Archive for the ‘search’ Category

Real Innovation in Search

July 22, 2008

Reading all the techie articles about semantic search vs. statistical models and semantic web vs. semantic search, it occurred to me that in one way, all that debate is besides the point. 

Consumers usually only search with two or three words.  If you pick two random words and google them, you’ll get tens of thousands of pages. 

With only two words to go on, the best search technology in the world is still going to have a really hard time understanding what the searcher wants.  That’s true if you have semantic search, statistical models, or the Head Librarian from the Library of Congress trying to get the “best” answer.

I think real innovation in search is going to have to come from creating user interfaces that get consumers to form better search queries.  For web search, we’ve had the single search box for more than a decade.  For travel sites, it’s still two cities and two dates.  For most job sites, it’s title/location.  (Not Trovix, though!)

Dating sites like eharmony figured out years ago that to get a person matched up with the best search result, you need more information.  Yet most search sites won’t let you spend three minutes to give relevant information, even if you wanted to.   

The technology on the back end can support more complex queries, so the bottleneck is the front end.  Trovix has a huge innovation in that it takes a full document and uses that to create a query. I suspect that other search technology companies will eventually follow this model.  The one-size fits search box is what we need to innovate away from.

Google for Job Search

July 14, 2008

According to Doug Berg, people ran more than 100 million job related searches on Google in June. The average month has 124 million searches.

Let’s have fun with math.  If each search took 30 seconds, that means on average, people spend more than one million hours a month job searching with Google. Ouch!  Talk about a tough way to spend your time.  When you do the kinds of searches he describes, like “Nursing Jobs New York,” you unlikely to get a job that’s even close to what you want. Yet 14,800 people ran that exact search. 

Are these people really job hunting? I think so.  33,100 people searched for “Construction jobs California,” but only 390 people searched for “construction jobs America.” If some people were looking for economic statistics, the America searches would be more common.  As it is, you’d think California is also too big to get the right results. But people don’t get that.

What this tells me is that a whole lot of people don’t really know how to use search tools like Google, and don’t really know how to job search.  Or maybe, they’re just so used to really crummy results when they search, that they’re happy with the kind things Google comes up with. 

Either way, a million hours seems like a lot of time to spend being disappointed.

Microsoft buys Powerset. Great News for Semantic Search.

July 1, 2008

I for one was very happy to read that Powerset got acquired by Microsoft.  Rumored price was “roughly” $100M according to TechCrunch.  For those keeping score, that would be .37% of cash on hand for Microsoft, or 14 hours of revenue.  There are three important messages here:

1.  Better search matters.  MSFT wouldn’t be making this move if they didn’t think that a better search technology would help them compete against Google.

2.  Semantic search is real.  Lots of haters (including some folks at Google) like to say that semantic search won’t scale, or that you can’t build a big enough taxonomy or whatever. At Trovix, we know it works. And at Microsoft, someone with a big checkbook just voted that it works.  Semantic technologies are going to completely change the way people interact with data. Microsoft clearly wants to be at the head of that change.

3.  The time for semantic search is now.  The tech highway is littered with technologies that burned out on their own hype without ever delivering the goods.  Now, there are several companies on the cusp, or in Trovix’s case, already shipping products that provide a better user experience based on semantic technologies.  Semantic search can be demoed on applications available to consumers today. Microsoft bought Powerset because they can see a clear path to this technology being a competitive weapon.  They didn’t spend $100M to hire more researchers for the lab. 

Don’t Search for Employment

June 12, 2008

Here’s an example of where keyword search fails.  At news.google.com, I typed in the word “Employment.”  The number one result is an article about Australian employment, so I add a “-Australia.” I rerun the search and the number one result is now about India.  I’m not in India or Australia and Google knows that.  But they can’t do anything about it.   Everyone in the whole world gets the same results.

Besides, if I want articles about employment in the United States, why not search for “United States Employment?”  OK. 

The results come back with almost nothing about employment, and a fairly random scattering of topics. (Actual results are below.)

What’s going on?  I’m not sure, but I suspect this: The word employment shows up in thousands of articles, so it can’t be used to rank results. Same goes for “United States.”  Google doesn’t know what to do.  So it uses an algorithm to determine which stories are the most popular.  My search terms are essentially ignored at this point and the algorithm takes over. I end up with popular articles that include my too-common words.

This is an example of where conceptual search would blow away keyword search. Imagine if Google knew concepts related to employment, like unemployment, jobs, economic growth, layoffs, etc. Then, it could score an article based on how much the content includes employment related concepts.  It might still have issues, but I’d bet an article about how to wash tomatoes wouldn’t be in the top 5. 

Trovix is really great at finding jobs for people because we understand concepts related to employment.  So we can show you what you’re looking for. 

Anyways, here were my top 5 results for “United States Employment.”

1.  US Still Leads the world in Science and Technology. (RAND study about R+D spending in the US.)

2. A 21st-Century Profile: Art for Art’s Sake.  (NY Times article about how many artists there are in the US.)

3. Lawmakers ponder next step for E-Verify.  (Bureaucrats in Washington DC try to keep illegal immigrants out of the workforce.)

4. Produce Safety And Security International Ohio Facilities Will Be Operational To Provide Certifed (sic) Food Safe Tomatoes And All Fresh Produce Items (Press release from a company that is washing their tomatoes before selling them.)

5.  Legislature acts to opt AZ out of RealID.  (Another state says no to a federal ID card.)

Semantic Search and the Semantic Web

May 29, 2008

When I want to see what’s up in the search world, sometimes I google news “semantic search” to see if anything interesting has happened. 

Something that I’ve noticed is that people seem to be be confusing Semantic Search with the Semantic Web. (Example 1, Example 2)  Or maybe I’m confused.  From what I understand, the Semantic Web is a particular idea, first launched years ago, that involves metatags, RDFs, data exchange layers etc.  The idea would be to formalize the content of the internet to make it more useful, and to help people filter out what they aren’t interested in.  I’m on the sceptical side of that one.  So is Cory Doctorow. 

But Semantic Search is totally different.  Semantic search the way Trovix does it doesn’t require anything of the web page or document in order to be added to a semantic framework. We do all the heavy lifting of tagging the concepts into a tree, and building the indexes so they can be searched from a conceptual and contextual perspective.  That means you can say “show me the resume of a mid level bean counter” and we can do it, even if the accountant in question calls himself experienced. 

The problem with confusing the two is that the Semantic Web is a super long ways off if you listen to the proponents. And if you listen to the sceptics, you’d write the idea off all together.  But semantic search is already able to provide huge value in verticals. The employment space is a massive vertical market. $55 billion is spent by corporations on hiring in the US each year.  We’re not waiting for the semantic web to make search better for people. 

Microsoft Investing Big in Search, or Are They?

May 23, 2008

PC World and Ad Age both had articles about Bill Gates’ description of Microsoft and the battle for search.  No surprise that Microsoft is gunning big for Google, and I’d love to know what technologies they think are going to define the next generation of search.  (Besides the word “semantic,” of course.) 

Sadly, no such coverage. While confirming that Gates talked about “new search technologies and future ideas,” both articles swooned over the cash back business model that Microsoft came up with.  Wasn’t that a business model before the .com crash? There were (now dead) companies giving away computers, equity and cash as rewards for traffic. So, good for Microsoft for showing up with a 10 year old idea.  I’m sure it will work out fine for them.

But here’s my real question: What are they doing about search?  People talk about ad serving platforms and keyword management tools as “search,” but they aren’t.  What Microsoft is doing is a business model.  Not a search technology.  The way these articles spin it, Microsoft isn’t even competing on search technology. 

Gates is definitely on the bandwagon that the current approach will be replaced by more intelligence, and semantic approaches.  But does he think that they can out develop Google in this regard? Or is the Microsoft strategy to throw money at the problem? 

 

 

More on Taleo and Vurv

May 8, 2008

Since everyone says the industry is consolidating, I thought I’d take a quick look at the score board.  Since Trovix started selling it’s first applicant tracking solution in 2005, the following companies have gotten bought or shut down: Vurv, Resumix, Brass Ring, Virtual Edge, Deploy, Unicru, Projectix, Hire.com, WetFeet.  And those are just the ones I could think of. Except for WetFeet, all of them were pretty substantial companies.   

Meanwhile, we’re continuing to grow, add features and add customers.  I think one thing that works to our advantage is that we have a search technology that other companies simply don’t have.  But also, we designed our interface based on the feedback of people that had already used first generation ATS platforms. (See list of those above.)  So we got to see what problems were tripping up users of other systems and and avoid building them. 

Microsoft, Yahoo and Search History

May 5, 2008

Seeing Microsoft walk away from Yahoo made me think of a funny story in John Battelle’s book, The Search.  He tells about how Vinod Khosla tried to get Excite to buy Google.  (This is in 1997.  Excite was a very big deal back then.) 

Fortunately for himself, Larry Page had too much vision, and set a price for Google that was way too high.  He wanted the sick amount of $1.6 million. (Yes, million with an M.  I think he had his eye on a studio apartment in East Menlo Park.) 

So, here’s a case where a search company wanted too much for itself. But they were right in the end.  The question is if Yahoo, if turning away Microsoft, is also going to be right in the end.  Personally, I doubt it. 

Here’s how I see it. Google had unique and powerful technology that solved a real problem that lots of people had: how to find stuff on the Internet.  They just had to grow that into an empire. (Which they did.) Yahoo has a huge amount of traffic. But that isn’t anything all that special.  It’s just traffic.  And it will go away when tastes change. 

Technology is what creates the winners of the future.  Traffic is something that is farmed for money. It’s amazing to me that in 1997, no one saw how valuable the Google technology would become. At the same time, when you think about the big internet players that didn’t have technology (AOL, Netscape, Excite), it makes you wonder if people will be equally amazed at what offer Yahoo walked away from.    

Life Beyond Keyword Search

April 30, 2008

There was a solid article on tech crunch about Nova Spivak’s claims that keyword search is about to break. (Click here) A line that rang true to me:

But anyone frustrated by the sense that it takes longer to find something on Google today than it did even a year ago knows there is some truth to his argument.

 And of course, the graphic:

Keyword search crashes into the ground

He mentions how Google relies on page links to judge popularity. The more data on the web, the less you want a popularity contest to pick your search results.  I would argue that this is already true for jobs, because there is a ton of data (Trovix.com has millions of jobs), and the number of links to those jobs won’t help you pick the best job for you. 

Boolean Search Tips

April 29, 2008

Today, the ERE Exchange ran an article giving some tips on how to use Boolean search.  To me, it was just a great reminder of how bad Boolean search really is.

For example, if you want to see somone with a bachelor’s degree in science, you type:
           ((bachelor* AND science) OR bs* OR “b.s.”)
Sadly, that won’t find anyone from MIT or other schools that call the degrees “SB.”  And Stanford awards ABs instead of BAs. As clever as the Boolean string is, it doesn’t really get the job done.

Of course, if you’re a recruiter and want to see a resume of someone who went to a top school, has 5 to 8 years work experience, and has enterprise software experience, Boolean can’t do a thing for you. 

Boolean search for job seekers is an even worse idea. Think about looking for a sales manager job.  You might also search for area manager, sales representative, sales associate, business development and a dozen other titles. And you’ll still get back jobs selling cell phones, cars, software and life insurance. 

The good news is that the feedback we’re getting from job seekers and recruiters is that with our search, they don’t need to worry about Boolean any more.