Geosign of the Apocalypse

From Canada’s Financial Post, here’s an interesting summing-up of last year’s Geosign implosion, courtesy of Ahmed Farooq of iBegin.

(Alas, I had skipped over an earlier post on this topic from Peter Krasilovsky, so this was mostly news to me.)

The short version: Geosign operated a bunch of domains that existed solely to serve ads.  Some of these sites included ‘real’ content as a cynical fig leaf.

Googlers know how it goes: You search for ‘XYZ’ and click on an ad (or a result) that looks promising, only to land on a site full of more XYZ-related ads — some of which lead to yet more ad sites, the AdSense version of an infinite loop. 

Since advertisers pay by the click, this provides easy money for companies that are willing to waste your time.  ‘Arbitrage’ is the common — rather charitable — name for the method.

Google ultimately cut off Geosign, presumably because it was hurting the value of Google’s ads, and the company fell apart.

As a strategy, arbitrage isn’t so dissimilar from search-engine marketing (SEM), or even from search-engine optimization (SEO); it’s all a matter of degree.  And when your content is advertising, as it is for Yellow Pages sites, the line gets even blurrier.

So what separates Geosign from the rest of the local universe, which also depends heavily on search-engine traffic?  Witness this chart from Hitwise, recently highlighted by Mike Boland at Kelsey:

Chart

It’s arguable that Geosign is just the chart’s reductio ad absurdum.  Obviously we can make distinctions, but I’d be worried if I were above, say, 35% on this chart and I weren’t Google or Yahoo.

OK, it’s definitely impressive that Local.com gets more of its traffic from search engines than does either Yahoo Local or Google Maps.  Probably the same is true of Marchex, which operates domains like 20176.com.

But if Google and Yahoo want to move their own bars to the right, they can easily do so.  It’ll come from the hide of Local.com, Marchex and similar companies.

And one big lesson of Geosign, scary and refreshing both, is that Google is willing to nuke a 9-digit business overnight.

Time, tide and the print YP

King Canute

R.H. Donnelley, the big Yellow Pages publisher, has lost more than 90% of its market value in the past year — much of it since last week, when it revised its 2008 outlook and announced the resignation of Jake Winebaum, head of its digital operation.  The traded part of the company is now worth about $400 million, vs. $4 billion a year ago.

Same thing, more or less, for competitor Idearc: Down about 85% in the past year.

These are astonishing votes of no-confidence in what are, after all, profitable companies with billions in revenue.  Seems like a few things could be happening:

Investors are overreacting.  YP companies will get punished by the economy, just like everyone else, but then they’ll recover.

Investors are overreacting.  Sure, YP companies are in a secular decline, but it’s slow & there’s no reason to panic.

Investors are right.  Still, the problem is with these specific over-leveraged companies, or the US market, or something else — not the global print YP market.

Investors are right.  For print YP, this is the last stop before oblivion.

In the world of local experts,  all of whom I respect, no one takes the extreme view: See Greg Sterling, John Kelsey, Perry Evans.  The consensus is that we’re looking at a slow decline — while they ought to get it in gear, YP companies have valuable assets (revenue, sales force, customer relationships) and time to react.

Hard to argue.  And yet … all of this seems weirdly familiar.

In late 1989 I joined the Wall Street Journal to report on IBM.  At the time, the computer giant bestrode the technology world, but there were rumblings from upstarts called Intel and Microsoft.

For the longest time, I believed analysts who — even as they acknowledged the upstarts’ importance — talked up IBM’s huge mainframe revenue, its formidable sales force, and its customer relationships.

More recently, I worked for AOL.  Its planned to build something new and valuable while allowing its huge dial-up base (30 million subscribers!) to erode sloooooooowly to broadband.

In both cases, the tide came in far faster and deeper than expected. 

I suspect it’s ever thus: No one here is Canute*, exactly — but when the future is lapping at your feet, big lungs aren’t a viable strategy either.

My gut tells me the tide is coming in quickly for YP publishers. Whatever the stats say about numbers of lookups, I see more & more piles of shrinkwrapped directories that sit for weeks before being tossed.

Only two things are holding back the ocean:

• The lack of a popular tool that makes the Web easier to consult than a phone book.  The iPhone and similar handsets will change this within 18 months.

• Merchants’ failure to realize that their YP ad dollars are buying less and less.  By 2009, this will be impossible to ignore.

What follows will be a real crisis, not just today’s crisis of confidence. 

These companies aren’t 100% print YP, of course.  They include online components.  But they’re fitting things into a print-oriented culture, rather than starting from fundamentals.  Again, it reminds me of IBM, ca. 1990, fitting PCs into its mainframe-oriented strategy.

I visited R.H. Donnelley’s Web site today.  I was greeted by this slogan:

connecting you to the future
building on the past

“Building on the past.”  If I were an investor, I would sell too.  In today’s environment this is an epitaph, not a tagline.

*Aside: So now I read that King Canute was actually making a point to his fawning courtiers.  What use is he, if not as a metaphor?

The Washington Post sure does like Yelp

At least today’s story has a hook: Yelp just announced another round of funding, raising $15 million not because it needs the cash but because, per CEO Jeremy Stoppelman, “it’s a shaky world out there.”  Well, OK.

TechCrunch tosses out a valuation of $200 million on revenue of less than $10 million, and notes (correctly) that Yelp has been on a traffic tear lately.  Still not profitable, says the Post.

One interesting thing was the photo:

This is a local Virginia business giving some TLC to a group of the ‘Yelp Elite,’ which is Yelp’s clubby moniker for those it has anointed as cool kids.

I’m not a fan of the Yelp Elite concept, in part because it works only for twentysomethings, but there’s no doubt it drives engagement.  Smart businesses like this health club in McLean are leveraging that engagement, while Yelp acts as the broker.

I don’t think this strategy will prove cost-effective for Yelp, but it’s definitely interesting to watch.

I continue to be fascinated, too, by the way Yelp has become a platform for self-expression.  Above all, I believe, the ‘Elite’ visits Yelp in order to write artful reviews—not to read them.

It’s a different dynamic than the one we’re trying to tap at Loladex, where the substance of people’s opinions will take precedence over their mode of expression.  We think this will scale better, but it’s certainly tough to argue with how Yelp is doing so far.

Local is weird

Not long ago I was talking with Gib Olander from Localeze, our main data supplier.  The topic was local data, and how weird it can be: Some things look like they must be mistakes—except they’re not.

Gib’s example was a place that sells custom rims and also cellphone service.  If you saw the business listed under both categories, you might figure one was wrong. But Gib can show you a photo of Cell ‘n’ Wheels that proves otherwise.

I laughed. The example wasn’t so terribly outrageous, and Localeze certainly has an interest in promoting this idea.  :)  Yet at the same time, I happened to know of a much better illustration.

I get my hair cut at a barber’s shop that also sells seafood.   Oysters, specifically.  During the holiday season it does a roaring trade in hams, too; they’re piled into a shopping cart by the door.  No one seems to mind buying dinner from a place that can get ankle-deep in human hair.

Whatever about rims and phones, I’d definitely suspect an error if my search for [oysters] near [Leesburg, VA] returned Plaza & Tuffy’s Barber Shop as the top result.  But it’s the only place in Leesburg that advertises oysters on a sign:

 Fresh oysters at Tuffy’s

I shot this photo right before getting a haircut.  When I went inside, my barber Bobby asked why I had been taking pictures.  I told him I had a friend in Chicago who didn’t believe that a barber shop also sold seafood.

“We’re still country here,” he replied. “You tell him that.”

It got me to thinking: In reviewing YellowBot last year, I included a screengrab of something I portrayed as an error—YellowBot’s tags said that a hair-removal place here in Leesburg also sells bail bonds.

Perhaps I was too hasty in calling it a mistake?

Structured vs. unstructured

Yellow Pages folks surely do love structure — especially when it comes to data. Here at the latest Kelsey conference, where YP folks abound, the only good datum is a structured datum.

Consider the title of yesterday’s most interesting panel:

Building a Better Database: Acquiring Content in a Dysfunctional Environment

The title is a bit grad school, but “dysfunctional” is a strong word that caught my eye. Here it mostly means “resistant to structure.”

And them’s fightin’ words in the world of Yellow Pages.

By now I’ve gone to a bunch of YP-oriented conferences. All of them featured a discussion about how to gather structured data. But I’m starting to suspect that this isn’t the most important problem to solve — and not just because these conference discussions never go anywhere.

Here’s my thinking:

In what a YPer would call a functional environment, every business location, small or large, would authorize a regularly updated master version of its “attributes” (hours, certifications, parking facilities, etc.), and would post this information in some microformat on its Web site, or supply it directly to each data vendor, or send it to an industry-wide data clearinghouse that’ll probably never exist.

In addition, lots of other data sources — licensing bodies, rating sites, whatever — would distribute structured information that’s already normalized and can be correlated perfectly to these master records.

All this data would then be collated by data vendors such as Localeze and sold to Web companies such as Google or, for that matter, Loladex.

Finally, the Web companies would build applications that use the structured data for searching by consumers (input) and display to consumers (output).

This worldview may be summarized thus:

More structured data in → Better answers out.

Or as Marchex‘s Matthew Berk (who’s a smart guy) said at the panel here: “We think local search is about structured search.”

Berk gave a very good example, which I also use when discussing Loladex: If you’re looking for a doctor, you need to know whether he takes your insurance. That’s true, without a doubt.

But here’s the problem I have:

The majority of information available about any company, and particularly about any small company, will never be structured. It’ll exist only on the general Web, where it must be searched on its own terms — that is, as unstructured text.

To me, this suggests that the most pressing data problem isn’t how to gather more structured data, but how to search unstructured data (on Web pages) and return structured answers.

I live on both sides of this equation, by the way. My wife runs a small cookie bakery, and I’m in charge of distributing her data to online sources.

Because of my background, I’m more informed and motivated than most small business owners. And yet, to be honest, just keeping her Web site up-to-date is a chore. On Yelp right now, I’m sorry to say, her hours are incorrect. I should update it, but I just haven’t.

Accuracy on our own Web site is always my #1 priority, because that’s our official voice. Also it’s where most people land when they search for “Lola Cookies.”

Keeping Yahoo Local accurate is on my list, too, but it’s lower down. Ditto Google and YellowPages.com and the other big sites.

I never think about the data vendors one layer back, like InfoUSA, unless they happen to call the store. (Which InfoUSA does, to its credit.)

Meanwhile, plenty of interesting and searchable information about the bakery exists in other places on the Web, in formats that aren’t even addressed by the concept of “attributes.”

A TV broadcast from the bakery aired live on the local morning news recently, for instance. If you watched the show, you might search for us with a term like “fox 5 cookies virginia.” Where does that fit in the world of structured data?

I raised this general issue at yesterday’s panel. What were the panelists doing about this wealth of unstructured Web data, which right now is the dark matter of the local-search universe?

The answer I got was, basically, “Not much.”

Most panelists said they do only highly targeted crawls, focusing on sites that have structured data that can “extend or validate” their own data, in the words of Localeze’s Jeff Beard. An example might be the site of a professional group such as the American Optometric Association.

No panelist was ready to start indexing the sites of individual businesses, or locally focused blogs, or any other sites that are unstructured but potentially rich in content.

The only (mild) exception was Erron Silverstein of YellowBot, who also said his company limits itself to targeted crawls — but included local media, such as newspapers, among his targets.

A few players are indexing the broader Web and then associating pages with specific businesses (which is the important part). Most notable are Google and Yahoo, who do it for their local search products.

Of course, they’re already indexing the entire Web. It’s less of a stretch for them.

Google and Yahoo also buy structured data from InfoUSA, Localeze and others, so it’s not like such data is obsolete. But they’re getting the same info directly from some businesses, and those updates are likely more timely, more accurate, and more complete.

Meanwhile, their Web indices are opening up a realm of data that traditional vendors like Acxiom — represented by Jon Cohn on yesterday’s panel — simply don’t care to address.

I suspect that, sooner than you’d imagine, Google and Yahoo will be buying structured data not so that users can search it directly, but for two less-flattering reasons:

  1. To help find Web pages they can associate with each business
  2. To fill ever-smaller gaps in the coverage that results from #1

Matthew Berk of Marchex argued that a good local search must be structured to “help someone walk down the decision trail” by using filters to narrow their search progressively:

I need a orthopedist in Boston … in the Back Bay … who accepts United Healthcare.

I think users are more likely to learn that they can go to Google and type “orthopedist back bay united healthcare” — particularly if it produces a good top result the first time they try.

The burden of local search, it seems to me, is to do something that Google can’t match with an unstructured Web search.

In any case, the search portals will ultimately use their indexed Web pages to extract and cross-check structured data directly. Over time — probably just a couple of years — such automated processes will yield data that’s more current and detailed than anything that’s produced by scanning phone books or calling stores.

The resulting search functionality, integrating both structured and unstructured data, will be sold to other companies as a Web service, and data vendors such as InfoUSA will become irrelevant to local search.

Now that would be a dysfunctional environment for many of the Kelsey attendees.

I’m not sure exactly how companies like InfoUSA and Acxiom should tackle the unstructured Web. It’ll demand a new way of thinking, and probably a new way of selling.

But I’m certain that they ignore unstructured data at their peril.