I do like the maps, but 50% error -- you would not possibly get on an
airplane with that kind of error rate, would you?  And I don't think
I'd want to make decisions about my demographics on something with
that error rate either.   Why not take the IPS and bounce them against
whois or something?

On May 27, 8:21 pm, "Brendan O'Connor" <breno...@gmail.com> wrote:
> On Wed, May 27, 2009 at 5:04 PM, Christian Heilmann <
> chris.heilm...@gmail.com> wrote:
> >http://isithackday.com/hacks/placemaker/tweet-locations.php?user=codepo8
> > What do you think?
> Hey, nicely done.  I like the maps.
> Are you sending the raw tweet texts to the Yahoo Placemaker service?  Do you
> try to use the tweet['user']['location'] data at all?
> It's interesting to look at the quality level of this yahoo service.
>  Unfortunately, it makes lots of mistakes.  I was looking at my own feed
> (since i know what i was trying to talk 
> about):http://isithackday.com/hacks/placemaker/tweet-locations.php?user=bren...
> Out of 10 identifications, 5 of them are errors.
>    - "#scala"   !=   "Monte Scala, Switzerland"
>       - i meant the programming language.
>    - "middle-of-the-street *valencia* parking"   !=   "valencia, CA"
>       - that's a street name (in san francisco).
>    - "go easy on the *cancun*"   !=   "cancun, MX"
>       - minor error: name of a (mexican) restaurant.
>    - "sports, *mission*, *bay* bridge"   !=   "mission bay, SF, CA"
>       - that's a list of several things.  the "mission bay" neighborhood is
>       not one of them .. "bay" is part of the multiword "bay bridge".
> and most humorously,
>    - "giant *ec2* nodes"   !=   "EC2 area code, London, England"
> ... I haven't used this Yahoo service before, but I bet that, if it's any
> good at all, it's probably optimized for web pages or big documents, where
> there are many more context words to help disambiguate and safely identify.
>  There hasn't been a ton of NLP research on really short twitter-length
> messages, and I suspect the problem is harder, and might require somewhat
> different algorithms, than document-sized NLP problems.
> Are there any applications for this where a 50% error rate is OK?
> -Brendan
> --
> Brendan O'Connor -http://anyall.org

Reply via email to