I do like the maps, but 50% error -- you would not possibly get on an
airplane with that kind of error rate, would you?  And I don't think
I'd want to make decisions about my demographics on something with
that error rate either.   Why not take the IPS and bounce them against
whois or something?
N


On May 27, 8:21 pm, "Brendan O'Connor" <breno...@gmail.com> wrote:
> On Wed, May 27, 2009 at 5:04 PM, Christian Heilmann <
>
> chris.heilm...@gmail.com> wrote:
>
> >http://isithackday.com/hacks/placemaker/tweet-locations.php?user=codepo8
>
> > What do you think?
>
> Hey, nicely done.  I like the maps.
>
> Are you sending the raw tweet texts to the Yahoo Placemaker service?  Do you
> try to use the tweet['user']['location'] data at all?
>
> It's interesting to look at the quality level of this yahoo service.
>  Unfortunately, it makes lots of mistakes.  I was looking at my own feed
> (since i know what i was trying to talk 
> about):http://isithackday.com/hacks/placemaker/tweet-locations.php?user=bren...
>
> Out of 10 identifications, 5 of them are errors.
>
>    - "#scala"   !=   "Monte Scala, Switzerland"
>       - i meant the programming language.
>
>    - "middle-of-the-street *valencia* parking"   !=   "valencia, CA"
>       - that's a street name (in san francisco).
>
>    - "go easy on the *cancun*"   !=   "cancun, MX"
>       - minor error: name of a (mexican) restaurant.
>
>    - "sports, *mission*, *bay* bridge"   !=   "mission bay, SF, CA"
>       - that's a list of several things.  the "mission bay" neighborhood is
>       not one of them .. "bay" is part of the multiword "bay bridge".
>
> and most humorously,
>
>    - "giant *ec2* nodes"   !=   "EC2 area code, London, England"
>
> ... I haven't used this Yahoo service before, but I bet that, if it's any
> good at all, it's probably optimized for web pages or big documents, where
> there are many more context words to help disambiguate and safely identify.
>  There hasn't been a ton of NLP research on really short twitter-length
> messages, and I suspect the problem is harder, and might require somewhat
> different algorithms, than document-sized NLP problems.
>
> Are there any applications for this where a 50% error rate is OK?
>
> -Brendan
>
> --
> Brendan O'Connor -http://anyall.org

Reply via email to