I do like the maps, but 50% error -- you would not possibly get on an airplane with that kind of error rate, would you? And I don't think I'd want to make decisions about my demographics on something with that error rate either. Why not take the IPS and bounce them against whois or something? N
On May 27, 8:21 pm, "Brendan O'Connor" <breno...@gmail.com> wrote: > On Wed, May 27, 2009 at 5:04 PM, Christian Heilmann < > > chris.heilm...@gmail.com> wrote: > > >http://isithackday.com/hacks/placemaker/tweet-locations.php?user=codepo8 > > > What do you think? > > Hey, nicely done. I like the maps. > > Are you sending the raw tweet texts to the Yahoo Placemaker service? Do you > try to use the tweet['user']['location'] data at all? > > It's interesting to look at the quality level of this yahoo service. > Unfortunately, it makes lots of mistakes. I was looking at my own feed > (since i know what i was trying to talk > about):http://isithackday.com/hacks/placemaker/tweet-locations.php?user=bren... > > Out of 10 identifications, 5 of them are errors. > > - "#scala" != "Monte Scala, Switzerland" > - i meant the programming language. > > - "middle-of-the-street *valencia* parking" != "valencia, CA" > - that's a street name (in san francisco). > > - "go easy on the *cancun*" != "cancun, MX" > - minor error: name of a (mexican) restaurant. > > - "sports, *mission*, *bay* bridge" != "mission bay, SF, CA" > - that's a list of several things. the "mission bay" neighborhood is > not one of them .. "bay" is part of the multiword "bay bridge". > > and most humorously, > > - "giant *ec2* nodes" != "EC2 area code, London, England" > > ... I haven't used this Yahoo service before, but I bet that, if it's any > good at all, it's probably optimized for web pages or big documents, where > there are many more context words to help disambiguate and safely identify. > There hasn't been a ton of NLP research on really short twitter-length > messages, and I suspect the problem is harder, and might require somewhat > different algorithms, than document-sized NLP problems. > > Are there any applications for this where a 50% error rate is OK? > > -Brendan > > -- > Brendan O'Connor -http://anyall.org