[twitter-dev] Re: Search API questions

2009-11-28 Thread dbasch
Hi Elroy,

I tried your query from python several times within the same minute.
After running the query several times in a row I start getting fresh
results and they remain fresh for a while. I tried changing the least
significant decimal to make it a different query and I get stale
results immediately. Switching back yields fresh results.

This to me suggests that there may be two search tiers: one for low-
frequency queries that probably searches a subset of tweets, and
another one for frequent ones that searches everything and has an LRU
cache of important queries. It seems that we can force queries into
the LRU cache of the "good" tier by querying frequently enough. When I
stop querying for three minutes or so I see the old results again. The
question for the search team is how to have your query treated as an
"important" one without abusing the API.



On Nov 28, 1:18 pm, enygmatic  wrote:
> I got some requests to post the query that I am using:
> here is the query 
> :http://search.twitter.com/search.atom?geocode=19.017656%2C72.856178%2...
> Do correct me if I am not querying or using the API correctly. (Should
> have been my first question actually :) )
> Also here is a sample of the output from my ruby script. It will give
> you an idea of the "stale" results that I am getting. The script was
> run at approximately 21:37 IST.  As you can see, I'm getting tweets
> all the way back to 14:00 hours in the afternoon. I'm pretty sure
> there are more tweets for my location. I'm querying for tweets
> originating out of Mumbai, and by querying through twitter search I
> have noticed that there are at least 40-50 tweets posted every 2
> minutes or so.
> Output follows: Date-Day-Hour-Minute-Tweet-User-Hashtags(csv, if any)-
> source of tweet (All date/time info below is in IST)
> 2009-11-28      Saturday        21      27     �...@abhishek_rai I too am 
> huge fan of
> quizzing.. do let me kno if u find anythin interesting. ty
> Shakti_Shetty (Shakti Shetty)           web
> 2009-11-28      Saturday        21      21     �...@surubhi hallow darlin, 'm 
> fine doin
> great...how about u?    dacku87 (darshan thacker)               mobile web
> 2009-11-28      Saturday        20      40      powai mocha so full of 
> people, smaloe
> conversations and music..       sumagambs (Sumit Singh Gambhir)         web
> 2009-11-28      Saturday        20      25     �...@thetruboy idk we'll see. 
> Ari should be home
> by then ronniebaby010 (Princess)                UberTwitter
> 2009-11-28      Saturday        19      54      friends do look 
> upwww.clickthehorror.com-
> the website for my new film distirbuted by PNC has been launched -
> look 4ward to feedbacks     sangeethsivan (sangeeth sivan)          web
> 2009-11-28      Saturday        19      54      I'm guessing @Netra and 
> @prolificd are the
> two few Twitterers who've had multi-city tweetups. How cool is that.
> National figures!       b50 (Bombay Addict)             Tweetie
> 2009-11-28      Saturday        19      36      RT: Trupti's Blog: What 
> Commercial Floor
> Mats Offer: One of the best ways to keep any p..http://bit.ly/6sZWJg
> #blog   MishraNatty (Natasha Mishra)    blog    twitterfeed
> 2009-11-28      Saturday        19      09     �...@mattyza when launched 
> back in 2005, the
> Xbox 360 was available in Core and Pro. Now it's Arcade and Elite.
> Same difference!        aalaap (Aalaap Ghag)            Tweetie
> 2009-11-28      Saturday        19      05      Profit with Google, Twitter 
> & affiliate
> marketinghttp://snipurl.com/tet1r Tiifani_Lurid (Tiifani Lurid)
> 2009-11-28      Saturday        18      35      Just voted OOiZiT.com  for 
> Best Online Music
> Labelhttp://mashable.com/owa#openwebawards ankit_9oct (Ankit
> Khandelwal)     openwebawards   Mashable Connect
> 2009-11-28      Saturday        18      35     �...@reginafetalvero HAHA. 
> YUHH. Gift ko
> ah? :"> Jhoriiliee (Jorylie Cando)              web
> 2009-11-28      Saturday        18      24     �...@tweet_words JAGGERY PALM  
>      gannirules
> (gaanish)               Snaptu
> 2009-11-28      Saturday        17      34     �...@karan_talwar pls post 
> that if you get an
> answer. champbox (champbox)             Tweets60
> 2009-11-28      Saturday        17      34      Just Got Home! :) Wee. Had 
> FUN tonight! :)
> HBD kathy! Sayang wala si Beb, complete na sana.        Jhoriiliee (Jorylie
> Cando)          web
> 2009-11-28      Saturday        17      34      I'm listening to Kurbaan: 
> Kurbaan Hua
> (Soundtrack) - @Spinlet kmadvani (Kunal M Advani)               API
> 2009-11-28      Saturday        17      03      Eastern Province Under-19s 
> 322/7 & 185/5
> v South Western Districts Under-19s 92/10 & 152/10 *: Eastern
> Province..http://bit.ly/4rS1iAvenky888 (venkatesh iyer)
> twitterfeed
> 2009-11-28      Saturday        16      52      Hey tweeps..Rocket Singh 
> picshttp://www.yashrajfilms.com/microsites/rocketsingh/fullpage.htm

[twitter-dev] Re: getting older tweets

2009-11-28 Thread dbasch

I don't know if this will be useful to you but we have a
representative sample of tweets from past months in our search tool.
Right now it goes back to February. See for example:


Change the date and query for any date between February and today.


On Nov 28, 3:12 pm, Jack Widman  wrote:
> What are the oldest tweets I can search for? Is use of 'since' the only way?
> On Sat, Nov 28, 2009 at 1:09 PM, Raffi Krikorian  wrote:
> > there is currently no way to search for tweets that are that old.
> >  I am using the search api and want to get tweets from, say, one year
> >> ago. the 'since', parameter seems not to be working. e.g.
> >>http://search.twitter.com/search.atom?q=obama&until=2009-03-24
> >> returns no results.
> >> Am I doing something wrong?
> > --
> > Raffi Krikorian
> > Twitter Platform Team
> > ra...@twitter.com | @raffi

[twitter-dev] Re: Adding more languages to lang parameter in Search API

2009-11-28 Thread dbasch

There are tools you could use to do language detection on your side
and filter out non-Serbian tweets. I assume what slows you down is the
call to Google's Language Detection API:


You should try the n-gram based language identifier that comes with
the Nutch search engine. You can build a language model for Serbian
relatively quickly (just feed it a file with a fair amount of text in
Serbian) and see how well it works:



On Nov 29, 1:30 am, Тома Тасовац  wrote:
> Could somebody from the Twitter team please address my question about 
> language recognition in the search API?
> Many thanks in advance.
> T.
> 25.11.2009, Ò 10:00, Toma ÝÐßØáÐÛ(Ð):
> > Hi there.
> > I am working on a WordNet-based Serbian-English dictionary (part of
> > Transpoetika Project at the Belgrade Center for Digital Humanities,
> >http://humanistika.org)
> > I've implemented a "LiveQuote" system with Twitter, where we get most
> > recent tweets exemplifying the use of a given dictionary entry. We
> > also have several other ideas on how to integrate Twitter in our
> > dictionary application, both on the production and reception ends.
> > But we're facing a serious performance issue: Twitter's language
> > parameter (lang) does not recognize Serbian (sr). My workaround has
> > been to use Google Translate's API to check tweets to make sure they
> > are really Serbian. It works, Google is pretty good about this (not
> > 101%, but close enough), but this has considerably slowed down the
> > process -- every tweet we get for a certain word has to be checked
> > with Google before being displayed.
> > Without a language check, however, we run into cases where certain
> > Russian, Bulgarian, Macedonian etc. tweets will sometimes sneak into
> > our results thanks to interlingual homographs. For eg. ÖØÒÞâ in
> > Serbian means "life", while in Russian it means "stomach".
> > I am curious how you guys check for language identity on your backend,
> > and whether there was any chance you could include Serbian in the
> > list?
> > All best,
> > Toma

[twitter-dev] Re: want to get a frequency count of all words on twitter, 1 time/day

2009-12-02 Thread dbasch
We can help you with that, we have word lists for hundreds of millions
of tweets from the past year and add more every day to trendistic.com.
Please email me, maybe we can do something together.


On Dec 2, 3:33 pm, hydrodog  wrote:
> The twitter API allows us to collect the top 10 keywords, but what we
> want is a lot of words (100,000 perhaps?)  but only once per day.
> Obviously, with a firehose, we could do the work ourselves, but it
> seems obvious that internally, such a keyword list must exist, so is
> there any chance to get it?  For any kind of research, it's very
> useful, and clearly our group is not the only one that would benefit.

[twitter-dev] Re: How to get the most followed users?

2009-12-06 Thread dbasch

> not that I know of.  you could, conceivably, stroll through all the
> followers of a particular user, gather their number of followers, and
> then gather their followers and wash, lather, rinse, repeat.

It would be easier to do it the other way around. Start picking random
users and see how they follow. The list of the most followed people
should stabilize relatively quickly.

[twitter-dev] Re: Max tweet ID

2009-12-07 Thread dbasch
If you mean the since_id parameter, I just tried it for a few queries
and it worked as expected. Do you have examples of a query that
returns older tweets? Just curious.


On Dec 6, 4:30 pm, "ESPR!T"  wrote:
> Hi, I have used the search for tweets newer then some tweet id (cron
> job accessing the search and getting all tweets containing some word
> which id is more then previous one) but looks like it doesn't work for
> me correctly anymore. I have modified my application a little week ago
> but I didn't touched the functionality around the search.
> Basically now I am getting also some older tweets in search result
> again and again (I am storing all the searched tweet in my db so I can
> see that the tweet is being found again even if the max id is higher).
> Any1 has the same problem (maybe something changed in API in last 3-5
> days) or is the problem really on my side?
> Thanks

[twitter-dev] Re: Ping bot now available

2009-12-08 Thread dbasch
Hi Fabien,

Just a thought in case you haven't considered it: be careful not to
get caught in an infinite loop. There are bots that listen to keywords
and reply to you. Someone may trigger a situation like that by making
you echo such keywords, either maliciously or by accident.


On Dec 8, 8:52 am, Fabien Penso  wrote:
> Hi everyone,
> I looked for a few minutes without finding a ping bot account on twitter.
> So here it goeshttp://twitter.com/pingpongbotusing the streaming
> API, therefor should be fast. Works on public replies for now, suggest
> ideas if you wish.
> ...

[twitter-dev] Re: Getting number of lists a user is on

2009-12-08 Thread dbasch
I second that, users/show should be consistent with what you can see
by going to someone's profile page.

On Dec 8, 5:00 pm, Wynn Netherland  wrote:
> +1 for adding the count to users/show
> Wynn Netherland
> @pengwynn

[twitter-dev] Re: Regarding the search API based on Geo location

2009-12-10 Thread dbasch
>  if you are only looking for tweets
> that use the geotagging API, then do some post processing to find a
> tweet with a populated geo field.

The only way to do this right now is to hammer the search API, because
tweets with a populated geo field are needles in a haystack. I've done
this for a 500-mile radius centered in San Francisco only to find a
tweet or two every several minutes.

Geo tweets are so few at this point (I'm guessing on the order of one
in a thousand) that they could be thrown into their own in-memory
index on a low-end box. Tagging them with a custom field (e.g.
geo_data_present:true/false) in the regular search index should be
easy. Of course it means regenerating indexes, testing, etc.
Understandably this is not a priority for the search team as they must
be swamped with more urgent issues, but it would be nice.


[twitter-dev] Re: Regarding the search API based on Geo location

2009-12-14 Thread dbasch
Many people use UberTwitter from their phone and also tweet from the
web or a desktop client. UT updates the profile location with GPS data
but the browser doesn't. If the source of the tweet is UT chances are
the location is accurate, otherwise it's probably old. If you
desperately need to pin as many tweets on the map as possible you may
want to use this information for the time being.


On Dec 14, 8:33 pm, redders  wrote:
> Hi Raffi,
> I'll be interested to hear when the API adds functionality that'll
> allow us to retrieve *only* tweets with a geopoint! Any hints?
> In the meantime; copyied from the first post, what is going on with
> tweets like this? :
> {
>     * location: "iPhone: 37.313690,-122.022911"
>     * geo: null
> }
> {
>     * location: "ÜT: 37.293106,-121.969004"
>     * geo: null
> }
> Presumably this is some developer-implemented work around from a
> client that geotagged tweets before the geotagging API was available,
> by setting the profile location (where it normally says London, UK
> etc.) to co-ords?
> If so, I will just ignore it, as these should in theory become less
> and less common as developers update their apps to use the official
> geotagging method, but I want to be sure that I'm not missing some
> crucially geotagged tweets!

[twitter-dev] Re: searching spesific keyword in Tweets

2009-12-16 Thread dbasch
You can also try search.trendistic.com . We have a fraction of the
tweets but you can search all of 2009.

On Dec 16, 11:29 am, John Kalucki  wrote:
> Google.com is your only bet, and it will be very patchy.
> -John Kaluckihttp://twitter.com/jkalucki
> Services, Twitter Inc.
> On Tue, Dec 15, 2009 at 9:50 PM, MuratMetu  wrote:
> > Hello, I am new to Twitter development. Is there any way to get all
> > tweets including specific keyword "x" ans posted in the most recent 3
> > - 4 months. I heard that there is a way to do it for 1-2 week old
> > tweets but I need to go 3-4 months back. Thank you.

[twitter-dev] Re: URLification

2009-12-17 Thread dbasch
Periods and parentheses are valid url characters. Assuming that an
adjacent period or closing parenthesis is not part of the url is a
gamble. The most sensible urlification includes all valid characters
until it finds one that clearly delimits the url such as a space.


On Dec 17, 7:13 am, Ken Dobruskin  wrote:
> When adding a URL surrounded by parentheses or followed by a period, these 
> marks are included in the resulting link. Is a trailing whitespace the only 
> workaround? It's ugly and wastes a character.
> _
> Windows Live Hotmail: Your friends can get your Facebook updates, right from 
> Hotmail®.http://www.microsoft.com/middleeast/windows/windowslive/see-it-in-act...

[twitter-dev] Re: URLification

2009-12-17 Thread dbasch
You can get pretty sophisticated and have lots of heuristics to guess
what the user actually meant. For example, a period followed by a
space and a word that starts with uppercase almost certainly means
that the period was the end of a sentence and not part of the url.
Twitter probably should do this, as it's quite conservative.


On Dec 17, 11:10 am, Ken Dobruskin  wrote:
> True, but Yahoo! Mail and others do get it right.
> It's been a few years I no longer worry sending an email with a URL at the 
> end of a sentence. I wonder how they do it.
> > Date: Thu, 17 Dec 2009 05:48:31 -0800
> > Subject: [twitter-dev] Re: URLification
> > From: dba...@gmail.com
> > To: twitter-development-talk@googlegroups.com
> > Periods and parentheses are valid url characters. Assuming that an
> > adjacent period or closing parenthesis is not part of the url is a
> > gamble. The most sensible urlification includes all valid characters
> > until it finds one that clearly delimits the url such as a space.
> >http://www.ietf.org/rfc/rfc1738.txt
> > On Dec 17, 7:13 am, Ken Dobruskin  wrote:
> > > When adding a URL surrounded by parentheses or followed by a period, 
> > > these marks are included in the resulting link. Is a trailing whitespace 
> > > the only workaround? It's ugly and wastes a character.
> > > _
> > > Windows Live Hotmail: Your friends can get your Facebook updates, right 
> > > from 
> > > Hotmail®.http://www.microsoft.com/middleeast/windows/windowslive/see-it-in-act...
> _
> Windows Live: Friends get your Flickr, Yelp, and Digg updates when they 
> e-mail 
> you.http://www.microsoft.com/middleeast/windows/windowslive/see-it-in-act...

[twitter-dev] Re: URLification

2009-12-17 Thread dbasch
I agree. I searched the issues db and didn't find it. Not sure if it
belongs as an API issue but I submitted it anyway.


On Dec 17, 2:49 pm, Ken Dobruskin  wrote:
> A closing parenthesis followed by a space seems like a pretty safe bet too. 
> I'm sure those rules have been worked out long ago - the RFC was published in 
> '94.
> > Date: Thu, 17 Dec 2009 07:55:14 -0800
> > Subject: [twitter-dev] Re: URLification
> > From: dba...@gmail.com
> > To: twitter-development-talk@googlegroups.com
> > You can get pretty sophisticated and have lots of heuristics to guess
> > what the user actually meant. For example, a period followed by a
> > space and a word that starts with uppercase almost certainly means
> > that the period was the end of a sentence and not part of the url.
> > Twitter probably should do this, as it's quite conservative.
> > Diego
> > On Dec 17, 11:10 am, Ken Dobruskin  wrote:
> > > True, but Yahoo! Mail and others do get it right.
> > > It's been a few years I no longer worry sending an email with a URL at 
> > > the end of a sentence. I wonder how they do it.
> > > > Date: Thu, 17 Dec 2009 05:48:31 -0800
> > > > Subject: [twitter-dev] Re: URLification
> > > > From: dba...@gmail.com
> > > > To: twitter-development-talk@googlegroups.com
> > > > Periods and parentheses are valid url characters. Assuming that an
> > > > adjacent period or closing parenthesis is not part of the url is a
> > > > gamble. The most sensible urlification includes all valid characters
> > > > until it finds one that clearly delimits the url such as a space.
> > > >http://www.ietf.org/rfc/rfc1738.txt
> > > > On Dec 17, 7:13 am, Ken Dobruskin  wrote:
> > > > > When adding a URL surrounded by parentheses or followed by a period, 
> > > > > these marks are included in the resulting link. Is a trailing 
> > > > > whitespace the only workaround? It's ugly and wastes a character.
> > > > > _
> > > > > Windows Live Hotmail: Your friends can get your Facebook updates, 
> > > > > right from 
> > > > > Hotmail®.http://www.microsoft.com/middleeast/windows/windowslive/see-it-in-act...
> > > _
> > > Windows Live: Friends get your Flickr, Yelp, and Digg updates when they 
> > > e-mail 
> > > you.http://www.microsoft.com/middleeast/windows/windowslive/see-it-in-act...
> _
> Keep your friends updated—even when you’re not signed 
> in.http://www.microsoft.com/middleeast/windows/windowslive/see-it-in-act...

[twitter-dev] Re: Reg Fetch tweets by append GEO Code to URL from Search API

2009-12-18 Thread dbasch
I tried your query and got a timeout. My guess is that it's just a
very expensive query to compute because of the large radius. It seems
to work fine with a smaller radius.


On Dec 18, 3:25 am, praveenkumar nakka 
> Hai,
> I was using search API to get tweets from Twitter. When i append geo code to
> the URL i got following error  like this
> URL :http://search.twitter.com/search.json?q=%22holiday+list.+Pick+me%21%2...
> .TwitterException: *Server returned HTTP response code: 502 for 
> URL*:http://search.twitter.com/search.json?q=%22holiday+list.+Pick+me%21%2...
>         at com.netelixir.api.twitterSrc.
> http.HttpClient.httpRequest(HttpClient.java:274)
>         at
> com.netelixir.api.twitterSrc.http.HttpClient.get(HttpClient.java:189)
>         at com.netelixir.api.twitterSrc.Twitter.get(Twitter.java:279)
>         at com.netelixir.api.twitterSrc.Twitter.search(Twitter.java:1125)
>         at
> com.netelixir.api.twitter.DumpTweetsData.run(DumpTweetsData.java:119)
>         at java.lang.Thread.run(Thread.java:619)
> If i try to pull tweets without geo code then its working fine ,
> What is the wrong in the sending url and why its coming like this?
> Is there any other way to get tweets by using geocode from Search API?
> please give me reply as early as possible.
> Thanks
> Praveen

[twitter-dev] Re: Doing a search with from:username_with_underscore doesnt seem to work

2009-12-18 Thread dbasch
It's not the underscore. These queries work:


This particular one doesn't:


to:the_hindu does work. Just speculating, but perhaps the_hindu's
results were dropped from the index by a spam filter for some reason.
Maybe it tweeted too many times in a very short period.


On Dec 18, 12:23 pm, Joe  wrote:
> I am trying to search for all messages from a particular for eg
> from:the_hindu
> It doesnt work. How is this supposed to be done?

[twitter-dev] Deleting tweets tracked by keyword

2010-02-17 Thread dbasch
I'm using the stream API to track tweets by keyword (filter).
According to the documentation, "Streams may also contain status
deletion notices. Clients are urged to honor deletion requests and
discard deleted statuses immediately."

When I try creating and deleting tweets. I always get the new tweets
but never see deletion notices. The tweets do disappear from my
timeline and the search results. Is this a bug or should I expect to
never receive deletion notices through the filter call?
