Thanks a lot for your reply, Bjoern. ***from twitter api wiki***
statuses/public_timeline Returns the 20 most recent statuses from non-protected users who have set a custom user icon. The public timeline is cached for 60 seconds so requesting it more often than that is a waste of resources. ***end of api wiki**** So it is not random according to the documentation. The docs say that they do cache the responses for about a minute, yet i can see new data every 2 seconds.(NOT enough). And the server has already been whitelisted. 20K requests is not gonna be enough for this anyway. Sneaking into - I am not sure that would solve any of the problems. On Jul 17, 5:28 pm, Bjoern <bjoer...@googlemail.com> wrote: > On Jul 17, 1:57 pm, CreativeEye <creativv...@gmail.com> wrote: > > > 1) Get Twitter Public timeline repeatedly. > > My understanding is that this does not give you all tweets, just a > random selection. > > > 2) Get follower network - user profiles and get their statuses. > > You would reach the API limit quickly, I'd expect. > > I don't remember the robots.txt definition very well, but I think > twitter also disallows classic web crawlers:http://twitter.com/robots.txt > > > I do know Firehose is an option, but that would again be something > > like Approach 1. right? > > Firehose is only an option if Twitter allows you to use it. > > > Please guide me how to proceed. > > I think there is no reliable way to get ALL tweets, though I would be > pleased to learn otherwise. (with the exception of the Firehose, which > I suppose one can not plan for). > > Maybe by being "sneaky" about it one can get a lot of tweets. For > example by getting people to use your service to access twitter, so > that you are using up their API limits, not your own. Or at least get > the service whitelisted so that you can make lots of requests (I doubt > they would be enough to get ALL tweets, though).