Thanks a lot for your reply, Bjoern.

***from twitter api wiki***

statuses/public_timeline
Returns the 20 most recent statuses from non-protected users who have
set a custom user icon. The public timeline is cached for 60 seconds
so requesting it more often than that is a waste of resources.

***end of api wiki****

So it is not random according to the documentation. The docs say that
they do cache the responses for about a minute, yet i can see new data
every 2 seconds.(NOT enough).
And the server has already been whitelisted. 20K requests is not gonna
be enough for this anyway.

Sneaking into - I am not sure that would solve any of the problems.

On Jul 17, 5:28 pm, Bjoern <bjoer...@googlemail.com> wrote:
> On Jul 17, 1:57 pm, CreativeEye <creativv...@gmail.com> wrote:
>
> > 1) Get Twitter Public timeline repeatedly.
>
> My understanding is that this does not give you all tweets, just a
> random selection.
>
> > 2) Get follower network - user profiles and get their statuses.
>
> You would reach the API limit quickly, I'd expect.
>
> I don't remember the robots.txt definition very well, but I think
> twitter also disallows classic web crawlers:http://twitter.com/robots.txt
>
> > I do know Firehose is an option, but that would again be something
> > like Approach 1. right?
>
> Firehose is only an option if Twitter allows you to use it.
>
> > Please guide me how to proceed.
>
> I think there is no reliable way to get ALL tweets, though I would be
> pleased to learn otherwise. (with the exception of the Firehose, which
> I suppose one can not plan for).
>
> Maybe by being "sneaky" about it one can get a lot of tweets. For
> example by getting people to use your service to access twitter, so
> that you are using up their API limits, not your own. Or at least get
> the service whitelisted so that you can make lots of requests (I doubt
> they would be enough to get ALL tweets, though).

Reply via email to