Pascal,

These assumptions about since_id and max_id are incorrect. You can
still, and must still, rely upon them for fetching. The additional
jitter introduced by the id generation scheme is statistically
insignificant and very small compared to other reordering effects in
the Twitter system. Tweets are K-ordered over a multiple second window
as they are today, whereas the additional K introduced by the ID
generation system will be sub-second, if not sub-millisecond, and
practically irrelevant.

If you are doing repeated automated queries against the Search API,
you should transition to streaming. If you are attempting to get every
tweet that matches, which is clearly the case given the questions
below, transitioning to streaming is your only option, as search is
already filtering for relevance and this filtering will only increase
over time.

-John Kalucki
http://twitter.com/jkalucki
Infrastructure, Twitter Inc.






2010/6/7 Pascal Jürgens <lists.pascal.juerg...@googlemail.com>:
> Good to know. Did you mean to say "consume … streaming results"? I don't 
> really see where you use the stream here.
>
> Also, please note that it's not a good idea to work with "since_id" and 
> "max_id" any more, because those will soon be (already are?) NON-SEQUENTIAL. 
> This means you will lose tweets if you rely on the IDs incrementing over 
> time. To quote the relevant email from Taylor Singletary:
>
>> Please don't depend on the exact format of the ID. As our infrastructure 
>> needs evolve, we might need to tweak the generation algorithm again.
>>
>> If you've been trying to divine meaning from status IDs aside from their 
>> role as a primary key, you won't be able to anymore. Likewise for usage of 
>> IDs in mathematical operations -- for instance, subtracting two status IDs 
>> to determine the number of tweets in between will no longer be possible
>
> Cheers.
>
> On Jun 8, 2010, at 0:06 , sahmed10 wrote:
>
>> yes it works! This algorithm works
>> Its something like this
>> Set the query to a string with appropriate To and From dates. Then
>> consuem the 1500 streaming results and also save the status id of the
>> very last tweet you got. As they are in order sequentially(with gaps)
>> it wont be a problem. The very last tweet status id should be assigned
>> as the MaxId for the next set of results and so on.
>
>

Reply via email to