Pascal, These assumptions about since_id and max_id are incorrect. You can still, and must still, rely upon them for fetching. The additional jitter introduced by the id generation scheme is statistically insignificant and very small compared to other reordering effects in the Twitter system. Tweets are K-ordered over a multiple second window as they are today, whereas the additional K introduced by the ID generation system will be sub-second, if not sub-millisecond, and practically irrelevant.
If you are doing repeated automated queries against the Search API, you should transition to streaming. If you are attempting to get every tweet that matches, which is clearly the case given the questions below, transitioning to streaming is your only option, as search is already filtering for relevance and this filtering will only increase over time. -John Kalucki http://twitter.com/jkalucki Infrastructure, Twitter Inc. 2010/6/7 Pascal Jürgens <lists.pascal.juerg...@googlemail.com>: > Good to know. Did you mean to say "consume … streaming results"? I don't > really see where you use the stream here. > > Also, please note that it's not a good idea to work with "since_id" and > "max_id" any more, because those will soon be (already are?) NON-SEQUENTIAL. > This means you will lose tweets if you rely on the IDs incrementing over > time. To quote the relevant email from Taylor Singletary: > >> Please don't depend on the exact format of the ID. As our infrastructure >> needs evolve, we might need to tweak the generation algorithm again. >> >> If you've been trying to divine meaning from status IDs aside from their >> role as a primary key, you won't be able to anymore. Likewise for usage of >> IDs in mathematical operations -- for instance, subtracting two status IDs >> to determine the number of tweets in between will no longer be possible > > Cheers. > > On Jun 8, 2010, at 0:06 , sahmed10 wrote: > >> yes it works! This algorithm works >> Its something like this >> Set the query to a string with appropriate To and From dates. Then >> consuem the 1500 streaming results and also save the status id of the >> very last tweet you got. As they are in order sequentially(with gaps) >> it wont be a problem. The very last tweet status id should be assigned >> as the MaxId for the next set of results and so on. > >