[twitter-dev] Re: Streaming API time drifting problem and possible solutions

2010-07-09 Thread Larry Zhang
Thank everyone for the quick reply, I have implemented a downloading
program which uses curl, and it is fast enough to avoid the time
drift.
-Larry


On Jul 8, 5:00 pm, Pascal Jürgens
lists.pascal.juerg...@googlemail.com wrote:
 Larry,

 moreover, I assume you checked I/O and CPU load. But even if that's not the 
 issue, you should absolutely check if you have simplejson with c extension 
 installed. The python included version is 1.9 which is decidedly slower than 
 the new 2.x branch. You might see json decoding load drop by 50% or more.

 Pascal

 On Jul 8, 2010, at 17:31 , Larry Zhang wrote:



  Hi everyone,

  I have a program calling the statuses/sample method of a garden hose
  of the Streaming API, and I am experiencing the following problem: the
  timestamps of the tweets that I downloaded constantly drift behind
  real-time, the time drift keeps increasing until it reaches around 25
  minutes, and then I get a timeout from the request, sleep for 5
  seconds and reset the connection. The time drift is also reset to 0
  when the connection is reset.

  One solution for this I have now is to proactively reset the
  connection more frequently, e.g., if I reconnect every 1 minute, the
  time drift I get will be at most 1 minute. But I am not sure whether
  this is allow by the API.

  So could anyone tell me if you have the same problem as mine or I am
  using the API in the wrong way. And is it OK to reset connection every
  minute?

  I am using Tweepy (http://github.com/joshthecoder/tweepy) as the
  library for accessing the Streaming API.

  Thanks a lot!
  -Larry


[twitter-dev] Streaming API time drifting problem and possible solutions

2010-07-08 Thread Larry Zhang
Hi everyone,

I have a program calling the statuses/sample method of a garden hose
of the Streaming API, and I am experiencing the following problem: the
timestamps of the tweets that I downloaded constantly drift behind
real-time, the time drift keeps increasing until it reaches around 25
minutes, and then I get a timeout from the request, sleep for 5
seconds and reset the connection. The time drift is also reset to 0
when the connection is reset.

One solution for this I have now is to proactively reset the
connection more frequently, e.g., if I reconnect every 1 minute, the
time drift I get will be at most 1 minute. But I am not sure whether
this is allow by the API.

So could anyone tell me if you have the same problem as mine or I am
using the API in the wrong way. And is it OK to reset connection every
minute?

I am using Tweepy (http://github.com/joshthecoder/tweepy) as the
library for accessing the Streaming API.

Thanks a lot!
-Larry