You will receive limit messages when your stream is limited. They are
documented here:
http://apiwiki.twitter.com/Streaming-API-Documentation#ParsingResponses

You may need to query on a few stop words before you get the limit
messages to flow, as the proportion currently allowed is pretty large,
even at the default level.

-John Kalucki
http://twitter.com/jkalucki
Services, Twitter Inc.




On Sep 28, 3:10 am, Robert Chatley <rob...@metabroadcast.com> wrote:
> Hi,
>
> I also have a question regarding throttling of the streaming API when
> tracking keywords.
>
> We are successfully tracking keywords and reading messages, but would
> like to know when our query is too broad, and we are not receiving all
> the messages, so that we can back off. We would prefer to be getting
> all the messages for a finer-grained query than most of the messages
> for a broader one.
>
> Is it possible for the client to tell whether its query is being
> throttled? I checked the rate-limit data on the returned statuses, but
> these didn't seem to give useful information for the streaming API - I
> guess they only give data about GET requests to other APIs.
>
> We are using the default access level.
>
> regards,
> Robert
>
> On Sep 4, 4:20 am, John Kalucki <jkalu...@gmail.com> wrote:
>
> > Zac,
>
> > It's possible that the trackfilteris missing something, but there's
> > probably other misunderstandings that are clouding things.
>
> > I don't know how Tweespeed comes up with their numbers, but theStreamingAPI 
> > only makes available a proportion of all public
> > statuses. Spam accounts, for example, are filtered out, as are
> > protected accounts, direct messages, etc. etc. My guess is that
> > Tweespeed is assuming that status_ids are assigned sequentially and
> > they are just reporting the velocity of that column.
>
> > Your estimate that 40% of tweets contain a link seems more than 2x too
> > high. You can come up with a very accurate number by collecting a
> > sampled feed for a few hours or days (there are diurnal and daily
> > patterns to everything on Twitter) and dividing out. Even 10 minutes
> > of the default sampled feed (the old "spritzer") will give you an
> > idea.
>
> > Without knowing your sample size, day of week, or time of day, I'd say
> > that your reported matches per minute and limited statuses per minute
> > are pretty good. I don't think you are missing much, if anything,
> > other than the statuses reported by the limit message.
>
> > As a double check, I just ran a quick test with the highest level of
> > track and compared the result against the firehose. In a one minute
> > sample, the track feed had matched the same tweets as the firehose
> > piped to 'grep -i http'.
>
> > -John Kaluckihttp://twitter.com/jkalucki
> > Services, Twitter Inc
>
> > On Sep 3, 7:23 pm, Zac Witte <zacwi...@gmail.com> wrote:
>
> > > I'm not sure thefilteris actually catching everything that I'm
> > > supposedly tracking. There are ~20,000 tweets per minute right now
> > > according to tweespeed. I'm getting about 1000 tweets/m and skipping
> > > on average 1500 tweets/m according to the limit notifications. That
> > > means myfilteris matching about 12.5% of all tweets, but I'm
> > > tracking "http" and supposedly 40% of all tweets contain a link so my
> > >filterwould seem to be missing the majority of all links. Is this
> > > making sense?

Reply via email to