Well, first, In the Gnip Power Track documentation
http://docs.gnip.com/w/page/35663947/Power-Track at the "has:geo"
section they say <<Currently, 'has:geo' is about 2-4% of the full
firehose>>.

Also, I ran some tests a few weeks ago to see the difference in
content between the search api and the streaming api for equivalent
geolocalized searches. See this thread
http://groups.google.com/group/twitter-development-talk/browse_thread/thread/a4bf3b7c6373657b#

My results showed that the streaming API returns a very small fraction
(3% in my tests) of what the search API returns. This is because the
streaming API only uses the geotagging API to locate tweets, but the
search API uses both the geotagging API and the user location field.

For example, I can get around 250 000 tweets/day for San Francisco
using the search api but the streaming api will return around 7000
tweets/day.

At 7000 tweets/day for San Francisco, 50 000 for the whole US seems
small.

Colin

On Apr 1, 2:40 pm, Augusto Santos <augu...@gemeos.org> wrote:
> Sorry Colin, but where did you get this information? Doesn't match with the
> reality. Not at all.
>
> On Fri, Apr 1, 2011 at 12:35 PM, Colin Surprenant <
>
>
>
>
>
>
>
>
>
> colin.surpren...@gmail.com> wrote:
> > As a side note, currently only 3-4% of the total tweets (firehose) are
> > geo-tagged and are eligible to be selected in a stream location
> > bounding box. If the current firehose rate is about 140M tweets/day,
> > that makes ~5M eligible tweets/day.
>
> > I do not know what the proportion of tweets from the US is but I would
> > think 50% seem reasonable and would result in ~2.5M tweets/day. Even
> > if we lower that proportion, your 50 000 tweets/day seems way off.
>
> > There are 3 possibilities, 1) you are being rate limited more than you
> > think, 2) your bounding box is wrong or 3) your bounding box is too
> > large and Twitter has reduced it somehow. I remember I read somewhere
> > in the api doc that each bounding box could not be more than 1 degree
> > square "enough to cover most metropolitan areas" - but I cannot find
> > that back.
>
> > Colin
>
> > On Mar 31, 4:08 pm, Data Gatherer <gatherer...@gmail.com> wrote:
> > > We have a bounding box set for the United States. Even though it's a
> > > large box, we only receive about 50,000 tweets a day. However, I see
> > > that we get rate limited at least once a week already. The box is
> > > large, but the number of matching results is fairly low.  Knowing how
> > > the rate limiting works more specifically would be important when
> > > trying to gather data for other projects (more bounding boxes, other
> > > keywords).
>
> > > On Mar 31, 3:50 pm, Jeremy Dunck <jdu...@gmail.com> wrote:
>
> > > > On Thu, Mar 31, 2011 at 2:48 PM, Augusto Santos <augu...@gemeos.org>
> > wrote:
> > > > > No it won't. Streaming has rate limit with around 1% of firehose, if
> > your
> > > > > search term os too much generic.
> > > > > If your search term or bouding box get too many tweets, you will
> > start
> > > > > receive 'limit' status message as doc said.
> > > > >http://dev.twitter.com/pages/streaming_api_concepts#parsing-responses
>
> > > > Sure, I understand that, I just meant to say that 1% of all tweets is
> > > > a lot (140M average per day now).
>
> > > > If your terms are not very general, you have a lot of head room.
>
> > --
> > Twitter developer documentation and resources:http://dev.twitter.com/doc
> > API updates via Twitter:http://twitter.com/twitterapi
> > Issues/Enhancements Tracker:
> >http://code.google.com/p/twitter-api/issues/list
> > Change your membership to this group:
> >http://groups.google.com/group/twitter-development-talk
>
> --
> 氣

-- 
Twitter developer documentation and resources: http://dev.twitter.com/doc
API updates via Twitter: http://twitter.com/twitterapi
Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list
Change your membership to this group: 
http://groups.google.com/group/twitter-development-talk

Reply via email to