The "existing" APIs stopped providing accurate data about a year ago
and degraded substantially over a period of just a few months. Now the
only data store for social graph data requires cursors to access
complete sets. Pagination is just not possible with the same latency
at this scale without an order of magnitude or two increase in cost.
So, instead of hardware "units" in the tens and hundreds, think about
the same in the thousands and tens of thousands.

These APIs and their now decommissioned backing stores were developed
when having 20,000 followers was a lot. We're an order of magnitude or
two beyond that point along nearly every dimension. Accounts.
Followers per account. Tweets per second. Etc. As systems evolve, some
evolutionary paths become extinct.

Given boundless resources, the best we could do for a REST API, as
Marcel has alluded, is to do the cursoring for you and aggregate many
blocks into much larger responses. This wouldn't work very well for at
least two immediate reasons: 1) Running a system with multimodal
service times is a nightmare -- we'd have to provision a specific
endpoint for such a resource. 2) Ruby GC chokes on lots of objects.
We'd have to consider implementing this resource in another stack, or
do a lot of tuning. All this to build the opposite of what most
applications want: a real-time stream of graph deltas for a set of
accounts, or the list of recent set operations since the last poll --
and rarely, if ever, the entire following set.

Also, I'm a little rusty on the details on the social graph api, but
please detail which public resources allow retrieval of 40,000
followers in two seconds. I'd be very interested in looking at the
implementing code on our end. A curl timing would be nice (time curl
URL > /dev/null) too.

-John Kalucki
http://twitter.com/jkalucki
Services, Twitter Inc.


On Mon, Jan 4, 2010 at 9:18 PM, PJB <pjbmancun...@gmail.com> wrote:
>
>
> On Jan 4, 8:58 pm, John Kalucki <j...@twitter.com> wrote:
>> at the moment). So, it seems that we're returning the data over home
>> DSL at between 2,500 and 4,000 ids per second, which seems like a
>> perfectly reasonable rate and variance.
>
> It's certainly not reasonable to expect it to take 10+ seconds to get
> 25,000 to 40,000 ids, PARTICULARLY when existing methods, for whatever
> reason, return the same data in less than 2 seconds.  Twitter is being
> incredibly short-sighted if they think this is indeed reasonable.
>
> Some of us have built applications around your EXISTING APIs, and to
> now suggest that we may need formal "business relationships" to
> continue to use such APIs is seriously disquieting.
>
> Disgusted...
>
>
>

Reply via email to