Some quick benchmarks... Grabbed entire social graph for ~250 users, where each user has a number of friends/followers between 0 and 80,000. I randomly used both the cursor and cursor-less API methods.
< 5000 ids cursor: 0.72 avg seconds cursorless: 0.51 avg seconds 5000 to 10,000 ids cursor: 1.42 avg seconds cursorless: 0.94 avg seconds 1 to 80,000 ids cursor: 2.82 avg seconds cursorless: 1.21 avg seconds 5,000 to 80,000 ids cursor: 4.28 cursorless: 1.59 10,000 to 80,000 ids cursor: 5.23 cursorless: 1.82 20,000 to 80,000 ids cursor: 6.82 cursorless: 2 40,000 to 80,000 ids cursor: 9.5 cursorless: 3 60,000 to 80,000 ids cursor: 12.25 cursorless: 3.12 On Jan 4, 7:58 pm, Jesse Stay <jesses...@gmail.com> wrote: > Ditto PJB :-) > > On Mon, Jan 4, 2010 at 8:12 PM, PJB <pjbmancun...@gmail.com> wrote: > > > I think that's like asking someone: why do you eat food? But don't say > > because it tastes good or nourishes you, because we already know > > that! ;) > > > You guys presumably set the 5000 ids per cursor limit by analyzing > > your user base and noting that one could still obtain the social graph > > for the vast majority of users with a single call. > > > But this is a bit misleading. For analytics-based apps, who aim to do > > near real-time analysis of relationships, the focus is typically on > > consumer brands who have a far larger than average number of > > relationships (e.g., 50k - 200k). > > > This means that those apps are neck-deep in cursor-based stuff, and > > quickly realize the existing drawbacks, including, in order of > > significance: > > > - Latency. Fetching ids for a user with 3000 friends is comparable > > between the two calls. But as you increment past 5000, the speed > > quickly peaks at a 5+x difference (I will include more benchmarks in a > > short while). For example, fetching 80,000 friends via the get-all > > method takes on average 3 seconds; it takes, on average, 15 seconds > > with cursors. > > > - Code complexity & elegance. I would say that there is a 3x increase > > in code lines to account for cursors, from retrying failed cursors, to > > caching to account for cursor slowness, to UI changes to coddle > > impatient users. > > > - Incomprehensibility. While there are obviously very good reasons > > from Twitter's perspective (performance) to the cursor based model, > > there really is no apparent obvious benefit to API users for the ids > > calls. I would make the case that a large majority of API uses of the > > ids calls need and require the entire social graph, not an incomplete > > one. After all, we need to know what new relationships exist, but > > also what old relationships have failed. To dole out the data in > > drips and drabs is like serving a pint of beer in sippy cups. That is > > to say: most users need the entire social graph, so what is the use > > case, from an API user's perspective, of NOT maintaining at least one > > means to quickly, reliably, and efficiently get it in a single call? > > > - API Barriers to entry. Most of the aforementioned arguments are > > obviously from an API user's perspective, but there's something, too, > > for Twitter to consider. Namely, by increasing the complexity and > > learning curve of particular API actions, you presumably further limit > > the pool of developers who will engage with that API. That's probably > > a bad thing. > > > - Limits Twitter 2.0 app development. This, again, speaks to issues > > bearing on speed and complexity, but I think it is important. The > > first few apps in any given media or innovation invariably have to do > > with basic functionality building blocks -- tweeting, following, > > showing tweets. But the next wave almost always has to do with > > measurement and analysis. By making such analysis more difficult, you > > forestall the critically important ability for brands, and others, to > > measure performance. > > > - API users have requested it. Shouldn't, ultimately, the use case > > for a particular API method simply be the fact that a number of API > > developers have requested that it remain? > > > On Jan 4, 2:07 pm, Wilhelm Bierbaum <wilh...@twitter.com> wrote: > > > Can everyone contribute their use case for this API method? I'm trying > > > to fully understand the deficiencies of the cursor approach. > > > > Please don't include that cursors are slow or that they are charged > > > against the rate limit, as those are known issues. > > > > Thanks. > >