Re: [twitter-dev] Re: Social Graph API: Legacy data format will be eliminated 1/11/2010

2010-01-18 Thread Jesse Stay
On Sun, Jan 17, 2010 at 12:54 PM, Abraham Williams 4bra...@gmail.comwrote:

 From the numbers I've seen in this thread more then 95% of accounts are are
 followed less then 25k times. It would not seem to make sense for Twitter to
 support returning more then 25k ids per call. Especially since there are
 only ~775 accounts with more then 100k followers:
 http://twitterholic.com/top800/followers/

 Abraham


Yet, those 775 accounts have the potential ability to reach up to 775,000+
(+, considering the number of retweets they each get) of Twitter's user
base. When they're dissatisfied, people hear.  IMO those are the ones
Twitter should be going out of their way to satisfy.  Add to that the fact
that many of those are the ones willing to pay the biggest bucks when/if
Twitter implements a business account, they could also be a contributing
factor to Twitter's revenue model in the future.  It makes total sense for
Twitter to support those ~775 accounts.  If they're ignored, they'll take
their followers with them.

Jesse


Re: [twitter-dev] Re: Social Graph API: Legacy data format will be eliminated 1/11/2010

2010-01-18 Thread Tim Haines


 Yet, those 775 accounts have the potential ability to reach up to 775,000+
 (+, considering the number of retweets they each get) of Twitter's user
 base. When they're dissatisfied, people hear.  IMO those are the ones
 Twitter should be going out of their way to satisfy.  Add to that the fact
 that many of those are the ones willing to pay the biggest bucks when/if
 Twitter implements a business account, they could also be a contributing
 factor to Twitter's revenue model in the future.  It makes total sense for
 Twitter to support those ~775 accounts.  If they're ignored, they'll take
 their followers with them.

 Jesse


Getting way off topic, but I think you're wrong here.  They won't be taking
their followers anywhere.  Commonly the majority of the large number of
followers aren't engaged followers.
http://dashes.com/anil/2010/01/nobody-has-a-million-twitter-followers.html
Anil's blog post matches my own experiences with traffic fluctuations
after
receiving tweets.

Tim.


Re: [twitter-dev] Re: Social Graph API: Legacy data format will be eliminated 1/11/2010

2010-01-17 Thread Abraham Williams
From the numbers I've seen in this thread more then 95% of accounts are are
followed less then 25k times. It would not seem to make sense for Twitter to
support returning more then 25k ids per call. Especially since there are
only ~775 accounts with more then 100k followers:
http://twitterholic.com/top800/followers/

Abraham

On Sat, Jan 16, 2010 at 04:06, st...@implu.com st...@implu.com wrote:

 Can we get a decision on this issue? Will a cursor eventually be a
 required element? If not, what will the default return be? If a cursor
 is required, what will be the number of social graph elements
 returned?

 I guess I'm cook with a required cursor so long as cursor=-1 returns
 100k for example.

 A decision please.

 Cheers,

 Steve

 On Jan 8, 9:24 pm, st...@implu.com st...@implu.com wrote:
  Here's some rough numbers...x is the number of twitter user's with a
  follower count of...
 
 x = 100k  7140.007%
  75k = x   100k  1510.001%
  50k = x   75k   4110.004%
  25k = x   50k  20440.020%
 0  x   25k10009489   96.529%
 
  Total:  10,369,396
 
  So I would agree that 100k would be sufficient for our needs.
 
  -Steve
 
  On Jan 8, 3:38 pm, Dossy Shiobara do...@panoptic.com wrote:
 
   100k, at the minimum.
 
   On 1/8/10 3:35 PM, Wilhelm Bierbaum wrote:
 
How much larger do you think makes it easier?
 
On Jan 7, 6:42 pm, st...@implu.com st...@implu.com wrote:
I would agree with several views expressed in various posts here.
 
1) Acursor-less call that returns all IDs makes for simpler code and
fewer API calls. i.e. less processing time.
 
2) If we must have a 'cursored' call then at least allow
 forcursor=-1
to return a larger number than 5k.
 
   --
   Dossy Shiobara  | do...@panoptic.com |http://dossy.org/
   Panoptic Computer Network   |http://panoptic.com/
 He realized the fastest way to change is to laugh at your own
   folly -- then you can let go and quickly move on. (p. 70)
 
 




-- 
Abraham Williams | Moved to Seattle | May cause email delays
Project | Intersect | http://intersect.labs.poseurtech.com
Hacker | http://abrah.am | http://twitter.com/abraham
This email is: [ ] shareable [x] ask first [ ] private.


Re: [twitter-dev] Re: Social Graph API: Legacy data format will be eliminated 1/11/2010

2010-01-08 Thread Dossy Shiobara
100k, at the minimum.

On 1/8/10 3:35 PM, Wilhelm Bierbaum wrote:
 How much larger do you think makes it easier?
 
 On Jan 7, 6:42 pm, st...@implu.com st...@implu.com wrote:
 I would agree with several views expressed in various posts here.

 1) A cursor-less call that returns all IDs makes for simpler code and
 fewer API calls. i.e. less processing time.

 2) If we must have a 'cursored' call then at least allow for cursor=-1
 to return a larger number than 5k.

-- 
Dossy Shiobara  | do...@panoptic.com | http://dossy.org/
Panoptic Computer Network   | http://panoptic.com/
  He realized the fastest way to change is to laugh at your own
folly -- then you can let go and quickly move on. (p. 70)


Re: [twitter-dev] Re: Social Graph API: Legacy data format will be eliminated 1/11/2010

2010-01-08 Thread John Kalucki
What proportion of your users have more than 5k followers? More than 25k
followers?

-John Kalucki
http://twitter.com/jkalucki
Services, Twitter Inc.


On Fri, Jan 8, 2010 at 2:57 PM, DustyReagan dustyrea...@gmail.com wrote:

 As large as possible. 100k would be a huge improvement.

 For FriendOrFollow.com I need the user's entire social graph to
 effectively calculate who's not following them back, who they're not
 following back, and their mutual friendships. I can't really cache
 this data because user's make decisions on who to follow and unfollow
 based on my data. If the data is old, I start hearing complaints about
 how a user unfollowed someone who was really following them, etc. So
 the data really needs to be pulled on page load. The 5k at a time
 cursors pretty much cripples FriendOrFollow for anyone with an
 impressive amount of followers, and it also takes too many API calls
 to be rate limit effective. The more IDs that can be pulled at once,
 the better.

 If I could have the user's IDs streamed to me like the streaming API
 does tweets, that would be pretty hot.

 On Jan 8, 2:38 pm, Dossy Shiobara do...@panoptic.com wrote:
  100k, at the minimum.
 
  On 1/8/10 3:35 PM, Wilhelm Bierbaum wrote:
 
   How much larger do you think makes it easier?
 
   On Jan 7, 6:42 pm, st...@implu.com st...@implu.com wrote:
   I would agree with several views expressed in various posts here.
 
   1) A cursor-less call that returns all IDs makes for simpler code and
   fewer API calls. i.e. less processing time.
 
   2) If we must have a 'cursored' call then at least allow for cursor=-1
   to return a larger number than 5k.
 
  --
  Dossy Shiobara  | do...@panoptic.com |http://dossy.org/
  Panoptic Computer Network   |http://panoptic.com/
He realized the fastest way to change is to laugh at your own
  folly -- then you can let go and quickly move on. (p. 70)



Re: [twitter-dev] Re: Social Graph API: Legacy data format will be eliminated 1/11/2010

2010-01-08 Thread Dossy Shiobara
On 1/8/10 5:59 PM, John Kalucki wrote:
 What proportion of your users have more than 5k followers? More than 25k
 followers?

Good point ...

| grouping | percent |
+--+-+
| 0-4,999  |72.7 |
| 5,000-24,999 |22.3 |
| 25,000+  | 5.0 |

I think 27% of users is large enough to pay attention to ... ?


-- 
Dossy Shiobara  | do...@panoptic.com | http://dossy.org/
Panoptic Computer Network   | http://panoptic.com/
  He realized the fastest way to change is to laugh at your own
folly -- then you can let go and quickly move on. (p. 70)


Re: [twitter-dev] Re: Social Graph API: Legacy data format will be eliminated 1/11/2010

2010-01-06 Thread Josh Roesslein
Not really sure how capping followers would be of much benefit.
A better solution might be better garbage collection of inactive or
spam accounts.
I believe twitter already does this, maybe not the best it could, but
there is something in place.
Capping the follower limit will hurt users who actually want to follow
the user, but are no longer able
to do so because the account has already been flooded with other
accounts. Some of these being old
followers who no longer use twitter or spam bots that got by the
anti-spam measures.

From a technical standpoint on twitter's end, followers is not a
really intense calcuation.
Friends on the other hand are, since you need to query everyone of
them to build the home timeline.
Followers one the other hand have no timeline. So not sure I see any
gains there for capped followers.

Just my two cents,

Josh

On Wed, Jan 6, 2010 at 7:36 AM, Dewald Pretorius dpr...@gmail.com wrote:
 This blog post by Anil Dash makes an excellent case for why Twitter
 should cap the number of followers that a Twitter account can have. It
 will make life easier for everyone.

 http://bit.ly/6Al7TU



Re: [twitter-dev] Re: Social Graph API: Legacy data format will be eliminated 1/11/2010

2010-01-06 Thread Marcel Molina
That post is a follow up to his argument for why the SUL doesn't represent
as much value as some might perceive it to. It's an argument for getting rid
of the SUL as it's currently implemented. There are only 500 or so people on
the SUL. Non SUL users with as many followers, though rare, likely have a
far more engaged set of followers.

We're actively devising various mechanisms for making it a lot easier (and
faster) to consume someone's full follower list.

On Wed, Jan 6, 2010 at 5:36 AM, Dewald Pretorius dpr...@gmail.com wrote:

 This blog post by Anil Dash makes an excellent case for why Twitter
 should cap the number of followers that a Twitter account can have. It
 will make life easier for everyone.

 http://bit.ly/6Al7TU




-- 
Marcel Molina
Twitter Platform Team
http://twitter.com/noradio


Re: [twitter-dev] Re: Social Graph API: Legacy data format will be eliminated 1/11/2010

2010-01-05 Thread Jesse Stay
If I can suggest you keep it backwards-compatible that would make much more
sense.  I think we're all aware that over 200,000 or so followers it breaks.
 So what if you kept the cursor-less nature, treat it like a cursor, but set
the returned cursor cap to be 200,000 per cursor?  Or if it needs to be
smaller (again, I think it would be much less bandwidth and process-time to
just keep it a high, sustainable number rather than having to traverse
multiple times to get that), maybe just return only the last 200,000 if no
cursor is specified?  This way those that aren't aware of the change aren't
affected, new methods can be put into place, documentation can be updated to
reflect the deprecated methods, and everyone's happy.

I'm a little surprised at the surprise by the Twitter team here. If you guys
need an account on one of my servers to test this stuff I'm happy to
provide. :-)  Hopefully you guys can trust us as much as we trust you.  I'm
always happy to provide examples and help though.  I recognize you guys are
all working your tails off there. (I say this as I wear my wearing my
Twitter shirt proudly)

Jesse

On Tue, Jan 5, 2010 at 1:35 AM, John Kalucki j...@twitter.com wrote:

 And so it is. Given the system implementation, I'm quite surprised
 that the cursorless call returns results with acceptable reliability,
 especially during peak system load. The documentation attempts to
 convey that the cursorless approach is risky. all IDs are attempted
 to be returned, but large sets of IDs will likely fail with timeout
 errors.   When documentation says attempted and fail with timeout
 errors, it doesn't take too much reading between the lines to infer
 that this is a best effort call. Building upon a risky dependency has,
 well, risks. (The passive voice, on the other hand, is a lowly crime.)

 I also agree that the cursored approach as currently implemented is
 quite problematic. To increase throughput, I'd support increasing the
 block size somewhat, but the boundless behavior of the cursorless
 unauthenticated call just has to go. The combination of these changes
 should reduce both query and memory pressure on the front end, which,
 in theory, if not in practice, should lead to a better overall
 experience. I'd imagine that there are complications, and numbers to
 be run, and trade-offs to be made.

 Trust that the platform people are trading-off many competing
 interests and that there isn't a single capricious bone in their
 collective body.

 -John Kalucki
 http://twitter.com/jkalucki
 Services, Twitter Inc.


 On Mon, Jan 4, 2010 at 10:40 PM, PJB pjbmancun...@gmail.com wrote:
 
  As noted in this thread, the fact that cursor-less methods for friends/
  followers ids will be deprecated was newly announced on December 22.
 
  In fact, the API documentation still clearly indicates that cursors
  are optional, and that their absence will return a complete social
  graph.  E.g.:
 
  http://apiwiki.twitter.com/Twitter-REST-API-Method%3A-followers%C2%A0ids
 
  (If the cursor parameter is not provided, all IDs are attempted to be
  returned)
 
  The example at the bottom of that page gives a good example of
  retrieving 300,000+ ids in several seconds:
 
  http://twitter.com/followers/ids.xml?screen_name=dougw
 
  Of course, retrieving 20-40k users is significantly faster.
 
  Again, many of us have built apps around cursor-less API calls.  To
  now deprecate them, with just a few days warning over the holidays, is
  clearly inappropriate and uncalled for.  Similarly, to announce that
  we must now expect 5x slowness when doing the same calls, when these
  existing methods work well, is shocking.
 
  Many developers live and die by the API documentation.  It's a really
  fouled-up situation when the API documentation is so totally wrong,
  right?
 
  I urge those folks addressing this issue to preserve the cursor-less
  methods.  Barring that, I urge them to return at least 25,000 ids per
  cursor (as you note, time progression has made 5000 per call
  antiquated and ineffective for today's Twitter user) and grant at
  least 3 months before deprecation.
 
  On Jan 4, 10:23 pm, John Kalucki j...@twitter.com wrote:
  The existing APIs stopped providing accurate data about a year ago
  and degraded substantially over a period of just a few months. Now the
  only data store for social graph data requires cursors to access
  complete sets. Pagination is just not possible with the same latency
  at this scale without an order of magnitude or two increase in cost.
  So, instead of hardware units in the tens and hundreds, think about
  the same in the thousands and tens of thousands.
 
  These APIs and their now decommissioned backing stores were developed
  when having 20,000 followers was a lot. We're an order of magnitude or
  two beyond that point along nearly every dimension. Accounts.
  Followers per account. Tweets per second. Etc. As systems evolve, some
  evolutionary paths become extinct.
 
  Given 

Re: [twitter-dev] Re: Social Graph API: Legacy data format will be eliminated 1/11/2010

2010-01-05 Thread John Kalucki
That sounds like a good overall technique. It's very best-effort. I'm
concerned about implementation details though. The webserver may
defensively time out the connection a lot, and tight coordination
between container and process is difficult to manage in our stack. And
by difficult, I mean intractably difficult. So, you might see a
premature 503 the processing is too aggressive and, on average,
receive even less data. It might be best to artificially limit the
processing before the safety kicks in.

In the end, this effort would be more usefully be applied towards
streaming social graph deltas.

-John Kalucki
http://twitter.com/jkalucki
Services, Twitter Inc.


On Tue, Jan 5, 2010 at 7:31 AM, Dewald Pretorius dpr...@gmail.com wrote:
 John,

 To try and make it as transparent and seamless for external developers
 as possible, I propose the following solution.

 Change the API layer so that it returns as many ids as possible in the
 first API call, regardless of whether cursor=-1 is present or omitted.
 If your system is able to return the entire social graph, then simply
 set next_cursor to 0. If it could return only a subset of ids, then
 set next_cursor to its next value.

 The benefits are:

 a) You can dynamically manage the number of ids returned in the first
 call, in accordance with system load at the time of the call.

 b) We, as developers, only need to check for next_cursor  0 to know
 whether we got the full set or whether we need to make subsequent
 cursored calls.

 On Jan 5, 10:34 am, John Kalucki j...@twitter.com wrote:
 Jessie,

 My surprise shouldn't be a surprise. I'm sure the platform team is
 well aware of the issues.

 The fact that it works at 200k users could very well be inherently
 unstable. Minor changes to the system elsewhere could cause this
 number to drop without anyone knowing. We don't monitor this breaks
 at threshold in production, and we certainly don't manage the cluster
 to preserve such a threshold. I'd doubt that this is testable in
 development. In practice, should we support this, it could be
 difficult to guarantee such a high threshold as various systems
 approach their capacity limits. The most reliable approach is to make
 all calls approximately the same cost and mange the system to
 provide smooth delivery at that cost per request.

 -John Kaluckihttp://twitter.com/jkalucki
 Services, Twitter Inc.

 On Tue, Jan 5, 2010 at 1:09 AM, Jesse Stay jesses...@gmail.com wrote:
  If I can suggest you keep it backwards-compatible that would make much more
  sense.  I think we're all aware that over 200,000 or so followers it 
  breaks.
   So what if you kept the cursor-less nature, treat it like a cursor, but 
  set
  the returned cursor cap to be 200,000 per cursor?  Or if it needs to be
  smaller (again, I think it would be much less bandwidth and process-time to
  just keep it a high, sustainable number rather than having to traverse
  multiple times to get that), maybe just return only the last 200,000 if no
  cursor is specified?  This way those that aren't aware of the change aren't
  affected, new methods can be put into place, documentation can be updated 
  to
  reflect the deprecated methods, and everyone's happy.

  I'm a little surprised at the surprise by the Twitter team here. If you 
  guys
  need an account on one of my servers to test this stuff I'm happy to
  provide. :-)  Hopefully you guys can trust us as much as we trust you.  I'm
  always happy to provide examples and help though.  I recognize you guys are
  all working your tails off there. (I say this as I wear my wearing my
  Twitter shirt proudly)
  Jesse

  On Tue, Jan 5, 2010 at 1:35 AM, John Kalucki j...@twitter.com wrote:

  And so it is. Given the system implementation, I'm quite surprised
  that the cursorless call returns results with acceptable reliability,
  especially during peak system load. The documentation attempts to
  convey that the cursorless approach is risky. all IDs are attempted
  to be returned, but large sets of IDs will likely fail with timeout
  errors.   When documentation says attempted and fail with timeout
  errors, it doesn't take too much reading between the lines to infer
  that this is a best effort call. Building upon a risky dependency has,
  well, risks. (The passive voice, on the other hand, is a lowly crime.)

  I also agree that the cursored approach as currently implemented is
  quite problematic. To increase throughput, I'd support increasing the
  block size somewhat, but the boundless behavior of the cursorless
  unauthenticated call just has to go. The combination of these changes
  should reduce both query and memory pressure on the front end, which,
  in theory, if not in practice, should lead to a better overall
  experience. I'd imagine that there are complications, and numbers to
  be run, and trade-offs to be made.

  Trust that the platform people are trading-off many competing
  interests and that there isn't a single capricious bone in 

Re: [twitter-dev] Re: Social Graph API: Legacy data format will be eliminated 1/11/2010

2010-01-05 Thread Michael Steuer
Ditto


On 1/4/10 7:58 PM, Jesse Stay jesses...@gmail.com wrote:

 Ditto PJB :-)
 
 On Mon, Jan 4, 2010 at 8:12 PM, PJB pjbmancun...@gmail.com wrote:
 
 I think that's like asking someone: why do you eat food? But don't say
 because it tastes good or nourishes you, because we already know
 that! ;)
 
 You guys presumably set the 5000 ids per cursor limit by analyzing
 your user base and noting that one could still obtain the social graph
 for the vast majority of users with a single call.
 
 But this is a bit misleading.  For analytics-based apps, who aim to do
 near real-time analysis of relationships, the focus is typically on
 consumer brands who have a far larger than average number of
 relationships (e.g., 50k - 200k).
 
 This means that those apps are neck-deep in cursor-based stuff, and
 quickly realize the existing drawbacks, including, in order of
 significance:
 
 - Latency.  Fetching ids for a user with 3000 friends is comparable
 between the two calls.  But as you increment past 5000, the speed
 quickly peaks at a 5+x difference (I will include more benchmarks in a
 short while).  For example, fetching 80,000 friends via the get-all
 method takes on average 3 seconds; it takes, on average, 15 seconds
 with cursors.
 
 - Code complexity  elegance.  I would say that there is a 3x increase
 in code lines to account for cursors, from retrying failed cursors, to
 caching to account for cursor slowness, to UI changes to coddle
 impatient users.
 
 - Incomprehensibility.  While there are obviously very good reasons
 from Twitter's perspective (performance) to the cursor based model,
 there really is no apparent obvious benefit to API users for the ids
 calls.  I would make the case that a large majority of API uses of the
 ids calls need and require the entire social graph, not an incomplete
 one.  After all, we need to know what new relationships exist, but
 also what old relationships have failed.  To dole out the data in
 drips and drabs is like serving a pint of beer in sippy cups.  That is
 to say: most users need the entire social graph, so what is the use
 case, from an API user's perspective, of NOT maintaining at least one
 means to quickly, reliably, and efficiently get it in a single call?
 
 - API Barriers to entry.  Most of the aforementioned arguments are
 obviously from an API user's perspective, but there's something, too,
 for Twitter to consider.  Namely, by increasing the complexity and
 learning curve of particular API actions, you presumably further limit
 the pool of developers who will engage with that API.  That's probably
 a bad thing.
 
 - Limits Twitter 2.0 app development.  This, again, speaks to issues
 bearing on speed and complexity, but I think it is important.  The
 first few apps in any given media or innovation invariably have to do
 with basic functionality building blocks -- tweeting, following,
 showing tweets.  But the next wave almost always has to do with
 measurement and analysis.  By making such analysis more difficult, you
 forestall the critically important ability for brands, and others, to
 measure performance.
 
 - API users have requested it.  Shouldn't, ultimately, the use case
 for a particular API method simply be the fact that a number of API
 developers have requested that it remain?
 
 
 On Jan 4, 2:07 pm, Wilhelm Bierbaum wilh...@twitter.com wrote:
  Can everyone contribute their use case for this API method? I'm trying
  to fully understand the deficiencies of the cursor approach.
 
  Please don't include that cursors are slow or that they are charged
  against the rate limit, as those are known issues.
 
  Thanks.
 
 



Re: [twitter-dev] Re: Social Graph API: Legacy data format will be eliminated 1/11/2010

2010-01-04 Thread Marcel Molina
Dewald, it should be noted that, of course, not all 200 request responses
are created equal and just because pulling down a response body with
hundreds of thousands of ids succeeds, it doesn't mean it doesn't cause a
substantial strain on our system. We want to make developing against the API
as easy as is feasible but need to do so in a spirit of reasonable
compromise.

On Mon, Jan 4, 2010 at 5:59 PM, Dewald Pretorius dpr...@gmail.com wrote:

 Wilhelm,

 I want the API method to return the full social graph in as few API
 calls as possible.

 If your system can return up to X ids in one call without doing a 502
 freak-out, then continue to do so. For social graphs with X+n ids, we
 can use cursors.

 On Jan 4, 6:07 pm, Wilhelm Bierbaum wilh...@twitter.com wrote:
  Can everyone contribute their use case for this API method? I'm trying
  to fully understand the deficiencies of the cursor approach.
 
  Please don't include that cursors are slow or that they are charged
  against the rate limit, as those are known issues.
 
  Thanks.




-- 
Marcel Molina
Twitter Platform Team
http://twitter.com/noradio


Re: [twitter-dev] Re: Social Graph API: Legacy data format will be eliminated 1/11/2010

2010-01-04 Thread Jesse Stay
I'm just now noticing this (I agree - why was this being announced over the
holidays???) - this will make it near impossible to process large users.
 This is a *huge* change that just about kills any of the larger services
processing very large amounts of social graph data.  Please reconsider
allowing the all-in-one calls.  I don't want to have to explain to our users
with hundreds of thousands of followers why Twitter isn't letting us read
their Social Graph. (nor do I think Twitter wants us to)  I had a lot of
high hopes with Ryan Sarver's announcements last year of lifting limits, but
this is really discouraging.

Jesse

On Sun, Dec 27, 2009 at 7:29 PM, Dewald Pretorius dpr...@gmail.com wrote:

 What is being deprecated here is the old pagination method with the
 page parameter.

 As noted earlier, it is going to cause great pain if the API is going
 to assume a cursor of -1 if no cursor is specified, and hence enforce
 the use of cursors regardless of the size of the social graph.

 The API is currently comfortably returning social graphs smaller than
 200,000 members in one call. I very rarely get a 502 on social graphs
 of that size. It makes no sense to force us to make 40 API where 1 API
 call currently suffices and works. Those 40 API calls take between 40
 and 80 seconds to complete, as opposed to 1 to 2 seconds for the
 single API call. Multiply that by a few thousand Twitter accounts, and
 it adds hours of additional processing time, which is completely
 unnecessary, and will make getting through a large number of accounts
 virtually impossible.


 On Dec 27, 7:45 pm, Zac Bowling zbowl...@gmail.com wrote:
  I agree with the others to some extent. Although its a good signal to
 stop
  using something ASAP when something is depreciated, saying depreciated
 and
  not giving definite time-line on it's removal isn't good either. (Source
  params are deprecated but still work and don't have solid deprecation
 date,
  and I'm still going on using them because OAuth sucks for desktop/mobile
  situations still and would die with a 15 day heads up on removal).
 
  Also iPhone app devs using this API will would probably have a hard time
  squeezing a 15 day return on Apple right now.
 
  Zac Bowling
 
  On Sun, Dec 27, 2009 at 3:28 PM, Dewald Pretorius dpr...@gmail.com
 wrote:
   I agree 100%.
 
   Calls without the starting cursor of -1 must still return all
   followers as is currently the case.
 
   As a test I've set my system to use cursors on all calls. It inflates
   the processing time so much that things become completely unworkable.
 
   We can programmatically use cursors if showuser says that the person
   has more than a certain number of friends/followers. That's what I'm
   currently doing, and it works beautifully. So, please do not force us
   to use cursors on all calls.
 
   On Dec 24, 7:20 am, Aki yoru.fuku...@gmail.com wrote:
I agree with PJB. The previous announcements only said that the
pagination will be deprecated.
 
1.
 http://groups.google.com/group/twitter-api-announce/browse_thread/thr.
   ..
2.
 http://groups.google.com/group/twitter-api-announce/browse_thread/thr.
   ..
 
However, both of the announcements did not say that the API call
without page parameter to get
all IDs will be removed or replaced with cursor pagination.
The deprecation of this method is not being documented as PJB said.
 
On Dec 24, 5:00 pm, PJB pjbmancun...@gmail.com wrote:
 
 Why hasn't this been announced before?  Why does the API suggest
 something totally different?  At the very least, can you please
 hold
 off on deprecation of this until 2/11/2010?  This is a new API
 change.
 
 On Dec 23, 7:45 pm, Raffi Krikorian ra...@twitter.com wrote:
 
  yes - if you do not pass in cursors, then the API will behave as
   though you
  requested the first cursor.
 
   Willhelm:
 
   Your announcement is apparently expanding the changeover from
 page
   to
   cursor in new, unannounced ways??
 
   The API documentation page says: If the cursor parameter is
 not
   provided, all IDs are attempted to be returned, but large sets
 of
   IDs
   will likely fail with timeout errors.
 
   Yesterday you wrote: Starting soon, if you fail to pass a
 cursor,
   the
   data returned will be that of the first cursor (-1) and the
   next_cursor and previous_cursor elements will be included.
 
   I can understand the need to swap from page to cursor, but was
   pleased
   that a single call was still available to return (or attempt to
   return) all friend/follower ids.  Now you are saying that, in
   addition
   to the changeover from page to cursor, you are also getting rid
 of
   this?
 
   Can you please confirm/deny?
 
   On Dec 22, 4:13 pm, Wilhelm Bierbaum wilh...@twitter.com
 wrote:
We noticed that some clients are still calling social graph
   methods
without cursor parameters. 

Re: [twitter-dev] Re: Social Graph API: Legacy data format will be eliminated 1/11/2010

2010-01-04 Thread Jesse Stay
Ditto PJB :-)

On Mon, Jan 4, 2010 at 8:12 PM, PJB pjbmancun...@gmail.com wrote:


 I think that's like asking someone: why do you eat food? But don't say
 because it tastes good or nourishes you, because we already know
 that! ;)

 You guys presumably set the 5000 ids per cursor limit by analyzing
 your user base and noting that one could still obtain the social graph
 for the vast majority of users with a single call.

 But this is a bit misleading.  For analytics-based apps, who aim to do
 near real-time analysis of relationships, the focus is typically on
 consumer brands who have a far larger than average number of
 relationships (e.g., 50k - 200k).

 This means that those apps are neck-deep in cursor-based stuff, and
 quickly realize the existing drawbacks, including, in order of
 significance:

 - Latency.  Fetching ids for a user with 3000 friends is comparable
 between the two calls.  But as you increment past 5000, the speed
 quickly peaks at a 5+x difference (I will include more benchmarks in a
 short while).  For example, fetching 80,000 friends via the get-all
 method takes on average 3 seconds; it takes, on average, 15 seconds
 with cursors.

 - Code complexity  elegance.  I would say that there is a 3x increase
 in code lines to account for cursors, from retrying failed cursors, to
 caching to account for cursor slowness, to UI changes to coddle
 impatient users.

 - Incomprehensibility.  While there are obviously very good reasons
 from Twitter's perspective (performance) to the cursor based model,
 there really is no apparent obvious benefit to API users for the ids
 calls.  I would make the case that a large majority of API uses of the
 ids calls need and require the entire social graph, not an incomplete
 one.  After all, we need to know what new relationships exist, but
 also what old relationships have failed.  To dole out the data in
 drips and drabs is like serving a pint of beer in sippy cups.  That is
 to say: most users need the entire social graph, so what is the use
 case, from an API user's perspective, of NOT maintaining at least one
 means to quickly, reliably, and efficiently get it in a single call?

 - API Barriers to entry.  Most of the aforementioned arguments are
 obviously from an API user's perspective, but there's something, too,
 for Twitter to consider.  Namely, by increasing the complexity and
 learning curve of particular API actions, you presumably further limit
 the pool of developers who will engage with that API.  That's probably
 a bad thing.

 - Limits Twitter 2.0 app development.  This, again, speaks to issues
 bearing on speed and complexity, but I think it is important.  The
 first few apps in any given media or innovation invariably have to do
 with basic functionality building blocks -- tweeting, following,
 showing tweets.  But the next wave almost always has to do with
 measurement and analysis.  By making such analysis more difficult, you
 forestall the critically important ability for brands, and others, to
 measure performance.

 - API users have requested it.  Shouldn't, ultimately, the use case
 for a particular API method simply be the fact that a number of API
 developers have requested that it remain?


 On Jan 4, 2:07 pm, Wilhelm Bierbaum wilh...@twitter.com wrote:
  Can everyone contribute their use case for this API method? I'm trying
  to fully understand the deficiencies of the cursor approach.
 
  Please don't include that cursors are slow or that they are charged
  against the rate limit, as those are known issues.
 
  Thanks.



Re: [twitter-dev] Re: Social Graph API: Legacy data format will be eliminated 1/11/2010

2010-01-04 Thread John Kalucki
The backend datastore returns following blocks in constant time,
regardless of the cursor depth. When I test a user with 100k+
followers via twitter.com using a ruby script, I see each cursored
block return in between 1.3 and 2.0 seconds, n=46, avg 1.59 seconds,
median 1.47 sec, stddev of .377, (home DSL, shared by several people
at the moment). So, it seems that we're returning the data over home
DSL at between 2,500 and 4,000 ids per second, which seems like a
perfectly reasonable rate and variance.

If I recall correctly, the cursorless methods are just shunted to
the first block each time, and thus represent a constant, incomplete,
amount of data...

Looking into my crystal ball, if you want a lot more than several
thousand widgets per second from Twitter, you probably aren't going to
get them via REST, and you will probably have some sort of business
relationship in place with Twitter.

-John Kalucki
http://twitter.com/jkalucki
Services, Twitter Inc.

(A slice of data below)

url /followers/ids/alexa_chung.xml?cursor=-1
fetch time = 1.478542
url /followers/ids/alexa_chung.xml?cursor=1322524362256299608
fetch time = 2.044831
url /followers/ids/alexa_chung.xml?cursor=1321126009663170021
fetch time = 1.350035
url /followers/ids/alexa_chung.xml?cursor=1319359640017038524
fetch time = 1.44636
url /followers/ids/alexa_chung.xml?cursor=1317653620096535558
fetch time = 1.955163
url /followers/ids/alexa_chung.xml?cursor=1316184964685221966
fetch time = 1.326226
url /followers/ids/alexa_chung.xml?cursor=1314866514116423204
fetch time = 1.96824
url /followers/ids/alexa_chung.xml?cursor=1313551933690106944
fetch time = 1.513922
url /followers/ids/alexa_chung.xml?cursor=1312201296962214944
fetch time = 1.59179
url /followers/ids/alexa_chung.xml?cursor=1311363260604388613
fetch time = 2.259924
url /followers/ids/alexa_chung.xml?cursor=1310627455188010229
fetch time = 1.706438
url /followers/ids/alexa_chung.xml?cursor=1309772694575801646
fetch time = 1.460413



On Mon, Jan 4, 2010 at 8:18 PM, PJB pjbmancun...@gmail.com wrote:

 Some quick benchmarks...

 Grabbed entire social graph for ~250 users, where each user has a
 number of friends/followers between 0 and 80,000.  I randomly used
 both the cursor and cursor-less API methods.

  5000 ids
 cursor: 0.72 avg seconds
 cursorless: 0.51 avg seconds

 5000 to 10,000 ids
 cursor: 1.42 avg seconds
 cursorless: 0.94 avg seconds

 1 to 80,000 ids
 cursor: 2.82 avg seconds
 cursorless: 1.21 avg seconds

 5,000 to 80,000 ids
 cursor: 4.28
 cursorless: 1.59

 10,000 to 80,000 ids
 cursor: 5.23
 cursorless: 1.82

 20,000 to 80,000 ids
 cursor: 6.82
 cursorless: 2

 40,000 to 80,000 ids
 cursor: 9.5
 cursorless: 3

 60,000 to 80,000 ids
 cursor: 12.25
 cursorless: 3.12

 On Jan 4, 7:58 pm, Jesse Stay jesses...@gmail.com wrote:
 Ditto PJB :-)

 On Mon, Jan 4, 2010 at 8:12 PM, PJB pjbmancun...@gmail.com wrote:

  I think that's like asking someone: why do you eat food? But don't say
  because it tastes good or nourishes you, because we already know
  that! ;)

  You guys presumably set the 5000 ids per cursor limit by analyzing
  your user base and noting that one could still obtain the social graph
  for the vast majority of users with a single call.

  But this is a bit misleading.  For analytics-based apps, who aim to do
  near real-time analysis of relationships, the focus is typically on
  consumer brands who have a far larger than average number of
  relationships (e.g., 50k - 200k).

  This means that those apps are neck-deep in cursor-based stuff, and
  quickly realize the existing drawbacks, including, in order of
  significance:

  - Latency.  Fetching ids for a user with 3000 friends is comparable
  between the two calls.  But as you increment past 5000, the speed
  quickly peaks at a 5+x difference (I will include more benchmarks in a
  short while).  For example, fetching 80,000 friends via the get-all
  method takes on average 3 seconds; it takes, on average, 15 seconds
  with cursors.

  - Code complexity  elegance.  I would say that there is a 3x increase
  in code lines to account for cursors, from retrying failed cursors, to
  caching to account for cursor slowness, to UI changes to coddle
  impatient users.

  - Incomprehensibility.  While there are obviously very good reasons
  from Twitter's perspective (performance) to the cursor based model,
  there really is no apparent obvious benefit to API users for the ids
  calls.  I would make the case that a large majority of API uses of the
  ids calls need and require the entire social graph, not an incomplete
  one.  After all, we need to know what new relationships exist, but
  also what old relationships have failed.  To dole out the data in
  drips and drabs is like serving a pint of beer in sippy cups.  That is
  to say: most users need the entire social graph, so what is the use
  case, from an API user's perspective, of NOT maintaining at least one
  means to quickly, reliably, and efficiently get it in 

Re: [twitter-dev] Re: Social Graph API: Legacy data format will be eliminated 1/11/2010

2010-01-04 Thread Jesse Stay
Also, how do we get a business relationship set up?  I've been asking for
that for years now.

Jesse

On Mon, Jan 4, 2010 at 10:16 PM, Jesse Stay jesses...@gmail.com wrote:

 John, how are things going on the real-time social graph APIs?  That would
 solve a lot of things for me surrounding this.

 Jesse


 On Mon, Jan 4, 2010 at 9:58 PM, John Kalucki j...@twitter.com wrote:

 The backend datastore returns following blocks in constant time,
 regardless of the cursor depth. When I test a user with 100k+
 followers via twitter.com using a ruby script, I see each cursored
 block return in between 1.3 and 2.0 seconds, n=46, avg 1.59 seconds,
 median 1.47 sec, stddev of .377, (home DSL, shared by several people
 at the moment). So, it seems that we're returning the data over home
 DSL at between 2,500 and 4,000 ids per second, which seems like a
 perfectly reasonable rate and variance.

 If I recall correctly, the cursorless methods are just shunted to
 the first block each time, and thus represent a constant, incomplete,
 amount of data...

 Looking into my crystal ball, if you want a lot more than several
 thousand widgets per second from Twitter, you probably aren't going to
 get them via REST, and you will probably have some sort of business
 relationship in place with Twitter.

 -John Kalucki
 http://twitter.com/jkalucki
 Services, Twitter Inc.

 (A slice of data below)

 url /followers/ids/alexa_chung.xml?cursor=-1
 fetch time = 1.478542
 url /followers/ids/alexa_chung.xml?cursor=1322524362256299608
 fetch time = 2.044831
 url /followers/ids/alexa_chung.xml?cursor=1321126009663170021
 fetch time = 1.350035
 url /followers/ids/alexa_chung.xml?cursor=1319359640017038524
 fetch time = 1.44636
 url /followers/ids/alexa_chung.xml?cursor=1317653620096535558
 fetch time = 1.955163
 url /followers/ids/alexa_chung.xml?cursor=1316184964685221966
 fetch time = 1.326226
 url /followers/ids/alexa_chung.xml?cursor=1314866514116423204
 fetch time = 1.96824
 url /followers/ids/alexa_chung.xml?cursor=1313551933690106944
 fetch time = 1.513922
 url /followers/ids/alexa_chung.xml?cursor=1312201296962214944
 fetch time = 1.59179
 url /followers/ids/alexa_chung.xml?cursor=1311363260604388613
 fetch time = 2.259924
 url /followers/ids/alexa_chung.xml?cursor=1310627455188010229
 fetch time = 1.706438
 url /followers/ids/alexa_chung.xml?cursor=1309772694575801646
 fetch time = 1.460413



 On Mon, Jan 4, 2010 at 8:18 PM, PJB pjbmancun...@gmail.com wrote:
 
  Some quick benchmarks...
 
  Grabbed entire social graph for ~250 users, where each user has a
  number of friends/followers between 0 and 80,000.  I randomly used
  both the cursor and cursor-less API methods.
 
   5000 ids
  cursor: 0.72 avg seconds
  cursorless: 0.51 avg seconds
 
  5000 to 10,000 ids
  cursor: 1.42 avg seconds
  cursorless: 0.94 avg seconds
 
  1 to 80,000 ids
  cursor: 2.82 avg seconds
  cursorless: 1.21 avg seconds
 
  5,000 to 80,000 ids
  cursor: 4.28
  cursorless: 1.59
 
  10,000 to 80,000 ids
  cursor: 5.23
  cursorless: 1.82
 
  20,000 to 80,000 ids
  cursor: 6.82
  cursorless: 2
 
  40,000 to 80,000 ids
  cursor: 9.5
  cursorless: 3
 
  60,000 to 80,000 ids
  cursor: 12.25
  cursorless: 3.12
 
  On Jan 4, 7:58 pm, Jesse Stay jesses...@gmail.com wrote:
  Ditto PJB :-)
 
  On Mon, Jan 4, 2010 at 8:12 PM, PJB pjbmancun...@gmail.com wrote:
 
   I think that's like asking someone: why do you eat food? But don't
 say
   because it tastes good or nourishes you, because we already know
   that! ;)
 
   You guys presumably set the 5000 ids per cursor limit by analyzing
   your user base and noting that one could still obtain the social
 graph
   for the vast majority of users with a single call.
 
   But this is a bit misleading.  For analytics-based apps, who aim to
 do
   near real-time analysis of relationships, the focus is typically on
   consumer brands who have a far larger than average number of
   relationships (e.g., 50k - 200k).
 
   This means that those apps are neck-deep in cursor-based stuff, and
   quickly realize the existing drawbacks, including, in order of
   significance:
 
   - Latency.  Fetching ids for a user with 3000 friends is comparable
   between the two calls.  But as you increment past 5000, the speed
   quickly peaks at a 5+x difference (I will include more benchmarks in
 a
   short while).  For example, fetching 80,000 friends via the get-all
   method takes on average 3 seconds; it takes, on average, 15 seconds
   with cursors.
 
   - Code complexity  elegance.  I would say that there is a 3x
 increase
   in code lines to account for cursors, from retrying failed cursors,
 to
   caching to account for cursor slowness, to UI changes to coddle
   impatient users.
 
   - Incomprehensibility.  While there are obviously very good reasons
   from Twitter's perspective (performance) to the cursor based model,
   there really is no apparent obvious benefit to API users for the ids
   calls.  I would make the case that a large 

Re: [twitter-dev] Re: Social Graph API: Legacy data format will be eliminated 1/11/2010

2010-01-04 Thread John Kalucki
Ryan Sarver announced that we're going to provide an agreement
framework for Tweet data at Le Web last month. Until all that
licensing machinery is working well, we probably won't put any effort
into syndicating the social graph. At this point, social graph
syndication appears to be totally unformed, completely up in the air,
and any predictions would be unwise.

-John Kalucki
http://twitter.com/jkalucki
Services, Twitter Inc.


On Mon, Jan 4, 2010 at 9:16 PM, Jesse Stay jesses...@gmail.com wrote:
 John, how are things going on the real-time social graph APIs?  That would
 solve a lot of things for me surrounding this.
 Jesse

 On Mon, Jan 4, 2010 at 9:58 PM, John Kalucki j...@twitter.com wrote:

 The backend datastore returns following blocks in constant time,
 regardless of the cursor depth. When I test a user with 100k+
 followers via twitter.com using a ruby script, I see each cursored
 block return in between 1.3 and 2.0 seconds, n=46, avg 1.59 seconds,
 median 1.47 sec, stddev of .377, (home DSL, shared by several people
 at the moment). So, it seems that we're returning the data over home
 DSL at between 2,500 and 4,000 ids per second, which seems like a
 perfectly reasonable rate and variance.

 If I recall correctly, the cursorless methods are just shunted to
 the first block each time, and thus represent a constant, incomplete,
 amount of data...

 Looking into my crystal ball, if you want a lot more than several
 thousand widgets per second from Twitter, you probably aren't going to
 get them via REST, and you will probably have some sort of business
 relationship in place with Twitter.

 -John Kalucki
 http://twitter.com/jkalucki
 Services, Twitter Inc.

 (A slice of data below)

 url /followers/ids/alexa_chung.xml?cursor=-1
 fetch time = 1.478542
 url /followers/ids/alexa_chung.xml?cursor=1322524362256299608
 fetch time = 2.044831
 url /followers/ids/alexa_chung.xml?cursor=1321126009663170021
 fetch time = 1.350035
 url /followers/ids/alexa_chung.xml?cursor=1319359640017038524
 fetch time = 1.44636
 url /followers/ids/alexa_chung.xml?cursor=1317653620096535558
 fetch time = 1.955163
 url /followers/ids/alexa_chung.xml?cursor=1316184964685221966
 fetch time = 1.326226
 url /followers/ids/alexa_chung.xml?cursor=1314866514116423204
 fetch time = 1.96824
 url /followers/ids/alexa_chung.xml?cursor=1313551933690106944
 fetch time = 1.513922
 url /followers/ids/alexa_chung.xml?cursor=1312201296962214944
 fetch time = 1.59179
 url /followers/ids/alexa_chung.xml?cursor=1311363260604388613
 fetch time = 2.259924
 url /followers/ids/alexa_chung.xml?cursor=1310627455188010229
 fetch time = 1.706438
 url /followers/ids/alexa_chung.xml?cursor=1309772694575801646
 fetch time = 1.460413



 On Mon, Jan 4, 2010 at 8:18 PM, PJB pjbmancun...@gmail.com wrote:
 
  Some quick benchmarks...
 
  Grabbed entire social graph for ~250 users, where each user has a
  number of friends/followers between 0 and 80,000.  I randomly used
  both the cursor and cursor-less API methods.
 
   5000 ids
  cursor: 0.72 avg seconds
  cursorless: 0.51 avg seconds
 
  5000 to 10,000 ids
  cursor: 1.42 avg seconds
  cursorless: 0.94 avg seconds
 
  1 to 80,000 ids
  cursor: 2.82 avg seconds
  cursorless: 1.21 avg seconds
 
  5,000 to 80,000 ids
  cursor: 4.28
  cursorless: 1.59
 
  10,000 to 80,000 ids
  cursor: 5.23
  cursorless: 1.82
 
  20,000 to 80,000 ids
  cursor: 6.82
  cursorless: 2
 
  40,000 to 80,000 ids
  cursor: 9.5
  cursorless: 3
 
  60,000 to 80,000 ids
  cursor: 12.25
  cursorless: 3.12
 
  On Jan 4, 7:58 pm, Jesse Stay jesses...@gmail.com wrote:
  Ditto PJB :-)
 
  On Mon, Jan 4, 2010 at 8:12 PM, PJB pjbmancun...@gmail.com wrote:
 
   I think that's like asking someone: why do you eat food? But don't
   say
   because it tastes good or nourishes you, because we already know
   that! ;)
 
   You guys presumably set the 5000 ids per cursor limit by analyzing
   your user base and noting that one could still obtain the social
   graph
   for the vast majority of users with a single call.
 
   But this is a bit misleading.  For analytics-based apps, who aim to
   do
   near real-time analysis of relationships, the focus is typically on
   consumer brands who have a far larger than average number of
   relationships (e.g., 50k - 200k).
 
   This means that those apps are neck-deep in cursor-based stuff, and
   quickly realize the existing drawbacks, including, in order of
   significance:
 
   - Latency.  Fetching ids for a user with 3000 friends is comparable
   between the two calls.  But as you increment past 5000, the speed
   quickly peaks at a 5+x difference (I will include more benchmarks in
   a
   short while).  For example, fetching 80,000 friends via the get-all
   method takes on average 3 seconds; it takes, on average, 15 seconds
   with cursors.
 
   - Code complexity  elegance.  I would say that there is a 3x
   increase
   in code lines to account for cursors, from retrying failed cursors,
   to
   caching 

Re: [twitter-dev] Re: Social Graph API: Legacy data format will be eliminated 1/11/2010

2010-01-04 Thread John Kalucki
The existing APIs stopped providing accurate data about a year ago
and degraded substantially over a period of just a few months. Now the
only data store for social graph data requires cursors to access
complete sets. Pagination is just not possible with the same latency
at this scale without an order of magnitude or two increase in cost.
So, instead of hardware units in the tens and hundreds, think about
the same in the thousands and tens of thousands.

These APIs and their now decommissioned backing stores were developed
when having 20,000 followers was a lot. We're an order of magnitude or
two beyond that point along nearly every dimension. Accounts.
Followers per account. Tweets per second. Etc. As systems evolve, some
evolutionary paths become extinct.

Given boundless resources, the best we could do for a REST API, as
Marcel has alluded, is to do the cursoring for you and aggregate many
blocks into much larger responses. This wouldn't work very well for at
least two immediate reasons: 1) Running a system with multimodal
service times is a nightmare -- we'd have to provision a specific
endpoint for such a resource. 2) Ruby GC chokes on lots of objects.
We'd have to consider implementing this resource in another stack, or
do a lot of tuning. All this to build the opposite of what most
applications want: a real-time stream of graph deltas for a set of
accounts, or the list of recent set operations since the last poll --
and rarely, if ever, the entire following set.

Also, I'm a little rusty on the details on the social graph api, but
please detail which public resources allow retrieval of 40,000
followers in two seconds. I'd be very interested in looking at the
implementing code on our end. A curl timing would be nice (time curl
URL  /dev/null) too.

-John Kalucki
http://twitter.com/jkalucki
Services, Twitter Inc.


On Mon, Jan 4, 2010 at 9:18 PM, PJB pjbmancun...@gmail.com wrote:


 On Jan 4, 8:58 pm, John Kalucki j...@twitter.com wrote:
 at the moment). So, it seems that we're returning the data over home
 DSL at between 2,500 and 4,000 ids per second, which seems like a
 perfectly reasonable rate and variance.

 It's certainly not reasonable to expect it to take 10+ seconds to get
 25,000 to 40,000 ids, PARTICULARLY when existing methods, for whatever
 reason, return the same data in less than 2 seconds.  Twitter is being
 incredibly short-sighted if they think this is indeed reasonable.

 Some of us have built applications around your EXISTING APIs, and to
 now suggest that we may need formal business relationships to
 continue to use such APIs is seriously disquieting.

 Disgusted...





Re: [twitter-dev] Re: Social Graph API: Legacy data format will be eliminated 1/11/2010

2010-01-04 Thread Jesse Stay
Again, ditto PJB - just making sure the Twitter devs don't think PJB is
alone in this.  I'm sure Dewald and many other developers, including those
unaware of this (is it even on the status blog?) agree.  I'm also seeing
similar results to PJB in my benchmarks. cursor-less is much, much faster.
 At a maximum, put a max on the cursor-less calls (200,000 should be
sufficient).  Please don't take them away.

Jesse

On Mon, Jan 4, 2010 at 11:40 PM, PJB pjbmancun...@gmail.com wrote:


 As noted in this thread, the fact that cursor-less methods for friends/
 followers ids will be deprecated was newly announced on December 22.

 In fact, the API documentation still clearly indicates that cursors
 are optional, and that their absence will return a complete social
 graph.  E.g.:

 http://apiwiki.twitter.com/Twitter-REST-API-Method%3A-followers%C2%A0ids

 (If the cursor parameter is not provided, all IDs are attempted to be
 returned)

 The example at the bottom of that page gives a good example of
 retrieving 300,000+ ids in several seconds:

 http://twitter.com/followers/ids.xml?screen_name=dougw

 Of course, retrieving 20-40k users is significantly faster.

 Again, many of us have built apps around cursor-less API calls.  To
 now deprecate them, with just a few days warning over the holidays, is
 clearly inappropriate and uncalled for.  Similarly, to announce that
 we must now expect 5x slowness when doing the same calls, when these
 existing methods work well, is shocking.

 Many developers live and die by the API documentation.  It's a really
 fouled-up situation when the API documentation is so totally wrong,
 right?

 I urge those folks addressing this issue to preserve the cursor-less
 methods.  Barring that, I urge them to return at least 25,000 ids per
 cursor (as you note, time progression has made 5000 per call
 antiquated and ineffective for today's Twitter user) and grant at
 least 3 months before deprecation.

 On Jan 4, 10:23 pm, John Kalucki j...@twitter.com wrote:
  The existing APIs stopped providing accurate data about a year ago
  and degraded substantially over a period of just a few months. Now the
  only data store for social graph data requires cursors to access
  complete sets. Pagination is just not possible with the same latency
  at this scale without an order of magnitude or two increase in cost.
  So, instead of hardware units in the tens and hundreds, think about
  the same in the thousands and tens of thousands.
 
  These APIs and their now decommissioned backing stores were developed
  when having 20,000 followers was a lot. We're an order of magnitude or
  two beyond that point along nearly every dimension. Accounts.
  Followers per account. Tweets per second. Etc. As systems evolve, some
  evolutionary paths become extinct.
 
  Given boundless resources, the best we could do for a REST API, as
  Marcel has alluded, is to do the cursoring for you and aggregate many
  blocks into much larger responses. This wouldn't work very well for at
  least two immediate reasons: 1) Running a system with multimodal
  service times is a nightmare -- we'd have to provision a specific
  endpoint for such a resource. 2) Ruby GC chokes on lots of objects.
  We'd have to consider implementing this resource in another stack, or
  do a lot of tuning. All this to build the opposite of what most
  applications want: a real-time stream of graph deltas for a set of
  accounts, or the list of recent set operations since the last poll --
  and rarely, if ever, the entire following set.
 
  Also, I'm a little rusty on the details on the social graph api, but
  please detail which public resources allow retrieval of 40,000
  followers in two seconds. I'd be very interested in looking at the
  implementing code on our end. A curl timing would be nice (time curl
  URL  /dev/null) too.
 
  -John Kaluckihttp://twitter.com/jkalucki
  Services, Twitter Inc.
 
  On Mon, Jan 4, 2010 at 9:18 PM, PJB pjbmancun...@gmail.com wrote:
 
   On Jan 4, 8:58 pm, John Kalucki j...@twitter.com wrote:
   at the moment). So, it seems that we're returning the data over home
   DSL at between 2,500 and 4,000 ids per second, which seems like a
   perfectly reasonable rate and variance.
 
   It's certainly not reasonable to expect it to take 10+ seconds to get
   25,000 to 40,000 ids, PARTICULARLY when existing methods, for whatever
   reason, return the same data in less than 2 seconds.  Twitter is being
   incredibly short-sighted if they think this is indeed reasonable.
 
   Some of us have built applications around your EXISTING APIs, and to
   now suggest that we may need formal business relationships to
   continue to use such APIs is seriously disquieting.
 
   Disgusted...
 
 



Re: [twitter-dev] Re: Social Graph API: Legacy data format will be eliminated 1/11/2010

2009-12-27 Thread Zac Bowling
I agree with the others to some extent. Although its a good signal to stop
using something ASAP when something is depreciated, saying depreciated and
not giving definite time-line on it's removal isn't good either. (Source
params are deprecated but still work and don't have solid deprecation date,
and I'm still going on using them because OAuth sucks for desktop/mobile
situations still and would die with a 15 day heads up on removal).

Also iPhone app devs using this API will would probably have a hard time
squeezing a 15 day return on Apple right now.

Zac Bowling


On Sun, Dec 27, 2009 at 3:28 PM, Dewald Pretorius dpr...@gmail.com wrote:

 I agree 100%.

 Calls without the starting cursor of -1 must still return all
 followers as is currently the case.

 As a test I've set my system to use cursors on all calls. It inflates
 the processing time so much that things become completely unworkable.

 We can programmatically use cursors if showuser says that the person
 has more than a certain number of friends/followers. That's what I'm
 currently doing, and it works beautifully. So, please do not force us
 to use cursors on all calls.

 On Dec 24, 7:20 am, Aki yoru.fuku...@gmail.com wrote:
  I agree with PJB. The previous announcements only said that the
  pagination will be deprecated.
 
  1.http://groups.google.com/group/twitter-api-announce/browse_thread/thr.
 ..
  2.http://groups.google.com/group/twitter-api-announce/browse_thread/thr.
 ..
 
  However, both of the announcements did not say that the API call
  without page parameter to get
  all IDs will be removed or replaced with cursor pagination.
  The deprecation of this method is not being documented as PJB said.
 
  On Dec 24, 5:00 pm, PJB pjbmancun...@gmail.com wrote:
 
   Why hasn't this been announced before?  Why does the API suggest
   something totally different?  At the very least, can you please hold
   off on deprecation of this until 2/11/2010?  This is a new API change.
 
   On Dec 23, 7:45 pm, Raffi Krikorian ra...@twitter.com wrote:
 
yes - if you do not pass in cursors, then the API will behave as
 though you
requested the first cursor.
 
 Willhelm:
 
 Your announcement is apparently expanding the changeover from page
 to
 cursor in new, unannounced ways??
 
 The API documentation page says: If the cursor parameter is not
 provided, all IDs are attempted to be returned, but large sets of
 IDs
 will likely fail with timeout errors.
 
 Yesterday you wrote: Starting soon, if you fail to pass a cursor,
 the
 data returned will be that of the first cursor (-1) and the
 next_cursor and previous_cursor elements will be included.
 
 I can understand the need to swap from page to cursor, but was
 pleased
 that a single call was still available to return (or attempt to
 return) all friend/follower ids.  Now you are saying that, in
 addition
 to the changeover from page to cursor, you are also getting rid of
 this?
 
 Can you please confirm/deny?
 
 On Dec 22, 4:13 pm, Wilhelm Bierbaum wilh...@twitter.com wrote:
  We noticed that some clients are still calling social graph
 methods
  without cursor parameters. We wanted to take time to make sure
 that
  people were calling the updated methods which return data with
 cursors
  instead of the old formats that do not.
 
  As previously announced in September (http://bit.ly/46x1iL) and
  November (http://bit.ly/3UQ0LU), the legacy data formats
 returned
  as a result of calling social graph endpoints without a cursor
  parameter are deprecated and will be removed.
 
  These formats have been removed from the API wiki since
 September.
 
  You should always pass a cursor parameter. Starting soon, if you
 fail
  to pass a cursor, the data returned will be that of the first
 cursor
  (-1) and the next_cursor and previous_cursor elements will be
 included.
 
  If you aren't seeing next_cursor and previous_cursor in your
 results,
  you are getting data back in the old format. You will need to
 adjust
  your parser to handle the new format.
 
  We're going to start assuming you want data in the new format
  (users_list / users / user or id_list / ids / id) instead of the
 old
  format (users / user or ids / id) regardless of your passing a
 cursor
  parameter as of 1/11/2010.
 
  * The old formats will no longer be returned after 1/11/2010.
  * Start using the new formats now by passing the 'cursor'
 parameter.
 
  To recap, the old endpoints at
 
 /statuses/friends.xml
 /statuses/followers.xml
 
  returned
 
  users type=array
user
!-- ... omitted ... --
/user
  /users
 
  or JSON like [{/*user record*/ /*, .../]
 
  whereas
 
  /statuses/friends.xml?cursor=n
  /statuses/followers.xml?cursor=n
 
  return data that looks like
 
  

Re: [twitter-dev] Re: Social Graph API: Legacy data format will be eliminated 1/11/2010

2009-12-24 Thread Michael Steuer
+1 - I'm currently relying on retrieving a complete social graph when  
no cursor is passed. Your announcing this change right around Xmas+new  
years to take effect almost immediately thereafter...






On Dec 23, 2009, at 10:00 PM, PJB pjbmancun...@gmail.com wrote:



Why hasn't this been announced before?  Why does the API suggest
something totally different?  At the very least, can you please hold
off on deprecation of this until 2/11/2010?  This is a new API change.

On Dec 23, 7:45 pm, Raffi Krikorian ra...@twitter.com wrote:
yes - if you do not pass in cursors, then the API will behave as  
though you

requested the first cursor.




Willhelm:


Your announcement is apparently expanding the changeover from page  
to

cursor in new, unannounced ways??



The API documentation page says: If the cursor parameter is not
provided, all IDs are attempted to be returned, but large sets of  
IDs

will likely fail with timeout errors.


Yesterday you wrote: Starting soon, if you fail to pass a cursor,  
the

data returned will be that of the first cursor (-1) and the
next_cursor and previous_cursor elements will be included.


I can understand the need to swap from page to cursor, but was  
pleased

that a single call was still available to return (or attempt to
return) all friend/follower ids.  Now you are saying that, in  
addition

to the changeover from page to cursor, you are also getting rid of
this?



Can you please confirm/deny?



On Dec 22, 4:13 pm, Wilhelm Bierbaum wilh...@twitter.com wrote:

We noticed that some clients are still calling social graph methods
without cursor parameters. We wanted to take time to make sure that
people were calling the updated methods which return data with  
cursors

instead of the old formats that do not.



As previously announced in September (http://bit.ly/46x1iL) and
November (http://bit.ly/3UQ0LU), the legacy data formats returned
as a result of calling social graph endpoints without a cursor
parameter are deprecated and will be removed.



These formats have been removed from the API wiki since September.


You should always pass a cursor parameter. Starting soon, if you  
fail
to pass a cursor, the data returned will be that of the first  
cursor
(-1) and the next_cursor and previous_cursor elements will be  
included.


If you aren't seeing next_cursor and previous_cursor in your  
results,
you are getting data back in the old format. You will need to  
adjust

your parser to handle the new format.



We're going to start assuming you want data in the new format
(users_list / users / user or id_list / ids / id) instead of the  
old
format (users / user or ids / id) regardless of your passing a  
cursor

parameter as of 1/11/2010.



* The old formats will no longer be returned after 1/11/2010.
* Start using the new formats now by passing the 'cursor'  
parameter.



To recap, the old endpoints at



   /statuses/friends.xml
   /statuses/followers.xml



returned



users type=array
  user
  !-- ... omitted ... --
  /user
/users



or JSON like [{/*user record*/ /*, .../]



whereas



/statuses/friends.xml?cursor=n
/statuses/followers.xml?cursor=n



return data that looks like



users_list
  users type=array
  user
  !-- ... omitted ... --
  /user
  /users
  next_cursor7128872798413429387/next_cursor
  previous_cursor0/previous_cursor
/users_list



or, the JSON equivalent:



{users:[{/*user record*/} /*, ...*/], next_cursor:0,
previous_cursor:0}



and the old endpoints at



/friends/ids.xml
/followers/ids.xml



returned data that looks like



ids
  id1/id
  id2/id
  id3/id
/ids



whereas



/friends/ids.xml?cursor=n
/followers/ids.xml?cursor=n



return data that looks like



id_list
  ids
id1/id
id2/id
id3/id
  /ids
  next_cursor1288724293877798413/next_cursor
  previous_cursor-1300794057949944903/previous_cursor
/id_list



or, the JSON equivalent:



{ids:[1, 2, 3], next_cursor:0, previous_cursor:0}


If you have any questions or comments, please feel free to post  
them

to twitter-development-talk.



Thanks!



--
Wilhelm Bierbaum
Twitter Platform Team


--
Raffi Krikorian
Twitter Platform Teamhttp://twitter.com/raffi


Re: [twitter-dev] Re: Social Graph API: Legacy data format will be eliminated 1/11/2010

2009-12-23 Thread Raffi Krikorian
yes - if you do not pass in cursors, then the API will behave as though you
requested the first cursor.


 Willhelm:

 Your announcement is apparently expanding the changeover from page to
 cursor in new, unannounced ways??

 The API documentation page says: If the cursor parameter is not
 provided, all IDs are attempted to be returned, but large sets of IDs
 will likely fail with timeout errors.

 Yesterday you wrote: Starting soon, if you fail to pass a cursor, the
 data returned will be that of the first cursor (-1) and the
 next_cursor and previous_cursor elements will be included.

 I can understand the need to swap from page to cursor, but was pleased
 that a single call was still available to return (or attempt to
 return) all friend/follower ids.  Now you are saying that, in addition
 to the changeover from page to cursor, you are also getting rid of
 this?

 Can you please confirm/deny?


 On Dec 22, 4:13 pm, Wilhelm Bierbaum wilh...@twitter.com wrote:
  We noticed that some clients are still calling social graph methods
  without cursor parameters. We wanted to take time to make sure that
  people were calling the updated methods which return data with cursors
  instead of the old formats that do not.
 
  As previously announced in September (http://bit.ly/46x1iL) and
  November (http://bit.ly/3UQ0LU), the legacy data formats returned
  as a result of calling social graph endpoints without a cursor
  parameter are deprecated and will be removed.
 
  These formats have been removed from the API wiki since September.
 
  You should always pass a cursor parameter. Starting soon, if you fail
  to pass a cursor, the data returned will be that of the first cursor
  (-1) and the next_cursor and previous_cursor elements will be included.
 
  If you aren't seeing next_cursor and previous_cursor in your results,
  you are getting data back in the old format. You will need to adjust
  your parser to handle the new format.
 
  We're going to start assuming you want data in the new format
  (users_list / users / user or id_list / ids / id) instead of the old
  format (users / user or ids / id) regardless of your passing a cursor
  parameter as of 1/11/2010.
 
  * The old formats will no longer be returned after 1/11/2010.
  * Start using the new formats now by passing the 'cursor' parameter.
 
  To recap, the old endpoints at
 
 /statuses/friends.xml
 /statuses/followers.xml
 
  returned
 
  users type=array
user
!-- ... omitted ... --
/user
  /users
 
  or JSON like [{/*user record*/ /*, .../]
 
  whereas
 
  /statuses/friends.xml?cursor=n
  /statuses/followers.xml?cursor=n
 
  return data that looks like
 
  users_list
users type=array
user
!-- ... omitted ... --
/user
/users
next_cursor7128872798413429387/next_cursor
previous_cursor0/previous_cursor
  /users_list
 
  or, the JSON equivalent:
 
  {users:[{/*user record*/} /*, ...*/], next_cursor:0,
  previous_cursor:0}
 
  and the old endpoints at
 
  /friends/ids.xml
  /followers/ids.xml
 
  returned data that looks like
 
  ids
id1/id
id2/id
id3/id
  /ids
 
  whereas
 
  /friends/ids.xml?cursor=n
  /followers/ids.xml?cursor=n
 
  return data that looks like
 
  id_list
ids
  id1/id
  id2/id
  id3/id
/ids
next_cursor1288724293877798413/next_cursor
previous_cursor-1300794057949944903/previous_cursor
  /id_list
 
  or, the JSON equivalent:
 
  {ids:[1, 2, 3], next_cursor:0, previous_cursor:0}
 
  If you have any questions or comments, please feel free to post them
  to twitter-development-talk.
 
  Thanks!
 
  --
  Wilhelm Bierbaum
  Twitter Platform Team




-- 
Raffi Krikorian
Twitter Platform Team
http://twitter.com/raffi