[twitter-dev] Cursor Expiration

2009-12-09 Thread Alan Gutierrez
Although million follower accounts are rare, how to I design for a 
million follower user logged into my application which users Social 
Graph API?


If Barack Obama were to log into my application, it would take 566 API 
calls to fetch his 2,828,782 followers, but I wouldn't have any left 
after the 150 API calls to fetch his 747,127 friends.


Obviously, I'd like to work my way through the list a little bit each 
hour. I'd like to store the cursor after 30 API calls, resume my 
iteration over an hour later.


When do cursors expire? I assume they will still be valid an hour later, 
but I've seen discussion on this group that says that they are opaque 
and that they may change at some point. I suppose that when that time 
goes, if my application is crawling a celebrity, it will not be able to 
resume crawling with the cursor it stored an hour before.


Alan Gutierrez
http://twitter.com/bigeasy


Re: [twitter-dev] Cursor Expiration

2009-12-09 Thread Abraham Williams
Check out the section about whitelisting:
http://apiwiki.twitter.com/Rate-limiting

Abraham

On Wed, Dec 9, 2009 at 02:44, Alan Gutierrez  wrote:

> Although million follower accounts are rare, how to I design for a million
> follower user logged into my application which users Social Graph API?
>
> If Barack Obama were to log into my application, it would take 566 API
> calls to fetch his 2,828,782 followers, but I wouldn't have any left after
> the 150 API calls to fetch his 747,127 friends.
>
> Obviously, I'd like to work my way through the list a little bit each hour.
> I'd like to store the cursor after 30 API calls, resume my iteration over an
> hour later.
>
> When do cursors expire? I assume they will still be valid an hour later,
> but I've seen discussion on this group that says that they are opaque and
> that they may change at some point. I suppose that when that time goes, if
> my application is crawling a celebrity, it will not be able to resume
> crawling with the cursor it stored an hour before.
>
> Alan Gutierrez
> http://twitter.com/bigeasy
>



-- 
Abraham Williams | Community Evangelist | http://web608.org
Project | Intersect | http://intersect.labs.poseurtech.com
Hacker | http://abrah.am | http://twitter.com/abraham
This email is: [ ] shareable [x] ask first [ ] private.
Sent from Madison, WI, United States


Re: [twitter-dev] Cursor Expiration

2009-12-09 Thread John Kalucki
A cursor should be valid forever, but as it ages and rows are removed, you
might see some minor data loss and probably more duplicates.

-John Kalucki
http://twitter.com/jkalucki
Services, Twitter Inc.



On Wed, Dec 9, 2009 at 12:44 AM, Alan Gutierrez  wrote:

> Although million follower accounts are rare, how to I design for a million
> follower user logged into my application which users Social Graph API?
>
> If Barack Obama were to log into my application, it would take 566 API
> calls to fetch his 2,828,782 followers, but I wouldn't have any left after
> the 150 API calls to fetch his 747,127 friends.
>
> Obviously, I'd like to work my way through the list a little bit each hour.
> I'd like to store the cursor after 30 API calls, resume my iteration over an
> hour later.
>
> When do cursors expire? I assume they will still be valid an hour later,
> but I've seen discussion on this group that says that they are opaque and
> that they may change at some point. I suppose that when that time goes, if
> my application is crawling a celebrity, it will not be able to resume
> crawling with the cursor it stored an hour before.
>
> Alan Gutierrez
> http://twitter.com/bigeasy
>


Re: [twitter-dev] Cursor Expiration

2010-01-16 Thread Marc Mims
* John Kalucki  [091209 09:28]:
> A cursor should be valid forever, but as it ages and rows are removed, you
> might see some minor data loss and probably more duplicates.

Out of curiosity, what is a cursor?  From our (the users') perspective,
it's just an opaque number.  But I'm curious.  How is it generated?
What does it represent internally?

-Marc


Re: [twitter-dev] Cursor Expiration

2010-01-17 Thread John Kalucki
A cursor is an opaque deletion-tolerant index into a Btree keyed by source
userid and modification time. It brings you to a point in time in the
reverse chron sorted list. So, since you can't change the past, other than
erasing it, it's effectively stable. (Modifications bubble to the top.) But
you have to deal with additions at the list head and also block shrinkage
due to deletions, so your blocks begin to overlap quite a bit as the data
ages. (If you cache cursors and read much later, you'll see the first few
rows of cursor[n+1]'s block as duplicates of the last rows of cursor[n]'s
block. The intersection cardinality is equal to the number of deletions in
cursor[n]'s block). Still, there may be value in caching these cursors and
then heuristically rebalancing them when the overlap proportion crosses some
threshold.


-John Kalucki
http://twitter.com/jkalucki
Infrastructure, Twitter Inc.


On Sat, Jan 16, 2010 at 10:40 PM, Marc Mims  wrote:

> * John Kalucki  [091209 09:28]:
> > A cursor should be valid forever, but as it ages and rows are removed,
> you
> > might see some minor data loss and probably more duplicates.
>
> Out of curiosity, what is a cursor?  From our (the users') perspective,
> it's just an opaque number.  But I'm curious.  How is it generated?
> What does it represent internally?
>
>-Marc
>