On Thu, Apr 15, 2010 at 9:56 AM, Allen He <allenh...@gmail.com> wrote:
> Hello folks,
>
> When Twissandra (Twitter clone example for Cassandra) post a tweet, it
> iterate all of the followers to insert a tweet_id to their time lines(see


>     for follower_id in follower_ids:
>         TIMELINE.insert(str(follower_id), {ts: str(tweet_id)})
>
>
>
> My question is, If a user has millions of followers, is there millions of
> iterate?

I never looked at the twissandra code but it looks like that. It is
probably a trade off: either you store the tweets in each timeline and
when a user wants to read them you fetch them all (so putting the
burden on read time) or you do it like this and put it on the write.
Since writes are cheap in cassandra, and reads are more frequents,
this seems to make sense.


PS
  I think it should use batch_mutate anyway so that only one operation
is sent over the network

Reply via email to