[Neo4j] Activity Streams and Twitter Sample App

maxdemarzi Thu, 03 Nov 2011 09:39:16 -0700

Andreas Ronge created a new sample app called kvitter @
https://github.com/andreasronge/kvitter .


This got me thinking about the Twitter clone done in Redis @
http://redis.io/topics/twitter-clone

If you scroll down 2/3's of the way down you'll read this piece:

"After we create a post we obtain the post id. We need to LPUSH this post id
in every user that's following the author of the post, and of course in the
list of posts of the author."

This is the "bottleneck" of the application (if you can call anything on
Redis a bottleneck since it's so freaking fast).  A tweet by
http://twitter.com/#!/APLUSK will have to do 8 Million writes to each of
Ashton's Followers.

In Neo4j, we shouldn't need to do that since we can express that by
relationships:

# this should return the tweets of all the people I follow.
Person1.outgoing(:follows).outgoing(:tweeted).depth(2).filter("position.length()
== 2;") 

I don't want to get every tweet of every person I follow, just 100 of them.

Person1.outgoing(:follows).outgoing(:tweeted).depth(2).filter("position.length()
== 2;") .prune("position.returnedNodesCount() > 100")

But what about ordering them so I get the Latest 100 tweets from all of the
tweets a persons followers have tweeted?

What are some options here?

A) Return all them, and then filter them so only the top 100 are displayed
(See getLatestEvents from
https://trac.neo4j.org/browser/examples/activity-stream/src/main/java/org/neo4j/examples/activitystream/ActivityStreamExample.java?rev=3888
)

B) A custom Evaluator that adds tweets to an ordered set and trims off the
old tweets as it goes based on a tweet date property.  (Will have to check
every tweet property, like
http://sujitpal.blogspot.com/2009/06/custom-traverser-for-neo4j.html )

C) Put the tweet date on the Tweeted Relationship, so we only check at the
relationship property level

D) Claim defeat and use a "tweeted_recently" relationship added from every
new tweet to every follower (with trimming) sort of like Redis

E) Claim defeat and use an index to store the latest tweets like Redis

F) (assuming no id reuse) A custom Breadth First Traverser that evaluates
just the last 100 Tweeted relationships ordered by id in conjunction with B.
(is this possible?)

Other ideas?

How scalable are these solutions?

--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Activity-Streams-and-Twitter-Sample-App-tp3477669p3477669.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

[Neo4j] Activity Streams and Twitter Sample App

Reply via email to