I doubt Titan would be able to give you traversal of billions of nodes in
real-time either.   In-memory traversal is typically much faster than
Cassandra-based tree traversal, even including in-memory caching.


On Tue, Apr 8, 2014 at 1:23 PM, Nick Pentreath <nick.pentre...@gmail.com>wrote:

> GraphX, like Spark, will not typically be "real-time" (where by "real-time"
> here I assume you mean of the order of a few 10s-100s ms, up to a few
> seconds).
>
> Spark can in some cases approach the upper boundary of this definition (a
> second or two, possibly less) when data is cached in memory and the
> computation is not "too heavy", while Spark Streaming may be able to get
> closer to the mid-to-upper boundary of this under similar conditions,
> especially if aggregating over relatively small windows.
>
> However, for this use case (while I haven't used GraphX yet) I would say
> something like Titan (https://github.com/thinkaurelius/titan/wiki) or a
> similar OLTP graph DB may be what you're after. But this depends on what
> kind of graph traversal you need.
>
>
>
>
> On Tue, Apr 8, 2014 at 10:02 PM, love2dishtech <love2disht...@gmail.com
> >wrote:
>
> > Hi,
> >
> > Is Graphx on top of Apache Spark, is able to process the large scale
> > distributed graph traversal and compute, in real time. What is the query
> > execution engine distributing the query on top of graphx and apache
> spark.
> > My typical use case is a large scale distributed graph traversal in real
> > time, with billions of nodes.
> >
> > Thanks,
> > Love.
> >
> >
> >
> > --
> > View this message in context:
> >
> http://apache-spark-developers-list.1001551.n3.nabble.com/Apache-Spark-and-Graphx-for-Real-Time-Analytics-tp6261.html
> > Sent from the Apache Spark Developers List mailing list archive at
> > Nabble.com.
> >
>



-- 
--
Evan Chan
Staff Engineer
e...@ooyala.com  |

<http://www.ooyala.com/>
<http://www.facebook.com/ooyala><http://www.linkedin.com/company/ooyala><http://www.twitter.com/ooyala>

Reply via email to