Re: Why no virtual nodes for Cassandra on EC2?

mck Mon, 23 Feb 2015 05:24:31 -0800

> … my understanding was that
> performance of Hadoop jobs on C* clusters with vnodes was poor because a
> given Hadoop input split has to run many individual scans (one for each
> vnode) rather than just a single scan.  I've run C* and Hadoop in
> production with a custom input format that used vnodes (and just combined
> multiple vnodes in a single input split) and didn't have any issues (the
> jobs had many other performance bottlenecks besides starting multiple
> scans from C*).


You've described the ticket, and how it has been solved :-)

> This is one of the videos where I recall an off-hand mention of the Spark
> connector working with vnodes:
> https://www.youtube.com/watch?v=1NtnrdIUlg0

Thanks.

~mck

Re: Why no virtual nodes for Cassandra on EC2?

Reply via email to