Map-Reduce on top of cassandra

Or Yanay Mon, 14 Mar 2011 08:06:28 -0700

Hi All,

I am trying to write some map-reduce tasks so I can find out stuff like - how 
many records have X status?
I am using 0.7.0 and have 5 nodes with ~100G of data on each node.


I have written the code based on the word_count example and the map-reduce is 
running successfully BUT is extremely slow (about 2 hours for the simplest key 
count).

I am now looking to track down the slowness and tune my process, or explore 
alternative ways to achieve the same goal.

Can anyone point me to a way to tune my map-reduce job?
Does anyone have any experience exploring Cassandra data with Hadoop cluster 
configuration? ( As suggested in 
http://wiki.apache.org/cassandra/HadoopSupport#ClusterConfig)

Thanks,
Orr

Map-Reduce on top of cassandra

Reply via email to