Re: Spark partitions from CassandraRDD

2015-09-04 Thread Ankur Srivastava
Oh if that is the case then you can try tuning " spark.cassandra.input.split.size" spark.cassandra.input.split.sizeapprox number of Cassandra partitions in a Spark partition 10 Hope this helps. Thanks Ankur On Thu, Sep 3, 2015 at 12:22 PM, Alaa Zubaidi (PDF)

Re: Spark partitions from CassandraRDD

2015-09-03 Thread Ankur Srivastava
Hi Alaa, Partition when using CassandraRDD depends on your partition key in Cassandra table. If you see only 1 partition in the RDD it means all the rows you have selected have same partition_key in C* Thanks Ankur On Thu, Sep 3, 2015 at 11:54 AM, Alaa Zubaidi (PDF)

Spark partitions from CassandraRDD

2015-09-03 Thread Alaa Zubaidi (PDF)
Hi, I testing Spark and Cassandra, Spark 1.4, Cassandra 2.1.7 cassandra spark connector 1.4, running in standalone mode. I am getting 4000 rows from Cassandra (4mb row), where the row keys are random. .. sc.cassandraTable[RES](keyspace,res_name).where(res_where).cache I am expecting that it

Re: Spark partitions from CassandraRDD

2015-09-03 Thread Alaa Zubaidi (PDF)
Thanks Ankur, But I grabbed some keys from the Spark results and ran "nodetool -h getendpoints " and it showed the data is coming from at least 2 nodes? Regards, Alaa On Thu, Sep 3, 2015 at 12:06 PM, Ankur Srivastava < ankur.srivast...@gmail.com> wrote: > Hi Alaa, > > Partition when