Oh if that is the case then you can try tuning "
spark.cassandra.input.split.size"
spark.cassandra.input.split.sizeapprox number of Cassandra
partitions in a Spark partition 10
Hope this helps.
Thanks
Ankur
On Thu, Sep 3, 2015 at 12:22 PM, Alaa Zubaidi (PDF)
Hi Alaa,
Partition when using CassandraRDD depends on your partition key in
Cassandra table.
If you see only 1 partition in the RDD it means all the rows you have
selected have same partition_key in C*
Thanks
Ankur
On Thu, Sep 3, 2015 at 11:54 AM, Alaa Zubaidi (PDF)
Hi,
I testing Spark and Cassandra, Spark 1.4, Cassandra 2.1.7 cassandra spark
connector 1.4, running in standalone mode.
I am getting 4000 rows from Cassandra (4mb row), where the row keys are
random.
.. sc.cassandraTable[RES](keyspace,res_name).where(res_where).cache
I am expecting that it
Thanks Ankur,
But I grabbed some keys from the Spark results and ran "nodetool -h
getendpoints " and it showed the data is coming from at least 2 nodes?
Regards,
Alaa
On Thu, Sep 3, 2015 at 12:06 PM, Ankur Srivastava <
ankur.srivast...@gmail.com> wrote:
> Hi Alaa,
>
> Partition when