Not sure about spark data distribution but yeah spark can be used to retrieve such data from Cassandra.
Regards, Nitan Cell: 510 449 9629 > On Jan 17, 2019, at 2:15 PM, Goutham reddy <goutham.chiru...@gmail.com> wrote: > > Hi, > As each partition key can hold up to 2 Billion rows, even then it is an > anti-pattern to have such huge data set for one partition key in our case it > is 300k rows only, but when trying to query for one particular key we are > getting timeout exception. If I use Spark to get the 300k rows for a > particular key does it solve the problem of timeouts and distribute the data > across the spark nodes or will it still throw timeout exceptions. Can you > please help me with the best practice to retrieve the data for the key with > 300k rows. Any help is highly appreciated. > > Regards > Goutham.