Not sure about spark data distribution but yeah spark can be used to retrieve 
such data from Cassandra.


Regards,
Nitan
Cell: 510 449 9629

> On Jan 17, 2019, at 2:15 PM, Goutham reddy <goutham.chiru...@gmail.com> wrote:
> 
> Hi,
> As each partition key can hold up to 2 Billion rows, even then it is an 
> anti-pattern to have such huge data set for one partition key in our case it 
> is 300k rows only, but when trying to query for one particular key we are 
> getting timeout exception. If I use Spark to get the 300k rows for a 
> particular key does it solve the problem of timeouts and distribute the data 
> across the spark nodes or will it still throw timeout exceptions. Can you 
> please help me with the best practice to retrieve the data for the key with 
> 300k rows. Any help is highly appreciated.
> 
> Regards
> Goutham.

Reply via email to