Re: spark optimized pagination

2018-06-11 Thread vaquar khan
Spark is processing engine not storage or cache ,you can dump your results back to Cassandra, if you see latency then you can use cache to dump spark results. In short answer is NO,spark doesn't supporter give any api to give you cache kind of storage. Directly reading from dataset millions of

Re: spark optimized pagination

2018-06-11 Thread Teemu Heikkilä
So you are now providing the data on-demand through spark? I suggest you change your API to query from cassandra and store the results from Spark back there, that way you will have to process the whole dataset just once and cassandra is suitable for that kind of workloads. -T > On 10 Jun 2018,

Re: spark optimized pagination

2018-06-10 Thread Deepak Goel
I think your requirement is that of OLTP system. Spark & Cassandra are more suitable for batch kind of jobs (They can be used for OLTP but there would be a performance hit) Deepak "The greatness of a nation can be judged by the way its animals are treated. Please consider stopping the cruelty by

spark optimized pagination

2018-06-09 Thread onmstester onmstester
Hi, I'm using spark on top of cassandra as backend CRUD of a Restfull Application. Most of Rest API's retrieve huge amount of data from cassandra and doing a lot of aggregation on them in spark which take some seconds. Problem: sometimes the output result would be a big list which make clien