Re: Pagination on big table, splitting joins

2015-08-10 Thread Michael Armbrust
I think to use *toLocalIterator* method and something like that, but I have doubts about memory and parallelism and sure there is a better way to do it. It will still run all earlier parts of the job in parallel. Only the actual retrieving of the final partitions will be serial. This is

Pagination on big table, splitting joins

2015-08-08 Thread Gaspar Muñoz
Hi, I have two different parts in my system. 1. Batch application that every x minutes do sql queries between several tables that contains millions of rows to compound a entity, and sent that entities to Kafka. 2. Streaming application that processing data from Kafka. Now, I have entire system