Re: How to process one partition at a time?

Andrei Wed, 06 Apr 2016 03:19:57 -0700

I'm writing a kind of sampler which in most cases will require only 1
partition, sometimes 2 and very rarely more. So it doesn't make sense to
process all partitions in parallel. What is the easiest way to limit
computations to one partition only?


So far the best idea I came to is to create a custom partition whose
`compute` method looks something like:

def compute(split: Partition, context: TaskContext) = {
    if (split.index == targetPartition) {
        // do computation
    } else {
       // return empty iterator
    }
}



But it's quite ugly and I'm unlikely to be the first person with such a
need. Is there easier way to do it?

Re: How to process one partition at a time?

Reply via email to