Re: Processing a splittable file from a single executor

2017-11-16 Thread Jeroen Miller
On 16 Nov 2017, at 10:22, Michael Shtelma wrote: > you call repartition(1) before starting processing your files. This > will ensure that you end up with just one partition. One question and one remark: Q) val ds = sqlContext.read.parquet(path).repartition(1) Am I absolutely sure that my file h

Processing a splittable file from a single executor

2017-11-16 Thread Jeroen Miller
Dear Sparkers, A while back, I asked how to process non-splittable files in parallel, one file per executor. Vadim's suggested "scheduling within an application" approach worked out beautifully. I am now facing the 'opposite' problem: - I have a bunch of parquet files to process - Once proce