Re: orc read issue n spark

Reynold Xin Wed, 18 Nov 2015 09:20:32 -0800

What do you mean by starts delay scheduling? Are you saying it is no longer
doing local reads?


If that's the case you can increase the spark.locality.read timeout.

On Wednesday, November 18, 2015, Renu Yadav <yren...@gmail.com> wrote:

> Hi ,
> I am using spark 1.4.1 and saving orc file using
> df.write.format("orc").save("outputlocation")
>
> outputloation size 440GB
>
> and while reading df.read.format("orc").load("outputlocation").count
>
>
> it has 2618 partitions .
> the count operation runs fine uptil 2500 but starts delay scheduling after
> that which results in slow performance.
>
> *If anyone has any idea on this.Please do reply as I need this  very
> urgent*
>
> Thanks in advance
>
>
> Regards,
> Renu Yadav
>
>
>

Re: orc read issue n spark

Reply via email to