Hi ,
I am using spark 1.4.1 and saving orc file using
df.write.format("orc").save("outputlocation")

outputloation size 440GB

and while reading df.read.format("orc").load("outputlocation").count


it has 2618 partitions .
the count operation runs fine uptil 2500 but starts delay scheduling after
that which results in slow performance.

*If anyone has any idea on this.Please do reply as I need this  very urgent*

Thanks in advance


Regards,
Renu Yadav

Reply via email to