Hi , I am using spark 1.4.1 and saving orc file using df.write.format("orc").save("outputlocation")
outputloation size 440GB and while reading df.read.format("orc").load("outputlocation").count it has 2618 partitions . the count operation runs fine uptil 2500 but starts delay scheduling after that which results in slow performance. *If anyone has any idea on this.Please do reply as I need this very urgent* Thanks in advance Regards, Renu Yadav