You will have to change the metadata file under _spark_metadata folder to remove the listing of corrupt files.
Thanks, Shobhit G Sent from my iPhone > On Dec 31, 2016, at 8:11 PM, khyati [via Apache Spark Developers List] > <ml-node+s1001551n20418...@n3.nabble.com> wrote: > > Hi, > > I am trying to read the multiple parquet files in sparksql. In one dir there > are two files, of which one is corrupted. While trying to read these files, > sparksql throws Exception for the corrupted file. > > val newDataDF = > sqlContext.read.parquet("/data/testdir/data1.parquet","/data/testdir/corruptblock.0") > > newDataDF.show > > throws Exception. > > Is there any way to just skip the file having corrupted block/footer and just > read the file/files which are proper? > > Thanks > > If you reply to this email, your message will be added to the discussion > below: > http://apache-spark-developers-list.1001551.n3.nabble.com/Skip-Corrupted-Parquet-blocks-footer-tp20418.html > To start a new topic under Apache Spark Developers List, email > ml-node+s1001551n1...@n3.nabble.com > To unsubscribe from Apache Spark Developers List, click here. > NAML ----- Regards, Abhi -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Skip-Corrupted-Parquet-blocks-footer-tp20418p20420.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com.