Re: Skip Corrupted Parquet blocks / footer.

Abhishek Sun, 01 Jan 2017 11:17:04 -0800

You will have to change the metadata file under _spark_metadata folder to 
remove the listing of corrupt files.


Thanks,
Shobhit G 

Sent from my iPhone

> On Dec 31, 2016, at 8:11 PM, khyati [via Apache Spark Developers List] 
> <ml-node+s1001551n20418...@n3.nabble.com> wrote:
> 
> Hi, 
> 
> I am trying to read the multiple parquet files in sparksql. In one dir there 
> are two files, of which one is corrupted. While trying to read these files, 
> sparksql throws Exception for the corrupted file. 
> 
> val newDataDF = 
> sqlContext.read.parquet("/data/testdir/data1.parquet","/data/testdir/corruptblock.0")
>  
> newDataDF.show 
> 
> throws Exception. 
> 
> Is there any way to just skip the file having corrupted block/footer and just 
> read the file/files which are proper? 
> 
> Thanks 
> 
> If you reply to this email, your message will be added to the discussion 
> below:
> http://apache-spark-developers-list.1001551.n3.nabble.com/Skip-Corrupted-Parquet-blocks-footer-tp20418.html
> To start a new topic under Apache Spark Developers List, email 
> ml-node+s1001551n1...@n3.nabble.com 
> To unsubscribe from Apache Spark Developers List, click here.
> NAML




-----
Regards, 
Abhi
--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Skip-Corrupted-Parquet-blocks-footer-tp20418p20420.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

Re: Skip Corrupted Parquet blocks / footer.

Reply via email to