We are receiving files from an outside vendor who creates a Parquet data file and Gzips it before delivery. Does anyone know how to Gunzip the file in Spark and inject the Parquet data into a DataFrame? I thought using sc.textFile or sc.wholeTextFiles would automatically Gunzip the file, but I’m getting a decompression header error when trying to open the Parquet file.
Thanks, Ben --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org