Pei-Lun Lee created SPARK-6581: ---------------------------------- Summary: Metadata is missing when saving parquet file using hadoop 1.0.4 Key: SPARK-6581 URL: https://issues.apache.org/jira/browse/SPARK-6581 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.3.0 Environment: hadoop 1.0.4 Reporter: Pei-Lun Lee
When saving parquet file with {code}df.save("foo", "parquet"){code} It generates only _common_data while _metadata is missing: {noformat} -rwxrwxrwx 1 peilunlee staff 0 Mar 27 11:29 _SUCCESS* -rwxrwxrwx 1 peilunlee staff 250 Mar 27 11:29 _common_metadata* -rwxrwxrwx 1 peilunlee staff 272 Mar 27 11:29 part-r-00001.parquet* -rwxrwxrwx 1 peilunlee staff 272 Mar 27 11:29 part-r-00002.parquet* -rwxrwxrwx 1 peilunlee staff 272 Mar 27 11:29 part-r-00003.parquet* -rwxrwxrwx 1 peilunlee staff 488 Mar 27 11:29 part-r-00004.parquet* {noformat} If saving with {code}df.save("foo", "parquet", SaveMode.Overwrite){code} Both _metadata and _common_metadata are missing: {noformat} -rwxrwxrwx 1 peilunlee staff 0 Mar 27 11:29 _SUCCESS* -rwxrwxrwx 1 peilunlee staff 272 Mar 27 11:29 part-r-00001.parquet* -rwxrwxrwx 1 peilunlee staff 272 Mar 27 11:29 part-r-00002.parquet* -rwxrwxrwx 1 peilunlee staff 272 Mar 27 11:29 part-r-00003.parquet* -rwxrwxrwx 1 peilunlee staff 488 Mar 27 11:29 part-r-00004.parquet* {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org