Hi everyone I tried upgrading Spark-1.6.2 to Spark-2.0.0 but run into an issue reading the existing data. Here's how the traceback looks in spark-shell:
scala> spark.read.parquet("/path/to/data") org.apache.spark.sql.AnalysisException: Unable to infer schema for ParquetFormat at /path/to/data. It must be specified manually; at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$16.apply(DataSource.scala:397) at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$16.apply(DataSource.scala:397) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:396) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:149) at org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:427) at org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:411) ... 48 elided If I enable DEBUG log with sc.setLogLevel("DEBUG"), here's what I additionally see in the output: https://gist.github.com/immerrr/4474021ae70f35b7b9e262251c0abc59. Of course, that same data is read and processed by spark-1.6.2 correctly. Any idea what might be wrong here? Cheers, immerrr --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org