Franklyn Dsouza created SPARK-19388: ---------------------------------------
Summary: Reading an empty folder as parquet causes an Analysis Exception Key: SPARK-19388 URL: https://issues.apache.org/jira/browse/SPARK-19388 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 2.1.0 Reporter: Franklyn Dsouza Priority: Minor Reading an empty folder as parquet used to return an empty dataframe up till 2.0 . Now this causes an analysis exception like so {code} In [1]: df = sc.sql.read.parquet("empty_dir/") --------------------------------------------------------------------------- AnalysisException Traceback (most recent call last) ----> 1 df = sqlCtx.read.parquet("empty_dir/") spark/99f3dfa6151e312379a7381b7e65637df0429941/python/pyspark/sql/readwriter.pyc in parquet(self, *paths) 272 [('name', 'string'), ('year', 'int'), ('month', 'int'), ('day', 'int')] 273 """ --> 274 return self._df(self._jreader.parquet(_to_seq(self._spark._sc, paths))) 275 276 @ignore_unicode_prefix park/99f3dfa6151e312379a7381b7e65637df0429941/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py in __call__(self, *args) 1131 answer = self.gateway_client.send_command(command) 1132 return_value = get_return_value( -> 1133 answer, self.gateway_client, self.target_id, self.name) 1134 1135 for temp_arg in temp_args: spark/99f3dfa6151e312379a7381b7e65637df0429941/python/pyspark/sql/utils.pyc in deco(*a, **kw) 67 e.java_exception.getStackTrace())) 68 if s.startswith('org.apache.spark.sql.AnalysisException: '): ---> 69 raise AnalysisException(s.split(': ', 1)[1], stackTrace) 70 if s.startswith('org.apache.spark.sql.catalyst.analysis'): 71 raise AnalysisException(s.split(': ', 1)[1], stackTrace) AnalysisException: u'Unable to infer schema for Parquet. It must be specified manually.;' {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org