subject:"SparkSQL exception on cached parquet table"

Re: SparkSQL exception on cached parquet table

2014-11-20 Thread Sadhan Sood

Also attaching the parquet file if anyone wants to take a further look. On Thu, Nov 20, 2014 at 8:54 AM, Sadhan Sood sadhan.s...@gmail.com wrote: So, I am seeing this issue with spark sql throwing an exception when trying to read selective columns from a thrift parquet file and also when

Re: SparkSQL exception on cached parquet table

2014-11-20 Thread Michael Armbrust

Which version are you running on again? On Thu, Nov 20, 2014 at 8:17 AM, Sadhan Sood sadhan.s...@gmail.com wrote: Also attaching the parquet file if anyone wants to take a further look. On Thu, Nov 20, 2014 at 8:54 AM, Sadhan Sood sadhan.s...@gmail.com wrote: So, I am seeing this issue

Re: SparkSQL exception on cached parquet table

2014-11-20 Thread Sadhan Sood

I am running on master, pulled yesterday I believe but saw the same issue with 1.2.0 On Thu, Nov 20, 2014 at 1:37 PM, Michael Armbrust mich...@databricks.com wrote: Which version are you running on again? On Thu, Nov 20, 2014 at 8:17 AM, Sadhan Sood sadhan.s...@gmail.com wrote: Also

Re: SparkSQL exception on cached parquet table

2014-11-20 Thread Sadhan Sood

Thanks Michael, opened this https://issues.apache.org/jira/browse/SPARK-4520 On Thu, Nov 20, 2014 at 2:59 PM, Michael Armbrust mich...@databricks.com wrote: Can you open a JIRA? On Thu, Nov 20, 2014 at 10:39 AM, Sadhan Sood sadhan.s...@gmail.com wrote: I am running on master, pulled

Re: SparkSQL exception on cached parquet table

2014-11-16 Thread Cheng Lian

(Forgot to cc user mail list) On 11/16/14 4:59 PM, Cheng Lian wrote: Hey Sadhan, Thanks for the additional information, this is helpful. Seems that some Parquet internal contract was broken, but I'm not sure whether it's caused by Spark SQL or Parquet, or even maybe the Parquet file itself

Re: SparkSQL exception on cached parquet table

2014-11-16 Thread Sadhan Sood

Hi Cheng, I tried reading the parquet file(on which we were getting the exception) through parquet-tools and it is able to dump the file and I can read the metadata, etc. I also loaded the file through hive table and can run a table scan query on it as well. Let me know if I can do more to help

Re: SparkSQL exception on cached parquet table

2014-11-15 Thread Cheng Lian

Hi Sadhan, Could you please provide the stack trace of the |ArrayIndexOutOfBoundsException| (if any)? The reason why the first query succeeds is that Spark SQL doesn’t bother reading all data from the table to give |COUNT(*)|. In the second case, however, the whole table is asked to be

Re: SparkSQL exception on cached parquet table

2014-11-15 Thread sadhan

Hi Cheng, Thanks for your response.Here is the stack trace from yarn logs: -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkSQL-exception-on-cached-parquet-table-tp18978p19020.html Sent from the Apache Spark User List mailing list archive

SparkSQL exception on cached parquet table

2014-11-14 Thread Sadhan Sood

While testing SparkSQL on a bunch of parquet files (basically used to be a partition for one of our hive tables), I encountered this error: import org.apache.spark.sql.SchemaRDD import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path;

Re: SparkSQL exception on cached parquet table

Re: SparkSQL exception on cached parquet table

Re: SparkSQL exception on cached parquet table

Re: SparkSQL exception on cached parquet table

Re: SparkSQL exception on cached parquet table

Re: SparkSQL exception on cached parquet table

Re: SparkSQL exception on cached parquet table

Re: SparkSQL exception on cached parquet table

SparkSQL exception on cached parquet table

9 matches

Site Navigation

Mail list logo

Footer information