Re: Does spark supports the Hive function posexplode function?

2015-07-12 Thread David Sabater Dinter
It seems this feature was added in Hive 0.13. https://issues.apache.org/jira/browse/HIVE-4943 I would assume this is supported as Spark is by default compiled using Hive 0.13.1. On Sun, Jul 12, 2015 at 7:42 PM, Ruslan Dautkhanov dautkha...@gmail.com wrote: You can see what Spark SQL functions

Re: How to upgrade Spark version in CDH 5.4

2015-07-12 Thread David Sabater Dinter
As Sean suggested you can actually build Spark 1.4 for CDH 5.4.x and also include Hive libraries for 0.13.1, but *this will be completely unsupported by Cloudera*. I would suggest to do that only if you just want to experiment with new features from Spark 1.4. I.e. Run SparkSQL with sort-merge

SparkSQL cache table with multiple replicas

2015-07-03 Thread David Sabater Dinter
Hi all, Do you know if there is an option to specify how many replicas we want while caching in memory a table in SparkSQL Thrift server? I have not seen any option so far but I assumed there is an option as you can see in the Storage section of the UI that there is 1 x replica of your

Issues when saving dataframe in Spark 1.4 with parquet format

2015-07-01 Thread David Sabater Dinter
Hi chaps, It seems there is an issue while saving dataframes in Spark 1.4. The default file extension inside Hive warehouse folder is now part-r-X.gz.parquet but while running queries from SparkSQL Thriftserver is still looking for part-r-X.parquet. Is there any config parameter we can use as