@Cheng :My problem is that the connector I use to query Spark does not support latest Hive (0.12, 0.13), But I need to perform Hive Queries on data retrieved from Cassandra. I assumed that if I get data out of cassandra in some way and register it as Temp table I would be able to query it using HiveContext, but it seems I can not do this!
@Yes, it is added in Hive 0.12, but do you mean It is not supported by HiveContext in Spark???? Thanks, /Shahab On Tue, Mar 3, 2015 at 5:23 PM, Yin Huai <yh...@databricks.com> wrote: > Regarding current_date, I think it is not in either Hive 0.12.0 or 0.13.1 > (versions that we support). Seems > https://issues.apache.org/jira/browse/HIVE-5472 added it Hive recently. > > On Tue, Mar 3, 2015 at 6:03 AM, Cheng, Hao <hao.ch...@intel.com> wrote: > >> The temp table in metastore can not be shared cross SQLContext >> instances, since HiveContext is a sub class of SQLContext (inherits all of >> its functionality), why not using a single HiveContext globally? Is there >> any specific requirement in your case that you need multiple >> SQLContext/HiveContext? >> >> >> >> *From:* shahab [mailto:shahab.mok...@gmail.com] >> *Sent:* Tuesday, March 3, 2015 9:46 PM >> >> *To:* Cheng, Hao >> *Cc:* user@spark.apache.org >> *Subject:* Re: Supporting Hive features in Spark SQL Thrift JDBC server >> >> >> >> You are right , CassandraAwareSQLContext is subclass of SQL context. >> >> >> >> But I did another experiment, I queried Cassandra >> using CassandraAwareSQLContext, then I registered the "rdd" as a temp table >> , next I tried to query it using HiveContext, but it seems that hive >> context can not see the registered table suing SQL context. Is this a >> normal case? >> >> >> >> best, >> >> /Shahab >> >> >> >> >> >> On Tue, Mar 3, 2015 at 1:35 PM, Cheng, Hao <hao.ch...@intel.com> wrote: >> >> Hive UDF are only applicable for HiveContext and its subclass instance, >> is the CassandraAwareSQLContext a direct sub class of HiveContext or >> SQLContext? >> >> >> >> *From:* shahab [mailto:shahab.mok...@gmail.com] >> *Sent:* Tuesday, March 3, 2015 5:10 PM >> *To:* Cheng, Hao >> *Cc:* user@spark.apache.org >> *Subject:* Re: Supporting Hive features in Spark SQL Thrift JDBC server >> >> >> >> val sc: SparkContext = new SparkContext(conf) >> >> val sqlCassContext = new CassandraAwareSQLContext(sc) // I used some >> Calliope Cassandra Spark connector >> >> val rdd : SchemaRDD = sqlCassContext.sql("select * from db.profile " ) >> >> rdd.cache >> >> rdd.registerTempTable("profile") >> >> rdd.first //enforce caching >> >> val q = "select from_unixtime(floor(createdAt/1000)) from profile >> where sampling_bucket=0 " >> >> val rdd2 = rdd.sqlContext.sql(q ) >> >> println ("Result: " + rdd2.first) >> >> >> >> And I get the following errors: >> >> xception in thread "main" >> org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Unresolved >> attributes: 'from_unixtime('floor(('createdAt / 1000))) AS c0#7, tree: >> >> Project ['from_unixtime('floor(('createdAt / 1000))) AS c0#7] >> >> Filter (sampling_bucket#10 = 0) >> >> Subquery profile >> >> Project >> [company#8,bucket#9,sampling_bucket#10,profileid#11,createdat#12L,modifiedat#13L,version#14] >> >> CassandraRelation localhost, 9042, 9160, normaldb_sampling, profile, >> org.apache.spark.sql.CassandraAwareSQLContext@778b692d, None, None, >> false, Some(Configuration: core-default.xml, core-site.xml, >> mapred-default.xml, mapred-site.xml) >> >> >> >> at >> org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$apply$1.applyOrElse(Analyzer.scala:72) >> >> at >> org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$apply$1.applyOrElse(Analyzer.scala:70) >> >> at >> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:165) >> >> at >> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:183) >> >> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) >> >> at scala.collection.Iterator$class.foreach(Iterator.scala:727) >> >> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) >> >> at >> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48) >> >> at >> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103) >> >> at >> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47) >> >> at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273) >> >> at scala.collection.AbstractIterator.to(Iterator.scala:1157) >> >> at >> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265) >> >> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157) >> >> at >> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252) >> >> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157) >> >> at >> org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:212) >> >> at >> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:168) >> >> at >> org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:156) >> >> at >> org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$.apply(Analyzer.scala:70) >> >> at >> org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$.apply(Analyzer.scala:68) >> >> at >> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:61) >> >> at >> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:59) >> >> at >> scala.collection.IndexedSeqOptimized$class.foldl(IndexedSeqOptimized.scala:51) >> >> at >> scala.collection.IndexedSeqOptimized$class.foldLeft(IndexedSeqOptimized.scala:60) >> >> at scala.collection.mutable.WrappedArray.foldLeft(WrappedArray.scala:34) >> >> at >> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:59) >> >> at >> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:51) >> >> at scala.collection.immutable.List.foreach(List.scala:318) >> >> at >> org.apache.spark.sql.catalyst.rules.RuleExecutor.apply(RuleExecutor.scala:51) >> >> at >> org.apache.spark.sql.SQLContext$QueryExecution.analyzed$lzycompute(SQLContext.scala:402) >> >> at >> org.apache.spark.sql.SQLContext$QueryExecution.analyzed(SQLContext.scala:402) >> >> at >> org.apache.spark.sql.SQLContext$QueryExecution.optimizedPlan$lzycompute(SQLContext.scala:403) >> >> at >> org.apache.spark.sql.SQLContext$QueryExecution.optimizedPlan(SQLContext.scala:403) >> >> at >> org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan$lzycompute(SQLContext.scala:407) >> >> at >> org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan(SQLContext.scala:405) >> >> at >> org.apache.spark.sql.SQLContext$QueryExecution.executedPlan$lzycompute(SQLContext.scala:411) >> >> at >> org.apache.spark.sql.SQLContext$QueryExecution.executedPlan(SQLContext.scala:411) >> >> at org.apache.spark.sql.SchemaRDD.collect(SchemaRDD.scala:438) >> >> at org.apache.spark.sql.SchemaRDD.take(SchemaRDD.scala:440) >> >> at org.apache.spark.sql.SchemaRDD.take(SchemaRDD.scala:103) >> >> at org.apache.spark.rdd.RDD.first(RDD.scala:1091) >> >> at boot.SQLDemo$.main(SQLDemo.scala:65) //my code >> >> at boot.SQLDemo.main(SQLDemo.scala) //my code >> >> >> >> On Tue, Mar 3, 2015 at 8:57 AM, Cheng, Hao <hao.ch...@intel.com> wrote: >> >> Can you provide the detailed failure call stack? >> >> >> >> *From:* shahab [mailto:shahab.mok...@gmail.com] >> *Sent:* Tuesday, March 3, 2015 3:52 PM >> *To:* user@spark.apache.org >> *Subject:* Supporting Hive features in Spark SQL Thrift JDBC server >> >> >> >> Hi, >> >> >> >> According to Spark SQL documentation, "....Spark SQL supports the vast >> majority of Hive features, such as User Defined Functions( UDF) ", and one >> of these UFDs is "current_date()" function, which should be supported. >> >> >> >> However, i get error when I am using this UDF in my SQL query. There are >> couple of other UDFs which cause similar error. >> >> >> >> Am I missing something in my JDBC server ? >> >> >> >> /Shahab >> >> >> >> >> > >