@Yin: sorry for my mistake, you are right it was added in 1.2, not 0.12.0 , my bad!
On Tue, Mar 3, 2015 at 6:47 PM, shahab <shahab.mok...@gmail.com> wrote: > Thanks Rohit, yes my mistake, it does work with 1.1 ( I am actually > running it on spark 1.1) > > But do you mean that even HiveConext of spark (nit Calliope > CassandraAwareHiveContext) is not supporting Hive 0.12 ?? > > best, > /Shahab > > On Tue, Mar 3, 2015 at 5:55 PM, Rohit Rai <ro...@tuplejump.com> wrote: > >> The Hive dependency comes from spark-hive. >> >> It does work with Spark 1.1 we will have the 1.2 release later this month. >> On Mar 3, 2015 8:49 AM, "shahab" <shahab.mok...@gmail.com> wrote: >> >>> >>> Thanks Rohit, >>> >>> I am already using Calliope and quite happy with it, well done ! except >>> the fact that : >>> 1- It seems that it does not support Hive 0.12 or higher, Am i right? >>> for example you can not use : current_time() UDF, or those new UDFs added >>> in hive 0.12 . Are they supported? Any plan for supporting them? >>> 2-It does not support Spark 1.1 and 1.2. Any plan for new release? >>> >>> best, >>> /Shahab >>> >>> On Tue, Mar 3, 2015 at 5:41 PM, Rohit Rai <ro...@tuplejump.com> wrote: >>> >>>> Hello Shahab, >>>> >>>> I think CassandraAwareHiveContext >>>> <https://github.com/tuplejump/calliope/blob/develop/sql/hive/src/main/scala/org/apache/spark/sql/hive/CassandraAwareHiveContext.scala> >>>> in >>>> Calliopee is what you are looking for. Create CAHC instance and you should >>>> be able to run hive functions against the SchemaRDD you create from there. >>>> >>>> Cheers, >>>> Rohit >>>> >>>> *Founder & CEO, **Tuplejump, Inc.* >>>> ____________________________ >>>> www.tuplejump.com >>>> *The Data Engineering Platform* >>>> >>>> On Tue, Mar 3, 2015 at 6:03 AM, Cheng, Hao <hao.ch...@intel.com> wrote: >>>> >>>>> The temp table in metastore can not be shared cross SQLContext >>>>> instances, since HiveContext is a sub class of SQLContext (inherits all of >>>>> its functionality), why not using a single HiveContext globally? Is there >>>>> any specific requirement in your case that you need multiple >>>>> SQLContext/HiveContext? >>>>> >>>>> >>>>> >>>>> *From:* shahab [mailto:shahab.mok...@gmail.com] >>>>> *Sent:* Tuesday, March 3, 2015 9:46 PM >>>>> >>>>> *To:* Cheng, Hao >>>>> *Cc:* user@spark.apache.org >>>>> *Subject:* Re: Supporting Hive features in Spark SQL Thrift JDBC >>>>> server >>>>> >>>>> >>>>> >>>>> You are right , CassandraAwareSQLContext is subclass of SQL context. >>>>> >>>>> >>>>> >>>>> But I did another experiment, I queried Cassandra >>>>> using CassandraAwareSQLContext, then I registered the "rdd" as a temp >>>>> table >>>>> , next I tried to query it using HiveContext, but it seems that hive >>>>> context can not see the registered table suing SQL context. Is this a >>>>> normal case? >>>>> >>>>> >>>>> >>>>> best, >>>>> >>>>> /Shahab >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Tue, Mar 3, 2015 at 1:35 PM, Cheng, Hao <hao.ch...@intel.com> >>>>> wrote: >>>>> >>>>> Hive UDF are only applicable for HiveContext and its subclass >>>>> instance, is the CassandraAwareSQLContext a direct sub class of >>>>> HiveContext or SQLContext? >>>>> >>>>> >>>>> >>>>> *From:* shahab [mailto:shahab.mok...@gmail.com] >>>>> *Sent:* Tuesday, March 3, 2015 5:10 PM >>>>> *To:* Cheng, Hao >>>>> *Cc:* user@spark.apache.org >>>>> *Subject:* Re: Supporting Hive features in Spark SQL Thrift JDBC >>>>> server >>>>> >>>>> >>>>> >>>>> val sc: SparkContext = new SparkContext(conf) >>>>> >>>>> val sqlCassContext = new CassandraAwareSQLContext(sc) // I used >>>>> some Calliope Cassandra Spark connector >>>>> >>>>> val rdd : SchemaRDD = sqlCassContext.sql("select * from db.profile " ) >>>>> >>>>> rdd.cache >>>>> >>>>> rdd.registerTempTable("profile") >>>>> >>>>> rdd.first //enforce caching >>>>> >>>>> val q = "select from_unixtime(floor(createdAt/1000)) from >>>>> profile where sampling_bucket=0 " >>>>> >>>>> val rdd2 = rdd.sqlContext.sql(q ) >>>>> >>>>> println ("Result: " + rdd2.first) >>>>> >>>>> >>>>> >>>>> And I get the following errors: >>>>> >>>>> xception in thread "main" >>>>> org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Unresolved >>>>> attributes: 'from_unixtime('floor(('createdAt / 1000))) AS c0#7, tree: >>>>> >>>>> Project ['from_unixtime('floor(('createdAt / 1000))) AS c0#7] >>>>> >>>>> Filter (sampling_bucket#10 = 0) >>>>> >>>>> Subquery profile >>>>> >>>>> Project >>>>> [company#8,bucket#9,sampling_bucket#10,profileid#11,createdat#12L,modifiedat#13L,version#14] >>>>> >>>>> CassandraRelation localhost, 9042, 9160, normaldb_sampling, >>>>> profile, org.apache.spark.sql.CassandraAwareSQLContext@778b692d, >>>>> None, None, false, Some(Configuration: core-default.xml, core-site.xml, >>>>> mapred-default.xml, mapred-site.xml) >>>>> >>>>> >>>>> >>>>> at >>>>> org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$apply$1.applyOrElse(Analyzer.scala:72) >>>>> >>>>> at >>>>> org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$apply$1.applyOrElse(Analyzer.scala:70) >>>>> >>>>> at >>>>> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:165) >>>>> >>>>> at >>>>> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:183) >>>>> >>>>> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) >>>>> >>>>> at scala.collection.Iterator$class.foreach(Iterator.scala:727) >>>>> >>>>> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) >>>>> >>>>> at >>>>> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48) >>>>> >>>>> at >>>>> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103) >>>>> >>>>> at >>>>> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47) >>>>> >>>>> at scala.collection.TraversableOnce$class.to >>>>> (TraversableOnce.scala:273) >>>>> >>>>> at scala.collection.AbstractIterator.to(Iterator.scala:1157) >>>>> >>>>> at >>>>> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265) >>>>> >>>>> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157) >>>>> >>>>> at >>>>> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252) >>>>> >>>>> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157) >>>>> >>>>> at >>>>> org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:212) >>>>> >>>>> at >>>>> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:168) >>>>> >>>>> at >>>>> org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:156) >>>>> >>>>> at >>>>> org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$.apply(Analyzer.scala:70) >>>>> >>>>> at >>>>> org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$.apply(Analyzer.scala:68) >>>>> >>>>> at >>>>> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:61) >>>>> >>>>> at >>>>> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:59) >>>>> >>>>> at >>>>> scala.collection.IndexedSeqOptimized$class.foldl(IndexedSeqOptimized.scala:51) >>>>> >>>>> at >>>>> scala.collection.IndexedSeqOptimized$class.foldLeft(IndexedSeqOptimized.scala:60) >>>>> >>>>> at >>>>> scala.collection.mutable.WrappedArray.foldLeft(WrappedArray.scala:34) >>>>> >>>>> at >>>>> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:59) >>>>> >>>>> at >>>>> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:51) >>>>> >>>>> at scala.collection.immutable.List.foreach(List.scala:318) >>>>> >>>>> at >>>>> org.apache.spark.sql.catalyst.rules.RuleExecutor.apply(RuleExecutor.scala:51) >>>>> >>>>> at >>>>> org.apache.spark.sql.SQLContext$QueryExecution.analyzed$lzycompute(SQLContext.scala:402) >>>>> >>>>> at >>>>> org.apache.spark.sql.SQLContext$QueryExecution.analyzed(SQLContext.scala:402) >>>>> >>>>> at >>>>> org.apache.spark.sql.SQLContext$QueryExecution.optimizedPlan$lzycompute(SQLContext.scala:403) >>>>> >>>>> at >>>>> org.apache.spark.sql.SQLContext$QueryExecution.optimizedPlan(SQLContext.scala:403) >>>>> >>>>> at >>>>> org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan$lzycompute(SQLContext.scala:407) >>>>> >>>>> at >>>>> org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan(SQLContext.scala:405) >>>>> >>>>> at >>>>> org.apache.spark.sql.SQLContext$QueryExecution.executedPlan$lzycompute(SQLContext.scala:411) >>>>> >>>>> at >>>>> org.apache.spark.sql.SQLContext$QueryExecution.executedPlan(SQLContext.scala:411) >>>>> >>>>> at org.apache.spark.sql.SchemaRDD.collect(SchemaRDD.scala:438) >>>>> >>>>> at org.apache.spark.sql.SchemaRDD.take(SchemaRDD.scala:440) >>>>> >>>>> at org.apache.spark.sql.SchemaRDD.take(SchemaRDD.scala:103) >>>>> >>>>> at org.apache.spark.rdd.RDD.first(RDD.scala:1091) >>>>> >>>>> at boot.SQLDemo$.main(SQLDemo.scala:65) //my code >>>>> >>>>> at boot.SQLDemo.main(SQLDemo.scala) //my code >>>>> >>>>> >>>>> >>>>> On Tue, Mar 3, 2015 at 8:57 AM, Cheng, Hao <hao.ch...@intel.com> >>>>> wrote: >>>>> >>>>> Can you provide the detailed failure call stack? >>>>> >>>>> >>>>> >>>>> *From:* shahab [mailto:shahab.mok...@gmail.com] >>>>> *Sent:* Tuesday, March 3, 2015 3:52 PM >>>>> *To:* user@spark.apache.org >>>>> *Subject:* Supporting Hive features in Spark SQL Thrift JDBC server >>>>> >>>>> >>>>> >>>>> Hi, >>>>> >>>>> >>>>> >>>>> According to Spark SQL documentation, "....Spark SQL supports the >>>>> vast majority of Hive features, such as User Defined Functions( UDF) ", >>>>> and one of these UFDs is "current_date()" function, which should be >>>>> supported. >>>>> >>>>> >>>>> >>>>> However, i get error when I am using this UDF in my SQL query. There >>>>> are couple of other UDFs which cause similar error. >>>>> >>>>> >>>>> >>>>> Am I missing something in my JDBC server ? >>>>> >>>>> >>>>> >>>>> /Shahab >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>> >