Re: TreeNodeException: No function to evaluate expression. type: AttributeReference, tree: id#0 on GROUP BY

2014-07-21 Thread Martin Gammelsæter
Aha, that makes sense. Thanks for the response! I guess one of the
areas Spark could need some love in in error messages (:

On Fri, Jul 18, 2014 at 9:41 PM, Michael Armbrust
 wrote:
> Sorry for the non-obvious error message.  It is not valid SQL to include
> attributes in the select clause unless they are also in the group by clause
> or are inside of an aggregate function.
>
> On Jul 18, 2014 5:12 AM, "Martin Gammelsæter" 
> wrote:
>>
>> Hi again!
>>
>> I am having problems when using GROUP BY on both SQLContext and
>> HiveContext (same problem).
>>
>> My code (simplified as much as possible) can be seen here:
>> http://pastebin.com/33rjW67H
>>
>> In short, I'm getting data from a Cassandra store with Datastax' new
>> driver (which works great by the way, recommended!), and mapping it to
>> a Spark SQL table through a Product class (Dokument in the source).
>> Regular SELECTs and stuff works fine, but once I try to do a GROUP BY,
>> I get the following error:
>>
>> Exception in thread "main" org.apache.spark.SparkException: Job
>> aborted due to stage failure: Task 0.0:25 failed 4 times, most recent
>> failure: Exception failure in TID 63 on host 192.168.121.132:
>> org.apache.spark.sql.catalyst.errors.package$TreeNodeException: No
>> function to evaluate expression. type: AttributeReference, tree: id#0
>>
>> org.apache.spark.sql.catalyst.expressions.AttributeReference.eval(namedExpressions.scala:158)
>>
>> org.apache.spark.sql.catalyst.expressions.MutableProjection.apply(Projection.scala:64)
>>
>> org.apache.spark.sql.execution.Aggregate$$anonfun$execute$1$$anonfun$7$$anon$1.next(Aggregate.scala:195)
>>
>> org.apache.spark.sql.execution.Aggregate$$anonfun$execute$1$$anonfun$7$$anon$1.next(Aggregate.scala:174)
>> scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>> scala.collection.Iterator$class.foreach(Iterator.scala:727)
>> scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>>
>> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
>>
>> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
>>
>> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
>>
>> scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
>> scala.collection.AbstractIterator.to(Iterator.scala:1157)
>>
>> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
>> scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
>>
>> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
>> scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
>> org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:750)
>> org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:750)
>>
>> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1096)
>>
>> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1096)
>>
>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:112)
>> org.apache.spark.scheduler.Task.run(Task.scala:51)
>>
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
>>
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> java.lang.Thread.run(Thread.java:745)
>>
>> What am I doing wrong?
>>
>> --
>> Best regards,
>> Martin Gammelsæter



-- 
Mvh.
Martin Gammelsæter
92209139


Re: TreeNodeException: No function to evaluate expression. type: AttributeReference, tree: id#0 on GROUP BY

2014-07-18 Thread Michael Armbrust
Sorry for the non-obvious error message.  It is not valid SQL to include
attributes in the select clause unless they are also in the group by clause
or are inside of an aggregate function.
On Jul 18, 2014 5:12 AM, "Martin Gammelsæter" 
wrote:

> Hi again!
>
> I am having problems when using GROUP BY on both SQLContext and
> HiveContext (same problem).
>
> My code (simplified as much as possible) can be seen here:
> http://pastebin.com/33rjW67H
>
> In short, I'm getting data from a Cassandra store with Datastax' new
> driver (which works great by the way, recommended!), and mapping it to
> a Spark SQL table through a Product class (Dokument in the source).
> Regular SELECTs and stuff works fine, but once I try to do a GROUP BY,
> I get the following error:
>
> Exception in thread "main" org.apache.spark.SparkException: Job
> aborted due to stage failure: Task 0.0:25 failed 4 times, most recent
> failure: Exception failure in TID 63 on host 192.168.121.132:
> org.apache.spark.sql.catalyst.errors.package$TreeNodeException: No
> function to evaluate expression. type: AttributeReference, tree: id#0
>
> org.apache.spark.sql.catalyst.expressions.AttributeReference.eval(namedExpressions.scala:158)
>
> org.apache.spark.sql.catalyst.expressions.MutableProjection.apply(Projection.scala:64)
>
> org.apache.spark.sql.execution.Aggregate$$anonfun$execute$1$$anonfun$7$$anon$1.next(Aggregate.scala:195)
>
> org.apache.spark.sql.execution.Aggregate$$anonfun$execute$1$$anonfun$7$$anon$1.next(Aggregate.scala:174)
> scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> scala.collection.Iterator$class.foreach(Iterator.scala:727)
> scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>
> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
>
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
>
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
> scala.collection.TraversableOnce$class.to
> (TraversableOnce.scala:273)
> scala.collection.AbstractIterator.to(Iterator.scala:1157)
>
> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
> scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
>
> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
> scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
> org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:750)
> org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:750)
>
> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1096)
>
> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1096)
> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:112)
> org.apache.spark.scheduler.Task.run(Task.scala:51)
>
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> java.lang.Thread.run(Thread.java:745)
>
> What am I doing wrong?
>
> --
> Best regards,
> Martin Gammelsæter
>


TreeNodeException: No function to evaluate expression. type: AttributeReference, tree: id#0 on GROUP BY

2014-07-18 Thread Martin Gammelsæter
Hi again!

I am having problems when using GROUP BY on both SQLContext and
HiveContext (same problem).

My code (simplified as much as possible) can be seen here:
http://pastebin.com/33rjW67H

In short, I'm getting data from a Cassandra store with Datastax' new
driver (which works great by the way, recommended!), and mapping it to
a Spark SQL table through a Product class (Dokument in the source).
Regular SELECTs and stuff works fine, but once I try to do a GROUP BY,
I get the following error:

Exception in thread "main" org.apache.spark.SparkException: Job
aborted due to stage failure: Task 0.0:25 failed 4 times, most recent
failure: Exception failure in TID 63 on host 192.168.121.132:
org.apache.spark.sql.catalyst.errors.package$TreeNodeException: No
function to evaluate expression. type: AttributeReference, tree: id#0

org.apache.spark.sql.catalyst.expressions.AttributeReference.eval(namedExpressions.scala:158)

org.apache.spark.sql.catalyst.expressions.MutableProjection.apply(Projection.scala:64)

org.apache.spark.sql.execution.Aggregate$$anonfun$execute$1$$anonfun$7$$anon$1.next(Aggregate.scala:195)

org.apache.spark.sql.execution.Aggregate$$anonfun$execute$1$$anonfun$7$$anon$1.next(Aggregate.scala:174)
scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
scala.collection.Iterator$class.foreach(Iterator.scala:727)
scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)

scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
scala.collection.AbstractIterator.to(Iterator.scala:1157)

scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)

scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:750)
org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:750)

org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1096)

org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1096)
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:112)
org.apache.spark.scheduler.Task.run(Task.scala:51)
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)

java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:745)

What am I doing wrong?

-- 
Best regards,
Martin Gammelsæter