Re: Spark Plugin Exception - java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GenericMutableRow cannot be cast to org.apache.spark.sql.Row

2015-09-23 Thread Josh Mahonin
Hi Babar,

Can you file a JIRA for this? I suspect this is something to do with the Spark 
1.5 data frame API data structures, perhaps they've gone and changed them again!

Can you try with previous Spark versions to see if there's a difference? Also, 
you may have luck interfacing with the RDDs directly instead of the data frames.

Thanks!

Josh

From: Babar Tareen
Reply-To: "user@phoenix.apache.org"
Date: Tuesday, September 22, 2015 at 5:47 PM
To: "user@phoenix.apache.org"
Subject: Spark Plugin Exception - java.lang.ClassCastException: 
org.apache.spark.sql.catalyst.expressions.GenericMutableRow cannot be cast to 
org.apache.spark.sql.Row

Hi,

I am trying to run the spark plugin DataFrame sample code available here 
(https://phoenix.apache.org/phoenix_spark.html) and getting following 
exception.  I am running the code against hbase-1.1.1, spark 1.5.0 and phoenix  
4.5.2. HBase is running in standalone mode, locally on OS X.  Any ideas what 
might be causing this exception?


java.lang.ClassCastException: 
org.apache.spark.sql.catalyst.expressions.GenericMutableRow cannot be cast to 
org.apache.spark.sql.Row
at org.apache.spark.sql.SQLContext$$anonfun$7.apply(SQLContext.scala:439) 
~[spark-sql_2.11-1.5.0.jar:1.5.0]
at scala.collection.Iterator$$anon$11.next(Iterator.scala:363) 
~[scala-library-2.11.4.jar:na]
at scala.collection.Iterator$$anon$11.next(Iterator.scala:363) 
~[scala-library-2.11.4.jar:na]
at scala.collection.Iterator$$anon$11.next(Iterator.scala:363) 
~[scala-library-2.11.4.jar:na]
at 
org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.processInputs(TungstenAggregationIterator.scala:366)
 ~[spark-sql_2.11-1.5.0.jar:1.5.0]
at 
org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.start(TungstenAggregationIterator.scala:622)
 ~[spark-sql_2.11-1.5.0.jar:1.5.0]
at 
org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1.org$apache$spark$sql$execution$aggregate$TungstenAggregate$$anonfun$$executePartition$1(TungstenAggregate.scala:110)
 ~[spark-sql_2.11-1.5.0.jar:1.5.0]
at 
org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:119)
 ~[spark-sql_2.11-1.5.0.jar:1.5.0]
at 
org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:119)
 ~[spark-sql_2.11-1.5.0.jar:1.5.0]
at 
org.apache.spark.rdd.MapPartitionsWithPreparationRDD.compute(MapPartitionsWithPreparationRDD.scala:64)
 ~[spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297) 
~[spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) 
~[spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) 
~[spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297) 
~[spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) 
~[spark-core_2.11-1.5.0.jar:1.5.0]
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) 
~[spark-core_2.11-1.5.0.jar:1.5.0]
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) 
~[spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.scheduler.Task.run(Task.scala:88) 
~[spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) 
~[spark-core_2.11-1.5.0.jar:1.5.0]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_45]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_45]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]

Thanks,
Babar


Re: Spark Plugin Exception - java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GenericMutableRow cannot be cast to org.apache.spark.sql.Row

2015-09-23 Thread Babar Tareen
I have filed PHOENIX-2287
 for this. And the code
works fine with Spark 1.4.1.

Thanks

On Wed, Sep 23, 2015 at 6:06 AM Josh Mahonin  wrote:

> Hi Babar,
>
> Can you file a JIRA for this? I suspect this is something to do with the
> Spark 1.5 data frame API data structures, perhaps they've gone and changed
> them again!
>
> Can you try with previous Spark versions to see if there's a difference?
> Also, you may have luck interfacing with the RDDs directly instead of the
> data frames.
>
> Thanks!
>
> Josh
>
> From: Babar Tareen
> Reply-To: "user@phoenix.apache.org"
> Date: Tuesday, September 22, 2015 at 5:47 PM
> To: "user@phoenix.apache.org"
> Subject: Spark Plugin Exception - java.lang.ClassCastException:
> org.apache.spark.sql.catalyst.expressions.GenericMutableRow cannot be cast
> to org.apache.spark.sql.Row
>
> Hi,
>
> I am trying to run the spark plugin DataFrame sample code available here (
> https://phoenix.apache.org/phoenix_spark.html) and getting following
> exception.  I am running the code against hbase-1.1.1, spark 1.5.0 and
> phoenix  4.5.2. HBase is running in standalone mode, locally on OS X.  Any
> ideas what might be causing this exception?
>
>
> java.lang.ClassCastException:
> org.apache.spark.sql.catalyst.expressions.GenericMutableRow cannot be cast
> to org.apache.spark.sql.Row
> at
> org.apache.spark.sql.SQLContext$$anonfun$7.apply(SQLContext.scala:439)
> ~[spark-sql_2.11-1.5.0.jar:1.5.0]
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:363)
> ~[scala-library-2.11.4.jar:na]
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:363)
> ~[scala-library-2.11.4.jar:na]
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:363)
> ~[scala-library-2.11.4.jar:na]
> at
> org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.processInputs(TungstenAggregationIterator.scala:366)
> ~[spark-sql_2.11-1.5.0.jar:1.5.0]
> at
> org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.start(TungstenAggregationIterator.scala:622)
> ~[spark-sql_2.11-1.5.0.jar:1.5.0]
> at
> org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$
> 1.org$apache$spark$sql$execution$aggregate$TungstenAggregate$$anonfun$$executePartition$1(TungstenAggregate.scala:110)
> ~[spark-sql_2.11-1.5.0.jar:1.5.0]
> at
> org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:119)
> ~[spark-sql_2.11-1.5.0.jar:1.5.0]
> at
> org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:119)
> ~[spark-sql_2.11-1.5.0.jar:1.5.0]
> at
> org.apache.spark.rdd.MapPartitionsWithPreparationRDD.compute(MapPartitionsWithPreparationRDD.scala:64)
> ~[spark-core_2.11-1.5.0.jar:1.5.0]
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
> ~[spark-core_2.11-1.5.0.jar:1.5.0]
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
> ~[spark-core_2.11-1.5.0.jar:1.5.0]
> at
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> ~[spark-core_2.11-1.5.0.jar:1.5.0]
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
> ~[spark-core_2.11-1.5.0.jar:1.5.0]
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
> ~[spark-core_2.11-1.5.0.jar:1.5.0]
> at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
> ~[spark-core_2.11-1.5.0.jar:1.5.0]
> at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
> ~[spark-core_2.11-1.5.0.jar:1.5.0]
> at org.apache.spark.scheduler.Task.run(Task.scala:88)
> ~[spark-core_2.11-1.5.0.jar:1.5.0]
> at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> ~[spark-core_2.11-1.5.0.jar:1.5.0]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [na:1.8.0_45]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [na:1.8.0_45]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
>
> Thanks,
> Babar
>


Re: Spark Plugin Exception - java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GenericMutableRow cannot be cast to org.apache.spark.sql.Row

2015-09-23 Thread Josh Mahonin
I've got a patch attached to the ticket that I think should fix your issue.

If you're able to try it out and let us know how it goes, it'd be much 
appreciated.

From: Babar Tareen
Reply-To: "user@phoenix.apache.org<mailto:user@phoenix.apache.org>"
Date: Wednesday, September 23, 2015 at 1:14 PM
To: "user@phoenix.apache.org<mailto:user@phoenix.apache.org>"
Subject: Re: Spark Plugin Exception - java.lang.ClassCastException: 
org.apache.spark.sql.catalyst.expressions.GenericMutableRow cannot be cast to 
org.apache.spark.sql.Row

I have filed PHOENIX-2287<https://issues.apache.org/jira/browse/PHOENIX-2287> 
for this. And the code works fine with Spark 1.4.1.

Thanks

On Wed, Sep 23, 2015 at 6:06 AM Josh Mahonin 
<jmaho...@interset.com<mailto:jmaho...@interset.com>> wrote:
Hi Babar,

Can you file a JIRA for this? I suspect this is something to do with the Spark 
1.5 data frame API data structures, perhaps they've gone and changed them again!

Can you try with previous Spark versions to see if there's a difference? Also, 
you may have luck interfacing with the RDDs directly instead of the data frames.

Thanks!

Josh

From: Babar Tareen
Reply-To: "user@phoenix.apache.org<mailto:user@phoenix.apache.org>"
Date: Tuesday, September 22, 2015 at 5:47 PM
To: "user@phoenix.apache.org<mailto:user@phoenix.apache.org>"
Subject: Spark Plugin Exception - java.lang.ClassCastException: 
org.apache.spark.sql.catalyst.expressions.GenericMutableRow cannot be cast to 
org.apache.spark.sql.Row

Hi,

I am trying to run the spark plugin DataFrame sample code available here 
(https://phoenix.apache.org/phoenix_spark.html) and getting following 
exception.  I am running the code against hbase-1.1.1, spark 1.5.0 and phoenix  
4.5.2. HBase is running in standalone mode, locally on OS X.  Any ideas what 
might be causing this exception?


java.lang.ClassCastException: 
org.apache.spark.sql.catalyst.expressions.GenericMutableRow cannot be cast to 
org.apache.spark.sql.Row
at org.apache.spark.sql.SQLContext$$anonfun$7.apply(SQLContext.scala:439) 
~[spark-sql_2.11-1.5.0.jar:1.5.0]
at scala.collection.Iterator$$anon$11.next(Iterator.scala:363) 
~[scala-library-2.11.4.jar:na]
at scala.collection.Iterator$$anon$11.next(Iterator.scala:363) 
~[scala-library-2.11.4.jar:na]
at scala.collection.Iterator$$anon$11.next(Iterator.scala:363) 
~[scala-library-2.11.4.jar:na]
at 
org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.processInputs(TungstenAggregationIterator.scala:366)
 ~[spark-sql_2.11-1.5.0.jar:1.5.0]
at 
org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.start(TungstenAggregationIterator.scala:622)
 ~[spark-sql_2.11-1.5.0.jar:1.5.0]
at 
org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1.org<http://1.org>$apache$spark$sql$execution$aggregate$TungstenAggregate$$anonfun$$executePartition$1(TungstenAggregate.scala:110)
 ~[spark-sql_2.11-1.5.0.jar:1.5.0]
at 
org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:119)
 ~[spark-sql_2.11-1.5.0.jar:1.5.0]
at 
org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:119)
 ~[spark-sql_2.11-1.5.0.jar:1.5.0]
at 
org.apache.spark.rdd.MapPartitionsWithPreparationRDD.compute(MapPartitionsWithPreparationRDD.scala:64)
 ~[spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297) 
~[spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) 
~[spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) 
~[spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297) 
~[spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) 
~[spark-core_2.11-1.5.0.jar:1.5.0]
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) 
~[spark-core_2.11-1.5.0.jar:1.5.0]
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) 
~[spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.scheduler.Task.run(Task.scala:88) 
~[spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) 
~[spark-core_2.11-1.5.0.jar:1.5.0]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_45]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_45]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]

Thanks,
Babar


Re: Spark Plugin Exception - java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GenericMutableRow cannot be cast to org.apache.spark.sql.Row

2015-09-23 Thread Babar Tareen
I tried the patch; it resolves this issue. Thanks.

On Wed, Sep 23, 2015 at 11:23 AM Josh Mahonin <jmaho...@interset.com> wrote:

> I've got a patch attached to the ticket that I think should fix your
> issue.
>
> If you're able to try it out and let us know how it goes, it'd be much
> appreciated.
>
> From: Babar Tareen
> Reply-To: "user@phoenix.apache.org"
> Date: Wednesday, September 23, 2015 at 1:14 PM
> To: "user@phoenix.apache.org"
> Subject: Re: Spark Plugin Exception - java.lang.ClassCastException:
> org.apache.spark.sql.catalyst.expressions.GenericMutableRow cannot be cast
> to org.apache.spark.sql.Row
>
> I have filed PHOENIX-2287
> <https://issues.apache.org/jira/browse/PHOENIX-2287> for this. And the
> code works fine with Spark 1.4.1.
>
> Thanks
>
> On Wed, Sep 23, 2015 at 6:06 AM Josh Mahonin <jmaho...@interset.com>
> wrote:
>
>> Hi Babar,
>>
>> Can you file a JIRA for this? I suspect this is something to do with the
>> Spark 1.5 data frame API data structures, perhaps they've gone and changed
>> them again!
>>
>> Can you try with previous Spark versions to see if there's a difference?
>> Also, you may have luck interfacing with the RDDs directly instead of the
>> data frames.
>>
>> Thanks!
>>
>> Josh
>>
>> From: Babar Tareen
>> Reply-To: "user@phoenix.apache.org"
>> Date: Tuesday, September 22, 2015 at 5:47 PM
>> To: "user@phoenix.apache.org"
>> Subject: Spark Plugin Exception - java.lang.ClassCastException:
>> org.apache.spark.sql.catalyst.expressions.GenericMutableRow cannot be cast
>> to org.apache.spark.sql.Row
>>
>> Hi,
>>
>> I am trying to run the spark plugin DataFrame sample code available here (
>> https://phoenix.apache.org/phoenix_spark.html) and getting following
>> exception.  I am running the code against hbase-1.1.1, spark 1.5.0 and
>> phoenix  4.5.2. HBase is running in standalone mode, locally on OS X.  Any
>> ideas what might be causing this exception?
>>
>>
>> java.lang.ClassCastException:
>> org.apache.spark.sql.catalyst.expressions.GenericMutableRow cannot be cast
>> to org.apache.spark.sql.Row
>> at
>> org.apache.spark.sql.SQLContext$$anonfun$7.apply(SQLContext.scala:439)
>> ~[spark-sql_2.11-1.5.0.jar:1.5.0]
>> at scala.collection.Iterator$$anon$11.next(Iterator.scala:363)
>> ~[scala-library-2.11.4.jar:na]
>> at scala.collection.Iterator$$anon$11.next(Iterator.scala:363)
>> ~[scala-library-2.11.4.jar:na]
>> at scala.collection.Iterator$$anon$11.next(Iterator.scala:363)
>> ~[scala-library-2.11.4.jar:na]
>> at
>> org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.processInputs(TungstenAggregationIterator.scala:366)
>> ~[spark-sql_2.11-1.5.0.jar:1.5.0]
>> at
>> org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.start(TungstenAggregationIterator.scala:622)
>> ~[spark-sql_2.11-1.5.0.jar:1.5.0]
>> at
>> org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$
>> 1.org$apache$spark$sql$execution$aggregate$TungstenAggregate$$anonfun$$executePartition$1(TungstenAggregate.scala:110)
>> ~[spark-sql_2.11-1.5.0.jar:1.5.0]
>> at
>> org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:119)
>> ~[spark-sql_2.11-1.5.0.jar:1.5.0]
>> at
>> org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:119)
>> ~[spark-sql_2.11-1.5.0.jar:1.5.0]
>> at
>> org.apache.spark.rdd.MapPartitionsWithPreparationRDD.compute(MapPartitionsWithPreparationRDD.scala:64)
>> ~[spark-core_2.11-1.5.0.jar:1.5.0]
>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
>> ~[spark-core_2.11-1.5.0.jar:1.5.0]
>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
>> ~[spark-core_2.11-1.5.0.jar:1.5.0]
>> at
>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>> ~[spark-core_2.11-1.5.0.jar:1.5.0]
>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
>> ~[spark-core_2.11-1.5.0.jar:1.5.0]
>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
>> ~[spark-core_2.11-1.5.0.jar:1.5.0]
>> at
>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
>> ~[spark-core_2.11-1.5.0.jar:1.5.0]
>> at
>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>> ~[spark-core_2.11-1.5.0.jar:1.5.0]
>> at org.apache.spark.scheduler.Task.run(Task.scala:88)
>> ~[spark-core_2.11-1.5.0.jar:1.5.0]
>> at
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
>> ~[spark-core_2.11-1.5.0.jar:1.5.0]
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> [na:1.8.0_45]
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> [na:1.8.0_45]
>> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
>>
>> Thanks,
>> Babar
>>
>