Re: Code generation for GPU

2015-09-10 Thread Paul Wais
In order to get a major speedup from applying *single-pass* map/filter/reduce operations on an array in GPU memory, wouldn't you need to stream the columnar data directly into GPU memory somehow? You might find in your experiments that GPU memory allocation is a bottleneck. See e.g. John Canny's

Re: Code generation for GPU

2015-09-10 Thread Steve Loughran
> On 9 Sep 2015, at 20:18, lonikar wrote: > > I have seen a perf improvement of 5-10 times on expression evaluation even > on "ordinary" laptop GPUs. Thus, it will be a good demo along with some > concrete proposals for vectorization. As you said, I will have to hook up to >

DF.intersection issue in 1.5

2015-09-10 Thread Nitay Joffe
The following fails for me in Spark 1.5: https://gist.github.com/nitay/d08cb294ccf00b80c49a Specifically, it returns 1 instead of 100 (in both versions). When I print out the contents (i.e. collect()) I see all 100 items, yet the count returns 1. This works in 1.3 and 1.4. Any ideas what's going

Spark 1.5.x: Java files in src/main/scala and vice versa

2015-09-10 Thread lonikar
I found these files: spark-1.5.0/sql/catalyst/*src/main/scala*/org/apache/spark/sql/types/*SQLUserDefinedType.java* spark-1.5.0/core/src/main/java/org/apache/spark/api/java/function/package.scala and several java files in spark-1.5.0/core/src/main/scala/. Is this intentional or inadvertant?

Re: Spark 1.5.x: Java files in src/main/scala and vice versa

2015-09-10 Thread Sean Owen
I feel like I knew the answer to this but have forgotten. Reynold do you know about this file? looks like you added it. On Thu, Sep 10, 2015 at 1:10 PM, lonikar wrote: > I found these files: >

Concurrency issue in SQLExecution.withNewExecutionId

2015-09-10 Thread Olivier Toupin
Look at this code: https://github.com/apache/spark/blob/branch-1.5/sql/core/src/main/scala/org/apache/spark/sql/execution/SQLExecution.scala#L42 and https://github.com/apache/spark/blob/branch-1.5/sql/core/src/main/scala/org/apache/spark/sql/execution/SQLExecution.scala#L87 This exception is

Re: [SparkSQL]Could not alter table in Spark 1.5 use HiveContext

2015-09-10 Thread Michael Armbrust
Can you open a JIRA? On Wed, Sep 9, 2015 at 11:11 PM, StanZhai wrote: > After upgrade spark from 1.4.1 to 1.5.0, I encountered the following > exception when use alter table statement in HiveContext: > > The sql is: ALTER TABLE a RENAME TO b > > The exception is: > > FAILED:

Re: ClassCastException using DataFrame only when num-executors > 2 ...

2015-09-10 Thread Reynold Xin
Does this still happen on 1.5.0 release? On Mon, Aug 31, 2015 at 9:31 AM, Olivier Girardot wrote: > tested now against Spark 1.5.0 rc2, and same exceptions happen when > num-executors > 2 : > > 15/08/25 10:31:10 WARN scheduler.TaskSetManager: Lost task 0.1 in stage > 5.0

Re: Concurrency issue in SQLExecution.withNewExecutionId

2015-09-10 Thread Andrew Or
@Olivier, did you use scala's parallel collections by any chance? If not, what form of concurrency were you using? 2015-09-10 13:01 GMT-07:00 Andrew Or : > Thanks for reporting this, I have filed > https://issues.apache.org/jira/browse/SPARK-10548. > > 2015-09-10 9:09

Re: [SparkSQL]Could not alter table in Spark 1.5 use HiveContext

2015-09-10 Thread StanZhai
Thank you for the swift reply! The version of my hive metastore server is 0.13.1, I've build spark use sbt like this: build/sbt -Pyarn -Phadoop-2.4 -Phive -Phive-thriftserver assembly Is spark 1.5 bind the hive client version of 1.2 by default? -- View this message in context:

Re: [SparkSQL]Could not alter table in Spark 1.5 use HiveContext

2015-09-10 Thread Yin Huai
Yes, Spark 1.5 use Hive 1.2's metastore client by default. You can change it by putting the following settings in your spark conf. spark.sql.hive.metastore.version = 0.13.1 spark.sql.hive.metastore.jars = maven or the path of your hive 0.13 jars and hadoop jars For spark.sql.hive.metastore.jars,

Re: Spark 1.5.x: Java files in src/main/scala and vice versa

2015-09-10 Thread Sean Owen
This is probably true as the scala plugin actually compiles both .scala and .java files. Still it seems like the wrong place just as a matter of style. Can we try moving it and verify it's still OK? On Fri, Sep 11, 2015 at 12:43 AM, Reynold Xin wrote: > There isn't really

Re: Concurrency issue in SQLExecution.withNewExecutionId

2015-09-10 Thread Andrew Or
Thanks for reporting this, I have filed https://issues.apache.org/jira/browse/SPARK-10548. 2015-09-10 9:09 GMT-07:00 Olivier Toupin : > Look at this code: > > >

[SparkSQL]Could not alter table in Spark 1.5 use HiveContext

2015-09-10 Thread StanZhai
After upgrade spark from 1.4.1 to 1.5.0, I encountered the following exception when use alter table statement in HiveContext: The sql is: ALTER TABLE a RENAME TO b The exception is: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table.

Re: DF.intersection issue in 1.5

2015-09-10 Thread Michael Armbrust
Thanks for pointing this out. https://issues.apache.org/jira/browse/SPARK-10539 We will fix this for Spark 1.5.1. On Thu, Sep 10, 2015 at 6:16 AM, Nitay Joffe wrote: > The following fails for me in Spark 1.5: > https://gist.github.com/nitay/d08cb294ccf00b80c49a >