SparkR Reading Tables from Hive

2015-06-08 Thread Eskilson,Aleksander
Hi there, I’m testing out the new SparkR-Hive interop right now. I’m noticing an apparent disconnect between the Hive store I have my data loaded and the store that sparkRHIve.init() connects to. For example, in beeline: 0: jdbc:hive2://quickstart.cloudera:1 show databases;

Re: SparkR Reading Tables from Hive

2015-06-08 Thread Eskilson,Aleksander
Resolved, my hive-site.xml wasn’t in the conf folder. I can load tables into DataFrames as expected. Thanks, Alek From: Eskilson, Aleksander Eskilson alek.eskil...@cerner.commailto:alek.eskil...@cerner.com Date: Monday, June 8, 2015 at 3:38 PM To:

Re: SparkR Reading Tables from Hive

2015-06-08 Thread Shivaram Venkataraman
Thanks for the confirmation - I was just going to send a pointer to the documentation that talks about hive-site.xml. http://people.apache.org/~pwendell/spark-releases/latest/sql-programming-guide.html#hive-tables Thanks Shivaram On Mon, Jun 8, 2015 at 1:57 PM, Eskilson,Aleksander

Fwd: pull requests no longer closing by commit messages with closes #xxxx

2015-06-08 Thread Reynold Xin
FYI. -- Forwarded message -- From: John Greet (GitHub Staff) supp...@github.com Date: Mon, Jun 8, 2015 at 5:50 PM Subject: Re: pull requests no longer closing by commit messages with closes # To: Reynold Xin r...@databricks.com Hi Reynold, The problem here is that the

Re: [VOTE] Release Apache Spark 1.4.0 (RC4)

2015-06-08 Thread Denny Lee
+1 On Mon, Jun 8, 2015 at 17:51 Wang, Daoyuan daoyuan.w...@intel.com wrote: +1 -Original Message- From: Patrick Wendell [mailto:pwend...@gmail.com] Sent: Wednesday, June 03, 2015 1:47 PM To: dev@spark.apache.org Subject: Re: [VOTE] Release Apache Spark 1.4.0 (RC4) He all - a

RE: [VOTE] Release Apache Spark 1.4.0 (RC4)

2015-06-08 Thread Wang, Daoyuan
+1 -Original Message- From: Patrick Wendell [mailto:pwend...@gmail.com] Sent: Wednesday, June 03, 2015 1:47 PM To: dev@spark.apache.org Subject: Re: [VOTE] Release Apache Spark 1.4.0 (RC4) He all - a tiny nit from the last e-mail. The tag is v1.4.0-rc4. The exact commit and all other

Re: [VOTE] Release Apache Spark 1.4.0 (RC4)

2015-06-08 Thread saurfang
+1 Build for Hadoop 2.4. Run a few jobs on YARN and tested spark.sql.unsafe whose performance seems great! -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-1-4-0-RC4-tp12582p12671.html Sent from the Apache Spark Developers

[SparkSQL ] What is Exchange in physical plan for ?

2015-06-08 Thread invkrh
Hi, DataFrame.explain() shows the physical plan of a query. I noticed there are a lot of `Exchange`s in it, like below: Project [period#20L,categoryName#0,regionName#10,action#15,list_id#16L] ShuffledHashJoin [region#18], [regionCode#9], BuildRight Exchange (HashPartitioning [region#18], 12)

RE: [SparkSQL ] What is Exchange in physical plan for ?

2015-06-08 Thread Cheng, Hao
It means the data shuffling, and its arguments also show the partitioning strategy. -Original Message- From: invkrh [mailto:inv...@gmail.com] Sent: Monday, June 8, 2015 9:34 PM To: dev@spark.apache.org Subject: [SparkSQL ] What is Exchange in physical plan for ? Hi,

Re: [VOTE] Release Apache Spark 1.4.0 (RC4)

2015-06-08 Thread Patrick Wendell
Hi All, Thanks for the continued voting! I'm going to leave this thread open for another few days to continue to collect feedback. - Patrick On Tue, Jun 2, 2015 at 8:53 PM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version

[ml] Why all model classes are final?

2015-06-08 Thread Peter Rudenko
Hi, previously all the models in ml package were private to package, so if i need to customize some models i inherit them in org.apache.spark.ml package in my project. But now new models

Re: Stages with non-arithmetic numbering Timing metrics in event logs

2015-06-08 Thread Imran Rashid
Hi Mike, all good questions, let me take a stab at answering them: 1. Event Logs + Stages: Its normal for stages to get skipped if they are shuffle map stages, which get read multiple times. Eg., here's a little example program I wrote earlier to demonstrate this: d3 doesn't need to be

[sample code] deeplearning4j for Spark ML (@DeveloperAPI)

2015-06-08 Thread Eron Wright
The deeplearning4j framework provides a variety of distributed, neural network-based learning algorithms, including convolutional nets, deep auto-encoders, deep-belief nets, and recurrent nets. We’re working on integration with the Spark ML pipeline, leveraging the developer API. This