Re: Support Hive 0.13 .1 in Spark SQL

2014-10-28 Thread Patrick Wendell
Hey Cheng, Right now we aren't using stable API's to communicate with the Hive Metastore. We didn't want to drop support for Hive 0.12 so right now we are using a shim layer to support compiling for 0.12 and 0.13. This is very costly to maintain. If Hive has a stable meta-data API for talking to

Re: best IDE for scala + spark development?

2014-10-28 Thread Duy Huynh
thanks everyone. i've been using vim and sbt recently, and i really like it. it's lightweight, fast. plus, ack, ctrl-t, nerdtre, etc. in vim do all the good work. but, as i'm not familiar with scala/spark api yet, i really wish to have these two things in vim + sbt. 1. code completion as in

Re: HiveContext bug?

2014-10-28 Thread Cheng Lian
Hi Marcelo, yes this is a known Spark SQL bug and we've got PRs to fix it (2887 2967). Not merged yet because newly merged Hive 0.13.1 support causes some conflicts. Thanks for reporting this :) On Tue, Oct 28, 2014 at 6:41 AM, Marcelo Vanzin van...@cloudera.com wrote: Well, looks like a huge

Re: best IDE for scala + spark development?

2014-10-28 Thread Cheng Lian
My two cents for Mac Vim/Emacs users. Fixed a Scala ctags Mac compatibility bug months ago, and you may want to use the most recent version here https://github.com/scala/scala-dist/blob/master/tool-support/src/emacs/contrib/dot-ctags On Tue, Oct 28, 2014 at 4:26 PM, Duy Huynh

Re: [MLlib] Contributing Algorithm for Outlier Detection

2014-10-28 Thread Ashutosh
Hi Anant, Thank you for reviewing and helping us out. Please find the following link where you can see the initial code. https://github.com/codeAshu/Outlier-Detection-with-AVF-Spark/blob/master/OutlierWithAVFModel.scala The input file for the code should be in csv format. We have provided a

Re: jenkins downtime tomorrow morning ~6am-8am PDT

2014-10-28 Thread shane knapp
this is done, and jenkins is up and building again. On Mon, Oct 27, 2014 at 10:46 AM, shane knapp skn...@berkeley.edu wrote: i'll be bringing jenkins down tomorrow morning for some system maintenance and to get our backups kicked off. i do expect to have the system back up and running before

How to run tests properly?

2014-10-28 Thread Niklas Wilcke
Hi, I want to contribute to the MLlib library but I can't get the tests up working. I've found three ways of running the tests on the commandline. I just want to execute the MLlib tests. 1. via dev/run-tests script This script executes all tests and take several hours to finish. Some tests

Re: How to run tests properly?

2014-10-28 Thread Sean Owen
On Tue, Oct 28, 2014 at 6:18 PM, Niklas Wilcke 1wil...@informatik.uni-hamburg.de wrote: 1. via dev/run-tests script This script executes all tests and take several hours to finish. Some tests failed but I can't say which of them. Should this really take that long? Can I specify to run only

Breeze::DiffFunction not serializable

2014-10-28 Thread Xuepeng Sun
Hi, I'm trying to call Breeze::LBFGS from the master on each partition but getting *NonSerializable* error. I guess it's well-known that the Breeze DiffFunction is not serializable. /// import breeze.linalg.{Vector = BV, DenseVector=BDV, SparseVector=BSV} val lbfgs = new

HiveShim not found when building in Intellij

2014-10-28 Thread Stephen Boesch
I have run on the command line via maven and it is fine: mvn -Dscalastyle.failOnViolation=false -DskipTests -Pyarn -Phadoop-2.3 compile package install But with the latest code Intellij builds do not work. Following is one of 26 similar errors: Error:(173, 38) not found: value HiveShim

Re: HiveShim not found when building in Intellij

2014-10-28 Thread Matei Zaharia
Hi Stephen, How did you generate your Maven workspace? You need to make sure the Hive profile is enabled for it. For example sbt/sbt -Phive gen-idea. Matei On Oct 28, 2014, at 7:42 PM, Stephen Boesch java...@gmail.com wrote: I have run on the command line via maven and it is fine: mvn

Re: HiveShim not found when building in Intellij

2014-10-28 Thread Stephen Boesch
Hi Matei, Until my latest pull from upstream/master it had not been necessary to add the hive profile: is it now?? I am not using sbt gen-idea. The way to open in intellij has been to Open the parent directory. IJ recognizes it as a maven project. There are several steps to do surgery on the

Re: HiveShim not found when building in Intellij

2014-10-28 Thread Patrick Wendell
Hey Stephen, In some cases in the maven build we now have pluggable source directories based on profiles using the maven build helper plug-in. This is necessary to support cross building against different Hive versions, and there will be additional instances of this due to supporting scala 2.11

Re: HiveShim not found when building in Intellij

2014-10-28 Thread Stephen Boesch
Thanks Patrick for the heads up. I have not been successful to discover a combination of profiles (i.e. enabling hive or hive-0.12.0 or hive-13.0) that works in Intellij with maven. Anyone who knows how to handle this - a quick note here would be appreciated. 2014-10-28 20:20 GMT-07:00 Patrick

Re: HiveShim not found when building in Intellij

2014-10-28 Thread Zhan Zhang
-Phive is to enable hive-0.13.1 and -Phive -Phive-0.12.0” is to enable hive-0.12.0. Note that the thrift-server is not supported yet in hive-0.13, but expected to go to upstream soon (Spark-3720). Thanks. Zhan Zhang On Oct 28, 2014, at 9:09 PM, Stephen Boesch java...@gmail.com wrote:

Re: HiveShim not found when building in Intellij

2014-10-28 Thread Cheng Lian
Yes, these two combinations work for me. On 10/29/14 12:32 PM, Zhan Zhang wrote: -Phive is to enable hive-0.13.1 and -Phive -Phive-0.12.0” is to enable hive-0.12.0. Note that the thrift-server is not supported yet in hive-0.13, but expected to go to upstream soon (Spark-3720). Thanks. Zhan

Re: HiveShim not found when building in Intellij

2014-10-28 Thread Stephen Boesch
I am interested specifically in how to build (and hopefully run/debug..) under Intellij. Your posts sound like command line maven - which has always been working already. Do you have instructions for building in IJ? 2014-10-28 21:38 GMT-07:00 Cheng Lian lian.cs@gmail.com: Yes, these two

Re: HiveShim not found when building in Intellij

2014-10-28 Thread Patrick Wendell
Btw - we should have part of the official docs that describes a full from scratch build in IntelliJ including any gotchas. Then we can update it if there are build changes that alter it. I created this JIRA for it: https://issues.apache.org/jira/browse/SPARK-4128 On Tue, Oct 28, 2014 at 9:42 PM,

Re: HiveShim not found when building in Intellij

2014-10-28 Thread Cheng Lian
You may first open the root pom.xml file in IDEA, and then go for menu View / Tool Windows / Maven Projects, then choose desired Maven profile combination under the Profiles node (e.g. I usually use hadoop-2.4 + hive + hive-0.12.0). IDEA will ask you to re-import the Maven projects, confirm,

Re: HiveShim not found when building in Intellij

2014-10-28 Thread Patrick Wendell
I just started a totally fresh IntelliJ project importing from our root pom. I used all the default options and I added hadoop-2.4, hive, hive-0.13.1 profiles. I was able to run spark core tests from within IntelliJ. Didn't try anything beyond that, but FWIW this worked. - Patrick On Tue, Oct

Re: HiveShim not found when building in Intellij

2014-10-28 Thread Cheng Lian
Hao Cheng had just written such a from scratch guide for building Spark SQL in IDEA. Although it's written in Chinese, I think the illustrations are already descriptive enough. http://www.cnblogs.com//articles/4058371.html On 10/29/14 12:45 PM, Patrick Wendell wrote: Btw - we should

Re: HiveShim not found when building in Intellij

2014-10-28 Thread Stephen Boesch
I have selected the same options as Cheng LIang: hadoop-2.4, hive, hive 0.12.0 . After a full Rebuild in IJ I still see the HiveShim errors. I really do not know what is different. I had pulled three hours ago from github upstream master. Just for kicks i am trying PW's combination which uses

Re: HiveShim not found when building in Intellij

2014-10-28 Thread Patrick Wendell
Cheng - to make it recognize the new HiveShim for 0.12 I had to click on spark-hive under packages in the left pane, then go to Open Module Settings - then explicitly add the v0.12.0/src/main/scala folder to the sources by navigating to it and then ctrl+click to add it as a source. Did you have to

Re: HiveShim not found when building in Intellij

2014-10-28 Thread Stephen Boesch
Thanks guys - adding the source root for the shim manually was the issue. For some reason the other issue I was struggling with (NoCLassDefFoundError on ThreadFactoryBuilder) also disappeared. I am able to run tests now inside IJ. Woot 2014-10-28 22:13 GMT-07:00 Patrick Wendell

Re: HiveShim not found when building in Intellij

2014-10-28 Thread Patrick Wendell
Oops - I actually should have added v0.13.0 (i.e. to match whatever I did in the profile). On Tue, Oct 28, 2014 at 10:05 PM, Patrick Wendell pwend...@gmail.com wrote: Cheng - to make it recognize the new HiveShim for 0.12 I had to click on spark-hive under packages in the left pane, then go to

Re: HiveShim not found when building in Intellij

2014-10-28 Thread Cheng Lian
Hm, the shim source folder could be automatically recognized some time before, although at a wrong directory level (sql/hive/v0.12.0/src instead of sql/hive/v0.12.0/src/main/scala), it compiles. Just tried against a fresh checkout, indeed need to add shim source folder manually. Sorry for the