RE: Build with Hive 0.13.1 doesn't have datanucleus and parquet dependencies.

2014-10-27 Thread Cheng, Hao
Hive-thriftserver module is not included while specifying the profile 
hive-0.13.1. 

-Original Message-
From: Jianshi Huang [mailto:jianshi.hu...@gmail.com] 
Sent: Monday, October 27, 2014 4:48 PM
To: dev@spark.apache.org
Subject: Build with Hive 0.13.1 doesn't have datanucleus and parquet 
dependencies.

There's a change in build process lately for Hive 0.13 support and we should 
make it obvious. Based on the new pom.xml I tried to enable Hive
0.13.1 support by using option

  -Phive-0.13.1

However, it seems datanucleus and parquet dependencies are not available in the 
final build.

Am I missing anything?

Jianshi

--
Jianshi Huang

LinkedIn: jianshi
Twitter: @jshuang
Github  Blog: http://huangjs.github.com/

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Build with Hive 0.13.1 doesn't have datanucleus and parquet dependencies.

2014-10-27 Thread Jianshi Huang
Ah I see. Thanks Hao! I'll wait for the fix.

Jianshi

On Mon, Oct 27, 2014 at 4:57 PM, Cheng, Hao hao.ch...@intel.com wrote:

 Hive-thriftserver module is not included while specifying the profile
 hive-0.13.1.

 -Original Message-
 From: Jianshi Huang [mailto:jianshi.hu...@gmail.com]
 Sent: Monday, October 27, 2014 4:48 PM
 To: dev@spark.apache.org
 Subject: Build with Hive 0.13.1 doesn't have datanucleus and parquet
 dependencies.

 There's a change in build process lately for Hive 0.13 support and we
 should make it obvious. Based on the new pom.xml I tried to enable Hive
 0.13.1 support by using option

   -Phive-0.13.1

 However, it seems datanucleus and parquet dependencies are not available
 in the final build.

 Am I missing anything?

 Jianshi

 --
 Jianshi Huang

 LinkedIn: jianshi
 Twitter: @jshuang
 Github  Blog: http://huangjs.github.com/




-- 
Jianshi Huang

LinkedIn: jianshi
Twitter: @jshuang
Github  Blog: http://huangjs.github.com/


Re: best IDE for scala + spark development?

2014-10-27 Thread Dean Wampler
For what it's worth, I use Sublime Text + the SBT console for everything. I
can live without the extra IDE features.

However, if you like an IDE, the Eclipse Scala IDE 4.0 RC1 is a big
improvement over previous releases. For one thing, it can now supports
projects using different versions of Scala, which is convenient for Spark's
current 2.10.4 support and emerging 2.11 support.

http://scala-ide.org/download/milestone.html

Dean


Dean Wampler, Ph.D.
Author: Programming Scala, 2nd Edition
http://shop.oreilly.com/product/0636920033073.do (O'Reilly)
Typesafe http://typesafe.com
@deanwampler http://twitter.com/deanwampler
http://polyglotprogramming.com

On Sun, Oct 26, 2014 at 5:06 PM, Duy Huynh duy.huynh@gmail.com wrote:

 i like intellij and eclipse too, but some that they are too heavy.  i would
 love to use vim.  are there are good scala plugins for vim?  (i.e code
 completion, scala doc, etc)

 On Sun, Oct 26, 2014 at 12:32 PM, Jay Vyas jayunit100.apa...@gmail.com
 wrote:

  I tried the scala eclipse ide but in scala 2.10 I ran into some weird
  issues
 
 http://stackoverflow.com/questions/24253084/scalaide-and-cryptic-classnotfound-errors
  ... So I switched to IntelliJ and was much more satisfied...
 
  I've written a post on how I use fedora,sbt, and intellij for spark apps.
 
 
 http://jayunit100.blogspot.com/2014/07/set-up-spark-application-devleopment.html?m=1
 
  The IntelliJ sbt plugin is imo less buggy then the eclipse scalaIDE
  stuff.  For example, I found I had to set some special preferences
 
  Finally... given sbts automated recompile option, if you just use tmux,
  and vim nerdtree, with sbt , you could come pretty close to something
 like
  an IDE without all the drama ..
 
   On Oct 26, 2014, at 11:07 AM, ll duy.huynh@gmail.com wrote:
  
   i'm new to both scala and spark.  what IDE / dev environment do you
 find
  most
   productive for writing code in scala with spark?  is it just vim + sbt?
  or
   does a full IDE like intellij works out better?  thanks!
  
  
  
   --
   View this message in context:
 
 http://apache-spark-developers-list.1001551.n3.nabble.com/best-IDE-for-scala-spark-development-tp8965.html
   Sent from the Apache Spark Developers List mailing list archive at
  Nabble.com.
  
   -
   To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
   For additional commands, e-mail: dev-h...@spark.apache.org
  
 



Re: best IDE for scala + spark development?

2014-10-27 Thread andy petrella
I second the S[B]T combo!

I tried ATOM → lack of features and stability (atm)

aℕdy ℙetrella
about.me/noootsab
[image: aℕdy ℙetrella on about.me]

http://about.me/noootsab

On Mon, Oct 27, 2014 at 2:15 PM, Dean Wampler deanwamp...@gmail.com wrote:

 For what it's worth, I use Sublime Text + the SBT console for everything. I
 can live without the extra IDE features.

 However, if you like an IDE, the Eclipse Scala IDE 4.0 RC1 is a big
 improvement over previous releases. For one thing, it can now supports
 projects using different versions of Scala, which is convenient for Spark's
 current 2.10.4 support and emerging 2.11 support.

 http://scala-ide.org/download/milestone.html

 Dean


 Dean Wampler, Ph.D.
 Author: Programming Scala, 2nd Edition
 http://shop.oreilly.com/product/0636920033073.do (O'Reilly)
 Typesafe http://typesafe.com
 @deanwampler http://twitter.com/deanwampler
 http://polyglotprogramming.com

 On Sun, Oct 26, 2014 at 5:06 PM, Duy Huynh duy.huynh@gmail.com
 wrote:

  i like intellij and eclipse too, but some that they are too heavy.  i
 would
  love to use vim.  are there are good scala plugins for vim?  (i.e code
  completion, scala doc, etc)
 
  On Sun, Oct 26, 2014 at 12:32 PM, Jay Vyas jayunit100.apa...@gmail.com
  wrote:
 
   I tried the scala eclipse ide but in scala 2.10 I ran into some weird
   issues
  
 
 http://stackoverflow.com/questions/24253084/scalaide-and-cryptic-classnotfound-errors
   ... So I switched to IntelliJ and was much more satisfied...
  
   I've written a post on how I use fedora,sbt, and intellij for spark
 apps.
  
  
 
 http://jayunit100.blogspot.com/2014/07/set-up-spark-application-devleopment.html?m=1
  
   The IntelliJ sbt plugin is imo less buggy then the eclipse scalaIDE
   stuff.  For example, I found I had to set some special preferences
  
   Finally... given sbts automated recompile option, if you just use tmux,
   and vim nerdtree, with sbt , you could come pretty close to something
  like
   an IDE without all the drama ..
  
On Oct 26, 2014, at 11:07 AM, ll duy.huynh@gmail.com wrote:
   
i'm new to both scala and spark.  what IDE / dev environment do you
  find
   most
productive for writing code in scala with spark?  is it just vim +
 sbt?
   or
does a full IDE like intellij works out better?  thanks!
   
   
   
--
View this message in context:
  
 
 http://apache-spark-developers-list.1001551.n3.nabble.com/best-IDE-for-scala-spark-development-tp8965.html
Sent from the Apache Spark Developers List mailing list archive at
   Nabble.com.
   
-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org
   
  
 



Re: best IDE for scala + spark development?

2014-10-27 Thread Koert Kuipers
editor of your choice + sbt console works + grep great.

if only folks stopped using wildcard imports (it has little benefits in
terms of coding yet requires an IDE with 1G+ of ram to track em down).

On Mon, Oct 27, 2014 at 9:17 AM, andy petrella andy.petre...@gmail.com
wrote:

 I second the S[B]T combo!

 I tried ATOM → lack of features and stability (atm)

 aℕdy ℙetrella
 about.me/noootsab
 [image: aℕdy ℙetrella on about.me]

 http://about.me/noootsab

 On Mon, Oct 27, 2014 at 2:15 PM, Dean Wampler deanwamp...@gmail.com
 wrote:

  For what it's worth, I use Sublime Text + the SBT console for
 everything. I
  can live without the extra IDE features.
 
  However, if you like an IDE, the Eclipse Scala IDE 4.0 RC1 is a big
  improvement over previous releases. For one thing, it can now supports
  projects using different versions of Scala, which is convenient for
 Spark's
  current 2.10.4 support and emerging 2.11 support.
 
  http://scala-ide.org/download/milestone.html
 
  Dean
 
 
  Dean Wampler, Ph.D.
  Author: Programming Scala, 2nd Edition
  http://shop.oreilly.com/product/0636920033073.do (O'Reilly)
  Typesafe http://typesafe.com
  @deanwampler http://twitter.com/deanwampler
  http://polyglotprogramming.com
 
  On Sun, Oct 26, 2014 at 5:06 PM, Duy Huynh duy.huynh@gmail.com
  wrote:
 
   i like intellij and eclipse too, but some that they are too heavy.  i
  would
   love to use vim.  are there are good scala plugins for vim?  (i.e code
   completion, scala doc, etc)
  
   On Sun, Oct 26, 2014 at 12:32 PM, Jay Vyas 
 jayunit100.apa...@gmail.com
   wrote:
  
I tried the scala eclipse ide but in scala 2.10 I ran into some weird
issues
   
  
 
 http://stackoverflow.com/questions/24253084/scalaide-and-cryptic-classnotfound-errors
... So I switched to IntelliJ and was much more satisfied...
   
I've written a post on how I use fedora,sbt, and intellij for spark
  apps.
   
   
  
 
 http://jayunit100.blogspot.com/2014/07/set-up-spark-application-devleopment.html?m=1
   
The IntelliJ sbt plugin is imo less buggy then the eclipse scalaIDE
stuff.  For example, I found I had to set some special preferences
   
Finally... given sbts automated recompile option, if you just use
 tmux,
and vim nerdtree, with sbt , you could come pretty close to something
   like
an IDE without all the drama ..
   
 On Oct 26, 2014, at 11:07 AM, ll duy.huynh@gmail.com wrote:

 i'm new to both scala and spark.  what IDE / dev environment do you
   find
most
 productive for writing code in scala with spark?  is it just vim +
  sbt?
or
 does a full IDE like intellij works out better?  thanks!



 --
 View this message in context:
   
  
 
 http://apache-spark-developers-list.1001551.n3.nabble.com/best-IDE-for-scala-spark-development-tp8965.html
 Sent from the Apache Spark Developers List mailing list archive at
Nabble.com.


 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org

   
  
 



Re: best IDE for scala + spark development?

2014-10-27 Thread Shivaram Venkataraman
Also ctags works fine with vim for browsing scala classes

Shivaram

On Mon, Oct 27, 2014 at 8:49 AM, Koert Kuipers ko...@tresata.com wrote:
 editor of your choice + sbt console works + grep great.

 if only folks stopped using wildcard imports (it has little benefits in
 terms of coding yet requires an IDE with 1G+ of ram to track em down).

 On Mon, Oct 27, 2014 at 9:17 AM, andy petrella andy.petre...@gmail.com
 wrote:

 I second the S[B]T combo!

 I tried ATOM → lack of features and stability (atm)

 aℕdy ℙetrella
 about.me/noootsab
 [image: aℕdy ℙetrella on about.me]

 http://about.me/noootsab

 On Mon, Oct 27, 2014 at 2:15 PM, Dean Wampler deanwamp...@gmail.com
 wrote:

  For what it's worth, I use Sublime Text + the SBT console for
 everything. I
  can live without the extra IDE features.
 
  However, if you like an IDE, the Eclipse Scala IDE 4.0 RC1 is a big
  improvement over previous releases. For one thing, it can now supports
  projects using different versions of Scala, which is convenient for
 Spark's
  current 2.10.4 support and emerging 2.11 support.
 
  http://scala-ide.org/download/milestone.html
 
  Dean
 
 
  Dean Wampler, Ph.D.
  Author: Programming Scala, 2nd Edition
  http://shop.oreilly.com/product/0636920033073.do (O'Reilly)
  Typesafe http://typesafe.com
  @deanwampler http://twitter.com/deanwampler
  http://polyglotprogramming.com
 
  On Sun, Oct 26, 2014 at 5:06 PM, Duy Huynh duy.huynh@gmail.com
  wrote:
 
   i like intellij and eclipse too, but some that they are too heavy.  i
  would
   love to use vim.  are there are good scala plugins for vim?  (i.e code
   completion, scala doc, etc)
  
   On Sun, Oct 26, 2014 at 12:32 PM, Jay Vyas 
 jayunit100.apa...@gmail.com
   wrote:
  
I tried the scala eclipse ide but in scala 2.10 I ran into some weird
issues
   
  
 
 http://stackoverflow.com/questions/24253084/scalaide-and-cryptic-classnotfound-errors
... So I switched to IntelliJ and was much more satisfied...
   
I've written a post on how I use fedora,sbt, and intellij for spark
  apps.
   
   
  
 
 http://jayunit100.blogspot.com/2014/07/set-up-spark-application-devleopment.html?m=1
   
The IntelliJ sbt plugin is imo less buggy then the eclipse scalaIDE
stuff.  For example, I found I had to set some special preferences
   
Finally... given sbts automated recompile option, if you just use
 tmux,
and vim nerdtree, with sbt , you could come pretty close to something
   like
an IDE without all the drama ..
   
 On Oct 26, 2014, at 11:07 AM, ll duy.huynh@gmail.com wrote:

 i'm new to both scala and spark.  what IDE / dev environment do you
   find
most
 productive for writing code in scala with spark?  is it just vim +
  sbt?
or
 does a full IDE like intellij works out better?  thanks!



 --
 View this message in context:
   
  
 
 http://apache-spark-developers-list.1001551.n3.nabble.com/best-IDE-for-scala-spark-development-tp8965.html
 Sent from the Apache Spark Developers List mailing list archive at
Nabble.com.


 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org

   
  
 


-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



jenkins downtime tomorrow morning ~6am-8am PDT

2014-10-27 Thread shane knapp
i'll be bringing jenkins down tomorrow morning for some system maintenance
and to get our backups kicked off.

i do expect to have the system back up and running before 8am.

please let me know ASAP if i need to reschedule this.

thanks,

shane


jenkins emergency restart now, was Re: jenkins downtime tomorrow morning ~6am-8am PDT

2014-10-27 Thread shane knapp
so, i'm having a race condition between a plugin i installed putting
jenkins in to quiet mode and it failing to perform a backup from this past
weekend.  i'll need to restart the process and get it out of the
constantly-in-to-quiet-mode cycle it's in now.

this will be quick, and i'll restart the jobs i've killed.

this DOES NOT effect the restart/maintenance tomorrow morning.

sorry about the inconvenience,

shane

On Mon, Oct 27, 2014 at 10:46 AM, shane knapp skn...@berkeley.edu wrote:

 i'll be bringing jenkins down tomorrow morning for some system maintenance
 and to get our backups kicked off.

 i do expect to have the system back up and running before 8am.

 please let me know ASAP if i need to reschedule this.

 thanks,

 shane



Re: jenkins emergency restart now, was Re: jenkins downtime tomorrow morning ~6am-8am PDT

2014-10-27 Thread shane knapp
ok we're back up and building.  i've retriggered the jobs i killed.

On Mon, Oct 27, 2014 at 1:24 PM, shane knapp skn...@berkeley.edu wrote:

 so, i'm having a race condition between a plugin i installed putting
 jenkins in to quiet mode and it failing to perform a backup from this past
 weekend.  i'll need to restart the process and get it out of the
 constantly-in-to-quiet-mode cycle it's in now.

 this will be quick, and i'll restart the jobs i've killed.

 this DOES NOT effect the restart/maintenance tomorrow morning.

 sorry about the inconvenience,

 shane

 On Mon, Oct 27, 2014 at 10:46 AM, shane knapp skn...@berkeley.edu wrote:

 i'll be bringing jenkins down tomorrow morning for some system
 maintenance and to get our backups kicked off.

 i do expect to have the system back up and running before 8am.

 please let me know ASAP if i need to reschedule this.

 thanks,

 shane





Re: best IDE for scala + spark development?

2014-10-27 Thread Will Benton
I'll chime in as yet another user who is extremely happy with sbt and a text 
editor.  (In my experience, running ack from the command line is usually just 
as easy and fast as using an IDE's find-in-project facility.)  You can, of 
course, extend editors with Scala-specific IDE-like functionality (in 
particular, I am aware of -- but have not used -- ENSIME for emacs or TextMate).

Since you're new to Scala, you may not know that you can run any sbt command 
preceded by a tilde, which will watch files in your project and run the command 
when anything changes.  Therefore, running ~compile from the sbt repl will 
get you most of the continuous syntax-checking functionality you can get from 
an IDE.

best,
wb

- Original Message -
 From: ll duy.huynh@gmail.com
 To: d...@spark.incubator.apache.org
 Sent: Sunday, October 26, 2014 10:07:20 AM
 Subject: best IDE for scala + spark development?
 
 i'm new to both scala and spark.  what IDE / dev environment do you find most
 productive for writing code in scala with spark?  is it just vim + sbt?  or
 does a full IDE like intellij works out better?  thanks!
 
 
 
 --
 View this message in context:
 http://apache-spark-developers-list.1001551.n3.nabble.com/best-IDE-for-scala-spark-development-tp8965.html
 Sent from the Apache Spark Developers List mailing list archive at
 Nabble.com.
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org
 
 

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: HiveContext bug?

2014-10-27 Thread Marcelo Vanzin
Well, looks like a huge coincidence, but this was just sent to github:
https://github.com/apache/spark/pull/2967

On Mon, Oct 27, 2014 at 3:25 PM, Marcelo Vanzin van...@cloudera.com wrote:
 Hey guys,

 I've been using the HiveFromSpark example to test some changes and I
 ran into an issue that manifests itself as an NPE inside Hive code
 because some configuration object is null.

 Tracing back, it seems that `sessionState` being a lazy val in
 HiveContext is causing it. That variably is only evaluated in [1],
 while the call in [2] causes a Driver to be initialized by [3], which
 the tries to use the thread-local session state ([4]) which hasn't
 been set yet.

 This could be seen as a Hive bug ([3] should probably be calling the
 constructor that takes a conf object), but is there a reason why these
 fields are lazy in HiveContext? I explicitly called
 SessionState.setCurrentSessionState() before the
 CommandProcessorFactory call and that seems to fix the issue too.

 [1] 
 https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala#L305
 [2] 
 https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala#L289
 [3] 
 https://github.com/apache/hive/blob/9c63b2fdc35387d735f4c9d08761203711d4974b/ql/src/java/org/apache/hadoop/hive/ql/processors/CommandProcessorFactory.java#L104
 [4] 
 https://github.com/apache/hive/blob/9c63b2fdc35387d735f4c9d08761203711d4974b/ql/src/java/org/apache/hadoop/hive/ql/Driver.java#L286

 --
 Marcelo



-- 
Marcelo

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Workaround for python's inability to unzip zip64 spark assembly jar

2014-10-27 Thread Rahul Singhal
Hi All,

We recently faced the known issue where pyspark does not work when the assembly 
jar contains more than 65K files. Our build and run time environment are both 
Java 7 but python fails to unzip the assembly jar as expected 
(https://issues.apache.org/jira/browse/SPARK-1911).

All nodes in our YARN cluster have spark deployed (at the same local location) 
on them so we are contemplating the following workaround (apart from using a 
Java 6 compiled assembly):

Modify PYTHONPATH to give preference to $SPARK_HOME/python  
$SPARK_HOME/python/lib/py4j-0.8.1-src.zip, with this the assembly does not 
need to be unzipped to access the python files. This worked fine for with my 
limited testing. And I think, this should work as long as the only reason to 
unzip the assembly jar is to extract the python files and nothing else (any 
reason to believe that this may not be the case?).

I would appreciate your opinion on this workaround.

Thanks,
Rahul Singhal