Re: [VOTE] Release Apache Spark 0.8.1-incubating (rc4)

2013-12-16 Thread Matei Zaharia
Are you using 0.8.1? It will build with protobuf 2.5 instead of 2.4 as long as 
you make it depend on Hadoop 2.2. But make sure you build it with 
SPARK_HADOOP_VERSION=2.2.0 or whatever.

Spark 0.8.0 doesn’t support Hadoop 2.2 due to this issue.

Matei

On Dec 15, 2013, at 10:25 PM, Azuryy Yu azury...@gmail.com wrote:

 Maybe I am not give a clear description. I am runing Spark on yarn. instead
 of Mesos. I just want build Spark with protobuf2.5. I am not care about
 Mesos.
 
 I've changed Spark pom.xml to probobuf2.5 manually.
 
 
 
 
 On Mon, Dec 16, 2013 at 2:02 PM, Matei Zaharia matei.zaha...@gmail.comwrote:
 
 Mesos will almost certainly compile fine with protobuf 2.5. The protobuf
 compiler and binary format is forward-compatible across releases, it’s just
 the Java artifacts that aren’t. You’ll need to ask Mesos to provide a
 version with protobuf 2.5, and use that with these versions of Hadoop.
 
 Matei
 
 On Dec 15, 2013, at 7:00 PM, Liu, Raymond raymond@intel.com wrote:
 
 That issue is for 0.9's solution.
 
 And if you mean for 0.8.1, when you build against hadoop 2.2 Yarn,
 protobuf is already using 2.5.0 instead of 2.4.1. so it will works fine
 with hadoop 2.2
 And regarding on 0.8.1 you build against hadoop 2.2 Yarn, while run upon
 mesos... strange combination, I am not sure, might have problem. If have
 problem, you might need to build mesos against 2.5.0, I don't test that, if
 you got time, mind take a test?
 
 Best Regards,
 Raymond Liu
 
 
 -Original Message-
 From: Liu, Raymond [mailto:raymond@intel.com]
 Sent: Monday, December 16, 2013 10:48 AM
 To: dev@spark.incubator.apache.org
 Subject: RE: [VOTE] Release Apache Spark 0.8.1-incubating (rc4)
 
 Hi Azuryy
 
 Please Check https://spark-project.atlassian.net/browse/SPARK-995 for
 this protobuf version issue
 
 Best Regards,
 Raymond Liu
 
 -Original Message-
 From: Azuryy Yu [mailto:azury...@gmail.com]
 Sent: Monday, December 16, 2013 10:30 AM
 To: dev@spark.incubator.apache.org
 Subject: Re: [VOTE] Release Apache Spark 0.8.1-incubating (rc4)
 
 Hi here,
 Do we have plan to upgrade protobuf from 2.4.1 to 2.5.0? PB has some
 uncompatable API between these two versions.
 Hadoop-2.x using protobuf-2.5.0
 
 
 but if some guys want to run Spark on mesos, then mesos using
 protobuf-2.4.1 currently. so we may discuss here for a better solution.
 
 
 
 On Mon, Dec 16, 2013 at 7:42 AM, Azuryy Yu azury...@gmail.com wrote:
 
 Thanks Patrick.
 On 16 Dec 2013 02:43, Patrick Wendell pwend...@gmail.com wrote:
 
 You can checkout the docs mentioned in the vote thread. There is also
 a pre-build binary for hadoop2 that is compiled for YARN 2.2
 
 - Patrick
 
 On Sun, Dec 15, 2013 at 4:31 AM, Azuryy Yu azury...@gmail.com wrote:
 yarn 2.2, not yarn 0.22, I am so sorry.
 
 
 On Sun, Dec 15, 2013 at 8:31 PM, Azuryy Yu azury...@gmail.com
 wrote:
 
 Hi,
 Spark-0.8.1 supports yarn 0.22 right? where to find the release note?
 Thanks.
 
 
 On Sun, Dec 15, 2013 at 3:20 AM, Henry Saputra 
 henry.sapu...@gmail.comwrote:
 
 Yeah seems like it. He was ok with our prev release.
 Let's wait for his reply
 
 On Saturday, December 14, 2013, Patrick Wendell wrote:
 
 Henry - from that thread it looks like sebb's concern was
 something different than this.
 
 On Sat, Dec 14, 2013 at 11:08 AM, Henry Saputra 
 henry.sapu...@gmail.com
 wrote:
 Hi Patrick,
 
 Yeap I agree, but technically ASF VOTE release on source
 only,
 there
 even debate about it =), so putting it in the vote staging
 artifact
 could confuse people because in our case we do package 3rd
 party libraries in the binary jars.
 
 I have sent email to sebb asking clarification about his
 concern
 in
 general@ list.
 
 - Henry
 
 On Sat, Dec 14, 2013 at 10:56 AM, Patrick Wendell 
 pwend...@gmail.com
 
 wrote:
 Hey Henry,
 
 One thing a lot of people do during the vote is test the
 binaries and
 make sure they work. This is really valuable. If you'd like
 I
 could
 add a caveat to the vote thread explaining that we are only
 voting on
 the source.
 
 - Patrick
 
 On Sat, Dec 14, 2013 at 10:40 AM, Henry Saputra 
 henry.sapu...@gmail.com wrote:
 Actually we should be fine putting the binaries there as
 long
 as the
 VOTE is for the source.
 
 Let's verify with sebb in the general@ list about his concern.
 
 - Henry
 
 On Sat, Dec 14, 2013 at 10:31 AM, Henry Saputra 
 henry.sapu...@gmail.com wrote:
 Hi Patrick, as sebb has mentioned let's move the binaries
 from
 the
 voting directory in your people.apache.org directory.
 ASF release voting is for source code and not binaries,
 and technically we provide binaries for convenience.
 
 And add link to the KEYS location in the dist[1] to let
 verify
 signatures.
 
 Sorry for the late response to the VOTE thread, guys.
 
 - Henry
 
 [1]
 https://dist.apache.org/repos/dist/release/incubator/spark/KEYS
 
 On Fri, Dec 13, 2013 at 6:37 PM, Patrick Wendell 
 pwend...@gmail.com
 wrote:
 The vote is now closed. This vote passes with 5 PPMC +1's
 and
 no 

Re: Intellij IDEA build issues

2013-12-16 Thread Nick Pentreath
Thanks Evan, I tried it and the new SBT direct import seems to work well,
though I did run into issues with some yarn imports on Spark.

n


On Thu, Dec 12, 2013 at 7:03 PM, Evan Chan e...@ooyala.com wrote:

 Nick, have you tried using the latest Scala plug-in, which features native
 SBT project imports?   ie you no longer need to run gen-idea.


 On Sat, Dec 7, 2013 at 4:15 AM, Nick Pentreath nick.pentre...@gmail.com
 wrote:

  Hi Spark Devs,
 
  Hoping someone cane help me out. No matter what I do, I cannot get
 Intellij
  to build Spark from source. I am using IDEA 13. I run sbt gen-idea and
  everything seems to work fine.
 
  When I try to build using IDEA, everything compiles but I get the error
  below.
 
  Have any of you come across the same?
 
  ==
 
  Internal error: (java.lang.AssertionError)
  java/nio/channels/FileChannel$MapMode already declared as
  ch.epfl.lamp.fjbg.JInnerClassesAttribute$Entry@1b5b798b
  java.lang.AssertionError: java/nio/channels/FileChannel$MapMode already
  declared as ch.epfl.lamp.fjbg.JInnerClassesAttribute$Entry@1b5b798b
  at
 
 
 ch.epfl.lamp.fjbg.JInnerClassesAttribute.addEntry(JInnerClassesAttribute.java:74)
  at
 
 
 scala.tools.nsc.backend.jvm.GenJVM$BytecodeGenerator$$anonfun$addInnerClasses$3.apply(GenJVM.scala:738)
  at
 
 
 scala.tools.nsc.backend.jvm.GenJVM$BytecodeGenerator$$anonfun$addInnerClasses$3.apply(GenJVM.scala:733)
  at
 
 
 scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:59)
  at scala.collection.immutable.List.foreach(List.scala:76)
  at
 
 
 scala.tools.nsc.backend.jvm.GenJVM$BytecodeGenerator.addInnerClasses(GenJVM.scala:733)
  at
 
 
 scala.tools.nsc.backend.jvm.GenJVM$BytecodeGenerator.emitClass(GenJVM.scala:200)
  at
 
 
 scala.tools.nsc.backend.jvm.GenJVM$BytecodeGenerator.genClass(GenJVM.scala:355)
  at
 
 
 scala.tools.nsc.backend.jvm.GenJVM$JvmPhase$$anonfun$run$4.apply(GenJVM.scala:86)
  at
 
 
 scala.tools.nsc.backend.jvm.GenJVM$JvmPhase$$anonfun$run$4.apply(GenJVM.scala:86)
  at
 
 
 scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:104)
  at
 
 
 scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:104)
  at scala.collection.Iterator$class.foreach(Iterator.scala:772)
  at
 scala.collection.mutable.HashTable$$anon$1.foreach(HashTable.scala:157)
  at
 
 scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:190)
  at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:45)
  at scala.collection.mutable.HashMap$$anon$2.foreach(HashMap.scala:104)
  at scala.tools.nsc.backend.jvm.GenJVM$JvmPhase.run(GenJVM.scala:86)
  at scala.tools.nsc.Global$Run.compileSources(Global.scala:953)
  at scala.tools.nsc.Global$Run.compile(Global.scala:1041)
  at xsbt.CachedCompiler0.run(CompilerInterface.scala:123)
  at xsbt.CachedCompiler0.liftedTree1$1(CompilerInterface.scala:99)
  at xsbt.CachedCompiler0.run(CompilerInterface.scala:99)
  at xsbt.CompilerInterface.run(CompilerInterface.scala:27)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at
 
 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
  at
 
 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:601)
  at sbt.compiler.AnalyzingCompiler.call(AnalyzingCompiler.scala:102)
  at sbt.compiler.AnalyzingCompiler.compile(AnalyzingCompiler.scala:48)
  at sbt.compiler.AnalyzingCompiler.compile(AnalyzingCompiler.scala:41)
  at
 
 
 sbt.compiler.AggressiveCompile$$anonfun$6$$anonfun$compileScala$1$1$$anonfun$apply$3$$anonfun$apply$1.apply$mcV$sp(AggressiveCompile.scala:106)
  at
 
 
 sbt.compiler.AggressiveCompile$$anonfun$6$$anonfun$compileScala$1$1$$anonfun$apply$3$$anonfun$apply$1.apply(AggressiveCompile.scala:106)
  at
 
 
 sbt.compiler.AggressiveCompile$$anonfun$6$$anonfun$compileScala$1$1$$anonfun$apply$3$$anonfun$apply$1.apply(AggressiveCompile.scala:106)
  at
 
 
 sbt.compiler.AggressiveCompile.sbt$compiler$AggressiveCompile$$timed(AggressiveCompile.scala:173)
  at
 
 
 sbt.compiler.AggressiveCompile$$anonfun$6$$anonfun$compileScala$1$1$$anonfun$apply$3.apply(AggressiveCompile.scala:105)
  at
 
 
 sbt.compiler.AggressiveCompile$$anonfun$6$$anonfun$compileScala$1$1$$anonfun$apply$3.apply(AggressiveCompile.scala:102)
  at scala.Option.foreach(Option.scala:236)
  at
 
 
 sbt.compiler.AggressiveCompile$$anonfun$6$$anonfun$compileScala$1$1.apply(AggressiveCompile.scala:102)
  at
 
 
 sbt.compiler.AggressiveCompile$$anonfun$6$$anonfun$compileScala$1$1.apply(AggressiveCompile.scala:102)
  at scala.Option.foreach(Option.scala:236)
  at
 
 
 sbt.compiler.AggressiveCompile$$anonfun$6.compileScala$1(AggressiveCompile.scala:102)
  at
 
 
 sbt.compiler.AggressiveCompile$$anonfun$6.apply(AggressiveCompile.scala:151)
  at
 
 sbt.compiler.AggressiveCompile$$anonfun$6.apply(AggressiveCompile.scala:89)
  at
 

Re: Re : Scala 2.10 Merge

2013-12-16 Thread Evan Chan
Great job everyone!  A big step forward.


On Sat, Dec 14, 2013 at 2:37 AM, andy.petre...@gmail.com 
andy.petre...@gmail.com wrote:

 That's a very good news!
 Congrats

 Envoyé depuis mon HTC

 - Reply message -
 De : Sam Bessalah samkil...@gmail.com
 Pour : dev@spark.incubator.apache.org dev@spark.incubator.apache.org
 Objet : Scala 2.10 Merge
 Date : sam., déc. 14, 2013 11:03


 Yes. Awesome.
 Great job guys.

 Sam Bessalah

  On Dec 14, 2013, at 9:59 AM, Patrick Wendell pwend...@gmail.com wrote:
 
  Alright I just merged this in - so Spark is officially Scala 2.10
  from here forward.
 
  For reference I cut a new branch called scala-2.9 with the commit
  immediately prior to the merge:
 
 https://git-wip-us.apache.org/repos/asf/incubator-spark/repo?p=incubator-spark.git;a=shortlog;h=refs/heads/scala-2.9
 
  - Patrick
 
  On Thu, Dec 12, 2013 at 8:26 PM, Patrick Wendell pwend...@gmail.com
 wrote:
  Hey Reymond,
 
  Let's move this discussion out of this thread and into the associated
 JIRA.
  I'll write up our current approach over there.
 
  https://spark-project.atlassian.net/browse/SPARK-995
 
  - Patrick
 
 
  On Thu, Dec 12, 2013 at 5:56 PM, Liu, Raymond raymond@intel.com
 wrote:
 
  Hi Patrick
 
 So what's the plan for support Yarn 2.2 in 0.9? As far as I can
  see, if you want to support both 2.2 and 2.0 , due to protobuf version
  incompatible issue. You need two version of akka anyway.
 
 Akka 2.3-M1 looks like have a little bit change in API, we
  probably could isolate the code like what we did on yarn part API. I
  remember that it is mentioned that to use reflection for different API
 is
  preferred. So the purpose to use reflection is to use one release bin
 jar to
  support both version of Hadoop/Yarn on runtime, instead of build
 different
  bin jar on compile time?
 
  Then all code related to hadoop will also be built in separate
  modules for loading on demand? This sounds to me involve a lot of
 works. And
  you still need to have shim layer and separate code for different
 version
  API and depends on different version Akka etc. Sounds like and even
 strict
  demands versus our current approaching on master, and with dynamic
 class
  loader in addition, And the problem we are facing now are still there?
 
  Best Regards,
  Raymond Liu
 
  -Original Message-
  From: Patrick Wendell [mailto:pwend...@gmail.com]
  Sent: Thursday, December 12, 2013 5:13 PM
  To: dev@spark.incubator.apache.org
  Subject: Re: Scala 2.10 Merge
 
  Also - the code is still there because of a recent merge that took in
 some
  newer changes... we'll be removing it for the final merge.
 
 
  On Thu, Dec 12, 2013 at 1:12 AM, Patrick Wendell pwend...@gmail.com
  wrote:
 
  Hey Raymond,
 
  This won't work because AFAIK akka 2.3-M1 is not binary compatible
  with akka 2.2.3 (right?). For all of the non-yarn 2.2 versions we need
  to still use the older protobuf library, so we'd need to support both.
 
  I'd also be concerned about having a reference to a non-released
  version of akka. Akka is the source of our hardest-to-find bugs and
  simultaneously trying to support 2.2.3 and 2.3-M1 is a bit daunting.
  Of course, if you are building off of master you can maintain a fork
  that uses this.
 
  - Patrick
 
 
  On Thu, Dec 12, 2013 at 12:42 AM, Liu, Raymond
  raymond@intel.comwrote:
 
  Hi Patrick
 
 What does that means for drop YARN 2.2? seems codes are still
  there. You mean if build upon 2.2 it will break, and won't and work
  right?
  Since the home made akka build on scala 2.10 are not there. While, if
  for this case, can we just use akka 2.3-M1 which run on protobuf 2.5
  for replacement?
 
  Best Regards,
  Raymond Liu
 
 
  -Original Message-
  From: Patrick Wendell [mailto:pwend...@gmail.com]
  Sent: Thursday, December 12, 2013 4:21 PM
  To: dev@spark.incubator.apache.org
  Subject: Scala 2.10 Merge
 
  Hi Developers,
 
  In the next few days we are planning to merge Scala 2.10 support into
  Spark. For those that haven't been following this, Prashant Sharma
  has been maintaining the scala-2.10 branch of Spark for several
  months. This branch is current with master and has been reviewed for
  merging:
 
  https://github.com/apache/incubator-spark/tree/scala-2.10
 
  Scala 2.10 support is one of the most requested features for Spark -
  it will be great to get this into Spark 0.9! Please note that *Scala
  2.10 is not binary compatible with Scala 2.9*. With that in mind, I
  wanted to give a few heads-up/requests to developers:
 
  If you are developing applications on top of Spark's master branch,
  those will need to migrate to Scala 2.10. You may want to download
  and test the current scala-2.10 branch in order to make sure you will
  be okay as Spark developments move forward. Of course, you can always
  stick with the current master commit and be fine (I'll cut a tag when
  we do the merge in order to delineate where the version 

Re: spark.task.maxFailures

2013-12-16 Thread Grega Kešpret
Any news regarding this setting? Is this expected behaviour? Is there some
other way I can have Spark fail-fast?

Thanks!

On Mon, Dec 9, 2013 at 4:35 PM, Grega Kešpret gr...@celtra.com wrote:

 Hi!

 I tried this (by setting spark.task.maxFailures to 1) and it still does
 not fail-fast. I started a job and after some time, I killed all JVMs
 running on one of the two workers. I was expecting Spark job to fail,
 however it re-fetched tasks to one of the two workers that was still alive
 and the job succeeded.

 Grega



Re: spark.task.maxFailures

2013-12-16 Thread Reynold Xin
I just merged your pull request
https://github.com/apache/incubator-spark/pull/245


On Mon, Dec 16, 2013 at 2:12 PM, Grega Kešpret gr...@celtra.com wrote:

 Any news regarding this setting? Is this expected behaviour? Is there some
 other way I can have Spark fail-fast?

 Thanks!

 On Mon, Dec 9, 2013 at 4:35 PM, Grega Kešpret gr...@celtra.com wrote:

  Hi!
 
  I tried this (by setting spark.task.maxFailures to 1) and it still does
  not fail-fast. I started a job and after some time, I killed all JVMs
  running on one of the two workers. I was expecting Spark job to fail,
  however it re-fetched tasks to one of the two workers that was still
 alive
  and the job succeeded.
 
  Grega
 



Re: spark.task.maxFailures

2013-12-16 Thread Dmitriy Lyubimov
i guess it should really be maximum number of total task run attempts.
 At least that's what it looks logically. in that sense, the rest of the
documentation is correct ( should be at least 1; 1 = task is allowed no
retries (1-1=0)).




On Fri, Nov 29, 2013 at 2:02 AM, Grega Kešpret gr...@celtra.com wrote:

 Looking at
 http://spark.incubator.apache.org/docs/latest/configuration.html
 docs says:
 Number of individual task failures before giving up on the job. Should be
 greater than or equal to 1. Number of allowed retries = this value - 1.

 However, looking at the code

 https://github.com/apache/incubator-spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterTaskSetManager.scala#L532

 if I set spark.task.maxFailures to 1, this means that job will fail after
 task fails for the second time. Shouldn't this line be corrected to if (
 numFailures(index) = MAX_TASK_FAILURES) {
 ?

 I can open a pull request if this is the case.

 Thanks,
 Grega
 --
 [image: Inline image 1]
 *Grega Kešpret*
 Analytics engineer

 Celtra — Rich Media Mobile Advertising
 celtra.com http://www.celtra.com/ | 
 @celtramobilehttp://www.twitter.com/celtramobile