from:"Patrick Wendell"


 [ 
https://issues.apache.org/jira/browse/SPARK-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-1142.

Resolution: Not a Problem

 Allow adding jars on app submission, outside of code
 

 Key: SPARK-1142
 URL: https://issues.apache.org/jira/browse/SPARK-1142
 Project: Spark
  Issue Type: Improvement
  Components: Spark Submit
Affects Versions: 0.9.0
Reporter: Sandy Pérez González
Assignee: Sandy Ryza

 yarn-standalone mode supports an option that allows adding jars that will be 
 distributed on the cluster with job submission.  Providing similar 
 functionality for other app submission modes will allow the spark-app script 
 proposed in SPARK-1126 to support an add-jars option that works for every 
 submit mode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-5195) when hive table is query with alias the cache data lose effectiveness.


 [ 
https://issues.apache.org/jira/browse/SPARK-5195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-5195:
---
Assignee: yixiaohua

 when hive table is query with alias  the cache data  lose effectiveness.
 

 Key: SPARK-5195
 URL: https://issues.apache.org/jira/browse/SPARK-5195
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.2.0
Reporter: yixiaohua
Assignee: yixiaohua
 Fix For: 1.3.0


 override the MetastoreRelation's sameresult method only compare databasename 
 and table name
 because in previous :
 cache table t1;
 select count() from t1;
 it will read data from memory but the sql below will not,instead it read from 
 hdfs:
 select count() from t1 t;
 because cache data is keyed by logical plan and compare with sameResult ,so 
 when table with alias the same table 's logicalplan is not the same logical 
 plan with out alias so modify the sameresult method only compare databasename 
 and table name



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-5690) Flaky test:

Patrick Wendell created SPARK-5690:
--

 Summary: Flaky test: 
 Key: SPARK-5690
 URL: https://issues.apache.org/jira/browse/SPARK-5690
 Project: Spark
  Issue Type: Bug
  Components: Tests
Reporter: Patrick Wendell
Assignee: Andrew Or


https://amplab.cs.berkeley.edu/jenkins/view/Spark/job/Spark-Master-SBT/AMPLAB_JENKINS_BUILD_PROFILE=hadoop1.0,label=centos/1647/testReport/junit/org.apache.spark.deploy.rest/StandaloneRestSubmitSuite/simple_submit_until_completion/

{code}
org.apache.spark.deploy.rest.StandaloneRestSubmitSuite.simple submit until 
completion

Failing for the past 1 build (Since Failed#1647 )
Took 30 sec.
Error Message

Driver driver-20150209035158- did not finish within 30 seconds.
Stacktrace

sbt.ForkMain$ForkError: Driver driver-20150209035158- did not finish within 
30 seconds.
at 
org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:495)
at 
org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1555)
at org.scalatest.Assertions$class.fail(Assertions.scala:1328)
at org.scalatest.FunSuite.fail(FunSuite.scala:1555)
at 
org.apache.spark.deploy.rest.StandaloneRestSubmitSuite.org$apache$spark$deploy$rest$StandaloneRestSubmitSuite$$waitUntilFinished(StandaloneRestSubmitSuite.scala:152)
at 
org.apache.spark.deploy.rest.StandaloneRestSubmitSuite$$anonfun$1.apply$mcV$sp(StandaloneRestSubmitSuite.scala:57)
at 
org.apache.spark.deploy.rest.StandaloneRestSubmitSuite$$anonfun$1.apply(StandaloneRestSubmitSuite.scala:52)
at 
org.apache.spark.deploy.rest.StandaloneRestSubmitSuite$$anonfun$1.apply(StandaloneRestSubmitSuite.scala:52)
at 
org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
at org.scalatest.Transformer.apply(Transformer.scala:22)
at org.scalatest.Transformer.apply(Transformer.scala:20)
at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
at org.scalatest.Suite$class.withFixture(Suite.scala:1122)
at org.scalatest.FunSuite.withFixture(FunSuite.scala:1555)
at 
org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
at 
org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
at 
org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
at 
org.apache.spark.deploy.rest.StandaloneRestSubmitSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(StandaloneRestSubmitSuite.scala:41)
at 
org.scalatest.BeforeAndAfterEach$class.runTest(BeforeAndAfterEach.scala:255)
at 
org.apache.spark.deploy.rest.StandaloneRestSubmitSuite.runTest(StandaloneRestSubmitSuite.scala:41)
at 
org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
at 
org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
at 
org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413)
at 
org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
at 
org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396)
at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483)
at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:208)
at org.scalatest.FunSuite.runTests(FunSuite.scala:1555)
at org.scalatest.Suite$class.run(Suite.scala:1424)
at 
org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1555)
at 
org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
at 
org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
at org.scalatest.SuperEngine.runImpl(Engine.scala:545)
at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:212)
at 
org.apache.spark.deploy.rest.StandaloneRestSubmitSuite.org$scalatest$BeforeAndAfterAll$$super$run(StandaloneRestSubmitSuite.scala:41)
at 
org.scalatest.BeforeAndAfterAll$class.liftedTree1$1(BeforeAndAfterAll.scala:257)
at 
org.scalatest.BeforeAndAfterAll$class.run(BeforeAndAfterAll.scala:256)
at 
org.apache.spark.deploy.rest.StandaloneRestSubmitSuite.run(StandaloneRestSubmitSuite.scala:41)
at 
org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:462)
at 
org.scalatest.tools.Framework$ScalaTestTask.execute

[jira] [Updated] (SPARK-5690) Flaky test: org.apache.spark.deploy.rest.StandaloneRestSubmitSuite.simple submit until completion


 [ 
https://issues.apache.org/jira/browse/SPARK-5690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-5690:
---
Summary: Flaky test: 
org.apache.spark.deploy.rest.StandaloneRestSubmitSuite.simple submit until 
completion  (was: Flaky test: )

 Flaky test: org.apache.spark.deploy.rest.StandaloneRestSubmitSuite.simple 
 submit until completion
 -

 Key: SPARK-5690
 URL: https://issues.apache.org/jira/browse/SPARK-5690
 Project: Spark
  Issue Type: Bug
  Components: Tests
Reporter: Patrick Wendell
Assignee: Andrew Or
  Labels: flaky-test

 https://amplab.cs.berkeley.edu/jenkins/view/Spark/job/Spark-Master-SBT/AMPLAB_JENKINS_BUILD_PROFILE=hadoop1.0,label=centos/1647/testReport/junit/org.apache.spark.deploy.rest/StandaloneRestSubmitSuite/simple_submit_until_completion/
 {code}
 org.apache.spark.deploy.rest.StandaloneRestSubmitSuite.simple submit until 
 completion
 Failing for the past 1 build (Since Failed#1647 )
 Took 30 sec.
 Error Message
 Driver driver-20150209035158- did not finish within 30 seconds.
 Stacktrace
 sbt.ForkMain$ForkError: Driver driver-20150209035158- did not finish 
 within 30 seconds.
   at 
 org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:495)
   at 
 org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1555)
   at org.scalatest.Assertions$class.fail(Assertions.scala:1328)
   at org.scalatest.FunSuite.fail(FunSuite.scala:1555)
   at 
 org.apache.spark.deploy.rest.StandaloneRestSubmitSuite.org$apache$spark$deploy$rest$StandaloneRestSubmitSuite$$waitUntilFinished(StandaloneRestSubmitSuite.scala:152)
   at 
 org.apache.spark.deploy.rest.StandaloneRestSubmitSuite$$anonfun$1.apply$mcV$sp(StandaloneRestSubmitSuite.scala:57)
   at 
 org.apache.spark.deploy.rest.StandaloneRestSubmitSuite$$anonfun$1.apply(StandaloneRestSubmitSuite.scala:52)
   at 
 org.apache.spark.deploy.rest.StandaloneRestSubmitSuite$$anonfun$1.apply(StandaloneRestSubmitSuite.scala:52)
   at 
 org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
   at org.scalatest.Transformer.apply(Transformer.scala:22)
   at org.scalatest.Transformer.apply(Transformer.scala:20)
   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
   at org.scalatest.Suite$class.withFixture(Suite.scala:1122)
   at org.scalatest.FunSuite.withFixture(FunSuite.scala:1555)
   at 
 org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
   at 
 org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
   at 
 org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
   at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
   at 
 org.apache.spark.deploy.rest.StandaloneRestSubmitSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(StandaloneRestSubmitSuite.scala:41)
   at 
 org.scalatest.BeforeAndAfterEach$class.runTest(BeforeAndAfterEach.scala:255)
   at 
 org.apache.spark.deploy.rest.StandaloneRestSubmitSuite.runTest(StandaloneRestSubmitSuite.scala:41)
   at 
 org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
   at 
 org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
   at 
 org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413)
   at 
 org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
   at scala.collection.immutable.List.foreach(List.scala:318)
   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
   at 
 org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396)
   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483)
   at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:208)
   at org.scalatest.FunSuite.runTests(FunSuite.scala:1555)
   at org.scalatest.Suite$class.run(Suite.scala:1424)
   at 
 org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1555)
   at 
 org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
   at 
 org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
   at org.scalatest.SuperEngine.runImpl(Engine.scala:545)
   at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:212)
   at 
 org.apache.spark.deploy.rest.StandaloneRestSubmitSuite.org$scalatest$BeforeAndAfterAll$$super$run(StandaloneRestSubmitSuite.scala:41

[jira] [Updated] (SPARK-5689) Document what can be run in different YARN modes


 [ 
https://issues.apache.org/jira/browse/SPARK-5689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-5689:
---
Issue Type: Documentation  (was: Improvement)

 Document what can be run in different YARN modes
 

 Key: SPARK-5689
 URL: https://issues.apache.org/jira/browse/SPARK-5689
 Project: Spark
  Issue Type: Documentation
  Components: YARN
Affects Versions: 1.1.0
Reporter: Thomas Graves

 We should document what can be run in the different yarn modes. For 
 instances, the interactive shell only work in yarn client mode, recently with 
 https://github.com/apache/spark/pull/3976 users can run python scripts in 
 cluster mode, etc..



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-1142) Allow adding jars on app submission, outside of code


[ 
https://issues.apache.org/jira/browse/SPARK-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312608#comment-14312608
 ] 

Patrick Wendell commented on SPARK-1142:


This already exists - you can use the --jars flag to spark-submit or set 
'spark.jars' manually.

 Allow adding jars on app submission, outside of code
 

 Key: SPARK-1142
 URL: https://issues.apache.org/jira/browse/SPARK-1142
 Project: Spark
  Issue Type: Improvement
  Components: Spark Submit
Affects Versions: 0.9.0
Reporter: Sandy Pérez González
Assignee: Sandy Ryza

 yarn-standalone mode supports an option that allows adding jars that will be 
 distributed on the cluster with job submission.  Providing similar 
 functionality for other app submission modes will allow the spark-app script 
 proposed in SPARK-1126 to support an add-jars option that works for every 
 submit mode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Re: Keep or remove Debian packaging in Spark?

2015-02-09 Thread Patrick Wendell

I have wondered whether we should sort of deprecated it more
officially, since otherwise I think people have the reasonable
expectation based on the current code that Spark intends to support
complete Debian packaging as part of the upstream build. Having
something that's sort-of maintained but no one is helping review and
merge patches on it or make it fully functional, IMO that doesn't
benefit us or our users. There are a bunch of other projects that are
specifically devoted to packaging, so it seems like there is a clear
separation of concerns here.

On Mon, Feb 9, 2015 at 7:31 AM, Mark Hamstra m...@clearstorydata.com wrote:

 it sounds like nobody intends these to be used to actually deploy Spark


 I wouldn't go quite that far.  What we have now can serve as useful input
 to a deployment tool like Chef, but the user is then going to need to add
 some customization or configuration within the context of that tooling to
 get Spark installed just the way they want.  So it is not so much that the
 current Debian packaging can't be used as that it has never really been
 intended to be a completely finished product that a newcomer could, for
 example, use to install Spark completely and quickly to Ubuntu and have a
 fully-functional environment in which they could then run all of the
 examples, tutorials, etc.

 Getting to that level of packaging (and maintenance) is something that I'm
 not sure we want to do since that is a better fit with Bigtop and the
 efforts of Cloudera, Horton Works, MapR, etc. to distribute Spark.

 On Mon, Feb 9, 2015 at 2:41 AM, Sean Owen so...@cloudera.com wrote:

 This is a straw poll to assess whether there is support to keep and
 fix, or remove, the Debian packaging-related config in Spark.

 I see several oldish outstanding JIRAs relating to problems in the
 packaging:

 https://issues.apache.org/jira/browse/SPARK-1799
 https://issues.apache.org/jira/browse/SPARK-2614
 https://issues.apache.org/jira/browse/SPARK-3624
 https://issues.apache.org/jira/browse/SPARK-4436
 (and a similar idea about making RPMs)
 https://issues.apache.org/jira/browse/SPARK-665

 The original motivation seems related to Chef:


 https://issues.apache.org/jira/browse/SPARK-2614?focusedCommentId=14070908page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14070908

 Mark's recent comments cast some doubt on whether it is essential:

 https://github.com/apache/spark/pull/4277#issuecomment-72114226

 and in recent conversations I didn't hear dissent to the idea of removing
 this.

 Is this still useful enough to fix up? All else equal I'd like to
 start to walk back some of the complexity of the build, but I don't
 know how all-else-equal it is. Certainly, it sounds like nobody
 intends these to be used to actually deploy Spark.

 I don't doubt it's useful to someone, but can they maintain the
 packaging logic elsewhere?

 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

[jira] [Resolved] (SPARK-2892) Socket Receiver does not stop when streaming context is stopped


 [ 
https://issues.apache.org/jira/browse/SPARK-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-2892.

   Resolution: Fixed
Fix Version/s: 1.2.1

I believe this is fixed by SPARK-5035, so I'm closing this.

 Socket Receiver does not stop when streaming context is stopped
 ---

 Key: SPARK-2892
 URL: https://issues.apache.org/jira/browse/SPARK-2892
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.0.2
Reporter: Tathagata Das
Assignee: Tathagata Das
Priority: Critical
 Fix For: 1.2.1


 Running NetworkWordCount with
 {quote}  
 ssc.start(); Thread.sleep(1); ssc.stop(stopSparkContext = false); 
 Thread.sleep(6)
 {quote}
 gives the following error
 {quote}
 14/08/06 18:37:13 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) 
 in 10047 ms on localhost (1/1)
 14/08/06 18:37:13 INFO DAGScheduler: Stage 0 (runJob at 
 ReceiverTracker.scala:275) finished in 10.056 s
 14/08/06 18:37:13 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks 
 have all completed, from pool
 14/08/06 18:37:13 INFO SparkContext: Job finished: runJob at 
 ReceiverTracker.scala:275, took 10.179263 s
 14/08/06 18:37:13 INFO ReceiverTracker: All of the receivers have been 
 terminated
 14/08/06 18:37:13 WARN ReceiverTracker: All of the receivers have not 
 deregistered, Map(0 - 
 ReceiverInfo(0,SocketReceiver-0,null,false,localhost,Stopped by driver,))
 14/08/06 18:37:13 INFO ReceiverTracker: ReceiverTracker stopped
 14/08/06 18:37:13 INFO JobGenerator: Stopping JobGenerator immediately
 14/08/06 18:37:13 INFO RecurringTimer: Stopped timer for JobGenerator after 
 time 1407375433000
 14/08/06 18:37:13 INFO JobGenerator: Stopped JobGenerator
 14/08/06 18:37:13 INFO JobScheduler: Stopped JobScheduler
 14/08/06 18:37:13 INFO StreamingContext: StreamingContext stopped successfully
 14/08/06 18:37:43 INFO SocketReceiver: Stopped receiving
 14/08/06 18:37:43 INFO SocketReceiver: Closed socket to localhost:
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Re: multi-line comment style

2015-02-09 Thread Patrick Wendell

Clearly there isn't a strictly optimal commenting format (pro's and
cons for both '//' and '/*'). My thought is for consistency we should
just chose one and put in the style guide.

On Mon, Feb 9, 2015 at 12:25 PM, Xiangrui Meng men...@gmail.com wrote:
 Btw, I think allowing `/* ... */` without the leading `*` in lines is
 also useful. Check this line:
 https://github.com/apache/spark/pull/4259/files#diff-e9dcb3b5f3de77fc31b3aff7831110eaR55,
 where we put the R commands that can reproduce the test result. It is
 easier if we write in the following style:

 ~~~
 /*
  Using the following R code to load the data and train the model using
 glmnet package.

  library(glmnet)
  data - read.csv(path, header=FALSE, stringsAsFactors=FALSE)
  features - as.matrix(data.frame(as.numeric(data$V2), as.numeric(data$V3)))
  label - as.numeric(data$V1)
  weights - coef(glmnet(features, label, family=gaussian, alpha = 0,
 lambda = 0))
  */
 ~~~

 So people can copy  paste the R commands directly.

 Xiangrui

 On Mon, Feb 9, 2015 at 12:18 PM, Xiangrui Meng men...@gmail.com wrote:
 I like the `/* .. */` style more. Because it is easier for IDEs to
 recognize it as a block comment. If you press enter in the comment
 block with the `//` style, IDEs won't add `//` for you. -Xiangrui

 On Wed, Feb 4, 2015 at 2:15 PM, Reynold Xin r...@databricks.com wrote:
 We should update the style doc to reflect what we have in most places
 (which I think is //).



 On Wed, Feb 4, 2015 at 2:09 PM, Shivaram Venkataraman 
 shiva...@eecs.berkeley.edu wrote:

 FWIW I like the multi-line // over /* */ from a purely style standpoint.
 The Google Java style guide[1] has some comment about code formatting tools
 working better with /* */ but there doesn't seem to be any strong arguments
 for one over the other I can find

 Thanks
 Shivaram

 [1]

 https://google-styleguide.googlecode.com/svn/trunk/javaguide.html#s4.8.6.1-block-comment-style

 On Wed, Feb 4, 2015 at 2:05 PM, Patrick Wendell pwend...@gmail.com
 wrote:

  Personally I have no opinion, but agree it would be nice to standardize.
 
  - Patrick
 
  On Wed, Feb 4, 2015 at 1:58 PM, Sean Owen so...@cloudera.com wrote:
   One thing Marcelo pointed out to me is that the // style does not
   interfere with commenting out blocks of code with /* */, which is a
   small good thing. I am also accustomed to // style for multiline, and
   reserve /** */ for javadoc / scaladoc. Meaning, seeing the /* */ style
   inline always looks a little funny to me.
  
   On Wed, Feb 4, 2015 at 3:53 PM, Kay Ousterhout 
 kayousterh...@gmail.com
  wrote:
   Hi all,
  
   The Spark Style Guide
   
  https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide
 
   says multi-line comments should formatted as:
  
   /*
* This is a
* very
* long comment.
*/
  
   But in my experience, we almost always use // for multi-line
 comments:
  
   // This is a
   // very
   // long comment.
  
   Here are some examples:
  
  - Recent commit by Reynold, king of style:
  
 
 https://github.com/apache/spark/commit/bebf4c42bef3e75d31ffce9bfdb331c16f34ddb1#diff-d616b5496d1a9f648864f4ab0db5a026R58
  - RDD.scala:
  
 
 https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/RDD.scala#L361
  - DAGScheduler.scala:
  
 
 https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L281
  
  
   Any objections to me updating the style guide to reflect this?  As
 with
   other style issues, I think consistency here is helpful (and
 formatting
   multi-line comments as // does nicely visually distinguish code
  comments
   from doc comments).
  
   -Kay
  
   -
   To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
   For additional commands, e-mail: dev-h...@spark.apache.org
  
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
  For additional commands, e-mail: dev-h...@spark.apache.org
 
 


-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

[ANNOUNCE] Apache Spark 1.2.1 Released

2015-02-09 Thread Patrick Wendell

Hi All,

I've just posted the 1.2.1 maintenance release of Apache Spark. We
recommend all 1.2.0 users upgrade to this release, as this release
includes stability fixes across all components of Spark.

- Download this release: http://spark.apache.org/downloads.html
- View the release notes:
http://spark.apache.org/releases/spark-release-1-2-1.html
- Full list of JIRA issues resolved in this release: http://s.apache.org/Mpn

Thanks to everyone who helped work on this release!

- Patrick

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

[jira] [Commented] (SPARK-4423) Improve foreach() documentation to avoid confusion between local- and cluster-mode behavior


[ 
https://issues.apache.org/jira/browse/SPARK-4423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313030#comment-14313030
 ] 

Patrick Wendell commented on SPARK-4423:


[~joshrosen] Is this specific to foreach? Isn't this true of map() or other 
operators as well?

 Improve foreach() documentation to avoid confusion between local- and 
 cluster-mode behavior
 ---

 Key: SPARK-4423
 URL: https://issues.apache.org/jira/browse/SPARK-4423
 Project: Spark
  Issue Type: Improvement
  Components: Documentation
Reporter: Josh Rosen
Assignee: Ilya Ganelin

 {{foreach}} seems to be a common source of confusion for new users: in 
 {{local}} mode, {{foreach}} can be used to update local variables on the 
 driver, but programs that do this will not work properly when executed on 
 clusters, since the {{foreach}} will update per-executor variables (note that 
 this _will_ work correctly for accumulators, but not for other types of 
 mutable objects).
 Similarly, I've seen users become confused when {{.foreach(println)}} doesn't 
 print to the driver's standard output.
 At a minimum, we should improve the documentation to warn users against 
 unsafe uses of {{foreach}} that won't work properly when transitioning from 
 local mode to a real cluster.
 We might also consider changes to local mode so that its behavior more 
 closely matches the cluster modes; this will require some discussion, though, 
 since any change of behavior here would technically be a user-visible 
 backwards-incompatible change (I don't think that we made any explicit 
 guarantees about the current local-mode behavior, but someone might be 
 relying on the current implicit behavior).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-5696) HiveThriftServer2Suite fails because of extra log4j.properties in the driver classpath


[ 
https://issues.apache.org/jira/browse/SPARK-5696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313005#comment-14313005
 ] 

Patrick Wendell commented on SPARK-5696:


Wow - this must have been a substantial effort to figure out what caused this. 
Sorry I didn't anticipate this when signing off on that patch. 

 HiveThriftServer2Suite fails because of extra log4j.properties in the driver 
 classpath
 --

 Key: SPARK-5696
 URL: https://issues.apache.org/jira/browse/SPARK-5696
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.3.0
Reporter: Cheng Lian
Assignee: Cheng Lian
  Labels: flaky-test

 PR #2982 added the {{--driver-class-path}} flag to {{HiveThriftServer2Suite}} 
 so that it passes when the {{hadoop-provided}} profile is used. However,  
 {{lib_managed/jars/jets3s-0.9.2.jar}} in the classpath has a log4j.properties 
 in it, which sets root logger level to ERROR. This makes 
 {{HiveThriftServer2Suite}} fail because it starts new processes and checks 
 for log output.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-5647) Output metrics do not show up for older hadoop versions ( 2.5)


[ 
https://issues.apache.org/jira/browse/SPARK-5647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313020#comment-14313020
 ] 

Patrick Wendell commented on SPARK-5647:


Isn't it just possible to get the file path in the case of file output format, 
and then read the size of that file? The main challenge I see is how quickly 
that size becomes visible to the HDFS client. In general I think it's worth 
doing because a lot of people still use older versions of the Spark HDFS 
client, for instance people based on AWS who primarily read from S3 and don't 
keep up to date with the newest Hadoop API's.

 Output metrics do not show up for older hadoop versions ( 2.5)
 ---

 Key: SPARK-5647
 URL: https://issues.apache.org/jira/browse/SPARK-5647
 Project: Spark
  Issue Type: New Feature
  Components: Spark Core
Reporter: Kostas Sakellis

 Need to add output metrics for hadoop  2.5. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-5647) Output metrics do not show up for older hadoop versions ( 2.5)


 [ 
https://issues.apache.org/jira/browse/SPARK-5647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-5647:
---
Target Version/s: 1.4.0

 Output metrics do not show up for older hadoop versions ( 2.5)
 ---

 Key: SPARK-5647
 URL: https://issues.apache.org/jira/browse/SPARK-5647
 Project: Spark
  Issue Type: New Feature
  Components: Spark Core
Reporter: Kostas Sakellis
Priority: Critical

 Need to add output metrics for hadoop  2.5. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Re: Improving metadata in Spark JIRA

2015-02-08 Thread Patrick Wendell

I think we already have a YARN component.

https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20component%20%3D%20YARN

I don't think JIRA allows it to be mandatory, but if it does, that
would be useful.

On Sat, Feb 7, 2015 at 5:08 PM, Nicholas Chammas
nicholas.cham...@gmail.com wrote:
By the way, isn't it possible to make the Component field mandatory when
people open new issues? Shouldn't we do that?

Btw Patrick, don't we need a YARN component? I think our JIRA components
should roughly match the components on the PR dashboard.

Nick

On Fri Feb 06 2015 at 12:25:52 PM Patrick Wendell pwend...@gmail.com
wrote:

Per Nick's suggestion I added two components:

1. Spark Submit
2. Spark Scheduler

I figured I would just add these since if we decide later we don't
want them, we can simply merge them into Spark Core.

On Fri, Feb 6, 2015 at 11:53 AM, Nicholas Chammas
nicholas.cham...@gmail.com wrote:
Do we need some new components to be added to the JIRA project?

Like:

scheduler
-

YARN
- spark-submit
- ...?

Nick

On Fri Feb 06 2015 at 10:50:41 AM Nicholas Chammas
nicholas.cham...@gmail.com wrote:

+9000 on cleaning up JIRA.

Thank you Sean for laying out some specific things to tackle. I will
assist with this.

Regarding email, I think Sandy is right. I only get JIRA email for
issues
I'm watching.

Nick

On Fri Feb 06 2015 at 9:52:58 AM Sandy Ryza sandy.r...@cloudera.com
wrote:

JIRA updates don't go to this list, they go to
iss...@spark.apache.org.
I
don't think many are signed up for that list, and those that are
probably
have a flood of emails anyway.

So I'd definitely be in favor of any JIRA cleanup that you're up for.

-Sandy

On Fri, Feb 6, 2015 at 6:45 AM, Sean Owen so...@cloudera.com wrote:

I've wasted no time in wielding the commit bit to complete a number
of
small, uncontroversial changes. I wouldn't commit anything that
didn't
already appear to have review, consensus and little risk, but please
let me know if anything looked a little too bold, so I can
calibrate.

Anyway, I'd like to continue some small house-cleaning by improving
the state of JIRA's metadata, in order to let it give us a little
clearer view on what's happening in the project:

a. Add Component to every (open) issue that's missing one
b. Review all Critical / Blocker issues to de-escalate ones that
seem
obviously neither
c. Correct open issues that list a Fix version that has already been
released
d. Close all issues Resolved for a release that has already been
released

The problem with doing so is that it will create a tremendous amount
of email to the list, like, several hundred. It's possible to make
bulk changes and suppress e-mail though, which could be done for all
but b.

Better to suppress the emails when making such changes? or just not
bother on some of these?

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

[RESULT] [VOTE] Release Apache Spark 1.2.1 (RC3)

2015-02-08 Thread Patrick Wendell

This vote passes with 5 +1 votes (3 binding) and no 0 or -1 votes.

+1 Votes:
Krishna Sankar
Sean Owen*
Chip Senkbeil
Matei Zaharia*
Patrick Wendell*

0 Votes:
(none)

-1 Votes:
(none)

On Fri, Feb 6, 2015 at 5:12 PM, Patrick Wendell pwend...@gmail.com wrote:
 I'll add a +1 as well.

 On Fri, Feb 6, 2015 at 2:38 PM, Matei Zaharia matei.zaha...@gmail.com wrote:
 +1

 Tested on Mac OS X.

 Matei


 On Feb 2, 2015, at 8:57 PM, Patrick Wendell pwend...@gmail.com wrote:

 Please vote on releasing the following candidate as Apache Spark version 
 1.2.1!

 The tag to be voted on is v1.2.1-rc3 (commit b6eaf77):
 https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=b6eaf77d4332bfb0a698849b1f5f917d20d70e97

 The release files, including signatures, digests, etc. can be found at:
 http://people.apache.org/~pwendell/spark-1.2.1-rc3/

 Release artifacts are signed with the following key:
 https://people.apache.org/keys/committer/pwendell.asc

 The staging repository for this release can be found at:
 https://repository.apache.org/content/repositories/orgapachespark-1065/

 The documentation corresponding to this release can be found at:
 http://people.apache.org/~pwendell/spark-1.2.1-rc3-docs/

 Changes from rc2:
 A single patch fixing a windows issue.

 Please vote on releasing this package as Apache Spark 1.2.1!

 The vote is open until Friday, February 06, at 05:00 UTC and passes
 if a majority of at least 3 +1 PMC votes are cast.

 [ ] +1 Release this package as Apache Spark 1.2.1
 [ ] -1 Do not release this package because ...

 For a list of fixes in this release, see http://s.apache.org/Mpn.

 To learn more about Apache Spark, please see
 http://spark.apache.org/

 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: [VOTE] Release Apache Spark 1.2.1 (RC3)

2015-02-08 Thread Patrick Wendell

I'll add a +1 as well.

On Fri, Feb 6, 2015 at 2:38 PM, Matei Zaharia matei.zaha...@gmail.com wrote:
 +1

 Tested on Mac OS X.

 Matei


 On Feb 2, 2015, at 8:57 PM, Patrick Wendell pwend...@gmail.com wrote:

 Please vote on releasing the following candidate as Apache Spark version 
 1.2.1!

 The tag to be voted on is v1.2.1-rc3 (commit b6eaf77):
 https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=b6eaf77d4332bfb0a698849b1f5f917d20d70e97

 The release files, including signatures, digests, etc. can be found at:
 http://people.apache.org/~pwendell/spark-1.2.1-rc3/

 Release artifacts are signed with the following key:
 https://people.apache.org/keys/committer/pwendell.asc

 The staging repository for this release can be found at:
 https://repository.apache.org/content/repositories/orgapachespark-1065/

 The documentation corresponding to this release can be found at:
 http://people.apache.org/~pwendell/spark-1.2.1-rc3-docs/

 Changes from rc2:
 A single patch fixing a windows issue.

 Please vote on releasing this package as Apache Spark 1.2.1!

 The vote is open until Friday, February 06, at 05:00 UTC and passes
 if a majority of at least 3 +1 PMC votes are cast.

 [ ] +1 Release this package as Apache Spark 1.2.1
 [ ] -1 Do not release this package because ...

 For a list of fixes in this release, see http://s.apache.org/Mpn.

 To learn more about Apache Spark, please see
 http://spark.apache.org/

 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Unit tests

2015-02-08 Thread Patrick Wendell

Hey All,

The tests are in a not-amazing state right now due to a few compounding factors:

1. We've merged a large volume of patches recently.
2. The load on jenkins has been relatively high, exposing races and
other behavior not seen at lower load.

For those not familiar, the main issue is flaky (non deterministic)
test failures. Right now I'm trying to prioritize keeping the
PullReqeustBuilder in good shape since it will block development if it
is down.

For other tests, let's try to keep filing JIRA's when we see issues
and use the flaky-test label (see http://bit.ly/1yRif9S):

I may contact people regarding specific tests. This is a very high
priority to get in good shape. This kind of thing is no one's fault
but just the result of a lot of concurrent development, and everyone
needs to pitch in to get back in a good place.

- Patrick

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

[jira] [Commented] (SPARK-761) Print a nicer error message when incompatible Spark binaries try to talk


[ 
https://issues.apache.org/jira/browse/SPARK-761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14311475#comment-14311475
 ] 

Patrick Wendell commented on SPARK-761:
---

I think the main thing to catch would be Akka. I.e. try connecting different 
versions and seeing what happens as an exploratory step. For instance, if akka 
has a standard exception which says you had an incompatible message type, we 
can wrap that and give an outer exception explaining that the spark version is 
likely wrong. So maybe we can see if someone wants to explore this a bit as a 
starter task.

 Print a nicer error message when incompatible Spark binaries try to talk
 

 Key: SPARK-761
 URL: https://issues.apache.org/jira/browse/SPARK-761
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Reporter: Matei Zaharia
Priority: Minor
  Labels: starter

 Not sure what component this falls under, or if this is still an issue.
 Patrick Wendell / Matei Zaharia?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-4687) SparkContext#addFile doesn't keep file folder information


 [ 
https://issues.apache.org/jira/browse/SPARK-4687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-4687:
---
Component/s: Spark Core

 SparkContext#addFile doesn't keep file folder information
 -

 Key: SPARK-4687
 URL: https://issues.apache.org/jira/browse/SPARK-4687
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.2.0
Reporter: Jimmy Xiang
Assignee: Sandy Ryza
 Fix For: 1.3.0, 1.4.0


 Files added with SparkContext#addFile are loaded with Utils#fetchFile before 
 a task starts. However, Utils#fetchFile puts all files under the Spart root 
 on the worker node. We should have an option to keep the folder information. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-5299) Is http://www.apache.org/dist/spark/KEYS out of date?


 [ 
https://issues.apache.org/jira/browse/SPARK-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-5299:
---
Component/s: (was: Deploy)
 Build

 Is http://www.apache.org/dist/spark/KEYS out of date?
 -

 Key: SPARK-5299
 URL: https://issues.apache.org/jira/browse/SPARK-5299
 Project: Spark
  Issue Type: Question
  Components: Build
Reporter: David Shaw
Assignee: Patrick Wendell

 The keys contained in http://www.apache.org/dist/spark/KEYS do not appear to 
 match the keys used to sign the releases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-3033) [Hive] java.math.BigDecimal cannot be cast to org.apache.hadoop.hive.common.type.HiveDecimal


 [ 
https://issues.apache.org/jira/browse/SPARK-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-3033:
---
Component/s: (was: Spark Core)

 [Hive] java.math.BigDecimal cannot be cast to 
 org.apache.hadoop.hive.common.type.HiveDecimal
 

 Key: SPARK-3033
 URL: https://issues.apache.org/jira/browse/SPARK-3033
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.0.2
Reporter: pengyanhong

 run a complex HiveQL via yarn-cluster, got error as below:
 {quote}
 14/08/14 15:05:24 WARN 
 org.apache.spark.Logging$class.logWarning(Logging.scala:70): Loss was due to 
 java.lang.ClassCastException
 java.lang.ClassCastException: java.math.BigDecimal cannot be cast to 
 org.apache.hadoop.hive.common.type.HiveDecimal
   at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaHiveDecimalObjectInspector.getPrimitiveJavaObject(JavaHiveDecimalObjectInspector.java:51)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getHiveDecimal(PrimitiveObjectInspectorUtils.java:1022)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$HiveDecimalConverter.convert(PrimitiveObjectInspectorConverter.java:306)
   at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDFUtils$ReturnObjectInspectorResolver.convertIfNecessary(GenericUDFUtils.java:179)
   at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDFIf.evaluate(GenericUDFIf.java:82)
   at org.apache.spark.sql.hive.HiveGenericUdf.eval(hiveUdfs.scala:276)
   at 
 org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scala:84)
   at 
 org.apache.spark.sql.catalyst.expressions.MutableProjection.apply(Projection.scala:62)
   at 
 org.apache.spark.sql.catalyst.expressions.MutableProjection.apply(Projection.scala:51)
   at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
   at 
 org.apache.spark.sql.execution.BroadcastNestedLoopJoin$$anonfun$4.apply(joins.scala:309)
   at 
 org.apache.spark.sql.execution.BroadcastNestedLoopJoin$$anonfun$4.apply(joins.scala:303)
   at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:571)
   at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:571)
   at 
 org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
   at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
   at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
   at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
   at org.apache.spark.scheduler.Task.run(Task.scala:51)
   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:183)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-761) Print a nicer error message when incompatible Spark binaries try to talk


 [ 
https://issues.apache.org/jira/browse/SPARK-761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-761:
--
Labels: starter  (was: )

 Print a nicer error message when incompatible Spark binaries try to talk
 

 Key: SPARK-761
 URL: https://issues.apache.org/jira/browse/SPARK-761
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Reporter: Matei Zaharia
Priority: Minor
  Labels: starter

 Not sure what component this falls under, or if this is still an issue.
 Patrick Wendell / Matei Zaharia?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-761) Print a nicer error message when incompatible Spark binaries try to talk


 [ 
https://issues.apache.org/jira/browse/SPARK-761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-761:
--
Description: As a starter task, it would be good to audit the current 
behavior for different client - server pairs with respect to how exceptions 
occur.  (was: Not sure what component this falls under, or if this is still an 
issue.
Patrick Wendell / Matei Zaharia?)

 Print a nicer error message when incompatible Spark binaries try to talk
 

 Key: SPARK-761
 URL: https://issues.apache.org/jira/browse/SPARK-761
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Reporter: Matei Zaharia
Priority: Minor
  Labels: starter

 As a starter task, it would be good to audit the current behavior for 
 different client - server pairs with respect to how exceptions occur.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-761) Print a nicer error message when incompatible Spark binaries try to talk


[ 
https://issues.apache.org/jira/browse/SPARK-761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14311490#comment-14311490
 ] 

Patrick Wendell commented on SPARK-761:
---

[~aash] right now we don't explicitly encode the spark version anywhere in the 
RPC. The best possible thing is to give an explicit version number like you 
said, but we don't have the plumbing to do that at the moment and IMO that's 
worth punting until we decide to standardize the RPC format.

 Print a nicer error message when incompatible Spark binaries try to talk
 

 Key: SPARK-761
 URL: https://issues.apache.org/jira/browse/SPARK-761
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Reporter: Matei Zaharia
Priority: Minor
  Labels: starter

 As a starter task, it would be good to audit the current behavior for 
 different client - server pairs with respect to how exceptions occur.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-5659) Flaky Test: org.apache.spark.streaming.ReceiverSuite.block


 [ 
https://issues.apache.org/jira/browse/SPARK-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-5659:
---
Component/s: Tests

 Flaky Test: org.apache.spark.streaming.ReceiverSuite.block
 --

 Key: SPARK-5659
 URL: https://issues.apache.org/jira/browse/SPARK-5659
 Project: Spark
  Issue Type: Bug
  Components: Streaming, Tests
Affects Versions: 1.3.0
Reporter: Patrick Wendell
Assignee: Tathagata Das
Priority: Critical
  Labels: flaky-test

 {code}
 Error Message
 recordedBlocks.drop(1).dropRight(1).forall(((block: 
 scala.collection.mutable.ArrayBuffer[Int]) = 
 block.size.=(minExpectedMessagesPerBlock).(block.size.=(maxExpectedMessagesPerBlock
  was false # records in received blocks = 
 [11,10,10,10,10,10,10,10,10,10,10,4,16,10,10,10,10,10,10,10], not between 7 
 and 11
 Stacktrace
 sbt.ForkMain$ForkError: recordedBlocks.drop(1).dropRight(1).forall(((block: 
 scala.collection.mutable.ArrayBuffer[Int]) = 
 block.size.=(minExpectedMessagesPerBlock).(block.size.=(maxExpectedMessagesPerBlock
  was false # records in received blocks = 
 [11,10,10,10,10,10,10,10,10,10,10,4,16,10,10,10,10,10,10,10], not between 7 
 and 11
   at 
 org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:500)
   at 
 org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1555)
   at 
 org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:466)
   at 
 org.apache.spark.streaming.ReceiverSuite$$anonfun$3.apply$mcV$sp(ReceiverSuite.scala:200)
   at 
 org.apache.spark.streaming.ReceiverSuite$$anonfun$3.apply(ReceiverSuite.scala:158)
   at 
 org.apache.spark.streaming.ReceiverSuite$$anonfun$3.apply(ReceiverSuite.scala:158)
   at 
 org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
   at org.scalatest.Transformer.apply(Transformer.scala:22)
   at org.scalatest.Transformer.apply(Transformer.scala:20)
   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
   at org.scalatest.Suite$class.withFixture(Suite.scala:1122)
   at org.scalatest.FunSuite.withFixture(FunSuite.scala:1555)
   at 
 org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
   at 
 org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
   at 
 org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
   at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
   at 
 org.apache.spark.streaming.ReceiverSuite.org$scalatest$BeforeAndAfter$$super$runTest(ReceiverSuite.scala:39)
   at org.scalatest.BeforeAndAfter$class.runTest(BeforeAndAfter.scala:200)
   at 
 org.apache.spark.streaming.ReceiverSuite.runTest(ReceiverSuite.scala:39)
   at 
 org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
   at 
 org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
   at 
 org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413)
   at 
 org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
   at scala.collection.immutable.List.foreach(List.scala:318)
   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
   at 
 org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396)
   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483)
   at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:208)
   at org.scalatest.FunSuite.runTests(FunSuite.scala:1555)
   at org.scalatest.Suite$class.run(Suite.scala:1424)
   at 
 org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1555)
   at 
 org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
   at 
 org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
   at org.scalatest.SuperEngine.runImpl(Engine.scala:545)
   at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:212)
   at 
 org.apache.spark.streaming.ReceiverSuite.org$scalatest$BeforeAndAfter$$super$run(ReceiverSuite.scala:39)
   at org.scalatest.BeforeAndAfter$class.run(BeforeAndAfter.scala:241)
   at org.apache.spark.streaming.ReceiverSuite.run(ReceiverSuite.scala:39)
   at 
 org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:462)
   at 
 org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:671)
   at sbt.ForkMain$Run$2.call(ForkMain.java:294

[jira] [Created] (SPARK-5679) Flaky tests in InputOutputMetricsSuite: input metrics with interleaved reads and .input metrics with mixed read method

Patrick Wendell created SPARK-5679:
--

 Summary: Flaky tests in InputOutputMetricsSuite: input metrics 
with interleaved reads and .input metrics with mixed read method 
 Key: SPARK-5679
 URL: https://issues.apache.org/jira/browse/SPARK-5679
 Project: Spark
  Issue Type: Bug
  Components: Spark Core, Tests
Affects Versions: 1.3.0
Reporter: Patrick Wendell
Assignee: Kostas Sakellis
Priority: Blocker


Please audit these and see if there are any assumptions with respect to File IO 
that might not hold in all cases. I'm happy to help if you can't find anything.

These both failed in the same run:
https://amplab.cs.berkeley.edu/jenkins/view/Spark/job/Spark-1.3-SBT/38/AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.2,label=centos/#showFailuresLink

{code}
org.apache.spark.metrics.InputOutputMetricsSuite.input metrics with mixed read 
method

Failing for the past 13 builds (Since Failed#26 )
Took 48 sec.
Error Message

2030 did not equal 6496
Stacktrace

sbt.ForkMain$ForkError: 2030 did not equal 6496
at 
org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:500)
at 
org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1555)
at 
org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:466)
at 
org.apache.spark.metrics.InputOutputMetricsSuite$$anonfun$9.apply$mcV$sp(InputOutputMetricsSuite.scala:135)
at 
org.apache.spark.metrics.InputOutputMetricsSuite$$anonfun$9.apply(InputOutputMetricsSuite.scala:113)
at 
org.apache.spark.metrics.InputOutputMetricsSuite$$anonfun$9.apply(InputOutputMetricsSuite.scala:113)
at 
org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
at org.scalatest.Transformer.apply(Transformer.scala:22)
at org.scalatest.Transformer.apply(Transformer.scala:20)
at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
at org.scalatest.Suite$class.withFixture(Suite.scala:1122)
at org.scalatest.FunSuite.withFixture(FunSuite.scala:1555)
at 
org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
at 
org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
at 
org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
at 
org.apache.spark.metrics.InputOutputMetricsSuite.org$scalatest$BeforeAndAfter$$super$runTest(InputOutputMetricsSuite.scala:46)
at org.scalatest.BeforeAndAfter$class.runTest(BeforeAndAfter.scala:200)
at 
org.apache.spark.metrics.InputOutputMetricsSuite.runTest(InputOutputMetricsSuite.scala:46)
at 
org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
at 
org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
at 
org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413)
at 
org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
at 
org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396)
at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483)
at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:208)
at org.scalatest.FunSuite.runTests(FunSuite.scala:1555)
at org.scalatest.Suite$class.run(Suite.scala:1424)
at 
org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1555)
at 
org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
at 
org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
at org.scalatest.SuperEngine.runImpl(Engine.scala:545)
at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:212)
at 
org.apache.spark.metrics.InputOutputMetricsSuite.org$scalatest$BeforeAndAfterAll$$super$run(InputOutputMetricsSuite.scala:46)
at 
org.scalatest.BeforeAndAfterAll$class.liftedTree1$1(BeforeAndAfterAll.scala:257)
at 
org.scalatest.BeforeAndAfterAll$class.run(BeforeAndAfterAll.scala:256)
at 
org.apache.spark.metrics.InputOutputMetricsSuite.org$scalatest$BeforeAndAfter$$super$run(InputOutputMetricsSuite.scala:46)
at org.scalatest.BeforeAndAfter$class.run(BeforeAndAfter.scala:241)
at 
org.apache.spark.metrics.InputOutputMetricsSuite.run(InputOutputMetricsSuite.scala:46

[jira] [Updated] (SPARK-5679) Flaky tests in InputOutputMetricsSuite: input metrics with interleaved reads and input metrics with mixed read method


 [ 
https://issues.apache.org/jira/browse/SPARK-5679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-5679:
---
Description: 
Please audit these and see if there are any assumptions with respect to File IO 
that might not hold in all cases. I'm happy to help if you can't find anything.

These both failed in the same run:
https://amplab.cs.berkeley.edu/jenkins/view/Spark/job/Spark-1.3-SBT/38/AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.2,label=centos/#showFailuresLink

{code}
org.apache.spark.metrics.InputOutputMetricsSuite.input metrics with mixed read 
method

Failing for the past 13 builds (Since Failed#26 )
Took 48 sec.
Error Message

2030 did not equal 6496
Stacktrace

sbt.ForkMain$ForkError: 2030 did not equal 6496
at 
org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:500)
at 
org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1555)
at 
org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:466)
at 
org.apache.spark.metrics.InputOutputMetricsSuite$$anonfun$9.apply$mcV$sp(InputOutputMetricsSuite.scala:135)
at 
org.apache.spark.metrics.InputOutputMetricsSuite$$anonfun$9.apply(InputOutputMetricsSuite.scala:113)
at 
org.apache.spark.metrics.InputOutputMetricsSuite$$anonfun$9.apply(InputOutputMetricsSuite.scala:113)
at 
org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
at org.scalatest.Transformer.apply(Transformer.scala:22)
at org.scalatest.Transformer.apply(Transformer.scala:20)
at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
at org.scalatest.Suite$class.withFixture(Suite.scala:1122)
at org.scalatest.FunSuite.withFixture(FunSuite.scala:1555)
at 
org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
at 
org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
at 
org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
at 
org.apache.spark.metrics.InputOutputMetricsSuite.org$scalatest$BeforeAndAfter$$super$runTest(InputOutputMetricsSuite.scala:46)
at org.scalatest.BeforeAndAfter$class.runTest(BeforeAndAfter.scala:200)
at 
org.apache.spark.metrics.InputOutputMetricsSuite.runTest(InputOutputMetricsSuite.scala:46)
at 
org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
at 
org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
at 
org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413)
at 
org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
at 
org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396)
at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483)
at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:208)
at org.scalatest.FunSuite.runTests(FunSuite.scala:1555)
at org.scalatest.Suite$class.run(Suite.scala:1424)
at 
org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1555)
at 
org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
at 
org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
at org.scalatest.SuperEngine.runImpl(Engine.scala:545)
at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:212)
at 
org.apache.spark.metrics.InputOutputMetricsSuite.org$scalatest$BeforeAndAfterAll$$super$run(InputOutputMetricsSuite.scala:46)
at 
org.scalatest.BeforeAndAfterAll$class.liftedTree1$1(BeforeAndAfterAll.scala:257)
at 
org.scalatest.BeforeAndAfterAll$class.run(BeforeAndAfterAll.scala:256)
at 
org.apache.spark.metrics.InputOutputMetricsSuite.org$scalatest$BeforeAndAfter$$super$run(InputOutputMetricsSuite.scala:46)
at org.scalatest.BeforeAndAfter$class.run(BeforeAndAfter.scala:241)
at 
org.apache.spark.metrics.InputOutputMetricsSuite.run(InputOutputMetricsSuite.scala:46)
at 
org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:462)
at 
org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:671)
at sbt.ForkMain$Run$2.call(ForkMain.java:294)
at sbt.ForkMain$Run$2.call(ForkMain.java:284)
at java.util.concurrent.FutureTask.run(FutureTask.java:262

[jira] [Updated] (SPARK-5679) Flaky tests in InputOutputMetricsSuite: input metrics with interleaved reads and input metrics with mixed read method


 [ 
https://issues.apache.org/jira/browse/SPARK-5679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-5679:
---
Summary: Flaky tests in InputOutputMetricsSuite: input metrics with 
interleaved reads and input metrics with mixed read method   (was: Flaky tests 
in InputOutputMetricsSuite: input metrics with interleaved reads and .input 
metrics with mixed read method )

 Flaky tests in InputOutputMetricsSuite: input metrics with interleaved reads 
 and input metrics with mixed read method 
 --

 Key: SPARK-5679
 URL: https://issues.apache.org/jira/browse/SPARK-5679
 Project: Spark
  Issue Type: Bug
  Components: Spark Core, Tests
Affects Versions: 1.3.0
Reporter: Patrick Wendell
Assignee: Kostas Sakellis
Priority: Blocker

 Please audit these and see if there are any assumptions with respect to File 
 IO that might not hold in all cases. I'm happy to help if you can't find 
 anything.
 These both failed in the same run:
 https://amplab.cs.berkeley.edu/jenkins/view/Spark/job/Spark-1.3-SBT/38/AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.2,label=centos/#showFailuresLink
 {code}
 org.apache.spark.metrics.InputOutputMetricsSuite.input metrics with mixed 
 read method
 Failing for the past 13 builds (Since Failed#26 )
 Took 48 sec.
 Error Message
 2030 did not equal 6496
 Stacktrace
 sbt.ForkMain$ForkError: 2030 did not equal 6496
   at 
 org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:500)
   at 
 org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1555)
   at 
 org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:466)
   at 
 org.apache.spark.metrics.InputOutputMetricsSuite$$anonfun$9.apply$mcV$sp(InputOutputMetricsSuite.scala:135)
   at 
 org.apache.spark.metrics.InputOutputMetricsSuite$$anonfun$9.apply(InputOutputMetricsSuite.scala:113)
   at 
 org.apache.spark.metrics.InputOutputMetricsSuite$$anonfun$9.apply(InputOutputMetricsSuite.scala:113)
   at 
 org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
   at org.scalatest.Transformer.apply(Transformer.scala:22)
   at org.scalatest.Transformer.apply(Transformer.scala:20)
   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
   at org.scalatest.Suite$class.withFixture(Suite.scala:1122)
   at org.scalatest.FunSuite.withFixture(FunSuite.scala:1555)
   at 
 org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
   at 
 org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
   at 
 org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
   at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
   at 
 org.apache.spark.metrics.InputOutputMetricsSuite.org$scalatest$BeforeAndAfter$$super$runTest(InputOutputMetricsSuite.scala:46)
   at org.scalatest.BeforeAndAfter$class.runTest(BeforeAndAfter.scala:200)
   at 
 org.apache.spark.metrics.InputOutputMetricsSuite.runTest(InputOutputMetricsSuite.scala:46)
   at 
 org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
   at 
 org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
   at 
 org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413)
   at 
 org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
   at scala.collection.immutable.List.foreach(List.scala:318)
   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
   at 
 org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396)
   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483)
   at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:208)
   at org.scalatest.FunSuite.runTests(FunSuite.scala:1555)
   at org.scalatest.Suite$class.run(Suite.scala:1424)
   at 
 org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1555)
   at 
 org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
   at 
 org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
   at org.scalatest.SuperEngine.runImpl(Engine.scala:545)
   at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:212)
   at 
 org.apache.spark.metrics.InputOutputMetricsSuite.org$scalatest$BeforeAndAfterAll$$super$run(InputOutputMetricsSuite.scala:46)
   at 
 org.scalatest.BeforeAndAfterAll

[jira] [Updated] (SPARK-4896) Don't redundantly copy executor dependencies in Utils.fetchFile


 [ 
https://issues.apache.org/jira/browse/SPARK-4896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-4896:
---
Component/s: Spark Core

 Don't redundantly copy executor dependencies in Utils.fetchFile
 ---

 Key: SPARK-4896
 URL: https://issues.apache.org/jira/browse/SPARK-4896
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Reporter: Josh Rosen
Assignee: Ryan Williams
 Fix For: 1.3.0, 1.1.2, 1.2.1


 This JIRA is spun off from a comment by [~rdub] on SPARK-3967, quoted here:
 {quote}
 I've been debugging this issue as well and I think I've found an issue in 
 {{org.apache.spark.util.Utils}} that is contributing to / causing the problem:
 {{Files.move}} on [line 
 390|https://github.com/apache/spark/blob/v1.1.0/core/src/main/scala/org/apache/spark/util/Utils.scala#L390]
  is called even if {{targetFile}} exists and {{tempFile}} and {{targetFile}} 
 are equal.
 The check on [line 
 379|https://github.com/apache/spark/blob/v1.1.0/core/src/main/scala/org/apache/spark/util/Utils.scala#L379]
  seems to imply the desire to skip a redundant overwrite if the file is 
 already there and has the contents that it should have.
 Gating the {{Files.move}} call on a further {{if (!targetFile.exists)}} fixes 
 the issue for me; attached is a patch of the change.
 In practice all of my executors that hit this code path are finding every 
 dependency JAR to already exist and be exactly equal to what they need it to 
 be, meaning they were all needlessly overwriting all of their dependency 
 JARs, and now are all basically no-op-ing in {{Utils.fetchFile}}; I've not 
 determined who/what is putting the JARs there, why the issue only crops up in 
 {{yarn-cluster}} mode (or {{--master yarn --deploy-mode cluster}}), etc., but 
 it seems like either way this patch is probably desirable.
 {quote}
 I'm spinning this off into its own JIRA so that we can track the merging of 
 https://github.com/apache/spark/pull/2848 separately (since we have multiple 
 PRs that contribute to fixing the original issue).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-5355) SparkConf is not thread-safe


 [ 
https://issues.apache.org/jira/browse/SPARK-5355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-5355:
---
Component/s: Spark Core

 SparkConf is not thread-safe
 

 Key: SPARK-5355
 URL: https://issues.apache.org/jira/browse/SPARK-5355
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.2.0, 1.3.0
Reporter: Davies Liu
Assignee: Davies Liu
Priority: Blocker
 Fix For: 1.3.0, 1.2.1


 The SparkConf is not thread-safe, but is accessed by many threads. The 
 getAll() could return parts of the configs if another thread is access it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-5289) Backport publishing of repl, yarn into branch-1.2


 [ 
https://issues.apache.org/jira/browse/SPARK-5289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-5289:
---
Component/s: Build

 Backport publishing of repl, yarn into branch-1.2
 -

 Key: SPARK-5289
 URL: https://issues.apache.org/jira/browse/SPARK-5289
 Project: Spark
  Issue Type: Improvement
  Components: Build
Reporter: Patrick Wendell
Assignee: Patrick Wendell
Priority: Blocker
 Fix For: 1.2.1


 In SPARK-3452 we did some clean-up of published artifacts that turned out to 
 adversely affect some users. This has been mostly patched up in master via 
 SPARK-4925 (hive-thritserver) which was backported. For the repl and yarn 
 modules, they were fixed in SPARK-4048 as part of a larger change that only 
 went into master.
 Those pieces should be backported to Spark 1.2 to allow publishing in a 1.2.1 
 release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-5254) Update the user guide to make clear that spark.mllib is not being deprecated

[
https://issues.apache.org/jira/browse/SPARK-5254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Patrick Wendell updated SPARK-5254:
---
Component/s: MLlib

Update the user guide to make clear that spark.mllib is not being deprecated

Key: SPARK-5254
URL: https://issues.apache.org/jira/browse/SPARK-5254
Project: Spark
Issue Type: Documentation
Components: MLlib
Reporter: Xiangrui Meng
Assignee: Xiangrui Meng
Fix For: 1.3.0, 1.2.1

The current statement in the user guide may deliver confusing messages to
users. spark.ml contains high-level APIs for building ML pipelines. But it
doesn't mean that spark.mllib is being deprecated.
First of all, the pipeline API is in its alpha stage and we need to see more
use cases from the community to stabilizes it, which may take several
releases. Secondly, the components in spark.ml are simple wrappers over
spark.mllib implementations. Neither the APIs or the implementations from
spark.mllib are being deprecated. We expect users use spark.ml pipeline APIs
to build their ML pipelines, but we will keep supporting and adding features
to spark.mllib. For example, there are many features in review at
https://spark-prs.appspot.com/#mllib. So users should be comfortable with
using spark.mllib features and expect more coming. The user guide needs to be
updated to make the message clear.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-5308) MD5 / SHA1 hash format doesn't match standard Maven output


 [ 
https://issues.apache.org/jira/browse/SPARK-5308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-5308:
---
Fix Version/s: (was: 1.2.1)

 MD5 / SHA1 hash format doesn't match standard Maven output
 --

 Key: SPARK-5308
 URL: https://issues.apache.org/jira/browse/SPARK-5308
 Project: Spark
  Issue Type: Bug
  Components: Build
Affects Versions: 1.2.0
Reporter: Kuldeep
Assignee: Sean Owen
Priority: Minor
 Fix For: 1.3.0


 https://repo1.maven.org/maven2/org/apache/spark/spark-core_2.10/1.2.0/spark-core_2.10-1.2.0.pom.md5
 The above does not look like a proper md5 which is causing failure in some 
 build tools like leiningen.
 https://github.com/technomancy/leiningen/issues/1802
 Compare this with 1.1.0 release
 https://repo1.maven.org/maven2/org/apache/spark/spark-core_2.10/1.1.0/spark-core_2.10-1.1.0.pom.md5



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-5308) MD5 / SHA1 hash format doesn't match standard Maven output


 [ 
https://issues.apache.org/jira/browse/SPARK-5308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-5308:
---
Fix Version/s: 1.2.1

 MD5 / SHA1 hash format doesn't match standard Maven output
 --

 Key: SPARK-5308
 URL: https://issues.apache.org/jira/browse/SPARK-5308
 Project: Spark
  Issue Type: Bug
  Components: Build
Affects Versions: 1.2.0
Reporter: Kuldeep
Assignee: Sean Owen
Priority: Minor
 Fix For: 1.3.0, 1.2.1


 https://repo1.maven.org/maven2/org/apache/spark/spark-core_2.10/1.2.0/spark-core_2.10-1.2.0.pom.md5
 The above does not look like a proper md5 which is causing failure in some 
 build tools like leiningen.
 https://github.com/technomancy/leiningen/issues/1802
 Compare this with 1.1.0 release
 https://repo1.maven.org/maven2/org/apache/spark/spark-core_2.10/1.1.0/spark-core_2.10-1.1.0.pom.md5



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-5524) Remove messy dependencies to log4j

2015-02-07 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-5524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-5524:
---
Component/s: Spark Core

 Remove messy dependencies to log4j
 --

 Key: SPARK-5524
 URL: https://issues.apache.org/jira/browse/SPARK-5524
 Project: Spark
  Issue Type: Task
  Components: Spark Core
Reporter: Jacek Lewandowski

 There are some tickets regarding loosening the dependency on Log4j, however 
 some classes still use the following scheme:
 {code}
   if (Logger.getLogger(classOf[SomeClass]).getLevel == null) {
 Logger.getLogger(classOf[SomeClass]).setLevel(someLevel)
   }
 {code}
 This doesn't look good and make it difficult to track why some logs are 
 missing when you use Log4j and why they are flooding when you use something 
 else, like logback. 
 There is a Logging class which checks whether we use Log4j or not. Why not 
 delegate all of such invocations, where the Logging class could handle it 
 properly, maybe considering more logging implementations?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-5524) Remove messy dependencies to log4j

2015-02-07 Thread Patrick Wendell (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-5524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1430#comment-1430
 ] 

Patrick Wendell commented on SPARK-5524:


[~nchammas] I don't think this is related to the build, so I've changed the 
component.

 Remove messy dependencies to log4j
 --

 Key: SPARK-5524
 URL: https://issues.apache.org/jira/browse/SPARK-5524
 Project: Spark
  Issue Type: Task
  Components: Spark Core
Reporter: Jacek Lewandowski

 There are some tickets regarding loosening the dependency on Log4j, however 
 some classes still use the following scheme:
 {code}
   if (Logger.getLogger(classOf[SomeClass]).getLevel == null) {
 Logger.getLogger(classOf[SomeClass]).setLevel(someLevel)
   }
 {code}
 This doesn't look good and make it difficult to track why some logs are 
 missing when you use Log4j and why they are flooding when you use something 
 else, like logback. 
 There is a Logging class which checks whether we use Log4j or not. Why not 
 delegate all of such invocations, where the Logging class could handle it 
 properly, maybe considering more logging implementations?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-5524) Remove messy dependencies to log4j

2015-02-07 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-5524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-5524:
---
Component/s: (was: Build)

 Remove messy dependencies to log4j
 --

 Key: SPARK-5524
 URL: https://issues.apache.org/jira/browse/SPARK-5524
 Project: Spark
  Issue Type: Task
  Components: Spark Core
Reporter: Jacek Lewandowski

 There are some tickets regarding loosening the dependency on Log4j, however 
 some classes still use the following scheme:
 {code}
   if (Logger.getLogger(classOf[SomeClass]).getLevel == null) {
 Logger.getLogger(classOf[SomeClass]).setLevel(someLevel)
   }
 {code}
 This doesn't look good and make it difficult to track why some logs are 
 missing when you use Log4j and why they are flooding when you use something 
 else, like logback. 
 There is a Logging class which checks whether we use Log4j or not. Why not 
 delegate all of such invocations, where the Logging class could handle it 
 properly, maybe considering more logging implementations?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-5388) Provide a stable application submission gateway in standalone cluster mode


[ 
https://issues.apache.org/jira/browse/SPARK-5388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309784#comment-14309784
 ] 

Patrick Wendell commented on SPARK-5388:


On DELETE, I'll defer to you guys, have zero strong feelings either way.

 Provide a stable application submission gateway in standalone cluster mode
 --

 Key: SPARK-5388
 URL: https://issues.apache.org/jira/browse/SPARK-5388
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.2.0
Reporter: Andrew Or
Assignee: Andrew Or
Priority: Blocker
 Attachments: stable-spark-submit-in-standalone-mode-2-4-15.pdf


 The existing submission gateway in standalone mode is not compatible across 
 Spark versions. If you have a newer version of Spark submitting to an older 
 version of the standalone Master, it is currently not guaranteed to work. The 
 goal is to provide a stable REST interface to replace this channel.
 For more detail, please see the most recent design doc attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Re: Improving metadata in Spark JIRA

2015-02-06 Thread Patrick Wendell

Per Nick's suggestion I added two components:

1. Spark Submit
2. Spark Scheduler

I figured I would just add these since if we decide later we don't
want them, we can simply merge them into Spark Core.

On Fri, Feb 6, 2015 at 11:53 AM, Nicholas Chammas
nicholas.cham...@gmail.com wrote:
 Do we need some new components to be added to the JIRA project?

 Like:

-

scheduler
 -

YARN
 - spark-submit
- ...?

 Nick


 On Fri Feb 06 2015 at 10:50:41 AM Nicholas Chammas 
 nicholas.cham...@gmail.com wrote:

 +9000 on cleaning up JIRA.

 Thank you Sean for laying out some specific things to tackle. I will
 assist with this.

 Regarding email, I think Sandy is right. I only get JIRA email for issues
 I'm watching.

 Nick

 On Fri Feb 06 2015 at 9:52:58 AM Sandy Ryza sandy.r...@cloudera.com
 wrote:

 JIRA updates don't go to this list, they go to iss...@spark.apache.org.
 I
 don't think many are signed up for that list, and those that are probably
 have a flood of emails anyway.

 So I'd definitely be in favor of any JIRA cleanup that you're up for.

 -Sandy

 On Fri, Feb 6, 2015 at 6:45 AM, Sean Owen so...@cloudera.com wrote:

  I've wasted no time in wielding the commit bit to complete a number of
  small, uncontroversial changes. I wouldn't commit anything that didn't
  already appear to have review, consensus and little risk, but please
  let me know if anything looked a little too bold, so I can calibrate.
 
 
  Anyway, I'd like to continue some small house-cleaning by improving
  the state of JIRA's metadata, in order to let it give us a little
  clearer view on what's happening in the project:
 
  a. Add Component to every (open) issue that's missing one
  b. Review all Critical / Blocker issues to de-escalate ones that seem
  obviously neither
  c. Correct open issues that list a Fix version that has already been
  released
  d. Close all issues Resolved for a release that has already been
 released
 
  The problem with doing so is that it will create a tremendous amount
  of email to the list, like, several hundred. It's possible to make
  bulk changes and suppress e-mail though, which could be done for all
  but b.
 
  Better to suppress the emails when making such changes? or just not
  bother on some of these?
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
  For additional commands, e-mail: dev-h...@spark.apache.org
 
 



-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

[jira] [Commented] (SPARK-5388) Provide a stable application submission gateway in standalone cluster mode

[
https://issues.apache.org/jira/browse/SPARK-5388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309825#comment-14309825
]

Patrick Wendell commented on SPARK-5388:

One the boolean and numeric values. I don't mind one way or the other how they
are handled programmatically (since we are not exposing this). However, it does
seem weird that in the wire protocol defines these as string types. I looked at
a few other API's, Github, Twitter, etc and they all use proper boolean types.
So I'd definitely recommend setting them as proper types in the JSON, and if
that's easier to do by making them nullable Boolean and Long values, seems like
a good approach.

Provide a stable application submission gateway in standalone cluster mode
--

Key: SPARK-5388
URL: https://issues.apache.org/jira/browse/SPARK-5388
Project: Spark
Issue Type: Bug
Components: Spark Core
Affects Versions: 1.2.0
Reporter: Andrew Or
Assignee: Andrew Or
Priority: Blocker
Attachments: stable-spark-submit-in-standalone-mode-2-4-15.pdf

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-4874) Report number of records read/written in a task


 [ 
https://issues.apache.org/jira/browse/SPARK-4874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-4874:
---
Component/s: Web UI
 Spark Core

 Report number of records read/written in a task
 ---

 Key: SPARK-4874
 URL: https://issues.apache.org/jira/browse/SPARK-4874
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core, Web UI
Reporter: Kostas Sakellis
Assignee: Kostas Sakellis
 Fix For: 1.3.0


 This metric will help us find key skew using the WebUI



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-4874) Report number of records read/written in a task


 [ 
https://issues.apache.org/jira/browse/SPARK-4874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-4874.

  Resolution: Fixed
   Fix Version/s: 1.3.0
Target Version/s: 1.3.0

 Report number of records read/written in a task
 ---

 Key: SPARK-4874
 URL: https://issues.apache.org/jira/browse/SPARK-4874
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core, Web UI
Reporter: Kostas Sakellis
Assignee: Kostas Sakellis
 Fix For: 1.3.0


 This metric will help us find key skew using the WebUI



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-5659) Flaky Test: org.apache.spark.streaming.ReceiverSuite.block


 [ 
https://issues.apache.org/jira/browse/SPARK-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-5659:
---
Labels: flaky-test  (was: )

 Flaky Test: org.apache.spark.streaming.ReceiverSuite.block
 --

 Key: SPARK-5659
 URL: https://issues.apache.org/jira/browse/SPARK-5659
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.3.0
Reporter: Patrick Wendell
Assignee: Tathagata Das
Priority: Critical
  Labels: flaky-test

 {code}
 Error Message
 recordedBlocks.drop(1).dropRight(1).forall(((block: 
 scala.collection.mutable.ArrayBuffer[Int]) = 
 block.size.=(minExpectedMessagesPerBlock).(block.size.=(maxExpectedMessagesPerBlock
  was false # records in received blocks = 
 [11,10,10,10,10,10,10,10,10,10,10,4,16,10,10,10,10,10,10,10], not between 7 
 and 11
 Stacktrace
 sbt.ForkMain$ForkError: recordedBlocks.drop(1).dropRight(1).forall(((block: 
 scala.collection.mutable.ArrayBuffer[Int]) = 
 block.size.=(minExpectedMessagesPerBlock).(block.size.=(maxExpectedMessagesPerBlock
  was false # records in received blocks = 
 [11,10,10,10,10,10,10,10,10,10,10,4,16,10,10,10,10,10,10,10], not between 7 
 and 11
   at 
 org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:500)
   at 
 org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1555)
   at 
 org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:466)
   at 
 org.apache.spark.streaming.ReceiverSuite$$anonfun$3.apply$mcV$sp(ReceiverSuite.scala:200)
   at 
 org.apache.spark.streaming.ReceiverSuite$$anonfun$3.apply(ReceiverSuite.scala:158)
   at 
 org.apache.spark.streaming.ReceiverSuite$$anonfun$3.apply(ReceiverSuite.scala:158)
   at 
 org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
   at org.scalatest.Transformer.apply(Transformer.scala:22)
   at org.scalatest.Transformer.apply(Transformer.scala:20)
   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
   at org.scalatest.Suite$class.withFixture(Suite.scala:1122)
   at org.scalatest.FunSuite.withFixture(FunSuite.scala:1555)
   at 
 org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
   at 
 org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
   at 
 org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
   at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
   at 
 org.apache.spark.streaming.ReceiverSuite.org$scalatest$BeforeAndAfter$$super$runTest(ReceiverSuite.scala:39)
   at org.scalatest.BeforeAndAfter$class.runTest(BeforeAndAfter.scala:200)
   at 
 org.apache.spark.streaming.ReceiverSuite.runTest(ReceiverSuite.scala:39)
   at 
 org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
   at 
 org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
   at 
 org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413)
   at 
 org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
   at scala.collection.immutable.List.foreach(List.scala:318)
   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
   at 
 org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396)
   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483)
   at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:208)
   at org.scalatest.FunSuite.runTests(FunSuite.scala:1555)
   at org.scalatest.Suite$class.run(Suite.scala:1424)
   at 
 org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1555)
   at 
 org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
   at 
 org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
   at org.scalatest.SuperEngine.runImpl(Engine.scala:545)
   at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:212)
   at 
 org.apache.spark.streaming.ReceiverSuite.org$scalatest$BeforeAndAfter$$super$run(ReceiverSuite.scala:39)
   at org.scalatest.BeforeAndAfter$class.run(BeforeAndAfter.scala:241)
   at org.apache.spark.streaming.ReceiverSuite.run(ReceiverSuite.scala:39)
   at 
 org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:462)
   at 
 org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:671)
   at sbt.ForkMain$Run$2.call(ForkMain.java:294

[jira] [Resolved] (SPARK-5388) Provide a stable application submission gateway in standalone cluster mode


 [ 
https://issues.apache.org/jira/browse/SPARK-5388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-5388.

   Resolution: Fixed
Fix Version/s: 1.3.0

 Provide a stable application submission gateway in standalone cluster mode
 --

 Key: SPARK-5388
 URL: https://issues.apache.org/jira/browse/SPARK-5388
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.2.0
Reporter: Andrew Or
Assignee: Andrew Or
Priority: Blocker
 Fix For: 1.3.0

 Attachments: stable-spark-submit-in-standalone-mode-2-4-15.pdf


 The existing submission gateway in standalone mode is not compatible across 
 Spark versions. If you have a newer version of Spark submitting to an older 
 version of the standalone Master, it is currently not guaranteed to work. The 
 goal is to provide a stable REST interface to replace this channel.
 For more detail, please see the most recent design doc attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-5662) Flaky test: org.apache.spark.streaming.kafka.KafkaDirectStreamSuite.multi topic stream


 [ 
https://issues.apache.org/jira/browse/SPARK-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-5662:
---
Priority: Critical  (was: Major)

 Flaky test: org.apache.spark.streaming.kafka.KafkaDirectStreamSuite.multi 
 topic stream
 --

 Key: SPARK-5662
 URL: https://issues.apache.org/jira/browse/SPARK-5662
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.3.0
Reporter: Patrick Wendell
Priority: Critical

 {code}
 sbt.ForkMain$ForkError: java.net.ConnectException: Connection refused
   at 
 org.apache.spark.streaming.kafka.KafkaUtils$$anonfun$createDirectStream$2.apply(KafkaUtils.scala:319)
   at 
 org.apache.spark.streaming.kafka.KafkaUtils$$anonfun$createDirectStream$2.apply(KafkaUtils.scala:319)
   at scala.util.Either.fold(Either.scala:97)
   at 
 org.apache.spark.streaming.kafka.KafkaUtils$.createDirectStream(KafkaUtils.scala:318)
   at 
 org.apache.spark.streaming.kafka.KafkaDirectStreamSuite$$anonfun$3.apply$mcV$sp(KafkaDirectStreamSuite.scala:66)
   at 
 org.apache.spark.streaming.kafka.KafkaDirectStreamSuite$$anonfun$3.apply(KafkaDirectStreamSuite.scala:59)
   at 
 org.apache.spark.streaming.kafka.KafkaDirectStreamSuite$$anonfun$3.apply(KafkaDirectStreamSuite.scala:59)
   at 
 org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
   at org.scalatest.Transformer.apply(Transformer.scala:22)
   at org.scalatest.Transformer.apply(Transformer.scala:20)
   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
   at org.scalatest.Suite$class.withFixture(Suite.scala:1122)
   at org.scalatest.FunSuite.withFixture(FunSuite.scala:1555)
   at 
 org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
   at 
 org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
   at 
 org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
   at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
   at 
 org.apache.spark.streaming.kafka.KafkaDirectStreamSuite.org$scalatest$BeforeAndAfter$$super$runTest(KafkaDirectStreamSuite.scala:32)
   at org.scalatest.BeforeAndAfter$class.runTest(BeforeAndAfter.scala:200)
   at 
 org.apache.spark.streaming.kafka.KafkaDirectStreamSuite.runTest(KafkaDirectStreamSuite.scala:32)
   at 
 org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
   at 
 org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
   at 
 org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413)
   at 
 org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
   at scala.collection.immutable.List.foreach(List.scala:318)
   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
   at 
 org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396)
   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483)
   at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:208)
   at org.scalatest.FunSuite.runTests(FunSuite.scala:1555)
   at org.scalatest.Suite$class.run(Suite.scala:1424)
   at 
 org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1555)
   at 
 org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
   at 
 org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
   at org.scalatest.SuperEngine.runImpl(Engine.scala:545)
   at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:212)
   at 
 org.apache.spark.streaming.kafka.KafkaDirectStreamSuite.org$scalatest$BeforeAndAfter$$super$run(KafkaDirectStreamSuite.scala:32)
   at org.scalatest.BeforeAndAfter$class.run(BeforeAndAfter.scala:241)
   at 
 org.apache.spark.streaming.kafka.KafkaDirectStreamSuite.run(KafkaDirectStreamSuite.scala:32)
   at 
 org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:462)
   at 
 org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:671)
   at sbt.ForkMain$Run$2.call(ForkMain.java:294)
   at sbt.ForkMain$Run$2.call(ForkMain.java:284)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745

[jira] [Created] (SPARK-5662) Flaky test: org.apache.spark.streaming.kafka.KafkaDirectStreamSuite.multi topic stream

Patrick Wendell created SPARK-5662:
--

 Summary: Flaky test: 
org.apache.spark.streaming.kafka.KafkaDirectStreamSuite.multi topic stream
 Key: SPARK-5662
 URL: https://issues.apache.org/jira/browse/SPARK-5662
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.3.0
Reporter: Patrick Wendell


{code}
sbt.ForkMain$ForkError: java.net.ConnectException: Connection refused
at 
org.apache.spark.streaming.kafka.KafkaUtils$$anonfun$createDirectStream$2.apply(KafkaUtils.scala:319)
at 
org.apache.spark.streaming.kafka.KafkaUtils$$anonfun$createDirectStream$2.apply(KafkaUtils.scala:319)
at scala.util.Either.fold(Either.scala:97)
at 
org.apache.spark.streaming.kafka.KafkaUtils$.createDirectStream(KafkaUtils.scala:318)
at 
org.apache.spark.streaming.kafka.KafkaDirectStreamSuite$$anonfun$3.apply$mcV$sp(KafkaDirectStreamSuite.scala:66)
at 
org.apache.spark.streaming.kafka.KafkaDirectStreamSuite$$anonfun$3.apply(KafkaDirectStreamSuite.scala:59)
at 
org.apache.spark.streaming.kafka.KafkaDirectStreamSuite$$anonfun$3.apply(KafkaDirectStreamSuite.scala:59)
at 
org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
at org.scalatest.Transformer.apply(Transformer.scala:22)
at org.scalatest.Transformer.apply(Transformer.scala:20)
at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
at org.scalatest.Suite$class.withFixture(Suite.scala:1122)
at org.scalatest.FunSuite.withFixture(FunSuite.scala:1555)
at 
org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
at 
org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
at 
org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
at 
org.apache.spark.streaming.kafka.KafkaDirectStreamSuite.org$scalatest$BeforeAndAfter$$super$runTest(KafkaDirectStreamSuite.scala:32)
at org.scalatest.BeforeAndAfter$class.runTest(BeforeAndAfter.scala:200)
at 
org.apache.spark.streaming.kafka.KafkaDirectStreamSuite.runTest(KafkaDirectStreamSuite.scala:32)
at 
org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
at 
org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
at 
org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413)
at 
org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
at 
org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396)
at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483)
at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:208)
at org.scalatest.FunSuite.runTests(FunSuite.scala:1555)
at org.scalatest.Suite$class.run(Suite.scala:1424)
at 
org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1555)
at 
org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
at 
org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
at org.scalatest.SuperEngine.runImpl(Engine.scala:545)
at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:212)
at 
org.apache.spark.streaming.kafka.KafkaDirectStreamSuite.org$scalatest$BeforeAndAfter$$super$run(KafkaDirectStreamSuite.scala:32)
at org.scalatest.BeforeAndAfter$class.run(BeforeAndAfter.scala:241)
at 
org.apache.spark.streaming.kafka.KafkaDirectStreamSuite.run(KafkaDirectStreamSuite.scala:32)
at 
org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:462)
at 
org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:671)
at sbt.ForkMain$Run$2.call(ForkMain.java:294)
at sbt.ForkMain$Run$2.call(ForkMain.java:284)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{code}

https://amplab.cs.berkeley.edu/jenkins/view/Spark/job/Spark-Master-SBT/1628/AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.0,label=centos/testReport/junit/org.apache.spark.streaming.kafka/KafkaDirectStreamSuite/multi_topic_stream

[jira] [Commented] (SPARK-5388) Provide a stable application submission gateway in standalone cluster mode

[
https://issues.apache.org/jira/browse/SPARK-5388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308635#comment-14308635
]

Patrick Wendell commented on SPARK-5388:

I think it's reasonable to use DELETE per [~tigerquoll]'s suggestion. It's not
a perfect match with DELETE semantics, but I think it's fine to use it if it's
not too much work. I also think calling it maxProtocolVersion is a good idea if
those are indeed the semantics. For security, yeah the killing is the same as
it is in the current mode, which is that there is no security. One thing we
could do if there is user demand is add a flag that globally disables killing,
but let's see if users request this first.

Provide a stable application submission gateway in standalone cluster mode
--

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-5557) spark-shell failed to start


[ 
https://issues.apache.org/jira/browse/SPARK-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308346#comment-14308346
 ] 

Patrick Wendell commented on SPARK-5557:


I can send a fix for this shortly. It also works fine if you build with Hadoop 
2 support.

 spark-shell failed to start
 ---

 Key: SPARK-5557
 URL: https://issues.apache.org/jira/browse/SPARK-5557
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.3.0
Reporter: Guoqiang Li
Priority: Blocker

 the log:
 {noformat}
 5/02/03 19:06:39 INFO spark.HttpServer: Starting HTTP Server
 Exception in thread main java.lang.NoClassDefFoundError: 
 javax/servlet/http/HttpServletResponse
   at 
 org.apache.spark.HttpServer.org$apache$spark$HttpServer$$doStart(HttpServer.scala:75)
   at org.apache.spark.HttpServer$$anonfun$1.apply(HttpServer.scala:62)
   at org.apache.spark.HttpServer$$anonfun$1.apply(HttpServer.scala:62)
   at 
 org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1774)
   at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
   at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1765)
   at org.apache.spark.HttpServer.start(HttpServer.scala:62)
   at org.apache.spark.repl.SparkIMain.init(SparkIMain.scala:130)
   at 
 org.apache.spark.repl.SparkILoop$SparkILoopInterpreter.init(SparkILoop.scala:185)
   at 
 org.apache.spark.repl.SparkILoop.createInterpreter(SparkILoop.scala:214)
   at 
 org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:946)
   at 
 org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:942)
   at 
 org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:942)
   at 
 scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
   at 
 org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:942)
   at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1039)
   at org.apache.spark.repl.Main$.main(Main.scala:31)
   at org.apache.spark.repl.Main.main(Main.scala)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:403)
   at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:77)
   at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
 Caused by: java.lang.ClassNotFoundException: 
 javax.servlet.http.HttpServletResponse
   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
   ... 25 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Re: PSA: Maven supports parallel builds

2015-02-05 Thread Patrick Wendell

I've done this in the past, but back when I wasn't using Zinc it
didn't make a big difference. It's worth doing this in our jenkins
environment though.

- Patrick

On Thu, Feb 5, 2015 at 4:52 PM, Dirceu Semighini Filho
dirceu.semigh...@gmail.com wrote:
 Thanks Nicholas, I didn't knew this.

 2015-02-05 22:16 GMT-02:00 Nicholas Chammas nicholas.cham...@gmail.com:

 Y'all may already know this, but I haven't seen it mentioned anywhere in
 our docs on here and it's a pretty easy win.

 Maven supports parallel builds
 
 https://cwiki.apache.org/confluence/display/MAVEN/Parallel+builds+in+Maven+3
 
 with the -T command line option.

 For example:

 ./build/mvn -T 1C -Dhadoop.version=1.2.1 -DskipTests clean package

 This will have Maven use 1 thread per core on your machine to build Spark.

 On my little MacBook air, this cuts the build time from 14 minutes to 10.5
 minutes. A machine with more cores should see a bigger improvement.

 Note though that the docs mark this as experimental, so I wouldn't change
 our reference build to use this. But it should be useful, for example, in
 Jenkins or when working locally.

 Nick



-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

[jira] [Commented] (SPARK-5607) NullPointerException in objenesis


[ 
https://issues.apache.org/jira/browse/SPARK-5607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308423#comment-14308423
 ] 

Patrick Wendell commented on SPARK-5607:


This may have actually caused more of an issue than it solved :(. Lots of 
cascading failures in Spark SQL recently

 NullPointerException in objenesis
 -

 Key: SPARK-5607
 URL: https://issues.apache.org/jira/browse/SPARK-5607
 Project: Spark
  Issue Type: Bug
Reporter: Reynold Xin
Assignee: Patrick Wendell
 Fix For: 1.3.0


 Tests are sometimes failing with the following exception.
 The problem might be that Kryo is using a different version of objenesis from 
 Mockito.
 {code}
 [info] - Process succeeds instantly *** FAILED *** (107 milliseconds)
 [info]   java.lang.NullPointerException:
 [info]   at 
 org.objenesis.strategy.StdInstantiatorStrategy.newInstantiatorOf(StdInstantiatorStrategy.java:52)
 [info]   at 
 org.objenesis.ObjenesisBase.getInstantiatorOf(ObjenesisBase.java:90)
 [info]   at org.objenesis.ObjenesisBase.newInstance(ObjenesisBase.java:73)
 [info]   at 
 org.mockito.internal.creation.jmock.ClassImposterizer.createProxy(ClassImposterizer.java:111)
 [info]   at 
 org.mockito.internal.creation.jmock.ClassImposterizer.imposterise(ClassImposterizer.java:51)
 [info]   at org.mockito.internal.util.MockUtil.createMock(MockUtil.java:52)
 [info]   at org.mockito.internal.MockitoCore.mock(MockitoCore.java:41)
 [info]   at org.mockito.Mockito.mock(Mockito.java:1014)
 [info]   at org.mockito.Mockito.mock(Mockito.java:909)
 [info]   at 
 org.apache.spark.deploy.worker.DriverRunnerTest$$anonfun$1.apply$mcV$sp(DriverRunnerTest.scala:50)
 [info]   at 
 org.apache.spark.deploy.worker.DriverRunnerTest$$anonfun$1.apply(DriverRunnerTest.scala:47)
 [info]   at 
 org.apache.spark.deploy.worker.DriverRunnerTest$$anonfun$1.apply(DriverRunnerTest.scala:47)
 [info]   at 
 org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
 [info]   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
 [info]   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
 [info]   at org.scalatest.Transformer.apply(Transformer.scala:22)
 [info]   at org.scalatest.Transformer.apply(Transformer.scala:20)
 [info]   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
 [info]   at org.scalatest.Suite$class.withFixture(Suite.scala:1122)
 [info]   at org.scalatest.FunSuite.withFixture(FunSuite.scala:1555)
 [info]   at 
 org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
 [info]   at 
 org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
 [info]   at 
 org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
 [info]   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
 [info]   at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
 [info]   at org.scalatest.FunSuite.runTest(FunSuite.scala:1555)
 [info]   at 
 org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
 [info]   at 
 org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
 [info]   at 
 org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413)
 [info]   at 
 org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
 [info]   at scala.collection.immutable.List.foreach(List.scala:318)
 [info]   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
 [info]   at 
 org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396)
 [info]   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483)
 [info]   at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:208)
 [info]   at org.scalatest.FunSuite.runTests(FunSuite.scala:1555)
 [info]   at org.scalatest.Suite$class.run(Suite.scala:1424)
 [info]   at 
 org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1555)
 [info]   at 
 org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
 [info]   at 
 org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
 [info]   at org.scalatest.SuperEngine.runImpl(Engine.scala:545)
 [info]   at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:212)
 [info]   at org.scalatest.FunSuite.run(FunSuite.scala:1555)
 [info]   at 
 org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:462)
 [info]   at 
 org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:671)
 [info]   at sbt.ForkMain$Run$2.call(ForkMain.java:294)
 [info]   at sbt.ForkMain$Run$2.call(ForkMain.java:284)
 [info]   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 [info]   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 [info

[jira] [Updated] (SPARK-5557) Servlet API classes now missing after jetty shading


 [ 
https://issues.apache.org/jira/browse/SPARK-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-5557:
---
Summary: Servlet API classes now missing after jetty shading  (was: 
spark-shell failed to start)

 Servlet API classes now missing after jetty shading
 ---

 Key: SPARK-5557
 URL: https://issues.apache.org/jira/browse/SPARK-5557
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.3.0
Reporter: Guoqiang Li
Priority: Blocker

 the log:
 {noformat}
 5/02/03 19:06:39 INFO spark.HttpServer: Starting HTTP Server
 Exception in thread main java.lang.NoClassDefFoundError: 
 javax/servlet/http/HttpServletResponse
   at 
 org.apache.spark.HttpServer.org$apache$spark$HttpServer$$doStart(HttpServer.scala:75)
   at org.apache.spark.HttpServer$$anonfun$1.apply(HttpServer.scala:62)
   at org.apache.spark.HttpServer$$anonfun$1.apply(HttpServer.scala:62)
   at 
 org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1774)
   at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
   at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1765)
   at org.apache.spark.HttpServer.start(HttpServer.scala:62)
   at org.apache.spark.repl.SparkIMain.init(SparkIMain.scala:130)
   at 
 org.apache.spark.repl.SparkILoop$SparkILoopInterpreter.init(SparkILoop.scala:185)
   at 
 org.apache.spark.repl.SparkILoop.createInterpreter(SparkILoop.scala:214)
   at 
 org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:946)
   at 
 org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:942)
   at 
 org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:942)
   at 
 scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
   at 
 org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:942)
   at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1039)
   at org.apache.spark.repl.Main$.main(Main.scala:31)
   at org.apache.spark.repl.Main.main(Main.scala)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:403)
   at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:77)
   at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
 Caused by: java.lang.ClassNotFoundException: 
 javax.servlet.http.HttpServletResponse
   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
   ... 25 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-5557) Servlet API classes now missing after jetty shading


[ 
https://issues.apache.org/jira/browse/SPARK-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308391#comment-14308391
 ] 

Patrick Wendell edited comment on SPARK-5557 at 2/6/15 1:07 AM:


I'm sorry that this affected so many people for so long. It is not acceptable 
to have the master build not working for this many hours. Unfortunately our 
tests do not catch this for some reason.


was (Author: pwendell):
I'm sorry that this affected so many people for so long. It is not acceptable 
to have the master build not working for this any hours. Unfortunately our 
tests do not catch this for some reason.

 Servlet API classes now missing after jetty shading
 ---

 Key: SPARK-5557
 URL: https://issues.apache.org/jira/browse/SPARK-5557
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.3.0
Reporter: Guoqiang Li
Priority: Blocker

 the log:
 {noformat}
 5/02/03 19:06:39 INFO spark.HttpServer: Starting HTTP Server
 Exception in thread main java.lang.NoClassDefFoundError: 
 javax/servlet/http/HttpServletResponse
   at 
 org.apache.spark.HttpServer.org$apache$spark$HttpServer$$doStart(HttpServer.scala:75)
   at org.apache.spark.HttpServer$$anonfun$1.apply(HttpServer.scala:62)
   at org.apache.spark.HttpServer$$anonfun$1.apply(HttpServer.scala:62)
   at 
 org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1774)
   at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
   at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1765)
   at org.apache.spark.HttpServer.start(HttpServer.scala:62)
   at org.apache.spark.repl.SparkIMain.init(SparkIMain.scala:130)
   at 
 org.apache.spark.repl.SparkILoop$SparkILoopInterpreter.init(SparkILoop.scala:185)
   at 
 org.apache.spark.repl.SparkILoop.createInterpreter(SparkILoop.scala:214)
   at 
 org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:946)
   at 
 org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:942)
   at 
 org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:942)
   at 
 scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
   at 
 org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:942)
   at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1039)
   at org.apache.spark.repl.Main$.main(Main.scala:31)
   at org.apache.spark.repl.Main.main(Main.scala)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:403)
   at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:77)
   at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
 Caused by: java.lang.ClassNotFoundException: 
 javax.servlet.http.HttpServletResponse
   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
   ... 25 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-5557) Servlet API classes now missing after jetty shading


[ 
https://issues.apache.org/jira/browse/SPARK-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308391#comment-14308391
 ] 

Patrick Wendell commented on SPARK-5557:


I'm sorry that this affected so many people for so long. It is not acceptable 
to have the master build not working for this any hours. Unfortunately our 
tests do not catch this for some reason.

 Servlet API classes now missing after jetty shading
 ---

 Key: SPARK-5557
 URL: https://issues.apache.org/jira/browse/SPARK-5557
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.3.0
Reporter: Guoqiang Li
Priority: Blocker

 the log:
 {noformat}
 5/02/03 19:06:39 INFO spark.HttpServer: Starting HTTP Server
 Exception in thread main java.lang.NoClassDefFoundError: 
 javax/servlet/http/HttpServletResponse
   at 
 org.apache.spark.HttpServer.org$apache$spark$HttpServer$$doStart(HttpServer.scala:75)
   at org.apache.spark.HttpServer$$anonfun$1.apply(HttpServer.scala:62)
   at org.apache.spark.HttpServer$$anonfun$1.apply(HttpServer.scala:62)
   at 
 org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1774)
   at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
   at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1765)
   at org.apache.spark.HttpServer.start(HttpServer.scala:62)
   at org.apache.spark.repl.SparkIMain.init(SparkIMain.scala:130)
   at 
 org.apache.spark.repl.SparkILoop$SparkILoopInterpreter.init(SparkILoop.scala:185)
   at 
 org.apache.spark.repl.SparkILoop.createInterpreter(SparkILoop.scala:214)
   at 
 org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:946)
   at 
 org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:942)
   at 
 org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:942)
   at 
 scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
   at 
 org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:942)
   at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1039)
   at org.apache.spark.repl.Main$.main(Main.scala:31)
   at org.apache.spark.repl.Main.main(Main.scala)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:403)
   at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:77)
   at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
 Caused by: java.lang.ClassNotFoundException: 
 javax.servlet.http.HttpServletResponse
   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
   ... 25 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-5557) Servlet API classes now missing after jetty shading


 [ 
https://issues.apache.org/jira/browse/SPARK-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-5557.

   Resolution: Fixed
Fix Version/s: 1.3.0
 Assignee: Patrick Wendell

 Servlet API classes now missing after jetty shading
 ---

 Key: SPARK-5557
 URL: https://issues.apache.org/jira/browse/SPARK-5557
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.3.0
Reporter: Guoqiang Li
Assignee: Patrick Wendell
Priority: Blocker
 Fix For: 1.3.0


 the log:
 {noformat}
 5/02/03 19:06:39 INFO spark.HttpServer: Starting HTTP Server
 Exception in thread main java.lang.NoClassDefFoundError: 
 javax/servlet/http/HttpServletResponse
   at 
 org.apache.spark.HttpServer.org$apache$spark$HttpServer$$doStart(HttpServer.scala:75)
   at org.apache.spark.HttpServer$$anonfun$1.apply(HttpServer.scala:62)
   at org.apache.spark.HttpServer$$anonfun$1.apply(HttpServer.scala:62)
   at 
 org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1774)
   at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
   at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1765)
   at org.apache.spark.HttpServer.start(HttpServer.scala:62)
   at org.apache.spark.repl.SparkIMain.init(SparkIMain.scala:130)
   at 
 org.apache.spark.repl.SparkILoop$SparkILoopInterpreter.init(SparkILoop.scala:185)
   at 
 org.apache.spark.repl.SparkILoop.createInterpreter(SparkILoop.scala:214)
   at 
 org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:946)
   at 
 org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:942)
   at 
 org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:942)
   at 
 scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
   at 
 org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:942)
   at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1039)
   at org.apache.spark.repl.Main$.main(Main.scala:31)
   at org.apache.spark.repl.Main.main(Main.scala)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:403)
   at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:77)
   at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
 Caused by: java.lang.ClassNotFoundException: 
 javax.servlet.http.HttpServletResponse
   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
   ... 25 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-5626) Spurious test failures due to NullPointerException in EasyMock test code


[ 
https://issues.apache.org/jira/browse/SPARK-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308482#comment-14308482
 ] 

Patrick Wendell commented on SPARK-5626:


[~joshrosen] I may have caused this by merging SPARK-5607

 Spurious test failures due to NullPointerException in EasyMock test code
 

 Key: SPARK-5626
 URL: https://issues.apache.org/jira/browse/SPARK-5626
 Project: Spark
  Issue Type: Bug
Affects Versions: 1.3.0
Reporter: Josh Rosen
  Labels: flaky-test
 Attachments: consoleText.txt


 I've seen a few cases where a test failure will trigger a cascade of spurious 
 failures when instantiating test suites that use EasyMock.  Here's a sample 
 symptom:
 {code}
 [info] CacheManagerSuite:
 [info] Exception encountered when attempting to run a suite with class name: 
 org.apache.spark.CacheManagerSuite *** ABORTED *** (137 milliseconds)
 [info]   java.lang.NullPointerException:
 [info]   at 
 org.objenesis.strategy.StdInstantiatorStrategy.newInstantiatorOf(StdInstantiatorStrategy.java:52)
 [info]   at 
 org.objenesis.ObjenesisBase.getInstantiatorOf(ObjenesisBase.java:90)
 [info]   at org.objenesis.ObjenesisBase.newInstance(ObjenesisBase.java:73)
 [info]   at org.objenesis.ObjenesisHelper.newInstance(ObjenesisHelper.java:43)
 [info]   at 
 org.easymock.internal.ObjenesisClassInstantiator.newInstance(ObjenesisClassInstantiator.java:26)
 [info]   at 
 org.easymock.internal.ClassProxyFactory.createProxy(ClassProxyFactory.java:219)
 [info]   at 
 org.easymock.internal.MocksControl.createMock(MocksControl.java:59)
 [info]   at org.easymock.EasyMock.createMock(EasyMock.java:103)
 [info]   at 
 org.scalatest.mock.EasyMockSugar$class.mock(EasyMockSugar.scala:267)
 [info]   at 
 org.apache.spark.CacheManagerSuite.mock(CacheManagerSuite.scala:28)
 [info]   at 
 org.apache.spark.CacheManagerSuite$$anonfun$1.apply$mcV$sp(CacheManagerSuite.scala:40)
 [info]   at 
 org.apache.spark.CacheManagerSuite$$anonfun$1.apply(CacheManagerSuite.scala:38)
 [info]   at 
 org.apache.spark.CacheManagerSuite$$anonfun$1.apply(CacheManagerSuite.scala:38)
 [info]   at 
 org.scalatest.BeforeAndAfter$class.runTest(BeforeAndAfter.scala:195)
 [info]   at 
 org.apache.spark.CacheManagerSuite.runTest(CacheManagerSuite.scala:28)
 [info]   at 
 org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
 [info]   at 
 org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
 [info]   at 
 org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413)
 [info]   at 
 org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
 [info]   at scala.collection.immutable.List.foreach(List.scala:318)
 [info]   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
 [info]   at 
 org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396)
 [info]   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483)
 [info]   at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:208)
 [info]   at org.scalatest.FunSuite.runTests(FunSuite.scala:1555)
 [info]   at org.scalatest.Suite$class.run(Suite.scala:1424)
 [info]   at 
 org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1555)
 [info]   at 
 org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
 [info]   at 
 org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
 [info]   at org.scalatest.SuperEngine.runImpl(Engine.scala:545)
 [info]   at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:212)
 [info]   at 
 org.apache.spark.CacheManagerSuite.org$scalatest$BeforeAndAfter$$super$run(CacheManagerSuite.scala:28)
 [info]   at org.scalatest.BeforeAndAfter$class.run(BeforeAndAfter.scala:241)
 [info]   at org.apache.spark.CacheManagerSuite.run(CacheManagerSuite.scala:28)
 [info]   at 
 org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:462)
 [info]   at 
 org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:671)
 [info]   at sbt.ForkMain$Run$2.call(ForkMain.java:294)
 [info]   at sbt.ForkMain$Run$2.call(ForkMain.java:284)
 [info]   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 [info]   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 [info]   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 [info]   at java.lang.Thread.run(Thread.java:745)
 {code}
 This is from 
 https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26852/consoleFull.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org

[jira] [Reopened] (SPARK-5607) NullPointerException in objenesis


 [ 
https://issues.apache.org/jira/browse/SPARK-5607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell reopened SPARK-5607:


 NullPointerException in objenesis
 -

 Key: SPARK-5607
 URL: https://issues.apache.org/jira/browse/SPARK-5607
 Project: Spark
  Issue Type: Bug
Reporter: Reynold Xin
Assignee: Patrick Wendell
 Fix For: 1.3.0


 Tests are sometimes failing with the following exception.
 The problem might be that Kryo is using a different version of objenesis from 
 Mockito.
 {code}
 [info] - Process succeeds instantly *** FAILED *** (107 milliseconds)
 [info]   java.lang.NullPointerException:
 [info]   at 
 org.objenesis.strategy.StdInstantiatorStrategy.newInstantiatorOf(StdInstantiatorStrategy.java:52)
 [info]   at 
 org.objenesis.ObjenesisBase.getInstantiatorOf(ObjenesisBase.java:90)
 [info]   at org.objenesis.ObjenesisBase.newInstance(ObjenesisBase.java:73)
 [info]   at 
 org.mockito.internal.creation.jmock.ClassImposterizer.createProxy(ClassImposterizer.java:111)
 [info]   at 
 org.mockito.internal.creation.jmock.ClassImposterizer.imposterise(ClassImposterizer.java:51)
 [info]   at org.mockito.internal.util.MockUtil.createMock(MockUtil.java:52)
 [info]   at org.mockito.internal.MockitoCore.mock(MockitoCore.java:41)
 [info]   at org.mockito.Mockito.mock(Mockito.java:1014)
 [info]   at org.mockito.Mockito.mock(Mockito.java:909)
 [info]   at 
 org.apache.spark.deploy.worker.DriverRunnerTest$$anonfun$1.apply$mcV$sp(DriverRunnerTest.scala:50)
 [info]   at 
 org.apache.spark.deploy.worker.DriverRunnerTest$$anonfun$1.apply(DriverRunnerTest.scala:47)
 [info]   at 
 org.apache.spark.deploy.worker.DriverRunnerTest$$anonfun$1.apply(DriverRunnerTest.scala:47)
 [info]   at 
 org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
 [info]   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
 [info]   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
 [info]   at org.scalatest.Transformer.apply(Transformer.scala:22)
 [info]   at org.scalatest.Transformer.apply(Transformer.scala:20)
 [info]   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
 [info]   at org.scalatest.Suite$class.withFixture(Suite.scala:1122)
 [info]   at org.scalatest.FunSuite.withFixture(FunSuite.scala:1555)
 [info]   at 
 org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
 [info]   at 
 org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
 [info]   at 
 org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
 [info]   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
 [info]   at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
 [info]   at org.scalatest.FunSuite.runTest(FunSuite.scala:1555)
 [info]   at 
 org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
 [info]   at 
 org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
 [info]   at 
 org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413)
 [info]   at 
 org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
 [info]   at scala.collection.immutable.List.foreach(List.scala:318)
 [info]   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
 [info]   at 
 org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396)
 [info]   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483)
 [info]   at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:208)
 [info]   at org.scalatest.FunSuite.runTests(FunSuite.scala:1555)
 [info]   at org.scalatest.Suite$class.run(Suite.scala:1424)
 [info]   at 
 org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1555)
 [info]   at 
 org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
 [info]   at 
 org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
 [info]   at org.scalatest.SuperEngine.runImpl(Engine.scala:545)
 [info]   at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:212)
 [info]   at org.scalatest.FunSuite.run(FunSuite.scala:1555)
 [info]   at 
 org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:462)
 [info]   at 
 org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:671)
 [info]   at sbt.ForkMain$Run$2.call(ForkMain.java:294)
 [info]   at sbt.ForkMain$Run$2.call(ForkMain.java:284)
 [info]   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 [info]   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 [info]   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 [info]   at java.lang.Thread.run(Thread.java:745)
 {code}
 More information:
 Kryo depends on objenesis 1.2

[jira] [Commented] (SPARK-5607) NullPointerException in objenesis


[ 
https://issues.apache.org/jira/browse/SPARK-5607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308493#comment-14308493
 ] 

Patrick Wendell commented on SPARK-5607:


I've reverted my patch since it may have caused more harm than good.

 NullPointerException in objenesis
 -

 Key: SPARK-5607
 URL: https://issues.apache.org/jira/browse/SPARK-5607
 Project: Spark
  Issue Type: Bug
Reporter: Reynold Xin
Assignee: Patrick Wendell
 Fix For: 1.3.0


 Tests are sometimes failing with the following exception.
 The problem might be that Kryo is using a different version of objenesis from 
 Mockito.
 {code}
 [info] - Process succeeds instantly *** FAILED *** (107 milliseconds)
 [info]   java.lang.NullPointerException:
 [info]   at 
 org.objenesis.strategy.StdInstantiatorStrategy.newInstantiatorOf(StdInstantiatorStrategy.java:52)
 [info]   at 
 org.objenesis.ObjenesisBase.getInstantiatorOf(ObjenesisBase.java:90)
 [info]   at org.objenesis.ObjenesisBase.newInstance(ObjenesisBase.java:73)
 [info]   at 
 org.mockito.internal.creation.jmock.ClassImposterizer.createProxy(ClassImposterizer.java:111)
 [info]   at 
 org.mockito.internal.creation.jmock.ClassImposterizer.imposterise(ClassImposterizer.java:51)
 [info]   at org.mockito.internal.util.MockUtil.createMock(MockUtil.java:52)
 [info]   at org.mockito.internal.MockitoCore.mock(MockitoCore.java:41)
 [info]   at org.mockito.Mockito.mock(Mockito.java:1014)
 [info]   at org.mockito.Mockito.mock(Mockito.java:909)
 [info]   at 
 org.apache.spark.deploy.worker.DriverRunnerTest$$anonfun$1.apply$mcV$sp(DriverRunnerTest.scala:50)
 [info]   at 
 org.apache.spark.deploy.worker.DriverRunnerTest$$anonfun$1.apply(DriverRunnerTest.scala:47)
 [info]   at 
 org.apache.spark.deploy.worker.DriverRunnerTest$$anonfun$1.apply(DriverRunnerTest.scala:47)
 [info]   at 
 org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
 [info]   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
 [info]   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
 [info]   at org.scalatest.Transformer.apply(Transformer.scala:22)
 [info]   at org.scalatest.Transformer.apply(Transformer.scala:20)
 [info]   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
 [info]   at org.scalatest.Suite$class.withFixture(Suite.scala:1122)
 [info]   at org.scalatest.FunSuite.withFixture(FunSuite.scala:1555)
 [info]   at 
 org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
 [info]   at 
 org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
 [info]   at 
 org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
 [info]   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
 [info]   at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
 [info]   at org.scalatest.FunSuite.runTest(FunSuite.scala:1555)
 [info]   at 
 org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
 [info]   at 
 org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
 [info]   at 
 org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413)
 [info]   at 
 org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
 [info]   at scala.collection.immutable.List.foreach(List.scala:318)
 [info]   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
 [info]   at 
 org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396)
 [info]   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483)
 [info]   at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:208)
 [info]   at org.scalatest.FunSuite.runTests(FunSuite.scala:1555)
 [info]   at org.scalatest.Suite$class.run(Suite.scala:1424)
 [info]   at 
 org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1555)
 [info]   at 
 org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
 [info]   at 
 org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
 [info]   at org.scalatest.SuperEngine.runImpl(Engine.scala:545)
 [info]   at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:212)
 [info]   at org.scalatest.FunSuite.run(FunSuite.scala:1555)
 [info]   at 
 org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:462)
 [info]   at 
 org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:671)
 [info]   at sbt.ForkMain$Run$2.call(ForkMain.java:294)
 [info]   at sbt.ForkMain$Run$2.call(ForkMain.java:284)
 [info]   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 [info]   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 [info]   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run

[jira] [Resolved] (SPARK-5474) curl should support URL redirection in build/mvn


 [ 
https://issues.apache.org/jira/browse/SPARK-5474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-5474.

   Resolution: Fixed
Fix Version/s: 1.3.0
 Assignee: Guoqiang Li

 curl should support URL redirection in build/mvn
 

 Key: SPARK-5474
 URL: https://issues.apache.org/jira/browse/SPARK-5474
 Project: Spark
  Issue Type: Improvement
  Components: Build
Affects Versions: 1.3.0
Reporter: Guoqiang Li
Assignee: Guoqiang Li
 Fix For: 1.3.0


 {{http://archive.apache.org/dist/maven/maven-3/3.2.5/binaries/apache-maven-3.2.5-bin.tar.gz}}
   sometimes return 3xx



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-5594) SparkException: Failed to get broadcast (TorrentBroadcast)


 [ 
https://issues.apache.org/jira/browse/SPARK-5594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-5594:
---
Priority: Critical  (was: Major)

 SparkException: Failed to get broadcast (TorrentBroadcast)
 --

 Key: SPARK-5594
 URL: https://issues.apache.org/jira/browse/SPARK-5594
 Project: Spark
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: John Sandiford
Priority: Critical

 I am uncertain whether this is a bug, however I am getting the error below 
 when running on a cluster (works locally), and have no idea what is causing 
 it, or where to look for more information.
 Any help is appreciated.  Others appear to experience the same issue, but I 
 have not found any solutions online.
 Please note that this only happens with certain code and is repeatable, all 
 my other spark jobs work fine.
 ERROR TaskSetManager: Task 3 in stage 6.0 failed 4 times; aborting job
 Exception in thread main org.apache.spark.SparkException: Job aborted due 
 to stage failure: Task 3 in stage 6.0 failed 4 times, most recent failure: 
 Lost task 3.3 in stage 6.0 (TID 24, removed): java.io.IOException: 
 org.apache.spark.SparkException: Failed to get broadcast_6_piece0 of 
 broadcast_6
 at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1011)
 at 
 org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164)
 at 
 org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64)
 at 
 org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64)
 at 
 org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87)
 at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
 at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:58)
 at org.apache.spark.scheduler.Task.run(Task.scala:56)
 at 
 org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 Caused by: org.apache.spark.SparkException: Failed to get broadcast_6_piece0 
 of broadcast_6
 at 
 org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1$$anonfun$2.apply(TorrentBroadcast.scala:137)
 at 
 org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1$$anonfun$2.apply(TorrentBroadcast.scala:137)
 at scala.Option.getOrElse(Option.scala:120)
 at 
 org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply$mcVI$sp(TorrentBroadcast.scala:136)
 at 
 org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply(TorrentBroadcast.scala:119)
 at 
 org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply(TorrentBroadcast.scala:119)
 at scala.collection.immutable.List.foreach(List.scala:318)
 at 
 org.apache.spark.broadcast.TorrentBroadcast.org$apache$spark$broadcast$TorrentBroadcast$$readBlocks(TorrentBroadcast.scala:119)
 at 
 org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:174)
 at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1008)
 ... 11 more
 Driver stacktrace:
 at 
 org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1214)
 at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1203)
 at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1202)
 at 
 scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
 at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
 at 
 org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1202)
 at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
 at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
 at scala.Option.foreach(Option.scala:236)
 at 
 org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:696)
 at 
 org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1420

[jira] [Commented] (SPARK-5594) SparkException: Failed to get broadcast (TorrentBroadcast)


[ 
https://issues.apache.org/jira/browse/SPARK-5594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14305657#comment-14305657
 ] 

Patrick Wendell commented on SPARK-5594:


I've seen this occasionally in unit tests also. I think we need better 
exception logging in this code path to explain exactly why it is failing.

 SparkException: Failed to get broadcast (TorrentBroadcast)
 --

 Key: SPARK-5594
 URL: https://issues.apache.org/jira/browse/SPARK-5594
 Project: Spark
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: John Sandiford

 I am uncertain whether this is a bug, however I am getting the error below 
 when running on a cluster (works locally), and have no idea what is causing 
 it, or where to look for more information.
 Any help is appreciated.  Others appear to experience the same issue, but I 
 have not found any solutions online.
 Please note that this only happens with certain code and is repeatable, all 
 my other spark jobs work fine.
 ERROR TaskSetManager: Task 3 in stage 6.0 failed 4 times; aborting job
 Exception in thread main org.apache.spark.SparkException: Job aborted due 
 to stage failure: Task 3 in stage 6.0 failed 4 times, most recent failure: 
 Lost task 3.3 in stage 6.0 (TID 24, removed): java.io.IOException: 
 org.apache.spark.SparkException: Failed to get broadcast_6_piece0 of 
 broadcast_6
 at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1011)
 at 
 org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164)
 at 
 org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64)
 at 
 org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64)
 at 
 org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87)
 at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
 at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:58)
 at org.apache.spark.scheduler.Task.run(Task.scala:56)
 at 
 org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 Caused by: org.apache.spark.SparkException: Failed to get broadcast_6_piece0 
 of broadcast_6
 at 
 org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1$$anonfun$2.apply(TorrentBroadcast.scala:137)
 at 
 org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1$$anonfun$2.apply(TorrentBroadcast.scala:137)
 at scala.Option.getOrElse(Option.scala:120)
 at 
 org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply$mcVI$sp(TorrentBroadcast.scala:136)
 at 
 org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply(TorrentBroadcast.scala:119)
 at 
 org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply(TorrentBroadcast.scala:119)
 at scala.collection.immutable.List.foreach(List.scala:318)
 at 
 org.apache.spark.broadcast.TorrentBroadcast.org$apache$spark$broadcast$TorrentBroadcast$$readBlocks(TorrentBroadcast.scala:119)
 at 
 org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:174)
 at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1008)
 ... 11 more
 Driver stacktrace:
 at 
 org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1214)
 at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1203)
 at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1202)
 at 
 scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
 at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
 at 
 org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1202)
 at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
 at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
 at scala.Option.foreach(Option.scala:236)
 at 
 org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:696

Re: 1.2.1-rc3 - Avro input format for Hadoop 2 broken/fix?

2015-02-04 Thread Patrick Wendell

Hi Markus,

That won't be included in 1.2.1 most likely because the release votes
have already started, and at that point we don't hold the release
except for major regression issues from 1.2.0. However, if this goes
through we can backport it into the 1.2 branch and it will end up in a
future maintenance release, or you can just build spark from that
branch as soon as it's in there.

- Patric

On Wed, Feb 4, 2015 at 7:30 AM, M. Dale medal...@yahoo.com.invalid wrote:
 SPARK-3039 Spark assembly for new hadoop API (hadoop 2) contains
 avro-mapred for hadoop 1 API was reopened
 and prevents v.1.2.1-rc3 from using Avro Input format for Hadoop 2
 API/instances (it includes the hadoop1 avro-mapred library files).

 What are the chances of getting the fix outlined here
 (https://github.com/medale/spark/compare/apache:v1.2.1-rc3...avro-hadoop2-v1.2.1-rc2)
 included in 1.2.1? My apologies, I do not know how to generate a pull
 request against a tag version.

 I did add pull request https://github.com/apache/spark/pull/4315 for the
 current 1.3.0-SNAPSHOT master on this issue. Even though 1.3.0 build already
 does not include avro-mapred in the spark assembly jar this minor change
 improves dependence convergence.

 Thanks,
 Markus

 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: multi-line comment style

2015-02-04 Thread Patrick Wendell

Personally I have no opinion, but agree it would be nice to standardize.

- Patrick

On Wed, Feb 4, 2015 at 1:58 PM, Sean Owen so...@cloudera.com wrote:
 One thing Marcelo pointed out to me is that the // style does not
 interfere with commenting out blocks of code with /* */, which is a
 small good thing. I am also accustomed to // style for multiline, and
 reserve /** */ for javadoc / scaladoc. Meaning, seeing the /* */ style
 inline always looks a little funny to me.

 On Wed, Feb 4, 2015 at 3:53 PM, Kay Ousterhout kayousterh...@gmail.com 
 wrote:
 Hi all,

 The Spark Style Guide
 https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide
 says multi-line comments should formatted as:

 /*
  * This is a
  * very
  * long comment.
  */

 But in my experience, we almost always use // for multi-line comments:

 // This is a
 // very
 // long comment.

 Here are some examples:

- Recent commit by Reynold, king of style:

 https://github.com/apache/spark/commit/bebf4c42bef3e75d31ffce9bfdb331c16f34ddb1#diff-d616b5496d1a9f648864f4ab0db5a026R58
- RDD.scala:

 https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/RDD.scala#L361
- DAGScheduler.scala:

 https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L281


 Any objections to me updating the style guide to reflect this?  As with
 other style issues, I think consistency here is helpful (and formatting
 multi-line comments as // does nicely visually distinguish code comments
 from doc comments).

 -Kay

 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

[jira] [Updated] (SPARK-5586) Automatically provide sqlContext in Spark shell


 [ 
https://issues.apache.org/jira/browse/SPARK-5586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-5586:
---
Priority: Blocker  (was: Critical)

 Automatically provide sqlContext in Spark shell
 ---

 Key: SPARK-5586
 URL: https://issues.apache.org/jira/browse/SPARK-5586
 Project: Spark
  Issue Type: Improvement
  Components: Spark Shell, SQL
Reporter: Patrick Wendell
Assignee: Patrick Wendell
Priority: Blocker

 A simple patch, but we should create a sqlContext (and, if supported by the 
 build, a Hive context) in the Spark shell when it's created, and import the 
 DSL. We can just call it sqlContext. This would save us so much time writing 
 code examples :P



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-5586) Automatically provide sqlContext in Spark shell


 [ 
https://issues.apache.org/jira/browse/SPARK-5586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-5586:
---
Assignee: (was: Patrick Wendell)

 Automatically provide sqlContext in Spark shell
 ---

 Key: SPARK-5586
 URL: https://issues.apache.org/jira/browse/SPARK-5586
 Project: Spark
  Issue Type: Improvement
  Components: Spark Shell, SQL
Reporter: Patrick Wendell
Priority: Blocker

 A simple patch, but we should create a sqlContext (and, if supported by the 
 build, a Hive context) in the Spark shell when it's created, and import the 
 DSL. We can just call it sqlContext. This would save us so much time writing 
 code examples :P



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-5411) Allow SparkListeners to be specified in SparkConf and loaded when creating SparkContext


 [ 
https://issues.apache.org/jira/browse/SPARK-5411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-5411.

   Resolution: Fixed
Fix Version/s: 1.3.0

 Allow SparkListeners to be specified in SparkConf and loaded when creating 
 SparkContext
 ---

 Key: SPARK-5411
 URL: https://issues.apache.org/jira/browse/SPARK-5411
 Project: Spark
  Issue Type: New Feature
  Components: Spark Core
Reporter: Josh Rosen
Assignee: Josh Rosen
 Fix For: 1.3.0


 It would be nice if there was a mechanism to allow SparkListeners to be 
 registered through SparkConf settings.  This would allow monitoring 
 frameworks to be easily injected into Spark programs without having to modify 
 those programs' code.
 I propose to introduce a new configuration option, {{spark.extraListeners}}, 
 that allows SparkListeners to be specified in SparkConf and registered before 
 the SparkContext is created.  Here is the proposed documentation for the new 
 option:
 {quote}
 A comma-separated list of classes that implement SparkListener; when 
 initializing SparkContext, instances of these classes will be created and 
 registered with Spark's listener bus. If a class has a single-argument 
 constructor that accepts a SparkConf, that constructor will be called; 
 otherwise, a zero-argument constructor will be called. If no valid 
 constructor can be found, the SparkContext creation will fail with an 
 exception.
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-5607) NullPointerException in objenesis


 [ 
https://issues.apache.org/jira/browse/SPARK-5607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-5607.

  Resolution: Fixed
   Fix Version/s: 1.3.0
Assignee: Patrick Wendell
Target Version/s: 1.3.0

I've merged a patch attempting to fix this. Let's re-open this if we see it 
again

 NullPointerException in objenesis
 -

 Key: SPARK-5607
 URL: https://issues.apache.org/jira/browse/SPARK-5607
 Project: Spark
  Issue Type: Bug
Reporter: Reynold Xin
Assignee: Patrick Wendell
 Fix For: 1.3.0


 Tests are sometimes failing with the following exception.
 The problem might be that Kryo is using a different version of objenesis from 
 Mockito.
 {code}
 [info] - Process succeeds instantly *** FAILED *** (107 milliseconds)
 [info]   java.lang.NullPointerException:
 [info]   at 
 org.objenesis.strategy.StdInstantiatorStrategy.newInstantiatorOf(StdInstantiatorStrategy.java:52)
 [info]   at 
 org.objenesis.ObjenesisBase.getInstantiatorOf(ObjenesisBase.java:90)
 [info]   at org.objenesis.ObjenesisBase.newInstance(ObjenesisBase.java:73)
 [info]   at 
 org.mockito.internal.creation.jmock.ClassImposterizer.createProxy(ClassImposterizer.java:111)
 [info]   at 
 org.mockito.internal.creation.jmock.ClassImposterizer.imposterise(ClassImposterizer.java:51)
 [info]   at org.mockito.internal.util.MockUtil.createMock(MockUtil.java:52)
 [info]   at org.mockito.internal.MockitoCore.mock(MockitoCore.java:41)
 [info]   at org.mockito.Mockito.mock(Mockito.java:1014)
 [info]   at org.mockito.Mockito.mock(Mockito.java:909)
 [info]   at 
 org.apache.spark.deploy.worker.DriverRunnerTest$$anonfun$1.apply$mcV$sp(DriverRunnerTest.scala:50)
 [info]   at 
 org.apache.spark.deploy.worker.DriverRunnerTest$$anonfun$1.apply(DriverRunnerTest.scala:47)
 [info]   at 
 org.apache.spark.deploy.worker.DriverRunnerTest$$anonfun$1.apply(DriverRunnerTest.scala:47)
 [info]   at 
 org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
 [info]   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
 [info]   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
 [info]   at org.scalatest.Transformer.apply(Transformer.scala:22)
 [info]   at org.scalatest.Transformer.apply(Transformer.scala:20)
 [info]   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
 [info]   at org.scalatest.Suite$class.withFixture(Suite.scala:1122)
 [info]   at org.scalatest.FunSuite.withFixture(FunSuite.scala:1555)
 [info]   at 
 org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
 [info]   at 
 org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
 [info]   at 
 org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
 [info]   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
 [info]   at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
 [info]   at org.scalatest.FunSuite.runTest(FunSuite.scala:1555)
 [info]   at 
 org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
 [info]   at 
 org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
 [info]   at 
 org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413)
 [info]   at 
 org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
 [info]   at scala.collection.immutable.List.foreach(List.scala:318)
 [info]   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
 [info]   at 
 org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396)
 [info]   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483)
 [info]   at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:208)
 [info]   at org.scalatest.FunSuite.runTests(FunSuite.scala:1555)
 [info]   at org.scalatest.Suite$class.run(Suite.scala:1424)
 [info]   at 
 org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1555)
 [info]   at 
 org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
 [info]   at 
 org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
 [info]   at org.scalatest.SuperEngine.runImpl(Engine.scala:545)
 [info]   at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:212)
 [info]   at org.scalatest.FunSuite.run(FunSuite.scala:1555)
 [info]   at 
 org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:462)
 [info]   at 
 org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:671)
 [info]   at sbt.ForkMain$Run$2.call(ForkMain.java:294)
 [info]   at sbt.ForkMain$Run$2.call(ForkMain.java:284)
 [info]   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 [info]   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java

[jira] [Updated] (SPARK-5585) Flaky test: Python regression


 [ 
https://issues.apache.org/jira/browse/SPARK-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-5585:
---
Labels: flaky-test  (was: )

 Flaky test: Python regression
 -

 Key: SPARK-5585
 URL: https://issues.apache.org/jira/browse/SPARK-5585
 Project: Spark
  Issue Type: Bug
  Components: MLlib
Affects Versions: 1.3.0
Reporter: Patrick Wendell
Assignee: Davies Liu
Priority: Critical
  Labels: flaky-test

 Hey [~davies] any chance you can take a look at this? The master build is 
 having random python failures fairly often. Not quite sure what is going on:
 {code}
 0inputs+128outputs (0major+13320minor)pagefaults 0swaps
 Run mllib tests ...
 Running test: pyspark/mllib/classification.py
 tput: No value for $TERM and no -T specified
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 0.43user 0.12system 0:14.85elapsed 3%CPU (0avgtext+0avgdata 94272maxresident)k
 0inputs+280outputs (0major+12627minor)pagefaults 0swaps
 Running test: pyspark/mllib/clustering.py
 tput: No value for $TERM and no -T specified
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 0.35user 0.11system 0:12.63elapsed 3%CPU (0avgtext+0avgdata 93568maxresident)k
 0inputs+88outputs (0major+12532minor)pagefaults 0swaps
 Running test: pyspark/mllib/feature.py
 tput: No value for $TERM and no -T specified
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 0.28user 0.08system 0:05.73elapsed 6%CPU (0avgtext+0avgdata 93424maxresident)k
 0inputs+32outputs (0major+12548minor)pagefaults 0swaps
 Running test: pyspark/mllib/linalg.py
 0.16user 0.05system 0:00.22elapsed 98%CPU (0avgtext+0avgdata 
 89888maxresident)k
 0inputs+0outputs (0major+8099minor)pagefaults 0swaps
 Running test: pyspark/mllib/rand.py
 tput: No value for $TERM and no -T specified
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 0.25user 0.08system 0:05.42elapsed 6%CPU (0avgtext+0avgdata 87872maxresident)k
 0inputs+0outputs (0major+11849minor)pagefaults 0swaps
 Running test: pyspark/mllib/recommendation.py
 tput: No value for $TERM and no -T specified
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 0.32user 0.09system 0:11.42elapsed 3%CPU (0avgtext+0avgdata 94256maxresident)k
 0inputs+32outputs (0major+11797minor)pagefaults 0swaps
 Running test: pyspark/mllib/regression.py
 tput: No value for $TERM and no -T specified
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 0.53user 0.17system 0:23.53elapsed 3%CPU (0avgtext+0avgdata 99600maxresident)k
 0inputs+48outputs (0major+12402minor)pagefaults 0swaps
 Running test: pyspark/mllib/stat/_statistics.py
 tput: No value for $TERM and no -T specified
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 0.29user 0.09system 0:08.03elapsed 4%CPU (0avgtext+0avgdata 92656maxresident)k
 0inputs+48outputs (0major+12508minor)pagefaults 0swaps
 Running test: pyspark/mllib/tree.py
 tput: No value for $TERM and no -T specified
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 0.57user 0.16system 0:25.30elapsed 2%CPU (0avgtext+0avgdata 94400maxresident)k
 0inputs+144outputs (0major+12600minor)pagefaults 0swaps
 Running test: pyspark/mllib/util.py
 tput: No value for $TERM and no -T specified
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 0.20user 0.06system 0:08.08elapsed 3%CPU (0avgtext+0avgdata 92768maxresident)k
 0inputs+56outputs (0major+12474minor)pagefaults 0swaps
 Running test: pyspark/mllib/tests.py
 tput: No value for $TERM and no -T specified
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 .F/usr/lib64/python2.6/site-packages/numpy/core/fromnumeric.py:2499: 
 VisibleDeprecationWarning: `rank` is deprecated; use the `ndim` attribute or 
 function instead. To find the rank of a matrix see `numpy.linalg.matrix_rank`.
   VisibleDeprecationWarning)
 ./usr/lib64/python2.6/site-packages/numpy/core/fromnumeric.py:2499: 
 VisibleDeprecationWarning: `rank` is deprecated; use the `ndim` attribute or 
 function instead. To find the rank of a matrix see `numpy.linalg.matrix_rank`.
   VisibleDeprecationWarning)
 /usr/lib64/python2.6/site-packages/numpy/core/fromnumeric.py:2499: 
 VisibleDeprecationWarning: `rank` is deprecated; use the `ndim` attribute or 
 function instead. To find the rank of a matrix see `numpy.linalg.matrix_rank`.
   VisibleDeprecationWarning)
 /usr/lib64/python2.6/site-packages/numpy/core/fromnumeric.py:2499: 
 VisibleDeprecationWarning: `rank` is deprecated; use the `ndim` attribute or 
 function instead. To find the rank of a matrix see

[jira] [Created] (SPARK-5585) Flaky test: Python regression

Patrick Wendell created SPARK-5585:
--

 Summary: Flaky test: Python regression
 Key: SPARK-5585
 URL: https://issues.apache.org/jira/browse/SPARK-5585
 Project: Spark
  Issue Type: Bug
  Components: MLlib
Reporter: Patrick Wendell
Assignee: Davies Liu


Hey [~davies] any chance you can take a look at this? The master build is 
having random python failures fairly often. Not quite sure what is going on:

{code}
0inputs+128outputs (0major+13320minor)pagefaults 0swaps
Run mllib tests ...
Running test: pyspark/mllib/classification.py
tput: No value for $TERM and no -T specified
Spark assembly has been built with Hive, including Datanucleus jars on classpath
0.43user 0.12system 0:14.85elapsed 3%CPU (0avgtext+0avgdata 94272maxresident)k
0inputs+280outputs (0major+12627minor)pagefaults 0swaps
Running test: pyspark/mllib/clustering.py
tput: No value for $TERM and no -T specified
Spark assembly has been built with Hive, including Datanucleus jars on classpath
0.35user 0.11system 0:12.63elapsed 3%CPU (0avgtext+0avgdata 93568maxresident)k
0inputs+88outputs (0major+12532minor)pagefaults 0swaps
Running test: pyspark/mllib/feature.py
tput: No value for $TERM and no -T specified
Spark assembly has been built with Hive, including Datanucleus jars on classpath
0.28user 0.08system 0:05.73elapsed 6%CPU (0avgtext+0avgdata 93424maxresident)k
0inputs+32outputs (0major+12548minor)pagefaults 0swaps
Running test: pyspark/mllib/linalg.py
0.16user 0.05system 0:00.22elapsed 98%CPU (0avgtext+0avgdata 89888maxresident)k
0inputs+0outputs (0major+8099minor)pagefaults 0swaps
Running test: pyspark/mllib/rand.py
tput: No value for $TERM and no -T specified
Spark assembly has been built with Hive, including Datanucleus jars on classpath
0.25user 0.08system 0:05.42elapsed 6%CPU (0avgtext+0avgdata 87872maxresident)k
0inputs+0outputs (0major+11849minor)pagefaults 0swaps
Running test: pyspark/mllib/recommendation.py
tput: No value for $TERM and no -T specified
Spark assembly has been built with Hive, including Datanucleus jars on classpath
0.32user 0.09system 0:11.42elapsed 3%CPU (0avgtext+0avgdata 94256maxresident)k
0inputs+32outputs (0major+11797minor)pagefaults 0swaps
Running test: pyspark/mllib/regression.py
tput: No value for $TERM and no -T specified
Spark assembly has been built with Hive, including Datanucleus jars on classpath
0.53user 0.17system 0:23.53elapsed 3%CPU (0avgtext+0avgdata 99600maxresident)k
0inputs+48outputs (0major+12402minor)pagefaults 0swaps
Running test: pyspark/mllib/stat/_statistics.py
tput: No value for $TERM and no -T specified
Spark assembly has been built with Hive, including Datanucleus jars on classpath
0.29user 0.09system 0:08.03elapsed 4%CPU (0avgtext+0avgdata 92656maxresident)k
0inputs+48outputs (0major+12508minor)pagefaults 0swaps
Running test: pyspark/mllib/tree.py
tput: No value for $TERM and no -T specified
Spark assembly has been built with Hive, including Datanucleus jars on classpath
0.57user 0.16system 0:25.30elapsed 2%CPU (0avgtext+0avgdata 94400maxresident)k
0inputs+144outputs (0major+12600minor)pagefaults 0swaps
Running test: pyspark/mllib/util.py
tput: No value for $TERM and no -T specified
Spark assembly has been built with Hive, including Datanucleus jars on classpath
0.20user 0.06system 0:08.08elapsed 3%CPU (0avgtext+0avgdata 92768maxresident)k
0inputs+56outputs (0major+12474minor)pagefaults 0swaps
Running test: pyspark/mllib/tests.py
tput: No value for $TERM and no -T specified
Spark assembly has been built with Hive, including Datanucleus jars on classpath
.F/usr/lib64/python2.6/site-packages/numpy/core/fromnumeric.py:2499: 
VisibleDeprecationWarning: `rank` is deprecated; use the `ndim` attribute or 
function instead. To find the rank of a matrix see `numpy.linalg.matrix_rank`.
  VisibleDeprecationWarning)
./usr/lib64/python2.6/site-packages/numpy/core/fromnumeric.py:2499: 
VisibleDeprecationWarning: `rank` is deprecated; use the `ndim` attribute or 
function instead. To find the rank of a matrix see `numpy.linalg.matrix_rank`.
  VisibleDeprecationWarning)
/usr/lib64/python2.6/site-packages/numpy/core/fromnumeric.py:2499: 
VisibleDeprecationWarning: `rank` is deprecated; use the `ndim` attribute or 
function instead. To find the rank of a matrix see `numpy.linalg.matrix_rank`.
  VisibleDeprecationWarning)
/usr/lib64/python2.6/site-packages/numpy/core/fromnumeric.py:2499: 
VisibleDeprecationWarning: `rank` is deprecated; use the `ndim` attribute or 
function instead. To find the rank of a matrix see `numpy.linalg.matrix_rank`.
  VisibleDeprecationWarning)
/usr/lib64/python2.6/site-packages/numpy/core/fromnumeric.py:2499: 
VisibleDeprecationWarning: `rank` is deprecated; use the `ndim` attribute or 
function instead. To find the rank of a matrix see `numpy.linalg.matrix_rank`.
  VisibleDeprecationWarning)
./usr/lib64/python2.6/site-packages/numpy/lib

[jira] [Updated] (SPARK-5585) Flaky test: Python regression


 [ 
https://issues.apache.org/jira/browse/SPARK-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-5585:
---
Affects Version/s: 1.3.0

 Flaky test: Python regression
 -

 Key: SPARK-5585
 URL: https://issues.apache.org/jira/browse/SPARK-5585
 Project: Spark
  Issue Type: Bug
  Components: MLlib
Affects Versions: 1.3.0
Reporter: Patrick Wendell
Assignee: Davies Liu
Priority: Critical
  Labels: flaky-test

 Hey [~davies] any chance you can take a look at this? The master build is 
 having random python failures fairly often. Not quite sure what is going on:
 {code}
 0inputs+128outputs (0major+13320minor)pagefaults 0swaps
 Run mllib tests ...
 Running test: pyspark/mllib/classification.py
 tput: No value for $TERM and no -T specified
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 0.43user 0.12system 0:14.85elapsed 3%CPU (0avgtext+0avgdata 94272maxresident)k
 0inputs+280outputs (0major+12627minor)pagefaults 0swaps
 Running test: pyspark/mllib/clustering.py
 tput: No value for $TERM and no -T specified
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 0.35user 0.11system 0:12.63elapsed 3%CPU (0avgtext+0avgdata 93568maxresident)k
 0inputs+88outputs (0major+12532minor)pagefaults 0swaps
 Running test: pyspark/mllib/feature.py
 tput: No value for $TERM and no -T specified
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 0.28user 0.08system 0:05.73elapsed 6%CPU (0avgtext+0avgdata 93424maxresident)k
 0inputs+32outputs (0major+12548minor)pagefaults 0swaps
 Running test: pyspark/mllib/linalg.py
 0.16user 0.05system 0:00.22elapsed 98%CPU (0avgtext+0avgdata 
 89888maxresident)k
 0inputs+0outputs (0major+8099minor)pagefaults 0swaps
 Running test: pyspark/mllib/rand.py
 tput: No value for $TERM and no -T specified
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 0.25user 0.08system 0:05.42elapsed 6%CPU (0avgtext+0avgdata 87872maxresident)k
 0inputs+0outputs (0major+11849minor)pagefaults 0swaps
 Running test: pyspark/mllib/recommendation.py
 tput: No value for $TERM and no -T specified
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 0.32user 0.09system 0:11.42elapsed 3%CPU (0avgtext+0avgdata 94256maxresident)k
 0inputs+32outputs (0major+11797minor)pagefaults 0swaps
 Running test: pyspark/mllib/regression.py
 tput: No value for $TERM and no -T specified
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 0.53user 0.17system 0:23.53elapsed 3%CPU (0avgtext+0avgdata 99600maxresident)k
 0inputs+48outputs (0major+12402minor)pagefaults 0swaps
 Running test: pyspark/mllib/stat/_statistics.py
 tput: No value for $TERM and no -T specified
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 0.29user 0.09system 0:08.03elapsed 4%CPU (0avgtext+0avgdata 92656maxresident)k
 0inputs+48outputs (0major+12508minor)pagefaults 0swaps
 Running test: pyspark/mllib/tree.py
 tput: No value for $TERM and no -T specified
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 0.57user 0.16system 0:25.30elapsed 2%CPU (0avgtext+0avgdata 94400maxresident)k
 0inputs+144outputs (0major+12600minor)pagefaults 0swaps
 Running test: pyspark/mllib/util.py
 tput: No value for $TERM and no -T specified
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 0.20user 0.06system 0:08.08elapsed 3%CPU (0avgtext+0avgdata 92768maxresident)k
 0inputs+56outputs (0major+12474minor)pagefaults 0swaps
 Running test: pyspark/mllib/tests.py
 tput: No value for $TERM and no -T specified
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 .F/usr/lib64/python2.6/site-packages/numpy/core/fromnumeric.py:2499: 
 VisibleDeprecationWarning: `rank` is deprecated; use the `ndim` attribute or 
 function instead. To find the rank of a matrix see `numpy.linalg.matrix_rank`.
   VisibleDeprecationWarning)
 ./usr/lib64/python2.6/site-packages/numpy/core/fromnumeric.py:2499: 
 VisibleDeprecationWarning: `rank` is deprecated; use the `ndim` attribute or 
 function instead. To find the rank of a matrix see `numpy.linalg.matrix_rank`.
   VisibleDeprecationWarning)
 /usr/lib64/python2.6/site-packages/numpy/core/fromnumeric.py:2499: 
 VisibleDeprecationWarning: `rank` is deprecated; use the `ndim` attribute or 
 function instead. To find the rank of a matrix see `numpy.linalg.matrix_rank`.
   VisibleDeprecationWarning)
 /usr/lib64/python2.6/site-packages/numpy/core/fromnumeric.py:2499: 
 VisibleDeprecationWarning: `rank` is deprecated; use the `ndim` attribute or 
 function instead. To find the rank of a matrix see

[jira] [Updated] (SPARK-5585) Flaky test: Python regression


 [ 
https://issues.apache.org/jira/browse/SPARK-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-5585:
---
Priority: Critical  (was: Major)

 Flaky test: Python regression
 -

 Key: SPARK-5585
 URL: https://issues.apache.org/jira/browse/SPARK-5585
 Project: Spark
  Issue Type: Bug
  Components: MLlib
Affects Versions: 1.3.0
Reporter: Patrick Wendell
Assignee: Davies Liu
Priority: Critical
  Labels: flaky-test

 Hey [~davies] any chance you can take a look at this? The master build is 
 having random python failures fairly often. Not quite sure what is going on:
 {code}
 0inputs+128outputs (0major+13320minor)pagefaults 0swaps
 Run mllib tests ...
 Running test: pyspark/mllib/classification.py
 tput: No value for $TERM and no -T specified
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 0.43user 0.12system 0:14.85elapsed 3%CPU (0avgtext+0avgdata 94272maxresident)k
 0inputs+280outputs (0major+12627minor)pagefaults 0swaps
 Running test: pyspark/mllib/clustering.py
 tput: No value for $TERM and no -T specified
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 0.35user 0.11system 0:12.63elapsed 3%CPU (0avgtext+0avgdata 93568maxresident)k
 0inputs+88outputs (0major+12532minor)pagefaults 0swaps
 Running test: pyspark/mllib/feature.py
 tput: No value for $TERM and no -T specified
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 0.28user 0.08system 0:05.73elapsed 6%CPU (0avgtext+0avgdata 93424maxresident)k
 0inputs+32outputs (0major+12548minor)pagefaults 0swaps
 Running test: pyspark/mllib/linalg.py
 0.16user 0.05system 0:00.22elapsed 98%CPU (0avgtext+0avgdata 
 89888maxresident)k
 0inputs+0outputs (0major+8099minor)pagefaults 0swaps
 Running test: pyspark/mllib/rand.py
 tput: No value for $TERM and no -T specified
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 0.25user 0.08system 0:05.42elapsed 6%CPU (0avgtext+0avgdata 87872maxresident)k
 0inputs+0outputs (0major+11849minor)pagefaults 0swaps
 Running test: pyspark/mllib/recommendation.py
 tput: No value for $TERM and no -T specified
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 0.32user 0.09system 0:11.42elapsed 3%CPU (0avgtext+0avgdata 94256maxresident)k
 0inputs+32outputs (0major+11797minor)pagefaults 0swaps
 Running test: pyspark/mllib/regression.py
 tput: No value for $TERM and no -T specified
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 0.53user 0.17system 0:23.53elapsed 3%CPU (0avgtext+0avgdata 99600maxresident)k
 0inputs+48outputs (0major+12402minor)pagefaults 0swaps
 Running test: pyspark/mllib/stat/_statistics.py
 tput: No value for $TERM and no -T specified
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 0.29user 0.09system 0:08.03elapsed 4%CPU (0avgtext+0avgdata 92656maxresident)k
 0inputs+48outputs (0major+12508minor)pagefaults 0swaps
 Running test: pyspark/mllib/tree.py
 tput: No value for $TERM and no -T specified
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 0.57user 0.16system 0:25.30elapsed 2%CPU (0avgtext+0avgdata 94400maxresident)k
 0inputs+144outputs (0major+12600minor)pagefaults 0swaps
 Running test: pyspark/mllib/util.py
 tput: No value for $TERM and no -T specified
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 0.20user 0.06system 0:08.08elapsed 3%CPU (0avgtext+0avgdata 92768maxresident)k
 0inputs+56outputs (0major+12474minor)pagefaults 0swaps
 Running test: pyspark/mllib/tests.py
 tput: No value for $TERM and no -T specified
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 .F/usr/lib64/python2.6/site-packages/numpy/core/fromnumeric.py:2499: 
 VisibleDeprecationWarning: `rank` is deprecated; use the `ndim` attribute or 
 function instead. To find the rank of a matrix see `numpy.linalg.matrix_rank`.
   VisibleDeprecationWarning)
 ./usr/lib64/python2.6/site-packages/numpy/core/fromnumeric.py:2499: 
 VisibleDeprecationWarning: `rank` is deprecated; use the `ndim` attribute or 
 function instead. To find the rank of a matrix see `numpy.linalg.matrix_rank`.
   VisibleDeprecationWarning)
 /usr/lib64/python2.6/site-packages/numpy/core/fromnumeric.py:2499: 
 VisibleDeprecationWarning: `rank` is deprecated; use the `ndim` attribute or 
 function instead. To find the rank of a matrix see `numpy.linalg.matrix_rank`.
   VisibleDeprecationWarning)
 /usr/lib64/python2.6/site-packages/numpy/core/fromnumeric.py:2499: 
 VisibleDeprecationWarning: `rank` is deprecated; use the `ndim` attribute or 
 function instead. To find the rank of a matrix see

[jira] [Updated] (SPARK-5341) Support maven coordinates in spark-shell and spark-submit


 [ 
https://issues.apache.org/jira/browse/SPARK-5341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-5341:
---
Assignee: Burak Yavuz

 Support maven coordinates in spark-shell and spark-submit
 -

 Key: SPARK-5341
 URL: https://issues.apache.org/jira/browse/SPARK-5341
 Project: Spark
  Issue Type: New Feature
  Components: Deploy, Spark Shell
Reporter: Burak Yavuz
Assignee: Burak Yavuz
Priority: Critical
 Fix For: 1.3.0


 This feature will allow users to provide the maven coordinates of jars they 
 wish to use in their spark application. Coordinates can be a comma-delimited 
 list and be supplied like:
 ```spark-submit --maven org.apache.example.a,org.apache.example.b```
 This feature will also be added to spark-shell (where it is more critical to 
 have this feature)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-5341) Support maven coordinates in spark-shell and spark-submit


 [ 
https://issues.apache.org/jira/browse/SPARK-5341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-5341.

   Resolution: Fixed
Fix Version/s: 1.3.0

 Support maven coordinates in spark-shell and spark-submit
 -

 Key: SPARK-5341
 URL: https://issues.apache.org/jira/browse/SPARK-5341
 Project: Spark
  Issue Type: New Feature
  Components: Deploy, Spark Shell
Reporter: Burak Yavuz
Assignee: Burak Yavuz
Priority: Critical
 Fix For: 1.3.0


 This feature will allow users to provide the maven coordinates of jars they 
 wish to use in their spark application. Coordinates can be a comma-delimited 
 list and be supplied like:
 ```spark-submit --maven org.apache.example.a,org.apache.example.b```
 This feature will also be added to spark-shell (where it is more critical to 
 have this feature)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-5586) Automatically provide sqlContext in Spark shell


 [ 
https://issues.apache.org/jira/browse/SPARK-5586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-5586:
---
Priority: Critical  (was: Major)

 Automatically provide sqlContext in Spark shell
 ---

 Key: SPARK-5586
 URL: https://issues.apache.org/jira/browse/SPARK-5586
 Project: Spark
  Issue Type: Improvement
  Components: Spark Shell, SQL
Reporter: Patrick Wendell
Assignee: Patrick Wendell
Priority: Critical

 A simple patch, but we should create a sqlContext (and, if supported by the 
 build, a Hive context) in the Spark shell when it's created, and import the 
 DSL. We can just call it sqlContext. This would save us so much time writing 
 code examples :P



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-5586) Automatically provide sqlContext in Spark shell

Patrick Wendell created SPARK-5586:
--

 Summary: Automatically provide sqlContext in Spark shell
 Key: SPARK-5586
 URL: https://issues.apache.org/jira/browse/SPARK-5586
 Project: Spark
  Issue Type: Improvement
  Components: Spark Shell, SQL
Reporter: Patrick Wendell
Assignee: Patrick Wendell
 Fix For: 1.3.0


A simple patch, but we should create a sqlContext (and, if supported by the 
build, a Hive context) in the Spark shell when it's created, and import the 
DSL. We can just call it sqlContext. This would save us so much time writing 
code examples :P



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-5586) Automatically provide sqlContext in Spark shell


 [ 
https://issues.apache.org/jira/browse/SPARK-5586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-5586:
---
Fix Version/s: (was: 1.3.0)

 Automatically provide sqlContext in Spark shell
 ---

 Key: SPARK-5586
 URL: https://issues.apache.org/jira/browse/SPARK-5586
 Project: Spark
  Issue Type: Improvement
  Components: Spark Shell, SQL
Reporter: Patrick Wendell
Assignee: Patrick Wendell
Priority: Critical

 A simple patch, but we should create a sqlContext (and, if supported by the 
 build, a Hive context) in the Spark shell when it's created, and import the 
 DSL. We can just call it sqlContext. This would save us so much time writing 
 code examples :P



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-5140) Two RDDs which are scheduled concurrently should be able to wait on parent in all cases


 [ 
https://issues.apache.org/jira/browse/SPARK-5140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-5140:
---
Fix Version/s: (was: 1.2.1)
   (was: 1.3.0)

 Two RDDs which are scheduled concurrently should be able to wait on parent in 
 all cases
 ---

 Key: SPARK-5140
 URL: https://issues.apache.org/jira/browse/SPARK-5140
 Project: Spark
  Issue Type: New Feature
Reporter: Corey J. Nolet
  Labels: features

 Not sure if this would change too much of the internals to be included in the 
 1.2.1 but it would be very helpful if it could be.
 This ticket is from a discussion between myself and [~ilikerps]. Here's the 
 result of some testing that [~ilikerps] did:
 bq. I did some testing as well, and it turns out the wait for other guy to 
 finish caching logic is on a per-task basis, and it only works on tasks that 
 happen to be executing on the same machine. 
 bq. Once a partition is cached, we will schedule tasks that touch that 
 partition on that executor. The problem here, though, is that the cache is in 
 progress, and so the tasks are still scheduled randomly (or with whatever 
 locality the data source has), so tasks which end up on different machines 
 will not see that the cache is already in progress.
 {code}
 Here was my test, by the way:
 import scala.concurrent.ExecutionContext.Implicits.global
 import scala.concurrent._
 import scala.concurrent.duration._
 val rdd = sc.parallelize(0 until 8).map(i = { Thread.sleep(1); i 
 }).cache()
 val futures = (0 until 4).map { _ = Future { rdd.count } }
 Await.result(Future.sequence(futures), 120.second)
 {code}
 bq. Note that I run the future 4 times in parallel. I found that the first 
 run has all tasks take 10 seconds. The second has about 50% of its tasks take 
 10 seconds, and the rest just wait for the first stage to finish. The last 
 two runs have no tasks that take 10 seconds; all wait for the first two 
 stages to finish.
 What we want is the ability to fire off a job and have the DAG figure out 
 that two RDDs depend on the same parent so that when the children are 
 scheduled concurrently, the first one to start will activate the parent and 
 both will wait on the parent. When the parent is done, they will both be able 
 to finish their work concurrently. We are trying to use this pattern by 
 having the parent cache results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-5569) Checkpoints cannot reference classes defined outside of Spark's assembly

Patrick Wendell created SPARK-5569:
--

 Summary: Checkpoints cannot reference classes defined outside of 
Spark's assembly
 Key: SPARK-5569
 URL: https://issues.apache.org/jira/browse/SPARK-5569
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Reporter: Patrick Wendell


Not sure if this is a bug or a feature, but it's not obvious, so wanted to 
create a JIRA to make sure we document this behavior.

First documented by Cody Koeninger:
https://gist.github.com/koeninger/561a61482cd1b5b3600c

{code}
15/01/12 16:07:07 INFO CheckpointReader: Attempting to load checkpoint from 
file file:/var/tmp/cp/checkpoint-142110041.bk
15/01/12 16:07:07 WARN CheckpointReader: Error reading checkpoint from file 
file:/var/tmp/cp/checkpoint-142110041.bk
java.io.IOException: java.lang.ClassNotFoundException: 
org.apache.spark.rdd.kafka.KafkaRDDPartition
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1043)
at 
org.apache.spark.streaming.dstream.DStreamCheckpointData.readObject(DStreamCheckpointData.scala:146)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at 
java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:500)
at 
org.apache.spark.streaming.DStreamGraph$$anonfun$readObject$1.apply$mcV$sp(DStreamGraph.scala:180)
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1040)
at 
org.apache.spark.streaming.DStreamGraph.readObject(DStreamGraph.scala:176)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at 
org.apache.spark.streaming.CheckpointReader$$anonfun$read$2.apply(Checkpoint.scala:251)
at 
org.apache.spark.streaming.CheckpointReader$$anonfun$read$2.apply(Checkpoint.scala:239)
at 
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34)
at 
org.apache.spark.streaming.CheckpointReader$.read(Checkpoint.scala:239)
at 
org.apache.spark.streaming.StreamingContext$.getOrCreate(StreamingContext.scala:552)
at example.CheckpointedExample$.main(CheckpointedExample.scala:34)
at example.CheckpointedExample.main(CheckpointedExample.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57

[ANNOUNCE] branch-1.3 has been cut

2015-02-03 Thread Patrick Wendell

Hey All,

Just wanted to announce that we've cut the 1.3 branch which will
become the 1.3 release after community testing.

There are still some features that will go in (in higher level
libraries, and some stragglers in spark core), but overall this
indicates the end of major feature development for Spark 1.3 and a
transition into testing.

Within a few days I'll cut a snapshot package release for this so that
people can begin testing.

https://git-wip-us.apache.org/repos/asf?p=spark.git;a=shortlog;h=refs/heads/branch-1.3

- Patrick

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

[jira] [Commented] (SPARK-4550) In sort-based shuffle, store map outputs in serialized form


[ 
https://issues.apache.org/jira/browse/SPARK-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302999#comment-14302999
 ] 

Patrick Wendell commented on SPARK-4550:


The doc alludes to having to (at some point) deal with comparing serialized 
objects. In the future one approach would be to restrict this only to 
SchemaRDD's where we can have more control over the serialized format. This is 
effectively what Flink and other systems do (they basically only have 
SchemaRDD's).

 In sort-based shuffle, store map outputs in serialized form
 ---

 Key: SPARK-4550
 URL: https://issues.apache.org/jira/browse/SPARK-4550
 Project: Spark
  Issue Type: Improvement
  Components: Shuffle, Spark Core
Affects Versions: 1.2.0
Reporter: Sandy Ryza
Priority: Critical
 Attachments: SPARK-4550-design-v1.pdf


 One drawback with sort-based shuffle compared to hash-based shuffle is that 
 it ends up storing many more java objects in memory.  If Spark could store 
 map outputs in serialized form, it could
 * spill less often because the serialized form is more compact
 * reduce GC pressure
 This will only work when the serialized representations of objects are 
 independent from each other and occupy contiguous segments of memory.  E.g. 
 when Kryo reference tracking is left on, objects may contain pointers to 
 objects farther back in the stream, which means that the sort can't relocate 
 objects without corrupting them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-5388) Provide a stable application submission gateway in standalone cluster mode

[
https://issues.apache.org/jira/browse/SPARK-5388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14304010#comment-14304010
]

Patrick Wendell commented on SPARK-5388:

The intention for this is really just to take single RPC that was using Akka
and add a stable version of it that we are okay supporting long term. It
doesn't preclude moving to avro or some other RPC as a general thing we use
across all of Spark. However, that design choice was intentionally excluded
from this decision given all the complexities you bring up. Doing some basic
message dispatching on our own - there is only a small and very straightforward
code related to this. Adopting Avro would be overkill for this.

In the current implementation the client and server exchange Spark versions, so
this is the basis of reasoning about version changes - maybe it wasn't in the
design doc. In terms of evolvability, the way you do this is that you only add
new functionality over time, and you never remove fields from messages. This is
similar to the API contract of the history logs with the history server. So the
idea is that newer clients would implement a super set of messages and fields
as older ones.

Adding v1 seems like a good idea in case this evolves into something public or
more well specified over time. It would just be good to define precisely what
it means to advance that version identifier. That all matters a lot more if we
want it to be something others interact with.

Provide a stable application submission gateway in standalone cluster mode
--

The existing submission gateway in standalone mode is not compatible across
Spark versions. If you have a newer version of Spark submitting to an older
version of the standalone Master, it is currently not guaranteed to work. The
goal is to provide a stable REST interface to replace this channel.
The first cut implementation will target standalone cluster mode because
there are very few messages exchanged. The design, however, should be general
enough to potentially support this for other cluster managers too. Note that
this is not necessarily required in YARN because we already use YARN's stable
interface to submit applications there.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Re: Spark Master Maven with YARN build is broken

2015-02-02 Thread Patrick Wendell

It's my fault, I'm sending a hot fix now.

On Mon, Feb 2, 2015 at 1:44 PM, Nicholas Chammas
nicholas.cham...@gmail.com wrote:
 https://amplab.cs.berkeley.edu/jenkins/view/Spark/job/Spark-Master-Maven-with-YARN/HADOOP_PROFILE=hadoop-2.4,label=centos/

 Is this is a known issue? It seems to have been broken since last night.

 Here's a snippet from the build output of one of the builds
 https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-Maven-with-YARN/HADOOP_PROFILE=hadoop-2.4,label=centos/1308/console
 :

 [error] bad symbolic reference. A signature in WebUI.class refers to
 term eclipse
 [error] in package org which is not available.
 [error] It may be completely missing from the current classpath, or
 the version on
 [error] the classpath might be incompatible with the version used when
 compiling WebUI.class.
 [error] bad symbolic reference. A signature in WebUI.class refers to term 
 jetty
 [error] in value org.eclipse which is not available.
 [error] It may be completely missing from the current classpath, or
 the version on
 [error] the classpath might be incompatible with the version used when
 compiling WebUI.class.
 [error]
 [error]  while compiling:
 /home/jenkins/workspace/Spark-Master-Maven-with-YARN/HADOOP_PROFILE/hadoop-2.4/label/centos/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala
 [error] during phase: erasure
 [error]  library version: version 2.10.4
 [error] compiler version: version 2.10.4

 Nick


-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

[jira] [Commented] (SPARK-3778) newAPIHadoopRDD doesn't properly pass credentials for secure hdfs on yarn


[ 
https://issues.apache.org/jira/browse/SPARK-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302290#comment-14302290
 ] 

Patrick Wendell commented on SPARK-3778:


/cc [~hshreedharan]

 newAPIHadoopRDD doesn't properly pass credentials for secure hdfs on yarn
 -

 Key: SPARK-3778
 URL: https://issues.apache.org/jira/browse/SPARK-3778
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.1.0
Reporter: Thomas Graves
Assignee: Thomas Graves
Priority: Blocker

 The newAPIHadoopRDD routine doesn't properly add the credentials to the conf 
 to be able to access secure hdfs.
 Note that newAPIHadoopFile does handle these because the 
 org.apache.hadoop.mapreduce.Job automatically adds it for you.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-3778) newAPIHadoopRDD doesn't properly pass credentials for secure hdfs on yarn


 [ 
https://issues.apache.org/jira/browse/SPARK-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-3778:
---
Priority: Blocker  (was: Critical)

 newAPIHadoopRDD doesn't properly pass credentials for secure hdfs on yarn
 -

 Key: SPARK-3778
 URL: https://issues.apache.org/jira/browse/SPARK-3778
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.1.0
Reporter: Thomas Graves
Assignee: Thomas Graves
Priority: Blocker

 The newAPIHadoopRDD routine doesn't properly add the credentials to the conf 
 to be able to access secure hdfs.
 Note that newAPIHadoopFile does handle these because the 
 org.apache.hadoop.mapreduce.Job automatically adds it for you.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-4550) In sort-based shuffle, store map outputs in serialized form


 [ 
https://issues.apache.org/jira/browse/SPARK-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-4550:
---
Target Version/s: 1.4.0

 In sort-based shuffle, store map outputs in serialized form
 ---

 Key: SPARK-4550
 URL: https://issues.apache.org/jira/browse/SPARK-4550
 Project: Spark
  Issue Type: Improvement
  Components: Shuffle, Spark Core
Affects Versions: 1.2.0
Reporter: Sandy Ryza
 Attachments: SPARK-4550-design-v1.pdf


 One drawback with sort-based shuffle compared to hash-based shuffle is that 
 it ends up storing many more java objects in memory.  If Spark could store 
 map outputs in serialized form, it could
 * spill less often because the serialized form is more compact
 * reduce GC pressure
 This will only work when the serialized representations of objects are 
 independent from each other and occupy contiguous segments of memory.  E.g. 
 when Kryo reference tracking is left on, objects may contain pointers to 
 objects farther back in the stream, which means that the sort can't relocate 
 objects without corrupting them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-4550) In sort-based shuffle, store map outputs in serialized form


 [ 
https://issues.apache.org/jira/browse/SPARK-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-4550:
---
Priority: Critical  (was: Major)

 In sort-based shuffle, store map outputs in serialized form
 ---

 Key: SPARK-4550
 URL: https://issues.apache.org/jira/browse/SPARK-4550
 Project: Spark
  Issue Type: Improvement
  Components: Shuffle, Spark Core
Affects Versions: 1.2.0
Reporter: Sandy Ryza
Priority: Critical
 Attachments: SPARK-4550-design-v1.pdf


 One drawback with sort-based shuffle compared to hash-based shuffle is that 
 it ends up storing many more java objects in memory.  If Spark could store 
 map outputs in serialized form, it could
 * spill less often because the serialized form is more compact
 * reduce GC pressure
 This will only work when the serialized representations of objects are 
 independent from each other and occupy contiguous segments of memory.  E.g. 
 when Kryo reference tracking is left on, objects may contain pointers to 
 objects farther back in the stream, which means that the sort can't relocate 
 objects without corrupting them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-5195) when hive table is query with alias the cache data lose effectiveness.


 [ 
https://issues.apache.org/jira/browse/SPARK-5195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-5195:
---
Fix Version/s: (was: 1.2.1)

 when hive table is query with alias  the cache data  lose effectiveness.
 

 Key: SPARK-5195
 URL: https://issues.apache.org/jira/browse/SPARK-5195
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.2.0
Reporter: yixiaohua
 Fix For: 1.3.0


 override the MetastoreRelation's sameresult method only compare databasename 
 and table name
 because in previous :
 cache table t1;
 select count() from t1;
 it will read data from memory but the sql below will not,instead it read from 
 hdfs:
 select count() from t1 t;
 because cache data is keyed by logical plan and compare with sameResult ,so 
 when table with alias the same table 's logicalplan is not the same logical 
 plan with out alias so modify the sameresult method only compare databasename 
 and table name



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-4508) Native Date type for SQL92 Date


 [ 
https://issues.apache.org/jira/browse/SPARK-4508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-4508:
---
Fix Version/s: (was: 1.3.0)

 Native Date type for SQL92 Date
 ---

 Key: SPARK-4508
 URL: https://issues.apache.org/jira/browse/SPARK-4508
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Adrian Wang
Assignee: Adrian Wang

 Store daysSinceEpoch as an Int(4 bytes), instead of using java.sql.Date(8 
 bytes as Long) in catalyst row.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-5541) Allow running Maven or SBT in the Spark build

Patrick Wendell created SPARK-5541:
--

 Summary: Allow running Maven or SBT in the Spark build
 Key: SPARK-5541
 URL: https://issues.apache.org/jira/browse/SPARK-5541
 Project: Spark
  Issue Type: Bug
  Components: Build
Reporter: Patrick Wendell
Assignee: Nicholas Chammas


It would be nice if we had a hook for the spark test scripts to run with Maven 
in addition to running with SBT. Right now it is difficult for us to test pull 
requests in maven and we get master build breaks because of it. A simple first 
step is to modify run-tests to allow building with maven. Then we can add a 
second PRB that invokes this maven build. I would just add an env var called 
SPARK_BUILD_TOOL that can be set to sbt or mvn. And make sure the 
associated logic works in either case. If we don't want to have the fancy SQL 
only stuff in Maven, that's fine too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-5541) Allow running Maven or SBT in run-tests


 [ 
https://issues.apache.org/jira/browse/SPARK-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-5541:
---
Summary: Allow running Maven or SBT in run-tests  (was: Allow running Maven 
or SBT in the Spark build)

 Allow running Maven or SBT in run-tests
 ---

 Key: SPARK-5541
 URL: https://issues.apache.org/jira/browse/SPARK-5541
 Project: Spark
  Issue Type: Bug
  Components: Build
Reporter: Patrick Wendell
Assignee: Nicholas Chammas

 It would be nice if we had a hook for the spark test scripts to run with 
 Maven in addition to running with SBT. Right now it is difficult for us to 
 test pull requests in maven and we get master build breaks because of it. A 
 simple first step is to modify run-tests to allow building with maven. Then 
 we can add a second PRB that invokes this maven build. I would just add an 
 env var called SPARK_BUILD_TOOL that can be set to sbt or mvn. And make 
 sure the associated logic works in either case. If we don't want to have the 
 fancy SQL only stuff in Maven, that's fine too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Reopened] (SPARK-4508) Native Date type for SQL92 Date


 [ 
https://issues.apache.org/jira/browse/SPARK-4508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell reopened SPARK-4508:


This has caused several date-related test failures in the master and pull 
request builds, so I'm reverting it:

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26560/testReport/org.apache.spark.sql/ScalaReflectionRelationSuite/query_case_class_RDD/

 Native Date type for SQL92 Date
 ---

 Key: SPARK-4508
 URL: https://issues.apache.org/jira/browse/SPARK-4508
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Adrian Wang
Assignee: Adrian Wang
 Fix For: 1.3.0


 Store daysSinceEpoch as an Int(4 bytes), instead of using java.sql.Date(8 
 bytes as Long) in catalyst row.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-4550) In sort-based shuffle, store map outputs in serialized form


[ 
https://issues.apache.org/jira/browse/SPARK-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302326#comment-14302326
 ] 

Patrick Wendell commented on SPARK-4550:


Yeah, this is a good idea. I don't see why we don't serialize these immediately.

 In sort-based shuffle, store map outputs in serialized form
 ---

 Key: SPARK-4550
 URL: https://issues.apache.org/jira/browse/SPARK-4550
 Project: Spark
  Issue Type: Improvement
  Components: Shuffle, Spark Core
Affects Versions: 1.2.0
Reporter: Sandy Ryza
Priority: Critical
 Attachments: SPARK-4550-design-v1.pdf


 One drawback with sort-based shuffle compared to hash-based shuffle is that 
 it ends up storing many more java objects in memory.  If Spark could store 
 map outputs in serialized form, it could
 * spill less often because the serialized form is more compact
 * reduce GC pressure
 This will only work when the serialized representations of objects are 
 independent from each other and occupy contiguous segments of memory.  E.g. 
 when Kryo reference tracking is left on, objects may contain pointers to 
 objects farther back in the stream, which means that the sort can't relocate 
 objects without corrupting them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-5542) Decouple publishing, packaging, and tagging in release script

Patrick Wendell created SPARK-5542:
--

 Summary: Decouple publishing, packaging, and tagging in release 
script
 Key: SPARK-5542
 URL: https://issues.apache.org/jira/browse/SPARK-5542
 Project: Spark
  Issue Type: Bug
  Components: Build
Reporter: Patrick Wendell
Assignee: Patrick Wendell


Our release script should make it easy to do these separately. I.e. it should 
be possible to publish a release from a tag that we already cut. This would 
help with things such as publishing nightly releases (SPARK-1517).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Temporary jenkins issue

2015-02-02 Thread Patrick Wendell

Hey All,

I made a change to the Jenkins configuration that caused most builds
to fail (attempting to enable a new plugin), I've reverted the change
effective about 10 minutes ago.

If you've seen recent build failures like below, this was caused by
that change. Sorry about that.


ERROR: Publisher
com.google.jenkins.flakyTestHandler.plugin.JUnitFlakyResultArchiver
aborted due to exception
java.lang.NoSuchMethodError:
hudson.model.AbstractBuild.getTestResultAction()Lhudson/tasks/test/AbstractTestResultAction;
at 
com.google.jenkins.flakyTestHandler.plugin.FlakyTestResultAction.init(FlakyTestResultAction.java:78)
at 
com.google.jenkins.flakyTestHandler.plugin.JUnitFlakyResultArchiver.perform(JUnitFlakyResultArchiver.java:89)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:770)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:734)
at hudson.model.Build$BuildExecution.post2(Build.java:183)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:683)
at hudson.model.Run.execute(Run.java:1784)
at hudson.matrix.MatrixRun.run(MatrixRun.java:146)
at hudson.model.ResourceController.execute(ResourceController.java:89)
at hudson.model.Executor.run(Executor.java:240)


- Patrick

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

[jira] [Resolved] (SPARK-5542) Decouple publishing, packaging, and tagging in release script


 [ 
https://issues.apache.org/jira/browse/SPARK-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-5542.

   Resolution: Fixed
Fix Version/s: 1.3.0

 Decouple publishing, packaging, and tagging in release script
 -

 Key: SPARK-5542
 URL: https://issues.apache.org/jira/browse/SPARK-5542
 Project: Spark
  Issue Type: Bug
  Components: Build
Reporter: Patrick Wendell
Assignee: Patrick Wendell
 Fix For: 1.3.0


 Our release script should make it easy to do these separately. I.e. it should 
 be possible to publish a release from a tag that we already cut. This would 
 help with things such as publishing nightly releases (SPARK-1517).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-5548) Flaky test: org.apache.spark.util.AkkaUtilsSuite.remote fetch ssl on - untrusted server