from:"Patrick Wendell $JIRA$"

[jira] [Commented] (SPARK-1392) Local spark-shell Runs Out of Memory With Default Settings

2014-06-21 Thread Patrick Wendell (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-1392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039946#comment-14039946
 ] 

Patrick Wendell commented on SPARK-1392:


I mentioned this on the pull request, but I think this was an instance of 
SPARK-1777. I'm running some tests locally on the pull request there to 
determine whether that was the case.

 Local spark-shell Runs Out of Memory With Default Settings
 --

 Key: SPARK-1392
 URL: https://issues.apache.org/jira/browse/SPARK-1392
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 0.9.0
 Environment: OS X 10.9.2, Java 1.7.0_51, Scala 2.10.3
Reporter: Pat McDonough

 Using the spark-0.9.0 Hadoop2 binary from the project download page, running 
 the spark-shell locally in out of the box configuration, and attempting to 
 cache all the attached data, spark OOMs with: java.lang.OutOfMemoryError: GC 
 overhead limit exceeded
 You can work around the issue by either decreasing 
 spark.storage.memoryFraction or increasing SPARK_MEM



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SPARK-1392) Local spark-shell Runs Out of Memory With Default Settings

2014-06-21 Thread Patrick Wendell (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-1392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039953#comment-14039953
 ] 

Patrick Wendell commented on SPARK-1392:


Okay great, I confirmed this is fixed by SPARK-1777. I tested as follows:

{code}
SPARK_HADOOP_VERSION=2.2.0 SPARK_YARN=true SPARK_HIVE=true sbt/sbt clean 
assembly/assembly
sc.textFile(/tmp/wiki_links).cache.count
{code}

The wiki_links file was download and extracted from here:

This worked with the proposed patch but failed with the default build.

 Local spark-shell Runs Out of Memory With Default Settings
 --

 Key: SPARK-1392
 URL: https://issues.apache.org/jira/browse/SPARK-1392
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 0.9.0
 Environment: OS X 10.9.2, Java 1.7.0_51, Scala 2.10.3
Reporter: Pat McDonough

 Using the spark-0.9.0 Hadoop2 binary from the project download page, running 
 the spark-shell locally in out of the box configuration, and attempting to 
 cache all the attached data, spark OOMs with: java.lang.OutOfMemoryError: GC 
 overhead limit exceeded
 You can work around the issue by either decreasing 
 spark.storage.memoryFraction or increasing SPARK_MEM



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (SPARK-1392) Local spark-shell Runs Out of Memory With Default Settings

2014-06-21 Thread Patrick Wendell (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-1392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039953#comment-14039953
 ] 

Patrick Wendell edited comment on SPARK-1392 at 6/21/14 9:15 PM:
-

Okay great, I confirmed this is fixed by SPARK-1777. I tested as follows:

{code}
SPARK_HADOOP_VERSION=2.2.0 SPARK_YARN=true SPARK_HIVE=true sbt/sbt clean 
assembly/assembly
sc.textFile(/tmp/wiki_links).cache.count
{code}

The wiki_links file was download and extracted from here:
https://drive.google.com/file/d/0BwrkCxCycBCyTmlWYXp0MmdEakk/edit?usp=sharing

This worked with the proposed patch but failed with the default build.


was (Author: pwendell):
Okay great, I confirmed this is fixed by SPARK-1777. I tested as follows:

{code}
SPARK_HADOOP_VERSION=2.2.0 SPARK_YARN=true SPARK_HIVE=true sbt/sbt clean 
assembly/assembly
sc.textFile(/tmp/wiki_links).cache.count
{code}

The wiki_links file was download and extracted from here:

This worked with the proposed patch but failed with the default build.

 Local spark-shell Runs Out of Memory With Default Settings
 --

 Key: SPARK-1392
 URL: https://issues.apache.org/jira/browse/SPARK-1392
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 0.9.0
 Environment: OS X 10.9.2, Java 1.7.0_51, Scala 2.10.3
Reporter: Pat McDonough

 Using the spark-0.9.0 Hadoop2 binary from the project download page, running 
 the spark-shell locally in out of the box configuration, and attempting to 
 cache all the attached data, spark OOMs with: java.lang.OutOfMemoryError: GC 
 overhead limit exceeded
 You can work around the issue by either decreasing 
 spark.storage.memoryFraction or increasing SPARK_MEM



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (SPARK-1392) Local spark-shell Runs Out of Memory With Default Settings

2014-06-21 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-1392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-1392.


Resolution: Duplicate

 Local spark-shell Runs Out of Memory With Default Settings
 --

 Key: SPARK-1392
 URL: https://issues.apache.org/jira/browse/SPARK-1392
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 0.9.0
 Environment: OS X 10.9.2, Java 1.7.0_51, Scala 2.10.3
Reporter: Pat McDonough

 Using the spark-0.9.0 Hadoop2 binary from the project download page, running 
 the spark-shell locally in out of the box configuration, and attempting to 
 cache all the attached data, spark OOMs with: java.lang.OutOfMemoryError: GC 
 overhead limit exceeded
 You can work around the issue by either decreasing 
 spark.storage.memoryFraction or increasing SPARK_MEM



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (SPARK-1996) Remove use of special Maven repo for Akka

2014-06-22 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-1996.


   Resolution: Fixed
Fix Version/s: (was: 1.0.1)
   1.1.0

Fixed via:
https://github.com/apache/spark/pull/1170/files

 Remove use of special Maven repo for Akka
 -

 Key: SPARK-1996
 URL: https://issues.apache.org/jira/browse/SPARK-1996
 Project: Spark
  Issue Type: Improvement
  Components: Documentation, Spark Core
Reporter: Matei Zaharia
Assignee: Sean Owen
 Fix For: 1.1.0


 According to http://doc.akka.io/docs/akka/2.3.3/intro/getting-started.html 
 Akka is now published to Maven Central, so our documentation and POM files 
 don't need to use the old Akka repo. It will be one less step for users to 
 worry about.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (SPARK-2230) Improvements to Jenkins QA Harness

2014-06-22 Thread Patrick Wendell (JIRA)

Patrick Wendell created SPARK-2230:
--

 Summary: Improvements to Jenkins QA Harness
 Key: SPARK-2230
 URL: https://issues.apache.org/jira/browse/SPARK-2230
 Project: Spark
  Issue Type: Umbrella
  Components: Project Infra
Reporter: Patrick Wendell


An umbrella for some improvements I'd like to do.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (SPARK-2231) dev/run-tests should include YARN and use a recent Hadoop version

2014-06-22 Thread Patrick Wendell (JIRA)

Patrick Wendell created SPARK-2231:
--

 Summary: dev/run-tests should include YARN and use a recent Hadoop 
version
 Key: SPARK-2231
 URL: https://issues.apache.org/jira/browse/SPARK-2231
 Project: Spark
  Issue Type: Sub-task
  Components: Project Infra
Reporter: Patrick Wendell
Assignee: Patrick Wendell






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (SPARK-2232) Fix Jenkins tests in Maven

2014-06-22 Thread Patrick Wendell (JIRA)

Patrick Wendell created SPARK-2232:
--

 Summary: Fix Jenkins tests in Maven
 Key: SPARK-2232
 URL: https://issues.apache.org/jira/browse/SPARK-2232
 Project: Spark
  Issue Type: Sub-task
Reporter: Patrick Wendell


It appears Maven tests are failing under the newer Hadoop configurations. We 
need to go through and make sure all the Spark master build configurations are 
passing.

https://amplab.cs.berkeley.edu/jenkins/view/Spark%20Master%20Matrix/



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (SPARK-1804) Mark 0.9.1 as released in JIRA

2014-06-22 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-1804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-1804.


Resolution: Fixed

 Mark 0.9.1 as released in JIRA
 --

 Key: SPARK-1804
 URL: https://issues.apache.org/jira/browse/SPARK-1804
 Project: Spark
  Issue Type: Task
  Components: Documentation, Project Infra
Affects Versions: 0.9.1
Reporter: Stevo Slavic
Priority: Trivial

 0.9.1 has been released but is labeled as unreleased in SPARK JIRA project. 
 Please have it marked as released. Also please document that step in release 
 process.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (SPARK-1803) Rename test resources to be compatible with Windows FS

2014-06-22 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-1803.


Resolution: Fixed

Resolved via:
https://github.com/apache/spark/pull/739

 Rename test resources to be compatible with Windows FS
 --

 Key: SPARK-1803
 URL: https://issues.apache.org/jira/browse/SPARK-1803
 Project: Spark
  Issue Type: Task
  Components: Windows
Affects Versions: 0.9.1
Reporter: Stevo Slavic
Priority: Trivial

 {{git clone}} of master branch and then {{git status}} on Windows reports 
 untracked files:
 {noformat}
 # Untracked files:
 #   (use git add file... to include in what will be committed)
 #
 #   sql/hive/src/test/resources/golden/Column pruning
 #   sql/hive/src/test/resources/golden/Partition pruning
 #   sql/hive/src/test/resources/golden/Partiton pruning
 {noformat}
 Actual issue is that several files under 
 {{sql/hive/src/test/resources/golden}} directory have colon in name which is 
 invalid character in file name on Windows.
 Please have these files renamed to a Windows compatible file name.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (SPARK-721) Fix remaining deprecation warnings

2014-06-22 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-721.
---

Resolution: Fixed
  Assignee: (was: Gary Struthers)

 Fix remaining deprecation warnings
 --

 Key: SPARK-721
 URL: https://issues.apache.org/jira/browse/SPARK-721
 Project: Spark
  Issue Type: Improvement
Affects Versions: 0.7.1
Reporter: Josh Rosen
Priority: Minor
  Labels: Starter

 The recent patch to re-enable deprecation warnings fixed many of them, but 
 there's still a few left; it would be nice to fix them.
 For example, here's one in RDDSuite:
 {code}
 [warn] 
 /Users/joshrosen/Documents/spark/spark/core/src/test/scala/spark/RDDSuite.scala:32:
  method mapPartitionsWithSplit in class RDD is deprecated: use 
 mapPartitionsWithIndex
 [warn] val partitionSumsWithSplit = nums.mapPartitionsWithSplit {
 [warn]   ^
 [warn] one warning found
 {code}
 Also, it looks like Scala 2.9 added a second deprecatedSince parameter to 
 @Deprecated.   We didn't fill this in, which causes some additional warnings:
 {code}
 [warn] 
 /Users/joshrosen/Documents/spark/spark/core/src/main/scala/spark/RDD.scala:370:
  @deprecated now takes two arguments; see the scaladoc.
 [warn]   @deprecated(use mapPartitionsWithIndex)
 [warn]^
 [warn] one warning found
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (SPARK-2233) make-distribution script should list the git hash in the RELEASE file

2014-06-22 Thread Patrick Wendell (JIRA)

Patrick Wendell created SPARK-2233:
--

 Summary: make-distribution script should list the git hash in the 
RELEASE file
 Key: SPARK-2233
 URL: https://issues.apache.org/jira/browse/SPARK-2233
 Project: Spark
  Issue Type: Bug
Reporter: Patrick Wendell


If someone is creating a distribution and also has a version of Spark that has 
a .git folder in it, we should list the current git hash and put that in the 
RELEASE file.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-2233) make-distribution script should list the git hash in the RELEASE file

2014-06-22 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-2233:
---

Issue Type: Improvement  (was: Bug)

 make-distribution script should list the git hash in the RELEASE file
 -

 Key: SPARK-2233
 URL: https://issues.apache.org/jira/browse/SPARK-2233
 Project: Spark
  Issue Type: Improvement
  Components: Project Infra
Reporter: Patrick Wendell
Priority: Minor
  Labels: starter

 If someone is creating a distribution and also has a version of Spark that 
 has a .git folder in it, we should list the current git hash and put that in 
 the RELEASE file.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (SPARK-2231) dev/run-tests should include YARN and use a recent Hadoop version

2014-06-22 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-2231.


   Resolution: Fixed
Fix Version/s: 1.1.0

Issue resolved by pull request 1175
[https://github.com/apache/spark/pull/1175]

 dev/run-tests should include YARN and use a recent Hadoop version
 -

 Key: SPARK-2231
 URL: https://issues.apache.org/jira/browse/SPARK-2231
 Project: Spark
  Issue Type: Sub-task
  Components: Project Infra
Reporter: Patrick Wendell
Assignee: Patrick Wendell
 Fix For: 1.1.0






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-2034) KafkaInputDStream doesn't close resources and may prevent JVM shutdown

2014-06-22 Thread Patrick Wendell (JIRA)

[
https://issues.apache.org/jira/browse/SPARK-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Patrick Wendell updated SPARK-2034:
---

Assignee: Sean Owen

KafkaInputDStream doesn't close resources and may prevent JVM shutdown
--

Key: SPARK-2034
URL: https://issues.apache.org/jira/browse/SPARK-2034
Project: Spark
Issue Type: Bug
Components: Streaming
Affects Versions: 1.0.0
Reporter: Sean Owen
Assignee: Sean Owen
Fix For: 1.0.1, 1.1.0

Tobias noted today on the mailing list:
{quote}
I am trying to use Spark Streaming with Kafka, which works like a
charm -- except for shutdown. When I run my program with sbt
run-main, sbt will never exit, because there are two non-daemon
threads left that don't die.
I created a minimal example at
https://gist.github.com/tgpfeiffer/b1e765064e983449c6b6#file-kafkadoesntshutdown-scala.
It starts a StreamingContext and does nothing more than connecting to
a Kafka server and printing what it receives. Using the `future { ...
}` construct, I shut down the StreamingContext after some seconds and
then print the difference between the threads at start time and at end
time. The output can be found at
https://gist.github.com/tgpfeiffer/b1e765064e983449c6b6#file-output1.
There are a number of threads remaining that will prevent sbt from
exiting.
When I replace `KafkaUtils.createStream(...)` with a call that does
exactly the same, except that it calls `consumerConnector.shutdown()`
in `KafkaReceiver.onStop()` (which it should, IMO), the output is as
shown at
https://gist.github.com/tgpfeiffer/b1e765064e983449c6b6#file-output2.
Does anyone have *any* idea what is going on here and why the program
doesn't shut down properly? The behavior is the same with both kafka
0.8.0 and 0.8.1.1, by the way.
{quote}
Something similar was noted last year:
http://mail-archives.apache.org/mod_mbox/spark-dev/201309.mbox/%3c1380220041.2428.yahoomail...@web160804.mail.bf1.yahoo.com%3E

KafkaInputDStream doesn't close ConsumerConnector in onStop(), and does not
close the Executor it creates. The latter leaves non-daemon threads and can
prevent the JVM from shutting down even if streaming is closed properly.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-2156) When the size of serialized results for one partition is slightly smaller than 10MB (the default akka.frameSize), the execution blocks

2014-06-22 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-2156:
---

Target Version/s: 0.9.2, 1.0.1, 1.1.0  (was: 0.9.2, 1.0.1)

 When the size of serialized results for one partition is slightly smaller 
 than 10MB (the default akka.frameSize), the execution blocks
 --

 Key: SPARK-2156
 URL: https://issues.apache.org/jira/browse/SPARK-2156
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 0.9.1, 1.0.0
 Environment: AWS EC2 1 master 2 slaves with the instance type of 
 r3.2xlarge
Reporter: Chen Jin
Assignee: Xiangrui Meng
Priority: Blocker
 Fix For: 1.0.1

   Original Estimate: 504h
  Remaining Estimate: 504h

  I have done some experiments when the frameSize is around 10MB .
 1) spark.akka.frameSize = 10
 If one of the partition size is very close to 10MB, say 9.97MB, the execution 
 blocks without any exception or warning. Worker finished the task to send the 
 serialized result, and then throw exception saying hadoop IPC client 
 connection stops (changing the logging to debug level). However, the master 
 never receives the results and the program just hangs.
 But if sizes for all the partitions less than some number btw 9.96MB amd 
 9.97MB, the program works fine.
 2) spark.akka.frameSize = 9
 when the partition size is just a little bit smaller than 9MB, it fails as 
 well.
 This bug behavior is not exactly what spark-1112 is about.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-1112) When spark.akka.frameSize 10, task results bigger than 10MiB block execution

2014-06-22 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-1112:
---

Fix Version/s: 1.0.1

 When spark.akka.frameSize  10, task results bigger than 10MiB block execution
 --

 Key: SPARK-1112
 URL: https://issues.apache.org/jira/browse/SPARK-1112
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 0.9.0, 1.0.0
Reporter: Guillaume Pitel
Assignee: Xiangrui Meng
Priority: Blocker
 Fix For: 1.0.1


 When I set the spark.akka.frameSize to something over 10, the messages sent 
 from the executors to the driver completely block the execution if the 
 message is bigger than 10MiB and smaller than the frameSize (if it's above 
 the frameSize, it's ok)
 Workaround is to set the spark.akka.frameSize to 10. In this case, since 
 0.8.1, the blockManager deal with  the data to be sent. It seems slower than 
 akka direct message though.
 The configuration seems to be correctly read (see actorSystemConfig.txt), so 
 I don't see where the 10MiB could come from 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-2156) When the size of serialized results for one partition is slightly smaller than 10MB (the default akka.frameSize), the execution blocks

2014-06-22 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-2156:
---

Fix Version/s: 1.0.1

 When the size of serialized results for one partition is slightly smaller 
 than 10MB (the default akka.frameSize), the execution blocks
 --

 Key: SPARK-2156
 URL: https://issues.apache.org/jira/browse/SPARK-2156
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 0.9.1, 1.0.0
 Environment: AWS EC2 1 master 2 slaves with the instance type of 
 r3.2xlarge
Reporter: Chen Jin
Assignee: Xiangrui Meng
Priority: Blocker
 Fix For: 1.0.1

   Original Estimate: 504h
  Remaining Estimate: 504h

  I have done some experiments when the frameSize is around 10MB .
 1) spark.akka.frameSize = 10
 If one of the partition size is very close to 10MB, say 9.97MB, the execution 
 blocks without any exception or warning. Worker finished the task to send the 
 serialized result, and then throw exception saying hadoop IPC client 
 connection stops (changing the logging to debug level). However, the master 
 never receives the results and the program just hangs.
 But if sizes for all the partitions less than some number btw 9.96MB amd 
 9.97MB, the program works fine.
 2) spark.akka.frameSize = 9
 when the partition size is just a little bit smaller than 9MB, it fails as 
 well.
 This bug behavior is not exactly what spark-1112 is about.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-1112) When spark.akka.frameSize 10, task results bigger than 10MiB block execution

2014-06-22 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-1112:
---

Target Version/s: 0.9.2, 1.0.1, 1.1.0  (was: 0.9.2, 1.0.1)

 When spark.akka.frameSize  10, task results bigger than 10MiB block execution
 --

 Key: SPARK-1112
 URL: https://issues.apache.org/jira/browse/SPARK-1112
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 0.9.0, 1.0.0
Reporter: Guillaume Pitel
Assignee: Xiangrui Meng
Priority: Blocker
 Fix For: 1.0.1


 When I set the spark.akka.frameSize to something over 10, the messages sent 
 from the executors to the driver completely block the execution if the 
 message is bigger than 10MiB and smaller than the frameSize (if it's above 
 the frameSize, it's ok)
 Workaround is to set the spark.akka.frameSize to 10. In this case, since 
 0.8.1, the blockManager deal with  the data to be sent. It seems slower than 
 akka direct message though.
 The configuration seems to be correctly read (see actorSystemConfig.txt), so 
 I don't see where the 10MiB could come from 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SPARK-2156) When the size of serialized results for one partition is slightly smaller than 10MB (the default akka.frameSize), the execution blocks

2014-06-22 Thread Patrick Wendell (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14040360#comment-14040360
 ] 

Patrick Wendell commented on SPARK-2156:


This is fixed in the 1.0 branch via:

https://github.com/apache/spark/pull/1172

 When the size of serialized results for one partition is slightly smaller 
 than 10MB (the default akka.frameSize), the execution blocks
 --

 Key: SPARK-2156
 URL: https://issues.apache.org/jira/browse/SPARK-2156
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 0.9.1, 1.0.0
 Environment: AWS EC2 1 master 2 slaves with the instance type of 
 r3.2xlarge
Reporter: Chen Jin
Assignee: Xiangrui Meng
Priority: Blocker
 Fix For: 1.0.1

   Original Estimate: 504h
  Remaining Estimate: 504h

  I have done some experiments when the frameSize is around 10MB .
 1) spark.akka.frameSize = 10
 If one of the partition size is very close to 10MB, say 9.97MB, the execution 
 blocks without any exception or warning. Worker finished the task to send the 
 serialized result, and then throw exception saying hadoop IPC client 
 connection stops (changing the logging to debug level). However, the master 
 never receives the results and the program just hangs.
 But if sizes for all the partitions less than some number btw 9.96MB amd 
 9.97MB, the program works fine.
 2) spark.akka.frameSize = 9
 when the partition size is just a little bit smaller than 9MB, it fails as 
 well.
 This bug behavior is not exactly what spark-1112 is about.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-2241) EC2 script should handle quoted arguments correctly

2014-06-22 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-2241:
---

Description: We should pass quoted arguments correctly to the underlying 
ec2 script in spark-ec2  (was: We should pass quoted arguments correctly to the 
underlying ec2 in spark-ec2)

 EC2 script should handle quoted arguments correctly
 ---

 Key: SPARK-2241
 URL: https://issues.apache.org/jira/browse/SPARK-2241
 Project: Spark
  Issue Type: Bug
  Components: EC2
Affects Versions: 0.9.1, 1.0.0
Reporter: Patrick Wendell

 We should pass quoted arguments correctly to the underlying ec2 script in 
 spark-ec2



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (SPARK-2241) EC2 script should handle quoted arguments correctly

2014-06-22 Thread Patrick Wendell (JIRA)

Patrick Wendell created SPARK-2241:
--

 Summary: EC2 script should handle quoted arguments correctly
 Key: SPARK-2241
 URL: https://issues.apache.org/jira/browse/SPARK-2241
 Project: Spark
  Issue Type: Bug
  Components: EC2
Affects Versions: 1.0.0, 0.9.1
Reporter: Patrick Wendell






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-2241) EC2 script should handle quoted arguments correctly

2014-06-22 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-2241:
---

Description: We should pass quoted arguments correctly to the underlying 
ec2 in spark-ec2

 EC2 script should handle quoted arguments correctly
 ---

 Key: SPARK-2241
 URL: https://issues.apache.org/jira/browse/SPARK-2241
 Project: Spark
  Issue Type: Bug
  Components: EC2
Affects Versions: 0.9.1, 1.0.0
Reporter: Patrick Wendell

 We should pass quoted arguments correctly to the underlying ec2 in spark-ec2



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (SPARK-2241) EC2 script should handle quoted arguments correctly

2014-06-22 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-2241.


   Resolution: Fixed
Fix Version/s: 1.1.0
   0.9.2
   1.0.1

Issue resolved by pull request 1169
[https://github.com/apache/spark/pull/1169]

 EC2 script should handle quoted arguments correctly
 ---

 Key: SPARK-2241
 URL: https://issues.apache.org/jira/browse/SPARK-2241
 Project: Spark
  Issue Type: Bug
  Components: EC2
Affects Versions: 0.9.1, 1.0.0
Reporter: Patrick Wendell
 Fix For: 1.0.1, 0.9.2, 1.1.0


 We should pass quoted arguments correctly to the underlying ec2 script in 
 spark-ec2



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (SPARK-2166) Enumerating instances to be terminated before the prompting the users to continue.

2014-06-22 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-2166.


   Resolution: Fixed
Fix Version/s: 1.1.0

Issue resolved by pull request 270
[https://github.com/apache/spark/pull/270]

 Enumerating instances to be terminated before the prompting the users to 
 continue.
 --

 Key: SPARK-2166
 URL: https://issues.apache.org/jira/browse/SPARK-2166
 Project: Spark
  Issue Type: Improvement
  Components: EC2
Affects Versions: 0.9.0, 1.0.0
Reporter: Jean-Martin Archer
Assignee: Jean-Martin Archer
Priority: Minor
 Fix For: 1.1.0

   Original Estimate: 0h
  Remaining Estimate: 0h

 When destroying a cluster, the user will be prompted for confirmation without 
 first showing which instances will be terminated.
 Pull Request: https://github.com/apache/spark/pull/270#issuecomment-46341975
 This pull request will list the EC2 instances before destroying the cluster.
 This was added because it can be scary to destroy EC2
 instances without knowing which one will be affected.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-2228) onStageSubmitted does not properly called so NoSuchElement will be thrown in onStageCompleted

2014-06-23 Thread Patrick Wendell (JIRA)

[
https://issues.apache.org/jira/browse/SPARK-2228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Patrick Wendell updated SPARK-2228:
---

Target Version/s: 1.0.1, 1.1.0

onStageSubmitted does not properly called so NoSuchElement will be thrown in
onStageCompleted
-

Key: SPARK-2228
URL: https://issues.apache.org/jira/browse/SPARK-2228
Project: Spark
Issue Type: Bug
Components: Spark Core
Affects Versions: 1.0.0
Reporter: Baoxu Shi

We are using `SaveAsObjectFile` and `objectFile` to cut off lineage during
iterative computing, but after several hundreds of iterations, there will be
`NoSuchElementsError`. We check the code and locate the problem at
`org.apache.spark.ui.jobs.JobProgressListener`. When `onStageCompleted` is
called, such `stageId` can not be found in `stageIdToPool`, but it does exist
in other HashMaps. So we think `onStageSubmitted` is not properly called.
`Spark` did add a stage but failed to send the message to listeners. When
sending `finish` message to listeners, the error occurs.
This problem will cause a huge number of `active stages` showing in
`SparkUI`, which is really annoying. But it may not affect the final result,
according to the result of my testing code.
I'm willing to help solve this problem, any idea about which part should I
change? I assume `org.apache.spark.scheduler.SparkListenerBus` have something
to do with it but it looks fine to me.
FYI, here is the test code that could reproduce the problem. I do not know
who to put code here with highlight, so I put the code on gist to make the
issue looks clean.
https://gist.github.com/bxshi/b5c0fe0ae089c75a39bd

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-2228) onStageSubmitted does not properly called so NoSuchElement will be thrown in onStageCompleted

2014-06-23 Thread Patrick Wendell (JIRA)

[
https://issues.apache.org/jira/browse/SPARK-2228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Patrick Wendell updated SPARK-2228:
---

Affects Version/s: (was: 1.1.0)

onStageSubmitted does not properly called so NoSuchElement will be thrown in
onStageCompleted
-

Key: SPARK-2228
URL: https://issues.apache.org/jira/browse/SPARK-2228
Project: Spark
Issue Type: Bug
Components: Spark Core
Affects Versions: 1.0.0
Reporter: Baoxu Shi

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (SPARK-2118) If tools jar is not present, MIMA build should exit with an exception

2014-06-23 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-2118.


   Resolution: Fixed
Fix Version/s: 1.1.0

Issue resolved by pull request 1068
[https://github.com/apache/spark/pull/1068]

 If tools jar is not present, MIMA build should exit with an exception
 -

 Key: SPARK-2118
 URL: https://issues.apache.org/jira/browse/SPARK-2118
 Project: Spark
  Issue Type: Sub-task
Reporter: Patrick Wendell
Assignee: Prashant Sharma
 Fix For: 1.1.0


 Right now dev/mima will just produce a bunch of warnings since generating the 
 excludes fails. If the tools jar is not present, it should tell the user to 
 run sbt/sbt assembly and exit nonzero.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (SPARK-1768) History Server enhancements

2014-06-23 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-1768.


Resolution: Fixed

Issue resolved by pull request 718
[https://github.com/apache/spark/pull/718]

 History Server enhancements
 ---

 Key: SPARK-1768
 URL: https://issues.apache.org/jira/browse/SPARK-1768
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 1.0.0
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin
 Fix For: 1.1.0


 The history server currently has some limitations; the one that currently 
 concerns me the most is that it limits the number of applications it will 
 show, to avoid having to hold all applications in memory.
 It would be better if the code were smarter and able to show any application 
 available in the history storage.
 Also, thinking forward a little bit (I'm thinking SPARK-1537), it would be 
 nice to separate the serving logic from the logic to access app log data.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-2204) Scheduler for Mesos in fine-grained mode launches tasks on wrong executors

2014-06-24 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-2204:
---

Target Version/s: 1.0.1, 1.1.0

 Scheduler for Mesos in fine-grained mode launches tasks on wrong executors
 --

 Key: SPARK-2204
 URL: https://issues.apache.org/jira/browse/SPARK-2204
 Project: Spark
  Issue Type: Bug
  Components: Mesos
Affects Versions: 1.0.0
Reporter: Sebastien Rainville
Priority: Blocker

 MesosSchedulerBackend.resourceOffers(SchedulerDriver, List[Offer]) is 
 assuming that TaskSchedulerImpl.resourceOffers(Seq[WorkerOffer]) is returning 
 task lists in the same order as the offers it was passed, but in the current 
 implementation TaskSchedulerImpl.resourceOffers shuffles the offers to avoid 
 assigning the tasks always to the same executors. The result is that the 
 tasks are launched on the wrong executors.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (SPARK-2264) CachedTableSuite SQL Tests are Failing

2014-06-24 Thread Patrick Wendell (JIRA)

Patrick Wendell created SPARK-2264:
--

 Summary: CachedTableSuite SQL Tests are Failing
 Key: SPARK-2264
 URL: https://issues.apache.org/jira/browse/SPARK-2264
 Project: Spark
  Issue Type: Bug
Reporter: Patrick Wendell
Assignee: Michael Armbrust
Priority: Blocker


{code}
[info] CachedTableSuite:
[info] - read from cached table and uncache *** FAILED ***
[info]   java.lang.RuntimeException: Table Not Found: testData
[info]   at scala.sys.package$.error(package.scala:27)
[info]   at 
org.apache.spark.sql.catalyst.analysis.SimpleCatalog$$anonfun$1.apply(Catalog.scala:64)
[info]   at 
org.apache.spark.sql.catalyst.analysis.SimpleCatalog$$anonfun$1.apply(Catalog.scala:64)
[info]   at scala.Option.getOrElse(Option.scala:120)
[info]   at 
org.apache.spark.sql.catalyst.analysis.SimpleCatalog.lookupRelation(Catalog.scala:64)
[info]   at org.apache.spark.sql.SQLContext.table(SQLContext.scala:185)
[info]   at 
org.apache.spark.sql.CachedTableSuite$$anonfun$1.apply$mcV$sp(CachedTableSuite.scala:43)
[info]   at 
org.apache.spark.sql.CachedTableSuite$$anonfun$1.apply(CachedTableSuite.scala:27)
[info]   at 
org.apache.spark.sql.CachedTableSuite$$anonfun$1.apply(CachedTableSuite.scala:27)
[info]   at 
org.scalatest.Transformer$$anonfun$apply$1.apply(Transformer.scala:22)
[info]   ...
[info] - correct error on uncache of non-cached table *** FAILED ***
[info]   Expected exception java.lang.IllegalArgumentException to be thrown, 
but java.lang.RuntimeException was thrown. (CachedTableSuite.scala:55)
[info] - SELECT Star Cached Table *** FAILED ***
[info]   java.lang.RuntimeException: Table Not Found: testData
[info]   at scala.sys.package$.error(package.scala:27)
[info]   at 
org.apache.spark.sql.catalyst.analysis.SimpleCatalog$$anonfun$1.apply(Catalog.scala:64)
[info]   at 
org.apache.spark.sql.catalyst.analysis.SimpleCatalog$$anonfun$1.apply(Catalog.scala:64)
[info]   at scala.Option.getOrElse(Option.scala:120)
[info]   at 
org.apache.spark.sql.catalyst.analysis.SimpleCatalog.lookupRelation(Catalog.scala:64)
[info]   at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$1.applyOrElse(Analyzer.scala:67)
[info]   at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$1.applyOrElse(Analyzer.scala:65)
[info]   at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:165)
[info]   at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:183)
[info]   at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
[info]   ...
[info] - Self-join cached *** FAILED ***
[info]   java.lang.RuntimeException: Table Not Found: testData
[info]   at scala.sys.package$.error(package.scala:27)
[info]   at 
org.apache.spark.sql.catalyst.analysis.SimpleCatalog$$anonfun$1.apply(Catalog.scala:64)
[info]   at 
org.apache.spark.sql.catalyst.analysis.SimpleCatalog$$anonfun$1.apply(Catalog.scala:64)
[info]   at scala.Option.getOrElse(Option.scala:120)
[info]   at 
org.apache.spark.sql.catalyst.analysis.SimpleCatalog.lookupRelation(Catalog.scala:64)
[info]   at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$1.applyOrElse(Analyzer.scala:67)
[info]   at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$1.applyOrElse(Analyzer.scala:65)
[info]   at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:165)
[info]   at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:183)
[info]   at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
[info]   ...
[info] - 'CACHE TABLE' and 'UNCACHE TABLE' SQL statement *** FAILED ***
[info]   java.lang.RuntimeException: Table Not Found: testData
[info]   at scala.sys.package$.error(package.scala:27)
[info]   at 
org.apache.spark.sql.catalyst.analysis.SimpleCatalog$$anonfun$1.apply(Catalog.scala:64)
[info]   at 
org.apache.spark.sql.catalyst.analysis.SimpleCatalog$$anonfun$1.apply(Catalog.scala:64)
[info]   at scala.Option.getOrElse(Option.scala:120)
[info]   at 
org.apache.spark.sql.catalyst.analysis.SimpleCatalog.lookupRelation(Catalog.scala:64)
[info]   at org.apache.spark.sql.SQLContext.cacheTable(SQLContext.scala:189)
[info]   at 
org.apache.spark.sql.execution.CacheCommand.sideEffectResult$lzycompute(commands.scala:110)
[info]   at 
org.apache.spark.sql.execution.CacheCommand.sideEffectResult(commands.scala:108)
[info]   at 
org.apache.spark.sql.execution.CacheCommand.execute(commands.scala:118)
[info]   at 
org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:322)
[info]   ...
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (SPARK-2269) Clean up and add unit tests for resourceOffers in MesosSchedulerBackend

2014-06-24 Thread Patrick Wendell (JIRA)

Patrick Wendell created SPARK-2269:
--

 Summary: Clean up and add unit tests for resourceOffers in 
MesosSchedulerBackend
 Key: SPARK-2269
 URL: https://issues.apache.org/jira/browse/SPARK-2269
 Project: Spark
  Issue Type: Bug
  Components: Mesos
Reporter: Patrick Wendell


This function could be simplified a bit. We could re-write it without 
offerableIndices or creating the mesosTasks array as large as the offer list. 
There is a lot of logic around making sure you get the correct index into 
mesosTasks and offers, really we should just build mesosTasks directly from the 
offers we get back. To associate the tasks we are launching with the offers we 
can just create a hashMap from the slaveId to the original offer.

The basic logic of the function is that you take the mesos offers, convert them 
to spark offers, then convert the results back.

One thing we should check is whether Mesos guarantees that it won't give two 
offers for the same worker. That would make things much more complicated.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SPARK-2156) When the size of serialized results for one partition is slightly smaller than 10MB (the default akka.frameSize), the execution blocks

2014-06-24 Thread Patrick Wendell (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14042962#comment-14042962
 ] 

Patrick Wendell commented on SPARK-2156:


Fixed in 1.1.0 via: 
https://github.com/apache/spark/pull/1132

 When the size of serialized results for one partition is slightly smaller 
 than 10MB (the default akka.frameSize), the execution blocks
 --

 Key: SPARK-2156
 URL: https://issues.apache.org/jira/browse/SPARK-2156
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 0.9.1, 1.0.0
 Environment: AWS EC2 1 master 2 slaves with the instance type of 
 r3.2xlarge
Reporter: Chen Jin
Assignee: Xiangrui Meng
Priority: Blocker
 Fix For: 1.0.1, 1.1.0

   Original Estimate: 504h
  Remaining Estimate: 504h

  I have done some experiments when the frameSize is around 10MB .
 1) spark.akka.frameSize = 10
 If one of the partition size is very close to 10MB, say 9.97MB, the execution 
 blocks without any exception or warning. Worker finished the task to send the 
 serialized result, and then throw exception saying hadoop IPC client 
 connection stops (changing the logging to debug level). However, the master 
 never receives the results and the program just hangs.
 But if sizes for all the partitions less than some number btw 9.96MB amd 
 9.97MB, the program works fine.
 2) spark.akka.frameSize = 9
 when the partition size is just a little bit smaller than 9MB, it fails as 
 well.
 This bug behavior is not exactly what spark-1112 is about.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (SPARK-2248) spark.default.parallelism does not apply in local mode

2014-06-24 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-2248.


   Resolution: Fixed
Fix Version/s: 1.1.0

Issue resolved by pull request 1194
[https://github.com/apache/spark/pull/1194]

 spark.default.parallelism does not apply in local mode
 --

 Key: SPARK-2248
 URL: https://issues.apache.org/jira/browse/SPARK-2248
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Reporter: Matei Zaharia
Assignee: Guoqiang Li
Priority: Trivial
  Labels: Starter
 Fix For: 1.1.0


 LocalBackend.defaultParallelism ignores the spark.default.parallelism 
 property, unlike the other SchedulerBackends. We should make it take this in 
 for consistency.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SPARK-1112) When spark.akka.frameSize 10, task results bigger than 10MiB block execution

2014-06-25 Thread Patrick Wendell (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043738#comment-14043738
 ] 

Patrick Wendell commented on SPARK-1112:


[~reachbach] If you are running on standalone mode, it might work if you go on 
every node in your cluster and add the following to spark-env.sh:

{code}
SPARK_JAVA_OPTS=-Dspark.akka.frameSize=XXX
{code}

However, this work around will only work if every job in your cluster is using 
the same frame size (XXX).

The main recommendation is to upgrade to 1.0.1. We are very conservative about 
what we merge into maintenance branches, so we recommend users upgrade 
immediately once we release them.

 When spark.akka.frameSize  10, task results bigger than 10MiB block execution
 --

 Key: SPARK-1112
 URL: https://issues.apache.org/jira/browse/SPARK-1112
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 0.9.0, 1.0.0
Reporter: Guillaume Pitel
Assignee: Xiangrui Meng
Priority: Blocker
 Fix For: 1.0.1, 1.1.0


 When I set the spark.akka.frameSize to something over 10, the messages sent 
 from the executors to the driver completely block the execution if the 
 message is bigger than 10MiB and smaller than the frameSize (if it's above 
 the frameSize, it's ok)
 Workaround is to set the spark.akka.frameSize to 10. In this case, since 
 0.8.1, the blockManager deal with  the data to be sent. It seems slower than 
 akka direct message though.
 The configuration seems to be correctly read (see actorSystemConfig.txt), so 
 I don't see where the 10MiB could come from 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Reopened] (SPARK-1112) When spark.akka.frameSize 10, task results bigger than 10MiB block execution

2014-06-25 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell reopened SPARK-1112:



This is not resolved yet because it needs to be back ported into 0.9

 When spark.akka.frameSize  10, task results bigger than 10MiB block execution
 --

 Key: SPARK-1112
 URL: https://issues.apache.org/jira/browse/SPARK-1112
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 0.9.0, 1.0.0
Reporter: Guillaume Pitel
Assignee: Xiangrui Meng
Priority: Blocker
 Fix For: 1.0.1, 1.1.0


 When I set the spark.akka.frameSize to something over 10, the messages sent 
 from the executors to the driver completely block the execution if the 
 message is bigger than 10MiB and smaller than the frameSize (if it's above 
 the frameSize, it's ok)
 Workaround is to set the spark.akka.frameSize to 10. In this case, since 
 0.8.1, the blockManager deal with  the data to be sent. It seems slower than 
 akka direct message though.
 The configuration seems to be correctly read (see actorSystemConfig.txt), so 
 I don't see where the 10MiB could come from 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-2269) Clean up and add unit tests for resourceOffers in MesosSchedulerBackend

2014-06-25 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-2269:
---

Description: 
This function could be simplified a bit. We could re-write it without 
offerableIndices or creating the mesosTasks array as large as the offer list. 
There is a lot of logic around making sure you get the correct index into 
mesosTasks and offers, really we should just build mesosTasks directly from the 
offers we get back. To associate the tasks we are launching with the offers we 
can just create a hashMap from the slaveId to the original offer.

The basic logic of the function is that you take the mesos offers, convert them 
to spark offers, then convert the results back.

One reason I think it might be designed as it is now is to deal with the case 
where Mesos gives multiple offers for a single slave. I checked directly with 
the Mesos team and they said this won't ever happen, you'll get at most one 
offer per mesos slave within a set of offers.

  was:
This function could be simplified a bit. We could re-write it without 
offerableIndices or creating the mesosTasks array as large as the offer list. 
There is a lot of logic around making sure you get the correct index into 
mesosTasks and offers, really we should just build mesosTasks directly from the 
offers we get back. To associate the tasks we are launching with the offers we 
can just create a hashMap from the slaveId to the original offer.

The basic logic of the function is that you take the mesos offers, convert them 
to spark offers, then convert the results back.

One thing we should check is whether Mesos guarantees that it won't give two 
offers for the same worker. That would make things much more complicated.


 Clean up and add unit tests for resourceOffers in MesosSchedulerBackend
 ---

 Key: SPARK-2269
 URL: https://issues.apache.org/jira/browse/SPARK-2269
 Project: Spark
  Issue Type: Bug
  Components: Mesos
Reporter: Patrick Wendell

 This function could be simplified a bit. We could re-write it without 
 offerableIndices or creating the mesosTasks array as large as the offer list. 
 There is a lot of logic around making sure you get the correct index into 
 mesosTasks and offers, really we should just build mesosTasks directly from 
 the offers we get back. To associate the tasks we are launching with the 
 offers we can just create a hashMap from the slaveId to the original offer.
 The basic logic of the function is that you take the mesos offers, convert 
 them to spark offers, then convert the results back.
 One reason I think it might be designed as it is now is to deal with the case 
 where Mesos gives multiple offers for a single slave. I checked directly with 
 the Mesos team and they said this won't ever happen, you'll get at most one 
 offer per mesos slave within a set of offers.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (SPARK-2270) Kryo cannot serialize results returned by asJavaIterable (and thus groupBy/cogroup are broken in Java APIs when Kryo is used)

2014-06-25 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-2270.


   Resolution: Fixed
Fix Version/s: 1.1.0
   1.0.1

Issue resolved by pull request 1206
[https://github.com/apache/spark/pull/1206]

 Kryo cannot serialize results returned by asJavaIterable (and thus 
 groupBy/cogroup are broken in Java APIs when Kryo is used)
 -

 Key: SPARK-2270
 URL: https://issues.apache.org/jira/browse/SPARK-2270
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.0.0
Reporter: Reynold Xin
Assignee: Reynold Xin
Priority: Critical
 Fix For: 1.0.1, 1.1.0


 The combination of Kryo serializer  Java API could lead to the following 
 exception in groupBy/groupByKey/cogroup:
 {code}
 org.apache.spark.SparkException: Job aborted due to stage failure: Exception 
 while deserializing and fetching task: java.lang.UnsupportedOperationException
 org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1033)
 org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1017)
 org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1015)
 scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
 scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
 org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1015)
 org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633)
 org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633)
 scala.Option.foreach(Option.scala:236)
 org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:633)
 org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1207)
 akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
 akka.actor.ActorCell.invoke(ActorCell.scala:456)
 akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
 akka.dispatch.Mailbox.run(Mailbox.scala:219)
 akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
 scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
 scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
 scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)3:45
  PM
 {code}
 or
 {code}
 14/06/24 16:38:09 ERROR TaskResultGetter: Exception while getting task result
 java.lang.UnsupportedOperationException
 at java.util.AbstractCollection.add(AbstractCollection.java:260)
 at 
 com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:109)
 at 
 com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
 at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
 at carbonite.serializer$mk_collection_reader$fn__50.invoke(serializer.clj:57)
 at clojure.lang.Var.invoke(Var.java:383)
 at carbonite.ClojureVecSerializer.read(ClojureVecSerializer.java:17)
 at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
 at 
 com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:338)
 at 
 com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:293)
 at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
 at 
 org.apache.spark.serializer.KryoSerializerInstance.deserialize(KryoSerializer.scala:144)
 at org.apache.spark.scheduler.DirectTaskResult.value(TaskResult.scala:79)
 at 
 org.apache.spark.scheduler.TaskSetManager.handleSuccessfulTask(TaskSetManager.scala:480)
 at 
 org.apache.spark.scheduler.TaskSchedulerImpl.handleSuccessfulTask(TaskSchedulerImpl.scala:316)
 at 
 org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply$mcV$sp(TaskResultGetter.scala:68)
 at 
 org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:47)
 at 
 org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:47)
 at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1213)
 at 
 org.apache.spark.scheduler.TaskResultGetter$$anon$2.run(TaskResultGetter.scala:46)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at

[jira] [Resolved] (SPARK-2204) Scheduler for Mesos in fine-grained mode launches tasks on wrong executors

2014-06-25 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-2204.


   Resolution: Fixed
Fix Version/s: 1.1.0
   1.0.1

Issue resolved by pull request 1140
[https://github.com/apache/spark/pull/1140]

 Scheduler for Mesos in fine-grained mode launches tasks on wrong executors
 --

 Key: SPARK-2204
 URL: https://issues.apache.org/jira/browse/SPARK-2204
 Project: Spark
  Issue Type: Bug
  Components: Mesos
Affects Versions: 1.0.0
Reporter: Sebastien Rainville
Priority: Blocker
 Fix For: 1.0.1, 1.1.0


 MesosSchedulerBackend.resourceOffers(SchedulerDriver, List[Offer]) is 
 assuming that TaskSchedulerImpl.resourceOffers(Seq[WorkerOffer]) is returning 
 task lists in the same order as the offers it was passed, but in the current 
 implementation TaskSchedulerImpl.resourceOffers shuffles the offers to avoid 
 assigning the tasks always to the same executors. The result is that the 
 tasks are launched on the wrong executors.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-1912) Compression memory issue during reduce

2014-06-25 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-1912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-1912:
---

Fix Version/s: 0.9.2

 Compression memory issue during reduce
 --

 Key: SPARK-1912
 URL: https://issues.apache.org/jira/browse/SPARK-1912
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Reporter: Wenchen Fan
Assignee: Wenchen Fan
 Fix For: 0.9.2, 1.0.1, 1.1.0


 When we need to read a compressed block, we will first create a compress 
 stream instance(LZF or Snappy) and use it to wrap that block.
 Let's say a reducer task need to read 1000 local shuffle blocks, it will 
 first prepare to read that 1000 blocks, which means create 1000 compression 
 stream instance to wrap them. But the initialization of compression instance 
 will allocate some memory and when we have many compression instance at the 
 same time, it is a problem.
 Actually reducer reads the shuffle blocks one by one, so why we create 
 compression instance at the first time? Can we do it lazily that when a block 
 is first read, create compression instance for it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-1912) Compression memory issue during reduce

2014-06-25 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-1912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-1912:
---

Fix Version/s: 1.0.1

 Compression memory issue during reduce
 --

 Key: SPARK-1912
 URL: https://issues.apache.org/jira/browse/SPARK-1912
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Reporter: Wenchen Fan
Assignee: Wenchen Fan
 Fix For: 0.9.2, 1.0.1, 1.1.0


 When we need to read a compressed block, we will first create a compress 
 stream instance(LZF or Snappy) and use it to wrap that block.
 Let's say a reducer task need to read 1000 local shuffle blocks, it will 
 first prepare to read that 1000 blocks, which means create 1000 compression 
 stream instance to wrap them. But the initialization of compression instance 
 will allocate some memory and when we have many compression instance at the 
 same time, it is a problem.
 Actually reducer reads the shuffle blocks one by one, so why we create 
 compression instance at the first time? Can we do it lazily that when a block 
 is first read, create compression instance for it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SPARK-2279) JavaSparkContext should allow creation of EmptyRDD

2014-06-25 Thread Patrick Wendell (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044063#comment-14044063
 ] 

Patrick Wendell commented on SPARK-2279:


I think `EmtpyRDD` is a mostly internal class. Can you just parallelize an 
empty collection?

 JavaSparkContext should allow creation of EmptyRDD
 --

 Key: SPARK-2279
 URL: https://issues.apache.org/jira/browse/SPARK-2279
 Project: Spark
  Issue Type: New Feature
  Components: Java API
Affects Versions: 1.0.0
Reporter: Hans Uhlig

 Scala Implementation currently supports creation of EmptyRDD. Java does not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SPARK-2266) Log page on Worker UI displays Some(app-id)

2014-06-25 Thread Patrick Wendell (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-2266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044070#comment-14044070
 ] 

Patrick Wendell commented on SPARK-2266:


Resolved by:
https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=9aa603296c285e1acf4bde64583f203008ba3e91

 Log page on Worker UI displays Some(app-id)
 -

 Key: SPARK-2266
 URL: https://issues.apache.org/jira/browse/SPARK-2266
 Project: Spark
  Issue Type: Bug
Affects Versions: 1.1.0
Reporter: Andrew Or
Priority: Minor
 Fix For: 1.0.1, 1.1.0

 Attachments: Screen Shot 2014-06-24 at 5.07.54 PM.png


 Oops.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (SPARK-2266) Log page on Worker UI displays Some(app-id)

2014-06-25 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-2266.


   Resolution: Fixed
Fix Version/s: 1.0.1

 Log page on Worker UI displays Some(app-id)
 -

 Key: SPARK-2266
 URL: https://issues.apache.org/jira/browse/SPARK-2266
 Project: Spark
  Issue Type: Bug
Affects Versions: 1.1.0
Reporter: Andrew Or
Priority: Minor
 Fix For: 1.0.1, 1.1.0

 Attachments: Screen Shot 2014-06-24 at 5.07.54 PM.png


 Oops.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SPARK-985) Support Job Cancellation on Mesos Scheduler

2014-06-25 Thread Patrick Wendell (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044079#comment-14044079
 ] 

Patrick Wendell commented on SPARK-985:
---

Some more notes on this from a related thread:

Task killing is not supported in the fine-grained mode on mesos because, in 
that mode, we use Mesos's built in support for all of the control plane 
messages relating to tasks. So we'll have to figure out how to support killing 
tasks in that model. There are two questions, one is who actually sends the 
kill message to the executor and the other is how we tell Mesos that the 
cores are freed which were in use by the task. In the course of normal 
operation that's handled by using the Mesos launchTask and sendStatusUpdate 
interfaces.

 Support Job Cancellation on Mesos Scheduler
 ---

 Key: SPARK-985
 URL: https://issues.apache.org/jira/browse/SPARK-985
 Project: Spark
  Issue Type: Improvement
  Components: Mesos
Affects Versions: 0.9.0
Reporter: Josh Rosen

 https://github.com/apache/incubator-spark/pull/29 added job cancellation but 
 may still need support for Mesos scheduler backends:
 Quote: 
 {quote}
 This looks good except that MesosSchedulerBackend isn't yet calling Mesos's 
 killTask. Do you want to add that too or are you planning to push it till 
 later? I don't think it's a huge change.
 {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-985) Support Job Cancellation on Mesos Scheduler

2014-06-25 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-985:
--

Component/s: Mesos

 Support Job Cancellation on Mesos Scheduler
 ---

 Key: SPARK-985
 URL: https://issues.apache.org/jira/browse/SPARK-985
 Project: Spark
  Issue Type: Improvement
  Components: Mesos
Affects Versions: 0.9.0
Reporter: Josh Rosen

 https://github.com/apache/incubator-spark/pull/29 added job cancellation but 
 may still need support for Mesos scheduler backends:
 Quote: 
 {quote}
 This looks good except that MesosSchedulerBackend isn't yet calling Mesos's 
 killTask. Do you want to add that too or are you planning to push it till 
 later? I don't think it's a huge change.
 {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-2204) Scheduler for Mesos in fine-grained mode launches tasks on wrong executors

2014-06-25 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-2204:
---

Assignee: Sebastien Rainville

 Scheduler for Mesos in fine-grained mode launches tasks on wrong executors
 --

 Key: SPARK-2204
 URL: https://issues.apache.org/jira/browse/SPARK-2204
 Project: Spark
  Issue Type: Bug
  Components: Mesos
Affects Versions: 1.0.0
Reporter: Sebastien Rainville
Assignee: Sebastien Rainville
Priority: Blocker
 Fix For: 1.0.1, 1.1.0


 MesosSchedulerBackend.resourceOffers(SchedulerDriver, List[Offer]) is 
 assuming that TaskSchedulerImpl.resourceOffers(Seq[WorkerOffer]) is returning 
 task lists in the same order as the offers it was passed, but in the current 
 implementation TaskSchedulerImpl.resourceOffers shuffles the offers to avoid 
 assigning the tasks always to the same executors. The result is that the 
 tasks are launched on the wrong executors.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (SPARK-1749) DAGScheduler supervisor strategy broken with Mesos

2014-06-25 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-1749.


   Resolution: Fixed
Fix Version/s: 1.1.0
   1.0.1

Issue resolved by pull request 1219
[https://github.com/apache/spark/pull/1219]

 DAGScheduler supervisor strategy broken with Mesos
 --

 Key: SPARK-1749
 URL: https://issues.apache.org/jira/browse/SPARK-1749
 Project: Spark
  Issue Type: Bug
  Components: Mesos, Spark Core
Affects Versions: 1.0.0
Reporter: Bouke van der Bijl
Assignee: Mark Hamstra
Priority: Blocker
  Labels: mesos, scheduler, scheduling
 Fix For: 1.0.1, 1.1.0


 Any bad Python code will trigger this bug, for example 
 `sc.parallelize(range(100)).map(lambda n: undefined_variable * 2).collect()` 
 will cause a `undefined_variable isn't defined`, which will cause spark to 
 try to kill the task, resulting in the following stacktrace:
 java.lang.UnsupportedOperationException
   at 
 org.apache.spark.scheduler.SchedulerBackend$class.killTask(SchedulerBackend.scala:32)
   at 
 org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend.killTask(MesosSchedulerBackend.scala:41)
   at 
 org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$cancelTasks$3$$anonfun$apply$1.apply$mcVJ$sp(TaskSchedulerImpl.scala:184)
   at 
 org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$cancelTasks$3$$anonfun$apply$1.apply(TaskSchedulerImpl.scala:182)
   at 
 org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$cancelTasks$3$$anonfun$apply$1.apply(TaskSchedulerImpl.scala:182)
   at scala.collection.mutable.HashSet.foreach(HashSet.scala:79)
   at 
 org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$cancelTasks$3.apply(TaskSchedulerImpl.scala:182)
   at 
 org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$cancelTasks$3.apply(TaskSchedulerImpl.scala:175)
   at scala.Option.foreach(Option.scala:236)
   at 
 org.apache.spark.scheduler.TaskSchedulerImpl.cancelTasks(TaskSchedulerImpl.scala:175)
   at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages$1.apply$mcVI$sp(DAGScheduler.scala:1058)
   at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages$1.apply(DAGScheduler.scala:1045)
   at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages$1.apply(DAGScheduler.scala:1045)
   at scala.collection.mutable.HashSet.foreach(HashSet.scala:79)
   at 
 org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1045)
   at 
 org.apache.spark.scheduler.DAGScheduler.handleJobCancellation(DAGScheduler.scala:998)
   at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$doCancelAllJobs$1.apply$mcVI$sp(DAGScheduler.scala:499)
   at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$doCancelAllJobs$1.apply(DAGScheduler.scala:499)
   at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$doCancelAllJobs$1.apply(DAGScheduler.scala:499)
   at scala.collection.mutable.HashSet.foreach(HashSet.scala:79)
   at 
 org.apache.spark.scheduler.DAGScheduler.doCancelAllJobs(DAGScheduler.scala:499)
   at 
 org.apache.spark.scheduler.DAGSchedulerActorSupervisor$$anonfun$2.applyOrElse(DAGScheduler.scala:1151)
   at 
 org.apache.spark.scheduler.DAGSchedulerActorSupervisor$$anonfun$2.applyOrElse(DAGScheduler.scala:1147)
   at akka.actor.SupervisorStrategy.handleFailure(FaultHandling.scala:295)
   at 
 akka.actor.dungeon.FaultHandling$class.handleFailure(FaultHandling.scala:253)
   at akka.actor.ActorCell.handleFailure(ActorCell.scala:338)
   at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:423)
   at akka.actor.ActorCell.systemInvoke(ActorCell.scala:447)
   at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:262)
   at akka.dispatch.Mailbox.run(Mailbox.scala:218)
   at 
 akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
   at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
   at 
 scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
   at 
 scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 This is because killTask isn't implemented for the MesosSchedulerBackend. I 
 assume this isn't pyspark-specific, as there will be other instances where 
 you might want to kill the task 



--
This message was sent by Atlassian JIRA

[jira] [Resolved] (SPARK-2251) MLLib Naive Bayes Example SparkException: Can only zip RDDs with same number of elements in each partition

2014-06-26 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-2251.


   Resolution: Fixed
Fix Version/s: 1.1.0

Issue resolved by pull request 1229
[https://github.com/apache/spark/pull/1229]

 MLLib Naive Bayes Example SparkException: Can only zip RDDs with same number 
 of elements in each partition
 --

 Key: SPARK-2251
 URL: https://issues.apache.org/jira/browse/SPARK-2251
 Project: Spark
  Issue Type: Bug
  Components: MLlib
Affects Versions: 1.0.0
 Environment: OS: Fedora Linux
 Spark Version: 1.0.0. Git clone from the Spark Repository
Reporter: Jun Xie
Assignee: Xiangrui Meng
Priority: Minor
  Labels: Naive-Bayes
 Fix For: 1.0.1, 1.1.0


 I follow the exact code from Naive Bayes Example 
 (http://spark.apache.org/docs/latest/mllib-naive-bayes.html) of MLLib.
 When I executed the final command: 
 val accuracy = 1.0 * predictionAndLabel.filter(x = x._1 == x._2).count() / 
 test.count()
 It complains Can only zip RDDs with same number of elements in each 
 partition.
 I got the following exception:
 {code}
 14/06/23 19:39:23 INFO SparkContext: Starting job: count at console:31
 14/06/23 19:39:23 INFO DAGScheduler: Got job 3 (count at console:31) with 2 
 output partitions (allowLocal=false)
 14/06/23 19:39:23 INFO DAGScheduler: Final stage: Stage 4(count at 
 console:31)
 14/06/23 19:39:23 INFO DAGScheduler: Parents of final stage: List()
 14/06/23 19:39:23 INFO DAGScheduler: Missing parents: List()
 14/06/23 19:39:23 INFO DAGScheduler: Submitting Stage 4 (FilteredRDD[14] at 
 filter at console:31), which has no missing parents
 14/06/23 19:39:23 INFO DAGScheduler: Submitting 2 missing tasks from Stage 4 
 (FilteredRDD[14] at filter at console:31)
 14/06/23 19:39:23 INFO TaskSchedulerImpl: Adding task set 4.0 with 2 tasks
 14/06/23 19:39:23 INFO TaskSetManager: Starting task 4.0:0 as TID 8 on 
 executor localhost: localhost (PROCESS_LOCAL)
 14/06/23 19:39:23 INFO TaskSetManager: Serialized task 4.0:0 as 3410 bytes in 
 0 ms
 14/06/23 19:39:23 INFO TaskSetManager: Starting task 4.0:1 as TID 9 on 
 executor localhost: localhost (PROCESS_LOCAL)
 14/06/23 19:39:23 INFO TaskSetManager: Serialized task 4.0:1 as 3410 bytes in 
 1 ms
 14/06/23 19:39:23 INFO Executor: Running task ID 8
 14/06/23 19:39:23 INFO Executor: Running task ID 9
 14/06/23 19:39:23 INFO BlockManager: Found block broadcast_0 locally
 14/06/23 19:39:23 INFO BlockManager: Found block broadcast_0 locally
 14/06/23 19:39:23 INFO HadoopRDD: Input split: 
 file:/home/jun/open_source/spark/mllib/data/sample_naive_bayes_data.txt:0+24
 14/06/23 19:39:23 INFO HadoopRDD: Input split: 
 file:/home/jun/open_source/spark/mllib/data/sample_naive_bayes_data.txt:24+24
 14/06/23 19:39:23 INFO HadoopRDD: Input split: 
 file:/home/jun/open_source/spark/mllib/data/sample_naive_bayes_data.txt:0+24
 14/06/23 19:39:23 INFO HadoopRDD: Input split: 
 file:/home/jun/open_source/spark/mllib/data/sample_naive_bayes_data.txt:24+24
 14/06/23 19:39:23 ERROR Executor: Exception in task ID 9
 org.apache.spark.SparkException: Can only zip RDDs with same number of 
 elements in each partition
   at 
 org.apache.spark.rdd.RDD$$anonfun$zip$1$$anon$1.hasNext(RDD.scala:663)
   at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:388)
   at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1067)
   at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:858)
   at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:858)
   at 
 org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1079)
   at 
 org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1079)
   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
   at org.apache.spark.scheduler.Task.run(Task.scala:51)
   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:724)
 14/06/23 19:39:23 ERROR Executor: Exception in task ID 8
 org.apache.spark.SparkException: Can only zip RDDs with same number of 
 elements in each partition
   at 
 org.apache.spark.rdd.RDD$$anonfun$zip$1$$anon$1.hasNext(RDD.scala:663)
   at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:388)
   at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1067)
   at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:858)
   at

[jira] [Commented] (SPARK-2279) JavaSparkContext should allow creation of EmptyRDD

2014-06-26 Thread Patrick Wendell (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045517#comment-14045517
 ] 

Patrick Wendell commented on SPARK-2279:


Ah I see - I thought you meant the EmptyRDD class not the emptyRDD() method 
(which I forgot we even had!). It definitely makes sense to include the latter 
in the Java API.

 JavaSparkContext should allow creation of EmptyRDD
 --

 Key: SPARK-2279
 URL: https://issues.apache.org/jira/browse/SPARK-2279
 Project: Spark
  Issue Type: New Feature
  Components: Java API
Affects Versions: 1.0.0
Reporter: Hans Uhlig

 Scala Implementation currently supports creation of EmptyRDD. Java does not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (SPARK-2181) The keys for sorting the columns of Executor page in SparkUI are incorrect

2014-06-26 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-2181.


   Resolution: Fixed
Fix Version/s: 1.0.2
   1.1.0

 The keys for sorting the columns of Executor page in SparkUI are incorrect
 --

 Key: SPARK-2181
 URL: https://issues.apache.org/jira/browse/SPARK-2181
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Reporter: Shuo Xiang
Assignee: Guoqiang Li
Priority: Minor
 Fix For: 1.1.0, 1.0.2


 Under the Executor page of SparkUI, each column is sorted alphabetically 
 (after clicking). However, it should be sorted by the value, not the string.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SPARK-2228) onStageSubmitted does not properly called so NoSuchElement will be thrown in onStageCompleted

2014-06-26 Thread Patrick Wendell (JIRA)

[
https://issues.apache.org/jira/browse/SPARK-2228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045564#comment-14045564
]

Patrick Wendell commented on SPARK-2228:

I ran your reproduction locally. What I found was that it just generates events
more quickly than the listener can process, so that was triggering all of the
subsequent errors:

{code}
$ cat job-log.txt |grep ERROR | head -n 10
14/06/26 22:41:02 ERROR scheduler.LiveListenerBus: Dropping SparkListenerEvent
because no remaining room in event queue. This likely means one of the
SparkListeners is too slow and cannot keep up withthe rate at which tasks are
being started by the scheduler.
14/06/26 22:42:01 ERROR scheduler.LiveListenerBus: Listener JobProgressListener
threw an exception
14/06/26 22:42:01 ERROR scheduler.LiveListenerBus: Listener JobProgressListener
threw an exception
14/06/26 22:42:01 ERROR scheduler.LiveListenerBus: Listener JobProgressListener
threw an exception
14/06/26 22:42:01 ERROR scheduler.LiveListenerBus: Listener JobProgressListener
threw an exception
14/06/26 22:42:01 ERROR scheduler.LiveListenerBus: Listener JobProgressListener
threw an exception
14/06/26 22:42:01 ERROR scheduler.LiveListenerBus: Listener JobProgressListener
threw an exception
{code}

If someone submits a job that creates thousands of stages in a few seconds this
can happen. But I haven't seen it happen in a real production job that does
actual nontrivial work inside of the stage.

We could consider an alternative design that applies back pressure instead of
dropping events.

onStageSubmitted does not properly called so NoSuchElement will be thrown in
onStageCompleted
-

Key: SPARK-2228
URL: https://issues.apache.org/jira/browse/SPARK-2228
Project: Spark
Issue Type: Bug
Components: Spark Core
Affects Versions: 1.0.0
Reporter: Baoxu Shi

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SPARK-2228) onStageSubmitted does not properly called so NoSuchElement will be thrown in onStageCompleted

2014-06-27 Thread Patrick Wendell (JIRA)

[
https://issues.apache.org/jira/browse/SPARK-2228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045626#comment-14045626
]

Patrick Wendell commented on SPARK-2228:

[~rxin] unfortunately I think it's more complicated because the inconsistency
can happen in both directions. We can miss an event for a stage finishing or we
can miss an event for the stage starting. That means we either try to finish a
missing stage (and get an NPE), or we have a straggler stage that looks like it
never ended in the UI.

onStageSubmitted does not properly called so NoSuchElement will be thrown in
onStageCompleted
-

Key: SPARK-2228
URL: https://issues.apache.org/jira/browse/SPARK-2228
Project: Spark
Issue Type: Bug
Components: Spark Core
Affects Versions: 1.0.0
Reporter: Baoxu Shi

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (SPARK-2291) Update EC2 scripts to use instance storage on m3 instance types

2014-06-27 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-2291.


Resolution: Duplicate

Was already fixed by this PR:
https://github.com/apache/spark/pull/1156

 Update EC2 scripts to use instance storage on m3 instance types
 ---

 Key: SPARK-2291
 URL: https://issues.apache.org/jira/browse/SPARK-2291
 Project: Spark
  Issue Type: Improvement
  Components: EC2
Affects Versions: 0.9.0, 0.9.1, 1.0.0
Reporter: Alessandro Andrioni

 [On January 
 21|https://aws.amazon.com/about-aws/whats-new/2014/01/21/announcing-new-amazon-ec2-m3-instance-sizes-and-lower-prices-for-amazon-s3-and-amazon-ebs/],
  Amazon added SSD-backed instance storages for m3 instances, and also added 
 two new types: m3.medium and m3.large.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (SPARK-2292) NullPointerException in JavaPairRDD.mapToPair

2014-06-27 Thread Patrick Wendell (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-2292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046445#comment-14046445
 ] 

Patrick Wendell edited comment on SPARK-2292 at 6/27/14 9:54 PM:
-

Unfortunately I also can't reproduce this issue. I tried the example job from 
[~mkim], but I had to generate my own CSV file because non was provided. And I 
found that there was no exception. Looking at the code I don't see an obvious 
cause for this, so it would be nice to have a reliable reproduction.


was (Author: pwendell):
Unfortunately I also can't reproduce this issue. I tried the example job from 
[~mkim], but I had to generate my own CSV file because non was provided.

 NullPointerException in JavaPairRDD.mapToPair
 -

 Key: SPARK-2292
 URL: https://issues.apache.org/jira/browse/SPARK-2292
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.0.0
 Environment: Spark 1.0.0, Standalone with the master  single slave 
 running on Ubuntu on a laptop. 4G mem and 8 cores were available to the 
 executor .
Reporter: Bharath Ravi Kumar
Priority: Critical

 Correction: Invoking JavaPairRDD.mapToPair results in an NPE:
 {noformat}
 14/06/26 21:05:35 WARN scheduler.TaskSetManager: Loss was due to 
 java.lang.NullPointerException
 java.lang.NullPointerException
   at 
 org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:750)
   at 
 org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:750)
   at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
   at org.apache.spark.Aggregator.combineValuesByKey(Aggregator.scala:59)
   at 
 org.apache.spark.rdd.PairRDDFunctions$$anonfun$1.apply(PairRDDFunctions.scala:96)
   at 
 org.apache.spark.rdd.PairRDDFunctions$$anonfun$1.apply(PairRDDFunctions.scala:95)
   at org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:582)
   at org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:582)
   at 
 org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
   at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
   at 
 org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:158)
   at 
 org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
   at org.apache.spark.scheduler.Task.run(Task.scala:51)
   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
   at java.lang.Thread.run(Thread.java:722)
 {noformat}
  This occurs only after migrating to the 1.0.0 API. The details of the code 
 the data file used to test are included in this gist : 
 https://gist.github.com/reachbach/d8977c8eb5f71f889301



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-2307) SparkUI Storage page cached statuses incorrect

2014-06-27 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-2307:
---

Assignee: Andrew Or

 SparkUI Storage page cached statuses incorrect
 --

 Key: SPARK-2307
 URL: https://issues.apache.org/jira/browse/SPARK-2307
 Project: Spark
  Issue Type: Bug
  Components: Spark Core, Web UI
Affects Versions: 1.1.0
Reporter: Andrew Or
Assignee: Andrew Or
 Fix For: 1.0.1, 1.1.0

 Attachments: Screen Shot 2014-06-27 at 11.09.54 AM.png


 See attached: the executor has 512MB, but somehow it has cached (279 + 27 + 
 279 + 27) = 612MB? (The correct answer is 279MB).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (SPARK-2307) SparkUI Storage page cached statuses incorrect

2014-06-27 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-2307.


   Resolution: Fixed
Fix Version/s: 1.0.1

Issue resolved by pull request 1249
[https://github.com/apache/spark/pull/1249]

 SparkUI Storage page cached statuses incorrect
 --

 Key: SPARK-2307
 URL: https://issues.apache.org/jira/browse/SPARK-2307
 Project: Spark
  Issue Type: Bug
  Components: Spark Core, Web UI
Affects Versions: 1.1.0
Reporter: Andrew Or
 Fix For: 1.0.1, 1.1.0

 Attachments: Screen Shot 2014-06-27 at 11.09.54 AM.png


 See attached: the executor has 512MB, but somehow it has cached (279 + 27 + 
 279 + 27) = 612MB? (The correct answer is 279MB).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (SPARK-2259) Spark submit documentation for --deploy-mode is highly misleading

2014-06-27 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-2259.


   Resolution: Fixed
Fix Version/s: 1.1.0
   1.0.1

Issue resolved by pull request 1200
[https://github.com/apache/spark/pull/1200]

 Spark submit documentation for --deploy-mode is highly misleading
 -

 Key: SPARK-2259
 URL: https://issues.apache.org/jira/browse/SPARK-2259
 Project: Spark
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.1.0
Reporter: Andrew Or
Assignee: Andrew Or
Priority: Critical
 Fix For: 1.0.1, 1.1.0


 There are a few issues:
 1. Client mode does not necessarily mean the driver program must be launched 
 outside of the cluster.
 2. For standalone clusters, only client mode is currently supported. This was 
 the case supported even before 1.0.
 Currently, the docs tell the user to use cluster deploy mode when deploying 
 your driver program within the cluster, which is true also for 
 standalone-client mode. In short, the docs encourage the user to use 
 standalone-cluster, an unsupported mode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-2243) Support multiple SparkContexts in the same JVM

2014-06-27 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-2243:
---

Summary: Support multiple SparkContexts in the same JVM  (was: Using 
several Spark Contexts)

 Support multiple SparkContexts in the same JVM
 --

 Key: SPARK-2243
 URL: https://issues.apache.org/jira/browse/SPARK-2243
 Project: Spark
  Issue Type: New Feature
  Components: Block Manager, Spark Core
Affects Versions: 1.0.0
Reporter: Miguel Angel Fernandez Diaz

 We're developing a platform where we create several Spark contexts for 
 carrying out different calculations. Is there any restriction when using 
 several Spark contexts? We have two contexts, one for Spark calculations and 
 another one for Spark Streaming jobs. The next error arises when we first 
 execute a Spark calculation and, once the execution is finished, a Spark 
 Streaming job is launched:
 {code}
 14/06/23 16:40:08 ERROR executor.Executor: Exception in task ID 0
 java.io.FileNotFoundException: http://172.19.0.215:47530/broadcast_0
   at 
 sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1624)
   at 
 org.apache.spark.broadcast.HttpBroadcast$.read(HttpBroadcast.scala:156)
   at 
 org.apache.spark.broadcast.HttpBroadcast.readObject(HttpBroadcast.scala:56)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
   at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
   at 
 java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
   at 
 java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
   at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
   at 
 java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
   at 
 org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:40)
   at 
 org.apache.spark.scheduler.ResultTask$.deserializeInfo(ResultTask.scala:63)
   at 
 org.apache.spark.scheduler.ResultTask.readExternal(ResultTask.scala:139)
   at 
 java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1837)
   at 
 java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
   at 
 org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:40)
   at 
 org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:62)
   at 
 org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:193)
   at 
 org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:45)
   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:176)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
 14/06/23 16:40:08 WARN scheduler.TaskSetManager: Lost TID 0 (task 0.0:0)
 14/06/23 16:40:08 WARN scheduler.TaskSetManager: Loss was due to 
 java.io.FileNotFoundException
 java.io.FileNotFoundException: http://172.19.0.215:47530/broadcast_0
   at 
 sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1624)
   at 
 org.apache.spark.broadcast.HttpBroadcast$.read(HttpBroadcast.scala:156)
   at 
 org.apache.spark.broadcast.HttpBroadcast.readObject(HttpBroadcast.scala:56)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
   at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
   at 
 java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
   at

[jira] [Commented] (SPARK-2243) Support multiple SparkContexts in the same JVM

2014-06-27 Thread Patrick Wendell (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046574#comment-14046574
 ] 

Patrick Wendell commented on SPARK-2243:


This is not supported - but it's something we could support in the future. Note 
though that you can use the same SparkContext for a streaming program and your 
own calculations. This is actually better in some ways because they can share 
data.

 Support multiple SparkContexts in the same JVM
 --

 Key: SPARK-2243
 URL: https://issues.apache.org/jira/browse/SPARK-2243
 Project: Spark
  Issue Type: New Feature
  Components: Block Manager, Spark Core
Affects Versions: 1.0.0
Reporter: Miguel Angel Fernandez Diaz

 We're developing a platform where we create several Spark contexts for 
 carrying out different calculations. Is there any restriction when using 
 several Spark contexts? We have two contexts, one for Spark calculations and 
 another one for Spark Streaming jobs. The next error arises when we first 
 execute a Spark calculation and, once the execution is finished, a Spark 
 Streaming job is launched:
 {code}
 14/06/23 16:40:08 ERROR executor.Executor: Exception in task ID 0
 java.io.FileNotFoundException: http://172.19.0.215:47530/broadcast_0
   at 
 sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1624)
   at 
 org.apache.spark.broadcast.HttpBroadcast$.read(HttpBroadcast.scala:156)
   at 
 org.apache.spark.broadcast.HttpBroadcast.readObject(HttpBroadcast.scala:56)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
   at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
   at 
 java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
   at 
 java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
   at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
   at 
 java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
   at 
 org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:40)
   at 
 org.apache.spark.scheduler.ResultTask$.deserializeInfo(ResultTask.scala:63)
   at 
 org.apache.spark.scheduler.ResultTask.readExternal(ResultTask.scala:139)
   at 
 java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1837)
   at 
 java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
   at 
 org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:40)
   at 
 org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:62)
   at 
 org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:193)
   at 
 org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:45)
   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:176)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
 14/06/23 16:40:08 WARN scheduler.TaskSetManager: Lost TID 0 (task 0.0:0)
 14/06/23 16:40:08 WARN scheduler.TaskSetManager: Loss was due to 
 java.io.FileNotFoundException
 java.io.FileNotFoundException: http://172.19.0.215:47530/broadcast_0
   at 
 sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1624)
   at 
 org.apache.spark.broadcast.HttpBroadcast$.read(HttpBroadcast.scala:156)
   at 
 org.apache.spark.broadcast.HttpBroadcast.readObject(HttpBroadcast.scala:56)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
   at

[jira] [Commented] (SPARK-2111) pyspark errors when SPARK_PRINT_LAUNCH_COMMAND=1

2014-06-27 Thread Patrick Wendell (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-2111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046586#comment-14046586
 ] 

Patrick Wendell commented on SPARK-2111:


I was thinking that SPARK-2313 might be a better general solution to this.

 pyspark errors when SPARK_PRINT_LAUNCH_COMMAND=1
 

 Key: SPARK-2111
 URL: https://issues.apache.org/jira/browse/SPARK-2111
 Project: Spark
  Issue Type: Bug
  Components: PySpark
Affects Versions: 1.0.0
Reporter: Thomas Graves

 If you set SPARK_PRINT_LAUNCH_COMMAND=1 to see what java command is being 
 used to launch spark and then try to run pyspark it errors out with a very 
 non-useful error message:
 Traceback (most recent call last):
   File /homes/tgraves/test/hadoop2/y-spark-git/python/pyspark/shell.py, 
 line 43, in module
 sc = SparkContext(appName=PySparkShell, pyFiles=add_files)
   File /homes/tgraves/test/hadoop2/y-spark-git/python/pyspark/context.py, 
 line 94, in __init__
 SparkContext._ensure_initialized(self, gateway=gateway)
   File /homes/tgraves/test/hadoop2/y-spark-git/python/pyspark/context.py, 
 line 184, in _ensure_initialized
 SparkContext._gateway = gateway or launch_gateway()
   File 
 /homes/tgraves/test/hadoop2/y-spark-git/python/pyspark/java_gateway.py, 
 line 51, in launch_gateway
 gateway_port = int(proc.stdout.readline())
 ValueError: invalid literal for int() with base 10: 'Spark Command: 
 /home/gs/java/jdk/bin/java -cp 
 :/home/gs/hadoop/current/share/hadoop/common/hadoop-gpl-compression.jar:/home/gs/hadoop/current/share/hadoop/hdfs/lib/YahooDNSToSwitchMapping-0.2.14020207'



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (SPARK-2313) PySpark should accept port via a command line argument rather than STDIN

2014-06-27 Thread Patrick Wendell (JIRA)

Patrick Wendell created SPARK-2313:
--

 Summary: PySpark should accept port via a command line argument 
rather than STDIN
 Key: SPARK-2313
 URL: https://issues.apache.org/jira/browse/SPARK-2313
 Project: Spark
  Issue Type: Bug
  Components: PySpark
Reporter: Patrick Wendell


Relying on stdin is a brittle mechanism and has broken several times in the 
past. From what I can tell this is used only to bootstrap worker.py one time. 
It would be strictly simpler to just pass it is a command line.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SPARK-2228) onStageSubmitted does not properly called so NoSuchElement will be thrown in onStageCompleted

2014-06-28 Thread Patrick Wendell (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-2228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046940#comment-14046940
 ] 

Patrick Wendell commented on SPARK-2228:


So I dug into this more and profiled it to confirm. The issue is that we do a 
bunch of inefficient operations in the storage listener. For instance I noticed 
we spend almost all the times doing a big scala groupBy on the entire list of 
persisted blocks:

{code}
at java.lang.Integer.valueOf(Integer.java:642)
at scala.runtime.BoxesRunTime.boxToInteger(BoxesRunTime.java:70)
at 
org.apache.spark.storage.StorageUtils$$anonfun$9.apply(StorageUtils.scala:82)
at 
scala.collection.TraversableLike$$anonfun$groupBy$1.apply(TraversableLike.scala:328)
at 
scala.collection.TraversableLike$$anonfun$groupBy$1.apply(TraversableLike.scala:327)
at 
scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:224)
at 
scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:403)
at 
scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:403)
at 
scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:403)
at 
scala.collection.TraversableLike$class.groupBy(TraversableLike.scala:327)
at scala.collection.AbstractTraversable.groupBy(Traversable.scala:105)
at 
org.apache.spark.storage.StorageUtils$.rddInfoFromStorageStatus(StorageUtils.scala:82)
at 
org.apache.spark.ui.storage.StorageListener.updateRDDInfo(StorageTab.scala:56)
at 
org.apache.spark.ui.storage.StorageListener.onTaskEnd(StorageTab.scala:67)
- locked 0xa27ebe30 (a 
org.apache.spark.ui.storage.StorageListener)
{code}

Resizing this buffer won't help the underlying issue it all, it will just defer 
the time until failure to be longer.

 onStageSubmitted does not properly called so NoSuchElement will be thrown in 
 onStageCompleted
 -

 Key: SPARK-2228
 URL: https://issues.apache.org/jira/browse/SPARK-2228
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.0.0
Reporter: Baoxu Shi

 We are using `SaveAsObjectFile` and `objectFile` to cut off lineage during 
 iterative computing, but after several hundreds of iterations, there will be 
 `NoSuchElementsError`. We check the code and locate the problem at 
 `org.apache.spark.ui.jobs.JobProgressListener`. When `onStageCompleted` is 
 called, such `stageId` can not be found in `stageIdToPool`, but it does exist 
 in other HashMaps. So we think `onStageSubmitted` is not properly called. 
 `Spark` did add a stage but failed to send the message to listeners. When 
 sending `finish` message to listeners, the error occurs. 
 This problem will cause a huge number of `active stages` showing in 
 `SparkUI`, which is really annoying. But it may not affect the final result, 
 according to the result of my testing code.
 I'm willing to help solve this problem, any idea about which part should I 
 change? I assume `org.apache.spark.scheduler.SparkListenerBus` have something 
 to do with it but it looks fine to me.
 FYI, here is the test code that could reproduce the problem. I do not know 
 who to put code here with highlight, so I put the code on gist to make the 
 issue looks clean.
 https://gist.github.com/bxshi/b5c0fe0ae089c75a39bd



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (SPARK-2316) StorageStatusListener should avoid O(blocks) operations

2014-06-28 Thread Patrick Wendell (JIRA)

Patrick Wendell created SPARK-2316:
--

 Summary: StorageStatusListener should avoid O(blocks) operations
 Key: SPARK-2316
 URL: https://issues.apache.org/jira/browse/SPARK-2316
 Project: Spark
  Issue Type: Bug
  Components: Spark Core, Web UI
Affects Versions: 1.0.0
Reporter: Patrick Wendell
Assignee: Andrew Or


In the case where jobs are frequently causing dropped blocks the storage status 
listener can bottleneck. This is slow for a few reasons, one being that we use 
Scala collection operations, the other being that we operations that are 
O(number of blocks). I think using a few indices here could make this much 
faster.

{code}
 at java.lang.Integer.valueOf(Integer.java:642)
at scala.runtime.BoxesRunTime.boxToInteger(BoxesRunTime.java:70)
at 
org.apache.spark.storage.StorageUtils$$anonfun$9.apply(StorageUtils.scala:82)
at 
scala.collection.TraversableLike$$anonfun$groupBy$1.apply(TraversableLike.scala:328)
at 
scala.collection.TraversableLike$$anonfun$groupBy$1.apply(TraversableLike.scala:327)
at 
scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:224)
at 
scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:403)
at 
scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:403)
at 
scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:403)
at 
scala.collection.TraversableLike$class.groupBy(TraversableLike.scala:327)
at scala.collection.AbstractTraversable.groupBy(Traversable.scala:105)
at 
org.apache.spark.storage.StorageUtils$.rddInfoFromStorageStatus(StorageUtils.scala:82)
at 
org.apache.spark.ui.storage.StorageListener.updateRDDInfo(StorageTab.scala:56)
at 
org.apache.spark.ui.storage.StorageListener.onTaskEnd(StorageTab.scala:67)
- locked 0xa27ebe30 (a 
org.apache.spark.ui.storage.StorageListener)
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SPARK-2228) onStageSubmitted does not properly called so NoSuchElement will be thrown in onStageCompleted

2014-06-28 Thread Patrick Wendell (JIRA)

[
https://issues.apache.org/jira/browse/SPARK-2228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046943#comment-14046943
]

Patrick Wendell commented on SPARK-2228:

I've created SPARK-2316 to deal with the underlying issue here. The fix in this
pull request might also alleviate this issue, since it removes dropped blocks
from the set that is considered by the UI:

https://github.com/apache/spark/pull/1255

onStageSubmitted does not properly called so NoSuchElement will be thrown in
onStageCompleted
-

Key: SPARK-2228
URL: https://issues.apache.org/jira/browse/SPARK-2228
Project: Spark
Issue Type: Bug
Components: Spark Core
Affects Versions: 1.0.0
Reporter: Baoxu Shi

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SPARK-2292) NullPointerException in JavaPairRDD.mapToPair

2014-06-28 Thread Patrick Wendell (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-2292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046946#comment-14046946
 ] 

Patrick Wendell commented on SPARK-2292:


[~aash] With your example code I was able to narrow this down (slightly). I 
think there is something subtle going on here at the byte code level.

Your example links against the spark-1.0.0 binaries in Maven.

1. If I ran your example on a download Spark 1.0.0 cluster (I just went and 
downloaded the Spark binaries) it worked fine.
2. If I ran your example on a local Spark cluster that I compiled myself with 
SBT, even with the 1.0.0 tag, it didn't work.

I'm wondering if this is something similar to SPARK-2075.

In general, it would be good if people used spark-submit binary that is 
compiled at the same time as their cluster to submit jobs. Otherwise, there can 
be issues where a closure is created using an internal class name that is 
different than that on the cluster.

 NullPointerException in JavaPairRDD.mapToPair
 -

 Key: SPARK-2292
 URL: https://issues.apache.org/jira/browse/SPARK-2292
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.0.0
 Environment: Spark 1.0.0, Standalone with the master  single slave 
 running on Ubuntu on a laptop. 4G mem and 8 cores were available to the 
 executor .
Reporter: Bharath Ravi Kumar
 Attachments: SPARK-2292-aash-repro.tar.gz


 Correction: Invoking JavaPairRDD.mapToPair results in an NPE:
 {noformat}
 14/06/26 21:05:35 WARN scheduler.TaskSetManager: Loss was due to 
 java.lang.NullPointerException
 java.lang.NullPointerException
   at 
 org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:750)
   at 
 org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:750)
   at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
   at org.apache.spark.Aggregator.combineValuesByKey(Aggregator.scala:59)
   at 
 org.apache.spark.rdd.PairRDDFunctions$$anonfun$1.apply(PairRDDFunctions.scala:96)
   at 
 org.apache.spark.rdd.PairRDDFunctions$$anonfun$1.apply(PairRDDFunctions.scala:95)
   at org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:582)
   at org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:582)
   at 
 org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
   at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
   at 
 org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:158)
   at 
 org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
   at org.apache.spark.scheduler.Task.run(Task.scala:51)
   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
   at java.lang.Thread.run(Thread.java:722)
 {noformat}
  This occurs only after migrating to the 1.0.0 API. The details of the code 
 the data file used to test are included in this gist : 
 https://gist.github.com/reachbach/d8977c8eb5f71f889301



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-2233) make-distribution script should list the git hash in the RELEASE file

2014-06-28 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-2233:
---

Assignee: Guillaume Ballet

 make-distribution script should list the git hash in the RELEASE file
 -

 Key: SPARK-2233
 URL: https://issues.apache.org/jira/browse/SPARK-2233
 Project: Spark
  Issue Type: Improvement
  Components: Project Infra
Reporter: Patrick Wendell
Assignee: Guillaume Ballet
Priority: Minor
  Labels: starter
 Fix For: 1.1.0


 If someone is creating a distribution and also has a version of Spark that 
 has a .git folder in it, we should list the current git hash and put that in 
 the RELEASE file.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (SPARK-2233) make-distribution script should list the git hash in the RELEASE file

2014-06-28 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-2233.


   Resolution: Fixed
Fix Version/s: 1.1.0

Issue resolved by pull request 1216
[https://github.com/apache/spark/pull/1216]

 make-distribution script should list the git hash in the RELEASE file
 -

 Key: SPARK-2233
 URL: https://issues.apache.org/jira/browse/SPARK-2233
 Project: Spark
  Issue Type: Improvement
  Components: Project Infra
Reporter: Patrick Wendell
Priority: Minor
  Labels: starter
 Fix For: 1.1.0


 If someone is creating a distribution and also has a version of Spark that 
 has a .git folder in it, we should list the current git hash and put that in 
 the RELEASE file.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SPARK-2292) NullPointerException in JavaPairRDD.mapToPair

2014-06-28 Thread Patrick Wendell (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-2292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047039#comment-14047039
 ] 

Patrick Wendell commented on SPARK-2292:


It would still be good to debug and fix this issue because there are
definitely users who want to come with their own version of spark. But a
work around in the short term is to use spark-submit or somehow else inject
the same spark jars that are present on the cluster in the claspath when
you submit your app.

---
sent from my phone



 NullPointerException in JavaPairRDD.mapToPair
 -

 Key: SPARK-2292
 URL: https://issues.apache.org/jira/browse/SPARK-2292
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.0.0
 Environment: Spark 1.0.0, Standalone with the master  single slave 
 running on Ubuntu on a laptop. 4G mem and 8 cores were available to the 
 executor .
Reporter: Bharath Ravi Kumar
 Attachments: SPARK-2292-aash-repro.tar.gz


 Correction: Invoking JavaPairRDD.mapToPair results in an NPE:
 {noformat}
 14/06/26 21:05:35 WARN scheduler.TaskSetManager: Loss was due to 
 java.lang.NullPointerException
 java.lang.NullPointerException
   at 
 org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:750)
   at 
 org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:750)
   at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
   at org.apache.spark.Aggregator.combineValuesByKey(Aggregator.scala:59)
   at 
 org.apache.spark.rdd.PairRDDFunctions$$anonfun$1.apply(PairRDDFunctions.scala:96)
   at 
 org.apache.spark.rdd.PairRDDFunctions$$anonfun$1.apply(PairRDDFunctions.scala:95)
   at org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:582)
   at org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:582)
   at 
 org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
   at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
   at 
 org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:158)
   at 
 org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
   at org.apache.spark.scheduler.Task.run(Task.scala:51)
   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
   at java.lang.Thread.run(Thread.java:722)
 {noformat}
  This occurs only after migrating to the 1.0.0 API. The details of the code 
 the data file used to test are included in this gist : 
 https://gist.github.com/reachbach/d8977c8eb5f71f889301



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (SPARK-2003) SparkContext(SparkConf) doesn't work in pyspark

2014-07-01 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-2003.


Resolution: Won't Fix

 SparkContext(SparkConf) doesn't work in pyspark
 ---

 Key: SPARK-2003
 URL: https://issues.apache.org/jira/browse/SPARK-2003
 Project: Spark
  Issue Type: Bug
  Components: Documentation, PySpark
Affects Versions: 1.0.0
Reporter: Diana Carroll
 Fix For: 1.0.1, 1.1.0


 Using SparkConf with SparkContext as described in the Programming Guide does 
 NOT work in Python:
 conf = SparkConf.setAppName(blah)
 sc = SparkContext(conf)
 When I tried I got 
 AttributeError: 'SparkConf' object has no attribute '_get_object_id'
 [This equivalent code in Scala works fine:
 val conf = new SparkConf().setAppName(blah)
 val sc = new SparkContext(conf)]
 I think this is because there's no equivalent for the Scala constructor 
 SparkContext(SparkConf).  
 Workaround:
 If I explicitly set the conf parameter in the python call, it does work:
 sconf = SparkConf.setAppName(blah)
 sc = SparkContext(conf=sconf)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SPARK-2003) SparkContext(SparkConf) doesn't work in pyspark

2014-07-01 Thread Patrick Wendell (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14049456#comment-14049456
 ] 

Patrick Wendell commented on SPARK-2003:


If I understand correctly, [~dcarr...@cloudera.com] is asking us to change this 
API to make it more consistent with other languages. I don't see a way of doing 
this without breaking the existing behavior for old users (which we can't do). 
In python, it's not possible to overload constructors in the same way as in 
Java because it's not strongly typed. I'd guess this is why Matei didn't change 
it when he refactored the constructor to take a configuration.

For that reason I'm going to close this as wontFix - but if there is indeed a 
backwards-compatible way to do that, please feel free to re-open it with a 
proposal.


 SparkContext(SparkConf) doesn't work in pyspark
 ---

 Key: SPARK-2003
 URL: https://issues.apache.org/jira/browse/SPARK-2003
 Project: Spark
  Issue Type: Bug
  Components: Documentation, PySpark
Affects Versions: 1.0.0
Reporter: Diana Carroll
 Fix For: 1.0.1, 1.1.0


 Using SparkConf with SparkContext as described in the Programming Guide does 
 NOT work in Python:
 conf = SparkConf.setAppName(blah)
 sc = SparkContext(conf)
 When I tried I got 
 AttributeError: 'SparkConf' object has no attribute '_get_object_id'
 [This equivalent code in Scala works fine:
 val conf = new SparkConf().setAppName(blah)
 val sc = new SparkContext(conf)]
 I think this is because there's no equivalent for the Scala constructor 
 SparkContext(SparkConf).  
 Workaround:
 If I explicitly set the conf parameter in the python call, it does work:
 sconf = SparkConf.setAppName(blah)
 sc = SparkContext(conf=sconf)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SPARK-1981) Add AWS Kinesis streaming support

2014-07-01 Thread Patrick Wendell (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14049560#comment-14049560
 ] 

Patrick Wendell commented on SPARK-1981:


Assigned!

 Add AWS Kinesis streaming support
 -

 Key: SPARK-1981
 URL: https://issues.apache.org/jira/browse/SPARK-1981
 Project: Spark
  Issue Type: New Feature
  Components: Streaming
Reporter: Chris Fregly
Assignee: Chris Fregly

 Add AWS Kinesis support to Spark Streaming.
 Initial discussion occured here:  https://github.com/apache/spark/pull/223
 I discussed this with Parviz from AWS recently and we agreed that I would 
 take this over.
 Look for a new PR that takes into account all the feedback from the earlier 
 PR including spark-1.0-compliant implementation, AWS-license-aware build 
 support, tests, comments, and style guide compliance.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (SPARK-2109) Setting SPARK_MEM for bin/pyspark does not work.

2014-07-03 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-2109.


Resolution: Fixed

Fixed in master and 1.0 via https://github.com/apache/spark/pull/1050/files

 Setting SPARK_MEM for bin/pyspark does not work. 
 -

 Key: SPARK-2109
 URL: https://issues.apache.org/jira/browse/SPARK-2109
 Project: Spark
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Prashant Sharma
Assignee: Prashant Sharma
Priority: Critical
 Fix For: 1.0.1, 1.1.0


 prashant@sc:~/work/spark$ SPARK_MEM=10G bin/pyspark 
 Python 2.7.6 (default, Mar 22 2014, 22:59:56) 
 [GCC 4.8.2] on linux2
 Type help, copyright, credits or license for more information.
 Traceback (most recent call last):
   File /home/prashant/work/spark/python/pyspark/shell.py, line 43, in 
 module
 sc = SparkContext(appName=PySparkShell, pyFiles=add_files)
   File /home/prashant/work/spark/python/pyspark/context.py, line 94, in 
 __init__
 SparkContext._ensure_initialized(self, gateway=gateway)
   File /home/prashant/work/spark/python/pyspark/context.py, line 190, in 
 _ensure_initialized
 SparkContext._gateway = gateway or launch_gateway()
   File /home/prashant/work/spark/python/pyspark/java_gateway.py, line 51, 
 in launch_gateway
 gateway_port = int(proc.stdout.readline())
 ValueError: invalid literal for int() with base 10: 'Warning: SPARK_MEM is 
 deprecated, please use a more specific config option\n'



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (SPARK-2350) Master throws NPE

2014-07-03 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-2350.


   Resolution: Fixed
Fix Version/s: 1.0.1

Issue resolved by pull request 1289
[https://github.com/apache/spark/pull/1289]

 Master throws NPE
 -

 Key: SPARK-2350
 URL: https://issues.apache.org/jira/browse/SPARK-2350
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.1.0
Reporter: Andrew Or
 Fix For: 1.0.1, 1.1.0


 ... if we launch a driver and there are more waiting drivers to be launched. 
 This is because we remove from a list while iterating through this.
 Here is the culprit from Master.scala (L487 as of the creation of this JIRA, 
 commit bc7041a42dfa84312492ea8cae6fdeaeac4f6d1c).
 {code}
 for (driver - waitingDrivers) {
   if (worker.memoryFree = driver.desc.mem  worker.coresFree = 
 driver.desc.cores) {
 launchDriver(worker, driver)
 waitingDrivers -= driver
   }
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-2350) Master throws NPE

2014-07-03 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-2350:
---

Assignee: Aaron Davidson

 Master throws NPE
 -

 Key: SPARK-2350
 URL: https://issues.apache.org/jira/browse/SPARK-2350
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.1.0
Reporter: Andrew Or
Assignee: Aaron Davidson
 Fix For: 1.0.1, 1.1.0


 ... if we launch a driver and there are more waiting drivers to be launched. 
 This is because we remove from a list while iterating through this.
 Here is the culprit from Master.scala (L487 as of the creation of this JIRA, 
 commit bc7041a42dfa84312492ea8cae6fdeaeac4f6d1c).
 {code}
 for (driver - waitingDrivers) {
   if (worker.memoryFree = driver.desc.mem  worker.coresFree = 
 driver.desc.cores) {
 launchDriver(worker, driver)
 waitingDrivers -= driver
   }
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SPARK-2307) SparkUI Storage page cached statuses incorrect

2014-07-03 Thread Patrick Wendell (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052183#comment-14052183
 ] 

Patrick Wendell commented on SPARK-2307:


There was a follow up patch:
https://github.com/apache/spark/pull/1255

 SparkUI Storage page cached statuses incorrect
 --

 Key: SPARK-2307
 URL: https://issues.apache.org/jira/browse/SPARK-2307
 Project: Spark
  Issue Type: Bug
  Components: Spark Core, Web UI
Affects Versions: 1.1.0
Reporter: Andrew Or
Assignee: Andrew Or
 Fix For: 1.0.1, 1.1.0

 Attachments: Screen Shot 2014-06-27 at 11.09.54 AM.png


 See attached: the executor has 512MB, but somehow it has cached (279 + 27 + 
 279 + 27) = 612MB? (The correct answer is 279MB).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-2350) Master throws NPE

2014-07-04 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-2350:
---

Fix Version/s: 0.9.2

 Master throws NPE
 -

 Key: SPARK-2350
 URL: https://issues.apache.org/jira/browse/SPARK-2350
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.1.0
Reporter: Andrew Or
Assignee: Aaron Davidson
 Fix For: 0.9.2, 1.0.1, 1.1.0


 ... if we launch a driver and there are more waiting drivers to be launched. 
 This is because we remove from a list while iterating through this.
 Here is the culprit from Master.scala (L487 as of the creation of this JIRA, 
 commit bc7041a42dfa84312492ea8cae6fdeaeac4f6d1c).
 {code}
 for (driver - waitingDrivers) {
   if (worker.memoryFree = driver.desc.mem  worker.coresFree = 
 driver.desc.cores) {
 launchDriver(worker, driver)
 waitingDrivers -= driver
   }
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (SPARK-2282) PySpark crashes if too many tasks complete quickly

2014-07-04 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-2282.


   Resolution: Fixed
Fix Version/s: 1.0.0
   1.0.1
   0.9.2

 PySpark crashes if too many tasks complete quickly
 --

 Key: SPARK-2282
 URL: https://issues.apache.org/jira/browse/SPARK-2282
 Project: Spark
  Issue Type: Bug
  Components: PySpark
Affects Versions: 0.9.1, 1.0.0, 1.0.1
Reporter: Aaron Davidson
Assignee: Aaron Davidson
 Fix For: 0.9.2, 1.0.1, 1.0.0


 Upon every task completion, PythonAccumulatorParam constructs a new socket to 
 the Accumulator server running inside the pyspark daemon. This can cause a 
 buildup of used ephemeral ports from sockets in the TIME_WAIT termination 
 stage, which will cause the SparkContext to crash if too many tasks complete 
 too quickly. We ran into this bug with 17k tasks completing in 15 seconds.
 This bug can be fixed outside of Spark by ensuring these properties are set 
 (on a linux server);
 echo 1  /proc/sys/net/ipv4/tcp_tw_reuse
 echo 1  /proc/sys/net/ipv4/tcp_tw_recycle
 or by adding the SO_REUSEADDR option to the Socket creation within Spark.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-2282) PySpark crashes if too many tasks complete quickly

2014-07-04 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-2282:
---

Affects Version/s: 0.9.1

 PySpark crashes if too many tasks complete quickly
 --

 Key: SPARK-2282
 URL: https://issues.apache.org/jira/browse/SPARK-2282
 Project: Spark
  Issue Type: Bug
  Components: PySpark
Affects Versions: 0.9.1, 1.0.0, 1.0.1
Reporter: Aaron Davidson
Assignee: Aaron Davidson
 Fix For: 0.9.2, 1.0.0, 1.0.1


 Upon every task completion, PythonAccumulatorParam constructs a new socket to 
 the Accumulator server running inside the pyspark daemon. This can cause a 
 buildup of used ephemeral ports from sockets in the TIME_WAIT termination 
 stage, which will cause the SparkContext to crash if too many tasks complete 
 too quickly. We ran into this bug with 17k tasks completing in 15 seconds.
 This bug can be fixed outside of Spark by ensuring these properties are set 
 (on a linux server);
 echo 1  /proc/sys/net/ipv4/tcp_tw_reuse
 echo 1  /proc/sys/net/ipv4/tcp_tw_recycle
 or by adding the SO_REUSEADDR option to the Socket creation within Spark.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (SPARK-1199) Type mismatch in Spark shell when using case class defined in shell

2014-07-04 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-1199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-1199.


   Resolution: Fixed
Fix Version/s: 1.1.0
   1.0.1

Resolved via:
https://github.com/apache/spark/pull/1179

 Type mismatch in Spark shell when using case class defined in shell
 ---

 Key: SPARK-1199
 URL: https://issues.apache.org/jira/browse/SPARK-1199
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 0.9.0
Reporter: Andrew Kerr
Assignee: Prashant Sharma
Priority: Blocker
 Fix For: 1.0.1, 1.1.0


 Define a class in the shell:
 {code}
 case class TestClass(a:String)
 {code}
 and an RDD
 {code}
 val data = sc.parallelize(Seq(a)).map(TestClass(_))
 {code}
 define a function on it and map over the RDD
 {code}
 def itemFunc(a:TestClass):TestClass = a
 data.map(itemFunc)
 {code}
 Error:
 {code}
 console:19: error: type mismatch;
  found   : TestClass = TestClass
  required: TestClass = ?
   data.map(itemFunc)
 {code}
 Similarly with a mapPartitions:
 {code}
 def partitionFunc(a:Iterator[TestClass]):Iterator[TestClass] = a
 data.mapPartitions(partitionFunc)
 {code}
 {code}
 console:19: error: type mismatch;
  found   : Iterator[TestClass] = Iterator[TestClass]
  required: Iterator[TestClass] = Iterator[?]
 Error occurred in an application involving default arguments.
   data.mapPartitions(partitionFunc)
 {code}
 The behavior is the same whether in local mode or on a cluster.
 This isn't specific to RDDs. A Scala collection in the Spark shell has the 
 same problem.
 {code}
 scala Seq(TestClass(foo)).map(itemFunc)
 console:15: error: type mismatch;
  found   : TestClass = TestClass
  required: TestClass = ?
   Seq(TestClass(foo)).map(itemFunc)
 ^
 {code}
 When run in the Scala console (not the Spark shell) there are no type 
 mismatch errors.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (SPARK-2380) Support displaying accumulator contents in the web UI

2014-07-06 Thread Patrick Wendell (JIRA)

Patrick Wendell created SPARK-2380:
--

 Summary: Support displaying accumulator contents in the web UI
 Key: SPARK-2380
 URL: https://issues.apache.org/jira/browse/SPARK-2380
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core, Web UI
Reporter: Patrick Wendell
Assignee: Patrick Wendell






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-2290) Do not send SPARK_HOME from workers to executors

2014-07-08 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-2290:
---

Issue Type: Improvement  (was: Bug)

 Do not send SPARK_HOME from workers to executors
 

 Key: SPARK-2290
 URL: https://issues.apache.org/jira/browse/SPARK-2290
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Reporter: YanTang Zhai
Priority: Minor

 The client path is /data/home/spark/test/spark-1.0.0 while the worker deploy 
 path is /data/home/spark/spark-1.0.0 which is different from the client path. 
 Then an application is launched using the ./bin/spark-submit --class 
 JobTaskJoin --master spark://172.25.38.244:7077 --executor-memory 128M 
 ../jobtaskjoin_2.10-1.0.0.jar. However the application is failed because an 
 exception occurs at 
 java.io.IOException: Cannot run program 
 /data/home/spark/test/spark-1.0.0-bin-0.20.2-cdh3u3/bin/compute-classpath.sh
  (in directory .): error=2, No such file or directory
 at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
 at org.apache.spark.util.Utils$.executeAndGetOutput(Utils.scala:759)
 at 
 org.apache.spark.deploy.worker.CommandUtils$.buildJavaOpts(CommandUtils.scala:72)
 at 
 org.apache.spark.deploy.worker.CommandUtils$.buildCommandSeq(CommandUtils.scala:37)
 at 
 org.apache.spark.deploy.worker.ExecutorRunner.getCommandSeq(ExecutorRunner.scala:109)
 at 
 org.apache.spark.deploy.worker.ExecutorRunner.fetchAndRunExecutor(ExecutorRunner.scala:124)
 at 
 org.apache.spark.deploy.worker.ExecutorRunner$$anon$1.run(ExecutorRunner.scala:58)
 Caused by: java.io.IOException: error=2, No such file or directory
 at java.lang.UNIXProcess.forkAndExec(Native Method)
 at java.lang.UNIXProcess.init(UNIXProcess.java:135)
 at java.lang.ProcessImpl.start(ProcessImpl.java:130)
 at java.lang.ProcessBuilder.start(ProcessBuilder.java:1021)
 ... 6 more
 Therefore, I think worker should not use appDesc.sparkHome when 
 LaunchExecutor, Contrarily, worker could use its own sparkHome directly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-2290) Do not send SPARK_HOME from workers to executors

2014-07-08 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-2290:
---

Priority: Major  (was: Minor)

 Do not send SPARK_HOME from workers to executors
 

 Key: SPARK-2290
 URL: https://issues.apache.org/jira/browse/SPARK-2290
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Reporter: YanTang Zhai
Assignee: Patrick Wendell

 The client path is /data/home/spark/test/spark-1.0.0 while the worker deploy 
 path is /data/home/spark/spark-1.0.0 which is different from the client path. 
 Then an application is launched using the ./bin/spark-submit --class 
 JobTaskJoin --master spark://172.25.38.244:7077 --executor-memory 128M 
 ../jobtaskjoin_2.10-1.0.0.jar. However the application is failed because an 
 exception occurs at 
 java.io.IOException: Cannot run program 
 /data/home/spark/test/spark-1.0.0-bin-0.20.2-cdh3u3/bin/compute-classpath.sh
  (in directory .): error=2, No such file or directory
 at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
 at org.apache.spark.util.Utils$.executeAndGetOutput(Utils.scala:759)
 at 
 org.apache.spark.deploy.worker.CommandUtils$.buildJavaOpts(CommandUtils.scala:72)
 at 
 org.apache.spark.deploy.worker.CommandUtils$.buildCommandSeq(CommandUtils.scala:37)
 at 
 org.apache.spark.deploy.worker.ExecutorRunner.getCommandSeq(ExecutorRunner.scala:109)
 at 
 org.apache.spark.deploy.worker.ExecutorRunner.fetchAndRunExecutor(ExecutorRunner.scala:124)
 at 
 org.apache.spark.deploy.worker.ExecutorRunner$$anon$1.run(ExecutorRunner.scala:58)
 Caused by: java.io.IOException: error=2, No such file or directory
 at java.lang.UNIXProcess.forkAndExec(Native Method)
 at java.lang.UNIXProcess.init(UNIXProcess.java:135)
 at java.lang.ProcessImpl.start(ProcessImpl.java:130)
 at java.lang.ProcessBuilder.start(ProcessBuilder.java:1021)
 ... 6 more
 Therefore, I think worker should not use appDesc.sparkHome when 
 LaunchExecutor, Contrarily, worker could use its own sparkHome directly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-2290) Do not send SPARK_HOME from workers to executors

2014-07-08 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-2290:
---

Assignee: Patrick Wendell

 Do not send SPARK_HOME from workers to executors
 

 Key: SPARK-2290
 URL: https://issues.apache.org/jira/browse/SPARK-2290
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Reporter: YanTang Zhai
Assignee: Patrick Wendell
Priority: Minor

 The client path is /data/home/spark/test/spark-1.0.0 while the worker deploy 
 path is /data/home/spark/spark-1.0.0 which is different from the client path. 
 Then an application is launched using the ./bin/spark-submit --class 
 JobTaskJoin --master spark://172.25.38.244:7077 --executor-memory 128M 
 ../jobtaskjoin_2.10-1.0.0.jar. However the application is failed because an 
 exception occurs at 
 java.io.IOException: Cannot run program 
 /data/home/spark/test/spark-1.0.0-bin-0.20.2-cdh3u3/bin/compute-classpath.sh
  (in directory .): error=2, No such file or directory
 at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
 at org.apache.spark.util.Utils$.executeAndGetOutput(Utils.scala:759)
 at 
 org.apache.spark.deploy.worker.CommandUtils$.buildJavaOpts(CommandUtils.scala:72)
 at 
 org.apache.spark.deploy.worker.CommandUtils$.buildCommandSeq(CommandUtils.scala:37)
 at 
 org.apache.spark.deploy.worker.ExecutorRunner.getCommandSeq(ExecutorRunner.scala:109)
 at 
 org.apache.spark.deploy.worker.ExecutorRunner.fetchAndRunExecutor(ExecutorRunner.scala:124)
 at 
 org.apache.spark.deploy.worker.ExecutorRunner$$anon$1.run(ExecutorRunner.scala:58)
 Caused by: java.io.IOException: error=2, No such file or directory
 at java.lang.UNIXProcess.forkAndExec(Native Method)
 at java.lang.UNIXProcess.init(UNIXProcess.java:135)
 at java.lang.ProcessImpl.start(ProcessImpl.java:130)
 at java.lang.ProcessBuilder.start(ProcessBuilder.java:1021)
 ... 6 more
 Therefore, I think worker should not use appDesc.sparkHome when 
 LaunchExecutor, Contrarily, worker could use its own sparkHome directly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-2290) Do not send SPARK_HOME from workers to executors

2014-07-08 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-2290:
---

Summary: Do not send SPARK_HOME from workers to executors  (was: Worker 
should directly use its own sparkHome instead of appDesc.sparkHome when 
LaunchExecutor)

 Do not send SPARK_HOME from workers to executors
 

 Key: SPARK-2290
 URL: https://issues.apache.org/jira/browse/SPARK-2290
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Reporter: YanTang Zhai
Priority: Minor

 The client path is /data/home/spark/test/spark-1.0.0 while the worker deploy 
 path is /data/home/spark/spark-1.0.0 which is different from the client path. 
 Then an application is launched using the ./bin/spark-submit --class 
 JobTaskJoin --master spark://172.25.38.244:7077 --executor-memory 128M 
 ../jobtaskjoin_2.10-1.0.0.jar. However the application is failed because an 
 exception occurs at 
 java.io.IOException: Cannot run program 
 /data/home/spark/test/spark-1.0.0-bin-0.20.2-cdh3u3/bin/compute-classpath.sh
  (in directory .): error=2, No such file or directory
 at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
 at org.apache.spark.util.Utils$.executeAndGetOutput(Utils.scala:759)
 at 
 org.apache.spark.deploy.worker.CommandUtils$.buildJavaOpts(CommandUtils.scala:72)
 at 
 org.apache.spark.deploy.worker.CommandUtils$.buildCommandSeq(CommandUtils.scala:37)
 at 
 org.apache.spark.deploy.worker.ExecutorRunner.getCommandSeq(ExecutorRunner.scala:109)
 at 
 org.apache.spark.deploy.worker.ExecutorRunner.fetchAndRunExecutor(ExecutorRunner.scala:124)
 at 
 org.apache.spark.deploy.worker.ExecutorRunner$$anon$1.run(ExecutorRunner.scala:58)
 Caused by: java.io.IOException: error=2, No such file or directory
 at java.lang.UNIXProcess.forkAndExec(Native Method)
 at java.lang.UNIXProcess.init(UNIXProcess.java:135)
 at java.lang.ProcessImpl.start(ProcessImpl.java:130)
 at java.lang.ProcessBuilder.start(ProcessBuilder.java:1021)
 ... 6 more
 Therefore, I think worker should not use appDesc.sparkHome when 
 LaunchExecutor, Contrarily, worker could use its own sparkHome directly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SPARK-2290) Do not send SPARK_HOME from workers to executors

2014-07-08 Thread Patrick Wendell (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-2290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14054633#comment-14054633
 ] 

Patrick Wendell commented on SPARK-2290:


I updated the description here. It is indeed pretty strange that we ship this 
to the cluster when it's always available there anyways (since the Worker has 
it's own sparkHome anyways). So we should just remove it.

 Do not send SPARK_HOME from workers to executors
 

 Key: SPARK-2290
 URL: https://issues.apache.org/jira/browse/SPARK-2290
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Reporter: YanTang Zhai
Assignee: Patrick Wendell

 The client path is /data/home/spark/test/spark-1.0.0 while the worker deploy 
 path is /data/home/spark/spark-1.0.0 which is different from the client path. 
 Then an application is launched using the ./bin/spark-submit --class 
 JobTaskJoin --master spark://172.25.38.244:7077 --executor-memory 128M 
 ../jobtaskjoin_2.10-1.0.0.jar. However the application is failed because an 
 exception occurs at 
 java.io.IOException: Cannot run program 
 /data/home/spark/test/spark-1.0.0-bin-0.20.2-cdh3u3/bin/compute-classpath.sh
  (in directory .): error=2, No such file or directory
 at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
 at org.apache.spark.util.Utils$.executeAndGetOutput(Utils.scala:759)
 at 
 org.apache.spark.deploy.worker.CommandUtils$.buildJavaOpts(CommandUtils.scala:72)
 at 
 org.apache.spark.deploy.worker.CommandUtils$.buildCommandSeq(CommandUtils.scala:37)
 at 
 org.apache.spark.deploy.worker.ExecutorRunner.getCommandSeq(ExecutorRunner.scala:109)
 at 
 org.apache.spark.deploy.worker.ExecutorRunner.fetchAndRunExecutor(ExecutorRunner.scala:124)
 at 
 org.apache.spark.deploy.worker.ExecutorRunner$$anon$1.run(ExecutorRunner.scala:58)
 Caused by: java.io.IOException: error=2, No such file or directory
 at java.lang.UNIXProcess.forkAndExec(Native Method)
 at java.lang.UNIXProcess.init(UNIXProcess.java:135)
 at java.lang.ProcessImpl.start(ProcessImpl.java:130)
 at java.lang.ProcessBuilder.start(ProcessBuilder.java:1021)
 ... 6 more
 Therefore, I think worker should not use appDesc.sparkHome when 
 LaunchExecutor, Contrarily, worker could use its own sparkHome directly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-2348) In Windows having a enviorinment variable named 'classpath' gives error

2014-07-09 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-2348:
---

Assignee: Chirag Todarka

 In Windows having a enviorinment variable named 'classpath' gives error
 ---

 Key: SPARK-2348
 URL: https://issues.apache.org/jira/browse/SPARK-2348
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.0.0
 Environment: Windows 7 Enterprise
Reporter: Chirag Todarka
Assignee: Chirag Todarka

 Operating System:: Windows 7 Enterprise
 If having enviorinment variable named 'classpath' gives then starting 
 'spark-shell' gives below error::
 mydir\spark\binspark-shell
 Failed to initialize compiler: object scala.runtime in compiler mirror not 
 found
 .
 ** Note that as of 2.8 scala does not assume use of the java classpath.
 ** For the old behavior pass -usejavacp to scala, or if using a Settings
 ** object programatically, settings.usejavacp.value = true.
 14/07/02 14:22:06 WARN SparkILoop$SparkILoopInterpreter: Warning: compiler 
 acces
 sed before init set up.  Assuming no postInit code.
 Failed to initialize compiler: object scala.runtime in compiler mirror not 
 found
 .
 ** Note that as of 2.8 scala does not assume use of the java classpath.
 ** For the old behavior pass -usejavacp to scala, or if using a Settings
 ** object programatically, settings.usejavacp.value = true.
 Exception in thread main java.lang.AssertionError: assertion failed: null
 at scala.Predef$.assert(Predef.scala:179)
 at 
 org.apache.spark.repl.SparkIMain.initializeSynchronous(SparkIMain.sca
 la:202)
 at 
 org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply$mcZ$sp(Spar
 kILoop.scala:929)
 at 
 org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.
 scala:884)
 at 
 org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.
 scala:884)
 at 
 scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClass
 Loader.scala:135)
 at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:884)
 at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:982)
 at org.apache.spark.repl.Main$.main(Main.scala:31)
 at org.apache.spark.repl.Main.main(Main.scala)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
 at java.lang.reflect.Method.invoke(Unknown Source)
 at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:292)
 at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55)
 at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-2379) stopReceive in dead loop, cause stackoverflow exception

2014-07-09 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-2379:
---

Component/s: Streaming

 stopReceive in dead loop, cause stackoverflow exception
 ---

 Key: SPARK-2379
 URL: https://issues.apache.org/jira/browse/SPARK-2379
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.0.0
Reporter: sunshangchun

 streaming/src/main/scala/org/apache/spark/streaming/receiver/ReceiverSupervisor.scala
 stop will call stopReceiver and stopReceiver will call stop if exception 
 occurs, that make a dead loop.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-2409) Make SQLConf thread safe

2014-07-09 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-2409:
---

Component/s: SQL

 Make SQLConf thread safe
 

 Key: SPARK-2409
 URL: https://issues.apache.org/jira/browse/SPARK-2409
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Reporter: Reynold Xin
Assignee: Reynold Xin





--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-2201) Improve FlumeInputDStream's stability and make it scalable

2014-07-09 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-2201:
---

Component/s: Streaming

 Improve FlumeInputDStream's stability and make it scalable
 --

 Key: SPARK-2201
 URL: https://issues.apache.org/jira/browse/SPARK-2201
 Project: Spark
  Issue Type: Improvement
  Components: Streaming
Reporter: sunshangchun

 Currently:
 FlumeUtils.createStream(ssc, localhost, port); 
 This means that only one flume receiver can work with FlumeInputDStream .so 
 the solution is not scalable. 
 I use a zookeeper to solve this problem.
 Spark flume receivers register themselves to a zk path when started, and a 
 flume agent get physical hosts and push events to them.
 Some works need to be done here: 
 1.receiver create tmp node in zk,  listeners just watch those tmp nodes.
 2. when spark FlumeReceivers started, they acquire a physical host 
 (localhost's ip and an idle port) and register itself to zookeeper.
 3. A new flume sink. In the method of appendEvents, they get physical hosts 
 and push data to them in a round-robin manner.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-2414) Remove jquery

2014-07-09 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-2414:
---

Component/s: Web UI

 Remove jquery
 -

 Key: SPARK-2414
 URL: https://issues.apache.org/jira/browse/SPARK-2414
 Project: Spark
  Issue Type: Improvement
  Components: Web UI
Reporter: Reynold Xin
Assignee: Reynold Xin
Priority: Minor

 SPARK-2384 introduces jquery for tooltip display. We can probably just create 
 a very simple javascript for tooltip instead of pulling in jquery. 
 https://github.com/apache/spark/pull/1314



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-2381) streaming receiver crashed,but seems nothing happened

2014-07-09 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-2381:
---

Component/s: Streaming

 streaming receiver crashed,but seems nothing happened
 -

 Key: SPARK-2381
 URL: https://issues.apache.org/jira/browse/SPARK-2381
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Reporter: sunshangchun

 when we submit a streaming job and if receivers doesn't start normally, the 
 application should stop itself. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-2017) web ui stage page becomes unresponsive when the number of tasks is large

2014-07-09 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-2017:
---

Component/s: Web UI

 web ui stage page becomes unresponsive when the number of tasks is large
 

 Key: SPARK-2017
 URL: https://issues.apache.org/jira/browse/SPARK-2017
 Project: Spark
  Issue Type: Sub-task
  Components: Web UI
Reporter: Reynold Xin
  Labels: starter

 {code}
 sc.parallelize(1 to 100, 100).count()
 {code}
 The above code creates one million tasks to be executed. The stage detail web 
 ui page takes forever to load (if it ever completes).
 There are again a few different alternatives:
 0. Limit the number of tasks we show.
 1. Pagination
 2. By default only show the aggregate metrics and failed tasks, and hide the 
 successful ones.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-2345) ForEachDStream should have an option of running the foreachfunc on Spark

2014-07-09 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-2345:
---

Component/s: Streaming

 ForEachDStream should have an option of running the foreachfunc on Spark
 

 Key: SPARK-2345
 URL: https://issues.apache.org/jira/browse/SPARK-2345
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Reporter: Hari Shreedharan

 Today the Job generated simply calls the foreachfunc, but does not run it on 
 spark itself using the sparkContext.runJob method.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (SPARK-2338) Jenkins Spark-Master-Maven-with-YARN builds failing due to test misconfiguration

2014-07-09 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved SPARK-2338.


Resolution: Fixed
  Assignee: Pete MacKinnon

Thanks a ton for getting to the bottom of this. I was super confused why the 
tests were so messed up even though this seems totally obvious in retrospect.

I went ahead and updated the build configuration. There are some failing tests 
in MLLib in maven, I'll try to track those down as well to get this all green.

 Jenkins Spark-Master-Maven-with-YARN builds failing due to test 
 misconfiguration
 

 Key: SPARK-2338
 URL: https://issues.apache.org/jira/browse/SPARK-2338
 Project: Spark
  Issue Type: Bug
  Components: Build, Project Infra, YARN
Affects Versions: 1.0.0
 Environment: https://amplab.cs.berkeley.edu/jenkins
Reporter: Pete MacKinnon
Assignee: Pete MacKinnon
  Labels: hadoop2, jenkins, maven, protobuf, yarn

 https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-Maven-with-YARN/hadoop.version=2.2.0,label=centos/
 https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-Maven-with-YARN/hadoop.version=2.3.0,label=centos/
 These builds are currently failing due to the builder configuration being 
 incomplete. After building, they specify the test command as:
 {noformat}
 /home/jenkins/tools/hudson.tasks.Maven_MavenInstallation/Maven_3.0.5/bin/mvn 
 -Dhadoop.version=2.3.0 -Dlabel=centos test -Pyarn -Phive
 {noformat}
 However, it is not enough to specify the hadoop.version, the tests should 
 instead be run using the hadoop-2.2 and hadoop-2.3 profiles respectively. 
 For example:
 {noformat}
 /home/jenkins/tools/hudson.tasks.Maven_MavenInstallation/Maven_3.0.5/bin/mvn 
 -Phadoop2.2 -Dlabel=centos test -Pyarn -Phive
 {noformat}
 These profiles will not only set the appropriate hadoop.version but also set 
 the version of protobuf-java required by yarn (2.5.0). Without the correct 
 profile set, the test run fails at:
 {noformat}
 *** RUN ABORTED ***
   java.lang.VerifyError: class 
 org.apache.hadoop.yarn.proto.YarnProtos$LocalResourceProto overrides final 
 method getUnknownFields.()Lcom/google/protobuf/UnknownFieldSet;
 {noformat}
 since it is getting the default version of protobuf-java (2.4.1) which has 
 the old incompatible version of getUnknownFields.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (SPARK-2416) Allow richer reporting of unit test results

2014-07-09 Thread Patrick Wendell (JIRA)

Patrick Wendell created SPARK-2416:
--

 Summary: Allow richer reporting of unit test results
 Key: SPARK-2416
 URL: https://issues.apache.org/jira/browse/SPARK-2416
 Project: Spark
  Issue Type: Sub-task
  Components: Project Infra
Reporter: Patrick Wendell
Assignee: Patrick Wendell


The built-in Jenkins integration is pretty bad. It's very confusing to users 
whether tests have passed or failed and we can't easily customize the message.

With some small scripting around the Github API we can do much better than this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-2152) the error of comput rightNodeAgg about Decision tree algorithm in Spark MLlib

2014-07-09 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-2152:
---

Assignee: Jon Sondag

 the error of comput rightNodeAgg about  Decision tree algorithm  in Spark 
 MLlib 
 

 Key: SPARK-2152
 URL: https://issues.apache.org/jira/browse/SPARK-2152
 Project: Spark
  Issue Type: Bug
Affects Versions: 1.0.0
 Environment: windows7 ,32 operator,and 3G mem
Reporter: caoli
Assignee: Jon Sondag
  Labels: features
 Fix For: 1.0.1, 1.1.0

   Original Estimate: 4h
  Remaining Estimate: 4h

  the error of comput rightNodeAgg about  Decision tree algorithm  in Spark 
 MLlib  about  the function extractLeftRightNodeAggregates() ,when compute 
 rightNodeAgg  used bindata index is error. in the DecisionTree.scala file 
 about  Line 980:
  rightNodeAgg(featureIndex)(2 * (numBins - 2 - splitIndex)) =
 binData(shift + (2 * (numBins - 2 - splitIndex))) +
   rightNodeAgg(featureIndex)(2 * (numBins - 1 - splitIndex))  
   
  the   binData(shift + (2 * (numBins - 2 - splitIndex)))  index compute is 
 error, so the result of rightNodeAgg  include  repeated data about bins  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SPARK-2152) the error of comput rightNodeAgg about Decision tree algorithm in Spark MLlib

2014-07-09 Thread Patrick Wendell (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056157#comment-14056157
 ] 

Patrick Wendell commented on SPARK-2152:


FYI this caused some new test failures, I created SPARK-2417 to track it.

 the error of comput rightNodeAgg about  Decision tree algorithm  in Spark 
 MLlib 
 

 Key: SPARK-2152
 URL: https://issues.apache.org/jira/browse/SPARK-2152
 Project: Spark
  Issue Type: Bug
Affects Versions: 1.0.0
 Environment: windows7 ,32 operator,and 3G mem
Reporter: caoli
Assignee: Jon Sondag
  Labels: features
 Fix For: 1.0.1, 1.1.0

   Original Estimate: 4h
  Remaining Estimate: 4h

  the error of comput rightNodeAgg about  Decision tree algorithm  in Spark 
 MLlib  about  the function extractLeftRightNodeAggregates() ,when compute 
 rightNodeAgg  used bindata index is error. in the DecisionTree.scala file 
 about  Line 980:
  rightNodeAgg(featureIndex)(2 * (numBins - 2 - splitIndex)) =
 binData(shift + (2 * (numBins - 2 - splitIndex))) +
   rightNodeAgg(featureIndex)(2 * (numBins - 1 - splitIndex))  
   
  the   binData(shift + (2 * (numBins - 2 - splitIndex)))  index compute is 
 error, so the result of rightNodeAgg  include  repeated data about bins  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (SPARK-2417) Decision tree tests fail in maven build

2014-07-09 Thread Patrick Wendell (JIRA)

Patrick Wendell created SPARK-2417:
--

 Summary: Decision tree tests fail in maven build
 Key: SPARK-2417
 URL: https://issues.apache.org/jira/browse/SPARK-2417
 Project: Spark
  Issue Type: Bug
  Components: MLlib
Reporter: Patrick Wendell
Assignee: Xiangrui Meng


After SPARK-2152 was merged, these tests started failing in Jenkins:

{code}
- classification stump with all categorical variables *** FAILED ***
  org.scalatest.exceptions.TestFailedException was thrown. 
(DecisionTreeSuite.scala:257)
- regression stump with all categorical variables *** FAILED ***
  org.scalatest.exceptions.TestFailedException was thrown. 
(DecisionTreeSuite.scala:284)
{code}

https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-Maven-pre-YARN/97/hadoop.version=1.0.4,label=centos/console



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-2417) Decision tree tests are failing

2014-07-09 Thread Patrick Wendell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-2417:
---

Summary: Decision tree tests are failing  (was: Decision tree tests fail in 
maven build)

 Decision tree tests are failing
 ---

 Key: SPARK-2417
 URL: https://issues.apache.org/jira/browse/SPARK-2417
 Project: Spark
  Issue Type: Bug
  Components: MLlib
Reporter: Patrick Wendell
Assignee: Xiangrui Meng

 After SPARK-2152 was merged, these tests started failing in Jenkins:
 {code}
 - classification stump with all categorical variables *** FAILED ***
   org.scalatest.exceptions.TestFailedException was thrown. 
 (DecisionTreeSuite.scala:257)
 - regression stump with all categorical variables *** FAILED ***
   org.scalatest.exceptions.TestFailedException was thrown. 
 (DecisionTreeSuite.scala:284)
 {code}
 https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-Maven-pre-YARN/97/hadoop.version=1.0.4,label=centos/console



--
This message was sent by Atlassian JIRA
(v6.2#6252)

< 2 3 4 5 6 7 8 9 10 11 >

601 - 700 of 2990 matches

Mail list logo