[jira] [Created] (SPARK-2162) possible to read from removed block in blockmanager
Raymond Liu created SPARK-2162: -- Summary: possible to read from removed block in blockmanager Key: SPARK-2162 URL: https://issues.apache.org/jira/browse/SPARK-2162 Project: Spark Issue Type: Bug Components: Block Manager Reporter: Raymond Liu Priority: Minor In BlockManager's doGetLocal method, there are chance that info get removed when info.synchronized block is entered. thus it will either read in vain in memory level case, or throw exception in disk level case when it believe the block is there while actually it had been removed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (SPARK-2130) Clarify PySpark docs for RDD.getStorageLevel
[ https://issues.apache.org/jira/browse/SPARK-2130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2130. Resolution: Fixed Fix Version/s: 1.1.0 Issue resolved by pull request 1096 [https://github.com/apache/spark/pull/1096] Clarify PySpark docs for RDD.getStorageLevel Key: SPARK-2130 URL: https://issues.apache.org/jira/browse/SPARK-2130 Project: Spark Issue Type: Improvement Components: PySpark Affects Versions: 1.0.0 Reporter: Nicholas Chammas Assignee: Kan Zhang Priority: Minor Fix For: 1.1.0 The [PySpark docs for RDD.getStorageLevel|http://spark.apache.org/docs/1.0.0/api/python/pyspark.rdd.RDD-class.html#getStorageLevel] are unclear. {quote} rdd1 = sc.parallelize([1,2]) rdd1.getStorageLevel() StorageLevel(False, False, False, False, 1) {quote} What do the 5 values of False, False, False, False, 1 mean? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (SPARK-1990) spark-ec2 should only need Python 2.6, not 2.7
[ https://issues.apache.org/jira/browse/SPARK-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-1990: --- Assignee: Anant Daksh Asthana spark-ec2 should only need Python 2.6, not 2.7 -- Key: SPARK-1990 URL: https://issues.apache.org/jira/browse/SPARK-1990 Project: Spark Issue Type: Improvement Reporter: Matei Zaharia Assignee: Anant Daksh Asthana Labels: Starter Fix For: 1.0.1, 1.1.0 There were some posts on the lists that spark-ec2 does not work with Python 2.6. In addition, we should check the Python version at the top of the script and exit if it's too old. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (SPARK-1990) spark-ec2 should only need Python 2.6, not 2.7
[ https://issues.apache.org/jira/browse/SPARK-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-1990: --- Fix Version/s: 0.9.2 spark-ec2 should only need Python 2.6, not 2.7 -- Key: SPARK-1990 URL: https://issues.apache.org/jira/browse/SPARK-1990 Project: Spark Issue Type: Improvement Reporter: Matei Zaharia Assignee: Anant Daksh Asthana Labels: Starter Fix For: 0.9.2, 1.0.1, 1.1.0 There were some posts on the lists that spark-ec2 does not work with Python 2.6. In addition, we should check the Python version at the top of the script and exit if it's too old. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (SPARK-2035) Make a stage's call stack available on the UI
[ https://issues.apache.org/jira/browse/SPARK-2035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2035: --- Assignee: Daniel Darabos Make a stage's call stack available on the UI - Key: SPARK-2035 URL: https://issues.apache.org/jira/browse/SPARK-2035 Project: Spark Issue Type: Improvement Components: Web UI Reporter: Daniel Darabos Assignee: Daniel Darabos Priority: Minor Fix For: 1.1.0 Attachments: example-html.tgz Currently the stage table displays the file name and line number that is the call site that triggered the given stage. This is enormously useful for understanding the execution. But once a project adds utility classes and other indirections, the call site can become less meaningful, because the interesting line is further up the stack. An idea to fix this is to display the entire call stack that triggered the stage. It would be collapsed by default and could be revealed with a click. I have started working on this. It is a good way to learn about how the RDD interface ties into the UI. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (SPARK-2035) Make a stage's call stack available on the UI
[ https://issues.apache.org/jira/browse/SPARK-2035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2035. Resolution: Fixed Fix Version/s: 1.1.0 Issue resolved by pull request 981 [https://github.com/apache/spark/pull/981] Make a stage's call stack available on the UI - Key: SPARK-2035 URL: https://issues.apache.org/jira/browse/SPARK-2035 Project: Spark Issue Type: Improvement Components: Web UI Reporter: Daniel Darabos Priority: Minor Fix For: 1.1.0 Attachments: example-html.tgz Currently the stage table displays the file name and line number that is the call site that triggered the given stage. This is enormously useful for understanding the execution. But once a project adds utility classes and other indirections, the call site can become less meaningful, because the interesting line is further up the stack. An idea to fix this is to display the entire call stack that triggered the stage. It would be collapsed by default and could be revealed with a click. I have started working on this. It is a good way to learn about how the RDD interface ties into the UI. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (SPARK-2155) Support effectful / non-deterministic key expressions in CASE WHEN statements
[ https://issues.apache.org/jira/browse/SPARK-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2155: Priority: Minor (was: Major) Support effectful / non-deterministic key expressions in CASE WHEN statements - Key: SPARK-2155 URL: https://issues.apache.org/jira/browse/SPARK-2155 Project: Spark Issue Type: Bug Components: SQL Reporter: Zongheng Yang Priority: Minor Currently we translate CASE KEY WHEN to CASE WHEN, hence incurring redundant evaluations of the key expression. Relevant discussions here: https://github.com/apache/spark/pull/1055/files#r13784248 If we are very in need of support for effectful key expressions, at least we can resort to the baseline approach of having both CaseWhen and CaseKeyWhen as expressions, which seem to introduce much code duplication (e.g. see https://github.com/concretevitamin/spark/blob/47d406a58d129e5bba68bfadf9dd1faa9054d834/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala#L216 for a sketch implementation). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2160) error of Decision tree algorithm in Spark MLlib
[ https://issues.apache.org/jira/browse/SPARK-2160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033550#comment-14033550 ] Sean Owen commented on SPARK-2160: -- You already added this as https://issues.apache.org/jira/browse/SPARK-2152 right? error of Decision tree algorithm in Spark MLlib -- Key: SPARK-2160 URL: https://issues.apache.org/jira/browse/SPARK-2160 Project: Spark Issue Type: Bug Components: MLlib Affects Versions: 1.0.0 Reporter: caoli Labels: patch Fix For: 1.1.0 Original Estimate: 4h Remaining Estimate: 4h the error of comput rightNodeAgg about Decision tree algorithm in Spark MLlib , in the function extractLeftRightNodeAggregates() ,when compute rightNodeAgg used bindata index is error. in the DecisionTree.scala file about Line980: rightNodeAgg(featureIndex)(2 * (numBins - 2 - splitIndex)) = binData(shift + (2 * (numBins - 2 - splitIndex))) + rightNodeAgg(featureIndex)(2 * (numBins - 1 - splitIndex)) the binData(shift + (2 * (numBins - 2 - splitIndex))) index compute is error, so the result of rightNodeAgg include repeated data about bins -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (SPARK-2144) SparkUI Executors tab displays incorrect RDD blocks
[ https://issues.apache.org/jira/browse/SPARK-2144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2144: --- Component/s: (was: Spark Core) SparkUI Executors tab displays incorrect RDD blocks --- Key: SPARK-2144 URL: https://issues.apache.org/jira/browse/SPARK-2144 Project: Spark Issue Type: Bug Components: Web UI Affects Versions: 1.0.0 Reporter: Andrew Or Assignee: Andrew Or Fix For: 1.0.1, 1.1.0 If a block is dropped because of memory pressure, this is not reflected in the RDD Blocks column on the Executors page. This is because StorageStatusListener updates the StorageLevel of the dropped block to StorageLevel.None, but does not remove it from the list. This is a simple fix. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (SPARK-2144) SparkUI Executors tab displays incorrect RDD blocks
[ https://issues.apache.org/jira/browse/SPARK-2144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2144: --- Assignee: Andrew Or SparkUI Executors tab displays incorrect RDD blocks --- Key: SPARK-2144 URL: https://issues.apache.org/jira/browse/SPARK-2144 Project: Spark Issue Type: Bug Components: Spark Core, Web UI Affects Versions: 1.0.0 Reporter: Andrew Or Assignee: Andrew Or Fix For: 1.0.1, 1.1.0 If a block is dropped because of memory pressure, this is not reflected in the RDD Blocks column on the Executors page. This is because StorageStatusListener updates the StorageLevel of the dropped block to StorageLevel.None, but does not remove it from the list. This is a simple fix. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (SPARK-2144) SparkUI Executors tab displays incorrect RDD blocks
[ https://issues.apache.org/jira/browse/SPARK-2144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2144. Resolution: Fixed Fix Version/s: 1.1.0 Issue resolved by pull request 1080 [https://github.com/apache/spark/pull/1080] SparkUI Executors tab displays incorrect RDD blocks --- Key: SPARK-2144 URL: https://issues.apache.org/jira/browse/SPARK-2144 Project: Spark Issue Type: Bug Components: Spark Core, Web UI Affects Versions: 1.0.0 Reporter: Andrew Or Assignee: Andrew Or Fix For: 1.0.1, 1.1.0 If a block is dropped because of memory pressure, this is not reflected in the RDD Blocks column on the Executors page. This is because StorageStatusListener updates the StorageLevel of the dropped block to StorageLevel.None, but does not remove it from the list. This is a simple fix. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-1353) IllegalArgumentException when writing to disk
[ https://issues.apache.org/jira/browse/SPARK-1353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033607#comment-14033607 ] Gavin commented on SPARK-1353: -- I have the same problem. spark-assembly-1.0.0-hadoop2.2.0 . IllegalArgumentException when writing to disk - Key: SPARK-1353 URL: https://issues.apache.org/jira/browse/SPARK-1353 Project: Spark Issue Type: Bug Components: Block Manager Environment: AWS EMR 3.2.30-49.59.amzn1.x86_64 #1 SMP x86_64 GNU/Linux Spark 1.0.0-SNAPSHOT built for Hadoop 1.0.4 built 2014-03-18 Reporter: Jim Blomo Priority: Minor The Executor may fail when trying to mmap a file bigger than Integer.MAX_VALUE due to the constraints of FileChannel.map (http://docs.oracle.com/javase/7/docs/api/java/nio/channels/FileChannel.html#map(java.nio.channels.FileChannel.MapMode, long, long)). The signature takes longs, but the size value must be less than MAX_VALUE. This manifests with the following backtrace: java.lang.IllegalArgumentException: Size exceeds Integer.MAX_VALUE at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:828) at org.apache.spark.storage.DiskStore.getBytes(DiskStore.scala:98) at org.apache.spark.storage.BlockManager.doGetLocal(BlockManager.scala:337) at org.apache.spark.storage.BlockManager.getLocal(BlockManager.scala:281) at org.apache.spark.storage.BlockManager.get(BlockManager.scala:430) at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:38) at org.apache.spark.rdd.RDD.iterator(RDD.scala:220) at org.apache.spark.api.python.PythonRDD$$anon$2.run(PythonRDD.scala:85) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-2163) Change ``setConvergenceTol'' with a parameter of type Double instead of Int
Gang Bai created SPARK-2163: --- Summary: Change ``setConvergenceTol'' with a parameter of type Double instead of Int Key: SPARK-2163 URL: https://issues.apache.org/jira/browse/SPARK-2163 Project: Spark Issue Type: Improvement Components: MLlib Affects Versions: 1.0.0 Reporter: Gang Bai The class LBFGS in mllib.optimization currently provides a {{setConvergenceTol(tolerance: Int)}} method for setting the convergence tolerance. The tolerance parameter is of type {{Int}}. The specified tolerance is then used as parameter in calling {{LBFGS.runLBFGS}}, where the parameter {{convergenceTol}} is of type {{Double}}. The Int parameter may cause problem when one creates an optimizer and sets a Double-valued tolerance. e.g: {code:borderStyle=solid} override val optimizer = new LBFGS(gradient, updater) .setNumCorrections(9) .setConvergenceTol(1e-4) // *type mismatch here* .setMaxNumIterations(100) .setRegParam(1.0) {code} IMHO there is no need to make the tolerance of type Int. Let's change it into a Double parameter and eliminate the type mismatch problem. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (SPARK-2163) Set ``setConvergenceTol'' with a parameter of type Double instead of Int
[ https://issues.apache.org/jira/browse/SPARK-2163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Bai updated SPARK-2163: Summary: Set ``setConvergenceTol'' with a parameter of type Double instead of Int (was: Change ``setConvergenceTol'' with a parameter of type Double instead of Int) Set ``setConvergenceTol'' with a parameter of type Double instead of Int Key: SPARK-2163 URL: https://issues.apache.org/jira/browse/SPARK-2163 Project: Spark Issue Type: Improvement Components: MLlib Affects Versions: 1.0.0 Reporter: Gang Bai The class LBFGS in mllib.optimization currently provides a {{setConvergenceTol(tolerance: Int)}} method for setting the convergence tolerance. The tolerance parameter is of type {{Int}}. The specified tolerance is then used as parameter in calling {{LBFGS.runLBFGS}}, where the parameter {{convergenceTol}} is of type {{Double}}. The Int parameter may cause problem when one creates an optimizer and sets a Double-valued tolerance. e.g: {code:borderStyle=solid} override val optimizer = new LBFGS(gradient, updater) .setNumCorrections(9) .setConvergenceTol(1e-4) // *type mismatch here* .setMaxNumIterations(100) .setRegParam(1.0) {code} IMHO there is no need to make the tolerance of type Int. Let's change it into a Double parameter and eliminate the type mismatch problem. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-1353) IllegalArgumentException when writing to disk
[ https://issues.apache.org/jira/browse/SPARK-1353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033625#comment-14033625 ] Mridul Muralidharan commented on SPARK-1353: This is due to limitation in spark which is being addressed in https://issues.apache.org/jira/browse/SPARK-1476. IllegalArgumentException when writing to disk - Key: SPARK-1353 URL: https://issues.apache.org/jira/browse/SPARK-1353 Project: Spark Issue Type: Bug Components: Block Manager Environment: AWS EMR 3.2.30-49.59.amzn1.x86_64 #1 SMP x86_64 GNU/Linux Spark 1.0.0-SNAPSHOT built for Hadoop 1.0.4 built 2014-03-18 Reporter: Jim Blomo Priority: Minor The Executor may fail when trying to mmap a file bigger than Integer.MAX_VALUE due to the constraints of FileChannel.map (http://docs.oracle.com/javase/7/docs/api/java/nio/channels/FileChannel.html#map(java.nio.channels.FileChannel.MapMode, long, long)). The signature takes longs, but the size value must be less than MAX_VALUE. This manifests with the following backtrace: java.lang.IllegalArgumentException: Size exceeds Integer.MAX_VALUE at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:828) at org.apache.spark.storage.DiskStore.getBytes(DiskStore.scala:98) at org.apache.spark.storage.BlockManager.doGetLocal(BlockManager.scala:337) at org.apache.spark.storage.BlockManager.getLocal(BlockManager.scala:281) at org.apache.spark.storage.BlockManager.get(BlockManager.scala:430) at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:38) at org.apache.spark.rdd.RDD.iterator(RDD.scala:220) at org.apache.spark.api.python.PythonRDD$$anon$2.run(PythonRDD.scala:85) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (SPARK-2164) Applying UDF on a struct throws a MatchError
[ https://issues.apache.org/jira/browse/SPARK-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-2164. - Resolution: Fixed Fixed by: https://github.com/apache/spark/pull/796 Applying UDF on a struct throws a MatchError Key: SPARK-2164 URL: https://issues.apache.org/jira/browse/SPARK-2164 Project: Spark Issue Type: Bug Components: SQL Reporter: Michael Armbrust Fix For: 1.0.1, 1.1.0 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-2164) Applying UDF on a struct throws a MatchError
Michael Armbrust created SPARK-2164: --- Summary: Applying UDF on a struct throws a MatchError Key: SPARK-2164 URL: https://issues.apache.org/jira/browse/SPARK-2164 Project: Spark Issue Type: Bug Components: SQL Reporter: Michael Armbrust Fix For: 1.0.1, 1.1.0 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (SPARK-2053) Add Catalyst expression for CASE WHEN
[ https://issues.apache.org/jira/browse/SPARK-2053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-2053. - Resolution: Fixed Fix Version/s: 1.1.0 1.0.1 Add Catalyst expression for CASE WHEN - Key: SPARK-2053 URL: https://issues.apache.org/jira/browse/SPARK-2053 Project: Spark Issue Type: Improvement Components: SQL Reporter: Michael Armbrust Assignee: Zongheng Yang Fix For: 1.0.1, 1.1.0 Here's a rough start: https://github.com/marmbrus/spark/commit/1209daaf49b0a87e7f68f89c79d02b446e624db3 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2163) Set ``setConvergenceTol'' with a parameter of type Double instead of Int
[ https://issues.apache.org/jira/browse/SPARK-2163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033703#comment-14033703 ] Gang Bai commented on SPARK-2163: - I've created a pull request on GitHub for this. https://github.com/apache/spark/pull/1104 Set ``setConvergenceTol'' with a parameter of type Double instead of Int Key: SPARK-2163 URL: https://issues.apache.org/jira/browse/SPARK-2163 Project: Spark Issue Type: Improvement Components: MLlib Affects Versions: 1.0.0 Reporter: Gang Bai The class LBFGS in mllib.optimization currently provides a {{setConvergenceTol(tolerance: Int)}} method for setting the convergence tolerance. The tolerance parameter is of type {{Int}}. The specified tolerance is then used as parameter in calling {{LBFGS.runLBFGS}}, where the parameter {{convergenceTol}} is of type {{Double}}. The Int parameter may cause problem when one creates an optimizer and sets a Double-valued tolerance. e.g: {code:borderStyle=solid} override val optimizer = new LBFGS(gradient, updater) .setNumCorrections(9) .setConvergenceTol(1e-4) // *type mismatch here* .setMaxNumIterations(100) .setRegParam(1.0) {code} IMHO there is no need to make the tolerance of type Int. Let's change it into a Double parameter and eliminate the type mismatch problem. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-1471) Worker not recognize Driver state at standalone mode
[ https://issues.apache.org/jira/browse/SPARK-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033938#comment-14033938 ] Federico Ragona commented on SPARK-1471: Hello, I'm facing the same issue in version 1.0.0 (built from the sources distribution using {{make-distribution.sh --hadoop 2.0.0-cdh4.7.0}}). I'm running a job using the new {{bin/spark-submit}} script. When the job fails, one of the worker dies with the following error: {code} 2014-06-17 17:00:04,675 [sparkWorker-akka.actor.default-dispatcher-3] ERROR akka.actor.OneForOneStrategy - FAILED (of class scala.Enumeration$Val) scala.MatchError: FAILED (of class scala.Enumeration$Val) at org.apache.spark.deploy.worker.Worker$$anonfun$receive$1.applyOrElse(Worker.scala:317) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) {code} Worker not recognize Driver state at standalone mode - Key: SPARK-1471 URL: https://issues.apache.org/jira/browse/SPARK-1471 Project: Spark Issue Type: Bug Components: Deploy Affects Versions: 0.9.0 Environment: standalone Reporter: shenhong When I run a spark job in standalone, ./bin/spark-class org.apache.spark.deploy.Client launch spark://v125050024.bja:7077 file:///home/yuling.sh/spark-0.9.0-incubating/examples/target/spark-examples_2.10-0.9.0-incubating.jar org.apache.spark.examples.SparkPi Here is the Worker log. 14/04/11 11:15:04 ERROR OneForOneStrategy: FAILED (of class scala.Enumeration$Val) scala.MatchError: FAILED (of class scala.Enumeration$Val) at org.apache.spark.deploy.worker.Worker$$anonfun$receive$1.applyOrElse(Worker.scala:277) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-1471) Worker not recognize Driver state at standalone mode
[ https://issues.apache.org/jira/browse/SPARK-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033984#comment-14033984 ] Nan Zhu commented on SPARK-1471: I will fix it right now Worker not recognize Driver state at standalone mode - Key: SPARK-1471 URL: https://issues.apache.org/jira/browse/SPARK-1471 Project: Spark Issue Type: Bug Components: Deploy Affects Versions: 0.9.0 Environment: standalone Reporter: shenhong When I run a spark job in standalone, ./bin/spark-class org.apache.spark.deploy.Client launch spark://v125050024.bja:7077 file:///home/yuling.sh/spark-0.9.0-incubating/examples/target/spark-examples_2.10-0.9.0-incubating.jar org.apache.spark.examples.SparkPi Here is the Worker log. 14/04/11 11:15:04 ERROR OneForOneStrategy: FAILED (of class scala.Enumeration$Val) scala.MatchError: FAILED (of class scala.Enumeration$Val) at org.apache.spark.deploy.worker.Worker$$anonfun$receive$1.applyOrElse(Worker.scala:277) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-1471) Worker not recognize Driver state at standalone mode
[ https://issues.apache.org/jira/browse/SPARK-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033986#comment-14033986 ] Nan Zhu commented on SPARK-1471: this has been fixed by https://github.com/apache/spark/commit/95e4c9c6fb153b7f0aa4c442c4bdb6552d326640 Worker not recognize Driver state at standalone mode - Key: SPARK-1471 URL: https://issues.apache.org/jira/browse/SPARK-1471 Project: Spark Issue Type: Bug Components: Deploy Affects Versions: 0.9.0 Environment: standalone Reporter: shenhong When I run a spark job in standalone, ./bin/spark-class org.apache.spark.deploy.Client launch spark://v125050024.bja:7077 file:///home/yuling.sh/spark-0.9.0-incubating/examples/target/spark-examples_2.10-0.9.0-incubating.jar org.apache.spark.examples.SparkPi Here is the Worker log. 14/04/11 11:15:04 ERROR OneForOneStrategy: FAILED (of class scala.Enumeration$Val) scala.MatchError: FAILED (of class scala.Enumeration$Val) at org.apache.spark.deploy.worker.Worker$$anonfun$receive$1.applyOrElse(Worker.scala:277) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (SPARK-1471) Worker not recognize Driver state at standalone mode
[ https://issues.apache.org/jira/browse/SPARK-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guoqiang Li resolved SPARK-1471. Resolution: Fixed Worker not recognize Driver state at standalone mode - Key: SPARK-1471 URL: https://issues.apache.org/jira/browse/SPARK-1471 Project: Spark Issue Type: Bug Components: Deploy Affects Versions: 0.9.0 Environment: standalone Reporter: shenhong When I run a spark job in standalone, ./bin/spark-class org.apache.spark.deploy.Client launch spark://v125050024.bja:7077 file:///home/yuling.sh/spark-0.9.0-incubating/examples/target/spark-examples_2.10-0.9.0-incubating.jar org.apache.spark.examples.SparkPi Here is the Worker log. 14/04/11 11:15:04 ERROR OneForOneStrategy: FAILED (of class scala.Enumeration$Val) scala.MatchError: FAILED (of class scala.Enumeration$Val) at org.apache.spark.deploy.worker.Worker$$anonfun$receive$1.applyOrElse(Worker.scala:277) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (SPARK-1471) Worker not recognize Driver state at standalone mode
[ https://issues.apache.org/jira/browse/SPARK-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guoqiang Li updated SPARK-1471: --- Fix Version/s: 1.0.0 Worker not recognize Driver state at standalone mode - Key: SPARK-1471 URL: https://issues.apache.org/jira/browse/SPARK-1471 Project: Spark Issue Type: Bug Components: Deploy Affects Versions: 0.9.0 Environment: standalone Reporter: shenhong Fix For: 1.0.1 When I run a spark job in standalone, ./bin/spark-class org.apache.spark.deploy.Client launch spark://v125050024.bja:7077 file:///home/yuling.sh/spark-0.9.0-incubating/examples/target/spark-examples_2.10-0.9.0-incubating.jar org.apache.spark.examples.SparkPi Here is the Worker log. 14/04/11 11:15:04 ERROR OneForOneStrategy: FAILED (of class scala.Enumeration$Val) scala.MatchError: FAILED (of class scala.Enumeration$Val) at org.apache.spark.deploy.worker.Worker$$anonfun$receive$1.applyOrElse(Worker.scala:277) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (SPARK-1471) Worker not recognize Driver state at standalone mode
[ https://issues.apache.org/jira/browse/SPARK-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guoqiang Li updated SPARK-1471: --- Fix Version/s: (was: 1.0.0) 1.0.1 Worker not recognize Driver state at standalone mode - Key: SPARK-1471 URL: https://issues.apache.org/jira/browse/SPARK-1471 Project: Spark Issue Type: Bug Components: Deploy Affects Versions: 0.9.0 Environment: standalone Reporter: shenhong Fix For: 1.0.1 When I run a spark job in standalone, ./bin/spark-class org.apache.spark.deploy.Client launch spark://v125050024.bja:7077 file:///home/yuling.sh/spark-0.9.0-incubating/examples/target/spark-examples_2.10-0.9.0-incubating.jar org.apache.spark.examples.SparkPi Here is the Worker log. 14/04/11 11:15:04 ERROR OneForOneStrategy: FAILED (of class scala.Enumeration$Val) scala.MatchError: FAILED (of class scala.Enumeration$Val) at org.apache.spark.deploy.worker.Worker$$anonfun$receive$1.applyOrElse(Worker.scala:277) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (SPARK-1471) Worker not recognize Driver state at standalone mode
[ https://issues.apache.org/jira/browse/SPARK-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guoqiang Li updated SPARK-1471: --- Fix Version/s: 1.1.0 Worker not recognize Driver state at standalone mode - Key: SPARK-1471 URL: https://issues.apache.org/jira/browse/SPARK-1471 Project: Spark Issue Type: Bug Components: Deploy Affects Versions: 0.9.0 Environment: standalone Reporter: shenhong Fix For: 1.0.1, 1.1.0 When I run a spark job in standalone, ./bin/spark-class org.apache.spark.deploy.Client launch spark://v125050024.bja:7077 file:///home/yuling.sh/spark-0.9.0-incubating/examples/target/spark-examples_2.10-0.9.0-incubating.jar org.apache.spark.examples.SparkPi Here is the Worker log. 14/04/11 11:15:04 ERROR OneForOneStrategy: FAILED (of class scala.Enumeration$Val) scala.MatchError: FAILED (of class scala.Enumeration$Val) at org.apache.spark.deploy.worker.Worker$$anonfun$receive$1.applyOrElse(Worker.scala:277) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2058) SPARK_CONF_DIR should override all present configs
[ https://issues.apache.org/jira/browse/SPARK-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034036#comment-14034036 ] Ryan Fishel commented on SPARK-2058: Was Eugen's fix implemented? SPARK_CONF_DIR should override all present configs -- Key: SPARK-2058 URL: https://issues.apache.org/jira/browse/SPARK-2058 Project: Spark Issue Type: Improvement Components: Deploy Affects Versions: 1.0.0, 1.0.1, 1.1.0 Reporter: Eugen Cepoi Priority: Trivial Fix For: 1.0.1, 1.1.0 When the user defines SPARK_CONF_DIR I think spark should use all the configs available there not only spark-env. This involves changing SparkSubmitArguments to first read from SPARK_CONF_DIR, and updating the scripts to add SPARK_CONF_DIR to the computed classpath for configs such as log4j, metrics, etc. I have already prepared a PR for this. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (SPARK-1199) Type mismatch in Spark shell when using case class defined in shell
[ https://issues.apache.org/jira/browse/SPARK-1199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-1199: --- Fix Version/s: (was: 1.1.0) Type mismatch in Spark shell when using case class defined in shell --- Key: SPARK-1199 URL: https://issues.apache.org/jira/browse/SPARK-1199 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 0.9.0 Reporter: Andrew Kerr Assignee: Prashant Sharma Priority: Blocker Define a class in the shell: {code} case class TestClass(a:String) {code} and an RDD {code} val data = sc.parallelize(Seq(a)).map(TestClass(_)) {code} define a function on it and map over the RDD {code} def itemFunc(a:TestClass):TestClass = a data.map(itemFunc) {code} Error: {code} console:19: error: type mismatch; found : TestClass = TestClass required: TestClass = ? data.map(itemFunc) {code} Similarly with a mapPartitions: {code} def partitionFunc(a:Iterator[TestClass]):Iterator[TestClass] = a data.mapPartitions(partitionFunc) {code} {code} console:19: error: type mismatch; found : Iterator[TestClass] = Iterator[TestClass] required: Iterator[TestClass] = Iterator[?] Error occurred in an application involving default arguments. data.mapPartitions(partitionFunc) {code} The behavior is the same whether in local mode or on a cluster. This isn't specific to RDDs. A Scala collection in the Spark shell has the same problem. {code} scala Seq(TestClass(foo)).map(itemFunc) console:15: error: type mismatch; found : TestClass = TestClass required: TestClass = ? Seq(TestClass(foo)).map(itemFunc) ^ {code} When run in the Scala console (not the Spark shell) there are no type mismatch errors. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (SPARK-1199) Type mismatch in Spark shell when using case class defined in shell
[ https://issues.apache.org/jira/browse/SPARK-1199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell reassigned SPARK-1199: -- Assignee: Prashant Sharma Prashant said he could look into this - so I'm assigning it to him. Type mismatch in Spark shell when using case class defined in shell --- Key: SPARK-1199 URL: https://issues.apache.org/jira/browse/SPARK-1199 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 0.9.0 Reporter: Andrew Kerr Assignee: Prashant Sharma Priority: Blocker Define a class in the shell: {code} case class TestClass(a:String) {code} and an RDD {code} val data = sc.parallelize(Seq(a)).map(TestClass(_)) {code} define a function on it and map over the RDD {code} def itemFunc(a:TestClass):TestClass = a data.map(itemFunc) {code} Error: {code} console:19: error: type mismatch; found : TestClass = TestClass required: TestClass = ? data.map(itemFunc) {code} Similarly with a mapPartitions: {code} def partitionFunc(a:Iterator[TestClass]):Iterator[TestClass] = a data.mapPartitions(partitionFunc) {code} {code} console:19: error: type mismatch; found : Iterator[TestClass] = Iterator[TestClass] required: Iterator[TestClass] = Iterator[?] Error occurred in an application involving default arguments. data.mapPartitions(partitionFunc) {code} The behavior is the same whether in local mode or on a cluster. This isn't specific to RDDs. A Scala collection in the Spark shell has the same problem. {code} scala Seq(TestClass(foo)).map(itemFunc) console:15: error: type mismatch; found : TestClass = TestClass required: TestClass = ? Seq(TestClass(foo)).map(itemFunc) ^ {code} When run in the Scala console (not the Spark shell) there are no type mismatch errors. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (SPARK-1199) Type mismatch in Spark shell when using case class defined in shell
[ https://issues.apache.org/jira/browse/SPARK-1199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-1199: --- Target Version/s: 1.0.1, 1.1.0 Type mismatch in Spark shell when using case class defined in shell --- Key: SPARK-1199 URL: https://issues.apache.org/jira/browse/SPARK-1199 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 0.9.0 Reporter: Andrew Kerr Assignee: Prashant Sharma Priority: Blocker Define a class in the shell: {code} case class TestClass(a:String) {code} and an RDD {code} val data = sc.parallelize(Seq(a)).map(TestClass(_)) {code} define a function on it and map over the RDD {code} def itemFunc(a:TestClass):TestClass = a data.map(itemFunc) {code} Error: {code} console:19: error: type mismatch; found : TestClass = TestClass required: TestClass = ? data.map(itemFunc) {code} Similarly with a mapPartitions: {code} def partitionFunc(a:Iterator[TestClass]):Iterator[TestClass] = a data.mapPartitions(partitionFunc) {code} {code} console:19: error: type mismatch; found : Iterator[TestClass] = Iterator[TestClass] required: Iterator[TestClass] = Iterator[?] Error occurred in an application involving default arguments. data.mapPartitions(partitionFunc) {code} The behavior is the same whether in local mode or on a cluster. This isn't specific to RDDs. A Scala collection in the Spark shell has the same problem. {code} scala Seq(TestClass(foo)).map(itemFunc) console:15: error: type mismatch; found : TestClass = TestClass required: TestClass = ? Seq(TestClass(foo)).map(itemFunc) ^ {code} When run in the Scala console (not the Spark shell) there are no type mismatch errors. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2157) Can't write tight firewall rules for Spark
[ https://issues.apache.org/jira/browse/SPARK-2157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034135#comment-14034135 ] Andrew Ash commented on SPARK-2157: --- I pulled together Egor's work for HttpBroadcast and HttpFileServer and added configuration options for the block manager and the repl class server in this PR: https://github.com/apache/spark/pull/1107 Can't write tight firewall rules for Spark -- Key: SPARK-2157 URL: https://issues.apache.org/jira/browse/SPARK-2157 Project: Spark Issue Type: Bug Components: Deploy, Spark Core Affects Versions: 1.0.0 Reporter: Andrew Ash Priority: Critical In order to run Spark in places with strict firewall rules, you need to be able to specify every port that's used between all parts of the stack. Per the [network activity section of the docs|http://spark.apache.org/docs/latest/spark-standalone.html#configuring-ports-for-network-security] most of the ports are configurable, but there are a few ports that aren't configurable. We need to make every port configurable to a particular port, so that we can run Spark in highly locked-down environments. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-2165) spark on yarn: add support for setting maxAppAttempts in the ApplicationSubmissionContext
Thomas Graves created SPARK-2165: Summary: spark on yarn: add support for setting maxAppAttempts in the ApplicationSubmissionContext Key: SPARK-2165 URL: https://issues.apache.org/jira/browse/SPARK-2165 Project: Spark Issue Type: Improvement Components: YARN Affects Versions: 1.0.0 Reporter: Thomas Graves Hadoop 2.x adds support for allowing the application to specify the maximum application attempts. We should add support for it by setting in the ApplicationSubmissionContext. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-2166) Enumerating instances to be terminated before the prompting the users to continue.
Jean-Martin Archer created SPARK-2166: - Summary: Enumerating instances to be terminated before the prompting the users to continue. Key: SPARK-2166 URL: https://issues.apache.org/jira/browse/SPARK-2166 Project: Spark Issue Type: Improvement Components: EC2 Affects Versions: 0.9.0 Reporter: Jean-Martin Archer Priority: Minor When destroying a cluster, the user will be prompted for confirmation without first showing which instances will be terminated. Pull Request: https://github.com/apache/spark/pull/270#issuecomment-46341975 This pull request will list the EC2 instances before destroying the cluster. This was added because it can be scary to destroy EC2 instances without knowing which one will be affected. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (SPARK-1907) spark-submit: add exec at the end of the script
[ https://issues.apache.org/jira/browse/SPARK-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-1907: - Assignee: Colin Patrick McCabe spark-submit: add exec at the end of the script --- Key: SPARK-1907 URL: https://issues.apache.org/jira/browse/SPARK-1907 Project: Spark Issue Type: Improvement Components: Deploy Affects Versions: 1.0.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Add an 'exec' at the end of the spark-submit script, to avoid keeping a bash process hanging around while it runs. This makes ps look a little bit nicer. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-2167) spark-submit should return exit code based on failure/success
Thomas Graves created SPARK-2167: Summary: spark-submit should return exit code based on failure/success Key: SPARK-2167 URL: https://issues.apache.org/jira/browse/SPARK-2167 Project: Spark Issue Type: Bug Components: Deploy Affects Versions: 1.0.0 Reporter: Thomas Graves Fix For: 1.1.0 spark-submit script and Java class should exit with 0 for success and non-zero with failure so that other command line tools and workflow managers (like oozie) can properly tell if the spark app succeeded or failed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (SPARK-1907) spark-submit: add exec at the end of the script
[ https://issues.apache.org/jira/browse/SPARK-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-1907. Resolution: Fixed Fix Version/s: 1.1.0 1.0.1 spark-submit: add exec at the end of the script --- Key: SPARK-1907 URL: https://issues.apache.org/jira/browse/SPARK-1907 Project: Spark Issue Type: Improvement Components: Deploy Affects Versions: 1.0.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Fix For: 1.0.1, 1.1.0 Add an 'exec' at the end of the spark-submit script, to avoid keeping a bash process hanging around while it runs. This makes ps look a little bit nicer. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2022) Spark 1.0.0 is failing if mesos.coarse set to true
[ https://issues.apache.org/jira/browse/SPARK-2022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034294#comment-14034294 ] Sebastien Rainville commented on SPARK-2022: I'm seeing the same behavior when trying to set spark.executor.extraLibraryPath: in conf/spark-defaults.conf: spark.executor.extraLibraryPath /usr/lib/hadoop/lib/native the error message in stderr: WARNING: Logging before InitGoogleLogging() is written to STDERR I0617 16:00:55.592289 27091 fetcher.cpp:73] Fetching URI 'hdfs://ca1-dcc1-0071:9200/user/sebastien/spark-1.0.0-bin-cdh4-sebr.tgz' I0617 16:00:55.592428 27091 fetcher.cpp:99] Downloading resource from 'hdfs://ca1-dcc1-0071:9200/user/sebastien/spark-1.0.0-bin-cdh4-sebr.tgz' to '/u05/app/mesos/work/slaves/201311011608-1369465866-5050-9189-86/frameworks/20140416-011500-1369465866-5050-26096-0449/executors/9/runs/ba87d7b6-56c1-4892-9ed8-18fa8f8364d2/spark-1.0.0-bin-cdh4-sebr.tgz' I0617 16:01:05.170714 27091 fetcher.cpp:61] Extracted resource '/u05/app/mesos/work/slaves/201311011608-1369465866-5050-9189-86/frameworks/20140416-011500-1369465866-5050-26096-0449/executors/9/runs/ba87d7b6-56c1-4892-9ed8-18fa8f8364d2/spark-1.0.0-bin-cdh4-sebr.tgz' into '/u05/app/mesos/work/slaves/201311011608-1369465866-5050-9189-86/frameworks/20140416-011500-1369465866-5050-26096-0449/executors/9/runs/ba87d7b6-56c1-4892-9ed8-18fa8f8364d2' WARNING: Logging before InitGoogleLogging() is written to STDERR I0617 16:01:06.105363 27166 exec.cpp:131] Version: 0.18.0 I0617 16:01:06.112191 27175 exec.cpp:205] Executor registered on slave 201311011608-1369465866-5050-9189-86 Spark assembly has been built with Hive, including Datanucleus jars on classpath Exception in thread main java.lang.NumberFormatException: For input string: ca1-dcc1-0106.lab.mtl at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:492) at java.lang.Integer.parseInt(Integer.java:527) at scala.collection.immutable.StringLike$class.toInt(StringLike.scala:229) at scala.collection.immutable.StringOps.toInt(StringOps.scala:31) at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:135) at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala) here is the command in stdout: Registered executor on ca1-dcc1-0106.lab.mtl Starting task 9 Forked command at 27178 sh -c 'cd spark-1*; ./bin/spark-class org.apache.spark.executor.CoarseGrainedExecutorBackend -Djava.library.path=/usr/lib/hadoop/lib/native akka.tcp://sp...@ca1-dcc1-0071.lab.mtl:32789/user/CoarseGrainedScheduler 201311011608-1369465866-5050-9189-86 ca1-dcc1-0106.lab.mtl 1' Command exited with status 1 (pid: 27178) Spark 1.0.0 is failing if mesos.coarse set to true -- Key: SPARK-2022 URL: https://issues.apache.org/jira/browse/SPARK-2022 Project: Spark Issue Type: Bug Components: Mesos Affects Versions: 1.0.0 Reporter: Marek Wiewiorka Priority: Critical more stderr --- WARNING: Logging before InitGoogleLogging() is written to STDERR I0603 16:07:53.721132 61192 exec.cpp:131] Version: 0.18.2 I0603 16:07:53.725230 61200 exec.cpp:205] Executor registered on slave 201405220917-134217738-5050-27119-0 Exception in thread main java.lang.NumberFormatException: For input string: sparkseq003.cloudapp.net at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:492) at java.lang.Integer.parseInt(Integer.java:527) at scala.collection.immutable.StringLike$class.toInt(StringLike.scala:229) at scala.collection.immutable.StringOps.toInt(StringOps.scala:31) at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:135) at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala) more stdout --- Registered executor on sparkseq003.cloudapp.net Starting task 5 Forked command at 61202 sh -c '/home/mesos/spark-1.0.0/bin/spark-class org.apache.spark.executor.CoarseGrainedExecutorBackend -Dspark.mesos.coarse=true akka.tcp://sp...@sparkseq001.cloudapp.net:40312/user/CoarseG rainedScheduler 201405220917-134217738-5050-27119-0 sparkseq003.cloudapp.net 4' Command exited with status 1 (pid: 61202) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (SPARK-2022) Spark 1.0.0 is failing if mesos.coarse set to true
[ https://issues.apache.org/jira/browse/SPARK-2022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034294#comment-14034294 ] Sebastien Rainville edited comment on SPARK-2022 at 6/17/14 8:08 PM: - I'm seeing the same behavior when trying to set spark.executor.extraLibraryPath: in conf/spark-defaults.conf: {noformat} spark.executor.extraLibraryPath /usr/lib/hadoop/lib/native {noformat} the error message in stderr: {noformat} WARNING: Logging before InitGoogleLogging() is written to STDERR I0617 16:00:55.592289 27091 fetcher.cpp:73] Fetching URI 'hdfs://ca1-dcc1-0071:9200/user/sebastien/spark-1.0.0-bin-cdh4-sebr.tgz' I0617 16:00:55.592428 27091 fetcher.cpp:99] Downloading resource from 'hdfs://ca1-dcc1-0071:9200/user/sebastien/spark-1.0.0-bin-cdh4-sebr.tgz' to '/u05/app/mesos/work/slaves/201311011608-1369465866-5050-9189-86/frameworks/20140416-011500-1369465866-5050-26096-0449/executors/9/runs/ba87d7b6-56c1-4892-9ed8-18fa8f8364d2/spark-1.0.0-bin-cdh4-sebr.tgz' I0617 16:01:05.170714 27091 fetcher.cpp:61] Extracted resource '/u05/app/mesos/work/slaves/201311011608-1369465866-5050-9189-86/frameworks/20140416-011500-1369465866-5050-26096-0449/executors/9/runs/ba87d7b6-56c1-4892-9ed8-18fa8f8364d2/spark-1.0.0-bin-cdh4-sebr.tgz' into '/u05/app/mesos/work/slaves/201311011608-1369465866-5050-9189-86/frameworks/20140416-011500-1369465866-5050-26096-0449/executors/9/runs/ba87d7b6-56c1-4892-9ed8-18fa8f8364d2' WARNING: Logging before InitGoogleLogging() is written to STDERR I0617 16:01:06.105363 27166 exec.cpp:131] Version: 0.18.0 I0617 16:01:06.112191 27175 exec.cpp:205] Executor registered on slave 201311011608-1369465866-5050-9189-86 Spark assembly has been built with Hive, including Datanucleus jars on classpath Exception in thread main java.lang.NumberFormatException: For input string: ca1-dcc1-0106.lab.mtl at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:492) at java.lang.Integer.parseInt(Integer.java:527) at scala.collection.immutable.StringLike$class.toInt(StringLike.scala:229) at scala.collection.immutable.StringOps.toInt(StringOps.scala:31) at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:135) at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala) {noformat} here is the command in stdout: {noformat} Registered executor on ca1-dcc1-0106.lab.mtl Starting task 9 Forked command at 27178 sh -c 'cd spark-1*; ./bin/spark-class org.apache.spark.executor.CoarseGrainedExecutorBackend -Djava.library.path=/usr/lib/hadoop/lib/native akka.tcp://sp...@ca1-dcc1-0071.lab.mtl:32789/user/CoarseGrainedScheduler 201311011608-1369465866-5050-9189-86 ca1-dcc1-0106.lab.mtl 1' Command exited with status 1 (pid: 27178) {noformat} was (Author: srainville): I'm seeing the same behavior when trying to set spark.executor.extraLibraryPath: in conf/spark-defaults.conf: {noformat} spark.executor.extraLibraryPath /usr/lib/hadoop/lib/native {noformat} the error message in stderr: WARNING: Logging before InitGoogleLogging() is written to STDERR I0617 16:00:55.592289 27091 fetcher.cpp:73] Fetching URI 'hdfs://ca1-dcc1-0071:9200/user/sebastien/spark-1.0.0-bin-cdh4-sebr.tgz' I0617 16:00:55.592428 27091 fetcher.cpp:99] Downloading resource from 'hdfs://ca1-dcc1-0071:9200/user/sebastien/spark-1.0.0-bin-cdh4-sebr.tgz' to '/u05/app/mesos/work/slaves/201311011608-1369465866-5050-9189-86/frameworks/20140416-011500-1369465866-5050-26096-0449/executors/9/runs/ba87d7b6-56c1-4892-9ed8-18fa8f8364d2/spark-1.0.0-bin-cdh4-sebr.tgz' I0617 16:01:05.170714 27091 fetcher.cpp:61] Extracted resource '/u05/app/mesos/work/slaves/201311011608-1369465866-5050-9189-86/frameworks/20140416-011500-1369465866-5050-26096-0449/executors/9/runs/ba87d7b6-56c1-4892-9ed8-18fa8f8364d2/spark-1.0.0-bin-cdh4-sebr.tgz' into '/u05/app/mesos/work/slaves/201311011608-1369465866-5050-9189-86/frameworks/20140416-011500-1369465866-5050-26096-0449/executors/9/runs/ba87d7b6-56c1-4892-9ed8-18fa8f8364d2' WARNING: Logging before InitGoogleLogging() is written to STDERR I0617 16:01:06.105363 27166 exec.cpp:131] Version: 0.18.0 I0617 16:01:06.112191 27175 exec.cpp:205] Executor registered on slave 201311011608-1369465866-5050-9189-86 Spark assembly has been built with Hive, including Datanucleus jars on classpath Exception in thread main java.lang.NumberFormatException: For input string: ca1-dcc1-0106.lab.mtl at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:492) at java.lang.Integer.parseInt(Integer.java:527) at scala.collection.immutable.StringLike$class.toInt(StringLike.scala:229) at
[jira] [Comment Edited] (SPARK-2022) Spark 1.0.0 is failing if mesos.coarse set to true
[ https://issues.apache.org/jira/browse/SPARK-2022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034294#comment-14034294 ] Sebastien Rainville edited comment on SPARK-2022 at 6/17/14 8:08 PM: - I'm seeing the same behavior when trying to set spark.executor.extraLibraryPath: in conf/spark-defaults.conf: {noformat} spark.executor.extraLibraryPath /usr/lib/hadoop/lib/native {noformat} the error message in stderr: WARNING: Logging before InitGoogleLogging() is written to STDERR I0617 16:00:55.592289 27091 fetcher.cpp:73] Fetching URI 'hdfs://ca1-dcc1-0071:9200/user/sebastien/spark-1.0.0-bin-cdh4-sebr.tgz' I0617 16:00:55.592428 27091 fetcher.cpp:99] Downloading resource from 'hdfs://ca1-dcc1-0071:9200/user/sebastien/spark-1.0.0-bin-cdh4-sebr.tgz' to '/u05/app/mesos/work/slaves/201311011608-1369465866-5050-9189-86/frameworks/20140416-011500-1369465866-5050-26096-0449/executors/9/runs/ba87d7b6-56c1-4892-9ed8-18fa8f8364d2/spark-1.0.0-bin-cdh4-sebr.tgz' I0617 16:01:05.170714 27091 fetcher.cpp:61] Extracted resource '/u05/app/mesos/work/slaves/201311011608-1369465866-5050-9189-86/frameworks/20140416-011500-1369465866-5050-26096-0449/executors/9/runs/ba87d7b6-56c1-4892-9ed8-18fa8f8364d2/spark-1.0.0-bin-cdh4-sebr.tgz' into '/u05/app/mesos/work/slaves/201311011608-1369465866-5050-9189-86/frameworks/20140416-011500-1369465866-5050-26096-0449/executors/9/runs/ba87d7b6-56c1-4892-9ed8-18fa8f8364d2' WARNING: Logging before InitGoogleLogging() is written to STDERR I0617 16:01:06.105363 27166 exec.cpp:131] Version: 0.18.0 I0617 16:01:06.112191 27175 exec.cpp:205] Executor registered on slave 201311011608-1369465866-5050-9189-86 Spark assembly has been built with Hive, including Datanucleus jars on classpath Exception in thread main java.lang.NumberFormatException: For input string: ca1-dcc1-0106.lab.mtl at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:492) at java.lang.Integer.parseInt(Integer.java:527) at scala.collection.immutable.StringLike$class.toInt(StringLike.scala:229) at scala.collection.immutable.StringOps.toInt(StringOps.scala:31) at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:135) at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala) here is the command in stdout: Registered executor on ca1-dcc1-0106.lab.mtl Starting task 9 Forked command at 27178 sh -c 'cd spark-1*; ./bin/spark-class org.apache.spark.executor.CoarseGrainedExecutorBackend -Djava.library.path=/usr/lib/hadoop/lib/native akka.tcp://sp...@ca1-dcc1-0071.lab.mtl:32789/user/CoarseGrainedScheduler 201311011608-1369465866-5050-9189-86 ca1-dcc1-0106.lab.mtl 1' Command exited with status 1 (pid: 27178) was (Author: srainville): I'm seeing the same behavior when trying to set spark.executor.extraLibraryPath: in conf/spark-defaults.conf: spark.executor.extraLibraryPath /usr/lib/hadoop/lib/native the error message in stderr: WARNING: Logging before InitGoogleLogging() is written to STDERR I0617 16:00:55.592289 27091 fetcher.cpp:73] Fetching URI 'hdfs://ca1-dcc1-0071:9200/user/sebastien/spark-1.0.0-bin-cdh4-sebr.tgz' I0617 16:00:55.592428 27091 fetcher.cpp:99] Downloading resource from 'hdfs://ca1-dcc1-0071:9200/user/sebastien/spark-1.0.0-bin-cdh4-sebr.tgz' to '/u05/app/mesos/work/slaves/201311011608-1369465866-5050-9189-86/frameworks/20140416-011500-1369465866-5050-26096-0449/executors/9/runs/ba87d7b6-56c1-4892-9ed8-18fa8f8364d2/spark-1.0.0-bin-cdh4-sebr.tgz' I0617 16:01:05.170714 27091 fetcher.cpp:61] Extracted resource '/u05/app/mesos/work/slaves/201311011608-1369465866-5050-9189-86/frameworks/20140416-011500-1369465866-5050-26096-0449/executors/9/runs/ba87d7b6-56c1-4892-9ed8-18fa8f8364d2/spark-1.0.0-bin-cdh4-sebr.tgz' into '/u05/app/mesos/work/slaves/201311011608-1369465866-5050-9189-86/frameworks/20140416-011500-1369465866-5050-26096-0449/executors/9/runs/ba87d7b6-56c1-4892-9ed8-18fa8f8364d2' WARNING: Logging before InitGoogleLogging() is written to STDERR I0617 16:01:06.105363 27166 exec.cpp:131] Version: 0.18.0 I0617 16:01:06.112191 27175 exec.cpp:205] Executor registered on slave 201311011608-1369465866-5050-9189-86 Spark assembly has been built with Hive, including Datanucleus jars on classpath Exception in thread main java.lang.NumberFormatException: For input string: ca1-dcc1-0106.lab.mtl at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:492) at java.lang.Integer.parseInt(Integer.java:527) at scala.collection.immutable.StringLike$class.toInt(StringLike.scala:229) at scala.collection.immutable.StringOps.toInt(StringOps.scala:31) at
[jira] [Updated] (SPARK-2166) Enumerating instances to be terminated before the prompting the users to continue.
[ https://issues.apache.org/jira/browse/SPARK-2166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2166: --- Assignee: Jean-Martin Archer Enumerating instances to be terminated before the prompting the users to continue. -- Key: SPARK-2166 URL: https://issues.apache.org/jira/browse/SPARK-2166 Project: Spark Issue Type: Improvement Components: EC2 Affects Versions: 0.9.0, 1.0.0 Reporter: Jean-Martin Archer Assignee: Jean-Martin Archer Priority: Minor Original Estimate: 0h Remaining Estimate: 0h When destroying a cluster, the user will be prompted for confirmation without first showing which instances will be terminated. Pull Request: https://github.com/apache/spark/pull/270#issuecomment-46341975 This pull request will list the EC2 instances before destroying the cluster. This was added because it can be scary to destroy EC2 instances without knowing which one will be affected. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (SPARK-2166) Enumerating instances to be terminated before the prompting the users to continue.
[ https://issues.apache.org/jira/browse/SPARK-2166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2166: --- Affects Version/s: 1.0.0 Enumerating instances to be terminated before the prompting the users to continue. -- Key: SPARK-2166 URL: https://issues.apache.org/jira/browse/SPARK-2166 Project: Spark Issue Type: Improvement Components: EC2 Affects Versions: 0.9.0, 1.0.0 Reporter: Jean-Martin Archer Priority: Minor Original Estimate: 0h Remaining Estimate: 0h When destroying a cluster, the user will be prompted for confirmation without first showing which instances will be terminated. Pull Request: https://github.com/apache/spark/pull/270#issuecomment-46341975 This pull request will list the EC2 instances before destroying the cluster. This was added because it can be scary to destroy EC2 instances without knowing which one will be affected. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (SPARK-2166) Enumerating instances to be terminated before the prompting the users to continue.
[ https://issues.apache.org/jira/browse/SPARK-2166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2166: --- Target Version/s: 1.1.0 Enumerating instances to be terminated before the prompting the users to continue. -- Key: SPARK-2166 URL: https://issues.apache.org/jira/browse/SPARK-2166 Project: Spark Issue Type: Improvement Components: EC2 Affects Versions: 0.9.0, 1.0.0 Reporter: Jean-Martin Archer Assignee: Jean-Martin Archer Priority: Minor Original Estimate: 0h Remaining Estimate: 0h When destroying a cluster, the user will be prompted for confirmation without first showing which instances will be terminated. Pull Request: https://github.com/apache/spark/pull/270#issuecomment-46341975 This pull request will list the EC2 instances before destroying the cluster. This was added because it can be scary to destroy EC2 instances without knowing which one will be affected. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (SPARK-2022) Spark 1.0.0 is failing if mesos.coarse set to true
[ https://issues.apache.org/jira/browse/SPARK-2022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034294#comment-14034294 ] Sebastien Rainville edited comment on SPARK-2022 at 6/17/14 8:41 PM: - I'm seeing the same behavior when trying to set spark.executor.extraLibraryPath: in conf/spark-defaults.conf: {noformat} spark.executor.extraLibraryPath /usr/lib/hadoop/lib/native {noformat} the error message in stderr: {noformat} WARNING: Logging before InitGoogleLogging() is written to STDERR I0617 16:00:55.592289 27091 fetcher.cpp:73] Fetching URI 'hdfs://ca1-dcc1-0071:9200/user/sebastien/spark-1.0.0-bin-cdh4-sebr.tgz' I0617 16:00:55.592428 27091 fetcher.cpp:99] Downloading resource from 'hdfs://ca1-dcc1-0071:9200/user/sebastien/spark-1.0.0-bin-cdh4-sebr.tgz' to '/u05/app/mesos/work/slaves/201311011608-1369465866-5050-9189-86/frameworks/20140416-011500-1369465866-5050-26096-0449/executors/9/runs/ba87d7b6-56c1-4892-9ed8-18fa8f8364d2/spark-1.0.0-bin-cdh4-sebr.tgz' I0617 16:01:05.170714 27091 fetcher.cpp:61] Extracted resource '/u05/app/mesos/work/slaves/201311011608-1369465866-5050-9189-86/frameworks/20140416-011500-1369465866-5050-26096-0449/executors/9/runs/ba87d7b6-56c1-4892-9ed8-18fa8f8364d2/spark-1.0.0-bin-cdh4-sebr.tgz' into '/u05/app/mesos/work/slaves/201311011608-1369465866-5050-9189-86/frameworks/20140416-011500-1369465866-5050-26096-0449/executors/9/runs/ba87d7b6-56c1-4892-9ed8-18fa8f8364d2' WARNING: Logging before InitGoogleLogging() is written to STDERR I0617 16:01:06.105363 27166 exec.cpp:131] Version: 0.18.0 I0617 16:01:06.112191 27175 exec.cpp:205] Executor registered on slave 201311011608-1369465866-5050-9189-86 Spark assembly has been built with Hive, including Datanucleus jars on classpath Exception in thread main java.lang.NumberFormatException: For input string: ca1-dcc1-0106.lab.mtl at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:492) at java.lang.Integer.parseInt(Integer.java:527) at scala.collection.immutable.StringLike$class.toInt(StringLike.scala:229) at scala.collection.immutable.StringOps.toInt(StringOps.scala:31) at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:135) at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala) {noformat} here is the command in stdout: {noformat} Registered executor on ca1-dcc1-0106.lab.mtl Starting task 9 Forked command at 27178 sh -c 'cd spark-1*; ./bin/spark-class org.apache.spark.executor.CoarseGrainedExecutorBackend -Djava.library.path=/usr/lib/hadoop/lib/native akka.tcp://sp...@ca1-dcc1-0071.lab.mtl:32789/user/CoarseGrainedScheduler 201311011608-1369465866-5050-9189-86 ca1-dcc1-0106.lab.mtl 1' Command exited with status 1 (pid: 27178) {noformat} In fact, this behavior occurs whenever a jvmarg is set, so setting spark.executor.extraJavaOptions triggers it too. The problem is that CoarseMesosSchedulerBackend is passing the jvmargs to CoarseGrainedExecutorBackend instead of the jvm itself: {code} val uri = conf.get(spark.executor.uri, null) if (uri == null) { val runScript = new File(sparkHome, ./bin/spark-class).getCanonicalPath command.setValue( \%s\ org.apache.spark.executor.CoarseGrainedExecutorBackend %s %s %s %s %d.format( runScript, extraOpts, driverUrl, offer.getSlaveId.getValue, offer.getHostname, numCores)) } else { // Grab everything to the first '.'. We'll use that and '*' to // glob the directory correctly. val basename = uri.split('/').last.split('.').head command.setValue( (cd %s*; + ./bin/spark-class org.apache.spark.executor.CoarseGrainedExecutorBackend %s %s %s %s %d) .format(basename, extraOpts, driverUrl, offer.getSlaveId.getValue, offer.getHostname, numCores)) command.addUris(CommandInfo.URI.newBuilder().setValue(uri)) } {code} as a reference, here's the main method in CoarseGrainedExecutorBackend: {code} def main(args: Array[String]) { args.length match { case x if x 4 = System.err.println( // Worker url is used in spark standalone mode to enforce fate-sharing with worker Usage: CoarseGrainedExecutorBackend driverUrl executorId hostname + cores [workerUrl]) System.exit(1) case 4 = run(args(0), args(1), args(2), args(3).toInt, None) case x if x 4 = run(args(0), args(1), args(2), args(3).toInt, Some(args(4))) } } {code} was (Author: srainville): I'm seeing the same behavior when trying to set spark.executor.extraLibraryPath: in conf/spark-defaults.conf: {noformat} spark.executor.extraLibraryPath /usr/lib/hadoop/lib/native {noformat} the error message in
[jira] [Created] (SPARK-2169) SparkUI.setAppName() has no effect
Marcelo Vanzin created SPARK-2169: - Summary: SparkUI.setAppName() has no effect Key: SPARK-2169 URL: https://issues.apache.org/jira/browse/SPARK-2169 Project: Spark Issue Type: Bug Components: Web UI Affects Versions: 1.0.0 Reporter: Marcelo Vanzin {{SparkUI.setAppName()}} does not do anything useful. It overwrites the instance's {{appName}} fields, but all places where that field is used have already read that value into their own copies by the time that happens. e.g. StagePage.scala copies {{parent.appName}} into its own private {{appName}} in the constructor, which is called as part of SparkUI's constructor. So when you call {{SparkUI.setAppName}} it does not overwrite StagePage's copy, and so the UI still shows the old value. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-2170) Fix for global name 'PIPE' is not defined.
Grega Kespret created SPARK-2170: Summary: Fix for global name 'PIPE' is not defined. Key: SPARK-2170 URL: https://issues.apache.org/jira/browse/SPARK-2170 Project: Spark Issue Type: Bug Components: EC2 Environment: $ python --version Python 2.6.6 $ lsb_release -a No LSB modules are available. Distributor ID: Debian Description:Debian GNU/Linux 6.0.9 (squeeze) Release:6.0.9 Codename: squeeze Reporter: Grega Kespret Priority: Minor When running spark-ec2.py script, it fails with error NameError: global name 'PIPE' is not defined. Traceback (most recent call last): File ./spark_ec2.py, line 894, in module main() File ./spark_ec2.py, line 886, in main real_main() File ./spark_ec2.py, line 770, in real_main setup_cluster(conn, master_nodes, slave_nodes, opts, True) File ./spark_ec2.py, line 475, in setup_cluster dot_ssh_tar = ssh_read(master, opts, ['tar', 'c', '.ssh']) File ./spark_ec2.py, line 709, in ssh_read ssh_command(opts) + ['%s@%s' % (opts.user, host), stringify_command(command)]) File ./spark_ec2.py, line 696, in _check_output process = subprocess.Popen(stdout=PIPE, *popenargs, **kwargs) NameError: global name 'PIPE' is not defined -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2170) Fix for global name 'PIPE' is not defined.
[ https://issues.apache.org/jira/browse/SPARK-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034372#comment-14034372 ] Grega Kespret commented on SPARK-2170: -- Added a Pull request that resolves this issue - https://github.com/apache/spark/pull/1109 Fix for global name 'PIPE' is not defined. -- Key: SPARK-2170 URL: https://issues.apache.org/jira/browse/SPARK-2170 Project: Spark Issue Type: Bug Components: EC2 Environment: $ python --version Python 2.6.6 $ lsb_release -a No LSB modules are available. Distributor ID: Debian Description:Debian GNU/Linux 6.0.9 (squeeze) Release:6.0.9 Codename: squeeze Reporter: Grega Kespret Priority: Minor Original Estimate: 1h Remaining Estimate: 1h When running spark-ec2.py script, it fails with error NameError: global name 'PIPE' is not defined. Traceback (most recent call last): File ./spark_ec2.py, line 894, in module main() File ./spark_ec2.py, line 886, in main real_main() File ./spark_ec2.py, line 770, in real_main setup_cluster(conn, master_nodes, slave_nodes, opts, True) File ./spark_ec2.py, line 475, in setup_cluster dot_ssh_tar = ssh_read(master, opts, ['tar', 'c', '.ssh']) File ./spark_ec2.py, line 709, in ssh_read ssh_command(opts) + ['%s@%s' % (opts.user, host), stringify_command(command)]) File ./spark_ec2.py, line 696, in _check_output process = subprocess.Popen(stdout=PIPE, *popenargs, **kwargs) NameError: global name 'PIPE' is not defined -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-2171) Groovy bindings for Spark
Artur Andrzejak created SPARK-2171: -- Summary: Groovy bindings for Spark Key: SPARK-2171 URL: https://issues.apache.org/jira/browse/SPARK-2171 Project: Spark Issue Type: Improvement Components: Build, Documentation, Examples Affects Versions: 1.0.0 Reporter: Artur Andrzejak Priority: Minor A simple way to add Groovy bindings to Spark, without additional code. The idea is to use the standard java implementations of RDD and Context and to use the coercion of Groovy closure to abstract classes to call all methods, which take anonymous inner classes in Java, with a closure. Advantages: - No need for new code, which avoids unnecessary bugs and implementation effort - Access to spark from Groovy with the ease of closures using the default Java implementations - No need to install additional software -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (SPARK-2171) Groovy bindings for Spark
[ https://issues.apache.org/jira/browse/SPARK-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artur Andrzejak updated SPARK-2171: --- Attachment: examples-Groovy4Spark.zip Groovy4Spark - Introduction.pdf A short guide for using Spark from Groovy and examples of Groovy code Groovy bindings for Spark - Key: SPARK-2171 URL: https://issues.apache.org/jira/browse/SPARK-2171 Project: Spark Issue Type: Improvement Components: Build, Documentation, Examples Affects Versions: 1.0.0 Reporter: Artur Andrzejak Priority: Minor Attachments: Groovy4Spark - Introduction.pdf, examples-Groovy4Spark.zip A simple way to add Groovy bindings to Spark, without additional code. The idea is to use the standard java implementations of RDD and Context and to use the coercion of Groovy closure to abstract classes to call all methods, which take anonymous inner classes in Java, with a closure. Advantages: - No need for new code, which avoids unnecessary bugs and implementation effort - Access to spark from Groovy with the ease of closures using the default Java implementations - No need to install additional software -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Reopened] (SPARK-1990) spark-ec2 should only need Python 2.6, not 2.7
[ https://issues.apache.org/jira/browse/SPARK-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng reopened SPARK-1990: -- In your PR, you should use subprocess.PIPE instead of calling PIPE directly. Could you submit a patch for it? spark-ec2 should only need Python 2.6, not 2.7 -- Key: SPARK-1990 URL: https://issues.apache.org/jira/browse/SPARK-1990 Project: Spark Issue Type: Improvement Reporter: Matei Zaharia Assignee: Anant Daksh Asthana Labels: Starter Fix For: 0.9.2, 1.0.1, 1.1.0 There were some posts on the lists that spark-ec2 does not work with Python 2.6. In addition, we should check the Python version at the top of the script and exit if it's too old. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (SPARK-1990) spark-ec2 should only need Python 2.6, not 2.7
[ https://issues.apache.org/jira/browse/SPARK-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-1990. -- Resolution: Fixed HOTFIX: https://github.com/apache/spark/pull/1108 spark-ec2 should only need Python 2.6, not 2.7 -- Key: SPARK-1990 URL: https://issues.apache.org/jira/browse/SPARK-1990 Project: Spark Issue Type: Improvement Reporter: Matei Zaharia Assignee: Anant Daksh Asthana Labels: Starter Fix For: 0.9.2, 1.0.1, 1.1.0 There were some posts on the lists that spark-ec2 does not work with Python 2.6. In addition, we should check the Python version at the top of the script and exit if it's too old. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (SPARK-2168) History Server renered page not suitable for load balancing
[ https://issues.apache.org/jira/browse/SPARK-2168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lukasz Jastrzebski updated SPARK-2168: -- Description: Small issue but still. I run history server through Marathon and balance it through haproxy. The problem is that links generated by HistoryPage (links to completed applications) are absolute, e.g. a href=http://some-server:port/history/...;completedApplicationName/a , but instead they should be relative, e.g. a hfref=/history/...completedApplicationName/a, so they can be load balanced. was: Small issue but still. I run history server through Marathon and balance it through haproxy. The problem is that links generated by HistoryPage (links to completed applications) are absolute, e.g. a href=http://some-server:port/history;http://some-server:port/history... , but instead they should relative just /history, so they can be load balanced. History Server renered page not suitable for load balancing --- Key: SPARK-2168 URL: https://issues.apache.org/jira/browse/SPARK-2168 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.0.0 Reporter: Lukasz Jastrzebski Priority: Minor Small issue but still. I run history server through Marathon and balance it through haproxy. The problem is that links generated by HistoryPage (links to completed applications) are absolute, e.g. a href=http://some-server:port/history/...;completedApplicationName/a , but instead they should be relative, e.g. a hfref=/history/...completedApplicationName/a, so they can be load balanced. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-2172) PySpark cannot import mllib modules in YARN-client mode
Vlad Frolov created SPARK-2172: -- Summary: PySpark cannot import mllib modules in YARN-client mode Key: SPARK-2172 URL: https://issues.apache.org/jira/browse/SPARK-2172 Project: Spark Issue Type: Bug Components: MLlib, PySpark, Spark Core, YARN Affects Versions: 1.0.0, 1.1.0 Environment: Ubuntu 14.04 Java 7 Python 2.7 CDH 5.0.2 (Hadoop 2.3.0): HDFS, YARN Spark 1.0.0 and git master Reporter: Vlad Frolov Here is the simple reproduce code: {code:title=issue.py|borderStyle=solid} from pyspark.mllib.regression import LabeledPoint sc.parallelize([1,2,3]).map(lambda x: LabeledPoint(1, [2])).count() {code} Note: The same issue occurs with .collect() instead of .count() {code:title=TraceBack|borderStyle=solid} Py4JJavaError: An error occurred while calling o110.collect. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 8.0:0 failed 4 times, most recent failure: Exception failure in TID 52 on host ares: org.apache.spark.api.python.PythonException: Traceback (most recent call last): File /mnt/storage/bigisle/yarn/1/yarn/local/usercache/blb/filecache/18/spark-assembly-1.0.0-hadoop2.2.0.jar/pyspark/worker.py, line 73, in main command = pickleSer._read_with_length(infile) File /mnt/storage/bigisle/yarn/1/yarn/local/usercache/blb/filecache/18/spark-assembly-1.0.0-hadoop2.2.0.jar/pyspark/serializers.py, line 146, in _read_with_length return self.loads(obj) ImportError: No module named mllib.regression org.apache.spark.api.python.PythonRDD$$anon$1.read(PythonRDD.scala:115) org.apache.spark.api.python.PythonRDD$$anon$1.init(PythonRDD.scala:145) org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:78) org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) org.apache.spark.rdd.RDD.iterator(RDD.scala:229) org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111) org.apache.spark.scheduler.Task.run(Task.scala:51) org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) java.lang.Thread.run(Thread.java:745) Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1033) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1017) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1015) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1015) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:633) at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1207) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) {code} However, this code works as expected: {code:title=noissue.py|borderStyle=solid} from pyspark.mllib.regression import LabeledPoint sc.parallelize([1,2,3]).map(lambda x: LabeledPoint(1, [2])).first() sc.parallelize([1,2,3]).map(lambda x: LabeledPoint(1, [2])).take(3) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (SPARK-2172) PySpark cannot import mllib modules in YARN-client mode
[ https://issues.apache.org/jira/browse/SPARK-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Frolov updated SPARK-2172: --- Description: Here is the simple reproduce code: {noformat} $ HADOOP_CONF_DIR=/etc/hadoop/conf MASTER=yarn-client ./bin/pyspark {noformat} {code:title=issue.py|borderStyle=solid} from pyspark.mllib.regression import LabeledPoint sc.parallelize([1,2,3]).map(lambda x: LabeledPoint(1, [2])).count() {code} Note: The same issue occurs with .collect() instead of .count() {code:title=TraceBack|borderStyle=solid} Py4JJavaError: An error occurred while calling o110.collect. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 8.0:0 failed 4 times, most recent failure: Exception failure in TID 52 on host ares: org.apache.spark.api.python.PythonException: Traceback (most recent call last): File /mnt/storage/bigisle/yarn/1/yarn/local/usercache/blb/filecache/18/spark-assembly-1.0.0-hadoop2.2.0.jar/pyspark/worker.py, line 73, in main command = pickleSer._read_with_length(infile) File /mnt/storage/bigisle/yarn/1/yarn/local/usercache/blb/filecache/18/spark-assembly-1.0.0-hadoop2.2.0.jar/pyspark/serializers.py, line 146, in _read_with_length return self.loads(obj) ImportError: No module named mllib.regression org.apache.spark.api.python.PythonRDD$$anon$1.read(PythonRDD.scala:115) org.apache.spark.api.python.PythonRDD$$anon$1.init(PythonRDD.scala:145) org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:78) org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) org.apache.spark.rdd.RDD.iterator(RDD.scala:229) org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111) org.apache.spark.scheduler.Task.run(Task.scala:51) org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) java.lang.Thread.run(Thread.java:745) Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1033) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1017) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1015) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1015) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:633) at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1207) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) {code} However, this code works as expected: {code:title=noissue.py|borderStyle=solid} from pyspark.mllib.regression import LabeledPoint sc.parallelize([1,2,3]).map(lambda x: LabeledPoint(1, [2])).first() sc.parallelize([1,2,3]).map(lambda x: LabeledPoint(1, [2])).take(3) {code} was: Here is the simple reproduce code: {code:title=issue.py|borderStyle=solid} from pyspark.mllib.regression import LabeledPoint sc.parallelize([1,2,3]).map(lambda x: LabeledPoint(1, [2])).count() {code} Note: The same issue occurs with .collect() instead of .count() {code:title=TraceBack|borderStyle=solid} Py4JJavaError: An error occurred while calling o110.collect. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 8.0:0 failed 4 times, most recent failure: Exception failure in TID 52 on host ares: org.apache.spark.api.python.PythonException: Traceback (most recent call last): File
[jira] [Created] (SPARK-2173) Add Master Computer and SuperStep Accumulator to Pregel GraphX Implement
Ted Malaska created SPARK-2173: -- Summary: Add Master Computer and SuperStep Accumulator to Pregel GraphX Implement Key: SPARK-2173 URL: https://issues.apache.org/jira/browse/SPARK-2173 Project: Spark Issue Type: Improvement Reporter: Ted Malaska In Girpah there is an idea of a master compute and a global superstep value you can access. I would like to add that to GraphX. Let me know what you think. I will try to get a pull request tonight. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (SPARK-2173) Add Master Computer and SuperStep Accumulator to Pregel GraphX Implemention
[ https://issues.apache.org/jira/browse/SPARK-2173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Malaska updated SPARK-2173: --- Summary: Add Master Computer and SuperStep Accumulator to Pregel GraphX Implemention (was: Add Master Computer and SuperStep Accumulator to Pregel GraphX Implement) Add Master Computer and SuperStep Accumulator to Pregel GraphX Implemention --- Key: SPARK-2173 URL: https://issues.apache.org/jira/browse/SPARK-2173 Project: Spark Issue Type: Improvement Reporter: Ted Malaska In Girpah there is an idea of a master compute and a global superstep value you can access. I would like to add that to GraphX. Let me know what you think. I will try to get a pull request tonight. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-2174) Implement treeReduce and treeAggregate
Xiangrui Meng created SPARK-2174: Summary: Implement treeReduce and treeAggregate Key: SPARK-2174 URL: https://issues.apache.org/jira/browse/SPARK-2174 Project: Spark Issue Type: New Feature Components: MLlib, Spark Core Reporter: Xiangrui Meng Assignee: Xiangrui Meng In `reduce` and `aggregate`, the driver node spends linear time on the number of partitions. It becomes a bottleneck when there are many partitions and the data from each partition is big. SPARK-1485 tracks the progress of implementing AllReduce on Spark. I didn't several implementations including butterfly, reduce + broadcast, and treeReduce + broadcast. treeReduce + BT broadcast seems to be right way to go for Spark. Using binary tree may introduce some overhead in communication, because the driver still need to coordinate on data shuffling. In my experiments, n - sqrt(n) - 1 gives the best performance in general. But it certainly needs more testing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (SPARK-2174) Implement treeReduce and treeAggregate
[ https://issues.apache.org/jira/browse/SPARK-2174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2174: - Description: In `reduce` and `aggregate`, the driver node spends linear time on the number of partitions. It becomes a bottleneck when there are many partitions and the data from each partition is big. SPARK-1485 tracks the progress of implementing AllReduce on Spark. I did several implementations including butterfly, reduce + broadcast, and treeReduce + broadcast. treeReduce + BT broadcast seems to be right way to go for Spark. Using binary tree may introduce some overhead in communication, because the driver still need to coordinate on data shuffling. In my experiments, n - sqrt(n) - 1 gives the best performance in general. But it certainly needs more testing. was: In `reduce` and `aggregate`, the driver node spends linear time on the number of partitions. It becomes a bottleneck when there are many partitions and the data from each partition is big. SPARK-1485 tracks the progress of implementing AllReduce on Spark. I didn't several implementations including butterfly, reduce + broadcast, and treeReduce + broadcast. treeReduce + BT broadcast seems to be right way to go for Spark. Using binary tree may introduce some overhead in communication, because the driver still need to coordinate on data shuffling. In my experiments, n - sqrt(n) - 1 gives the best performance in general. But it certainly needs more testing. Implement treeReduce and treeAggregate -- Key: SPARK-2174 URL: https://issues.apache.org/jira/browse/SPARK-2174 Project: Spark Issue Type: New Feature Components: MLlib, Spark Core Reporter: Xiangrui Meng Assignee: Xiangrui Meng In `reduce` and `aggregate`, the driver node spends linear time on the number of partitions. It becomes a bottleneck when there are many partitions and the data from each partition is big. SPARK-1485 tracks the progress of implementing AllReduce on Spark. I did several implementations including butterfly, reduce + broadcast, and treeReduce + broadcast. treeReduce + BT broadcast seems to be right way to go for Spark. Using binary tree may introduce some overhead in communication, because the driver still need to coordinate on data shuffling. In my experiments, n - sqrt(n) - 1 gives the best performance in general. But it certainly needs more testing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (SPARK-2170) Fix for global name 'PIPE' is not defined.
[ https://issues.apache.org/jira/browse/SPARK-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2170. Resolution: Not a Problem This was fixed already in a hotfix. But thanks for submitting the patch! https://github.com/apache/spark/pull/1108 Fix for global name 'PIPE' is not defined. -- Key: SPARK-2170 URL: https://issues.apache.org/jira/browse/SPARK-2170 Project: Spark Issue Type: Bug Components: EC2 Environment: $ python --version Python 2.6.6 $ lsb_release -a No LSB modules are available. Distributor ID: Debian Description:Debian GNU/Linux 6.0.9 (squeeze) Release:6.0.9 Codename: squeeze Reporter: Grega Kespret Priority: Minor Original Estimate: 1h Remaining Estimate: 1h When running spark-ec2.py script, it fails with error NameError: global name 'PIPE' is not defined. Traceback (most recent call last): File ./spark_ec2.py, line 894, in module main() File ./spark_ec2.py, line 886, in main real_main() File ./spark_ec2.py, line 770, in real_main setup_cluster(conn, master_nodes, slave_nodes, opts, True) File ./spark_ec2.py, line 475, in setup_cluster dot_ssh_tar = ssh_read(master, opts, ['tar', 'c', '.ssh']) File ./spark_ec2.py, line 709, in ssh_read ssh_command(opts) + ['%s@%s' % (opts.user, host), stringify_command(command)]) File ./spark_ec2.py, line 696, in _check_output process = subprocess.Popen(stdout=PIPE, *popenargs, **kwargs) NameError: global name 'PIPE' is not defined -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (SPARK-2060) Querying JSON Datasets with SQL and DSL in Spark SQL
[ https://issues.apache.org/jira/browse/SPARK-2060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-2060. Resolution: Fixed Fix Version/s: 1.1.0 1.0.1 Querying JSON Datasets with SQL and DSL in Spark SQL Key: SPARK-2060 URL: https://issues.apache.org/jira/browse/SPARK-2060 Project: Spark Issue Type: New Feature Components: SQL Reporter: Yin Huai Assignee: Yin Huai Fix For: 1.0.1, 1.1.0 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-2175) Null values when using App trait.
Brandon Amos created SPARK-2175: --- Summary: Null values when using App trait. Key: SPARK-2175 URL: https://issues.apache.org/jira/browse/SPARK-2175 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.0.0 Environment: Linux Reporter: Brandon Amos Priority: Trivial See http://apache-spark-user-list.1001560.n3.nabble.com/NullPointerExceptions-when-using-val-or-broadcast-on-a-standalone-cluster-tc7524.html -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2172) PySpark cannot import mllib modules in YARN-client mode
[ https://issues.apache.org/jira/browse/SPARK-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034730#comment-14034730 ] Vlad Frolov commented on SPARK-2172: I've tried to run the code in standalone and local modes. There is no such error, but I want to exercise YARN. I've also tried to run similar code in spark-shell (Scala) and it does well: {code} scala import org.apache.spark.mllib.regression.LabeledPoint scala import org.apache.spark.mllib.linalg.{Vector, Vectors} scala val array: Array[Double] = Array(1, 2) scala val vector: Vector = Vectors.dense(array) scala sc.parallelize(1 to 3).map(x = LabeledPoint(x, vector)).collect() res2: Array[org.apache.spark.mllib.regression.LabeledPoint] = Array(LabeledPoint(1.0, [1.0,2.0]), LabeledPoint(2.0, [1.0,2.0]), LabeledPoint(3.0, [1.0,2.0])) {code} PySpark cannot import mllib modules in YARN-client mode --- Key: SPARK-2172 URL: https://issues.apache.org/jira/browse/SPARK-2172 Project: Spark Issue Type: Bug Components: MLlib, PySpark, Spark Core, YARN Affects Versions: 1.0.0, 1.1.0 Environment: Ubuntu 14.04 Java 7 Python 2.7 CDH 5.0.2 (Hadoop 2.3.0): HDFS, YARN Spark 1.0.0 and git master Reporter: Vlad Frolov Labels: mllib, python Here is the simple reproduce code: {noformat} $ HADOOP_CONF_DIR=/etc/hadoop/conf MASTER=yarn-client ./bin/pyspark {noformat} {code:title=issue.py|borderStyle=solid} from pyspark.mllib.regression import LabeledPoint sc.parallelize([1,2,3]).map(lambda x: LabeledPoint(1, [2])).count() {code} Note: The same issue occurs with .collect() instead of .count() {code:title=TraceBack|borderStyle=solid} Py4JJavaError: An error occurred while calling o110.collect. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 8.0:0 failed 4 times, most recent failure: Exception failure in TID 52 on host ares: org.apache.spark.api.python.PythonException: Traceback (most recent call last): File /mnt/storage/bigisle/yarn/1/yarn/local/usercache/blb/filecache/18/spark-assembly-1.0.0-hadoop2.2.0.jar/pyspark/worker.py, line 73, in main command = pickleSer._read_with_length(infile) File /mnt/storage/bigisle/yarn/1/yarn/local/usercache/blb/filecache/18/spark-assembly-1.0.0-hadoop2.2.0.jar/pyspark/serializers.py, line 146, in _read_with_length return self.loads(obj) ImportError: No module named mllib.regression org.apache.spark.api.python.PythonRDD$$anon$1.read(PythonRDD.scala:115) org.apache.spark.api.python.PythonRDD$$anon$1.init(PythonRDD.scala:145) org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:78) org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) org.apache.spark.rdd.RDD.iterator(RDD.scala:229) org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111) org.apache.spark.scheduler.Task.run(Task.scala:51) org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) java.lang.Thread.run(Thread.java:745) Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1033) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1017) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1015) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1015) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:633) at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1207) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at
[jira] [Reopened] (SPARK-2038) Don't shadow conf variable in saveAsHadoop functions
[ https://issues.apache.org/jira/browse/SPARK-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell reopened SPARK-2038: Don't shadow conf variable in saveAsHadoop functions -- Key: SPARK-2038 URL: https://issues.apache.org/jira/browse/SPARK-2038 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 1.0.0 Reporter: Patrick Wendell Assignee: Nan Zhu Priority: Minor Fix For: 1.1.0 This could lead to a lot of bugs. We should just change it to hadoopConf. I noticed this when reviewing SPARK-1677. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (SPARK-2038) Don't shadow conf variable in saveAsHadoop functions
[ https://issues.apache.org/jira/browse/SPARK-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2038. Resolution: Won't Fix Don't shadow conf variable in saveAsHadoop functions -- Key: SPARK-2038 URL: https://issues.apache.org/jira/browse/SPARK-2038 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 1.0.0 Reporter: Patrick Wendell Assignee: Nan Zhu Priority: Minor Fix For: 1.1.0 This could lead to a lot of bugs. We should just change it to hadoopConf. I noticed this when reviewing SPARK-1677. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2038) Don't shadow conf variable in saveAsHadoop functions
[ https://issues.apache.org/jira/browse/SPARK-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034735#comment-14034735 ] Patrick Wendell commented on SPARK-2038: Unfortunately after discussion with Reynold, I realized we have to revert this. The issue is that we can't change parameter names in public API's because scala allows functions to pass arguments by name: http://docs.scala-lang.org/tutorials/tour/named-parameters.html So a change like this could break source compatibility for users. Don't shadow conf variable in saveAsHadoop functions -- Key: SPARK-2038 URL: https://issues.apache.org/jira/browse/SPARK-2038 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 1.0.0 Reporter: Patrick Wendell Assignee: Nan Zhu Priority: Minor Fix For: 1.1.0 This could lead to a lot of bugs. We should just change it to hadoopConf. I noticed this when reviewing SPARK-1677. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2038) Don't shadow conf variable in saveAsHadoop functions
[ https://issues.apache.org/jira/browse/SPARK-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034739#comment-14034739 ] Nan Zhu commented on SPARK-2038: Ah, I seethat's fine... Don't shadow conf variable in saveAsHadoop functions -- Key: SPARK-2038 URL: https://issues.apache.org/jira/browse/SPARK-2038 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 1.0.0 Reporter: Patrick Wendell Assignee: Nan Zhu Priority: Minor Fix For: 1.1.0 This could lead to a lot of bugs. We should just change it to hadoopConf. I noticed this when reviewing SPARK-1677. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-791) [pyspark] operator.getattr not serialized
[ https://issues.apache.org/jira/browse/SPARK-791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034779#comment-14034779 ] Mark Baker commented on SPARK-791: -- I began porting Pyspark to Python 3, but with my modest Python-fu, hit a wall at cloudpickle. Dill supports Python 3, so seems like a big win in that direction too. [pyspark] operator.getattr not serialized - Key: SPARK-791 URL: https://issues.apache.org/jira/browse/SPARK-791 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 0.7.2, 0.9.0 Reporter: Jim Blomo Priority: Minor Using operator.itemgetter as a function in map seems to confuse the serialization process in pyspark. I'm using itemgetter to return tuples, which fails with a TypeError (details below). Using an equivalent lambda function returns the correct result. Use a test file: {code:sh} echo 1,1 test.txt {code} Then try mapping it to a tuple: {code:python} import csv sc.textFile(test.txt).mapPartitions(csv.reader).map(lambda l: (l[0],l[1])).first() Out[7]: ('1', '1') {code} But this does not work when using operator.itemgetter: {code:python} import operator sc.textFile(test.txt).mapPartitions(csv.reader).map(operator.itemgetter(0,1)).first() # TypeError: list indices must be integers, not tuple {code} This is running with git master, commit 6d60fe571a405eb9306a2be1817901316a46f892 IPython 0.13.2 java version 1.7.0_25 Scala code runner version 2.9.1 Ubuntu 12.04 Full debug output: {code:python} In [9]: sc.textFile(test.txt).mapPartitions(csv.reader).map(operator.itemgetter(0,1)).first() 13/07/04 16:19:49 INFO storage.MemoryStore: ensureFreeSpace(33632) called with curMem=201792, maxMem=339585269 13/07/04 16:19:49 INFO storage.MemoryStore: Block broadcast_6 stored as values to memory (estimated size 32.8 KB, free 323.6 MB) 13/07/04 16:19:49 INFO mapred.FileInputFormat: Total input paths to process : 1 13/07/04 16:19:49 INFO spark.SparkContext: Starting job: takePartition at NativeMethodAccessorImpl.java:-2 13/07/04 16:19:49 INFO scheduler.DAGScheduler: Got job 4 (takePartition at NativeMethodAccessorImpl.java:-2) with 1 output partitions (allowLocal=true) 13/07/04 16:19:49 INFO scheduler.DAGScheduler: Final stage: Stage 4 (PythonRDD at NativeConstructorAccessorImpl.java:-2) 13/07/04 16:19:49 INFO scheduler.DAGScheduler: Parents of final stage: List() 13/07/04 16:19:49 INFO scheduler.DAGScheduler: Missing parents: List() 13/07/04 16:19:49 INFO scheduler.DAGScheduler: Computing the requested partition locally 13/07/04 16:19:49 INFO scheduler.DAGScheduler: Failed to run takePartition at NativeMethodAccessorImpl.java:-2 --- Py4JJavaError Traceback (most recent call last) ipython-input-9-1fdb3e7a8ac7 in module() 1 sc.textFile(test.txt).mapPartitions(csv.reader).map(operator.itemgetter(0,1)).first() /home/jim/src/spark/python/pyspark/rdd.pyc in first(self) 389 2 390 -- 391 return self.take(1)[0] 392 393 def saveAsTextFile(self, path): /home/jim/src/spark/python/pyspark/rdd.pyc in take(self, num) 372 items = [] 373 for partition in range(self._jrdd.splits().size()): -- 374 iterator = self.ctx._takePartition(self._jrdd.rdd(), partition) 375 # Each item in the iterator is a string, Python object, batch of 376 # Python objects. Regardless, it is sufficient to take `num` /home/jim/src/spark/python/lib/py4j0.7.egg/py4j/java_gateway.pyc in __call__(self, *args) 498 answer = self.gateway_client.send_command(command) 499 return_value = get_return_value(answer, self.gateway_client, -- 500 self.target_id, self.name) 501 502 for temp_arg in temp_args: /home/jim/src/spark/python/lib/py4j0.7.egg/py4j/protocol.pyc in get_return_value(answer, gateway_client, target_id, name) 298 raise Py4JJavaError( 299 'An error occurred while calling {0}{1}{2}.\n'. -- 300 format(target_id, '.', name), value) 301 else: 302 raise Py4JError( Py4JJavaError: An error occurred while calling z:spark.api.python.PythonRDD.takePartition. : spark.api.python.PythonException: Traceback (most recent call last): File /home/jim/src/spark/python/pyspark/worker.py, line 53, in main for obj in func(split_index, iterator): File /home/jim/src/spark/python/pyspark/serializers.py, line 24, in batched for item in iterator: TypeError: list indices must be integers, not tuple at
[jira] [Commented] (SPARK-2173) Add Master Computer and SuperStep Accumulator to Pregel GraphX Implemention
[ https://issues.apache.org/jira/browse/SPARK-2173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034798#comment-14034798 ] Ted Malaska commented on SPARK-2173: Nope a broadcast wont work either. Let me think about it over night. Maybe the solution is simply just update the VertexRDD.innerJoin method to take i which is the super step. Add Master Computer and SuperStep Accumulator to Pregel GraphX Implemention --- Key: SPARK-2173 URL: https://issues.apache.org/jira/browse/SPARK-2173 Project: Spark Issue Type: Improvement Reporter: Ted Malaska In Girpah there is an idea of a master compute and a global superstep value you can access. I would like to add that to GraphX. Let me know what you think. I will try to get a pull request tonight. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2173) Add Master Computer and SuperStep Accumulator to Pregel GraphX Implemention
[ https://issues.apache.org/jira/browse/SPARK-2173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034816#comment-14034816 ] Ted Malaska commented on SPARK-2173: Sorry I'm slow I just realized I don't even need superStep of master computer. In the confines of Giraph I did need them to solve tree rooting, but in the world of GraphX and can enter and exit Pregel when ever and how often I want. So in the case of tree rooting. I would do max one super step of pregel to broadcast to all my children to identify my roots then start a second pregel with un bound super steps to root all the other vertices to the roots. GraphX is so freeing in comparison to Giraph. I will close the ticket now. Add Master Computer and SuperStep Accumulator to Pregel GraphX Implemention --- Key: SPARK-2173 URL: https://issues.apache.org/jira/browse/SPARK-2173 Project: Spark Issue Type: Improvement Reporter: Ted Malaska In Girpah there is an idea of a master compute and a global superstep value you can access. I would like to add that to GraphX. Let me know what you think. I will try to get a pull request tonight. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Closed] (SPARK-2173) Add Master Computer and SuperStep Accumulator to Pregel GraphX Implemention
[ https://issues.apache.org/jira/browse/SPARK-2173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Malaska closed SPARK-2173. -- Resolution: Invalid Not an issue. GraphX doesn't need these features because it is not as limiting at Giraph in it's options. Add Master Computer and SuperStep Accumulator to Pregel GraphX Implemention --- Key: SPARK-2173 URL: https://issues.apache.org/jira/browse/SPARK-2173 Project: Spark Issue Type: Improvement Reporter: Ted Malaska In Girpah there is an idea of a master compute and a global superstep value you can access. I would like to add that to GraphX. Let me know what you think. I will try to get a pull request tonight. -- This message was sent by Atlassian JIRA (v6.2#6252)