[jira] [Commented] (SPARK-4160) Standalone cluster mode does not upload all needed jars to driver node

2014-12-22 Thread Gurpreet Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14256653#comment-14256653
 ] 

Gurpreet Singh commented on SPARK-4160:
---

Looks like this bug is relevant to yarn cluster mode also. Spark-Submit is not 
copying jars/files specified in --jars and --files option, This is working in 
1.0.2 version. 

 Standalone cluster mode does not upload all needed jars to driver node
 --

 Key: SPARK-4160
 URL: https://issues.apache.org/jira/browse/SPARK-4160
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.2.0
Reporter: Marcelo Vanzin

 If you look at the code in {{DriverRunner.scala}}, there is code to download 
 the main application jar from the launcher node. But that's the only jar 
 that's downloaded - if the driver depends on one of the jars or files 
 specified via {{spark-submit --jars list --files list}}, it won't be able 
 to run.
 It should be possible to use the same mechanism to distribute the other files 
 to the driver node, even if that's not the most efficient way of doing it. 
 That way, at least, you don't need any external dependencies to be able to 
 distribute the files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4931) Fix the messy format about log4j in running-on-yarn.md

2014-12-22 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-4931:
---

 Summary: Fix the messy format about log4j in running-on-yarn.md
 Key: SPARK-4931
 URL: https://issues.apache.org/jira/browse/SPARK-4931
 Project: Spark
  Issue Type: Documentation
  Components: Documentation, YARN
Reporter: Shixiong Zhu
Priority: Trivial


The format about log4j in running-on-yarn.md is a bit messy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-4931) Fix the messy format about log4j in running-on-yarn.md

2014-12-22 Thread Shixiong Zhu (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-4931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shixiong Zhu updated SPARK-4931:

Attachment: log4j.png

 Fix the messy format about log4j in running-on-yarn.md
 --

 Key: SPARK-4931
 URL: https://issues.apache.org/jira/browse/SPARK-4931
 Project: Spark
  Issue Type: Documentation
  Components: Documentation, YARN
Reporter: Shixiong Zhu
Priority: Trivial
 Attachments: log4j.png


 The format about log4j in running-on-yarn.md is a bit messy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4931) Fix the messy format about log4j in running-on-yarn.md

2014-12-22 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14256674#comment-14256674
 ] 

Apache Spark commented on SPARK-4931:
-

User 'zsxwing' has created a pull request for this issue:
https://github.com/apache/spark/pull/3774

 Fix the messy format about log4j in running-on-yarn.md
 --

 Key: SPARK-4931
 URL: https://issues.apache.org/jira/browse/SPARK-4931
 Project: Spark
  Issue Type: Documentation
  Components: Documentation, YARN
Reporter: Shixiong Zhu
Priority: Trivial
 Attachments: log4j.png


 The format about log4j in running-on-yarn.md is a bit messy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-4349) Spark driver hangs on sc.parallelize() if exception is thrown during serialization

2014-12-22 Thread Patrick Wendell (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-4349:
---
Target Version/s: 1.3.0

 Spark driver hangs on sc.parallelize() if exception is thrown during 
 serialization
 --

 Key: SPARK-4349
 URL: https://issues.apache.org/jira/browse/SPARK-4349
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.1.0
Reporter: Matt Cheah

 Executing the following in the Spark Shell will lead to the Spark Shell 
 hanging after a stack trace is printed. The serializer is set to the Kryo 
 serializer.
 {code}
 scala import com.esotericsoftware.kryo.io.Input
 import com.esotericsoftware.kryo.io.Input
 scala import com.esotericsoftware.kryo.io.Output
 import com.esotericsoftware.kryo.io.Output
 scala class MyKryoSerializable extends 
 com.esotericsoftware.kryo.KryoSerializable { def write (kryo: 
 com.esotericsoftware.kryo.Kryo, output: Output) { throw new 
 com.esotericsoftware.kryo.KryoException; } ; def read (kryo: 
 com.esotericsoftware.kryo.Kryo, input: Input) { throw new 
 com.esotericsoftware.kryo.KryoException; } }
 defined class MyKryoSerializable
 scala sc.parallelize(Seq(new MyKryoSerializable, new 
 MyKryoSerializable)).collect
 {code}
 A stack trace is printed during serialization as expected, but another stack 
 trace is printed afterwards, indicating that the driver can't recover:
 {code}
 14/11/11 14:10:03 ERROR OneForOneStrategy: actor name [ExecutorActor] is not 
 unique!
 akka.actor.PostRestartException: exception post restart (class 
 java.io.IOException)
   at 
 akka.actor.dungeon.FaultHandling$$anonfun$6.apply(FaultHandling.scala:249)
   at 
 akka.actor.dungeon.FaultHandling$$anonfun$6.apply(FaultHandling.scala:247)
   at 
 akka.actor.dungeon.FaultHandling$$anonfun$handleNonFatalOrInterruptedException$1.applyOrElse(FaultHandling.scala:302)
   at 
 akka.actor.dungeon.FaultHandling$$anonfun$handleNonFatalOrInterruptedException$1.applyOrElse(FaultHandling.scala:297)
   at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
   at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
   at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
   at 
 akka.actor.dungeon.FaultHandling$class.finishRecreate(FaultHandling.scala:247)
   at 
 akka.actor.dungeon.FaultHandling$class.faultRecreate(FaultHandling.scala:76)
   at akka.actor.ActorCell.faultRecreate(ActorCell.scala:369)
   at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:459)
   at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
   at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:263)
   at akka.dispatch.Mailbox.run(Mailbox.scala:219)
   at 
 akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
   at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
   at 
 scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
   at 
 scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 Caused by: akka.actor.InvalidActorNameException: actor name [ExecutorActor] 
 is not unique!
   at 
 akka.actor.dungeon.ChildrenContainer$NormalChildrenContainer.reserve(ChildrenContainer.scala:130)
   at akka.actor.dungeon.Children$class.reserveChild(Children.scala:77)
   at akka.actor.ActorCell.reserveChild(ActorCell.scala:369)
   at akka.actor.dungeon.Children$class.makeChild(Children.scala:202)
   at akka.actor.dungeon.Children$class.attachChild(Children.scala:42)
   at akka.actor.ActorCell.attachChild(ActorCell.scala:369)
   at akka.actor.ActorSystemImpl.actorOf(ActorSystem.scala:552)
   at org.apache.spark.executor.Executor.init(Executor.scala:97)
   at 
 org.apache.spark.scheduler.local.LocalActor.init(LocalBackend.scala:53)
   at 
 org.apache.spark.scheduler.local.LocalBackend$$anonfun$start$1.apply(LocalBackend.scala:96)
   at 
 org.apache.spark.scheduler.local.LocalBackend$$anonfun$start$1.apply(LocalBackend.scala:96)
   at akka.actor.TypedCreatorFunctionConsumer.produce(Props.scala:343)
   at akka.actor.Props.newActor(Props.scala:252)
   at akka.actor.ActorCell.newActor(ActorCell.scala:552)
   at 
 akka.actor.dungeon.FaultHandling$class.finishRecreate(FaultHandling.scala:234)
   ... 11 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For 

[jira] [Updated] (SPARK-4349) Spark driver hangs on sc.parallelize() if exception is thrown during serialization

2014-12-22 Thread Patrick Wendell (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-4349:
---
Fix Version/s: (was: 1.3.0)

 Spark driver hangs on sc.parallelize() if exception is thrown during 
 serialization
 --

 Key: SPARK-4349
 URL: https://issues.apache.org/jira/browse/SPARK-4349
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.1.0
Reporter: Matt Cheah

 Executing the following in the Spark Shell will lead to the Spark Shell 
 hanging after a stack trace is printed. The serializer is set to the Kryo 
 serializer.
 {code}
 scala import com.esotericsoftware.kryo.io.Input
 import com.esotericsoftware.kryo.io.Input
 scala import com.esotericsoftware.kryo.io.Output
 import com.esotericsoftware.kryo.io.Output
 scala class MyKryoSerializable extends 
 com.esotericsoftware.kryo.KryoSerializable { def write (kryo: 
 com.esotericsoftware.kryo.Kryo, output: Output) { throw new 
 com.esotericsoftware.kryo.KryoException; } ; def read (kryo: 
 com.esotericsoftware.kryo.Kryo, input: Input) { throw new 
 com.esotericsoftware.kryo.KryoException; } }
 defined class MyKryoSerializable
 scala sc.parallelize(Seq(new MyKryoSerializable, new 
 MyKryoSerializable)).collect
 {code}
 A stack trace is printed during serialization as expected, but another stack 
 trace is printed afterwards, indicating that the driver can't recover:
 {code}
 14/11/11 14:10:03 ERROR OneForOneStrategy: actor name [ExecutorActor] is not 
 unique!
 akka.actor.PostRestartException: exception post restart (class 
 java.io.IOException)
   at 
 akka.actor.dungeon.FaultHandling$$anonfun$6.apply(FaultHandling.scala:249)
   at 
 akka.actor.dungeon.FaultHandling$$anonfun$6.apply(FaultHandling.scala:247)
   at 
 akka.actor.dungeon.FaultHandling$$anonfun$handleNonFatalOrInterruptedException$1.applyOrElse(FaultHandling.scala:302)
   at 
 akka.actor.dungeon.FaultHandling$$anonfun$handleNonFatalOrInterruptedException$1.applyOrElse(FaultHandling.scala:297)
   at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
   at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
   at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
   at 
 akka.actor.dungeon.FaultHandling$class.finishRecreate(FaultHandling.scala:247)
   at 
 akka.actor.dungeon.FaultHandling$class.faultRecreate(FaultHandling.scala:76)
   at akka.actor.ActorCell.faultRecreate(ActorCell.scala:369)
   at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:459)
   at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
   at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:263)
   at akka.dispatch.Mailbox.run(Mailbox.scala:219)
   at 
 akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
   at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
   at 
 scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
   at 
 scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 Caused by: akka.actor.InvalidActorNameException: actor name [ExecutorActor] 
 is not unique!
   at 
 akka.actor.dungeon.ChildrenContainer$NormalChildrenContainer.reserve(ChildrenContainer.scala:130)
   at akka.actor.dungeon.Children$class.reserveChild(Children.scala:77)
   at akka.actor.ActorCell.reserveChild(ActorCell.scala:369)
   at akka.actor.dungeon.Children$class.makeChild(Children.scala:202)
   at akka.actor.dungeon.Children$class.attachChild(Children.scala:42)
   at akka.actor.ActorCell.attachChild(ActorCell.scala:369)
   at akka.actor.ActorSystemImpl.actorOf(ActorSystem.scala:552)
   at org.apache.spark.executor.Executor.init(Executor.scala:97)
   at 
 org.apache.spark.scheduler.local.LocalActor.init(LocalBackend.scala:53)
   at 
 org.apache.spark.scheduler.local.LocalBackend$$anonfun$start$1.apply(LocalBackend.scala:96)
   at 
 org.apache.spark.scheduler.local.LocalBackend$$anonfun$start$1.apply(LocalBackend.scala:96)
   at akka.actor.TypedCreatorFunctionConsumer.produce(Props.scala:343)
   at akka.actor.Props.newActor(Props.scala:252)
   at akka.actor.ActorCell.newActor(ActorCell.scala:552)
   at 
 akka.actor.dungeon.FaultHandling$class.finishRecreate(FaultHandling.scala:234)
   ... 11 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: 

[jira] [Updated] (SPARK-4349) Spark driver hangs on sc.parallelize() if exception is thrown during serialization

2014-12-22 Thread Patrick Wendell (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-4349:
---
Priority: Critical  (was: Major)

 Spark driver hangs on sc.parallelize() if exception is thrown during 
 serialization
 --

 Key: SPARK-4349
 URL: https://issues.apache.org/jira/browse/SPARK-4349
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.1.0
Reporter: Matt Cheah
Priority: Critical

 Executing the following in the Spark Shell will lead to the Spark Shell 
 hanging after a stack trace is printed. The serializer is set to the Kryo 
 serializer.
 {code}
 scala import com.esotericsoftware.kryo.io.Input
 import com.esotericsoftware.kryo.io.Input
 scala import com.esotericsoftware.kryo.io.Output
 import com.esotericsoftware.kryo.io.Output
 scala class MyKryoSerializable extends 
 com.esotericsoftware.kryo.KryoSerializable { def write (kryo: 
 com.esotericsoftware.kryo.Kryo, output: Output) { throw new 
 com.esotericsoftware.kryo.KryoException; } ; def read (kryo: 
 com.esotericsoftware.kryo.Kryo, input: Input) { throw new 
 com.esotericsoftware.kryo.KryoException; } }
 defined class MyKryoSerializable
 scala sc.parallelize(Seq(new MyKryoSerializable, new 
 MyKryoSerializable)).collect
 {code}
 A stack trace is printed during serialization as expected, but another stack 
 trace is printed afterwards, indicating that the driver can't recover:
 {code}
 14/11/11 14:10:03 ERROR OneForOneStrategy: actor name [ExecutorActor] is not 
 unique!
 akka.actor.PostRestartException: exception post restart (class 
 java.io.IOException)
   at 
 akka.actor.dungeon.FaultHandling$$anonfun$6.apply(FaultHandling.scala:249)
   at 
 akka.actor.dungeon.FaultHandling$$anonfun$6.apply(FaultHandling.scala:247)
   at 
 akka.actor.dungeon.FaultHandling$$anonfun$handleNonFatalOrInterruptedException$1.applyOrElse(FaultHandling.scala:302)
   at 
 akka.actor.dungeon.FaultHandling$$anonfun$handleNonFatalOrInterruptedException$1.applyOrElse(FaultHandling.scala:297)
   at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
   at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
   at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
   at 
 akka.actor.dungeon.FaultHandling$class.finishRecreate(FaultHandling.scala:247)
   at 
 akka.actor.dungeon.FaultHandling$class.faultRecreate(FaultHandling.scala:76)
   at akka.actor.ActorCell.faultRecreate(ActorCell.scala:369)
   at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:459)
   at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
   at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:263)
   at akka.dispatch.Mailbox.run(Mailbox.scala:219)
   at 
 akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
   at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
   at 
 scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
   at 
 scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 Caused by: akka.actor.InvalidActorNameException: actor name [ExecutorActor] 
 is not unique!
   at 
 akka.actor.dungeon.ChildrenContainer$NormalChildrenContainer.reserve(ChildrenContainer.scala:130)
   at akka.actor.dungeon.Children$class.reserveChild(Children.scala:77)
   at akka.actor.ActorCell.reserveChild(ActorCell.scala:369)
   at akka.actor.dungeon.Children$class.makeChild(Children.scala:202)
   at akka.actor.dungeon.Children$class.attachChild(Children.scala:42)
   at akka.actor.ActorCell.attachChild(ActorCell.scala:369)
   at akka.actor.ActorSystemImpl.actorOf(ActorSystem.scala:552)
   at org.apache.spark.executor.Executor.init(Executor.scala:97)
   at 
 org.apache.spark.scheduler.local.LocalActor.init(LocalBackend.scala:53)
   at 
 org.apache.spark.scheduler.local.LocalBackend$$anonfun$start$1.apply(LocalBackend.scala:96)
   at 
 org.apache.spark.scheduler.local.LocalBackend$$anonfun$start$1.apply(LocalBackend.scala:96)
   at akka.actor.TypedCreatorFunctionConsumer.produce(Props.scala:343)
   at akka.actor.Props.newActor(Props.scala:252)
   at akka.actor.ActorCell.newActor(ActorCell.scala:552)
   at 
 akka.actor.dungeon.FaultHandling$class.finishRecreate(FaultHandling.scala:234)
   ... 11 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: 

[jira] [Updated] (SPARK-4906) Spark master OOMs with exception stack trace stored in JobProgressListener

2014-12-22 Thread Patrick Wendell (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-4906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-4906:
---
Component/s: Web UI

 Spark master OOMs with exception stack trace stored in JobProgressListener
 --

 Key: SPARK-4906
 URL: https://issues.apache.org/jira/browse/SPARK-4906
 Project: Spark
  Issue Type: Bug
  Components: Web UI
Affects Versions: 1.1.1
Reporter: Mingyu Kim

 Spark master was OOMing with a lot of stack traces retained in 
 JobProgressListener. The object dependency goes like the following.
 JobProgressListener.stageIdToData = StageUIData.taskData = 
 TaskUIData.errorMessage
 Each error message is ~10kb since it has the entire stack trace. As we have a 
 lot of tasks, when all of the tasks across multiple stages go bad, these 
 error messages accounted for 0.5GB of heap at some point.
 Please correct me if I'm wrong, but it looks like all the task info for 
 running applications are kept in memory, which means it's almost always bound 
 to OOM for long-running applications. Would it make sense to fix this, for 
 example, by spilling some UI states to disk?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-3821) Develop an automated way of creating Spark images (AMI, Docker, and others)

2014-12-22 Thread Nicholas Chammas (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-3821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14256683#comment-14256683
 ] 

Nicholas Chammas commented on SPARK-3821:
-

Per the discussion earlier, I've 
[updated|https://github.com/nchammas/spark-ec2/tree/packer/packer] the Packer 
build configuration to drop the release-specific builds. I've also added GNU 
parallel to the list of installed tools and will use it in place of the {{while 
... rsync ...  wait}} pattern used throughout the various setup scripts.

I'll test out these changes on small ( 5 nodes) and large (= 100 nodes) 
cluster launches and post updated benchmarks as well as an updated README and 
proposal.

 Develop an automated way of creating Spark images (AMI, Docker, and others)
 ---

 Key: SPARK-3821
 URL: https://issues.apache.org/jira/browse/SPARK-3821
 Project: Spark
  Issue Type: Improvement
  Components: Build, EC2
Reporter: Nicholas Chammas
Assignee: Nicholas Chammas
 Attachments: packer-proposal.html


 Right now the creation of Spark AMIs or Docker containers is done manually. 
 With tools like [Packer|http://www.packer.io/], we should be able to automate 
 this work, and do so in such a way that multiple types of machine images can 
 be created from a single template.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Reopened] (SPARK-4325) Improve spark-ec2 cluster launch times

2014-12-22 Thread Nicholas Chammas (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Chammas reopened SPARK-4325:
-

Hey [~joshrosen], though [#3195|https://github.com/apache/spark/pull/3195] 
relates to this JIRA issue, it does not resolve it completely. There are 
several other improvements described here that have not been implemented yet.

In the future, should we try to have one PR match one JIRA issue? This issue 
could easily be an umbrella issue spanning several sub-tasks, one of which has 
been taken care of by the aforementioned PR.

 Improve spark-ec2 cluster launch times
 --

 Key: SPARK-4325
 URL: https://issues.apache.org/jira/browse/SPARK-4325
 Project: Spark
  Issue Type: Improvement
  Components: EC2
Reporter: Nicholas Chammas
Assignee: Nicholas Chammas
Priority: Minor
 Fix For: 1.3.0


 There are several optimizations we know we can make to [{{setup.sh}} | 
 https://github.com/mesos/spark-ec2/blob/v4/setup.sh] to make cluster launches 
 faster.
 There are also some improvements to the AMIs that will help a lot.
 Potential improvements:
 * Upgrade the Spark AMIs and pre-install tools like Ganglia on them. This 
 will reduce or eliminate SSH wait time and Ganglia init time.
 * Replace instances of {{download; rsync to rest of cluster}} with parallel 
 downloads on all nodes of the cluster.
 * Replace instances of 
  {code}
 for node in $NODES; do
   command
   sleep 0.3
 done
 wait{code}
  with simpler calls to {{pssh}}.
 * Remove the [linear backoff | 
 https://github.com/apache/spark/blob/b32734e12d5197bad26c080e529edd875604c6fb/ec2/spark_ec2.py#L665]
  when we wait for SSH availability now that we are already waiting for EC2 
 status checks to clear before testing SSH.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4241) spark_ec2.py support China AWS region: cn-north-1

2014-12-22 Thread Nicholas Chammas (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14256694#comment-14256694
 ] 

Nicholas Chammas commented on SPARK-4241:
-

[~joshrosen] - I noticed you linked this issue to [SPARK-4890]. Does a boto 
upgrade somehow enable us to support the {{cn-north-1}} region? I thought it 
was a limitation imposed intentionally by AWS/the Chinese government.

 spark_ec2.py support China AWS region: cn-north-1
 -

 Key: SPARK-4241
 URL: https://issues.apache.org/jira/browse/SPARK-4241
 Project: Spark
  Issue Type: Improvement
  Components: EC2
Reporter: Haitao Yao

 Amazon started a new region in China: cn-north-1. But in 
 https://github.com/mesos/spark-ec2/tree/v4/ami-list
 there's no ami id for the region: cn-north-1. so the ec2/spark_ec2.py failed 
 on this step. 
 We need to add ami id for region: cn-north-1 in 
 https://github.com/mesos/spark-ec2/tree/v4/ami-list



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4932) Add help comments in Analytics

2014-12-22 Thread Takeshi Yamamuro (JIRA)
Takeshi Yamamuro created SPARK-4932:
---

 Summary: Add help comments in Analytics
 Key: SPARK-4932
 URL: https://issues.apache.org/jira/browse/SPARK-4932
 Project: Spark
  Issue Type: Improvement
  Components: GraphX
Reporter: Takeshi Yamamuro
Priority: Trivial


Add help comments for taskType in Analytics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4932) Add help comments in Analytics

2014-12-22 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14256716#comment-14256716
 ] 

Apache Spark commented on SPARK-4932:
-

User 'maropu' has created a pull request for this issue:
https://github.com/apache/spark/pull/3775

 Add help comments in Analytics
 --

 Key: SPARK-4932
 URL: https://issues.apache.org/jira/browse/SPARK-4932
 Project: Spark
  Issue Type: Improvement
  Components: GraphX
Reporter: Takeshi Yamamuro
Priority: Trivial

 Add help comments for taskType in Analytics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



<    1   2