Steve, Something like this will do I think = sc.parallelize(1 to 1000,
1000).flatMap(x = 1 to 10)
the above will launch 1000 tasks (maps), with each task creating 10^5
numbers (total of 100 million elements)
On Mon, Dec 8, 2014 at 6:17 PM, Steve Lewis lordjoe2...@gmail.com wrote:
assume
Can someone explain the motivation behind passing executorAdded event to
DAGScheduler ? *DAGScheduler *does *submitWaitingStages *when *executorAdded
*method is called by *TaskSchedulerImpl*. I see some issue in the below
code,
*TaskSchedulerImpl.scala code*
if (!executorsByHost.contains(o.host))
Some corrections.
On Fri, Sep 26, 2014 at 5:32 PM, praveen seluka praveen.sel...@gmail.com
wrote:
Can someone explain the motivation behind passing executorAdded event to
DAGScheduler ? *DAGScheduler *does *submitWaitingStages *when *executorAdded
*method is called by *TaskSchedulerImpl*. I
)
I’m not sure if it will create an issue when you have multiple workers in
the same host, as submitWaitingStages is called everywhere and I never
try such a deployment mode
Best,
--
Nan Zhu
On Friday, September 26, 2014 at 8:02 AM, praveen seluka wrote:
Can someone explain the motivation
Hi all
Am seeing a strange issue in Spark on Yarn(Stable). Let me know if known,
or am missing something as it looks very fundamental.
Launch a Spark job with 2 Containers. addContainerRequest called twice and
then calls allocate to AMRMClient. This will get 2 Containers allocated.
Fine as of
Mailed our list - will send it to Spark Dev
On Fri, Sep 5, 2014 at 11:28 AM, Rajat Gupta rgu...@qubole.com wrote:
+1 on this. First step to more automated autoscaling of spark application
master...
On Fri, Sep 5, 2014 at 12:56 AM, Praveen Seluka psel...@qubole.com
wrote:
+user
+user
On Thu, Sep 4, 2014 at 10:53 PM, Praveen Seluka psel...@qubole.com wrote:
Spark on Yarn has static allocation of resources.
https://issues.apache.org/jira/browse/SPARK-3174 - This JIRA by Sandy is
about adding and removing executors dynamically based on load. Even before
doing
If you want to make Twitter* classes available in your shell, I believe you
could do the following
1. Change the parent pom module ordering - Move external/twitter before
assembly
2. In assembly/pom.xm, add external/twitter dependency - this will package
twitter* into the assembly jar
Now when
If I understand correctly, you could not change the number of executors at
runtime right(correct me if am wrong) - its defined when we start the
application and fixed. Do you mean number of tasks?
On Fri, Jul 11, 2014 at 6:29 AM, Tathagata Das tathagata.das1...@gmail.com
wrote:
Can you try
Praveen Seluka psel...@qubole.com:
I am trying to run Spark on YARN. I have a hadoop 2.2 cluster (YARN +
HDFS) in EC2. Then, I compiled Spark using Maven with 2.2 hadoop profiles.
Now am trying to run the example Spark job . (In Yarn-cluster mode).
From my *local machine. *I have setup
I am trying to run Spark on YARN. I have a hadoop 2.2 cluster (YARN +
HDFS) in EC2. Then, I compiled Spark using Maven with 2.2 hadoop profiles.
Now am trying to run the example Spark job . (In Yarn-cluster mode).
From my *local machine. *I have setup HADOOP_CONF_DIR environment variable
11 matches
Mail list logo