t; On Tue, Jun 27, 2017 at 9:17 AM, 萝卜丝炒饭 <1427357...@qq.com> wrote:
>>>
>>>> My words cause misunderstanding.
>>>> Step 1:A is submited to spark.
>>>> Step 2:B is submitted to spark.
>>>>
>>>> Spark gets two independent jobs.The FAIR
A and B.
Jeffrey' code did not cause two submit.
---Original---From: "Pralabh Kumar"<pralabhku...@gmail.com>Date: 2017/6/27
12:09:27To: "萝卜丝炒饭"<1427357...@qq.com>;Cc:
"user"<user@spark.apache.org>;"satishl"<satish.la...@gmail.com>;&
>
>>
>>
>> ---Original---
>> *From:* "Pralabh Kumar"<pralabhku...@gmail.com>
>> *Date:* 2017/6/27 12:09:27
>> *To:* "萝卜丝炒饭"<1427357...@qq.com>;
>> *Cc:* "user"<user@spark.apache.org>;"satishl"
use two submit.
>
>
>
> ---Original---
> *From:* "Pralabh Kumar"<pralabhku...@gmail.com>
> *Date:* 2017/6/27 12:09:27
> *To:* "萝卜丝炒饭"<1427357...@qq.com>;
> *Cc:* "user"<user@spark.apache.org>;"satishl"<satish.la.
te: 2017/6/27 12:09:27
To: "??"<1427357...@qq.com>;
Cc: "user"<user@spark.apache.org>;"satishl"<satish.la...@gmail.com>;"Bryan
Jeffrey"<bryan.jeff...@gmail.com>;
Subject: Re: Question about Parallel Stages in Spark
Hi
ail.com>;
> *Cc:* "user"<user@spark.apache.org>;
> *Subject:* Re: Question about Parallel Stages in Spark
>
> Hello.
>
> The driver is running the individual operations in series, but each
> operation is parallelized internally. If you want them run in p
"<user@spark.apache.org>;
Subject: Re: Question about Parallel Stages in Spark
Hello.
The driver is running the individual operations in series, but each operation
is parallelized internally. If you want them run in parallel you need to
provide the driver a mechanism to thread the j
ocnd"
> statements next in serial fashion. I have set spark.scheduler.mode = FAIR.
> obviously my understanding of parallel stages is wrong. What am I missing?
>
> val rdd1 = sc.parallelize(1 to 100)
> val rdd2 = sc.parallelize(1 to 100)
>
> for (i <-
cheduler.mode = FAIR.
obviously my understanding of parallel stages is wrong. What am I missing?
val rdd1 = sc.parallelize(1 to 100)
val rdd2 = sc.parallelize(1 to 100)
for (i <- (1 to 100))
println("first: " + rdd1.sum())
for (i <- (1 to 100
Yes, but what I show can be done in one Spark job.
On Wed, Jul 16, 2014 at 5:01 AM, Wei Tan w...@us.ibm.com wrote:
Thanks Sean. In Oozie you can use fork-join, however using Oozie to drive
Spark jobs, jobs will not be able to share RDD (Am I right? I think multiple
jobs submitted by Oozie will
Hi, I wonder if I do wordcount on two different files, like this:
val file1 = sc.textFile(/...)
val file2 = sc.textFile(/...)
val wc1= file.flatMap(..).reduceByKey(_ + _,1)
val wc2= file.flatMap(...).reduceByKey(_ + _,1)
wc1.saveAsTextFile(titles.out)
wc2.saveAsTextFile(tables.out)
Would the
The last two lines are what trigger the operations, and they will each
block until the result is computed and saved. So if you execute this
code as-is, no. You could write a Scala program that invokes these two
operations in parallel, like:
Array((wc1,titles.out), (wc2,tables.out)).par.foreach {
-
Wei Tan, PhD
Research Staff Member
IBM T. J. Watson Research Center
http://researcher.ibm.com/person/us-wtan
From: Sean Owen so...@cloudera.com
To: user@spark.apache.org,
Date: 07/15/2014 04:37 PM
Subject:Re: parallel stages?
The last two lines
13 matches
Mail list logo