RE: Problem understanding spark word count execution

2015-10-02 Thread java8964
t()", which will transfer the final result to driver and dump it in the console. Yong Date: Fri, 2 Oct 2015 00:50:24 -0700 Subject: Re: Problem understanding spark word count execution From: kar...@bluedata.com To: java8...@hotmail.com CC: nicolae.maras...@adswizz.com; user@spark.apache.

Re: Problem understanding spark word count execution

2015-10-02 Thread Kartik Mathur
using > "rdd.collect()", which will transfer the final result to driver and dump it > in the console. > > Yong > > -- > Date: Fri, 2 Oct 2015 00:50:24 -0700 > Subject: Re: Problem understanding spark word count execution > From: kar...@blu

RE: Problem understanding spark word count execution

2015-10-02 Thread java8964
t fraction of java heap to use as the SortBuffer area. You can find more information in this Jira: https://issues.apache.org/jira/browse/SPARK-2045 Yong Date: Fri, 2 Oct 2015 11:55:41 -0700 Subject: Re: Problem understanding spark word count execution From: kar...@bluedata.com To: java8...@hotm

Re: Problem understanding spark word count execution

2015-10-02 Thread Kartik Mathur
nd of low, but I really don't have > other explaining of its meaning. > > If you finally output shows hundreds of unique words, then it is. > > The 2000 bytes sent to driver is the final output aggregated on the > reducers end, and merged back to the driver. > > Yong > > >

Re: Problem understanding spark word count execution

2015-10-01 Thread Kartik Mathur
* Thursday, October 1, 2015 12:42 AM > *To:* user > *Subject:* Problem understanding spark word count execution > > Hi All, > > I tried running spark word count and I have couple of questions - > > I am analyzing stage 0 , i.e > *sc.textFile -> flatMap -> Map (Wo

Re: Problem understanding spark word count execution

2015-10-01 Thread Kartik Mathur
can share more of your context if still unclear. > I just made assumptions to give clarity on a similar thing. > > Nicu > -- > *From:* Kartik Mathur <kar...@bluedata.com> > *Sent:* Thursday, October 1, 2015 10:25 PM > *To:* Nicolae Marasoiu >

Re: Problem understanding spark word count execution

2015-10-01 Thread Nicolae Marasoiu
ons to give clarity on a similar thing. Nicu From: Kartik Mathur <kar...@bluedata.com> Sent: Thursday, October 1, 2015 10:25 PM To: Nicolae Marasoiu Cc: user Subject: Re: Problem understanding spark word count execution Thanks Nicolae , So In my case all executers

RE: Problem understanding spark word count execution

2015-10-01 Thread java8964
aggregated on the reducers end, and merged back to the driver. Yong Date: Thu, 1 Oct 2015 13:33:59 -0700 Subject: Re: Problem understanding spark word count execution From: kar...@bluedata.com To: nicolae.maras...@adswizz.com CC: user@spark.apache.org Hi Nicolae,Thanks for the reply. To further clarify

Problem understanding spark word count execution

2015-09-30 Thread Kartik Mathur
Hi All, I tried running spark word count and I have couple of questions - I am analyzing stage 0 , i.e *sc.textFile -> flatMap -> Map (Word count example)* 1) In the *Stage logs* under Application UI details for every task I am seeing Shuffle write as 2.7 KB, *question - how can I know where

Re: Problem understanding spark word count execution

2015-09-30 Thread Nicolae Marasoiu
ode, I bet shuffle is just sending out the textFile to a few nodes to distribute the partitions. From: Kartik Mathur <kar...@bluedata.com> Sent: Thursday, October 1, 2015 12:42 AM To: user Subject: Problem understanding spark word count execution Hi All,