emails. Learn more at http://vsre.info/
From: ayan guha [mailto:guha.a...@gmail.com]
Sent: 18 May 2015 15:46
To: Laeeq Ahmed
Cc: user@spark.apache.org
Subject: Re: Processing multiple columns in parallel
My first thought would be creating 10 rdds and run your word count on each of
them..I think
My first thought would be creating 10 rdds and run your word count on each
of them..I think spark scheduler is going to resolve dependency in parallel
and launch 10 jobs.
Best
Ayan
On 18 May 2015 23:41, "Laeeq Ahmed" wrote:
> Hi,
>
> Consider I have a tab delimited text file with 10 columns. Eac