Re: How do I parallize Spark Jobs at Executor Level.

2015-10-30 Thread Deng Ching-Mallete
to load the files and continue >> processing in parallel, then a simple .map should work. >> If you want to execute arbitrary code based on the list of files that >> each executor received, then you need to use .foreach that will get >> executed for each of the entries, on the worker

Re: How do I parallize Spark Jobs at Executor Level.

2015-10-30 Thread Deng Ching-Mallete
>>> >>> On Wed, Oct 28, 2015 at 8:29 PM Adrian Tanase <atan...@adobe.com> wrote: >>> >>>> The first line is distributing your fileList variable in the cluster as >>>> a RDD, partitioned using the default partitioner settings (e.g. Number of >&

Re: How do I parallize Spark Jobs at Executor Level.

2015-10-30 Thread Vinoth Sankar
ores in your cluster). >>> >>> Each of your workers would one or more slices of data (depending on how >>> many cores each executor has) and the abstraction is called partition. >>> >>> What is your use case? If you want to load the files and conti

Re: How do I parallize Spark Jobs at Executor Level.

2015-10-29 Thread Vinoth Sankar
use .foreach that will get executed for > each of the entries, on the worker. > > -adrian > > From: Vinoth Sankar > Date: Wednesday, October 28, 2015 at 2:49 PM > To: "user@spark.apache.org" > Subject: How do I parallize Spark Jobs at Executor Level. > > Hi,

How do I parallize Spark Jobs at Executor Level.

2015-10-28 Thread Vinoth Sankar
Hi, I'm reading and filtering large no of files using Spark. It's getting parallized at Spark Driver level only. How do i make it parallelize to Executor(Worker) Level. Refer the following sample. Is there any way to paralleling iterate the localIterator ? Note : I use Java 1.7 version JavaRDD

Re: How do I parallize Spark Jobs at Executor Level.

2015-10-28 Thread Adrian Tanase
of the entries, on the worker. -adrian From: Vinoth Sankar Date: Wednesday, October 28, 2015 at 2:49 PM To: "user@spark.apache.org<mailto:user@spark.apache.org>" Subject: How do I parallize Spark Jobs at Executor Level. Hi, I'm reading and filtering large no of files using Spa