Re: [Spark Launcher] How to launch parallel jobs?

2017-02-14 Thread Cosmin Posteuca
ctual usage. >> >> >> >> BTW, it doesn’t matter how much memory your program wants but how much it >> reserves. In your example it will not take the 50MB of the test but the >> ~1.5GB (after overhead) per executor. >> >> Hope this helps, >> >> Assa

Re: [Spark Launcher] How to launch parallel jobs?

2017-02-14 Thread Cosmin Posteuca
ssaf. > > > > *From:* Cosmin Posteuca [mailto:cosmin.poste...@gmail.com] > *Sent:* Tuesday, February 14, 2017 9:53 AM > *To:* Egor Pahomov > *Cc:* user > *Subject:* Re: [Spark Launcher] How to launch parallel jobs? > > > > Hi Egor, > > > > About the first pro

RE: [Spark Launcher] How to launch parallel jobs?

2017-02-14 Thread Mendelson, Assaf
Cc: user Subject: Re: [Spark Launcher] How to launch parallel jobs? Hi Egor, About the first problem i think you are right, it's make sense. About the second problem, i check available resource on 8088 port and there show 16 available cores. I start my job with 4 executors with 1 core each, and

Re: [Spark Launcher] How to launch parallel jobs?

2017-02-13 Thread Cosmin Posteuca
Hi Egor, About the first problem i think you are right, it's make sense. About the second problem, i check available resource on 8088 port and there show 16 available cores. I start my job with 4 executors with 1 core each, and 1gb per executor. My job use maximum 50mb of memory(just for test).

Re: [Spark Launcher] How to launch parallel jobs?

2017-02-13 Thread Egor Pahomov
About second problem: I understand this can be in two cases: when one job prevents the other one from getting resources for executors or (2) bottleneck is reading from disk, so you can not really parallel that. I have no experience with second case, but it's easy to verify the fist one: just look

Re: [Spark Launcher] How to launch parallel jobs?

2017-02-13 Thread Egor Pahomov
"But if i increase only executor-cores the finish time is the same". More experienced ones can correct me, if I'm wrong, but as far as I understand that: one partition processed by one spark task. Task is always running on 1 core and not parallelized among cores. So if you have 5 partitions and