Memory seems to be enough. My cluster has 22.5 gb total memory and my job use 6.88 gb. If i run twice this job, they will use 13.75 gb, but sometimes the cluster has a spike of memory of 19.5 gb.
Thanks, Cosmin 2017-02-14 10:03 GMT+02:00 Mendelson, Assaf <assaf.mendel...@rsa.com>: > You should also check your memory usage. > > Let’s say for example you have 16 cores and 8 GB. And that you use 4 > executors with 1 core each. > > When you use an executor, spark reserves it from yarn and yarn allocates > the number of cores (e.g. 1 in our case) and the memory. The memory is > actually more than you asked for. If you ask for 1GB it will in fact > allocate almost 1.5GB with overhead. In addition, it will probably allocate > an executor for the driver (probably with 1024MB memory usage). > > When you run your program and look in port 8080, you should look not only > on the VCores used out of the VCores total but also on the Memory used and > Memory total. You should also navigate to the executors (e.g. > applications->running on the left and then choose you application and > navigate all the way down to a single container). You can see there the > actual usage. > > > > BTW, it doesn’t matter how much memory your program wants but how much it > reserves. In your example it will not take the 50MB of the test but the > ~1.5GB (after overhead) per executor. > > Hope this helps, > > Assaf. > > > > *From:* Cosmin Posteuca [mailto:cosmin.poste...@gmail.com] > *Sent:* Tuesday, February 14, 2017 9:53 AM > *To:* Egor Pahomov > *Cc:* user > *Subject:* Re: [Spark Launcher] How to launch parallel jobs? > > > > Hi Egor, > > > > About the first problem i think you are right, it's make sense. > > > > About the second problem, i check available resource on 8088 port and > there show 16 available cores. I start my job with 4 executors with 1 core > each, and 1gb per executor. My job use maximum 50mb of memory(just for > test). From my point of view the resources are enough, and the problem i > think is from yarn configuration files, but i don't know what is missing. > > > > Thank you > > > > 2017-02-13 21:14 GMT+02:00 Egor Pahomov <pahomov.e...@gmail.com>: > > About second problem: I understand this can be in two cases: when one job > prevents the other one from getting resources for executors or (2) > bottleneck is reading from disk, so you can not really parallel that. I > have no experience with second case, but it's easy to verify the fist one: > just look on you hadoop UI and verify, that both job get enough resources. > > > > 2017-02-13 11:07 GMT-08:00 Egor Pahomov <pahomov.e...@gmail.com>: > > "But if i increase only executor-cores the finish time is the same". More > experienced ones can correct me, if I'm wrong, but as far as I understand > that: one partition processed by one spark task. Task is always running on > 1 core and not parallelized among cores. So if you have 5 partitions and > you increased totall number of cores among cluster from 7 to 10 for example > - you have not gained anything. But if you repartition you give an > opportunity to process thing in more threads, so now more tasks can execute > in parallel. > > > > 2017-02-13 7:05 GMT-08:00 Cosmin Posteuca <cosmin.poste...@gmail.com>: > > Hi, > > > > I think i don't understand enough how to launch jobs. > > > > I have one job which takes 60 seconds to finish. I run it with following > command: > > > > spark-submit --executor-cores 1 \ > > --executor-memory 1g \ > > --driver-memory 1g \ > > --master yarn \ > > --deploy-mode cluster \ > > --conf spark.dynamicAllocation.enabled=true \ > > --conf spark.shuffle.service.enabled=true \ > > --conf spark.dynamicAllocation.minExecutors=1 \ > > --conf spark.dynamicAllocation.maxExecutors=4 \ > > --conf spark.dynamicAllocation.initialExecutors=4 \ > > --conf spark.executor.instances=4 \ > > If i increase number of partitions from code and number of executors the app > will finish faster, which it's ok. But if i increase only executor-cores the > finish time is the same, and i don't understand why. I expect the time to be > lower than initial time. > > My second problem is if i launch twice above code i expect that both jobs to > finish in 60 seconds, but this don't happen. Both jobs finish after 120 > seconds and i don't understand why. > > I run this code on AWS EMR, on 2 instances(4 cpu each, and each cpu has 2 > threads). From what i saw in default EMR configurations, yarn is set on > FIFO(default) mode with CapacityScheduler. > > What do you think about this problems? > > Thanks, > > Cosmin > > > > > > -- > > > *Sincerely yours Egor Pakhomov* > > > > > > -- > > > *Sincerely yours Egor Pakhomov* > > >