You are welcome Also use can use OS command /usr/bin/free to see how much free memory you have on each node.
You should also see from SPARK GUI (first job on master node:4040, next on 4041etc) the resource and Storage (memory usage) for each SparkSubmit job. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 29 May 2016 at 01:16, heri wijayanto <heri0...@gmail.com> wrote: > Thank you, Dr Mich Talebzadeh, I will capture the error messages, but > currently, my cluster is running to do the other job. After it finished, I > will try your suggestions > > On Sun, May 29, 2016 at 7:55 AM, Mich Talebzadeh < > mich.talebza...@gmail.com> wrote: > >> You should have errors in yarn-nodemanager and yarn-resourcemanager logs. >> >> Something like below for heathy container >> >> 2016-05-29 00:50:50,496 INFO >> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: >> Memory usage of ProcessTree 29769 for container-id >> container_1464210869844_0061_01_000001: 372.6 MB of 4 GB physical memory >> used; 2.7 GB of 8.4 GB virtual memory used >> >> It appears that you are running out of memory. Have you also checked with >> jps and jmonitor for SparkSubmit (the driver process) for the failing job? >> It will show you the resource usage= like memory/heap/cpu etc >> >> HTH >> >> Dr Mich Talebzadeh >> >> >> >> LinkedIn * >> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >> >> >> >> http://talebzadehmich.wordpress.com >> >> >> >> On 29 May 2016 at 00:26, heri wijayanto <heri0...@gmail.com> wrote: >> >>> I implement spark with join function for processing in around 250 >>> million rows of text. >>> >>> When I just used several hundred of rows, it could run, but when I use >>> the large data, it is failed. >>> >>> My spark version in 1.6.1, run above yarn-cluster mode, and we have 5 >>> node computers. >>> >>> Thank you very much, Ted Yu >>> >>> On Sun, May 29, 2016 at 6:48 AM, Ted Yu <yuzhih...@gmail.com> wrote: >>> >>>> Can you let us know your case ? >>>> >>>> When the join failed, what was the error (consider pastebin) ? >>>> >>>> Which release of Spark are you using ? >>>> >>>> Thanks >>>> >>>> > On May 28, 2016, at 3:27 PM, heri wijayanto <heri0...@gmail.com> >>>> wrote: >>>> > >>>> > Hi everyone, >>>> > I perform join function in a loop, and it is failed. I found a >>>> tutorial from the web, it says that I should use a broadcast variable but >>>> it is not a good choice for doing it on the loop. >>>> > I need your suggestion to address this problem, thank you very much. >>>> > and I am sorry, I am a beginner in Spark programming >>>> >>> >>> >> >