Re: spark pi example fail on yarn
I modified yarn-site.xml yarn.nodemanager.vmem-check-enabled to false and it works for yarn-client and spark-shell On Fri, Oct 21, 2016 at 10:59 AM, Li Li <fancye...@gmail.com> wrote: > I found a warn in nodemanager log. is the virtual memory exceed? how > should I config yarn to solve this problem? > > 2016-10-21 10:41:12,588 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: > Memory usage of ProcessTree 20299 for container-id > container_1477017445921_0001_02_01: 335.1 MB of 1 GB physical > memory used; 2.2 GB of 2.1 GB virtual memory used > 2016-10-21 10:41:12,589 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: > Process tree for container: container_1477017445921_0001_02_01 has > processes older than 1 iteration running over the configured limit. > Limit=2254857728, current usage = 2338873344 > > On Fri, Oct 21, 2016 at 8:49 AM, Saisai Shao <sai.sai.s...@gmail.com> wrote: >> It is not Spark has difficulty to communicate with YARN, it simply means AM >> is exited with FINISHED state. >> >> I'm guessing it might be related to memory constraints for container, please >> check the yarn RM and NM logs to find out more details. >> >> Thanks >> Saisai >> >> On Fri, Oct 21, 2016 at 8:14 AM, Xi Shen <davidshe...@gmail.com> wrote: >>> >>> 16/10/20 18:12:14 ERROR cluster.YarnClientSchedulerBackend: Yarn >>> application has already exited with state FINISHED! >>> >>> From this, I think it is spark has difficult communicating with YARN. You >>> should check your Spark log. >>> >>> >>> On Fri, Oct 21, 2016 at 8:06 AM Li Li <fancye...@gmail.com> wrote: >>>> >>>> which log file should I >>>> >>>> On Thu, Oct 20, 2016 at 10:02 PM, Saisai Shao <sai.sai.s...@gmail.com> >>>> wrote: >>>> > Looks like ApplicationMaster is killed by SIGTERM. >>>> > >>>> > 16/10/20 18:12:04 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL TERM >>>> > 16/10/20 18:12:04 INFO yarn.ApplicationMaster: Final app status: >>>> > >>>> > This container may be killed by yarn NodeManager or other processes, >>>> > you'd >>>> > better check yarn log to dig out more details. >>>> > >>>> > Thanks >>>> > Saisai >>>> > >>>> > On Thu, Oct 20, 2016 at 6:51 PM, Li Li <fancye...@gmail.com> wrote: >>>> >> >>>> >> I am setting up a small yarn/spark cluster. hadoop/yarn version is >>>> >> 2.7.3 and I can run wordcount map-reduce correctly in yarn. >>>> >> And I am using spark-2.0.1-bin-hadoop2.7 using command: >>>> >> ~/spark-2.0.1-bin-hadoop2.7$ ./bin/spark-submit --class >>>> >> org.apache.spark.examples.SparkPi --master yarn-client >>>> >> examples/jars/spark-examples_2.11-2.0.1.jar 1 >>>> >> it fails and the first error is: >>>> >> 16/10/20 18:12:03 INFO storage.BlockManagerMaster: Registered >>>> >> BlockManager BlockManagerId(driver, 10.161.219.189, 39161) >>>> >> 16/10/20 18:12:03 INFO handler.ContextHandler: Started >>>> >> o.s.j.s.ServletContextHandler@76ad6715{/metrics/json,null,AVAILABLE} >>>> >> 16/10/20 18:12:12 INFO >>>> >> cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster >>>> >> registered as NettyRpcEndpointRef(null) >>>> >> 16/10/20 18:12:12 INFO cluster.YarnClientSchedulerBackend: Add WebUI >>>> >> Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, >>>> >> Map(PROXY_HOSTS -> ai-hz1-spark1, PROXY_URI_BASES -> >>>> >> http://ai-hz1-spark1:8088/proxy/application_1476957324184_0002), >>>> >> /proxy/application_1476957324184_0002 >>>> >> 16/10/20 18:12:12 INFO ui.JettyUtils: Adding filter: >>>> >> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter >>>> >> 16/10/20 18:12:12 INFO cluster.YarnClientSchedulerBackend: >>>> >> SchedulerBackend is ready for scheduling beginning after waiting >>>> >> maxRegisteredResourcesWaitingTime: 3(ms) >>>> >> 16/10/20 18:12:12 WARN spark.SparkContext: Use an existing >>>> >> SparkContext, some configuration may not take effect. >>>> >> 16/10/20 18:12:12 INFO handler.ContextHandler: Started >&g
Re: spark pi example fail on yarn
I found a warn in nodemanager log. is the virtual memory exceed? how should I config yarn to solve this problem? 2016-10-21 10:41:12,588 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 20299 for container-id container_1477017445921_0001_02_01: 335.1 MB of 1 GB physical memory used; 2.2 GB of 2.1 GB virtual memory used 2016-10-21 10:41:12,589 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Process tree for container: container_1477017445921_0001_02_01 has processes older than 1 iteration running over the configured limit. Limit=2254857728, current usage = 2338873344 On Fri, Oct 21, 2016 at 8:49 AM, Saisai Shao <sai.sai.s...@gmail.com> wrote: > It is not Spark has difficulty to communicate with YARN, it simply means AM > is exited with FINISHED state. > > I'm guessing it might be related to memory constraints for container, please > check the yarn RM and NM logs to find out more details. > > Thanks > Saisai > > On Fri, Oct 21, 2016 at 8:14 AM, Xi Shen <davidshe...@gmail.com> wrote: >> >> 16/10/20 18:12:14 ERROR cluster.YarnClientSchedulerBackend: Yarn >> application has already exited with state FINISHED! >> >> From this, I think it is spark has difficult communicating with YARN. You >> should check your Spark log. >> >> >> On Fri, Oct 21, 2016 at 8:06 AM Li Li <fancye...@gmail.com> wrote: >>> >>> which log file should I >>> >>> On Thu, Oct 20, 2016 at 10:02 PM, Saisai Shao <sai.sai.s...@gmail.com> >>> wrote: >>> > Looks like ApplicationMaster is killed by SIGTERM. >>> > >>> > 16/10/20 18:12:04 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL TERM >>> > 16/10/20 18:12:04 INFO yarn.ApplicationMaster: Final app status: >>> > >>> > This container may be killed by yarn NodeManager or other processes, >>> > you'd >>> > better check yarn log to dig out more details. >>> > >>> > Thanks >>> > Saisai >>> > >>> > On Thu, Oct 20, 2016 at 6:51 PM, Li Li <fancye...@gmail.com> wrote: >>> >> >>> >> I am setting up a small yarn/spark cluster. hadoop/yarn version is >>> >> 2.7.3 and I can run wordcount map-reduce correctly in yarn. >>> >> And I am using spark-2.0.1-bin-hadoop2.7 using command: >>> >> ~/spark-2.0.1-bin-hadoop2.7$ ./bin/spark-submit --class >>> >> org.apache.spark.examples.SparkPi --master yarn-client >>> >> examples/jars/spark-examples_2.11-2.0.1.jar 1 >>> >> it fails and the first error is: >>> >> 16/10/20 18:12:03 INFO storage.BlockManagerMaster: Registered >>> >> BlockManager BlockManagerId(driver, 10.161.219.189, 39161) >>> >> 16/10/20 18:12:03 INFO handler.ContextHandler: Started >>> >> o.s.j.s.ServletContextHandler@76ad6715{/metrics/json,null,AVAILABLE} >>> >> 16/10/20 18:12:12 INFO >>> >> cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster >>> >> registered as NettyRpcEndpointRef(null) >>> >> 16/10/20 18:12:12 INFO cluster.YarnClientSchedulerBackend: Add WebUI >>> >> Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, >>> >> Map(PROXY_HOSTS -> ai-hz1-spark1, PROXY_URI_BASES -> >>> >> http://ai-hz1-spark1:8088/proxy/application_1476957324184_0002), >>> >> /proxy/application_1476957324184_0002 >>> >> 16/10/20 18:12:12 INFO ui.JettyUtils: Adding filter: >>> >> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter >>> >> 16/10/20 18:12:12 INFO cluster.YarnClientSchedulerBackend: >>> >> SchedulerBackend is ready for scheduling beginning after waiting >>> >> maxRegisteredResourcesWaitingTime: 3(ms) >>> >> 16/10/20 18:12:12 WARN spark.SparkContext: Use an existing >>> >> SparkContext, some configuration may not take effect. >>> >> 16/10/20 18:12:12 INFO handler.ContextHandler: Started >>> >> o.s.j.s.ServletContextHandler@489091bd{/SQL,null,AVAILABLE} >>> >> 16/10/20 18:12:12 INFO handler.ContextHandler: Started >>> >> o.s.j.s.ServletContextHandler@1de9b505{/SQL/json,null,AVAILABLE} >>> >> 16/10/20 18:12:12 INFO handler.ContextHandler: Started >>> >> o.s.j.s.ServletContextHandler@378f002a{/SQL/execution,null,AVAILABLE} >>> >> 16/10/20 18:12:12 INFO handler.ContextHandler: Started >>> >> >>> >&
Re: spark pi example fail on yarn
which log file should I On Thu, Oct 20, 2016 at 10:02 PM, Saisai Shao <sai.sai.s...@gmail.com> wrote: > Looks like ApplicationMaster is killed by SIGTERM. > > 16/10/20 18:12:04 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL TERM > 16/10/20 18:12:04 INFO yarn.ApplicationMaster: Final app status: > > This container may be killed by yarn NodeManager or other processes, you'd > better check yarn log to dig out more details. > > Thanks > Saisai > > On Thu, Oct 20, 2016 at 6:51 PM, Li Li <fancye...@gmail.com> wrote: >> >> I am setting up a small yarn/spark cluster. hadoop/yarn version is >> 2.7.3 and I can run wordcount map-reduce correctly in yarn. >> And I am using spark-2.0.1-bin-hadoop2.7 using command: >> ~/spark-2.0.1-bin-hadoop2.7$ ./bin/spark-submit --class >> org.apache.spark.examples.SparkPi --master yarn-client >> examples/jars/spark-examples_2.11-2.0.1.jar 1 >> it fails and the first error is: >> 16/10/20 18:12:03 INFO storage.BlockManagerMaster: Registered >> BlockManager BlockManagerId(driver, 10.161.219.189, 39161) >> 16/10/20 18:12:03 INFO handler.ContextHandler: Started >> o.s.j.s.ServletContextHandler@76ad6715{/metrics/json,null,AVAILABLE} >> 16/10/20 18:12:12 INFO >> cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster >> registered as NettyRpcEndpointRef(null) >> 16/10/20 18:12:12 INFO cluster.YarnClientSchedulerBackend: Add WebUI >> Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, >> Map(PROXY_HOSTS -> ai-hz1-spark1, PROXY_URI_BASES -> >> http://ai-hz1-spark1:8088/proxy/application_1476957324184_0002), >> /proxy/application_1476957324184_0002 >> 16/10/20 18:12:12 INFO ui.JettyUtils: Adding filter: >> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter >> 16/10/20 18:12:12 INFO cluster.YarnClientSchedulerBackend: >> SchedulerBackend is ready for scheduling beginning after waiting >> maxRegisteredResourcesWaitingTime: 3(ms) >> 16/10/20 18:12:12 WARN spark.SparkContext: Use an existing >> SparkContext, some configuration may not take effect. >> 16/10/20 18:12:12 INFO handler.ContextHandler: Started >> o.s.j.s.ServletContextHandler@489091bd{/SQL,null,AVAILABLE} >> 16/10/20 18:12:12 INFO handler.ContextHandler: Started >> o.s.j.s.ServletContextHandler@1de9b505{/SQL/json,null,AVAILABLE} >> 16/10/20 18:12:12 INFO handler.ContextHandler: Started >> o.s.j.s.ServletContextHandler@378f002a{/SQL/execution,null,AVAILABLE} >> 16/10/20 18:12:12 INFO handler.ContextHandler: Started >> o.s.j.s.ServletContextHandler@2cc75074{/SQL/execution/json,null,AVAILABLE} >> 16/10/20 18:12:12 INFO handler.ContextHandler: Started >> o.s.j.s.ServletContextHandler@2d64160c{/static/sql,null,AVAILABLE} >> 16/10/20 18:12:12 INFO internal.SharedState: Warehouse path is >> '/home/hadoop/spark-2.0.1-bin-hadoop2.7/spark-warehouse'. >> 16/10/20 18:12:13 INFO spark.SparkContext: Starting job: reduce at >> SparkPi.scala:38 >> 16/10/20 18:12:13 INFO scheduler.DAGScheduler: Got job 0 (reduce at >> SparkPi.scala:38) with 1 output partitions >> 16/10/20 18:12:13 INFO scheduler.DAGScheduler: Final stage: >> ResultStage 0 (reduce at SparkPi.scala:38) >> 16/10/20 18:12:13 INFO scheduler.DAGScheduler: Parents of final stage: >> List() >> 16/10/20 18:12:13 INFO scheduler.DAGScheduler: Missing parents: List() >> 16/10/20 18:12:13 INFO scheduler.DAGScheduler: Submitting ResultStage >> 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34), which has no >> missing parents >> 16/10/20 18:12:13 INFO memory.MemoryStore: Block broadcast_0 stored as >> values in memory (estimated size 1832.0 B, free 366.3 MB) >> 16/10/20 18:12:13 INFO memory.MemoryStore: Block broadcast_0_piece0 >> stored as bytes in memory (estimated size 1169.0 B, free 366.3 MB) >> 16/10/20 18:12:13 INFO storage.BlockManagerInfo: Added >> broadcast_0_piece0 in memory on 10.161.219.189:39161 (size: 1169.0 B, >> free: 366.3 MB) >> 16/10/20 18:12:13 INFO spark.SparkContext: Created broadcast 0 from >> broadcast at DAGScheduler.scala:1012 >> 16/10/20 18:12:13 INFO scheduler.DAGScheduler: Submitting 1 >> missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at >> SparkPi.scala:34) >> 16/10/20 18:12:13 INFO cluster.YarnScheduler: Adding task set 0.0 with >> 1 tasks >> 16/10/20 18:12:14 ERROR cluster.YarnClientSchedulerBackend: Yarn >> application has already exited with state FINISHED! >> 16/10/20 18:12:14 INFO server.ServerConnector: Stopped >> ServerConnector@389adf1d{HTTP/1.1}{0.0.0.0:4040} >> 16/10/20 18:12:14 INFO handler.ContextHandler:
Re: spark pi example fail on yarn
which log file should I check? On Thu, Oct 20, 2016 at 11:32 PM, Amit Tank <amittankopensou...@gmail.com> wrote: > I recently started learning spark so I may be completely wrong here but I > was facing similar problem with sparkpi on yarn. After changing yarn to > cluster mode it worked perfectly fine. > > Thank you, > Amit > > > On Thursday, October 20, 2016, Saisai Shao <sai.sai.s...@gmail.com> wrote: >> >> Looks like ApplicationMaster is killed by SIGTERM. >> >> 16/10/20 18:12:04 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL TERM >> 16/10/20 18:12:04 INFO yarn.ApplicationMaster: Final app status: >> >> This container may be killed by yarn NodeManager or other processes, you'd >> better check yarn log to dig out more details. >> >> Thanks >> Saisai >> >> On Thu, Oct 20, 2016 at 6:51 PM, Li Li <fancye...@gmail.com> wrote: >>> >>> I am setting up a small yarn/spark cluster. hadoop/yarn version is >>> 2.7.3 and I can run wordcount map-reduce correctly in yarn. >>> And I am using spark-2.0.1-bin-hadoop2.7 using command: >>> ~/spark-2.0.1-bin-hadoop2.7$ ./bin/spark-submit --class >>> org.apache.spark.examples.SparkPi --master yarn-client >>> examples/jars/spark-examples_2.11-2.0.1.jar 1 >>> it fails and the first error is: >>> 16/10/20 18:12:03 INFO storage.BlockManagerMaster: Registered >>> BlockManager BlockManagerId(driver, 10.161.219.189, 39161) >>> 16/10/20 18:12:03 INFO handler.ContextHandler: Started >>> o.s.j.s.ServletContextHandler@76ad6715{/metrics/json,null,AVAILABLE} >>> 16/10/20 18:12:12 INFO >>> cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster >>> registered as NettyRpcEndpointRef(null) >>> 16/10/20 18:12:12 INFO cluster.YarnClientSchedulerBackend: Add WebUI >>> Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, >>> Map(PROXY_HOSTS -> ai-hz1-spark1, PROXY_URI_BASES -> >>> http://ai-hz1-spark1:8088/proxy/application_1476957324184_0002), >>> /proxy/application_1476957324184_0002 >>> 16/10/20 18:12:12 INFO ui.JettyUtils: Adding filter: >>> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter >>> 16/10/20 18:12:12 INFO cluster.YarnClientSchedulerBackend: >>> SchedulerBackend is ready for scheduling beginning after waiting >>> maxRegisteredResourcesWaitingTime: 3(ms) >>> 16/10/20 18:12:12 WARN spark.SparkContext: Use an existing >>> SparkContext, some configuration may not take effect. >>> 16/10/20 18:12:12 INFO handler.ContextHandler: Started >>> o.s.j.s.ServletContextHandler@489091bd{/SQL,null,AVAILABLE} >>> 16/10/20 18:12:12 INFO handler.ContextHandler: Started >>> o.s.j.s.ServletContextHandler@1de9b505{/SQL/json,null,AVAILABLE} >>> 16/10/20 18:12:12 INFO handler.ContextHandler: Started >>> o.s.j.s.ServletContextHandler@378f002a{/SQL/execution,null,AVAILABLE} >>> 16/10/20 18:12:12 INFO handler.ContextHandler: Started >>> >>> o.s.j.s.ServletContextHandler@2cc75074{/SQL/execution/json,null,AVAILABLE} >>> 16/10/20 18:12:12 INFO handler.ContextHandler: Started >>> o.s.j.s.ServletContextHandler@2d64160c{/static/sql,null,AVAILABLE} >>> 16/10/20 18:12:12 INFO internal.SharedState: Warehouse path is >>> '/home/hadoop/spark-2.0.1-bin-hadoop2.7/spark-warehouse'. >>> 16/10/20 18:12:13 INFO spark.SparkContext: Starting job: reduce at >>> SparkPi.scala:38 >>> 16/10/20 18:12:13 INFO scheduler.DAGScheduler: Got job 0 (reduce at >>> SparkPi.scala:38) with 1 output partitions >>> 16/10/20 18:12:13 INFO scheduler.DAGScheduler: Final stage: >>> ResultStage 0 (reduce at SparkPi.scala:38) >>> 16/10/20 18:12:13 INFO scheduler.DAGScheduler: Parents of final stage: >>> List() >>> 16/10/20 18:12:13 INFO scheduler.DAGScheduler: Missing parents: List() >>> 16/10/20 18:12:13 INFO scheduler.DAGScheduler: Submitting ResultStage >>> 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34), which has no >>> missing parents >>> 16/10/20 18:12:13 INFO memory.MemoryStore: Block broadcast_0 stored as >>> values in memory (estimated size 1832.0 B, free 366.3 MB) >>> 16/10/20 18:12:13 INFO memory.MemoryStore: Block broadcast_0_piece0 >>> stored as bytes in memory (estimated size 1169.0 B, free 366.3 MB) >>> 16/10/20 18:12:13 INFO storage.BlockManagerInfo: Added >>> broadcast_0_piece0 in memory on 10.161.219.189:39161 (size: 1169.0 B, >>> free: 366.3 MB) >>> 16/10/20 18:12:13 INFO spark.SparkContext: Created broadcast 0 fro
Re: spark pi example fail on yarn
yes, when I use yarn-cluster mode, it's correct. What's wrong with yarn-client? the spark shell is also not work because it's client mode. Any solution for this? On Thu, Oct 20, 2016 at 11:32 PM, Amit Tank <amittankopensou...@gmail.com> wrote: > I recently started learning spark so I may be completely wrong here but I > was facing similar problem with sparkpi on yarn. After changing yarn to > cluster mode it worked perfectly fine. > > Thank you, > Amit > > > On Thursday, October 20, 2016, Saisai Shao <sai.sai.s...@gmail.com> wrote: >> >> Looks like ApplicationMaster is killed by SIGTERM. >> >> 16/10/20 18:12:04 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL TERM >> 16/10/20 18:12:04 INFO yarn.ApplicationMaster: Final app status: >> >> This container may be killed by yarn NodeManager or other processes, you'd >> better check yarn log to dig out more details. >> >> Thanks >> Saisai >> >> On Thu, Oct 20, 2016 at 6:51 PM, Li Li <fancye...@gmail.com> wrote: >>> >>> I am setting up a small yarn/spark cluster. hadoop/yarn version is >>> 2.7.3 and I can run wordcount map-reduce correctly in yarn. >>> And I am using spark-2.0.1-bin-hadoop2.7 using command: >>> ~/spark-2.0.1-bin-hadoop2.7$ ./bin/spark-submit --class >>> org.apache.spark.examples.SparkPi --master yarn-client >>> examples/jars/spark-examples_2.11-2.0.1.jar 1 >>> it fails and the first error is: >>> 16/10/20 18:12:03 INFO storage.BlockManagerMaster: Registered >>> BlockManager BlockManagerId(driver, 10.161.219.189, 39161) >>> 16/10/20 18:12:03 INFO handler.ContextHandler: Started >>> o.s.j.s.ServletContextHandler@76ad6715{/metrics/json,null,AVAILABLE} >>> 16/10/20 18:12:12 INFO >>> cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster >>> registered as NettyRpcEndpointRef(null) >>> 16/10/20 18:12:12 INFO cluster.YarnClientSchedulerBackend: Add WebUI >>> Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, >>> Map(PROXY_HOSTS -> ai-hz1-spark1, PROXY_URI_BASES -> >>> http://ai-hz1-spark1:8088/proxy/application_1476957324184_0002), >>> /proxy/application_1476957324184_0002 >>> 16/10/20 18:12:12 INFO ui.JettyUtils: Adding filter: >>> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter >>> 16/10/20 18:12:12 INFO cluster.YarnClientSchedulerBackend: >>> SchedulerBackend is ready for scheduling beginning after waiting >>> maxRegisteredResourcesWaitingTime: 3(ms) >>> 16/10/20 18:12:12 WARN spark.SparkContext: Use an existing >>> SparkContext, some configuration may not take effect. >>> 16/10/20 18:12:12 INFO handler.ContextHandler: Started >>> o.s.j.s.ServletContextHandler@489091bd{/SQL,null,AVAILABLE} >>> 16/10/20 18:12:12 INFO handler.ContextHandler: Started >>> o.s.j.s.ServletContextHandler@1de9b505{/SQL/json,null,AVAILABLE} >>> 16/10/20 18:12:12 INFO handler.ContextHandler: Started >>> o.s.j.s.ServletContextHandler@378f002a{/SQL/execution,null,AVAILABLE} >>> 16/10/20 18:12:12 INFO handler.ContextHandler: Started >>> >>> o.s.j.s.ServletContextHandler@2cc75074{/SQL/execution/json,null,AVAILABLE} >>> 16/10/20 18:12:12 INFO handler.ContextHandler: Started >>> o.s.j.s.ServletContextHandler@2d64160c{/static/sql,null,AVAILABLE} >>> 16/10/20 18:12:12 INFO internal.SharedState: Warehouse path is >>> '/home/hadoop/spark-2.0.1-bin-hadoop2.7/spark-warehouse'. >>> 16/10/20 18:12:13 INFO spark.SparkContext: Starting job: reduce at >>> SparkPi.scala:38 >>> 16/10/20 18:12:13 INFO scheduler.DAGScheduler: Got job 0 (reduce at >>> SparkPi.scala:38) with 1 output partitions >>> 16/10/20 18:12:13 INFO scheduler.DAGScheduler: Final stage: >>> ResultStage 0 (reduce at SparkPi.scala:38) >>> 16/10/20 18:12:13 INFO scheduler.DAGScheduler: Parents of final stage: >>> List() >>> 16/10/20 18:12:13 INFO scheduler.DAGScheduler: Missing parents: List() >>> 16/10/20 18:12:13 INFO scheduler.DAGScheduler: Submitting ResultStage >>> 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34), which has no >>> missing parents >>> 16/10/20 18:12:13 INFO memory.MemoryStore: Block broadcast_0 stored as >>> values in memory (estimated size 1832.0 B, free 366.3 MB) >>> 16/10/20 18:12:13 INFO memory.MemoryStore: Block broadcast_0_piece0 >>> stored as bytes in memory (estimated size 1169.0 B, free 366.3 MB) >>> 16/10/20 18:12:13 INFO storage.BlockManagerInfo: Added >>> broadcast_0_piece0 in memory on 10.161.219.189:391
spark pi example fail on yarn
I am setting up a small yarn/spark cluster. hadoop/yarn version is 2.7.3 and I can run wordcount map-reduce correctly in yarn. And I am using spark-2.0.1-bin-hadoop2.7 using command: ~/spark-2.0.1-bin-hadoop2.7$ ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-client examples/jars/spark-examples_2.11-2.0.1.jar 1 it fails and the first error is: 16/10/20 18:12:03 INFO storage.BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 10.161.219.189, 39161) 16/10/20 18:12:03 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@76ad6715{/metrics/json,null,AVAILABLE} 16/10/20 18:12:12 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(null) 16/10/20 18:12:12 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> ai-hz1-spark1, PROXY_URI_BASES -> http://ai-hz1-spark1:8088/proxy/application_1476957324184_0002), /proxy/application_1476957324184_0002 16/10/20 18:12:12 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 16/10/20 18:12:12 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 3(ms) 16/10/20 18:12:12 WARN spark.SparkContext: Use an existing SparkContext, some configuration may not take effect. 16/10/20 18:12:12 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@489091bd{/SQL,null,AVAILABLE} 16/10/20 18:12:12 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1de9b505{/SQL/json,null,AVAILABLE} 16/10/20 18:12:12 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@378f002a{/SQL/execution,null,AVAILABLE} 16/10/20 18:12:12 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@2cc75074{/SQL/execution/json,null,AVAILABLE} 16/10/20 18:12:12 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@2d64160c{/static/sql,null,AVAILABLE} 16/10/20 18:12:12 INFO internal.SharedState: Warehouse path is '/home/hadoop/spark-2.0.1-bin-hadoop2.7/spark-warehouse'. 16/10/20 18:12:13 INFO spark.SparkContext: Starting job: reduce at SparkPi.scala:38 16/10/20 18:12:13 INFO scheduler.DAGScheduler: Got job 0 (reduce at SparkPi.scala:38) with 1 output partitions 16/10/20 18:12:13 INFO scheduler.DAGScheduler: Final stage: ResultStage 0 (reduce at SparkPi.scala:38) 16/10/20 18:12:13 INFO scheduler.DAGScheduler: Parents of final stage: List() 16/10/20 18:12:13 INFO scheduler.DAGScheduler: Missing parents: List() 16/10/20 18:12:13 INFO scheduler.DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34), which has no missing parents 16/10/20 18:12:13 INFO memory.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1832.0 B, free 366.3 MB) 16/10/20 18:12:13 INFO memory.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1169.0 B, free 366.3 MB) 16/10/20 18:12:13 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on 10.161.219.189:39161 (size: 1169.0 B, free: 366.3 MB) 16/10/20 18:12:13 INFO spark.SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1012 16/10/20 18:12:13 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34) 16/10/20 18:12:13 INFO cluster.YarnScheduler: Adding task set 0.0 with 1 tasks 16/10/20 18:12:14 ERROR cluster.YarnClientSchedulerBackend: Yarn application has already exited with state FINISHED! 16/10/20 18:12:14 INFO server.ServerConnector: Stopped ServerConnector@389adf1d{HTTP/1.1}{0.0.0.0:4040} 16/10/20 18:12:14 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@841e575{/stages/stage/kill,null,UNAVAILABLE} 16/10/20 18:12:14 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@66629f63{/api,null,UNAVAILABLE} 16/10/20 18:12:14 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@2b62442c{/,null,UNAVAILABLE} I also use yarn log to get logs from yarn(total log is very lengthy in attachement): 16/10/20 18:12:03 INFO yarn.ExecutorRunnable: === YARN executor launch context: env: CLASSPATH -> {{PWD}}{{PWD}}/__spark_conf__{{PWD}}/__spark_libs__/*$HADOOP_CONF_DIR$HADOOP_COMMON_HOME/share/hadoop/common/*$HADOOP_COMMON_HOME/share/hadoop/common/lib/*$HADOOP_HDFS_HOME/share/hadoop/hdfs/*$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*$HADOOP_YARN_HOME/share/hadoop/yarn/*$HADOOP_YARN_HOME/share/hadoop/yarn/lib/*$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/* SPARK_LOG_URL_STDERR -> http://ai-hz1-spark3:8042/node/containerlogs/container_1476957324184_0002_01_03/hadoop/stderr?start=-4096 SPARK_YARN_STAGING_DIR ->
Re: running lda in spark throws exception
anyone could help? On Wed, Dec 23, 2015 at 1:40 PM, Li Li <fancye...@gmail.com> wrote: > I ran my lda example in a yarn 2.6.2 cluster with spark 1.5.2. > it throws exception in line: Matrix topics = ldaModel.topicsMatrix(); > But in yarn job history ui, it's successful. What's wrong with it? > I submit job with > .bin/spark-submit --class Myclass \ > --master yarn-client \ > --num-executors 2 \ > --driver-memory 4g \ > --executor-memory 4g \ > --executor-cores 1 \ > > > My codes: > >corpus.cache(); > > > // Cluster the documents into three topics using LDA > > DistributedLDAModel ldaModel = (DistributedLDAModel) new > LDA().setOptimizer("em").setMaxIterations(iterNumber).setK(topicNumber).run(corpus); > > > // Output topics. Each is a distribution over words (matching word > count vectors) > > System.out.println("Learned topics (as distributions over vocab of > " + ldaModel.vocabSize() > > + " words):"); > >//Line81, exception here:Matrix topics = ldaModel.topicsMatrix(); > > for (int topic = 0; topic < topicNumber; topic++) { > > System.out.print("Topic " + topic + ":"); > > for (int word = 0; word < ldaModel.vocabSize(); word++) { > > System.out.print(" " + topics.apply(word, topic)); > > } > > System.out.println(); > > } > > > ldaModel.save(sc.sc(), modelPath); > > > Exception in thread "main" java.lang.IndexOutOfBoundsException: > (1025,0) not in [-58,58) x [-100,100) > > at > breeze.linalg.DenseMatrix$mcD$sp.update$mcD$sp(DenseMatrix.scala:112) > > at > org.apache.spark.mllib.clustering.DistributedLDAModel$$anonfun$topicsMatrix$1.apply(LDAModel.scala:534) > > at > org.apache.spark.mllib.clustering.DistributedLDAModel$$anonfun$topicsMatrix$1.apply(LDAModel.scala:531) > > at > scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) > > at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) > > at > org.apache.spark.mllib.clustering.DistributedLDAModel.topicsMatrix$lzycompute(LDAModel.scala:531) > > at > org.apache.spark.mllib.clustering.DistributedLDAModel.topicsMatrix(LDAModel.scala:523) > > at > com.mobvoi.knowledgegraph.textmining.lda.ReviewLDA.main(ReviewLDA.java:81) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:606) > > at > org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:674) > > at > org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) > > at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) > > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120) > > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > > 15/12/23 00:01:16 INFO spark.SparkContext: Invoking stop() from shutdown hook - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
running lda in spark throws exception
I ran my lda example in a yarn 2.6.2 cluster with spark 1.5.2. it throws exception in line: Matrix topics = ldaModel.topicsMatrix(); But in yarn job history ui, it's successful. What's wrong with it? I submit job with .bin/spark-submit --class Myclass \ --master yarn-client \ --num-executors 2 \ --driver-memory 4g \ --executor-memory 4g \ --executor-cores 1 \ My codes: corpus.cache(); // Cluster the documents into three topics using LDA DistributedLDAModel ldaModel = (DistributedLDAModel) new LDA().setOptimizer("em").setMaxIterations(iterNumber).setK(topicNumber).run(corpus); // Output topics. Each is a distribution over words (matching word count vectors) System.out.println("Learned topics (as distributions over vocab of " + ldaModel.vocabSize() + " words):"); //Line81, exception here:Matrix topics = ldaModel.topicsMatrix(); for (int topic = 0; topic < topicNumber; topic++) { System.out.print("Topic " + topic + ":"); for (int word = 0; word < ldaModel.vocabSize(); word++) { System.out.print(" " + topics.apply(word, topic)); } System.out.println(); } ldaModel.save(sc.sc(), modelPath); Exception in thread "main" java.lang.IndexOutOfBoundsException: (1025,0) not in [-58,58) x [-100,100) at breeze.linalg.DenseMatrix$mcD$sp.update$mcD$sp(DenseMatrix.scala:112) at org.apache.spark.mllib.clustering.DistributedLDAModel$$anonfun$topicsMatrix$1.apply(LDAModel.scala:534) at org.apache.spark.mllib.clustering.DistributedLDAModel$$anonfun$topicsMatrix$1.apply(LDAModel.scala:531) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) at org.apache.spark.mllib.clustering.DistributedLDAModel.topicsMatrix$lzycompute(LDAModel.scala:531) at org.apache.spark.mllib.clustering.DistributedLDAModel.topicsMatrix(LDAModel.scala:523) at com.mobvoi.knowledgegraph.textmining.lda.ReviewLDA.main(ReviewLDA.java:81) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:674) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 15/12/23 00:01:16 INFO spark.SparkContext: Invoking stop() from shutdown hook - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org