The nm logs only seems to contain similar to the following. Nothing else in the same time range. Any help?
2015-02-12 20:47:31,245 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Event EventType: KILL_CONTAINER sent to absent container container_1422406067005_0053_01_000002 2015-02-12 20:47:31,246 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Event EventType: KILL_CONTAINER sent to absent container container_1422406067005_0053_01_000012 2015-02-12 20:47:31,246 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Event EventType: KILL_CONTAINER sent to absent container container_1422406067005_0053_01_000022 2015-02-12 20:47:31,246 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Event EventType: KILL_CONTAINER sent to absent container container_1422406067005_0053_01_000032 2015-02-12 20:47:31,246 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Event EventType: KILL_CONTAINER sent to absent container container_1422406067005_0053_01_000042 2015-02-12 21:24:30,515 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Event EventType: FINISH_APPLICATION sent to absent application application_1422406067005_0053 On Thu, Feb 12, 2015 at 10:38 PM, Sandy Ryza <sandy.r...@cloudera.com> wrote: > It seems unlikely to me that it would be a 2.2 issue, though not entirely > impossible. Are you able to find any of the container logs? Is the > NodeManager launching containers and reporting some exit code? > > -Sandy > > On Thu, Feb 12, 2015 at 1:21 PM, Anders Arpteg <arp...@spotify.com> wrote: > >> No, not submitting from windows, from a debian distribution. Had a quick >> look at the rm logs, and it seems some containers are allocated but then >> released again for some reason. Not easy to make sense of the logs, but >> here is a snippet from the logs (from a test in our small test cluster) if >> you'd like to have a closer look: http://pastebin.com/8WU9ivqC >> >> Sandy, sounds like it could possible be a 2.2 issue then, or what do you >> think? >> >> Thanks, >> Anders >> >> On Thu, Feb 12, 2015 at 3:11 PM, Aniket Bhatnagar < >> aniket.bhatna...@gmail.com> wrote: >> >>> This is tricky to debug. Check logs of node and resource manager of YARN >>> to see if you can trace the error. In the past I have to closely look at >>> arguments getting passed to YARN container (they get logged before >>> attempting to launch containers). If I still don't get a clue, I had to >>> check the script generated by YARN to execute the container and even run >>> manually to trace at what line the error has occurred. >>> >>> BTW are you submitting the job from windows? >>> >>> On Thu, Feb 12, 2015, 3:34 PM Anders Arpteg <arp...@spotify.com> wrote: >>> >>>> Interesting to hear that it works for you. Are you using Yarn 2.2 as >>>> well? No strange log message during startup, and can't see any other log >>>> messages since no executer gets launched. Does not seems to work in >>>> yarn-client mode either, failing with the exception below. >>>> >>>> Exception in thread "main" org.apache.spark.SparkException: Yarn >>>> application has already ended! It might have been killed or unable to >>>> launch application master. >>>> at >>>> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:119) >>>> at >>>> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:59) >>>> at >>>> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:141) >>>> at org.apache.spark.SparkContext.<init>(SparkContext.scala:370) >>>> at >>>> com.spotify.analytics.AnalyticsSparkContext.<init>(AnalyticsSparkContext.scala:8) >>>> at com.spotify.analytics.DataSampler$.main(DataSampler.scala:42) >>>> at com.spotify.analytics.DataSampler.main(DataSampler.scala) >>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>> at >>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >>>> at >>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>>> at java.lang.reflect.Method.invoke(Method.java:597) >>>> at >>>> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:551) >>>> at >>>> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:155) >>>> at >>>> org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:178) >>>> at >>>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:99) >>>> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) >>>> >>>> /Anders >>>> >>>> >>>> On Thu, Feb 12, 2015 at 1:33 AM, Sandy Ryza <sandy.r...@cloudera.com> >>>> wrote: >>>> >>>>> Hi Anders, >>>>> >>>>> I just tried this out and was able to successfully acquire executors. >>>>> Any strange log messages or additional color you can provide on your >>>>> setup? Does yarn-client mode work? >>>>> >>>>> -Sandy >>>>> >>>>> On Wed, Feb 11, 2015 at 1:28 PM, Anders Arpteg <arp...@spotify.com> >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> Compiled the latest master of Spark yesterday (2015-02-10) for Hadoop >>>>>> 2.2 and failed executing jobs in yarn-cluster mode for that build. Works >>>>>> successfully with spark 1.2 (and also master from 2015-01-16), so >>>>>> something >>>>>> has changed since then that prevents the job from receiving any executors >>>>>> on the cluster. >>>>>> >>>>>> Basic symptoms are that the jobs fires up the AM, but after examining >>>>>> the "executors" page in the web ui, only the driver is listed, no >>>>>> executors are ever received, and the driver keep waiting forever. Has >>>>>> anyone seemed similar problems? >>>>>> >>>>>> Thanks for any insights, >>>>>> Anders >>>>>> >>>>> >>>>> >>>> >> >