Hi Krzysztof, I had connectivity errors, but in my case was the /etc/hosts misconfigured.
Cheers. 2015-07-10 12:11 GMT-03:00 Roger Hoover <roger.hoo...@gmail.com>: > Hi Krzysztof, > > I haven't seen that error before. It does sound like it could be a > connection issue. Did you check that the YARN node has access > to hdfs:///user/samza/deploy/event-log-etl-nested-0.1.0-dist.tar.gz? > > One way to set the AM and containers to debug is to include a log4j.xml > file in your tar.gz on the lib folder. There special logic in the start > scripts ( > > https://github.com/apache/samza/blob/master/samza-shell/src/main/bash/run-class.sh#L40 > ) > that checks for that path and doesn't work with log4j.properties, for > example. > > Cheers, > > Roger > > > > On Fri, Jul 10, 2015 at 4:18 AM, Krzysztof Zarzycki <k.zarzy...@gmail.com> > wrote: > > > Hi there Samza developers, > > > > I have a problem that I cannot overcome with deploying Samza task on > YARN. > > When I submitted the task, ApplicationMasters get created (2 of them), > job > > is visible, but in state UNASSIGNED. After some time the job FAILED. > > > > application information on resource manager panel is : > > State: FAILED > > FinalStatus: FAILED > > Elapsed: 25mins, 2sec > > Diagnostics: Application application_1424354741837_0380 failed 2 times > due > > to ApplicationMaster for attempt appattempt_1424354741837_0380_000002 > timed > > out. Failing the application. > > > > > > When I look into the logs of ApplicationMaster I see no errors, no > > warnings, anything wrong: Please see the output of "yarn logs" comand > > attached. > > > > My guess would be that connection failed between some components > > (container to ApplicationMaster? NodeManager? ). I suspect that when > > looking at jstack output in the AM: > > > > "main" #1 prio=5 os_prio=0 tid=0x00007f9338015000 nid=0x6f2f waiting on > > condition [0x00007f933de6e000] > > java.lang.Thread.State: TIMED_WAITING (sleeping) > > at java.lang.Thread.sleep(Native Method) > > at > > > org.apache.hadoop.util.ThreadUtil.sleepAtLeastIgnoreInterrupts(ThreadUtil.java:43) > > at > > > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:154) > > at com.sun.proxy.$Proxy18.registerApplicationMaster(Unknown Source) > > at > > > org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.registerApplicationMaster(AMRMClientImpl.java:196) > > at > > > org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl.registerApplicationMaster(AMRMClientAsyncImpl.java:138) > > at > > > org.apache.samza.job.yarn.SamzaAppMasterLifecycle.onInit(SamzaAppMasterLifecycle.scala:39) > > at > > > org.apache.samza.job.yarn.SamzaAppMaster$$anonfun$run$1.apply(SamzaAppMaster.scala:108) > > at > > > org.apache.samza.job.yarn.SamzaAppMaster$$anonfun$run$1.apply(SamzaAppMaster.scala:108) > > at scala.collection.immutable.List.foreach(List.scala:318) > > at > > org.apache.samza.job.yarn.SamzaAppMaster$.run(SamzaAppMaster.scala:108) > > at > > org.apache.samza.job.yarn.SamzaAppMaster$.main(SamzaAppMaster.scala:95) > > at org.apache.samza.job.yarn.SamzaAppMaster.main(SamzaAppMaster.scala) > > > > > > On the other hand I see in logs correct RM addresses: > > 15/07/10 12:17:30 INFO client.RMProxy: Connecting to ResourceManager at > > hdnn02.company.com/148.251.82.11:8030 > > 15/07/10 12:17:31 INFO client.RMProxy: Connecting to ResourceManager at > > hdnn02.company.com/148.251.82.11:8050 > > ... > > 2015-07-10 12:17:31,032 [main] INFO o.apache.samza.job.yarn.ClientHelper > > - trying to connect to RM hdnn02.company.com:8050 > > ... > > 2015-07-10 12:17:31,680 [main] INFO o.a.s.job.yarn.SamzaAppMasterService > > - Webapp is started at (rpc http://78.46.56.88:43268/, tracking http:// > > > > > > Does anyone knows what could be wrong here? I'll be grateful for any > help, > > also in just debugging the case. > > I start with a simple question: do you know how to set log4j for AM & > > containers to DEBUG? > > > > Thank you! > > Krzysztof > > > > > > >