Hey Nirmal,

It sounds like the yarn-site.xml is being ignored for some reason. Things to do:

1. Could you please send full log files for the RM and NM?

2. You might also try putting your yarn-site.xml in 
hello-samza/deploy/yarn/conf, and explicitly setting the HADOOP_YARN_HOME 
environment variable to:

export HADOOP_YARN_HOME=<path to>/hello-samza/deploy/yarn

Then try running bin/grid start yarn.

3. Try staring yarn WITHOUT bin/grid. This can be done with:

deploy/yarn/bin/yarn resourcemanager
deploy/yarn/bin/yarn nodemanager

Cheers,
Chris
________________________________
From: Nirmal Kumar [[email protected]]
Sent: Friday, November 08, 2013 5:01 AM
To: [email protected]
Subject: RE: Running Samza on multi node

Hi Chris,

The below exception is gone if the job file timestamp is same:


Application application_1383907331443_0002 failed 1 times due to AM Container 
for appattempt_1383907331443_0002_000001 exited with exitCode: -1000 due to: 
RemoteTrace:

java.io.IOException: Resource 
file:/home/temptest/samza-job-package-0.7.0-dist.tar.gz changed on src 
filesystem (expected 1383904942000, was 1382550495000

at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:178)

at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:51)

at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:284)

at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:282)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:396)

at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)

at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:280)

at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:51)

at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)

at java.util.concurrent.FutureTask.run(FutureTask.java:138)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)

at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)

at java.util.concurrent.FutureTask.run(FutureTask.java:138)

at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)

at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

at java.lang.Thread.run(Thread.java:662)

at LocalTrace:

org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: Resource 
file:/home/temptest/samza-job-package-0.7.0-dist.tar.gz changed on src 
filesystem (expected 1383904942000, was 1382550495000

at 
org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.convertFromProtoFormat(LocalResourceStatusPBImpl.java:217)

at 
org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.getException(LocalResourceStatusPBImpl.java:147)

at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.update(ResourceLocalizationService.java:819)

at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.processHeartbeat(ResourceLocalizationService.java:491)

at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.heartbeat(ResourceLocalizationService.java:218)

at 
org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.service.LocalizationProtocolPBServiceImpl.heartbeat(LocalizationProtocolPBServiceImpl.java:46)

at 
org.apache.hadoop.yarn.proto.LocalizationProtocol$LocalizationProtocolService$2.callBlockingMethod(LocalizationProtocol.java:57)

at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)

at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:396)

at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)

at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)
.Failing this attempt.. Failing the application.

PFA the exception that’s coming. I am still struggling with the same exception 
i.e. NM trying to connect to 0.0.0.0:8030
I don’t know from where the NM is picking up this 0.0.0.0:8030 value. 
Overriding the  yarn.resourcemanager.scheduler.address in yarn-site.xml is not 
working.

I am using the same hello-samza/deploy/yarn/etc/hadoop/yarn-site.xml on both RM 
and NM machines:

<configuration>
  <property>
    <name>yarn.scheduler.minimum-allocation-mb</name>
    <value>128</value>
  </property>

  <property>
    <name>yarn.nodemanager.vmem-pmem-ratio</name>
    <value>10</value>
  </property>

<property>
    <name>yarn.resourcemanager.hostname</name>
   <value>192.168.145.37</value>
 </property>

<property>
     <name>yarn.resourcemanager.resource-tracker.address</name>
     <value>192.168.145.37:8031</value>
 </property>
</configuration>


Regards,
-Nirmal

From: Nirmal Kumar
Sent: Friday, November 08, 2013 6:11 PM
To: [email protected]
Subject: RE: Running Samza on multi node


Hi Chris,



Using just the yarn.resourcemanager.hostname property gives me the following 
exception on the NM:



Caused by: com.google.protobuf.ServiceException: java.net.ConnectException: 
Call From IMPETUS-DSRV14.impetus.co.in/192.168.145.43 to 0.0.0.0:8031 failed on 
connection exception: java.net.ConnectException: Connection refused; For more 
details see:  http://wiki.apache.org/hadoop/ConnectionRefused



I then added the following property as well:

<property>

   <name>yarn.resourcemanager.resource-tracker.address</name>

   <value>192.168.145.37:8031</value>

</property>



After this my RM and NM were up and the NM got registered as well:

13/11/08 16:12:12 INFO service.AbstractService: Service:ResourceManager is 
started.

13/11/08 16:12:19 INFO util.RackResolver: Resolved IMPETUS-DSRV14.impetus.co.in 
to /default-rack

13/11/08 16:12:19 INFO resourcemanager.ResourceTrackerService: NodeManager from 
node IMPETUS-DSRV14.impetus.co.in(cmPort: 32948 httpPort: 8042) registered with 
capability: <memory:8192, vCores:16>, assigned nodeId 
IMPETUS-DSRV14.impetus.co.in:32948

13/11/08 16:12:19 INFO rmnode.RMNodeImpl: IMPETUS-DSRV14.impetus.co.in:32948 
Node Transitioned from NEW to RUNNING

13/11/08 16:12:19 INFO capacity.CapacityScheduler: Added node 
IMPETUS-DSRV14.impetus.co.in:32948 clusterResource: <memory:8192, vCores:16>



When submitting the job I'm still getting the same exception:


YarnAppMaster [WARN] Listener 
org.apache.samza.job.yarn.SamzaAppMasterLifecycle@500c954e<mailto:org.apache.samza.job.yarn.SamzaAppMasterLifecycle@500c954e>
 failed to shutdown.
java.lang.reflect.UndeclaredThrowableException
         at 
org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
         at 
org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.finishApplicationMaster(AMRMProtocolPBClientImpl.java:90)
         at 
org.apache.hadoop.yarn.client.AMRMClientImpl.unregisterApplicationMaster(AMRMClientImpl.java:244)
         at 
org.apache.samza.job.yarn.SamzaAppMasterLifecycle.onShutdown(SamzaAppMasterLifecycle.scala:68)
         at 
org.apache.samza.job.yarn.YarnAppMaster$$anonfun$run$9.apply(YarnAppMaster.scala:70)
         at 
org.apache.samza.job.yarn.YarnAppMaster$$anonfun$run$9.apply(YarnAppMaster.scala:69)
         at 
scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
         at scala.collection.immutable.List.foreach(List.scala:45)
         at org.apache.samza.job.yarn.YarnAppMaster.run(YarnAppMaster.scala:69)
         at 
org.apache.samza.job.yarn.SamzaAppMaster$.main(SamzaAppMaster.scala:78)
         at org.apache.samza.job.yarn.SamzaAppMaster.main(SamzaAppMaster.scala)
Caused by: com.google.protobuf.ServiceException: java.net.ConnectException: 
Call From IMPETUS-DSRV14.impetus.co.in/192.168.145.43 to 0.0.0.0:8030 failed on 
connection exception: java.net.ConnectException: Connection refused; For more 
details see:  http://wiki.apache.org/hadoop/ConnectionRefused
         at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
         at $Proxy12.finishApplicationMaster(Unknown Source)
         at 
org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.finishApplicationMaster(AMRMProtocolPBClientImpl.java:87)
         ... 9 more
Caused by: java.net.ConnectException: Call From 
IMPETUS-DSRV14.impetus.co.in/192.168.145.43 to 0.0.0.0:8030 failed on 
connection exception: java.net.ConnectException: Connection refused; For more 
details see:  http://wiki.apache.org/hadoop/ConnectionRefused
         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
Method)
         at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
         at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
         at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
         at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:780)
         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:727)
         at org.apache.hadoop.ipc.Client.call(Client.java:1239)
         at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
         ... 11 more
Caused by: java.net.ConnectException: Connection refused
         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
         at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
         at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
         at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:526)
         at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:490)
         at 
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:508)
         at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:603)
         at org.apache.hadoop.ipc.Client$Connection.access$2100(Client.java:253)
         at org.apache.hadoop.ipc.Client.getConnection(Client.java:1288)
         at org.apache.hadoop.ipc.Client.call(Client.java:1206)
         ... 12 more



Where do I need to keep the job file ?

I am setting the job file name in a test-consumer.properties file:

yarn.package.path=file:/home/temptest/samza-job-package-0.7.0-dist.tar.gz



And submitting the job from :

bin/run-job.sh 
--config-factory=org.apache.samza.config.factories.PropertiesConfigFactory 
--config-path=file:/home/bda/nirmal/hello-samza/deploy/samza/config/test-consumer.properties



But that way do I need to keep the job file in both NM and RM machines on the 
same location?



I tried submitting the job several time with different properties in 
yarn-site.xml and now I am getting some strange exception. This is probably due 
to the different timestamps.



Application application_1383907331443_0002 failed 1 times due to AM Container 
for appattempt_1383907331443_0002_000001 exited with exitCode: -1000 due to: 
RemoteTrace:

java.io.IOException: Resource 
file:/home/temptest/samza-job-package-0.7.0-dist.tar.gz changed on src 
filesystem (expected 1383904942000, was 1382550495000

at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:178)

at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:51)

at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:284)

at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:282)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:396)

at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)

at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:280)

at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:51)

at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)

at java.util.concurrent.FutureTask.run(FutureTask.java:138)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)

at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)

at java.util.concurrent.FutureTask.run(FutureTask.java:138)

at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)

at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

at java.lang.Thread.run(Thread.java:662)

at LocalTrace:

org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: Resource 
file:/home/temptest/samza-job-package-0.7.0-dist.tar.gz changed on src 
filesystem (expected 1383904942000, was 1382550495000

at 
org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.convertFromProtoFormat(LocalResourceStatusPBImpl.java:217)

at 
org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.getException(LocalResourceStatusPBImpl.java:147)

at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.update(ResourceLocalizationService.java:819)

at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.processHeartbeat(ResourceLocalizationService.java:491)

at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.heartbeat(ResourceLocalizationService.java:218)

at 
org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.service.LocalizationProtocolPBServiceImpl.heartbeat(LocalizationProtocolPBServiceImpl.java:46)

at 
org.apache.hadoop.yarn.proto.LocalizationProtocol$LocalizationProtocolService$2.callBlockingMethod(LocalizationProtocol.java:57)

at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)

at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:396)

at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)

at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)

.Failing this attempt.. Failing the application.



When I am using NM and RM on a single node all is running fine. PFA the logs 
for the job.



Other questions:



1. Both your NM and RM are running YARN 2.2.0, right?

I am using same YARN on both NM and RM that was downloaded as part of 
hello-samza application.

2. It appears that your AM shuts down. Did you run kill-job.sh to kill it?

I am forcibly killing the java processes using kill -9 pid command.



Thanks,

-Nirmal



-----Original Message-----
From: Chris Riccomini [mailto:[email protected]]
Sent: Thursday, November 07, 2013 9:19 PM
To: [email protected]
Subject: Re: Running Samza on multi node



Hey Nirmal,



Thanks for this detailed report! It makes things much easier to figure out. The 
problem appears to be that the Samza AM is trying to connect to 0.0.0.0:8030 
when trying to talk to the RM. This is an RM port, which is running on 
192.168.145.37 (the RM host), not 192.168.145.43 (the NM host). This is causing 
a timeout, since 8030 isn't open on localhost for the Samza AM, which is 
running on the NM's box.



It is somewhat interesting that the NM does connect to the RM for the capacity 
scheduler. Rather than setting each individual host/port pair, as you've done, 
I recommend just setting:



  <property>

    <name>yarn.resourcemanager.hostname</name>

    <value>192.168.145.37</value>

  </property>



Your netstat reports look fine – as expected.



Other questions:



1. Both your NM and RM are running YARN 2.2.0, right?

2. It appears that your AM shuts down. Did you run kill-job.sh to kill it?



Regarding (2), it appears that the AM never tries to register. This normally 
happens. I'm wondering if another failure is being triggered, which is then 
causing the AM to try and shut itself down. Could you turn on debugging for 
your Samza job (in log4j.xml), and re-run? I'm curious if the web-service 
that's starting up, or the registration itself is failing. In a normal 
execution, you would expect to see:





    info("Got AM register response. The YARN RM supports container requests 
with max-mem: %s, max-cpu: %s" format (maxMem, maxCpu))



I don't see this in your logs, which means the AM is failing (and triggering a 
shutdown) before it even tries to register.



Cheers,

Chris



From: Nirmal Kumar 
<[email protected]<mailto:[email protected]<mailto:[email protected]%3cmailto:[email protected]>>>

Reply-To: 
"[email protected]<mailto:[email protected]><mailto:[email protected]%3cmailto:[email protected]%3e>"
 
<[email protected]<mailto:[email protected]<mailto:[email protected]%3cmailto:[email protected]>>>

Date: Thursday, November 7, 2013 5:05 AM

To: 
"[email protected]<mailto:[email protected]><mailto:[email protected]%3cmailto:[email protected]%3e>"
 
<[email protected]<mailto:[email protected]<mailto:[email protected]%3cmailto:[email protected]>>>

Subject: Running Samza on multi node



All,



I was able to run the hello-samza application on a single node machine.

Now I am trying to run the hello-samza application on  a 2 node setup.



Node1 has a Resource Manager

Node2 has a Node Manager



The NM gets registered with the RM successfully as seen in rm.log of the RM 
node:

13/11/07 11:44:29 INFO service.AbstractService: Service:ResourceManager is 
started.

13/11/07 11:48:30 INFO util.RackResolver: Resolved IMPETUS-DSRV14.impetus.co.in 
to /default-rack

13/11/07 11:48:30 INFO resourcemanager.ResourceTrackerService: NodeManager from 
node IMPETUS-DSRV14.impetus.co.in(cmPort: 56093 httpPort: 8042) registered with 
capability: <memory:8192, vCores:16>, assigned nodeId 
IMPETUS-DSRV14.impetus.co.in:56093

13/11/07 11:48:30 INFO rmnode.RMNodeImpl: IMPETUS-DSRV14.impetus.co.in:56093 
Node Transitioned from NEW to RUNNING

13/11/07 11:48:30 INFO capacity.CapacityScheduler: Added node 
IMPETUS-DSRV14.impetus.co.in:56093 clusterResource: <memory:8192, vCores:16>



I am submitting the job from the RM machine using the command line:

bin/run-job.sh 
--config-factory=org.apache.samza.config.factories.PropertiesConfigFactory 
--config-path=file:/home/bda/nirmal/hello-samza/deploy/samza/config/test-consumer.properties



However, I am getting the following exception after submitting the job to YARN:



2013-11-07 15:05:57 SamzaAppMaster$ [INFO] got container id: 
container_1383816757258_0001_01_000001

2013-11-07 15:05:57 SamzaAppMaster$ [INFO] got app attempt id: 
appattempt_1383816757258_0001_000001

2013-11-07 15:05:57 SamzaAppMaster$ [INFO] got node manager host: IMPETUS-DSRV14

2013-11-07 15:05:57 SamzaAppMaster$ [INFO] got node manager port: 59828

2013-11-07 15:05:57 SamzaAppMaster$ [INFO] got node manager http port: 8042

2013-11-07 15:05:57 SamzaAppMaster$ [INFO] got config: 
{task.inputs=kafka.storm-sentence, 
job.factory.class=org.apache.samza.job.yarn.YarnJobFactory, 
systems.kafka.samza.consumer.factory=samza.stream.kafka.KafkaConsumerFactory, 
job.name=test-Consumer, 
systems.kafka.consumer.zookeeper.connect=192.168.145.195:2181/, 
systems.kafka.consumer.auto.offset.reset=largest, 
systems.kafka.samza.msg.serde=json, 
serializers.registry.json.class=org.apache.samza.serializers.JsonSerdeFactory, 
systems.kafka.samza.partition.manager=samza.stream.kafka.KafkaPartitionManager, 
task.window.ms=10000, task.class=samza.examples.wikipedia.task.TestConsumer, 
yarn.package.path=file:/home/temptest/samza+storm/hello-samza/samza-job-package/target/samza-job-package-0.7.0-dist.tar.gz,
 systems.kafka.samza.factory=org.apache.samza.system.kafka.KafkaSystemFactory, 
systems.kafka.producer.metadata.broker.list=192.168.145.195:9092,192.168.145.195:9093}

2013-11-07 15:05:57 ClientHelper [INFO] trying to connect to RM /0.0.0.0:8032

2013-11-07 15:05:57 JmxServer [INFO] According to 
InetAddress.getLocalHost.getHostName we are IMPETUS-DSRV14.impetus.co.in

2013-11-07 15:05:57 JmxServer [INFO] Started JmxServer port=47115 
url=service:jmx:rmi:///jndi/rmi://IMPETUS-DSRV14.impetus.co.in:47115/jmxrmi

2013-11-07 15:05:57 SamzaAppMasterTaskManager [INFO] No yarn.container.count 
specified. Defaulting to one container.

2013-11-07 15:05:57 VerifiableProperties [INFO] Verifying properties

2013-11-07 15:05:57 VerifiableProperties [INFO] Property client.id is 
overridden to samza_admin-test_Consumer-1-1383816957797-0

2013-11-07 15:05:57 VerifiableProperties [INFO] Property metadata.broker.list 
is overridden to 192.168.145.195:9092,192.168.145.195:9093

2013-11-07 15:05:57 VerifiableProperties [INFO] Verifying properties

2013-11-07 15:05:57 VerifiableProperties [INFO] Property auto.offset.reset is 
overridden to largest

2013-11-07 15:05:57 VerifiableProperties [INFO] Property client.id is 
overridden to samza_admin-test_Consumer-1-1383816957797-0

2013-11-07 15:05:57 VerifiableProperties [INFO] Property group.id is overridden 
to undefined-samza-consumer-group-

2013-11-07 15:05:57 VerifiableProperties [INFO] Property zookeeper.connect is 
overridden to 192.168.145.195:2181/

2013-11-07 15:05:57 VerifiableProperties [INFO] Verifying properties

2013-11-07 15:05:57 VerifiableProperties [INFO] Property client.id is 
overridden to samza_admin-test_Consumer-1-1383816957797-0

2013-11-07 15:05:57 VerifiableProperties [INFO] Property metadata.broker.list 
is overridden to 192.168.145.195:9092,192.168.145.195:9093

2013-11-07 15:05:57 VerifiableProperties [INFO] Property request.timeout.ms is 
overridden to 6000

2013-11-07 15:05:57 ClientUtils$ [INFO] Fetching metadata from broker 
id:0,host:192.168.145.195,port:9092 with correlation id 0 for 1 topic(s) 
Set(storm-sentence)

2013-11-07 15:05:57 SyncProducer [INFO] Connected to 192.168.145.195:9092 for 
producing

2013-11-07 15:05:57 SyncProducer [INFO] Disconnecting from 192.168.145.195:9092

2013-11-07 15:05:57 SamzaAppMasterService [INFO] Starting webapp at rpc 39152, 
tracking port 26751

2013-11-07 15:05:57 log [INFO] Logging to 
org.slf4j.impl.Log4jLoggerAdapter(org.eclipse.jetty.util.log) via 
org.eclipse.jetty.util.log.Slf4jLog

2013-11-07 15:05:58 ClientHelper [INFO] trying to connect to RM /0.0.0.0:8032

2013-11-07 15:05:58 log [INFO] jetty-7.0.0.v20091005

2013-11-07 15:05:58 log [INFO] Extract 
jar:file:/tmp/hadoop-vuser/nm-local-dir/usercache/bda/appcache/application_1383816757258_0001/filecache/8004956396276725272/samza-job-package-0.7.0-dist.tar.gz/lib/samza-yarn_2.8.1-0.7.0-yarn-2.0.5-alpha.jar!/scalate/WEB-INF/
 to /tmp/Jetty_0_0_0_0_39152_scalate____xveaws/webinf/WEB-INF

2013-11-07 15:05:58 ServletTemplateEngine [INFO] Scalate template engine using 
working directory: /tmp/scalate-5279562760844696556-workdir

2013-11-07 15:05:58 log [INFO] Started 
[email protected]<mailto:[email protected]>:39152<mailto:[email protected]%3cmailto:[email protected]%3e:39152>

2013-11-07 15:05:58 log [INFO] jetty-7.0.0.v20091005

2013-11-07 15:05:58 log [INFO] Extract 
jar:file:/tmp/hadoop-vuser/nm-local-dir/usercache/bda/appcache/application_1383816757258_0001/filecache/8004956396276725272/samza-job-package-0.7.0-dist.tar.gz/lib/samza-yarn_2.8.1-0.7.0-yarn-2.0.5-alpha.jar!/scalate/WEB-INF/
 to /tmp/Jetty_0_0_0_0_26751_scalate____.dr19qj/webinf/WEB-INF

2013-11-07 15:05:58 ServletTemplateEngine [INFO] Scalate template engine using 
working directory: /tmp/scalate-5582747144249485577-workdir

2013-11-07 15:05:58 log [INFO] Started 
[email protected]<mailto:[email protected]>:26751<mailto:[email protected]%3cmailto:[email protected]%3e:26751>

2013-11-07 15:06:08 SamzaAppMasterLifecycle [INFO] Shutting down.

2013-11-07 15:06:18 YarnAppMaster [WARN] Listener 
org.apache.samza.job.yarn.SamzaAppMasterLifecycle@500c954e<mailto:org.apache.samza.job.yarn.SamzaAppMasterLifecycle@500c954e>
 failed to shutdown.

java.lang.reflect.UndeclaredThrowableException

         at 
org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)

         at 
org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.finishApplicationMaster(AMRMProtocolPBClientImpl.java:90)

         at 
org.apache.hadoop.yarn.client.AMRMClientImpl.unregisterApplicationMaster(AMRMClientImpl.java:244)

         at 
org.apache.samza.job.yarn.SamzaAppMasterLifecycle.onShutdown(SamzaAppMasterLifecycle.scala:68)

         at 
org.apache.samza.job.yarn.YarnAppMaster$$anonfun$run$9.apply(YarnAppMaster.scala:70)

         at 
org.apache.samza.job.yarn.YarnAppMaster$$anonfun$run$9.apply(YarnAppMaster.scala:69)

         at 
scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)

         at scala.collection.immutable.List.foreach(List.scala:45)

         at org.apache.samza.job.yarn.YarnAppMaster.run(YarnAppMaster.scala:69)

         at 
org.apache.samza.job.yarn.SamzaAppMaster$.main(SamzaAppMaster.scala:78)

         at org.apache.samza.job.yarn.SamzaAppMaster.main(SamzaAppMaster.scala)

Caused by: com.google.protobuf.ServiceException: java.net.ConnectException: 
Call From IMPETUS-DSRV14.impetus.co.in/192.168.145.43 to 0.0.0.0:8030 failed on 
connection exception: java.net.ConnectException: Connection refused; For more 
details see:  http://wiki.apache.org/hadoop/ConnectionRefused

         at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)

         at $Proxy12.finishApplicationMaster(Unknown Source)

         at 
org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.finishApplicationMaster(AMRMProtocolPBClientImpl.java:87)

         ... 9 more

Caused by: java.net.ConnectException: Call From 
IMPETUS-DSRV14.impetus.co.in/192.168.145.43 to 0.0.0.0:8030 failed on 
connection exception: java.net.ConnectException: Connection refused; For more 
details see:  http://wiki.apache.org/hadoop/ConnectionRefused

         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
Method)

         at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

         at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)

         at java.lang.reflect.Constructor.newInstance(Constructor.java:513)

         at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:780)

         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:727)

         at org.apache.hadoop.ipc.Client.call(Client.java:1239)

         at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)

         ... 11 more

Caused by: java.net.ConnectException: Connection refused

         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

         at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)

         at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)

         at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:526)

         at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:490)

         at 
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:508)

         at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:603)

         at org.apache.hadoop.ipc.Client$Connection.access$2100(Client.java:253)

         at org.apache.hadoop.ipc.Client.getConnection(Client.java:1288)

         at org.apache.hadoop.ipc.Client.call(Client.java:1206)

         ... 12 more





I have changed the following properties in the 
hello-samza/deploy/yarn/etc/hadoop/yarn-site.xml on the Node Manager machine:



<property>

                <name>yarn.resourcemanager.scheduler.address</name>

                <value>192.168.145.37:8030</value>

</property>

<property>

                <name>yarn.resourcemanager.resource-tracker.address</name>

                <value>192.168.145.37:8031</value>

</property>

<property>

                <name>yarn.resourcemanager.address</name>

                <value>192.168.145.37:8032</value>

</property>

<property>

                <name>yarn.resourcemanager.admin.address</name>

                <value>192.168.145.37:8033</value>

</property>

<property>

                <name>yarn.resourcemanager.webapp.address</name>

                <value>192.168.145.37:8088</value>

</property>







These properties are reflected on the UI screen as well:



[cid:[email protected]]



But this overriding of the yarn.resourcemanager.scheduler.address to 
192.168.145.37:8030 does not rectify the error.

I still get:

Caused by: com.google.protobuf.ServiceException: java.net.ConnectException: 
Call From IMPETUS-DSRV14.impetus.co.in/192.168.145.43 to 0.0.0.0:8030 failed on 
connection exception: java.net.ConnectException: Connection refused; For more 
details see:  http://wiki.apache.org/hadoop/ConnectionRefused



Nestat on the RM machine shows me:

tcp        0      0 ::ffff:192.168.145.37:8088  :::*                        
LISTEN      14595/java

tcp        0      0 ::ffff:192.168.145.37:8030  :::*                        
LISTEN      14595/java

tcp        0      0 ::ffff:192.168.145.37:8031  :::*                        
LISTEN      14595/java

tcp        0      0 ::ffff:192.168.145.37:8032  :::*                        
LISTEN      14595/java

tcp        0      0 ::ffff:192.168.145.37:8033  :::*                        
LISTEN      14595/java



Nestat on the NM machine shows me:

tcp        0      0 :::8040                     :::*                        
LISTEN      1331/java

tcp        0      0 :::8042                     :::*                        
LISTEN      1331/java

tcp        0      0 :::56877                    :::*                        
LISTEN      1331/java



Kindly help me how to rectify this error.



Regards,

-Nirmal



________________________________













NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.

________________________________






NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.

________________________________






NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.

Reply via email to