Susheel Gupta created YARN-11900:
------------------------------------
Summary: NullPointerException in ZKConfigurationStore during RM
startup when HA enabled and configuration store is ZK
Key: YARN-11900
URL: https://issues.apache.org/jira/browse/YARN-11900
Project: Hadoop YARN
Issue Type: Bug
Components: yarn
Affects Versions: 3.5.0
Reporter: Susheel Gupta
Intermittently observing RM startup failures when YARN RM HA is enabled and the
scheduler configuration store is set to ZK.
During RM restarts one of the RMs occasionally fails to initialize the
CapacityScheduler with the following exception:
{code:java}
2025-11-18 16:50:23,398 INFO org.apache.zookeeper.ClientCnxn: Session
establishment complete on server
quasar-tiwwno-3.vpc.cloudera.com/10.65.54.198:2182, session id =
0x30000621e760015, negotiated timeout = 60000
2025-11-18 16:50:23,399 INFO
org.apache.curator.framework.state.ConnectionStateManager: State change:
CONNECTED
2025-11-18 16:50:23,487 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.YarnConfigurationStore:
Loaded configuration store version info null
2025-11-18 16:50:23,487 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.YarnConfigurationStore:
Storing configuration store version info 0.1
2025-11-18 16:50:23,541 ERROR
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.ZKConfigurationStore:
Exception while deserializing scheduler configuration from store
java.lang.NullPointerException
at
java.base/java.io.ByteArrayInputStream.<init>(ByteArrayInputStream.java:108)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.ZKConfigurationStore.deserializeObject(ZKConfigurationStore.java:317)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.ZKConfigurationStore.retrieve(ZKConfigurationStore.java:214)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.MutableCSConfigurationProvider.init(MutableCSConfigurationProvider.java:83)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:295)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:403)
at
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:875)
at
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1293)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:334)
at
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1580)
2025-11-18 16:50:23,549 INFO org.apache.hadoop.service.AbstractService: Service
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
failed in state INITED
java.lang.NullPointerException
at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:842)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.MutableCSConfigurationProvider.loadConfiguration(MutableCSConfigurationProvider.java:102)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:296)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:403)
at
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:875)
at
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1293)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:334)
at
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1580)
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]