Susheel Gupta created YARN-11900:
------------------------------------

             Summary: NullPointerException in ZKConfigurationStore during RM 
startup when HA enabled and configuration store is ZK
                 Key: YARN-11900
                 URL: https://issues.apache.org/jira/browse/YARN-11900
             Project: Hadoop YARN
          Issue Type: Bug
          Components: yarn
    Affects Versions: 3.5.0
            Reporter: Susheel Gupta


Intermittently observing RM startup failures when YARN RM HA is enabled and the 
scheduler configuration store is set to ZK.
During RM restarts one of the RMs occasionally fails to initialize the 
CapacityScheduler with the following exception:

{code:java}
2025-11-18 16:50:23,398 INFO org.apache.zookeeper.ClientCnxn: Session 
establishment complete on server 
quasar-tiwwno-3.vpc.cloudera.com/10.65.54.198:2182, session id = 
0x30000621e760015, negotiated timeout = 60000
2025-11-18 16:50:23,399 INFO 
org.apache.curator.framework.state.ConnectionStateManager: State change: 
CONNECTED
2025-11-18 16:50:23,487 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.YarnConfigurationStore:
 Loaded configuration store version info null
2025-11-18 16:50:23,487 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.YarnConfigurationStore:
 Storing configuration store version info 0.1

2025-11-18 16:50:23,541 ERROR 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.ZKConfigurationStore:
 Exception while deserializing scheduler configuration from store
java.lang.NullPointerException
        at 
java.base/java.io.ByteArrayInputStream.<init>(ByteArrayInputStream.java:108)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.ZKConfigurationStore.deserializeObject(ZKConfigurationStore.java:317)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.ZKConfigurationStore.retrieve(ZKConfigurationStore.java:214)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.MutableCSConfigurationProvider.init(MutableCSConfigurationProvider.java:83)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:295)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:403)
        at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
        at 
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
        at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:875)
        at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
        at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1293)
        at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:334)
        at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
        at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1580)

2025-11-18 16:50:23,549 INFO org.apache.hadoop.service.AbstractService: Service 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
 failed in state INITED
java.lang.NullPointerException
        at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:842)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.MutableCSConfigurationProvider.loadConfiguration(MutableCSConfigurationProvider.java:102)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:296)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:403)
        at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
        at 
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
        at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:875)
        at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
        at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1293)
        at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:334)
        at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
        at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1580)
{code}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to