Gour, Thanks for the prompt reply.
1. Temp hickup in HDFS as possible cause has been on mind as well, wanted to reach out to slider community to check if there were other issues causing this symptom. 2. I remember I had stopped and started the slider app after this time stamp. Apparently App Stop/Start did not delete this file. Can you confirm that behaviour ? Also would it make sense to have a enhancement to delete this file on App stop/start if indeed not being done ? Thanks, Manoj On Wed, Jan 17, 2018 at 1:50 PM, Gour Saha <gs...@hortonworks.com> wrote: > Manoj, > By any chance is it possible to find out (maybe from logs or sar files) if > there was HDFS unavailability (say NN node connection issue) around the > time of 2018-01-06 00:33 (based on the readlock file timestamp)? > > -rw-r--r-- 3 xxx xxx 23 2018-01-06 00:33 > hdfs://xxx/user/xxx/.slider/cluster/spas/readlock > > > -Gour > > On 1/17/18, 1:05 PM, "Manoj Samel" <manojsamelt...@gmail.com> wrote: > > >Hello, > > > >Slider version 0.80 on CDH 5.5.1 cluster with kerberos > > > >Slider upgrade <App> --template /xxx/appConfig.json --resources > >/xxx/resources.json --queue tenant --force failed with following trace > > > >2018-01-17 20:31:23,030 [main] INFO tools.SliderUtils - JVM initialized > >into secure mode with kerberos realm BIGDATA > >2018-01-17 20:31:23,869 [main] > >INFO client.ConfiguredRMFailoverProxyProvider - Failing over to rm2 > >2018-01-17 20:31:24,325 [main] WARN client.SliderClient - Failed to get a > >Lock on Builder working with spas at > >hdfs://xxx/user/xxx/.slider/cluster/spas : > >org.apache.slider.core.persist.LockAcquireFailedException: Failed to > >acquire lock hdfs://xxx/user/xxx/.slider/cluster/spas/readlock > >org.apache.slider.core.persist.LockAcquireFailedException: Failed to > >acquire lock hdfs://xxx/user/xxx/.slider/cluster/spas/readlock > > at > >org.apache.slider.core.persist.ConfPersister. > acquireWritelock(ConfPersiste > >r.java:141) > > > > at > >org.apache.slider.core.persist.ConfPersister.save(ConfPersister.java:253) > > at > >org.apache.slider.core.build.InstanceBuilder.persist( > InstanceBuilder.java: > >270) > > > > at > >org.apache.slider.client.SliderClient.persistInstanceDefinition( > SliderClie > >nt.java:1836) > > > > at > >org.apache.slider.client.SliderClient.buildInstanceDefinition( > SliderClient > >.java:1734) > > > > at > >org.apache.slider.client.SliderClient.actionUpgrade( > SliderClient.java:802) > > at org.apache.slider.client.SliderClient.exec(SliderClient.java:542) > > at > >org.apache.slider.client.SliderClient.runService(SliderClient.java:424) > > at > >org.apache.slider.core.main.ServiceLauncher.launchService( > ServiceLauncher. > >java:188) > > > > at > >org.apache.slider.core.main.ServiceLauncher. > launchServiceRobustly(ServiceL > >auncher.java:475) > > > > at > >org.apache.slider.core.main.ServiceLauncher. > launchServiceAndExit(ServiceLa > >uncher.java:403) > > > > at > >org.apache.slider.core.main.ServiceLauncher.serviceMain( > ServiceLauncher.ja > >va:630) > > > > at org.apache.slider.Slider.main(Slider.java:49) > >2018-01-17 20:31:24,327 [main] ERROR main.ServiceLauncher - Failed to save > >spas: org.apache.slider.core.persist.LockAcquireFailedException: Failed > to > >acquire lock hdfs://xxx/user/xxx/.slider/cluster/spas/readlock > >2018-01-17 20:31:24,328 [main] INFO util.ExitUtil - Exiting with status > >70 > > > >HDFS ls listing showed a file readlock was created few days back > > > >hdfs dfs -ls hdfs://xxx/user/xxx/.slider/cluster/spas > >... > >-rw-r--r-- 3 xxx xxx 23 2018-01-06 00:33 > >hdfs://xxx/user/xxx/.slider/cluster/spas/readlock > >... > > > >After deleting this file manually, the upgrade command works. > > > >Any idea when is this file created and why it was not removed ? > > > >Thanks in advance, > > > >Manoj > >