Hi: I opened a JIRA for this issue: https://issues.apache.org/jira/browse/OOZIE-2755 Thanks
2016-12-07 17:22 GMT+08:00 Andras Piros <andras.pi...@cloudera.com>: > Hi Dongying, > > this seems like a bug in ZKJobsConcurrencyService - in case numOozies is > zero isJobIdForThisServer() should emit a WARN log stating that the other > Oozie instance might be missing and return true rather than throwing a > RuntimeException. > > Can you please file a bug under Apache JIRA. > > Thanks, and regards, > > Andras > > -- > Andras PIROS > Software Engineer > <http://www.cloudera.com/> > > On Tue, Dec 6, 2016 at 4:33 AM, Dongying Jiao <pineapple...@gmail.com> > wrote: > > > Hi: > > Do you have the detail steps on setting up oozie HA using virtual IP? > > I setup oozie HA using virtual IP, server-1 and server-2(active-active), > > when we take down server-1 any oozie job submitted fails with below > > stacktrace. If both are up , there is no issue. > > ERROR RecoveryService$RecoveryRunnable:517 - SERVER[XXXX] USER[-] > GROUP[-] > > TOKEN[-] APP[-] JOB[-] ACTION[-] Exception, / by zero > > java.lang.ArithmeticException: / by zero > > at > > org.apache.oozie.service.ZKJobsConcurrencyService.checkJobIdForServer( > > ZKJobsConcurrencyService.java:167) > > at > > org.apache.oozie.service.ZKJobsConcurrencyService.isJobIdForThisServer( > > ZKJobsConcurrencyService.java:129) > > at > > org.apache.oozie.service.RecoveryService$RecoveryRunnable. > > runWFRecovery(RecoveryService.java:362) > > at > > org.apache.oozie.service.RecoveryService$RecoveryRunnable.run( > > RecoveryService.java:146) > > at > > org.apache.oozie.service.SchedulerService$2.run( > SchedulerService.java:175) > > at > > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > > > > It seems server-2 can't get oozie server list from zookeeper. Zookeeper > > connection string is already added to oozie site. > > > > Thanks > > > > Best Regards, > > Dongying Jiao > > >