[ https://issues.apache.org/jira/browse/OOZIE-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andras Salamon updated OOZIE-2310: ---------------------------------- Fix Version/s: (was: 5.2.0) 5.3.0 > If the Hadoop configuration is not configured, you get a NullPointerException > on job submission > ----------------------------------------------------------------------------------------------- > > Key: OOZIE-2310 > URL: https://issues.apache.org/jira/browse/OOZIE-2310 > Project: Oozie > Issue Type: Bug > Components: core > Affects Versions: 4.1.0 > Reporter: Robert Kanter > Assignee: Artem Ervits > Priority: Major > Fix For: 5.3.0 > > > A user reported an NPE on startup here: > http://mail-archives.apache.org/mod_mbox/oozie-user/201507.mbox/%3cCALBGZ8oZ0GZ+hf76nQYKxiATHH5g2gbQ_0sQ78uQv_=r4Hct=q...@mail.gmail.com%3e > I did some digging and the problem is that Oozie is trying to load the > Sharelib from but the {{FileSystem}} class variable is {{null}} because the > {{ShareLibService}} wasn't able to create it on {{init}}. That would > normally cause Oozie to fail on startup, but the default value of > {{oozie.service.ShareLibService.fail.fast.on.startup}} is {{false}}, so it > gets ignored. > The code in question is this: > {code:java} > try { > fs = FileSystem.get(has.createJobConf(uri.getAuthority())); > //cache action key sharelib conf list > cacheActionKeySharelibConfList(); > updateLauncherLib(); > updateShareLib(); > } > catch (Throwable e) { > if (failOnfailure) { > LOG.error("Sharelib initialization fails", e); > throw new ServiceException(ErrorCode.E0104, > getClass().getName(), "Sharelib initialization fails. ", e); > } > else { > // We don't want to actually fail init by throwing an > Exception, so only create the ServiceException and > // log it > ServiceException se = new ServiceException(ErrorCode.E0104, > getClass().getName(), > "Not able to cache sharelib. An Admin needs to > install the sharelib with oozie-setup.sh and issue the " > + "'oozie admin' CLI command to update the > sharelib", e); > LOG.error(se); > } > } > {code} > where {{failOnfailure}} is {{false}} by default. So, {{fs}} ends up being > {{null}}, and if anything later tries to use it, you get an NPE. > I think we should do two things here: > # Creating the {{FileSystem}} should be in a different try-catch so that the > {{failOnfailure}} doesn't affect it. The original intention of that behavior > was to ignore ShareLib failures, not Hadoop failures. > # We should improve the default Hadoop configuration (i.e. > {{oozie.service.HadoopAccessorService.hadoop.configurations}}). This has > been a problem for a while now where out-of-the-box, Oozie doesn't work even > for a local psuedo-cluster because of this config's default. If that's not > possible, we need to make it more obvious that user's must configure this > before doing anything. -- This message was sent by Atlassian Jira (v8.3.4#803005)