[ 
https://issues.apache.org/jira/browse/OOZIE-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Salamon updated OOZIE-2310:
----------------------------------
    Fix Version/s:     (was: 5.2.0)
                   5.3.0

> If the Hadoop configuration is not configured, you get a NullPointerException 
> on job submission
> -----------------------------------------------------------------------------------------------
>
>                 Key: OOZIE-2310
>                 URL: https://issues.apache.org/jira/browse/OOZIE-2310
>             Project: Oozie
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 4.1.0
>            Reporter: Robert Kanter
>            Assignee: Artem Ervits
>            Priority: Major
>             Fix For: 5.3.0
>
>
> A user reported an NPE on startup here:
> http://mail-archives.apache.org/mod_mbox/oozie-user/201507.mbox/%3cCALBGZ8oZ0GZ+hf76nQYKxiATHH5g2gbQ_0sQ78uQv_=r4Hct=q...@mail.gmail.com%3e
> I did some digging and the problem is that Oozie is trying to load the 
> Sharelib from but the {{FileSystem}} class variable is {{null}} because the 
> {{ShareLibService}} wasn't able to create it on {{init}}.  That would 
> normally cause Oozie to fail on startup, but the default value of 
> {{oozie.service.ShareLibService.fail.fast.on.startup}} is {{false}}, so it 
> gets ignored.
> The code in question is this:
> {code:java}
> try {
>             fs = FileSystem.get(has.createJobConf(uri.getAuthority()));
>             //cache action key sharelib conf list
>             cacheActionKeySharelibConfList();
>             updateLauncherLib();
>             updateShareLib();
>         }
>         catch (Throwable e) {
>             if (failOnfailure) {
>                 LOG.error("Sharelib initialization fails", e);
>                 throw new ServiceException(ErrorCode.E0104, 
> getClass().getName(), "Sharelib initialization fails. ", e);
>             }
>             else {
>                 // We don't want to actually fail init by throwing an 
> Exception, so only create the ServiceException and
>                 // log it
>                 ServiceException se = new ServiceException(ErrorCode.E0104, 
> getClass().getName(),
>                         "Not able to cache sharelib. An Admin needs to 
> install the sharelib with oozie-setup.sh and issue the "
>                                 + "'oozie admin' CLI command to update the 
> sharelib", e);
>                 LOG.error(se);
>             }
>         }
> {code}
> where {{failOnfailure}} is {{false}} by default.  So, {{fs}} ends up being 
> {{null}}, and if anything later tries to use it, you get an NPE.
> I think we should do two things here:
> # Creating the {{FileSystem}} should be in a different try-catch so that the 
> {{failOnfailure}} doesn't affect it.  The original intention of that behavior 
> was to ignore ShareLib failures, not Hadoop failures.
> # We should improve the default Hadoop configuration (i.e. 
> {{oozie.service.HadoopAccessorService.hadoop.configurations}}).  This has 
> been a problem for a while now where out-of-the-box, Oozie doesn't work even 
> for a local psuedo-cluster because of this config's default.  If that's not 
> possible, we need to make it more obvious that user's must configure this 
> before doing anything.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to