[ 
https://issues.apache.org/jira/browse/OOZIE-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

abhishek bafna updated OOZIE-2347:
----------------------------------
    Fix Version/s:     (was: trunk)
                   4.3.0

> Remove unnecessary new Configuration()/new jobConf() calls from oozie
> ---------------------------------------------------------------------
>
>                 Key: OOZIE-2347
>                 URL: https://issues.apache.org/jira/browse/OOZIE-2347
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Purshotam Shah
>            Assignee: Purshotam Shah
>             Fix For: 4.3.0
>
>         Attachments: OOZIE-2347-V1.patch, OOZIE-2347-V2.patch, 
> amend-OOZIE-2347-V1.patch, amend-OOZIE-2347-V2.patch
>
>
> We noticed that setting of job sharelib was slow and one prime reason was lot 
> of thread was blocked on "java.util.zip.ZipFile.getEntry"
> <0x00000005c0afda68> (a java.util.jar.JarFile): 0 Thread(s) sleeping, 178 
> Thread(s) waiting, 1 Thread(s) locking
> There are lot of places we do new Configuration()/new jobConf() 
> unnecessarily. This can be easily removed to enhance performance.
> 1.
> Configuration defaultConf = new Configuration(); is called for every file we 
> add to classpath.
> {code}
> public static void addFileToClassPath(Path file, Configuration conf, 
> FileSystem fs) throws IOException {
>       Configuration defaultConf = new Configuration();
>       XConfiguration.copy(conf, defaultConf);
>       if (fs == null) {
>         // it fails with conf, therefore we pass defaultConf instead
>         fs = file.getFileSystem(defaultConf);
>       }
>       // Hadoop 0.20/1.x.
>       if (defaultConf.get("yarn.resourcemanager.webapp.address") == null) {
>           // Duplicate hadoop 1.x code to workaround MAPREDUCE-2361 in Hadoop 
> 0.20
>           // Refer OOZIE-1806.
>           String filepath = file.toUri().getPath();
>           String classpath = conf.get("mapred.job.classpath.files");
>           conf.set("mapred.job.classpath.files", classpath == null
>               ? filepath
>               : classpath + System.getProperty("path.separator") + filepath);
>           URI uri = fs.makeQualified(file).toUri();
>           DistributedCache.addCacheFile(uri, conf);
>       }
>       else { // Hadoop 0.23/2.x
>           DistributedCache.addFileToClassPath(file, conf, fs);
>       }
>     }
> {code}
> 2.
> sharelib setup also calls new Configuration(), which is not needed.
> {code}
> public Configuration getShareLibConf(String inputKey, Path path) {
>         Configuration conf = new Configuration();
>         if (shareLibConfigMap.containsKey(inputKey)) {
>             conf = shareLibConfigMap.get(inputKey).get(path);
>         }
>         return conf;
>     }
> {code}        
>       
>       
> 3.CoordActionInputCheckXCommand.checkPath also creates jobConf every time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to