[ https://issues.apache.org/jira/browse/OOZIE-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
abhishek bafna updated OOZIE-2347: ---------------------------------- Fix Version/s: (was: trunk) 4.3.0 > Remove unnecessary new Configuration()/new jobConf() calls from oozie > --------------------------------------------------------------------- > > Key: OOZIE-2347 > URL: https://issues.apache.org/jira/browse/OOZIE-2347 > Project: Oozie > Issue Type: Bug > Reporter: Purshotam Shah > Assignee: Purshotam Shah > Fix For: 4.3.0 > > Attachments: OOZIE-2347-V1.patch, OOZIE-2347-V2.patch, > amend-OOZIE-2347-V1.patch, amend-OOZIE-2347-V2.patch > > > We noticed that setting of job sharelib was slow and one prime reason was lot > of thread was blocked on "java.util.zip.ZipFile.getEntry" > <0x00000005c0afda68> (a java.util.jar.JarFile): 0 Thread(s) sleeping, 178 > Thread(s) waiting, 1 Thread(s) locking > There are lot of places we do new Configuration()/new jobConf() > unnecessarily. This can be easily removed to enhance performance. > 1. > Configuration defaultConf = new Configuration(); is called for every file we > add to classpath. > {code} > public static void addFileToClassPath(Path file, Configuration conf, > FileSystem fs) throws IOException { > Configuration defaultConf = new Configuration(); > XConfiguration.copy(conf, defaultConf); > if (fs == null) { > // it fails with conf, therefore we pass defaultConf instead > fs = file.getFileSystem(defaultConf); > } > // Hadoop 0.20/1.x. > if (defaultConf.get("yarn.resourcemanager.webapp.address") == null) { > // Duplicate hadoop 1.x code to workaround MAPREDUCE-2361 in Hadoop > 0.20 > // Refer OOZIE-1806. > String filepath = file.toUri().getPath(); > String classpath = conf.get("mapred.job.classpath.files"); > conf.set("mapred.job.classpath.files", classpath == null > ? filepath > : classpath + System.getProperty("path.separator") + filepath); > URI uri = fs.makeQualified(file).toUri(); > DistributedCache.addCacheFile(uri, conf); > } > else { // Hadoop 0.23/2.x > DistributedCache.addFileToClassPath(file, conf, fs); > } > } > {code} > 2. > sharelib setup also calls new Configuration(), which is not needed. > {code} > public Configuration getShareLibConf(String inputKey, Path path) { > Configuration conf = new Configuration(); > if (shareLibConfigMap.containsKey(inputKey)) { > conf = shareLibConfigMap.get(inputKey).get(path); > } > return conf; > } > {code} > > > 3.CoordActionInputCheckXCommand.checkPath also creates jobConf every time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)