Re: Review Request 19929: OOZIE-1685: Oozie doesn’t process correctly workflows with a non-default name node
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19929/#review45985 --- Ship it! Ship It! - Rohini Palaniswamy On June 16, 2014, 7:43 a.m., Benjamin Zhitomirsky wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19929/ --- (Updated June 16, 2014, 7:43 a.m.) Review request for oozie. Repository: oozie-git Description --- When name-node element in Oozie workflow specifies a name node different from the default one (specified in core-site.xml), the following functionality doesn’t work properly: ?Location of libraries specified via oozie.service.WorkflowAppService.system.libpath. Oozie first (during launcher configuration) tries to locate them using name node specified by the name-node element, but later during job submission it expects this path to be under the default Oozie name node ?Processing of the job-xml element if job xml is specified via absolute path. Oozie tries locate it under the default Oozie name node instead of the name-node specified in action. Specifying non-default name node makes a lot of sense in Azure environment, because it allows to submit the same job to different Hadoop clusters. Diffs - core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 40add2c core/src/main/java/org/apache/oozie/service/HadoopAccessorService.java bb68b0e core/src/main/java/org/apache/oozie/service/ShareLibService.java 320af8b core/src/main/java/org/apache/oozie/util/JobUtils.java 135b096 core/src/test/java/org/apache/oozie/action/hadoop/ActionExecutorTestCase.java bc2c1b6 core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java 390ad3f core/src/test/java/org/apache/oozie/test/XFsTestCase.java 18cb742 core/src/test/java/org/apache/oozie/test/XTestCase.java 1536927 docs/src/site/twiki/WorkflowFunctionalSpec.twiki f7590d0 Diff: https://reviews.apache.org/r/19929/diff/ Testing --- On deployed Hadoop cluster. Two tests were added. Thanks, Benjamin Zhitomirsky
Re: Review Request 19929: OOZIE-1685: Oozie doesn’t process correctly workflows with a non-default name node
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19929/ --- (Updated June 16, 2014, 7:43 a.m.) Review request for oozie. Changes --- ShareLibService#getFileSystem now simply returns fs as Rohini pointed out Repository: oozie-git Description --- When name-node element in Oozie workflow specifies a name node different from the default one (specified in core-site.xml), the following functionality doesn’t work properly: ?Location of libraries specified via oozie.service.WorkflowAppService.system.libpath. Oozie first (during launcher configuration) tries to locate them using name node specified by the name-node element, but later during job submission it expects this path to be under the default Oozie name node ?Processing of the job-xml element if job xml is specified via absolute path. Oozie tries locate it under the default Oozie name node instead of the name-node specified in action. Specifying non-default name node makes a lot of sense in Azure environment, because it allows to submit the same job to different Hadoop clusters. Diffs (updated) - core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 40add2c core/src/main/java/org/apache/oozie/service/HadoopAccessorService.java bb68b0e core/src/main/java/org/apache/oozie/service/ShareLibService.java 320af8b core/src/main/java/org/apache/oozie/util/JobUtils.java 135b096 core/src/test/java/org/apache/oozie/action/hadoop/ActionExecutorTestCase.java bc2c1b6 core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java 390ad3f core/src/test/java/org/apache/oozie/test/XFsTestCase.java 18cb742 core/src/test/java/org/apache/oozie/test/XTestCase.java 1536927 docs/src/site/twiki/WorkflowFunctionalSpec.twiki f7590d0 Diff: https://reviews.apache.org/r/19929/diff/ Testing --- On deployed Hadoop cluster. Two tests were added. Thanks, Benjamin Zhitomirsky
Re: Review Request 19929: OOZIE-1685: Oozie doesn’t process correctly workflows with a non-default name node
On June 15, 2014, 2:03 a.m., Rohini Palaniswamy wrote: core/src/main/java/org/apache/oozie/service/ShareLibService.java, line 96 https://reviews.apache.org/r/19929/diff/6-7/?file=599794#file599794line96 You can just return this fs object This won't be right, because if sysLibPath is a full qualified path than fs would not be the right filesystem. Of cause I may check for authority in sysLibPath and if not present then return fs, otherwise call getFileSystem(). - Benjamin --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19929/#review45702 --- On June 14, 2014, 7:27 p.m., Benjamin Zhitomirsky wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19929/ --- (Updated June 14, 2014, 7:27 p.m.) Review request for oozie. Repository: oozie-git Description --- When name-node element in Oozie workflow specifies a name node different from the default one (specified in core-site.xml), the following functionality doesn’t work properly: ?Location of libraries specified via oozie.service.WorkflowAppService.system.libpath. Oozie first (during launcher configuration) tries to locate them using name node specified by the name-node element, but later during job submission it expects this path to be under the default Oozie name node ?Processing of the job-xml element if job xml is specified via absolute path. Oozie tries locate it under the default Oozie name node instead of the name-node specified in action. Specifying non-default name node makes a lot of sense in Azure environment, because it allows to submit the same job to different Hadoop clusters. Diffs - core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 40add2c core/src/main/java/org/apache/oozie/service/HadoopAccessorService.java bb68b0e core/src/main/java/org/apache/oozie/service/ShareLibService.java 320af8b core/src/main/java/org/apache/oozie/util/JobUtils.java 135b096 core/src/test/java/org/apache/oozie/action/hadoop/ActionExecutorTestCase.java bc2c1b6 core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java 390ad3f core/src/test/java/org/apache/oozie/test/XFsTestCase.java 18cb742 core/src/test/java/org/apache/oozie/test/XTestCase.java 1536927 docs/src/site/twiki/WorkflowFunctionalSpec.twiki f7590d0 Diff: https://reviews.apache.org/r/19929/diff/ Testing --- On deployed Hadoop cluster. Two tests were added. Thanks, Benjamin Zhitomirsky
Re: Review Request 19929: OOZIE-1685: Oozie doesn’t process correctly workflows with a non-default name node
On June 15, 2014, 2:03 a.m., Rohini Palaniswamy wrote: core/src/main/java/org/apache/oozie/service/ShareLibService.java, line 96 https://reviews.apache.org/r/19929/diff/6-7/?file=599794#file599794line96 You can just return this fs object Benjamin Zhitomirsky wrote: This won't be right, because if sysLibPath is a full qualified path than fs would not be the right filesystem. Of cause I may check for authority in sysLibPath and if not present then return fs, otherwise call getFileSystem(). This fs is constructed by doing fs = FileSystem.get(has.createJobConf(uri.getAuthority())); and should take care of the case where uri authority is null or if it is fully qualified for system lib path. Why do you want to construct fs by always passing null for authority? - Rohini --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19929/#review45702 --- On June 14, 2014, 7:27 p.m., Benjamin Zhitomirsky wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19929/ --- (Updated June 14, 2014, 7:27 p.m.) Review request for oozie. Repository: oozie-git Description --- When name-node element in Oozie workflow specifies a name node different from the default one (specified in core-site.xml), the following functionality doesn’t work properly: ?Location of libraries specified via oozie.service.WorkflowAppService.system.libpath. Oozie first (during launcher configuration) tries to locate them using name node specified by the name-node element, but later during job submission it expects this path to be under the default Oozie name node ?Processing of the job-xml element if job xml is specified via absolute path. Oozie tries locate it under the default Oozie name node instead of the name-node specified in action. Specifying non-default name node makes a lot of sense in Azure environment, because it allows to submit the same job to different Hadoop clusters. Diffs - core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 40add2c core/src/main/java/org/apache/oozie/service/HadoopAccessorService.java bb68b0e core/src/main/java/org/apache/oozie/service/ShareLibService.java 320af8b core/src/main/java/org/apache/oozie/util/JobUtils.java 135b096 core/src/test/java/org/apache/oozie/action/hadoop/ActionExecutorTestCase.java bc2c1b6 core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java 390ad3f core/src/test/java/org/apache/oozie/test/XFsTestCase.java 18cb742 core/src/test/java/org/apache/oozie/test/XTestCase.java 1536927 docs/src/site/twiki/WorkflowFunctionalSpec.twiki f7590d0 Diff: https://reviews.apache.org/r/19929/diff/ Testing --- On deployed Hadoop cluster. Two tests were added. Thanks, Benjamin Zhitomirsky
Re: Review Request 19929: OOZIE-1685: Oozie doesn’t process correctly workflows with a non-default name node
On June 15, 2014, 2:03 a.m., Rohini Palaniswamy wrote: core/src/main/java/org/apache/oozie/service/ShareLibService.java, line 96 https://reviews.apache.org/r/19929/diff/6-7/?file=599794#file599794line96 You can just return this fs object Benjamin Zhitomirsky wrote: This won't be right, because if sysLibPath is a full qualified path than fs would not be the right filesystem. Of cause I may check for authority in sysLibPath and if not present then return fs, otherwise call getFileSystem(). Rohini Palaniswamy wrote: This fs is constructed by doing fs = FileSystem.get(has.createJobConf(uri.getAuthority())); and should take care of the case where uri authority is null or if it is fully qualified for system lib path. Why do you want to construct fs by always passing null for authority? Hmm... Yes, you are right... My bad! I will fix it as you say. - Benjamin --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19929/#review45702 --- On June 14, 2014, 7:27 p.m., Benjamin Zhitomirsky wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19929/ --- (Updated June 14, 2014, 7:27 p.m.) Review request for oozie. Repository: oozie-git Description --- When name-node element in Oozie workflow specifies a name node different from the default one (specified in core-site.xml), the following functionality doesn’t work properly: ?Location of libraries specified via oozie.service.WorkflowAppService.system.libpath. Oozie first (during launcher configuration) tries to locate them using name node specified by the name-node element, but later during job submission it expects this path to be under the default Oozie name node ?Processing of the job-xml element if job xml is specified via absolute path. Oozie tries locate it under the default Oozie name node instead of the name-node specified in action. Specifying non-default name node makes a lot of sense in Azure environment, because it allows to submit the same job to different Hadoop clusters. Diffs - core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 40add2c core/src/main/java/org/apache/oozie/service/HadoopAccessorService.java bb68b0e core/src/main/java/org/apache/oozie/service/ShareLibService.java 320af8b core/src/main/java/org/apache/oozie/util/JobUtils.java 135b096 core/src/test/java/org/apache/oozie/action/hadoop/ActionExecutorTestCase.java bc2c1b6 core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java 390ad3f core/src/test/java/org/apache/oozie/test/XFsTestCase.java 18cb742 core/src/test/java/org/apache/oozie/test/XTestCase.java 1536927 docs/src/site/twiki/WorkflowFunctionalSpec.twiki f7590d0 Diff: https://reviews.apache.org/r/19929/diff/ Testing --- On deployed Hadoop cluster. Two tests were added. Thanks, Benjamin Zhitomirsky
Re: Review Request 19929: OOZIE-1685: Oozie doesn’t process correctly workflows with a non-default name node
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19929/ --- (Updated June 14, 2014, 7:27 p.m.) Review request for oozie. Changes --- Latest CR comments applied: ShareLibService#getFileSystem now retrieves filesystem simply for Oozie user instead of the caller. Repository: oozie-git Description --- When name-node element in Oozie workflow specifies a name node different from the default one (specified in core-site.xml), the following functionality doesn’t work properly: ?Location of libraries specified via oozie.service.WorkflowAppService.system.libpath. Oozie first (during launcher configuration) tries to locate them using name node specified by the name-node element, but later during job submission it expects this path to be under the default Oozie name node ?Processing of the job-xml element if job xml is specified via absolute path. Oozie tries locate it under the default Oozie name node instead of the name-node specified in action. Specifying non-default name node makes a lot of sense in Azure environment, because it allows to submit the same job to different Hadoop clusters. Diffs (updated) - core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 40add2c core/src/main/java/org/apache/oozie/service/HadoopAccessorService.java bb68b0e core/src/main/java/org/apache/oozie/service/ShareLibService.java 320af8b core/src/main/java/org/apache/oozie/util/JobUtils.java 135b096 core/src/test/java/org/apache/oozie/action/hadoop/ActionExecutorTestCase.java bc2c1b6 core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java 390ad3f core/src/test/java/org/apache/oozie/test/XFsTestCase.java 18cb742 core/src/test/java/org/apache/oozie/test/XTestCase.java 1536927 docs/src/site/twiki/WorkflowFunctionalSpec.twiki f7590d0 Diff: https://reviews.apache.org/r/19929/diff/ Testing --- On deployed Hadoop cluster. Two tests were added. Thanks, Benjamin Zhitomirsky
Re: Review Request 19929: OOZIE-1685: Oozie doesn’t process correctly workflows with a non-default name node
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19929/#review45702 --- core/src/main/java/org/apache/oozie/service/ShareLibService.java https://reviews.apache.org/r/19929/#comment80658 You can just return this fs object - Rohini Palaniswamy On June 14, 2014, 7:27 p.m., Benjamin Zhitomirsky wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19929/ --- (Updated June 14, 2014, 7:27 p.m.) Review request for oozie. Repository: oozie-git Description --- When name-node element in Oozie workflow specifies a name node different from the default one (specified in core-site.xml), the following functionality doesn’t work properly: ?Location of libraries specified via oozie.service.WorkflowAppService.system.libpath. Oozie first (during launcher configuration) tries to locate them using name node specified by the name-node element, but later during job submission it expects this path to be under the default Oozie name node ?Processing of the job-xml element if job xml is specified via absolute path. Oozie tries locate it under the default Oozie name node instead of the name-node specified in action. Specifying non-default name node makes a lot of sense in Azure environment, because it allows to submit the same job to different Hadoop clusters. Diffs - core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 40add2c core/src/main/java/org/apache/oozie/service/HadoopAccessorService.java bb68b0e core/src/main/java/org/apache/oozie/service/ShareLibService.java 320af8b core/src/main/java/org/apache/oozie/util/JobUtils.java 135b096 core/src/test/java/org/apache/oozie/action/hadoop/ActionExecutorTestCase.java bc2c1b6 core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java 390ad3f core/src/test/java/org/apache/oozie/test/XFsTestCase.java 18cb742 core/src/test/java/org/apache/oozie/test/XTestCase.java 1536927 docs/src/site/twiki/WorkflowFunctionalSpec.twiki f7590d0 Diff: https://reviews.apache.org/r/19929/diff/ Testing --- On deployed Hadoop cluster. Two tests were added. Thanks, Benjamin Zhitomirsky
Re: Review Request 19929: OOZIE-1685: Oozie doesn’t process correctly workflows with a non-default name node
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19929/#review45089 --- Patch looks good. Just two minor comments. core/src/main/java/org/apache/oozie/service/ShareLibService.java https://reviews.apache.org/r/19929/#comment79756 Is constructing new FileSystem required? Can't the local variable fs be returned as is. The fs will access filesystem as the oozie user. But I guess that should be ok as it is only used for reading files. core/src/test/java/org/apache/oozie/test/XTestCase.java https://reviews.apache.org/r/19929/#comment79757 create - c lower case. - Rohini Palaniswamy On May 30, 2014, 2:32 p.m., Benjamin Zhitomirsky wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19929/ --- (Updated May 30, 2014, 2:32 p.m.) Review request for oozie. Repository: oozie-git Description --- When name-node element in Oozie workflow specifies a name node different from the default one (specified in core-site.xml), the following functionality doesn’t work properly: ?Location of libraries specified via oozie.service.WorkflowAppService.system.libpath. Oozie first (during launcher configuration) tries to locate them using name node specified by the name-node element, but later during job submission it expects this path to be under the default Oozie name node ?Processing of the job-xml element if job xml is specified via absolute path. Oozie tries locate it under the default Oozie name node instead of the name-node specified in action. Specifying non-default name node makes a lot of sense in Azure environment, because it allows to submit the same job to different Hadoop clusters. Diffs - core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 40add2c core/src/main/java/org/apache/oozie/service/HadoopAccessorService.java bb68b0e core/src/main/java/org/apache/oozie/service/ShareLibService.java 353b382 core/src/main/java/org/apache/oozie/util/JobUtils.java 135b096 core/src/test/java/org/apache/oozie/action/hadoop/ActionExecutorTestCase.java bc2c1b6 core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java 390ad3f core/src/test/java/org/apache/oozie/test/XFsTestCase.java 18cb742 core/src/test/java/org/apache/oozie/test/XTestCase.java 1536927 docs/src/site/twiki/WorkflowFunctionalSpec.twiki f7590d0 Diff: https://reviews.apache.org/r/19929/diff/ Testing --- On deployed Hadoop cluster. Two tests were added. Thanks, Benjamin Zhitomirsky
Re: Review Request 19929: OOZIE-1685: Oozie doesn’t process correctly workflows with a non-default name node
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19929/ --- (Updated May 30, 2014, 2:32 p.m.) Review request for oozie. Changes --- Fixed according to CR comments Repository: oozie-git Description --- When name-node element in Oozie workflow specifies a name node different from the default one (specified in core-site.xml), the following functionality doesn’t work properly: ?Location of libraries specified via oozie.service.WorkflowAppService.system.libpath. Oozie first (during launcher configuration) tries to locate them using name node specified by the name-node element, but later during job submission it expects this path to be under the default Oozie name node ?Processing of the job-xml element if job xml is specified via absolute path. Oozie tries locate it under the default Oozie name node instead of the name-node specified in action. Specifying non-default name node makes a lot of sense in Azure environment, because it allows to submit the same job to different Hadoop clusters. Diffs (updated) - core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 40add2c core/src/main/java/org/apache/oozie/service/HadoopAccessorService.java bb68b0e core/src/main/java/org/apache/oozie/service/ShareLibService.java 353b382 core/src/main/java/org/apache/oozie/util/JobUtils.java 135b096 core/src/test/java/org/apache/oozie/action/hadoop/ActionExecutorTestCase.java bc2c1b6 core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java 390ad3f core/src/test/java/org/apache/oozie/test/XFsTestCase.java 18cb742 core/src/test/java/org/apache/oozie/test/XTestCase.java 1536927 docs/src/site/twiki/WorkflowFunctionalSpec.twiki f7590d0 Diff: https://reviews.apache.org/r/19929/diff/ Testing --- On deployed Hadoop cluster. Two tests were added. Thanks, Benjamin Zhitomirsky
Re: Review Request 19929: OOZIE-1685: Oozie doesn’t process correctly workflows with a non-default name node
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19929/ --- (Updated May 8, 2014, 4:05 p.m.) Review request for oozie. Changes --- Fixes according to comments. Shared system libraries are defaulted to default filesystem, unless fully qualified path specified in configuration. Previous implementation wronly assumed that shared libraries should be located in the application's file system. Repository: oozie-git Description --- When name-node element in Oozie workflow specifies a name node different from the default one (specified in core-site.xml), the following functionality doesn’t work properly: ?Location of libraries specified via oozie.service.WorkflowAppService.system.libpath. Oozie first (during launcher configuration) tries to locate them using name node specified by the name-node element, but later during job submission it expects this path to be under the default Oozie name node ?Processing of the job-xml element if job xml is specified via absolute path. Oozie tries locate it under the default Oozie name node instead of the name-node specified in action. Specifying non-default name node makes a lot of sense in Azure environment, because it allows to submit the same job to different Hadoop clusters. Diffs (updated) - core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 40add2c core/src/main/java/org/apache/oozie/service/HadoopAccessorService.java bb68b0e core/src/main/java/org/apache/oozie/service/ShareLibService.java 353b382 core/src/main/java/org/apache/oozie/util/JobUtils.java 135b096 core/src/test/java/org/apache/oozie/action/hadoop/ActionExecutorTestCase.java bc2c1b6 core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java 390ad3f core/src/test/java/org/apache/oozie/test/XFsTestCase.java 18cb742 core/src/test/java/org/apache/oozie/test/XTestCase.java 1536927 docs/src/site/twiki/WorkflowFunctionalSpec.twiki f7590d0 Diff: https://reviews.apache.org/r/19929/diff/ Testing --- On deployed Hadoop cluster. Two tests were added. Thanks, Benjamin Zhitomirsky
Re: Review Request 19929: OOZIE-1685: Oozie doesn’t process correctly workflows with a non-default name node
On May 5, 2014, 10:59 p.m., Rohini Palaniswamy wrote: core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java, lines 443-458 https://reviews.apache.org/r/19929/diff/3/?file=574188#file574188line443 Can just do: Path pathToAdd = new Path(uri.normalize()); Services.get().get(HadoopAccessorService.class).addFileToClassPath(user, pathToAdd, conf); Make the below change to HadoopAccessorService.java: public void addFileToClassPath(String user, final Path file, final Configuration conf) throws IOException { ParamChecker.notEmpty(user, user); try { UserGroupInformation ugi = getUGI(user); ugi.doAs(new PrivilegedExceptionActionVoid() { public Void run() throws Exception { Configuration defaultConf = new Configuration(); XConfiguration.copy(conf, defaultConf); //Doing this NOP add first to have the FS created and cached DistributedCache.addFileToClassPath(file, defaultConf); // Hadoop 0.20/1.x. if (defaultConf.get(mapred.job.classpath.files) != null) { // Duplicate hadoop 1.x code to workaround MAPREDUCE-2361 in Hadoop 0.20 // Refer OOZIE-1806. String filepath = file.toUri().getPath(); String classpath = conf.get(mapred.job.classpath.files); conf.set(mapred.job.classpath.files, classpath == null ? filepath : classpath + System.getProperty(path.separator) + filepath); URI uri = file.getFileSystem(defaultConf).makeQualified(file).toUri(); DistributedCache.addCacheFile(uri, conf); } else { // Hadoop 0.23/2.x DistributedCache.addFileToClassPath(file, conf); } return null; } }); } catch (InterruptedException ex) { throw new IOException(ex); } } The code inside run() method can be replaced with JobUtils.addFileToClasspath() which is added by OOZIE-1806 - Rohini --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19929/#review42205 --- On May 4, 2014, 7:05 p.m., Benjamin Zhitomirsky wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19929/ --- (Updated May 4, 2014, 7:05 p.m.) Review request for oozie. Repository: oozie-git Description --- When name-node element in Oozie workflow specifies a name node different from the default one (specified in core-site.xml), the following functionality doesn’t work properly: ?Location of libraries specified via oozie.service.WorkflowAppService.system.libpath. Oozie first (during launcher configuration) tries to locate them using name node specified by the name-node element, but later during job submission it expects this path to be under the default Oozie name node ?Processing of the job-xml element if job xml is specified via absolute path. Oozie tries locate it under the default Oozie name node instead of the name-node specified in action. Specifying non-default name node makes a lot of sense in Azure environment, because it allows to submit the same job to different Hadoop clusters. Diffs - core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 59ad143 core/src/test/java/org/apache/oozie/action/hadoop/ActionExecutorTestCase.java bc2c1b6 core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java 390ad3f core/src/test/java/org/apache/oozie/test/XFsTestCase.java 18cb742 core/src/test/java/org/apache/oozie/test/XTestCase.java 1536927 Diff: https://reviews.apache.org/r/19929/diff/ Testing --- On deployed Hadoop cluster. Two tests were added. Thanks, Benjamin Zhitomirsky
Re: Review Request 19929: OOZIE-1685: Oozie doesn’t process correctly workflows with a non-default name node
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19929/#review42205 --- core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java https://reviews.apache.org/r/19929/#comment75952 hadoop Configuration deprecation feature should handle it. You don't have to explicitly check for both. Only checking fs.default.name is good enough. core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java https://reviews.apache.org/r/19929/#comment75953 Just getAuthority() != null is enough as it contains host as well core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java https://reviews.apache.org/r/19929/#comment75954 Since this is in a loop, we can have a mapString,FileSystem) of authority and filesystem and reuse. core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java https://reviews.apache.org/r/19929/#comment75968 Can just do: Path pathToAdd = new Path(uri.normalize()); Services.get().get(HadoopAccessorService.class).addFileToClassPath(user, pathToAdd, conf); Make the below change to HadoopAccessorService.java: public void addFileToClassPath(String user, final Path file, final Configuration conf) throws IOException { ParamChecker.notEmpty(user, user); try { UserGroupInformation ugi = getUGI(user); ugi.doAs(new PrivilegedExceptionActionVoid() { public Void run() throws Exception { Configuration defaultConf = new Configuration(); XConfiguration.copy(conf, defaultConf); //Doing this NOP add first to have the FS created and cached DistributedCache.addFileToClassPath(file, defaultConf); // Hadoop 0.20/1.x. if (defaultConf.get(mapred.job.classpath.files) != null) { // Duplicate hadoop 1.x code to workaround MAPREDUCE-2361 in Hadoop 0.20 // Refer OOZIE-1806. String filepath = file.toUri().getPath(); String classpath = conf.get(mapred.job.classpath.files); conf.set(mapred.job.classpath.files, classpath == null ? filepath : classpath + System.getProperty(path.separator) + filepath); URI uri = file.getFileSystem(defaultConf).makeQualified(file).toUri(); DistributedCache.addCacheFile(uri, conf); } else { // Hadoop 0.23/2.x DistributedCache.addFileToClassPath(file, conf); } return null; } }); } catch (InterruptedException ex) { throw new IOException(ex); } } core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java https://reviews.apache.org/r/19929/#comment75959 uri.normalize() core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java https://reviews.apache.org/r/19929/#comment75973 Can be removed core/src/test/java/org/apache/oozie/action/hadoop/ActionExecutorTestCase.java https://reviews.apache.org/r/19929/#comment75969 Can we do this only for TestJavaActionExecutor as it is not needed by other action executor test cases? Even though cluster setup is only done once for all tests, we will unnecessarily keep creating test dirs in both clusters. core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java https://reviews.apache.org/r/19929/#comment75972 3. Absolute (but not fully qualified) path located in the both filesystems Comment not in sync with actual test. job3.xml is a fully qualified path. Need to make it absolute and add another job4.xml for fully qualified path. core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java https://reviews.apache.org/r/19929/#comment75977 Can you add a comment here and in 3.2.2.4 Syntax of WorkflowFunctionalSpecification.twiki, saying relative and absolute paths for job-xml point to the Namenode of the app path even if a different namenode is specified for the action. And that to point to a different namenode it should be fully qualified. core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java https://reviews.apache.org/r/19929/#comment75978 This will not be ignored right if job3.xml was not fully qualified? Aren't absolute non-qualified paths picked from app path filesystem?
Re: Review Request 19929: OOZIE-1685: Oozie doesn’t process correctly workflows with a non-default name node
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19929/ --- (Updated April 30, 2014, 2:19 p.m.) Review request for oozie. Changes --- Fixed small bug with non-default fs detection. Repository: oozie-git Description --- When name-node element in Oozie workflow specifies a name node different from the default one (specified in core-site.xml), the following functionality doesn’t work properly: ?Location of libraries specified via oozie.service.WorkflowAppService.system.libpath. Oozie first (during launcher configuration) tries to locate them using name node specified by the name-node element, but later during job submission it expects this path to be under the default Oozie name node ?Processing of the job-xml element if job xml is specified via absolute path. Oozie tries locate it under the default Oozie name node instead of the name-node specified in action. Specifying non-default name node makes a lot of sense in Azure environment, because it allows to submit the same job to different Hadoop clusters. Diffs (updated) - core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 59ad143 core/src/test/java/org/apache/oozie/action/hadoop/ActionExecutorTestCase.java bc2c1b6 core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java 390ad3f core/src/test/java/org/apache/oozie/test/XFsTestCase.java 18cb742 core/src/test/java/org/apache/oozie/test/XTestCase.java 1536927 Diff: https://reviews.apache.org/r/19929/diff/ Testing --- On deployed Hadoop cluster. Two tests were added. Thanks, Benjamin Zhitomirsky
Review Request 19929: OOZIE-1685: Oozie doesn’t process correctly workflows with a non-default name node
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19929/ --- Review request for oozie. Repository: oozie-git Description --- When name-node element in Oozie workflow specifies a name node different from the default one (specified in core-site.xml), the following functionality doesn’t work properly: ?Location of libraries specified via oozie.service.WorkflowAppService.system.libpath. Oozie first (during launcher configuration) tries to locate them using name node specified by the name-node element, but later during job submission it expects this path to be under the default Oozie name node ?Processing of the job-xml element if job xml is specified via absolute path. Oozie tries locate it under the default Oozie name node instead of the name-node specified in action. Specifying non-default name node makes a lot of sense in Azure environment, because it allows to submit the same job to different Hadoop clusters. Diffs - core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 68d77a8 core/src/test/java/org/apache/oozie/action/hadoop/ActionExecutorTestCase.java bc2c1b6 core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java 7841076 core/src/test/java/org/apache/oozie/test/XFsTestCase.java 18cb742 core/src/test/java/org/apache/oozie/test/XTestCase.java 1536927 Diff: https://reviews.apache.org/r/19929/diff/ Testing --- On deployed Hadoop cluster. Two tests were added. Thanks, Benjamin Zhitomirsky