[jira] [Commented] (OOZIE-3450) Investigate and clean git sharelib
[ https://issues.apache.org/jira/browse/OOZIE-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806691#comment-16806691 ] Andras Salamon commented on OOZIE-3450: --- I was able to execute the git test successfully. Thanks for the contribution, +1, committed to master. > Investigate and clean git sharelib > -- > > Key: OOZIE-3450 > URL: https://issues.apache.org/jira/browse/OOZIE-3450 > Project: Oozie > Issue Type: Improvement >Affects Versions: trunk >Reporter: Andras Salamon >Assignee: Mate Juhasz >Priority: Major > Attachments: OOZIE-3450-v2.patch, OOZIE-3540-v1.patch > > > I've checked the number of jars in the Oozie sharelibs and realized that git > sharelib contains the highest number of jars (203), it's much more than the > hive (85), pig (67). Not to mention that we have really small sharelibs like > distcp (3). > I don't really understand the reason for this, we need to check if we really > need all the jars here. The huge number of jars make it slower and it's more > likely that we get strange errors because of jar conflicts. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OOZIE-3450) Investigate and clean git sharelib
[ https://issues.apache.org/jira/browse/OOZIE-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16805038#comment-16805038 ] Hadoop QA commented on OOZIE-3450: -- Testing JIRA OOZIE-3450 Cleaning local git workspace {color:green}+1 PATCH_APPLIES{color} {color:green}+1 CLEAN{color} {color:red}-1 RAW_PATCH_ANALYSIS{color} .{color:green}+1{color} the patch does not introduce any @author tags .{color:green}+1{color} the patch does not introduce any tabs .{color:green}+1{color} the patch does not introduce any trailing spaces .{color:green}+1{color} the patch does not introduce any star imports .{color:green}+1{color} the patch does not introduce any line longer than 132 .{color:red}-1{color} the patch does not add/modify any testcase {color:green}+1 RAT{color} .{color:green}+1{color} the patch does not seem to introduce new RAT warnings {color:green}+1 JAVADOC{color} .{color:green}+1{color} Javadoc generation succeeded with the patch .{color:green}+1{color} the patch does not seem to introduce new Javadoc warning(s) {color:green}+1 COMPILE{color} .{color:green}+1{color} HEAD compiles .{color:green}+1{color} patch compiles .{color:green}+1{color} the patch does not seem to introduce new javac warnings {color:red}-1{color} There are [1] new bugs found below threshold in total that must be fixed. .{color:green}+1{color} There are no new bugs found in [sharelib/hive2]. .{color:green}+1{color} There are no new bugs found in [sharelib/spark]. .{color:red}-1{color} There are [1] new bugs found below threshold in [sharelib/oozie] that must be fixed. .You can find the SpotBugs diff here (look for the red and orange ones): sharelib/oozie/findbugs-new.html .The most important SpotBugs errors are: .At ShellMain.java:[line 92]: This usage of java/lang/ProcessBuilder.init(Ljava/util/List;)V can be vulnerable to Command Injection .At ShellMain.java:[line 90]: At ShellMain.java:[line 89] .At ShellMain.java:[line 91] .{color:green}+1{color} There are no new bugs found in [sharelib/pig]. .{color:green}+1{color} There are no new bugs found in [sharelib/streaming]. .{color:green}+1{color} There are no new bugs found in [sharelib/hive]. .{color:green}+1{color} There are no new bugs found in [sharelib/distcp]. .{color:green}+1{color} There are no new bugs found in [sharelib/hcatalog]. .{color:green}+1{color} There are no new bugs found in [sharelib/sqoop]. .{color:green}+1{color} There are no new bugs found in [sharelib/git]. .{color:green}+1{color} There are no new bugs found in [client]. .{color:green}+1{color} There are no new bugs found in [docs]. .{color:green}+1{color} There are no new bugs found in [tools]. .{color:green}+1{color} There are no new bugs found in [fluent-job/fluent-job-api]. .{color:green}+1{color} There are no new bugs found in [server]. .{color:green}+1{color} There are no new bugs found in [webapp]. .{color:green}+1{color} There are no new bugs found in [examples]. .{color:green}+1{color} There are no new bugs found in [core]. {color:green}+1 BACKWARDS_COMPATIBILITY{color} .{color:green}+1{color} the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations .{color:green}+1{color} the patch does not modify JPA files {color:green}+1 TESTS{color} .Tests run: 3169 .{color:orange}Tests failed at first run:{color} TestPurgeXCommand#testPurgeableBundleUnpurgeableCoordinatorUnpurgeableWorkflow TestPurgeXCommand#testPurgeableBundleUnpurgeableCoordinatorUnpurgebleWorkflowPurgeableSubWorkflow .For the complete list of flaky tests, see TEST-SUMMARY-FULL files. {color:green}+1 DISTRO{color} .{color:green}+1{color} distro tarball builds with the patch {color:red}*-1 Overall result, please check the reported -1(s)*{color} The full output of the test-patch run is available at . https://builds.apache.org/job/PreCommit-OOZIE-Build/1067/ > Investigate and clean git sharelib > -- > > Key: OOZIE-3450 > URL: https://issues.apache.org/jira/browse/OOZIE-3450 > Project: Oozie > Issue Type: Improvement >Affects Versions: trunk >Reporter: Andras Salamon >Assignee: Mate Juhasz >Priority: Major > Attachments: OOZIE-3450-v2.patch, OOZIE-3540-v1.patch > > > I've checked the number of jars in the Oozie sharelibs and realized that git > sharelib contains the highest number of jars (203), it's much more than the > hive (85), pig (67). Not to mention that we have really small sharelibs like > distcp (3). > I don't really understand the reason for this, we need to check if we really > need all the jars here. The huge number of jars make it slower and it's more > likely that we get
[jira] [Commented] (OOZIE-3450) Investigate and clean git sharelib
[ https://issues.apache.org/jira/browse/OOZIE-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16804899#comment-16804899 ] Mate Juhasz commented on OOZIE-3450: Thanks [~asalamon74] for the testing. Could you please give the new patch a try as well? Interesting that each of the sharelib projects are using the oozie-core as a provided dependency, the only difference is that while most of them just reffering to static constants in the ActionExecutor-s, sharelib-git's GitMain tries to call a method in GitActionExecutor$ActionConfVerifier. This method is the ActionConfVerifier#checkAndGetTrimmed, which only returns an action conf value, not connecting strongly anywhere, so I moved it to the GitMain class. > Investigate and clean git sharelib > -- > > Key: OOZIE-3450 > URL: https://issues.apache.org/jira/browse/OOZIE-3450 > Project: Oozie > Issue Type: Improvement >Affects Versions: trunk >Reporter: Andras Salamon >Assignee: Mate Juhasz >Priority: Major > Attachments: OOZIE-3450-v2.patch, OOZIE-3540-v1.patch > > > I've checked the number of jars in the Oozie sharelibs and realized that git > sharelib contains the highest number of jars (203), it's much more than the > hive (85), pig (67). Not to mention that we have really small sharelibs like > distcp (3). > I don't really understand the reason for this, we need to check if we really > need all the jars here. The huge number of jars make it slower and it's more > likely that we get strange errors because of jar conflicts. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OOZIE-3450) Investigate and clean git sharelib
[ https://issues.apache.org/jira/browse/OOZIE-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16804896#comment-16804896 ] Hadoop QA commented on OOZIE-3450: -- PreCommit-OOZIE-Build started > Investigate and clean git sharelib > -- > > Key: OOZIE-3450 > URL: https://issues.apache.org/jira/browse/OOZIE-3450 > Project: Oozie > Issue Type: Improvement >Affects Versions: trunk >Reporter: Andras Salamon >Assignee: Mate Juhasz >Priority: Major > Attachments: OOZIE-3450-v2.patch, OOZIE-3540-v1.patch > > > I've checked the number of jars in the Oozie sharelibs and realized that git > sharelib contains the highest number of jars (203), it's much more than the > hive (85), pig (67). Not to mention that we have really small sharelibs like > distcp (3). > I don't really understand the reason for this, we need to check if we really > need all the jars here. The huge number of jars make it slower and it's more > likely that we get strange errors because of jar conflicts. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OOZIE-3450) Investigate and clean git sharelib
[ https://issues.apache.org/jira/browse/OOZIE-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16804101#comment-16804101 ] Andras Salamon commented on OOZIE-3450: --- I tried to run the git example, but it fails with the following error: {noformat}java.lang.NoClassDefFoundError: org/apache/oozie/action/hadoop/GitActionExecutor$ActionConfVerifier at org.apache.oozie.action.hadoop.GitMain.parseActionConfiguration(GitMain.java:181) at org.apache.oozie.action.hadoop.GitMain.run(GitMain.java:63) at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:107) at org.apache.oozie.action.hadoop.GitMain.main(GitMain.java:48) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.oozie.action.hadoop.LauncherAM.runActionMain(LauncherAM.java:413) at org.apache.oozie.action.hadoop.LauncherAM.access$400(LauncherAM.java:55) at org.apache.oozie.action.hadoop.LauncherAM$2.run(LauncherAM.java:226) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.oozie.action.hadoop.LauncherAM.run(LauncherAM.java:220) at org.apache.oozie.action.hadoop.LauncherAM$1.run(LauncherAM.java:156) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.oozie.action.hadoop.LauncherAM.main(LauncherAM.java:144) Caused by: java.lang.ClassNotFoundException: org.apache.oozie.action.hadoop.GitActionExecutor$ActionConfVerifier at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 20 more {noformat} I think we need the oozie-core jar on the sharelib. Or maybe it would be possible to refactor the code and avoid referring to {{GitActionExecutor$ActionConfVerifier}}. > Investigate and clean git sharelib > -- > > Key: OOZIE-3450 > URL: https://issues.apache.org/jira/browse/OOZIE-3450 > Project: Oozie > Issue Type: Improvement >Affects Versions: trunk >Reporter: Andras Salamon >Assignee: Mate Juhasz >Priority: Major > Attachments: OOZIE-3540-v1.patch > > > I've checked the number of jars in the Oozie sharelibs and realized that git > sharelib contains the highest number of jars (203), it's much more than the > hive (85), pig (67). Not to mention that we have really small sharelibs like > distcp (3). > I don't really understand the reason for this, we need to check if we really > need all the jars here. The huge number of jars make it slower and it's more > likely that we get strange errors because of jar conflicts. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OOZIE-3450) Investigate and clean git sharelib
[ https://issues.apache.org/jira/browse/OOZIE-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16803155#comment-16803155 ] Hadoop QA commented on OOZIE-3450: -- Testing JIRA OOZIE-3450 Cleaning local git workspace {color:green}+1 PATCH_APPLIES{color} {color:green}+1 CLEAN{color} {color:red}-1 RAW_PATCH_ANALYSIS{color} .{color:green}+1{color} the patch does not introduce any @author tags .{color:green}+1{color} the patch does not introduce any tabs .{color:green}+1{color} the patch does not introduce any trailing spaces .{color:green}+1{color} the patch does not introduce any star imports .{color:green}+1{color} the patch does not introduce any line longer than 132 .{color:red}-1{color} the patch does not add/modify any testcase {color:green}+1 RAT{color} .{color:green}+1{color} the patch does not seem to introduce new RAT warnings {color:green}+1 JAVADOC{color} .{color:green}+1{color} Javadoc generation succeeded with the patch .{color:green}+1{color} the patch does not seem to introduce new Javadoc warning(s) {color:green}+1 COMPILE{color} .{color:green}+1{color} HEAD compiles .{color:green}+1{color} patch compiles .{color:green}+1{color} the patch does not seem to introduce new javac warnings {color:orange}0{color} There are [4] new bugs found in total that would be nice to have fixed. .{color:green}+1{color} There are no new bugs found in [examples]. .{color:green}+1{color} There are no new bugs found in [core]. .{color:green}+1{color} There are no new bugs found in [sharelib/distcp]. .{color:green}+1{color} There are no new bugs found in [sharelib/hive]. .{color:green}+1{color} There are no new bugs found in [sharelib/pig]. .{color:green}+1{color} There are no new bugs found in [sharelib/spark]. .{color:green}+1{color} There are no new bugs found in [sharelib/hive2]. .{color:green}+1{color} There are no new bugs found in [sharelib/hcatalog]. .{color:green}+1{color} There are no new bugs found in [sharelib/sqoop]. .{color:green}+1{color} There are no new bugs found in [sharelib/oozie]. .{color:green}+1{color} There are no new bugs found in [sharelib/streaming]. .{color:green}+1{color} There are no new bugs found in [sharelib/git]. .{color:green}+1{color} There are no new bugs found in [webapp]. .{color:green}+1{color} There are no new bugs found in [tools]. .{color:green}+1{color} There are no new bugs found in [docs]. .{color:orange}0{color} There are [4] new bugs found in [server] that would be nice to have fixed. .You can find the SpotBugs diff here: server/findbugs-new.html .{color:green}+1{color} There are no new bugs found in [fluent-job/fluent-job-api]. .{color:green}+1{color} There are no new bugs found in [client]. {color:green}+1 BACKWARDS_COMPATIBILITY{color} .{color:green}+1{color} the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations .{color:green}+1{color} the patch does not modify JPA files {color:green}+1 TESTS{color} .Tests run: 3169 {color:green}+1 DISTRO{color} .{color:green}+1{color} distro tarball builds with the patch {color:red}*-1 Overall result, please check the reported -1(s)*{color} The full output of the test-patch run is available at . https://builds.apache.org/job/PreCommit-OOZIE-Build/1062/ > Investigate and clean git sharelib > -- > > Key: OOZIE-3450 > URL: https://issues.apache.org/jira/browse/OOZIE-3450 > Project: Oozie > Issue Type: Improvement >Affects Versions: trunk >Reporter: Andras Salamon >Assignee: Mate Juhasz >Priority: Major > Attachments: OOZIE-3540-v1.patch > > > I've checked the number of jars in the Oozie sharelibs and realized that git > sharelib contains the highest number of jars (203), it's much more than the > hive (85), pig (67). Not to mention that we have really small sharelibs like > distcp (3). > I don't really understand the reason for this, we need to check if we really > need all the jars here. The huge number of jars make it slower and it's more > likely that we get strange errors because of jar conflicts. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OOZIE-3450) Investigate and clean git sharelib
[ https://issues.apache.org/jira/browse/OOZIE-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16802981#comment-16802981 ] Hadoop QA commented on OOZIE-3450: -- PreCommit-OOZIE-Build started > Investigate and clean git sharelib > -- > > Key: OOZIE-3450 > URL: https://issues.apache.org/jira/browse/OOZIE-3450 > Project: Oozie > Issue Type: Improvement >Affects Versions: trunk >Reporter: Andras Salamon >Assignee: Mate Juhasz >Priority: Major > Attachments: OOZIE-3540-v1.patch > > > I've checked the number of jars in the Oozie sharelibs and realized that git > sharelib contains the highest number of jars (203), it's much more than the > hive (85), pig (67). Not to mention that we have really small sharelibs like > distcp (3). > I don't really understand the reason for this, we need to check if we really > need all the jars here. The huge number of jars make it slower and it's more > likely that we get strange errors because of jar conflicts. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OOZIE-3450) Investigate and clean git sharelib
[ https://issues.apache.org/jira/browse/OOZIE-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16802946#comment-16802946 ] Andras Salamon commented on OOZIE-3450: --- Thanks [~matijhs]. I assigned the Jira to you and changed the status to patch available so precommit will execute the tests. I'll also test it using the git example. > Investigate and clean git sharelib > -- > > Key: OOZIE-3450 > URL: https://issues.apache.org/jira/browse/OOZIE-3450 > Project: Oozie > Issue Type: Improvement >Affects Versions: trunk >Reporter: Andras Salamon >Assignee: Mate Juhasz >Priority: Major > Attachments: OOZIE-3540-v1.patch > > > I've checked the number of jars in the Oozie sharelibs and realized that git > sharelib contains the highest number of jars (203), it's much more than the > hive (85), pig (67). Not to mention that we have really small sharelibs like > distcp (3). > I don't really understand the reason for this, we need to check if we really > need all the jars here. The huge number of jars make it slower and it's more > likely that we get strange errors because of jar conflicts. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OOZIE-3450) Investigate and clean git sharelib
[ https://issues.apache.org/jira/browse/OOZIE-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16802711#comment-16802711 ] Mate Juhasz commented on OOZIE-3450: I have managed to reduce to number of jars on the git sharelib (9) with better chosen dependency scopes, but unfortunately I could not find an environment to run the examples code. [~asalamon74] could you please help to check what happens if we keep the below jars only? I upload a patch as well, but these are really minor changes in code. {noformat} find share/lib/git -name "*.jar" | sort share/lib/git/JavaEWAH-1.1.6.jar share/lib/git/commons-lang3-3.3.2.jar share/lib/git/httpclient-4.3.6.jar share/lib/git/httpcore-4.3.3.jar share/lib/git/jsch-0.1.54.jar share/lib/git/jzlib-1.1.1.jar share/lib/git/oozie-sharelib-git-5.2.0-SNAPSHOT.jar share/lib/git/org.eclipse.jgit-5.0.1.201806211838-r.jar share/lib/git/slf4j-api-1.6.6.jar {noformat} > Investigate and clean git sharelib > -- > > Key: OOZIE-3450 > URL: https://issues.apache.org/jira/browse/OOZIE-3450 > Project: Oozie > Issue Type: Improvement >Affects Versions: trunk >Reporter: Andras Salamon >Priority: Major > Attachments: OOZIE-3540-v1.patch > > > I've checked the number of jars in the Oozie sharelibs and realized that git > sharelib contains the highest number of jars (203), it's much more than the > hive (85), pig (67). Not to mention that we have really small sharelibs like > distcp (3). > I don't really understand the reason for this, we need to check if we really > need all the jars here. The huge number of jars make it slower and it's more > likely that we get strange errors because of jar conflicts. -- This message was sent by Atlassian JIRA (v7.6.3#76005)