[jira] [Commented] (OOZIE-3450) Investigate and clean git sharelib

2019-04-01 Thread Andras Salamon (JIRA)


[ 
https://issues.apache.org/jira/browse/OOZIE-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806691#comment-16806691
 ] 

Andras Salamon commented on OOZIE-3450:
---

I was able to execute the git test successfully.

Thanks for the contribution, +1, committed to master.

> Investigate and clean git sharelib
> --
>
> Key: OOZIE-3450
> URL: https://issues.apache.org/jira/browse/OOZIE-3450
> Project: Oozie
>  Issue Type: Improvement
>Affects Versions: trunk
>Reporter: Andras Salamon
>Assignee: Mate Juhasz
>Priority: Major
> Attachments: OOZIE-3450-v2.patch, OOZIE-3540-v1.patch
>
>
> I've checked the number of jars in the Oozie sharelibs and realized that git 
> sharelib contains the highest number of jars (203), it's much more than the 
> hive (85), pig (67). Not to mention that we have really small sharelibs like 
> distcp (3).
> I don't really understand the reason for this, we need to check if we really 
> need all the jars here. The huge number of jars make it slower and it's more 
> likely that we get strange errors because of jar conflicts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OOZIE-3450) Investigate and clean git sharelib

2019-03-29 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/OOZIE-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16805038#comment-16805038
 ] 

Hadoop QA commented on OOZIE-3450:
--


Testing JIRA OOZIE-3450

Cleaning local git workspace



{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:red}-1 RAW_PATCH_ANALYSIS{color}
.{color:green}+1{color} the patch does not introduce any @author tags
.{color:green}+1{color} the patch does not introduce any tabs
.{color:green}+1{color} the patch does not introduce any trailing spaces
.{color:green}+1{color} the patch does not introduce any star imports
.{color:green}+1{color} the patch does not introduce any line longer than 
132
.{color:red}-1{color} the patch does not add/modify any testcase
{color:green}+1 RAT{color}
.{color:green}+1{color} the patch does not seem to introduce new RAT 
warnings
{color:green}+1 JAVADOC{color}
.{color:green}+1{color} Javadoc generation succeeded with the patch
.{color:green}+1{color} the patch does not seem to introduce new Javadoc 
warning(s)
{color:green}+1 COMPILE{color}
.{color:green}+1{color} HEAD compiles
.{color:green}+1{color} patch compiles
.{color:green}+1{color} the patch does not seem to introduce new javac 
warnings
{color:red}-1{color} There are [1] new bugs found below threshold in total that 
must be fixed.
.{color:green}+1{color} There are no new bugs found in [sharelib/hive2].
.{color:green}+1{color} There are no new bugs found in [sharelib/spark].
.{color:red}-1{color} There are [1] new bugs found below threshold in 
[sharelib/oozie] that must be fixed.
.You can find the SpotBugs diff here (look for the red and orange ones): 
sharelib/oozie/findbugs-new.html
.The most important SpotBugs errors are:
.At ShellMain.java:[line 92]: This usage of 
java/lang/ProcessBuilder.init(Ljava/util/List;)V can be vulnerable to 
Command Injection
.At ShellMain.java:[line 90]: At ShellMain.java:[line 89]
.At ShellMain.java:[line 91]
.{color:green}+1{color} There are no new bugs found in [sharelib/pig].
.{color:green}+1{color} There are no new bugs found in [sharelib/streaming].
.{color:green}+1{color} There are no new bugs found in [sharelib/hive].
.{color:green}+1{color} There are no new bugs found in [sharelib/distcp].
.{color:green}+1{color} There are no new bugs found in [sharelib/hcatalog].
.{color:green}+1{color} There are no new bugs found in [sharelib/sqoop].
.{color:green}+1{color} There are no new bugs found in [sharelib/git].
.{color:green}+1{color} There are no new bugs found in [client].
.{color:green}+1{color} There are no new bugs found in [docs].
.{color:green}+1{color} There are no new bugs found in [tools].
.{color:green}+1{color} There are no new bugs found in 
[fluent-job/fluent-job-api].
.{color:green}+1{color} There are no new bugs found in [server].
.{color:green}+1{color} There are no new bugs found in [webapp].
.{color:green}+1{color} There are no new bugs found in [examples].
.{color:green}+1{color} There are no new bugs found in [core].
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.{color:green}+1{color} the patch does not change any JPA 
Entity/Colum/Basic/Lob/Transient annotations
.{color:green}+1{color} the patch does not modify JPA files
{color:green}+1 TESTS{color}
.Tests run: 3169
.{color:orange}Tests failed at first run:{color}
TestPurgeXCommand#testPurgeableBundleUnpurgeableCoordinatorUnpurgeableWorkflow
TestPurgeXCommand#testPurgeableBundleUnpurgeableCoordinatorUnpurgebleWorkflowPurgeableSubWorkflow
.For the complete list of flaky tests, see TEST-SUMMARY-FULL files.
{color:green}+1 DISTRO{color}
.{color:green}+1{color} distro tarball builds with the patch 


{color:red}*-1 Overall result, please check the reported -1(s)*{color}


The full output of the test-patch run is available at

. https://builds.apache.org/job/PreCommit-OOZIE-Build/1067/



> Investigate and clean git sharelib
> --
>
> Key: OOZIE-3450
> URL: https://issues.apache.org/jira/browse/OOZIE-3450
> Project: Oozie
>  Issue Type: Improvement
>Affects Versions: trunk
>Reporter: Andras Salamon
>Assignee: Mate Juhasz
>Priority: Major
> Attachments: OOZIE-3450-v2.patch, OOZIE-3540-v1.patch
>
>
> I've checked the number of jars in the Oozie sharelibs and realized that git 
> sharelib contains the highest number of jars (203), it's much more than the 
> hive (85), pig (67). Not to mention that we have really small sharelibs like 
> distcp (3).
> I don't really understand the reason for this, we need to check if we really 
> need all the jars here. The huge number of jars make it slower and it's more 
> likely that we get 

[jira] [Commented] (OOZIE-3450) Investigate and clean git sharelib

2019-03-29 Thread Mate Juhasz (JIRA)


[ 
https://issues.apache.org/jira/browse/OOZIE-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16804899#comment-16804899
 ] 

Mate Juhasz commented on OOZIE-3450:


Thanks [~asalamon74] for the testing.

Could you please give the new patch a try as well? 
Interesting that each of the sharelib projects are using the oozie-core as a 
provided dependency, the only difference is that while most of them just 
reffering to static constants in the ActionExecutor-s, sharelib-git's GitMain 
tries to call a method in GitActionExecutor$ActionConfVerifier. This method is 
the ActionConfVerifier#checkAndGetTrimmed, which only returns an action conf 
value, not connecting strongly anywhere, so I moved it to the GitMain class.

> Investigate and clean git sharelib
> --
>
> Key: OOZIE-3450
> URL: https://issues.apache.org/jira/browse/OOZIE-3450
> Project: Oozie
>  Issue Type: Improvement
>Affects Versions: trunk
>Reporter: Andras Salamon
>Assignee: Mate Juhasz
>Priority: Major
> Attachments: OOZIE-3450-v2.patch, OOZIE-3540-v1.patch
>
>
> I've checked the number of jars in the Oozie sharelibs and realized that git 
> sharelib contains the highest number of jars (203), it's much more than the 
> hive (85), pig (67). Not to mention that we have really small sharelibs like 
> distcp (3).
> I don't really understand the reason for this, we need to check if we really 
> need all the jars here. The huge number of jars make it slower and it's more 
> likely that we get strange errors because of jar conflicts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OOZIE-3450) Investigate and clean git sharelib

2019-03-29 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/OOZIE-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16804896#comment-16804896
 ] 

Hadoop QA commented on OOZIE-3450:
--

PreCommit-OOZIE-Build started


> Investigate and clean git sharelib
> --
>
> Key: OOZIE-3450
> URL: https://issues.apache.org/jira/browse/OOZIE-3450
> Project: Oozie
>  Issue Type: Improvement
>Affects Versions: trunk
>Reporter: Andras Salamon
>Assignee: Mate Juhasz
>Priority: Major
> Attachments: OOZIE-3450-v2.patch, OOZIE-3540-v1.patch
>
>
> I've checked the number of jars in the Oozie sharelibs and realized that git 
> sharelib contains the highest number of jars (203), it's much more than the 
> hive (85), pig (67). Not to mention that we have really small sharelibs like 
> distcp (3).
> I don't really understand the reason for this, we need to check if we really 
> need all the jars here. The huge number of jars make it slower and it's more 
> likely that we get strange errors because of jar conflicts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OOZIE-3450) Investigate and clean git sharelib

2019-03-28 Thread Andras Salamon (JIRA)


[ 
https://issues.apache.org/jira/browse/OOZIE-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16804101#comment-16804101
 ] 

Andras Salamon commented on OOZIE-3450:
---

I tried to run the git example, but it fails with the following error:

{noformat}java.lang.NoClassDefFoundError: 
org/apache/oozie/action/hadoop/GitActionExecutor$ActionConfVerifier
at 
org.apache.oozie.action.hadoop.GitMain.parseActionConfiguration(GitMain.java:181)
at org.apache.oozie.action.hadoop.GitMain.run(GitMain.java:63)
at 
org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:107)
at org.apache.oozie.action.hadoop.GitMain.main(GitMain.java:48)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.oozie.action.hadoop.LauncherAM.runActionMain(LauncherAM.java:413)
at 
org.apache.oozie.action.hadoop.LauncherAM.access$400(LauncherAM.java:55)
at org.apache.oozie.action.hadoop.LauncherAM$2.run(LauncherAM.java:226)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.oozie.action.hadoop.LauncherAM.run(LauncherAM.java:220)
at org.apache.oozie.action.hadoop.LauncherAM$1.run(LauncherAM.java:156)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.oozie.action.hadoop.LauncherAM.main(LauncherAM.java:144)
Caused by: java.lang.ClassNotFoundException: 
org.apache.oozie.action.hadoop.GitActionExecutor$ActionConfVerifier
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 20 more
{noformat}

I think we need the oozie-core jar on the sharelib. Or maybe it would be 
possible to refactor the code and avoid referring to 
{{GitActionExecutor$ActionConfVerifier}}.

> Investigate and clean git sharelib
> --
>
> Key: OOZIE-3450
> URL: https://issues.apache.org/jira/browse/OOZIE-3450
> Project: Oozie
>  Issue Type: Improvement
>Affects Versions: trunk
>Reporter: Andras Salamon
>Assignee: Mate Juhasz
>Priority: Major
> Attachments: OOZIE-3540-v1.patch
>
>
> I've checked the number of jars in the Oozie sharelibs and realized that git 
> sharelib contains the highest number of jars (203), it's much more than the 
> hive (85), pig (67). Not to mention that we have really small sharelibs like 
> distcp (3).
> I don't really understand the reason for this, we need to check if we really 
> need all the jars here. The huge number of jars make it slower and it's more 
> likely that we get strange errors because of jar conflicts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OOZIE-3450) Investigate and clean git sharelib

2019-03-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/OOZIE-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16803155#comment-16803155
 ] 

Hadoop QA commented on OOZIE-3450:
--


Testing JIRA OOZIE-3450

Cleaning local git workspace



{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:red}-1 RAW_PATCH_ANALYSIS{color}
.{color:green}+1{color} the patch does not introduce any @author tags
.{color:green}+1{color} the patch does not introduce any tabs
.{color:green}+1{color} the patch does not introduce any trailing spaces
.{color:green}+1{color} the patch does not introduce any star imports
.{color:green}+1{color} the patch does not introduce any line longer than 
132
.{color:red}-1{color} the patch does not add/modify any testcase
{color:green}+1 RAT{color}
.{color:green}+1{color} the patch does not seem to introduce new RAT 
warnings
{color:green}+1 JAVADOC{color}
.{color:green}+1{color} Javadoc generation succeeded with the patch
.{color:green}+1{color} the patch does not seem to introduce new Javadoc 
warning(s)
{color:green}+1 COMPILE{color}
.{color:green}+1{color} HEAD compiles
.{color:green}+1{color} patch compiles
.{color:green}+1{color} the patch does not seem to introduce new javac 
warnings
{color:orange}0{color} There are [4] new bugs found in total that would be nice 
to have fixed.
.{color:green}+1{color} There are no new bugs found in [examples].
.{color:green}+1{color} There are no new bugs found in [core].
.{color:green}+1{color} There are no new bugs found in [sharelib/distcp].
.{color:green}+1{color} There are no new bugs found in [sharelib/hive].
.{color:green}+1{color} There are no new bugs found in [sharelib/pig].
.{color:green}+1{color} There are no new bugs found in [sharelib/spark].
.{color:green}+1{color} There are no new bugs found in [sharelib/hive2].
.{color:green}+1{color} There are no new bugs found in [sharelib/hcatalog].
.{color:green}+1{color} There are no new bugs found in [sharelib/sqoop].
.{color:green}+1{color} There are no new bugs found in [sharelib/oozie].
.{color:green}+1{color} There are no new bugs found in [sharelib/streaming].
.{color:green}+1{color} There are no new bugs found in [sharelib/git].
.{color:green}+1{color} There are no new bugs found in [webapp].
.{color:green}+1{color} There are no new bugs found in [tools].
.{color:green}+1{color} There are no new bugs found in [docs].
.{color:orange}0{color} There are [4] new bugs found in [server] that would 
be nice to have fixed.
.You can find the SpotBugs diff here: server/findbugs-new.html
.{color:green}+1{color} There are no new bugs found in 
[fluent-job/fluent-job-api].
.{color:green}+1{color} There are no new bugs found in [client].
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.{color:green}+1{color} the patch does not change any JPA 
Entity/Colum/Basic/Lob/Transient annotations
.{color:green}+1{color} the patch does not modify JPA files
{color:green}+1 TESTS{color}
.Tests run: 3169
{color:green}+1 DISTRO{color}
.{color:green}+1{color} distro tarball builds with the patch 


{color:red}*-1 Overall result, please check the reported -1(s)*{color}


The full output of the test-patch run is available at

. https://builds.apache.org/job/PreCommit-OOZIE-Build/1062/



> Investigate and clean git sharelib
> --
>
> Key: OOZIE-3450
> URL: https://issues.apache.org/jira/browse/OOZIE-3450
> Project: Oozie
>  Issue Type: Improvement
>Affects Versions: trunk
>Reporter: Andras Salamon
>Assignee: Mate Juhasz
>Priority: Major
> Attachments: OOZIE-3540-v1.patch
>
>
> I've checked the number of jars in the Oozie sharelibs and realized that git 
> sharelib contains the highest number of jars (203), it's much more than the 
> hive (85), pig (67). Not to mention that we have really small sharelibs like 
> distcp (3).
> I don't really understand the reason for this, we need to check if we really 
> need all the jars here. The huge number of jars make it slower and it's more 
> likely that we get strange errors because of jar conflicts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OOZIE-3450) Investigate and clean git sharelib

2019-03-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/OOZIE-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16802981#comment-16802981
 ] 

Hadoop QA commented on OOZIE-3450:
--

PreCommit-OOZIE-Build started


> Investigate and clean git sharelib
> --
>
> Key: OOZIE-3450
> URL: https://issues.apache.org/jira/browse/OOZIE-3450
> Project: Oozie
>  Issue Type: Improvement
>Affects Versions: trunk
>Reporter: Andras Salamon
>Assignee: Mate Juhasz
>Priority: Major
> Attachments: OOZIE-3540-v1.patch
>
>
> I've checked the number of jars in the Oozie sharelibs and realized that git 
> sharelib contains the highest number of jars (203), it's much more than the 
> hive (85), pig (67). Not to mention that we have really small sharelibs like 
> distcp (3).
> I don't really understand the reason for this, we need to check if we really 
> need all the jars here. The huge number of jars make it slower and it's more 
> likely that we get strange errors because of jar conflicts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OOZIE-3450) Investigate and clean git sharelib

2019-03-27 Thread Andras Salamon (JIRA)


[ 
https://issues.apache.org/jira/browse/OOZIE-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16802946#comment-16802946
 ] 

Andras Salamon commented on OOZIE-3450:
---

Thanks [~matijhs]. I assigned the Jira to you and changed the status to patch 
available so precommit will execute the tests. I'll also test it using the git 
example.

> Investigate and clean git sharelib
> --
>
> Key: OOZIE-3450
> URL: https://issues.apache.org/jira/browse/OOZIE-3450
> Project: Oozie
>  Issue Type: Improvement
>Affects Versions: trunk
>Reporter: Andras Salamon
>Assignee: Mate Juhasz
>Priority: Major
> Attachments: OOZIE-3540-v1.patch
>
>
> I've checked the number of jars in the Oozie sharelibs and realized that git 
> sharelib contains the highest number of jars (203), it's much more than the 
> hive (85), pig (67). Not to mention that we have really small sharelibs like 
> distcp (3).
> I don't really understand the reason for this, we need to check if we really 
> need all the jars here. The huge number of jars make it slower and it's more 
> likely that we get strange errors because of jar conflicts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OOZIE-3450) Investigate and clean git sharelib

2019-03-27 Thread Mate Juhasz (JIRA)


[ 
https://issues.apache.org/jira/browse/OOZIE-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16802711#comment-16802711
 ] 

Mate Juhasz commented on OOZIE-3450:


I have managed to reduce to number of jars on the git sharelib (9) with better 
chosen dependency scopes, but unfortunately I could not find an environment to 
run the examples code. [~asalamon74] could you please help to check what 
happens if we keep the below jars only? I upload a patch as well, but these are 
really minor changes in code.

{noformat}
find share/lib/git -name "*.jar" | sort
share/lib/git/JavaEWAH-1.1.6.jar
share/lib/git/commons-lang3-3.3.2.jar
share/lib/git/httpclient-4.3.6.jar
share/lib/git/httpcore-4.3.3.jar
share/lib/git/jsch-0.1.54.jar
share/lib/git/jzlib-1.1.1.jar
share/lib/git/oozie-sharelib-git-5.2.0-SNAPSHOT.jar
share/lib/git/org.eclipse.jgit-5.0.1.201806211838-r.jar
share/lib/git/slf4j-api-1.6.6.jar
{noformat}


> Investigate and clean git sharelib
> --
>
> Key: OOZIE-3450
> URL: https://issues.apache.org/jira/browse/OOZIE-3450
> Project: Oozie
>  Issue Type: Improvement
>Affects Versions: trunk
>Reporter: Andras Salamon
>Priority: Major
> Attachments: OOZIE-3540-v1.patch
>
>
> I've checked the number of jars in the Oozie sharelibs and realized that git 
> sharelib contains the highest number of jars (203), it's much more than the 
> hive (85), pig (67). Not to mention that we have really small sharelibs like 
> distcp (3).
> I don't really understand the reason for this, we need to check if we really 
> need all the jars here. The huge number of jars make it slower and it's more 
> likely that we get strange errors because of jar conflicts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)