[GitHub] [flink] rmetzger commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile

2020-05-11 Thread GitBox


rmetzger commented on a change in pull request #11983:
URL: https://github.com/apache/flink/pull/11983#discussion_r423002221



##
File path: flink-end-to-end-tests/pom.xml
##
@@ -255,6 +298,21 @@ under the License.



+   
+   org.apache.maven.plugins
+   maven-enforcer-plugin

Review comment:
   But I don't know if it makes sense to mess with Hadoop's transitive 
dependency tree. 
   In our tests, we want to ensure that Flink works with certain vanilla Hadoop 
versions.
   
   If we start hand-crafting Hadoop's dependencies towards convergence, we 
won't ensure that Flink works with those versions -- we ensure it works with 
our version.
   
   If the maven-enforcer-plugin would allow us to control the convergence check 
more fine-grained, I would not be opposed to it, as we need to ensure that some 
dependencyManagement, exclusion etc. from us is affecting vanilla Hadoop's 
dependency tree.
   
   A second problem is that the exclusions might differ between the Hadoop 
versions we use for CI. For Hadoop 2.4.1 we have convergence, for 2.8.3 we 
don't.
   
   Given these thoughts, I believe we should disable the convergence check for 
the tests, and rely on test failures for detecting severe issues with our 
Hadoop integration. We just need to accept that the Hadoop project is wild-west 
when it comes to dependencies.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] rmetzger commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile

2020-05-11 Thread GitBox


rmetzger commented on a change in pull request #11983:
URL: https://github.com/apache/flink/pull/11983#discussion_r422965804



##
File path: flink-end-to-end-tests/pom.xml
##
@@ -255,6 +298,21 @@ under the License.



+   
+   org.apache.maven.plugins
+   maven-enforcer-plugin

Review comment:
   There are convergence issues on CI for Hadoop 2.8.3 (on 2.4.1 it 
converges)
   But since we are not providing the flink-end-to-end-tests officially as a 
module for users, I decided to ignore convergence:
   
   ```
   2020-05-11T10:54:23.9920427Z [INFO] --- 
maven-enforcer-plugin:3.0.0-M1:enforce (dependency-convergence) @ 
flink-end-to-end-tests ---
   2020-05-11T10:54:24.0051160Z [WARNING] 
   2020-05-11T10:54:24.0051590Z Dependency convergence error for 
com.google.guava:guava:11.0.2 paths to dependency are:
   2020-05-11T10:54:24.0052711Z 
+-org.apache.flink:flink-end-to-end-tests:1.11-SNAPSHOT
   2020-05-11T10:54:24.0053292Z   
+-org.apache.flink:flink-yarn_2.11:1.11-SNAPSHOT
   2020-05-11T10:54:24.0053794Z +-org.apache.hadoop:hadoop-common:2.8.3
   2020-05-11T10:54:24.0054246Z   +-com.google.guava:guava:11.0.2
   2020-05-11T10:54:24.0054558Z and
   2020-05-11T10:54:24.0054967Z 
+-org.apache.flink:flink-end-to-end-tests:1.11-SNAPSHOT
   2020-05-11T10:54:24.0055491Z   
+-org.apache.flink:flink-yarn_2.11:1.11-SNAPSHOT
   2020-05-11T10:54:24.0055971Z +-org.apache.hadoop:hadoop-common:2.8.3
   2020-05-11T10:54:24.0056429Z   +-org.apache.hadoop:hadoop-auth:2.8.3
   2020-05-11T10:54:24.0057162Z 
+-org.apache.curator:curator-framework:2.7.1
   2020-05-11T10:54:24.0057681Z   +-com.google.guava:guava:16.0.1
   2020-05-11T10:54:24.0057931Z and
   2020-05-11T10:54:24.0058355Z 
+-org.apache.flink:flink-end-to-end-tests:1.11-SNAPSHOT
   2020-05-11T10:54:24.0058898Z   
+-org.apache.flink:flink-yarn_2.11:1.11-SNAPSHOT
   2020-05-11T10:54:24.0059397Z +-org.apache.hadoop:hadoop-common:2.8.3
   2020-05-11T10:54:24.0059886Z   +-org.apache.curator:curator-client:2.7.1
   2020-05-11T10:54:24.0060526Z +-com.google.guava:guava:16.0.1
   2020-05-11T10:54:24.0060915Z and
   2020-05-11T10:54:24.0061396Z 
+-org.apache.flink:flink-end-to-end-tests:1.11-SNAPSHOT
   2020-05-11T10:54:24.0061913Z   
+-org.apache.flink:flink-yarn_2.11:1.11-SNAPSHOT
   2020-05-11T10:54:24.0062417Z +-org.apache.hadoop:hadoop-common:2.8.3
   2020-05-11T10:54:24.0062922Z   +-org.apache.curator:curator-recipes:2.7.1
   2020-05-11T10:54:24.0063396Z +-com.google.guava:guava:16.0.1
   2020-05-11T10:54:24.0063643Z and
   2020-05-11T10:54:24.0064067Z 
+-org.apache.flink:flink-end-to-end-tests:1.11-SNAPSHOT
   2020-05-11T10:54:24.0064623Z   
+-org.apache.flink:flink-yarn_2.11:1.11-SNAPSHOT
   2020-05-11T10:54:24.0065096Z +-org.apache.hadoop:hadoop-hdfs:2.8.3
   2020-05-11T10:54:24.0065569Z   +-com.google.guava:guava:11.0.2
   2020-05-11T10:54:24.0065811Z and
   2020-05-11T10:54:24.0066236Z 
+-org.apache.flink:flink-end-to-end-tests:1.11-SNAPSHOT
   2020-05-11T10:54:24.0066774Z   
+-org.apache.flink:flink-yarn_2.11:1.11-SNAPSHOT
   2020-05-11T10:54:24.0067274Z +-org.apache.hadoop:hadoop-yarn-common:2.8.3
   2020-05-11T10:54:24.0067789Z   +-org.apache.hadoop:hadoop-yarn-api:2.8.3
   2020-05-11T10:54:24.0068256Z +-com.google.guava:guava:11.0.2
   2020-05-11T10:54:24.0068501Z and
   2020-05-11T10:54:24.0068940Z 
+-org.apache.flink:flink-end-to-end-tests:1.11-SNAPSHOT
   2020-05-11T10:54:24.0069632Z   
+-org.apache.flink:flink-yarn_2.11:1.11-SNAPSHOT
   2020-05-11T10:54:24.0070133Z +-org.apache.hadoop:hadoop-yarn-common:2.8.3
   2020-05-11T10:54:24.0070582Z   +-com.google.guava:guava:11.0.2
   2020-05-11T10:54:24.0071129Z and
   2020-05-11T10:54:24.0071673Z 
+-org.apache.flink:flink-end-to-end-tests:1.11-SNAPSHOT
   2020-05-11T10:54:24.0072195Z   
+-org.apache.flink:flink-yarn_2.11:1.11-SNAPSHOT
   2020-05-11T10:54:24.0072692Z +-org.apache.hadoop:hadoop-yarn-client:2.8.3
   2020-05-11T10:54:24.0073146Z   +-com.google.guava:guava:11.0.2
   2020-05-11T10:54:24.0073347Z 
   2020-05-11T10:54:24.0073499Z [WARNING] 
   2020-05-11T10:54:24.0074079Z Dependency convergence error for 
commons-beanutils:commons-beanutils:1.8.0 paths to dependency are:
   2020-05-11T10:54:24.0074699Z 
+-org.apache.flink:flink-end-to-end-tests:1.11-SNAPSHOT
   2020-05-11T10:54:24.0075216Z   
+-org.apache.flink:flink-yarn_2.11:1.11-SNAPSHOT
   2020-05-11T10:54:24.0075681Z +-org.apache.hadoop:hadoop-common:2.8.3
   2020-05-11T10:54:24.0076191Z   
+-commons-configuration:commons-configuration:1.7
   2020-05-11T10:54:24.0076707Z 
+-commons-digester:commons-digester:1.8.1
   2020-05-11T10:54:24.0077208Z   
+-commons-beanutils:commons-beanutils:1.8.0
   2020-05-11T10:54:24.0077482Z and
   2

[GitHub] [flink] rmetzger commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile

2020-05-11 Thread GitBox


rmetzger commented on a change in pull request #11983:
URL: https://github.com/apache/flink/pull/11983#discussion_r422941894



##
File path: flink-end-to-end-tests/pom.xml
##
@@ -255,6 +298,21 @@ under the License.



+   
+   org.apache.maven.plugins
+   maven-enforcer-plugin

Review comment:
   You are right. Removing it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] rmetzger commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile

2020-05-11 Thread GitBox


rmetzger commented on a change in pull request #11983:
URL: https://github.com/apache/flink/pull/11983#discussion_r422941894



##
File path: flink-end-to-end-tests/pom.xml
##
@@ -255,6 +298,21 @@ under the License.



+   
+   org.apache.maven.plugins
+   maven-enforcer-plugin

Review comment:
   You are right. Removing it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] rmetzger commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile

2020-05-11 Thread GitBox


rmetzger commented on a change in pull request #11983:
URL: https://github.com/apache/flink/pull/11983#discussion_r422927738



##
File path: flink-end-to-end-tests/pom.xml
##
@@ -153,6 +170,32 @@ under the License.
 


+   
+   maven-resources-plugin
+   

Review comment:
   I think I copy pasted the plugin definition from somewhere and assumed 
that the plugin version is centrally managed.
   
   The other uses of the plugin are also defined without a version
   ```
   --
   flink-quickstart/pom.xml-
   flink-quickstart/pom.xml-
org.apache.maven.plugins
   flink-quickstart/pom.xml:
maven-resources-plugin
   flink-quickstart/pom.xml-
   flink-quickstart/pom.xml-
false
   --
   flink-table/flink-sql-parser-hive/pom.xml-   
   flink-table/flink-sql-parser-hive/pom.xml-   
   flink-table/flink-sql-parser-hive/pom.xml:   
maven-resources-plugin
   flink-table/flink-sql-parser-hive/pom.xml-   

   flink-table/flink-sql-parser-hive/pom.xml-   

   --
   flink-table/flink-sql-parser/pom.xml-
   flink-table/flink-sql-parser/pom.xml-
   flink-table/flink-sql-parser/pom.xml:
maven-resources-plugin
   flink-table/flink-sql-parser/pom.xml-

   flink-table/flink-sql-parser/pom.xml-

   --
   flink-walkthroughs/pom.xml-  
   flink-walkthroughs/pom.xml-  
org.apache.maven.plugins
   flink-walkthroughs/pom.xml:  
maven-resources-plugin
   flink-walkthroughs/pom.xml-  
   flink-walkthroughs/pom.xml-  
false
   ```
   
   It doesn't seem that we are centrally managing the plugin versions. I would 
propose to just remove the `` line.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] rmetzger commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile

2020-05-11 Thread GitBox


rmetzger commented on a change in pull request #11983:
URL: https://github.com/apache/flink/pull/11983#discussion_r422928042



##
File path: flink-end-to-end-tests/pom.xml
##
@@ -153,6 +170,32 @@ under the License.
 


+   
+   maven-resources-plugin
+   
+   
+   
+   copy-resources
+   

Review comment:
   Sorry that I didn't catch this in my self-review





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] rmetzger commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile

2020-05-11 Thread GitBox


rmetzger commented on a change in pull request #11983:
URL: https://github.com/apache/flink/pull/11983#discussion_r422919014



##
File path: flink-dist/pom.xml
##
@@ -460,50 +460,6 @@ under the License.


 
-   
-   
-   include-hadoop

Review comment:
   Addressed in 
https://github.com/rmetzger/flink/commit/eda372b26dc5ea82b03f0ada8d16bc85afc61123





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] rmetzger commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile

2020-05-11 Thread GitBox


rmetzger commented on a change in pull request #11983:
URL: https://github.com/apache/flink/pull/11983#discussion_r422918601



##
File path: flink-dist/pom.xml
##
@@ -460,50 +460,6 @@ under the License.


 
-   
-   
-   include-hadoop

Review comment:
   You are right. The e2e tests will always have access to at least Hadoop 
2.4.1, and we can remove the "hadoopfree" nightly profile





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] rmetzger commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile

2020-05-07 Thread GitBox


rmetzger commented on a change in pull request #11983:
URL: https://github.com/apache/flink/pull/11983#discussion_r421433013



##
File path: flink-yarn/pom.xml
##
@@ -227,6 +227,27 @@ under the License.



+   

[GitHub] [flink] rmetzger commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile

2020-05-05 Thread GitBox


rmetzger commented on a change in pull request #11983:
URL: https://github.com/apache/flink/pull/11983#discussion_r420202465



##
File path: 
flink-end-to-end-tests/flink-end-to-end-tests-common/src/main/java/org/apache/flink/tests/util/flink/SQLJobSubmission.java
##
@@ -90,6 +91,11 @@ public SQLJobSubmissionBuilder addJar(Path jarFile) {
return this;
}
 
+   public SQLJobSubmissionBuilder addJars(List jarFiles) {
+   this.jars.addAll(jarFiles.stream().map(path -> 
path.toAbsolutePath().toString()).collect(Collectors.toList()));

Review comment:
   Ah, sweet! Thanks! You see I haven't worked much with Java 8 :) 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] rmetzger commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile

2020-05-05 Thread GitBox


rmetzger commented on a change in pull request #11983:
URL: https://github.com/apache/flink/pull/11983#discussion_r420201968



##
File path: 
flink-end-to-end-tests/flink-end-to-end-tests-common-kafka/src/test/java/org/apache/flink/tests/util/kafka/SQLClientKafkaITCase.java
##
@@ -106,11 +112,16 @@ public SQLClientKafkaITCase(String kafkaVersion, String 
kafkaSQLVersion, String
}
 
@Before
-   public void before() {
+   public void before() throws Exception {
+   downloadCache.before();
Path tmpPath = tmp.getRoot().toPath();
LOG.info("The current temporary path: {}", tmpPath);
this.sqlClientSessionConf = 
tmpPath.resolve("sql-client-session.conf");
this.result = tmpPath.resolve("result");
+
+   
apacheAvroJars.add(downloadCache.getOrDownload("https://repo1.maven.org/maven2/org/apache/avro/avro/1.8.2/avro-1.8.2.jar";,
 tmpPath));

Review comment:
   > So this only was only working by chance since missing stuff was 
provided by flink-shaded-hadoop? 
   
   Yes.
   
   > If so, why does that no longer work?
   
   Because flink-shaded-hadoop does not exist anymore in this change
   
   Re correct fix: I discussed this offline with @dawidwys and @twalthr and 
they agreed that providing the required avro dependencies is an acceptable 
approach.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] rmetzger commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile

2020-05-05 Thread GitBox


rmetzger commented on a change in pull request #11983:
URL: https://github.com/apache/flink/pull/11983#discussion_r420198337



##
File path: flink-formats/flink-orc-nohive/pom.xml
##
@@ -98,6 +103,22 @@ under the License.
 

 
+   
+   
+   
+   hadoop3-tests
+   
+   
+   org.apache.hadoop
+   
hadoop-hdfs-client

Review comment:
   The artifact is required at runtime, but only if you are running Hadoop 
3. Flink does only compile against 2.4.1, users need to provide the right 
dependencies in their classpath.
   For running the tests on HD3, we provide the right classpath through this 
profile.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] rmetzger commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile

2020-05-04 Thread GitBox


rmetzger commented on a change in pull request #11983:
URL: https://github.com/apache/flink/pull/11983#discussion_r419675846



##
File path: flink-dist/pom.xml
##
@@ -137,8 +137,8 @@ under the License.
${project.version}


-   org.apache.flink
-   
flink-shaded-hadoop-2
+   org.apache.hadoop
+   *

Review comment:
   Oh, I didn't know that this is a newer feature. However, we have already 
a case in master with a star exclude: 
https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-elasticsearch-base/pom.xml#L68
   
   Can we raise the minimum maven version to 
[3.2.1](https://maven.apache.org/docs/3.2.1/release-notes.html) then?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org