[GitHub] [flink] rmetzger commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile
rmetzger commented on a change in pull request #11983: URL: https://github.com/apache/flink/pull/11983#discussion_r423002221 ## File path: flink-end-to-end-tests/pom.xml ## @@ -255,6 +298,21 @@ under the License. + + org.apache.maven.plugins + maven-enforcer-plugin Review comment: But I don't know if it makes sense to mess with Hadoop's transitive dependency tree. In our tests, we want to ensure that Flink works with certain vanilla Hadoop versions. If we start hand-crafting Hadoop's dependencies towards convergence, we won't ensure that Flink works with those versions -- we ensure it works with our version. If the maven-enforcer-plugin would allow us to control the convergence check more fine-grained, I would not be opposed to it, as we need to ensure that some dependencyManagement, exclusion etc. from us is affecting vanilla Hadoop's dependency tree. A second problem is that the exclusions might differ between the Hadoop versions we use for CI. For Hadoop 2.4.1 we have convergence, for 2.8.3 we don't. Given these thoughts, I believe we should disable the convergence check for the tests, and rely on test failures for detecting severe issues with our Hadoop integration. We just need to accept that the Hadoop project is wild-west when it comes to dependencies. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] rmetzger commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile
rmetzger commented on a change in pull request #11983: URL: https://github.com/apache/flink/pull/11983#discussion_r422965804 ## File path: flink-end-to-end-tests/pom.xml ## @@ -255,6 +298,21 @@ under the License. + + org.apache.maven.plugins + maven-enforcer-plugin Review comment: There are convergence issues on CI for Hadoop 2.8.3 (on 2.4.1 it converges) But since we are not providing the flink-end-to-end-tests officially as a module for users, I decided to ignore convergence: ``` 2020-05-11T10:54:23.9920427Z [INFO] --- maven-enforcer-plugin:3.0.0-M1:enforce (dependency-convergence) @ flink-end-to-end-tests --- 2020-05-11T10:54:24.0051160Z [WARNING] 2020-05-11T10:54:24.0051590Z Dependency convergence error for com.google.guava:guava:11.0.2 paths to dependency are: 2020-05-11T10:54:24.0052711Z +-org.apache.flink:flink-end-to-end-tests:1.11-SNAPSHOT 2020-05-11T10:54:24.0053292Z +-org.apache.flink:flink-yarn_2.11:1.11-SNAPSHOT 2020-05-11T10:54:24.0053794Z +-org.apache.hadoop:hadoop-common:2.8.3 2020-05-11T10:54:24.0054246Z +-com.google.guava:guava:11.0.2 2020-05-11T10:54:24.0054558Z and 2020-05-11T10:54:24.0054967Z +-org.apache.flink:flink-end-to-end-tests:1.11-SNAPSHOT 2020-05-11T10:54:24.0055491Z +-org.apache.flink:flink-yarn_2.11:1.11-SNAPSHOT 2020-05-11T10:54:24.0055971Z +-org.apache.hadoop:hadoop-common:2.8.3 2020-05-11T10:54:24.0056429Z +-org.apache.hadoop:hadoop-auth:2.8.3 2020-05-11T10:54:24.0057162Z +-org.apache.curator:curator-framework:2.7.1 2020-05-11T10:54:24.0057681Z +-com.google.guava:guava:16.0.1 2020-05-11T10:54:24.0057931Z and 2020-05-11T10:54:24.0058355Z +-org.apache.flink:flink-end-to-end-tests:1.11-SNAPSHOT 2020-05-11T10:54:24.0058898Z +-org.apache.flink:flink-yarn_2.11:1.11-SNAPSHOT 2020-05-11T10:54:24.0059397Z +-org.apache.hadoop:hadoop-common:2.8.3 2020-05-11T10:54:24.0059886Z +-org.apache.curator:curator-client:2.7.1 2020-05-11T10:54:24.0060526Z +-com.google.guava:guava:16.0.1 2020-05-11T10:54:24.0060915Z and 2020-05-11T10:54:24.0061396Z +-org.apache.flink:flink-end-to-end-tests:1.11-SNAPSHOT 2020-05-11T10:54:24.0061913Z +-org.apache.flink:flink-yarn_2.11:1.11-SNAPSHOT 2020-05-11T10:54:24.0062417Z +-org.apache.hadoop:hadoop-common:2.8.3 2020-05-11T10:54:24.0062922Z +-org.apache.curator:curator-recipes:2.7.1 2020-05-11T10:54:24.0063396Z +-com.google.guava:guava:16.0.1 2020-05-11T10:54:24.0063643Z and 2020-05-11T10:54:24.0064067Z +-org.apache.flink:flink-end-to-end-tests:1.11-SNAPSHOT 2020-05-11T10:54:24.0064623Z +-org.apache.flink:flink-yarn_2.11:1.11-SNAPSHOT 2020-05-11T10:54:24.0065096Z +-org.apache.hadoop:hadoop-hdfs:2.8.3 2020-05-11T10:54:24.0065569Z +-com.google.guava:guava:11.0.2 2020-05-11T10:54:24.0065811Z and 2020-05-11T10:54:24.0066236Z +-org.apache.flink:flink-end-to-end-tests:1.11-SNAPSHOT 2020-05-11T10:54:24.0066774Z +-org.apache.flink:flink-yarn_2.11:1.11-SNAPSHOT 2020-05-11T10:54:24.0067274Z +-org.apache.hadoop:hadoop-yarn-common:2.8.3 2020-05-11T10:54:24.0067789Z +-org.apache.hadoop:hadoop-yarn-api:2.8.3 2020-05-11T10:54:24.0068256Z +-com.google.guava:guava:11.0.2 2020-05-11T10:54:24.0068501Z and 2020-05-11T10:54:24.0068940Z +-org.apache.flink:flink-end-to-end-tests:1.11-SNAPSHOT 2020-05-11T10:54:24.0069632Z +-org.apache.flink:flink-yarn_2.11:1.11-SNAPSHOT 2020-05-11T10:54:24.0070133Z +-org.apache.hadoop:hadoop-yarn-common:2.8.3 2020-05-11T10:54:24.0070582Z +-com.google.guava:guava:11.0.2 2020-05-11T10:54:24.0071129Z and 2020-05-11T10:54:24.0071673Z +-org.apache.flink:flink-end-to-end-tests:1.11-SNAPSHOT 2020-05-11T10:54:24.0072195Z +-org.apache.flink:flink-yarn_2.11:1.11-SNAPSHOT 2020-05-11T10:54:24.0072692Z +-org.apache.hadoop:hadoop-yarn-client:2.8.3 2020-05-11T10:54:24.0073146Z +-com.google.guava:guava:11.0.2 2020-05-11T10:54:24.0073347Z 2020-05-11T10:54:24.0073499Z [WARNING] 2020-05-11T10:54:24.0074079Z Dependency convergence error for commons-beanutils:commons-beanutils:1.8.0 paths to dependency are: 2020-05-11T10:54:24.0074699Z +-org.apache.flink:flink-end-to-end-tests:1.11-SNAPSHOT 2020-05-11T10:54:24.0075216Z +-org.apache.flink:flink-yarn_2.11:1.11-SNAPSHOT 2020-05-11T10:54:24.0075681Z +-org.apache.hadoop:hadoop-common:2.8.3 2020-05-11T10:54:24.0076191Z +-commons-configuration:commons-configuration:1.7 2020-05-11T10:54:24.0076707Z +-commons-digester:commons-digester:1.8.1 2020-05-11T10:54:24.0077208Z +-commons-beanutils:commons-beanutils:1.8.0 2020-05-11T10:54:24.0077482Z and 2
[GitHub] [flink] rmetzger commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile
rmetzger commented on a change in pull request #11983: URL: https://github.com/apache/flink/pull/11983#discussion_r422941894 ## File path: flink-end-to-end-tests/pom.xml ## @@ -255,6 +298,21 @@ under the License. + + org.apache.maven.plugins + maven-enforcer-plugin Review comment: You are right. Removing it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] rmetzger commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile
rmetzger commented on a change in pull request #11983: URL: https://github.com/apache/flink/pull/11983#discussion_r422941894 ## File path: flink-end-to-end-tests/pom.xml ## @@ -255,6 +298,21 @@ under the License. + + org.apache.maven.plugins + maven-enforcer-plugin Review comment: You are right. Removing it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] rmetzger commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile
rmetzger commented on a change in pull request #11983: URL: https://github.com/apache/flink/pull/11983#discussion_r422927738 ## File path: flink-end-to-end-tests/pom.xml ## @@ -153,6 +170,32 @@ under the License. + + maven-resources-plugin + Review comment: I think I copy pasted the plugin definition from somewhere and assumed that the plugin version is centrally managed. The other uses of the plugin are also defined without a version ``` -- flink-quickstart/pom.xml- flink-quickstart/pom.xml- org.apache.maven.plugins flink-quickstart/pom.xml: maven-resources-plugin flink-quickstart/pom.xml- flink-quickstart/pom.xml- false -- flink-table/flink-sql-parser-hive/pom.xml- flink-table/flink-sql-parser-hive/pom.xml- flink-table/flink-sql-parser-hive/pom.xml: maven-resources-plugin flink-table/flink-sql-parser-hive/pom.xml- flink-table/flink-sql-parser-hive/pom.xml- -- flink-table/flink-sql-parser/pom.xml- flink-table/flink-sql-parser/pom.xml- flink-table/flink-sql-parser/pom.xml: maven-resources-plugin flink-table/flink-sql-parser/pom.xml- flink-table/flink-sql-parser/pom.xml- -- flink-walkthroughs/pom.xml- flink-walkthroughs/pom.xml- org.apache.maven.plugins flink-walkthroughs/pom.xml: maven-resources-plugin flink-walkthroughs/pom.xml- flink-walkthroughs/pom.xml- false ``` It doesn't seem that we are centrally managing the plugin versions. I would propose to just remove the `` line. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] rmetzger commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile
rmetzger commented on a change in pull request #11983: URL: https://github.com/apache/flink/pull/11983#discussion_r422928042 ## File path: flink-end-to-end-tests/pom.xml ## @@ -153,6 +170,32 @@ under the License. + + maven-resources-plugin + + + + copy-resources + Review comment: Sorry that I didn't catch this in my self-review This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] rmetzger commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile
rmetzger commented on a change in pull request #11983: URL: https://github.com/apache/flink/pull/11983#discussion_r422919014 ## File path: flink-dist/pom.xml ## @@ -460,50 +460,6 @@ under the License. - - - include-hadoop Review comment: Addressed in https://github.com/rmetzger/flink/commit/eda372b26dc5ea82b03f0ada8d16bc85afc61123 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] rmetzger commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile
rmetzger commented on a change in pull request #11983: URL: https://github.com/apache/flink/pull/11983#discussion_r422918601 ## File path: flink-dist/pom.xml ## @@ -460,50 +460,6 @@ under the License. - - - include-hadoop Review comment: You are right. The e2e tests will always have access to at least Hadoop 2.4.1, and we can remove the "hadoopfree" nightly profile This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] rmetzger commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile
rmetzger commented on a change in pull request #11983: URL: https://github.com/apache/flink/pull/11983#discussion_r421433013 ## File path: flink-yarn/pom.xml ## @@ -227,6 +227,27 @@ under the License. +
[GitHub] [flink] rmetzger commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile
rmetzger commented on a change in pull request #11983: URL: https://github.com/apache/flink/pull/11983#discussion_r420202465 ## File path: flink-end-to-end-tests/flink-end-to-end-tests-common/src/main/java/org/apache/flink/tests/util/flink/SQLJobSubmission.java ## @@ -90,6 +91,11 @@ public SQLJobSubmissionBuilder addJar(Path jarFile) { return this; } + public SQLJobSubmissionBuilder addJars(List jarFiles) { + this.jars.addAll(jarFiles.stream().map(path -> path.toAbsolutePath().toString()).collect(Collectors.toList())); Review comment: Ah, sweet! Thanks! You see I haven't worked much with Java 8 :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] rmetzger commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile
rmetzger commented on a change in pull request #11983: URL: https://github.com/apache/flink/pull/11983#discussion_r420201968 ## File path: flink-end-to-end-tests/flink-end-to-end-tests-common-kafka/src/test/java/org/apache/flink/tests/util/kafka/SQLClientKafkaITCase.java ## @@ -106,11 +112,16 @@ public SQLClientKafkaITCase(String kafkaVersion, String kafkaSQLVersion, String } @Before - public void before() { + public void before() throws Exception { + downloadCache.before(); Path tmpPath = tmp.getRoot().toPath(); LOG.info("The current temporary path: {}", tmpPath); this.sqlClientSessionConf = tmpPath.resolve("sql-client-session.conf"); this.result = tmpPath.resolve("result"); + + apacheAvroJars.add(downloadCache.getOrDownload("https://repo1.maven.org/maven2/org/apache/avro/avro/1.8.2/avro-1.8.2.jar";, tmpPath)); Review comment: > So this only was only working by chance since missing stuff was provided by flink-shaded-hadoop? Yes. > If so, why does that no longer work? Because flink-shaded-hadoop does not exist anymore in this change Re correct fix: I discussed this offline with @dawidwys and @twalthr and they agreed that providing the required avro dependencies is an acceptable approach. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] rmetzger commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile
rmetzger commented on a change in pull request #11983: URL: https://github.com/apache/flink/pull/11983#discussion_r420198337 ## File path: flink-formats/flink-orc-nohive/pom.xml ## @@ -98,6 +103,22 @@ under the License. + + + + hadoop3-tests + + + org.apache.hadoop + hadoop-hdfs-client Review comment: The artifact is required at runtime, but only if you are running Hadoop 3. Flink does only compile against 2.4.1, users need to provide the right dependencies in their classpath. For running the tests on HD3, we provide the right classpath through this profile. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] rmetzger commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile
rmetzger commented on a change in pull request #11983: URL: https://github.com/apache/flink/pull/11983#discussion_r419675846 ## File path: flink-dist/pom.xml ## @@ -137,8 +137,8 @@ under the License. ${project.version} - org.apache.flink - flink-shaded-hadoop-2 + org.apache.hadoop + * Review comment: Oh, I didn't know that this is a newer feature. However, we have already a case in master with a star exclude: https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-elasticsearch-base/pom.xml#L68 Can we raise the minimum maven version to [3.2.1](https://maven.apache.org/docs/3.2.1/release-notes.html) then? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org