[GitHub] [flink] zentol commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile

2020-05-11 Thread GitBox


zentol commented on a change in pull request #11983:
URL: https://github.com/apache/flink/pull/11983#discussion_r422996560



##
File path: flink-end-to-end-tests/pom.xml
##
@@ -255,6 +298,21 @@ under the License.



+   
+   org.apache.maven.plugins
+   maven-enforcer-plugin

Review comment:
   Convergence isn't just a problem for users though; we also want 
convergence so the tests are stable, so i don't want to throw it out in general 
just because Hadoop is causing yet another issue.
   I'd also like to understand why the exclusions aren't working; we used them 
in other areas to solve convergence issues IIRC.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] zentol commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile

2020-05-11 Thread GitBox


zentol commented on a change in pull request #11983:
URL: https://github.com/apache/flink/pull/11983#discussion_r422992723



##
File path: flink-end-to-end-tests/pom.xml
##
@@ -153,6 +170,32 @@ under the License.
 


+   
+   maven-resources-plugin
+   

Review comment:
   yes we should remove it. It's a bit odd that we don't have a managed 
version, but that's a separate issue.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] zentol commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile

2020-05-11 Thread GitBox


zentol commented on a change in pull request #11983:
URL: https://github.com/apache/flink/pull/11983#discussion_r422899707



##
File path: flink-end-to-end-tests/pom.xml
##
@@ -153,6 +170,32 @@ under the License.
 


+   
+   maven-resources-plugin
+   

Review comment:
   ?

##
File path: flink-dist/pom.xml
##
@@ -460,50 +460,6 @@ under the License.


 
-   
-   
-   include-hadoop

Review comment:
   still referenced in the azure files and `run-nightly-tests.sh`

##
File path: flink-end-to-end-tests/pom.xml
##
@@ -153,6 +170,32 @@ under the License.
 


+   
+   maven-resources-plugin
+   
+   
+   
+   copy-resources
+   

Review comment:
   ```suggestion
   ```

##
File path: flink-end-to-end-tests/pom.xml
##
@@ -255,6 +298,21 @@ under the License.



+   
+   org.apache.maven.plugins
+   maven-enforcer-plugin

Review comment:
   do we still need this despite excluding all dependencies?

##
File path: 
flink-yarn-tests/src/test/java/org/apache/flink/yarn/YARNSessionCapacitySchedulerITCase.java
##
@@ -155,9 +155,7 @@ public static void tearDown() throws Exception {
public void testStartYarnSessionClusterInQaTeamQueue() throws Exception 
{
runTest(() -> runWithArgs(new String[]{
"-j", flinkUberjar.getAbsolutePath(),
-   "-t", flinkLibFolder.getAbsolutePath(),
-   "-t", flinkShadedHadoopDir.getAbsolutePath(),
-   "-jm", "768m",
+   "-t", flinkLibFolder.getAbsolutePath(), "-jm", 
"768m",

Review comment:
   ```suggestion
"-t", flinkLibFolder.getAbsolutePath(),
"-jm", "768m",
   ```

##
File path: docs/ops/deployment/hadoop.md
##
@@ -120,4 +88,44 @@ This way it should work both in local and cluster run where 
the provided depende
 To run or debug an application in IntelliJ Idea the provided dependencies can 
be included to the class path
 in the "Run|Edit Configurations" window.
 
+
+2) Putting the required jar files into /lib directory of the Flink distribution
+Option 1) requires very little work, integrates nicely with existing Hadoop 
setups and should be the
+preferred approach.
+However, Hadoop has a large dependency footprint that increases the risk for 
dependency conflicts to occur.
+If this happens, please refer to option 2).
+
+The following subsections explains these approaches in detail.
+
+## Using `flink-shaded-hadoop-2-uber` jar for resolving dependency conflicts 
(legacy)
+
+
+  Warning: Starting from Flink 1.11, using 
`flink-shaded-hadoop-2-uber` releases is not officially supported
+  by the Flink project anymore. Users are advised to provide Hadoop 
dependencies through `HADOOP_CLASSPATH` (see above).
+
+
+
+
+The Flink project used to release Hadoop distributions for specific versions, 
that relocate or exclude several dependencies

Review comment:
   add a specific release as a date, instead of "used to"

##
File path: docs/ops/deployment/hadoop.md
##
@@ -120,4 +88,44 @@ This way it should work both in local and cluster run where 
the provided depende
 To run or debug an application in IntelliJ Idea the provided dependencies can 
be included to the class path
 in the "Run|Edit Configurations" window.
 
+
+2) Putting the required jar files into /lib directory of the Flink distribution

Review comment:
   This paragraph isn't properly integrated with the current documentation. 
I guess it should be removed since the section below subsumes it?

##
File path: 
flink-connectors/flink-hbase/src/test/java/org/apache/flink/addons/hbase/util/HBaseTestingClusterAutoStarter.java
##
@@ -142,6 +144,11 @@ private static Configuration initialize(Configuration 
conf) {
 
@BeforeClass
public static void setUp() throws Exception {
+   // HBase 1.4 does not work with Hadoop 3
+   // because it uses Guava 12.0.1, Hadoop 3 uses Guava 27.0-jre.
+   // There is not Guava version in between that works with both.

Review comment:
   ```suggestion
// There is no Guava version in between that works with both.
   ```

##
File path: 
flink-end-to-

[GitHub] [flink] zentol commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile

2020-05-06 Thread GitBox


zentol commented on a change in pull request #11983:
URL: https://github.com/apache/flink/pull/11983#discussion_r420605259



##
File path: 
flink-end-to-end-tests/flink-end-to-end-tests-common-kafka/src/test/java/org/apache/flink/tests/util/kafka/SQLClientKafkaITCase.java
##
@@ -106,11 +112,16 @@ public SQLClientKafkaITCase(String kafkaVersion, String 
kafkaSQLVersion, String
}
 
@Before
-   public void before() {
+   public void before() throws Exception {
+   downloadCache.before();
Path tmpPath = tmp.getRoot().toPath();
LOG.info("The current temporary path: {}", tmpPath);
this.sqlClientSessionConf = 
tmpPath.resolve("sql-client-session.conf");
this.result = tmpPath.resolve("result");
+
+   
apacheAvroJars.add(downloadCache.getOrDownload("https://repo1.maven.org/maven2/org/apache/avro/avro/1.8.2/avro-1.8.2.jar";,
 tmpPath));

Review comment:
   > Because flink-shaded-hadoop does not exist anymore in this change
   
   Well duh, but hadoop is still on the classpath, no? Otherwise you're not 
fulfilling the contract of the `Hadoop` category; if the test is still passing 
then the category should be removed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] zentol commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile

2020-05-05 Thread GitBox


zentol commented on a change in pull request #11983:
URL: https://github.com/apache/flink/pull/11983#discussion_r420172348



##
File path: flink-connectors/flink-connector-hive/pom.xml
##
@@ -126,12 +211,18 @@ under the License.
thus override the default hadoop version from 2.4.1 to 
2.7.5
-->

-   org.apache.flink
-   flink-shaded-hadoop-2-uber
-   
${hivemetastore.hadoop.version}-${flink.shaded.version}
+   org.apache.hadoop
+   hadoop-common
+   ${hivemetastore.hadoop.version}

Review comment:
   redundant due to dependency management

##
File path: flink-connectors/flink-connector-hive/pom.xml
##
@@ -126,12 +211,18 @@ under the License.
thus override the default hadoop version from 2.4.1 to 
2.7.5
-->

-   org.apache.flink
-   flink-shaded-hadoop-2-uber
-   
${hivemetastore.hadoop.version}-${flink.shaded.version}
+   org.apache.hadoop
+   hadoop-common
+   ${hivemetastore.hadoop.version}
provided

 
+   
+   org.apache.hadoop
+   hadoop-mapreduce-client-core
+   ${hivemetastore.hadoop.version}

Review comment:
   redundant due to dependency management

##
File path: flink-formats/flink-orc-nohive/pom.xml
##
@@ -98,6 +103,22 @@ under the License.
 

 
+   
+   
+   
+   hadoop3-tests
+   
+   
+   org.apache.hadoop
+   
hadoop-hdfs-client

Review comment:
   how come this isn't required at runtime?

##
File path: flink-filesystems/flink-s3-fs-hadoop/pom.xml
##
@@ -32,6 +32,17 @@ under the License.
 
jar
 
+   

[GitHub] [flink] zentol commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile

2020-05-04 Thread GitBox


zentol commented on a change in pull request #11983:
URL: https://github.com/apache/flink/pull/11983#discussion_r419698801



##
File path: flink-dist/pom.xml
##
@@ -137,8 +137,8 @@ under the License.
${project.version}


-   org.apache.flink
-   
flink-shaded-hadoop-2
+   org.apache.hadoop
+   *

Review comment:
   hmm, should've read MNG-3832 more carefully; it _does_ work with all 
maven 3 versions, just prints a warning on everything below 3.2.1 .





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] zentol commented on a change in pull request #11983: [FLINK-11086] Replace flink-shaded-hadoop dependencies; add Hadoop 3 test profile

2020-05-04 Thread GitBox


zentol commented on a change in pull request #11983:
URL: https://github.com/apache/flink/pull/11983#discussion_r419630074



##
File path: flink-dist/pom.xml
##
@@ -137,8 +137,8 @@ under the License.
${project.version}


-   org.apache.flink
-   
flink-shaded-hadoop-2
+   org.apache.hadoop
+   *

Review comment:
   This breaks compatibility with maven 3.1





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org