+1, with the instruction "warn everyone about the guava update possibly
breaking things at run time"

With the key issues being
* code compiled with the new guava release will not link against the older
releases, even without any changes in the source files.
* this includes hadoop-common

Applications which exclude the guava dependency published by hadoop-
artifacts to use their own, must set guava.version=27.0-jre or
guava.version=27.0 to be consistent with that of this release.


My tests were all with using the artifacts downstream via maven; I trust
others to look at the big tarball release.


*Project 1: cloudstore*


This is my extra diagnostics and cloud utils module.
https://github.com/steveloughran/cloudstore


All compiled fine, but the tests failed on guava linkage

testNoOverwriteDest(org.apache.hadoop.tools.cloudup.TestLocalCloudup)  Time
elapsed: 0.012 sec  <<< ERROR! java.lang.NoSuchMethodError: 'void
com.google.common.base.Preconditions.checkArgument(boolean,
java.lang.String, java.lang.Object, java.lang.Object)'
at org.apache.hadoop.fs.tools.cloudup.Cloudup.run(Cloudup.java:177)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at org.apache.hadoop.tools.store.StoreTestUtils.exec(StoreTestUtils.java:4


Note: that app is designed to run against hadoop branch-2 and other
branches, so I ended up reimplementing the checkArgument and checkState
calls so that I can have a binary which links everywhere. My code, nothing
serious.

*Project 2: Spark*


apache spark main branch built with maven (not tried the SBT build).


mvn -T 1  -Phadoop-3.2 -Dhadoop.version=3.1.4 -Psnapshots-and-staging
-Phadoop-cloud,yarn,kinesis-asl -DskipTests clean package

All good. Then I ran the committer unit test suite

mvn -T 1  -Phadoop-3.2 -Dhadoop.version=3.1.4
-Phadoop-cloud,yarn,kinesis-as  -Psnapshots-and-staging --pl hadoop-cloud
test

CommitterBindingSuite:
*** RUN ABORTED ***
  java.lang.NoSuchMethodError: 'void
com.google.common.base.Preconditions.checkArgument(boolean,
java.lang.String, java.lang.Object)'
  at org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
  at org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
  at
org.apache.spark.internal.io.cloud.CommitterBindingSuite.newJob(CommitterBindingSuite.scala:89)
  at
org.apache.spark.internal.io.cloud.CommitterBindingSuite.$anonfun$new$1(CommitterBindingSuite.scala:55)
  at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
  at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
  at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
  at org.scalatest.Transformer.apply(Transformer.scala:22)
  at org.scalatest.Transformer.apply(Transformer.scala:20)
  at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186)
  ...

Fix: again, tell the build this is a later version of Guava:


mvn -T 1  -Phadoop-3.2 -Dhadoop.version=3.1.4
-Phadoop-cloud,yarn,kinesis-asl  -Psnapshots-and-staging --pl hadoop-cloud
-Dguava.version=27.0-jre test


the mismatch doesn't break spark internally, they shade their stuff anyway,
the guava.version here is actually the one which hadoop is to be linked
with.

outcome: tests work

[INFO] --- scalatest-maven-plugin:2.0.0:test (test) @
spark-hadoop-cloud_2.12 ---
Discovery starting.
Discovery completed in 438 milliseconds.
Run starting. Expected test count is: 4
CommitterBindingSuite:
- BindingParquetOutputCommitter binds to the inner committer
- committer protocol can be serialized and deserialized
- local filesystem instantiation
- reject dynamic partitioning
Run completed in 1 second, 411 milliseconds.
Total number of tests run: 4
Suites: completed 2, aborted 0
Tests: succeeded 4, failed 0, canceled 0, ignored 0, pending 0

This is a real PITA, and its invariably those checkArgument calls, because
the later guava versions added some overloaded methods. Compile existing
source with a later guava version and the .class no longer binds to the
older guava version, even though no new guava APIs have been adopted.

I am really tempted to go through src/**/*.java and replace all Guava
checkArgument/checkState with our own implementation in hadoop.common, at
least for any which uses the vararg variant. But: it'd be a big change and
there may be related issues elsewhere. At least now things fail fast.

*Project 3: spark cloud integration  *

https://github.com/hortonworks-spark/cloud-integration

This is where the functional tests for the s3a committer through spark live

-Dhadoop.version=3.1.2 -Dspark.version=3.1.0-SNAPSHOT
-Psnapshots-and-staging

and a full test run

mvn test -Dcloud.test.configuration.file=../test-configs/s3a.xml --pl
cloud-examples -Dhadoop.version=3.1.2 -Dspark.version=3.1.0-SNAPSHOT
-Psnapshots-and-staging

All good. A couple of test failures, but that was because one of my test
datasets is not on any bucket I have...will have to fix that.


To conclude: the artefacts are all there, existing code compiles against
the new version without obvious problems. Where people will see stack
traces is from the guava update. Is it frustrating, but there is nothing we
can do about it. All we can do is remember to ourselves "don't add
overloaded methods where you have already shipped an implementation with a
varargs one"

For the release notes: we need to explain what is happening and why.



n Fri, 26 Jun 2020 at 14:51, Gabor Bota <gabor.b...@cloudera.com> wrote:

> Hi folks,
>
> I have put together a release candidate (RC2) for Hadoop 3.1.4.
>
> The RC is available at: http://people.apache.org/~gabota/hadoop-3.1.4-RC2/
> The RC tag in git is here:
> https://github.com/apache/hadoop/releases/tag/release-3.1.4-RC2
> The maven artifacts are staged at
> https://repository.apache.org/content/repositories/orgapachehadoop-1269/
>
> You can find my public key at:
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> and http://keys.gnupg.net/pks/lookup?op=get&search=0xB86249D83539B38C
>
> Please try the release and vote. The vote will run for 5 weekdays,
> until July 6. 2020. 23:00 CET.
>
> The release includes the revert of HDFS-14941, as it caused
> HDFS-15421. IBR leak causes standby NN to be stuck in safe mode.
> (https://issues.apache.org/jira/browse/HDFS-15421)
> The release includes HDFS-15323, as requested.
> (https://issues.apache.org/jira/browse/HDFS-15323)
>
> Thanks,
> Gabor
>

Reply via email to