Thanks Shilun for your great work! It is acceptable for me to release 3.4.0 first which is dependent on hadoop-thirdparty-1.2.0. Then push forwards to fix the following issues mentioned above at the next release version. I don't think we can solve all historical issues in one release. If it is possible, we could mark this release (release-3.4.0) as an unstable version. Any thoughts? Thanks again.
Best Regards, - He Xiaoqiao On Fri, Mar 1, 2024 at 12:19 PM slfan1989 <slfan1...@apache.org> wrote: > I expect to initiate a vote for hadoop-3.4.0-RC3 in preparation for the > hadoop-3.4.0 release. We have been working on this for 2 months and have > already released hadoop-thirdparty-1.2.0. > > Regarding the issue described in HADOOP-19090, I believe we can address it > in the hadoop-3.4.1 release because not all improvements can be expected to > be completed in hadoop-3.4.0. > > I commented on HADOOP-19090: > > I am not opposed to releasing hadoop-thirdparty-1.2.1, but I don't think > now is a good time to do so. If we were to release hadoop-thirdparty-1.2.1, > our process is too lengthy: > > 1. We need to announce this in a public mailing list. > 2. Then initiate a vote, and after the vote passes, release > hadoop-thirdparty-1.2.1. > 3. Introduce version 1.2.1 in the Hadoop trunk branch. > 4. backport hadoop-3.4.0 > > Even if we upgrade to protobuf-3.23.4, there might still be other issues. > If there really are other issues, would we need to release > hadoop-thirdparty-1.2.2? > > I think a better approach would be: > > To notify about this in the release email for hadoop-3.4.0, and then > release hadoop-thirdparty-1.2.1 before the release of hadoop-3.4.1, > followed by thorough validation. > > I would like to hear the thoughts of other members. > > Best Regards, > Shilun Fan. > > On Fri, Mar 1, 2024 at 6:05 AM slfan1989 <slfan1...@apache.org> wrote: > >> Thank you for the feedback on this issue! >> >> We have already released hadoop-thirdparty-1.2.0. I think we should not >> release hadoop-thirdparty-1.2.1 before the launch of hadoop-3.4.0, as we >> are already short on time. >> >> Can we consider addressing this matter with the release of hadoop-3.4.1 >> instead? >> >> From my personal point of view, I hope to solve this problem in >> hadoop-3.4.1. >> >> Best Regards, >> Shilun Fan. >> >> On Fri, Mar 1, 2024 at 5:37 AM PJ Fanning <fannin...@apache.org> wrote: >> >>> There is an issue with the protobuf lib - described here [1] >>> >>> The idea would be to do a new hadoop-thirdparty release and uptake that. >>> >>> Related the hadoop-thirdparty uptake, I would like to get the Avro >>> uptake merged [2]. I think if we don't merge this for Hadoop 3.4.0, we >>> will have to wait until v3.5.0 instead because changing the Avro >>> compilation is probably something that you would want in a patch >>> release. >>> >>> >>> [1] https://issues.apache.org/jira/browse/HADOOP-19090 >>> [2] https://github.com/apache/hadoop/pull/4854#issuecomment-1967549235 >>> >>> >>> On Thu, 29 Feb 2024 at 22:24, slfan1989 <slfan1...@apache.org> wrote: >>> > >>> > I am preparing hadoop-3.4.0-RC3 as we have already released 3 RC >>> versions >>> > before, and I hope hadoop-3.4.0-RC3 will receive the approval of the >>> > members. >>> > >>> > Compared to hadoop-3.4.0-RC2, my plan is to backport 2 PRs from >>> branch-3.4 >>> > to branch-3.4.0: >>> > >>> > HADOOP-18088: Replacing log4j 1.x with reload4j. >>> > HADOOP-19084: Pruning hadoop-common transitive dependencies. >>> > >>> > I will use hadoop-release-support to package the arm version. >>> > >>> > I plan to release hadoop-3.4.0-RC3 next Monday. >>> > >>> > Best Regards, >>> > Shilun Fan. >>> > >>> > On Sat, Feb 24, 2024 at 11:28 AM slfan1989 <slfan1...@apache.org> >>> wrote: >>> > >>> > > Thank you very much for Steve's detailed test report and issue >>> description! >>> > > >>> > > I appreciate your time spent helping with validation. I am currently >>> > > trying to use hadoop-release-support to prepare hadoop-3.4.0-RC3. >>> > > >>> > > After completing the hadoop-3.4.0 version, I will document some of >>> the >>> > > issues encountered in the "how to release" document, so that future >>> members >>> > > can refer to it during the release process. >>> > > >>> > > Once again, thank you to all members involved in the hadoop-3.4.0 >>> release. >>> > > >>> > > Let's hope for a smooth release process. >>> > > >>> > > Best Regards, >>> > > Shilun Fan. >>> > > >>> > > On Sat, Feb 24, 2024 at 2:29 AM Steve Loughran >>> <ste...@cloudera.com.invalid> >>> > > wrote: >>> > > >>> > >> I have been testing this all week, and a -1 until some very minor >>> changes >>> > >> go in. >>> > >> >>> > >> >>> > >> 1. build the arm64 binaries with the same jar artifacts as the >>> x86 one >>> > >> 2. include ad8b6541117b HADOOP-18088. Replace log4j 1.x with >>> reload4j. >>> > >> 3. include 80b4bb68159c HADOOP-19084. Prune hadoop-common >>> transitive >>> > >> dependencies >>> > >> >>> > >> >>> > >> For #1 we have automation there in my client-validator module, >>> which I >>> > >> have >>> > >> moved to be a hadoop-managed project and tried to make more >>> > >> manageable >>> > >> https://github.com/apache/hadoop-release-support >>> > >> >>> > >> This contains an ant project to perform a lot of the documented >>> build >>> > >> stages, including using SCP to copy down an x86 release tarball and >>> make a >>> > >> signed copy of this containing (locally built) arm artifacts. >>> > >> >>> > >> Although that only works with my development environment (macbook m1 >>> > >> laptop >>> > >> and remote ec2 server), it should be straightforward to make it more >>> > >> flexible. >>> > >> >>> > >> It also includes and tests a maven project which imports many of the >>> > >> hadoop-* pom files and run some test with it; this caught some >>> problems >>> > >> with exported slf4j and log4j2 artifacts getting into the >>> classpath. That >>> > >> is: hadoop-common pulling in log4j 1.2 and 2.x bindings. >>> > >> >>> > >> HADOOP-19084 fixes this; the build file now includes a target to >>> scan the >>> > >> dependencies and fail if "forbidden" artifacts are found. I have >>> not been >>> > >> able to stop logback ending on the transitive dependency list, but >>> at >>> > >> least >>> > >> there is only one slf4j there. >>> > >> >>> > >> HADOOP-18088. Replace log4j 1.x with reload4j switches over to >>> reload4j >>> > >> while the move to v2 is still something we have to consider a WiP. >>> > >> >>> > >> I have tried doing some other changes to the packaging this week >>> > >> - creating a lean distro without the AWS SDK >>> > >> - trying to get protobuf-2.5 out of yarn-api >>> > >> However, I think it is too late to try applying patches this risky. >>> > >> >>> > >> I Believe we should get the 3.4.0 release out for people to start >>> playing >>> > >> with while we rapidly iterate 3.4.1 release out with >>> > >> - updated dependencies (where possible) >>> > >> - separate "lean" and "full" installations, where "full" includes >>> all the >>> > >> cloud connectors and their dependencies; the default is lean and >>> doesn't. >>> > >> That will cut the default download size in half. >>> > >> - critical issues which people who use the 3.4.0 release raise with >>> us. >>> > >> >>> > >> That is: a packaging and bugs release, with a minimal number of new >>> > >> features. >>> > >> >>> > >> I've created HADOOP-19087 >>> > >> <https://issues.apache.org/jira/browse/HADOOP-19087> to cover this, >>> > >> I'm willing to get my hands dirty here -Shilun Fan and Xiaoqiao He >>> have >>> > >> put >>> > >> a lot of work on 3.4.0 and probably need other people to take up >>> the work >>> > >> for next release. Who else is willing to participate? (Yes Mukund, >>> I have >>> > >> you in mind too) >>> > >> >>> > >> One thing I would like to visit is: what hadoop-tools modules can >>> we cut? >>> > >> Are rumen and hadoop-streaming being actively used? Or can we >>> consider >>> > >> them >>> > >> implicitly EOL and strip. Just think of the maintenance effort we >>> would >>> > >> save. >>> > >> >>> > >> --- >>> > >> >>> > >> Incidentally, I have tested the arm stuff on my raspberry pi5 which >>> is now >>> > >> running 64 bit linux. I believe it is the first time we have >>> qualified a >>> > >> Hadoop release with the media player under someone's television. >>> > >> >>> > >> On Thu, 15 Feb 2024 at 20:41, Mukund Madhav Thakur < >>> mtha...@cloudera.com> >>> > >> wrote: >>> > >> >>> > >> > Thanks, Shilun for putting this together. >>> > >> > >>> > >> > Tried the below things and everything worked for me. >>> > >> > >>> > >> > validated checksum and gpg signature. >>> > >> > compiled from source. >>> > >> > Ran AWS integration tests. >>> > >> > untar the binaries and able to access objects in S3 via hadoop fs >>> > >> commands. >>> > >> > compiled gcs-connector successfully using the 3.4.0 version. >>> > >> > >>> > >> > qq: what is the difference between RC1 and RC2? apart from some >>> extra >>> > >> > patches. >>> > >> > >>> > >> > >>> > >> > >>> > >> > On Thu, Feb 15, 2024 at 10:58 AM slfan1989 <slfan1...@apache.org> >>> > >> wrote: >>> > >> > >>> > >> >> Thank you for explaining this part! >>> > >> >> >>> > >> >> hadoop-3.4.0-RC2 used the validate-hadoop-client-artifacts tool >>> to >>> > >> >> generate >>> > >> >> the ARM tar package, which should meet expectations. >>> > >> >> >>> > >> >> We also look forward to other members helping to verify. >>> > >> >> >>> > >> >> Best Regards, >>> > >> >> Shilun Fan. >>> > >> >> >>> > >> >> On Fri, Feb 16, 2024 at 12:22 AM Steve Loughran < >>> ste...@cloudera.com> >>> > >> >> wrote: >>> > >> >> >>> > >> >> > >>> > >> >> > >>> > >> >> > On Mon, 12 Feb 2024 at 15:32, slfan1989 <slfan1...@apache.org> >>> > >> wrote: >>> > >> >> > >>> > >> >> >> >>> > >> >> >> >>> > >> >> >> Note, because the arm64 binaries are built separately on a >>> different >>> > >> >> >> platform and JVM, their jar files may not match those of the >>> x86 >>> > >> >> >> release -and therefore the maven artifacts. I don't think >>> this is >>> > >> >> >> an issue (the ASF actually releases source tarballs, the >>> binaries >>> > >> are >>> > >> >> >> there for help only, though with the maven repo that's a bit >>> > >> blurred). >>> > >> >> >> >>> > >> >> >> The only way to be consistent would actually untar the >>> x86.tar.gz, >>> > >> >> >> overwrite its binaries with the arm stuff, retar, sign and >>> push out >>> > >> >> >> for the vote. >>> > >> >> > >>> > >> >> > >>> > >> >> > >>> > >> >> > that's exactly what the "arm.release" target in my >>> client-validator >>> > >> >> does. >>> > >> >> > builds an arm tar with the x86 binaries but the arm native >>> libs, >>> > >> signs >>> > >> >> it. >>> > >> >> > >>> > >> >> > >>> > >> >> > >>> > >> >> >> Even automating that would be risky. >>> > >> >> >> >>> > >> >> >> >>> > >> >> > automating is the *only* way to do it; apache ant has >>> everything >>> > >> needed >>> > >> >> > for this including the ability to run gpg. >>> > >> >> > >>> > >> >> > we did this on the relevant 3.3.x releases and nobody has yet >>> > >> >> complained... >>> > >> >> > >>> > >> >> >>> > >> > >>> > >> >>> > > >>> >>