Re: [VOTE] Release Spark 3.2.0 (RC1)

2021-08-31 Thread Gengliang Wang
Hi Chao & DB, Actually, I cut the RC2 yesterday before your post the Parquet issue: https://github.com/apache/spark/tree/v3.2.0-rc2 It has been 11 days since RC1. I think we can have RC2 today so that the community can test and find potential issues earlier. As for the Parquet issue, we can treat

Re: [VOTE] Release Spark 3.2.0 (RC1)

2021-08-31 Thread DB Tsai
Hello Xiao, there are multiple patches in Spark 3.2 depending on parquet 1.12, so it might be easier to wait for the fix in parquet community instead of reverting all the related changes. The fix in parquet community is very trivial, and we hope that it will not take too long. Thanks. DB Tsai |

Re: [VOTE] Release Spark 3.2.0 (RC1)

2021-08-31 Thread Chao Sun
Hi Xiao, I'm still checking with the Parquet community on this. Since the fix is already +1'd, I'm hoping this won't take long. The delta in parquet-1.12.x branch is also small with just 2 commits so far. Chao On Tue, Aug 31, 2021 at 12:03 PM Xiao Li wrote: > Hi, Chao, > > How long will it

Re: [VOTE] Release Spark 3.2.0 (RC1)

2021-08-31 Thread Xiao Li
Hi, Chao, How long will it take? Normally, in the RC stage, we always revert the upgrade made in the current release. We did the parquet upgrade multiple times in the previous releases for avoiding the major delay in our Spark release Thanks, Xiao On Tue, Aug 31, 2021 at 11:03 AM Chao Sun

Re: [VOTE] Release Spark 3.2.0 (RC1)

2021-08-31 Thread Chao Sun
The Apache Parquet community found an issue [1] in 1.12.0 which could cause incorrect file offset being written and subsequently reading of the same file to fail. A fix has been proposed in the same JIRA and we may have to wait until a new release is available so that we can upgrade Spark with the

Re: [VOTE] Release Spark 3.2.0 (RC1)

2021-08-27 Thread Sean Owen
Maybe, I'm just confused why it's needed at all. Other profiles that add a dependency seem OK, but something's different here. One thing we can/should change is to simply remove the block in the profile. It should always be a direct dep in Scala 2.13 (which lets us take out the profiles in

Re: [VOTE] Release Spark 3.2.0 (RC1)

2021-08-26 Thread Stephen Coy
Hi Sean, I think that maybe the https://www.mojohaus.org/flatten-maven-plugin/ will help you out here. Cheers, Steve C On 27 Aug 2021, at 12:29 pm, Sean Owen mailto:sro...@gmail.com>> wrote: OK right, you would have seen a different error otherwise. Yes profiles are only a compile-time

Re: [VOTE] Release Spark 3.2.0 (RC1)

2021-08-26 Thread Sean Owen
OK right, you would have seen a different error otherwise. Yes profiles are only a compile-time thing, but they should affect the effective POM for the artifact. mvn -Pscala-2.13 help:effective-pom shows scala-parallel-collections as a dependency in the POM as expected (not in a profile). However

Re: [VOTE] Release Spark 3.2.0 (RC1)

2021-08-26 Thread Stephen Coy
I did indeed. The generated spark-core_2.13-3.2.0.pom that is created alongside the jar file in the local repo contains: scala-2.13 org.scala-lang.modules scala-parallel-collections_${scala.binary.version} which means this dependency will be missing for unit

Re: [VOTE] Release Spark 3.2.0 (RC1)

2021-08-26 Thread Sean Owen
Did you run ./dev/change-scala-version.sh 2.13 ? that's required first to update POMs. It works fine for me. On Thu, Aug 26, 2021 at 8:33 PM Stephen Coy wrote: > Hi all, > > Being adventurous I have built the RC1 code with: > > -Pyarn -Phadoop-3.2 -Pyarn -Phadoop-cloud -Phive-thriftserver

Re: [VOTE] Release Spark 3.2.0 (RC1)

2021-08-26 Thread Stephen Coy
Hi all, Being adventurous I have built the RC1 code with: -Pyarn -Phadoop-3.2 -Pyarn -Phadoop-cloud -Phive-thriftserver -Phive-2.3 -Pscala-2.13 -Dhadoop.version=3.2.2 And then attempted to build my Java based spark application. However, I found a number of our unit tests were failing with:

Re: [VOTE] Release Spark 3.2.0 (RC1)

2021-08-25 Thread Yi Wu
Hi Gengliang, I found another ticket: SPARK-36509 : Executors don't get rescheduled in standalone mode when worker dies And it already has the fix: https://github.com/apache/spark/pull/33818 Bests, Yi On Wed, Aug 25, 2021 at 9:49 PM Gengliang

Re: [VOTE] Release Spark 3.2.0 (RC1)

2021-08-25 Thread Gengliang Wang
Hi all, So, RC1 failed. After RC1 cut, we have merged the following bug fixes to branch-3.2: - Updates AuthEngine to pass the correct SecretKeySpec format - Fix NullPointerException in

Re: [VOTE] Release Spark 3.2.0 (RC1)

2021-08-24 Thread Sean Owen
I think we'll need this revert: https://github.com/apache/spark/pull/33819 Between that and a few other minor but important issues I think I'd say -1 myself and ask for another RC. On Tue, Aug 24, 2021 at 1:01 PM Jacek Laskowski wrote: > Hi Yi Wu, > > Looks like the issue has got resolution:

Re: [VOTE] Release Spark 3.2.0 (RC1)

2021-08-24 Thread Jacek Laskowski
Hi Yi Wu, Looks like the issue has got resolution: Won't Fix. How about your -1? Pozdrawiam, Jacek Laskowski https://about.me/JacekLaskowski "The Internals Of" Online Books Follow me on https://twitter.com/jaceklaskowski On

Re: [VOTE] Release Spark 3.2.0 (RC1)

2021-08-22 Thread Yi Wu
-1. I found a bug (https://issues.apache.org/jira/browse/SPARK-36558) in the push-based shuffle, which could lead to job hang. Bests, Yi On Sat, Aug 21, 2021 at 1:05 AM Gengliang Wang wrote: > Please vote on releasing the following candidate as Apache Spark version > 3.2.0. > > The vote is

Re: [VOTE] Release Spark 3.2.0 (RC1)

2021-08-22 Thread Michael Heuer
Thanks! I found the issue was our explicit dependency on hadoop-client. After dropping that for the one provided by spark-core we no longer run into the Jackson classpath problem. > On Aug 22, 2021, at 1:29 PM, Sean Owen wrote: > > Jackson was bumped from 2.10.x to 2.12.x, which could well

Re: [VOTE] Release Spark 3.2.0 (RC1)

2021-08-22 Thread Sean Owen
Jackson was bumped from 2.10.x to 2.12.x, which could well explain it if you're exposed to the Spark classpath and have your own different Jackson dep. On Sun, Aug 22, 2021 at 1:21 PM Michael Heuer wrote: > We're seeing runtime classpath issues with Avro 1.10.2, Parquet 1.12.0, > and Spark

Re: [VOTE] Release Spark 3.2.0 (RC1)

2021-08-22 Thread Michael Heuer
We're seeing runtime classpath issues with Avro 1.10.2, Parquet 1.12.0, and Spark 3.2.0 RC1. Our dependency tree is deep though, and will require further investigation. https://github.com/bigdatagenomics/adam/pull/2289 $ mvn test ... *** RUN

Re: [VOTE] Release Spark 3.2.0 (RC1)

2021-08-22 Thread Sean Owen
So far, I've tested Java 8 + Scala 2.12, Scala 2.13 and the results look good per usual. Good to see Scala 2.13 artifacts!! Unless I've forgotten something we're OK for Scala 2.13 now, and Java 11 (and, IIRC, Java 14 works fine minus some very minor corners of the project's deps) I think we're

Re: [VOTE] Release Spark 3.2.0 (RC1)

2021-08-22 Thread Jacek Laskowski
Hi Gengliang, Yay! Thank you! Java 11 with the following MAVEN_OPTS worked fine: $ echo $MAVEN_OPTS -Xss64m -Xmx4g -XX:ReservedCodeCacheSize=1g $ ./build/mvn \ -Pyarn,kubernetes,hadoop-cloud,hive,hive-thriftserver \ -DskipTests \ clean install ... [INFO]

Re: [VOTE] Release Spark 3.2.0 (RC1)

2021-08-22 Thread Jacek Laskowski
Hi Gengliang, With Java 8 the build worked fine. No other changes. I'm going to give Java 11 a try with the options you mentioned. $ java -version openjdk version "1.8.0_292" OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_292-b10) OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.292-b10,

Re: [VOTE] Release Spark 3.2.0 (RC1)

2021-08-22 Thread Gengliang Wang
Hi Mridul, yes, Spark 3.2.0 should include the fix. The PR is merged after the RC1 cut and there is no JIRA for the issue so that it is missed. On Sun, Aug 22, 2021 at 2:27 PM Mridul Muralidharan wrote: > Hi, > > Signatures, digests, etc check out fine. > Checked out tag and build/tested

Re: [VOTE] Release Spark 3.2.0 (RC1)

2021-08-22 Thread Gengliang Wang
Hi Jacek, The current GitHub action CI for Spark contains Java 11 build. The build is successful with the options "-Xss64m -Xmx2g -XX:ReservedCodeCacheSize=1g": https://github.com/apache/spark/blob/master/.github/workflows/build_and_test.yml#L506 The default Java stack size is small and we have

Re: [VOTE] Release Spark 3.2.0 (RC1)

2021-08-22 Thread Mridul Muralidharan
Hi, Signatures, digests, etc check out fine. Checked out tag and build/tested with -Pyarn -Phadoop-2.7 -Pmesos -Pkubernetes I am seeing test failures which are addressed by #33790 - this is in branch-3.2, but after the RC tag. After updating to the

Re: [VOTE] Release Spark 3.2.0 (RC1)

2021-08-21 Thread Jacek Laskowski
Hi, I've been building the tag and I'm facing the following StackOverflowError: Exception in thread "main" java.lang.StackOverflowError at scala.tools.nsc.transform.ExtensionMethods$Extender.transform(ExtensionMethods.scala:275) at

[VOTE] Release Spark 3.2.0 (RC1)

2021-08-20 Thread Gengliang Wang
Please vote on releasing the following candidate as Apache Spark version 3.2 .0. The vote is open until 11:59pm Pacific time Aug 25 and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes. [ ] +1 Release this package as Apache Spark 3.2.0 [ ] -1 Do not release this package