Hello Xiao, there are multiple patches in Spark 3.2 depending on parquet
1.12, so it might be easier to wait for the fix in parquet community
instead of reverting all the related changes. The fix in parquet community
is very trivial, and we hope that it will not take too long. Thanks.
DB Tsai  |  https://www.dbtsai.com/  |  PGP 42E5B25A8F7A82C1

On Tue, Aug 31, 2021 at 1:09 PM Chao Sun <sunc...@apache.org> wrote:

> Hi Xiao, I'm still checking with the Parquet community on this. Since the
> fix is already +1'd, I'm hoping this won't take long. The delta in
> parquet-1.12.x branch is also small with just 2 commits so far.
> Chao
> On Tue, Aug 31, 2021 at 12:03 PM Xiao Li <lix...@databricks.com> wrote:
>> Hi, Chao,
>> How long will it take? Normally, in the RC stage, we always revert the
>> upgrade made in the current release. We did the parquet upgrade multiple
>> times in the previous releases for avoiding the major delay in our Spark
>> release
>> Thanks,
>> Xiao
>> On Tue, Aug 31, 2021 at 11:03 AM Chao Sun <sunc...@apache.org> wrote:
>>> The Apache Parquet community found an issue [1] in 1.12.0 which could
>>> cause incorrect file offset being written and subsequently reading of the
>>> same file to fail. A fix has been proposed in the same JIRA and we may have
>>> to wait until a new release is available so that we can upgrade Spark with
>>> the hot fix.
>>> [1]: https://issues.apache.org/jira/browse/PARQUET-2078
>>> On Fri, Aug 27, 2021 at 7:06 AM Sean Owen <sro...@gmail.com> wrote:
>>>> Maybe, I'm just confused why it's needed at all. Other profiles that
>>>> add a dependency seem OK, but something's different here.
>>>> One thing we can/should change is to simply remove the
>>>> <dependencyManagement> block in the profile. It should always be a direct
>>>> dep in Scala 2.13 (which lets us take out the profiles in submodules, which
>>>> just repeat that)
>>>> We can also update the version, by the by.
>>>> I tried this and the resulting POM still doesn't look like what I
>>>> expect though.
>>>> (The binary release is OK, FWIW - it gets pulled in as a JAR as
>>>> expected)
>>>> On Thu, Aug 26, 2021 at 11:34 PM Stephen Coy <s...@infomedia.com.au>
>>>> wrote:
>>>>> Hi Sean,
>>>>> I think that maybe the https://www.mojohaus.org/flatten-maven-plugin/ will
>>>>> help you out here.
>>>>> Cheers,
>>>>> Steve C
>>>>> On 27 Aug 2021, at 12:29 pm, Sean Owen <sro...@gmail.com> wrote:
>>>>> OK right, you would have seen a different error otherwise.
>>>>> Yes profiles are only a compile-time thing, but they should affect the
>>>>> effective POM for the artifact. mvn -Pscala-2.13 help:effective-pom shows
>>>>> scala-parallel-collections as a dependency in the POM as expected (not in 
>>>>> a
>>>>> profile). However I see what you see in the .pom in the release repo, and
>>>>> in my local repo after building - it's just sitting there as a profile as
>>>>> if it weren't activated or something.
>>>>> I'm confused then, that shouldn't be what happens. I'd say maybe there
>>>>> is a problem with the release script, but seems to affect a simple local
>>>>> build. Anyone else more expert in this see the problem, while I try to
>>>>> debug more?
>>>>> The binary distro may actually be fine, I'll check; it may even not
>>>>> matter much for users who generally just treat Spark as a 
>>>>> compile-time-only
>>>>> dependency either. But I can see it would break exactly your case,
>>>>> something like a self-contained test job.
>>>>> On Thu, Aug 26, 2021 at 8:41 PM Stephen Coy <s...@infomedia.com.au>
>>>>> wrote:
>>>>>> I did indeed.
>>>>>> The generated spark-core_2.13-3.2.0.pom that is created alongside the
>>>>>> jar file in the local repo contains:
>>>>>> <profile>
>>>>>>   <id>scala-2.13</id>
>>>>>>   <dependencies>
>>>>>>     <dependency>
>>>>>>       <groupId>org.scala-lang.modules</groupId>
>>>>>> <artifactId>scala-parallel-collections_${scala.binary.version}</artifactId>
>>>>>>     </dependency>
>>>>>>   </dependencies>
>>>>>> </profile>
>>>>>> which means this dependency will be missing for unit tests that
>>>>>> create SparkSessions from library code only, a technique inspired by
>>>>>> Spark’s own unit tests.
>>>>>> Cheers,
>>>>>> Steve C
>>>>>> On 27 Aug 2021, at 11:33 am, Sean Owen <sro...@gmail.com> wrote:
>>>>>> Did you run ./dev/change-scala-version.sh 2.13 ? that's required
>>>>>> first to update POMs. It works fine for me.
>>>>>> On Thu, Aug 26, 2021 at 8:33 PM Stephen Coy <
>>>>>> s...@infomedia.com.au.invalid> wrote:
>>>>>>> Hi all,
>>>>>>> Being adventurous I have built the RC1 code with:
>>>>>>> -Pyarn -Phadoop-3.2  -Pyarn -Phadoop-cloud -Phive-thriftserver
>>>>>>> -Phive-2.3 -Pscala-2.13 -Dhadoop.version=3.2.2
>>>>>>> And then attempted to build my Java based spark application.
>>>>>>> However, I found a number of our unit tests were failing with:
>>>>>>> java.lang.NoClassDefFoundError: scala/collection/parallel/TaskSupport
>>>>>>> at
>>>>>>> org.apache.spark.SparkContext.$anonfun$union$1(SparkContext.scala:1412)
>>>>>>> at
>>>>>>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>>>>>>> at
>>>>>>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
>>>>>>> at org.apache.spark.SparkContext.withScope(SparkContext.scala:789)
>>>>>>> at org.apache.spark.SparkContext.union(SparkContext.scala:1406)
>>>>>>> at
>>>>>>> org.apache.spark.sql.execution.UnionExec.doExecute(basicPhysicalOperators.scala:698)
>>>>>>> at
>>>>>>> org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:184)
>>>>>>>         …
>>>>>>> I tracked this down to a missing dependency:
>>>>>>> <dependency>
>>>>>>>   <groupId>org.scala-lang.modules</groupId>
>>>>>>> <artifactId>scala-parallel-collections_${scala.binary.version}</artifactId>
>>>>>>> </dependency>
>>>>>>> which unfortunately appears only in a profile in the pom files
>>>>>>> associated with the various spark dependencies.
>>>>>>> As far as I know it is not possible to activate profiles in
>>>>>>> dependencies in maven builds.
>>>>>>> Therefore I suspect that right now a Scala 2.13 migration is not
>>>>>>> quite as seamless as we would like.
>>>>>>> I stress that this is only an issue for developers that write unit
>>>>>>> tests for their applications, as the Spark runtime environment will 
>>>>>>> always
>>>>>>> have the necessary dependencies available to it.
>>>>>>> (You might consider upgrading the
>>>>>>> org.scala-lang.modules:scala-parallel-collections_2.13 version from 0.2 
>>>>>>> to
>>>>>>> 1.0.3 though!)
>>>>>>> Cheers and thanks for the great work!
>>>>>>> Steve Coy
>>>>>>> On 21 Aug 2021, at 3:05 am, Gengliang Wang <ltn...@gmail.com> wrote:
>>>>>>> Please vote on releasing the following candidate as Apache Spark
>>>>>>>  version 3.2.0.
>>>>>>> The vote is open until 11:59pm Pacific time Aug 25 and passes if a
>>>>>>> majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>>>>>>> [ ] +1 Release this package as Apache Spark 3.2.0
>>>>>>> [ ] -1 Do not release this package because ...
>>>>>>> To learn more about Apache Spark, please see http://spark
>>>>>>> .apache.org/
>>>>>>> <https://aus01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fspark.apache.org%2F&data=04%7C01%7Cscoy%40infomedia.com.au%7Ca129f588b6f74ab624b908d96902801d%7C45d5407150f849caa59f9457123dc71c%7C0%7C1%7C637656281738454069%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=R0QBrNxN%2FYd9HrCrihR5XgRZF7jYRHcq931lLXwhQeQ%3D&reserved=0>
>>>>>>> The tag to be voted on is v3.2.0-rc1 (commit
>>>>>>> 6bb3523d8e838bd2082fb90d7f3741339245c044):
>>>>>>> https://github.com/apache/spark/tree/v3.2.0-rc1
>>>>>>> <https://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fspark%2Ftree%2Fv3.2.0-rc1&data=04%7C01%7Cscoy%40infomedia.com.au%7Ca129f588b6f74ab624b908d96902801d%7C45d5407150f849caa59f9457123dc71c%7C0%7C1%7C637656281738464031%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=aDmKWoXWZNsrYv6bLP%2F78rnC8rbhYEbOVoJ3FwQ49yU%3D&reserved=0>
>>>>>>> The release files, including signatures, digests, etc. can be found
>>>>>>> at:
>>>>>>> https://dist.apache.org/repos/dist/dev/spark/v3.2.0-rc1-bin/
>>>>>>> <https://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdist.apache.org%2Frepos%2Fdist%2Fdev%2Fspark%2Fv3.2.0-rc1-bin%2F&data=04%7C01%7Cscoy%40infomedia.com.au%7Ca129f588b6f74ab624b908d96902801d%7C45d5407150f849caa59f9457123dc71c%7C0%7C1%7C637656281738464031%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=6w0zf1lNPWdTeSLOGmUo4yMkDwd6xwC4o7EUkw1n9gI%3D&reserved=0>
>>>>>>> Signatures used for Spark RCs can be found in this file:
>>>>>>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>>>>>> <https://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdist.apache.org%2Frepos%2Fdist%2Fdev%2Fspark%2FKEYS&data=04%7C01%7Cscoy%40infomedia.com.au%7Ca129f588b6f74ab624b908d96902801d%7C45d5407150f849caa59f9457123dc71c%7C0%7C1%7C637656281738473982%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=x7XeOjMPwuEqR%2FuXijVjAlwf68MuVInqGhZ9l19eVPI%3D&reserved=0>
>>>>>>> The staging repository for this release can be found at:
>>>>>>> https://repository.apache.org/content/repositories/orgapachespark-1388
>>>>>>> <https://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Frepository.apache.org%2Fcontent%2Frepositories%2Forgapachespark-1388&data=04%7C01%7Cscoy%40infomedia.com.au%7Ca129f588b6f74ab624b908d96902801d%7C45d5407150f849caa59f9457123dc71c%7C0%7C1%7C637656281738473982%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=DLKn1scc4YOYUNGP51ch4nkxr1lh5nhZIBj0%2BoBSCXo%3D&reserved=0>
>>>>>>> The documentation corresponding to this release can be found at:
>>>>>>> https://dist.apache.org/repos/dist/dev/spark/v3.2.0-rc1-docs/
>>>>>>> <https://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdist.apache.org%2Frepos%2Fdist%2Fdev%2Fspark%2Fv3.2.0-rc1-docs%2F&data=04%7C01%7Cscoy%40infomedia.com.au%7Ca129f588b6f74ab624b908d96902801d%7C45d5407150f849caa59f9457123dc71c%7C0%7C1%7C637656281738473982%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=QtfYYwnJlQIHry0TlmQy72y2DYzat1MQmpBQkATw%2BAQ%3D&reserved=0>
>>>>>>> The list of bug fixes going into 3.2.0 can be found at the following
>>>>>>> URL:
>>>>>>> https://issues.apache.org/jira/projects/SPARK/versions/12349407
>>>>>>> <https://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fprojects%2FSPARK%2Fversions%2F12349407&data=04%7C01%7Cscoy%40infomedia.com.au%7Ca129f588b6f74ab624b908d96902801d%7C45d5407150f849caa59f9457123dc71c%7C0%7C1%7C637656281738483945%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=cop5XebB3u0dc2rRqe4YvHfCJ2w9yLlhcdaGB7TSTas%3D&reserved=0>
>>>>>>> This release is using the release script of the tag v3.2.0-rc1.
>>>>>>> FAQ
>>>>>>> =========================
>>>>>>> How can I help test this release?
>>>>>>> =========================
>>>>>>> If you are a Spark user, you can help us test this release by taking
>>>>>>> an existing Spark workload and running on this release candidate,
>>>>>>> then
>>>>>>> reporting any regressions.
>>>>>>> If you're working in PySpark you can set up a virtual env and install
>>>>>>> the current RC and see if anything important breaks, in the
>>>>>>> Java/Scala
>>>>>>> you can add the staging repository to your projects resolvers and
>>>>>>> test
>>>>>>> with the RC (make sure to clean up the artifact cache before/after
>>>>>>> so
>>>>>>> you don't end up building with a out of date RC going forward).
>>>>>>> ===========================================
>>>>>>> What should happen to JIRA tickets still targeting 3.2.0?
>>>>>>> ===========================================
>>>>>>> The current list of open tickets targeted at 3.2.0 can be found at:
>>>>>>> https://issues.apache.org/jira/projects/SPARK
>>>>>>> <https://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fprojects%2FSPARK&data=04%7C01%7Cscoy%40infomedia.com.au%7Ca129f588b6f74ab624b908d96902801d%7C45d5407150f849caa59f9457123dc71c%7C0%7C1%7C637656281738483945%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=k5gTpGV4JvGRC6gKOXY%2BlaZKAH5NPFM3nDwmRyNDiQA%3D&reserved=0>
>>>>>>>  and
>>>>>>> search for "Target Version/s" = 3.2.0
>>>>>>> Committers should look at those and triage. Extremely important bug
>>>>>>> fixes, documentation, and API tweaks that impact compatibility should
>>>>>>> be worked on immediately. Everything else please retarget to an
>>>>>>> appropriate release.
>>>>>>> ==================
>>>>>>> But my bug isn't fixed?
>>>>>>> ==================
>>>>>>> In order to make timely releases, we will typically not hold the
>>>>>>> release unless the bug in question is a regression from the previous
>>>>>>> release. That being said, if there is something which is a regression
>>>>>>> that has not been correctly targeted please ping me or a committer to
>>>>>>> help target the issue.
>>>>>>> This email contains confidential information of and is the copyright
>>>>>>> of Infomedia. It must not be forwarded, amended or disclosed without
>>>>>>> consent of the sender. If you received this message by mistake, please
>>>>>>> advise the sender and delete all copies. Security of transmission on the
>>>>>>> internet cannot be guaranteed, could be infected, intercepted, or 
>>>>>>> corrupted
>>>>>>> and you should ensure you have suitable antivirus protection in place. 
>>>>>>> By
>>>>>>> sending us your or any third party personal details, you consent to (or
>>>>>>> confirm you have obtained consent from such third parties) to 
>>>>>>> Infomedia’s
>>>>>>> privacy policy. http://www.infomedia.com.au/privacy-policy/
>> --

Reply via email to