Hi Xiao, I'm still checking with the Parquet community on this. Since the
fix is already +1'd, I'm hoping this won't take long. The delta in
parquet-1.12.x branch is also small with just 2 commits so far.

Chao

On Tue, Aug 31, 2021 at 12:03 PM Xiao Li <lix...@databricks.com> wrote:

> Hi, Chao,
>
> How long will it take? Normally, in the RC stage, we always revert the
> upgrade made in the current release. We did the parquet upgrade multiple
> times in the previous releases for avoiding the major delay in our Spark
> release
>
> Thanks,
>
> Xiao
>
>
> On Tue, Aug 31, 2021 at 11:03 AM Chao Sun <sunc...@apache.org> wrote:
>
>> The Apache Parquet community found an issue [1] in 1.12.0 which could
>> cause incorrect file offset being written and subsequently reading of the
>> same file to fail. A fix has been proposed in the same JIRA and we may have
>> to wait until a new release is available so that we can upgrade Spark with
>> the hot fix.
>>
>> [1]: https://issues.apache.org/jira/browse/PARQUET-2078
>>
>> On Fri, Aug 27, 2021 at 7:06 AM Sean Owen <sro...@gmail.com> wrote:
>>
>>> Maybe, I'm just confused why it's needed at all. Other profiles that add
>>> a dependency seem OK, but something's different here.
>>>
>>> One thing we can/should change is to simply remove the
>>> <dependencyManagement> block in the profile. It should always be a direct
>>> dep in Scala 2.13 (which lets us take out the profiles in submodules, which
>>> just repeat that)
>>> We can also update the version, by the by.
>>>
>>> I tried this and the resulting POM still doesn't look like what I expect
>>> though.
>>>
>>> (The binary release is OK, FWIW - it gets pulled in as a JAR as expected)
>>>
>>> On Thu, Aug 26, 2021 at 11:34 PM Stephen Coy <s...@infomedia.com.au>
>>> wrote:
>>>
>>>> Hi Sean,
>>>>
>>>> I think that maybe the https://www.mojohaus.org/flatten-maven-plugin/ will
>>>> help you out here.
>>>>
>>>> Cheers,
>>>>
>>>> Steve C
>>>>
>>>> On 27 Aug 2021, at 12:29 pm, Sean Owen <sro...@gmail.com> wrote:
>>>>
>>>> OK right, you would have seen a different error otherwise.
>>>>
>>>> Yes profiles are only a compile-time thing, but they should affect the
>>>> effective POM for the artifact. mvn -Pscala-2.13 help:effective-pom shows
>>>> scala-parallel-collections as a dependency in the POM as expected (not in a
>>>> profile). However I see what you see in the .pom in the release repo, and
>>>> in my local repo after building - it's just sitting there as a profile as
>>>> if it weren't activated or something.
>>>>
>>>> I'm confused then, that shouldn't be what happens. I'd say maybe there
>>>> is a problem with the release script, but seems to affect a simple local
>>>> build. Anyone else more expert in this see the problem, while I try to
>>>> debug more?
>>>> The binary distro may actually be fine, I'll check; it may even not
>>>> matter much for users who generally just treat Spark as a compile-time-only
>>>> dependency either. But I can see it would break exactly your case,
>>>> something like a self-contained test job.
>>>>
>>>> On Thu, Aug 26, 2021 at 8:41 PM Stephen Coy <s...@infomedia.com.au>
>>>> wrote:
>>>>
>>>>> I did indeed.
>>>>>
>>>>> The generated spark-core_2.13-3.2.0.pom that is created alongside the
>>>>> jar file in the local repo contains:
>>>>>
>>>>> <profile>
>>>>>   <id>scala-2.13</id>
>>>>>   <dependencies>
>>>>>     <dependency>
>>>>>       <groupId>org.scala-lang.modules</groupId>
>>>>>
>>>>> <artifactId>scala-parallel-collections_${scala.binary.version}</artifactId>
>>>>>     </dependency>
>>>>>   </dependencies>
>>>>> </profile>
>>>>>
>>>>> which means this dependency will be missing for unit tests that create
>>>>> SparkSessions from library code only, a technique inspired by Spark’s own
>>>>> unit tests.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Steve C
>>>>>
>>>>> On 27 Aug 2021, at 11:33 am, Sean Owen <sro...@gmail.com> wrote:
>>>>>
>>>>> Did you run ./dev/change-scala-version.sh 2.13 ? that's required first
>>>>> to update POMs. It works fine for me.
>>>>>
>>>>> On Thu, Aug 26, 2021 at 8:33 PM Stephen Coy <
>>>>> s...@infomedia.com.au.invalid> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> Being adventurous I have built the RC1 code with:
>>>>>>
>>>>>> -Pyarn -Phadoop-3.2  -Pyarn -Phadoop-cloud -Phive-thriftserver
>>>>>> -Phive-2.3 -Pscala-2.13 -Dhadoop.version=3.2.2
>>>>>>
>>>>>>
>>>>>> And then attempted to build my Java based spark application.
>>>>>>
>>>>>> However, I found a number of our unit tests were failing with:
>>>>>>
>>>>>> java.lang.NoClassDefFoundError: scala/collection/parallel/TaskSupport
>>>>>>
>>>>>> at
>>>>>> org.apache.spark.SparkContext.$anonfun$union$1(SparkContext.scala:1412)
>>>>>> at
>>>>>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>>>>>> at
>>>>>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
>>>>>> at org.apache.spark.SparkContext.withScope(SparkContext.scala:789)
>>>>>> at org.apache.spark.SparkContext.union(SparkContext.scala:1406)
>>>>>> at
>>>>>> org.apache.spark.sql.execution.UnionExec.doExecute(basicPhysicalOperators.scala:698)
>>>>>> at
>>>>>> org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:184)
>>>>>>         …
>>>>>>
>>>>>>
>>>>>> I tracked this down to a missing dependency:
>>>>>>
>>>>>> <dependency>
>>>>>>   <groupId>org.scala-lang.modules</groupId>
>>>>>>
>>>>>> <artifactId>scala-parallel-collections_${scala.binary.version}</artifactId>
>>>>>> </dependency>
>>>>>>
>>>>>>
>>>>>> which unfortunately appears only in a profile in the pom files
>>>>>> associated with the various spark dependencies.
>>>>>>
>>>>>> As far as I know it is not possible to activate profiles in
>>>>>> dependencies in maven builds.
>>>>>>
>>>>>> Therefore I suspect that right now a Scala 2.13 migration is not
>>>>>> quite as seamless as we would like.
>>>>>>
>>>>>> I stress that this is only an issue for developers that write unit
>>>>>> tests for their applications, as the Spark runtime environment will 
>>>>>> always
>>>>>> have the necessary dependencies available to it.
>>>>>>
>>>>>> (You might consider upgrading the
>>>>>> org.scala-lang.modules:scala-parallel-collections_2.13 version from 0.2 
>>>>>> to
>>>>>> 1.0.3 though!)
>>>>>>
>>>>>> Cheers and thanks for the great work!
>>>>>>
>>>>>> Steve Coy
>>>>>>
>>>>>>
>>>>>> On 21 Aug 2021, at 3:05 am, Gengliang Wang <ltn...@gmail.com> wrote:
>>>>>>
>>>>>> Please vote on releasing the following candidate as Apache Spark
>>>>>>  version 3.2.0.
>>>>>>
>>>>>> The vote is open until 11:59pm Pacific time Aug 25 and passes if a
>>>>>> majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>>>>>>
>>>>>> [ ] +1 Release this package as Apache Spark 3.2.0
>>>>>> [ ] -1 Do not release this package because ...
>>>>>>
>>>>>> To learn more about Apache Spark, please see http://spark.apache.org/
>>>>>> <https://aus01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fspark.apache.org%2F&data=04%7C01%7Cscoy%40infomedia.com.au%7Ca129f588b6f74ab624b908d96902801d%7C45d5407150f849caa59f9457123dc71c%7C0%7C1%7C637656281738454069%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=R0QBrNxN%2FYd9HrCrihR5XgRZF7jYRHcq931lLXwhQeQ%3D&reserved=0>
>>>>>>
>>>>>> The tag to be voted on is v3.2.0-rc1 (commit
>>>>>> 6bb3523d8e838bd2082fb90d7f3741339245c044):
>>>>>> https://github.com/apache/spark/tree/v3.2.0-rc1
>>>>>> <https://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fspark%2Ftree%2Fv3.2.0-rc1&data=04%7C01%7Cscoy%40infomedia.com.au%7Ca129f588b6f74ab624b908d96902801d%7C45d5407150f849caa59f9457123dc71c%7C0%7C1%7C637656281738464031%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=aDmKWoXWZNsrYv6bLP%2F78rnC8rbhYEbOVoJ3FwQ49yU%3D&reserved=0>
>>>>>>
>>>>>> The release files, including signatures, digests, etc. can be found
>>>>>> at:
>>>>>> https://dist.apache.org/repos/dist/dev/spark/v3.2.0-rc1-bin/
>>>>>> <https://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdist.apache.org%2Frepos%2Fdist%2Fdev%2Fspark%2Fv3.2.0-rc1-bin%2F&data=04%7C01%7Cscoy%40infomedia.com.au%7Ca129f588b6f74ab624b908d96902801d%7C45d5407150f849caa59f9457123dc71c%7C0%7C1%7C637656281738464031%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=6w0zf1lNPWdTeSLOGmUo4yMkDwd6xwC4o7EUkw1n9gI%3D&reserved=0>
>>>>>>
>>>>>> Signatures used for Spark RCs can be found in this file:
>>>>>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>>>>> <https://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdist.apache.org%2Frepos%2Fdist%2Fdev%2Fspark%2FKEYS&data=04%7C01%7Cscoy%40infomedia.com.au%7Ca129f588b6f74ab624b908d96902801d%7C45d5407150f849caa59f9457123dc71c%7C0%7C1%7C637656281738473982%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=x7XeOjMPwuEqR%2FuXijVjAlwf68MuVInqGhZ9l19eVPI%3D&reserved=0>
>>>>>>
>>>>>> The staging repository for this release can be found at:
>>>>>> https://repository.apache.org/content/repositories/orgapachespark-1388
>>>>>> <https://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Frepository.apache.org%2Fcontent%2Frepositories%2Forgapachespark-1388&data=04%7C01%7Cscoy%40infomedia.com.au%7Ca129f588b6f74ab624b908d96902801d%7C45d5407150f849caa59f9457123dc71c%7C0%7C1%7C637656281738473982%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=DLKn1scc4YOYUNGP51ch4nkxr1lh5nhZIBj0%2BoBSCXo%3D&reserved=0>
>>>>>>
>>>>>> The documentation corresponding to this release can be found at:
>>>>>> https://dist.apache.org/repos/dist/dev/spark/v3.2.0-rc1-docs/
>>>>>> <https://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdist.apache.org%2Frepos%2Fdist%2Fdev%2Fspark%2Fv3.2.0-rc1-docs%2F&data=04%7C01%7Cscoy%40infomedia.com.au%7Ca129f588b6f74ab624b908d96902801d%7C45d5407150f849caa59f9457123dc71c%7C0%7C1%7C637656281738473982%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=QtfYYwnJlQIHry0TlmQy72y2DYzat1MQmpBQkATw%2BAQ%3D&reserved=0>
>>>>>>
>>>>>> The list of bug fixes going into 3.2.0 can be found at the following
>>>>>> URL:
>>>>>> https://issues.apache.org/jira/projects/SPARK/versions/12349407
>>>>>>
>>>>>> <https://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fprojects%2FSPARK%2Fversions%2F12349407&data=04%7C01%7Cscoy%40infomedia.com.au%7Ca129f588b6f74ab624b908d96902801d%7C45d5407150f849caa59f9457123dc71c%7C0%7C1%7C637656281738483945%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=cop5XebB3u0dc2rRqe4YvHfCJ2w9yLlhcdaGB7TSTas%3D&reserved=0>
>>>>>>
>>>>>> This release is using the release script of the tag v3.2.0-rc1.
>>>>>>
>>>>>>
>>>>>> FAQ
>>>>>>
>>>>>> =========================
>>>>>> How can I help test this release?
>>>>>> =========================
>>>>>> If you are a Spark user, you can help us test this release by taking
>>>>>> an existing Spark workload and running on this release candidate,
>>>>>> then
>>>>>> reporting any regressions.
>>>>>>
>>>>>> If you're working in PySpark you can set up a virtual env and install
>>>>>> the current RC and see if anything important breaks, in the
>>>>>> Java/Scala
>>>>>> you can add the staging repository to your projects resolvers and test
>>>>>> with the RC (make sure to clean up the artifact cache before/after so
>>>>>> you don't end up building with a out of date RC going forward).
>>>>>>
>>>>>> ===========================================
>>>>>> What should happen to JIRA tickets still targeting 3.2.0?
>>>>>> ===========================================
>>>>>> The current list of open tickets targeted at 3.2.0 can be found at:
>>>>>> https://issues.apache.org/jira/projects/SPARK
>>>>>> <https://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fprojects%2FSPARK&data=04%7C01%7Cscoy%40infomedia.com.au%7Ca129f588b6f74ab624b908d96902801d%7C45d5407150f849caa59f9457123dc71c%7C0%7C1%7C637656281738483945%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=k5gTpGV4JvGRC6gKOXY%2BlaZKAH5NPFM3nDwmRyNDiQA%3D&reserved=0>
>>>>>>  and
>>>>>> search for "Target Version/s" = 3.2.0
>>>>>>
>>>>>> Committers should look at those and triage. Extremely important bug
>>>>>> fixes, documentation, and API tweaks that impact compatibility should
>>>>>> be worked on immediately. Everything else please retarget to an
>>>>>> appropriate release.
>>>>>>
>>>>>> ==================
>>>>>> But my bug isn't fixed?
>>>>>> ==================
>>>>>> In order to make timely releases, we will typically not hold the
>>>>>> release unless the bug in question is a regression from the previous
>>>>>> release. That being said, if there is something which is a regression
>>>>>> that has not been correctly targeted please ping me or a committer to
>>>>>> help target the issue.
>>>>>>
>>>>>>
>>>>>> This email contains confidential information of and is the copyright
>>>>>> of Infomedia. It must not be forwarded, amended or disclosed without
>>>>>> consent of the sender. If you received this message by mistake, please
>>>>>> advise the sender and delete all copies. Security of transmission on the
>>>>>> internet cannot be guaranteed, could be infected, intercepted, or 
>>>>>> corrupted
>>>>>> and you should ensure you have suitable antivirus protection in place. By
>>>>>> sending us your or any third party personal details, you consent to (or
>>>>>> confirm you have obtained consent from such third parties) to Infomedia’s
>>>>>> privacy policy. http://www.infomedia.com.au/privacy-policy/
>>>>>>
>>>>>
>>>>>
>>>>
>
> --
>
>

Reply via email to