Spark32 + Java 11 . Reading parquet java.lang.NoSuchMethodError: 'sun.misc.Cleaner sun.nio.ch.DirectBuffer.cleaner()'

2022-06-13 Thread Pralabh Kumar
Hi Dev team

I have a spark32 image with Java 11 (Running Spark on K8s) .  While reading
a huge parquet file via  spark.read.parquet("") .  I am getting the
following error . The same error is mentioned in Spark docs
https://spark.apache.org/docs/latest/#downloading but w.r.t to apache arrow.


   - IMHO , I think the error is coming from Parquet 1.12.1  which is based
   on Hadoop 2.10 which is not java 11 compatible.

Please let me know if this understanding is correct and is there a way to
fix it.


java.lang.NoSuchMethodError: 'sun.misc.Cleaner
sun.nio.ch.DirectBuffer.cleaner()'

at
org.apache.hadoop.crypto.CryptoStreamUtils.freeDB(CryptoStreamUtils.java:41)

at
org.apache.hadoop.crypto.CryptoInputStream.freeBuffers(CryptoInputStream.java:687)

at
org.apache.hadoop.crypto.CryptoInputStream.close(CryptoInputStream.java:320)

at java.base/java.io.FilterInputStream.close(Unknown Source)

at
org.apache.parquet.hadoop.util.H2SeekableInputStream.close(H2SeekableInputStream.java:50)

at
org.apache.parquet.hadoop.ParquetFileReader.close(ParquetFileReader.java:1299)

at
org.apache.spark.sql.execution.datasources.parquet.ParquetFooterReader.readFooter(ParquetFooterReader.java:54)

at
org.apache.spark.sql.execution.datasources.parquet.ParquetFooterReader.readFooter(ParquetFooterReader.java:44)

at
org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$.$anonfun$readParquetFootersInParallel$1(ParquetFileFormat.scala:467)

at
org.apache.spark.util.ThreadUtils$.$anonfun$parmap$2(ThreadUtils.scala:372)

at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659)

at scala.util.Success.$anonfun$map$1(Try.scala:255)

at scala.util.Success.map(Try.scala:213)

at scala.concurrent.Future.$anonfun$map$1(Future.scala:292)

at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)

at
scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)

at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)

at
java.base/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(Unknown
Source)

at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown
Source)

at
java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown
Source)

at java.base/java.util.concurrent.ForkJoinPool.scan(Unknown
Source)

at
java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)

at
java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)


Re: Spark32 + Java 11 . Reading parquet java.lang.NoSuchMethodError: 'sun.misc.Cleaner sun.nio.ch.DirectBuffer.cleaner()'

2022-06-13 Thread Steve Loughran
On Mon, 13 Jun 2022 at 08:52, Pralabh Kumar  wrote:

> Hi Dev team
>
> I have a spark32 image with Java 11 (Running Spark on K8s) .  While
> reading a huge parquet file via  spark.read.parquet("") .  I am getting
> the following error . The same error is mentioned in Spark docs
> https://spark.apache.org/docs/latest/#downloading but w.r.t to apache
> arrow.
>
>
>- IMHO , I think the error is coming from Parquet 1.12.1  which is
>based on Hadoop 2.10 which is not java 11 compatible.
>
>
correct. see https://issues.apache.org/jira/browse/HADOOP-12760


Please let me know if this understanding is correct and is there a way to
> fix it.
>



upgrade to a version of hadoop with the fix. That's any version >= hadoop
3.2.0 which shipped since 2018

>
>
> java.lang.NoSuchMethodError: 'sun.misc.Cleaner
> sun.nio.ch.DirectBuffer.cleaner()'
>
> at
> org.apache.hadoop.crypto.CryptoStreamUtils.freeDB(CryptoStreamUtils.java:41)
>
> at
> org.apache.hadoop.crypto.CryptoInputStream.freeBuffers(CryptoInputStream.java:687)
>
> at
> org.apache.hadoop.crypto.CryptoInputStream.close(CryptoInputStream.java:320)
>
> at java.base/java.io.FilterInputStream.close(Unknown Source)
>
> at
> org.apache.parquet.hadoop.util.H2SeekableInputStream.close(H2SeekableInputStream.java:50)
>
> at
> org.apache.parquet.hadoop.ParquetFileReader.close(ParquetFileReader.java:1299)
>
> at
> org.apache.spark.sql.execution.datasources.parquet.ParquetFooterReader.readFooter(ParquetFooterReader.java:54)
>
> at
> org.apache.spark.sql.execution.datasources.parquet.ParquetFooterReader.readFooter(ParquetFooterReader.java:44)
>
> at
> org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$.$anonfun$readParquetFootersInParallel$1(ParquetFileFormat.scala:467)
>
> at
> org.apache.spark.util.ThreadUtils$.$anonfun$parmap$2(ThreadUtils.scala:372)
>
> at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659)
>
> at scala.util.Success.$anonfun$map$1(Try.scala:255)
>
> at scala.util.Success.map(Try.scala:213)
>
> at scala.concurrent.Future.$anonfun$map$1(Future.scala:292)
>
> at
> scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)
>
> at
> scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)
>
> at
> scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
>
> at
> java.base/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(Unknown
> Source)
>
> at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown
> Source)
>
> at
> java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown
> Source)
>
> at java.base/java.util.concurrent.ForkJoinPool.scan(Unknown
> Source)
>
> at
> java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
>
> at
> java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)
>


Re: Spark32 + Java 11 . Reading parquet java.lang.NoSuchMethodError: 'sun.misc.Cleaner sun.nio.ch.DirectBuffer.cleaner()'

2022-06-13 Thread Pralabh Kumar
Hi steve

Thx for help . We are on Hadoop3.2 ,however we are building Hadoop3.2 with
Java 8 .

Do you suggest to build Hadoop with Java 11

Regards
Pralabh kumar

On Mon, 13 Jun 2022, 15:25 Steve Loughran,  wrote:

>
>
> On Mon, 13 Jun 2022 at 08:52, Pralabh Kumar 
> wrote:
>
>> Hi Dev team
>>
>> I have a spark32 image with Java 11 (Running Spark on K8s) .  While
>> reading a huge parquet file via  spark.read.parquet("") .  I am getting
>> the following error . The same error is mentioned in Spark docs
>> https://spark.apache.org/docs/latest/#downloading but w.r.t to apache
>> arrow.
>>
>>
>>- IMHO , I think the error is coming from Parquet 1.12.1  which is
>>based on Hadoop 2.10 which is not java 11 compatible.
>>
>>
> correct. see https://issues.apache.org/jira/browse/HADOOP-12760
>
>
> Please let me know if this understanding is correct and is there a way to
>> fix it.
>>
>
>
>
> upgrade to a version of hadoop with the fix. That's any version >= hadoop
> 3.2.0 which shipped since 2018
>
>>
>>
>> java.lang.NoSuchMethodError: 'sun.misc.Cleaner
>> sun.nio.ch.DirectBuffer.cleaner()'
>>
>> at
>> org.apache.hadoop.crypto.CryptoStreamUtils.freeDB(CryptoStreamUtils.java:41)
>>
>> at
>> org.apache.hadoop.crypto.CryptoInputStream.freeBuffers(CryptoInputStream.java:687)
>>
>> at
>> org.apache.hadoop.crypto.CryptoInputStream.close(CryptoInputStream.java:320)
>>
>> at java.base/java.io.FilterInputStream.close(Unknown Source)
>>
>> at
>> org.apache.parquet.hadoop.util.H2SeekableInputStream.close(H2SeekableInputStream.java:50)
>>
>> at
>> org.apache.parquet.hadoop.ParquetFileReader.close(ParquetFileReader.java:1299)
>>
>> at
>> org.apache.spark.sql.execution.datasources.parquet.ParquetFooterReader.readFooter(ParquetFooterReader.java:54)
>>
>> at
>> org.apache.spark.sql.execution.datasources.parquet.ParquetFooterReader.readFooter(ParquetFooterReader.java:44)
>>
>> at
>> org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$.$anonfun$readParquetFootersInParallel$1(ParquetFileFormat.scala:467)
>>
>> at
>> org.apache.spark.util.ThreadUtils$.$anonfun$parmap$2(ThreadUtils.scala:372)
>>
>> at
>> scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659)
>>
>> at scala.util.Success.$anonfun$map$1(Try.scala:255)
>>
>> at scala.util.Success.map(Try.scala:213)
>>
>> at scala.concurrent.Future.$anonfun$map$1(Future.scala:292)
>>
>> at
>> scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)
>>
>> at
>> scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)
>>
>> at
>> scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
>>
>> at
>> java.base/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(Unknown
>> Source)
>>
>> at
>> java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
>>
>> at
>> java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown
>> Source)
>>
>> at java.base/java.util.concurrent.ForkJoinPool.scan(Unknown
>> Source)
>>
>> at
>> java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
>>
>> at
>> java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)
>>
>


Re: Spark32 + Java 11 . Reading parquet java.lang.NoSuchMethodError: 'sun.misc.Cleaner sun.nio.ch.DirectBuffer.cleaner()'

2022-06-13 Thread Pralabh Kumar
Steve . Thx for your help ,please ignore last comment.

Regards
Pralabh Kumar

On Mon, 13 Jun 2022, 15:43 Pralabh Kumar,  wrote:

> Hi steve
>
> Thx for help . We are on Hadoop3.2 ,however we are building Hadoop3.2 with
> Java 8 .
>
> Do you suggest to build Hadoop with Java 11
>
> Regards
> Pralabh kumar
>
> On Mon, 13 Jun 2022, 15:25 Steve Loughran,  wrote:
>
>>
>>
>> On Mon, 13 Jun 2022 at 08:52, Pralabh Kumar 
>> wrote:
>>
>>> Hi Dev team
>>>
>>> I have a spark32 image with Java 11 (Running Spark on K8s) .  While
>>> reading a huge parquet file via  spark.read.parquet("") .  I am getting
>>> the following error . The same error is mentioned in Spark docs
>>> https://spark.apache.org/docs/latest/#downloading but w.r.t to apache
>>> arrow.
>>>
>>>
>>>- IMHO , I think the error is coming from Parquet 1.12.1  which is
>>>based on Hadoop 2.10 which is not java 11 compatible.
>>>
>>>
>> correct. see https://issues.apache.org/jira/browse/HADOOP-12760
>>
>>
>> Please let me know if this understanding is correct and is there a way to
>>> fix it.
>>>
>>
>>
>>
>> upgrade to a version of hadoop with the fix. That's any version >= hadoop
>> 3.2.0 which shipped since 2018
>>
>>>
>>>
>>> java.lang.NoSuchMethodError: 'sun.misc.Cleaner
>>> sun.nio.ch.DirectBuffer.cleaner()'
>>>
>>> at
>>> org.apache.hadoop.crypto.CryptoStreamUtils.freeDB(CryptoStreamUtils.java:41)
>>>
>>> at
>>> org.apache.hadoop.crypto.CryptoInputStream.freeBuffers(CryptoInputStream.java:687)
>>>
>>> at
>>> org.apache.hadoop.crypto.CryptoInputStream.close(CryptoInputStream.java:320)
>>>
>>> at java.base/java.io.FilterInputStream.close(Unknown Source)
>>>
>>> at
>>> org.apache.parquet.hadoop.util.H2SeekableInputStream.close(H2SeekableInputStream.java:50)
>>>
>>> at
>>> org.apache.parquet.hadoop.ParquetFileReader.close(ParquetFileReader.java:1299)
>>>
>>> at
>>> org.apache.spark.sql.execution.datasources.parquet.ParquetFooterReader.readFooter(ParquetFooterReader.java:54)
>>>
>>> at
>>> org.apache.spark.sql.execution.datasources.parquet.ParquetFooterReader.readFooter(ParquetFooterReader.java:44)
>>>
>>> at
>>> org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$.$anonfun$readParquetFootersInParallel$1(ParquetFileFormat.scala:467)
>>>
>>> at
>>> org.apache.spark.util.ThreadUtils$.$anonfun$parmap$2(ThreadUtils.scala:372)
>>>
>>> at
>>> scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659)
>>>
>>> at scala.util.Success.$anonfun$map$1(Try.scala:255)
>>>
>>> at scala.util.Success.map(Try.scala:213)
>>>
>>> at scala.concurrent.Future.$anonfun$map$1(Future.scala:292)
>>>
>>> at
>>> scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)
>>>
>>> at
>>> scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)
>>>
>>> at
>>> scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
>>>
>>> at
>>> java.base/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(Unknown
>>> Source)
>>>
>>> at
>>> java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
>>>
>>> at
>>> java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown
>>> Source)
>>>
>>> at java.base/java.util.concurrent.ForkJoinPool.scan(Unknown
>>> Source)
>>>
>>> at
>>> java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
>>>
>>> at
>>> java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)
>>>
>>


Re: [VOTE] Release Spark 3.3.0 (RC6)

2022-06-13 Thread Wenchen Fan
+1, tests are all green and there are no more blocker issues AFAIK.

On Fri, Jun 10, 2022 at 12:27 PM Maxim Gekk
 wrote:

> Please vote on releasing the following candidate as
> Apache Spark version 3.3.0.
>
> The vote is open until 11:59pm Pacific time June 14th and passes if a
> majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>
> [ ] +1 Release this package as Apache Spark 3.3.0
> [ ] -1 Do not release this package because ...
>
> To learn more about Apache Spark, please see http://spark.apache.org/
>
> The tag to be voted on is v3.3.0-rc6 (commit
> f74867bddfbcdd4d08076db36851e88b15e66556):
> https://github.com/apache/spark/tree/v3.3.0-rc6
>
> The release files, including signatures, digests, etc. can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-bin/
>
> Signatures used for Spark RCs can be found in this file:
> https://dist.apache.org/repos/dist/dev/spark/KEYS
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1407
>
> The documentation corresponding to this release can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-docs/
>
> The list of bug fixes going into 3.3.0 can be found at the following URL:
> https://issues.apache.org/jira/projects/SPARK/versions/12350369
>
> This release is using the release script of the tag v3.3.0-rc6.
>
>
> FAQ
>
> =
> How can I help test this release?
> =
> If you are a Spark user, you can help us test this release by taking
> an existing Spark workload and running on this release candidate, then
> reporting any regressions.
>
> If you're working in PySpark you can set up a virtual env and install
> the current RC and see if anything important breaks, in the Java/Scala
> you can add the staging repository to your projects resolvers and test
> with the RC (make sure to clean up the artifact cache before/after so
> you don't end up building with a out of date RC going forward).
>
> ===
> What should happen to JIRA tickets still targeting 3.3.0?
> ===
> The current list of open tickets targeted at 3.3.0 can be found at:
> https://issues.apache.org/jira/projects/SPARK and search for "Target
> Version/s" = 3.3.0
>
> Committers should look at those and triage. Extremely important bug
> fixes, documentation, and API tweaks that impact compatibility should
> be worked on immediately. Everything else please retarget to an
> appropriate release.
>
> ==
> But my bug isn't fixed?
> ==
> In order to make timely releases, we will typically not hold the
> release unless the bug in question is a regression from the previous
> release. That being said, if there is something which is a regression
> that has not been correctly targeted please ping me or a committer to
> help target the issue.
>
> Maxim Gekk
>
> Software Engineer
>
> Databricks, Inc.
>


Re: [VOTE] Release Spark 3.3.0 (RC6)

2022-06-13 Thread Herman van Hovell
+1

On Mon, Jun 13, 2022 at 12:53 PM Wenchen Fan  wrote:

> +1, tests are all green and there are no more blocker issues AFAIK.
>
> On Fri, Jun 10, 2022 at 12:27 PM Maxim Gekk
>  wrote:
>
>> Please vote on releasing the following candidate as
>> Apache Spark version 3.3.0.
>>
>> The vote is open until 11:59pm Pacific time June 14th and passes if a
>> majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>>
>> [ ] +1 Release this package as Apache Spark 3.3.0
>> [ ] -1 Do not release this package because ...
>>
>> To learn more about Apache Spark, please see http://spark.apache.org/
>>
>> The tag to be voted on is v3.3.0-rc6 (commit
>> f74867bddfbcdd4d08076db36851e88b15e66556):
>> https://github.com/apache/spark/tree/v3.3.0-rc6
>>
>> The release files, including signatures, digests, etc. can be found at:
>> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-bin/
>>
>> Signatures used for Spark RCs can be found in this file:
>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>
>> The staging repository for this release can be found at:
>> https://repository.apache.org/content/repositories/orgapachespark-1407
>>
>> The documentation corresponding to this release can be found at:
>> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-docs/
>>
>> The list of bug fixes going into 3.3.0 can be found at the following URL:
>> https://issues.apache.org/jira/projects/SPARK/versions/12350369
>>
>> This release is using the release script of the tag v3.3.0-rc6.
>>
>>
>> FAQ
>>
>> =
>> How can I help test this release?
>> =
>> If you are a Spark user, you can help us test this release by taking
>> an existing Spark workload and running on this release candidate, then
>> reporting any regressions.
>>
>> If you're working in PySpark you can set up a virtual env and install
>> the current RC and see if anything important breaks, in the Java/Scala
>> you can add the staging repository to your projects resolvers and test
>> with the RC (make sure to clean up the artifact cache before/after so
>> you don't end up building with a out of date RC going forward).
>>
>> ===
>> What should happen to JIRA tickets still targeting 3.3.0?
>> ===
>> The current list of open tickets targeted at 3.3.0 can be found at:
>> https://issues.apache.org/jira/projects/SPARK and search for "Target
>> Version/s" = 3.3.0
>>
>> Committers should look at those and triage. Extremely important bug
>> fixes, documentation, and API tweaks that impact compatibility should
>> be worked on immediately. Everything else please retarget to an
>> appropriate release.
>>
>> ==
>> But my bug isn't fixed?
>> ==
>> In order to make timely releases, we will typically not hold the
>> release unless the bug in question is a regression from the previous
>> release. That being said, if there is something which is a regression
>> that has not been correctly targeted please ping me or a committer to
>> help target the issue.
>>
>> Maxim Gekk
>>
>> Software Engineer
>>
>> Databricks, Inc.
>>
>


Re: [VOTE] Release Spark 3.3.0 (RC6)

2022-06-13 Thread Gengliang Wang
+1 (non-binding)

On Mon, Jun 13, 2022 at 10:20 AM Herman van Hovell
 wrote:

> +1
>
> On Mon, Jun 13, 2022 at 12:53 PM Wenchen Fan  wrote:
>
>> +1, tests are all green and there are no more blocker issues AFAIK.
>>
>> On Fri, Jun 10, 2022 at 12:27 PM Maxim Gekk
>>  wrote:
>>
>>> Please vote on releasing the following candidate as
>>> Apache Spark version 3.3.0.
>>>
>>> The vote is open until 11:59pm Pacific time June 14th and passes if a
>>> majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>>>
>>> [ ] +1 Release this package as Apache Spark 3.3.0
>>> [ ] -1 Do not release this package because ...
>>>
>>> To learn more about Apache Spark, please see http://spark.apache.org/
>>>
>>> The tag to be voted on is v3.3.0-rc6 (commit
>>> f74867bddfbcdd4d08076db36851e88b15e66556):
>>> https://github.com/apache/spark/tree/v3.3.0-rc6
>>>
>>> The release files, including signatures, digests, etc. can be found at:
>>> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-bin/
>>>
>>> Signatures used for Spark RCs can be found in this file:
>>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>>
>>> The staging repository for this release can be found at:
>>> https://repository.apache.org/content/repositories/orgapachespark-1407
>>>
>>> The documentation corresponding to this release can be found at:
>>> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-docs/
>>>
>>> The list of bug fixes going into 3.3.0 can be found at the following URL:
>>> https://issues.apache.org/jira/projects/SPARK/versions/12350369
>>>
>>> This release is using the release script of the tag v3.3.0-rc6.
>>>
>>>
>>> FAQ
>>>
>>> =
>>> How can I help test this release?
>>> =
>>> If you are a Spark user, you can help us test this release by taking
>>> an existing Spark workload and running on this release candidate, then
>>> reporting any regressions.
>>>
>>> If you're working in PySpark you can set up a virtual env and install
>>> the current RC and see if anything important breaks, in the Java/Scala
>>> you can add the staging repository to your projects resolvers and test
>>> with the RC (make sure to clean up the artifact cache before/after so
>>> you don't end up building with a out of date RC going forward).
>>>
>>> ===
>>> What should happen to JIRA tickets still targeting 3.3.0?
>>> ===
>>> The current list of open tickets targeted at 3.3.0 can be found at:
>>> https://issues.apache.org/jira/projects/SPARK and search for "Target
>>> Version/s" = 3.3.0
>>>
>>> Committers should look at those and triage. Extremely important bug
>>> fixes, documentation, and API tweaks that impact compatibility should
>>> be worked on immediately. Everything else please retarget to an
>>> appropriate release.
>>>
>>> ==
>>> But my bug isn't fixed?
>>> ==
>>> In order to make timely releases, we will typically not hold the
>>> release unless the bug in question is a regression from the previous
>>> release. That being said, if there is something which is a regression
>>> that has not been correctly targeted please ping me or a committer to
>>> help target the issue.
>>>
>>> Maxim Gekk
>>>
>>> Software Engineer
>>>
>>> Databricks, Inc.
>>>
>>


[VOTE][SPIP] Spark Connect

2022-06-13 Thread Herman van Hovell
Hi all,

I’d like to start a vote for SPIP: "Spark Connect"

The goal of the SPIP is to introduce a Dataframe based client/server API
for Spark

Please also refer to:

- Previous discussion in dev mailing list: [DISCUSS] SPIP: Spark Connect -
A client and server interface for Apache Spark.

- Design doc: Spark Connect - A client and server interface for Apache
Spark.

- JIRA: SPARK-39375 

Please vote on the SPIP for the next 72 hours:

[ ] +1: Accept the proposal as an official SPIP
[ ] +0
[ ] -1: I don’t think this is a good idea because …

Kind Regards,
Herman


Re: [VOTE][SPIP] Spark Connect

2022-06-13 Thread Herman van Hovell
Let me kick off the voting...

+1

On Mon, Jun 13, 2022 at 2:02 PM Herman van Hovell 
wrote:

> Hi all,
>
> I’d like to start a vote for SPIP: "Spark Connect"
>
> The goal of the SPIP is to introduce a Dataframe based client/server API
> for Spark
>
> Please also refer to:
>
> - Previous discussion in dev mailing list: [DISCUSS] SPIP: Spark Connect
> - A client and server interface for Apache Spark.
> 
> - Design doc: Spark Connect - A client and server interface for Apache
> Spark.
> 
> - JIRA: SPARK-39375 
>
> Please vote on the SPIP for the next 72 hours:
>
> [ ] +1: Accept the proposal as an official SPIP
> [ ] +0
> [ ] -1: I don’t think this is a good idea because …
>
> Kind Regards,
> Herman
>


Re: [VOTE][SPIP] Spark Connect

2022-06-13 Thread Matei Zaharia
+1, very excited about this direction.

Matei

> On Jun 13, 2022, at 11:07 AM, Herman van Hovell 
>  wrote:
> 
> Let me kick off the voting...
> 
> +1
> 
> On Mon, Jun 13, 2022 at 2:02 PM Herman van Hovell  > wrote:
> Hi all,
> 
> I’d like to start a vote for SPIP: "Spark Connect"
> 
> The goal of the SPIP is to introduce a Dataframe based client/server API for 
> Spark
> 
> Please also refer to:
> 
> - Previous discussion in dev mailing list: [DISCUSS] SPIP: Spark Connect - A 
> client and server interface for Apache Spark. 
> 
> - Design doc: Spark Connect - A client and server interface for Apache Spark. 
> 
> - JIRA: SPARK-39375 
> 
> Please vote on the SPIP for the next 72 hours:
> 
> [ ] +1: Accept the proposal as an official SPIP
> [ ] +0
> [ ] -1: I don’t think this is a good idea because …
> 
> Kind Regards,
> Herman



Re: [VOTE] Release Spark 3.3.0 (RC6)

2022-06-13 Thread Sean Owen
+1 still looks good, same as last results.

On Thu, Jun 9, 2022 at 11:27 PM Maxim Gekk
 wrote:

> Please vote on releasing the following candidate as
> Apache Spark version 3.3.0.
>
> The vote is open until 11:59pm Pacific time June 14th and passes if a
> majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>
> [ ] +1 Release this package as Apache Spark 3.3.0
> [ ] -1 Do not release this package because ...
>
> To learn more about Apache Spark, please see http://spark.apache.org/
>
> The tag to be voted on is v3.3.0-rc6 (commit
> f74867bddfbcdd4d08076db36851e88b15e66556):
> https://github.com/apache/spark/tree/v3.3.0-rc6
>
> The release files, including signatures, digests, etc. can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-bin/
>
> Signatures used for Spark RCs can be found in this file:
> https://dist.apache.org/repos/dist/dev/spark/KEYS
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1407
>
> The documentation corresponding to this release can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-docs/
>
> The list of bug fixes going into 3.3.0 can be found at the following URL:
> https://issues.apache.org/jira/projects/SPARK/versions/12350369
>
> This release is using the release script of the tag v3.3.0-rc6.
>
>
> FAQ
>
> =
> How can I help test this release?
> =
> If you are a Spark user, you can help us test this release by taking
> an existing Spark workload and running on this release candidate, then
> reporting any regressions.
>
> If you're working in PySpark you can set up a virtual env and install
> the current RC and see if anything important breaks, in the Java/Scala
> you can add the staging repository to your projects resolvers and test
> with the RC (make sure to clean up the artifact cache before/after so
> you don't end up building with a out of date RC going forward).
>
> ===
> What should happen to JIRA tickets still targeting 3.3.0?
> ===
> The current list of open tickets targeted at 3.3.0 can be found at:
> https://issues.apache.org/jira/projects/SPARK and search for "Target
> Version/s" = 3.3.0
>
> Committers should look at those and triage. Extremely important bug
> fixes, documentation, and API tweaks that impact compatibility should
> be worked on immediately. Everything else please retarget to an
> appropriate release.
>
> ==
> But my bug isn't fixed?
> ==
> In order to make timely releases, we will typically not hold the
> release unless the bug in question is a regression from the previous
> release. That being said, if there is something which is a regression
> that has not been correctly targeted please ping me or a committer to
> help target the issue.
>
> Maxim Gekk
>
> Software Engineer
>
> Databricks, Inc.
>


Re: [VOTE] Release Spark 3.3.0 (RC6)

2022-06-13 Thread Tom Graves
 +1
Tom
On Thursday, June 9, 2022, 11:27:50 PM CDT, Maxim Gekk 
 wrote:  
 
 Please vote on releasing the following candidate as Apache Spark version 3.3.0.

The vote is open until 11:59pm Pacific time June 14th and passes if a majority 
+1 PMC votes are cast, with a minimum of 3 +1 votes.

[ ] +1 Release this package as Apache Spark 3.3.0
[ ] -1 Do not release this package because ...

To learn more about Apache Spark, please see http://spark.apache.org/

The tag to be voted on is v3.3.0-rc6 (commit 
f74867bddfbcdd4d08076db36851e88b15e66556):
https://github.com/apache/spark/tree/v3.3.0-rc6

The release files, including signatures, digests, etc. can be found at:
https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-bin/

Signatures used for Spark RCs can be found in this file:
https://dist.apache.org/repos/dist/dev/spark/KEYS

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapachespark-1407

The documentation corresponding to this release can be found at:
https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-docs/

The list of bug fixes going into 3.3.0 can be found at the following URL:
https://issues.apache.org/jira/projects/SPARK/versions/12350369
This release is using the release script of the tag v3.3.0-rc6.


FAQ

=
How can I help test this release?
=
If you are a Spark user, you can help us test this release by taking
an existing Spark workload and running on this release candidate, then
reporting any regressions.

If you're working in PySpark you can set up a virtual env and install
the current RC and see if anything important breaks, in the Java/Scala
you can add the staging repository to your projects resolvers and test
with the RC (make sure to clean up the artifact cache before/after so
you don't end up building with a out of date RC going forward).

===
What should happen to JIRA tickets still targeting 3.3.0?
===
The current list of open tickets targeted at 3.3.0 can be found at:
https://issues.apache.org/jira/projects/SPARK and search for "Target Version/s" 
= 3.3.0

Committers should look at those and triage. Extremely important bug
fixes, documentation, and API tweaks that impact compatibility should
be worked on immediately. Everything else please retarget to an
appropriate release.

==
But my bug isn't fixed?
==
In order to make timely releases, we will typically not hold the
release unless the bug in question is a regression from the previous
release. That being said, if there is something which is a regression
that has not been correctly targeted please ping me or a committer to
help target the issue.
Maxim Gekk

Software Engineer

Databricks, Inc.
  

Re: [VOTE] Release Spark 3.3.0 (RC6)

2022-06-13 Thread Mridul Muralidharan
+1

Signatures, digests, etc check out fine.
Checked out tag and build/tested with -Pyarn -Pmesos -Pkubernetes

The test "SPARK-33084: Add jar support Ivy URI in SQL" in sql.SQLQuerySuite
fails; but other than that, rest looks good.

Regards,
Mridul



On Mon, Jun 13, 2022 at 4:25 PM Tom Graves 
wrote:

> +1
>
> Tom
>
> On Thursday, June 9, 2022, 11:27:50 PM CDT, Maxim Gekk
>  wrote:
>
>
> Please vote on releasing the following candidate as
> Apache Spark version 3.3.0.
>
> The vote is open until 11:59pm Pacific time June 14th and passes if a
> majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>
> [ ] +1 Release this package as Apache Spark 3.3.0
> [ ] -1 Do not release this package because ...
>
> To learn more about Apache Spark, please see http://spark.apache.org/
>
> The tag to be voted on is v3.3.0-rc6 (commit
> f74867bddfbcdd4d08076db36851e88b15e66556):
> https://github.com/apache/spark/tree/v3.3.0-rc6
>
> The release files, including signatures, digests, etc. can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-bin/
>
> Signatures used for Spark RCs can be found in this file:
> https://dist.apache.org/repos/dist/dev/spark/KEYS
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1407
>
> The documentation corresponding to this release can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-docs/
>
> The list of bug fixes going into 3.3.0 can be found at the following URL:
> https://issues.apache.org/jira/projects/SPARK/versions/12350369
>
> This release is using the release script of the tag v3.3.0-rc6.
>
>
> FAQ
>
> =
> How can I help test this release?
> =
> If you are a Spark user, you can help us test this release by taking
> an existing Spark workload and running on this release candidate, then
> reporting any regressions.
>
> If you're working in PySpark you can set up a virtual env and install
> the current RC and see if anything important breaks, in the Java/Scala
> you can add the staging repository to your projects resolvers and test
> with the RC (make sure to clean up the artifact cache before/after so
> you don't end up building with a out of date RC going forward).
>
> ===
> What should happen to JIRA tickets still targeting 3.3.0?
> ===
> The current list of open tickets targeted at 3.3.0 can be found at:
> https://issues.apache.org/jira/projects/SPARK and search for "Target
> Version/s" = 3.3.0
>
> Committers should look at those and triage. Extremely important bug
> fixes, documentation, and API tweaks that impact compatibility should
> be worked on immediately. Everything else please retarget to an
> appropriate release.
>
> ==
> But my bug isn't fixed?
> ==
> In order to make timely releases, we will typically not hold the
> release unless the bug in question is a regression from the previous
> release. That being said, if there is something which is a regression
> that has not been correctly targeted please ping me or a committer to
> help target the issue.
>
> Maxim Gekk
>
> Software Engineer
>
> Databricks, Inc.
>


Re: [VOTE] Release Spark 3.3.0 (RC6)

2022-06-13 Thread Chris Nauroth
+1 (non-binding)

I repeated all checks I described for RC5:

https://lists.apache.org/thread/ksoxmozgz7q728mnxl6c2z7ncmo87vls

Maxim, thank you for your dedication on these release candidates.

Chris Nauroth


On Mon, Jun 13, 2022 at 3:21 PM Mridul Muralidharan 
wrote:

>
> +1
>
> Signatures, digests, etc check out fine.
> Checked out tag and build/tested with -Pyarn -Pmesos -Pkubernetes
>
> The test "SPARK-33084: Add jar support Ivy URI in SQL" in
> sql.SQLQuerySuite fails; but other than that, rest looks good.
>
> Regards,
> Mridul
>
>
>
> On Mon, Jun 13, 2022 at 4:25 PM Tom Graves 
> wrote:
>
>> +1
>>
>> Tom
>>
>> On Thursday, June 9, 2022, 11:27:50 PM CDT, Maxim Gekk
>>  wrote:
>>
>>
>> Please vote on releasing the following candidate as
>> Apache Spark version 3.3.0.
>>
>> The vote is open until 11:59pm Pacific time June 14th and passes if a
>> majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>>
>> [ ] +1 Release this package as Apache Spark 3.3.0
>> [ ] -1 Do not release this package because ...
>>
>> To learn more about Apache Spark, please see http://spark.apache.org/
>>
>> The tag to be voted on is v3.3.0-rc6 (commit
>> f74867bddfbcdd4d08076db36851e88b15e66556):
>> https://github.com/apache/spark/tree/v3.3.0-rc6
>>
>> The release files, including signatures, digests, etc. can be found at:
>> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-bin/
>>
>> Signatures used for Spark RCs can be found in this file:
>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>
>> The staging repository for this release can be found at:
>> https://repository.apache.org/content/repositories/orgapachespark-1407
>>
>> The documentation corresponding to this release can be found at:
>> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-docs/
>>
>> The list of bug fixes going into 3.3.0 can be found at the following URL:
>> https://issues.apache.org/jira/projects/SPARK/versions/12350369
>>
>> This release is using the release script of the tag v3.3.0-rc6.
>>
>>
>> FAQ
>>
>> =
>> How can I help test this release?
>> =
>> If you are a Spark user, you can help us test this release by taking
>> an existing Spark workload and running on this release candidate, then
>> reporting any regressions.
>>
>> If you're working in PySpark you can set up a virtual env and install
>> the current RC and see if anything important breaks, in the Java/Scala
>> you can add the staging repository to your projects resolvers and test
>> with the RC (make sure to clean up the artifact cache before/after so
>> you don't end up building with a out of date RC going forward).
>>
>> ===
>> What should happen to JIRA tickets still targeting 3.3.0?
>> ===
>> The current list of open tickets targeted at 3.3.0 can be found at:
>> https://issues.apache.org/jira/projects/SPARK and search for "Target
>> Version/s" = 3.3.0
>>
>> Committers should look at those and triage. Extremely important bug
>> fixes, documentation, and API tweaks that impact compatibility should
>> be worked on immediately. Everything else please retarget to an
>> appropriate release.
>>
>> ==
>> But my bug isn't fixed?
>> ==
>> In order to make timely releases, we will typically not hold the
>> release unless the bug in question is a regression from the previous
>> release. That being said, if there is something which is a regression
>> that has not been correctly targeted please ping me or a committer to
>> help target the issue.
>>
>> Maxim Gekk
>>
>> Software Engineer
>>
>> Databricks, Inc.
>>
>


Re: [VOTE] Release Spark 3.3.0 (RC6)

2022-06-13 Thread Dongjoon Hyun
+1

Thanks,
Dongjoon.

On Mon, Jun 13, 2022 at 3:54 PM Chris Nauroth  wrote:

> +1 (non-binding)
>
> I repeated all checks I described for RC5:
>
> https://lists.apache.org/thread/ksoxmozgz7q728mnxl6c2z7ncmo87vls
>
> Maxim, thank you for your dedication on these release candidates.
>
> Chris Nauroth
>
>
> On Mon, Jun 13, 2022 at 3:21 PM Mridul Muralidharan 
> wrote:
>
>>
>> +1
>>
>> Signatures, digests, etc check out fine.
>> Checked out tag and build/tested with -Pyarn -Pmesos -Pkubernetes
>>
>> The test "SPARK-33084: Add jar support Ivy URI in SQL" in
>> sql.SQLQuerySuite fails; but other than that, rest looks good.
>>
>> Regards,
>> Mridul
>>
>>
>>
>> On Mon, Jun 13, 2022 at 4:25 PM Tom Graves 
>> wrote:
>>
>>> +1
>>>
>>> Tom
>>>
>>> On Thursday, June 9, 2022, 11:27:50 PM CDT, Maxim Gekk
>>>  wrote:
>>>
>>>
>>> Please vote on releasing the following candidate as
>>> Apache Spark version 3.3.0.
>>>
>>> The vote is open until 11:59pm Pacific time June 14th and passes if a
>>> majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>>>
>>> [ ] +1 Release this package as Apache Spark 3.3.0
>>> [ ] -1 Do not release this package because ...
>>>
>>> To learn more about Apache Spark, please see http://spark.apache.org/
>>>
>>> The tag to be voted on is v3.3.0-rc6 (commit
>>> f74867bddfbcdd4d08076db36851e88b15e66556):
>>> https://github.com/apache/spark/tree/v3.3.0-rc6
>>>
>>> The release files, including signatures, digests, etc. can be found at:
>>> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-bin/
>>>
>>> Signatures used for Spark RCs can be found in this file:
>>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>>
>>> The staging repository for this release can be found at:
>>> https://repository.apache.org/content/repositories/orgapachespark-1407
>>>
>>> The documentation corresponding to this release can be found at:
>>> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-docs/
>>>
>>> The list of bug fixes going into 3.3.0 can be found at the following URL:
>>> https://issues.apache.org/jira/projects/SPARK/versions/12350369
>>>
>>> This release is using the release script of the tag v3.3.0-rc6.
>>>
>>>
>>> FAQ
>>>
>>> =
>>> How can I help test this release?
>>> =
>>> If you are a Spark user, you can help us test this release by taking
>>> an existing Spark workload and running on this release candidate, then
>>> reporting any regressions.
>>>
>>> If you're working in PySpark you can set up a virtual env and install
>>> the current RC and see if anything important breaks, in the Java/Scala
>>> you can add the staging repository to your projects resolvers and test
>>> with the RC (make sure to clean up the artifact cache before/after so
>>> you don't end up building with a out of date RC going forward).
>>>
>>> ===
>>> What should happen to JIRA tickets still targeting 3.3.0?
>>> ===
>>> The current list of open tickets targeted at 3.3.0 can be found at:
>>> https://issues.apache.org/jira/projects/SPARK and search for "Target
>>> Version/s" = 3.3.0
>>>
>>> Committers should look at those and triage. Extremely important bug
>>> fixes, documentation, and API tweaks that impact compatibility should
>>> be worked on immediately. Everything else please retarget to an
>>> appropriate release.
>>>
>>> ==
>>> But my bug isn't fixed?
>>> ==
>>> In order to make timely releases, we will typically not hold the
>>> release unless the bug in question is a regression from the previous
>>> release. That being said, if there is something which is a regression
>>> that has not been correctly targeted please ping me or a committer to
>>> help target the issue.
>>>
>>> Maxim Gekk
>>>
>>> Software Engineer
>>>
>>> Databricks, Inc.
>>>
>>


Re: [VOTE] Release Spark 3.3.0 (RC6)

2022-06-13 Thread Yuming Wang
+1 (non-binding)

On Tue, Jun 14, 2022 at 7:41 AM Dongjoon Hyun 
wrote:

> +1
>
> Thanks,
> Dongjoon.
>
> On Mon, Jun 13, 2022 at 3:54 PM Chris Nauroth  wrote:
>
>> +1 (non-binding)
>>
>> I repeated all checks I described for RC5:
>>
>> https://lists.apache.org/thread/ksoxmozgz7q728mnxl6c2z7ncmo87vls
>>
>> Maxim, thank you for your dedication on these release candidates.
>>
>> Chris Nauroth
>>
>>
>> On Mon, Jun 13, 2022 at 3:21 PM Mridul Muralidharan 
>> wrote:
>>
>>>
>>> +1
>>>
>>> Signatures, digests, etc check out fine.
>>> Checked out tag and build/tested with -Pyarn -Pmesos -Pkubernetes
>>>
>>> The test "SPARK-33084: Add jar support Ivy URI in SQL" in
>>> sql.SQLQuerySuite fails; but other than that, rest looks good.
>>>
>>> Regards,
>>> Mridul
>>>
>>>
>>>
>>> On Mon, Jun 13, 2022 at 4:25 PM Tom Graves 
>>> wrote:
>>>
 +1

 Tom

 On Thursday, June 9, 2022, 11:27:50 PM CDT, Maxim Gekk
  wrote:


 Please vote on releasing the following candidate as
 Apache Spark version 3.3.0.

 The vote is open until 11:59pm Pacific time June 14th and passes if a
 majority +1 PMC votes are cast, with a minimum of 3 +1 votes.

 [ ] +1 Release this package as Apache Spark 3.3.0
 [ ] -1 Do not release this package because ...

 To learn more about Apache Spark, please see http://spark.apache.org/

 The tag to be voted on is v3.3.0-rc6 (commit
 f74867bddfbcdd4d08076db36851e88b15e66556):
 https://github.com/apache/spark/tree/v3.3.0-rc6

 The release files, including signatures, digests, etc. can be found at:
 https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-bin/

 Signatures used for Spark RCs can be found in this file:
 https://dist.apache.org/repos/dist/dev/spark/KEYS

 The staging repository for this release can be found at:
 https://repository.apache.org/content/repositories/orgapachespark-1407

 The documentation corresponding to this release can be found at:
 https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-docs/

 The list of bug fixes going into 3.3.0 can be found at the following
 URL:
 https://issues.apache.org/jira/projects/SPARK/versions/12350369

 This release is using the release script of the tag v3.3.0-rc6.


 FAQ

 =
 How can I help test this release?
 =
 If you are a Spark user, you can help us test this release by taking
 an existing Spark workload and running on this release candidate, then
 reporting any regressions.

 If you're working in PySpark you can set up a virtual env and install
 the current RC and see if anything important breaks, in the Java/Scala
 you can add the staging repository to your projects resolvers and test
 with the RC (make sure to clean up the artifact cache before/after so
 you don't end up building with a out of date RC going forward).

 ===
 What should happen to JIRA tickets still targeting 3.3.0?
 ===
 The current list of open tickets targeted at 3.3.0 can be found at:
 https://issues.apache.org/jira/projects/SPARK and search for "Target
 Version/s" = 3.3.0

 Committers should look at those and triage. Extremely important bug
 fixes, documentation, and API tweaks that impact compatibility should
 be worked on immediately. Everything else please retarget to an
 appropriate release.

 ==
 But my bug isn't fixed?
 ==
 In order to make timely releases, we will typically not hold the
 release unless the bug in question is a regression from the previous
 release. That being said, if there is something which is a regression
 that has not been correctly targeted please ping me or a committer to
 help target the issue.

 Maxim Gekk

 Software Engineer

 Databricks, Inc.

>>>


Re: [VOTE][SPIP] Spark Connect

2022-06-13 Thread Yuming Wang
+1.

On Tue, Jun 14, 2022 at 2:20 AM Matei Zaharia 
wrote:

> +1, very excited about this direction.
>
> Matei
>
> On Jun 13, 2022, at 11:07 AM, Herman van Hovell <
> her...@databricks.com.INVALID> wrote:
>
> Let me kick off the voting...
>
> +1
>
> On Mon, Jun 13, 2022 at 2:02 PM Herman van Hovell 
> wrote:
>
>> Hi all,
>>
>> I’d like to start a vote for SPIP: "Spark Connect"
>>
>> The goal of the SPIP is to introduce a Dataframe based client/server API
>> for Spark
>>
>> Please also refer to:
>>
>> - Previous discussion in dev mailing list: [DISCUSS] SPIP: Spark Connect
>> - A client and server interface for Apache Spark.
>> 
>> - Design doc: Spark Connect - A client and server interface for Apache
>> Spark.
>> 
>> - JIRA: SPARK-39375 
>>
>> Please vote on the SPIP for the next 72 hours:
>>
>> [ ] +1: Accept the proposal as an official SPIP
>> [ ] +0
>> [ ] -1: I don’t think this is a good idea because …
>>
>> Kind Regards,
>> Herman
>>
>
>


Re: [VOTE] Release Spark 3.3.0 (RC6)

2022-06-13 Thread Holden Karau
+1

On Mon, Jun 13, 2022 at 4:51 PM Yuming Wang  wrote:

> +1 (non-binding)
>
> On Tue, Jun 14, 2022 at 7:41 AM Dongjoon Hyun 
> wrote:
>
>> +1
>>
>> Thanks,
>> Dongjoon.
>>
>> On Mon, Jun 13, 2022 at 3:54 PM Chris Nauroth 
>> wrote:
>>
>>> +1 (non-binding)
>>>
>>> I repeated all checks I described for RC5:
>>>
>>> https://lists.apache.org/thread/ksoxmozgz7q728mnxl6c2z7ncmo87vls
>>>
>>> Maxim, thank you for your dedication on these release candidates.
>>>
>>> Chris Nauroth
>>>
>>>
>>> On Mon, Jun 13, 2022 at 3:21 PM Mridul Muralidharan 
>>> wrote:
>>>

 +1

 Signatures, digests, etc check out fine.
 Checked out tag and build/tested with -Pyarn -Pmesos -Pkubernetes

 The test "SPARK-33084: Add jar support Ivy URI in SQL" in
 sql.SQLQuerySuite fails; but other than that, rest looks good.

 Regards,
 Mridul



 On Mon, Jun 13, 2022 at 4:25 PM Tom Graves 
 wrote:

> +1
>
> Tom
>
> On Thursday, June 9, 2022, 11:27:50 PM CDT, Maxim Gekk
>  wrote:
>
>
> Please vote on releasing the following candidate as
> Apache Spark version 3.3.0.
>
> The vote is open until 11:59pm Pacific time June 14th and passes if a
> majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>
> [ ] +1 Release this package as Apache Spark 3.3.0
> [ ] -1 Do not release this package because ...
>
> To learn more about Apache Spark, please see http://spark.apache.org/
>
> The tag to be voted on is v3.3.0-rc6 (commit
> f74867bddfbcdd4d08076db36851e88b15e66556):
> https://github.com/apache/spark/tree/v3.3.0-rc6
>
> The release files, including signatures, digests, etc. can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-bin/
>
> Signatures used for Spark RCs can be found in this file:
> https://dist.apache.org/repos/dist/dev/spark/KEYS
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1407
>
> The documentation corresponding to this release can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-docs/
>
> The list of bug fixes going into 3.3.0 can be found at the following
> URL:
> https://issues.apache.org/jira/projects/SPARK/versions/12350369
>
> This release is using the release script of the tag v3.3.0-rc6.
>
>
> FAQ
>
> =
> How can I help test this release?
> =
> If you are a Spark user, you can help us test this release by taking
> an existing Spark workload and running on this release candidate, then
> reporting any regressions.
>
> If you're working in PySpark you can set up a virtual env and install
> the current RC and see if anything important breaks, in the Java/Scala
> you can add the staging repository to your projects resolvers and test
> with the RC (make sure to clean up the artifact cache before/after so
> you don't end up building with a out of date RC going forward).
>
> ===
> What should happen to JIRA tickets still targeting 3.3.0?
> ===
> The current list of open tickets targeted at 3.3.0 can be found at:
> https://issues.apache.org/jira/projects/SPARK and search for "Target
> Version/s" = 3.3.0
>
> Committers should look at those and triage. Extremely important bug
> fixes, documentation, and API tweaks that impact compatibility should
> be worked on immediately. Everything else please retarget to an
> appropriate release.
>
> ==
> But my bug isn't fixed?
> ==
> In order to make timely releases, we will typically not hold the
> release unless the bug in question is a regression from the previous
> release. That being said, if there is something which is a regression
> that has not been correctly targeted please ping me or a committer to
> help target the issue.
>
> Maxim Gekk
>
> Software Engineer
>
> Databricks, Inc.
>


-- 
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


Re: [VOTE][SPIP] Spark Connect

2022-06-13 Thread Hyukjin Kwon
+1

On Tue, 14 Jun 2022 at 08:50, Yuming Wang  wrote:

> +1.
>
> On Tue, Jun 14, 2022 at 2:20 AM Matei Zaharia 
> wrote:
>
>> +1, very excited about this direction.
>>
>> Matei
>>
>> On Jun 13, 2022, at 11:07 AM, Herman van Hovell <
>> her...@databricks.com.INVALID> wrote:
>>
>> Let me kick off the voting...
>>
>> +1
>>
>> On Mon, Jun 13, 2022 at 2:02 PM Herman van Hovell 
>> wrote:
>>
>>> Hi all,
>>>
>>> I’d like to start a vote for SPIP: "Spark Connect"
>>>
>>> The goal of the SPIP is to introduce a Dataframe based client/server
>>> API for Spark
>>>
>>> Please also refer to:
>>>
>>> - Previous discussion in dev mailing list: [DISCUSS] SPIP: Spark
>>> Connect - A client and server interface for Apache Spark.
>>> 
>>> - Design doc: Spark Connect - A client and server interface for Apache
>>> Spark.
>>> 
>>> - JIRA: SPARK-39375 
>>>
>>> Please vote on the SPIP for the next 72 hours:
>>>
>>> [ ] +1: Accept the proposal as an official SPIP
>>> [ ] +0
>>> [ ] -1: I don’t think this is a good idea because …
>>>
>>> Kind Regards,
>>> Herman
>>>
>>
>>


Re: [VOTE] Release Spark 3.3.0 (RC6)

2022-06-13 Thread L. C. Hsieh
+1

On Mon, Jun 13, 2022 at 5:07 PM Holden Karau  wrote:
>
> +1
>
> On Mon, Jun 13, 2022 at 4:51 PM Yuming Wang  wrote:
>>
>> +1 (non-binding)
>>
>> On Tue, Jun 14, 2022 at 7:41 AM Dongjoon Hyun  
>> wrote:
>>>
>>> +1
>>>
>>> Thanks,
>>> Dongjoon.
>>>
>>> On Mon, Jun 13, 2022 at 3:54 PM Chris Nauroth  wrote:

 +1 (non-binding)

 I repeated all checks I described for RC5:

 https://lists.apache.org/thread/ksoxmozgz7q728mnxl6c2z7ncmo87vls

 Maxim, thank you for your dedication on these release candidates.

 Chris Nauroth


 On Mon, Jun 13, 2022 at 3:21 PM Mridul Muralidharan  
 wrote:
>
>
> +1
>
> Signatures, digests, etc check out fine.
> Checked out tag and build/tested with -Pyarn -Pmesos -Pkubernetes
>
> The test "SPARK-33084: Add jar support Ivy URI in SQL" in 
> sql.SQLQuerySuite fails; but other than that, rest looks good.
>
> Regards,
> Mridul
>
>
>
> On Mon, Jun 13, 2022 at 4:25 PM Tom Graves  
> wrote:
>>
>> +1
>>
>> Tom
>>
>> On Thursday, June 9, 2022, 11:27:50 PM CDT, Maxim Gekk 
>>  wrote:
>>
>>
>> Please vote on releasing the following candidate as Apache Spark version 
>> 3.3.0.
>>
>> The vote is open until 11:59pm Pacific time June 14th and passes if a 
>> majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>>
>> [ ] +1 Release this package as Apache Spark 3.3.0
>> [ ] -1 Do not release this package because ...
>>
>> To learn more about Apache Spark, please see http://spark.apache.org/
>>
>> The tag to be voted on is v3.3.0-rc6 (commit 
>> f74867bddfbcdd4d08076db36851e88b15e66556):
>> https://github.com/apache/spark/tree/v3.3.0-rc6
>>
>> The release files, including signatures, digests, etc. can be found at:
>> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-bin/
>>
>> Signatures used for Spark RCs can be found in this file:
>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>
>> The staging repository for this release can be found at:
>> https://repository.apache.org/content/repositories/orgapachespark-1407
>>
>> The documentation corresponding to this release can be found at:
>> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-docs/
>>
>> The list of bug fixes going into 3.3.0 can be found at the following URL:
>> https://issues.apache.org/jira/projects/SPARK/versions/12350369
>>
>> This release is using the release script of the tag v3.3.0-rc6.
>>
>>
>> FAQ
>>
>> =
>> How can I help test this release?
>> =
>> If you are a Spark user, you can help us test this release by taking
>> an existing Spark workload and running on this release candidate, then
>> reporting any regressions.
>>
>> If you're working in PySpark you can set up a virtual env and install
>> the current RC and see if anything important breaks, in the Java/Scala
>> you can add the staging repository to your projects resolvers and test
>> with the RC (make sure to clean up the artifact cache before/after so
>> you don't end up building with a out of date RC going forward).
>>
>> ===
>> What should happen to JIRA tickets still targeting 3.3.0?
>> ===
>> The current list of open tickets targeted at 3.3.0 can be found at:
>> https://issues.apache.org/jira/projects/SPARK and search for "Target 
>> Version/s" = 3.3.0
>>
>> Committers should look at those and triage. Extremely important bug
>> fixes, documentation, and API tweaks that impact compatibility should
>> be worked on immediately. Everything else please retarget to an
>> appropriate release.
>>
>> ==
>> But my bug isn't fixed?
>> ==
>> In order to make timely releases, we will typically not hold the
>> release unless the bug in question is a regression from the previous
>> release. That being said, if there is something which is a regression
>> that has not been correctly targeted please ping me or a committer to
>> help target the issue.
>>
>> Maxim Gekk
>>
>> Software Engineer
>>
>> Databricks, Inc.
>
>
>
> --
> Twitter: https://twitter.com/holdenkarau
> Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: [VOTE] Release Spark 3.3.0 (RC6)

2022-06-13 Thread Cheng Su
+1 (non-binding).

Thanks,
Cheng Su

From: L. C. Hsieh 
Date: Monday, June 13, 2022 at 5:13 PM
To: dev 
Subject: Re: [VOTE] Release Spark 3.3.0 (RC6)
+1

On Mon, Jun 13, 2022 at 5:07 PM Holden Karau  wrote:
>
> +1
>
> On Mon, Jun 13, 2022 at 4:51 PM Yuming Wang  wrote:
>>
>> +1 (non-binding)
>>
>> On Tue, Jun 14, 2022 at 7:41 AM Dongjoon Hyun  
>> wrote:
>>>
>>> +1
>>>
>>> Thanks,
>>> Dongjoon.
>>>
>>> On Mon, Jun 13, 2022 at 3:54 PM Chris Nauroth  wrote:

 +1 (non-binding)

 I repeated all checks I described for RC5:

 https://lists.apache.org/thread/ksoxmozgz7q728mnxl6c2z7ncmo87vls

 Maxim, thank you for your dedication on these release candidates.

 Chris Nauroth


 On Mon, Jun 13, 2022 at 3:21 PM Mridul Muralidharan  
 wrote:
>
>
> +1
>
> Signatures, digests, etc check out fine.
> Checked out tag and build/tested with -Pyarn -Pmesos -Pkubernetes
>
> The test "SPARK-33084: Add jar support Ivy URI in SQL" in 
> sql.SQLQuerySuite fails; but other than that, rest looks good.
>
> Regards,
> Mridul
>
>
>
> On Mon, Jun 13, 2022 at 4:25 PM Tom Graves  
> wrote:
>>
>> +1
>>
>> Tom
>>
>> On Thursday, June 9, 2022, 11:27:50 PM CDT, Maxim Gekk 
>>  wrote:
>>
>>
>> Please vote on releasing the following candidate as Apache Spark version 
>> 3.3.0.
>>
>> The vote is open until 11:59pm Pacific time June 14th and passes if a 
>> majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>>
>> [ ] +1 Release this package as Apache Spark 3.3.0
>> [ ] -1 Do not release this package because ...
>>
>> To learn more about Apache Spark, please see 
>> http://spark.apache.org/
>>
>> The tag to be voted on is v3.3.0-rc6 (commit 
>> f74867bddfbcdd4d08076db36851e88b15e66556):
>> https://github.com/apache/spark/tree/v3.3.0-rc6
>>
>> The release files, including signatures, digests, etc. can be found at:
>> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-bin/
>>
>> Signatures used for Spark RCs can be found in this file:
>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>
>> The staging repository for this release can be found at:
>> https://repository.apache.org/content/repositories/orgapachespark-1407
>>
>> The documentation corresponding to this release can be found at:
>> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-docs/
>>
>> The list of bug fixes going into 3.3.0 can be found at the following URL:
>> https://issues.apache.org/jira/projects/SPARK/versions/12350369
>>
>> This release is using the release script of the tag v3.3.0-rc6.
>>
>>
>> FAQ
>>
>> =
>> How can I help test this release?
>> =
>> If you are a Spark user, you can help us test this release by taking
>> an existing Spark workload and running on this release candidate, then
>> reporting any regressions.
>>
>> If you're working in PySpark you can set up a virtual env and install
>> the current RC and see if anything important breaks, in the Java/Scala
>> you can add the staging repository to your projects resolvers and test
>> with the RC (make sure to clean up the artifact cache before/after so
>> you don't end up building with a out of date RC going forward).
>>
>> ===
>> What should happen to JIRA tickets still targeting 3.3.0?
>> ===
>> The current list of open tickets targeted at 3.3.0 can be found at:
>> https://issues.apache.org/jira/projects/SPARK
>>   and search for "Target Version/s" = 3.3.0
>>
>> Committers should look at those and triage. Extremely important bug
>> fixes, documentation, and API tweaks that impact compatibility should
>> be worked on immediately. Everything else please retarget to an
>> appropriate release.
>>
>> ==
>> But my bug isn't fixed?
>> ==
>> In order to make timely releases, we will typically not hold the
>> release unless the bug in question is a regression from the previous
>> release. That being said, if there is something which is a regression
>> that has not been correctly targeted please ping me or a committer to
>> help target the issue.
>>
>> Maxim Gekk
>>

Re: [VOTE][SPIP] Spark Connect

2022-06-13 Thread Chao Sun
+1 (non-binding)

On Mon, Jun 13, 2022 at 5:11 PM Hyukjin Kwon  wrote:

> +1
>
> On Tue, 14 Jun 2022 at 08:50, Yuming Wang  wrote:
>
>> +1.
>>
>> On Tue, Jun 14, 2022 at 2:20 AM Matei Zaharia 
>> wrote:
>>
>>> +1, very excited about this direction.
>>>
>>> Matei
>>>
>>> On Jun 13, 2022, at 11:07 AM, Herman van Hovell <
>>> her...@databricks.com.INVALID> wrote:
>>>
>>> Let me kick off the voting...
>>>
>>> +1
>>>
>>> On Mon, Jun 13, 2022 at 2:02 PM Herman van Hovell 
>>> wrote:
>>>
 Hi all,

 I’d like to start a vote for SPIP: "Spark Connect"

 The goal of the SPIP is to introduce a Dataframe based client/server
 API for Spark

 Please also refer to:

 - Previous discussion in dev mailing list: [DISCUSS] SPIP: Spark
 Connect - A client and server interface for Apache Spark.
 
 - Design doc: Spark Connect - A client and server interface for Apache
 Spark.
 
 - JIRA: SPARK-39375 

 Please vote on the SPIP for the next 72 hours:

 [ ] +1: Accept the proposal as an official SPIP
 [ ] +0
 [ ] -1: I don’t think this is a good idea because …

 Kind Regards,
 Herman

>>>
>>>


Re: [VOTE][SPIP] Spark Connect

2022-06-13 Thread L. C. Hsieh
+1

On Mon, Jun 13, 2022 at 5:41 PM Chao Sun  wrote:
>
> +1 (non-binding)
>
> On Mon, Jun 13, 2022 at 5:11 PM Hyukjin Kwon  wrote:
>>
>> +1
>>
>> On Tue, 14 Jun 2022 at 08:50, Yuming Wang  wrote:
>>>
>>> +1.
>>>
>>> On Tue, Jun 14, 2022 at 2:20 AM Matei Zaharia  
>>> wrote:

 +1, very excited about this direction.

 Matei

 On Jun 13, 2022, at 11:07 AM, Herman van Hovell 
  wrote:

 Let me kick off the voting...

 +1

 On Mon, Jun 13, 2022 at 2:02 PM Herman van Hovell  
 wrote:
>
> Hi all,
>
> I’d like to start a vote for SPIP: "Spark Connect"
>
> The goal of the SPIP is to introduce a Dataframe based client/server API 
> for Spark
>
> Please also refer to:
>
> - Previous discussion in dev mailing list: [DISCUSS] SPIP: Spark Connect 
> - A client and server interface for Apache Spark.
> - Design doc: Spark Connect - A client and server interface for Apache 
> Spark.
> - JIRA: SPARK-39375
>
> Please vote on the SPIP for the next 72 hours:
>
> [ ] +1: Accept the proposal as an official SPIP
> [ ] +0
> [ ] -1: I don’t think this is a good idea because …
>
> Kind Regards,
> Herman



-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: [VOTE] Release Spark 3.3.0 (RC6)

2022-06-13 Thread Chao Sun
+1 (non-binding)

Thanks,
Chao

On Mon, Jun 13, 2022 at 5:37 PM Cheng Su  wrote:

> +1 (non-binding).
>
>
>
> Thanks,
>
> Cheng Su
>
>
>
> *From: *L. C. Hsieh 
> *Date: *Monday, June 13, 2022 at 5:13 PM
> *To: *dev 
> *Subject: *Re: [VOTE] Release Spark 3.3.0 (RC6)
>
> +1
>
> On Mon, Jun 13, 2022 at 5:07 PM Holden Karau  wrote:
> >
> > +1
> >
> > On Mon, Jun 13, 2022 at 4:51 PM Yuming Wang  wrote:
> >>
> >> +1 (non-binding)
> >>
> >> On Tue, Jun 14, 2022 at 7:41 AM Dongjoon Hyun 
> wrote:
> >>>
> >>> +1
> >>>
> >>> Thanks,
> >>> Dongjoon.
> >>>
> >>> On Mon, Jun 13, 2022 at 3:54 PM Chris Nauroth 
> wrote:
> 
>  +1 (non-binding)
> 
>  I repeated all checks I described for RC5:
> 
>  https://lists.apache.org/thread/ksoxmozgz7q728mnxl6c2z7ncmo87vls
> 
>  Maxim, thank you for your dedication on these release candidates.
> 
>  Chris Nauroth
> 
> 
>  On Mon, Jun 13, 2022 at 3:21 PM Mridul Muralidharan 
> wrote:
> >
> >
> > +1
> >
> > Signatures, digests, etc check out fine.
> > Checked out tag and build/tested with -Pyarn -Pmesos -Pkubernetes
> >
> > The test "SPARK-33084: Add jar support Ivy URI in SQL" in
> sql.SQLQuerySuite fails; but other than that, rest looks good.
> >
> > Regards,
> > Mridul
> >
> >
> >
> > On Mon, Jun 13, 2022 at 4:25 PM Tom Graves
>  wrote:
> >>
> >> +1
> >>
> >> Tom
> >>
> >> On Thursday, June 9, 2022, 11:27:50 PM CDT, Maxim Gekk
>  wrote:
> >>
> >>
> >> Please vote on releasing the following candidate as Apache Spark
> version 3.3.0.
> >>
> >> The vote is open until 11:59pm Pacific time June 14th and passes if
> a majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
> >>
> >> [ ] +1 Release this package as Apache Spark 3.3.0
> >> [ ] -1 Do not release this package because ...
> >>
> >> To learn more about Apache Spark, please see
> http://spark.apache.org/
> >>
> >> The tag to be voted on is v3.3.0-rc6 (commit
> f74867bddfbcdd4d08076db36851e88b15e66556):
> >> https://github.com/apache/spark/tree/v3.3.0-rc6
> >>
> >> The release files, including signatures, digests, etc. can be found
> at:
> >> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-bin/
> >>
> >> Signatures used for Spark RCs can be found in this file:
> >> https://dist.apache.org/repos/dist/dev/spark/KEYS
> >>
> >> The staging repository for this release can be found at:
> >>
> https://repository.apache.org/content/repositories/orgapachespark-1407
> >>
> >> The documentation corresponding to this release can be found at:
> >> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-docs/
> >>
> >> The list of bug fixes going into 3.3.0 can be found at the
> following URL:
> >> https://issues.apache.org/jira/projects/SPARK/versions/12350369
> >>
> >> This release is using the release script of the tag v3.3.0-rc6.
> >>
> >>
> >> FAQ
> >>
> >> =
> >> How can I help test this release?
> >> =
> >> If you are a Spark user, you can help us test this release by taking
> >> an existing Spark workload and running on this release candidate,
> then
> >> reporting any regressions.
> >>
> >> If you're working in PySpark you can set up a virtual env and
> install
> >> the current RC and see if anything important breaks, in the
> Java/Scala
> >> you can add the staging repository to your projects resolvers and
> test
> >> with the RC (make sure to clean up the artifact cache before/after
> so
> >> you don't end up building with a out of date RC going forward).
> >>
> >> ===
> >> What should happen to JIRA tickets still targeting 3.3.0?
> >> ===
> >> The current list of open tickets targeted at 3.3.0 can be found at:
> >> https://issues.apache.org/jira/projects/SPARK  and search for
> "Target Version/s" = 3.3.0
> >>
> >> Committers should look at those and triage. Extremely important bug
> >> fixes, documentation, and API tweaks that impact compatibility
> should
> >> be worked on immediately. Everything else please retarget to an
> >> appropriate release.
> >>
> >> ==
> >> But my bug isn't fixed?
> >> ==
> >> In order to make timely releases, we will typically not hold the
> >> release unless the bug in question is a regression from the previous
> >> release. That being said, if there is something which is a
> regression
> >> that has not been correctly targeted please ping me or a committer
> to
> >> help target the issue.
> >>
> >> Maxim Gekk
> >>
> >> Software Engineer
> >>
> >> Databricks, Inc.
> >
> >
> >
> > --
> > Twitter: https://twitter.com/ho

Re: [VOTE][SPIP] Spark Connect

2022-06-13 Thread huaxin gao
+1

On Mon, Jun 13, 2022 at 5:42 PM L. C. Hsieh  wrote:

> +1
>
> On Mon, Jun 13, 2022 at 5:41 PM Chao Sun  wrote:
> >
> > +1 (non-binding)
> >
> > On Mon, Jun 13, 2022 at 5:11 PM Hyukjin Kwon 
> wrote:
> >>
> >> +1
> >>
> >> On Tue, 14 Jun 2022 at 08:50, Yuming Wang  wrote:
> >>>
> >>> +1.
> >>>
> >>> On Tue, Jun 14, 2022 at 2:20 AM Matei Zaharia 
> wrote:
> 
>  +1, very excited about this direction.
> 
>  Matei
> 
>  On Jun 13, 2022, at 11:07 AM, Herman van Hovell
>  wrote:
> 
>  Let me kick off the voting...
> 
>  +1
> 
>  On Mon, Jun 13, 2022 at 2:02 PM Herman van Hovell <
> her...@databricks.com> wrote:
> >
> > Hi all,
> >
> > I’d like to start a vote for SPIP: "Spark Connect"
> >
> > The goal of the SPIP is to introduce a Dataframe based client/server
> API for Spark
> >
> > Please also refer to:
> >
> > - Previous discussion in dev mailing list: [DISCUSS] SPIP: Spark
> Connect - A client and server interface for Apache Spark.
> > - Design doc: Spark Connect - A client and server interface for
> Apache Spark.
> > - JIRA: SPARK-39375
> >
> > Please vote on the SPIP for the next 72 hours:
> >
> > [ ] +1: Accept the proposal as an official SPIP
> > [ ] +0
> > [ ] -1: I don’t think this is a good idea because …
> >
> > Kind Regards,
> > Herman
> 
> 
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


?????? [VOTE][SPIP] Spark Connect

2022-06-13 Thread Ruifeng Zheng
+1




--  --
??: 
   "huaxin gao" 
   


?????? [VOTE] Release Spark 3.3.0 (RC6)

2022-06-13 Thread Ruifeng Zheng
+1 (non-binding)


Maxim, thank you for driving this release!


thanks,
ruifeng






--  --
??: 
   "Chao Sun"   
 
https://lists.apache.org/thread/ksoxmozgz7q728mnxl6c2z7ncmo87vls 
 
  Maxim, thank you for your dedication on these release 
candidates.
 
  Chris Nauroth
 
 
  On Mon, Jun 13, 2022 at 3:21 PM Mridul Muralidharan 
http://spark.apache.org/ 
 >>
 >> The tag to be voted on is v3.3.0-rc6 (commit 
f74867bddfbcdd4d08076db36851e88b15e66556):
 >> https://github.com/apache/spark/tree/v3.3.0-rc6
 >>
 >> The release files, including signatures, digests, 
etc. can be found at:
 >> 
https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-bin/ 
 >>
 >> Signatures used for Spark RCs can be found in this 
file:
 >> https://dist.apache.org/repos/dist/dev/spark/KEYS 
 >>
 >> The staging repository for this release can be found 
at:
 >> 
https://repository.apache.org/content/repositories/orgapachespark-1407 
 >>
 >> The documentation corresponding to this release can 
be found at:
 >> 
https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-docs/ 
 >>
 >> The list of bug fixes going into 3.3.0 can be found 
at the following URL:
 >> 
https://issues.apache.org/jira/projects/SPARK/versions/12350369 
 >>
 >> This release is using the release script of the tag 
v3.3.0-rc6.
 >>
 >>
 >> FAQ
 >>
 >> =
 >> How can I help test this release?
 >> =
 >> If you are a Spark user, you can help us test this 
release by taking
 >> an existing Spark workload and running on this 
release candidate, then
 >> reporting any regressions.
 >>
 >> If you're working in PySpark you can set up a virtual 
env and install
 >> the current RC and see if anything important breaks, 
in the Java/Scala
 >> you can add the staging repository to your projects 
resolvers and test
 >> with the RC (make sure to clean up the artifact cache 
before/after so
 >> you don't end up building with a out of date RC going 
forward).
 >>
 >> ===
 >> What should happen to JIRA tickets still targeting 
3.3.0?
 >> ===
 >> The current list of open tickets targeted at 3.3.0 
can be found at:
 >> https://issues.apache.org/jira/projects/SPARK ; 
and search for "Target Version/s" = 3.3.0
 >>
 >> Committers should look at those and triage. Extremely 
important bug
 >> fixes, documentation, and API tweaks that impact 
compatibility should
 >> be worked on immediately. Everything else please 
retarget to an
 >> appropriate release.
 >>
 >> ==
 >> But my bug isn't fixed?
 >> ==
 >> In order to make timely releases, we will typically 
not hold the
 >> release unless the bug in question is a regression 
from the previous
 >> release. That being said, if there is something which 
is a regression
 >> that has not been correctly targeted please ping me 
or a committer to
 >> help target the issue.
 >>
 >> Maxim Gekk
 >>
 >> Software Engineer
 >>
 >> Databricks, Inc.
 >
 >
 >
 > --
 > Twitter: https://twitter.com/holdenkarau 
 > Books (Learning Spark, High Performance Spark, etc.): 
https://amzn.to/2MaRAG9 
 > YouTube Live Streams: https://www.youtube.com/user/holdenkarau
 
 -
 To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re:回复: [VOTE] Release Spark 3.3.0 (RC6)

2022-06-13 Thread beliefer
+1 AFAIK, no blocking issues now.
Glad to hear to release 3.3.0 !




在 2022-06-14 09:38:35,"Ruifeng Zheng"  写道:

+1 (non-binding)


Maxim, thank you for driving this release!


thanks,
ruifeng







-- 原始邮件 --
发件人: "Chao Sun" ;
发送时间: 2022年6月14日(星期二) 上午8:45
收件人: "Cheng Su";
抄送: "L. C. Hsieh";"dev";
主题: Re: [VOTE] Release Spark 3.3.0 (RC6)


+1 (non-binding)


Thanks,
Chao


On Mon, Jun 13, 2022 at 5:37 PM Cheng Su  wrote:


+1 (non-binding).

 

Thanks,

Cheng Su

 

From: L. C. Hsieh 
Date: Monday, June 13, 2022 at 5:13 PM
To: dev 
Subject: Re: [VOTE] Release Spark 3.3.0 (RC6)

+1

On Mon, Jun 13, 2022 at 5:07 PM Holden Karau  wrote:
>
> +1
>
> On Mon, Jun 13, 2022 at 4:51 PM Yuming Wang  wrote:
>>
>> +1 (non-binding)
>>
>> On Tue, Jun 14, 2022 at 7:41 AM Dongjoon Hyun  
>> wrote:
>>>
>>> +1
>>>
>>> Thanks,
>>> Dongjoon.
>>>
>>> On Mon, Jun 13, 2022 at 3:54 PM Chris Nauroth  wrote:

 +1 (non-binding)

 I repeated all checks I described for RC5:

 https://lists.apache.org/thread/ksoxmozgz7q728mnxl6c2z7ncmo87vls

 Maxim, thank you for your dedication on these release candidates.

 Chris Nauroth


 On Mon, Jun 13, 2022 at 3:21 PM Mridul Muralidharan  
 wrote:
>
>
> +1
>
> Signatures, digests, etc check out fine.
> Checked out tag and build/tested with -Pyarn -Pmesos -Pkubernetes
>
> The test "SPARK-33084: Add jar support Ivy URI in SQL" in 
> sql.SQLQuerySuite fails; but other than that, rest looks good.
>
> Regards,
> Mridul
>
>
>
> On Mon, Jun 13, 2022 at 4:25 PM Tom Graves  
> wrote:
>>
>> +1
>>
>> Tom
>>
>> On Thursday, June 9, 2022, 11:27:50 PM CDT, Maxim Gekk 
>>  wrote:
>>
>>
>> Please vote on releasing the following candidate as Apache Spark version 
>> 3.3.0.
>>
>> The vote is open until 11:59pm Pacific time June 14th and passes if a 
>> majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>>
>> [ ] +1 Release this package as Apache Spark 3.3.0
>> [ ] -1 Do not release this package because ...
>>
>> To learn more about Apache Spark, please see http://spark.apache.org/
>>
>> The tag to be voted on is v3.3.0-rc6 (commit 
>> f74867bddfbcdd4d08076db36851e88b15e66556):
>> https://github.com/apache/spark/tree/v3.3.0-rc6
>>
>> The release files, including signatures, digests, etc. can be found at:
>> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-bin/
>>
>> Signatures used for Spark RCs can be found in this file:
>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>
>> The staging repository for this release can be found at:
>> https://repository.apache.org/content/repositories/orgapachespark-1407
>>
>> The documentation corresponding to this release can be found at:
>> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-docs/
>>
>> The list of bug fixes going into 3.3.0 can be found at the following URL:
>> https://issues.apache.org/jira/projects/SPARK/versions/12350369
>>
>> This release is using the release script of the tag v3.3.0-rc6.
>>
>>
>> FAQ
>>
>> =
>> How can I help test this release?
>> =
>> If you are a Spark user, you can help us test this release by taking
>> an existing Spark workload and running on this release candidate, then
>> reporting any regressions.
>>
>> If you're working in PySpark you can set up a virtual env and install
>> the current RC and see if anything important breaks, in the Java/Scala
>> you can add the staging repository to your projects resolvers and test
>> with the RC (make sure to clean up the artifact cache before/after so
>> you don't end up building with a out of date RC going forward).
>>
>> ===
>> What should happen to JIRA tickets still targeting 3.3.0?
>> ===
>> The current list of open tickets targeted at 3.3.0 can be found at:
>> https://issues.apache.org/jira/projects/SPARK  and search for "Target 
>> Version/s" = 3.3.0
>>
>> Committers should look at those and triage. Extremely important bug
>> fixes, documentation, and API tweaks that impact compatibility should
>> be worked on immediately. Everything else please retarget to an
>> appropriate release.
>>
>> ==
>> But my bug isn't fixed?
>> ==
>> In order to make timely releases, we will typically not hold the
>> release unless the bug in question is a regression from the previous
>> release. That being said, if there is something which is a regression
>> that has not been correctly targeted please ping me or a committer to
>> help target the issue.
>>
>> Maxi

Re: Spark32 + Java 11 . Reading parquet java.lang.NoSuchMethodError: 'sun.misc.Cleaner sun.nio.ch.DirectBuffer.cleaner()'

2022-06-13 Thread Pralabh Kumar
Hi Steve / Dev team

Thx for the help . Have a quick question ,  How can we fix the above error
in Hadoop 3.1 .

   - Spark docker file have (Java 11)
   
https://github.com/apache/spark/blob/branch-3.2/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile

   - Now if we build Spark32  , Spark image will be having Java 11 .  If we
   run on a Hadoop version less than 3.2 , it will throw an exception.



   - Should there be a separate docker file for Spark32 for Java 8 for
   Hadoop version < 3.2 .  Spark 3.0.1 have Java 8 in docker file which works
   fine in our environment (with Hadoop3.1)


Regards
Pralabh Kumar



On Mon, Jun 13, 2022 at 3:25 PM Steve Loughran  wrote:

>
>
> On Mon, 13 Jun 2022 at 08:52, Pralabh Kumar 
> wrote:
>
>> Hi Dev team
>>
>> I have a spark32 image with Java 11 (Running Spark on K8s) .  While
>> reading a huge parquet file via  spark.read.parquet("") .  I am getting
>> the following error . The same error is mentioned in Spark docs
>> https://spark.apache.org/docs/latest/#downloading but w.r.t to apache
>> arrow.
>>
>>
>>- IMHO , I think the error is coming from Parquet 1.12.1  which is
>>based on Hadoop 2.10 which is not java 11 compatible.
>>
>>
> correct. see https://issues.apache.org/jira/browse/HADOOP-12760
>
>
> Please let me know if this understanding is correct and is there a way to
>> fix it.
>>
>
>
>
> upgrade to a version of hadoop with the fix. That's any version >= hadoop
> 3.2.0 which shipped since 2018
>
>>
>>
>> java.lang.NoSuchMethodError: 'sun.misc.Cleaner
>> sun.nio.ch.DirectBuffer.cleaner()'
>>
>> at
>> org.apache.hadoop.crypto.CryptoStreamUtils.freeDB(CryptoStreamUtils.java:41)
>>
>> at
>> org.apache.hadoop.crypto.CryptoInputStream.freeBuffers(CryptoInputStream.java:687)
>>
>> at
>> org.apache.hadoop.crypto.CryptoInputStream.close(CryptoInputStream.java:320)
>>
>> at java.base/java.io.FilterInputStream.close(Unknown Source)
>>
>> at
>> org.apache.parquet.hadoop.util.H2SeekableInputStream.close(H2SeekableInputStream.java:50)
>>
>> at
>> org.apache.parquet.hadoop.ParquetFileReader.close(ParquetFileReader.java:1299)
>>
>> at
>> org.apache.spark.sql.execution.datasources.parquet.ParquetFooterReader.readFooter(ParquetFooterReader.java:54)
>>
>> at
>> org.apache.spark.sql.execution.datasources.parquet.ParquetFooterReader.readFooter(ParquetFooterReader.java:44)
>>
>> at
>> org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$.$anonfun$readParquetFootersInParallel$1(ParquetFileFormat.scala:467)
>>
>> at
>> org.apache.spark.util.ThreadUtils$.$anonfun$parmap$2(ThreadUtils.scala:372)
>>
>> at
>> scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659)
>>
>> at scala.util.Success.$anonfun$map$1(Try.scala:255)
>>
>> at scala.util.Success.map(Try.scala:213)
>>
>> at scala.concurrent.Future.$anonfun$map$1(Future.scala:292)
>>
>> at
>> scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)
>>
>> at
>> scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)
>>
>> at
>> scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
>>
>> at
>> java.base/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(Unknown
>> Source)
>>
>> at
>> java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
>>
>> at
>> java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown
>> Source)
>>
>> at java.base/java.util.concurrent.ForkJoinPool.scan(Unknown
>> Source)
>>
>> at
>> java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
>>
>> at
>> java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)
>>
>


Re: 回复: [VOTE] Release Spark 3.3.0 (RC6)

2022-06-13 Thread Xiao Li
+1

Xiao

beliefer  于2022年6月13日周一 20:04写道:

> +1 AFAIK, no blocking issues now.
> Glad to hear to release 3.3.0 !
>
>
> 在 2022-06-14 09:38:35,"Ruifeng Zheng"  写道:
>
> +1 (non-binding)
>
> Maxim, thank you for driving this release!
>
> thanks,
> ruifeng
>
>
>
> -- 原始邮件 --
> *发件人:* "Chao Sun" ;
> *发送时间:* 2022年6月14日(星期二) 上午8:45
> *收件人:* "Cheng Su";
> *抄送:* "L. C. Hsieh";"dev";
> *主题:* Re: [VOTE] Release Spark 3.3.0 (RC6)
>
> +1 (non-binding)
>
> Thanks,
> Chao
>
> On Mon, Jun 13, 2022 at 5:37 PM Cheng Su  wrote:
>
>> +1 (non-binding).
>>
>>
>>
>> Thanks,
>>
>> Cheng Su
>>
>>
>>
>> *From: *L. C. Hsieh 
>> *Date: *Monday, June 13, 2022 at 5:13 PM
>> *To: *dev 
>> *Subject: *Re: [VOTE] Release Spark 3.3.0 (RC6)
>>
>> +1
>>
>> On Mon, Jun 13, 2022 at 5:07 PM Holden Karau 
>> wrote:
>> >
>> > +1
>> >
>> > On Mon, Jun 13, 2022 at 4:51 PM Yuming Wang  wrote:
>> >>
>> >> +1 (non-binding)
>> >>
>> >> On Tue, Jun 14, 2022 at 7:41 AM Dongjoon Hyun 
>> wrote:
>> >>>
>> >>> +1
>> >>>
>> >>> Thanks,
>> >>> Dongjoon.
>> >>>
>> >>> On Mon, Jun 13, 2022 at 3:54 PM Chris Nauroth 
>> wrote:
>> 
>>  +1 (non-binding)
>> 
>>  I repeated all checks I described for RC5:
>> 
>>  https://lists.apache.org/thread/ksoxmozgz7q728mnxl6c2z7ncmo87vls
>> 
>>  Maxim, thank you for your dedication on these release candidates.
>> 
>>  Chris Nauroth
>> 
>> 
>>  On Mon, Jun 13, 2022 at 3:21 PM Mridul Muralidharan <
>> mri...@gmail.com> wrote:
>> >
>> >
>> > +1
>> >
>> > Signatures, digests, etc check out fine.
>> > Checked out tag and build/tested with -Pyarn -Pmesos -Pkubernetes
>> >
>> > The test "SPARK-33084: Add jar support Ivy URI in SQL" in
>> sql.SQLQuerySuite fails; but other than that, rest looks good.
>> >
>> > Regards,
>> > Mridul
>> >
>> >
>> >
>> > On Mon, Jun 13, 2022 at 4:25 PM Tom Graves
>>  wrote:
>> >>
>> >> +1
>> >>
>> >> Tom
>> >>
>> >> On Thursday, June 9, 2022, 11:27:50 PM CDT, Maxim Gekk
>>  wrote:
>> >>
>> >>
>> >> Please vote on releasing the following candidate as Apache Spark
>> version 3.3.0.
>> >>
>> >> The vote is open until 11:59pm Pacific time June 14th and passes
>> if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>> >>
>> >> [ ] +1 Release this package as Apache Spark 3.3.0
>> >> [ ] -1 Do not release this package because ...
>> >>
>> >> To learn more about Apache Spark, please see
>> http://spark.apache.org/
>> >>
>> >> The tag to be voted on is v3.3.0-rc6 (commit
>> f74867bddfbcdd4d08076db36851e88b15e66556):
>> >> https://github.com/apache/spark/tree/v3.3.0-rc6
>> >>
>> >> The release files, including signatures, digests, etc. can be
>> found at:
>> >> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-bin/
>> >>
>> >> Signatures used for Spark RCs can be found in this file:
>> >> https://dist.apache.org/repos/dist/dev/spark/KEYS
>> >>
>> >> The staging repository for this release can be found at:
>> >>
>> https://repository.apache.org/content/repositories/orgapachespark-1407
>> >>
>> >> The documentation corresponding to this release can be found at:
>> >> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-docs/
>> >>
>> >> The list of bug fixes going into 3.3.0 can be found at the
>> following URL:
>> >> https://issues.apache.org/jira/projects/SPARK/versions/12350369
>> >>
>> >> This release is using the release script of the tag v3.3.0-rc6.
>> >>
>> >>
>> >> FAQ
>> >>
>> >> =
>> >> How can I help test this release?
>> >> =
>> >> If you are a Spark user, you can help us test this release by
>> taking
>> >> an existing Spark workload and running on this release candidate,
>> then
>> >> reporting any regressions.
>> >>
>> >> If you're working in PySpark you can set up a virtual env and
>> install
>> >> the current RC and see if anything important breaks, in the
>> Java/Scala
>> >> you can add the staging repository to your projects resolvers and
>> test
>> >> with the RC (make sure to clean up the artifact cache before/after
>> so
>> >> you don't end up building with a out of date RC going forward).
>> >>
>> >> ===
>> >> What should happen to JIRA tickets still targeting 3.3.0?
>> >> ===
>> >> The current list of open tickets targeted at 3.3.0 can be found at:
>> >> https://issues.apache.org/jira/projects/SPARK  and search for
>> "Target Version/s" = 3.3.0
>> >>
>> >> Committers should look at those and triage. Extremely important bug
>> >> fixes, documentation, and API tweaks that impact compatibility
>> should
>> >> be worked on immediately. Everything else please retarget

Re: 回复: [VOTE] Release Spark 3.3.0 (RC6)

2022-06-13 Thread Kent Yao
+1, non-binding

Xiao Li  于2022年6月14日周二 13:11写道:
>
> +1
>
> Xiao
>
> beliefer  于2022年6月13日周一 20:04写道:
>>
>> +1 AFAIK, no blocking issues now.
>> Glad to hear to release 3.3.0 !
>>
>>
>> 在 2022-06-14 09:38:35,"Ruifeng Zheng"  写道:
>>
>> +1 (non-binding)
>>
>> Maxim, thank you for driving this release!
>>
>> thanks,
>> ruifeng
>>
>>
>>
>> -- 原始邮件 --
>> 发件人: "Chao Sun" ;
>> 发送时间: 2022年6月14日(星期二) 上午8:45
>> 收件人: "Cheng Su";
>> 抄送: "L. C. Hsieh";"dev";
>> 主题: Re: [VOTE] Release Spark 3.3.0 (RC6)
>>
>> +1 (non-binding)
>>
>> Thanks,
>> Chao
>>
>> On Mon, Jun 13, 2022 at 5:37 PM Cheng Su  wrote:
>>>
>>> +1 (non-binding).
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Cheng Su
>>>
>>>
>>>
>>> From: L. C. Hsieh 
>>> Date: Monday, June 13, 2022 at 5:13 PM
>>> To: dev 
>>> Subject: Re: [VOTE] Release Spark 3.3.0 (RC6)
>>>
>>> +1
>>>
>>> On Mon, Jun 13, 2022 at 5:07 PM Holden Karau  wrote:
>>> >
>>> > +1
>>> >
>>> > On Mon, Jun 13, 2022 at 4:51 PM Yuming Wang  wrote:
>>> >>
>>> >> +1 (non-binding)
>>> >>
>>> >> On Tue, Jun 14, 2022 at 7:41 AM Dongjoon Hyun  
>>> >> wrote:
>>> >>>
>>> >>> +1
>>> >>>
>>> >>> Thanks,
>>> >>> Dongjoon.
>>> >>>
>>> >>> On Mon, Jun 13, 2022 at 3:54 PM Chris Nauroth  
>>> >>> wrote:
>>> 
>>>  +1 (non-binding)
>>> 
>>>  I repeated all checks I described for RC5:
>>> 
>>>  https://lists.apache.org/thread/ksoxmozgz7q728mnxl6c2z7ncmo87vls
>>> 
>>>  Maxim, thank you for your dedication on these release candidates.
>>> 
>>>  Chris Nauroth
>>> 
>>> 
>>>  On Mon, Jun 13, 2022 at 3:21 PM Mridul Muralidharan  
>>>  wrote:
>>> >
>>> >
>>> > +1
>>> >
>>> > Signatures, digests, etc check out fine.
>>> > Checked out tag and build/tested with -Pyarn -Pmesos -Pkubernetes
>>> >
>>> > The test "SPARK-33084: Add jar support Ivy URI in SQL" in 
>>> > sql.SQLQuerySuite fails; but other than that, rest looks good.
>>> >
>>> > Regards,
>>> > Mridul
>>> >
>>> >
>>> >
>>> > On Mon, Jun 13, 2022 at 4:25 PM Tom Graves 
>>> >  wrote:
>>> >>
>>> >> +1
>>> >>
>>> >> Tom
>>> >>
>>> >> On Thursday, June 9, 2022, 11:27:50 PM CDT, Maxim Gekk 
>>> >>  wrote:
>>> >>
>>> >>
>>> >> Please vote on releasing the following candidate as Apache Spark 
>>> >> version 3.3.0.
>>> >>
>>> >> The vote is open until 11:59pm Pacific time June 14th and passes if 
>>> >> a majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>>> >>
>>> >> [ ] +1 Release this package as Apache Spark 3.3.0
>>> >> [ ] -1 Do not release this package because ...
>>> >>
>>> >> To learn more about Apache Spark, please see http://spark.apache.org/
>>> >>
>>> >> The tag to be voted on is v3.3.0-rc6 (commit 
>>> >> f74867bddfbcdd4d08076db36851e88b15e66556):
>>> >> https://github.com/apache/spark/tree/v3.3.0-rc6
>>> >>
>>> >> The release files, including signatures, digests, etc. can be found 
>>> >> at:
>>> >> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-bin/
>>> >>
>>> >> Signatures used for Spark RCs can be found in this file:
>>> >> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>> >>
>>> >> The staging repository for this release can be found at:
>>> >> https://repository.apache.org/content/repositories/orgapachespark-1407
>>> >>
>>> >> The documentation corresponding to this release can be found at:
>>> >> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-docs/
>>> >>
>>> >> The list of bug fixes going into 3.3.0 can be found at the following 
>>> >> URL:
>>> >> https://issues.apache.org/jira/projects/SPARK/versions/12350369
>>> >>
>>> >> This release is using the release script of the tag v3.3.0-rc6.
>>> >>
>>> >>
>>> >> FAQ
>>> >>
>>> >> =
>>> >> How can I help test this release?
>>> >> =
>>> >> If you are a Spark user, you can help us test this release by taking
>>> >> an existing Spark workload and running on this release candidate, 
>>> >> then
>>> >> reporting any regressions.
>>> >>
>>> >> If you're working in PySpark you can set up a virtual env and install
>>> >> the current RC and see if anything important breaks, in the 
>>> >> Java/Scala
>>> >> you can add the staging repository to your projects resolvers and 
>>> >> test
>>> >> with the RC (make sure to clean up the artifact cache before/after so
>>> >> you don't end up building with a out of date RC going forward).
>>> >>
>>> >> ===
>>> >> What should happen to JIRA tickets still targeting 3.3.0?
>>> >> ===
>>> >> The current list of open tickets targeted at 3.3.0 can be found at:
>>> >> https://issues.apache.org/jira/projects/SPARK  and search fo

Re: 回复: [VOTE] Release Spark 3.3.0 (RC6)

2022-06-13 Thread huaxin gao
+1 (non-binding)

On Mon, Jun 13, 2022 at 10:47 PM Kent Yao  wrote:

> +1, non-binding
>
> Xiao Li  于2022年6月14日周二 13:11写道:
> >
> > +1
> >
> > Xiao
> >
> > beliefer  于2022年6月13日周一 20:04写道:
> >>
> >> +1 AFAIK, no blocking issues now.
> >> Glad to hear to release 3.3.0 !
> >>
> >>
> >> 在 2022-06-14 09:38:35,"Ruifeng Zheng"  写道:
> >>
> >> +1 (non-binding)
> >>
> >> Maxim, thank you for driving this release!
> >>
> >> thanks,
> >> ruifeng
> >>
> >>
> >>
> >> -- 原始邮件 --
> >> 发件人: "Chao Sun" ;
> >> 发送时间: 2022年6月14日(星期二) 上午8:45
> >> 收件人: "Cheng Su";
> >> 抄送: "L. C. Hsieh";"dev";
> >> 主题: Re: [VOTE] Release Spark 3.3.0 (RC6)
> >>
> >> +1 (non-binding)
> >>
> >> Thanks,
> >> Chao
> >>
> >> On Mon, Jun 13, 2022 at 5:37 PM Cheng Su 
> wrote:
> >>>
> >>> +1 (non-binding).
> >>>
> >>>
> >>>
> >>> Thanks,
> >>>
> >>> Cheng Su
> >>>
> >>>
> >>>
> >>> From: L. C. Hsieh 
> >>> Date: Monday, June 13, 2022 at 5:13 PM
> >>> To: dev 
> >>> Subject: Re: [VOTE] Release Spark 3.3.0 (RC6)
> >>>
> >>> +1
> >>>
> >>> On Mon, Jun 13, 2022 at 5:07 PM Holden Karau 
> wrote:
> >>> >
> >>> > +1
> >>> >
> >>> > On Mon, Jun 13, 2022 at 4:51 PM Yuming Wang 
> wrote:
> >>> >>
> >>> >> +1 (non-binding)
> >>> >>
> >>> >> On Tue, Jun 14, 2022 at 7:41 AM Dongjoon Hyun <
> dongjoon.h...@gmail.com> wrote:
> >>> >>>
> >>> >>> +1
> >>> >>>
> >>> >>> Thanks,
> >>> >>> Dongjoon.
> >>> >>>
> >>> >>> On Mon, Jun 13, 2022 at 3:54 PM Chris Nauroth 
> wrote:
> >>> 
> >>>  +1 (non-binding)
> >>> 
> >>>  I repeated all checks I described for RC5:
> >>> 
> >>>  https://lists.apache.org/thread/ksoxmozgz7q728mnxl6c2z7ncmo87vls
> >>> 
> >>>  Maxim, thank you for your dedication on these release candidates.
> >>> 
> >>>  Chris Nauroth
> >>> 
> >>> 
> >>>  On Mon, Jun 13, 2022 at 3:21 PM Mridul Muralidharan <
> mri...@gmail.com> wrote:
> >>> >
> >>> >
> >>> > +1
> >>> >
> >>> > Signatures, digests, etc check out fine.
> >>> > Checked out tag and build/tested with -Pyarn -Pmesos -Pkubernetes
> >>> >
> >>> > The test "SPARK-33084: Add jar support Ivy URI in SQL" in
> sql.SQLQuerySuite fails; but other than that, rest looks good.
> >>> >
> >>> > Regards,
> >>> > Mridul
> >>> >
> >>> >
> >>> >
> >>> > On Mon, Jun 13, 2022 at 4:25 PM Tom Graves
>  wrote:
> >>> >>
> >>> >> +1
> >>> >>
> >>> >> Tom
> >>> >>
> >>> >> On Thursday, June 9, 2022, 11:27:50 PM CDT, Maxim Gekk <
> maxim.g...@databricks.com.invalid> wrote:
> >>> >>
> >>> >>
> >>> >> Please vote on releasing the following candidate as Apache
> Spark version 3.3.0.
> >>> >>
> >>> >> The vote is open until 11:59pm Pacific time June 14th and
> passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
> >>> >>
> >>> >> [ ] +1 Release this package as Apache Spark 3.3.0
> >>> >> [ ] -1 Do not release this package because ...
> >>> >>
> >>> >> To learn more about Apache Spark, please see
> http://spark.apache.org/
> >>> >>
> >>> >> The tag to be voted on is v3.3.0-rc6 (commit
> f74867bddfbcdd4d08076db36851e88b15e66556):
> >>> >> https://github.com/apache/spark/tree/v3.3.0-rc6
> >>> >>
> >>> >> The release files, including signatures, digests, etc. can be
> found at:
> >>> >> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-bin/
> >>> >>
> >>> >> Signatures used for Spark RCs can be found in this file:
> >>> >> https://dist.apache.org/repos/dist/dev/spark/KEYS
> >>> >>
> >>> >> The staging repository for this release can be found at:
> >>> >>
> https://repository.apache.org/content/repositories/orgapachespark-1407
> >>> >>
> >>> >> The documentation corresponding to this release can be found at:
> >>> >> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-docs/
> >>> >>
> >>> >> The list of bug fixes going into 3.3.0 can be found at the
> following URL:
> >>> >> https://issues.apache.org/jira/projects/SPARK/versions/12350369
> >>> >>
> >>> >> This release is using the release script of the tag v3.3.0-rc6.
> >>> >>
> >>> >>
> >>> >> FAQ
> >>> >>
> >>> >> =
> >>> >> How can I help test this release?
> >>> >> =
> >>> >> If you are a Spark user, you can help us test this release by
> taking
> >>> >> an existing Spark workload and running on this release
> candidate, then
> >>> >> reporting any regressions.
> >>> >>
> >>> >> If you're working in PySpark you can set up a virtual env and
> install
> >>> >> the current RC and see if anything important breaks, in the
> Java/Scala
> >>> >> you can add the staging repository to your projects resolvers
> and test
> >>> >> with the RC (make sure to clean up the artifact cache
> before/after so
> >>> >> you don't end up building with a out of date RC going forward).

Stickers and Swag

2022-06-13 Thread Xiao Li
Hi, all,

The ASF has an official store at RedBubble
 that Apache Community
Development (ComDev) runs. If you are interested in buying Spark Swag, 70
products featuring the Spark logo are available:
https://www.redbubble.com/shop/ap/113203780

Go Spark!

Xiao


Re: 回复: [VOTE] Release Spark 3.3.0 (RC6)

2022-06-13 Thread Jungtaek Lim
+1 (non-binding)

Checked signature and checksum. Confirmed SPARK-39412
 is resolved. Built
source tgz with JDK 11.

Thanks Max for driving the efforts of this huge release!

On Tue, Jun 14, 2022 at 2:51 PM huaxin gao  wrote:

> +1 (non-binding)
>
> On Mon, Jun 13, 2022 at 10:47 PM Kent Yao  wrote:
>
>> +1, non-binding
>>
>> Xiao Li  于2022年6月14日周二 13:11写道:
>> >
>> > +1
>> >
>> > Xiao
>> >
>> > beliefer  于2022年6月13日周一 20:04写道:
>> >>
>> >> +1 AFAIK, no blocking issues now.
>> >> Glad to hear to release 3.3.0 !
>> >>
>> >>
>> >> 在 2022-06-14 09:38:35,"Ruifeng Zheng"  写道:
>> >>
>> >> +1 (non-binding)
>> >>
>> >> Maxim, thank you for driving this release!
>> >>
>> >> thanks,
>> >> ruifeng
>> >>
>> >>
>> >>
>> >> -- 原始邮件 --
>> >> 发件人: "Chao Sun" ;
>> >> 发送时间: 2022年6月14日(星期二) 上午8:45
>> >> 收件人: "Cheng Su";
>> >> 抄送: "L. C. Hsieh";"dev";
>> >> 主题: Re: [VOTE] Release Spark 3.3.0 (RC6)
>> >>
>> >> +1 (non-binding)
>> >>
>> >> Thanks,
>> >> Chao
>> >>
>> >> On Mon, Jun 13, 2022 at 5:37 PM Cheng Su 
>> wrote:
>> >>>
>> >>> +1 (non-binding).
>> >>>
>> >>>
>> >>>
>> >>> Thanks,
>> >>>
>> >>> Cheng Su
>> >>>
>> >>>
>> >>>
>> >>> From: L. C. Hsieh 
>> >>> Date: Monday, June 13, 2022 at 5:13 PM
>> >>> To: dev 
>> >>> Subject: Re: [VOTE] Release Spark 3.3.0 (RC6)
>> >>>
>> >>> +1
>> >>>
>> >>> On Mon, Jun 13, 2022 at 5:07 PM Holden Karau 
>> wrote:
>> >>> >
>> >>> > +1
>> >>> >
>> >>> > On Mon, Jun 13, 2022 at 4:51 PM Yuming Wang 
>> wrote:
>> >>> >>
>> >>> >> +1 (non-binding)
>> >>> >>
>> >>> >> On Tue, Jun 14, 2022 at 7:41 AM Dongjoon Hyun <
>> dongjoon.h...@gmail.com> wrote:
>> >>> >>>
>> >>> >>> +1
>> >>> >>>
>> >>> >>> Thanks,
>> >>> >>> Dongjoon.
>> >>> >>>
>> >>> >>> On Mon, Jun 13, 2022 at 3:54 PM Chris Nauroth <
>> cnaur...@apache.org> wrote:
>> >>> 
>> >>>  +1 (non-binding)
>> >>> 
>> >>>  I repeated all checks I described for RC5:
>> >>> 
>> >>>  https://lists.apache.org/thread/ksoxmozgz7q728mnxl6c2z7ncmo87vls
>> >>> 
>> >>>  Maxim, thank you for your dedication on these release candidates.
>> >>> 
>> >>>  Chris Nauroth
>> >>> 
>> >>> 
>> >>>  On Mon, Jun 13, 2022 at 3:21 PM Mridul Muralidharan <
>> mri...@gmail.com> wrote:
>> >>> >
>> >>> >
>> >>> > +1
>> >>> >
>> >>> > Signatures, digests, etc check out fine.
>> >>> > Checked out tag and build/tested with -Pyarn -Pmesos
>> -Pkubernetes
>> >>> >
>> >>> > The test "SPARK-33084: Add jar support Ivy URI in SQL" in
>> sql.SQLQuerySuite fails; but other than that, rest looks good.
>> >>> >
>> >>> > Regards,
>> >>> > Mridul
>> >>> >
>> >>> >
>> >>> >
>> >>> > On Mon, Jun 13, 2022 at 4:25 PM Tom Graves
>>  wrote:
>> >>> >>
>> >>> >> +1
>> >>> >>
>> >>> >> Tom
>> >>> >>
>> >>> >> On Thursday, June 9, 2022, 11:27:50 PM CDT, Maxim Gekk <
>> maxim.g...@databricks.com.invalid> wrote:
>> >>> >>
>> >>> >>
>> >>> >> Please vote on releasing the following candidate as Apache
>> Spark version 3.3.0.
>> >>> >>
>> >>> >> The vote is open until 11:59pm Pacific time June 14th and
>> passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>> >>> >>
>> >>> >> [ ] +1 Release this package as Apache Spark 3.3.0
>> >>> >> [ ] -1 Do not release this package because ...
>> >>> >>
>> >>> >> To learn more about Apache Spark, please see
>> http://spark.apache.org/
>> >>> >>
>> >>> >> The tag to be voted on is v3.3.0-rc6 (commit
>> f74867bddfbcdd4d08076db36851e88b15e66556):
>> >>> >> https://github.com/apache/spark/tree/v3.3.0-rc6
>> >>> >>
>> >>> >> The release files, including signatures, digests, etc. can be
>> found at:
>> >>> >> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-bin/
>> >>> >>
>> >>> >> Signatures used for Spark RCs can be found in this file:
>> >>> >> https://dist.apache.org/repos/dist/dev/spark/KEYS
>> >>> >>
>> >>> >> The staging repository for this release can be found at:
>> >>> >>
>> https://repository.apache.org/content/repositories/orgapachespark-1407
>> >>> >>
>> >>> >> The documentation corresponding to this release can be found
>> at:
>> >>> >> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc6-docs/
>> >>> >>
>> >>> >> The list of bug fixes going into 3.3.0 can be found at the
>> following URL:
>> >>> >>
>> https://issues.apache.org/jira/projects/SPARK/versions/12350369
>> >>> >>
>> >>> >> This release is using the release script of the tag v3.3.0-rc6.
>> >>> >>
>> >>> >>
>> >>> >> FAQ
>> >>> >>
>> >>> >> =
>> >>> >> How can I help test this release?
>> >>> >> =
>> >>> >> If you are a Spark user, you can help us test this release by
>> taking
>> >>> >> an existing Spark workload and running on this release
>> candidate,