[spark] branch master updated: [SPARK-35991][SQL][FOLLOWUP] Add back protected modifier of sparkConf to TPCBase
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new e83f8a8 [SPARK-35991][SQL][FOLLOWUP] Add back protected modifier of sparkConf to TPCBase e83f8a8 is described below commit e83f8a872a16d4f049cefb1fc445f91cf84443ad Author: Angerszh AuthorDate: Tue Aug 24 11:32:30 2021 +0900 [SPARK-35991][SQL][FOLLOWUP] Add back protected modifier of sparkConf to TPCBase ### What changes were proposed in this pull request? Add back protected modifier of sparkConf to TPCBase according to https://github.com/apache/spark/pull/33736/files#r694054229 ### Why are the changes needed? Add back protected modifier of sparkConf to TPCBase ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Not need Closes #33813 from AngersZh/SPARK-35991-FOLLOWUP. Authored-by: Angerszh Signed-off-by: Hyukjin Kwon --- sql/core/src/test/scala/org/apache/spark/sql/TPCBase.scala | 2 +- sql/core/src/test/scala/org/apache/spark/sql/TPCDSQuerySuite.scala | 2 +- sql/core/src/test/scala/org/apache/spark/sql/TPCDSQueryTestSuite.scala | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/TPCBase.scala b/sql/core/src/test/scala/org/apache/spark/sql/TPCBase.scala index b1ea70d..1764584 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/TPCBase.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/TPCBase.scala @@ -25,7 +25,7 @@ trait TPCBase extends SharedSparkSession { protected def injectStats: Boolean = false - override def sparkConf: SparkConf = { + override protected def sparkConf: SparkConf = { if (injectStats) { super.sparkConf .set(SQLConf.MAX_TO_STRING_FIELDS, Int.MaxValue) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/TPCDSQuerySuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/TPCDSQuerySuite.scala index cab117c..22e1b83 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/TPCDSQuerySuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/TPCDSQuerySuite.scala @@ -73,6 +73,6 @@ class TPCDSQueryWithStatsSuite extends TPCDSQuerySuite { @ExtendedSQLTest class TPCDSQueryANSISuite extends TPCDSQuerySuite { - override def sparkConf: SparkConf = + override protected def sparkConf: SparkConf = super.sparkConf.set(SQLConf.ANSI_ENABLED, true) } diff --git a/sql/core/src/test/scala/org/apache/spark/sql/TPCDSQueryTestSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/TPCDSQueryTestSuite.scala index 0f16d25..3e7f898 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/TPCDSQueryTestSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/TPCDSQueryTestSuite.scala @@ -57,7 +57,7 @@ class TPCDSQueryTestSuite extends QueryTest with TPCDSBase with SQLQueryTestHelp private val regenerateGoldenFiles = sys.env.get("SPARK_GENERATE_GOLDEN_FILES").exists(_ == "1") // To make output results deterministic - override def sparkConf: SparkConf = super.sparkConf + override protected def sparkConf: SparkConf = super.sparkConf .set(SQLConf.SHUFFLE_PARTITIONS.key, "1") protected override def createSparkSession: TestSparkSession = { - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (cd23426 -> fa53aa0)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from cd23426 [SPARK-34952][SQL][FOLLOWUP] Move aggregates to a separate package add fa53aa0 [SPARK-36560][PYTHON][INFRA] Deflake PySpark coverage job No new revisions were added by this update. Summary of changes: python/pyspark/mllib/tests/test_streaming_algorithms.py | 2 +- python/pyspark/sql/tests/test_streaming.py | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9f595c4 -> cd23426)
This is an automated email from the ASF dual-hosted git repository. viirya pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9f595c4 [SPARK-36418][SPARK-36536][SQL][DOCS][FOLLOWUP] Update the SQL migration guide about using `CAST` in datetime parsing add cd23426 [SPARK-34952][SQL][FOLLOWUP] Move aggregates to a separate package No new revisions were added by this update. Summary of changes: .../sql/connector/expressions/{ => aggregate}/AggregateFunc.java | 7 --- .../sql/connector/expressions/{ => aggregate}/Aggregation.java | 7 --- .../spark/sql/connector/expressions/{ => aggregate}/Count.java | 3 ++- .../spark/sql/connector/expressions/{ => aggregate}/CountStar.java | 2 +- .../spark/sql/connector/expressions/{ => aggregate}/Max.java | 3 ++- .../spark/sql/connector/expressions/{ => aggregate}/Min.java | 3 ++- .../spark/sql/connector/expressions/{ => aggregate}/Sum.java | 3 ++- .../spark/sql/connector/read/SupportsPushDownAggregates.java | 2 +- .../scala/org/apache/spark/sql/execution/DataSourceScanExec.scala | 2 +- .../spark/sql/execution/datasources/DataSourceStrategy.scala | 3 ++- .../org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala | 2 +- .../apache/spark/sql/execution/datasources/v2/PushDownUtils.scala | 3 ++- .../sql/execution/datasources/v2/V2ScanRelationPushDown.scala | 2 +- .../spark/sql/execution/datasources/v2/jdbc/JDBCScanBuilder.scala | 2 +- 14 files changed, 26 insertions(+), 18 deletions(-) rename sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/{ => aggregate}/AggregateFunc.java (89%) rename sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/{ => aggregate}/Aggregation.java (91%) rename sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/{ => aggregate}/Count.java (92%) rename sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/{ => aggregate}/CountStar.java (94%) rename sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/{ => aggregate}/Max.java (91%) rename sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/{ => aggregate}/Min.java (91%) rename sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/{ => aggregate}/Sum.java (92%) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.2 updated: [SPARK-34952][SQL][FOLLOWUP] Move aggregates to a separate package
This is an automated email from the ASF dual-hosted git repository. viirya pushed a commit to branch branch-3.2 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.2 by this push: new e48de78 [SPARK-34952][SQL][FOLLOWUP] Move aggregates to a separate package e48de78 is described below commit e48de7884d218e2f156ee09031b8c9b05e7a2933 Author: Huaxin Gao AuthorDate: Mon Aug 23 15:31:13 2021 -0700 [SPARK-34952][SQL][FOLLOWUP] Move aggregates to a separate package ### What changes were proposed in this pull request? Add `aggregate` package under `sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions` and move all the aggregates (e.g. `Count`, `Max`, `Min`, etc.) there. ### Why are the changes needed? Right now these aggregates are under `sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions`. It looks OK now, but we plan to add a new `filter` package under `expressions` for all the DSV2 filters. It will look strange that filters have their own package, but aggregates don't. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests Closes #33815 from huaxingao/agg_package. Authored-by: Huaxin Gao Signed-off-by: Liang-Chi Hsieh (cherry picked from commit cd2342691d1182b14f6076f69793441d2aa03e85) Signed-off-by: Liang-Chi Hsieh --- .../sql/connector/expressions/{ => aggregate}/AggregateFunc.java | 7 --- .../sql/connector/expressions/{ => aggregate}/Aggregation.java | 7 --- .../spark/sql/connector/expressions/{ => aggregate}/Count.java | 3 ++- .../spark/sql/connector/expressions/{ => aggregate}/CountStar.java | 2 +- .../spark/sql/connector/expressions/{ => aggregate}/Max.java | 3 ++- .../spark/sql/connector/expressions/{ => aggregate}/Min.java | 3 ++- .../spark/sql/connector/expressions/{ => aggregate}/Sum.java | 3 ++- .../spark/sql/connector/read/SupportsPushDownAggregates.java | 2 +- .../scala/org/apache/spark/sql/execution/DataSourceScanExec.scala | 2 +- .../spark/sql/execution/datasources/DataSourceStrategy.scala | 3 ++- .../org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala | 2 +- .../apache/spark/sql/execution/datasources/v2/PushDownUtils.scala | 3 ++- .../sql/execution/datasources/v2/V2ScanRelationPushDown.scala | 2 +- .../spark/sql/execution/datasources/v2/jdbc/JDBCScanBuilder.scala | 2 +- 14 files changed, 26 insertions(+), 18 deletions(-) diff --git a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/AggregateFunc.java b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/aggregate/AggregateFunc.java similarity index 89% rename from sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/AggregateFunc.java rename to sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/aggregate/AggregateFunc.java index eea8c31..6683f73 100644 --- a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/AggregateFunc.java +++ b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/aggregate/AggregateFunc.java @@ -15,12 +15,13 @@ * limitations under the License. */ -package org.apache.spark.sql.connector.expressions; - -import org.apache.spark.annotation.Evolving; +package org.apache.spark.sql.connector.expressions.aggregate; import java.io.Serializable; +import org.apache.spark.annotation.Evolving; +import org.apache.spark.sql.connector.expressions.Expression; + /** * Base class of the Aggregate Functions. * diff --git a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/Aggregation.java b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/aggregate/Aggregation.java similarity index 91% rename from sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/Aggregation.java rename to sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/aggregate/Aggregation.java index 8eb3491..0392523 100644 --- a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/Aggregation.java +++ b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/aggregate/Aggregation.java @@ -15,12 +15,13 @@ * limitations under the License. */ -package org.apache.spark.sql.connector.expressions; - -import org.apache.spark.annotation.Evolving; +package org.apache.spark.sql.connector.expressions.aggregate; import java.io.Serializable; +import org.apache.spark.annotation.Evolving; +import org.apache.spark.sql.connector.expressions.FieldReference; + /** * Aggregation in SQL statement. * diff --git a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/Count.java b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/aggregate/Count.java similarity index 92%
[spark] branch master updated: [SPARK-36418][SPARK-36536][SQL][DOCS][FOLLOWUP] Update the SQL migration guide about using `CAST` in datetime parsing
This is an automated email from the ASF dual-hosted git repository. maxgekk pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 9f595c4 [SPARK-36418][SPARK-36536][SQL][DOCS][FOLLOWUP] Update the SQL migration guide about using `CAST` in datetime parsing 9f595c4 is described below commit 9f595c4ce34728f5d8f943eadea8d85a548b2d41 Author: Max Gekk AuthorDate: Mon Aug 23 13:07:37 2021 +0300 [SPARK-36418][SPARK-36536][SQL][DOCS][FOLLOWUP] Update the SQL migration guide about using `CAST` in datetime parsing ### What changes were proposed in this pull request? In the PR, I propose the update the SQL migration guide about the changes introduced by the PRs https://github.com/apache/spark/pull/33709 and https://github.com/apache/spark/pull/33769. https://user-images.githubusercontent.com/1580697/130419710-640f20b3-6a38-4eb1-a6d6-2e069dc5665c.png;> ### Why are the changes needed? To inform users about the upcoming changes in parsing datetime strings. This should help users to migrate on the new release. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? By generating the doc, and checking by eyes: ``` $ SKIP_API=1 SKIP_RDOC=1 SKIP_PYTHONDOC=1 SKIP_SCALADOC=1 bundle exec jekyll build ``` Closes #33809 from MaxGekk/datetime-cast-migr-guide. Authored-by: Max Gekk Signed-off-by: Max Gekk --- docs/sql-migration-guide.md | 20 1 file changed, 20 insertions(+) diff --git a/docs/sql-migration-guide.md b/docs/sql-migration-guide.md index 7ad384f..47e7921 100644 --- a/docs/sql-migration-guide.md +++ b/docs/sql-migration-guide.md @@ -26,6 +26,26 @@ license: | - Since Spark 3.3, Spark turns a non-nullable schema into nullable for API `DataFrameReader.schema(schema: StructType).json(jsonDataset: Dataset[String])` and `DataFrameReader.schema(schema: StructType).csv(csvDataset: Dataset[String])` when the schema is specified by the user and contains non-nullable fields. + - Since Spark 3.3, when the date or timestamp pattern is not specified, Spark converts an input string to a date/timestamp using the `CAST` expression approach. The changes affect CSV/JSON datasources and parsing of partition values. In Spark 3.2 or earlier, when the date or timestamp pattern is not set, Spark uses the default patterns: `-MM-dd` for dates and `-MM-dd HH:mm:ss` for timestamps. After the changes, Spark still recognizes the pattern together with + +Date patterns: + * `[+-]*` + * `[+-]*-[m]m` + * `[+-]*-[m]m-[d]d` + * `[+-]*-[m]m-[d]d ` + * `[+-]*-[m]m-[d]d *` + * `[+-]*-[m]m-[d]dT*` + +Timestamp patterns: + * `[+-]*` + * `[+-]*-[m]m` + * `[+-]*-[m]m-[d]d` + * `[+-]*-[m]m-[d]d ` + * `[+-]*-[m]m-[d]d [h]h:[m]m:[s]s.[ms][ms][ms][us][us][us][zone_id]` + * `[+-]*-[m]m-[d]dT[h]h:[m]m:[s]s.[ms][ms][ms][us][us][us][zone_id]` + * `[h]h:[m]m:[s]s.[ms][ms][ms][us][us][us][zone_id]` + * `T[h]h:[m]m:[s]s.[ms][ms][ms][us][us][us][zone_id]` + ## Upgrading from Spark SQL 3.1 to 3.2 - Since Spark 3.2, ADD FILE/JAR/ARCHIVE commands require each path to be enclosed by `"` or `'` if the path contains whitespaces. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] cloud-fan commented on a change in pull request #356: Improve the guideline of Preparing gpg key
cloud-fan commented on a change in pull request #356: URL: https://github.com/apache/spark-website/pull/356#discussion_r693773768 ## File path: release-process.md ## @@ -39,15 +39,90 @@ If you are a new Release Manager, you can read up on the process from the follow You can skip this section if you have already uploaded your key. -After generating the gpg key, you need to upload your key to a public key server. Please refer to -https://www.apache.org/dev/openpgp.html#generate-key;>https://www.apache.org/dev/openpgp.html#generate-key -for details. +Generate Key -If you want to do the release on another machine, you can transfer your secret key to that machine -via the `gpg --export-secret-keys` and `gpg --import` commands. +Here's an example of gpg 2.0.12. If you use gpg version 1 series, please refer to https://www.apache.org/dev/openpgp.html#generate-key;>generate-key for details. + +``` +:::console +$ gpg --full-gen-key +gpg (GnuPG) 2.0.12; Copyright (C) 2009 Free Software Foundation, Inc. +This is free software: you are free to change and redistribute it. +There is NO WARRANTY, to the extent permitted by law. + +Please select what kind of key you want: + (1) RSA and RSA (default) + (2) DSA and Elgamal + (3) DSA (sign only) + (4) RSA (sign only) +Your selection? 1 +RSA keys may be between 1024 and 4096 bits long. +What keysize do you want? (2048) 4096 +Requested keysize is 4096 bits +Please specify how long the key should be valid. + 0 = key does not expire += key expires in n days + w = key expires in n weeks + m = key expires in n months + y = key expires in n years +Key is valid for? (0) +Key does not expire at all +Is this correct? (y/N) y + +GnuPG needs to construct a user ID to identify your key. + +Real name: Robert Burrell Donkin +Email address: rdon...@apache.org +Comment: CODE SIGNING KEY +You selected this USER-ID: +"Robert Burrell Donkin (CODE SIGNING KEY) " + +Change (N)ame, (C)omment, (E)mail or (O)kay/(Q)uit? O +We need to generate a lot of random bytes. It is a good idea to perform +some other action (type on the keyboard, move the mouse, utilize the +disks) during the prime generation; this gives the random number +generator a better chance to gain enough entropy. +We need to generate a lot of random bytes. It is a good idea to perform +some other action (type on the keyboard, move the mouse, utilize the +disks) during the prime generation; this gives the random number +generator a better chance to gain enough entropy. +gpg: key 04B3B5C426A27D33 marked as ultimately trusted +gpg: revocation certificate stored as '/home/ubuntu/.gnupg/openpgp-revocs.d/08071B1E23C8A7E2CA1E891A04B3B5C426A27D33.rev' +public and secret key created and signed. + +pub rsa4096 2021-08-19 [SC] + 08071B1E23C8A7E2CA1E891A04B3B5C426A27D33 +uid Jack (test) +sub rsa4096 2021-08-19 [E] +``` + +Note that the last 8 digits (26A27D33) of the public key is the https://infra.apache.org/release-signing.html#key-id;>key ID. -The last step is to update the KEYS file with your code signing key -https://www.apache.org/dev/openpgp.html#export-public-key;>https://www.apache.org/dev/openpgp.html#export-public-key +Upload Key + +After generating the public key, we should upload it to a https://infra.apache.org/release-signing.html#keyserver;>public key server. +You can upload: + +either use gpg command: + +``` +$ gpg --keyserver keys.openpgp.org --send-key 26A27D33 +``` + +or copy-paste the ASCII-armored public key to http://keyserver.ubuntu.com:11371/#submitKey;>OpenPGP Keyserver. +The ASCII-armored public key can be generated by: + +``` +:::console +$ gpg --export --armor 26A27D33 +``` + +Please refer to https://infra.apache.org/release-signing.html#keyserver-upload;>keyserver-upload for details. + +Update KEYS file with your code signing key + +The code signing key is exactly the same with the ASCII-armored public key mentioned above. +You should append it to https://dist.apache.org/repos/dist/dev/spark/KEYS;>KEYS by: Review comment: ```suggestion You should append it to the KEYS file by: ``` It doesn't seem necessary to add url for `KEYS`. People need to run the svn command below anyway. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (0b6af46 -> adc485a)
This is an automated email from the ASF dual-hosted git repository. sarutak pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 0b6af46 [SPARK-36470][PYTHON] Implement `CategoricalIndex.map` and `DatetimeIndex.map` add adc485a [MINOR][DOCS] Mention Hadoop 3 in YARN introduction on cluster-overview.md No new revisions were added by this update. Summary of changes: docs/cluster-overview.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org