[spark] branch master updated (c01dad46c77 -> 0f8218c4324)
This is an automated email from the ASF dual-hosted git repository. yao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from c01dad46c77 [MINOR][SQL] Simplify the method resolveExprsAndAddMissingAttrs add 0f8218c4324 [SPARK-42916][SQL] JDBCTableCatalog Keeps Char/Varchar meta on the read-side No new revisions were added by this update. Summary of changes: .../spark/sql/jdbc/MySQLIntegrationSuite.scala | 2 +- .../spark/sql/jdbc/PostgresIntegrationSuite.scala | 6 +++-- .../sql/execution/datasources/jdbc/JdbcUtils.scala | 14 ++- .../org/apache/spark/sql/jdbc/MySQLDialect.scala | 3 +++ .../apache/spark/sql/jdbc/PostgresDialect.scala| 8 +-- .../v2/jdbc/JDBCTableCatalogSuite.scala| 19 +-- .../org/apache/spark/sql/jdbc/JDBCSuite.scala | 27 +- 7 files changed, 55 insertions(+), 24 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [MINOR][SQL] Simplify the method resolveExprsAndAddMissingAttrs
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new c01dad46c77 [MINOR][SQL] Simplify the method resolveExprsAndAddMissingAttrs c01dad46c77 is described below commit c01dad46c773146b27b8ffc09fdf83b09edefec1 Author: Gengliang Wang AuthorDate: Wed Apr 12 22:04:31 2023 -0700 [MINOR][SQL] Simplify the method resolveExprsAndAddMissingAttrs ### What changes were proposed in this pull request? The method `resolveExprsAndAddMissingAttrs` contains redundant code: getting the `newExprs` and `newChild` shows up 4 times in different branches. This PR is to simplify the implementation of the method. ### Why are the changes needed? Code clean up and remove redundant code. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests Closes #40761 from gengliangwang/cleanup. Lead-authored-by: Gengliang Wang Co-authored-by: Gengliang Wang Signed-off-by: Gengliang Wang --- .../catalyst/analysis/ColumnResolutionHelper.scala | 59 +++--- 1 file changed, 29 insertions(+), 30 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ColumnResolutionHelper.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ColumnResolutionHelper.scala index ba550bce791..c5634278490 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ColumnResolutionHelper.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ColumnResolutionHelper.scala @@ -50,40 +50,39 @@ trait ColumnResolutionHelper extends Logging { (exprs, plan) } else { plan match { -case p: Project => - // Resolving expressions against current plan. - val maybeResolvedExprs = exprs.map(resolveExpressionByPlanOutput(_, p)) - // Recursively resolving expressions on the child of current plan. - val (newExprs, newChild) = resolveExprsAndAddMissingAttrs(maybeResolvedExprs, p.child) - // If some attributes used by expressions are resolvable only on the rewritten child - // plan, we need to add them into original projection. - val missingAttrs = (AttributeSet(newExprs) -- p.outputSet).intersect(newChild.outputSet) - (newExprs, Project(p.projectList ++ missingAttrs, newChild)) - -case a @ Aggregate(groupExprs, aggExprs, child) => - val maybeResolvedExprs = exprs.map(resolveExpressionByPlanOutput(_, a)) - val (newExprs, newChild) = resolveExprsAndAddMissingAttrs(maybeResolvedExprs, child) - val missingAttrs = (AttributeSet(newExprs) -- a.outputSet).intersect(newChild.outputSet) - if (missingAttrs.forall(attr => groupExprs.exists(_.semanticEquals(attr { -// All the missing attributes are grouping expressions, valid case. -(newExprs, a.copy(aggregateExpressions = aggExprs ++ missingAttrs, child = newChild)) - } else { -// Need to add non-grouping attributes, invalid case. -(exprs, a) - } - -case g: Generate => - val maybeResolvedExprs = exprs.map(resolveExpressionByPlanOutput(_, g)) - val (newExprs, newChild) = resolveExprsAndAddMissingAttrs(maybeResolvedExprs, g.child) - (newExprs, g.copy(unrequiredChildIndex = Nil, child = newChild)) - // For `Distinct` and `SubqueryAlias`, we can't recursively resolve and add attributes // via its children. case u: UnaryNode if !u.isInstanceOf[Distinct] && !u.isInstanceOf[SubqueryAlias] => - val maybeResolvedExprs = exprs.map(resolveExpressionByPlanOutput(_, u)) - val (newExprs, newChild) = resolveExprsAndAddMissingAttrs(maybeResolvedExprs, u.child) - (newExprs, u.withNewChildren(Seq(newChild))) + val (newExprs, newChild) = { +// Resolving expressions against current plan. +val maybeResolvedExprs = exprs.map(resolveExpressionByPlanOutput(_, u)) +// Recursively resolving expressions on the child of current plan. +resolveExprsAndAddMissingAttrs(maybeResolvedExprs, u.child) + } + // If some attributes used by expressions are resolvable only on the rewritten child + // plan, we need to add them into original projection. + lazy val missingAttrs = +(AttributeSet(newExprs) -- u.outputSet).intersect(newChild.outputSet) + u match { +case p: Project => + (newExprs, Project(p.projectList ++ missingAttrs, newChild)) + +case a @ Aggregate(groupExprs, aggExprs, child) => + if (missingAttrs.forall(attr =>
[spark] branch master updated (a45affe3c8e -> 69abf14b966)
This is an automated email from the ASF dual-hosted git repository. ruifengz pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from a45affe3c8e [SPARK-43063][SQL] `df.show` handle null should print NULL instead of null add 69abf14b966 [SPARK-43115][CONNECT][PS][TESTS] Split pyspark-pandas-connect from pyspark-connect module No new revisions were added by this update. Summary of changes: .github/workflows/build_and_test.yml | 2 ++ dev/sparktestsupport/modules.py | 15 +++ dev/sparktestsupport/utils.py| 13 - 3 files changed, 25 insertions(+), 5 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-43063][SQL] `df.show` handle null should print NULL instead of null
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new a45affe3c8e [SPARK-43063][SQL] `df.show` handle null should print NULL instead of null a45affe3c8e is described below commit a45affe3c8e7a724aea7dbbc1af08e36001c7540 Author: Yikf AuthorDate: Thu Apr 13 10:15:14 2023 +0800 [SPARK-43063][SQL] `df.show` handle null should print NULL instead of null ### What changes were proposed in this pull request? `df.show` handle null should print NULL instead of null to consistent behavior; Like as the following behavior is currently inconsistent: ``` shell scala> spark.sql("select decode(6, 1, 'Southlake', 2, 'San Francisco', 3, 'New Jersey', 4, 'Seattle') as result").show(false) +--+ |result| +--+ |null | +--+ ``` ``` shell spark-sql> DESC FUNCTION EXTENDED decode; function_desc Function: decode Class: org.apache.spark.sql.catalyst.expressions.Decode Usage: decode(bin, charset) - Decodes the first argument using the second argument character set. decode(expr, search, result [, search, result ] ... [, default]) - Compares expr to each search value in order. If expr is equal to a search value, decode returns the corresponding result. If no match is found, then it returns default. If default is omitted, it returns null. Extended Usage: Examples: > SELECT decode(encode('abc', 'utf-8'), 'utf-8'); abc > SELECT decode(2, 1, 'Southlake', 2, 'San Francisco', 3, 'New Jersey', 4, 'Seattle', 'Non domestic'); San Francisco > SELECT decode(6, 1, 'Southlake', 2, 'San Francisco', 3, 'New Jersey', 4, 'Seattle', 'Non domestic'); Non domestic > SELECT decode(6, 1, 'Southlake', 2, 'San Francisco', 3, 'New Jersey', 4, 'Seattle'); NULL Since: 3.2.0 Time taken: 0.074 seconds, Fetched 4 row(s) ``` ``` shell spark-sql> select decode(6, 1, 'Southlake', 2, 'San Francisco', 3, 'New Jersey', 4, 'Seattle'); NULL ``` ### Why are the changes needed? `df.show` keep consistent behavior when handle `null` with spark-sql CLI. ### Does this PR introduce _any_ user-facing change? Yes, `null` will display NULL instead of null. ### How was this patch tested? GA Closes #40699 from Yikf/show-NULL. Authored-by: Yikf Signed-off-by: Wenchen Fan --- python/pyspark/ml/feature.py | 2 +- python/pyspark/pandas/frame.py | 20 ++--- python/pyspark/sql/column.py | 2 +- python/pyspark/sql/dataframe.py| 68 - python/pyspark/sql/functions.py| 86 +++--- python/pyspark/sql/readwriter.py | 10 +-- .../sql/tests/connect/test_connect_basic.py| 38 +- .../sql/tests/connect/test_connect_column.py | 36 - .../sql/tests/connect/test_connect_function.py | 62 .../spark/sql/catalyst/expressions/Cast.scala | 22 +++--- .../sql/catalyst/expressions/CastSuiteBase.scala | 10 +-- .../main/scala/org/apache/spark/sql/Dataset.scala | 10 +-- .../scala/org/apache/spark/sql/DatasetSuite.scala | 2 +- 13 files changed, 184 insertions(+), 184 deletions(-) diff --git a/python/pyspark/ml/feature.py b/python/pyspark/ml/feature.py index ff7aaf71f9c..e7ec35bffa0 100755 --- a/python/pyspark/ml/feature.py +++ b/python/pyspark/ml/feature.py @@ -5313,7 +5313,7 @@ class VectorAssembler( +---+---++-+ | a| b| c| features| +---+---++-+ -|1.0|2.0|null|[1.0,2.0,NaN]| +|1.0|2.0|NULL|[1.0,2.0,NaN]| |3.0|NaN| 4.0|[3.0,NaN,4.0]| |5.0|6.0| 7.0|[5.0,6.0,7.0]| +---+---++-+ diff --git a/python/pyspark/pandas/frame.py b/python/pyspark/pandas/frame.py index 1f81f0addf9..8bddcb6bae8 100644 --- a/python/pyspark/pandas/frame.py +++ b/python/pyspark/pandas/frame.py @@ -1530,7 +1530,7 @@ class DataFrame(Frame, Generic[T]): # | A| B| C| # +---+---++ # | 1| 2| 3.0| -# | 4| 1|null| +# | 4| 1|NULL| # +---+---++ pair_scols: List[GenericColumn] = [] @@ -1560,10 +1560,10 @@ class DataFrame(Frame, Generic[T]): # | 2| 2|3.0| 3.0| # | 0| 0|4.0| 4.0| # | 0| 1|4.0| 1.0| -# | 0| 2| null|
[spark] branch master updated: [SPARK-43110][SQL] Move asIntegral to PhysicalDataType
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new a3839d8cf2e [SPARK-43110][SQL] Move asIntegral to PhysicalDataType a3839d8cf2e is described below commit a3839d8cf2e4d17c930cf3902538e39b58779e88 Author: Rui Wang AuthorDate: Thu Apr 13 09:53:01 2023 +0800 [SPARK-43110][SQL] Move asIntegral to PhysicalDataType ### What changes were proposed in this pull request? This PR proposes that we move asIntegral to PhysicalDataType. This is to simplify the DataType class to make it become a simple interface without coupling too many internal representations. ### Why are the changes needed? To make DataType become a simpler interface, non-public code can be moved outside of the DataType class. ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? UT Closes #40758 from amaliujia/catalyst_datatype_refactor_5. Authored-by: Rui Wang Signed-off-by: Wenchen Fan --- .../org/apache/spark/sql/catalyst/expressions/arithmetic.scala | 6 +++--- .../org/apache/spark/sql/catalyst/types/PhysicalDataType.scala | 4 .../main/scala/org/apache/spark/sql/types/AbstractDataType.scala| 4 +--- .../src/main/scala/org/apache/spark/sql/types/DecimalType.scala | 1 - .../src/main/scala/org/apache/spark/sql/types/DoubleType.scala | 1 - .../src/main/scala/org/apache/spark/sql/types/FloatType.scala | 1 - 6 files changed, 8 insertions(+), 9 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala index 3fbe7269cb7..31d4d71cd40 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala @@ -886,8 +886,8 @@ case class IntegralDivide( val integral = left.dataType match { case i: IntegralType => PhysicalIntegralType.integral(i) - case d: DecimalType => -d.asIntegral.asInstanceOf[Integral[Any]] + case DecimalType.Fixed(p, s) => +PhysicalDecimalType(p, s).asIntegral.asInstanceOf[Integral[Any]] case _: YearMonthIntervalType => PhysicalIntegerType.integral.asInstanceOf[Integral[Any]] case _: DayTimeIntervalType => @@ -981,7 +981,7 @@ case class Remainder( (left, right) => integral.rem(left, right) case d @ DecimalType.Fixed(precision, scale) => - val integral = d.asIntegral.asInstanceOf[Integral[Any]] + val integral = PhysicalDecimalType(precision, scale).asIntegral.asInstanceOf[Integral[Any]] (left, right) => checkDecimalOverflow(integral.rem(left, right).asInstanceOf[Decimal], precision, scale) } diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/PhysicalDataType.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/PhysicalDataType.scala index e7e9a2aa83b..b6e0cd88f08 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/PhysicalDataType.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/PhysicalDataType.scala @@ -89,6 +89,7 @@ object PhysicalNumericType { sealed abstract class PhysicalFractionalType extends PhysicalNumericType { private[sql] val fractional: Fractional[InternalType] + private[sql] val asIntegral: Integral[InternalType] } object PhysicalFractionalType { @@ -160,6 +161,7 @@ case class PhysicalDecimalType(precision: Int, scale: Int) extends PhysicalFract private[sql] val numeric = Decimal.DecimalIsFractional override private[sql] def exactNumeric = DecimalExactNumeric private[sql] val fractional = Decimal.DecimalIsFractional + private[sql] val asIntegral = Decimal.DecimalAsIfIntegral } case object PhysicalDecimalType { @@ -179,6 +181,7 @@ class PhysicalDoubleType() extends PhysicalFractionalType with PhysicalPrimitive private[sql] val numeric = implicitly[Numeric[Double]] override private[sql] def exactNumeric = DoubleExactNumeric private[sql] val fractional = implicitly[Fractional[Double]] + private[sql] val asIntegral = DoubleType.DoubleAsIfIntegral } case object PhysicalDoubleType extends PhysicalDoubleType @@ -193,6 +196,7 @@ class PhysicalFloatType() extends PhysicalFractionalType with PhysicalPrimitiveT private[sql] val numeric = implicitly[Numeric[Float]] override private[sql] def exactNumeric = FloatExactNumeric private[sql] val fractional = implicitly[Fractional[Float]] + private[sql] val asIntegral = FloatType.FloatAsIfIntegral } case object PhysicalFloatType extends PhysicalFloatType diff --git
[spark] branch master updated: [SPARK-42656][FOLLOWUP] `chmod+x` for `connector/connect/bin/spark-connect-scala-client-classpath` script
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 549c43a8553 [SPARK-42656][FOLLOWUP] `chmod+x` for `connector/connect/bin/spark-connect-scala-client-classpath` script 549c43a8553 is described below commit 549c43a8553461e5e35f6c3f7d597c7322705170 Author: Juliusz Sompolski AuthorDate: Thu Apr 13 09:34:59 2023 +0900 [SPARK-42656][FOLLOWUP] `chmod+x` for `connector/connect/bin/spark-connect-scala-client-classpath` script ### What changes were proposed in this pull request? Make the script introduced in https://github.com/apache/spark/pull/40676 runnable. ### Why are the changes needed? Somehow the chmod didn't commit with the previous PR, which only became apparent when I pulled back from master... ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manual dev again... Closes #40757 from juliuszsompolski/spark-connect-scala-client-classpath-chmod. Authored-by: Juliusz Sompolski Signed-off-by: Hyukjin Kwon --- connector/connect/bin/spark-connect-scala-client-classpath | 0 1 file changed, 0 insertions(+), 0 deletions(-) diff --git a/connector/connect/bin/spark-connect-scala-client-classpath b/connector/connect/bin/spark-connect-scala-client-classpath old mode 100644 new mode 100755 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (2931993e059 -> 76bd695084c)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from 2931993e059 [SPARK-42437][PYTHON][CONNECT] PySpark catalog.cacheTable will allow to specify storage level add 76bd695084c [SPARK-43031][SS][CONNECT] Enable unit test and doctest for streaming No new revisions were added by this update. Summary of changes: dev/sparktestsupport/modules.py| 5 + python/pyspark/sql/connect/streaming/query.py | 35 +- python/pyspark/sql/connect/streaming/readwriter.py | 40 ++- python/pyspark/sql/dataframe.py| 2 +- python/pyspark/sql/streaming/query.py | 14 +- python/pyspark/sql/streaming/readwriter.py | 31 +- .../connect/streaming/test_parity_streaming.py | 68 .../pyspark/sql/tests/streaming/test_streaming.py | 370 +++-- .../sql/tests/streaming/test_streaming_foreach.py | 297 + .../tests/streaming/test_streaming_foreachBatch.py | 102 ++ 10 files changed, 603 insertions(+), 361 deletions(-) create mode 100644 python/pyspark/sql/tests/connect/streaming/test_parity_streaming.py create mode 100644 python/pyspark/sql/tests/streaming/test_streaming_foreach.py create mode 100644 python/pyspark/sql/tests/streaming/test_streaming_foreachBatch.py - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (dabd771c37b -> 2931993e059)
This is an automated email from the ASF dual-hosted git repository. ueshin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from dabd771c37b [SPARK-43038][SQL] Support the CBC mode by `aes_encrypt()`/`aes_decrypt()` add 2931993e059 [SPARK-42437][PYTHON][CONNECT] PySpark catalog.cacheTable will allow to specify storage level No new revisions were added by this update. Summary of changes: python/pyspark/sql/catalog.py | 23 --- python/pyspark/sql/connect/catalog.py | 5 +++-- python/pyspark/sql/connect/plan.py | 23 +-- python/pyspark/sql/tests/test_catalog.py | 26 -- python/pyspark/sql/tests/test_dataframe.py | 16 +++- 5 files changed, 71 insertions(+), 22 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-43038][SQL] Support the CBC mode by `aes_encrypt()`/`aes_decrypt()`
This is an automated email from the ASF dual-hosted git repository. maxgekk pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new dabd771c37b [SPARK-43038][SQL] Support the CBC mode by `aes_encrypt()`/`aes_decrypt()` dabd771c37b is described below commit dabd771c37be9cbd773b5223d8c78226ece84f8a Author: Max Gekk AuthorDate: Wed Apr 12 16:02:29 2023 +0300 [SPARK-43038][SQL] Support the CBC mode by `aes_encrypt()`/`aes_decrypt()` ### What changes were proposed in this pull request? In the PR, I propose new AES mode for the `aes_encrypt()`/`aes_decrypt()` functions - `CBC` ([Cipher Block Chaining](https://www.ibm.com/docs/en/linux-on-systems?topic=operation-cipher-block-chaining-cbc-mode)) with the padding `PKCS7(5)`. The `aes_encrypt()` function returns a binary value which consists of the following fields: 1. The salt magic prefix `Salted__` with the length of 8 bytes. 2. A salt generated per every `aes_encrypt()` call using `java.security.SecureRandom`. Its length is 8 bytes. 3. The encrypted input. The encrypt function derives the secret key and initialization vector (16 bytes) from the salt and user's key using the same algorithm as OpenSSL's `EVP_BytesToKey()` (versions >= 1.1.0c). The `aes_decrypt()` functions assumes that its input has the fields as showed above. For example: ```sql spark-sql> SELECT base64(aes_encrypt('Apache Spark', '', 'CBC', 'PKCS')); U2FsdGVkX1/ERGxwEOTDpDD4bQvDtQaNe+gXGudCcUk= spark-sql> SELECT aes_decrypt(unbase64('U2FsdGVkX1/ERGxwEOTDpDD4bQvDtQaNe+gXGudCcUk='), '', 'CBC', 'PKCS'); Apache Spark ``` ### Why are the changes needed? To achieve feature parity with other systems/frameworks, and make the migration process from them to Spark SQL easier. For example, the `CBC` mode is supported by: - BigQuery: https://cloud.google.com/bigquery/docs/reference/standard-sql/aead-encryption-concepts#block_cipher_modes - Snowflake: https://docs.snowflake.com/en/sql-reference/functions/encrypt.html ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? By running new checks: ``` $ build/sbt "sql/testOnly *QueryExecutionErrorsSuite" $ build/sbt "sql/test:testOnly org.apache.spark.sql.expressions.ExpressionInfoSuite" $ build/sbt "test:testOnly org.apache.spark.sql.MiscFunctionsSuite" $ build/sbt "core/testOnly *SparkThrowableSuite" ``` and checked compatibility with LibreSSL/OpenSSL: ``` $ openssl version LibreSSL 3.3.6 $ echo -n 'Apache Spark' | openssl enc -e -aes-128-cbc -pass pass: -a U2FsdGVkX1+5GyAmmG7wDWWDBAuUuxjMy++cMFytpls= ``` ```sql spark-sql (default)> SELECT aes_decrypt(unbase64('U2FsdGVkX1+5GyAmmG7wDWWDBAuUuxjMy++cMFytpls='), '', 'CBC'); Apache Spark ``` decrypt Spark's output by OpenSSL: ```sql spark-sql (default)> SELECT base64(aes_encrypt('Apache Spark', 'abcdefghijklmnop12345678ABCDEFGH', 'CBC', 'PKCS')); U2FsdGVkX1+maU2vmxrulgxXuQSyZ3ODnlHKqnt2fDA= ``` ``` $ echo 'U2FsdGVkX1+maU2vmxrulgxXuQSyZ3ODnlHKqnt2fDA=' | openssl aes-256-cbc -a -d -pass pass:abcdefghijklmnop12345678ABCDEFGH Apache Spark ``` Closes #40704 from MaxGekk/aes-cbc. Authored-by: Max Gekk Signed-off-by: Max Gekk --- core/src/main/resources/error/error-classes.json | 5 ++ .../catalyst/expressions/ExpressionImplUtils.java | 72 ++ .../spark/sql/catalyst/expressions/misc.scala | 16 +++-- .../spark/sql/errors/QueryExecutionErrors.scala| 9 +++ .../org/apache/spark/sql/MiscFunctionsSuite.scala | 33 +- .../sql/errors/QueryExecutionErrorsSuite.scala | 31 -- 6 files changed, 141 insertions(+), 25 deletions(-) diff --git a/core/src/main/resources/error/error-classes.json b/core/src/main/resources/error/error-classes.json index ae73071a120..1edf625fdc3 100644 --- a/core/src/main/resources/error/error-classes.json +++ b/core/src/main/resources/error/error-classes.json @@ -978,6 +978,11 @@ "expects a binary value with 16, 24 or 32 bytes, but got bytes." ] }, + "AES_SALTED_MAGIC" : { +"message" : [ + "Initial bytes from input do not match 'Salted__' (0x53616C7465645F5F)." +] + }, "PATTERN" : { "message" : [ "." diff --git a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/ExpressionImplUtils.java b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/ExpressionImplUtils.java index a6e482db57b..680ad11ad73 100644 --- a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/ExpressionImplUtils.java +++
[spark] branch master updated (f8751e2afeb -> 74d840c247a)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from f8751e2afeb [SPARK-42994][ML][CONNECT] PyTorch Distributor support Local Mode add 74d840c247a [SPARK-43103][SQL] Moving Integral to PhysicalDataType No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/expressions/Cast.scala | 12 +- .../sql/catalyst/expressions/arithmetic.scala | 10 .../expressions/collectionOperations.scala | 7 +++--- .../sql/catalyst/types/PhysicalDataType.scala | 28 ++ .../apache/spark/sql/types/AbstractDataType.scala | 4 +--- .../org/apache/spark/sql/types/ByteType.scala | 1 - .../org/apache/spark/sql/types/IntegerType.scala | 3 --- .../org/apache/spark/sql/types/LongType.scala | 1 - .../org/apache/spark/sql/types/ShortType.scala | 1 - 9 files changed, 39 insertions(+), 28 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (7e2c6c7ab23 -> f8751e2afeb)
This is an automated email from the ASF dual-hosted git repository. ruifengz pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from 7e2c6c7ab23 [SPARK-42985][CONNECT][PYTHON] Fix createDataFrame to respect the SQL configs add f8751e2afeb [SPARK-42994][ML][CONNECT] PyTorch Distributor support Local Mode No new revisions were added by this update. Summary of changes: .../src/main/protobuf/spark/connect/base.proto | 3 + .../src/main/protobuf/spark/connect/commands.proto | 9 ++ .../src/main/protobuf/spark/connect/common.proto | 10 ++ .../sql/connect/planner/SparkConnectPlanner.scala | 27 .../tests/connect/test_parity_torch_distributor.py | 59 + python/pyspark/ml/torch/distributor.py | 89 + python/pyspark/ml/torch/tests/test_distributor.py | 30 - python/pyspark/sql/connect/client.py | 16 +++ python/pyspark/sql/connect/proto/base_pb2.py | 108 python/pyspark/sql/connect/proto/base_pb2.pyi | 13 ++ python/pyspark/sql/connect/proto/commands_pb2.py | 144 ++--- python/pyspark/sql/connect/proto/commands_pb2.pyi | 68 ++ python/pyspark/sql/connect/proto/common_pb2.py | 16 ++- python/pyspark/sql/connect/proto/common_pb2.pyi| 30 + 14 files changed, 462 insertions(+), 160 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (631ee6706e6 -> 7e2c6c7ab23)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from 631ee6706e6 [SPARK-43055][CONNECT][PYTHON] Support duplicated nested field names add 7e2c6c7ab23 [SPARK-42985][CONNECT][PYTHON] Fix createDataFrame to respect the SQL configs No new revisions were added by this update. Summary of changes: python/pyspark/sql/connect/conversion.py | 36 +- python/pyspark/sql/connect/session.py | 43 +- .../pyspark/sql/tests/connect/test_parity_arrow.py | 5 --- .../pyspark/sql/tests/connect/test_parity_types.py | 13 ++- python/pyspark/sql/tests/test_types.py | 30 +++ 5 files changed, 80 insertions(+), 47 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (a31ac0492a5 -> 631ee6706e6)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from a31ac0492a5 [SPARK-43039][SQL] Support custom fields in the file source _metadata column add 631ee6706e6 [SPARK-43055][CONNECT][PYTHON] Support duplicated nested field names No new revisions were added by this update. Summary of changes: .../spark/sql/connect/client/SparkResult.scala | 12 ++- .../service/SparkConnectStreamHandler.scala| 36 ++- python/pyspark/sql/connect/client.py | 5 +- python/pyspark/sql/connect/conversion.py | 111 + python/pyspark/sql/tests/test_dataframe.py | 20 5 files changed, 135 insertions(+), 49 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org