date:20220215

[spark] branch master updated: [SPARK-38173][SQL] Quoted column cannot be recognized correctly when quotedRegexColumnNa…

2022-02-15 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 1ef5638  [SPARK-38173][SQL] Quoted column cannot be recognized 
correctly when quotedRegexColumnNa…
1ef5638 is described below

commit 1ef5638177dcf06ebca4e9b0bc88401e0fce2ae8
Author: TongWeii 
AuthorDate: Wed Feb 16 12:40:59 2022 +0800

[SPARK-38173][SQL] Quoted column cannot be recognized correctly when 
quotedRegexColumnNa…

### What changes were proposed in this pull request?

bug fix

### Why are the changes needed?

When spark.sql.parser.quotedRegexColumnNames=true
```
SELECT `(C3)?+.+`,`C1` * C2 FROM (SELECT 3 AS C1,2 AS C2,1 AS C3) T;
```
The above query will throw an exception
```
Error: org.apache.hive.service.cli.HiveSQLException: Error running query: 
org.apache.spark.sql.AnalysisException: Invalid usage of '*' in expression 
'multiply'
at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:370)
at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:266)
at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at 
org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78)
at 
org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62)
at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:44)
at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:266)
at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:261)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:275)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.spark.sql.AnalysisException: Invalid usage of '*' in 
expression 'multiply'
at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis.failAnalysis(CheckAnalysis.scala:50)
at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis.failAnalysis$(CheckAnalysis.scala:49)
at 
org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:155)
at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$$anonfun$expandStarExpression$1.applyOrElse(Analyzer.scala:1700)
at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$$anonfun$expandStarExpression$1.applyOrElse(Analyzer.scala:1671)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUp$2(TreeNode.scala:342)
at 
org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:74)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:342)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUp$1(TreeNode.scala:339)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$mapChildren$1(TreeNode.scala:408)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:244)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:406)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:359)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:339)
at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$.expandStarExpression(Analyzer.scala:1671)
at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$.$anonfun$buildExpandedProjectList$1(Analyzer.scala:1656)
```
It works fine in hive, because hive treats a pattern with all 
alphabets/digits and "_" as a normal string
```
  /**
   * Returns whether

[spark] branch master updated: [SPARK-38201][K8S] Fix `uploadFileToHadoopCompatibleFS` to use `delSrc` and `overwrite` parameters

2022-02-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 69859f8  [SPARK-38201][K8S] Fix `uploadFileToHadoopCompatibleFS` to 
use `delSrc` and `overwrite` parameters
69859f8 is described below

commit 69859f81e5b13952b6e37fa4d51b1b4dbe19e5bc
Author: yangjie01 
AuthorDate: Tue Feb 15 19:49:39 2022 -0800

[SPARK-38201][K8S] Fix `uploadFileToHadoopCompatibleFS` to use `delSrc` and 
`overwrite` parameters

### What changes were proposed in this pull request?
`KubernetesUtils#uploadFileToHadoopCompatibleFS` defines the input 
parameters `delSrc` and `overwrite`,  but constants(`false` and `true`) are 
used when invoke `FileSystem.copyFromLocalFile(boolean delSrc, boolean 
overwrite, Path src, Path dst) `, this pr change to use passed in `delSrc` and 
`overwrite` when invoke the`copyFromLocalFile` method.

### Why are the changes needed?
Bug fix

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Pass GA and add new test case

Closes #35509 from LuciferYang/SPARK-38201.

Lead-authored-by: yangjie01 
Co-authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
---
 .../apache/spark/deploy/k8s/KubernetesUtils.scala  |  2 +-
 .../spark/deploy/k8s/KubernetesUtilsSuite.scala| 66 +-
 2 files changed, 65 insertions(+), 3 deletions(-)

diff --git 
a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala
 
b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala
index 0c8d964..a05d07a 100644
--- 
a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala
+++ 
b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala
@@ -344,7 +344,7 @@ object KubernetesUtils extends Logging {
   delSrc : Boolean = false,
   overwrite: Boolean = true): Unit = {
 try {
-  fs.copyFromLocalFile(false, true, src, dest)
+  fs.copyFromLocalFile(delSrc, overwrite, src, dest)
 } catch {
   case e: IOException =>
 throw new SparkException(s"Error uploading file ${src.getName}", e)
diff --git 
a/resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/KubernetesUtilsSuite.scala
 
b/resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/KubernetesUtilsSuite.scala
index ef57a4b..5498238 100644
--- 
a/resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/KubernetesUtilsSuite.scala
+++ 
b/resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/KubernetesUtilsSuite.scala
@@ -17,13 +17,20 @@
 
 package org.apache.spark.deploy.k8s
 
+import java.io.File
+import java.nio.charset.StandardCharsets
+
 import scala.collection.JavaConverters._
 
 import io.fabric8.kubernetes.api.model.{ContainerBuilder, PodBuilder}
+import org.apache.commons.io.FileUtils
+import org.apache.hadoop.conf.Configuration
+import org.apache.hadoop.fs.Path
+import org.scalatest.PrivateMethodTester
 
-import org.apache.spark.SparkFunSuite
+import org.apache.spark.{SparkException, SparkFunSuite}
 
-class KubernetesUtilsSuite extends SparkFunSuite {
+class KubernetesUtilsSuite extends SparkFunSuite with PrivateMethodTester {
   private val HOST = "test-host"
   private val POD = new PodBuilder()
 .withNewSpec()
@@ -65,4 +72,59 @@ class KubernetesUtilsSuite extends SparkFunSuite {
 assert(sparkPodWithNoContainerName.pod.getSpec.getHostname == HOST)
 assert(sparkPodWithNoContainerName.container.getName == null)
   }
+
+  test("SPARK-38201: check uploadFileToHadoopCompatibleFS with different 
delSrc and overwrite") {
+withTempDir { srcDir =>
+  withTempDir { destDir =>
+val upload = 
PrivateMethod[Unit](Symbol("uploadFileToHadoopCompatibleFS"))
+val fileName = "test.txt"
+val srcFile = new File(srcDir, fileName)
+val src = new Path(srcFile.getAbsolutePath)
+val dest = new Path(destDir.getAbsolutePath, fileName)
+val fs = src.getFileSystem(new Configuration())
+
+def checkUploadException(delSrc: Boolean, overwrite: Boolean): Unit = {
+  val message = intercept[SparkException] {
+KubernetesUtils.invokePrivate(upload(src, dest, fs, delSrc, 
overwrite))
+  }.getMessage
+  assert(message.contains("Error uploading file"))
+}
+
+def appendFileAndUpload(content: String, delSrc: Boolean, overwrite: 
Boolean): Unit = {
+  FileUtils.write(srcFile, content, StandardCharsets.UTF_8, true)
+  KubernetesUtils.invokePrivate(upload(src, dest, fs, delSrc, 
overwrite))
+}
+
+// Write a new file, upload file with delSrc = false and

[spark] branch branch-3.2 updated: [SPARK-38221][SQL] Eagerly iterate over groupingExpressions when moving complex grouping expressions out of an Aggregate node

2022-02-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new 5976a06  [SPARK-38221][SQL] Eagerly iterate over groupingExpressions 
when moving complex grouping expressions out of an Aggregate node
5976a06 is described below

commit 5976a0655e78b1d5666d8b2b5d459eafc6319547
Author: Bruce Robbins 
AuthorDate: Wed Feb 16 11:12:33 2022 +0900

[SPARK-38221][SQL] Eagerly iterate over groupingExpressions when moving 
complex grouping expressions out of an Aggregate node

### What changes were proposed in this pull request?

Change `PullOutGroupingExpressions` to eagerly iterate over 
`groupingExpressions` when building `complexGroupingExpressionMap`.

### Why are the changes needed?

Consider this query:
```
Seq(1).toDF("id").groupBy(Stream($"id" + 1, $"id" + 2): 
_*).sum("id").show(false)
```
It fails with
```
java.lang.IllegalStateException: Couldn't find _groupingexpression#24 in 
[id#4,_groupingexpression#23]
  at 
org.apache.spark.sql.catalyst.expressions.BindReferences$$anonfun$bindReference$1.applyOrElse(BoundAttribute.scala:80)
  at 
org.apache.spark.sql.catalyst.expressions.BindReferences$$anonfun$bindReference$1.applyOrElse(BoundAttribute.scala:73)
  at 
org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:481)
  at 
org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:83)
  at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:481)
  at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:457)
  at 
org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:425)
  at 
org.apache.spark.sql.catalyst.expressions.BindReferences$.bindReference(BoundAttribute.scala:73)
  at 
org.apache.spark.sql.catalyst.expressions.BindReferences$.$anonfun$bindReferences$1(BoundAttribute.scala:94)
  at scala.collection.immutable.Stream.$anonfun$map$1(Stream.scala:418)
  at scala.collection.immutable.Stream$Cons.tail(Stream.scala:1173)
  at scala.collection.immutable.Stream$Cons.tail(Stream.scala:1163)
  at scala.collection.immutable.Stream.$anonfun$map$1(Stream.scala:418)
  at scala.collection.immutable.Stream$Cons.tail(Stream.scala:1173)
  at scala.collection.immutable.Stream$Cons.tail(Stream.scala:1163)
  at scala.collection.immutable.Stream.foreach(Stream.scala:534)
  at scala.collection.TraversableOnce.count(TraversableOnce.scala:152)
  at scala.collection.TraversableOnce.count$(TraversableOnce.scala:145)
  at scala.collection.AbstractTraversable.count(Traversable.scala:108)
  at 
org.apache.spark.sql.catalyst.expressions.codegen.GenerateUnsafeProjection$.createCode(GenerateUnsafeProjection.scala:293)
  at 
org.apache.spark.sql.execution.aggregate.HashAggregateExec.doConsumeWithKeys(HashAggregateExec.scala:623)
  ... etc ...
```
When `HashAggregateExec` attempts to bind the references in the group-by 
expressions, attribute _groupingexpression#24 is missing from the child 
`ProjectExec`'s output.

This is due to the way `PullOutGroupingExpressions`, when determining which 
grouping expressions to shift from the `Aggregate` node to a `Project` node,  
populates `complexGroupingExpressionMap`. `PullOutGroupingExpressions` uses a 
map operation to iterate over `groupingExpressions` and updates 
`complexGroupingExpressionMap` in the closure passed to `map()`. However, if 
`groupingExpressions` is a `Stream`, the map operation is evaluated lazily, and 
isn't fully completed until `Compute [...]

### Does this PR introduce _any_ user-facing change?

No, other than the above query now works.

### How was this patch tested?

New unit test.

Closes #35537 from bersprockets/groupby_stream_issue.

Authored-by: Bruce Robbins 
Signed-off-by: Hyukjin Kwon 
(cherry picked from commit ad2bc7d82296527582dfa469aad33123afdf6736)
Signed-off-by: Hyukjin Kwon 
---
 .../spark/sql/catalyst/optimizer/PullOutGroupingExpressions.scala| 2 +-
 .../test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala| 5 +
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/PullOutGroupingExpressions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/PullOutGroupingExpressions.scala
index 859a73a..1bd186d 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/PullOutGroupingExpressions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/PullOutGroupingExpressions.scala
@@ -50,7 +50,7 @@ object PullOutGroupingExpressions extends

[spark] branch master updated (bb757b5 -> ad2bc7d)

2022-02-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from bb757b5  [SPARK-37145][K8S][FOLLOWUP] Add note for 
`KubernetesCustom[Driver/Executor]FeatureConfigStep`
 add ad2bc7d  [SPARK-38221][SQL] Eagerly iterate over groupingExpressions 
when moving complex grouping expressions out of an Aggregate node

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/optimizer/PullOutGroupingExpressions.scala| 2 +-
 .../test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala| 5 +
 2 files changed, 6 insertions(+), 1 deletion(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (66c83df -> bb757b5)

2022-02-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 66c83df  [SPARK-38220][BUILD] Upgrade `commons-math3` to 3.6.1
 add bb757b5  [SPARK-37145][K8S][FOLLOWUP] Add note for 
`KubernetesCustom[Driver/Executor]FeatureConfigStep`

No new revisions were added by this update.

Summary of changes:
 .../KubernetesDriverCustomFeatureConfigStep.scala  | 39 ++
 ...KubernetesExecutorCustomFeatureConfigStep.scala | 39 ++
 2 files changed, 78 insertions(+)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-38220][BUILD] Upgrade `commons-math3` to 3.6.1

2022-02-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 66c83df  [SPARK-38220][BUILD] Upgrade `commons-math3` to 3.6.1
66c83df is described below

commit 66c83dfb7e443bedc7f9a895f6589f40153b652f
Author: Dongjoon Hyun 
AuthorDate: Wed Feb 16 09:03:04 2022 +0900

[SPARK-38220][BUILD] Upgrade `commons-math3` to 3.6.1

### What changes were proposed in this pull request?

This PR aims to upgrade `commons-math3` to 3.6.1.

### Why are the changes needed?

`3.6.1` is the latest and popular than `3.4.1`.
- https://commons.apache.org/proper/commons-math/download_math.cgi
- https://mvnrepository.com/artifact/org.apache.commons/commons-math3

### Does this PR introduce _any_ user-facing change?

Although this is a dependency change, there is no breaking change.

### How was this patch tested?

Pass the CIs.

Closes #35535 from dongjoon-hyun/SPARK-38220.

Authored-by: Dongjoon Hyun 
Signed-off-by: Hyukjin Kwon 
---
 dev/deps/spark-deps-hadoop-2-hive-2.3 | 2 +-
 dev/deps/spark-deps-hadoop-3-hive-2.3 | 2 +-
 pom.xml   | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/dev/deps/spark-deps-hadoop-2-hive-2.3 
b/dev/deps/spark-deps-hadoop-2-hive-2.3
index 50480e4..26c5439 100644
--- a/dev/deps/spark-deps-hadoop-2-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-2-hive-2.3
@@ -50,7 +50,7 @@ commons-io/2.4//commons-io-2.4.jar
 commons-lang/2.6//commons-lang-2.6.jar
 commons-lang3/3.12.0//commons-lang3-3.12.0.jar
 commons-logging/1.1.3//commons-logging-1.1.3.jar
-commons-math3/3.4.1//commons-math3-3.4.1.jar
+commons-math3/3.6.1//commons-math3-3.6.1.jar
 commons-net/3.1//commons-net-3.1.jar
 commons-pool/1.5.4//commons-pool-1.5.4.jar
 commons-text/1.6//commons-text-1.6.jar
diff --git a/dev/deps/spark-deps-hadoop-3-hive-2.3 
b/dev/deps/spark-deps-hadoop-3-hive-2.3
index 13b23c0..dd95710 100644
--- a/dev/deps/spark-deps-hadoop-3-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-3-hive-2.3
@@ -48,7 +48,7 @@ commons-io/2.11.0//commons-io-2.11.0.jar
 commons-lang/2.6//commons-lang-2.6.jar
 commons-lang3/3.12.0//commons-lang3-3.12.0.jar
 commons-logging/1.1.3//commons-logging-1.1.3.jar
-commons-math3/3.4.1//commons-math3-3.4.1.jar
+commons-math3/3.6.1//commons-math3-3.6.1.jar
 commons-net/3.1//commons-net-3.1.jar
 commons-pool/1.5.4//commons-pool-1.5.4.jar
 commons-text/1.6//commons-text-1.6.jar
diff --git a/pom.xml b/pom.xml
index 04feb2a..b0791f7 100644
--- a/pom.xml
+++ b/pom.xml
@@ -157,7 +157,7 @@
 
 4.5.13
 4.4.14
-3.4.1
+3.6.1
 
 4.4
 2.12.15

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (ece34f0 -> 39166ed)

2022-02-15 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ece34f0  [SPARK-38130][SQL] Remove array_sort orderable entries check
 add 39166ed  [SPARK-38124][SS][FOLLOWUP] Add test to harden assumption of 
SS partitioning requirement

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/plans/physical/partitioning.scala |  2 +
 .../spark/sql/catalyst/DistributionSuite.scala | 78 ++
 2 files changed, 80 insertions(+)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (2a5ef00 -> ece34f0)

2022-02-15 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 2a5ef00  [SPARK-38121][PYTHON][SQL][FOLLOW-UP] Set 
'spark.sql.catalogImplementation' to 'hive' in HiveContext
 add ece34f0  [SPARK-38130][SQL] Remove array_sort orderable entries check

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/expressions/higherOrderFunctions.scala  |  9 +++--
 .../scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala   | 10 ++
 2 files changed, 13 insertions(+), 6 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (c7ee7cd -> 2a5ef00)

2022-02-15 Thread viirya

This is an automated email from the ASF dual-hosted git repository.

viirya pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from c7ee7cd  [SPARK-37413][PYTHON] Inline type hints for 
python/pyspark/ml/tree.py
 add 2a5ef00  [SPARK-38121][PYTHON][SQL][FOLLOW-UP] Set 
'spark.sql.catalogImplementation' to 'hive' in HiveContext

No new revisions were added by this update.

Summary of changes:
 python/pyspark/sql/context.py | 28 ++--
 1 file changed, 22 insertions(+), 6 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.2 updated (3dea6c4 -> 0dde12f)

2022-02-15 Thread sarutak

This is an automated email from the ASF dual-hosted git repository.

sarutak pushed a change to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 3dea6c4  [SPARK-38211][SQL][DOCS] Add SQL migration guide on restoring 
loose upcast from string to other types
 add 0dde12f  [SPARK-36808][BUILD][3.2] Upgrade Kafka to 2.8.1

No new revisions were added by this update.

Summary of changes:
 pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (a9a792b3 -> c7ee7cd)

2022-02-15 Thread zero323

This is an automated email from the ASF dual-hosted git repository.

zero323 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from a9a792b3 [SPARK-38199][SQL] Delete the unused `dataType` specified in 
the definition of `IntervalColumnAccessor`
 add c7ee7cd  [SPARK-37413][PYTHON] Inline type hints for 
python/pyspark/ml/tree.py

No new revisions were added by this update.

Summary of changes:
 python/pyspark/ml/tree.py  | 129 -
 python/pyspark/ml/tree.pyi | 110 --
 2 files changed, 68 insertions(+), 171 deletions(-)
 delete mode 100644 python/pyspark/ml/tree.pyi

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (ea1f922 -> a9a792b3)

2022-02-15 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ea1f922  [SPARK-37707][SQL][FOLLOWUP] Allow implicitly casting Date 
type to AnyTimestampType under ANSI mode
 add a9a792b3 [SPARK-38199][SQL] Delete the unused `dataType` specified in 
the definition of `IntervalColumnAccessor`

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/sql/execution/columnar/ColumnAccessor.scala  | 2 +-
 .../apache/spark/sql/execution/columnar/GenerateColumnAccessor.scala| 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-37707][SQL][FOLLOWUP] Allow implicitly casting Date type to AnyTimestampType under ANSI mode

2022-02-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new ea1f922  [SPARK-37707][SQL][FOLLOWUP] Allow implicitly casting Date 
type to AnyTimestampType under ANSI mode
ea1f922 is described below

commit ea1f922e232b0193927cdeec529083f274b108ac
Author: Gengliang Wang 
AuthorDate: Tue Feb 15 19:06:17 2022 +0900

[SPARK-37707][SQL][FOLLOWUP] Allow implicitly casting Date type to 
AnyTimestampType under ANSI mode

### What changes were proposed in this pull request?

Followup of https://github.com/apache/spark/pull/34976: allow implicitly 
casting Date type to AnyTimestampType under ANSI mode

### Why are the changes needed?

AnyTimestampType is a type collection for Timestamp and TimestampNTZ. As 
Spark allows implicit casting Date as Timestamp/TimestampNTZ under ANSI mode, 
Date can be cast as AnyTimestampType as well.

### Does this PR introduce _any_ user-facing change?

Yes, allow implicitly casting Date type to AnyTimestampType under ANSI mode

### How was this patch tested?

Unit  test

Closes #35522 from gengliangwang/fixMoreAnsiTest.

Authored-by: Gengliang Wang 
Signed-off-by: Hyukjin Kwon 
---
 .../apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala |  3 +++
 .../spark/sql/catalyst/analysis/TypeCoercionSuite.scala   | 11 +++
 2 files changed, 14 insertions(+)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala
index 61142fc..90f28fb 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala
@@ -204,6 +204,9 @@ object AnsiTypeCoercion extends TypeCoercionBase {
   case (StringType, AnyTimestampType) =>
 Some(AnyTimestampType.defaultConcreteType)
 
+  case (DateType, AnyTimestampType) =>
+Some(AnyTimestampType.defaultConcreteType)
+
   case (_, target: DataType) =>
 if (Cast.canANSIStoreAssign(inType, target)) {
   Some(target)
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala
index 8ea5886..63ad84e 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala
@@ -190,6 +190,7 @@ abstract class TypeCoercionSuiteBase extends AnalysisTest {
   test("implicit type cast - DateType") {
 val checkedType = DateType
 checkTypeCasting(checkedType, castableTypes = Seq(checkedType, StringType) 
++ datetimeTypes)
+shouldCast(checkedType, AnyTimestampType, 
AnyTimestampType.defaultConcreteType)
 shouldNotCast(checkedType, DecimalType)
 shouldNotCast(checkedType, NumericType)
 shouldNotCast(checkedType, IntegralType)
@@ -198,6 +199,16 @@ abstract class TypeCoercionSuiteBase extends AnalysisTest {
   test("implicit type cast - TimestampType") {
 val checkedType = TimestampType
 checkTypeCasting(checkedType, castableTypes = Seq(checkedType, StringType) 
++ datetimeTypes)
+shouldCast(checkedType, AnyTimestampType, checkedType)
+shouldNotCast(checkedType, DecimalType)
+shouldNotCast(checkedType, NumericType)
+shouldNotCast(checkedType, IntegralType)
+  }
+
+  test("implicit type cast - TimestampNTZType") {
+val checkedType = TimestampNTZType
+checkTypeCasting(checkedType, castableTypes = Seq(checkedType, StringType) 
++ datetimeTypes)
+shouldCast(checkedType, AnyTimestampType, checkedType)
 shouldNotCast(checkedType, DecimalType)
 shouldNotCast(checkedType, NumericType)
 shouldNotCast(checkedType, IntegralType)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-38173][SQL] Quoted column cannot be recognized correctly when quotedRegexColumnNa…

[spark] branch master updated: [SPARK-38201][K8S] Fix `uploadFileToHadoopCompatibleFS` to use `delSrc` and `overwrite` parameters

[spark] branch branch-3.2 updated: [SPARK-38221][SQL] Eagerly iterate over groupingExpressions when moving complex grouping expressions out of an Aggregate node

[spark] branch master updated (bb757b5 -> ad2bc7d)

[spark] branch master updated (66c83df -> bb757b5)

[spark] branch master updated: [SPARK-38220][BUILD] Upgrade `commons-math3` to 3.6.1

[spark] branch master updated (ece34f0 -> 39166ed)

[spark] branch master updated (2a5ef00 -> ece34f0)

[spark] branch master updated (c7ee7cd -> 2a5ef00)

[spark] branch branch-3.2 updated (3dea6c4 -> 0dde12f)

[spark] branch master updated (a9a792b3 -> c7ee7cd)

[spark] branch master updated (ea1f922 -> a9a792b3)

[spark] branch master updated: [SPARK-37707][SQL][FOLLOWUP] Allow implicitly casting Date type to AnyTimestampType under ANSI mode

13 matches

Site Navigation

Mail list logo

Footer information