date:20210522

[spark] branch branch-3.1 updated: [SPARK-35226][SQL][FOLLOWUP] Fix test added in SPARK-35226 for DB2KrbIntegrationSuite

2021-05-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new def4e52  [SPARK-35226][SQL][FOLLOWUP] Fix test added in SPARK-35226 
for DB2KrbIntegrationSuite
def4e52 is described below

commit def4e526e9ba57759e4e3b5f1853a8b3de30a824
Author: Kousuke Saruta 
AuthorDate: Sat May 22 22:31:43 2021 -0700

[SPARK-35226][SQL][FOLLOWUP] Fix test added in SPARK-35226 for 
DB2KrbIntegrationSuite

### What changes were proposed in this pull request?

This PR fixes an test added in SPARK-35226 (#32344).

### Why are the changes needed?

`SELECT 1` seems non-valid query for DB2.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

DB2KrbIntegrationSuite passes on my laptop.

I also confirmed all the KrbIntegrationSuites pass with the following 
command.
```
build/sbt -Phive -Phive-thriftserver -Pdocker-integration-tests "testOnly 
org.apache.spark.sql.jdbc.*KrbIntegrationSuite"
```

Closes #32632 from sarutak/followup-SPARK-35226.

Authored-by: Kousuke Saruta 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 1a43415d8de1d6462d64a6b6f8daa505a3a81970)
Signed-off-by: Dongjoon Hyun 
---
 .../sql/jdbc/DockerKrbJDBCIntegrationSuite.scala | 20 
 1 file changed, 4 insertions(+), 16 deletions(-)

diff --git 
a/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/DockerKrbJDBCIntegrationSuite.scala
 
b/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/DockerKrbJDBCIntegrationSuite.scala
index 4a828ae..3865f91 100644
--- 
a/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/DockerKrbJDBCIntegrationSuite.scala
+++ 
b/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/DockerKrbJDBCIntegrationSuite.scala
@@ -178,19 +178,7 @@ abstract class DockerKrbJDBCIntegrationSuite extends 
DockerJDBCIntegrationSuite
 .option("keytab", keytabFullPath)
 .option("principal", principal)
 .option("refreshKrb5Config", "true")
-.option("query", "SELECT 1")
-.load()
-}
-
-// Set the authentic krb5.conf but doesn't refresh config
-// so this assertion is expected to fail.
-intercept[Exception] {
-  sys.props(KRB5_CONF_PROP) = origKrb5Conf
-  spark.read.format("jdbc")
-.option("url", jdbcUrl)
-.option("keytab", keytabFullPath)
-.option("principal", principal)
-.option("query", "SELECT 1")
+.option("dbtable", "bar")
 .load()
 }
 
@@ -200,11 +188,11 @@ abstract class DockerKrbJDBCIntegrationSuite extends 
DockerJDBCIntegrationSuite
   .option("keytab", keytabFullPath)
   .option("principal", principal)
   .option("refreshKrb5Config", "true")
-  .option("query", "SELECT 1")
+  .option("dbtable", "bar")
   .load()
-val result = df.collect().map(_.getInt(0))
+val result = df.collect().map(_.getString(0))
 assert(result.length === 1)
-assert(result(0) === 1)
+assert(result(0) === "hello")
   } finally {
 sys.props(KRB5_CONF_PROP) = origKrb5Conf
   }

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-35226][SQL][FOLLOWUP] Fix test added in SPARK-35226 for DB2KrbIntegrationSuite

2021-05-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 1a43415  [SPARK-35226][SQL][FOLLOWUP] Fix test added in SPARK-35226 
for DB2KrbIntegrationSuite
1a43415 is described below

commit 1a43415d8de1d6462d64a6b6f8daa505a3a81970
Author: Kousuke Saruta 
AuthorDate: Sat May 22 22:31:43 2021 -0700

[SPARK-35226][SQL][FOLLOWUP] Fix test added in SPARK-35226 for 
DB2KrbIntegrationSuite

### What changes were proposed in this pull request?

This PR fixes an test added in SPARK-35226 (#32344).

### Why are the changes needed?

`SELECT 1` seems non-valid query for DB2.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

DB2KrbIntegrationSuite passes on my laptop.

I also confirmed all the KrbIntegrationSuites pass with the following 
command.
```
build/sbt -Phive -Phive-thriftserver -Pdocker-integration-tests "testOnly 
org.apache.spark.sql.jdbc.*KrbIntegrationSuite"
```

Closes #32632 from sarutak/followup-SPARK-35226.

Authored-by: Kousuke Saruta 
Signed-off-by: Dongjoon Hyun 
---
 .../sql/jdbc/DockerKrbJDBCIntegrationSuite.scala | 20 
 1 file changed, 4 insertions(+), 16 deletions(-)

diff --git 
a/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/DockerKrbJDBCIntegrationSuite.scala
 
b/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/DockerKrbJDBCIntegrationSuite.scala
index 4a828ae..3865f91 100644
--- 
a/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/DockerKrbJDBCIntegrationSuite.scala
+++ 
b/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/DockerKrbJDBCIntegrationSuite.scala
@@ -178,19 +178,7 @@ abstract class DockerKrbJDBCIntegrationSuite extends 
DockerJDBCIntegrationSuite
 .option("keytab", keytabFullPath)
 .option("principal", principal)
 .option("refreshKrb5Config", "true")
-.option("query", "SELECT 1")
-.load()
-}
-
-// Set the authentic krb5.conf but doesn't refresh config
-// so this assertion is expected to fail.
-intercept[Exception] {
-  sys.props(KRB5_CONF_PROP) = origKrb5Conf
-  spark.read.format("jdbc")
-.option("url", jdbcUrl)
-.option("keytab", keytabFullPath)
-.option("principal", principal)
-.option("query", "SELECT 1")
+.option("dbtable", "bar")
 .load()
 }
 
@@ -200,11 +188,11 @@ abstract class DockerKrbJDBCIntegrationSuite extends 
DockerJDBCIntegrationSuite
   .option("keytab", keytabFullPath)
   .option("principal", principal)
   .option("refreshKrb5Config", "true")
-  .option("query", "SELECT 1")
+  .option("dbtable", "bar")
   .load()
-val result = df.collect().map(_.getInt(0))
+val result = df.collect().map(_.getString(0))
 assert(result.length === 1)
-assert(result(0) === 1)
+assert(result(0) === "hello")
   } finally {
 sys.props(KRB5_CONF_PROP) = origKrb5Conf
   }

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (594ffd2 -> a59a214)

2021-05-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 594ffd2  [SPARK-35463][BUILD][FOLLOWUP] Redirect output for skipping 
checksum check
 add a59a214  [MINOR][FOLLOWUP] Update SHA for the oracle docker image

No new revisions were added by this update.

Summary of changes:
 .../test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala   | 2 +-
 .../scala/org/apache/spark/sql/jdbc/v2/OracleIntegrationSuite.scala | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-35463][BUILD][FOLLOWUP] Redirect output for skipping checksum check

2021-05-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 4994c7b  [SPARK-35463][BUILD][FOLLOWUP] Redirect output for skipping 
checksum check
4994c7b is described below

commit 4994c7b65b69ad8332b00872cbc975fc1c83c76a
Author: Liang-Chi Hsieh 
AuthorDate: Sat May 22 19:13:33 2021 -0700

[SPARK-35463][BUILD][FOLLOWUP] Redirect output for skipping checksum check

### What changes were proposed in this pull request?

This patch is a followup of SPARK-35463. In SPARK-35463, we output a 
message to stdout and now we redirect it to stderr.

### Why are the changes needed?

All `echo` statements in `build/mvn` should redirect to stderr if it is not 
followed by `exit`. It is because we use `build/mvn` to get stdout output by 
other scripts. If we don't redirect it, we can get invalid output, e.g. got 
"Skipping checksum because shasum is not installed." as `commons-cli` version.

### Does this PR introduce _any_ user-facing change?

No. Dev only.

### How was this patch tested?

Manually test on internal system.

Closes #32637 from viirya/fix-build.

Authored-by: Liang-Chi Hsieh 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 594ffd2db224c89f5375645a7a249d4befca6163)
Signed-off-by: Dongjoon Hyun 
---
 build/mvn | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/build/mvn b/build/mvn
index 39b9a48..0d2da62 100755
--- a/build/mvn
+++ b/build/mvn
@@ -93,7 +93,7 @@ install_app() {
 fi
   fi
 else
-  echo "Skipping checksum because shasum is not installed."
+  echo "Skipping checksum because shasum is not installed." 1>&2
 fi
 
 cd "${_DIR}" && tar -xzf "${local_tarball}"

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.1 updated: [SPARK-35463][BUILD][FOLLOWUP] Redirect output for skipping checksum check

2021-05-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new 01c7705  [SPARK-35463][BUILD][FOLLOWUP] Redirect output for skipping 
checksum check
01c7705 is described below

commit 01c770549e11d3f78448e8f9d50fa64356b61448
Author: Liang-Chi Hsieh 
AuthorDate: Sat May 22 19:13:33 2021 -0700

[SPARK-35463][BUILD][FOLLOWUP] Redirect output for skipping checksum check

### What changes were proposed in this pull request?

This patch is a followup of SPARK-35463. In SPARK-35463, we output a 
message to stdout and now we redirect it to stderr.

### Why are the changes needed?

All `echo` statements in `build/mvn` should redirect to stderr if it is not 
followed by `exit`. It is because we use `build/mvn` to get stdout output by 
other scripts. If we don't redirect it, we can get invalid output, e.g. got 
"Skipping checksum because shasum is not installed." as `commons-cli` version.

### Does this PR introduce _any_ user-facing change?

No. Dev only.

### How was this patch tested?

Manually test on internal system.

Closes #32637 from viirya/fix-build.

Authored-by: Liang-Chi Hsieh 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 594ffd2db224c89f5375645a7a249d4befca6163)
Signed-off-by: Dongjoon Hyun 
---
 build/mvn | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/build/mvn b/build/mvn
index e49709f..2c8571f 100755
--- a/build/mvn
+++ b/build/mvn
@@ -85,7 +85,7 @@ install_app() {
 fi
   fi
 else
-  echo "Skipping checksum because shasum is not installed."
+  echo "Skipping checksum because shasum is not installed." 1>&2
 fi
 
 cd "${_DIR}" && tar -xzf "${local_tarball}"

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (1d9f09d -> 594ffd2)

2021-05-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 1d9f09d  [SPARK-35480][SQL] Make percentile_approx work with pivot
 add 594ffd2  [SPARK-35463][BUILD][FOLLOWUP] Redirect output for skipping 
checksum check

No new revisions were added by this update.

Summary of changes:
 build/mvn | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.1 updated: Revert "[SPARK-35480][SQL] Make percentile_approx work with pivot"

2021-05-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new d1607d6  Revert "[SPARK-35480][SQL] Make percentile_approx work with 
pivot"
d1607d6 is described below

commit d1607d61911ca3d6050e4d1261e3988cfe029d12
Author: Dongjoon Hyun 
AuthorDate: Sat May 22 18:33:24 2021 -0700

Revert "[SPARK-35480][SQL] Make percentile_approx work with pivot"

This reverts commit 612e3b59e81c52f36754ee962e91dc2df98fb941.
---
 .../scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala   | 4 
 .../src/test/scala/org/apache/spark/sql/DataFramePivotSuite.scala | 8 
 2 files changed, 12 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
index e576ab0..600a5af 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
@@ -819,10 +819,6 @@ class Analyzer(override val catalogManager: CatalogManager)
   First(ifExpr(expr), true)
 case Last(expr, _) =>
   Last(ifExpr(expr), true)
-case a: ApproximatePercentile =>
-  // ApproximatePercentile takes two literals for accuracy and 
percentage which
-  // should not be wrapped by if-else.
-  a.withNewChildren(ifExpr(a.first) :: a.second :: a.third :: 
Nil)
 case a: AggregateFunction =>
   a.withNewChildren(a.children.map(ifExpr))
   }.transform {
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/DataFramePivotSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/DataFramePivotSuite.scala
index 32cbb8b..51d861e 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/DataFramePivotSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/DataFramePivotSuite.scala
@@ -344,12 +344,4 @@ class DataFramePivotSuite extends QueryTest with 
SharedSparkSession {
 val actual = df.groupBy("x").pivot("s").count()
 checkAnswer(actual, expected)
   }
-
-  test("SPARK-35480: percentile_approx should work with pivot") {
-val actual = Seq(
-  ("a", -1.0), ("a", 5.5), ("a", 2.5), ("b", 3.0), ("b", 
5.2)).toDF("type", "value")
-  .groupBy().pivot("type", Seq("a", "b")).agg(
-percentile_approx(col("value"), array(lit(0.5)), lit(1)))
-checkAnswer(actual, Row(Array(2.5), Array(3.0)))
-  }
 }

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.1 updated: [SPARK-35480][SQL] Make percentile_approx work with pivot

2021-05-22 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new 612e3b5  [SPARK-35480][SQL] Make percentile_approx work with pivot
612e3b5 is described below

commit 612e3b59e81c52f36754ee962e91dc2df98fb941
Author: Hyukjin Kwon 
AuthorDate: Sun May 23 07:35:43 2021 +0900

[SPARK-35480][SQL] Make percentile_approx work with pivot

### What changes were proposed in this pull request?

This PR proposes to avoid wrapping if-else to the constant literals for 
`percentage` and `accuracy` in `percentile_approx`. They are expected to be 
literals (or foldable expressions).

Pivot works by two phrase aggregations, and it works with manipulating the 
input to `null` for non-matched values (pivot column and value).

Note that pivot supports an optimized version without such logic with 
changing input to `null` for some types (non-nested types basically). So the 
issue fixed by this PR is only for complex types.

```scala
val df = Seq(
  ("a", -1.0), ("a", 5.5), ("a", 2.5), ("b", 3.0), ("b", 5.2)).toDF("type", 
"value")
  .groupBy().pivot("type", Seq("a", "b")).agg(
percentile_approx(col("value"), array(lit(0.5)), lit(1)))
df.show()
```

**Before:**

```
org.apache.spark.sql.AnalysisException: cannot resolve 
'percentile_approx((IF((type <=> CAST('a' AS STRING)), value, CAST(NULL AS 
DOUBLE))), (IF((type <=> CAST('a' AS STRING)), array(0.5D), NULL)), (IF((type 
<=> CAST('a' AS STRING)), 1, CAST(NULL AS INT' due to data type 
mismatch: The accuracy or percentage provided must be a constant literal;
'Aggregate [percentile_approx(if ((type#7 <=> cast(a as string))) value#8 
else cast(null as double), if ((type#7 <=> cast(a as string))) array(0.5) else 
cast(null as array), if ((type#7 <=> cast(a as string))) 1 else 
cast(null as int), 0, 0) AS a#16, percentile_approx(if ((type#7 <=> cast(b as 
string))) value#8 else cast(null as double), if ((type#7 <=> cast(b as 
string))) array(0.5) else cast(null as array), if ((type#7 <=> cast(b 
as string))) 1 else cast(null [...]
+- Project [_1#2 AS type#7, _2#3 AS value#8]
   +- LocalRelation [_1#2, _2#3]
```

**After:**

```
+-+-+
|a|b|
+-+-+
|[2.5]|[3.0]|
+-+-+
```

### Why are the changes needed?

To make percentile_approx work with pivot as expected

### Does this PR introduce _any_ user-facing change?

Yes. It threw an exception but now it returns a correct result as shown 
above.

### How was this patch tested?

Manually tested and unit test was added.

Closes #32619 from HyukjinKwon/SPARK-35480.

Authored-by: Hyukjin Kwon 
Signed-off-by: Hyukjin Kwon 
(cherry picked from commit 1d9f09decb280beb313f3ff0384364a40c47c4b4)
Signed-off-by: Hyukjin Kwon 
---
 .../scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala   | 4 
 .../src/test/scala/org/apache/spark/sql/DataFramePivotSuite.scala | 8 
 2 files changed, 12 insertions(+)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
index 600a5af..e576ab0 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
@@ -819,6 +819,10 @@ class Analyzer(override val catalogManager: CatalogManager)
   First(ifExpr(expr), true)
 case Last(expr, _) =>
   Last(ifExpr(expr), true)
+case a: ApproximatePercentile =>
+  // ApproximatePercentile takes two literals for accuracy and 
percentage which
+  // should not be wrapped by if-else.
+  a.withNewChildren(ifExpr(a.first) :: a.second :: a.third :: 
Nil)
 case a: AggregateFunction =>
   a.withNewChildren(a.children.map(ifExpr))
   }.transform {
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/DataFramePivotSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/DataFramePivotSuite.scala
index 51d861e..32cbb8b 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/DataFramePivotSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/DataFramePivotSuite.scala
@@ -344,4 +344,12 @@ class DataFramePivotSuite extends QueryTest with 
SharedSparkSession {
 val actual = df.groupBy("x").pivot("s").count()
 checkAnswer(actual, expected)
   }
+
+  test("SPARK-35480: percentile_approx should work with pivot") {
+val actual = Seq(
+  ("a", -1.0), ("a", 5.5), ("a", 2.5), ("b", 3.0),

[spark] branch master updated: [SPARK-35480][SQL] Make percentile_approx work with pivot

2021-05-22 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 1d9f09d  [SPARK-35480][SQL] Make percentile_approx work with pivot
1d9f09d is described below

commit 1d9f09decb280beb313f3ff0384364a40c47c4b4
Author: Hyukjin Kwon 
AuthorDate: Sun May 23 07:35:43 2021 +0900

[SPARK-35480][SQL] Make percentile_approx work with pivot

### What changes were proposed in this pull request?

This PR proposes to avoid wrapping if-else to the constant literals for 
`percentage` and `accuracy` in `percentile_approx`. They are expected to be 
literals (or foldable expressions).

Pivot works by two phrase aggregations, and it works with manipulating the 
input to `null` for non-matched values (pivot column and value).

Note that pivot supports an optimized version without such logic with 
changing input to `null` for some types (non-nested types basically). So the 
issue fixed by this PR is only for complex types.

```scala
val df = Seq(
  ("a", -1.0), ("a", 5.5), ("a", 2.5), ("b", 3.0), ("b", 5.2)).toDF("type", 
"value")
  .groupBy().pivot("type", Seq("a", "b")).agg(
percentile_approx(col("value"), array(lit(0.5)), lit(1)))
df.show()
```

**Before:**

```
org.apache.spark.sql.AnalysisException: cannot resolve 
'percentile_approx((IF((type <=> CAST('a' AS STRING)), value, CAST(NULL AS 
DOUBLE))), (IF((type <=> CAST('a' AS STRING)), array(0.5D), NULL)), (IF((type 
<=> CAST('a' AS STRING)), 1, CAST(NULL AS INT' due to data type 
mismatch: The accuracy or percentage provided must be a constant literal;
'Aggregate [percentile_approx(if ((type#7 <=> cast(a as string))) value#8 
else cast(null as double), if ((type#7 <=> cast(a as string))) array(0.5) else 
cast(null as array), if ((type#7 <=> cast(a as string))) 1 else 
cast(null as int), 0, 0) AS a#16, percentile_approx(if ((type#7 <=> cast(b as 
string))) value#8 else cast(null as double), if ((type#7 <=> cast(b as 
string))) array(0.5) else cast(null as array), if ((type#7 <=> cast(b 
as string))) 1 else cast(null [...]
+- Project [_1#2 AS type#7, _2#3 AS value#8]
   +- LocalRelation [_1#2, _2#3]
```

**After:**

```
+-+-+
|a|b|
+-+-+
|[2.5]|[3.0]|
+-+-+
```

### Why are the changes needed?

To make percentile_approx work with pivot as expected

### Does this PR introduce _any_ user-facing change?

Yes. It threw an exception but now it returns a correct result as shown 
above.

### How was this patch tested?

Manually tested and unit test was added.

Closes #32619 from HyukjinKwon/SPARK-35480.

Authored-by: Hyukjin Kwon 
Signed-off-by: Hyukjin Kwon 
---
 .../scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala   | 4 
 .../src/test/scala/org/apache/spark/sql/DataFramePivotSuite.scala | 8 
 2 files changed, 12 insertions(+)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
index c1521ac..bba9a38 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
@@ -770,6 +770,10 @@ class Analyzer(override val catalogManager: CatalogManager)
   First(ifExpr(expr), true)
 case Last(expr, _) =>
   Last(ifExpr(expr), true)
+case a: ApproximatePercentile =>
+  // ApproximatePercentile takes two literals for accuracy and 
percentage which
+  // should not be wrapped by if-else.
+  a.withNewChildren(ifExpr(a.first) :: a.second :: a.third :: 
Nil)
 case a: AggregateFunction =>
   a.withNewChildren(a.children.map(ifExpr))
   }.transform {
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/DataFramePivotSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/DataFramePivotSuite.scala
index 51d861e..32cbb8b 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/DataFramePivotSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/DataFramePivotSuite.scala
@@ -344,4 +344,12 @@ class DataFramePivotSuite extends QueryTest with 
SharedSparkSession {
 val actual = df.groupBy("x").pivot("s").count()
 checkAnswer(actual, expected)
   }
+
+  test("SPARK-35480: percentile_approx should work with pivot") {
+val actual = Seq(
+  ("a", -1.0), ("a", 5.5), ("a", 2.5), ("b", 3.0), ("b", 
5.2)).toDF("type", "value")
+  .groupBy().pivot("type", Seq("a", "b")).agg(
+

[spark] branch master updated: [SPARK-35489][BUILD] Upgrade ORC to 1.6.8

2021-05-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new fa424ac  [SPARK-35489][BUILD] Upgrade ORC to 1.6.8
fa424ac is described below

commit fa424ac2b851ceaf3b2094cafa218d2867cc0da1
Author: Dongjoon Hyun 
AuthorDate: Sat May 22 10:35:40 2021 -0700

[SPARK-35489][BUILD] Upgrade ORC to 1.6.8

### What changes were proposed in this pull request?

This PR aims to upgrade ORC to 1.6.8.

### Why are the changes needed?

This will bring the latest bug fixes.
- https://orc.apache.org/news/2021/05/21/ORC-1.6.8/

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the existing CIs.

Closes #32635 from dongjoon-hyun/SPARK-35489.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
---
 dev/deps/spark-deps-hadoop-2.7-hive-2.3 | 6 +++---
 dev/deps/spark-deps-hadoop-3.2-hive-2.3 | 6 +++---
 pom.xml | 2 +-
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/dev/deps/spark-deps-hadoop-2.7-hive-2.3 
b/dev/deps/spark-deps-hadoop-2.7-hive-2.3
index da02709..f0d8992 100644
--- a/dev/deps/spark-deps-hadoop-2.7-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-2.7-hive-2.3
@@ -196,9 +196,9 @@ objenesis/2.6//objenesis-2.6.jar
 okhttp/3.12.12//okhttp-3.12.12.jar
 okio/1.14.0//okio-1.14.0.jar
 opencsv/2.3//opencsv-2.3.jar
-orc-core/1.6.7//orc-core-1.6.7.jar
-orc-mapreduce/1.6.7//orc-mapreduce-1.6.7.jar
-orc-shims/1.6.7//orc-shims-1.6.7.jar
+orc-core/1.6.8//orc-core-1.6.8.jar
+orc-mapreduce/1.6.8//orc-mapreduce-1.6.8.jar
+orc-shims/1.6.8//orc-shims-1.6.8.jar
 oro/2.0.8//oro-2.0.8.jar
 osgi-resource-locator/1.0.3//osgi-resource-locator-1.0.3.jar
 paranamer/2.8//paranamer-2.8.jar
diff --git a/dev/deps/spark-deps-hadoop-3.2-hive-2.3 
b/dev/deps/spark-deps-hadoop-3.2-hive-2.3
index c06357f..00de17e 100644
--- a/dev/deps/spark-deps-hadoop-3.2-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-3.2-hive-2.3
@@ -167,9 +167,9 @@ objenesis/2.6//objenesis-2.6.jar
 okhttp/3.12.12//okhttp-3.12.12.jar
 okio/1.14.0//okio-1.14.0.jar
 opencsv/2.3//opencsv-2.3.jar
-orc-core/1.6.7//orc-core-1.6.7.jar
-orc-mapreduce/1.6.7//orc-mapreduce-1.6.7.jar
-orc-shims/1.6.7//orc-shims-1.6.7.jar
+orc-core/1.6.8//orc-core-1.6.8.jar
+orc-mapreduce/1.6.8//orc-mapreduce-1.6.8.jar
+orc-shims/1.6.8//orc-shims-1.6.8.jar
 oro/2.0.8//oro-2.0.8.jar
 osgi-resource-locator/1.0.3//osgi-resource-locator-1.0.3.jar
 paranamer/2.8//paranamer-2.8.jar
diff --git a/pom.xml b/pom.xml
index 310f11d..7fe327c 100644
--- a/pom.xml
+++ b/pom.xml
@@ -137,7 +137,7 @@
 
 10.14.2.0
 1.12.0
-1.6.7
+1.6.8
 9.4.40.v20210413
 4.0.3
 0.9.5

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (0549caf -> 003294c)

2021-05-22 Thread sarutak

This is an automated email from the ASF dual-hosted git repository.

sarutak pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 0549caf  [MINOR][SQL] Change the script name for creating oracle 
docker image
 add 003294c  [SPARK-35488][BUILD] Upgrade ASM to 7.3.1

No new revisions were added by this update.

Summary of changes:
 dev/deps/spark-deps-hadoop-2.7-hive-2.3 | 2 +-
 dev/deps/spark-deps-hadoop-3.2-hive-2.3 | 2 +-
 pom.xml | 6 +++---
 project/plugins.sbt | 4 ++--
 4 files changed, 7 insertions(+), 7 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.1 updated: [SPARK-35226][SQL][FOLLOWUP] Fix test added in SPARK-35226 for DB2KrbIntegrationSuite

[spark] branch master updated: [SPARK-35226][SQL][FOLLOWUP] Fix test added in SPARK-35226 for DB2KrbIntegrationSuite

[spark] branch master updated (594ffd2 -> a59a214)

[spark] branch branch-3.0 updated: [SPARK-35463][BUILD][FOLLOWUP] Redirect output for skipping checksum check

[spark] branch branch-3.1 updated: [SPARK-35463][BUILD][FOLLOWUP] Redirect output for skipping checksum check

[spark] branch master updated (1d9f09d -> 594ffd2)

[spark] branch branch-3.1 updated: Revert "[SPARK-35480][SQL] Make percentile_approx work with pivot"

[spark] branch branch-3.1 updated: [SPARK-35480][SQL] Make percentile_approx work with pivot

[spark] branch master updated: [SPARK-35480][SQL] Make percentile_approx work with pivot

[spark] branch master updated: [SPARK-35489][BUILD] Upgrade ORC to 1.6.8

[spark] branch master updated (0549caf -> 003294c)

11 matches

Site Navigation

Mail list logo

Footer information