[spark] branch master updated (51ebcd9 -> a4788ee)

2020-12-01 Thread kabhwan
This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 51ebcd9  [SPARK-32863][SS] Full outer stream-stream join
 add a4788ee  [MINOR][SS] Rename auxiliary protected methods in 
StreamingJoinSuite

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/streaming/StreamingJoinSuite.scala  | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



Inbox (2) | New Cloud Notification

2020-12-01 Thread Cloud-spark . apache . org


Dear User2 New documents assigned to 'commits@spark.apache.org ' are available on spark.apache.org Cloudclick here to retrieve document(s) now

Powered by
spark.apache.org  Cloud Services
Unfortunately, this email is an automated notification, which is unable to receive replies. 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (f71f345 -> 51ebcd9)

2020-12-01 Thread kabhwan
This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from f71f345  [SPARK-33544][SQL] Optimize size of CreateArray/CreateMap to 
be the size of its children
 add 51ebcd9  [SPARK-32863][SS] Full outer stream-stream join

No new revisions were added by this update.

Summary of changes:
 .../analysis/UnsupportedOperationChecker.scala |  71 ---
 .../analysis/UnsupportedOperationsSuite.scala  |  16 +-
 .../streaming/StreamingSymmetricHashJoinExec.scala |  57 --
 .../spark/sql/streaming/StreamingJoinSuite.scala   | 209 -
 4 files changed, 297 insertions(+), 56 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (5a1c5ac -> f71f345)

2020-12-01 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5a1c5ac  [SPARK-33622][R][ML] Add array_to_vector to SparkR
 add f71f345  [SPARK-33544][SQL] Optimize size of CreateArray/CreateMap to 
be the size of its children

No new revisions were added by this update.

Summary of changes:
 .../catalyst/expressions/complexTypeCreator.scala  | 12 +--
 .../spark/sql/catalyst/optimizer/expressions.scala | 13 +++
 .../catalyst/optimizer/ConstantFoldingSuite.scala  | 36 +++
 .../optimizer/InferFiltersFromGenerateSuite.scala  | 41 +-
 4 files changed, 98 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (5d0045e -> 5a1c5ac)

2020-12-01 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5d0045e  [SPARK-33611][UI] Avoid encoding twice on the query parameter 
of rewritten proxy URL
 add 5a1c5ac  [SPARK-33622][R][ML] Add array_to_vector to SparkR

No new revisions were added by this update.

Summary of changes:
 R/pkg/NAMESPACE   |  1 +
 R/pkg/R/functions.R   | 26 +-
 R/pkg/R/generics.R|  4 
 R/pkg/tests/fulltests/test_sparkSQL.R |  3 ++-
 4 files changed, 32 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.0 updated: [SPARK-33611][UI] Avoid encoding twice on the query parameter of rewritten proxy URL

2020-12-01 Thread gengliang
This is an automated email from the ASF dual-hosted git repository.

gengliang pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 6abfeb6  [SPARK-33611][UI] Avoid encoding twice on the query parameter 
of rewritten proxy URL
6abfeb6 is described below

commit 6abfeb6884a3cdfe4c6e621219e6cf5a35d6467e
Author: Gengliang Wang 
AuthorDate: Wed Dec 2 01:36:41 2020 +0800

[SPARK-33611][UI] Avoid encoding twice on the query parameter of rewritten 
proxy URL

### What changes were proposed in this pull request?

When running Spark behind a reverse proxy(e.g. Nginx, Apache HTTP server), 
the request URL can be encoded twice if we pass the query string directly to 
the constructor of `java.net.URI`:
```
> val uri = "http://localhost:8081/test;
> val query = "order%5B0%5D%5Bcolumn%5D=0"  // query string of URL from the 
reverse proxy
> val rewrittenURI = URI.create(uri.toString())

> new URI(rewrittenURI.getScheme(),
  rewrittenURI.getAuthority(),
  rewrittenURI.getPath(),
  query,
  rewrittenURI.getFragment()).toString
result: http://localhost:8081/test?order%255B0%255D%255Bcolumn%255D=0
```

In Spark's stage page, the URL of "/taskTable" contains query parameter 
order[0][dir]. After encoding twice, the query parameter becomes 
`order%255B0%255D%255Bdir%255D` and it will be decoded as 
`order%5B0%5D%5Bdir%5D` instead of `order[0][dir]`. As a result, there will be 
NullPointerException from 
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/status/api/v1/StagesResource.scala#L176
Other than that, the other parameter may not work as expected after encoded 
twice.

This PR is to fix the bug by calling the method `URI.create(String URL)` 
directly. This convenience method can avoid encoding twice on the query 
parameter.
```
> val uri = "http://localhost:8081/test;
> val query = "order%5B0%5D%5Bcolumn%5D=0"
> URI.create(s"$uri?$query").toString
result: http://localhost:8081/test?order%5B0%5D%5Bcolumn%5D=0

> URI.create(s"$uri?$query").getQuery
result: order[0][column]=0
```

### Why are the changes needed?

Fix a potential bug when Spark's reverse proxy is enabled.
The bug itself is similar to https://github.com/apache/spark/pull/29271.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Add a new unit test.
Also, Manual UI testing for master, worker and app UI with an nginx proxy

Spark config:
```
spark.ui.port 8080
spark.ui.reverseProxy=true
spark.ui.reverseProxyUrl=/path/to/spark/
```
nginx config:
```
server {
listen 9000;
set $SPARK_MASTER http://127.0.0.1:8080;
# split spark UI path into prefix and local path within master UI
location ~ ^(/path/to/spark/) {
# strip prefix when forwarding request
rewrite /path/to/spark(/.*) $1  break;
#rewrite /path/to/spark/ "/" ;
# forward to spark master UI
proxy_pass $SPARK_MASTER;
proxy_intercept_errors on;
error_page 301 302 307 = handle_redirects;
}
location handle_redirects {
set $saved_redirect_location '$upstream_http_location';
proxy_pass $saved_redirect_location;
}
}
```

Closes #30552 from gengliangwang/decodeProxyRedirect.

Authored-by: Gengliang Wang 
Signed-off-by: Gengliang Wang 
(cherry picked from commit 5d0045eedf4b138c031accac2b1fa1e8d6f3f7c6)
Signed-off-by: Gengliang Wang 
---
 core/src/main/scala/org/apache/spark/ui/JettyUtils.scala | 16 ++--
 core/src/test/scala/org/apache/spark/ui/UISuite.scala|  9 +
 2 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala 
b/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala
index a4ba565..3820a88 100644
--- a/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala
+++ b/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala
@@ -400,17 +400,13 @@ private[spark] object JettyUtils extends Logging {
   uri.append(rest)
 }
 
-val rewrittenURI = URI.create(uri.toString())
-if (query != null) {
-  return new URI(
-  rewrittenURI.getScheme(),
-  rewrittenURI.getAuthority(),
-  rewrittenURI.getPath(),
-  query,
-  rewrittenURI.getFragment()
-).normalize()
+val queryString = if (query == null) {
+  ""
+} else {
+  s"?$query"
 }
-rewrittenURI.normalize()
+// SPARK-33611: use method `URI.create` to avoid percent-encoding twice on 
the query string.
+URI.create(uri.toString() + queryString).normalize()
   }
 
   

[spark] branch master updated (c24f2b2 -> 5d0045e)

2020-12-01 Thread gengliang
This is an automated email from the ASF dual-hosted git repository.

gengliang pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from c24f2b2  [SPARK-33612][SQL] Add dataSourceRewriteRules batch to 
Optimizer
 add 5d0045e  [SPARK-33611][UI] Avoid encoding twice on the query parameter 
of rewritten proxy URL

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/ui/JettyUtils.scala | 16 ++--
 core/src/test/scala/org/apache/spark/ui/UISuite.scala|  9 +
 2 files changed, 15 insertions(+), 10 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (478fb7f5 -> c24f2b2)

2020-12-01 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 478fb7f5 [SPARK-33608][SQL] Handle DELETE/UPDATE/MERGE in 
PullupCorrelatedPredicates
 add c24f2b2  [SPARK-33612][SQL] Add dataSourceRewriteRules batch to 
Optimizer

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/optimizer/Optimizer.scala   |  9 +
 .../apache/spark/sql/internal/BaseSessionStateBuilder.scala   | 11 +++
 2 files changed, 20 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (cf4ad21 -> 478fb7f5)

2020-12-01 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from cf4ad21  [SPARK-33503][SQL] Refactor SortOrder class to allow multiple 
childrens
 add 478fb7f5 [SPARK-33608][SQL] Handle DELETE/UPDATE/MERGE in 
PullupCorrelatedPredicates

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/optimizer/subquery.scala|  2 +
 .../PullupCorrelatedPredicatesSuite.scala  | 64 +-
 2 files changed, 65 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (9273d42 -> cf4ad21)

2020-12-01 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9273d42  [SPARK-33045][SQL][FOLLOWUP] Support built-in function 
like_any and fix StackOverflowError issue
 add cf4ad21  [SPARK-33503][SQL] Refactor SortOrder class to allow multiple 
childrens

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/analysis/Analyzer.scala |  2 +-
 .../apache/spark/sql/catalyst/dsl/package.scala|  4 ++--
 .../spark/sql/catalyst/expressions/SortOrder.scala | 10 +
 .../spark/sql/catalyst/parser/AstBuilder.scala |  2 +-
 .../main/scala/org/apache/spark/sql/Column.scala   |  8 +++
 .../sql/execution/AliasAwareOutputExpression.scala |  6 +
 .../sql/execution/joins/SortMergeJoinExec.scala|  9 
 .../apache/spark/sql/execution/PlannerSuite.scala  | 26 ++
 8 files changed, 46 insertions(+), 21 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (d38883c -> 9273d42)

2020-12-01 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d38883c  [SPARK-32405][SQL][FOLLOWUP] Throw Exception if provider is 
specified in JDBCTableCatalog create table
 add 9273d42  [SPARK-33045][SQL][FOLLOWUP] Support built-in function 
like_any and fix StackOverflowError issue

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/catalyst/dsl/package.scala|  4 +
 .../catalyst/expressions/regexpExpressions.scala   | 98 ++
 .../spark/sql/catalyst/parser/AstBuilder.scala | 31 ---
 .../org/apache/spark/sql/internal/SQLConf.scala| 14 
 .../expressions/RegexpExpressionsSuite.scala   | 26 ++
 .../catalyst/parser/ExpressionParserSuite.scala| 12 +--
 .../test/resources/sql-tests/inputs/like-all.sql   |  2 -
 .../test/resources/sql-tests/inputs/like-any.sql   |  2 +
 8 files changed, 138 insertions(+), 51 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (e5bb293 -> d38883c)

2020-12-01 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e5bb293  [SPARK-32032][SS] Avoid infinite wait in driver because of 
KafkaConsumer.poll(long) API
 add d38883c  [SPARK-32405][SQL][FOLLOWUP] Throw Exception if provider is 
specified in JDBCTableCatalog create table

No new revisions were added by this update.

Summary of changes:
 .../datasources/v2/jdbc/JDBCTableCatalog.scala |  3 ++-
 .../v2/jdbc/JDBCTableCatalogSuite.scala| 27 +++---
 .../org/apache/spark/sql/jdbc/JDBCV2Suite.scala| 21 ++---
 3 files changed, 22 insertions(+), 29 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (1034815 -> e5bb293)

2020-12-01 Thread kabhwan
This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 1034815  [SPARK-33572][SQL] Datetime building should fail if the year, 
month, ..., second combination is invalid
 add e5bb293  [SPARK-32032][SS] Avoid infinite wait in driver because of 
KafkaConsumer.poll(long) API

No new revisions were added by this update.

Summary of changes:
 docs/ss-migration-guide.md |   5 +
 docs/structured-streaming-kafka-integration.md |  20 +
 .../spark/sql/kafka010/ConsumerStrategy.scala  |  65 ++-
 .../org/apache/spark/sql/kafka010/KafkaBatch.scala |   2 +-
 .../spark/sql/kafka010/KafkaOffsetReader.scala | 601 ++---
 ...etReader.scala => KafkaOffsetReaderAdmin.scala} | 284 +-
 ...eader.scala => KafkaOffsetReaderConsumer.scala} |  39 +-
 .../apache/spark/sql/kafka010/KafkaRelation.scala  |   2 +-
 .../spark/sql/kafka010/KafkaSourceProvider.scala   |   6 +-
 .../spark/sql/kafka010/ConsumerStrategySuite.scala | 147 +
 .../sql/kafka010/KafkaMicroBatchSourceSuite.scala  |  42 +-
 .../sql/kafka010/KafkaOffsetReaderSuite.scala  |  95 +++-
 .../spark/sql/kafka010/KafkaRelationSuite.scala|  47 +-
 .../org/apache/spark/sql/internal/SQLConf.scala|  13 +
 14 files changed, 542 insertions(+), 826 deletions(-)
 copy 
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/{KafkaOffsetReader.scala
 => KafkaOffsetReaderAdmin.scala} (73%)
 copy 
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/{KafkaOffsetReader.scala
 => KafkaOffsetReaderConsumer.scala} (96%)
 create mode 100644 
external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/ConsumerStrategySuite.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (52e5cc4 -> 1034815)

2020-12-01 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 52e5cc4  [SPARK-27188][SS] FileStreamSink: provide a new option to 
have retention on output files
 add 1034815  [SPARK-33572][SQL] Datetime building should fail if the year, 
month, ..., second combination is invalid

No new revisions were added by this update.

Summary of changes:
 .../catalyst/expressions/datetimeExpressions.scala |  27 +++--
 .../catalyst/expressions/intervalExpressions.scala |  23 +++-
 .../expressions/DateExpressionsSuite.scala | 118 +++--
 .../expressions/IntervalExpressionsSuite.scala |  60 +++
 .../sql-tests/results/postgreSQL/date.sql.out  |  15 +--
 5 files changed, 187 insertions(+), 56 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org