[spark] branch branch-3.0 updated: [SPARK-33942][DOCS] Remove `hiveClientCalls.count` in `CodeGenerator` metrics docs

2020-12-30 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new b156c1f  [SPARK-33942][DOCS] Remove `hiveClientCalls.count` in 
`CodeGenerator` metrics docs
b156c1f is described below

commit b156c1f9073cdb27e1ca8d7752aa0793d160ad0b
Author: Pradyumn Agrawal (pradyumn.ag) 
AuthorDate: Wed Dec 30 17:25:46 2020 -0800

[SPARK-33942][DOCS] Remove `hiveClientCalls.count` in `CodeGenerator` 
metrics docs

### What changes were proposed in this pull request?
Removed the **hiveClientCalls.count** in CodeGenerator metrics in Component 
instance = Executor

### Why are the changes needed?
Wrong information regarding metrics was being displayed on Monitoring 
Documentation. I had added referred documentation for adding metrics logging in 
Graphite. This metric was not being reported. I had to check if the issue was 
at my application end or spark code or documentation. Documentation had the 
wrong info.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manual, checked it on my forked repository feature branch 
[SPARK-33942](https://github.com/coderbond007/spark/blob/SPARK-33942/docs/monitoring.md)

Closes #30976 from coderbond007/SPARK-33942.

Authored-by: Pradyumn Agrawal (pradyumn.ag) 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 13e8c2840969a17d5ba113686501abd3c23e3c23)
Signed-off-by: Dongjoon Hyun 
---
 docs/monitoring.md | 1 -
 1 file changed, 1 deletion(-)

diff --git a/docs/monitoring.md b/docs/monitoring.md
index 8471417..ac6b693 100644
--- a/docs/monitoring.md
+++ b/docs/monitoring.md
@@ -1228,7 +1228,6 @@ when running in local mode.
   - compilationTime (histogram)
   - generatedClassSize (histogram)
   - generatedMethodSize (histogram)
-  - hiveClientCalls.count
   - sourceCodeSize (histogram)
 
 - namespace=plugin.\


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.1 updated: [SPARK-33942][DOCS] Remove `hiveClientCalls.count` in `CodeGenerator` metrics docs

2020-12-30 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new 087b9ed  [SPARK-33942][DOCS] Remove `hiveClientCalls.count` in 
`CodeGenerator` metrics docs
087b9ed is described below

commit 087b9edb2e15a8ee889c9f57603645cb7c96f107
Author: Pradyumn Agrawal (pradyumn.ag) 
AuthorDate: Wed Dec 30 17:25:46 2020 -0800

[SPARK-33942][DOCS] Remove `hiveClientCalls.count` in `CodeGenerator` 
metrics docs

### What changes were proposed in this pull request?
Removed the **hiveClientCalls.count** in CodeGenerator metrics in Component 
instance = Executor

### Why are the changes needed?
Wrong information regarding metrics was being displayed on Monitoring 
Documentation. I had added referred documentation for adding metrics logging in 
Graphite. This metric was not being reported. I had to check if the issue was 
at my application end or spark code or documentation. Documentation had the 
wrong info.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manual, checked it on my forked repository feature branch 
[SPARK-33942](https://github.com/coderbond007/spark/blob/SPARK-33942/docs/monitoring.md)

Closes #30976 from coderbond007/SPARK-33942.

Authored-by: Pradyumn Agrawal (pradyumn.ag) 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 13e8c2840969a17d5ba113686501abd3c23e3c23)
Signed-off-by: Dongjoon Hyun 
---
 docs/monitoring.md | 1 -
 1 file changed, 1 deletion(-)

diff --git a/docs/monitoring.md b/docs/monitoring.md
index c610518..5b3278b 100644
--- a/docs/monitoring.md
+++ b/docs/monitoring.md
@@ -1276,7 +1276,6 @@ These metrics are exposed by Spark executors.
   - compilationTime (histogram)
   - generatedClassSize (histogram)
   - generatedMethodSize (histogram)
-  - hiveClientCalls.count
   - sourceCodeSize (histogram)
 
 - namespace=plugin.\


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-33942][DOCS] Remove `hiveClientCalls.count` in `CodeGenerator` metrics docs

2020-12-30 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 13e8c28  [SPARK-33942][DOCS] Remove `hiveClientCalls.count` in 
`CodeGenerator` metrics docs
13e8c28 is described below

commit 13e8c2840969a17d5ba113686501abd3c23e3c23
Author: Pradyumn Agrawal (pradyumn.ag) 
AuthorDate: Wed Dec 30 17:25:46 2020 -0800

[SPARK-33942][DOCS] Remove `hiveClientCalls.count` in `CodeGenerator` 
metrics docs

### What changes were proposed in this pull request?
Removed the **hiveClientCalls.count** in CodeGenerator metrics in Component 
instance = Executor

### Why are the changes needed?
Wrong information regarding metrics was being displayed on Monitoring 
Documentation. I had added referred documentation for adding metrics logging in 
Graphite. This metric was not being reported. I had to check if the issue was 
at my application end or spark code or documentation. Documentation had the 
wrong info.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manual, checked it on my forked repository feature branch 
[SPARK-33942](https://github.com/coderbond007/spark/blob/SPARK-33942/docs/monitoring.md)

Closes #30976 from coderbond007/SPARK-33942.

Authored-by: Pradyumn Agrawal (pradyumn.ag) 
Signed-off-by: Dongjoon Hyun 
---
 docs/monitoring.md | 1 -
 1 file changed, 1 deletion(-)

diff --git a/docs/monitoring.md b/docs/monitoring.md
index c610518..5b3278b 100644
--- a/docs/monitoring.md
+++ b/docs/monitoring.md
@@ -1276,7 +1276,6 @@ These metrics are exposed by Spark executors.
   - compilationTime (histogram)
   - generatedClassSize (histogram)
   - generatedMethodSize (histogram)
-  - hiveClientCalls.count
   - sourceCodeSize (histogram)
 
 - namespace=plugin.\


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (f38265d -> 85de644)

2020-12-30 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from f38265d  [SPARK-33907][SQL] Only prune columns of from_json if parsing 
options is empty
 add 85de644  [SPARK-33804][CORE] Fix compilation warnings about 'view 
bounds are deprecated'

No new revisions were added by this update.

Summary of changes:
 .../main/scala/org/apache/spark/rdd/SequenceFileRDDFunctions.scala | 7 ++-
 core/src/main/scala/org/apache/spark/rdd/package.scala | 6 +-
 2 files changed, 7 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (ba974ea -> f38265d)

2020-12-30 Thread viirya
This is an automated email from the ASF dual-hosted git repository.

viirya pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ba974ea  [SPARK-30789][SQL] Support (IGNORE | RESPECT) NULLS for 
LEAD/LAG/NTH_VALUE/FIRST_VALUE/LAST_VALUE
 add f38265d  [SPARK-33907][SQL] Only prune columns of from_json if parsing 
options is empty

No new revisions were added by this update.

Summary of changes:
 .../catalyst/optimizer/OptimizeCsvJsonExprs.scala  |  9 ++-
 .../optimizer/OptimizeJsonExprsSuite.scala | 20 +++
 .../org/apache/spark/sql/JsonFunctionsSuite.scala  | 65 ++
 3 files changed, 92 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-30789][SQL] Support (IGNORE | RESPECT) NULLS for LEAD/LAG/NTH_VALUE/FIRST_VALUE/LAST_VALUE

2020-12-30 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new ba974ea  [SPARK-30789][SQL] Support (IGNORE | RESPECT) NULLS for 
LEAD/LAG/NTH_VALUE/FIRST_VALUE/LAST_VALUE
ba974ea is described below

commit ba974ea8e4cc8075056682c2badab5ca64b90047
Author: gengjiaan 
AuthorDate: Wed Dec 30 13:14:31 2020 +

[SPARK-30789][SQL] Support (IGNORE | RESPECT) NULLS for 
LEAD/LAG/NTH_VALUE/FIRST_VALUE/LAST_VALUE

### What changes were proposed in this pull request?
All of `LEAD`/`LAG`/`NTH_VALUE`/`FIRST_VALUE`/`LAST_VALUE` should support 
IGNORE NULLS | RESPECT NULLS. For example:
```
LEAD (value_expr [, offset ])
[ IGNORE NULLS | RESPECT NULLS ]
OVER ( [ PARTITION BY window_partition ] ORDER BY window_ordering )
```

```
LAG (value_expr [, offset ])
[ IGNORE NULLS | RESPECT NULLS ]
OVER ( [ PARTITION BY window_partition ] ORDER BY window_ordering )
```

```
NTH_VALUE (expr, offset)
[ IGNORE NULLS | RESPECT NULLS ]
OVER
( [ PARTITION BY window_partition ]
[ ORDER BY window_ordering
 frame_clause ] )
```

The mainstream database or engine supports this syntax contains:
**Oracle**

https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/NTH_VALUE.html#GUID-F8A0E88C-67E5-4AA6-9515-95D03A7F9EA0

**Redshift**
https://docs.aws.amazon.com/redshift/latest/dg/r_WF_NTH.html

**Presto**
https://prestodb.io/docs/current/functions/window.html

**DB2**

https://www.ibm.com/support/knowledgecenter/SSGU8G_14.1.0/com.ibm.sqls.doc/ids_sqs_1513.htm

**Teradata**
https://docs.teradata.com/r/756LNiPSFdY~4JcCCcR5Cw/GjCT6l7trjkIEjt~7Dhx4w

**Snowflake**
https://docs.snowflake.com/en/sql-reference/functions/lead.html
https://docs.snowflake.com/en/sql-reference/functions/lag.html
https://docs.snowflake.com/en/sql-reference/functions/nth_value.html
https://docs.snowflake.com/en/sql-reference/functions/first_value.html
https://docs.snowflake.com/en/sql-reference/functions/last_value.html

**Exasol**

https://docs.exasol.com/sql_references/functions/alphabeticallistfunctions/lead.htm

https://docs.exasol.com/sql_references/functions/alphabeticallistfunctions/lag.htm

https://docs.exasol.com/sql_references/functions/alphabeticallistfunctions/nth_value.htm

https://docs.exasol.com/sql_references/functions/alphabeticallistfunctions/first_value.htm

https://docs.exasol.com/sql_references/functions/alphabeticallistfunctions/last_value.htm

### Why are the changes needed?
Support `(IGNORE | RESPECT) NULLS` for 
`LEAD`/`LAG`/`NTH_VALUE`/`FIRST_VALUE`/`LAST_VALUE `is very useful.

### Does this PR introduce _any_ user-facing change?
Yes.

### How was this patch tested?
Jenkins test

Closes #30943 from beliefer/SPARK-30789.

Lead-authored-by: gengjiaan 
Co-authored-by: beliefer 
Signed-off-by: Wenchen Fan 
---
 docs/sql-ref-ansi-compliance.md|   1 +
 .../apache/spark/sql/catalyst/parser/SqlBase.g4|   6 +-
 .../apache/spark/sql/QueryCompilationErrors.scala  |   4 +
 .../spark/sql/catalyst/analysis/Analyzer.scala |  45 +++-
 .../catalyst/analysis/higherOrderFunctions.scala   |   6 +-
 .../spark/sql/catalyst/analysis/unresolved.scala   |   3 +-
 .../spark/sql/catalyst/parser/AstBuilder.scala |   4 +-
 .../sql/catalyst/analysis/AnalysisErrorSuite.scala |  20 ++
 .../src/test/resources/sql-tests/inputs/window.sql | 148 ++-
 .../resources/sql-tests/results/window.sql.out | 280 -
 10 files changed, 508 insertions(+), 9 deletions(-)

diff --git a/docs/sql-ref-ansi-compliance.md b/docs/sql-ref-ansi-compliance.md
index 8201fd7..16059a5 100644
--- a/docs/sql-ref-ansi-compliance.md
+++ b/docs/sql-ref-ansi-compliance.md
@@ -363,6 +363,7 @@ Below is a list of all the keywords in Spark SQL.
 |REPAIR|non-reserved|non-reserved|non-reserved|
 |REPLACE|non-reserved|non-reserved|non-reserved|
 |RESET|non-reserved|non-reserved|non-reserved|
+|RESPECT|non-reserved|non-reserved|non-reserved|
 |RESTRICT|non-reserved|non-reserved|non-reserved|
 |REVOKE|non-reserved|non-reserved|reserved|
 |RIGHT|reserved|strict-non-reserved|reserved|
diff --git 
a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 
b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
index d2908a5..ab4b783 100644
--- 
a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
+++ 
b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
@@ -803,7 +803,8 @@ primaryExpression
 | '(' namedExpression (',' namedExpression)+ ')'   
#rowConstructor