[spark] branch master updated: [SPARK-34123][WEB UI] optimize spark history summary page loading

2021-01-17 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new ebd8bc9  [SPARK-34123][WEB UI] optimize spark history summary page 
loading
ebd8bc9 is described below

commit ebd8bc934de9d6aec53beb4ab60c998052038fad
Author: mohan3d 
AuthorDate: Sun Jan 17 14:37:28 2021 -0600

[SPARK-34123][WEB UI] optimize spark history summary page loading

### What changes were proposed in this pull request?
Display history server entries using datatables instead of Mustache + 
Datatables which proved to be faster and non-blocking for the webpage while 
searching (using search bar in the page)

### Why are the changes needed?
Small changes in the attempts (entries) and removed part of HTML (Mustache 
template).

### Does this PR introduce _any_ user-facing change?
Not very sure, but it's not supposed to change the way the page looks 
rather it changes how entries are displayed.

### How was this patch tested?
Running test, since it's not adding new functionality.

Closes #31191 from mohan3d/feat/history-server-ui-optimization.

Lead-authored-by: mohan3d 
Co-authored-by: Author: mohan3d 
Signed-off-by: Sean Owen 
---
 .../spark/ui/static/historypage-template.html  | 20 
 .../org/apache/spark/ui/static/historypage.js  | 54 --
 2 files changed, 41 insertions(+), 33 deletions(-)

diff --git 
a/core/src/main/resources/org/apache/spark/ui/static/historypage-template.html 
b/core/src/main/resources/org/apache/spark/ui/static/historypage-template.html
index 7e9927d..5427125 100644
--- 
a/core/src/main/resources/org/apache/spark/ui/static/historypage-template.html
+++ 
b/core/src/main/resources/org/apache/spark/ui/static/historypage-template.html
@@ -75,26 +75,6 @@
   
   
   
-  {{#applications}}
-
-  {{#attempts}}
-  {{version}}
-  {{id}}
-  {{name}}
-  {{#hasMultipleAttempts}}
-  {{attemptId}}
-  {{/hasMultipleAttempts}}
-  {{startTime}}
-  {{#showCompletedColumns}}
-  {{endTime}}
-  {{duration}}
-  {{/showCompletedColumns}}
-  {{sparkUser}}
-  {{lastUpdated}}
-  Download
-  {{/attempts}}
-
-  {{/applications}}
   
 
 
diff --git a/core/src/main/resources/org/apache/spark/ui/static/historypage.js 
b/core/src/main/resources/org/apache/spark/ui/static/historypage.js
index 3a4c815..aa542a7 100644
--- a/core/src/main/resources/org/apache/spark/ui/static/historypage.js
+++ b/core/src/main/resources/org/apache/spark/ui/static/historypage.js
@@ -140,9 +140,13 @@ $(document).ready(function() {
 (attempt.hasOwnProperty("attemptId") ? attempt["attemptId"] + "/" 
: "") + "logs";
   attempt["durationMillisec"] = attempt["duration"];
   attempt["duration"] = formatDuration(attempt["duration"]);
-  var hasAttemptId = attempt.hasOwnProperty("attemptId");
-  var app_clone = {"id" : id, "name" : name, "version": version, 
"hasAttemptId" : hasAttemptId, "attempts" : [attempt]};
-  array.push(app_clone);
+  attempt["id"] = id;
+  attempt["name"] = name;
+  attempt["version"] = version;
+  attempt["attemptUrl"] = uiRoot + "/history/" + id + "/" +
+(attempt.hasOwnProperty("attemptId") ? attempt["attemptId"] + "/" 
: "") + "jobs/";
+
+  array.push(attempt);
 }
   }
   if(array.length < 20) {
@@ -165,17 +169,41 @@ $(document).ready(function() {
 var completedColumnName = 'completed';
 var durationColumnName = 'duration';
 var conf = {
+  "data": array,
   "columns": [
-{name: 'version'},
-{name: 'appId', type: "appid-numeric"},
-{name: 'appName'},
-{name: attemptIdColumnName},
-{name: startedColumnName},
-{name: completedColumnName},
-{name: durationColumnName, type: "title-numeric"},
-{name: 'user'},
-{name: 'lastUpdated'},
-{name: 'eventLog'},
+{name: 'version', data: 'version' },
+{
+  name: 'appId', 
+  type: "appid-numeric", 
+  data: 'id',
+  render:  (id, type, row) => `${id}`
+},
+{name: 'appName', data: 'name' },
+{
+  name: attemptIdColumnName, 
+  data: 'attemptId',
+  render: (attemptId, type, row) => (attemptId ? `${attemptId}` : '')
+},
+{name: startedColumnName, data: 'startTime' },
+{name: completedColumnName, data: 'endTime' },
+{name: durationColumnName, type: "title-numeric", data: 'duration' 
},
+{name: 'user', data: 'sparkUser' },
+{name: 'lastUpdated', data: 'lastUpdated' },

[spark] branch branch-2.4 updated (7ae6c8d -> e0e1e21)

2021-01-17 Thread kabhwan
This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 7ae6c8d  [SPARK-34118][CORE][SQL][2.4] Replaces filter and check for 
emptiness with exists or forall
 add e0e1e21  [SPARK-34125][CORE][2.4] Make EventLoggingListener.codecMap 
thread-safe

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/scheduler/EventLoggingListener.scala  | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-33730][PYTHON] Standardize warning types

2021-01-17 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 098f226  [SPARK-33730][PYTHON] Standardize warning types
098f226 is described below

commit 098f2268e4ad43dd9453ada91161ea428dd57d16
Author: zero323 
AuthorDate: Mon Jan 18 09:32:55 2021 +0900

[SPARK-33730][PYTHON] Standardize warning types

### What changes were proposed in this pull request?

This PR:

- Adds as small  hierarchy of warnings to be used in PySpark applications. 
These extend built-in classes and top level `PySparkWarning`.
- Replaces `DeprecationWarnings` (intended for developers) with PySpark 
specific subclasses of `FutureWarning` (intended for end users).

### Why are the changes needed?

- To be more precise and add users additional control (in addition to 
standard module level filters) over PySpark warnings handling.
- Correct semantics (at the moment we use `DeprecationWarning` in 
user-facing API, but it is intended "for warnings about deprecated features 
when those warnings are intended for other Python developers").

### Does this PR introduce _any_ user-facing change?

Yes. Code can raise different type of warning than before.

### How was this patch tested?

Existing tests.

Closes #30985 from zero323/SPARK-33730.

Authored-by: zero323 
Signed-off-by: HyukjinKwon 
---
 python/pyspark/ml/clustering.py|  2 +-
 python/pyspark/mllib/classification.py |  2 +-
 python/pyspark/mllib/regression.py |  7 ---
 python/pyspark/rdd.py  | 10 ++
 python/pyspark/sql/catalog.py  |  6 --
 python/pyspark/sql/column.py   |  6 --
 python/pyspark/sql/context.py  | 15 ++-
 python/pyspark/sql/dataframe.py|  4 +++-
 python/pyspark/sql/functions.py|  6 +++---
 python/pyspark/worker.py   | 10 --
 10 files changed, 44 insertions(+), 24 deletions(-)

diff --git a/python/pyspark/ml/clustering.py b/python/pyspark/ml/clustering.py
index 54c1a43..60726cb 100644
--- a/python/pyspark/ml/clustering.py
+++ b/python/pyspark/ml/clustering.py
@@ -821,7 +821,7 @@ class BisectingKMeansModel(JavaModel, 
_BisectingKMeansParams, JavaMLWritable, Ja
 """
 warnings.warn("Deprecated in 3.0.0. It will be removed in future 
versions. Use "
   "ClusteringEvaluator instead. You can also get the cost 
on the training "
-  "dataset in the summary.", DeprecationWarning)
+  "dataset in the summary.", FutureWarning)
 return self._call_java("computeCost", dataset)
 
 @property
diff --git a/python/pyspark/mllib/classification.py 
b/python/pyspark/mllib/classification.py
index bd43e91..5705401 100644
--- a/python/pyspark/mllib/classification.py
+++ b/python/pyspark/mllib/classification.py
@@ -324,7 +324,7 @@ class LogisticRegressionWithSGD(object):
 """
 warnings.warn(
 "Deprecated in 2.0.0. Use ml.classification.LogisticRegression or "
-"LogisticRegressionWithLBFGS.", DeprecationWarning)
+"LogisticRegressionWithLBFGS.", FutureWarning)
 
 def train(rdd, i):
 return callMLlibFunc("trainLogisticRegressionModelWithSGD", rdd, 
int(iterations),
diff --git a/python/pyspark/mllib/regression.py 
b/python/pyspark/mllib/regression.py
index c224e38..3908e4a 100644
--- a/python/pyspark/mllib/regression.py
+++ b/python/pyspark/mllib/regression.py
@@ -299,7 +299,7 @@ class LinearRegressionWithSGD(object):
 (default: 0.001)
 """
 warnings.warn(
-"Deprecated in 2.0.0. Use ml.regression.LinearRegression.", 
DeprecationWarning)
+"Deprecated in 2.0.0. Use ml.regression.LinearRegression.", 
FutureWarning)
 
 def train(rdd, i):
 return callMLlibFunc("trainLinearRegressionModelWithSGD", rdd, 
int(iterations),
@@ -453,7 +453,8 @@ class LassoWithSGD(object):
 warnings.warn(
 "Deprecated in 2.0.0. Use ml.regression.LinearRegression with 
elasticNetParam = 1.0. "
 "Note the default regParam is 0.01 for LassoWithSGD, but is 0.0 
for LinearRegression.",
-DeprecationWarning)
+FutureWarning
+)
 
 def train(rdd, i):
 return callMLlibFunc("trainLassoModelWithSGD", rdd, 
int(iterations), float(step),
@@ -607,7 +608,7 @@ class RidgeRegressionWithSGD(object):
 warnings.warn(
 "Deprecated in 2.0.0. Use ml.regression.LinearRegression with 
elasticNetParam = 0.0. "
 "Note the default regParam is 0.01 for RidgeRegressionWithSGD, but 
is 0.0 for "
-"LinearRegression.", DeprecationWarning)
+"LinearRegression.", FutureWarning)
 
 def tra

[spark] branch master updated (098f226 -> 415506c)

2021-01-17 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 098f226  [SPARK-33730][PYTHON] Standardize warning types
 add 415506c  [SPARK-34142][CORE] Support Fallback Storage Cleanup during 
stopping SparkContext

No new revisions were added by this update.

Summary of changes:
 .../main/scala/org/apache/spark/SparkContext.scala |  1 +
 .../org/apache/spark/internal/config/package.scala |  7 ++
 .../org/apache/spark/storage/FallbackStorage.scala | 25 --
 .../spark/storage/FallbackStorageSuite.scala   | 21 +-
 4 files changed, 51 insertions(+), 3 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [MINOR][DOCS] Fix broken python doc links

2021-01-17 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 8847b7f  [MINOR][DOCS] Fix broken python doc links
8847b7f is described below

commit 8847b7fa6d9713646257c6640017aab7e0c22e5a
Author: Huaxin Gao 
AuthorDate: Mon Jan 18 10:06:45 2021 +0900

[MINOR][DOCS] Fix broken python doc links

### What changes were proposed in this pull request?
Fix broken python links

### Why are the changes needed?
links broken.

![image](https://user-images.githubusercontent.com/13592258/104859361-9f60c980-58d9-11eb-8810-cb0669040af4.png)


![image](https://user-images.githubusercontent.com/13592258/104859350-8b1ccc80-58d9-11eb-9a8a-6ba8792595aa.png)

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Manually checked

Closes #31220 from huaxingao/docs.

Authored-by: Huaxin Gao 
Signed-off-by: HyukjinKwon 
---
 docs/ml-classification-regression.md   | 40 -
 docs/ml-clustering.md  | 10 ++---
 docs/ml-collaborative-filtering.md |  2 +-
 docs/ml-features.md| 80 +-
 docs/ml-frequent-pattern-mining.md |  4 +-
 docs/ml-pipeline.md|  8 ++--
 docs/ml-statistics.md  |  6 +--
 docs/ml-tuning.md  |  4 +-
 docs/mllib-clustering.md   | 18 
 docs/mllib-collaborative-filtering.md  |  2 +-
 docs/mllib-data-types.md   | 44 +--
 docs/mllib-decision-tree.md|  4 +-
 docs/mllib-dimensionality-reduction.md |  4 +-
 docs/mllib-ensembles.md|  8 ++--
 docs/mllib-evaluation-metrics.md   |  8 ++--
 docs/mllib-feature-extraction.md   | 14 +++---
 docs/mllib-frequent-pattern-mining.md  |  6 +--
 docs/mllib-isotonic-regression.md  |  2 +-
 docs/mllib-linear-methods.md   |  4 +-
 docs/mllib-naive-bayes.md  |  8 ++--
 docs/mllib-statistics.md   | 28 ++--
 21 files changed, 152 insertions(+), 152 deletions(-)

diff --git a/docs/ml-classification-regression.md 
b/docs/ml-classification-regression.md
index 247989d..bad74cb 100644
--- a/docs/ml-classification-regression.md
+++ b/docs/ml-classification-regression.md
@@ -85,7 +85,7 @@ More details on parameters can be found in the [Java API 
documentation](api/java
 
 
 
-More details on parameters can be found in the [Python API 
documentation](api/python/pyspark.ml.html#pyspark.ml.classification.LogisticRegression).
+More details on parameters can be found in the [Python API 
documentation](api/python/reference/api/pyspark.ml.classification.LogisticRegression.html).
 
 {% include_example python/ml/logistic_regression_with_elastic_net.py %}
 
@@ -135,11 +135,11 @@ Continuing the earlier example:
 
 
 
-[`LogisticRegressionTrainingSummary`](api/python/pyspark.ml.html#pyspark.ml.classification.LogisticRegressionSummary)
+[`LogisticRegressionTrainingSummary`](api/python/reference/api/pyspark.ml.classification.LogisticRegressionSummary.html)
 provides a summary for a
-[`LogisticRegressionModel`](api/python/pyspark.ml.html#pyspark.ml.classification.LogisticRegressionModel).
+[`LogisticRegressionModel`](api/python/reference/api/pyspark.ml.classification.LogisticRegressionModel.html).
 In the case of binary classification, certain additional metrics are
-available, e.g. ROC curve. See 
[`BinaryLogisticRegressionTrainingSummary`](api/python/pyspark.ml.html#pyspark.ml.classification.BinaryLogisticRegressionTrainingSummary).
+available, e.g. ROC curve. See 
[`BinaryLogisticRegressionTrainingSummary`](api/python/reference/api/pyspark.ml.classification.BinaryLogisticRegressionTrainingSummary.html).
 
 Continuing the earlier example:
 
@@ -232,7 +232,7 @@ More details on parameters can be found in the [Java API 
documentation](api/java
 
 
 
-More details on parameters can be found in the [Python API 
documentation](api/python/pyspark.ml.html#pyspark.ml.classification.DecisionTreeClassifier).
+More details on parameters can be found in the [Python API 
documentation](api/python/reference/api/pyspark.ml.classification.DecisionTreeClassifier.html).
 
 {% include_example python/ml/decision_tree_classification_example.py %}
 
@@ -275,7 +275,7 @@ Refer to the [Java API 
docs](api/java/org/apache/spark/ml/classification/RandomF
 
 
 
-Refer to the [Python API 
docs](api/python/pyspark.ml.html#pyspark.ml.classification.RandomForestClassifier)
 for more details.
+Refer to the [Python API 
docs](api/python/reference/api/pyspark.ml.classification.RandomForestClassifier.html)
 for more details.
 
 {% include_example python/ml/random_forest_classifier_example.py %}
 
@@ -316,7 +316,7 @@ Refer to the [Java API 
docs](api/java/org/apache/spark/ml/classif

[spark] branch branch-3.1 updated: [MINOR][DOCS] Fix broken python doc links

2021-01-17 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new a88cbbe  [MINOR][DOCS] Fix broken python doc links
a88cbbe is described below

commit a88cbbe9d84abfb5139d5deab7077d2f6a76dc51
Author: Huaxin Gao 
AuthorDate: Mon Jan 18 10:06:45 2021 +0900

[MINOR][DOCS] Fix broken python doc links

### What changes were proposed in this pull request?
Fix broken python links

### Why are the changes needed?
links broken.

![image](https://user-images.githubusercontent.com/13592258/104859361-9f60c980-58d9-11eb-8810-cb0669040af4.png)


![image](https://user-images.githubusercontent.com/13592258/104859350-8b1ccc80-58d9-11eb-9a8a-6ba8792595aa.png)

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Manually checked

Closes #31220 from huaxingao/docs.

Authored-by: Huaxin Gao 
Signed-off-by: HyukjinKwon 
(cherry picked from commit 8847b7fa6d9713646257c6640017aab7e0c22e5a)
Signed-off-by: HyukjinKwon 
---
 docs/ml-classification-regression.md   | 40 -
 docs/ml-clustering.md  | 10 ++---
 docs/ml-collaborative-filtering.md |  2 +-
 docs/ml-features.md| 80 +-
 docs/ml-frequent-pattern-mining.md |  4 +-
 docs/ml-pipeline.md|  8 ++--
 docs/ml-statistics.md  |  6 +--
 docs/ml-tuning.md  |  4 +-
 docs/mllib-clustering.md   | 18 
 docs/mllib-collaborative-filtering.md  |  2 +-
 docs/mllib-data-types.md   | 44 +--
 docs/mllib-decision-tree.md|  4 +-
 docs/mllib-dimensionality-reduction.md |  4 +-
 docs/mllib-ensembles.md|  8 ++--
 docs/mllib-evaluation-metrics.md   |  8 ++--
 docs/mllib-feature-extraction.md   | 14 +++---
 docs/mllib-frequent-pattern-mining.md  |  6 +--
 docs/mllib-isotonic-regression.md  |  2 +-
 docs/mllib-linear-methods.md   |  4 +-
 docs/mllib-naive-bayes.md  |  8 ++--
 docs/mllib-statistics.md   | 28 ++--
 21 files changed, 152 insertions(+), 152 deletions(-)

diff --git a/docs/ml-classification-regression.md 
b/docs/ml-classification-regression.md
index 247989d..bad74cb 100644
--- a/docs/ml-classification-regression.md
+++ b/docs/ml-classification-regression.md
@@ -85,7 +85,7 @@ More details on parameters can be found in the [Java API 
documentation](api/java
 
 
 
-More details on parameters can be found in the [Python API 
documentation](api/python/pyspark.ml.html#pyspark.ml.classification.LogisticRegression).
+More details on parameters can be found in the [Python API 
documentation](api/python/reference/api/pyspark.ml.classification.LogisticRegression.html).
 
 {% include_example python/ml/logistic_regression_with_elastic_net.py %}
 
@@ -135,11 +135,11 @@ Continuing the earlier example:
 
 
 
-[`LogisticRegressionTrainingSummary`](api/python/pyspark.ml.html#pyspark.ml.classification.LogisticRegressionSummary)
+[`LogisticRegressionTrainingSummary`](api/python/reference/api/pyspark.ml.classification.LogisticRegressionSummary.html)
 provides a summary for a
-[`LogisticRegressionModel`](api/python/pyspark.ml.html#pyspark.ml.classification.LogisticRegressionModel).
+[`LogisticRegressionModel`](api/python/reference/api/pyspark.ml.classification.LogisticRegressionModel.html).
 In the case of binary classification, certain additional metrics are
-available, e.g. ROC curve. See 
[`BinaryLogisticRegressionTrainingSummary`](api/python/pyspark.ml.html#pyspark.ml.classification.BinaryLogisticRegressionTrainingSummary).
+available, e.g. ROC curve. See 
[`BinaryLogisticRegressionTrainingSummary`](api/python/reference/api/pyspark.ml.classification.BinaryLogisticRegressionTrainingSummary.html).
 
 Continuing the earlier example:
 
@@ -232,7 +232,7 @@ More details on parameters can be found in the [Java API 
documentation](api/java
 
 
 
-More details on parameters can be found in the [Python API 
documentation](api/python/pyspark.ml.html#pyspark.ml.classification.DecisionTreeClassifier).
+More details on parameters can be found in the [Python API 
documentation](api/python/reference/api/pyspark.ml.classification.DecisionTreeClassifier.html).
 
 {% include_example python/ml/decision_tree_classification_example.py %}
 
@@ -275,7 +275,7 @@ Refer to the [Java API 
docs](api/java/org/apache/spark/ml/classification/RandomF
 
 
 
-Refer to the [Python API 
docs](api/python/pyspark.ml.html#pyspark.ml.classification.RandomForestClassifier)
 for more details.
+Refer to the [Python API 
docs](api/python/reference/api/pyspark.ml.classification.RandomForestClassifier.html)
 for more details.
 
 {% include_example python/ml/random_forest_

[spark] branch master updated: [MINOR][DOCS] Fix typos in sql-ref-datatypes.md

2021-01-17 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 536a725  [MINOR][DOCS] Fix typos in sql-ref-datatypes.md
536a725 is described below

commit 536a7258a829299a13035eb3550e6ce6f7632677
Author: Mitsuru Kariya 
AuthorDate: Mon Jan 18 13:18:03 2021 +0900

[MINOR][DOCS] Fix typos in sql-ref-datatypes.md

### What changes were proposed in this pull request?
Fixing typos in the docs sql-ref-datatypes.md.

### Why are the changes needed?
To display '' correctly.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manually run jekyll.

before this fix

![image](https://user-images.githubusercontent.com/2217224/104865408-3df33600-597f-11eb-857b-c6223ff9159a.png)

after this fix

![image](https://user-images.githubusercontent.com/2217224/104865458-62e7a900-597f-11eb-8a21-6d838eecaaf2.png)

Closes #31221 from kariya-mitsuru/fix-typo.

Authored-by: Mitsuru Kariya 
Signed-off-by: HyukjinKwon 
---
 docs/sql-ref-datatypes.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/sql-ref-datatypes.md b/docs/sql-ref-datatypes.md
index 0087867..fe1090e 100644
--- a/docs/sql-ref-datatypes.md
+++ b/docs/sql-ref-datatypes.md
@@ -193,7 +193,7 @@ The following table shows the type names as well as aliases 
used in Spark SQL pa
 |**BinaryType**|BINARY|
 |**DecimalType**|DECIMAL, DEC, NUMERIC|
 |**CalendarIntervalType**|INTERVAL|
-|**ArrayType**|ARRAY|
+|**ArrayType**|ARRAY\|
 |**StructType**|STRUCT|
 |**MapType**|MAP|
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.1 updated: [MINOR][DOCS] Fix typos in sql-ref-datatypes.md

2021-01-17 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new 67ee28c  [MINOR][DOCS] Fix typos in sql-ref-datatypes.md
67ee28c is described below

commit 67ee28c1b50c95c09472de21a85ae753b64d1e03
Author: Mitsuru Kariya 
AuthorDate: Mon Jan 18 13:18:03 2021 +0900

[MINOR][DOCS] Fix typos in sql-ref-datatypes.md

### What changes were proposed in this pull request?
Fixing typos in the docs sql-ref-datatypes.md.

### Why are the changes needed?
To display '' correctly.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manually run jekyll.

before this fix

![image](https://user-images.githubusercontent.com/2217224/104865408-3df33600-597f-11eb-857b-c6223ff9159a.png)

after this fix

![image](https://user-images.githubusercontent.com/2217224/104865458-62e7a900-597f-11eb-8a21-6d838eecaaf2.png)

Closes #31221 from kariya-mitsuru/fix-typo.

Authored-by: Mitsuru Kariya 
Signed-off-by: HyukjinKwon 
(cherry picked from commit 536a7258a829299a13035eb3550e6ce6f7632677)
Signed-off-by: HyukjinKwon 
---
 docs/sql-ref-datatypes.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/sql-ref-datatypes.md b/docs/sql-ref-datatypes.md
index 0087867..fe1090e 100644
--- a/docs/sql-ref-datatypes.md
+++ b/docs/sql-ref-datatypes.md
@@ -193,7 +193,7 @@ The following table shows the type names as well as aliases 
used in Spark SQL pa
 |**BinaryType**|BINARY|
 |**DecimalType**|DECIMAL, DEC, NUMERIC|
 |**CalendarIntervalType**|INTERVAL|
-|**ArrayType**|ARRAY|
+|**ArrayType**|ARRAY\|
 |**StructType**|STRUCT|
 |**MapType**|MAP|
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.0 updated: [MINOR][DOCS] Fix typos in sql-ref-datatypes.md

2021-01-17 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new d8ce224  [MINOR][DOCS] Fix typos in sql-ref-datatypes.md
d8ce224 is described below

commit d8ce2249d0fb62dce161f5caa4a9143295949f93
Author: Mitsuru Kariya 
AuthorDate: Mon Jan 18 13:18:03 2021 +0900

[MINOR][DOCS] Fix typos in sql-ref-datatypes.md

### What changes were proposed in this pull request?
Fixing typos in the docs sql-ref-datatypes.md.

### Why are the changes needed?
To display '' correctly.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manually run jekyll.

before this fix

![image](https://user-images.githubusercontent.com/2217224/104865408-3df33600-597f-11eb-857b-c6223ff9159a.png)

after this fix

![image](https://user-images.githubusercontent.com/2217224/104865458-62e7a900-597f-11eb-8a21-6d838eecaaf2.png)

Closes #31221 from kariya-mitsuru/fix-typo.

Authored-by: Mitsuru Kariya 
Signed-off-by: HyukjinKwon 
(cherry picked from commit 536a7258a829299a13035eb3550e6ce6f7632677)
Signed-off-by: HyukjinKwon 
---
 docs/sql-ref-datatypes.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/sql-ref-datatypes.md b/docs/sql-ref-datatypes.md
index f27f1a0..144594b 100644
--- a/docs/sql-ref-datatypes.md
+++ b/docs/sql-ref-datatypes.md
@@ -191,7 +191,7 @@ The following table shows the type names as well as aliases 
used in Spark SQL pa
 |**BinaryType**|BINARY|
 |**DecimalType**|DECIMAL, DEC, NUMERIC|
 |**CalendarIntervalType**|INTERVAL|
-|**ArrayType**|ARRAY|
+|**ArrayType**|ARRAY\|
 |**StructType**|STRUCT|
 |**MapType**|MAP|
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (536a725 -> ac322a1)

2021-01-17 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 536a725  [MINOR][DOCS] Fix typos in sql-ref-datatypes.md
 add ac322a1  [SPARK-34080][ML][PYTHON][FOLLOWUP] Add 
UnivariateFeatureSelector - make methods private

No new revisions were added by this update.

Summary of changes:
 .../ml/feature/UnivariateFeatureSelector.scala | 74 --
 1 file changed, 27 insertions(+), 47 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.1 updated: [SPARK-34080][ML][PYTHON][FOLLOWUP] Add UnivariateFeatureSelector - make methods private

2021-01-17 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new 56f93e5  [SPARK-34080][ML][PYTHON][FOLLOWUP] Add 
UnivariateFeatureSelector - make methods private
56f93e5 is described below

commit 56f93e56ab731be27a05a299fcbe0ef529f280ba
Author: Ruifeng Zheng 
AuthorDate: Mon Jan 18 13:19:59 2021 +0900

[SPARK-34080][ML][PYTHON][FOLLOWUP] Add UnivariateFeatureSelector - make 
methods private

### What changes were proposed in this pull request?
1, make `getTopIndices`/`selectIndicesFromPValues` private;
2, avoid setting `selectionThreshold` in `fit`
3, move param checking to `transformSchema`

### Why are the changes needed?
`getTopIndices`/`selectIndicesFromPValues` should not be exposed to end 
users;

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
existing testsuites

Closes #31222 from zhengruifeng/selector_clean_up.

Authored-by: Ruifeng Zheng 
Signed-off-by: HyukjinKwon 
(cherry picked from commit ac322a1ac3be79b5e514f0119275f53b3a40c923)
Signed-off-by: HyukjinKwon 
---
 .../ml/feature/UnivariateFeatureSelector.scala | 74 --
 1 file changed, 27 insertions(+), 47 deletions(-)

diff --git 
a/mllib/src/main/scala/org/apache/spark/ml/feature/UnivariateFeatureSelector.scala
 
b/mllib/src/main/scala/org/apache/spark/ml/feature/UnivariateFeatureSelector.scala
index 6d5f09e..bfe1d5f 100644
--- 
a/mllib/src/main/scala/org/apache/spark/ml/feature/UnivariateFeatureSelector.scala
+++ 
b/mllib/src/main/scala/org/apache/spark/ml/feature/UnivariateFeatureSelector.scala
@@ -76,8 +76,7 @@ private[feature] trait UnivariateFeatureSelectorParams 
extends Params
   @Since("3.1.1")
   final val selectionMode = new Param[String](this, "selectionMode",
 "The selection mode. Supported options: numTopFeatures, percentile, fpr, 
fdr, fwe",
-ParamValidators.inArray(Array("numTopFeatures", "percentile", "fpr", "fdr",
-  "fwe")))
+ParamValidators.inArray(Array("numTopFeatures", "percentile", "fpr", 
"fdr", "fwe")))
 
   /** @group getParam */
   @Since("3.1.1")
@@ -161,48 +160,17 @@ final class UnivariateFeatureSelector 
@Since("3.1.1")(@Since("3.1.1") override v
 transformSchema(dataset.schema, logging = true)
 val numFeatures = MetadataUtils.getNumFeatures(dataset, $(featuresCol))
 
-$(selectionMode) match {
-  case ("numTopFeatures") =>
-if (!isSet(selectionThreshold)) {
-  set(selectionThreshold, 50.0)
-} else {
-  require($(selectionThreshold) > 0 && $(selectionThreshold).toInt == 
$(selectionThreshold),
-"selectionThreshold needs to be a positive Integer for selection 
mode numTopFeatures")
-}
-  case ("percentile") =>
-if (!isSet(selectionThreshold)) {
-  set(selectionThreshold, 0.1)
-} else {
-  require($(selectionThreshold) >= 0 && $(selectionThreshold) <= 1,
-"selectionThreshold needs to be in the range of 0 to 1 for 
selection mode percentile")
-}
-  case ("fpr") =>
-if (!isSet(selectionThreshold)) {
-  set(selectionThreshold, 0.05)
-} else {
-  require($(selectionThreshold) >= 0 && $(selectionThreshold) <= 1,
-"selectionThreshold needs to be in the range of 0 to 1 for 
selection mode fpr")
-}
-  case ("fdr") =>
-if (!isSet(selectionThreshold)) {
-  set(selectionThreshold, 0.05)
-} else {
-  require($(selectionThreshold) >= 0 && $(selectionThreshold) <= 1,
-"selectionThreshold needs to be in the range of 0 to 1 for 
selection mode fdr")
-}
-  case ("fwe") =>
-if (!isSet(selectionThreshold)) {
-  set(selectionThreshold, 0.05)
-} else {
-  require($(selectionThreshold) >= 0 && $(selectionThreshold) <= 1,
-"selectionThreshold needs to be in the range of 0 to 1 for 
selection mode fwe")
-}
-  case _ =>
-throw new IllegalArgumentException(s"Unsupported selection mode:" +
-  s" selectionMode=${$(selectionMode)}")
+var threshold = Double.NaN
+if (isSet(selectionThreshold)) {
+  threshold = $(selectionThreshold)
+} else {
+  $(selectionMode) match {
+case "numTopFeatures" => threshold = 50
+case "percentile" => threshold = 0.1
+case "fpr" | "fdr" | "fwe" => threshold = 0.05
+  }
 }
 
-require(isSet(featureType) && isSet(labelType), "featureType and labelType 
need to be set")
 val resultDF = ($(featureType), $(labelType)) match {
   case ("categorical", "categorical") =>
 ChiSquareTest.test(dataset.toDF, getFeaturesCol, getLabelCol, true)
@@ -215,14 +183,12 @@ final class UnivariateFeatureS

[spark] branch master updated (ac322a1 -> b5bdbf2)

2021-01-17 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ac322a1  [SPARK-34080][ML][PYTHON][FOLLOWUP] Add 
UnivariateFeatureSelector - make methods private
 add b5bdbf2  [SPARK-30682][R][SQL][FOLLOW-UP] Keep the name similar with 
Scala side in higher order functions

No new revisions were added by this update.

Summary of changes:
 R/pkg/R/functions.R | 8 
 R/pkg/R/generics.R  | 3 ++-
 2 files changed, 6 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.1 updated (56f93e5 -> 8758773)

2021-01-17 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 56f93e5  [SPARK-34080][ML][PYTHON][FOLLOWUP] Add 
UnivariateFeatureSelector - make methods private
 add 8758773  [SPARK-30682][R][SQL][FOLLOW-UP] Keep the name similar with 
Scala side in higher order functions

No new revisions were added by this update.

Summary of changes:
 R/pkg/R/functions.R | 8 
 R/pkg/R/generics.R  | 3 ++-
 2 files changed, 6 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.1 updated (8758773 -> e1ad275)

2021-01-17 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8758773  [SPARK-30682][R][SQL][FOLLOW-UP] Keep the name similar with 
Scala side in higher order functions
 add e1ad275  [SPARK-33819][CORE][FOLLOWUP][3.1] Restore the constructor of 
SingleFileEventLogFileReader to remove Mima exclusion

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/deploy/history/EventLogFileReaders.scala  | 4 +++-
 project/MimaExcludes.scala   | 5 +
 2 files changed, 4 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.0 updated: [SPARK-33819][CORE][FOLLOWUP][3.0] Restore the constructor of SingleFileEventLogFileReader to remove Mima exclusion

2021-01-17 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 70c0bc9  [SPARK-33819][CORE][FOLLOWUP][3.0] Restore the constructor of 
SingleFileEventLogFileReader to remove Mima exclusion
70c0bc9 is described below

commit 70c0bc9a3358fc36b967cc28ae9232fe7437ab6a
Author: Dongjoon Hyun 
AuthorDate: Mon Jan 18 14:31:17 2021 +0900

[SPARK-33819][CORE][FOLLOWUP][3.0] Restore the constructor of 
SingleFileEventLogFileReader to remove Mima exclusion

### What changes were proposed in this pull request?

This PR proposes to remove Mima exclusion via restoring the old constructor 
of SingleFileEventLogFileReader. This partially adopts the remaining parts of 
#30814 which was excluded while porting back.

### Why are the changes needed?

To remove unnecessary Mima exclusion.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass CIs.

Closes #31225 from HeartSaVioR/SPARK-33819-followup-branch-3.0.

Authored-by: Dongjoon Hyun 
Signed-off-by: HyukjinKwon 
---
 .../scala/org/apache/spark/deploy/history/EventLogFileReaders.scala  | 4 +++-
 project/MimaExcludes.scala   | 5 +
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/EventLogFileReaders.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/EventLogFileReaders.scala
index 6fe3a7b..b4771c8 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/EventLogFileReaders.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/EventLogFileReaders.scala
@@ -167,9 +167,11 @@ object EventLogFileReader {
 private[history] class SingleFileEventLogFileReader(
 fs: FileSystem,
 path: Path,
-maybeStatus: Option[FileStatus] = None) extends EventLogFileReader(fs, 
path) {
+maybeStatus: Option[FileStatus]) extends EventLogFileReader(fs, path) {
   private lazy val status = 
maybeStatus.getOrElse(fileSystem.getFileStatus(rootPath))
 
+  def this(fs: FileSystem, path: Path) = this(fs, path, None)
+
   override def lastIndex: Option[Long] = None
 
   override def fileSizeForLastIndex: Long = status.getLen
diff --git a/project/MimaExcludes.scala b/project/MimaExcludes.scala
index b4e36dc..b799624 100644
--- a/project/MimaExcludes.scala
+++ b/project/MimaExcludes.scala
@@ -461,10 +461,7 @@ object MimaExcludes {
 
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.streaming.StreamingQueryListener#QueryStartedEvent.this"),
 
 // [SPARK-30667][CORE] Add allGather method to BarrierTaskContext
-
ProblemFilters.exclude[IncompatibleTemplateDefProblem]("org.apache.spark.RequestToSync"),
-
-// [SPARK-33790][CORE] Reduce the rpc call of getFileStatus in 
SingleFileEventLogFileReader
-
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.deploy.history.SingleFileEventLogFileReader.this")
+
ProblemFilters.exclude[IncompatibleTemplateDefProblem]("org.apache.spark.RequestToSync")
   )
 
   // Exclude rules for 2.4.x


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (b5bdbf2 -> c87b008)

2021-01-17 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b5bdbf2  [SPARK-30682][R][SQL][FOLLOW-UP] Keep the name similar with 
Scala side in higher order functions
 add c87b008  [SPARK-33696][BUILD][SQL] Upgrade built-in Hive to 2.3.8

No new revisions were added by this update.

Summary of changes:
 dev/deps/spark-deps-hadoop-2.7-hive-2.3| 26 +++---
 dev/deps/spark-deps-hadoop-3.2-hive-2.3| 26 +++---
 docs/building-spark.md |  4 ++--
 docs/sql-data-sources-hive-tables.md   |  8 +++
 docs/sql-migration-guide.md|  2 +-
 pom.xml| 20 +++--
 .../scala/org/apache/spark/sql/SQLQuerySuite.scala | 11 -
 .../thriftserver/HiveThriftServer2Suites.scala |  4 ++--
 .../org/apache/spark/sql/hive/HiveUtils.scala  |  2 +-
 .../sql/hive/client/IsolatedClientLoader.scala |  4 ++--
 .../org/apache/spark/sql/hive/client/package.scala | 10 -
 .../hive/HiveExternalCatalogVersionsSuite.scala|  2 +-
 .../spark/sql/hive/execution/HiveQuerySuite.scala  |  7 +++---
 13 files changed, 72 insertions(+), 54 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (c87b008 -> 78893b8)

2021-01-17 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from c87b008  [SPARK-33696][BUILD][SQL] Upgrade built-in Hive to 2.3.8
 add 78893b8  [SPARK-34139][SQL] UnresolvedRelation should retain SQL text 
position for DDL commands

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/analysis/CheckAnalysis.scala  |  6 +--
 .../spark/sql/catalyst/parser/AstBuilder.scala | 43 +-
 .../analysis/AnalysisExceptionPositionSuite.scala  | 17 +
 .../sql-tests/results/postgreSQL/with.sql.out  |  2 +-
 4 files changed, 46 insertions(+), 22 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org