[spark] branch master updated (27ed89b7be5 -> be5c85cffee)

2022-06-18 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 27ed89b7be5 [SPARK-38775][ML] cleanup validation functions
 add be5c85cffee [SPARK-36979][SQL][TESTS][FOLLOWUP] Move the test from 
`SQLQuerySuite` to `SQLQueryTestSuite`

No new revisions were added by this update.

Summary of changes:
 .../resources/sql-tests/inputs/non-excludable-rule.sql   |  4 
 .../sql-tests/results/non-excludable-rule.sql.out| 16 
 .../test/scala/org/apache/spark/sql/SQLQuerySuite.scala  |  7 ---
 3 files changed, 20 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-38775][ML] cleanup validation functions

2022-06-18 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 27ed89b7be5 [SPARK-38775][ML] cleanup validation functions
27ed89b7be5 is described below

commit 27ed89b7be5ebb91e4a0b106b1669a7867a6012d
Author: Ruifeng Zheng 
AuthorDate: Sat Jun 18 21:51:50 2022 -0700

[SPARK-38775][ML] cleanup validation functions

### What changes were proposed in this pull request?
1, remove unused `extractInstances` and `extractLabeledPoints` in 
`Predictor`;
2, remove unused `checkNonNegativeWeight` in `function`;
3, move `getNumClasses` from `Clasifier` to `DatasetUtils`;
4, move `getNumFeatures` from `MetadataUtils` to `DatasetUtils`;

### Why are the changes needed?
to unify to methods

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
existing testsuites

Closes #36049 from zhengruifeng/validate_cleanup.

Authored-by: Ruifeng Zheng 
Signed-off-by: Dongjoon Hyun 
---
 .../spark/examples/ml/DeveloperApiExample.scala|   7 +-
 .../main/scala/org/apache/spark/ml/Predictor.scala |  51 +-
 .../spark/ml/classification/Classifier.scala   | 106 +
 .../ml/classification/DecisionTreeClassifier.scala |   3 +-
 .../spark/ml/classification/FMClassifier.scala |   2 +-
 .../spark/ml/classification/GBTClassifier.scala|  20 +---
 .../ml/classification/RandomForestClassifier.scala |   2 +-
 .../spark/ml/clustering/GaussianMixture.scala  |   2 +-
 .../evaluation/BinaryClassificationEvaluator.scala |   7 +-
 .../spark/ml/evaluation/ClusteringEvaluator.scala  |  21 ++--
 .../spark/ml/evaluation/ClusteringMetrics.scala|   6 +-
 .../MulticlassClassificationEvaluator.scala|   8 +-
 .../spark/ml/evaluation/RegressionEvaluator.scala  |  16 ++--
 .../scala/org/apache/spark/ml/feature/LSH.scala|   2 +-
 .../org/apache/spark/ml/feature/RobustScaler.scala |   2 +-
 .../org/apache/spark/ml/feature/Selector.scala |   2 +-
 .../ml/feature/UnivariateFeatureSelector.scala |   2 +-
 .../apache/spark/ml/feature/VectorIndexer.scala|   2 +-
 .../main/scala/org/apache/spark/ml/functions.scala |   6 --
 .../apache/spark/ml/regression/FMRegressor.scala   |   2 +-
 .../apache/spark/ml/regression/GBTRegressor.scala  |  20 +---
 .../regression/GeneralizedLinearRegression.scala   |   2 +-
 .../spark/ml/regression/LinearRegression.scala |   2 +-
 .../org/apache/spark/ml/util/DatasetUtils.scala|  82 +++-
 .../org/apache/spark/ml/util/MetadataUtils.scala   |  14 +--
 .../spark/ml/classification/ClassifierSuite.scala  |  44 +
 project/MimaExcludes.scala |  16 +++-
 27 files changed, 152 insertions(+), 297 deletions(-)

diff --git 
a/examples/src/main/scala/org/apache/spark/examples/ml/DeveloperApiExample.scala
 
b/examples/src/main/scala/org/apache/spark/examples/ml/DeveloperApiExample.scala
index 487cb27b93f..bfee3301f8e 100644
--- 
a/examples/src/main/scala/org/apache/spark/examples/ml/DeveloperApiExample.scala
+++ 
b/examples/src/main/scala/org/apache/spark/examples/ml/DeveloperApiExample.scala
@@ -24,6 +24,7 @@ import org.apache.spark.ml.linalg.{BLAS, Vector, Vectors}
 import org.apache.spark.ml.param.{IntParam, ParamMap}
 import org.apache.spark.ml.util.Identifiable
 import org.apache.spark.sql.{Dataset, Row, SparkSession}
+import org.apache.spark.sql.functions.col
 
 /**
  * A simple example demonstrating how to write your own learning algorithm 
using Estimator,
@@ -120,8 +121,10 @@ private class MyLogisticRegression(override val uid: 
String)
 
   // This method is used by fit()
   override protected def train(dataset: Dataset[_]): MyLogisticRegressionModel 
= {
-// Extract columns from data using helper method.
-val oldDataset = extractLabeledPoints(dataset)
+// Extract columns from data.
+val oldDataset = dataset.select(col($(labelCol)).cast("double"), 
col($(featuresCol)))
+  .rdd
+  .map { case Row(l: Double, f: Vector) => LabeledPoint(l, f) }
 
 // Do learning to estimate the coefficients vector.
 val numFeatures = oldDataset.take(1)(0).features.size
diff --git a/mllib/src/main/scala/org/apache/spark/ml/Predictor.scala 
b/mllib/src/main/scala/org/apache/spark/ml/Predictor.scala
index e0b128e3698..9c6eb880c80 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/Predictor.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/Predictor.scala
@@ -18,14 +18,11 @@
 package org.apache.spark.ml
 
 import org.apache.spark.annotation.Since
-import org.apache.spark.ml.feature.{Instance, LabeledPoint}
-import org.apache.spark.ml.functions.checkNonNegativeWeight
-import org.apache.spark.ml.linalg.{Vector, VectorUDT}
+import org.apache.spark.ml.linalg.VectorUDT
 import org.apache.spark.ml.param._
 import 

[spark] branch master updated (a859dd25019 -> 362f27f38e9)

2022-06-18 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from a859dd25019 [SPARK-39509][INFRA] Support `DEFAULT_ARTIFACT_REPOSITORY` 
in `check-license`
 add 362f27f38e9 [SPARK-39507][CORE] `SocketAuthServer` should respect Java 
IPv6 options

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/security/SocketAuthServer.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[GitHub] [spark-website] srowen commented on a diff in pull request #400: [SPARK-39512] Document docker image release steps

2022-06-18 Thread GitBox


srowen commented on code in PR #400:
URL: https://github.com/apache/spark-website/pull/400#discussion_r901012063


##
site/sitemap.xml:
##
@@ -941,27 +941,27 @@
   weekly
 
 
-  https://spark.apache.org/graphx/
+  https://spark.apache.org/news/

Review Comment:
   I don't know which ordering is correct, but maybe revert this change?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[GitHub] [spark-website] holdenk opened a new pull request, #400: [SPARK-39512] Document docker image release steps

2022-06-18 Thread GitBox


holdenk opened a new pull request, #400:
URL: https://github.com/apache/spark-website/pull/400

   Document the docker image release steps for the release manager to follow 
when finalizing the release.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-39509][INFRA] Support `DEFAULT_ARTIFACT_REPOSITORY` in `check-license`

2022-06-18 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new a859dd25019 [SPARK-39509][INFRA] Support `DEFAULT_ARTIFACT_REPOSITORY` 
in `check-license`
a859dd25019 is described below

commit a859dd25019715165ddb0defe3ddfd8e3cba866e
Author: Dongjoon Hyun 
AuthorDate: Sat Jun 18 10:05:24 2022 -0700

[SPARK-39509][INFRA] Support `DEFAULT_ARTIFACT_REPOSITORY` in 
`check-license`

### What changes were proposed in this pull request?

This PR aims to make `check-license` script to support IPv6 environment via 
`DEFAULT_ARTIFACT_REPOSITORY`

### Why are the changes needed?

Apache Maven Central repository has two separate URLs.
- https://repo.maven.apache.org/maven2/ (IPv4)
- https://ipv6.repo1.maven.org/maven2/ (IPv6)

`DEFAULT_ARTIFACT_REPOSITORY` allows IPv6 users to use 
`ipv6.repo1.maven.org` or Google Maven Central Mirror according to their needs.

### Does this PR introduce _any_ user-facing change?

No. This is a dev-only change.

### How was this patch tested?

Pass the CIs.

Closes #36907 from dongjoon-hyun/SPARK-39509.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
---
 dev/check-license | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/dev/check-license b/dev/check-license
index bd255954d6d..f1cd5a5f1d4 100755
--- a/dev/check-license
+++ b/dev/check-license
@@ -20,7 +20,7 @@
 
 acquire_rat_jar () {
 
-  
URL="https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat/${RAT_VERSION}/apache-rat-${RAT_VERSION}.jar;
+  
URL="${DEFAULT_ARTIFACT_REPOSITORY:-https://repo1.maven.org/maven2/}org/apache/rat/apache-rat/${RAT_VERSION}/apache-rat-${RAT_VERSION}.jar;
 
   JAR="$rat_jar"
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org