date:20170728

spark git commit: [SPARK-20090][PYTHON] Add StructType.fieldNames in PySpark

2017-07-28 Thread holden

Repository: spark
Updated Branches:
  refs/heads/master 0ef9fe64e -> b56f79cc3


[SPARK-20090][PYTHON] Add StructType.fieldNames in PySpark

## What changes were proposed in this pull request?

This PR proposes `StructType.fieldNames` that returns a copy of a field name 
list rather than a (undocumented) `StructType.names`.

There are two points here:

  - API consistency with Scala/Java

  - Provide a safe way to get the field names. Manipulating these might cause 
unexpected behaviour as below:

```python
from pyspark.sql.types import *

struct = StructType([StructField("f1", StringType(), True)])
names = struct.names
del names[0]
spark.createDataFrame([{"f1": 1}], struct).show()
```

```
...
java.lang.IllegalStateException: Input row doesn't have expected number of 
values required by the schema. 1 fields are required while 0 values are 
provided.
at 
org.apache.spark.sql.execution.python.EvaluatePython$.fromJava(EvaluatePython.scala:138)
at 
org.apache.spark.sql.SparkSession$$anonfun$6.apply(SparkSession.scala:741)
at 
org.apache.spark.sql.SparkSession$$anonfun$6.apply(SparkSession.scala:741)
...
```

## How was this patch tested?

Added tests in `python/pyspark/sql/tests.py`.

Author: hyukjinkwon 

Closes #18618 from HyukjinKwon/SPARK-20090.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b56f79cc
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/b56f79cc
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/b56f79cc

Branch: refs/heads/master
Commit: b56f79cc359d093d757af83171175cfd933162d1
Parents: 0ef9fe6
Author: hyukjinkwon 
Authored: Fri Jul 28 20:59:32 2017 -0700
Committer: Holden Karau 
Committed: Fri Jul 28 20:59:32 2017 -0700

--
 python/pyspark/sql/tests.py | 16 
 python/pyspark/sql/types.py | 15 ++-
 2 files changed, 22 insertions(+), 9 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/b56f79cc/python/pyspark/sql/tests.py
--
diff --git a/python/pyspark/sql/tests.py b/python/pyspark/sql/tests.py
index 54756ed..cfd9c55 100644
--- a/python/pyspark/sql/tests.py
+++ b/python/pyspark/sql/tests.py
@@ -1241,26 +1241,29 @@ class SQLTests(ReusedPySparkTestCase):
 struct1 = StructType().add("f1", StringType(), True).add("f2", 
StringType(), True, None)
 struct2 = StructType([StructField("f1", StringType(), True),
   StructField("f2", StringType(), True, None)])
+self.assertEqual(struct1.fieldNames(), struct2.names)
 self.assertEqual(struct1, struct2)
 
 struct1 = StructType().add("f1", StringType(), True).add("f2", 
StringType(), True, None)
 struct2 = StructType([StructField("f1", StringType(), True)])
+self.assertNotEqual(struct1.fieldNames(), struct2.names)
 self.assertNotEqual(struct1, struct2)
 
 struct1 = (StructType().add(StructField("f1", StringType(), True))
.add(StructField("f2", StringType(), True, None)))
 struct2 = StructType([StructField("f1", StringType(), True),
   StructField("f2", StringType(), True, None)])
+self.assertEqual(struct1.fieldNames(), struct2.names)
 self.assertEqual(struct1, struct2)
 
 struct1 = (StructType().add(StructField("f1", StringType(), True))
.add(StructField("f2", StringType(), True, None)))
 struct2 = StructType([StructField("f1", StringType(), True)])
+self.assertNotEqual(struct1.fieldNames(), struct2.names)
 self.assertNotEqual(struct1, struct2)
 
 # Catch exception raised during improper construction
-with self.assertRaises(ValueError):
-struct1 = StructType().add("name")
+self.assertRaises(ValueError, lambda: StructType().add("name"))
 
 struct1 = StructType().add("f1", StringType(), True).add("f2", 
StringType(), True, None)
 for field in struct1:
@@ -1273,12 +1276,9 @@ class SQLTests(ReusedPySparkTestCase):
 self.assertIs(struct1["f1"], struct1.fields[0])
 self.assertIs(struct1[0], struct1.fields[0])
 self.assertEqual(struct1[0:1], StructType(struct1.fields[0:1]))
-with self.assertRaises(KeyError):
-not_a_field = struct1["f9"]
-with self.assertRaises(IndexError):
-not_a_field = struct1[9]
-with self.assertRaises(TypeError):
-not_a_field = struct1[9.9]
+self.assertRaises(KeyError, lambda: struct1["f9"])
+self.assertRaises(IndexError, lambda: struct1[9])
+self.assertRaises(TypeError, lambda: struct1[9.9])
 
 def

spark git commit: Typo in comment

2017-07-28 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master 69ab0e4bd -> 0ef9fe64e


Typo in comment

-

Author: Johan Grande 

Closes #18738 from nahoj/patch-1.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/0ef9fe64
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/0ef9fe64
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/0ef9fe64

Branch: refs/heads/master
Commit: 0ef9fe64e2d59bdf98c0f4729e5017d258f7c07d
Parents: 69ab0e4
Author: Johan Grande 
Authored: Fri Jul 28 16:51:18 2017 +0100
Committer: Sean Owen 
Committed: Fri Jul 28 16:51:18 2017 +0100

--
 core/src/main/scala/org/apache/spark/rdd/RDD.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/0ef9fe64/core/src/main/scala/org/apache/spark/rdd/RDD.scala
--
diff --git a/core/src/main/scala/org/apache/spark/rdd/RDD.scala 
b/core/src/main/scala/org/apache/spark/rdd/RDD.scala
index 2985c90..5435f59 100644
--- a/core/src/main/scala/org/apache/spark/rdd/RDD.scala
+++ b/core/src/main/scala/org/apache/spark/rdd/RDD.scala
@@ -55,7 +55,7 @@ import org.apache.spark.util.random.{BernoulliCellSampler, 
BernoulliSampler, Poi
  * Doubles; and
  * [[org.apache.spark.rdd.SequenceFileRDDFunctions]] contains operations 
available on RDDs that
  * can be saved as SequenceFiles.
- * All operations are automatically available on any RDD of the right type 
(e.g. RDD[(Int, Int)]
+ * All operations are automatically available on any RDD of the right type 
(e.g. RDD[(Int, Int)])
  * through implicit.
  *
  * Internally, each RDD is characterized by five main properties:


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-21541][YARN] Spark Logs show incorrect job status for a job that does not create SparkContext

2017-07-28 Thread tgraves

Repository: spark
Updated Branches:
  refs/heads/master 784680903 -> 69ab0e4bd


[SPARK-21541][YARN] Spark Logs show incorrect job status for a job that does 
not create SparkContext

If you run a spark job without creating the SparkSession or SparkContext, the 
spark job logs says it succeeded but yarn says it fails and retries 3 times. 
Also, since, Application Master unregisters with Resource Manager and exits 
successfully, it deletes the spark staging directory, so when yarn makes 
subsequent retries, it fails to find the staging directory and thus, the 
retries fail.

Added a flag to check whether user has initialized SparkContext. If it is true, 
we let Application Master unregister with Resource Manager else, we do not let 
AM unregister with RM.

## How was this patch tested?
Manually tested the fix.
Before:
https://user-images.githubusercontent.com/8190/28647214-69bf81e2-722b-11e7-9ed0-d416d2bf23be.png;>

After:
https://user-images.githubusercontent.com/8190/28647220-70f9eea2-722b-11e7-85c6-e56276b15614.png;>

Please review http://spark.apache.org/contributing.html before opening a pull 
request.

Author: pgandhi 
Author: pgandhi999 

Closes #18741 from pgandhi999/SPARK-21541.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/69ab0e4b
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/69ab0e4b
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/69ab0e4b

Branch: refs/heads/master
Commit: 69ab0e4bddccb461f960fcb48a390a1517e504dd
Parents: 7846809
Author: pgandhi 
Authored: Fri Jul 28 09:23:08 2017 -0500
Committer: Tom Graves 
Committed: Fri Jul 28 09:23:08 2017 -0500

--
 .../spark/deploy/yarn/ApplicationMaster.scala   | 21 +---
 1 file changed, 14 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/69ab0e4b/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala
--
diff --git 
a/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala
 
b/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala
index fc92502..ca6a3ef 100644
--- 
a/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala
+++ 
b/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala
@@ -90,6 +90,9 @@ private[spark] class ApplicationMaster(
   @volatile private var reporterThread: Thread = _
   @volatile private var allocator: YarnAllocator = _
 
+  // A flag to check whether user has initialized spark context
+  @volatile private var registered = false
+
   private val userClassLoader = {
 val classpath = Client.getUserClasspath(sparkConf)
 val urls = classpath.map { entry =>
@@ -319,7 +322,7 @@ private[spark] class ApplicationMaster(
*/
   final def unregister(status: FinalApplicationStatus, diagnostics: String = 
null): Unit = {
 synchronized {
-  if (!unregistered) {
+  if (registered && !unregistered) {
 logInfo(s"Unregistering ApplicationMaster with $status" +
   Option(diagnostics).map(msg => s" (diag message: 
$msg)").getOrElse(""))
 unregistered = true
@@ -332,10 +335,15 @@ private[spark] class ApplicationMaster(
 synchronized {
   if (!finished) {
 val inShutdown = ShutdownHookManager.inShutdown()
-logInfo(s"Final app status: $status, exitCode: $code" +
+if (registered) {
+  exitCode = code
+  finalStatus = status
+} else {
+  finalStatus = FinalApplicationStatus.FAILED
+  exitCode = ApplicationMaster.EXIT_SC_NOT_INITED
+}
+logInfo(s"Final app status: $finalStatus, exitCode: $exitCode" +
   Option(msg).map(msg => s", (reason: $msg)").getOrElse(""))
-exitCode = code
-finalStatus = status
 finalMsg = msg
 finished = true
 if (!inShutdown && Thread.currentThread() != reporterThread && 
reporterThread != null) {
@@ -439,12 +447,11 @@ private[spark] class ApplicationMaster(
   sc.getConf.get("spark.driver.port"),
   isClusterMode = true)
 registerAM(sc.getConf, rpcEnv, driverRef, sc.ui.map(_.webUrl), 
securityMgr)
+registered = true
   } else {
 // Sanity check; should never happen in normal operation, since sc 
should only be null
 // if the user app did not create a SparkContext.
-if (!finished) {
-  throw new IllegalStateException("SparkContext is null but app is 
still running!")
-}
+throw new IllegalStateException("User did not initialize spark 
context!")

spark git commit: [SPARK-21553][SPARK SHELL] Add the description of the default value of master parameter in the spark-shell

2017-07-28 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master 63d168cbb -> 784680903


[SPARK-21553][SPARK SHELL] Add the description of the default value of master 
parameter in the spark-shell

When I type spark-shell --help, I find that the default value description for 
the master parameter is missing. The user does not know what the default value 
is when the master parameter is not included, so we need to add the master 
parameter default description to the help information.
[https://issues.apache.org/jira/browse/SPARK-21553](https://issues.apache.org/jira/browse/SPARK-21553)

Author: davidxdh 
Author: Donghui Xu 

Closes #18755 from davidxdh/dev_0728.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/78468090
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/78468090
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/78468090

Branch: refs/heads/master
Commit: 784680903cdf003ac965b2d97b3f793324909bd3
Parents: 63d168c
Author: davidxdh 
Authored: Fri Jul 28 15:21:45 2017 +0100
Committer: Sean Owen 
Committed: Fri Jul 28 15:21:45 2017 +0100

--
 .../main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/78468090/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala
--
diff --git 
a/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala 
b/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala
index fd15211..3721b98 100644
--- a/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala
@@ -514,7 +514,8 @@ private[deploy] class SparkSubmitArguments(args: 
Seq[String], env: Map[String, S
 outStream.println(
   s"""
 |Options:
-|  --master MASTER_URL spark://host:port, mesos://host:port, 
yarn, or local.
+|  --master MASTER_URL spark://host:port, mesos://host:port, 
yarn, or local
+|  (Default: local[*]).
 |  --deploy-mode DEPLOY_MODE   Whether to launch the driver program 
locally ("client") or
 |  on one of the worker machines inside 
the cluster ("cluster")
 |  (Default: client).


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: Revert "[SPARK-21306][ML] OneVsRest should support setWeightCol"

2017-07-28 Thread yliang

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 8520d7c6d -> 258ca40cf


Revert "[SPARK-21306][ML] OneVsRest should support setWeightCol"

This reverts commit 8520d7c6d5e880dea3c1a8a874148c07222b4b4b.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/258ca40c
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/258ca40c
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/258ca40c

Branch: refs/heads/branch-2.1
Commit: 258ca40cf43eedae59b014a41fc6197df9bde299
Parents: 8520d7c
Author: Yanbo Liang 
Authored: Fri Jul 28 20:24:54 2017 +0800
Committer: Yanbo Liang 
Committed: Fri Jul 28 20:24:54 2017 +0800

--
 .../spark/ml/classification/OneVsRest.scala | 39 ++--
 .../ml/classification/OneVsRestSuite.scala  | 10 -
 python/pyspark/ml/classification.py | 27 +++---
 python/pyspark/ml/tests.py  | 14 ---
 4 files changed, 9 insertions(+), 81 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/258ca40c/mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala 
b/mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala
index c4a8f1f..e58b30d 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala
@@ -34,7 +34,6 @@ import org.apache.spark.ml._
 import org.apache.spark.ml.attribute._
 import org.apache.spark.ml.linalg.Vector
 import org.apache.spark.ml.param.{Param, ParamMap, ParamPair, Params}
-import org.apache.spark.ml.param.shared.HasWeightCol
 import org.apache.spark.ml.util._
 import org.apache.spark.sql.{DataFrame, Dataset, Row}
 import org.apache.spark.sql.functions._
@@ -54,8 +53,7 @@ private[ml] trait ClassifierTypeTrait {
 /**
  * Params for [[OneVsRest]].
  */
-private[ml] trait OneVsRestParams extends PredictorParams
-  with ClassifierTypeTrait with HasWeightCol {
+private[ml] trait OneVsRestParams extends PredictorParams with 
ClassifierTypeTrait {
 
   /**
* param for the base binary classifier that we reduce multiclass 
classification into.
@@ -301,18 +299,6 @@ final class OneVsRest @Since("1.4.0") (
   @Since("1.5.0")
   def setPredictionCol(value: String): this.type = set(predictionCol, value)
 
-  /**
-   * Sets the value of param [[weightCol]].
-   *
-   * This is ignored if weight is not supported by [[classifier]].
-   * If this is not set or empty, we treat all instance weights as 1.0.
-   * Default is not set, so all instances have weight one.
-   *
-   * @group setParam
-   */
-  @Since("2.3.0")
-  def setWeightCol(value: String): this.type = set(weightCol, value)
-
   @Since("1.4.0")
   override def transformSchema(schema: StructType): StructType = {
 validateAndTransformSchema(schema, fitting = true, 
getClassifier.featuresDataType)
@@ -331,20 +317,7 @@ final class OneVsRest @Since("1.4.0") (
 }
 val numClasses = 
MetadataUtils.getNumClasses(labelSchema).fold(computeNumClasses())(identity)
 
-val weightColIsUsed = isDefined(weightCol) && $(weightCol).nonEmpty && {
-  getClassifier match {
-case _: HasWeightCol => true
-case c =>
-  logWarning(s"weightCol is ignored, as it is not supported by $c 
now.")
-  false
-  }
-}
-
-val multiclassLabeled = if (weightColIsUsed) {
-  dataset.select($(labelCol), $(featuresCol), $(weightCol))
-} else {
-  dataset.select($(labelCol), $(featuresCol))
-}
+val multiclassLabeled = dataset.select($(labelCol), $(featuresCol))
 
 // persist if underlying dataset is not persistent.
 val handlePersistence = dataset.rdd.getStorageLevel == StorageLevel.NONE
@@ -364,13 +337,7 @@ final class OneVsRest @Since("1.4.0") (
   paramMap.put(classifier.labelCol -> labelColName)
   paramMap.put(classifier.featuresCol -> getFeaturesCol)
   paramMap.put(classifier.predictionCol -> getPredictionCol)
-  if (weightColIsUsed) {
-val classifier_ = classifier.asInstanceOf[ClassifierType with 
HasWeightCol]
-paramMap.put(classifier_.weightCol -> getWeightCol)
-classifier_.fit(trainingDataset, paramMap)
-  } else {
-classifier.fit(trainingDataset, paramMap)
-  }
+  classifier.fit(trainingDataset, paramMap)
 }.toArray[ClassificationModel[_, _]]
 
 if (handlePersistence) {

http://git-wip-us.apache.org/repos/asf/spark/blob/258ca40c/mllib/src/test/scala/org/apache/spark/ml/classification/OneVsRestSuite.scala
--
diff --git

spark git commit: Revert "[SPARK-21306][ML] OneVsRest should support setWeightCol"

2017-07-28 Thread yliang

Repository: spark
Updated Branches:
  refs/heads/branch-2.0 ccb827224 -> f8ae2bdd2


Revert "[SPARK-21306][ML] OneVsRest should support setWeightCol"

This reverts commit ccb82722450c20c9cdea2b2c68783943213a5aa1.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f8ae2bdd
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f8ae2bdd
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/f8ae2bdd

Branch: refs/heads/branch-2.0
Commit: f8ae2bdd2112780ec2b1104119bac2b718a55413
Parents: ccb8272
Author: Yanbo Liang 
Authored: Fri Jul 28 19:45:14 2017 +0800
Committer: Yanbo Liang 
Committed: Fri Jul 28 19:45:14 2017 +0800

--
 .../spark/ml/classification/OneVsRest.scala | 39 ++--
 .../ml/classification/OneVsRestSuite.scala  | 10 -
 python/pyspark/ml/classification.py | 27 +++---
 python/pyspark/ml/tests.py  | 14 ---
 4 files changed, 9 insertions(+), 81 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/f8ae2bdd/mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala 
b/mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala
index 770d5db..f4ab0a0 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala
@@ -34,7 +34,6 @@ import org.apache.spark.ml._
 import org.apache.spark.ml.attribute._
 import org.apache.spark.ml.linalg.Vector
 import org.apache.spark.ml.param.{Param, ParamMap, ParamPair, Params}
-import org.apache.spark.ml.param.shared.HasWeightCol
 import org.apache.spark.ml.util._
 import org.apache.spark.sql.{DataFrame, Dataset, Row}
 import org.apache.spark.sql.functions._
@@ -54,8 +53,7 @@ private[ml] trait ClassifierTypeTrait {
 /**
  * Params for [[OneVsRest]].
  */
-private[ml] trait OneVsRestParams extends PredictorParams
-  with ClassifierTypeTrait with HasWeightCol {
+private[ml] trait OneVsRestParams extends PredictorParams with 
ClassifierTypeTrait {
 
   /**
* param for the base binary classifier that we reduce multiclass 
classification into.
@@ -292,18 +290,6 @@ final class OneVsRest @Since("1.4.0") (
   @Since("1.5.0")
   def setPredictionCol(value: String): this.type = set(predictionCol, value)
 
-  /**
-   * Sets the value of param [[weightCol]].
-   *
-   * This is ignored if weight is not supported by [[classifier]].
-   * If this is not set or empty, we treat all instance weights as 1.0.
-   * Default is not set, so all instances have weight one.
-   *
-   * @group setParam
-   */
-  @Since("2.3.0")
-  def setWeightCol(value: String): this.type = set(weightCol, value)
-
   @Since("1.4.0")
   override def transformSchema(schema: StructType): StructType = {
 validateAndTransformSchema(schema, fitting = true, 
getClassifier.featuresDataType)
@@ -322,20 +308,7 @@ final class OneVsRest @Since("1.4.0") (
 }
 val numClasses = 
MetadataUtils.getNumClasses(labelSchema).fold(computeNumClasses())(identity)
 
-val weightColIsUsed = isDefined(weightCol) && $(weightCol).nonEmpty && {
-  getClassifier match {
-case _: HasWeightCol => true
-case c =>
-  logWarning(s"weightCol is ignored, as it is not supported by $c 
now.")
-  false
-  }
-}
-
-val multiclassLabeled = if (weightColIsUsed) {
-  dataset.select($(labelCol), $(featuresCol), $(weightCol))
-} else {
-  dataset.select($(labelCol), $(featuresCol))
-}
+val multiclassLabeled = dataset.select($(labelCol), $(featuresCol))
 
 // persist if underlying dataset is not persistent.
 val handlePersistence = dataset.rdd.getStorageLevel == StorageLevel.NONE
@@ -355,13 +328,7 @@ final class OneVsRest @Since("1.4.0") (
   paramMap.put(classifier.labelCol -> labelColName)
   paramMap.put(classifier.featuresCol -> getFeaturesCol)
   paramMap.put(classifier.predictionCol -> getPredictionCol)
-  if (weightColIsUsed) {
-val classifier_ = classifier.asInstanceOf[ClassifierType with 
HasWeightCol]
-paramMap.put(classifier_.weightCol -> getWeightCol)
-classifier_.fit(trainingDataset, paramMap)
-  } else {
-classifier.fit(trainingDataset, paramMap)
-  }
+  classifier.fit(trainingDataset, paramMap)
 }.toArray[ClassificationModel[_, _]]
 
 if (handlePersistence) {

http://git-wip-us.apache.org/repos/asf/spark/blob/f8ae2bdd/mllib/src/test/scala/org/apache/spark/ml/classification/OneVsRestSuite.scala
--
diff --git

spark git commit: [MINOR][BUILD] Fix current lint-java failures

2017-07-28 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master a5a318997 -> 63d168cbb


[MINOR][BUILD] Fix current lint-java failures

## What changes were proposed in this pull request?

Fixes current failures in dev/lint-java

## How was this patch tested?

Existing linter, tests.

Author: Sean Owen 

Closes #18757 from srowen/LintJava.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/63d168cb
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/63d168cb
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/63d168cb

Branch: refs/heads/master
Commit: 63d168cbb8ebe80bfcf91a4d5070f1d9f3ec7aca
Parents: a5a3189
Author: Sean Owen 
Authored: Fri Jul 28 11:31:40 2017 +0100
Committer: Sean Owen 
Committed: Fri Jul 28 11:31:40 2017 +0100

--
 .../org/apache/spark/network/TransportRequestHandlerSuite.java | 6 --
 .../org/apache/spark/shuffle/sort/ShuffleExternalSorter.java   | 6 --
 .../spark/sql/execution/vectorized/ArrowColumnVector.java  | 2 +-
 3 files changed, 9 insertions(+), 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/63d168cb/common/network-common/src/test/java/org/apache/spark/network/TransportRequestHandlerSuite.java
--
diff --git 
a/common/network-common/src/test/java/org/apache/spark/network/TransportRequestHandlerSuite.java
 
b/common/network-common/src/test/java/org/apache/spark/network/TransportRequestHandlerSuite.java
index 1fb987a..1ed5711 100644
--- 
a/common/network-common/src/test/java/org/apache/spark/network/TransportRequestHandlerSuite.java
+++ 
b/common/network-common/src/test/java/org/apache/spark/network/TransportRequestHandlerSuite.java
@@ -105,7 +105,7 @@ public class TransportRequestHandlerSuite {
 private List listeners = new ArrayList<>();
 private boolean success;
 
-public ExtendedChannelPromise(Channel channel) {
+ExtendedChannelPromise(Channel channel) {
   super(channel);
   success = false;
 }
@@ -127,7 +127,9 @@ public class TransportRequestHandlerSuite {
   listeners.forEach(listener -> {
 try {
   listener.operationComplete(this);
-} catch (Exception e) { }
+} catch (Exception e) {
+  // do nothing
+}
   });
 }
   }

http://git-wip-us.apache.org/repos/asf/spark/blob/63d168cb/core/src/main/java/org/apache/spark/shuffle/sort/ShuffleExternalSorter.java
--
diff --git 
a/core/src/main/java/org/apache/spark/shuffle/sort/ShuffleExternalSorter.java 
b/core/src/main/java/org/apache/spark/shuffle/sort/ShuffleExternalSorter.java
index 338faaa..da6c55d 100644
--- 
a/core/src/main/java/org/apache/spark/shuffle/sort/ShuffleExternalSorter.java
+++ 
b/core/src/main/java/org/apache/spark/shuffle/sort/ShuffleExternalSorter.java
@@ -120,14 +120,16 @@ final class ShuffleExternalSorter extends MemoryConsumer {
 this.taskContext = taskContext;
 this.numPartitions = numPartitions;
 // Use getSizeAsKb (not bytes) to maintain backwards compatibility if no 
units are provided
-this.fileBufferSizeBytes = (int) (long) 
conf.get(package$.MODULE$.SHUFFLE_FILE_BUFFER_SIZE()) * 1024;
+this.fileBufferSizeBytes =
+(int) (long) conf.get(package$.MODULE$.SHUFFLE_FILE_BUFFER_SIZE()) * 
1024;
 this.numElementsForSpillThreshold =
   conf.getLong("spark.shuffle.spill.numElementsForceSpillThreshold", 1024 
* 1024 * 1024);
 this.writeMetrics = writeMetrics;
 this.inMemSorter = new ShuffleInMemorySorter(
   this, initialSize, conf.getBoolean("spark.shuffle.sort.useRadixSort", 
true));
 this.peakMemoryUsedBytes = getMemoryUsage();
-this.diskWriteBufferSize = (int) (long) 
conf.get(package$.MODULE$.SHUFFLE_DISK_WRITE_BUFFER_SIZE());
+this.diskWriteBufferSize =
+(int) (long) 
conf.get(package$.MODULE$.SHUFFLE_DISK_WRITE_BUFFER_SIZE());
   }
 
   /**

http://git-wip-us.apache.org/repos/asf/spark/blob/63d168cb/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ArrowColumnVector.java
--
diff --git 
a/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ArrowColumnVector.java
 
b/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ArrowColumnVector.java
index 31dea6a..59d66c5 100644
--- 
a/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ArrowColumnVector.java
+++ 
b/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ArrowColumnVector.java
@@ -322,7 +322,7 @@ public final class ArrowColumnVector extends 
ReadOnlyColumnVector {
 anyNullsSet = numNulls > 0;
   }
 
-  private static abstract class

spark-website git commit: Update committer page

2017-07-28 Thread gurwls223

Repository: spark-website
Updated Branches:
  refs/heads/asf-site 6ff5039f3 -> 0e09b2f58


Update committer page


Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/0e09b2f5
Tree: http://git-wip-us.apache.org/repos/asf/spark-website/tree/0e09b2f5
Diff: http://git-wip-us.apache.org/repos/asf/spark-website/diff/0e09b2f5

Branch: refs/heads/asf-site
Commit: 0e09b2f580b32e16a6eef81e520e909174ebdb4d
Parents: 6ff5039
Author: hyukjinkwon 
Authored: Fri Jul 28 10:22:44 2017 +0900
Committer: hyukjinkwon 
Committed: Fri Jul 28 10:30:43 2017 +0900

--
 committers.md| 1 +
 site/committers.html | 4 
 2 files changed, 5 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark-website/blob/0e09b2f5/committers.md
--
diff --git a/committers.md b/committers.md
index e850f8b..a4965cb 100644
--- a/committers.md
+++ b/committers.md
@@ -30,6 +30,7 @@ navigation:
 |Shane Huang|Intel|
 |Holden Karau|IBM|
 |Andy Konwinski|Databricks|
+|Hyukjin Kwon|Mobigen|
 |Ryan LeCompte|Quantifind|
 |Haoyuan Li|Alluxio, UC Berkeley|
 |Xiao Li|Databricks|

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0e09b2f5/site/committers.html
--
diff --git a/site/committers.html b/site/committers.html
index b3137ca..f69529d 100644
--- a/site/committers.html
+++ b/site/committers.html
@@ -285,6 +285,10 @@
   Databricks
 
 
+  Hyukjin Kwon
+  Mobigen
+
+
   Ryan LeCompte
   Quantifind
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-20090][PYTHON] Add StructType.fieldNames in PySpark

spark git commit: Typo in comment

spark git commit: [SPARK-21541][YARN] Spark Logs show incorrect job status for a job that does not create SparkContext

spark git commit: [SPARK-21553][SPARK SHELL] Add the description of the default value of master parameter in the spark-shell

spark git commit: Revert "[SPARK-21306][ML] OneVsRest should support setWeightCol"

spark git commit: Revert "[SPARK-21306][ML] OneVsRest should support setWeightCol"

spark git commit: [MINOR][BUILD] Fix current lint-java failures

spark-website git commit: Update committer page

8 matches

Site Navigation

Mail list logo

Footer information