[GitHub] spark issue #21506: [SPARK-24485][SS] Measure and log elapsed time for files...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21506
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91626/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21506: [SPARK-24485][SS] Measure and log elapsed time for files...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21506
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21506: [SPARK-24485][SS] Measure and log elapsed time for files...

2018-06-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21506
  
**[Test build #91626 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91626/testReport)**
 for PR 21506 at commit 
[`d84f98f`](https://github.com/apache/spark/commit/d84f98fc978262f4165f78b3b223b8bb3151f735).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21501: [SPARK-15064][ML] Locale support in StopWordsRemo...

2018-06-09 Thread mengxr
Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/21501#discussion_r194244802
  
--- Diff: python/pyspark/ml/feature.py ---
@@ -2582,25 +2582,31 @@ class StopWordsRemover(JavaTransformer, 
HasInputCol, HasOutputCol, JavaMLReadabl
   typeConverter=TypeConverters.toListString)
 caseSensitive = Param(Params._dummy(), "caseSensitive", "whether to do 
a case sensitive " +
   "comparison over the stop words", 
typeConverter=TypeConverters.toBoolean)
+locale = Param(Params._dummy(), "locale", "locale of the input. 
ignored when case sensitive " +
+   "is true", typeConverter=TypeConverters.toString)
 
 @keyword_only
-def __init__(self, inputCol=None, outputCol=None, stopWords=None, 
caseSensitive=False):
+def __init__(self, inputCol=None, outputCol=None, stopWords=None, 
caseSensitive=False,
+ locale=None):
 """
-__init__(self, inputCol=None, outputCol=None, stopWords=None, 
caseSensitive=false)
+__init__(self, inputCol=None, outputCol=None, stopWords=None, 
caseSensitive=false,
+locale=None)
 """
 super(StopWordsRemover, self).__init__()
 self._java_obj = 
self._new_java_obj("org.apache.spark.ml.feature.StopWordsRemover",
 self.uid)
 
self._setDefault(stopWords=StopWordsRemover.loadDefaultStopWords("english"),
- caseSensitive=False)
+ caseSensitive=False, locale=None)
--- End diff --

I think we just need to explain how we handle the default value in the doc.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21501: [SPARK-15064][ML] Locale support in StopWordsRemo...

2018-06-09 Thread mengxr
Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/21501#discussion_r194244784
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/feature/StopWordsRemover.scala ---
@@ -84,7 +86,28 @@ class StopWordsRemover @Since("1.5.0") (@Since("1.5.0") 
override val uid: String
   @Since("1.5.0")
   def getCaseSensitive: Boolean = $(caseSensitive)
 
-  setDefault(stopWords -> 
StopWordsRemover.loadDefaultStopWords("english"), caseSensitive -> false)
+  /**
+   * Locale of the input for case insensitive matching. Ignored when 
[[caseSensitive]]
+   * is true.
+   * Default: Locale.getDefault.toString
+   * @see `StopWordsRemover.loadDefaultStopWords()`
+   * @group param
+   */
+  @Since("2.4.0")
+  val locale: Param[String] = new Param[String](this, "locale",
+"Locale of the input for case insensitive matching. Ignored when 
caseSensitive is true.",
+
ParamValidators.inArray[String](Locale.getAvailableLocales.map(_.toString)))
+
+  /** @group setParam */
+  @Since("2.4.0")
+  def setLocale(value: String): this.type = set(locale, value)
--- End diff --

Not sure if it is necessary. If users already have a `Locale` instance, 
they can use `toString` to provide the value. It is simpler if setter/getter 
have consistent types.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21501: [SPARK-15064][ML] Locale support in StopWordsRemo...

2018-06-09 Thread mengxr
Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/21501#discussion_r194244754
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/feature/StopWordsRemover.scala ---
@@ -84,7 +86,28 @@ class StopWordsRemover @Since("1.5.0") (@Since("1.5.0") 
override val uid: String
   @Since("1.5.0")
   def getCaseSensitive: Boolean = $(caseSensitive)
 
-  setDefault(stopWords -> 
StopWordsRemover.loadDefaultStopWords("english"), caseSensitive -> false)
+  /**
+   * Locale of the input for case insensitive matching. Ignored when 
[[caseSensitive]]
+   * is true.
+   * Default: Locale.getDefault.toString
--- End diff --

Locale is usually configured at system level. Setting the default locale to 
`en` is a breaking change to users who set system locale to `tr` and have the 
following code:

~~~
val remover = new StopWordsRemover()
  .setStopWords(loadDefaultStopWords("turkish")
~~~


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21513: [SPARK-19826][ML][PYTHON]add spark.ml Python API ...

2018-06-09 Thread mengxr
Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/21513#discussion_r194244595
  
--- Diff: python/pyspark/ml/clustering.py ---
@@ -1156,6 +1157,213 @@ def getKeepLastCheckpoint(self):
 return self.getOrDefault(self.keepLastCheckpoint)
 
 
+@inherit_doc
+class PowerIterationClustering(HasMaxIter, HasWeightCol, JavaParams, 
JavaMLReadable,
+   JavaMLWritable):
+"""
+.. note:: Experimental
+
+Power Iteration Clustering (PIC), a scalable graph clustering 
algorithm developed by
+http://www.icml2010.org/papers/387.pdf>Lin and Cohen. From 
the abstract:
+PIC finds a very low-dimensional embedding of a dataset using 
truncated power
+iteration on a normalized pair-wise similarity matrix of the data.
+
+This class is not yet an Estimator/Transformer, use 
:py:func:`assignClusters` method
+to run the PowerIterationClustering algorithm.
+
+.. seealso:: `Wikipedia on Spectral clustering \
+`_
+
+   >>> data = [((long)(1), (long)(0), 0.5), \
+   ((long)(2), (long)(0), 0.5), \
+   ((long)(2), (long)(1), 0.7), \
+   ((long)(3), (long)(0), 0.5), \
+   ((long)(3), (long)(1), 0.7), \
+   ((long)(3), (long)(2), 0.9), \
+   ((long)(4), (long)(0), 0.5), \
+   ((long)(4), (long)(1), 0.7), \
+   ((long)(4), (long)(2), 0.9), \
+   ((long)(4), (long)(3), 1.1), \
+   ((long)(5), (long)(0), 0.5), \
+   ((long)(5), (long)(1), 0.7), \
+   ((long)(5), (long)(2), 0.9), \
+   ((long)(5), (long)(3), 1.1), \
+   ((long)(5), (long)(4), 1.3)]
+>>> df = spark.createDataFrame(data).toDF("src", "dst", "weight")
+>>> pic = PowerIterationClustering()
--- End diff --

If we only keep one example, we should use keyword args:

~~~python
pic = PowerIterationClustering(k=2, maxIter=40, weightCol="weight")
assignments = pic.assignClusters(df)
~~~


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21513: [SPARK-19826][ML][PYTHON]add spark.ml Python API ...

2018-06-09 Thread mengxr
Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/21513#discussion_r194244698
  
--- Diff: python/pyspark/ml/clustering.py ---
@@ -1156,6 +1157,213 @@ def getKeepLastCheckpoint(self):
 return self.getOrDefault(self.keepLastCheckpoint)
 
 
+@inherit_doc
+class PowerIterationClustering(HasMaxIter, HasWeightCol, JavaParams, 
JavaMLReadable,
+   JavaMLWritable):
+"""
+.. note:: Experimental
+
+Power Iteration Clustering (PIC), a scalable graph clustering 
algorithm developed by
+http://www.icml2010.org/papers/387.pdf>Lin and Cohen. From 
the abstract:
+PIC finds a very low-dimensional embedding of a dataset using 
truncated power
+iteration on a normalized pair-wise similarity matrix of the data.
+
+This class is not yet an Estimator/Transformer, use 
:py:func:`assignClusters` method
+to run the PowerIterationClustering algorithm.
+
+.. seealso:: `Wikipedia on Spectral clustering \
+`_
+
+   >>> data = [((long)(1), (long)(0), 0.5), \
+   ((long)(2), (long)(0), 0.5), \
+   ((long)(2), (long)(1), 0.7), \
+   ((long)(3), (long)(0), 0.5), \
+   ((long)(3), (long)(1), 0.7), \
+   ((long)(3), (long)(2), 0.9), \
+   ((long)(4), (long)(0), 0.5), \
+   ((long)(4), (long)(1), 0.7), \
+   ((long)(4), (long)(2), 0.9), \
+   ((long)(4), (long)(3), 1.1), \
+   ((long)(5), (long)(0), 0.5), \
+   ((long)(5), (long)(1), 0.7), \
+   ((long)(5), (long)(2), 0.9), \
+   ((long)(5), (long)(3), 1.1), \
+   ((long)(5), (long)(4), 1.3)]
+>>> df = spark.createDataFrame(data).toDF("src", "dst", "weight")
+>>> pic = PowerIterationClustering()
+>>> assignments = 
pic.setK(2).setMaxIter(40).setWeightCol("weight").assignClusters(df)
+>>> assignments.sort(assignments.id).show(truncate=False)
++---+---+
+|id |cluster|
++---+---+
+|0  |1  |
+|1  |1  |
+|2  |1  |
+|3  |1  |
+|4  |1  |
+|5  |0  |
++---+---+
+...
+>>> pic_path = temp_path + "/pic"
+>>> pic.save(pic_path)
+>>> pic2 = PowerIterationClustering.load(pic_path)
+>>> pic2.getK()
+2
+>>> pic2.getMaxIter()
+40
+>>> assignments2 = pic2.assignClusters(df)
--- End diff --

This and `pic3` seem unnecessary to me as doctest. Same for pic3. Doctest 
is mainly to provide examples, not a full suite of unit tests.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21513: [SPARK-19826][ML][PYTHON]add spark.ml Python API ...

2018-06-09 Thread mengxr
Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/21513#discussion_r194244592
  
--- Diff: python/pyspark/ml/clustering.py ---
@@ -1156,6 +1157,213 @@ def getKeepLastCheckpoint(self):
 return self.getOrDefault(self.keepLastCheckpoint)
 
 
+@inherit_doc
+class PowerIterationClustering(HasMaxIter, HasWeightCol, JavaParams, 
JavaMLReadable,
+   JavaMLWritable):
+"""
+.. note:: Experimental
+
+Power Iteration Clustering (PIC), a scalable graph clustering 
algorithm developed by
+http://www.icml2010.org/papers/387.pdf>Lin and Cohen. From 
the abstract:
+PIC finds a very low-dimensional embedding of a dataset using 
truncated power
+iteration on a normalized pair-wise similarity matrix of the data.
+
+This class is not yet an Estimator/Transformer, use 
:py:func:`assignClusters` method
+to run the PowerIterationClustering algorithm.
+
+.. seealso:: `Wikipedia on Spectral clustering \
+`_
+
+   >>> data = [((long)(1), (long)(0), 0.5), \
--- End diff --

Users do not know that we make `long` an alias of `int` in PySpark + Python 
3.

I think in both Py2 and 3, PySpark infers Python int/long as long type in 
DataFrame. Could you help verify? If that is the case, we can drop 
`(long)(...)` here. If not, we can cast the columns to long type after creating 
the DataFrame.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21469
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91625/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21469
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-06-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21469
  
**[Test build #91625 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91625/testReport)**
 for PR 21469 at commit 
[`62fc395`](https://github.com/apache/spark/commit/62fc395d29b67a827612e4e7f4bf134f5817a33c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21520: [SPARK-24505][SQL] Forbidding string interpolation in Co...

2018-06-09 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/21520
  
Sorry this change is quite large, but it can't split into smaller pieces 
because it must be changed as a whole to pass compilation.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21520: [SPARK-24505][SQL] Forbidding string interpolation in Co...

2018-06-09 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/21520
  
cc @cloud-fan @kiszk @hvanhovell 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21520: [SPARK-24505][SQL][WIP] Forbidding string interpolation ...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21520
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21520: [SPARK-24505][SQL][WIP] Forbidding string interpolation ...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21520
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91624/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21520: [SPARK-24505][SQL][WIP] Forbidding string interpolation ...

2018-06-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21520
  
**[Test build #91624 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91624/testReport)**
 for PR 21520 at commit 
[`349d63e`](https://github.com/apache/spark/commit/349d63ee92f69a09500acb4b61532af8d591a6d2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21506: [SPARK-24485][SS] Measure and log elapsed time for files...

2018-06-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21506
  
**[Test build #91627 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91627/testReport)**
 for PR 21506 at commit 
[`d84f98f`](https://github.com/apache/spark/commit/d84f98fc978262f4165f78b3b223b8bb3151f735).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21506: [SPARK-24485][SS] Measure and log elapsed time for files...

2018-06-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21506
  
**[Test build #91626 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91626/testReport)**
 for PR 21506 at commit 
[`d84f98f`](https://github.com/apache/spark/commit/d84f98fc978262f4165f78b3b223b8bb3151f735).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21506: [SPARK-24485][SS] Measure and log elapsed time for files...

2018-06-09 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21506
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21506: [SPARK-24485][SS] Measure and log elapsed time for files...

2018-06-09 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21506
  
add to whitelist


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21506: [SPARK-24485][SS] Measure and log elapsed time for files...

2018-06-09 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue:

https://github.com/apache/spark/pull/21506
  
cc. @tdas @jose-torres @jerryshao @arunmahadevan @HyukjinKwon 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-06-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21469
  
**[Test build #91625 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91625/testReport)**
 for PR 21469 at commit 
[`62fc395`](https://github.com/apache/spark/commit/62fc395d29b67a827612e4e7f4bf134f5817a33c).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-06-09 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue:

https://github.com/apache/spark/pull/21469
  
@jose-torres No problem. I expect there would be some inactive moment in 
Spark community during spark summit. Addressed comment regarding renaming.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21520: [SPARK-24505][SQL][WIP] Forbidding string interpolation ...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21520
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3880/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21520: [SPARK-24505][SQL][WIP] Forbidding string interpolation ...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21520
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21520: [SPARK-24505][SQL][WIP] Forbidding string interpolation ...

2018-06-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21520
  
**[Test build #91624 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91624/testReport)**
 for PR 21520 at commit 
[`349d63e`](https://github.com/apache/spark/commit/349d63ee92f69a09500acb4b61532af8d591a6d2).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21520: [SPARK-24505][SQL][WIP] Forbidding string interpolation ...

2018-06-09 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/21520
  
retest this please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21521: [SPARK-23732][docs] Fix source links in generated scalad...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21521
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91622/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21521: [SPARK-23732][docs] Fix source links in generated scalad...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21521
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21521: [SPARK-23732][docs] Fix source links in generated scalad...

2018-06-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21521
  
**[Test build #91622 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91622/testReport)**
 for PR 21521 at commit 
[`62301ce`](https://github.com/apache/spark/commit/62301ced0e736fdc445d830f5dadf377cb65cc6d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21513: [SPARK-19826][ML][PYTHON]add spark.ml Python API for PIC

2018-06-09 Thread WeichenXu123
Github user WeichenXu123 commented on the issue:

https://github.com/apache/spark/pull/21513
  
LGTM. Thanks! @mengxr Would you mind take a look ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21522: [SPARK-24467][ML] VectorAssemblerEstimator

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21522
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21522: [SPARK-24467][ML] VectorAssemblerEstimator

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21522
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21522: [SPARK-24467][ML] VectorAssemblerEstimator

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21522
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21522: [SPARK-24467][ML] VectorAssemblerEstimator

2018-06-09 Thread tengpeng
GitHub user tengpeng opened a pull request:

https://github.com/apache/spark/pull/21522

[SPARK-24467][ML] VectorAssemblerEstimator

Background: See the JIRA ticket.

This PR is on its very early stage, and hopefully it would help us decide 
what's the right direction.

## What changes were proposed in this pull request? 

1. Add a optional Param to VectorAssembler for specifying the sizes of 
Vectors in the inputCols. 
- If not given, then VectorAssembler will behave as it does now. 
- If given, then VectorAssembler can use that info instead of figuring out 
the Vector sizes via metadata or examining Rows in the data. And it does 
consistency checks.
2. Add a VectorAssemblerEstimator which gets the Vector lengths from data 
and produces a VectorAssembler_Model_ with the vector lengths Param specified.

Todos:
1. Reduce code duplication. Not sure if want to have a trait that reduces 
duplication between `VectorAssembler` and `VectorAssemblerEstimator`, like 
'OneHotEncoderBase'.
2. comments & documentations etc.


## How was this patch tested?
Added unit tests.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tengpeng/spark Spark-24467

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21522.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21522


commit 8e3aa44c3937d60d5aa35dd03604e57ef218ebb4
Author: Teng Peng 
Date:   2018-06-09T12:48:30Z

Add a param to VectorAssembler for specifying the sizes of Vectors. Add a 
VectorAssemblerEstimator.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21513: [SPARK-19826][ML][PYTHON]add spark.ml Python API for PIC

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21513
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91623/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21513: [SPARK-19826][ML][PYTHON]add spark.ml Python API for PIC

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21513
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21513: [SPARK-19826][ML][PYTHON]add spark.ml Python API for PIC

2018-06-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21513
  
**[Test build #91623 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91623/testReport)**
 for PR 21513 at commit 
[`fcb9a51`](https://github.com/apache/spark/commit/fcb9a51e0cb07d0c77c98d93797b21a1b670aa41).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21513: [SPARK-19826][ML][PYTHON]add spark.ml Python API for PIC

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21513
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21513: [SPARK-19826][ML][PYTHON]add spark.ml Python API for PIC

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21513
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3879/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21513: [SPARK-19826][ML][PYTHON]add spark.ml Python API for PIC

2018-06-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21513
  
**[Test build #91623 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91623/testReport)**
 for PR 21513 at commit 
[`fcb9a51`](https://github.com/apache/spark/commit/fcb9a51e0cb07d0c77c98d93797b21a1b670aa41).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

2018-06-09 Thread pgandhi999
Github user pgandhi999 commented on the issue:

https://github.com/apache/spark/pull/21468
  
Thank you @HyukjinKwon 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21513: [SPARK-19826][ML][PYTHON]add spark.ml Python API ...

2018-06-09 Thread huaxingao
Github user huaxingao commented on a diff in the pull request:

https://github.com/apache/spark/pull/21513#discussion_r194237396
  
--- Diff: python/pyspark/ml/clustering.py ---
@@ -1156,6 +1157,204 @@ def getKeepLastCheckpoint(self):
 return self.getOrDefault(self.keepLastCheckpoint)
 
 
+@inherit_doc
+class PowerIterationClustering(HasMaxIter, HasWeightCol, JavaParams, 
JavaMLReadable,
+   JavaMLWritable):
+"""
+.. note:: Experimental
+
+Power Iteration Clustering (PIC), a scalable graph clustering 
algorithm developed by
+http://www.icml2010.org/papers/387.pdf>Lin and Cohen. From 
the abstract:
+PIC finds a very low-dimensional embedding of a dataset using 
truncated power
+iteration on a normalized pair-wise similarity matrix of the data.
+
+This class is not yet an Estimator/Transformer, use `assignClusters` 
method to run the
+PowerIterationClustering algorithm.
+
+.. seealso:: `Wikipedia on Spectral clustering \
+`_
+
+>>> from pyspark.sql.types import DoubleType, LongType, StructField, 
StructType
+>>> import math
+>>> def genCircle(r, n):
+... points = []
+... for i in range(0, n):
+... theta = 2.0 * math.pi * i / n
+... points.append((r * math.cos(theta), r * math.sin(theta)))
+... return points
+>>> def sim(x, y):
+... dist = (x[0] - y[0]) * (x[0] - y[0]) + (x[1] - y[1]) * (x[1] - 
y[1])
+... return math.exp(-dist / 2.0)
+>>> r1 = 1.0
+>>> n1 = 10
+>>> r2 = 4.0
+>>> n2 = 40
+>>> n = n1 + n2
+>>> points = genCircle(r1, n1) + genCircle(r2, n2)
+>>> data = [(i, j, sim(points[i], points[j])) for i in range(1, n) for 
j in range(0, i)]
+>>> rdd = sc.parallelize(data, 2)
+>>> schema = StructType([StructField("src", LongType(), False), \
+ StructField("dst", LongType(),  True), \
+ StructField("weight", DoubleType(), True)])
+>>> df = spark.createDataFrame(rdd, schema)
+>>> pic = PowerIterationClustering()
+>>> assignments = 
pic.setK(2).setMaxIter(40).setWeightCol("weight").assignClusters(df)
+>>> result = sorted(assignments.collect(), key=lambda x: x.id)
+>>> result[0].cluster == result[1].cluster == result[2].cluster == 
result[3].cluster
+True
+>>> result[4].cluster == result[5].cluster == result[6].cluster == 
result[7].cluster
+True
+>>> pic_path = temp_path + "/pic"
+>>> pic.save(pic_path)
+>>> pic2 = PowerIterationClustering.load(pic_path)
+>>> pic2.getK()
+2
+>>> pic2.getMaxIter()
+40
+>>> assignments2 = pic2.assignClusters(df)
+>>> result2 = sorted(assignments2.collect(), key=lambda x: x.id)
+>>> result2[0].cluster == result2[1].cluster == result2[2].cluster == 
result2[3].cluster
+True
+>>> result2[4].cluster == result2[5].cluster == result2[6].cluster == 
result2[7].cluster
+True
--- End diff --

Thanks for the comments. Will make changes. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21462: [SPARK-24428][K8S] Fix unused code

2018-06-09 Thread skonto
Github user skonto commented on the issue:

https://github.com/apache/spark/pull/21462
  
@felixcheung @mccheah pls review.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21521: [SPARK-23732][docs] Fix source links in generated scalad...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21521
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3878/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21521: [SPARK-23732][docs] Fix source links in generated scalad...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21521
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21521: [SPARK-23732][docs] Fix source links in generated scalad...

2018-06-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21521
  
**[Test build #91622 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91622/testReport)**
 for PR 21521 at commit 
[`62301ce`](https://github.com/apache/spark/commit/62301ced0e736fdc445d830f5dadf377cb65cc6d).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21521: [SPARK-23732][docs] Fix source links in generated...

2018-06-09 Thread vanzin
GitHub user vanzin opened a pull request:

https://github.com/apache/spark/pull/21521

[SPARK-23732][docs] Fix source links in generated scaladoc.

Apply the suggestion on the bug to fix source links. Tested with
the 2.3.1 release docs.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vanzin/spark SPARK-23732

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21521.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21521


commit 62301ced0e736fdc445d830f5dadf377cb65cc6d
Author: Marcelo Vanzin 
Date:   2018-06-09T17:59:39Z

[SPARK-23732][docs] Fix source links in generated scaladoc.

Apply the suggestion on the bug to fix source links. Tested with
the 2.3.1 release docs.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20909: [SPARK-23776][python][test] Check for needed comp...

2018-06-09 Thread bersprockets
Github user bersprockets closed the pull request at:

https://github.com/apache/spark/pull/20909


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...

2018-06-09 Thread bersprockets
Github user bersprockets commented on the issue:

https://github.com/apache/spark/pull/20909
  
@HyukjinKwon This PR is mostly obsolete. I will close it and re-open 
something smaller... maybe a one-line documentation change to handle the 
missing UDF case for those who build with sbt.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20901: [SPARK-23792][DOCS] Documentation improvements for datet...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20901
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20901: [SPARK-23792][DOCS] Documentation improvements for datet...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20901
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91620/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20901: [SPARK-23792][DOCS] Documentation improvements for datet...

2018-06-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20901
  
**[Test build #91620 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91620/testReport)**
 for PR 20901 at commit 
[`2544d93`](https://github.com/apache/spark/commit/2544d930f3b8f3015f140bc51578834814d8cf22).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-06-09 Thread jose-torres
Github user jose-torres commented on the issue:

https://github.com/apache/spark/pull/21469
  
(Sorry to comment after so long with such a minor change - I've been busy 
with spark summit)

metricProviderLoaderMapSize should be metricProviderLoaderMapSizeBytes, 
both for clarity and consistency with memoryUsedBytes. Other than that LGTM.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21469: [SPARK-24441][SS] Expose total estimated size of ...

2018-06-09 Thread jose-torres
Github user jose-torres commented on a diff in the pull request:

https://github.com/apache/spark/pull/21469#discussion_r194232690
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingQueryListenerSuite.scala
 ---
@@ -231,7 +231,7 @@ class StreamingQueryListenerSuite extends StreamTest 
with BeforeAndAfter {
   test("event ordering") {
 val listener = new EventCollector
 withListenerAdded(listener) {
-  for (i <- 1 to 100) {
+  for (i <- 1 to 50) {
--- End diff --

Makes sense, and I agree with the implicit claim that this slowdown isn't 
too worrying.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21490: [SPARK-24462][SS] Initialize the offsets correctly when ...

2018-06-09 Thread jose-torres
Github user jose-torres commented on the issue:

https://github.com/apache/spark/pull/21490
  
(sorry, I've been busy with Spark Summit)

The problem I see is that fault tolerance might not be cleanly separable 
from query stop tolerance. If a user stops the query at the wrong time, will 
they see strange results which make them think Spark is doing something wrong?

If we can come up with a clean story for what users will see on restart, 
and it makes sense no matter when the query was stopped, I agree that this 
change or #20958 would be reasonable to add.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21520: [SPARK-24505][SQL][WIP] Forbidding string interpolation ...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21520
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21520: [SPARK-24505][SQL][WIP] Forbidding string interpolation ...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21520
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91621/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21520: [SPARK-24505][SQL][WIP] Forbidding string interpolation ...

2018-06-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21520
  
**[Test build #91621 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91621/testReport)**
 for PR 21520 at commit 
[`349d63e`](https://github.com/apache/spark/commit/349d63ee92f69a09500acb4b61532af8d591a6d2).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20909
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91612/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20909
  
Build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...

2018-06-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20909
  
**[Test build #91612 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91612/testReport)**
 for PR 20909 at commit 
[`db14acb`](https://github.com/apache/spark/commit/db14acbb3a90c9da184fc9c909640e07100c38fa).
 * This patch **fails from timeout after a configured wait of \`300m\`**.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20168: [SPARK-22730][ML] Add ImageSchema support for all OpenCv...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20168
  
Build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20838: [SPARK-23698] Resolve undefined names in Python 3

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20838
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91607/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20168: [SPARK-22730][ML] Add ImageSchema support for all OpenCv...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20168
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91608/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20838: [SPARK-23698] Resolve undefined names in Python 3

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20838
  
Build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20838: [SPARK-23698] Resolve undefined names in Python 3

2018-06-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20838
  
**[Test build #91607 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91607/testReport)**
 for PR 20838 at commit 
[`16d60ce`](https://github.com/apache/spark/commit/16d60cef3de81c9fde33ec21e0dc39f75643f6a0).
 * This patch **fails from timeout after a configured wait of \`300m\`**.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20168: [SPARK-22730][ML] Add ImageSchema support for all OpenCv...

2018-06-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20168
  
**[Test build #91608 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91608/testReport)**
 for PR 20168 at commit 
[`5a632f5`](https://github.com/apache/spark/commit/5a632f5f60afd2e8c225703532d17ed9e56e47f7).
 * This patch passes all tests.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21520: [SPARK-24505][SQL][WIP] Forbidding string interpolation ...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21520
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21520: [SPARK-24505][SQL][WIP] Forbidding string interpolation ...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21520
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21520: [SPARK-24505][SQL][WIP] Forbidding string interpolation ...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21520
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3877/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21520: [SPARK-24505][SQL][WIP] Forbidding string interpolation ...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21520
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3876/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21520: [SPARK-24505][SQL][WIP] Forbidding string interpolation ...

2018-06-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21520
  
**[Test build #91621 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91621/testReport)**
 for PR 21520 at commit 
[`349d63e`](https://github.com/apache/spark/commit/349d63ee92f69a09500acb4b61532af8d591a6d2).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21276: [SPARK-24216][SQL] Spark TypedAggregateExpression uses g...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21276
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21276: [SPARK-24216][SQL] Spark TypedAggregateExpression uses g...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21276
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91609/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21497: [SPARK-24466][SS] Fix TextSocketMicroBatchReader to be c...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21497
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91618/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21497: [SPARK-24466][SS] Fix TextSocketMicroBatchReader to be c...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21497
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21497: [SPARK-24466][SS] Fix TextSocketMicroBatchReader to be c...

2018-06-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21497
  
**[Test build #91618 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91618/testReport)**
 for PR 21497 at commit 
[`d069dd0`](https://github.com/apache/spark/commit/d069dd009bac833ac5f1a61bd9f911d1e021e15c).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21276: [SPARK-24216][SQL] Spark TypedAggregateExpression uses g...

2018-06-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21276
  
**[Test build #91609 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91609/testReport)**
 for PR 21276 at commit 
[`3d067b8`](https://github.com/apache/spark/commit/3d067b883a947b0b3b3dd200c7767125001aefef).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20901: [SPARK-23792][DOCS] Documentation improvements for datet...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20901
  
Build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20901: [SPARK-23792][DOCS] Documentation improvements for datet...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20901
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91606/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20901: [SPARK-23792][DOCS] Documentation improvements for datet...

2018-06-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20901
  
**[Test build #91606 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91606/testReport)**
 for PR 20901 at commit 
[`6dea103`](https://github.com/apache/spark/commit/6dea103733e47e9aa57f331a78a662d3e20b9f77).
 * This patch **fails Spark unit tests**.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20901: [SPARK-23792][DOCS] Documentation improvements for datet...

2018-06-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20901
  
**[Test build #91620 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91620/testReport)**
 for PR 20901 at commit 
[`2544d93`](https://github.com/apache/spark/commit/2544d930f3b8f3015f140bc51578834814d8cf22).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21199: [SPARK-24127][SS] Continuous text socket source

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21199
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21199: [SPARK-24127][SS] Continuous text socket source

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21199
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91615/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20901: [SPARK-23792][DOCS] Documentation improvements for datet...

2018-06-09 Thread abradbury
Github user abradbury commented on the issue:

https://github.com/apache/spark/pull/20901
  
@HyukjinKwon, merge conflicts have now been resolved


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21199: [SPARK-24127][SS] Continuous text socket source

2018-06-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21199
  
**[Test build #91615 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91615/testReport)**
 for PR 21199 at commit 
[`68c5eed`](https://github.com/apache/spark/commit/68c5eed7c6d32db8a7a469a1970e6fa731a91c70).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21482: [SPARK-24393][SQL] SQL builtin: isinf

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21482
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21482: [SPARK-24393][SQL] SQL builtin: isinf

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21482
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91601/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21482: [SPARK-24393][SQL] SQL builtin: isinf

2018-06-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21482
  
**[Test build #91601 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91601/testReport)**
 for PR 21482 at commit 
[`559900a`](https://github.com/apache/spark/commit/559900a1ffeb926bf06a2f015f5316d4fac862cf).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18555: [SPARK-21353][CORE]add checkValue in spark.internal.conf...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18555
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91602/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18555: [SPARK-21353][CORE]add checkValue in spark.internal.conf...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18555
  
Build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18555: [SPARK-21353][CORE]add checkValue in spark.internal.conf...

2018-06-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18555
  
**[Test build #91602 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91602/testReport)**
 for PR 18555 at commit 
[`a0efb41`](https://github.com/apache/spark/commit/a0efb41fe041520bf423933b3c7b18ad2feb6711).
 * This patch **fails Spark unit tests**.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21469
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21469
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91596/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-06-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21469
  
**[Test build #91596 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91596/testReport)**
 for PR 21469 at commit 
[`3c80cad`](https://github.com/apache/spark/commit/3c80cad32c056a24a7f5ffd7ab0ae3f7e096a62d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21222: [SPARK-24161][SS] Enable debug package feature on struct...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21222
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21222: [SPARK-24161][SS] Enable debug package feature on struct...

2018-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21222
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91610/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21222: [SPARK-24161][SS] Enable debug package feature on struct...

2018-06-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21222
  
**[Test build #91610 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91610/testReport)**
 for PR 21222 at commit 
[`b9bad1a`](https://github.com/apache/spark/commit/b9bad1ae6618606e91df35f854025bc32c8178de).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   >