[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23052
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23034: [SPARK-26035][PYTHON] Break large streaming/tests.py fil...

2018-11-15 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23034
  
Thank you @BryanCutler.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23054: [SPARK-26085][SQL] Key attribute of primitive typ...

2018-11-15 Thread viirya
GitHub user viirya opened a pull request:

https://github.com/apache/spark/pull/23054

[SPARK-26085][SQL] Key attribute of primitive type under typed aggregation 
should be named as "key" too

## What changes were proposed in this pull request?

When doing typed aggregation on a Dataset, for complex key type, the key 
attribute is named as "key". But for primitive type, the key attribute is named 
as "value". This key attribute should also be named as "key" for primitive type.

## How was this patch tested?

Added test.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viirya/spark-1 SPARK-26085

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23054.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23054


commit c7bbe91519aec116ae2c2f449f518f59cc49c7c0
Author: Liang-Chi Hsieh 
Date:   2018-11-16T01:52:12Z

Named key attribute for primitive type as "key".




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23054: [SPARK-26085][SQL] Key attribute of primitive type under...

2018-11-15 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/23054
  
cc @cloud-fan 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23038: [SPARK-25451][CORE][WEBUI]Aggregated metrics table doesn...

2018-11-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23038
  
**[Test build #98894 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98894/testReport)**
 for PR 23038 at commit 
[`805ebb8`](https://github.com/apache/spark/commit/805ebb8e6b103cbc0688da64ec27841a1491039f).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23031: [SPARK-26060][SQL] Track SparkConf entries and make SET ...

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23031
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5066/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23031: [SPARK-26060][SQL] Track SparkConf entries and make SET ...

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23031
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23026: [SPARK-25960][k8s] Support subpath mounting with Kuberne...

2018-11-15 Thread shaneknapp
Github user shaneknapp commented on the issue:

https://github.com/apache/spark/pull/23026
  
> > if such a list exists it should be the same list that triggers regular 
tests.
> 
> I defer that to @shaneknapp

no, @vanzin is right.  i'll update that tomorrow.

@vanzin for historical knowledge:  once i get spark ported to ubuntu 
(literally down to one or two troublesome builds!  such closeness!), the k8s 
prb will be merged in to the regular spark prb.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23056: [SPARK-26034][PYTHON][TESTS] Break large mllib/te...

2018-11-15 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/23056#discussion_r234080468
  
--- Diff: python/pyspark/mllib/tests/test_linalg.py ---
@@ -0,0 +1,642 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import sys
+import array as pyarray
+
+from numpy import array, array_equal, zeros, arange, tile, ones, inf
+from numpy import sum as array_sum
+
+if sys.version_info[:2] <= (2, 6):
+try:
+import unittest2 as unittest
+except ImportError:
+sys.stderr.write('Please install unittest2 to test with Python 2.6 
or earlier')
+sys.exit(1)
+else:
+import unittest
+
+import pyspark.ml.linalg as newlinalg
+from pyspark.mllib.linalg import Vector, SparseVector, DenseVector, 
VectorUDT, _convert_to_vector, \
+DenseMatrix, SparseMatrix, Vectors, Matrices, MatrixUDT
+from pyspark.mllib.regression import LabeledPoint
+from pyspark.testing.mllibutils import make_serializer, MLlibTestCase
+
+_have_scipy = False
+try:
+import scipy.sparse
+_have_scipy = True
+except:
+# No SciPy, but that's okay, we'll skip those tests
+pass
+
+
+ser = make_serializer()
+
+
+def _squared_distance(a, b):
+if isinstance(a, Vector):
+return a.squared_distance(b)
+else:
+return b.squared_distance(a)
+
+
+class VectorTests(MLlibTestCase):
+
+def _test_serialize(self, v):
+self.assertEqual(v, ser.loads(ser.dumps(v)))
+jvec = 
self.sc._jvm.org.apache.spark.mllib.api.python.SerDe.loads(bytearray(ser.dumps(v)))
+nv = 
ser.loads(bytes(self.sc._jvm.org.apache.spark.mllib.api.python.SerDe.dumps(jvec)))
+self.assertEqual(v, nv)
+vs = [v] * 100
+jvecs = 
self.sc._jvm.org.apache.spark.mllib.api.python.SerDe.loads(bytearray(ser.dumps(vs)))
+nvs = 
ser.loads(bytes(self.sc._jvm.org.apache.spark.mllib.api.python.SerDe.dumps(jvecs)))
+self.assertEqual(vs, nvs)
+
+def test_serialize(self):
+self._test_serialize(DenseVector(range(10)))
+self._test_serialize(DenseVector(array([1., 2., 3., 4.])))
+self._test_serialize(DenseVector(pyarray.array('d', range(10
+self._test_serialize(SparseVector(4, {1: 1, 3: 2}))
+self._test_serialize(SparseVector(3, {}))
+self._test_serialize(DenseMatrix(2, 3, range(6)))
+sm1 = SparseMatrix(
+3, 4, [0, 2, 2, 4, 4], [1, 2, 1, 2], [1.0, 2.0, 4.0, 5.0])
+self._test_serialize(sm1)
+
+def test_dot(self):
+sv = SparseVector(4, {1: 1, 3: 2})
+dv = DenseVector(array([1., 2., 3., 4.]))
+lst = DenseVector([1, 2, 3, 4])
+mat = array([[1., 2., 3., 4.],
+ [1., 2., 3., 4.],
+ [1., 2., 3., 4.],
+ [1., 2., 3., 4.]])
+arr = pyarray.array('d', [0, 1, 2, 3])
+self.assertEqual(10.0, sv.dot(dv))
+self.assertTrue(array_equal(array([3., 6., 9., 12.]), sv.dot(mat)))
+self.assertEqual(30.0, dv.dot(dv))
+self.assertTrue(array_equal(array([10., 20., 30., 40.]), 
dv.dot(mat)))
+self.assertEqual(30.0, lst.dot(dv))
+self.assertTrue(array_equal(array([10., 20., 30., 40.]), 
lst.dot(mat)))
+self.assertEqual(7.0, sv.dot(arr))
+
+def test_squared_distance(self):
+sv = SparseVector(4, {1: 1, 3: 2})
+dv = DenseVector(array([1., 2., 3., 4.]))
+lst = DenseVector([4, 3, 2, 1])
+lst1 = [4, 3, 2, 1]
+arr = pyarray.array('d', [0, 2, 1, 3])
+narr = array([0, 2, 1, 3])
+self.assertEqual(15.0, _squared_distance(sv, dv))
+self.assertEqual(25.0, _squared_distance(sv, lst))
+self.assertEqual(20.0, _squared_distance(dv, lst))
+self.assertEqual(15.0, _squared_distance(dv, sv))
+   

[GitHub] spark issue #23056: [SPARK-26034][PYTHON][TESTS] Break large mllib/tests.py ...

2018-11-15 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23056
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23055: [SPARK-26080][PYTHON] Disable 'spark.executor.pys...

2018-11-15 Thread rdblue
Github user rdblue commented on a diff in the pull request:

https://github.com/apache/spark/pull/23055#discussion_r234080578
  
--- Diff: 
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
@@ -74,8 +74,13 @@ private[spark] abstract class BasePythonRunner[IN, OUT](
   private val reuseWorker = conf.getBoolean("spark.python.worker.reuse", 
true)
   // each python worker gets an equal part of the allocation. the worker 
pool will grow to the
   // number of concurrent tasks, which is determined by the number of 
cores in this executor.
-  private val memoryMb = conf.get(PYSPARK_EXECUTOR_MEMORY)
+  private val memoryMb = if (Utils.isWindows) {
--- End diff --

I don't think this is necessary. If `resource` can't be imported for any 
reason, then memory will not be limited in python. But the JVM side shouldn't 
be what determines whether that happens. The JVM should do everything the same 
way -- even requesting memory from schedulers like YARN because that space 
should still be allocated as python memory, even if python can't self-limit.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23056: [SPARK-26034][PYTHON][TESTS] Break large mllib/tests.py ...

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23056
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5068/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23056: [SPARK-26034][PYTHON][TESTS] Break large mllib/te...

2018-11-15 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/23056#discussion_r234080249
  
--- Diff: python/pyspark/testing/mllibutils.py ---
@@ -0,0 +1,44 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import sys
+
+if sys.version_info[:2] <= (2, 6):
+try:
+import unittest2 as unittest
+except ImportError:
+sys.stderr.write('Please install unittest2 to test with Python 2.6 
or earlier')
+sys.exit(1)
+else:
+import unittest
--- End diff --

@BryanCutler, actually I remove this because we dropped 2.6 support while 
we are here. Im pretty sure we can just import unittest.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23055: [SPARK-26080][PYTHON] Disable 'spark.executor.pys...

2018-11-15 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/23055#discussion_r234081475
  
--- Diff: 
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
@@ -74,8 +74,13 @@ private[spark] abstract class BasePythonRunner[IN, OUT](
   private val reuseWorker = conf.getBoolean("spark.python.worker.reuse", 
true)
   // each python worker gets an equal part of the allocation. the worker 
pool will grow to the
   // number of concurrent tasks, which is determined by the number of 
cores in this executor.
-  private val memoryMb = conf.get(PYSPARK_EXECUTOR_MEMORY)
+  private val memoryMb = if (Utils.isWindows) {
--- End diff --

I see. I think the point of view is a bit different. What I was trying to 
do is that:
we declare this configuration is not supported on Windows, meaning we 
disable this configuration on Windows from the start, JVM side - because it's 
JVM to launch Python workers. So, I was trying to leave the control to JVM.

> It seems brittle to disable this on the JVM side and rely on it here. Can 
we also set a flag in the ImportError case and also check that here?

However, in a way, It's a bit odd to say it's brittle because we're already 
relying on that.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23049: [SPARK-26076][Build][Minor] Revise ambiguous error messa...

2018-11-15 Thread gengliangwang
Github user gengliangwang commented on the issue:

https://github.com/apache/spark/pull/23049
  
Hi @vanzin ,
thanks for pointing it out! I have updated the script and PR description.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23046: [SPARK-23207][SQL][FOLLOW-UP] Use `SQLConf.get.en...

2018-11-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/23046#discussion_r234088968
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala
 ---
@@ -280,7 +280,7 @@ object ShuffleExchangeExec {
   }
   // The comparator for comparing row hashcode, which should 
always be Integer.
   val prefixComparator = PrefixComparators.LONG
-  val canUseRadixSort = 
SparkEnv.get.conf.get(SQLConf.RADIX_SORT_ENABLED)
+  val canUseRadixSort = SQLConf.get.enableRadixSort
--- End diff --

It's a small bug fix, so no need to backport to all the branches. I think 
2.4 is good enough


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23045: [SPARK-26071][SQL] disallow map as map key

2018-11-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23045
  
**[Test build #98901 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98901/testReport)**
 for PR 23045 at commit 
[`574308e`](https://github.com/apache/spark/commit/574308e8f4c23f9549c647178709c7c85d4d2fc7).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23034: [SPARK-26035][PYTHON] Break large streaming/tests.py fil...

2018-11-15 Thread BryanCutler
Github user BryanCutler commented on the issue:

https://github.com/apache/spark/pull/23034
  
> Also, @BryanCutler, I think we can talk about locations of 
testing/...util.py later when we finished to split the tests. Moving utils 
would probably cause less conflicts and should be good enough to separately 
discuss if that's a worry, and should be changed.

Sounds good, working on MLlib right now. Hopefully have a PR up soon


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23052: [SPARK-26081][SQL] Prevent empty files for empty ...

2018-11-15 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/23052#discussion_r234062564
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVFileFormat.scala
 ---
@@ -174,13 +174,18 @@ private[csv] class CsvOutputWriter(
 context: TaskAttemptContext,
 params: CSVOptions) extends OutputWriter with Logging {
 
-  private val charset = Charset.forName(params.charset)
+  private var univocityGenerator: Option[UnivocityGenerator] = None
 
-  private val writer = CodecStreams.createOutputStreamWriter(context, new 
Path(path), charset)
-
-  private val gen = new UnivocityGenerator(dataSchema, writer, params)
+  override def write(row: InternalRow): Unit = {
+val gen = univocityGenerator.getOrElse {
--- End diff --

Also, one thing we should not forget about is, CSV _could_ have headers 
even if the records are empty.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20303: [SPARK-23128][SQL] A new approach to do adaptive executi...

2018-11-15 Thread carsonwang
Github user carsonwang commented on the issue:

https://github.com/apache/spark/pull/20303
  
@cloud-fan @gatorsmile , are you ready to start reviewing this? I can bring 
this update to date.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23038: [SPARK-25451][CORE][WEBUI]Aggregated metrics tabl...

2018-11-15 Thread shahidki31
Github user shahidki31 commented on a diff in the pull request:

https://github.com/apache/spark/pull/23038#discussion_r234072070
  
--- Diff: core/src/main/scala/org/apache/spark/status/api/v1/api.scala ---
@@ -63,6 +63,7 @@ case class ApplicationAttemptInfo private[spark](
 
 class ExecutorStageSummary private[spark](
 val taskTime : Long,
+val activeTasks: Int,
--- End diff --

Hi @vanzin , I have modified based your comment. Kindly review


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23038: [SPARK-25451][CORE][WEBUI]Aggregated metrics table doesn...

2018-11-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23038
  
**[Test build #98893 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98893/testReport)**
 for PR 23038 at commit 
[`0d92185`](https://github.com/apache/spark/commit/0d921852045fdca3a528fa807fbd229076b52746).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23046: [SPARK-23207][SQL][FOLLOW-UP] Use `SQLConf.get.en...

2018-11-15 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/23046#discussion_r234073703
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala
 ---
@@ -280,7 +280,7 @@ object ShuffleExchangeExec {
   }
   // The comparator for comparing row hashcode, which should 
always be Integer.
   val prefixComparator = PrefixComparators.LONG
-  val canUseRadixSort = 
SparkEnv.get.conf.get(SQLConf.RADIX_SORT_ENABLED)
+  val canUseRadixSort = SQLConf.get.enableRadixSort
--- End diff --

Yes .. I don't mind it but was just thinking that we don't necessarily 
backport to all the branches if there's any concern. I will leave it to you 
guys as well. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23026: [SPARK-25960][k8s] Support subpath mounting with Kuberne...

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23026
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5062/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23026: [SPARK-25960][k8s] Support subpath mounting with Kuberne...

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23026
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23056: [SPARK-26034][PYTHON][TESTS] Break large mllib/te...

2018-11-15 Thread BryanCutler
GitHub user BryanCutler opened a pull request:

https://github.com/apache/spark/pull/23056

[SPARK-26034][PYTHON][TESTS] Break large mllib/tests.py file into smaller 
files

## What changes were proposed in this pull request?

This PR breaks down the large mllib/tests.py file that contains all Python 
MLlib unit tests into several smaller test files to be easier to read and 
maintain.

The tests are broken down as follows:
```
pyspark
├── __init__.py
...
├── mllib
│   ├── __init__.py
...
│   ├── tests
│   │   ├── __init__.py
│   │   ├── test_algorithms.py
│   │   ├── test_feature.py
│   │   ├── test_linalg.py
│   │   ├── test_stat.py
│   │   ├── test_streaming_algorithms.py
│   │   └── test_util.py
...
├── testing
...
│   ├── mllibutils.py
...
```

## How was this patch tested?

Ran tests manually by module to ensure test count was the same, and ran 
`python/run-tests --modules=pyspark-mllib` to verify all passing with Python 
2.7 and Python 3.6. Also installed scipy to include optional tests in 
test_linalg.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/BryanCutler/spark 
python-test-breakup-mllib-SPARK-26034

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23056.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23056


commit 2759521df7f2dffc9ddb9379e0b1dac6721da366
Author: Bryan Cutler 
Date:   2018-11-16T03:01:22Z

separated mllib tests




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23056: [SPARK-26034][PYTHON][TESTS] Break large mllib/tests.py ...

2018-11-15 Thread BryanCutler
Github user BryanCutler commented on the issue:

https://github.com/apache/spark/pull/23056
  
Dist by line count:
```
313 ./test_algorithms.py
201 ./test_feature.py
642 ./test_linalg.py
197 ./test_stat.py
523 ./test_streaming_algorithms.py
115 ./test_util.py
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23037: [SPARK-26083][k8s] Add Copy pyspark into corresponding d...

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23037
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5063/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23037: [SPARK-26083][k8s] Add Copy pyspark into corresponding d...

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23037
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23055: [SPARK-26080][PYTHON] Disable 'spark.executor.pyspark.me...

2018-11-15 Thread rdblue
Github user rdblue commented on the issue:

https://github.com/apache/spark/pull/23055
  
Thanks for fixing this so quickly, @HyukjinKwon! I'd like a couple of 
changes, but overall it is going in the right direction.

We should also plan on porting this to the 2.4 branch when it is committed 
since it is a regression.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23056: [SPARK-26034][PYTHON][TESTS] Break large mllib/tests.py ...

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23056
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23056: [SPARK-26034][PYTHON][TESTS] Break large mllib/tests.py ...

2018-11-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23056
  
**[Test build #98898 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98898/testReport)**
 for PR 23056 at commit 
[`2759521`](https://github.com/apache/spark/commit/2759521df7f2dffc9ddb9379e0b1dac6721da366).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22309: [SPARK-20384][SQL] Support value class in schema of Data...

2018-11-15 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22309
  
adding @liancheng BTW. IIRC, he took a look for this one before and 
abandoned the change (fix me if I'm wrongly remembering this).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23044: [SPARK-26073][SQL][FOLLOW-UP] remove invalid comment as ...

2018-11-15 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23044
  
thanks, merging to master!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23056: [SPARK-26034][PYTHON][TESTS] Break large mllib/te...

2018-11-15 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request:

https://github.com/apache/spark/pull/23056#discussion_r234093063
  
--- Diff: python/pyspark/testing/mllibutils.py ---
@@ -0,0 +1,44 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import sys
+
+if sys.version_info[:2] <= (2, 6):
+try:
+import unittest2 as unittest
+except ImportError:
+sys.stderr.write('Please install unittest2 to test with Python 2.6 
or earlier')
+sys.exit(1)
+else:
+import unittest
--- End diff --

Yeah, I wondered about that but thought it might be better to do in a 
followup


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23054: [SPARK-26085][SQL] Key attribute of primitive type under...

2018-11-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23054
  
**[Test build #98891 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98891/testReport)**
 for PR 23054 at commit 
[`c7bbe91`](https://github.com/apache/spark/commit/c7bbe91519aec116ae2c2f449f518f59cc49c7c0).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23055: [SPARK-26080][SQL] Disable 'spark.executor.pyspark.memor...

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23055
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23055: [SPARK-26080][SQL] Disable 'spark.executor.pyspark.memor...

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23055
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5065/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23025: [SPARK-26024][SQL]: Update documentation for repa...

2018-11-15 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/23025#discussion_r234071565
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -2813,6 +2819,11 @@ class Dataset[T] private[sql](
* When no explicit sort order is specified, "ascending nulls first" is 
assumed.
* Note, the rows are not sorted in each partition of the resulting 
Dataset.
*
+   * [SPARK-26024] Note that due to performance reasons this method uses 
sampling to
+   * estimate the ranges. Hence, the output may not be consistent, since 
sampling can return
+   * different values. The sample size can be controlled by setting the 
value of the parameter
+   * {{spark.sql.execution.rangeExchange.sampleSizePerPartition}}.
--- End diff --

``` `spark.sql.execution.rangeExchange.sampleSizePerPartition` ```.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22598: [SPARK-25501][SS] Add kafka delegation token support.

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22598
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22598: [SPARK-25501][SS] Add kafka delegation token support.

2018-11-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22598
  
**[Test build #98890 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98890/testReport)**
 for PR 22598 at commit 
[`2a0cdb7`](https://github.com/apache/spark/commit/2a0cdb7f397abdc8ce411e2f5c08cf8029676e90).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22598: [SPARK-25501][SS] Add kafka delegation token support.

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22598
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98890/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23046: [SPARK-23207][SQL][FOLLOW-UP] Use `SQLConf.get.en...

2018-11-15 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/23046#discussion_r234073072
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala
 ---
@@ -280,7 +280,7 @@ object ShuffleExchangeExec {
   }
   // The comparator for comparing row hashcode, which should 
always be Integer.
   val prefixComparator = PrefixComparators.LONG
-  val canUseRadixSort = 
SparkEnv.get.conf.get(SQLConf.RADIX_SORT_ENABLED)
+  val canUseRadixSort = SQLConf.get.enableRadixSort
--- End diff --

Ah, yes, to be exact, if users specified the config to `SparkConf` before 
Spark ran, it could be read.
I'd leave which branch we should backport to to you and other reviewers. 
@jiangxb1987 @cloud-fan 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23031: [SPARK-26060][SQL] Track SparkConf entries and make SET ...

2018-11-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23031
  
**[Test build #98896 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98896/testReport)**
 for PR 23031 at commit 
[`336a331`](https://github.com/apache/spark/commit/336a331fdc817566c7fd09e5b36d5de24379c5b6).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23041: [SPARK-26069][TESTS]Fix flaky test: RpcIntegrationSuite....

2018-11-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23041
  
**[Test build #4427 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4427/testReport)**
 for PR 23041 at commit 
[`6bebcb5`](https://github.com/apache/spark/commit/6bebcb5e004ed4b434c550d26ed1a922d13e0446).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23049: [SPARK-26076][Build][Minor] Revise ambiguous error messa...

2018-11-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23049
  
**[Test build #98899 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98899/testReport)**
 for PR 23049 at commit 
[`daf5e33`](https://github.com/apache/spark/commit/daf5e33f14f28fa28e85a703fbd3acc08075fd1b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23049: [SPARK-26076][Build][Minor] Revise ambiguous error messa...

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23049
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23049: [SPARK-26076][Build][Minor] Revise ambiguous error messa...

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23049
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5070/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22309: [SPARK-20384][SQL] Support value class in schema ...

2018-11-15 Thread mt40
Github user mt40 commented on a diff in the pull request:

https://github.com/apache/spark/pull/22309#discussion_r234085471
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala 
---
@@ -373,6 +383,32 @@ object ScalaReflection extends ScalaReflection {
   dataType = ObjectType(udt.getClass))
 Invoke(obj, "deserialize", ObjectType(udt.userClass), path :: Nil)
 
+  case t if isValueClass(t) =>
+val (_, underlyingType) = getUnderlyingParameterOf(t)
+val underlyingClsName = getClassNameFromType(underlyingType)
+val clsName = getUnerasedClassNameFromType(t)
+val newTypePath = s"""- Scala value class: 
$clsName($underlyingClsName)""" +:
+  walkedTypePath
+
+// Nested value class is treated as its underlying type
+// because the compiler will convert value class in the schema to
+// its underlying type.
+// However, for value class that is top-level or array element,
+// if it is used as another type (e.g. as its parent trait or 
generic),
+// the compiler keeps the class so we must provide an instance of 
the
+// class too. In other cases, the compiler will handle 
wrapping/unwrapping
+// for us automatically.
+val arg = deserializerFor(underlyingType, path, newTypePath, 
Some(t))
+val isCollectionElement = lastType.exists { lt =>
+  lt <:< localTypeOf[Array[_]] || lt <:< localTypeOf[Seq[_]]
--- End diff --

I added the support for Map


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23046: [SPARK-23207][SQL][FOLLOW-UP] Use `SQLConf.get.en...

2018-11-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/23046


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23054: [SPARK-26085][SQL] Key attribute of primitive type under...

2018-11-15 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23054
  
makes sense to me. This is a behavior change right? Shall we write a 
migration guide?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23054: [SPARK-26085][SQL] Key attribute of primitive type under...

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23054
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23054: [SPARK-26085][SQL] Key attribute of primitive type under...

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23054
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5064/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23054: [SPARK-26085][SQL] Key attribute of primitive type under...

2018-11-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23054
  
**[Test build #98891 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98891/testReport)**
 for PR 23054 at commit 
[`c7bbe91`](https://github.com/apache/spark/commit/c7bbe91519aec116ae2c2f449f518f59cc49c7c0).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23056: [SPARK-26034][PYTHON][TESTS] Break large mllib/tests.py ...

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23056
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23056: [SPARK-26034][PYTHON][TESTS] Break large mllib/tests.py ...

2018-11-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23056
  
**[Test build #98897 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98897/testReport)**
 for PR 23056 at commit 
[`2759521`](https://github.com/apache/spark/commit/2759521df7f2dffc9ddb9379e0b1dac6721da366).
 * This patch **fails build dependency tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23056: [SPARK-26034][PYTHON][TESTS] Break large mllib/tests.py ...

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23056
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98897/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23055: [SPARK-26080][PYTHON] Disable 'spark.executor.pys...

2018-11-15 Thread rdblue
Github user rdblue commented on a diff in the pull request:

https://github.com/apache/spark/pull/23055#discussion_r234080290
  
--- Diff: python/pyspark/worker.py ---
@@ -268,9 +272,11 @@ def main(infile, outfile):
 
 # set up memory limits
 memory_limit_mb = int(os.environ.get('PYSPARK_EXECUTOR_MEMORY_MB', 
"-1"))
-total_memory = resource.RLIMIT_AS
-try:
-if memory_limit_mb > 0:
+# 'PYSPARK_EXECUTOR_MEMORY_MB' should be undefined on Windows 
because it depends on
+# resource package which is a Unix specific package.
+if memory_limit_mb > 0:
--- End diff --

It seems brittle to disable this on the JVM side and rely on it here. Can 
we also set a flag in the ImportError case and also check that here?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23049: [SPARK-26076][Build][Minor] Revise ambiguous error messa...

2018-11-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23049
  
**[Test build #98900 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98900/testReport)**
 for PR 23049 at commit 
[`3269862`](https://github.com/apache/spark/commit/3269862c0b80bb7c546e9d45fd5fd4aa17aa1c7e).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23055: [SPARK-26080][PYTHON] Disable 'spark.executor.pys...

2018-11-15 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/23055#discussion_r234086569
  
--- Diff: 
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
@@ -74,8 +74,13 @@ private[spark] abstract class BasePythonRunner[IN, OUT](
   private val reuseWorker = conf.getBoolean("spark.python.worker.reuse", 
true)
   // each python worker gets an equal part of the allocation. the worker 
pool will grow to the
   // number of concurrent tasks, which is determined by the number of 
cores in this executor.
-  private val memoryMb = conf.get(PYSPARK_EXECUTOR_MEMORY)
+  private val memoryMb = if (Utils.isWindows) {
--- End diff --

> JVM could set the request

This is handled in JVM so it wouldn't break. `worker` itself is strongly 
coupled to JVM.

You mean that case when the client is in Windows machine and it uses a 
Unix-based clusters, right? I think this is what the fix already does - the 
`PythonRunner`s already are created at executor side and it wouldn't affect 
when the client is on Windows.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23042: [SPARK-26070][SQL] add rule for implicit type coe...

2018-11-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/23042#discussion_r234091858
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 ---
@@ -138,6 +138,11 @@ object TypeCoercion {
 case (DateType, TimestampType)
   => if (conf.compareDateTimestampInTimestamp) Some(TimestampType) 
else Some(StringType)
 
+// to support a popular use case of tables using Decimal(X, 0) for 
long IDs instead of strings
+// see SPARK-26070 for more details
+case (n: DecimalType, s: StringType) if n.scale == 0 => 
Some(DecimalType(n.precision, n.scale))
--- End diff --

CC @gatorsmile @mgaido91 I think it's time to look at the SQL standard and 
other mainstream databases, and see how shall we update the type coercions 
rules with safe mode. What do you think?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23044: [SPARK-26073][SQL][FOLLOW-UP] remove invalid comm...

2018-11-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/23044


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23054: [SPARK-26085][SQL] Key attribute of primitive type under...

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23054
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23054: [SPARK-26085][SQL] Key attribute of primitive type under...

2018-11-15 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/23054
  
Ok. Let me update migration guide.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23054: [SPARK-26085][SQL] Key attribute of primitive type under...

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23054
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98891/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23045: [SPARK-26071][SQL] disallow map as map key

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23045
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23045: [SPARK-26071][SQL] disallow map as map key

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23045
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5071/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23054: [SPARK-26085][SQL] Key attribute of primitive type under...

2018-11-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23054
  
**[Test build #98902 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98902/testReport)**
 for PR 23054 at commit 
[`42e32ad`](https://github.com/apache/spark/commit/42e32adda2da3717161fe5f8aa40febc1f32465e).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23046: [SPARK-23207][SQL][FOLLOW-UP] Use `SQLConf.get.en...

2018-11-15 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/23046#discussion_r234063905
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala
 ---
@@ -280,7 +280,7 @@ object ShuffleExchangeExec {
   }
   // The comparator for comparing row hashcode, which should 
always be Integer.
   val prefixComparator = PrefixComparators.LONG
-  val canUseRadixSort = 
SparkEnv.get.conf.get(SQLConf.RADIX_SORT_ENABLED)
+  val canUseRadixSort = SQLConf.get.enableRadixSort
--- End diff --

@ueshin, BTW, for clarification, it does read the configuration but does 
not respect when the configuration is given to the session, right? I think we 
don't need to backport this through all other branches.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23037: [SPARK-26083][k8s] Add Copy pyspark into corresponding d...

2018-11-15 Thread AzureQ
Github user AzureQ commented on the issue:

https://github.com/apache/spark/pull/23037
  
> > This is fine, but please file a bug.
> 
> Okay, as such, @AzureQ could you add an integration test to 
`ClientModeTestsSuite`

Sure


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23025: [SPARK-26024][SQL]: Update documentation for repa...

2018-11-15 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/23025#discussion_r234071213
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -2813,6 +2819,11 @@ class Dataset[T] private[sql](
* When no explicit sort order is specified, "ascending nulls first" is 
assumed.
* Note, the rows are not sorted in each partition of the resulting 
Dataset.
*
+   * [SPARK-26024] Note that due to performance reasons this method uses 
sampling to
--- End diff --

We can drop `[SPARK-26024]` here.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23055: [SPARK-26080][SQL] Disable 'spark.executor.pyspar...

2018-11-15 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request:

https://github.com/apache/spark/pull/23055

[SPARK-26080][SQL] Disable 'spark.executor.pyspark.memory' always on Windows

## What changes were proposed in this pull request?

`resource` package is a Unit specific package. See 
https://docs.python.org/2/library/resource.html and 
https://docs.python.org/3/library/resource.html.

Note that we document Windows support:

> Spark runs on both Windows and UNIX-like systems (e.g. Linux, Mac OS). 

This should be backported into branch-2.4 to restore Windows support in 
Spark 2.4.1.

## How was this patch tested?

Manually mocking the changed logics.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/HyukjinKwon/spark SPARK-26080

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23055.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23055


commit 2d3315a7dab429abc4d9ef5ed7f8f5484e8421f1
Author: hyukjinkwon 
Date:   2018-11-16T01:46:31Z

Disable 'spark.executor.pyspark.memory' on Windows always




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23055: [SPARK-26080][SQL] Disable 'spark.executor.pyspark.memor...

2018-11-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23055
  
**[Test build #98892 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98892/testReport)**
 for PR 23055 at commit 
[`2d3315a`](https://github.com/apache/spark/commit/2d3315a7dab429abc4d9ef5ed7f8f5484e8421f1).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23055: [SPARK-26080][SQL] Disable 'spark.executor.pyspark.memor...

2018-11-15 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23055
  
cc @rdblue, @vanzin and @haydenjeune


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23038: [SPARK-25451][CORE][WEBUI]Aggregated metrics table doesn...

2018-11-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23038
  
**[Test build #98895 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98895/testReport)**
 for PR 23038 at commit 
[`7c3a80b`](https://github.com/apache/spark/commit/7c3a80bce0a45131091ce11e80a939e9de6ebf50).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23056: [SPARK-26034][PYTHON][TESTS] Break large mllib/tests.py ...

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23056
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23056: [SPARK-26034][PYTHON][TESTS] Break large mllib/tests.py ...

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23056
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5067/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23056: [SPARK-26034][PYTHON][TESTS] Break large mllib/tests.py ...

2018-11-15 Thread BryanCutler
Github user BryanCutler commented on the issue:

https://github.com/apache/spark/pull/23056
  
cc @HyukjinKwon @squito 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23056: [SPARK-26034][PYTHON][TESTS] Break large mllib/tests.py ...

2018-11-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23056
  
**[Test build #98897 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98897/testReport)**
 for PR 23056 at commit 
[`2759521`](https://github.com/apache/spark/commit/2759521df7f2dffc9ddb9379e0b1dac6721da366).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23049: [SPARK-26076][Build][Minor] Revise ambiguous error messa...

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23049
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23049: [SPARK-26076][Build][Minor] Revise ambiguous error messa...

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23049
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5069/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23055: [SPARK-26080][PYTHON] Disable 'spark.executor.pys...

2018-11-15 Thread rdblue
Github user rdblue commented on a diff in the pull request:

https://github.com/apache/spark/pull/23055#discussion_r234084002
  
--- Diff: 
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
@@ -74,8 +74,13 @@ private[spark] abstract class BasePythonRunner[IN, OUT](
   private val reuseWorker = conf.getBoolean("spark.python.worker.reuse", 
true)
   // each python worker gets an equal part of the allocation. the worker 
pool will grow to the
   // number of concurrent tasks, which is determined by the number of 
cores in this executor.
-  private val memoryMb = conf.get(PYSPARK_EXECUTOR_MEMORY)
+  private val memoryMb = if (Utils.isWindows) {
--- End diff --

I mean that it is brittle to try to use `resource` if the JVM has set the 
property. You handle the `ImportError`, but the JVM could set the request and 
Python would break again.

I think that this should not be entirely disabled on Windows. Resource 
requests to YARN or other schedulers should include this memory. The only 
feature that should be disabled is the resource limiting on the python side.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23046: [SPARK-23207][SQL][FOLLOW-UP] Use `SQLConf.get.enableRad...

2018-11-15 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23046
  
thanks, merging to master/2.4!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23054: [SPARK-26085][SQL] Key attribute of primitive type under...

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23054
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5072/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23054: [SPARK-26085][SQL] Key attribute of primitive type under...

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23054
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23043: [SPARK-26021][SQL] replace minus zero with zero in Unsaf...

2018-11-15 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/23043
  
IIUC, we discussed handling `+0.0` and `-0.0` before in another PR. 
@srowen do you remember the previous discussion?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23044: [SPARK-26073][SQL][FOLLOW-UP] remove invalid comment as ...

2018-11-15 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/23044
  
LGTM, pending Jenkins


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23043: [SPARK-26021][SQL] replace minus zero with zero in Unsaf...

2018-11-15 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/23043
  
@kiszk This spun out of https://issues.apache.org/jira/browse/SPARK-24834 
and https://github.com/apache/spark/pull/21794 ; is that what you may be 
thinking of? I'm not aware of others.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23050: [SPARK-26079][sql] Ensure listener event delivery in Str...

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23050
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5061/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23050: [SPARK-26079][sql] Ensure listener event delivery in Str...

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23050
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23049: [SPARK-26076][Build][Minor] Revise ambiguous error messa...

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23049
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23049: [SPARK-26076][Build][Minor] Revise ambiguous error messa...

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23049
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98879/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23049: [SPARK-26076][Build][Minor] Revise ambiguous error messa...

2018-11-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23049
  
**[Test build #98876 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98876/testReport)**
 for PR 23049 at commit 
[`bf94264`](https://github.com/apache/spark/commit/bf94264a3e037e00a7b7111a677467de980071c9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23040: [SPARK-26068][Core]ChunkedByteBufferInputStream should h...

2018-11-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23040
  
**[Test build #98878 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98878/testReport)**
 for PR 23040 at commit 
[`fa7af44`](https://github.com/apache/spark/commit/fa7af44abcd8ee95c956506c06badd83af067a03).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22612: [SPARK-24958] Add executors' process tree total memory i...

2018-11-15 Thread rezasafi
Github user rezasafi commented on the issue:

https://github.com/apache/spark/pull/22612
  
@squito @mccheah @dhruve  Let me know if there are more comments or this 
can be merged. I appreciate it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23031: [SPARK-26060][SQL] Track SparkConf entries and make SET ...

2018-11-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23031
  
**[Test build #98896 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98896/testReport)**
 for PR 23031 at commit 
[`336a331`](https://github.com/apache/spark/commit/336a331fdc817566c7fd09e5b36d5de24379c5b6).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23025: [SPARK-26024][SQL]: Update documentation for repa...

2018-11-15 Thread JulienPeloton
Github user JulienPeloton commented on a diff in the pull request:

https://github.com/apache/spark/pull/23025#discussion_r234099956
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -2813,6 +2819,11 @@ class Dataset[T] private[sql](
* When no explicit sort order is specified, "ascending nulls first" is 
assumed.
* Note, the rows are not sorted in each partition of the resulting 
Dataset.
*
+   * [SPARK-26024] Note that due to performance reasons this method uses 
sampling to
+   * estimate the ranges. Hence, the output may not be consistent, since 
sampling can return
+   * different values. The sample size can be controlled by setting the 
value of the parameter
+   * {{spark.sql.execution.rangeExchange.sampleSizePerPartition}}.
--- End diff --

Thanks. Done.



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23025: [SPARK-26024][SQL]: Update documentation for repa...

2018-11-15 Thread JulienPeloton
Github user JulienPeloton commented on a diff in the pull request:

https://github.com/apache/spark/pull/23025#discussion_r234099934
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -2813,6 +2819,11 @@ class Dataset[T] private[sql](
* When no explicit sort order is specified, "ascending nulls first" is 
assumed.
* Note, the rows are not sorted in each partition of the resulting 
Dataset.
*
+   * [SPARK-26024] Note that due to performance reasons this method uses 
sampling to
--- End diff --

Thanks. Done.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23047: [BACKPORT][SPARK-25883][SQL][MINOR] Override meth...

2018-11-15 Thread gengliangwang
Github user gengliangwang closed the pull request at:

https://github.com/apache/spark/pull/23047


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23038: [SPARK-25451][CORE][WEBUI]Aggregated metrics table doesn...

2018-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23038
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



<    1   2   3   4   5   >