[GitHub] spark issue #20348: [SPARK-23122][PYSPARK][FOLLOW-UP] Update the docs for UD...

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20348
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86461/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20348: [SPARK-23122][PYSPARK][FOLLOW-UP] Update the docs for UD...

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20348
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20348: [SPARK-23122][PYSPARK][FOLLOW-UP] Update the docs for UD...

2018-01-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20348
  
**[Test build #86461 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86461/testReport)**
 for PR 20348 at commit 
[`73a493a`](https://github.com/apache/spark/commit/73a493af4bf4b0a37dc58863cc75682e8fec26cb).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19892: [SPARK-22797][PySpark] Bucketizer support multi-column

2018-01-21 Thread MLnick
Github user MLnick commented on the issue:

https://github.com/apache/spark/pull/19892
  
If it is going to get merged to `branch-2.3` the `since` tags need to be 
`2.3.0` again


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20277: [SPARK-23090][SQL] polish ColumnVector

2018-01-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/20277#discussion_r162862761
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/vectorized/ArrowColumnVector.java 
---
@@ -33,18 +33,6 @@
   private final ArrowVectorAccessor accessor;
   private ArrowColumnVector[] childColumns;
 
-  private void ensureAccessible(int index) {
-ensureAccessible(index, 1);
-  }
-
-  private void ensureAccessible(int index, int count) {
--- End diff --

How about we do it later? We need to find a central place to put this 
check, instead of doing it in every implementation.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20277: [SPARK-23090][SQL] polish ColumnVector

2018-01-21 Thread ueshin
Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/20277
  
LGTM pending Jenkins.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20277: [SPARK-23090][SQL] polish ColumnVector

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20277
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/90/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20277: [SPARK-23090][SQL] polish ColumnVector

2018-01-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20277
  
**[Test build #86463 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86463/testReport)**
 for PR 20277 at commit 
[`0c22f5b`](https://github.com/apache/spark/commit/0c22f5bec3ce5d3bd9f54d7950b58bff65f4941b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20277: [SPARK-23090][SQL] polish ColumnVector

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20277
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20343: [SPARK-23167][SQL] Add TPCDS queries v2.7 in TPCDSQueryS...

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20343
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20277: [SPARK-23090][SQL] polish ColumnVector

2018-01-21 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/20277#discussion_r162860736
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/VectorizedHashMapGenerator.scala
 ---
@@ -127,8 +127,10 @@ class VectorizedHashMapGenerator(
 
 def genEqualsForKeys(groupingKeys: Seq[Buffer]): String = {
   groupingKeys.zipWithIndex.map { case (key: Buffer, ordinal: Int) =>
-s"""(${ctx.genEqual(key.dataType, 
ctx.getValue(s"vectors[$ordinal]", "buckets[idx]",
-  key.dataType), key.name)})"""
+// `ColumnVector.getStruct` is different from 
`InternalRow.getStruct`, it only takes an
+// `ordinal` parameter.
+val value = ctx.getValue(s"vectors[$ordinal]", key.dataType, 
"buckets[idx]")
--- End diff --

`getValueFromVector` instead of `getValue`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20343: [SPARK-23167][SQL] Add TPCDS queries v2.7 in TPCDSQueryS...

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20343
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/89/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20348: [SPARK-23122][PYSPARK][FOLLOW-UP] Update the docs for UD...

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20348
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/88/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20348: [SPARK-23122][PYSPARK][FOLLOW-UP] Update the docs for UD...

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20348
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20348: [SPARK-23122][PYSPARK][FOLLOW-UP] Update the docs for UD...

2018-01-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20348
  
**[Test build #86461 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86461/testReport)**
 for PR 20348 at commit 
[`73a493a`](https://github.com/apache/spark/commit/73a493af4bf4b0a37dc58863cc75682e8fec26cb).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20343: [SPARK-23167][SQL] Add TPCDS queries v2.7 in TPCDSQueryS...

2018-01-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20343
  
**[Test build #86462 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86462/testReport)**
 for PR 20343 at commit 
[`5d6092c`](https://github.com/apache/spark/commit/5d6092c4bf029a021930a4ba66e6e1de3a4b15ed).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20348: [SPARK-23122][PYSPARK][FOLLOW-UP] Update the docs...

2018-01-21 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20348#discussion_r162861480
  
--- Diff: python/pyspark/sql/udf.py ---
@@ -213,6 +213,10 @@ def register(self, name, f, returnType=None):
 `returnType` can be optionally specified when `f` is a Python 
function but not
 when `f` is a user-defined function. Please see below.
 
+To register a non-deterministic Python function, users need to 
first build
+a nondeterministic user-defined function for the Python function 
and then register it
--- End diff --

`nondeterministic` is better 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20343: [SPARK-23167][SQL] Add TPCDS queries v2.7 in TPCD...

2018-01-21 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/20343#discussion_r162860188
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/TPCDSQuerySuite.scala ---
@@ -339,6 +340,30 @@ class TPCDSQuerySuite extends BenchmarkQueryTest {
 }
   }
 
+  val tpcdsQueriesV2_7_0 = Seq(
+"q1", "q2", "q3", "q4", "q5", "q5a", "q6", "q7", "q8", "q9", "q10", 
"q10a", "q11",
+"q12", "q13", "q14_1", "q14_2", "q14a_1", "q14a_2",  "q15", "q16", 
"q17", "q18", "q18a", "q19",
+"q20", "q21", "q22", "q22a", "q23_1", "q23_2", "q24_1", "q24_2", 
"q25", "q26", "q27", "q27a",
+"q28", "q29", "q30", "q31", "q32", "q33", "q34", "q35", "q35a", "q36", 
"q36a", "q37", "q38",
+"q39_1", "q39_2", "q40", "q41", "q42", "q43", "q44", "q45", "q46", 
"q47", "q48", "q49",
+"q50", "q51", "q51a", "q52", "q53", "q54", "q55", "q56", "q57", "q58", 
"q59",
+"q60", "q61", "q62", "q63", "q64", "q65", "q66", "q67", "q67a", "q68", 
"q69",
+"q70", "q70a", "q71", "q72", "q73", "q74", "q75", "q76", "q77", 
"q77a", "q78", "q79",
+"q80", "q80a", "q81", "q82", "q83", "q84", "q85", "q86", "q86a", 
"q87", "q88", "q89",
+"q90", "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99")
+
+  tpcdsQueriesV2_7_0.foreach { name =>
+val queryString = resourceToString(s"tpcds-v2.7.0/$name.sql",
--- End diff --

ok


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18906: [SPARK-21692][PYSPARK][SQL] Add nullability suppo...

2018-01-21 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18906#discussion_r162857718
  
--- Diff: python/pyspark/sql/functions.py ---
@@ -2231,6 +2239,16 @@ def pandas_udf(f=None, returnType=None, 
functionType=None):
 ... return pd.Series(np.random.randn(len(v))
 >>> random = random.asNondeterministic()  # doctest: +SKIP
 
+.. note:: The user-defined functions are considered to be able to 
return null values by default.
+If your function is not deterministic, call `asNonNullable` on the 
user defined function.
--- End diff --

ditto.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18906: [SPARK-21692][PYSPARK][SQL] Add nullability suppo...

2018-01-21 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18906#discussion_r162857996
  
--- Diff: python/pyspark/sql/tests.py ---
@@ -602,6 +602,30 @@ def test_non_existed_udf(self):
 self.assertRaisesRegexp(AnalysisException, "Can not load class 
non_existed_udf",
 lambda: 
sqlContext.registerJavaFunction("udf1", "non_existed_udf"))
 
+def test_udf_no_nulls(self):
+from pyspark.sql.functions import udf
+plus_four = udf(lambda x: x + 4, IntegerType()).asNonNullable()
+df = self.spark.range(10)
+res = df.select(plus_four(df['id']).alias('plus_four'))
+self.assertFalse(plus_four.nullable)
+self.assertFalse(res.schema['plus_four'].nullable)
+self.assertEqual(res.agg({'plus_four': 'sum'}).collect()[0][0], 85)
+
+def test_udf_with_callable_no_nulls(self):
+df = self.spark.range(10)
+
+class PlusFour:
+def __call__(self, col):
+if col is not None:
+return col + 4
--- End diff --

We need `else` clause to be non-nullable?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18906: [SPARK-21692][PYSPARK][SQL] Add nullability suppo...

2018-01-21 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18906#discussion_r162857704
  
--- Diff: python/pyspark/sql/functions.py ---
@@ -2103,6 +2103,14 @@ def udf(f=None, returnType=StringType()):
 >>> import random
 >>> random_udf = udf(lambda: int(random.random() * 100), 
IntegerType()).asNondeterministic()
 
+.. note:: The user-defined functions are considered to be able to 
return null values by default.
+If your function is not deterministic, call `asNonNullable` on the 
user defined function.
--- End diff --

`nullable` instead of `deterministic`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20277: [SPARK-23090][SQL] polish ColumnVector

2018-01-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20277
  
**[Test build #86459 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86459/testReport)**
 for PR 20277 at commit 
[`55a288e`](https://github.com/apache/spark/commit/55a288e925a71cd48a533d6171926e398f857c2e).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20146: [SPARK-11215][ML] Add multiple columns support to String...

2018-01-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20146
  
**[Test build #86460 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86460/testReport)**
 for PR 20146 at commit 
[`540c364`](https://github.com/apache/spark/commit/540c364d2a70ecd6ee5b92fadedc5e9b85026d2c).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20277: [SPARK-23090][SQL] polish ColumnVector

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20277
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/87/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20277: [SPARK-23090][SQL] polish ColumnVector

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20277
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20348: [SPARK-23122][PYSPARK][FOLLOW-UP] Update the docs for UD...

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20348
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86458/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20348: [SPARK-23122][PYSPARK][FOLLOW-UP] Update the docs for UD...

2018-01-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20348
  
**[Test build #86458 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86458/testReport)**
 for PR 20348 at commit 
[`91ea1f5`](https://github.com/apache/spark/commit/91ea1f58b5daa27232c303f1bbeca09245a1aa9f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20348: [SPARK-23122][PYSPARK][FOLLOW-UP] Update the docs for UD...

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20348
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20347: [SPARK-20129][Core] JavaSparkContext should use SparkCon...

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20347
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20297: [SPARK-23020][CORE] Fix races in launcher code, t...

2018-01-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20297


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20347: [SPARK-20129][Core] JavaSparkContext should use SparkCon...

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20347
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86456/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20347: [SPARK-20129][Core] JavaSparkContext should use SparkCon...

2018-01-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20347
  
**[Test build #86456 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86456/testReport)**
 for PR 20347 at commit 
[`b1ae512`](https://github.com/apache/spark/commit/b1ae5125f65e0d8a59a4006a9777ed5185a758c9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20297: [SPARK-23020][CORE] Fix races in launcher code, test.

2018-01-21 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/20297
  
thanks, merging to master/2.3!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20348: [SPARK-23122][PYSPARK][FOLLOW-UP] Update the docs...

2018-01-21 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/20348#discussion_r162857800
  
--- Diff: python/pyspark/sql/udf.py ---
@@ -213,6 +213,10 @@ def register(self, name, f, returnType=None):
 `returnType` can be optionally specified when `f` is a Python 
function but not
 when `f` is a user-defined function. Please see below.
 
+To register a non-deterministic Python function, users need to 
first build
--- End diff --

Shall we switch this paragraph with `` `returnType` can be optionally s ... 
``? I intended to explain `returnType` with case 1. and case 2. together.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20348: [SPARK-23122][PYSPARK][FOLLOW-UP] Update the docs...

2018-01-21 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/20348#discussion_r162857579
  
--- Diff: python/pyspark/sql/udf.py ---
@@ -213,6 +213,10 @@ def register(self, name, f, returnType=None):
 `returnType` can be optionally specified when `f` is a Python 
function but not
 when `f` is a user-defined function. Please see below.
 
+To register a non-deterministic Python function, users need to 
first build
+a nondeterministic user-defined function for the Python function 
and then register it
--- End diff --

`nondeterministic` -> `non-deterministic` or the opposite.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19285: [SPARK-22068][CORE]Reduce the duplicate code between put...

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19285
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19285: [SPARK-22068][CORE]Reduce the duplicate code between put...

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19285
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86455/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19285: [SPARK-22068][CORE]Reduce the duplicate code between put...

2018-01-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19285
  
**[Test build #86455 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86455/testReport)**
 for PR 19285 at commit 
[`c442494`](https://github.com/apache/spark/commit/c4424943f5b74f8d1c191228cd8055d5482e7658).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20288: [SPARK-23122][PYTHON][SQL] Deprecate register* for UDFs ...

2018-01-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20288
  
@gatorsmile, I am sorry i don't know why I missed this comment ..


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20338: [SPARK-23174][BUILD][PYTHON] python code style checker u...

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20338
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20348: [SPARK-23122][PYSPARK][FOLLOW-UP] Update the docs...

2018-01-21 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/20348#discussion_r162856477
  
--- Diff: python/pyspark/sql/udf.py ---
@@ -200,7 +200,7 @@ def __init__(self, sparkSession):
 @since("1.3.1")
 def register(self, name, f, returnType=None):
 """Registers a Python function (including lambda function) or a 
user-defined function
--- End diff --

`Register` instead of `Registers` to be consistent with other descriptions?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20338: [SPARK-23174][BUILD][PYTHON] python code style checker u...

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20338
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86454/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20259: [SPARK-23066][WEB-UI] Master Page increase master start-...

2018-01-21 Thread guoxiaolongzte
Github user guoxiaolongzte commented on the issue:

https://github.com/apache/spark/pull/20259
  
Thank you for review, I will close this list. I'm going to use a script to 
monitor the health of the Master process.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20338: [SPARK-23174][BUILD][PYTHON] python code style checker u...

2018-01-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20338
  
**[Test build #86454 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86454/testReport)**
 for PR 20338 at commit 
[`c386a9a`](https://github.com/apache/spark/commit/c386a9a9b130e3974dc756b0fa89b7cff93f09ac).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20259: [SPARK-23066][WEB-UI] Master Page increase master...

2018-01-21 Thread guoxiaolongzte
Github user guoxiaolongzte closed the pull request at:

https://github.com/apache/spark/pull/20259


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pyspark

2018-01-21 Thread holdenk
Github user holdenk commented on the issue:

https://github.com/apache/spark/pull/13599
  
I want to be clear, there are ways to support this which don't break on 
executor restart which is why the hard block. Resilience is a first class 
concept for Spark and we can't abandon it to install packages.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pyspark

2018-01-21 Thread holdenk
Github user holdenk commented on the issue:

https://github.com/apache/spark/pull/13599
  
I think this is important functionality, however the current PR will break 
in the event of executor restart and that isn't acceptable. I'm -1 on this 
until that issue is fixed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-21 Thread holdenk
Github user holdenk commented on a diff in the pull request:

https://github.com/apache/spark/pull/13599#discussion_r162856254
  
--- Diff: python/pyspark/context.py ---
@@ -1023,6 +1032,35 @@ def getConf(self):
 conf.setAll(self._conf.getAll())
 return conf
 
+def install_packages(self, packages, install_driver=True):
+"""
+install python packages on all executors and driver through pip. 
pip will be installed
+by default no matter using native virtualenv or conda. So it is 
guaranteed that pip is
+available if virtualenv is enabled.
+:param packages: string for single package or a list of string for 
multiple packages
+:param install_driver: whether to install packages in client
+"""
+if self._conf.get("spark.pyspark.virtualenv.enabled") != "true":
+raise RuntimeError("install_packages can only use called when "
+   "spark.pyspark.virtualenv.enabled set as 
true")
+if isinstance(packages, basestring):
+packages = [packages]
+# seems statusTracker.getExecutorInfos() will return driver + 
exeuctors, so -1 here.
+num_executors = 
len(self._jsc.sc().statusTracker().getExecutorInfos()) - 1
+dummyRDD = self.parallelize(range(num_executors), num_executors)
+
+def _run_pip(packages, iterator):
+import pip
+pip.main(["install"] + packages)
+
+# run it in the main thread. Will do it in a separated thread after
+# https://github.com/pypa/pip/issues/2553 is fixed
+if install_driver:
+_run_pip(packages, None)
+
+import functools
+dummyRDD.foreachPartition(functools.partial(_run_pip, packages))
--- End diff --

@zjffdu No its it not, hard -1.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20348: [SPARK-23122][PYSPARK][FOLLOW-UP] Update the docs for UD...

2018-01-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20348
  
**[Test build #86458 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86458/testReport)**
 for PR 20348 at commit 
[`91ea1f5`](https://github.com/apache/spark/commit/91ea1f58b5daa27232c303f1bbeca09245a1aa9f).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20346: [MINOR][SQL] Fix wrong comments on org.apache.spa...

2018-01-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20346


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20348: [SPARK-23122][PYSPARK][FOLLOW-UP] Update the docs for UD...

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20348
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/86/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20348: [SPARK-23122][PYSPARK][FOLLOW-UP] Update the docs for UD...

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20348
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20343: [SPARK-23167][SQL] Add TPCDS queries v2.7 in TPCDSQueryS...

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20343
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86457/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20346: [MINOR][SQL] Fix wrong comments on org.apache.spark.sql....

2018-01-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20346
  
Merged to master and branch-2.3.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20348: [SPARK-23122][PYSPARK][FOLLOW-UP] Update the docs...

2018-01-21 Thread gatorsmile
GitHub user gatorsmile opened a pull request:

https://github.com/apache/spark/pull/20348

[SPARK-23122][PYSPARK][FOLLOW-UP] Update the docs for UDF Registration

## What changes were proposed in this pull request?

This PR is to update the docs for UDF registration  

## How was this patch tested?

N/A

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gatorsmile/spark testUpdateDoc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20348.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20348


commit 91ea1f58b5daa27232c303f1bbeca09245a1aa9f
Author: gatorsmile 
Date:   2018-01-22T06:16:46Z

update the doc.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20343: [SPARK-23167][SQL] Add TPCDS queries v2.7 in TPCDSQueryS...

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20343
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20343: [SPARK-23167][SQL] Add TPCDS queries v2.7 in TPCDSQueryS...

2018-01-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20343
  
**[Test build #86457 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86457/testReport)**
 for PR 20343 at commit 
[`12f687c`](https://github.com/apache/spark/commit/12f687c3c4338478f7f0cc40474c90f55aab8ecf).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20277: [SPARK-23090][SQL] polish ColumnVector

2018-01-21 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/20277#discussion_r162853774
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/vectorized/ArrowColumnVector.java 
---
@@ -55,164 +43,82 @@ public void close() {
 if (childColumns != null) {
   for (int i = 0; i < childColumns.length; i++) {
 childColumns[i].close();
+childColumns[i] = null;
   }
 }
--- End diff --

We need to do `childColumns = null` after the for loop, otherwise 
`NullPointerException` will be thrown if `close()` is called twice?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20259: [SPARK-23066][WEB-UI] Master Page increase master start-...

2018-01-21 Thread jerryshao
Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/20259
  
I kind of agree with @CodingCat , I think we have plenty of third-party 
monitoring tools to monitor the availability of Master process, it is not so 
necessary to expose here in Master UI. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20053: [SPARK-22873] [CORE] Init lastReportTimestamp wit...

2018-01-21 Thread Ngone51
Github user Ngone51 closed the pull request at:

https://github.com/apache/spark/pull/20053


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20056: [SPARK-22878] [CORE] Count totalDroppedEvents for...

2018-01-21 Thread Ngone51
Github user Ngone51 closed the pull request at:

https://github.com/apache/spark/pull/20056


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20338: [SPARK-23174][BUILD][PYTHON] python code style ch...

2018-01-21 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/20338#discussion_r162852255
  
--- Diff: dev/lint-python ---
@@ -35,11 +35,10 @@ compile_status="${PIPESTATUS[0]}"
 
 # Get pep8 at runtime so that we don't rely on it being installed on the 
build server.
 #+ See: https://github.com/apache/spark/pull/1744#issuecomment-50982162
-#+ TODOs:
-#+  - Download pep8 from PyPI. It's more "official".
-PEP8_VERSION="1.7.0"
+# Updated to latest official version for pep8.pep8 is formally renamed to 
pycodestyle.
+PEP8_VERSION="2.3.1"
--- End diff --

Should we also use `pycodestyle` instead of `pep8` for variable names or 
script names?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20338: [SPARK-23174][BUILD][PYTHON] python code style checker u...

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20338
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20297: [SPARK-23020][CORE] Fix races in launcher code, t...

2018-01-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/20297#discussion_r162851582
  
--- Diff: 
launcher/src/main/java/org/apache/spark/launcher/ChildProcAppHandle.java ---
@@ -48,14 +48,16 @@ public synchronized void disconnect() {
 
   @Override
   public synchronized void kill() {
-disconnect();
-if (childProc != null) {
-  if (childProc.isAlive()) {
-childProc.destroyForcibly();
+if (!isDisposed()) {
+  setState(State.KILLED);
--- End diff --

Even the order doesn't matter, I think it's more conventional to set the 
state at the end.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20338: [SPARK-23174][BUILD][PYTHON] python code style checker u...

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20338
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86451/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20338: [SPARK-23174][BUILD][PYTHON] python code style checker u...

2018-01-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20338
  
**[Test build #86451 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86451/testReport)**
 for PR 20338 at commit 
[`8718891`](https://github.com/apache/spark/commit/871889113057f819337bd24379cf1f07516c3298).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20297: [SPARK-23020][CORE] Fix races in launcher code, test.

2018-01-21 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/20297
  
LGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20346: [MINOR][SQL] Fix wrong comments on org.apache.spark.sql....

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20346
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86453/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20346: [MINOR][SQL] Fix wrong comments on org.apache.spark.sql....

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20346
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20346: [MINOR][SQL] Fix wrong comments on org.apache.spark.sql....

2018-01-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20346
  
**[Test build #86453 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86453/testReport)**
 for PR 20346 at commit 
[`4207754`](https://github.com/apache/spark/commit/4207754cc67633956b80e213875f6f5c3c2c0952).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20343: [SPARK-23167][SQL] Add TPCDS queries v2.7 in TPCD...

2018-01-21 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20343#discussion_r162850692
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/TPCDSQuerySuite.scala ---
@@ -339,6 +340,30 @@ class TPCDSQuerySuite extends BenchmarkQueryTest {
 }
   }
 
+  val tpcdsQueriesV2_7_0 = Seq(
+"q1", "q2", "q3", "q4", "q5", "q5a", "q6", "q7", "q8", "q9", "q10", 
"q10a", "q11",
+"q12", "q13", "q14_1", "q14_2", "q14a_1", "q14a_2",  "q15", "q16", 
"q17", "q18", "q18a", "q19",
+"q20", "q21", "q22", "q22a", "q23_1", "q23_2", "q24_1", "q24_2", 
"q25", "q26", "q27", "q27a",
+"q28", "q29", "q30", "q31", "q32", "q33", "q34", "q35", "q35a", "q36", 
"q36a", "q37", "q38",
+"q39_1", "q39_2", "q40", "q41", "q42", "q43", "q44", "q45", "q46", 
"q47", "q48", "q49",
+"q50", "q51", "q51a", "q52", "q53", "q54", "q55", "q56", "q57", "q58", 
"q59",
+"q60", "q61", "q62", "q63", "q64", "q65", "q66", "q67", "q67a", "q68", 
"q69",
+"q70", "q70a", "q71", "q72", "q73", "q74", "q75", "q76", "q77", 
"q77a", "q78", "q79",
+"q80", "q80a", "q81", "q82", "q83", "q84", "q85", "q86", "q86a", 
"q87", "q88", "q89",
+"q90", "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99")
+
+  tpcdsQueriesV2_7_0.foreach { name =>
+val queryString = resourceToString(s"tpcds-v2.7.0/$name.sql",
--- End diff --

This is only related to test cases. Thus, it is fine Spark 2.3 release does 
not have it. You can do it in this PR.

Actually, this PR can be merged as long as we can fix all the issues.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20292: [SPARK-23129][CORE] Make deserializeStream of DiskMapIte...

2018-01-21 Thread caneGuy
Github user caneGuy commented on the issue:

https://github.com/apache/spark/pull/20292
  
Ping  @jiangxb1987 could you help review this?thanks too much!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19285: [SPARK-22068][CORE]Reduce the duplicate code betw...

2018-01-21 Thread Ngone51
Github user Ngone51 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19285#discussion_r162848405
  
--- Diff: 
core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala ---
@@ -162,26 +162,29 @@ private[spark] class MemoryStore(
   }
 
   /**
-   * Attempt to put the given block in memory store as values.
+   * Attempt to put the given block in memory store as values or bytes.
*
* It's possible that the iterator is too large to materialize and store 
in memory. To avoid
* OOM exceptions, this method will gradually unroll the iterator while 
periodically checking
* whether there is enough free memory. If the block is successfully 
materialized, then the
* temporary unroll memory used during the materialization is 
"transferred" to storage memory,
* so we won't acquire more memory than is actually needed to store the 
block.
*
-   * @return in case of success, the estimated size of the stored data. In 
case of failure, return
-   * an iterator containing the values of the block. The returned 
iterator will be backed
-   * by the combination of the partially-unrolled block and the 
remaining elements of the
-   * original input iterator. The caller must either fully consume 
this iterator or call
-   * `close()` on it in order to free the storage memory consumed 
by the partially-unrolled
-   * block.
+   * @param blockId The block id.
+   * @param values The values which need be stored.
+   * @param classTag the [[ClassTag]] for the block.
+   * @param memoryMode The values saved mode.
+   * @param valuesHolder A holder that supports storing record of values 
into memory store as
+   *values or bytes.
+   * @return if the block is stored successfully, return the stored data 
size. Else return the
+   * memory has used for unroll the block.
--- End diff --

First, I think you will do not disagree with that there's 
partially-unrolled case exists in failure situation.

Second,
>The block can be unrolled fully, but the used memory exceeded the request 
and can't request the extra memory.

Yeah, I know. But what I want to say is block unrolled fully doesn't mean 
we have reserved unroll memory for all values(this only happens when the last 
element in iterator % memoryCheckPeriod == 0), because of `memoryCheckPeriod`.  
And here, we talk about `the  memory has used for unroll the block`. So, it is 
not accurately to say 'block be unrolled fully, so the used memory is for all 
the values'. 

So, mostly, it would be `partially-unrolled`. WDYT?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19285: [SPARK-22068][CORE]Reduce the duplicate code betw...

2018-01-21 Thread Ngone51
Github user Ngone51 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19285#discussion_r162841437
  
--- Diff: 
core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala ---
@@ -162,26 +162,29 @@ private[spark] class MemoryStore(
   }
 
   /**
-   * Attempt to put the given block in memory store as values.
+   * Attempt to put the given block in memory store as values or bytes.
*
* It's possible that the iterator is too large to materialize and store 
in memory. To avoid
* OOM exceptions, this method will gradually unroll the iterator while 
periodically checking
* whether there is enough free memory. If the block is successfully 
materialized, then the
* temporary unroll memory used during the materialization is 
"transferred" to storage memory,
* so we won't acquire more memory than is actually needed to store the 
block.
*
-   * @return in case of success, the estimated size of the stored data. In 
case of failure, return
-   * an iterator containing the values of the block. The returned 
iterator will be backed
-   * by the combination of the partially-unrolled block and the 
remaining elements of the
-   * original input iterator. The caller must either fully consume 
this iterator or call
-   * `close()` on it in order to free the storage memory consumed 
by the partially-unrolled
-   * block.
+   * @param blockId The block id.
+   * @param values The values which need be stored.
+   * @param classTag the [[ClassTag]] for the block.
+   * @param memoryMode The values saved mode.
--- End diff --

Ah, I mean put the `values`  together with `saved mode` let me consider 
about values's store type, rather than memory mode.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20298: [SPARK-22976][Core]: Cluster mode driver dir remo...

2018-01-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20298


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20298: [SPARK-22976][Core]: Cluster mode driver dir removed whi...

2018-01-21 Thread jerryshao
Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/20298
  
Merging to master/2.3, thanks for the fix!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20343: [SPARK-23167][SQL] Add TPCDS queries v2.7 in TPCDSQueryS...

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20343
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86452/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20343: [SPARK-23167][SQL] Add TPCDS queries v2.7 in TPCDSQueryS...

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20343
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20343: [SPARK-23167][SQL] Add TPCDS queries v2.7 in TPCDSQueryS...

2018-01-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20343
  
**[Test build #86452 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86452/testReport)**
 for PR 20343 at commit 
[`9ac04ed`](https://github.com/apache/spark/commit/9ac04edc5aa770fb04b9ad4c12de75fa6d4ac2c8).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20208: [SPARK-23007][SQL][TEST] Add schema evolution test suite...

2018-01-21 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/20208
  
Will do it after 2.3 release


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20325: [SPARK-22808][DOCS] add insertInto when save hive...

2018-01-21 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20325#discussion_r162845525
  
--- Diff: docs/sql-programming-guide.md ---
@@ -580,6 +580,9 @@ default local Hive metastore (using Derby) for you. 
Unlike the `createOrReplaceT
 Hive metastore. Persistent tables will still exist even after your Spark 
program has restarted, as
 long as you maintain your connection to the same metastore. A DataFrame 
for a persistent table can
 be created by calling the `table` method on a `SparkSession` with the name 
of the table.
+Notice that for `DataFrames` is built on Hive table, `insertInto` should 
be used instead of `saveAsTable`.
--- End diff --

Done. Thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20343: [SPARK-23167][SQL] Add TPCDS queries v2.7 in TPCDSQueryS...

2018-01-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20343
  
**[Test build #86457 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86457/testReport)**
 for PR 20343 at commit 
[`12f687c`](https://github.com/apache/spark/commit/12f687c3c4338478f7f0cc40474c90f55aab8ecf).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20343: [SPARK-23167][SQL] Add TPCDS queries v2.7 in TPCDSQueryS...

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20343
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/85/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20343: [SPARK-23167][SQL] Add TPCDS queries v2.7 in TPCDSQueryS...

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20343
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20277: [SPARK-23090][SQL] polish ColumnVector

2018-01-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/20277#discussion_r162844101
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/vectorized/ArrowColumnVector.java 
---
@@ -55,164 +43,82 @@ public void close() {
 if (childColumns != null) {
   for (int i = 0; i < childColumns.length; i++) {
 childColumns[i].close();
+childColumns[i] = null;
--- End diff --

what do you mean? `ColumnVector.close` is a required interface.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20316: [SPARK-23149][SQL] polish ColumnarBatch

2018-01-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/20316#discussion_r162842712
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/vectorized/ColumnarBatch.java ---
@@ -96,16 +90,6 @@ public void setNumRows(int numRows) {
*/
   public int numRows() { return numRows; }
 
-  /**
-   * Returns the schema that makes up this batch.
-   */
-  public StructType schema() { return schema; }
-
-  /**
-   * Returns the max capacity (in number of rows) for this batch.
-   */
-  public int capacity() { return capacity; }
--- End diff --

For `ColumnarBatch` consumers, they don't care about `capacity`, but only 
`numRows`, 
`capacity` is only needed by column vector builders. Also they don't care 
about schema/field names, but only the data type of each column.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20346: [MINOR][SQL] Fix wrong comments on org.apache.spark.sql....

2018-01-21 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/20346
  
Thank you for review and confirming, @viirya !


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20347: [SPARK-20129][Core] JavaSparkContext should use SparkCon...

2018-01-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20347
  
**[Test build #86456 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86456/testReport)**
 for PR 20347 at commit 
[`b1ae512`](https://github.com/apache/spark/commit/b1ae5125f65e0d8a59a4006a9777ed5185a758c9).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20347: [SPARK-20129][Core] JavaSparkContext should use SparkCon...

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20347
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20347: [SPARK-20129][Core] JavaSparkContext should use SparkCon...

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20347
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/84/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20347: [SPARK-20129][Core] JavaSparkContext should use S...

2018-01-21 Thread rekhajoshm
GitHub user rekhajoshm opened a pull request:

https://github.com/apache/spark/pull/20347

[SPARK-20129][Core] JavaSparkContext should use SparkContext.getOrCreate

## What changes were proposed in this pull request?
Using SparkContext getOrCreate() instead of recreating new sc in 
JavaSparkContext.

## How was this patch tested?
Existing tests


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rekhajoshm/spark SPARK-20129

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20347.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20347


commit e3677c9fa9697e0d34f9df52442085a6a481c9e9
Author: Rekha Joshi 
Date:   2015-05-05T23:10:08Z

Merge pull request #1 from apache/master

Pulling functionality from apache spark

commit 106fd8eee8f6a6f7c67cfc64f57c1161f76d8f75
Author: Rekha Joshi 
Date:   2015-05-08T21:49:09Z

Merge pull request #2 from apache/master

pull latest from apache spark

commit 0be142d6becba7c09c6eba0b8ea1efe83d649e8c
Author: Rekha Joshi 
Date:   2015-06-22T00:08:08Z

Merge pull request #3 from apache/master

Pulling functionality from apache spark

commit 6c6ee12fd733e3f9902e10faf92ccb78211245e3
Author: Rekha Joshi 
Date:   2015-09-17T01:03:09Z

Merge pull request #4 from apache/master

Pulling functionality from apache spark

commit b123c601e459d1ad17511fd91dd304032154882a
Author: Rekha Joshi 
Date:   2015-11-25T18:50:32Z

Merge pull request #5 from apache/master

pull request from apache/master

commit c73c32aadd6066e631956923725a48d98a18777e
Author: Rekha Joshi 
Date:   2016-03-18T19:13:51Z

Merge pull request #6 from apache/master

pull latest from apache spark

commit 7dbf7320057978526635bed09dabc8cf8657a28a
Author: Rekha Joshi 
Date:   2016-04-05T20:26:40Z

Merge pull request #8 from apache/master

pull latest from apache spark

commit 5e9d71827f8e2e4d07027281b80e4e073e7fecd1
Author: Rekha Joshi 
Date:   2017-05-01T23:00:30Z

Merge pull request #9 from apache/master

Pull apache spark

commit 63d99b3ce5f222d7126133170a373591f0ac67dd
Author: Rekha Joshi 
Date:   2017-09-30T22:26:44Z

Merge pull request #10 from apache/master

pull latest apache spark

commit a7fc787466b71784ff86f9694f617db0f1042da8
Author: Rekha Joshi 
Date:   2018-01-21T00:17:58Z

Merge pull request #11 from apache/master

Apache spark pull latest

commit b1ae5125f65e0d8a59a4006a9777ed5185a758c9
Author: rjoshi2 
Date:   2018-01-22T02:53:06Z

[SPARK-20129][Core] JavaSparkContext should use SparkContext.getOrCreate




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19285: [SPARK-22068][CORE]Reduce the duplicate code between put...

2018-01-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19285
  
**[Test build #86455 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86455/testReport)**
 for PR 19285 at commit 
[`c442494`](https://github.com/apache/spark/commit/c4424943f5b74f8d1c191228cd8055d5482e7658).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19285: [SPARK-22068][CORE]Reduce the duplicate code betw...

2018-01-21 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request:

https://github.com/apache/spark/pull/19285#discussion_r162840896
  
--- Diff: 
core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala ---
@@ -162,26 +162,29 @@ private[spark] class MemoryStore(
   }
 
   /**
-   * Attempt to put the given block in memory store as values.
+   * Attempt to put the given block in memory store as values or bytes.
*
* It's possible that the iterator is too large to materialize and store 
in memory. To avoid
* OOM exceptions, this method will gradually unroll the iterator while 
periodically checking
* whether there is enough free memory. If the block is successfully 
materialized, then the
* temporary unroll memory used during the materialization is 
"transferred" to storage memory,
* so we won't acquire more memory than is actually needed to store the 
block.
*
-   * @return in case of success, the estimated size of the stored data. In 
case of failure, return
-   * an iterator containing the values of the block. The returned 
iterator will be backed
-   * by the combination of the partially-unrolled block and the 
remaining elements of the
-   * original input iterator. The caller must either fully consume 
this iterator or call
-   * `close()` on it in order to free the storage memory consumed 
by the partially-unrolled
-   * block.
+   * @param blockId The block id.
+   * @param values The values which need be stored.
+   * @param classTag the [[ClassTag]] for the block.
+   * @param memoryMode The values saved mode.
+   * @param valuesHolder A holder that supports storing record of values 
into memory store as
+   *values of bytes.
+   * @return if the block is stored successfully, return the stored data 
size. Else return the
+   * memory has used for unroll the block.
--- End diff --

The block can be unrolled fully, but the used memory exceeded the request 
and can't request the extra memory.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19285: [SPARK-22068][CORE]Reduce the duplicate code betw...

2018-01-21 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request:

https://github.com/apache/spark/pull/19285#discussion_r162840776
  
--- Diff: 
core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala ---
@@ -162,26 +162,29 @@ private[spark] class MemoryStore(
   }
 
   /**
-   * Attempt to put the given block in memory store as values.
+   * Attempt to put the given block in memory store as values or bytes.
*
* It's possible that the iterator is too large to materialize and store 
in memory. To avoid
* OOM exceptions, this method will gradually unroll the iterator while 
periodically checking
* whether there is enough free memory. If the block is successfully 
materialized, then the
* temporary unroll memory used during the materialization is 
"transferred" to storage memory,
* so we won't acquire more memory than is actually needed to store the 
block.
*
-   * @return in case of success, the estimated size of the stored data. In 
case of failure, return
-   * an iterator containing the values of the block. The returned 
iterator will be backed
-   * by the combination of the partially-unrolled block and the 
remaining elements of the
-   * original input iterator. The caller must either fully consume 
this iterator or call
-   * `close()` on it in order to free the storage memory consumed 
by the partially-unrolled
-   * block.
+   * @param blockId The block id.
+   * @param values The values which need be stored.
+   * @param classTag the [[ClassTag]] for the block.
+   * @param memoryMode The values saved mode.
--- End diff --

`MemoryMode` only has ON_HEAP and OFF_HEAP two modes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20346: [MINOR][SQL] Fix wrong comments on org.apache.spark.sql....

2018-01-21 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/20346
  
This is an old code comment we should remove it before. Thanks for fixing
it. LGTM



On Jan 22, 2018 11:20 AM, "Dongjoon Hyun"  wrote:

> cc @viirya 
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> , or 
mute
> the thread
> 

> .
>



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20338: [SPARK-23174][BUILD][PYTHON] python code style checker u...

2018-01-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20338
  
**[Test build #86454 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86454/testReport)**
 for PR 20338 at commit 
[`c386a9a`](https://github.com/apache/spark/commit/c386a9a9b130e3974dc756b0fa89b7cff93f09ac).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20338: [SPARK-23174][BUILD][PYTHON] python code style checker u...

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20338
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20338: [SPARK-23174][BUILD][PYTHON] python code style checker u...

2018-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20338
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/83/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20338: [SPARK-23174][BUILD][PYTHON] python code style ch...

2018-01-21 Thread rekhajoshm
Github user rekhajoshm commented on a diff in the pull request:

https://github.com/apache/spark/pull/20338#discussion_r162838933
  
--- Diff: dev/tox.ini ---
@@ -13,7 +13,7 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-[pep8]
-ignore=E402,E731,E241,W503,E226
+[pycodestyle]
+ignore=E402,E731,E241,W503,E226,E722,E741,E305
--- End diff --

Updated SPARK-11222. Added 
https://issues.apache.org/jira/browse/SPARK-23174.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20346: [MINOR][SQL] Fix wrong comments on org.apache.spark.sql....

2018-01-21 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/20346
  
Thank you for review and approval, @HyukjinKwon !


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   >