[GitHub] spark issue #20421: [SPARK-23112][DOC] Update ML migration guide with breaki...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20421
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20421: [SPARK-23112][DOC] Update ML migration guide with breaki...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20421
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86802/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20421: [SPARK-23112][DOC] Update ML migration guide with breaki...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20421
  
**[Test build #86802 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86802/testReport)**
 for PR 20421 at commit 
[`4433d9c`](https://github.com/apache/spark/commit/4433d9cb70bd7a3257aef4e23f8c85f57c7999a6).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20404: [SPARK-23228][PYSPARK] Add Python Created jsparkSession ...

2018-01-29 Thread ueshin
Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/20404
  
LGTM.
I'd like to leave this to @felixcheung to confirm setting the default 
session is okay or not 
(https://github.com/apache/spark/pull/20404#discussion_r164362178).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20430: [SPARK-23263][SQL] Create table stored as parquet...

2018-01-29 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/20430#discussion_r164662154
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/CommandUtils.scala
 ---
@@ -34,16 +34,12 @@ object CommandUtils extends Logging {
 
   /** Change statistics after changing data by commands. */
   def updateTableStats(sparkSession: SparkSession, table: CatalogTable): 
Unit = {
-if (table.stats.nonEmpty) {
+if (sparkSession.sessionState.conf.autoSizeUpdateEnabled) {
   val catalog = sparkSession.sessionState.catalog
-  if (sparkSession.sessionState.conf.autoSizeUpdateEnabled) {
-val newTable = catalog.getTableMetadata(table.identifier)
-val newSize = 
CommandUtils.calculateTotalSize(sparkSession.sessionState, newTable)
-val newStats = CatalogStatistics(sizeInBytes = newSize)
-catalog.alterTableStats(table.identifier, Some(newStats))
-  } else {
-catalog.alterTableStats(table.identifier, None)
--- End diff --

this seems to be a way to clear out the table stats previously. Don't we 
need that?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20404: [SPARK-23228][PYSPARK] Add Python Created jsparkSession ...

2018-01-29 Thread jerryshao
Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/20404
  
Hi all, can you please review again, thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20428: [SPARK-23261] [PySpark] Rename Pandas UDFs

2018-01-29 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/20428
  
Let's also update PR description too.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20431: [SPARK-23222][SQL] Make DataFrameRangeSuite not flaky

2018-01-29 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/20431
  
I didn't notice significant difference. It is about 450~500 milliseconds.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20421: [SPARK-23112][DOC] Update ML migration guide with breaki...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20421
  
**[Test build #86802 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86802/testReport)**
 for PR 20421 at commit 
[`4433d9c`](https://github.com/apache/spark/commit/4433d9cb70bd7a3257aef4e23f8c85f57c7999a6).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20421: [SPARK-23112][DOC] Update ML migration guide with breaki...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20421
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/366/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20421: [SPARK-23112][DOC] Update ML migration guide with breaki...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20421
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20427: [SPARK-23260][SPARK-23262][SQL] several data sour...

2018-01-29 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request:

https://github.com/apache/spark/pull/20427#discussion_r164660069
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala
 ---
@@ -23,7 +23,7 @@ import org.apache.spark.sql.sources.v2.reader._
 
 case class DataSourceV2Relation(
--- End diff --

Consider remove V2 in `DataSourceV2Relation` and 
`StreamingDataSourceV2Relation` ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20432: [SPARK-23174][BUILD][PYTHON][FOLLOWUP] Add pycodestyle*....

2018-01-29 Thread rekhajoshm
Github user rekhajoshm commented on the issue:

https://github.com/apache/spark/pull/20432
  
👍 LGTM @ueshin 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20386: [SPARK-23202][SQL] Break down DataSourceV2Writer.commit ...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20386
  
**[Test build #86801 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86801/testReport)**
 for PR 20386 at commit 
[`42dc690`](https://github.com/apache/spark/commit/42dc69004ad37a5c4a5d8c96478a875ff4baed4e).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20386: [SPARK-23202][SQL] Break down DataSourceV2Writer.commit ...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20386
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20386: [SPARK-23202][SQL] Break down DataSourceV2Writer.commit ...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20386
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/365/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20422: [SPARK-23253][Core][Shuffle]Only write shuffle temporary...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20422
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20422: [SPARK-23253][Core][Shuffle]Only write shuffle temporary...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20422
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86792/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20422: [SPARK-23253][Core][Shuffle]Only write shuffle temporary...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20422
  
**[Test build #86792 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86792/testReport)**
 for PR 20422 at commit 
[`98ea6a7`](https://github.com/apache/spark/commit/98ea6a742143da803eb728c352e7424f504fabba).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20430: [SPARK-23263][SQL] Create table stored as parquet should...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20430
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20430: [SPARK-23263][SQL] Create table stored as parquet should...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20430
  
**[Test build #86790 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86790/testReport)**
 for PR 20430 at commit 
[`08d31c0`](https://github.com/apache/spark/commit/08d31c0823e5f6c257b0917362c8e07b04702af2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20430: [SPARK-23263][SQL] Create table stored as parquet should...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20430
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86790/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20177: [SPARK-22954][SQL] Fix the exception thrown by Analyze c...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20177
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86791/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SPARK-23204][SQL]: DataSourceV2: Use immut...

2018-01-29 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/20387
  
don't we already have table in DataFrameReader? 
http://spark.apache.org/docs/latest/api/python/pyspark.sql.html?highlight=dataframereader#pyspark.sql.DataFrameReader.table

http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.DataFrameReader@table(tableName:String):org.apache.spark.sql.DataFrame


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20177: [SPARK-22954][SQL] Fix the exception thrown by Analyze c...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20177
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20177: [SPARK-22954][SQL] Fix the exception thrown by Analyze c...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20177
  
**[Test build #86791 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86791/testReport)**
 for PR 20177 at commit 
[`4c86456`](https://github.com/apache/spark/commit/4c8645623f3b89c9f7b1bc7809c6b9f5a95d2389).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20332: [SPARK-23138][ML][DOC] Multiclass logistic regres...

2018-01-29 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20332


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20332: [SPARK-23138][ML][DOC] Multiclass logistic regression su...

2018-01-29 Thread MLnick
Github user MLnick commented on the issue:

https://github.com/apache/spark/pull/20332
  
Merged to master / branch-2.3. Thanks @sethah, and @WeichenXu123 for review.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20332: [SPARK-23138][ML][DOC] Multiclass logistic regres...

2018-01-29 Thread MLnick
Github user MLnick commented on a diff in the pull request:

https://github.com/apache/spark/pull/20332#discussion_r164654897
  
--- Diff: docs/ml-classification-regression.md ---
@@ -111,10 +110,9 @@ Continuing the earlier example:
 
[`LogisticRegressionTrainingSummary`](api/java/org/apache/spark/ml/classification/LogisticRegressionTrainingSummary.html)
 provides a summary for a
 
[`LogisticRegressionModel`](api/java/org/apache/spark/ml/classification/LogisticRegressionModel.html).
-Currently, only binary classification is supported and the
-summary must be explicitly cast to

-[`BinaryLogisticRegressionTrainingSummary`](api/java/org/apache/spark/ml/classification/BinaryLogisticRegressionTrainingSummary.html).
 
-Support for multiclass model summaries will be added in the future.
+In the case of binary classification, certain additional metrics are
--- End diff --

I'm ambivalent - I think it is fairly clear through the phrasing 
"additional metrics are available...", and in the API doc link provided.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20378: [SPARK-11222][Build][Python] Python document style check...

2018-01-29 Thread ueshin
Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/20378
  
I like this idea, too, but seems like there are too many violating files so 
we can't enable this for now.
I'm wondering how we can encourage contributors to follow the style, 
hopefully automatically. Should we make a blacklist for the currently violating 
files and remove from it when the style is fixed, or something?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20343: [SPARK-23167][SQL] Add TPCDS queries v2.7 in TPCDSQueryS...

2018-01-29 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/20343
  
I checked all the queries again and I found that some queries (q6, q11, 
q20, q22, q24, q34, q35, q47, q49, q57, q64, q72, q74, q75, q78, q98) only have 
minor changes (See the comments to point out the  changes). So, how about 
directly applying these changes in `sql/core/src/test/resources/tpcds`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20343: [SPARK-23167][SQL] Add TPCDS queries v2.7 in TPCDSQueryS...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20343
  
**[Test build #86800 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86800/testReport)**
 for PR 20343 at commit 
[`d04b087`](https://github.com/apache/spark/commit/d04b0872bcc02b5eadd309c560cda77ff1b8da0a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20343: [SPARK-23167][SQL] Add TPCDS queries v2.7 in TPCDSQueryS...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20343
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/364/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20343: [SPARK-23167][SQL] Add TPCDS queries v2.7 in TPCDSQueryS...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20343
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20433: [SPARK-23264][SQL] Support interval values without INTER...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20433
  
**[Test build #86799 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86799/testReport)**
 for PR 20433 at commit 
[`830cf8d`](https://github.com/apache/spark/commit/830cf8d014ae17ade5fd771ca98c8c846c93).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20433: [SPARK-23264][SQL] Support interval values without INTER...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20433
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/363/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20433: [SPARK-23264][SQL] Support interval values without INTER...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20433
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20343: [SPARK-23167][SQL] Add TPCDS queries v2.7 in TPCDSQueryS...

2018-01-29 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/20343
  
I opened a new pr to support `[date] + 14 days`: 
https://github.com/apache/spark/pull/20433


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20361: [SPARK-23188][SQL] Make vectorized columar reader...

2018-01-29 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20361#discussion_r164650445
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -377,6 +377,12 @@ object SQLConf {
   .booleanConf
   .createWithDefault(true)
 
+  val PARQUET_VECTORIZED_READER_BATCH_SIZE = 
buildConf("spark.sql.parquet.batchSize")
--- End diff --

Still a question. Is that possible to use the estimated memory size instead 
of the number of rows?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20433: [SPARK-23264][SQL] Support interval values withou...

2018-01-29 Thread maropu
GitHub user maropu opened a pull request:

https://github.com/apache/spark/pull/20433

[SPARK-23264][SQL] Support interval values without INTERVAL clauses

## What changes were proposed in this pull request?
This pr updated parsing rules in `SqlBase.g4` to support a SQL query below;
```
SELECT CAST('2017-08-04' AS DATE) + 1 days;
```
The current master cannot parse it though, other dbms-like systems support 
the syntax (e.g., hive and mysql). Also, the syntax is frequently used in the 
official TPC-DS queries.

## How was this patch tested?
Added tests in `SQLQuerySuite`.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maropu/spark SPARK-23264

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20433.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20433


commit 830cf8d014ae17ade5fd771ca98c8c846c93
Author: Takeshi Yamamuro 
Date:   2018-01-30T06:15:35Z

Fix




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20429: [SPARK-23157][SQL] Explain restriction on column ...

2018-01-29 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20429


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20431: [SPARK-23222][SQL] Make DataFrameRangeSuite not flaky

2018-01-29 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/20431
  
does this significantly increase the test runtime?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20428: [SPARK-23261] [PySpark] Rename Pandas UDFs

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20428
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/362/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20428: [SPARK-23261] [PySpark] Rename Pandas UDFs

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20428
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20428: [SPARK-23261] [PySpark] Rename Pandas UDFs

2018-01-29 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/20428
  
LGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20429: [SPARK-23157][SQL] Explain restriction on column express...

2018-01-29 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/20429
  
Thanks! Merged to master/2.3


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20386: [SPARK-23202][SQL] Break down DataSourceV2Writer.commit ...

2018-01-29 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/20386
  
CC @rdblue @zsxwing @jose-torres @sameeragarwal 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20432: [SPARK-23174][BUILD][PYTHON][FOLLOWUP] Add pycodestyle*....

2018-01-29 Thread ueshin
Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/20432
  
cc @rekhajoshm @HyukjinKwon 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20424: [Spark-23240][python] Better error message when e...

2018-01-29 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/20424#discussion_r164649608
  
--- Diff: 
core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala ---
@@ -191,7 +191,20 @@ private[spark] class PythonWorkerFactory(pythonExec: 
String, envVars: Map[String
 daemon = pb.start()
 
 val in = new DataInputStream(daemon.getInputStream)
-daemonPort = in.readInt()
+try {
+  daemonPort = in.readInt()
+} catch {
+  case exc: EOFException =>
+throw new IOException(s"No port number in $daemonModule's 
stdout")
+}
+
+// test that the returned port number is within a valid range.
+// note: this does not cover the case where the port number
+// is arbitrary data but is also coincidentally within range
+if (daemonPort < 1 || daemonPort > 0x) {
--- End diff --

I mean, I left my sign-off because what we do is basically move the _same_ 
check (`java.net.InetSocketAddress.checkPort`) ahead and another one is simply 
to wraps an exception, `EOFException`. I think we are here safe in theory.

I got your point of reserved ports and now the condition became narrower. I 
should check other things like which error it produces before in this case and 
if the current error message is nicer. Also, this seems not completely 
addressing the concerns about it.

I was wondering if this is worth doing these stuff. If you strongly prefer 
this, I won't stay against but may request few more investigations.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20428: [SPARK-23261] [PySpark] Rename Pandas UDFs

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20428
  
**[Test build #86798 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86798/testReport)**
 for PR 20428 at commit 
[`7a71c5a`](https://github.com/apache/spark/commit/7a71c5a294da230faf19965dc1d068adc3678411).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20432: [SPARK-23174][BUILD][PYTHON][FOLLOWUP] Add pycodestyle*....

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20432
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20432: [SPARK-23174][BUILD][PYTHON][FOLLOWUP] Add pycodestyle*....

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20432
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/361/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20386: [SPARK-23202][SQL] Break down DataSourceV2Writer....

2018-01-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/20386#discussion_r164649253
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/EpochCoordinator.scala
 ---
@@ -148,7 +148,8 @@ private[continuous] class EpochCoordinator(
   logDebug(s"Epoch $epoch has received commits from all partitions. 
Committing globally.")
   // Sequencing is important here. We must commit to the writer before 
recording the commit
   // in the query, or we will end up dropping the commit if we restart 
in the middle.
-  writer.commit(epoch, thisEpochCommits.toArray)
+  thisEpochCommits.foreach(writer.add(_))
--- End diff --

is it possible to call `add` once the commit message arrives?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20386: [SPARK-23202][SQL] Break down DataSourceV2Writer....

2018-01-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/20386#discussion_r164648934
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/DataSourceV2Writer.java
 ---
@@ -63,32 +65,30 @@
   DataWriterFactory createWriterFactory();
 
   /**
-   * Commits this writing job with a list of commit messages. The commit 
messages are collected from
-   * successful data writers and are produced by {@link 
DataWriter#commit()}.
+   * Handles a commit message produced by {@link DataWriter#commit()}.
--- End diff --

nit: `..., which is collected from a successful data writer in the executor 
side.`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20432: [SPARK-23174][BUILD][PYTHON][FOLLOWUP] Add pycodestyle*....

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20432
  
**[Test build #86797 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86797/testReport)**
 for PR 20432 at commit 
[`3fb3d78`](https://github.com/apache/spark/commit/3fb3d785a9b2497b6ec3b9ac9329db776568197c).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20386: [SPARK-23202][SQL] Break down DataSourceV2Writer....

2018-01-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/20386#discussion_r164648815
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/DataSourceV2Writer.java
 ---
@@ -63,32 +65,30 @@
   DataWriterFactory createWriterFactory();
 
   /**
-   * Commits this writing job with a list of commit messages. The commit 
messages are collected from
-   * successful data writers and are produced by {@link 
DataWriter#commit()}.
+   * Handles a commit message produced by {@link DataWriter#commit()}.
*
* If this method fails (by throwing an exception), this writing job is 
considered to to have been
-   * failed, and {@link #abort(WriterCommitMessage[])} would be called. 
The state of the destination
-   * is undefined and @{@link #abort(WriterCommitMessage[])} may not be 
able to deal with it.
+   * failed, and {@link #abort()} would be called. The state of the 
destination
+   * is undefined and @{@link #abort()} may not be able to deal with it.
--- End diff --

add some more comments to say that, implementations should probably cache 
the commit messages and do the final step in #commit


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20432: [SPARK-23174][BUILD][PYTHON][FOLLOWUP] Add pycode...

2018-01-29 Thread ueshin
GitHub user ueshin opened a pull request:

https://github.com/apache/spark/pull/20432

[SPARK-23174][BUILD][PYTHON][FOLLOWUP] Add pycodestyle*.py to .gitignore 
file.

## What changes were proposed in this pull request?

This is a follow-up pr of #20338 which changed the downloaded file name of 
the python code style checker but it's not contained in .gitignore file so the 
file remains as an untracked file for git after running the checker.
This pr adds the file name to .gitignore file.

## How was this patch tested?

Tested manually.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ueshin/apache-spark issues/SPARK-23174/fup1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20432.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20432


commit 3fb3d785a9b2497b6ec3b9ac9329db776568197c
Author: Takuya UESHIN 
Date:   2018-01-30T06:03:19Z

Add pycodestyle*.py to .gitignore file.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20386: [SPARK-23202][SQL] Break down DataSourceV2Writer....

2018-01-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/20386#discussion_r164648645
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/DataSourceV2Writer.java
 ---
@@ -40,11 +40,13 @@
  *   1. Create a writer factory by {@link #createWriterFactory()}, 
serialize and send it to all the
  *  partitions of the input data(RDD).
  *   2. For each partition, create the data writer, and write the data of 
the partition with this
- *  writer. If all the data are written successfully, call {@link 
DataWriter#commit()}. If
- *  exception happens during the writing, call {@link 
DataWriter#abort()}.
- *   3. If all writers are successfully committed, call {@link 
#commit(WriterCommitMessage[])}. If
+ *  writer. If all the data are written successfully, call {@link 
DataWriter#commit()}.
+ *  On a writer being successfully committed, call {@link 
#add(WriterCommitMessage)} to
+ *  handle its commit message.
+ *  If exception happens during the writing, call {@link 
DataWriter#abort()}.
+ *   3. If all writers are successfully committed, call {@link #commit()}. 
If
--- End diff --

If all the data writers finish successfully, and #add is successfully 
called for all the commit messages, Spark will call #commit. If any of the data 
writers failed, or any of the #add call failed, or the job failed with an 
unknown reason, call #abort.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20400: [SPARK-23084][PYTHON]Add unboundedPreceding(), unbounded...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20400
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86795/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20400: [SPARK-23084][PYTHON]Add unboundedPreceding(), unbounded...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20400
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20400: [SPARK-23084][PYTHON]Add unboundedPreceding(), unbounded...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20400
  
**[Test build #86795 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86795/testReport)**
 for PR 20400 at commit 
[`bbf8778`](https://github.com/apache/spark/commit/bbf8778a963a5e0b8de1b5ab1fddf4cafe13c180).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20386: [SPARK-23202][SQL] Break down DataSourceV2Writer....

2018-01-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/20386#discussion_r164648356
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/DataSourceV2Writer.java
 ---
@@ -40,11 +40,13 @@
  *   1. Create a writer factory by {@link #createWriterFactory()}, 
serialize and send it to all the
  *  partitions of the input data(RDD).
  *   2. For each partition, create the data writer, and write the data of 
the partition with this
- *  writer. If all the data are written successfully, call {@link 
DataWriter#commit()}. If
- *  exception happens during the writing, call {@link 
DataWriter#abort()}.
- *   3. If all writers are successfully committed, call {@link 
#commit(WriterCommitMessage[])}. If
+ *  writer. If all the data are written successfully, call {@link 
DataWriter#commit()}.
--- End diff --

If one data writer finishes successfully, the commit message will be sent 
back to the driver side and Spark will call #add.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20431: [SPARK-23222][SQL] Make DataFrameRangeSuite not flaky

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20431
  
**[Test build #86796 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86796/testReport)**
 for PR 20431 at commit 
[`9a4a484`](https://github.com/apache/spark/commit/9a4a4842b3f8281e73e564f4dfdad92017630760).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20431: [SPARK-23222][SQL] Make DataFrameRangeSuite not flaky

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20431
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/360/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20431: [SPARK-23222][SQL] Make DataFrameRangeSuite not flaky

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20431
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20431: [SPARK-23222][SQL] Make DataFrameRangeSuite not flaky

2018-01-29 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/20431
  
cc @vanzin @cloud-fan 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20431: [SPARK-23222][SQL] Make DataFrameRangeSuite not f...

2018-01-29 Thread viirya
GitHub user viirya opened a pull request:

https://github.com/apache/spark/pull/20431

[SPARK-23222][SQL] Make DataFrameRangeSuite not flaky

## What changes were proposed in this pull request?

It is reported that the test `Cancelling stage in a query with Range` in 
`DataFrameRangeSuite` fails a few times in unrelated PRs. I personally also saw 
it too in my PR.

This test is not very flaky actually but only fails occasionally. Based on 
how the test works, I guess that is because `range` finishes before the 
listener calls `cancelStage`.

I increase the range number from `10L` to `1000L` and count 
the range in one partition. I also reduce the `interval` of checking stage id. 
Hopefully it can make the test not flaky anymore.

## How was this patch tested?

The modified tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viirya/spark-1 SPARK-23222

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20431.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20431


commit 9a4a4842b3f8281e73e564f4dfdad92017630760
Author: Liang-Chi Hsieh 
Date:   2018-01-30T05:49:01Z

Make DataFrameRangeSuite not flaky.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20424: [Spark-23240][python] Better error message when e...

2018-01-29 Thread bersprockets
Github user bersprockets commented on a diff in the pull request:

https://github.com/apache/spark/pull/20424#discussion_r164646512
  
--- Diff: 
core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala ---
@@ -191,7 +191,20 @@ private[spark] class PythonWorkerFactory(pythonExec: 
String, envVars: Map[String
 daemon = pb.start()
 
 val in = new DataInputStream(daemon.getInputStream)
-daemonPort = in.readInt()
+try {
+  daemonPort = in.readInt()
+} catch {
+  case exc: EOFException =>
+throw new IOException(s"No port number in $daemonModule's 
stdout")
+}
+
+// test that the returned port number is within a valid range.
+// note: this does not cover the case where the port number
+// is arbitrary data but is also coincidentally within range
+if (daemonPort < 1 || daemonPort > 0x) {
--- End diff --

Sorry @HyukjinKwon , I didn't quite get the last point. Could you rephrase?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-29 Thread zjffdu
Github user zjffdu commented on a diff in the pull request:

https://github.com/apache/spark/pull/13599#discussion_r164646157
  
--- Diff: 
core/src/main/scala/org/apache/spark/api/python/VirtualEnvFactory.scala ---
@@ -0,0 +1,151 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.api.python
+
+import java.io.File
+import java.util.{Map => JMap}
+import java.util.Arrays
+import java.util.concurrent.atomic.AtomicInteger
+
+import scala.collection.JavaConverters._
+
+import com.google.common.io.Files
+
+import org.apache.spark.SparkConf
+import org.apache.spark.internal.Logging
+
+
+private[spark] class VirtualEnvFactory(pythonExec: String, conf: 
SparkConf, isDriver: Boolean)
+  extends Logging {
+
+  private var virtualEnvType = conf.get("spark.pyspark.virtualenv.type", 
"native")
+  private var virtualEnvPath = 
conf.get("spark.pyspark.virtualenv.bin.path", "")
+  private var virtualEnvName: String = _
+  private var virtualPythonExec: String = _
+  private val VIRTUALENV_ID = new AtomicInteger()
+  private var isLauncher: Boolean = false
+
+  // used by launcher when user want to use virtualenv in pyspark shell. 
Launcher need this class
+  // to create virtualenv for driver.
+  def this(pythonExec: String, properties: JMap[String, String], isDriver: 
java.lang.Boolean) {
--- End diff --

It is used by launcher module which doesn't depend on scala. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20400: [SPARK-23084][PYTHON]Add unboundedPreceding(), unbounded...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20400
  
**[Test build #86795 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86795/testReport)**
 for PR 20400 at commit 
[`bbf8778`](https://github.com/apache/spark/commit/bbf8778a963a5e0b8de1b5ab1fddf4cafe13c180).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20400: [SPARK-23084][PYTHON]Add unboundedPreceding(), unbounded...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20400
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20400: [SPARK-23084][PYTHON]Add unboundedPreceding(), unbounded...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20400
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/359/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20424: [Spark-23240][python] Better error message when e...

2018-01-29 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/20424#discussion_r164643325
  
--- Diff: 
core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala ---
@@ -191,7 +191,20 @@ private[spark] class PythonWorkerFactory(pythonExec: 
String, envVars: Map[String
 daemon = pb.start()
 
 val in = new DataInputStream(daemon.getInputStream)
-daemonPort = in.readInt()
+try {
+  daemonPort = in.readInt()
+} catch {
+  case exc: EOFException =>
+throw new IOException(s"No port number in $daemonModule's 
stdout")
+}
+
+// test that the returned port number is within a valid range.
+// note: this does not cover the case where the port number
+// is arbitrary data but is also coincidentally within range
+if (daemonPort < 1 || daemonPort > 0x) {
--- End diff --

I am saying we can just safely move the same check and fail fast, which is 
simple and theoretically safe.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20424: [Spark-23240][python] Better error message when e...

2018-01-29 Thread bersprockets
Github user bersprockets commented on a diff in the pull request:

https://github.com/apache/spark/pull/20424#discussion_r164642503
  
--- Diff: 
core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala ---
@@ -191,7 +191,20 @@ private[spark] class PythonWorkerFactory(pythonExec: 
String, envVars: Map[String
 daemon = pb.start()
 
 val in = new DataInputStream(daemon.getInputStream)
-daemonPort = in.readInt()
+try {
+  daemonPort = in.readInt()
+} catch {
+  case exc: EOFException =>
--- End diff --

I see, exc not used.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20386: [SPARK-23202][SQL] Break down DataSourceV2Writer.commit ...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20386
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86788/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20386: [SPARK-23202][SQL] Break down DataSourceV2Writer.commit ...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20386
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20386: [SPARK-23202][SQL] Break down DataSourceV2Writer.commit ...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20386
  
**[Test build #86788 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86788/testReport)**
 for PR 20386 at commit 
[`7a677fd`](https://github.com/apache/spark/commit/7a677fd63338cdfca4f1406ee9a5a7c45df42521).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20424: [Spark-23240][python] Better error message when extraneo...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20424
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86787/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20424: [Spark-23240][python] Better error message when extraneo...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20424
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20428: [SPARK-23261] [PySpark] Rename Pandas UDFs

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20428
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86794/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20428: [SPARK-23261] [PySpark] Rename Pandas UDFs

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20428
  
**[Test build #86794 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86794/testReport)**
 for PR 20428 at commit 
[`9a4aada`](https://github.com/apache/spark/commit/9a4aada3aafc0fcb06f06a39ce996ec9751ae0ac).
 * This patch **fails Python style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20424: [Spark-23240][python] Better error message when extraneo...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20424
  
**[Test build #86787 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86787/testReport)**
 for PR 20424 at commit 
[`a1cb1a8`](https://github.com/apache/spark/commit/a1cb1a89d679840845142facf58f15e870f7c81d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20428: [SPARK-23261] [PySpark] Rename Pandas UDFs

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20428
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20428: [SPARK-23261] [PySpark] Rename Pandas UDFs

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20428
  
**[Test build #86794 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86794/testReport)**
 for PR 20428 at commit 
[`9a4aada`](https://github.com/apache/spark/commit/9a4aada3aafc0fcb06f06a39ce996ec9751ae0ac).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20428: [SPARK-23261] [PySpark] Rename Pandas UDFs

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20428
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/358/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20424: [Spark-23240][python] Better error message when e...

2018-01-29 Thread bersprockets
Github user bersprockets commented on a diff in the pull request:

https://github.com/apache/spark/pull/20424#discussion_r164641641
  
--- Diff: 
core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala ---
@@ -191,7 +191,20 @@ private[spark] class PythonWorkerFactory(pythonExec: 
String, envVars: Map[String
 daemon = pb.start()
 
 val in = new DataInputStream(daemon.getInputStream)
-daemonPort = in.readInt()
+try {
+  daemonPort = in.readInt()
+} catch {
+  case exc: EOFException =>
+throw new IOException(s"No port number in $daemonModule's 
stdout")
+}
+
+// test that the returned port number is within a valid range.
+// note: this does not cover the case where the port number
+// is arbitrary data but is also coincidentally within range
+if (daemonPort < 1 || daemonPort > 0x) {
--- End diff --

Port 0 has special meaning. A program passes port 0 when it wants the 
system to choose an unused port on the program's behalf. So, the daemon should 
not return 0.

It's valid to pass port 0 to InetSocketAddress, since you might be asking 
for the system to assign a port for you.

However, following my own logic, the code in my pull request really should 
be checking for the range 49152-65535 (ephemeral range) instead of 1-65535, but 
I didn't have the nerve to make it that restrictive.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20428: [SPARK-23261] [PySpark] Rename Pandas UDFs

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20428
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20378: [SPARK-11222][Build][Python] Python document style check...

2018-01-29 Thread rekhajoshm
Github user rekhajoshm commented on the issue:

https://github.com/apache/spark/pull/20378
  
Thanks @HyukjinKwon for your update.

@HyukjinKwon  @holdenk  @ueshin  @viirya  @icexelloss  @felixcheung  
@BryanCutler and @MrBago - While you are thinking on it, below is my analysis.

As I understand, there are two things that jira "seems" to be calling 
out.Please validate.
1. doctest strings must be correctly formatted. 
2. Doctest must NOT be included in docs?

Working on it, I found docstring style itself was not enforced at all, and 
that includes doctest style. 
Another aspect seems to be exclude doctest from documentation (_build/html 
once generated.)
I am not certain on the reasoning behind this exclusion or whether it is 
indeed what is additionally intended in SPARK-11222.The jira subject and 
description say two different things, so maybe validate that understanding?

Meanwhile I had a look into/tested different configurations on 
epytext/sphinx extensions to see if we can achieve surpassing doctests in docs 
via them.
Played with RULES set in epytext.py. As per epytext manual 
http://epydoc.sourceforge.net/manual-epytext.html it ensures only correctly 
formatted doctests are rendered.Which means only if they were incorrectly 
formatted will they not to appear in docs.This did not seem right and contrary 
even.

So I played around with sphinx extensions -
'sphinx.ext.doctest',
'sphinx.ext.napoleon'
 and in conf.py -
# Napoleon settings
napoleon_google_docstring = True
# Doctest settings
doctest_test_doctest_blocks=''
trim_doctest_flags=True

None of those options tried get me to surpass doctest in docs(_build/html) 
once the build is done.

Thanks for thinking this over.



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20427: [SPARK-23260][SPARK-23262][SQL] several data source v2 n...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20427
  
**[Test build #86793 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86793/testReport)**
 for PR 20427 at commit 
[`b4fdbbe`](https://github.com/apache/spark/commit/b4fdbbe265943012093fbc0f54e8b22184fa2987).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20427: [SPARK-23260][SPARK-23262][SQL] several data source v2 n...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20427
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20427: [SPARK-23260][SPARK-23262][SQL] several data source v2 n...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20427
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/357/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20427: [SPARK-23260][SPARK-23262][SQL] several data source v2 n...

2018-01-29 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/20427
  
Retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20424: [Spark-23240][python] Better error message when e...

2018-01-29 Thread bersprockets
Github user bersprockets commented on a diff in the pull request:

https://github.com/apache/spark/pull/20424#discussion_r164640285
  
--- Diff: 
core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala ---
@@ -191,7 +191,20 @@ private[spark] class PythonWorkerFactory(pythonExec: 
String, envVars: Map[String
 daemon = pb.start()
 
 val in = new DataInputStream(daemon.getInputStream)
-daemonPort = in.readInt()
+try {
+  daemonPort = in.readInt()
+} catch {
+  case exc: EOFException =>
+throw new IOException(s"No port number in $daemonModule's 
stdout")
+}
+
+// test that the returned port number is within a valid range.
+// note: this does not cover the case where the port number
+// is arbitrary data but is also coincidentally within range
+if (daemonPort < 1 || daemonPort > 0x) {
+  throw new IOException(s"Bad port number in $daemonModule's 
stdout: " +
+f"0x$daemonPort%08x")
--- End diff --

Yes, that makes sense. I might not be able to get it as clear as the exact 
path to the file, since the PythonWorkerFactory sets a PYTHONPATH environmental 
variable and then lets python itself figure out where on those paths the module 
actually lives. But I could tell the user how the PYTHONPATH was set up (in a 
generic sense, without using any shell's syntax) and then how the python 
command was subsequently run.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20427: [SPARK-23260][SPARK-23262][SQL] several data source v2 n...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20427
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86789/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20427: [SPARK-23260][SPARK-23262][SQL] several data source v2 n...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20427
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20427: [SPARK-23260][SPARK-23262][SQL] several data source v2 n...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20427
  
**[Test build #86789 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86789/testReport)**
 for PR 20427 at commit 
[`b4fdbbe`](https://github.com/apache/spark/commit/b4fdbbe265943012093fbc0f54e8b22184fa2987).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20424: [Spark-23240][python] Better error message when e...

2018-01-29 Thread squito
Github user squito commented on a diff in the pull request:

https://github.com/apache/spark/pull/20424#discussion_r164637553
  
--- Diff: 
core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala ---
@@ -191,7 +191,20 @@ private[spark] class PythonWorkerFactory(pythonExec: 
String, envVars: Map[String
 daemon = pb.start()
 
 val in = new DataInputStream(daemon.getInputStream)
-daemonPort = in.readInt()
+try {
+  daemonPort = in.readInt()
+} catch {
+  case exc: EOFException =>
+throw new IOException(s"No port number in $daemonModule's 
stdout")
+}
+
+// test that the returned port number is within a valid range.
+// note: this does not cover the case where the port number
+// is arbitrary data but is also coincidentally within range
+if (daemonPort < 1 || daemonPort > 0x) {
+  throw new IOException(s"Bad port number in $daemonModule's 
stdout: " +
+f"0x$daemonPort%08x")
--- End diff --

just a thought:

 this error message won't mean much to the typical user.  Would it be 
sensible to tell the user exactly what python command to run themselves to 
figure out the problem?  Eg. "unexpected stdout from 
/foo/bar/some/path/to/python -m /path/to/daemon.py".  That's what would help 
with that sitecustomization.py case.  Or not useful in general?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20378: [SPARK-11222][Build][Python] Python document style check...

2018-01-29 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20378
  
Hey @holdenk, @ueshin, @viirya, @icexelloss, @felixcheung, @BryanCutler and 
@MrBago. What do you guys think about checking docstring and the list above? I 
think this could prevent nitpicking and idea itself seems good.

One vague concern is that it might make backporting super hard.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20422: [SPARK-23253][Core][Shuffle]Only write shuffle te...

2018-01-29 Thread squito
Github user squito commented on a diff in the pull request:

https://github.com/apache/spark/pull/20422#discussion_r164635886
  
--- Diff: 
core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockResolver.scala ---
@@ -166,8 +153,20 @@ private[spark] class IndexShuffleBlockResolver(
   if (dataTmp != null && dataTmp.exists()) {
 dataTmp.delete()
   }
-  indexTmp.delete()
 } else {
+  val out = new DataOutputStream(new BufferedOutputStream(new 
FileOutputStream(indexTmp)))
--- End diff --

move this below the comment "This is the first successul attempt".

I'd also include a comment about why we write to a temporary file, even 
though we're always going to rename (because in case the task dies somehow, 
we'd prefer to not leave a half-written index file in the final location).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20378: [SPARK-11222][Build][Python] Python document style check...

2018-01-29 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20378
  
So, seems we got:

```
First line should end with a period. 293
Multiline docstring should end with 1 blank line. 279
Blank line missing after one-line summary. 265
Return value type should be mentioned. 141
All modules should have docstrings. 109
One-liner docstrings should fit on one line with quotes. 91
First line should be in imperative mood ('Do', not 'Does'). 87
Exported definitions should have docstrings. 61
Class docstring should have 1 blank line around them. 35
Use r\"\"\" if any backslashes in your docstrings. 19
The entire docstring should be indented same as code. 6
Exported classes should have docstrings. 1
No blank line before docstring in definitions. 1
```

I think we can take in:

```
First line should end with a period. 293
Multiline docstring should end with 1 blank line. 279
Blank line missing after one-line summary. 265
The entire docstring should be indented same as code. 6
Use \"\"\"triple double quotes\"\"\". 3   # this seems only in heapq3.py 
where we ignore pep8.
No blank line before docstring in definitions. 1
```

Not sure on:

```
Exported definitions should have docstrings. 61
Exported classes should have docstrings. 1
```

and take out

```
Return value type should be mentioned. 141
All modules should have docstrings. 109
One-liner docstrings should fit on one line with quotes. 91
First line should be in imperative mood ('Do', not 'Does'). 87
Class docstring should have 1 blank line around them. 35
Use r\"\"\" if any backslashes in your docstrings. 19
```

Also, I think we can take out cloudpickle.py, heapq3.py, shared.py, 
python/docs/conf.py, work/*/*.py, python/.eggs/*` as we do in pep8.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   >