[GitHub] spark pull request #21109: [SPARK-24020][SQL] Sort-merge join inner range op...

2018-06-07 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/21109#discussion_r193743605 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1205,6 +1205,19 @@ object SQLConf { .booleanConf

[GitHub] spark pull request #21109: [SPARK-24020][SQL] Sort-merge join inner range op...

2018-06-07 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/21109#discussion_r193736438 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/joins/InnerJoinSuite.scala --- @@ -70,27 +70,41 @@ class InnerJoinSuite extends

[GitHub] spark pull request #21109: [SPARK-24020][SQL] Sort-merge join inner range op...

2018-06-07 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/21109#discussion_r193737191 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/InMemoryUnsafeRowQueue.scala --- @@ -0,0 +1,183 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #21109: [SPARK-24020][SQL] Sort-merge join inner range op...

2018-06-07 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/21109#discussion_r193735681 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala --- @@ -131,13 +135,100 @@ object ExtractEquiJoinKeys extends

[GitHub] spark pull request #21109: [SPARK-24020][SQL] Sort-merge join inner range op...

2018-06-07 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/21109#discussion_r193733146 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala --- @@ -131,13 +135,100 @@ object ExtractEquiJoinKeys extends

[GitHub] spark pull request #21109: [SPARK-24020][SQL] Sort-merge join inner range op...

2018-06-07 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/21109#discussion_r193734550 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala --- @@ -131,13 +135,100 @@ object ExtractEquiJoinKeys extends

[GitHub] spark pull request #21109: [SPARK-24020][SQL] Sort-merge join inner range op...

2018-06-07 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/21109#discussion_r193735061 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala --- @@ -131,13 +135,100 @@ object ExtractEquiJoinKeys extends

[GitHub] spark pull request #21109: [SPARK-24020][SQL] Sort-merge join inner range op...

2018-06-07 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/21109#discussion_r193736960 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/joins/InnerJoinSuite.scala --- @@ -117,101 +131,170 @@ class InnerJoinSuite extends

[GitHub] spark pull request #21502: [SPARK-22575][SQL] Add destroy to Dataset

2018-06-07 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/21502#discussion_r193742604 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/joins/BroadcastJoinSuite.scala --- @@ -152,6 +152,26 @@ class BroadcastJoinSuite extends

[GitHub] spark pull request #21477: [WIP] [SPARK-24396] [SS] [PYSPARK] Add Structured...

2018-06-07 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21477#discussion_r193740695 --- Diff: python/pyspark/sql/streaming.py --- @@ -843,6 +844,169 @@ def trigger(self, processingTime=None, once=None, continuous=None):

[GitHub] spark issue #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pyspark

2018-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13599 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pyspark

2018-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13599 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91519/ Test PASSed. ---

[GitHub] spark issue #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pyspark

2018-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13599 **[Test build #91519 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91519/testReport)** for PR 13599 at commit

[GitHub] spark issue #21467: [SPARK-23754][PYTHON][FOLLOWUP] Move UDF stop iteration ...

2018-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21467 **[Test build #91521 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91521/testReport)** for PR 21467 at commit

[GitHub] spark issue #21467: [SPARK-23754][PYTHON][FOLLOWUP] Move UDF stop iteration ...

2018-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21467 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21467: [SPARK-23754][PYTHON][FOLLOWUP] Move UDF stop iteration ...

2018-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21467 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91521/ Test PASSed. ---

[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21469 **[Test build #91523 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91523/testReport)** for PR 21469 at commit

[GitHub] spark issue #18900: [SPARK-21687][SQL] Spark SQL should set createTime for H...

2018-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18900 **[Test build #91522 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91522/testReport)** for PR 18900 at commit

[GitHub] spark pull request #18900: [SPARK-21687][SQL] Spark SQL should set createTim...

2018-06-07 Thread debugger87
Github user debugger87 commented on a diff in the pull request: https://github.com/apache/spark/pull/18900#discussion_r193730957 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala --- @@ -1019,6 +1021,8 @@ private[hive] object HiveClientImpl {

[GitHub] spark issue #21467: [SPARK-23754][PYTHON][FOLLOWUP] Move UDF stop iteration ...

2018-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21467 **[Test build #91521 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91521/testReport)** for PR 21467 at commit

[GitHub] spark pull request #21502: [SPARK-22575][SQL] Add destroy to Dataset

2018-06-07 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21502#discussion_r193724774 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/joins/BroadcastJoinSuite.scala --- @@ -152,6 +152,26 @@ class BroadcastJoinSuite

[GitHub] spark issue #21499: [SPARK-24468][SQL] Handle negative scale when adjusting ...

2018-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21499 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21499: [SPARK-24468][SQL] Handle negative scale when adjusting ...

2018-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21499 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3831/

[GitHub] spark pull request #21505: [SPARK-24457][SQL] Improving performance of strin...

2018-06-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21505#discussion_r193696451 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala --- @@ -111,6 +113,23 @@ object DateTimeUtils {

[GitHub] spark issue #21499: [SPARK-24468][SQL] Handle negative scale when adjusting ...

2018-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21499 **[Test build #91520 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91520/testReport)** for PR 21499 at commit

[GitHub] spark pull request #21505: [SPARK-24457][SQL] Improving performance of strin...

2018-06-07 Thread ssonker
Github user ssonker commented on a diff in the pull request: https://github.com/apache/spark/pull/21505#discussion_r193694372 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala --- @@ -111,6 +113,23 @@ object DateTimeUtils {

[GitHub] spark pull request #19691: [SPARK-14922][SPARK-17732][SQL]ALTER TABLE DROP P...

2018-06-07 Thread DazhuangSu
Github user DazhuangSu commented on a diff in the pull request: https://github.com/apache/spark/pull/19691#discussion_r193691275 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -510,40 +511,86 @@ case class

[GitHub] spark issue #21505: [SPARK-24457][SQL] Improving performance of stringToTime...

2018-06-07 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/21505 We would appreciate it if you put the performance before and after this PR? It would be good to use `Benchmark` class. --- -

[GitHub] spark pull request #21505: [SPARK-24457][SQL] Improving performance of strin...

2018-06-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21505#discussion_r193688565 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala --- @@ -111,6 +113,23 @@ object DateTimeUtils {

[GitHub] spark pull request #21505: [SPARK-24457][SQL] Improving performance of strin...

2018-06-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21505#discussion_r193687670 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala --- @@ -114,20 +114,19 @@ object DateTimeUtils { }

[GitHub] spark pull request #21505: [SPARK-24457][SQL] Improving performance of strin...

2018-06-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21505#discussion_r193687346 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala --- @@ -114,20 +114,19 @@ object DateTimeUtils { }

[GitHub] spark pull request #21505: [SPARK-24457][SQL] Improving performance of strin...

2018-06-07 Thread ssonker
Github user ssonker commented on a diff in the pull request: https://github.com/apache/spark/pull/21505#discussion_r193686772 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala --- @@ -111,6 +113,24 @@ object DateTimeUtils {

[GitHub] spark pull request #18900: [SPARK-21687][SQL] Spark SQL should set createTim...

2018-06-07 Thread cxzl25
Github user cxzl25 commented on a diff in the pull request: https://github.com/apache/spark/pull/18900#discussion_r193685282 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala --- @@ -1019,6 +1021,8 @@ private[hive] object HiveClientImpl {

[GitHub] spark pull request #21505: [SPARK-24457][SQL] Improving performance of strin...

2018-06-07 Thread ssonker
Github user ssonker commented on a diff in the pull request: https://github.com/apache/spark/pull/21505#discussion_r193679978 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala --- @@ -111,6 +113,24 @@ object DateTimeUtils {

[GitHub] spark issue #20636: [SPARK-23415][SQL][TEST] Make behavior of BufferHolderSp...

2018-06-07 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/20636 cc @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #21061: [SPARK-23914][SQL] Add array_union function

2018-06-07 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/21061 Let me think about the implementation to keep the order. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark pull request #21477: [WIP] [SPARK-24396] [SS] [PYSPARK] Add Structured...

2018-06-07 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21477#discussion_r193678459 --- Diff: python/pyspark/sql/streaming.py --- @@ -843,6 +844,169 @@ def trigger(self, processingTime=None, once=None, continuous=None):

[GitHub] spark pull request #21505: [SPARK-24457][SQL] Improving performance of strin...

2018-06-07 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/21505#discussion_r193678440 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala --- @@ -111,6 +113,24 @@ object DateTimeUtils {

[GitHub] spark pull request #21505: [SPARK-24457][SQL] Improving performance of strin...

2018-06-07 Thread ssonker
Github user ssonker commented on a diff in the pull request: https://github.com/apache/spark/pull/21505#discussion_r193676953 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala --- @@ -111,6 +113,24 @@ object DateTimeUtils {

[GitHub] spark pull request #21505: [SPARK-24457][SQL] Improving performance of strin...

2018-06-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21505#discussion_r193676413 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala --- @@ -111,6 +113,24 @@ object DateTimeUtils {

[GitHub] spark pull request #21499: [SPARK-24468][SQL] Handle negative scale when adj...

2018-06-07 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/21499#discussion_r193675689 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/types/DecimalType.scala --- @@ -161,13 +161,17 @@ object DecimalType extends AbstractDataType {

[GitHub] spark pull request #21505: [SPARK-24457][SQL] Improving performance of strin...

2018-06-07 Thread ssonker
Github user ssonker commented on a diff in the pull request: https://github.com/apache/spark/pull/21505#discussion_r193675674 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala --- @@ -111,6 +113,24 @@ object DateTimeUtils {

[GitHub] spark pull request #21505: [SPARK-24457][SQL] Improving performance of strin...

2018-06-07 Thread ssonker
Github user ssonker commented on a diff in the pull request: https://github.com/apache/spark/pull/21505#discussion_r193675439 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala --- @@ -111,6 +113,24 @@ object DateTimeUtils {

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-06-07 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r193674158 --- Diff: python/pyspark/context.py --- @@ -1035,6 +1044,41 @@ def getConf(self): conf.setAll(self._conf.getAll()) return conf

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-06-07 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r193673500 --- Diff: docs/submitting-applications.md --- @@ -218,6 +218,115 @@ These commands can be used with `pyspark`, `spark-shell`, and `spark-submit` to

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-06-07 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r193674619 --- Diff: python/pyspark/context.py --- @@ -1035,6 +1044,41 @@ def getConf(self): conf.setAll(self._conf.getAll()) return conf

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-06-07 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r193672797 --- Diff: docs/submitting-applications.md --- @@ -218,6 +218,115 @@ These commands can be used with `pyspark`, `spark-shell`, and `spark-submit` to

[GitHub] spark pull request #21505: [SPARK-24457][SQL] Improving performance of strin...

2018-06-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21505#discussion_r193674578 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala --- @@ -111,6 +113,24 @@ object DateTimeUtils {

[GitHub] spark issue #21505: [SPARK-24457][SQL] Improving performance of stringToTime...

2018-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21505 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21505: [SPARK-24457][SQL] Improving performance of stringToTime...

2018-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21505 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #21505: [SPARK-24457][SQL] Improving performance of strin...

2018-06-07 Thread ssonker
GitHub user ssonker opened a pull request: https://github.com/apache/spark/pull/21505 [SPARK-24457][SQL] Improving performance of stringToTimestamp by cach… …ing Calendar instances for input timezones instead of creating new everytime ## What changes were proposed in

[GitHub] spark issue #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pyspark

2018-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13599 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3830/

[GitHub] spark issue #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pyspark

2018-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13599 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pyspark

2018-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13599 **[Test build #91519 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91519/testReport)** for PR 13599 at commit

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-06-07 Thread zjffdu
Github user zjffdu commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r193664416 --- Diff: docs/submitting-applications.md --- @@ -218,6 +218,115 @@ These commands can be used with `pyspark`, `spark-shell`, and `spark-submit` to For

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-06-07 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r193659778 --- Diff: docs/submitting-applications.md --- @@ -218,6 +218,115 @@ These commands can be used with `pyspark`, `spark-shell`, and `spark-submit` to

[GitHub] spark pull request #21494: [WIP][SPARK-24375][Prototype] Support barrier sch...

2018-06-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21494#discussion_r193649314 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -359,17 +368,42 @@ private[spark] class TaskSchedulerImpl(

[GitHub] spark pull request #21494: [WIP][SPARK-24375][Prototype] Support barrier sch...

2018-06-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21494#discussion_r193648185 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1310,6 +1311,44 @@ class DAGScheduler( } }

[GitHub] spark pull request #21494: [WIP][SPARK-24375][Prototype] Support barrier sch...

2018-06-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21494#discussion_r193647168 --- Diff: core/src/main/scala/org/apache/spark/barrier/BarrierCoordinator.scala --- @@ -0,0 +1,78 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #21494: [WIP][SPARK-24375][Prototype] Support barrier sch...

2018-06-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21494#discussion_r193658009 --- Diff: core/src/main/scala/org/apache/spark/barrier/BarrierCoordinator.scala --- @@ -0,0 +1,78 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #21494: [WIP][SPARK-24375][Prototype] Support barrier sch...

2018-06-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21494#discussion_r193640783 --- Diff: core/src/main/scala/org/apache/spark/barrier/BarrierTaskContext.scala --- @@ -0,0 +1,67 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #21494: [WIP][SPARK-24375][Prototype] Support barrier sch...

2018-06-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21494#discussion_r193644506 --- Diff: core/src/main/scala/org/apache/spark/barrier/BarrierCoordinator.scala --- @@ -0,0 +1,78 @@ +/* + * Licensed to the Apache Software

[GitHub] spark issue #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pyspark

2018-06-07 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/13599 @JoshRosen, I roughly heard that you took a look about this before. Do you have a concern to address maybe? --- - To

[GitHub] spark issue #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pyspark

2018-06-07 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/13599 @holdenk and @zjffdu, I believe manual tests are a-okay if it's difficult to write a test. We can manually test and expose this as an experimental feature too. BTW, I believe we can

[GitHub] spark issue #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pyspark

2018-06-07 Thread zjffdu
Github user zjffdu commented on the issue: https://github.com/apache/spark/pull/13599 Thanks for the interest on this PR and the info about `Pipfiles`. I think we could support that after this PR get merged so that we can provide users more options for virtualenv based on their

[GitHub] spark issue #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pyspark

2018-06-07 Thread kokes
Github user kokes commented on the issue: https://github.com/apache/spark/pull/13599 Hi, thanks for all the work on this! I see requirements.txt mentioned here and there and, browsing this and other JIRAs, it seems to be the proposed way to specify dependencies in PySpark. As you

[GitHub] spark issue #21482: [SPARK-24393][SQL] SQL builtin: isinf

2018-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21482 **[Test build #91518 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91518/testReport)** for PR 21482 at commit

[GitHub] spark issue #21482: [SPARK-24393][SQL] SQL builtin: isinf

2018-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21482 **[Test build #91517 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91517/testReport)** for PR 21482 at commit

[GitHub] spark issue #21482: [SPARK-24393][SQL] SQL builtin: isinf

2018-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21482 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21482: [SPARK-24393][SQL] SQL builtin: isinf

2018-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21482 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21482: [SPARK-24393][SQL] SQL builtin: isinf

2018-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21482 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91518/ Test FAILed. ---

[GitHub] spark issue #21482: [SPARK-24393][SQL] SQL builtin: isinf

2018-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21482 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91517/ Test FAILed. ---

[GitHub] spark pull request #21092: [SPARK-23984][K8S] Initial Python Bindings for Py...

2018-06-07 Thread kokes
Github user kokes commented on a diff in the pull request: https://github.com/apache/spark/pull/21092#discussion_r193639554 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala --- @@ -154,6 +176,24 @@ private[spark] object Config

[GitHub] spark issue #20929: [SPARK-23772][SQL] Provide an option to ignore column of...

2018-06-07 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/20929 yea, thanks for the comments! I'll try to fix based on the comments. --- - To unsubscribe, e-mail:

[GitHub] spark pull request #21494: [WIP][SPARK-24375][Prototype] Support barrier sch...

2018-06-07 Thread galv
Github user galv commented on a diff in the pull request: https://github.com/apache/spark/pull/21494#discussion_r193290266 --- Diff: python/pyspark/worker.py --- @@ -232,6 +236,13 @@ def main(infile, outfile): shuffle.DiskBytesSpilled = 0

[GitHub] spark pull request #21494: [WIP][SPARK-24375][Prototype] Support barrier sch...

2018-06-07 Thread galv
Github user galv commented on a diff in the pull request: https://github.com/apache/spark/pull/21494#discussion_r193269255 --- Diff: core/src/test/scala/org/apache/spark/SparkContextSuite.scala --- @@ -627,6 +627,52 @@ class SparkContextSuite extends SparkFunSuite with

[GitHub] spark pull request #21494: [WIP][SPARK-24375][Prototype] Support barrier sch...

2018-06-07 Thread galv
Github user galv commented on a diff in the pull request: https://github.com/apache/spark/pull/21494#discussion_r193289530 --- Diff: python/pyspark/worker.py --- @@ -232,6 +236,13 @@ def main(infile, outfile): shuffle.DiskBytesSpilled = 0

[GitHub] spark pull request #21494: [WIP][SPARK-24375][Prototype] Support barrier sch...

2018-06-07 Thread galv
Github user galv commented on a diff in the pull request: https://github.com/apache/spark/pull/21494#discussion_r193291076 --- Diff: python/pyspark/worker.py --- @@ -232,6 +236,13 @@ def main(infile, outfile): shuffle.DiskBytesSpilled = 0

[GitHub] spark pull request #21494: [WIP][SPARK-24375][Prototype] Support barrier sch...

2018-06-07 Thread galv
Github user galv commented on a diff in the pull request: https://github.com/apache/spark/pull/21494#discussion_r193555968 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -123,6 +124,21 @@ private[spark] class TaskSetManager( // TODO: We

[GitHub] spark pull request #21494: [WIP][SPARK-24375][Prototype] Support barrier sch...

2018-06-07 Thread galv
Github user galv commented on a diff in the pull request: https://github.com/apache/spark/pull/21494#discussion_r193269297 --- Diff: core/src/test/scala/org/apache/spark/SparkContextSuite.scala --- @@ -627,6 +627,52 @@ class SparkContextSuite extends SparkFunSuite with

[GitHub] spark issue #21500: Scalable Memory option for HDFSBackedStateStore

2018-06-07 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21500 Retaining versions of state is also relevant to do snapshotting the last version in files: HDFSBackedStateStoreProvider doesn't snapshot if the version doesn't exist in loadedMaps. So we may

[GitHub] spark pull request #21501: [SPARK-15064][ML] Locale support in StopWordsRemo...

2018-06-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21501#discussion_r193635361 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/StopWordsRemover.scala --- @@ -84,7 +86,36 @@ class StopWordsRemover @Since("1.5.0")

[GitHub] spark pull request #21477: [WIP] [SPARK-24396] [SS] [PYSPARK] Add Structured...

2018-06-07 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21477#discussion_r193634436 --- Diff: python/pyspark/sql/streaming.py --- @@ -843,6 +844,169 @@ def trigger(self, processingTime=None, once=None, continuous=None):

[GitHub] spark pull request #21477: [WIP] [SPARK-24396] [SS] [PYSPARK] Add Structured...

2018-06-07 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21477#discussion_r193633540 --- Diff: python/pyspark/sql/streaming.py --- @@ -843,6 +844,169 @@ def trigger(self, processingTime=None, once=None, continuous=None):

<    1   2   3   4