[GitHub] spark issue #19557: [SPARK-22281][SPARKR] Handle R method breaking signature...

2017-10-25 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/19557 LGTM. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19557: [SPARK-22281][SPARKR][WIP] Handle R method breaking sign...

2017-10-25 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/19557 tested on windows, r-hub/r-devel --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #19571: [SPARK-15474][SQL] Write and read back non-emtpy schema ...

2017-10-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19571 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83072/ Test PASSed. ---

[GitHub] spark issue #19571: [SPARK-15474][SQL] Write and read back non-emtpy schema ...

2017-10-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19571 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19571: [SPARK-15474][SQL] Write and read back non-emtpy schema ...

2017-10-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19571 **[Test build #83072 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83072/testReport)** for PR 19571 at commit

[GitHub] spark issue #19556: [SPARK-22328][Core] ClosureCleaner should not miss refer...

2017-10-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19556 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83069/ Test PASSed. ---

[GitHub] spark issue #19556: [SPARK-22328][Core] ClosureCleaner should not miss refer...

2017-10-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19556 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19556: [SPARK-22328][Core] ClosureCleaner should not miss refer...

2017-10-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19556 **[Test build #83069 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83069/testReport)** for PR 19556 at commit

[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

2017-10-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19077 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

2017-10-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19077 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83068/ Test PASSed. ---

[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

2017-10-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19077 **[Test build #83068 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83068/testReport)** for PR 19077 at commit

[GitHub] spark issue #19122: [SPARK-21911][ML][PySpark] Parallel Model Evaluation for...

2017-10-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19122 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83075/ Test PASSed. ---

[GitHub] spark issue #19122: [SPARK-21911][ML][PySpark] Parallel Model Evaluation for...

2017-10-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19122 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19122: [SPARK-21911][ML][PySpark] Parallel Model Evaluation for...

2017-10-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19122 **[Test build #83075 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83075/testReport)** for PR 19122 at commit

[GitHub] spark issue #11205: [SPARK-11334][Core] Handle maximum task failure situatio...

2017-10-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/11205 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83067/ Test PASSed. ---

[GitHub] spark issue #11205: [SPARK-11334][Core] Handle maximum task failure situatio...

2017-10-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/11205 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #11205: [SPARK-11334][Core] Handle maximum task failure situatio...

2017-10-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/11205 **[Test build #83067 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83067/testReport)** for PR 11205 at commit

[GitHub] spark issue #19565: [SPARK-22111][MLLIB] OnlineLDAOptimizer should filter ou...

2017-10-25 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/19565 @akopich If you want to cache the input dataset, create JIAR to discuss it first. It's another issue I think. This JIAR also related to input caching issues:

[GitHub] spark issue #19559: [SPARK-22333][SQL]timeFunctionCall(CURRENT_DATE, CURRENT...

2017-10-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19559 **[Test build #83076 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83076/testReport)** for PR 19559 at commit

[GitHub] spark pull request #19559: [SPARK-22333][SQL]timeFunctionCall(CURRENT_DATE, ...

2017-10-25 Thread DonnyZone
Github user DonnyZone commented on a diff in the pull request: https://github.com/apache/spark/pull/19559#discussion_r147041980 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -139,6 +139,7 @@ class Analyzer(

[GitHub] spark issue #19559: [SPARK-22333][SQL]timeFunctionCall(CURRENT_DATE, CURRENT...

2017-10-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19559 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19559: [SPARK-22333][SQL]timeFunctionCall(CURRENT_DATE, CURRENT...

2017-10-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19559 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83071/ Test FAILed. ---

[GitHub] spark issue #19559: [SPARK-22333][SQL]timeFunctionCall(CURRENT_DATE, CURRENT...

2017-10-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19559 **[Test build #83071 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83071/testReport)** for PR 19559 at commit

[GitHub] spark issue #19529: [SPARK-22308] Support alternative unit testing styles in...

2017-10-25 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19529 LGTM pending Jenkins --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19122: [SPARK-21911][ML][PySpark] Parallel Model Evaluation for...

2017-10-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19122 **[Test build #83075 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83075/testReport)** for PR 19122 at commit

[GitHub] spark issue #19529: [SPARK-22308] Support alternative unit testing styles in...

2017-10-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19529 **[Test build #83074 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83074/testReport)** for PR 19529 at commit

[GitHub] spark pull request #19529: [SPARK-22308] Support alternative unit testing st...

2017-10-25 Thread nkronenfeld
Github user nkronenfeld commented on a diff in the pull request: https://github.com/apache/spark/pull/19529#discussion_r147038647 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/test/SharedSQLContext.scala --- @@ -17,86 +17,8 @@ package org.apache.spark.sql.test

[GitHub] spark pull request #19529: [SPARK-22308] Support alternative unit testing st...

2017-10-25 Thread nkronenfeld
Github user nkronenfeld commented on a diff in the pull request: https://github.com/apache/spark/pull/19529#discussion_r147038487 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/plans/PlanTest.scala --- @@ -29,7 +31,14 @@ import

[GitHub] spark pull request #19529: [SPARK-22308] Support alternative unit testing st...

2017-10-25 Thread nkronenfeld
Github user nkronenfeld commented on a diff in the pull request: https://github.com/apache/spark/pull/19529#discussion_r147038504 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/test/SharedSQLContext.scala --- @@ -17,86 +17,8 @@ package org.apache.spark.sql.test

[GitHub] spark issue #19433: [SPARK-3162] [MLlib] Add local tree training for decisio...

2017-10-25 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/19433 > We'll actually only have to run an O(n log n) sort on continuous feature values once (i.e. in the FeatureVector constructor), since once the continuous features are sorted we can update them

[GitHub] spark issue #19579: [SPARK-22356][SQL] data source table should support over...

2017-10-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19579 **[Test build #83073 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83073/testReport)** for PR 19579 at commit

[GitHub] spark issue #19471: [SPARK-22245][SQL] partitioned data set should always pu...

2017-10-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19471 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19471: [SPARK-22245][SQL] partitioned data set should always pu...

2017-10-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19471 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83066/ Test PASSed. ---

[GitHub] spark issue #19471: [SPARK-22245][SQL] partitioned data set should always pu...

2017-10-25 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19471 closing in favor of https://github.com/apache/spark/pull/19579 --- - To unsubscribe, e-mail:

[GitHub] spark issue #19471: [SPARK-22245][SQL] partitioned data set should always pu...

2017-10-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19471 **[Test build #83066 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83066/testReport)** for PR 19471 at commit

[GitHub] spark pull request #19471: [SPARK-22245][SQL] partitioned data set should al...

2017-10-25 Thread cloud-fan
Github user cloud-fan closed the pull request at: https://github.com/apache/spark/pull/19471 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19579: [SPARK-22356][SQL] data source table should support over...

2017-10-25 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19579 cc @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19579: [SPARK-22356][SQL] data source table should suppo...

2017-10-25 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/19579 [SPARK-22356][SQL] data source table should support overlapped columns between data and partition schema ## What changes were proposed in this pull request? This is a regression

[GitHub] spark pull request #19577: [SPARK-22355][SQL] Dataset.collect is not threads...

2017-10-25 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19577#discussion_r147037025 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -3102,7 +3103,12 @@ class Dataset[T] private[sql]( * Collect all elements

[GitHub] spark pull request #19433: [SPARK-3162] [MLlib] Add local tree training for ...

2017-10-25 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/19433#discussion_r147036693 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/impl/LocalDecisionTree.scala --- @@ -0,0 +1,250 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #18544: [SPARK-21318][SQL]Improve exception message thrown by `l...

2017-10-25 Thread stanzhai
Github user stanzhai commented on the issue: https://github.com/apache/spark/pull/18544 Hi @gatorsmile , I've added some test cases, and passed on my machine. --- - To unsubscribe, e-mail:

[GitHub] spark pull request #19559: [SPARK-22333][SQL]timeFunctionCall(CURRENT_DATE, ...

2017-10-25 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19559#discussion_r147035765 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -139,6 +139,7 @@ class Analyzer(

[GitHub] spark pull request #19577: [SPARK-22355][SQL] Dataset.collect is not threads...

2017-10-25 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19577#discussion_r147035745 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -3102,7 +3103,12 @@ class Dataset[T] private[sql]( * Collect all

[GitHub] spark issue #19527: [SPARK-13030][ML] Create OneHotEncoderEstimator for OneH...

2017-10-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19527 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19527: [SPARK-13030][ML] Create OneHotEncoderEstimator for OneH...

2017-10-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19527 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83070/ Test PASSed. ---

[GitHub] spark issue #19527: [SPARK-13030][ML] Create OneHotEncoderEstimator for OneH...

2017-10-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19527 **[Test build #83070 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83070/testReport)** for PR 19527 at commit

[GitHub] spark issue #19577: [SPARK-22355][SQL] Dataset.collect is not threadsafe

2017-10-25 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19577 Nice catch! LGTM with two minor comments. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19388: [SPARK-22162] Executors and the driver should use consis...

2017-10-25 Thread rezasafi
Github user rezasafi commented on the issue: https://github.com/apache/spark/pull/19388 Sorry for the delay. It seems that to be able to commit the same rdd in different stages we need to use stageId. So the jobId and other configurations in the write method of SparkHadoopWriter

[GitHub] spark issue #19571: [SPARK-15474][SQL] Write and read back non-emtpy schema ...

2017-10-25 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19571 Thank you for review, @gatorsmile and @cloud-fan . Especially, @cloud-fan 's opinion is my original approach in #17980 and #18953 (before Aug 16). I cannot agree any more. >

[GitHub] spark issue #19578: [SPARK-21983][SQL] Fix Antlr 4.7 deprecation warnings

2017-10-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19578 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19578: [SPARK-21983][SQL] Fix Antlr 4.7 deprecation warnings

2017-10-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19578 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83065/ Test PASSed. ---

[GitHub] spark issue #19571: [SPARK-15474][SQL] Write and read back non-emtpy schema ...

2017-10-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19571 **[Test build #83072 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83072/testReport)** for PR 19571 at commit

[GitHub] spark pull request #19577: [SPARK-22355][SQL] Dataset.collect is not threads...

2017-10-25 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19577#discussion_r147032429 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -2661,7 +2657,12 @@ class Dataset[T] private[sql]( */ def

[GitHub] spark issue #19578: [SPARK-21983][SQL] Fix Antlr 4.7 deprecation warnings

2017-10-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19578 **[Test build #83065 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83065/testReport)** for PR 19578 at commit

[GitHub] spark pull request #19577: [SPARK-22355][SQL] Dataset.collect is not threads...

2017-10-25 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19577#discussion_r147032280 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -3102,7 +3103,12 @@ class Dataset[T] private[sql]( * Collect all elements

[GitHub] spark issue #19534: [SPARK-22312][CORE] Fix bug in Executor allocation manag...

2017-10-25 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19534 @sitalkedia would you please reopen this PR, I think the second issue I fixed before is not valid anymore, for the first issue the fix is no difference compared to here. ---

[GitHub] spark pull request #11205: [SPARK-11334][Core] Handle maximum task failure s...

2017-10-25 Thread jerryshao
Github user jerryshao closed the pull request at: https://github.com/apache/spark/pull/11205 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #11205: [SPARK-11334][Core] Handle maximum task failure situatio...

2017-10-25 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/11205 Verified again, looks like the 2nd bullet is not valid anymore, I cannot reproduce it in latest master branch, this might have already been fixed in SPARK-13054. So only first issue

[GitHub] spark issue #19577: [SPARK-22355][SQL] Dataset.collect is not threadsafe

2017-10-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19577 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83063/ Test PASSed. ---

[GitHub] spark issue #19577: [SPARK-22355][SQL] Dataset.collect is not threadsafe

2017-10-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19577 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19577: [SPARK-22355][SQL] Dataset.collect is not threadsafe

2017-10-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19577 **[Test build #83063 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83063/testReport)** for PR 19577 at commit

[GitHub] spark issue #19559: [SPARK-22333][SQL]timeFunctionCall(CURRENT_DATE, CURRENT...

2017-10-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19559 **[Test build #83071 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83071/testReport)** for PR 19559 at commit

[GitHub] spark issue #19556: [SPARK-22328][Core] ClosureCleaner should not miss refer...

2017-10-25 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19556 @cloud-fan Two remaining do while loop are updated. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #19527: [SPARK-13030][ML] Create OneHotEncoderEstimator for OneH...

2017-10-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19527 **[Test build #83070 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83070/testReport)** for PR 19527 at commit

[GitHub] spark issue #19527: [SPARK-13030][ML] Create OneHotEncoderEstimator for OneH...

2017-10-25 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19527 @huaxingao Good catch! Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #19556: [SPARK-22328][Core] ClosureCleaner should not miss refer...

2017-10-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19556 **[Test build #83069 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83069/testReport)** for PR 19556 at commit

[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

2017-10-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19077 **[Test build #83068 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83068/testReport)** for PR 19077 at commit

[GitHub] spark pull request #19531: [SPARK-22310] [SQL] Refactor join estimation to i...

2017-10-25 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/19531#discussion_r147027199 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/JoinEstimation.scala --- @@ -157,64 +154,100 @@ case class

[GitHub] spark issue #11205: [SPARK-11334][Core] Handle maximum task failure situatio...

2017-10-25 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/11205 @vanzin , in the current code `stageIdToTaskIndices` cannot be used to track number of running tasks, because this structure doesn't remove task index from itself when task is finished

[GitHub] spark issue #11205: [SPARK-11334][Core] Handle maximum task failure situatio...

2017-10-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/11205 **[Test build #83067 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83067/testReport)** for PR 11205 at commit

[GitHub] spark pull request #19468: [SPARK-18278] [Scheduler] Spark on Kubernetes - B...

2017-10-25 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/19468#discussion_r147025026 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodFactory.scala --- @@ -0,0 +1,229 @@ +/*

[GitHub] spark pull request #18664: [SPARK-21375][PYSPARK][SQL] Add Date and Timestam...

2017-10-25 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/18664#discussion_r147024180 --- Diff: python/pyspark/sql/types.py --- @@ -1619,11 +1619,38 @@ def to_arrow_type(dt): arrow_type = pa.decimal(dt.precision, dt.scale)

[GitHub] spark pull request #19468: [SPARK-18278] [Scheduler] Spark on Kubernetes - B...

2017-10-25 Thread mccheah
Github user mccheah commented on a diff in the pull request: https://github.com/apache/spark/pull/19468#discussion_r147024086 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodFactory.scala --- @@ -0,0 +1,229 @@ +/*

[GitHub] spark pull request #18664: [SPARK-21375][PYSPARK][SQL] Add Date and Timestam...

2017-10-25 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18664#discussion_r147023750 --- Diff: python/pyspark/sql/types.py --- @@ -1619,11 +1619,38 @@ def to_arrow_type(dt): arrow_type = pa.decimal(dt.precision, dt.scale)

[GitHub] spark pull request #19468: [SPARK-18278] [Scheduler] Spark on Kubernetes - B...

2017-10-25 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/19468#discussion_r147022826 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodFactory.scala --- @@ -0,0 +1,229 @@ +/*

[GitHub] spark pull request #19468: [SPARK-18278] [Scheduler] Spark on Kubernetes - B...

2017-10-25 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/19468#discussion_r147022413 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodFactory.scala --- @@ -0,0 +1,229 @@ +/*

[GitHub] spark pull request #18664: [SPARK-21375][PYSPARK][SQL] Add Date and Timestam...

2017-10-25 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/18664#discussion_r147021983 --- Diff: python/pyspark/sql/types.py --- @@ -1619,11 +1619,38 @@ def to_arrow_type(dt): arrow_type = pa.decimal(dt.precision, dt.scale)

[GitHub] spark pull request #19565: [SPARK-22111][MLLIB] OnlineLDAOptimizer should fi...

2017-10-25 Thread akopich
Github user akopich commented on a diff in the pull request: https://github.com/apache/spark/pull/19565#discussion_r147021726 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -446,14 +445,14 @@ final class OnlineLDAOptimizer extends

[GitHub] spark issue #19471: [SPARK-22245][SQL] partitioned data set should always pu...

2017-10-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19471 **[Test build #83066 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83066/testReport)** for PR 19471 at commit

[GitHub] spark pull request #19565: [SPARK-22111][MLLIB] OnlineLDAOptimizer should fi...

2017-10-25 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/19565#discussion_r147020853 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -446,14 +445,14 @@ final class OnlineLDAOptimizer extends

[GitHub] spark pull request #19565: [SPARK-22111][MLLIB] OnlineLDAOptimizer should fi...

2017-10-25 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/19565#discussion_r147021004 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -446,14 +445,14 @@ final class OnlineLDAOptimizer extends

[GitHub] spark pull request #19468: [SPARK-18278] [Scheduler] Spark on Kubernetes - B...

2017-10-25 Thread ash211
Github user ash211 commented on a diff in the pull request: https://github.com/apache/spark/pull/19468#discussion_r147021138 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodFactory.scala --- @@ -0,0 +1,229 @@ +/*

[GitHub] spark pull request #18664: [SPARK-21375][PYSPARK][SQL] Add Date and Timestam...

2017-10-25 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/18664#discussion_r147021068 --- Diff: python/pyspark/serializers.py --- @@ -224,7 +225,13 @@ def _create_batch(series): # If a nullable integer series has been promoted to

[GitHub] spark issue #19468: [SPARK-18278] [Scheduler] Spark on Kubernetes - Basic Sc...

2017-10-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19468 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83062/ Test FAILed. ---

[GitHub] spark issue #19468: [SPARK-18278] [Scheduler] Spark on Kubernetes - Basic Sc...

2017-10-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19468 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19468: [SPARK-18278] [Scheduler] Spark on Kubernetes - Basic Sc...

2017-10-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19468 **[Test build #83062 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83062/testReport)** for PR 19468 at commit

[GitHub] spark issue #19565: [SPARK-22111][MLLIB] OnlineLDAOptimizer should filter ou...

2017-10-25 Thread hhbyyh
Github user hhbyyh commented on the issue: https://github.com/apache/spark/pull/19565 I wonder if we should add cache() for lda training data, even not for this feature. @srowen Not sure where we're on caching the training data or not for different algorithms. Appreciate

[GitHub] spark pull request #18664: [SPARK-21375][PYSPARK][SQL] Add Date and Timestam...

2017-10-25 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18664#discussion_r147020254 --- Diff: python/pyspark/sql/types.py --- @@ -1619,11 +1619,38 @@ def to_arrow_type(dt): arrow_type = pa.decimal(dt.precision, dt.scale)

[GitHub] spark pull request #19468: [SPARK-18278] [Scheduler] Spark on Kubernetes - B...

2017-10-25 Thread mccheah
Github user mccheah commented on a diff in the pull request: https://github.com/apache/spark/pull/19468#discussion_r147019793 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodFactory.scala --- @@ -0,0 +1,229 @@ +/*

[GitHub] spark pull request #18664: [SPARK-21375][PYSPARK][SQL] Add Date and Timestam...

2017-10-25 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18664#discussion_r147019262 --- Diff: python/pyspark/serializers.py --- @@ -224,7 +225,13 @@ def _create_batch(series): # If a nullable integer series has been promoted to

[GitHub] spark pull request #19468: [SPARK-18278] [Scheduler] Spark on Kubernetes - B...

2017-10-25 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/19468#discussion_r147019065 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/KubernetesClusterSchedulerBackend.scala --- @@ -0,0

[GitHub] spark issue #19576: [SPARK-19727][SQL][followup] Fix for round function that...

2017-10-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19576 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19576: [SPARK-19727][SQL][followup] Fix for round function that...

2017-10-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19576 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83059/ Test PASSed. ---

[GitHub] spark issue #19576: [SPARK-19727][SQL][followup] Fix for round function that...

2017-10-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19576 **[Test build #83059 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83059/testReport)** for PR 19576 at commit

[GitHub] spark pull request #19468: [SPARK-18278] [Scheduler] Spark on Kubernetes - B...

2017-10-25 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/19468#discussion_r147018411 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodFactory.scala --- @@ -0,0 +1,229 @@ +/*

[GitHub] spark issue #15770: [SPARK-15784][ML]:Add Power Iteration Clustering to spar...

2017-10-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15770 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #15770: [SPARK-15784][ML]:Add Power Iteration Clustering to spar...

2017-10-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15770 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83064/ Test FAILed. ---

[GitHub] spark issue #15770: [SPARK-15784][ML]:Add Power Iteration Clustering to spar...

2017-10-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15770 **[Test build #83064 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83064/testReport)** for PR 15770 at commit

[GitHub] spark pull request #19468: [SPARK-18278] [Scheduler] Spark on Kubernetes - B...

2017-10-25 Thread mccheah
Github user mccheah commented on a diff in the pull request: https://github.com/apache/spark/pull/19468#discussion_r147017517 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodFactory.scala --- @@ -0,0 +1,229 @@ +/*

[GitHub] spark pull request #19468: [SPARK-18278] [Scheduler] Spark on Kubernetes - B...

2017-10-25 Thread mccheah
Github user mccheah commented on a diff in the pull request: https://github.com/apache/spark/pull/19468#discussion_r147017011 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/KubernetesClusterSchedulerBackend.scala --- @@ -0,0

  1   2   3   4   5   >