[GitHub] spark issue #19287: [SPARK-22074][Core] Task killed by other attempt task sh...

2017-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19287 **[Test build #81969 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81969/testReport)** for PR 19287 at commit

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81963/ Test PASSed. ---

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17819 **[Test build #81963 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81963/testReport)** for PR 17819 at commit

[GitHub] spark pull request #19287: [SPARK-22074][Core] Task killed by other attempt ...

2017-09-19 Thread xuanyuanking
GitHub user xuanyuanking opened a pull request: https://github.com/apache/spark/pull/19287 [SPARK-22074][Core] Task killed by other attempt task should not be resubmitted ## What changes were proposed in this pull request? As the detail scenario described in

[GitHub] spark issue #18193: [SPARK-15616] [SQL] CatalogRelation should fallback to H...

2017-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18193 **[Test build #81968 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81968/testReport)** for PR 18193 at commit

[GitHub] spark pull request #18193: [SPARK-15616] [SQL] CatalogRelation should fallba...

2017-09-19 Thread lianhuiwang
Github user lianhuiwang commented on a diff in the pull request: https://github.com/apache/spark/pull/18193#discussion_r139879632 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala --- @@ -140,6 +141,62 @@ class DetermineTableStats(session:

[GitHub] spark pull request #19211: [SPARK-18838][core] Add separate listener queues ...

2017-09-19 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19211 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19229 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19229 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81964/ Test PASSed. ---

[GitHub] spark issue #19211: [SPARK-18838][core] Add separate listener queues to Live...

2017-09-19 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19211 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19229 **[Test build #81964 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81964/testReport)** for PR 19229 at commit

[GitHub] spark issue #19285: [SPARK-22068][CORE]Reduce the duplicate code between put...

2017-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19285 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19285: [SPARK-22068][CORE]Reduce the duplicate code between put...

2017-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19285 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81961/ Test FAILed. ---

[GitHub] spark issue #19285: [SPARK-22068][CORE]Reduce the duplicate code between put...

2017-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19285 **[Test build #81961 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81961/testReport)** for PR 19285 at commit

[GitHub] spark pull request #19286: [SPARK-21338][SQL][FOLLOW-UP] Implement isCascadi...

2017-09-19 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19286#discussion_r139878221 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala --- @@ -747,6 +747,19 @@ class JDBCSuite extends SparkFunSuite

[GitHub] spark pull request #19277: [SPARK-22058][CORE]the BufferedInputStream will n...

2017-09-19 Thread zuotingbing
Github user zuotingbing commented on a diff in the pull request: https://github.com/apache/spark/pull/19277#discussion_r139878136 --- Diff: core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala --- @@ -351,11 +351,11 @@ private[spark] object

[GitHub] spark pull request #15544: [SPARK-17997] [SQL] Add an aggregation function f...

2017-09-19 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/15544#discussion_r139878051 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproxCountDistinctForIntervals.scala --- @@ -0,0 +1,235 @@

[GitHub] spark pull request #15544: [SPARK-17997] [SQL] Add an aggregation function f...

2017-09-19 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/15544#discussion_r139877802 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproxCountDistinctForIntervals.scala --- @@ -0,0 +1,235 @@

[GitHub] spark pull request #19211: [SPARK-18838][core] Add separate listener queues ...

2017-09-19 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19211#discussion_r139877613 --- Diff: core/src/main/scala/org/apache/spark/scheduler/LiveListenerBus.scala --- @@ -39,20 +41,13 @@ import org.apache.spark.util.Utils * has

[GitHub] spark issue #19281: [SPARK-21998][SQL] SortMergeJoinExec did not calculate i...

2017-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19281 **[Test build #81966 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81966/testReport)** for PR 19281 at commit

[GitHub] spark issue #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18805 **[Test build #81967 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81967/testReport)** for PR 18805 at commit

[GitHub] spark issue #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDFs

2017-09-19 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18659 ok let's work around the type casting issue and discuss arrow upgrading later. --- - To unsubscribe, e-mail:

[GitHub] spark pull request #15544: [SPARK-17997] [SQL] Add an aggregation function f...

2017-09-19 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/15544#discussion_r139877421 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproxCountDistinctForIntervals.scala --- @@ -0,0 +1,235 @@

[GitHub] spark issue #19281: [SPARK-21998][SQL] SortMergeJoinExec did not calculate i...

2017-09-19 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19281 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19281: [SPARK-21998][SQL] SortMergeJoinExec did not calculate i...

2017-09-19 Thread maryannxue
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/19281 bq. This is not accurate. It depends on the length of required ordering and the length of child ordering. You are right. I did it right in the code but made a mistake in the description

[GitHub] spark pull request #15544: [SPARK-17997] [SQL] Add an aggregation function f...

2017-09-19 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/15544#discussion_r139876729 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproxCountDistinctForIntervals.scala --- @@ -0,0 +1,235 @@

[GitHub] spark pull request #15544: [SPARK-17997] [SQL] Add an aggregation function f...

2017-09-19 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/15544#discussion_r139876548 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproxCountDistinctForIntervals.scala --- @@ -0,0 +1,235 @@

[GitHub] spark issue #19281: [SPARK-21998][SQL] SortMergeJoinExec did not calculate i...

2017-09-19 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19281 > If the childOutputOrdering satisfies (is a superset of) the required child ordering => childOutputOrdering This is not accurate. It depends on the length of required ordering and the

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-19 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/19229 @viirya Thanks very much! Although the perf gap exists (when numCols is large), it won't block this PR. I will create a JIRA to track this. ---

[GitHub] spark issue #15544: [SPARK-17997] [SQL] Add an aggregation function for coun...

2017-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15544 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81960/ Test PASSed. ---

[GitHub] spark issue #15544: [SPARK-17997] [SQL] Add an aggregation function for coun...

2017-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15544 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #15544: [SPARK-17997] [SQL] Add an aggregation function for coun...

2017-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15544 **[Test build #81960 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81960/testReport)** for PR 15544 at commit

[GitHub] spark issue #19286: [SPARK-21338][SQL][FOLLOW-UP] Implement isCascadingTrunc...

2017-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19286 **[Test build #81965 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81965/testReport)** for PR 19286 at commit

[GitHub] spark issue #19286: [SPARK-21338][SQL][FOLLOW-UP] Implement isCascadingTrunc...

2017-09-19 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19286 cc @gatorsmile @huaxingao --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19281: [SPARK-21998][SQL] SortMergeJoinExec did not calc...

2017-09-19 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19281#discussion_r139875187 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala --- @@ -101,14 +101,15 @@ case class SortMergeJoinExec(

[GitHub] spark pull request #19281: [SPARK-21998][SQL] SortMergeJoinExec did not calc...

2017-09-19 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19281#discussion_r139875333 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/JoinSuite.scala --- @@ -64,6 +67,42 @@ class JoinSuite extends QueryTest with SharedSQLContext {

[GitHub] spark pull request #19286: [SPARK-21338][SQL][FOLLOW-UP] Implement isCascadi...

2017-09-19 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/19286 [SPARK-21338][SQL][FOLLOW-UP] Implement isCascadingTruncateTable() method in AggregatedDialect ## What changes were proposed in this pull request? The implemented

[GitHub] spark pull request #19281: [SPARK-21998][SQL] SortMergeJoinExec did not calc...

2017-09-19 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19281#discussion_r139873950 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala --- @@ -396,6 +396,26 @@ abstract class SparkPlan extends

[GitHub] spark pull request #19281: [SPARK-21998][SQL] SortMergeJoinExec did not calc...

2017-09-19 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19281#discussion_r139873547 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala --- @@ -396,6 +396,26 @@ abstract class SparkPlan extends

[GitHub] spark issue #19246: [SPARK-22025][PySpark] Speeding up fromInternal for Stru...

2017-09-19 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19246 I'd close this PR if there is no objection @maver1ck and I didn't miss something. --- - To unsubscribe, e-mail:

[GitHub] spark pull request #19246: [SPARK-22025][PySpark] Speeding up fromInternal f...

2017-09-19 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19246#discussion_r139874732 --- Diff: python/pyspark/sql/types.py --- @@ -410,6 +410,24 @@ def __init__(self, name, dataType, nullable=True, metadata=None):

[GitHub] spark pull request #18754: [WIP][SPARK-21552][SQL] Add DecimalType support t...

2017-09-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18754#discussion_r139872489 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowWriter.scala --- @@ -224,6 +226,25 @@ private[arrow] class DoubleWriter(val

[GitHub] spark pull request #19246: [SPARK-22025][PySpark] Speeding up fromInternal f...

2017-09-19 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19246#discussion_r139872337 --- Diff: python/pyspark/sql/types.py --- @@ -410,6 +410,24 @@ def __init__(self, name, dataType, nullable=True, metadata=None):

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-19 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19229 numColums | RDD Mean | RDD Median | DataFrame Mean | DataFrame Median -- | -- | -- | -- | -- 1 | 0.1642173481 | 0.199774305 | 0.4260180671006 | 0.2025112919 10 | 0.3713707549 |

[GitHub] spark pull request #19246: [SPARK-22025][PySpark] Speeding up fromInternal f...

2017-09-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19246#discussion_r139871986 --- Diff: python/pyspark/sql/types.py --- @@ -410,6 +410,24 @@ def __init__(self, name, dataType, nullable=True, metadata=None): self.dataType

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-19 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19229 I ran the test codes to benchmark RDD-version and DataFrame version with this `ImputerModel` change: import org.apache.spark.ml.feature._ import org.apache.spark.sql.{DataFrame,

[GitHub] spark pull request #19243: [SPARK-21780][R] Simpler Dataset.sample API in R

2017-09-19 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/19243#discussion_r139868488 --- Diff: R/pkg/R/DataFrame.R --- @@ -998,33 +998,44 @@ setMethod("unique", #' sparkR.session() #' path <- "path/to/file.json" #' df <-

[GitHub] spark pull request #19276: [SPARK-22049][DOCS] Confusing behavior of from_ut...

2017-09-19 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/19276#discussion_r139868265 --- Diff: R/pkg/R/functions.R --- @@ -2286,8 +2286,8 @@ setMethod("next_day", signature(y = "Column", x = "character"), }) #'

[GitHub] spark pull request #19277: [SPARK-22058][CORE]the BufferedInputStream will n...

2017-09-19 Thread jerryshao
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/19277#discussion_r139867429 --- Diff: core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala --- @@ -351,11 +351,11 @@ private[spark] object EventLoggingListener

[GitHub] spark pull request #19277: [SPARK-22058][CORE]the BufferedInputStream will n...

2017-09-19 Thread jerryshao
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/19277#discussion_r139867369 --- Diff: core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala --- @@ -351,11 +351,11 @@ private[spark] object EventLoggingListener

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19229 **[Test build #81964 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81964/testReport)** for PR 19229 at commit

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-19 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/17819 @MLnick Yea, you're right, only move `setXXX` to concrete class also work fine. The root cause is the `setXXX` return type. But I think the multi / single logic can be merged, because single

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17819 **[Test build #81963 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81963/testReport)** for PR 17819 at commit

[GitHub] spark issue #19285: [SPARK-22068][CORE]Reduce the duplicate code between put...

2017-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19285 **[Test build #81961 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81961/testReport)** for PR 19285 at commit

[GitHub] spark issue #19160: [SPARK-21934][CORE] Expose Shuffle Netty memory usage to...

2017-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19160 **[Test build #81962 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81962/testReport)** for PR 19160 at commit

[GitHub] spark issue #19285: [SPARK-22068][CORE]Reduce the duplicate code between put...

2017-09-19 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19285 ok to test. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19271: [SPARK-22053][SS] Stream-stream inner join in Append Mod...

2017-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19271 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81958/ Test FAILed. ---

[GitHub] spark issue #19278: [SPARK-22060][ML] Fix CrossValidator/TrainValidationSpli...

2017-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19278 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81959/ Test PASSed. ---

[GitHub] spark issue #19271: [SPARK-22053][SS] Stream-stream inner join in Append Mod...

2017-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19271 **[Test build #81958 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81958/testReport)** for PR 19271 at commit

[GitHub] spark issue #19271: [SPARK-22053][SS] Stream-stream inner join in Append Mod...

2017-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19271 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19278: [SPARK-22060][ML] Fix CrossValidator/TrainValidationSpli...

2017-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19278 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19278: [SPARK-22060][ML] Fix CrossValidator/TrainValidationSpli...

2017-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19278 **[Test build #81959 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81959/testReport)** for PR 19278 at commit

[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...

2017-09-19 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18648 The description is not clear, at least I get understood after diving into the code changes. --- - To unsubscribe, e-mail:

[GitHub] spark issue #19271: [SPARK-22053][SS] Stream-stream inner join in Append Mod...

2017-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19271 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81957/ Test FAILed. ---

[GitHub] spark issue #19271: [SPARK-22053][SS] Stream-stream inner join in Append Mod...

2017-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19271 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19271: [SPARK-22053][SS] Stream-stream inner join in Append Mod...

2017-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19271 **[Test build #81957 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81957/testReport)** for PR 19271 at commit

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81956/ Test PASSed. ---

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17819 **[Test build #81956 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81956/testReport)** for PR 17819 at commit

[GitHub] spark issue #18685: [SPARK-21439][PySpark] Support for ABCMeta in PySpark

2017-09-19 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18685 This should be good to go as soon as updated. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #19234: [SPARK-22010][PySpark] Change fromInternal method of Tim...

2017-09-19 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19234 Hm, BTW, do we handle https://github.com/python/cpython/blob/018d353c1c8c87767d2335cd884017c2ce12e045/Lib/datetime.py#L1443-L1455: ```python if tz is None:

[GitHub] spark pull request #19160: [SPARK-21934][CORE] Expose Shuffle Netty memory u...

2017-09-19 Thread jerryshao
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/19160#discussion_r139861892 --- Diff: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleClient.java --- @@ -117,6 +118,12 @@ public void

[GitHub] spark issue #15544: [SPARK-17997] [SQL] Add an aggregation function for coun...

2017-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15544 **[Test build #81960 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81960/testReport)** for PR 15544 at commit

[GitHub] spark pull request #18193: [SPARK-15616] [SQL] CatalogRelation should fallba...

2017-09-19 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/18193#discussion_r139861601 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala --- @@ -140,6 +141,62 @@ class DetermineTableStats(session: SparkSession)

[GitHub] spark pull request #19160: [SPARK-21934][CORE] Expose Shuffle Netty memory u...

2017-09-19 Thread jerryshao
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/19160#discussion_r139861341 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -115,6 +115,7 @@ private[spark] class Executor( if (!isLocal) {

[GitHub] spark pull request #19160: [SPARK-21934][CORE] Expose Shuffle Netty memory u...

2017-09-19 Thread jerryshao
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/19160#discussion_r139861303 --- Diff: core/src/main/scala/org/apache/spark/deploy/ExternalShuffleServiceSource.scala --- @@ -19,19 +19,19 @@ package org.apache.spark.deploy

[GitHub] spark pull request #19246: [SPARK-22025][PySpark] Speeding up fromInternal f...

2017-09-19 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19246#discussion_r139861057 --- Diff: python/pyspark/sql/types.py --- @@ -410,6 +410,24 @@ def __init__(self, name, dataType, nullable=True, metadata=None):

[GitHub] spark pull request #19246: [SPARK-22025][PySpark] Speeding up fromInternal f...

2017-09-19 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19246#discussion_r139861001 --- Diff: python/pyspark/sql/types.py --- @@ -410,6 +410,24 @@ def __init__(self, name, dataType, nullable=True, metadata=None):

[GitHub] spark pull request #19160: [SPARK-21934][CORE] Expose Shuffle Netty memory u...

2017-09-19 Thread jerryshao
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/19160#discussion_r139860969 --- Diff: core/src/main/scala/org/apache/spark/network/netty/NettyBlockTransferService.scala --- @@ -18,11 +18,14 @@ package

[GitHub] spark pull request #19160: [SPARK-21934][CORE] Expose Shuffle Netty memory u...

2017-09-19 Thread jerryshao
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/19160#discussion_r139860924 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -248,6 +251,16 @@ private[spark] class BlockManager(

[GitHub] spark issue #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDFs

2017-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18659 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDFs

2017-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18659 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81955/ Test FAILed. ---

[GitHub] spark issue #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDFs

2017-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18659 **[Test build #81955 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81955/testReport)** for PR 18659 at commit

[GitHub] spark pull request #19259: [BACKPORT-2.1][SPARK-19318][SPARK-22041][SQL] Doc...

2017-09-19 Thread wangyum
Github user wangyum closed the pull request at: https://github.com/apache/spark/pull/19259 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19284: [SPARK-22067][SQL] ArrowWriter should use positio...

2017-09-19 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19284 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19284: [SPARK-22067][SQL] ArrowWriter should use position when ...

2017-09-19 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19284 Thanks! merging to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #15544: [SPARK-17997] [SQL] Add an aggregation function f...

2017-09-19 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/15544#discussion_r139859176 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproxCountDistinctForIntervalsSuite.scala --- @@ -0,0 +1,206 @@

[GitHub] spark issue #19196: [SPARK-21977] SinglePartition optimizations break certai...

2017-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19196 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19196: [SPARK-21977] SinglePartition optimizations break certai...

2017-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19196 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81954/ Test PASSed. ---

[GitHub] spark issue #19196: [SPARK-21977] SinglePartition optimizations break certai...

2017-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19196 **[Test build #81954 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81954/testReport)** for PR 19196 at commit

[GitHub] spark issue #19278: [SPARK-22060][ML] Fix CrossValidator/TrainValidationSpli...

2017-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19278 **[Test build #81959 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81959/testReport)** for PR 19278 at commit

[GitHub] spark issue #19208: [SPARK-21087] [ML] CrossValidator, TrainValidationSplit ...

2017-09-19 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/19208 @smurching I will update this PR after #19278 merged. Because now this PR depend on that one. Thanks! --- - To

[GitHub] spark issue #19284: [SPARK-22067][SQL] ArrowWriter should use position when ...

2017-09-19 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19284 LGTM. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19278: [SPARK-22060][ML] Fix CrossValidator/TrainValidationSpli...

2017-09-19 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/19278 @BryanCutler The reason I add `skipParams` is that, if we don't use `DefaultParamReader.getAndSetParams`, we have to hardcoding all params which are very troublesome. And every time we add new

[GitHub] spark pull request #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDF...

2017-09-19 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18659#discussion_r139856165 --- Diff: python/pyspark/sql/functions.py --- @@ -2142,18 +2159,26 @@ def udf(f=None, returnType=StringType()): | 8| JOHN DOE|

[GitHub] spark pull request #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDF...

2017-09-19 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18659#discussion_r139855188 --- Diff: python/pyspark/serializers.py --- @@ -199,6 +211,46 @@ def __repr__(self): return "ArrowSerializer" +class

[GitHub] spark issue #19271: [SPARK-22053][SS] Stream-stream inner join in Append Mod...

2017-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19271 **[Test build #81958 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81958/testReport)** for PR 19271 at commit

[GitHub] spark pull request #19278: [SPARK-22060][ML] Fix CrossValidator/TrainValidat...

2017-09-19 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/19278#discussion_r139855087 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tuning/CrossValidator.scala --- @@ -303,16 +304,16 @@ object CrossValidatorModel extends

[GitHub] spark pull request #19278: [SPARK-22060][ML] Fix CrossValidator/TrainValidat...

2017-09-19 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/19278#discussion_r139854984 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tuning/CrossValidator.scala --- @@ -212,14 +213,12 @@ object CrossValidator extends

  1   2   3   4   5   6   >