[GitHub] spark issue #14412: [SPARK-15355] [CORE] Proactive block replication

2017-02-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14412 **[Test build #72291 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72291/testReport)** for PR 14412 at commit [`16975b6`](https://github.com/apache/spark/commit/1

[GitHub] spark issue #16779: [SPARK-19437] Rectify spark executor id in HeartbeatRece...

2017-02-02 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/16779 LGTM pending tests --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, o

[GitHub] spark pull request #16758: [SPARK-19413][SS] MapGroupsWithState for arbitrar...

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16758#discussion_r99257411 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/statefulOperators.scala --- @@ -235,3 +234,79 @@ case class StateStoreSaveExec(

[GitHub] spark pull request #16758: [SPARK-19413][SS] MapGroupsWithState for arbitrar...

2017-02-02 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/16758#discussion_r99257337 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/KeyedStateImpl.scala --- @@ -0,0 +1,57 @@ +/* + * Licensed to the Apache S

[GitHub] spark issue #16777: [SPARK-19435][SQL] Type coercion between ArrayTypes

2017-02-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16777 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72290/ Test PASSed. ---

[GitHub] spark issue #16777: [SPARK-19435][SQL] Type coercion between ArrayTypes

2017-02-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16777 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #16777: [SPARK-19435][SQL] Type coercion between ArrayTypes

2017-02-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16777 **[Test build #72290 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72290/testReport)** for PR 16777 at commit [`b860d25`](https://github.com/apache/spark/commit/b

[GitHub] spark pull request #16758: [SPARK-19413][SS] MapGroupsWithState for arbitrar...

2017-02-02 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16758#discussion_r99256384 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/KeyedState.scala --- @@ -0,0 +1,134 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] spark pull request #16758: [SPARK-19413][SS] MapGroupsWithState for arbitrar...

2017-02-02 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/16758#discussion_r99256480 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/KeyedState.scala --- @@ -0,0 +1,134 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] spark pull request #16758: [SPARK-19413][SS] MapGroupsWithState for arbitrar...

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16758#discussion_r99255993 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala --- @@ -147,6 +147,23 @@ private[state] cla

[GitHub] spark pull request #16758: [SPARK-19413][SS] MapGroupsWithState for arbitrar...

2017-02-02 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/16758#discussion_r99255963 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/KeyedState.scala --- @@ -0,0 +1,134 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request #16758: [SPARK-19413][SS] MapGroupsWithState for arbitrar...

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16758#discussion_r99255889 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/IncrementalExecution.scala --- @@ -68,7 +71,7 @@ class IncrementalExecution(

[GitHub] spark pull request #16758: [SPARK-19413][SS] MapGroupsWithState for arbitrar...

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16758#discussion_r99255665 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/KeyedState.scala --- @@ -0,0 +1,134 @@ +/* + * Licensed to the Apache Software Foundation (ASF) un

[GitHub] spark pull request #16758: [SPARK-19413][SS] MapGroupsWithState for arbitrar...

2017-02-02 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/16758#discussion_r99255194 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala --- @@ -313,6 +313,56 @@ case class MapGroups( output

[GitHub] spark pull request #16733: [SPARK-19392][SQL] Fix the bug that throws an exc...

2017-02-02 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/16733#discussion_r99255047 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala --- @@ -726,11 +726,14 @@ class JDBCSuite extends SparkFunSuite test(

[GitHub] spark pull request #16758: [SPARK-19413][SS] MapGroupsWithState for arbitrar...

2017-02-02 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/16758#discussion_r99255012 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationChecker.scala --- @@ -46,8 +46,13 @@ object UnsupportedOpera

[GitHub] spark pull request #16733: [SPARK-19392][SQL] Fix the bug that throws an exc...

2017-02-02 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/16733#discussion_r99255011 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala --- @@ -29,7 +29,12 @@ private case object OracleDialect extends JdbcDialect {

[GitHub] spark pull request #16646: [SPARK-19291][SPARKR][ML] spark.gaussianMixture s...

2017-02-02 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/16646#discussion_r99253729 --- Diff: mllib/src/main/scala/org/apache/spark/ml/r/GaussianMixtureWrapper.scala --- @@ -124,7 +129,8 @@ private[r] object GaussianMixtureWrapper extends

[GitHub] spark issue #16782: [SPARK-19348][PYTHON][WIP] PySpark keyword_only decorato...

2017-02-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16782 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72292/ Test PASSed. ---

[GitHub] spark issue #16782: [SPARK-19348][PYTHON][WIP] PySpark keyword_only decorato...

2017-02-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16782 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #16782: [SPARK-19348][PYTHON][WIP] PySpark keyword_only decorato...

2017-02-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16782 **[Test build #72292 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72292/testReport)** for PR 16782 at commit [`83bcce0`](https://github.com/apache/spark/commit/8

[GitHub] spark issue #16740: [SPARK-19400][ML] Allow GLM to handle intercept only mod...

2017-02-02 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/16740 @sethah Your formula for offset does not seem to be a general solution, and I'm not sure if there exists an analytical formula, in particular when the link function is not identity or log. In G

[GitHub] spark issue #15435: [SPARK-17139][ML] Add model summary for MultinomialLogis...

2017-02-02 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/15435 So, looking at the design, I'm a bit concerned. Since we're adding summaries in several places around ML, I think we'd ideally design a hierarchy like we did for the estimators and models:

[GitHub] spark issue #12420: [SPARK-14585][ML][WIP] Provide accessor methods for Pipe...

2017-02-02 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/12420 Well, after spending a while looking around, I haven't found a good way to write this and make it Java friendly (i.e., not use ClassTag, Type, or TypeTag). Does anyone else have ideas? I'll try

[GitHub] spark issue #16782: [SPARK-19348][PYTHON][WIP] PySpark keyword_only decorato...

2017-02-02 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/16782 Ping @holdenk @davies . I reproduced the code in the JIRA and found that kwargs from one thread were getting overwritten by another, causing a `ml.Pipeline` to be constructed with incorrect par

[GitHub] spark issue #16782: [SPARK-19348][PYTHON][WIP] PySpark keyword_only decorato...

2017-02-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16782 **[Test build #72292 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72292/testReport)** for PR 16782 at commit [`83bcce0`](https://github.com/apache/spark/commit/83

[GitHub] spark pull request #16782: [SPARK-19348][PYTHON][WIP] PySpark keyword_only d...

2017-02-02 Thread BryanCutler
GitHub user BryanCutler opened a pull request: https://github.com/apache/spark/pull/16782 [SPARK-19348][PYTHON][WIP] PySpark keyword_only decorator is not thread-safe ## What changes were proposed in this pull request? The `@keyword_only` decorator in PySpark is not thread-safe.

[GitHub] spark pull request #16758: [SPARK-19413][SS] MapGroupsWithState for arbitrar...

2017-02-02 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16758#discussion_r99238434 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/KeyedState.scala --- @@ -0,0 +1,134 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] spark pull request #16758: [SPARK-19413][SS] MapGroupsWithState for arbitrar...

2017-02-02 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16758#discussion_r99225106 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationsSuite.scala --- @@ -111,6 +111,25 @@ class UnsupportedOper

[GitHub] spark pull request #16758: [SPARK-19413][SS] MapGroupsWithState for arbitrar...

2017-02-02 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16758#discussion_r99239899 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/IncrementalExecution.scala --- @@ -68,7 +71,7 @@ class IncrementalExecution(

[GitHub] spark pull request #16758: [SPARK-19413][SS] MapGroupsWithState for arbitrar...

2017-02-02 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16758#discussion_r99240855 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStore.scala --- @@ -58,6 +58,8 @@ trait StateStore { */

[GitHub] spark pull request #16758: [SPARK-19413][SS] MapGroupsWithState for arbitrar...

2017-02-02 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16758#discussion_r99238895 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/KeyedState.scala --- @@ -0,0 +1,134 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] spark pull request #16758: [SPARK-19413][SS] MapGroupsWithState for arbitrar...

2017-02-02 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16758#discussion_r99240819 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala --- @@ -147,6 +147,23 @@ private[state]

[GitHub] spark pull request #16758: [SPARK-19413][SS] MapGroupsWithState for arbitrar...

2017-02-02 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16758#discussion_r99238001 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/KeyedState.scala --- @@ -0,0 +1,134 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] spark pull request #16758: [SPARK-19413][SS] MapGroupsWithState for arbitrar...

2017-02-02 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16758#discussion_r99243402 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/statefulOperators.scala --- @@ -235,3 +234,79 @@ case class StateStoreSaveExec(

[GitHub] spark pull request #16758: [SPARK-19413][SS] MapGroupsWithState for arbitrar...

2017-02-02 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16758#discussion_r99237776 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/KeyedState.scala --- @@ -0,0 +1,134 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] spark pull request #16758: [SPARK-19413][SS] MapGroupsWithState for arbitrar...

2017-02-02 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16758#discussion_r99238163 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/KeyedState.scala --- @@ -0,0 +1,134 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] spark pull request #16758: [SPARK-19413][SS] MapGroupsWithState for arbitrar...

2017-02-02 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16758#discussion_r99231831 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/MapGroupsWithStateSuite.scala --- @@ -0,0 +1,240 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #16758: [SPARK-19413][SS] MapGroupsWithState for arbitrar...

2017-02-02 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16758#discussion_r99243856 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/statefulOperators.scala --- @@ -235,3 +234,79 @@ case class StateStoreSaveExec(

[GitHub] spark pull request #16758: [SPARK-19413][SS] MapGroupsWithState for arbitrar...

2017-02-02 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16758#discussion_r99228681 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/MapGroupsWithStateSuite.scala --- @@ -0,0 +1,240 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tcondie
Github user tcondie commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99244144 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceRDD.scala --- @@ -135,7 +136,28 @@ private[kafka010] class KafkaSo

[GitHub] spark pull request #16781: [SPARK-12297][SQL][POC] Hive compatibility for Pa...

2017-02-02 Thread squito
Github user squito closed the pull request at: https://github.com/apache/spark/pull/16781 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #16541: [SPARK-19088][SQL] Optimize sequence type deserializatio...

2017-02-02 Thread michalsenkyr
Github user michalsenkyr commented on the issue: https://github.com/apache/spark/pull/16541 Apologies for taking so long. I tried modifying the serialization logic as best as I could to serialize into `UnsafeArrayData` ([branch diff](https://github.com/michalsenkyr/spark/comp

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tcondie
Github user tcondie commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99240245 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/CachedKafkaConsumer.scala --- @@ -334,14 +334,15 @@ private[kafka010] object

[GitHub] spark issue #13932: [SPARK-15354] [CORE] Topology aware block replication st...

2017-02-02 Thread shubhamchopra
Github user shubhamchopra commented on the issue: https://github.com/apache/spark/pull/13932 No test errors. Looks like the test process was killed midway. Tests added as a part of this PR took less than 7s, so couldn't have caused the delay. --- If your project is set up for it, you

[GitHub] spark issue #16762: [SPARK-19419] [SPARK-19420] Fix the cross join detection

2017-02-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16762 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #16762: [SPARK-19419] [SPARK-19420] Fix the cross join detection

2017-02-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16762 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72289/ Test FAILed. ---

[GitHub] spark issue #16762: [SPARK-19419] [SPARK-19420] Fix the cross join detection

2017-02-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16762 **[Test build #72289 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72289/testReport)** for PR 16762 at commit [`671a361`](https://github.com/apache/spark/commit/6

[GitHub] spark issue #14412: [SPARK-15355] [CORE] Proactive block replication

2017-02-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14412 **[Test build #72291 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72291/testReport)** for PR 14412 at commit [`16975b6`](https://github.com/apache/spark/commit/16

[GitHub] spark pull request #14412: [SPARK-15355] [CORE] Proactive block replication

2017-02-02 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/14412#discussion_r99237148 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala --- @@ -188,24 +189,45 @@ class BlockManagerMasterEndpoint(

[GitHub] spark issue #16777: [SPARK-19435][SQL] Type coercion between ArrayTypes

2017-02-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16777 **[Test build #72290 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72290/testReport)** for PR 16777 at commit [`b860d25`](https://github.com/apache/spark/commit/b8

[GitHub] spark issue #16739: [SPARK-19399][SPARKR] Add R coalesce API for DataFrame a...

2017-02-02 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/16739 @gatorsmile thanks for commenting. `coalesce` currently accept a number even if it is larger than the current number of partitions - I guess we didn't want to throw exeception in that case?

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99198943 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsets.scala --- @@ -22,11 +22,11 @@ import org.apache.kafka.common.TopicP

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99229152 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaUtils.scala --- @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Sof

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99227219 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala --- @@ -71,94 +77,152 @@ private[kafka010] class Kafka

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99198794 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetReader.scala --- @@ -0,0 +1,389 @@ +/* + * Licensed to the Ap

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99199482 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsets.scala --- @@ -22,11 +22,11 @@ import org.apache.kafka.common.TopicP

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99201167 --- Diff: external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaRelationSuite.scala --- @@ -0,0 +1,255 @@ +/* + * Licensed to the A

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99200075 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala --- @@ -251,7 +315,43 @@ private[kafka010] class Kafka

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99227485 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala --- @@ -71,94 +77,152 @@ private[kafka010] class Kafka

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99226571 --- Diff: external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaSourceSuite.scala --- @@ -384,6 +384,9 @@ class KafkaSourceSuite extends Ka

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99227617 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala --- @@ -71,94 +77,152 @@ private[kafka010] class Kafka

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99197254 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetReader.scala --- @@ -0,0 +1,389 @@ +/* + * Licensed to the Ap

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99196664 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetReader.scala --- @@ -0,0 +1,389 @@ +/* + * Licensed to the Ap

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99192856 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/CachedKafkaConsumer.scala --- @@ -42,7 +42,7 @@ private[kafka010] case class Cac

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99198641 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetReader.scala --- @@ -0,0 +1,389 @@ +/* + * Licensed to the Ap

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99200239 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala --- @@ -251,7 +315,43 @@ private[kafka010] class Kafka

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99193469 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceRDD.scala --- @@ -135,7 +136,28 @@ private[kafka010] class KafkaSourc

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99227773 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala --- @@ -278,5 +378,13 @@ private[kafka010] class Kafka

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99199931 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala --- @@ -251,7 +315,43 @@ private[kafka010] class Kafka

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99201123 --- Diff: external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaRelationSuite.scala --- @@ -0,0 +1,255 @@ +/* + * Licensed to the A

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99196592 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetReader.scala --- @@ -0,0 +1,389 @@ +/* + * Licensed to the Ap

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99227400 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala --- @@ -71,94 +77,152 @@ private[kafka010] class Kafka

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99200748 --- Diff: external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaRelationSuite.scala --- @@ -0,0 +1,255 @@ +/* + * Licensed to the A

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99198390 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetReader.scala --- @@ -0,0 +1,389 @@ +/* + * Licensed to the Ap

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99197356 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetReader.scala --- @@ -0,0 +1,389 @@ +/* + * Licensed to the Ap

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99228186 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaUtils.scala --- @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Sof

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99192771 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/CachedKafkaConsumer.scala --- @@ -42,7 +42,7 @@ private[kafka010] case class Cac

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99196854 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetReader.scala --- @@ -0,0 +1,389 @@ +/* + * Licensed to the Ap

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99196372 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetReader.scala --- @@ -0,0 +1,389 @@ +/* + * Licensed to the Ap

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99196482 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetReader.scala --- @@ -0,0 +1,389 @@ +/* + * Licensed to the Ap

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99195732 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/CachedKafkaConsumer.scala --- @@ -334,14 +334,15 @@ private[kafka010] object Cac

[GitHub] spark pull request #16686: [SPARK-18682][SS] Batch Source for Kafka

2017-02-02 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16686#discussion_r99195778 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetReader.scala --- @@ -0,0 +1,389 @@ +/* + * Licensed to the Ap

[GitHub] spark issue #16739: [SPARK-19399][SPARKR] Add R coalesce API for DataFrame a...

2017-02-02 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16739 `coalesce` is used to decrease the number of partitions in the RDD, but when you are setting it to a number that is larger than the number of the current RDD partitions, the result is not predica

[GitHub] spark issue #16762: [SPARK-19419] [SPARK-19420] Fix the cross join detection

2017-02-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16762 **[Test build #72289 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72289/testReport)** for PR 16762 at commit [`671a361`](https://github.com/apache/spark/commit/67

[GitHub] spark issue #16762: [SPARK-19419] [SPARK-19420] Fix the cross join detection

2017-02-02 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16762 Let me first push the new changes to resolve the SparkR issue. However, I still need time to add the corresponding test cases to Scala. --- If your project is set up for it, you can reply to th

[GitHub] spark pull request #16762: [SPARK-19419] [SPARK-19420] Fix the cross join de...

2017-02-02 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/16762#discussion_r99217621 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastNestedLoopJoinExec.scala --- @@ -339,6 +340,18 @@ case class BroadcastNes

[GitHub] spark pull request #16744: [SPARK-19405][STREAMING] Support for cross-accoun...

2017-02-02 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16744#discussion_r99217534 --- Diff: pom.xml --- @@ -146,6 +146,8 @@ hadoop2 0.7.1 1.6.2 + +1.10.61 --- End diff -- I believe there

[GitHub] spark pull request #16776: [SPARK-19436][SQL] Add missing tests for approxQu...

2017-02-02 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/16776#discussion_r99215565 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameStatSuite.scala --- @@ -159,16 +159,53 @@ class DataFrameStatSuite extends QueryTest with

[GitHub] spark issue #16776: [SPARK-19436][SQL] Add missing tests for approxQuantile

2017-02-02 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16776 @zhengruifeng Could you split the test cases to multiple independent ones with meaningful titles?Thanks! --- If your project is set up for it, you can reply to this email and have your reply a

[GitHub] spark issue #16780: [SPARK-19438] Both reading and updating executorDataMap ...

2017-02-02 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/16780 > Thus it might be possible that both of them will go to else branch `receiveAnyReply` processes messages in sequence just like using one thread. This won't happen. --- If your project is

[GitHub] spark pull request #16723: [SPARK-19389][ML][PYTHON][DOC] Minor doc fixes fo...

2017-02-02 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16723 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #16723: [SPARK-19389][ML][PYTHON][DOC] Minor doc fixes for ML Py...

2017-02-02 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/16723 OK thanks a lot @HyukjinKwon and @wangmiao1981 ! Merging with master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your pro

[GitHub] spark issue #16781: [SPARK-12297][SQL][POC] Hive compatibility for Parquet T...

2017-02-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16781 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #16781: [SPARK-12297][SQL][POC] Hive compatibility for Parquet T...

2017-02-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16781 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72288/ Test FAILed. ---

[GitHub] spark issue #16781: [SPARK-12297][SQL][POC] Hive compatibility for Parquet T...

2017-02-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16781 **[Test build #72288 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72288/testReport)** for PR 16781 at commit [`5b49ae0`](https://github.com/apache/spark/commit/5

[GitHub] spark pull request #16733: [SPARK-19392][SQL] Fix the bug that throws an exc...

2017-02-02 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/16733#discussion_r99202768 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala --- @@ -726,11 +726,14 @@ class JDBCSuite extends SparkFunSuite t

[GitHub] spark issue #16767: [SPARK-19386][SPARKR][DOC] Bisecting k-means in SparkR d...

2017-02-02 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/16767 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #1051: [SPARK-2014] Make PySpark store RDDs in MEMORY_ONLY_SER w...

2017-02-02 Thread HrWangChengdu
Github user HrWangChengdu commented on the issue: https://github.com/apache/spark/pull/1051 Could anyone explain why we don't want to use MEMORY_ONLY by default? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your pro

[GitHub] spark issue #16761: [BackPort-2.1][SPARK-19319][SparkR]:SparkR Kmeans summar...

2017-02-02 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/16761 If we don't want to change the parameters in 2.1, it is not necessary to port it back. It is because the bug occurs only if you use `random` mode with specific `seed`. If we don't provide seed,

<    1   2   3   4   >