[GitHub] spark issue #17525: [SPARK-20209][SS] Execute next trigger immediately if pr...

2017-04-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17525 **[Test build #75504 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75504/testReport)** for PR 17525 at commit [`50f0195`](https://github.com/apache/spark/commit/50

[GitHub] spark pull request #17455: [Spark-20044][Web UI] Support Spark UI behind fro...

2017-04-03 Thread okoethibm
Github user okoethibm commented on a diff in the pull request: https://github.com/apache/spark/pull/17455#discussion_r109588861 --- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala --- @@ -132,7 +132,13 @@ private[deploy] class Master( webUi.bind()

[GitHub] spark issue #17526: [SPARKR][DOC] update doc for fpgrowth

2017-04-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17526 **[Test build #75503 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75503/testReport)** for PR 17526 at commit [`e4e03ea`](https://github.com/apache/spark/commit/e4

[GitHub] spark pull request #17526: [SPARKR][DOC] update doc for fpgrowth

2017-04-03 Thread felixcheung
GitHub user felixcheung opened a pull request: https://github.com/apache/spark/pull/17526 [SPARKR][DOC] update doc for fpgrowth ## What changes were proposed in this pull request? minor update @zero323 You can merge this pull request into a Git repository by runni

[GitHub] spark issue #17445: [SPARK-20115] [CORE] Fix DAGScheduler to recompute all t...

2017-04-03 Thread umehrot2
Github user umehrot2 commented on the issue: https://github.com/apache/spark/pull/17445 Jenkins test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wish

[GitHub] spark pull request #17525: [SPARK-20209][SS] Execute next trigger immediatel...

2017-04-03 Thread tdas
GitHub user tdas opened a pull request: https://github.com/apache/spark/pull/17525 [SPARK-20209][SS] Execute next trigger immediately if previous batch took longer than trigger interval ## What changes were proposed in this pull request? For large trigger intervals (e.g. 10

[GitHub] spark pull request #17170: [SPARK-19825][R][ML] spark.ml R API for FPGrowth

2017-04-03 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17170 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #17170: [SPARK-19825][R][ML] spark.ml R API for FPGrowth

2017-04-03 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/17170 merged to master. @zero323 could you follow up with vignettes and programming guide update please - we need them for the 2.2.0 release. --- If your project is set up for it, you can reply t

[GitHub] spark issue #17524: [SPARK-19235] [SQL] [TEST] [FOLLOW-UP] Enable Test Cases...

2017-04-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17524 **[Test build #75502 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75502/testReport)** for PR 17524 at commit [`427741f`](https://github.com/apache/spark/commit/42

[GitHub] spark pull request #17455: [Spark-20044][Web UI] Support Spark UI behind fro...

2017-04-03 Thread okoethibm
Github user okoethibm commented on a diff in the pull request: https://github.com/apache/spark/pull/17455#discussion_r109586326 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/ExecutorRunner.scala --- @@ -157,7 +157,9 @@ private[deploy] class ExecutorRunner(

[GitHub] spark pull request #17394: [SPARK-20067] [SQL] Unify and Clean Up Desc Comma...

2017-04-03 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17394 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #17524: [SPARK-19235] [SQL] [TEST] [FOLLOW-UP] Enable Test Cases...

2017-04-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17524 **[Test build #75501 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75501/testReport)** for PR 17524 at commit [`c102187`](https://github.com/apache/spark/commit/c1

[GitHub] spark issue #17394: [SPARK-20067] [SQL] Unify and Clean Up Desc Commands Usi...

2017-04-03 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17394 Thanks! Merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and w

[GitHub] spark issue #17524: [SPARK-19235] [SQL] [TEST] [FOLLOW-UP] Enable Test Cases...

2017-04-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17524 **[Test build #75500 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75500/testReport)** for PR 17524 at commit [`8382228`](https://github.com/apache/spark/commit/83

[GitHub] spark pull request #17524: [SPARK-19235] [SQL] [TEST] [FOLLOW-UP] Enable Tes...

2017-04-03 Thread gatorsmile
GitHub user gatorsmile opened a pull request: https://github.com/apache/spark/pull/17524 [SPARK-19235] [SQL] [TEST] [FOLLOW-UP] Enable Test Cases in DDLSuite with Hive Metastore ### What changes were proposed in this pull request? This is a follow-up of enabling test cases in DD

[GitHub] spark issue #17480: [SPARK-20079][Core][yarn] Re registration of AM hangs sp...

2017-04-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17480 **[Test build #75499 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75499/testReport)** for PR 17480 at commit [`f54c9ae`](https://github.com/apache/spark/commit/f5

[GitHub] spark pull request #17469: [SPARK-20132][Docs] Add documentation for column ...

2017-04-03 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/17469#discussion_r109575589 --- Diff: python/pyspark/sql/column.py --- @@ -303,8 +333,25 @@ def isin(self, *cols): desc = _unary_op("desc", "Returns a sort expression based

[GitHub] spark pull request #17480: [SPARK-20079][Core][yarn] Re registration of AM h...

2017-04-03 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/17480#discussion_r109575470 --- Diff: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala --- @@ -249,7 +249,9 @@ private[spark] class ExecutorAllocationManager(

[GitHub] spark pull request #17469: [SPARK-20132][Docs] Add documentation for column ...

2017-04-03 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/17469#discussion_r109575023 --- Diff: python/pyspark/sql/column.py --- @@ -250,11 +250,39 @@ def __iter__(self): raise TypeError("Column is not iterable")

[GitHub] spark issue #17251: [SPARK-19910][SQL] `stack` should not reject NULL values...

2017-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17251 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75498/ Test PASSed. ---

[GitHub] spark issue #17251: [SPARK-19910][SQL] `stack` should not reject NULL values...

2017-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17251 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #17251: [SPARK-19910][SQL] `stack` should not reject NULL values...

2017-04-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17251 **[Test build #75498 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75498/testReport)** for PR 17251 at commit [`2150ce5`](https://github.com/apache/spark/commit/2

[GitHub] spark issue #17469: [SPARK-20132][Docs] Add documentation for column string ...

2017-04-03 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17469 It might be better to run`./dev/lint-python` locally if possible. There will catch more of minor nits ahead. --- If your project is set up for it, you can reply to this email and have your repl

[GitHub] spark pull request #17469: [SPARK-20132][Docs] Add documentation for column ...

2017-04-03 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/17469#discussion_r109574284 --- Diff: python/pyspark/sql/column.py --- @@ -303,8 +333,25 @@ def isin(self, *cols): desc = _unary_op("desc", "Returns a sort expression based

[GitHub] spark pull request #17469: [SPARK-20132][Docs] Add documentation for column ...

2017-04-03 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/17469#discussion_r109574278 --- Diff: python/pyspark/sql/column.py --- @@ -303,8 +333,25 @@ def isin(self, *cols): desc = _unary_op("desc", "Returns a sort expression based

[GitHub] spark issue #17394: [SPARK-20067] [SQL] Unify and Clean Up Desc Commands Usi...

2017-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17394 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75497/ Test PASSed. ---

[GitHub] spark issue #17394: [SPARK-20067] [SQL] Unify and Clean Up Desc Commands Usi...

2017-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17394 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #17394: [SPARK-20067] [SQL] Unify and Clean Up Desc Commands Usi...

2017-04-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17394 **[Test build #75497 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75497/testReport)** for PR 17394 at commit [`862a4d7`](https://github.com/apache/spark/commit/8

[GitHub] spark issue #17459: [SPARK-20109][MLlib] Added toBlockMatrixDense to Indexed...

2017-04-03 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17459 @johnc1231 The prototype I did: https://github.com/apache/spark/compare/master...viirya:general-toblockmatrix?expand=1 Maybe you can take a look and see if it is useful to you. --- If your

[GitHub] spark issue #17494: [SPARK-20076][ML][PySpark] Add Python interface for ml.s...

2017-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17494 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75496/ Test PASSed. ---

[GitHub] spark issue #17494: [SPARK-20076][ML][PySpark] Add Python interface for ml.s...

2017-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17494 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #17494: [SPARK-20076][ML][PySpark] Add Python interface for ml.s...

2017-04-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17494 **[Test build #75496 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75496/testReport)** for PR 17494 at commit [`fbcc1fe`](https://github.com/apache/spark/commit/f

[GitHub] spark pull request #17505: [SPARK-20187][SQL] Replace loadTable with moveFil...

2017-04-03 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/17505#discussion_r109567793 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala --- @@ -694,12 +694,25 @@ private[hive] class HiveClientImpl(

[GitHub] spark issue #17520: [WIP][SPARK-19712][SQL] Move PullupCorrelatedPredicates ...

2017-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17520 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #17520: [WIP][SPARK-19712][SQL] Move PullupCorrelatedPredicates ...

2017-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17520 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75494/ Test PASSed. ---

[GitHub] spark issue #17394: [SPARK-20067] [SQL] Unify and Clean Up Desc Commands Usi...

2017-04-03 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17394 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the fea

[GitHub] spark issue #17520: [WIP][SPARK-19712][SQL] Move PullupCorrelatedPredicates ...

2017-04-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17520 **[Test build #75494 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75494/testReport)** for PR 17520 at commit [`0bab4fd`](https://github.com/apache/spark/commit/0

[GitHub] spark issue #17459: [SPARK-20109][MLlib] Added toBlockMatrixDense to Indexed...

2017-04-03 Thread johnc1231
Github user johnc1231 commented on the issue: https://github.com/apache/spark/pull/17459 Alright, I agree with this. We'll switch off Dense or Sparse matrix backings based on what the type of the first vector in the iterator is. I'd be happy to take on making these adjustments. ---

[GitHub] spark issue #17506: [SPARK-20189][DStream] Fix spark kinesis testcases to re...

2017-04-03 Thread yssharma
Github user yssharma commented on the issue: https://github.com/apache/spark/pull/17506 The Scala style check fail because of the double spaced lines probably. But that's how the existing code was so thought of keeping it that way. --- If your project is set up for it, you can reply

[GitHub] spark issue #17459: [SPARK-20109][MLlib] Added toBlockMatrixDense to Indexed...

2017-04-03 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17459 > I considered having toBlockMatrix check if the rows of IndexedRowMatrix were dense or sparse, but there is no guarantee of consistency. Like, an IndexedRowMatrix could be a mix of Dense and Sparse

[GitHub] spark issue #17467: [SPARK-20140][DStream] Remove hardcoded kinesis retry wa...

2017-04-03 Thread yssharma
Github user yssharma commented on the issue: https://github.com/apache/spark/pull/17467 @srowen - Could I get some love here as well. Thanks --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark issue #15332: [SPARK-10364][SQL] Support Parquet logical type TIMESTAM...

2017-04-03 Thread dilipbiswal
Github user dilipbiswal commented on the issue: https://github.com/apache/spark/pull/15332 Thanks a lot @ueshin @viirya @gatorsmile --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this featu

[GitHub] spark issue #17251: [SPARK-19910][SQL] `stack` should not reject NULL values...

2017-04-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17251 **[Test build #75498 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75498/testReport)** for PR 17251 at commit [`2150ce5`](https://github.com/apache/spark/commit/21

[GitHub] spark issue #17251: [SPARK-19910][SQL] `stack` should not reject NULL values...

2017-04-03 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/17251 Retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #17459: [SPARK-20109][MLlib] Added toBlockMatrixDense to Indexed...

2017-04-03 Thread johnc1231
Github user johnc1231 commented on the issue: https://github.com/apache/spark/pull/17459 @viirya I think we definitely care about giving users the ability to make either dense or sparse Block matrices. I made a 100k by 10k IndexedRowMatrix of random doubles, then converted it to a Blo

[GitHub] spark issue #17494: [SPARK-20076][ML][PySpark] Add Python interface for ml.s...

2017-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17494 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75495/ Test PASSed. ---

[GitHub] spark issue #17494: [SPARK-20076][ML][PySpark] Add Python interface for ml.s...

2017-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17494 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #17494: [SPARK-20076][ML][PySpark] Add Python interface for ml.s...

2017-04-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17494 **[Test build #75495 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75495/testReport)** for PR 17494 at commit [`8936880`](https://github.com/apache/spark/commit/8

[GitHub] spark issue #17523: [SPARK-20064][PySpark]

2017-04-03 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17523 (it would be nicer if the title is fixed to indicate what it proposes in short) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. I

[GitHub] spark issue #16347: [SPARK-18934][SQL] Writing to dynamic partitions does no...

2017-04-03 Thread Downchuck
Github user Downchuck commented on the issue: https://github.com/apache/spark/pull/16347 Is there anyone on the Spark team taking this up? This bug is painful; it's saddened a hundred TB of data I stacked up, and I'm really trying to avoid more manual work. "INSERT OVERWRITE TABLE ...

[GitHub] spark pull request #17394: [SPARK-20067] [SQL] Unify and Clean Up Desc Comma...

2017-04-03 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/17394#discussion_r109558929 --- Diff: sql/core/src/test/resources/sql-tests/results/describe.sql.out --- @@ -1,205 +1,259 @@ -- Automatically generated by SQLQueryTestSuite

[GitHub] spark issue #17394: [SPARK-20067] [SQL] Unify and Clean Up Desc Commands Usi...

2017-04-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17394 **[Test build #75497 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75497/testReport)** for PR 17394 at commit [`862a4d7`](https://github.com/apache/spark/commit/86

[GitHub] spark pull request #15332: [SPARK-10364][SQL] Support Parquet logical type T...

2017-04-03 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15332 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #15332: [SPARK-10364][SQL] Support Parquet logical type TIMESTAM...

2017-04-03 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/15332 Thanks! Merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishe

[GitHub] spark issue #17494: [SPARK-20076][ML][PySpark] Add Python interface for ml.s...

2017-04-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17494 **[Test build #75496 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75496/testReport)** for PR 17494 at commit [`fbcc1fe`](https://github.com/apache/spark/commit/fb

[GitHub] spark issue #17494: [SPARK-20076][ML][PySpark] Add Python interface for ml.s...

2017-04-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17494 **[Test build #75495 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75495/testReport)** for PR 17494 at commit [`8936880`](https://github.com/apache/spark/commit/89

[GitHub] spark pull request #17494: [SPARK-20076][ML][PySpark] Add Python interface f...

2017-04-03 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/17494#discussion_r109557018 --- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Correlation.scala --- @@ -56,7 +56,7 @@ object Correlation { * Here is how to access the cor

[GitHub] spark issue #17512: [SPARK-20196][PYTHON][SQL] update doc for catalog functi...

2017-04-03 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/17512 will update after #17518 + changes to R doc too --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #17494: [SPARK-20076][ML][PySpark] Add Python interface f...

2017-04-03 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/17494#discussion_r109556837 --- Diff: python/pyspark/ml/stat.py --- @@ -71,6 +71,62 @@ def test(dataset, featuresCol, labelCol): return _java2py(sc, javaTestObj.test(*args))

[GitHub] spark issue #16906: [SPARK-19570][PYSPARK] Allow to disable hive in pyspark ...

2017-04-03 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/16906 +1 on that, we do have the log on the R side. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this f

[GitHub] spark issue #17459: [SPARK-20109][MLlib] Added toBlockMatrixDense to Indexed...

2017-04-03 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17459 I've done some prototype locally to generalize this change to `SparseMatrix`. During that, I have a thought that do we have the limit that all Matrix in `BlockMatrix` need to be the same kind of Matr

[GitHub] spark pull request #17415: [SPARK-19408][SQL] filter estimation on two colum...

2017-04-03 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17415 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #17415: [SPARK-19408][SQL] filter estimation on two columns of s...

2017-04-03 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17415 Thanks, Merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and w

[GitHub] spark issue #17520: [WIP][SPARK-19712][SQL] Move PullupCorrelatedPredicates ...

2017-04-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17520 **[Test build #75494 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75494/testReport)** for PR 17520 at commit [`0bab4fd`](https://github.com/apache/spark/commit/0b

[GitHub] spark pull request #17415: [SPARK-19408][SQL] filter estimation on two colum...

2017-04-03 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/17415#discussion_r109554814 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -550,6 +565,220 @@ case clas

[GitHub] spark issue #17415: [SPARK-19408][SQL] filter estimation on two columns of s...

2017-04-03 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17415 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the featur

[GitHub] spark pull request #17487: [Spark-20145] Fix range case insensitive bug in S...

2017-04-03 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17487 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #17487: [Spark-20145] Fix range case insensitive bug in SQL

2017-04-03 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17487 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark pull request #17505: [SPARK-20187][SQL] Replace loadTable with moveFil...

2017-04-03 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17505#discussion_r109553390 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala --- @@ -242,6 +251,16 @@ private[client] class Shim_v0_12 extends Shim with

[GitHub] spark pull request #17112: [WIP] Measurement for SPARK-16929.

2017-04-03 Thread jinxing64
Github user jinxing64 closed the pull request at: https://github.com/apache/spark/pull/17112 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request #17336: [SPARK-20003] [ML] FPGrowthModel setMinConfidence...

2017-04-03 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/17336#discussion_r109548396 --- Diff: mllib/src/test/scala/org/apache/spark/ml/fpm/FPGrowthSuite.scala --- @@ -85,38 +85,58 @@ class FPGrowthSuite extends SparkFunSuite with MLlibTe

[GitHub] spark pull request #17336: [SPARK-20003] [ML] FPGrowthModel setMinConfidence...

2017-04-03 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/17336#discussion_r109548283 --- Diff: mllib/src/test/scala/org/apache/spark/ml/fpm/FPGrowthSuite.scala --- @@ -85,38 +85,58 @@ class FPGrowthSuite extends SparkFunSuite with MLlibTe

[GitHub] spark issue #15821: [SPARK-13534][PySpark] Using Apache Arrow to increase pe...

2017-04-03 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/15821 Thanks for the review @viirya, I'm working on an update but want to be sure the python tests for arrow get run before I push. --- If your project is set up for it, you can reply to this email a

[GitHub] spark pull request #15821: [SPARK-13534][PySpark] Using Apache Arrow to incr...

2017-04-03 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/15821#discussion_r109547685 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -2747,6 +2747,17 @@ class Dataset[T] private[sql]( } }

[GitHub] spark issue #17499: [SPARK-20161][CORE] Default log4j properties file should...

2017-04-03 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17499 Maybe Hive can do it in Hive. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wis

[GitHub] spark issue #17521: [SPARK-20204][SQL] separate SQLConf into catalyst confs ...

2017-04-03 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17521 To be clear, I don't think we should have two separate places to define config entries. If this is what the pr is doing, I strongly veto. --- If your project is set up for it, you can reply to this e

[GitHub] spark issue #17522: [SPARK-18278] [Scheduler] Documentation to point to Kube...

2017-04-03 Thread foxish
Github user foxish commented on the issue: https://github.com/apache/spark/pull/17522 @mridulm, I understand your concern here. This is however an effort from the Kubernetes community (https://github.com/kubernetes/kubernetes/issues/34377), so, the eventuality of a different parallel

[GitHub] spark issue #17522: [SPARK-18278] [Scheduler] Documentation to point to Kube...

2017-04-03 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17522 Seems fine to me, since the number of external resource managers are small. We should definitely make it clear there is no firm commitment currently to merge this into Spark though. --- If your proj

[GitHub] spark issue #17521: [SPARK-20204][SQL] separate SQLConf into catalyst confs ...

2017-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17521 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75492/ Test FAILed. ---

[GitHub] spark issue #17521: [SPARK-20204][SQL] separate SQLConf into catalyst confs ...

2017-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17521 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #17521: [SPARK-20204][SQL] separate SQLConf into catalyst confs ...

2017-04-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17521 **[Test build #75492 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75492/testReport)** for PR 17521 at commit [`32aaf63`](https://github.com/apache/spark/commit/3

[GitHub] spark issue #16906: [SPARK-19570][PYSPARK] Allow to disable hive in pyspark ...

2017-04-03 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/16906 I think this looks reasonable, although it would maybe make sense to add a warning if the user has explicitly requested hive support and we are falling through to non-hive support (e.g. in the excep

[GitHub] spark issue #17375: [SPARK-19019][PYTHON][BRANCH-1.6] Fix hijacked `collecti...

2017-04-03 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/17375 Anaconda default to 3.6 definitely makes this make more sense, thanks @zero323 I had forgotten that. I'll give @davies until next week to say anything about this but otherwise I think the set of bac

[GitHub] spark pull request #17494: [SPARK-20076][ML][PySpark] Add Python interface f...

2017-04-03 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/17494#discussion_r109538706 --- Diff: python/pyspark/ml/stat.py --- @@ -71,6 +71,62 @@ def test(dataset, featuresCol, labelCol): return _java2py(sc, javaTestObj.test(*args)

[GitHub] spark pull request #17494: [SPARK-20076][ML][PySpark] Add Python interface f...

2017-04-03 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/17494#discussion_r109538556 --- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Correlation.scala --- @@ -56,7 +56,7 @@ object Correlation { * Here is how to access the co

[GitHub] spark issue #16793: [SPARK-19454][PYTHON][SQL] DataFrame.replace improvement...

2017-04-03 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/16793 Let me try and take a look tonight. It seems like there are some small formatting issues still at a quick glance but I feel like this should be close. --- If your project is set up for it, you can

[GitHub] spark issue #17328: [SPARK-19975][Python][SQL] Add map_keys and map_values f...

2017-04-03 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/17328 jenkins, ok to test. Does someone on the SQL side have a chance to look at this to say if its something they want added to the DataFrame API? Maybe @marmbrus ? I'm a little hesistant with adding

[GitHub] spark issue #17523: [SPARK-20064][PySpark]

2017-04-03 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/17523 Thanks for doing this @setjet & welcome to the Spark project :) This change looks good pending jenkins, if everything passes I'll merge it tonight. For others looking at this PR wondering, m

[GitHub] spark issue #17523: [SPARK-20064][PySpark]

2017-04-03 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/17523 Jenkins OK to test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark issue #17508: [SPARK-20191][yarn] Crate wrapper for RackResolver so te...

2017-04-03 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/17508 @srowen @tgravescs --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, o

[GitHub] spark issue #17520: [WIP][SPARK-19712][SQL] Move PullupCorrelatedPredicates ...

2017-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17520 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75493/ Test FAILed. ---

[GitHub] spark issue #17520: [WIP][SPARK-19712][SQL] Move PullupCorrelatedPredicates ...

2017-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17520 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #17522: [SPARK-18278] [Scheduler] Documentation to point to Kube...

2017-04-03 Thread mridulm
Github user mridulm commented on the issue: https://github.com/apache/spark/pull/17522 I dont think we should be pointing to third party projects in spark documentation - for example, it might be possible that some other effort gets merged in instead of the above. If/when it

[GitHub] spark issue #17520: [WIP][SPARK-19712][SQL] Move PullupCorrelatedPredicates ...

2017-04-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17520 **[Test build #75493 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75493/testReport)** for PR 17520 at commit [`4aaab02`](https://github.com/apache/spark/commit/4

[GitHub] spark issue #17523: [SPARK-20064][PySpark]

2017-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17523 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark issue #17422: [SPARK-20087] Attach accumulators / metrics to 'TaskKill...

2017-04-03 Thread noodle-fb
Github user noodle-fb commented on the issue: https://github.com/apache/spark/pull/17422 @JoshRosen ping? not sure how to github correctly --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #17523: [SPARK-20064][PySpark]

2017-04-03 Thread setjet
GitHub user setjet opened a pull request: https://github.com/apache/spark/pull/17523 [SPARK-20064][PySpark] ## What changes were proposed in this pull request? PySpark version in version.py was lagging behind Versioning is in line with PEP 440: https://www.python.org/dev/pe

[GitHub] spark issue #17520: [WIP][SPARK-19712][SQL] Move PullupCorrelatedPredicates ...

2017-04-03 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/17520 cc: @hvanhovell --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark issue #17087: [SPARK-19372][SQL] Fix throwing a Java exception at df.f...

2017-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17087 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75487/ Test PASSed. ---

[GitHub] spark issue #17087: [SPARK-19372][SQL] Fix throwing a Java exception at df.f...

2017-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17087 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

  1   2   3   >