[GitHub] spark pull request: [SPARK-14558][CORE] In ClosureCleaner, clean t...

2016-04-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/12327#discussion_r59368815 --- Diff: core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala --- @@ -77,35 +77,19 @@ private[spark] object ClosureCleaner extends Logging {

[GitHub] spark pull request: [SPARK-14558][CORE] In ClosureCleaner, clean t...

2016-04-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12327#issuecomment-208887603 **[Test build #55604 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55604/consoleFull)** for PR 12327 at commit

[GitHub] spark pull request: [SPARK-14558][CORE] In ClosureCleaner, clean t...

2016-04-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/12327#discussion_r59368774 --- Diff: core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala --- @@ -77,35 +77,19 @@ private[spark] object ClosureCleaner extends Logging {

[GitHub] spark pull request: [SPARK-14558][CORE] In ClosureCleaner, clean t...

2016-04-12 Thread cloud-fan
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/12327#issuecomment-208886915 cc @yhuai @JoshRosen @rxin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-14558][CORE] In ClosureCleaner, clean t...

2016-04-12 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/12327 [SPARK-14558][CORE] In ClosureCleaner, clean the outer pointer if it's a REPL line object ## What changes were proposed in this pull request? When we clean a closure, if its outermost

[GitHub] spark pull request: [SPARK-14533] [MLLIB] RowMatrix.computeCovaria...

2016-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12299#issuecomment-208883192 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-14533] [MLLIB] RowMatrix.computeCovaria...

2016-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12299#issuecomment-208883187 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-14533] [MLLIB] RowMatrix.computeCovaria...

2016-04-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12299#issuecomment-208882923 **[Test build #55603 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55603/consoleFull)** for PR 12299 at commit

[GitHub] spark pull request: [SPARK-14488][SPARK-14493][SQL] "CREATE TEMPOR...

2016-04-12 Thread cloud-fan
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/12303#issuecomment-208882828 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-14485][CORE] ignore task finished for e...

2016-04-12 Thread zhonghaihua
Github user zhonghaihua commented on the pull request: https://github.com/apache/spark/pull/12258#issuecomment-208881577 @JoshRosen @andrewor14 Could you verify this PR? Thanks a lot. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [SPARK-14533] [MLLIB] RowMatrix.computeCovaria...

2016-04-12 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/12299#issuecomment-208873482 So if the input is vector A, then instead of computing A'A we're computing, for some mean vector M, (A-M)'(A-M) = A'A - A'M - M'A + M'M. A'A remains efficient to

[GitHub] spark pull request: [SPARK-14533] [MLLIB] RowMatrix.computeCovaria...

2016-04-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12299#issuecomment-208867520 **[Test build #55603 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55603/consoleFull)** for PR 12299 at commit

[GitHub] spark pull request: [SPARK-14533] [MLLIB] RowMatrix.computeCovaria...

2016-04-12 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/12299#issuecomment-208866819 @mengxr I implemented the suggestion. The sparse case is covered now too, but, as you say I don't know that there's a faster algorithm for this once the data is not

[GitHub] spark pull request: [SPARK-14533] [MLLIB] RowMatrix.computeCovaria...

2016-04-12 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/12299#discussion_r59363128 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/BLAS.scala --- @@ -267,7 +295,6 @@ private[spark] object BLAS extends Serializable with Logging

[GitHub] spark pull request: [SPARK-14533] [MLLIB] RowMatrix.computeCovaria...

2016-04-12 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/12299#discussion_r59363105 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/BLAS.scala --- @@ -237,25 +237,53 @@ private[spark] object BLAS extends Serializable with

[GitHub] spark pull request: [SPARK-14533] [MLLIB] RowMatrix.computeCovaria...

2016-04-12 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/12299#discussion_r59363070 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/BLAS.scala --- @@ -237,25 +237,53 @@ private[spark] object BLAS extends Serializable with

[GitHub] spark pull request: [SPARK-14488][SPARK-14493][SQL] "CREATE TEMPOR...

2016-04-12 Thread cloud-fan
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/12303#issuecomment-208862182 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-14488][SPARK-14493][SQL] "CREATE TEMPOR...

2016-04-12 Thread cloud-fan
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/12303#issuecomment-208862392 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-13322] [ML] AFTSurvivalRegression suppo...

2016-04-12 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/11365#issuecomment-208857657 @mengxr I run test on a dataset with a constant nonzero column. * If ```fitIntercept==true```, Spark ```AFTSurvivalRegression``` and R ```survreg``` output the

[GitHub] spark pull request: [SPARK-14488][SPARK-14493][SQL] "CREATE TEMPOR...

2016-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12303#issuecomment-208847478 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-14488][SPARK-14493][SQL] "CREATE TEMPOR...

2016-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12303#issuecomment-208847472 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-14488][SPARK-14493][SQL] "CREATE TEMPOR...

2016-04-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12303#issuecomment-208847150 **[Test build #55602 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55602/consoleFull)** for PR 12303 at commit

[GitHub] spark pull request: [MINOR][SQL] Remove some unused imports in dat...

2016-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12326#issuecomment-208828508 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [MINOR][SQL] Remove some unused imports in dat...

2016-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12326#issuecomment-208828507 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [MINOR][SQL] Remove some unused imports in dat...

2016-04-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12326#issuecomment-208828190 **[Test build #55601 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55601/consoleFull)** for PR 12326 at commit

[GitHub] spark pull request: [SPARK-14488][SPARK-14493][SQL] "CREATE TEMPOR...

2016-04-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12303#issuecomment-208826385 **[Test build #55602 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55602/consoleFull)** for PR 12303 at commit

[GitHub] spark pull request: [SPARK-14412][ML][WIP]spark.ml ALS prefered st...

2016-04-12 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/12202#discussion_r59346482 --- Diff: mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala --- @@ -169,6 +169,41 @@ private[recommendation] trait ALSParams extends

[GitHub] spark pull request: [SPARK-14480][SQL] Simplify CSV parsing proces...

2016-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12268#issuecomment-208812371 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-14480][SQL] Simplify CSV parsing proces...

2016-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12268#issuecomment-208812366 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-13792][SQL] Limit logging of bad record...

2016-04-12 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/12173#issuecomment-208812249 okay, fixed cc: @falaki --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-14480][SQL] Simplify CSV parsing proces...

2016-04-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12268#issuecomment-208812044 **[Test build #55599 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55599/consoleFull)** for PR 12268 at commit

[GitHub] spark pull request: [SPARK-13792][SQL] Limit logging of bad record...

2016-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12173#issuecomment-208810842 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-14480][SQL] Simplify CSV parsing proces...

2016-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12268#issuecomment-208810922 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-13792][SQL] Limit logging of bad record...

2016-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12173#issuecomment-208810857 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-14480][SQL] Simplify CSV parsing proces...

2016-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12268#issuecomment-208810916 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-13792][SQL] Limit logging of bad record...

2016-04-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12173#issuecomment-208810207 **[Test build #55598 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55598/consoleFull)** for PR 12173 at commit

[GitHub] spark pull request: [SPARK-14480][SQL] Simplify CSV parsing proces...

2016-04-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12268#issuecomment-208810188 **[Test build #55597 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55597/consoleFull)** for PR 12268 at commit

[GitHub] spark pull request: [Minor] [MLlib] remove stale comment

2016-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12325#issuecomment-208798310 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [Minor] [MLlib] remove stale comment

2016-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12325#issuecomment-208798303 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [Minor] [MLlib] remove stale comment

2016-04-12 Thread MLnick
Github user MLnick commented on the pull request: https://github.com/apache/spark/pull/12325#issuecomment-208798019 LGTM pending test run. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [Minor] [MLlib] remove stale comment

2016-04-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12325#issuecomment-208797928 **[Test build #55600 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55600/consoleFull)** for PR 12325 at commit

[GitHub] spark pull request: [SPARK-14238][ML][MLLIB][PYSPARK] Add binary t...

2016-04-12 Thread MLnick
Github user MLnick commented on the pull request: https://github.com/apache/spark/pull/12079#issuecomment-208797154 A few minor comments, otherwise LGTM. @holdenk @BryanCutler we could merge this and #12308, and then update the param to be shared (if we can do the different

[GitHub] spark pull request: [SPARK-14238][ML][MLLIB][PYSPARK] Add binary t...

2016-04-12 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/12079#discussion_r59341296 --- Diff: python/pyspark/ml/feature.py --- @@ -512,14 +512,19 @@ class HashingTF(JavaTransformer, HasInputCol, HasOutputCol, HasNumFeatures, Java ..

[GitHub] spark pull request: [SPARK-14238][ML][MLLIB][PYSPARK] Add binary t...

2016-04-12 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/12079#discussion_r59341354 --- Diff: python/pyspark/mllib/feature.py --- @@ -379,6 +379,17 @@ class HashingTF(object): """ def __init__(self, numFeatures=1 << 20):

[GitHub] spark pull request: [MINOR][SQL][DOCS] Remove some unused imports ...

2016-04-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12326#issuecomment-208791039 I also noticed that `SqlNewHadoopRDD` is not used anymore. Just to double check, this wouldn't mean necessarily this has to be removed? --- If your project is set

[GitHub] spark pull request: [SPARK-14125] [SQL] Native DDL Support: Alter ...

2016-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12324#issuecomment-208788653 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-14125] [SQL] Native DDL Support: Alter ...

2016-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12324#issuecomment-208788643 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-14125] [SQL] Native DDL Support: Alter ...

2016-04-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12324#issuecomment-208787493 **[Test build #55595 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55595/consoleFull)** for PR 12324 at commit

[GitHub] spark pull request: [MINOR][SQL][DOCS] Remove some unused imports ...

2016-04-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12326#issuecomment-208785393 **[Test build #55601 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55601/consoleFull)** for PR 12326 at commit

[GitHub] spark pull request: [SPARK-13432][SQL] add the source file name an...

2016-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11301#issuecomment-208783933 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-13432][SQL] add the source file name an...

2016-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11301#issuecomment-208783925 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [MINOR][SQL][DOCS] Remove some unused imports ...

2016-04-12 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/12326 [MINOR][SQL][DOCS] Remove some unused imports in datasources. ## What changes were proposed in this pull request? It looks several commits for datasources missed removing some unused

[GitHub] spark pull request: [SPARK-13929] Use Scala reflection for UDTs

2016-04-12 Thread joan38
Github user joan38 commented on the pull request: https://github.com/apache/spark/pull/12149#issuecomment-208783120 @marmbrus I think now we have a end to end test for both RDDs and Datasets. Is that all good for you? --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: [SPARK-13432][SQL] add the source file name an...

2016-04-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11301#issuecomment-208783126 **[Test build #55594 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55594/consoleFull)** for PR 11301 at commit

[GitHub] spark pull request: [SPARK-14238][ML][MLLIB][PYSPARK] Add binary t...

2016-04-12 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/12079#discussion_r59339176 --- Diff: python/pyspark/ml/tests.py --- @@ -831,6 +831,25 @@ def test_logistic_regression_summary(self):

[GitHub] spark pull request: [SPARK-14238][ML][MLLIB][PYSPARK] Add binary t...

2016-04-12 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/12079#discussion_r59338971 --- Diff: python/pyspark/ml/feature.py --- @@ -512,14 +512,19 @@ class HashingTF(JavaTransformer, HasInputCol, HasOutputCol, HasNumFeatures, Java ..

[GitHub] spark pull request: [SPARK-14238][ML][MLLIB][PYSPARK] Add binary t...

2016-04-12 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/12079#discussion_r59338620 --- Diff: python/pyspark/ml/feature.py --- @@ -512,14 +512,19 @@ class HashingTF(JavaTransformer, HasInputCol, HasOutputCol, HasNumFeatures, Java ..

[GitHub] spark pull request: [SPARK-14513][CORE] Fix threads left behind af...

2016-04-12 Thread chtyim
Github user chtyim commented on a diff in the pull request: https://github.com/apache/spark/pull/12318#discussion_r59338307 --- Diff: core/src/main/scala/org/apache/spark/HttpServer.scala --- @@ -155,6 +156,7 @@ private[spark] class HttpServer( throw new

[GitHub] spark pull request: [SPARK-13967] [PYSPARK][ML] Added binary Param...

2016-04-12 Thread MLnick
Github user MLnick commented on the pull request: https://github.com/apache/spark/pull/12308#issuecomment-208777193 LGTM otherwise. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [Minor] [MLlib] remove stale comment

2016-04-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12325#issuecomment-208776352 **[Test build #55600 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55600/consoleFull)** for PR 12325 at commit

[GitHub] spark pull request: [SPARK-13967] [PYSPARK][ML] Added binary Param...

2016-04-12 Thread MLnick
Github user MLnick commented on the pull request: https://github.com/apache/spark/pull/12308#issuecomment-208773214 @holdenk @BryanCutler I'd say we could make `binary` shared, but the only thing is currently the doc is a bit different between them (the doc for `CountVectorizer`

[GitHub] spark pull request: [Minor] [MLlib] remove stale comment

2016-04-12 Thread hhbyyh
GitHub user hhbyyh opened a pull request: https://github.com/apache/spark/pull/12325 [Minor] [MLlib] remove stale comment ## What changes were proposed in this pull request? Remove the stale comment for KolmogorovSmirnovTest since the implementation has been changed. Let me

[GitHub] spark pull request: [SPARK-13967] [PYSPARK][ML] Added binary Param...

2016-04-12 Thread MLnick
Github user MLnick commented on the pull request: https://github.com/apache/spark/pull/12308#issuecomment-208771658 Will take a look --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: SPARK-14421 : Upgrade Kinesis Client Library (...

2016-04-12 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/12209#issuecomment-208770702 @boneill42 if you'll reopen vs master I'll merge it --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: [SPARK-6429] Implement hashCode and equals tog...

2016-04-12 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/12157#issuecomment-208770255 @joan38 what do you think about moving forward with the style check, and at least the changes that are uncontroversial here? some of these are good fixes. --- If your

[GitHub] spark pull request: [SPARK-14513][CORE] Fix threads left behind af...

2016-04-12 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/12318#discussion_r59335396 --- Diff: core/src/main/scala/org/apache/spark/HttpServer.scala --- @@ -155,6 +156,7 @@ private[spark] class HttpServer( throw new

[GitHub] spark pull request: [SPARK-2926][Shuffle]Add MR style sort-merge s...

2016-04-12 Thread yaooqinn
Github user yaooqinn commented on the pull request: https://github.com/apache/spark/pull/3438#issuecomment-208769628 Is there any progress about this mechanism study? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: [SPARK-14480][SQL] Simplify CSV parsing proces...

2016-04-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12268#issuecomment-208768839 **[Test build #55599 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55599/consoleFull)** for PR 12268 at commit

[GitHub] spark pull request: [SPARK-13792][SQL] Limit logging of bad record...

2016-04-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12173#issuecomment-208766060 **[Test build #55598 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55598/consoleFull)** for PR 12173 at commit

[GitHub] spark pull request: [SPARK-14480][SQL] Simplify CSV parsing proces...

2016-04-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12268#issuecomment-208765993 **[Test build #55597 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55597/consoleFull)** for PR 12268 at commit

[GitHub] spark pull request: [SPARK-14480][SQL] Simplify CSV parsing proces...

2016-04-12 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/12268#discussion_r59334349 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVRelation.scala --- @@ -17,153 +17,197 @@ package

[GitHub] spark pull request: [SPARK-14544] [SQL] improve performance of SQL...

2016-04-12 Thread davies
Github user davies commented on the pull request: https://github.com/apache/spark/pull/12311#issuecomment-208763553 @zsxwing Could you help me to fix the streaming test suite? It seems related to this PR. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-14238][ML][MLLIB][PYSPARK] Add binary t...

2016-04-12 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/12079#discussion_r59334159 --- Diff: python/pyspark/ml/feature.py --- @@ -512,14 +512,19 @@ class HashingTF(JavaTransformer, HasInputCol, HasOutputCol, HasNumFeatures, Java

[GitHub] spark pull request: [SPARK-14480][SQL] Simplify CSV parsing proces...

2016-04-12 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12268#discussion_r59334104 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVRelation.scala --- @@ -17,153 +17,197 @@ package

[GitHub] spark pull request: [SPARK-14363] Fix executor OOM due to memory l...

2016-04-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12285#issuecomment-208763429 **[Test build #2777 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2777/consoleFull)** for PR 12285 at commit

[GitHub] spark pull request: [SPARK-14544] [SQL] improve performance of SQL...

2016-04-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12311#issuecomment-208762764 **[Test build #2778 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2778/consoleFull)** for PR 12311 at commit

[GitHub] spark pull request: [SPARK-14480][SQL] Simplify CSV parsing proces...

2016-04-12 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12268#discussion_r59333718 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVRelation.scala --- @@ -17,153 +17,197 @@ package

[GitHub] spark pull request: [SPARK-14555] First cut of Python API for Stru...

2016-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12320#issuecomment-208761316 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-14555] First cut of Python API for Stru...

2016-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12320#issuecomment-208761311 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-14555] First cut of Python API for Stru...

2016-04-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12320#issuecomment-208760904 **[Test build #55592 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55592/consoleFull)** for PR 12320 at commit

[GitHub] spark pull request: [SPARK-14508][BUILD] Add a new ScalaStyle Rule...

2016-04-12 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/12280 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-14508][BUILD] Add a new ScalaStyle Rule...

2016-04-12 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/12280#issuecomment-208760839 Yeah I scanned it and looks good --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-14508][BUILD] Add a new ScalaStyle Rule...

2016-04-12 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/12280#issuecomment-208759696 Thanks - this looks good. I'm going to merge it, assuming @srowen took a detailed look. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-13089][ML] [Doc] spark.ml Naive Bayes u...

2016-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11015#issuecomment-208754701 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-13089][ML] [Doc] spark.ml Naive Bayes u...

2016-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11015#issuecomment-208754700 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-14513][CORE] Fix threads left behind af...

2016-04-12 Thread chtyim
Github user chtyim commented on a diff in the pull request: https://github.com/apache/spark/pull/12318#discussion_r59331847 --- Diff: core/src/main/scala/org/apache/spark/HttpServer.scala --- @@ -155,6 +156,7 @@ private[spark] class HttpServer( throw new

[GitHub] spark pull request: [SPARK-13089][ML] [Doc] spark.ml Naive Bayes u...

2016-04-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11015#issuecomment-208754612 **[Test build #55596 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55596/consoleFull)** for PR 11015 at commit

[GitHub] spark pull request: [SPARK-14473][SQL] Define analysis rules to ca...

2016-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12246#issuecomment-208752572 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-14473][SQL] Define analysis rules to ca...

2016-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12246#issuecomment-208752569 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-13568] [ML] Create feature transformer ...

2016-04-12 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/11601#discussion_r59331281 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Imputer.scala --- @@ -0,0 +1,300 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-14473][SQL] Define analysis rules to ca...

2016-04-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12246#issuecomment-208752097 **[Test build #55590 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55590/consoleFull)** for PR 12246 at commit

[GitHub] spark pull request: [SPARK-14549][ML][WIP] Copy the Vector and Mat...

2016-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12317#issuecomment-208751778 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-13089][ML] [Doc] spark.ml Naive Bayes u...

2016-04-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11015#issuecomment-208751703 **[Test build #55596 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55596/consoleFull)** for PR 11015 at commit

[GitHub] spark pull request: [SPARK-14549][ML][WIP] Copy the Vector and Mat...

2016-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12317#issuecomment-208751776 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-13568] [ML] Create feature transformer ...

2016-04-12 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/11601#discussion_r59330944 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Imputer.scala --- @@ -0,0 +1,300 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-14549][ML][WIP] Copy the Vector and Mat...

2016-04-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12317#issuecomment-208751521 **[Test build #55589 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55589/consoleFull)** for PR 12317 at commit

[GitHub] spark pull request: [SPARK-14125] [SQL] Native DDL Support: Alter ...

2016-04-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12324#issuecomment-208746010 **[Test build #55595 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55595/consoleFull)** for PR 12324 at commit

[GitHub] spark pull request: [SPARK-14125] [SQL] Native DDL Support: Alter ...

2016-04-12 Thread gatorsmile
GitHub user gatorsmile opened a pull request: https://github.com/apache/spark/pull/12324 [SPARK-14125] [SQL] Native DDL Support: Alter View What changes were proposed in this pull request? This PR is to provide a native DDL support for the following three Alter View

[GitHub] spark pull request: [SPARK-13432][SQL] add the source file name an...

2016-04-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11301#issuecomment-208741852 **[Test build #55594 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55594/consoleFull)** for PR 11301 at commit

[GitHub] spark pull request: [STREAMING] SPARK-2009 Key not found exception...

2016-04-12 Thread ouyangshourui
Github user ouyangshourui commented on the pull request: https://github.com/apache/spark/pull/961#issuecomment-208741187 I use spark1.5.2 on yarn,alse meeting the same problem: 16/04/12 14:36:45 ERROR scheduler.DAGScheduler: Failed to update accumulators for ShuffleMapTask(32566,

<    2   3   4   5   6   7   8   >