[GitHub] spark issue #15874: [Spark-18408] API Improvements for LSH

2016-11-14 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/15874 Thanks @yunni, I can take a look at this today. I would prefer to separate the addition of "AND-amplification" into another PR since the other changes I believe we'd like to get into 2.1, whereas

[GitHub] spark issue #14638: [SPARK-11374][SQL] Support `skip.header.line.count` opti...

2016-11-14 Thread jamartinh
Github user jamartinh commented on the issue: https://github.com/apache/spark/pull/14638 This is necessary, please add this to a release ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #15763: [SPARK-17348][SQL] Incorrect results from subquery trans...

2016-11-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15763 **[Test build #68623 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68623/consoleFull)** for PR 15763 at commit

[GitHub] spark issue #15763: [SPARK-17348][SQL] Incorrect results from subquery trans...

2016-11-14 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/15763 @hvanhovell I removed the redundant checking in `ScalarSubquery` as pointed out in your earlier comment (copied below). "I was also wondering if we still need the following

[GitHub] spark issue #15777: [SPARK-18282][ML][PYSPARK] Add python clustering summari...

2016-11-14 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/15777 ping @yanboliang --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark issue #15883: [SPARK-18438][SPARKR][ML] spark.mlp should support RForm...

2016-11-14 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/15883 cc @felixcheung --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark pull request #15883: [SPARK-18438][SPARKR][ML] spark.mlp should suppor...

2016-11-14 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/15883#discussion_r87828863 --- Diff: R/pkg/inst/tests/testthat/test_mllib.R --- @@ -385,7 +386,7 @@ test_that("spark.mlp", { # Test predict method mlpTestDF <- df

[GitHub] spark pull request #15883: [SPARK-18438][SPARKR][ML] spark.mlp should suppor...

2016-11-14 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/15883#discussion_r87829586 --- Diff: mllib/src/main/scala/org/apache/spark/ml/r/MultilayerPerceptronClassifierWrapper.scala --- @@ -73,25 +90,25 @@ private[r] object

[GitHub] spark pull request #15883: [SPARK-18438][SPARKR][ML] spark.mlp should suppor...

2016-11-14 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/15883#discussion_r87827463 --- Diff: R/pkg/R/mllib.R --- @@ -936,20 +939,23 @@ setMethod("predict", signature(object = "MultilayerPerceptronClassificationModel # Returns the

[GitHub] spark issue #15883: [SPARK-18438][SPARKR][ML] spark.mlp should support RForm...

2016-11-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15883 **[Test build #68622 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68622/consoleFull)** for PR 15883 at commit

[GitHub] spark pull request #15883: [SPARK-18438][SPARKR][ML] spark.mlp should suppor...

2016-11-14 Thread yanboliang
GitHub user yanboliang opened a pull request: https://github.com/apache/spark/pull/15883 [SPARK-18438][SPARKR][ML] spark.mlp should support RFormula. ## What changes were proposed in this pull request? ```spark.mlp``` should support ```RFormula``` like other ML algorithm

[GitHub] spark pull request #15881: [SPARK-18434][ML] Add missing ParamValidations fo...

2016-11-14 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/15881#discussion_r87826206 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala --- @@ -171,7 +171,10 @@ class LinearRegression @Since("1.3.0")

[GitHub] spark issue #15763: [SPARK-17348][SQL] Incorrect results from subquery trans...

2016-11-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15763 **[Test build #68621 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68621/consoleFull)** for PR 15763 at commit

[GitHub] spark pull request #15763: [SPARK-17348][SQL] Incorrect results from subquer...

2016-11-14 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/15763#discussion_r87823864 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1069,11 +1110,19 @@ class Analyzer( case a @

[GitHub] spark issue #15659: [SPARK-1267][SPARK-18129] Allow PySpark to be pip instal...

2016-11-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15659 **[Test build #68620 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68620/consoleFull)** for PR 15659 at commit

[GitHub] spark issue #15659: [SPARK-1267][SPARK-18129] Allow PySpark to be pip instal...

2016-11-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15659 **[Test build #68619 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68619/consoleFull)** for PR 15659 at commit

[GitHub] spark issue #15780: [SPARK-18284][SQL] Make ExpressionEncoder.serializer.nul...

2016-11-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15780 **[Test build #68618 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68618/consoleFull)** for PR 15780 at commit

[GitHub] spark issue #15659: [SPARK-1267][SPARK-18129] Allow PySpark to be pip instal...

2016-11-14 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/15659 @JoshRosen: So I took a look at doing that on the flight and I think we will want to keep the release-build tagging so that the artifacts are correct for the different hadoop versions, but we can

[GitHub] spark issue #15866: [SPARK-18422][CORE] Fix wholeTextFiles test to pass on W...

2016-11-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15866 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15866: [SPARK-18422][CORE] Fix wholeTextFiles test to pass on W...

2016-11-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15866 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68615/ Test PASSed. ---

[GitHub] spark issue #15866: [SPARK-18422][CORE] Fix wholeTextFiles test to pass on W...

2016-11-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15866 **[Test build #68615 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68615/consoleFull)** for PR 15866 at commit

[GitHub] spark pull request #15763: [SPARK-17348][SQL] Incorrect results from subquer...

2016-11-14 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15763#discussion_r87813912 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1041,12 +1070,24 @@ class Analyzer(

[GitHub] spark pull request #15763: [SPARK-17348][SQL] Incorrect results from subquer...

2016-11-14 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15763#discussion_r87814081 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1069,11 +1110,19 @@ class Analyzer(

[GitHub] spark pull request #15763: [SPARK-17348][SQL] Incorrect results from subquer...

2016-11-14 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15763#discussion_r87814868 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1069,11 +1110,19 @@ class Analyzer(

[GitHub] spark pull request #15857: [SPARK-18300][SQL] Do not apply foldable propagat...

2016-11-14 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15857#discussion_r87809078 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/FoldablePropagationSuite.scala --- @@ -118,14 +118,30 @@ class

[GitHub] spark pull request #15857: [SPARK-18300][SQL] Do not apply foldable propagat...

2016-11-14 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15857#discussion_r87809029 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala --- @@ -428,43 +428,47 @@ object FoldablePropagation

[GitHub] spark issue #15857: [SPARK-18300][SQL] Do not apply foldable propagation wit...

2016-11-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15857 **[Test build #68617 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68617/consoleFull)** for PR 15857 at commit

[GitHub] spark issue #15763: [SPARK-17348][SQL] Incorrect results from subquery trans...

2016-11-14 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/15763 @hvanhovell could you please review the latest PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #15871: [SPARK-17116][Pyspark] Allow parameters to be {st...

2016-11-14 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/15871#discussion_r87807344 --- Diff: python/pyspark/ml/base.py --- @@ -59,6 +59,12 @@ def fit(self, dataset, params=None): return [self.fit(dataset, paramMap) for

[GitHub] spark issue #15865: [SPARK-18420][BUILD] Fix the errors caused by lint check...

2016-11-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15865 **[Test build #3425 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3425/consoleFull)** for PR 15865 at commit

[GitHub] spark issue #15867: [SPARK-18423][Streaming] ReceiverTracker should close ch...

2016-11-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15867 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68616/ Test PASSed. ---

[GitHub] spark issue #15867: [SPARK-18423][Streaming] ReceiverTracker should close ch...

2016-11-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15867 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15867: [SPARK-18423][Streaming] ReceiverTracker should close ch...

2016-11-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15867 **[Test build #68616 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68616/consoleFull)** for PR 15867 at commit

[GitHub] spark issue #15880: [SPARK-17913][SQL] compare long and string type column m...

2016-11-14 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/15880 Yeah, you are totally right about that. I like this approach, the only bothering me is that this breaks backwards compatibility. --- If your project is set up for it, you can reply to this

[GitHub] spark pull request #15877: [SPARK-18429] [SQL] implement a new Aggregate for...

2016-11-14 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15877#discussion_r87799880 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/CountMinSketchAgg.scala --- @@ -0,0 +1,131 @@ +/* + *

[GitHub] spark pull request #15877: [SPARK-18429] [SQL] implement a new Aggregate for...

2016-11-14 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15877#discussion_r87799629 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/aggregate/CountMinSketchAggSuite.scala --- @@ -0,0 +1,284 @@ +/*

[GitHub] spark pull request #15865: [SPARK-18420][BUILD] Fix the errors caused by lin...

2016-11-14 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/15865#discussion_r87799361 --- Diff: dev/checkstyle-suppressions.xml --- @@ -30,6 +30,8 @@ + --- End diff -- @HyukjinKwon Also we could try `//

[GitHub] spark pull request #15877: [SPARK-18429] [SQL] implement a new Aggregate for...

2016-11-14 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15877#discussion_r87799293 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/aggregate/CountMinSketchAggSuite.scala --- @@ -0,0 +1,284 @@ +/*

[GitHub] spark pull request #15865: [SPARK-18420][BUILD] Fix the errors caused by lin...

2016-11-14 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/15865#discussion_r87798899 --- Diff: dev/checkstyle-suppressions.xml --- @@ -30,6 +30,8 @@ + --- End diff -- Ah, I thought we could disable it

[GitHub] spark pull request #15865: [SPARK-18420][BUILD] Fix the errors caused by lin...

2016-11-14 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/15865#discussion_r87798658 --- Diff: dev/checkstyle-suppressions.xml --- @@ -30,6 +30,8 @@ + --- End diff -- Oh, sorry. Actually, I didn't mean

[GitHub] spark issue #15865: [SPARK-18420][BUILD] Fix the errors caused by lint check...

2016-11-14 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/15865 @HyukjinKwon Thanks for the review and suggestion, I've updated it. Clear the unused object `hasher` and add suppression rules for the method `finalize` of `NioBufferedFileInputStream`. Please

[GitHub] spark pull request #15877: [SPARK-18429] [SQL] implement a new Aggregate for...

2016-11-14 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15877#discussion_r87797412 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/CountMinSketchAgg.scala --- @@ -0,0 +1,131 @@ +/* + *

[GitHub] spark pull request #15877: [SPARK-18429] [SQL] implement a new Aggregate for...

2016-11-14 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15877#discussion_r87797485 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/CountMinSketchAgg.scala --- @@ -0,0 +1,131 @@ +/* + *

[GitHub] spark issue #15880: [SPARK-17913][SQL] compare long and string type column m...

2016-11-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15880 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15880: [SPARK-17913][SQL] compare long and string type column m...

2016-11-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15880 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68612/ Test FAILed. ---

[GitHub] spark issue #15880: [SPARK-17913][SQL] compare long and string type column m...

2016-11-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15880 **[Test build #68612 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68612/consoleFull)** for PR 15880 at commit

[GitHub] spark pull request #15877: [SPARK-18429] [SQL] implement a new Aggregate for...

2016-11-14 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15877#discussion_r87796816 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/CountMinSketchAgg.scala --- @@ -0,0 +1,131 @@ +/* + *

[GitHub] spark pull request #15877: [SPARK-18429] [SQL] implement a new Aggregate for...

2016-11-14 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15877#discussion_r87796671 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/CountMinSketchAgg.scala --- @@ -0,0 +1,131 @@ +/* + *

[GitHub] spark issue #15881: [SPARK-18434][ML] Add missing ParamValidations for ML al...

2016-11-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15881 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #15849: [SPARK-18410][STREAMING] Add structured kafka exa...

2016-11-14 Thread koeninger
Github user koeninger commented on a diff in the pull request: https://github.com/apache/spark/pull/15849#discussion_r87795091 --- Diff: examples/src/main/java/org/apache/spark/examples/sql/streaming/JavaStructuredKafkaWordCount.java --- @@ -0,0 +1,96 @@ +/* + * Licensed

[GitHub] spark issue #15881: [SPARK-18434][ML] Add missing ParamValidations for ML al...

2016-11-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15881 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68613/ Test PASSed. ---

[GitHub] spark issue #15881: [SPARK-18434][ML] Add missing ParamValidations for ML al...

2016-11-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15881 **[Test build #68613 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68613/consoleFull)** for PR 15881 at commit

[GitHub] spark pull request #15877: [SPARK-18429] [SQL] implement a new Aggregate for...

2016-11-14 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/15877#discussion_r87794405 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/CountMinSketchAgg.scala --- @@ -0,0 +1,131 @@ +/* + *

[GitHub] spark issue #15882: [SPARK-18400][STREAMING] NPE when resharding Kinesis Str...

2016-11-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15882 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15882: [SPARK-18400][STREAMING] NPE when resharding Kinesis Str...

2016-11-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15882 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68614/ Test PASSed. ---

[GitHub] spark issue #15882: [SPARK-18400][STREAMING] NPE when resharding Kinesis Str...

2016-11-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15882 **[Test build #68614 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68614/consoleFull)** for PR 15882 at commit

[GitHub] spark issue #15880: [SPARK-17913][SQL] compare long and string type column m...

2016-11-14 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/15880 We can only know if this string is castable at runtime. BTW, other databases(like MySQL) have special implicit type conversion rules for constants, should we follow them? --- If your project is

[GitHub] spark issue #15867: [SPARK-18423][Streaming] ReceiverTracker should close ch...

2016-11-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15867 **[Test build #68616 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68616/consoleFull)** for PR 15867 at commit

[GitHub] spark issue #15867: [SPARK-18423][Streaming] ReceiverTracker should close ch...

2016-11-14 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15867 Build started: [Streaming] `org.apache.spark.streaming.JavaAPISuite`

[GitHub] spark issue #15803: [SPARK-18298][Web UI]change gmt time to local zone time ...

2016-11-14 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15803 Hm, so I looked at how the other UIs work, and they seem to not be in GMT always. They happen to use the machines' default time zone by way of using a `SimpleDateFormat` to render times. So it has

[GitHub] spark issue #15866: [SPARK-18422][CORE] Fix wholeTextFiles test to pass on W...

2016-11-14 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15866 Build started: [CORE] `org.apache.spark.JavaAPISuite`

[GitHub] spark issue #15866: [SPARK-18422][CORE] Fix wholeTextFiles test to pass on W...

2016-11-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15866 **[Test build #68615 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68615/consoleFull)** for PR 15866 at commit

[GitHub] spark pull request #15879: [SPARK-18432][DOC] Changed HDFS default block siz...

2016-11-14 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15879 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #15879: [SPARK-18432][DOC] Changed HDFS default block size from ...

2016-11-14 Thread sarutak
Github user sarutak commented on the issue: https://github.com/apache/spark/pull/15879 O.K. Merging into `master`, `branch-2.0` and `branch-2.1`. Thanks @moomindani ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark issue #15866: [SPARK-18422][CORE] Fix wholeTextFiles test to pass on W...

2016-11-14 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15866 Let me add a comment here and will try to clean up more. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark issue #15866: [SPARK-18422][CORE] Fix wholeTextFiles test to pass on W...

2016-11-14 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15866 Thank you Sean. Actually, this is a bit annoying. Here is what happens in the original test. 1. Writes a file to read back by `wholeTextFiles`. ```scala scala>

[GitHub] spark pull request #15869: [YARN][DOC] Update Yarn configuration doc

2016-11-14 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/15869#discussion_r87787683 --- Diff: docs/running-on-yarn.md --- @@ -495,6 +468,20 @@ To use a custom metrics.properties for the application master and executors, upd name

[GitHub] spark pull request #15869: [YARN][DOC] Update Yarn configuration doc

2016-11-14 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/15869#discussion_r87787755 --- Diff: docs/running-on-yarn.md --- @@ -495,6 +468,20 @@ To use a custom metrics.properties for the application master and executors, upd name

[GitHub] spark pull request #15869: [YARN][DOC] Update Yarn configuration doc

2016-11-14 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/15869#discussion_r87787316 --- Diff: docs/configuration.md --- @@ -156,6 +156,13 @@ of the most common options to set are: + spark.executor.instances --- End

[GitHub] spark pull request #15869: [YARN][DOC] Update Yarn configuration doc

2016-11-14 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/15869#discussion_r87787468 --- Diff: docs/running-on-yarn.md --- @@ -118,19 +118,6 @@ To use a custom metrics.properties for the application master and executors, upd

[GitHub] spark pull request #15869: [YARN][DOC] Update Yarn configuration doc

2016-11-14 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/15869#discussion_r87787674 --- Diff: docs/running-on-yarn.md --- @@ -495,6 +468,20 @@ To use a custom metrics.properties for the application master and executors, upd name

[GitHub] spark pull request #14612: [SPARK-16803] [SQL] SaveAsTable does not work whe...

2016-11-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/14612#discussion_r87787158 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala --- @@ -89,6 +89,22 @@ case class

[GitHub] spark issue #15880: [SPARK-17913][SQL] compare long and string type column m...

2016-11-14 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/15880 This might be a bad idea: should we follow the old casting strategy if we cannot cast from string to atomic datatype? --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request #15877: [SPARK-18429] [SQL] implement a new Aggregate for...

2016-11-14 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15877#discussion_r87785949 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/CountMinSketchAgg.scala --- @@ -0,0 +1,131 @@ +/* + *

[GitHub] spark pull request #15868: [SPARK-18413][SQL] Control the number of JDBC con...

2016-11-14 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/15868#discussion_r87785862 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala --- @@ -667,9 +667,15 @@ object JdbcUtils extends Logging

[GitHub] spark issue #15882: [SPARK-18400][STREAMING] NPE when resharding Kinesis Str...

2016-11-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15882 **[Test build #68614 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68614/consoleFull)** for PR 15882 at commit

[GitHub] spark pull request #15881: [SPARK-18434][ML] Add missing ParamValidations fo...

2016-11-14 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/15881#discussion_r87784943 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala --- @@ -171,7 +171,10 @@ class LinearRegression @Since("1.3.0")

[GitHub] spark issue #15881: [SPARK-18434][ML] Add missing ParamValidations for ML al...

2016-11-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15881 **[Test build #68613 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68613/consoleFull)** for PR 15881 at commit

[GitHub] spark pull request #15882: [SPARK-18400][STREAMING] NPE when resharding Kine...

2016-11-14 Thread srowen
GitHub user srowen opened a pull request: https://github.com/apache/spark/pull/15882 [SPARK-18400][STREAMING] NPE when resharding Kinesis Stream ## What changes were proposed in this pull request? Avoid NPE in KinesisRecordProcessor when shutdown happens without successful

[GitHub] spark pull request #15882: [SPARK-18400][STREAMING] NPE when resharding Kine...

2016-11-14 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/15882#discussion_r87784660 --- Diff: external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisRecordProcessor.scala --- @@ -102,27 +101,29 @@ private[kinesis]

[GitHub] spark pull request #15881: [SPARK-18434][ML] Add missing ParamValidations fo...

2016-11-14 Thread zhengruifeng
GitHub user zhengruifeng opened a pull request: https://github.com/apache/spark/pull/15881 [SPARK-18434][ML] Add missing ParamValidations for ML algos ## What changes were proposed in this pull request? Add missing ParamValidations for ML algos ## How was this patch tested?

[GitHub] spark issue #15880: [SPARK-17913][SQL] compare long and string type column m...

2016-11-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15880 **[Test build #68612 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68612/consoleFull)** for PR 15880 at commit

[GitHub] spark pull request #15838: [SPARK-18396][HISTORYSERVER]"Duration" column mak...

2016-11-14 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15838 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #15877: [SPARK-18429] [SQL] implement a new Aggregate for CountM...

2016-11-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15877 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15877: [SPARK-18429] [SQL] implement a new Aggregate for CountM...

2016-11-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15877 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68610/ Test PASSed. ---

[GitHub] spark issue #15877: [SPARK-18429] [SQL] implement a new Aggregate for CountM...

2016-11-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15877 **[Test build #68610 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68610/consoleFull)** for PR 15877 at commit

[GitHub] spark issue #15655: [SPARK-18010][CORE] Reduce work performed for building u...

2016-11-14 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15655 Merged to 2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark issue #15683: [SPARK-18166][MLlib] Fix Poisson GLM bug due to wrong re...

2016-11-14 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15683 Merged to master/2.1. @actuaryzhang it doesn't pick cleanly into 2.0 but if you want to open a new PR for 2.0 I'll merge that too. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request #15683: [SPARK-18166][MLlib] Fix Poisson GLM bug due to w...

2016-11-14 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15683 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #15880: [SPARK-17913][SQL] compare long and string type column m...

2016-11-14 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/15880 cc @yhuai @gatorsmile --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #15859: [SPARK-18416][Structured Streaming] Fixed temp file leak...

2016-11-14 Thread tdas
Github user tdas commented on the issue: https://github.com/apache/spark/pull/15859 @zsxwing can you take another look. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #15880: [SPARK-17913][SQL] compare long and string type c...

2016-11-14 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/15880 [SPARK-17913][SQL] compare long and string type column may return confusing result ## What changes were proposed in this pull request? Spark SQL follows MySQL to do the implicit type

[GitHub] spark issue #15840: [SPARK-18398][SQL] Fix nullabilities of MapObjects and o...

2016-11-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15840 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68607/ Test PASSed. ---

[GitHub] spark issue #15840: [SPARK-18398][SQL] Fix nullabilities of MapObjects and o...

2016-11-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15840 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15840: [SPARK-18398][SQL] Fix nullabilities of MapObjects and o...

2016-11-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15840 **[Test build #68607 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68607/consoleFull)** for PR 15840 at commit

[GitHub] spark issue #15683: [SPARK-18166][MLlib] Fix Poisson GLM bug due to wrong re...

2016-11-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15683 **[Test build #3424 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3424/consoleFull)** for PR 15683 at commit

[GitHub] spark issue #15780: [SPARK-18284][SQL] Make ExpressionEncoder.serializer.nul...

2016-11-14 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/15780 Needs `AssertNotNull()` [here](https://github.com/kiszk/spark/blob/38991d00cbaa50ffc9d22c54f643ed03e51b4785/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala#L577) if

[GitHub] spark issue #15871: [SPARK-17116][Pyspark] Allow parameters to be {string,va...

2016-11-14 Thread aditya1702
Github user aditya1702 commented on the issue: https://github.com/apache/spark/pull/15871 @HyukjinKwon I have updated the code. Could you please take a look --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #15840: [SPARK-18398][SQL] Fix nullabilities of MapObjects and o...

2016-11-14 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/15840 @hvanhovell The pattern basically I used to replace nullability checking here is like: ```scala val eval = child.genCode(ctx) ev.copy(code = s""" ${eval.code} boolean

[GitHub] spark pull request #15871: [SPARK-17116][Pyspark] Allow parameters to be {st...

2016-11-14 Thread aditya1702
Github user aditya1702 commented on a diff in the pull request: https://github.com/apache/spark/pull/15871#discussion_r87766961 --- Diff: python/pyspark/ml/base.py --- @@ -59,6 +59,12 @@ def fit(self, dataset, params=None): return [self.fit(dataset, paramMap) for

<    1   2   3   4   5   >