[GitHub] spark pull request: [SPARK-11940][PYSPARK][ML] Python API for ml.c...

2016-04-28 Thread jkbradley
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/12723#issuecomment-215581769 Ready now? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-14555] Second cut of Python API for Str...

2016-04-28 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/12673 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-14716][SQL] Added support for partition...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12409#issuecomment-215581159 **[Test build #2911 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2911/consoleFull)** for PR 12409 at commit

[GitHub] spark pull request: [SPARK-14716][SQL] Added support for partition...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12409#issuecomment-215581123 **[Test build #2910 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2910/consoleFull)** for PR 12409 at commit

[GitHub] spark pull request: [SPARK-14555] Second cut of Python API for Str...

2016-04-28 Thread tdas
Github user tdas commented on the pull request: https://github.com/apache/spark/pull/12673#issuecomment-215579990 Merging this. Thank you very much @brkyvz --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-14555] Second cut of Python API for Str...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12673#issuecomment-215578529 **[Test build #2909 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2909/consoleFull)** for PR 12673 at commit

[GitHub] spark pull request: [SPARK-14555] Second cut of Python API for Str...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12673#issuecomment-215578315 **[Test build #2908 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2908/consoleFull)** for PR 12673 at commit

[GitHub] spark pull request: [SPARK-14990][SQL] nvl, coalesce, array with p...

2016-04-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12768#issuecomment-215577290 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-14990][SQL] nvl, coalesce, array with p...

2016-04-28 Thread dosoft
GitHub user dosoft opened a pull request: https://github.com/apache/spark/pull/12768 [SPARK-14990][SQL] nvl, coalesce, array with parameter of type 'array' ## What changes were proposed in this pull request? (Please fill in changes proposed in this fix) ## How

[GitHub] spark pull request: [SPARK-11940][PYSPARK][ML] Python API for ml.c...

2016-04-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12723#issuecomment-215576476 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-11940][PYSPARK][ML] Python API for ml.c...

2016-04-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12723#issuecomment-215576473 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-11940][PYSPARK][ML] Python API for ml.c...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12723#issuecomment-215576378 **[Test build #57281 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57281/consoleFull)** for PR 12723 at commit

[GitHub] spark pull request: [SPARK-14837][SQL][STREAMING] Added support in...

2016-04-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12616#issuecomment-215576137 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-14837][SQL][STREAMING] Added support in...

2016-04-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12616#issuecomment-215576141 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-14837][SQL][STREAMING] Added support in...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12616#issuecomment-215575885 **[Test build #57275 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57275/consoleFull)** for PR 12616 at commit

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215575102 **[Test build #57283 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57283/consoleFull)** for PR 12761 at commit

[GitHub] spark pull request: [SPARK-14802] [SQL] [WIP] Disable Passing to H...

2016-04-28 Thread gatorsmile
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/12692#issuecomment-215574915 Another option is to use a different API to drop multiple partitions by a single command. ```JAVA public List dropPartitions(String dbName, String

[GitHub] spark pull request: [SPARK-14555] Second cut of Python API for Str...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12673#issuecomment-215574717 **[Test build #2909 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2909/consoleFull)** for PR 12673 at commit

[GitHub] spark pull request: [SPARK-14850][ML] convert primitive array from...

2016-04-28 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/12640#discussion_r61509114 --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeArrayData.java --- @@ -336,4 +336,78 @@ public UnsafeArrayData copy() {

[GitHub] spark pull request: [SPARK-14850][ML] convert primitive array from...

2016-04-28 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/12640#discussion_r61509122 --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeArrayData.java --- @@ -336,4 +336,62 @@ public UnsafeArrayData copy() {

[GitHub] spark pull request: [SPARK-14555] Second cut of Python API for Str...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12673#issuecomment-215574347 **[Test build #2908 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2908/consoleFull)** for PR 12673 at commit

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215573982 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215573910 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215573882 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215573656 Jenkins, please test it again. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-14978][PySpark] PySpark TrainValidation...

2016-04-28 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/12767#discussion_r61508511 --- Diff: python/pyspark/ml/tests.py --- @@ -586,10 +589,13 @@ def test_fit_maximize_metric(self): tvsModel = tvs.fit(dataset)

[GitHub] spark pull request: [SPARK-14978][PySpark] PySpark TrainValidation...

2016-04-28 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/12767#discussion_r61508260 --- Diff: python/pyspark/ml/tests.py --- @@ -616,6 +622,7 @@ def test_save_load(self): tvsModel.save(tvsModelPath) loadedModel =

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread daniel-siegmann-aol
Github user daniel-siegmann-aol commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215572538 Thanks. I'm pretty confident this does what it's suppose to do, my main concern is to make sure performance doesn't degrade for anything else. The

[GitHub] spark pull request: [SPARK-14830][SQL] Add RemoveRepetitionFromGro...

2016-04-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12590#issuecomment-215572360 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-14830][SQL] Add RemoveRepetitionFromGro...

2016-04-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12590#issuecomment-215572352 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-14830][SQL] Add RemoveRepetitionFromGro...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12590#issuecomment-215572100 **[Test build #57274 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57274/consoleFull)** for PR 12590 at commit

[GitHub] spark pull request: [SPARK-14978][PySpark] PySpark TrainValidation...

2016-04-28 Thread vectorijk
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/12767#discussion_r61507716 --- Diff: python/pyspark/ml/tuning.py --- @@ -613,7 +615,9 @@ def copy(self, extra=None): """ if extra is None:

[GitHub] spark pull request: [SPARK-14858][SQL] Enable subquery pushdown

2016-04-28 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/12720#discussion_r61507610 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -644,9 +643,10 @@ object InferFiltersFromConstraints

[GitHub] spark pull request: [SPARK-14555] Second cut of Python API for Str...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12673#issuecomment-215571604 **[Test build #2907 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2907/consoleFull)** for PR 12673 at commit

[GitHub] spark pull request: [SPARK-14555] Second cut of Python API for Str...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12673#issuecomment-215570990 **[Test build #2906 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2906/consoleFull)** for PR 12673 at commit

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215570785 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215570788 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-14989][BUILD] Upgrade Jackson from 2.5....

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12766#issuecomment-215570639 **[Test build #57282 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57282/consoleFull)** for PR 12766 at commit

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215570576 **[Test build #57272 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57272/consoleFull)** for PR 12761 at commit

[GitHub] spark pull request: [SPARK-14978][PySpark] PySpark TrainValidation...

2016-04-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12767#issuecomment-215570555 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-14978][PySpark] PySpark TrainValidation...

2016-04-28 Thread taku-k
GitHub user taku-k opened a pull request: https://github.com/apache/spark/pull/12767 [SPARK-14978][PySpark] PySpark TrainValidationSplitModel should support validationMetrics ## What changes were proposed in this pull request? This pull request includes supporting

[GitHub] spark pull request: [SPARK-14555] Second cut of Python API for Str...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12673#issuecomment-215569984 **[Test build #2905 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2905/consoleFull)** for PR 12673 at commit

[GitHub] spark pull request: [SPARK-14972] Improve performance of JSON sche...

2016-04-28 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/12750#issuecomment-215569915 /cc @NathanHowell, FYI. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-14989] Upgrade Jackson from 2.5.3 to 2....

2016-04-28 Thread JoshRosen
GitHub user JoshRosen opened a pull request: https://github.com/apache/spark/pull/12766 [SPARK-14989] Upgrade Jackson from 2.5.3 to 2.7.3 This patch upgrades Jackson from 2.5.3 to 2.7.3. I'd like to upgrade now in order to take advantage of new performance improvements and

[GitHub] spark pull request: [SPARK-14858][SQL] Enable subquery pushdown

2016-04-28 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/12720#discussion_r61505828 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -442,7 +443,7 @@ class Analyzer( */ object

[GitHub] spark pull request: [SPARK-14850][ML] convert primitive array from...

2016-04-28 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/12640#discussion_r61505845 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/linalg/UDTSerializationBenchmark.scala --- @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-11940][PYSPARK][ML] Python API for ml.c...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12723#issuecomment-215566571 **[Test build #57281 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57281/consoleFull)** for PR 12723 at commit

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215564655 Seems to be very promising. Since 2.0 window will be closed soon, it's unlikely to get into 2.0. Let's target 2.1 --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-12810][PySpark] PySpark CrossValidatorM...

2016-04-28 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/12464 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-14939][SQL] Improve EliminateSorts opti...

2016-04-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12719#issuecomment-215564542 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-11940][PYSPARK][ML] Python API for ml.c...

2016-04-28 Thread jkbradley
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/12723#issuecomment-215566123 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-14555] Second cut of Python API for Str...

2016-04-28 Thread tdas
Github user tdas commented on the pull request: https://github.com/apache/spark/pull/12673#issuecomment-215565677 LGTM. I am testing this for flakiness after which I will merge this soon. The lack ctx.streams is causing flakiness and blocking other PRs. --- If your project is set up

[GitHub] spark pull request: [SPARK-12810][PySpark] PySpark CrossValidatorM...

2016-04-28 Thread jkbradley
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/12464#issuecomment-215565488 LGTM Merging with master Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-14555] Second cut of Python API for Str...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12673#issuecomment-215565423 **[Test build #2907 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2907/consoleFull)** for PR 12673 at commit

[GitHub] spark pull request: [SPARK-12810][PySpark] PySpark CrossValidatorM...

2016-04-28 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/12464#discussion_r61504179 --- Diff: python/pyspark/ml/tests.py --- @@ -461,6 +461,31 @@ def _fit(self, dataset): class CrossValidatorTests(PySparkTestCase): +

[GitHub] spark pull request: [SPARK-14555] Second cut of Python API for Str...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12673#issuecomment-215565371 **[Test build #2906 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2906/consoleFull)** for PR 12673 at commit

[GitHub] spark pull request: [SPARK-14555] Second cut of Python API for Str...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12673#issuecomment-215565327 **[Test build #2905 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2905/consoleFull)** for PR 12673 at commit

[GitHub] spark pull request: [SPARK-14987] [SQL] inline hive-service (cli) ...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12764#issuecomment-215565289 **[Test build #57279 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57279/consoleFull)** for PR 12764 at commit

[GitHub] spark pull request: [SPARK-14988][PYTHON] SparkSession catalog and...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12765#issuecomment-215565286 **[Test build #57280 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57280/consoleFull)** for PR 12765 at commit

[GitHub] spark pull request: [SPARK-14939][SQL] Improve EliminateSorts opti...

2016-04-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12719#issuecomment-215564535 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-14939][SQL] Improve EliminateSorts opti...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12719#issuecomment-215564274 **[Test build #57271 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57271/consoleFull)** for PR 12719 at commit

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread daniel-siegmann-aol
Github user daniel-siegmann-aol commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215564208 Note those results above are the results from my production data. For comparison, I'm told by one of our data scientists the training can be done locally

[GitHub] spark pull request: [SPARK-14988][PYTHON] SparkSession catalog and...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12765#issuecomment-215564024 **[Test build #57278 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57278/consoleFull)** for PR 12765 at commit

[GitHub] spark pull request: [SPARK-14882] [DOCS] Clarify that Spark can be...

2016-04-28 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/12757#discussion_r61503201 --- Diff: docs/programming-guide.md --- @@ -24,7 +24,8 @@ along with if you launch Spark's interactive shell -- either `bin/spark-shell` f

[GitHub] spark pull request: [SPARK-14988][Python] Python SparkSession cata...

2016-04-28 Thread andrewor14
GitHub user andrewor14 opened a pull request: https://github.com/apache/spark/pull/12765 [SPARK-14988][Python] Python SparkSession catalog and conf API ## What changes were proposed in this pull request? The `catalog` and `conf` APIs were exposed in `SparkSession` in #12713

[GitHub] spark pull request: [SPARK-14891][ML] Add schema validation for AL...

2016-04-28 Thread BenFradet
Github user BenFradet commented on the pull request: https://github.com/apache/spark/pull/12762#issuecomment-215563782 LGTM except for a few minors. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-14837][SQL][STREAMING] Added support in...

2016-04-28 Thread tdas
Github user tdas commented on the pull request: https://github.com/apache/spark/pull/12616#issuecomment-215562885 @marmbrus Please take a look once again. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-14802] [SQL] [WIP] Disable Passing to H...

2016-04-28 Thread gatorsmile
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/12692#issuecomment-215562689 Found a serious issue in the existing `Drop Partition`. Now, we are able to drop multiple partitions using a single call. However, this could break atomicity. I

[GitHub] spark pull request: [SPARK-14891][ML] Add schema validation for AL...

2016-04-28 Thread BenFradet
Github user BenFradet commented on a diff in the pull request: https://github.com/apache/spark/pull/12762#discussion_r61502301 --- Diff: mllib/src/test/scala/org/apache/spark/ml/util/MLTestingUtils.scala --- @@ -58,6 +58,30 @@ object MLTestingUtils extends SparkFunSuite {

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread daniel-siegmann-aol
Github user daniel-siegmann-aol commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215561740 I'll give the results of my own training flow too. Testing was done on EMR 4.4.0 with Spark 1.6.0. The cluster was configured with six r3.8xlarge nodes:

[GitHub] spark pull request: [SPARK-14891][ML] Add schema validation for AL...

2016-04-28 Thread BenFradet
Github user BenFradet commented on a diff in the pull request: https://github.com/apache/spark/pull/12762#discussion_r61501824 --- Diff: mllib/src/test/scala/org/apache/spark/ml/util/MLTestingUtils.scala --- @@ -58,6 +58,30 @@ object MLTestingUtils extends SparkFunSuite {

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215561453 You may use some fake data to demonstrate how this PR improves. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [SPARK-13902][SCHEDULER] Make DAGScheduler.get...

2016-04-28 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/12655#issuecomment-215560207 Sorry, on re-reading my own post I realize that I wasn't completely clear. I'm saying that the two-line change in DAGScheduler.scala from

[GitHub] spark pull request: [SPARK-10884] [ML] Support prediction on singl...

2016-04-28 Thread sethah
Github user sethah commented on the pull request: https://github.com/apache/spark/pull/8883#issuecomment-215559305 ping @yanboliang Is this still on anyone's radar? I think it would be really useful. I can help review if needed. Maybe start by fixing merge conflicts? :D --- If

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread daniel-siegmann-aol
Github user daniel-siegmann-aol commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215557197 Is there some Spark benchmark you want me to run? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: [SPARK-13902][SCHEDULER] Make DAGScheduler.get...

2016-04-28 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/12655#issuecomment-215556785 I'm fine with the change in https://github.com/apache/spark/pull/8923 I'm looking at refactoring of `newOrUsedShuffleStage` that may effectively turn it into just

[GitHub] spark pull request: [SPARK-14987] [SQL] inline hive-service (cli) ...

2016-04-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12764#issuecomment-215556734 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-14987] [SQL] inline hive-service (cli) ...

2016-04-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12764#issuecomment-215556733 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-14987] [SQL] inline hive-service (cli) ...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12764#issuecomment-215556721 **[Test build #57277 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57277/consoleFull)** for PR 12764 at commit

[GitHub] spark pull request: [SPARK-14987] [SQL] inline hive-service (cli) ...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12764#issuecomment-215556461 **[Test build #2904 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2904/consoleFull)** for PR 12764 at commit

[GitHub] spark pull request: [SPARK-14987] [SQL] inline hive-service (cli) ...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12764#issuecomment-215556226 **[Test build #57277 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57277/consoleFull)** for PR 12764 at commit

[GitHub] spark pull request: [SPARK-14882] [DOCS] Clarify that Spark can be...

2016-04-28 Thread benmccann
Github user benmccann commented on a diff in the pull request: https://github.com/apache/spark/pull/12757#discussion_r61498763 --- Diff: docs/programming-guide.md --- @@ -24,7 +24,8 @@ along with if you launch Spark's interactive shell -- either `bin/spark-shell` f

[GitHub] spark pull request: [SPARK-14987] [SQL] inline hive-service (cli) ...

2016-04-28 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/12764#issuecomment-21911 LGTM pending Jenkins. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-14987] [SQL] inline hive-service (cli) ...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12764#issuecomment-21990 **[Test build #2904 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2904/consoleFull)** for PR 12764 at commit

[GitHub] spark pull request: [SPARK-14837][SQL][STREAMING] Added support in...

2016-04-28 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/12616#discussion_r61498520 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala --- @@ -439,6 +442,105 @@ class FileStreamSourceSuite extends

[GitHub] spark pull request: [SPARK-14837][SQL][STREAMING] Added support in...

2016-04-28 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/12616#discussion_r61498450 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala --- @@ -98,20 +109,49 @@ class FileStreamSource(

[GitHub] spark pull request: [SPARK-14837][SQL][STREAMING] Added support in...

2016-04-28 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/12616#discussion_r61498388 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala --- @@ -98,20 +109,49 @@ class FileStreamSource(

[GitHub] spark pull request: [SPARK-14837][SQL][STREAMING] Added support in...

2016-04-28 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/12616#discussion_r61498403 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala --- @@ -98,20 +109,49 @@ class FileStreamSource(

[GitHub] spark pull request: [SPARK-14613][ML] Add @Since into the matrix a...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12416#issuecomment-215554998 **[Test build #57276 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57276/consoleFull)** for PR 12416 at commit

[GitHub] spark pull request: [SPARK-14837][SQL][STREAMING] Added support in...

2016-04-28 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/12616#discussion_r61498250 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala --- @@ -55,10 +62,12 @@ class FileStreamSource( */

[GitHub] spark pull request: [SPARK-14987] [SQL] inline hive-service (cli) ...

2016-04-28 Thread davies
GitHub user davies opened a pull request: https://github.com/apache/spark/pull/12764 [SPARK-14987] [SQL] inline hive-service (cli) into sql/hive-thriftserver ## What changes were proposed in this pull request? This PR copy the thrift-server from hive-service-1.2 (including

[GitHub] spark pull request: [SPARK-14837][SQL][STREAMING] Added support in...

2016-04-28 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/12616#discussion_r61498117 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala --- @@ -55,10 +62,12 @@ class FileStreamSource( */

[GitHub] spark pull request: [SPARK-14837][SQL][STREAMING] Added support in...

2016-04-28 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/12616#discussion_r61498063 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -175,25 +175,11 @@ case class DataSource(

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215553815 Can you also post the benchmark result with/without this PR for very sparse features? Thanks. --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request: [SPARK-14837][SQL][STREAMING] Added support in...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12616#issuecomment-215553704 **[Test build #57275 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57275/consoleFull)** for PR 12616 at commit

[GitHub] spark pull request: [SPARK-14613][ML] Add @Since into the matrix a...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12416#issuecomment-215553142 LGTM except one minor styling issue. Once that is updated, and tests pass, I'll go ahead and merge it. Thank you very much. --- If your project is set up for it, you

[GitHub] spark pull request: [SPARK-14891][ML] Add schema validation for AL...

2016-04-28 Thread BenFradet
Github user BenFradet commented on a diff in the pull request: https://github.com/apache/spark/pull/12762#discussion_r61497201 --- Diff: mllib/src/test/scala/org/apache/spark/ml/recommendation/ALSSuite.scala --- @@ -512,6 +513,60 @@ class ALSSuite

[GitHub] spark pull request: [SPARK-14613][ML] Add @Since into the matrix a...

2016-04-28 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/12416#discussion_r61496950 --- Diff: project/SparkBuild.scala --- @@ -50,10 +50,11 @@ object BuildCommons { ).map(ProjectRef(buildLocation, _)) val allProjects@Seq(

[GitHub] spark pull request: [SPARK-14984][ML] Deprecated model field in Li...

2016-04-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12763#issuecomment-215551795 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-14984][ML] Deprecated model field in Li...

2016-04-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12763#issuecomment-215551797 Test PASSed. Refer to this link for build results (access rights to CI server needed):

<    1   2   3   4   5   6   7   8   9   >