[GitHub] spark issue #14705: [SPARK-16508][SparkR] Fix CRAN undocumented/duplicated a...

2016-08-19 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/14705 I think it's ok to put @param ... on top of the function in the order we want it. Or @param na.rm for sd, var etc. Yes it is a bit odd to have param in the documentation block that is not in

[GitHub] spark issue #12436: [SPARK-14649][CORE] DagScheduler should not run duplicat...

2016-08-19 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/12436 ping. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #14666: [SPARK-16578][SparkR] Enable SparkR to connect to a remo...

2016-08-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14666 **[Test build #64083 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64083/consoleFull)** for PR 14666 at commit

[GitHub] spark issue #14666: [SPARK-16578][SparkR] Enable SparkR to connect to a remo...

2016-08-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14666 **[Test build #64084 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64084/consoleFull)** for PR 14666 at commit

[GitHub] spark issue #14666: [SPARK-16578][SparkR] Enable SparkR to connect to a remo...

2016-08-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14666 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64084/ Test FAILed. ---

[GitHub] spark issue #14721: [SC-4296][SQL] Change error message for out of range num...

2016-08-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14721 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #14705: [SPARK-16508][SparkR] Fix CRAN undocumented/duplicated a...

2016-08-19 Thread junyangq
Github user junyangq commented on the issue: https://github.com/apache/spark/pull/14705 That makes sense. Perhaps this could be done in another PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #14693: [SPARK-17113][Shuffle] Job failure due to Executor OOM i...

2016-08-19 Thread davies
Github user davies commented on the issue: https://github.com/apache/spark/pull/14693 Merging this into master and 2.0 and 1.6 (hopefully), thanks --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request #14721: [SPARK-17158][SQL] Change error message for out o...

2016-08-19 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14721#discussion_r75530744 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala --- @@ -1278,10 +1278,16 @@ class AstBuilder extends

[GitHub] spark pull request #14650: [SPARK-17062][MESOS] add conf option to mesos dis...

2016-08-19 Thread mgummelt
Github user mgummelt commented on a diff in the pull request: https://github.com/apache/spark/pull/14650#discussion_r75530775 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -1962,6 +1962,26 @@ private[spark] object Utils extends Logging { path

[GitHub] spark issue #14697: [SPARK-17124][SQL] RelationalGroupedDataset.agg should p...

2016-08-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14697 **[Test build #64080 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64080/consoleFull)** for PR 14697 at commit

[GitHub] spark issue #14697: [SPARK-17124][SQL] RelationalGroupedDataset.agg should p...

2016-08-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14697 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #10896: [SPARK-12978][SQL] Skip unnecessary final group-by when ...

2016-08-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/10896 **[Test build #64079 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64079/consoleFull)** for PR 10896 at commit

[GitHub] spark issue #14721: [SPARK-17158][SQL] Change error message for out of range...

2016-08-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14721 **[Test build #64095 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64095/consoleFull)** for PR 14721 at commit

[GitHub] spark pull request #14712: [SPARK-17072] [SQL] support table-level statistic...

2016-08-19 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14712#discussion_r75513984 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeTableCommand.scala --- @@ -88,14 +90,30 @@ case class

[GitHub] spark issue #14719: [SPARK-17154][SQL] Wrong result can be returned or Analy...

2016-08-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14719 **[Test build #64081 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64081/consoleFull)** for PR 14719 at commit

[GitHub] spark pull request #14712: [SPARK-17072] [SQL] support table-level statistic...

2016-08-19 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/14712#discussion_r75515983 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeTableCommand.scala --- @@ -88,14 +90,30 @@ case class

[GitHub] spark pull request #14705: [SPARK-16508][SparkR] Fix CRAN undocumented/dupli...

2016-08-19 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14705#discussion_r75516177 --- Diff: R/pkg/R/DataFrame.R --- @@ -932,7 +932,7 @@ setMethod("sample_frac", #' @param x a SparkDataFrame. #' @family SparkDataFrame

[GitHub] spark pull request #14547: [SPARK-16718][MLlib] gbm-style treeboost

2016-08-19 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/14547#discussion_r75523853 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/treeParams.scala --- @@ -183,24 +191,18 @@ private[ml] trait DecisionTreeParams extends

[GitHub] spark pull request #14547: [SPARK-16718][MLlib] gbm-style treeboost

2016-08-19 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/14547#discussion_r75523850 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/treeParams.scala --- @@ -183,24 +191,18 @@ private[ml] trait DecisionTreeParams extends

[GitHub] spark pull request #14547: [SPARK-16718][MLlib] gbm-style treeboost

2016-08-19 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/14547#discussion_r75523858 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/treeParams.scala --- @@ -220,32 +222,42 @@ private[ml] object TreeClassifierParams { final

[GitHub] spark pull request #14547: [SPARK-16718][MLlib] gbm-style treeboost

2016-08-19 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/14547#discussion_r75523822 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/impurity/ApproxBernoulliImpurity.scala --- @@ -0,0 +1,162 @@ +/* + * Licensed to the

[GitHub] spark pull request #14547: [SPARK-16718][MLlib] gbm-style treeboost

2016-08-19 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/14547#discussion_r75523794 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/GBTRegressor.scala --- @@ -38,25 +38,35 @@ import org.apache.spark.sql.{DataFrame, Dataset}

[GitHub] spark pull request #14155: [SPARK-16498][SQL] move hive hack for data source...

2016-08-19 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14155#discussion_r75523912 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -200,22 +375,77 @@ private[spark] class

[GitHub] spark pull request #14547: [SPARK-16718][MLlib] gbm-style treeboost

2016-08-19 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/14547#discussion_r75523854 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/treeParams.scala --- @@ -183,24 +191,18 @@ private[ml] trait DecisionTreeParams extends

[GitHub] spark pull request #14155: [SPARK-16498][SQL] move hive hack for data source...

2016-08-19 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14155#discussion_r75523842 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -200,22 +375,77 @@ private[spark] class

[GitHub] spark pull request #14547: [SPARK-16718][MLlib] gbm-style treeboost

2016-08-19 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/14547#discussion_r75523771 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala --- @@ -42,18 +42,30 @@ import org.apache.spark.sql.types.DoubleType

[GitHub] spark pull request #14547: [SPARK-16718][MLlib] gbm-style treeboost

2016-08-19 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/14547#discussion_r75523815 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/impl/GradientBoostedTrees.scala --- @@ -258,11 +258,13 @@ private[spark] object

[GitHub] spark pull request #14547: [SPARK-16718][MLlib] gbm-style treeboost

2016-08-19 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/14547#discussion_r75523781 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/GBTRegressor.scala --- @@ -17,13 +17,13 @@ package org.apache.spark.ml.regression

[GitHub] spark pull request #14547: [SPARK-16718][MLlib] gbm-style treeboost

2016-08-19 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/14547#discussion_r75523871 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/treeParams.scala --- @@ -501,36 +564,75 @@ private[ml] trait GBTClassifierParams extends GBTParams

[GitHub] spark pull request #14547: [SPARK-16718][MLlib] gbm-style treeboost

2016-08-19 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/14547#discussion_r75523805 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/impl/DTStatsAggregator.scala --- @@ -33,11 +34,13 @@ private[spark] class DTStatsAggregator(

[GitHub] spark issue #14547: [SPARK-16718][MLlib] gbm-style treeboost

2016-08-19 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/14547 One questions I had - this PR creates an inherent coupling between the impurity used to train the tree and the loss used for boosting. This is not how I understood tree boost. My impression was

[GitHub] spark issue #14038: [SPARK-16317][SQL] Add a new interface to filter files i...

2016-08-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14038 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64075/ Test PASSed. ---

[GitHub] spark pull request #14155: [SPARK-16498][SQL] move hive hack for data source...

2016-08-19 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14155#discussion_r75528185 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -200,22 +375,77 @@ private[spark] class

[GitHub] spark issue #14038: [SPARK-16317][SQL] Add a new interface to filter files i...

2016-08-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14038 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14717: [WIP][SPARK-17090][ML]Make tree aggregation level in lin...

2016-08-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14717 **[Test build #64092 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64092/consoleFull)** for PR 14717 at commit

[GitHub] spark issue #14650: [SPARK-17062][MESOS] add conf option to mesos dispatcher

2016-08-19 Thread mgummelt
Github user mgummelt commented on the issue: https://github.com/apache/spark/pull/14650 I'm generally fine with this, though one downside is that it introduces a consistency with other daemon classes such as Master.scala, which only accepts a properties file. Maybe we should make a

[GitHub] spark issue #14719: [SPARK-17154][SQL] Wrong result can be returned or Analy...

2016-08-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14719 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64082/ Test FAILed. ---

[GitHub] spark issue #14697: [SPARK-17124][SQL] RelationalGroupedDataset.agg should p...

2016-08-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14697 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64080/ Test PASSed. ---

[GitHub] spark issue #10896: [SPARK-12978][SQL] Skip unnecessary final group-by when ...

2016-08-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/10896 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14718: [SPARK-16711] YarnShuffleService doesn't re-init properl...

2016-08-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14718 **[Test build #64094 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64094/consoleFull)** for PR 14718 at commit

[GitHub] spark issue #14719: [SPARK-17154][SQL] Wrong result can be returned or Analy...

2016-08-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14719 **[Test build #64082 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64082/consoleFull)** for PR 14719 at commit

[GitHub] spark issue #14718: [SPARK-16711] YarnShuffleService doesn't re-init properl...

2016-08-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14718 **[Test build #64074 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64074/consoleFull)** for PR 14718 at commit

[GitHub] spark issue #14204: [SPARK-16520] [WEBUI] Link executors to corresponding wo...

2016-08-19 Thread ajbozarth
Github user ajbozarth commented on the issue: https://github.com/apache/spark/pull/14204 @nblintao Now that #14382 is merged you can update this, here's a patch of what you'll need: `diff --git a/core/src/main/resources/org/apache/spark/ui/static/executorspage.js

[GitHub] spark issue #14666: [SPARK-16578][SparkR] Enable SparkR to connect to a remo...

2016-08-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14666 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64083/ Test FAILed. ---

[GitHub] spark issue #14666: [SPARK-16578][SparkR] Enable SparkR to connect to a remo...

2016-08-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14666 **[Test build #64083 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64083/consoleFull)** for PR 14666 at commit

[GitHub] spark issue #14666: [SPARK-16578][SparkR] Enable SparkR to connect to a remo...

2016-08-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14666 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14649: [SPARK-17059][SQL] Allow FileFormat to specify partition...

2016-08-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14649 **[Test build #64086 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64086/consoleFull)** for PR 14649 at commit

[GitHub] spark issue #14655: [SPARK-16669][SQL]Adding partition prunning to Metastore...

2016-08-19 Thread Parth-Brahmbhatt
Github user Parth-Brahmbhatt commented on the issue: https://github.com/apache/spark/pull/14655 @gatorsmile not sure if its the same issue. The issue you are pointing at talks about storing the actual partition level stats, which could be used by this PR but until its available we

[GitHub] spark issue #14467: [SPARK-16861][PYSPARK][CORE] Refactor PySpark accumulato...

2016-08-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14467 **[Test build #64087 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64087/consoleFull)** for PR 14467 at commit

[GitHub] spark pull request #14721: [SC-4296][SQL] Change error message for out of ra...

2016-08-19 Thread srinathshankar
GitHub user srinathshankar opened a pull request: https://github.com/apache/spark/pull/14721 [SC-4296][SQL] Change error message for out of range numeric literals ## What changes were proposed in this pull request? Modifies error message for numeric literals to Numeric

[GitHub] spark issue #14721: [SC-4296][SQL] Change error message for out of range num...

2016-08-19 Thread srinathshankar
Github user srinathshankar commented on the issue: https://github.com/apache/spark/pull/14721 @sameeragarwal @rxin @davies --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14666: [SPARK-16578][SparkR] Enable SparkR to connect to a remo...

2016-08-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14666 **[Test build #64091 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64091/consoleFull)** for PR 14666 at commit

[GitHub] spark pull request #14693: [SPARK-17113][Shuffle] Job failure due to Executo...

2016-08-19 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/14693#discussion_r75529799 --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java --- @@ -522,7 +522,7 @@ public long spill() throws

[GitHub] spark pull request #14650: [SPARK-17062][MESOS] add conf option to mesos dis...

2016-08-19 Thread mgummelt
Github user mgummelt commented on a diff in the pull request: https://github.com/apache/spark/pull/14650#discussion_r75530884 --- Diff: core/src/main/scala/org/apache/spark/deploy/mesos/MesosClusterDispatcherArguments.scala --- @@ -102,7 +115,9 @@ private[mesos] class

[GitHub] spark pull request #14650: [SPARK-17062][MESOS] add conf option to mesos dis...

2016-08-19 Thread mgummelt
Github user mgummelt commented on a diff in the pull request: https://github.com/apache/spark/pull/14650#discussion_r75530859 --- Diff: core/src/main/scala/org/apache/spark/deploy/mesos/MesosClusterDispatcherArguments.scala --- @@ -102,7 +115,9 @@ private[mesos] class

[GitHub] spark pull request #14721: [SPARK-17158][SQL] Change error message for out o...

2016-08-19 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14721#discussion_r75530829 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala --- @@ -1291,28 +1297,32 @@ class AstBuilder extends

[GitHub] spark issue #13650: [SPARK-9623] [ML] Provide variance for RandomForestRegre...

2016-08-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13650 **[Test build #64093 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64093/consoleFull)** for PR 13650 at commit

[GitHub] spark issue #13320: [SPARK-13184][SQL] Add a datasource-specific option minP...

2016-08-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13320 **[Test build #64077 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64077/consoleFull)** for PR 13320 at commit

[GitHub] spark pull request #14712: [SPARK-17072] [SQL] support table-level statistic...

2016-08-19 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14712#discussion_r75515751 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeTableCommand.scala --- @@ -88,14 +90,30 @@ case class

[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-19 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14712 So far, the test coverage is weak. Could we add more test cases to cover all the corner cases? Thanks! --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #10212: [SPARK-12221] add cpu time to metrics

2016-08-19 Thread jisookim0513
Github user jisookim0513 commented on the issue: https://github.com/apache/spark/pull/10212 @vanzin I updated the patch --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #14547: [SPARK-16718][MLlib] gbm-style treeboost

2016-08-19 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/14547 @vlad17 Thanks for the PR! I'm not done with a review pass, but I'll go ahead and send comments from a partial pass. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark issue #14547: [SPARK-16718][MLlib] gbm-style treeboost

2016-08-19 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/14547 Test gists * ```setMinInstancesPerNode(10)```: Is this the same value used by gbm by default? * Is ```counts.max / counts.sum``` meant to verify that the train/test splits are identical?

[GitHub] spark pull request #14547: [SPARK-16718][MLlib] gbm-style treeboost

2016-08-19 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/14547#discussion_r75523882 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/treeParams.scala --- @@ -465,33 +497,64 @@ private[ml] trait GBTParams extends TreeEnsembleParams

[GitHub] spark pull request #14547: [SPARK-16718][MLlib] gbm-style treeboost

2016-08-19 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/14547#discussion_r75523887 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/treeParams.scala --- @@ -465,33 +497,64 @@ private[ml] trait GBTParams extends TreeEnsembleParams

[GitHub] spark issue #14674: [SPARK-17002][CORE]: Document that spark.ssl.protocol. i...

2016-08-19 Thread mgummelt
Github user mgummelt commented on the issue: https://github.com/apache/spark/pull/14674 I'm fine with this PR, but I'm wondering if it makes more sense to set the default to one of the values here (probably "TLS"):

[GitHub] spark pull request #14547: [SPARK-16718][MLlib] gbm-style treeboost

2016-08-19 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/14547#discussion_r75523869 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/treeParams.scala --- @@ -465,33 +497,64 @@ private[ml] trait GBTParams extends TreeEnsembleParams

[GitHub] spark issue #14721: [SC-4296][SQL] Change error message for out of range num...

2016-08-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14721 **[Test build #3229 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3229/consoleFull)** for PR 14721 at commit

[GitHub] spark issue #14721: [SC-4296][SQL] Change error message for out of range num...

2016-08-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14721 **[Test build #64089 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64089/consoleFull)** for PR 14721 at commit

[GitHub] spark issue #14384: [Spark-16443][SparkR] Alternating Least Squares (ALS) wr...

2016-08-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14384 **[Test build #64090 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64090/consoleFull)** for PR 14384 at commit

[GitHub] spark issue #14327: [SPARK-16686][SQL] Remove PushProjectThroughSample since...

2016-08-19 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14327 I have also backported this bug fix into 2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #14655: [SPARK-16669][SQL]Adding partition prunning to Metastore...

2016-08-19 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14655 How about waiting for a few days until that is delivered? Let us see whether that might simplify your PR. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #10896: [SPARK-12978][SQL] Skip unnecessary final group-by when ...

2016-08-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/10896 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64079/ Test PASSed. ---

[GitHub] spark issue #14674: [SPARK-17002][CORE]: Document that spark.ssl.protocol. i...

2016-08-19 Thread mgummelt
Github user mgummelt commented on the issue: https://github.com/apache/spark/pull/14674 If we document what the default is, I think it's a fair requirement that the user read the docs and verify that it meets their needs. --- If your project is set up for it, you can reply to this

[GitHub] spark issue #14697: [SPARK-17124][SQL] RelationalGroupedDataset.agg should p...

2016-08-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14697 **[Test build #64080 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64080/consoleFull)** for PR 14697 at commit

[GitHub] spark pull request #14712: [SPARK-17072] [SQL] support table-level statistic...

2016-08-19 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/14712#discussion_r75516336 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeTableCommand.scala --- @@ -108,4 +126,8 @@ case class

[GitHub] spark pull request #14712: [SPARK-17072] [SQL] support table-level statistic...

2016-08-19 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14712#discussion_r75516944 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeTableCommand.scala --- @@ -88,14 +90,30 @@ case class

[GitHub] spark pull request #13898: [SPARK-16197][PYSPARK] Cleanup of Status API Exam...

2016-08-19 Thread BryanCutler
Github user BryanCutler closed the pull request at: https://github.com/apache/spark/pull/13898 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #14718: [SPARK-16711] YarnShuffleService doesn't re-init properl...

2016-08-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14718 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64074/ Test FAILed. ---

[GitHub] spark issue #14718: [SPARK-16711] YarnShuffleService doesn't re-init properl...

2016-08-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14718 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #14717: [WIP][SPARK-17090][ML]Make tree aggregation level...

2016-08-19 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/14717#discussion_r75519309 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -256,6 +256,15 @@ class LogisticRegression @Since("1.2.0")

[GitHub] spark issue #10212: [SPARK-12221] add cpu time to metrics

2016-08-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/10212 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #10212: [SPARK-12221] add cpu time to metrics

2016-08-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/10212 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64085/ Test FAILed. ---

[GitHub] spark issue #10212: [SPARK-12221] add cpu time to metrics

2016-08-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/10212 **[Test build #64085 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64085/consoleFull)** for PR 10212 at commit

[GitHub] spark pull request #14547: [SPARK-16718][MLlib] gbm-style treeboost

2016-08-19 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/14547#discussion_r75523820 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/impurity/ApproxBernoulliImpurity.scala --- @@ -0,0 +1,162 @@ +/* + * Licensed to the

[GitHub] spark pull request #14547: [SPARK-16718][MLlib] gbm-style treeboost

2016-08-19 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/14547#discussion_r75523836 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/impurity/ApproxBernoulliImpurity.scala --- @@ -0,0 +1,162 @@ +/* + * Licensed to the

[GitHub] spark pull request #14547: [SPARK-16718][MLlib] gbm-style treeboost

2016-08-19 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/14547#discussion_r75523777 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala --- @@ -148,11 +154,14 @@ class GBTClassifier @Since("1.4.0") (

[GitHub] spark pull request #14547: [SPARK-16718][MLlib] gbm-style treeboost

2016-08-19 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/14547#discussion_r75523763 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala --- @@ -42,18 +42,30 @@ import org.apache.spark.sql.types.DoubleType

[GitHub] spark pull request #14547: [SPARK-16718][MLlib] gbm-style treeboost

2016-08-19 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/14547#discussion_r75523809 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/impl/GradientBoostedTrees.scala --- @@ -258,11 +258,13 @@ private[spark] object

[GitHub] spark pull request #14547: [SPARK-16718][MLlib] gbm-style treeboost

2016-08-19 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/14547#discussion_r75523798 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/GBTRegressor.scala --- @@ -134,11 +146,15 @@ class GBTRegressor @Since("1.4.0")

[GitHub] spark pull request #14547: [SPARK-16718][MLlib] gbm-style treeboost

2016-08-19 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/14547#discussion_r75523785 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/GBTRegressor.scala --- @@ -38,25 +38,35 @@ import org.apache.spark.sql.{DataFrame, Dataset}

[GitHub] spark issue #14204: [SPARK-16520] [WEBUI] Link executors to corresponding wo...

2016-08-19 Thread nblintao
Github user nblintao commented on the issue: https://github.com/apache/spark/pull/14204 @ajbozarth Great! Glad to see your PR merged. Thanks for providing this patch, I'll fix it days later. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #14674: [SPARK-17002][CORE]: Document that spark.ssl.protocol. i...

2016-08-19 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/14674 @mgummelt I am fine with setting a default value. I have one concern about it. My motivation is to let the user make sure that they choose a proper protocol to use which meets their security

[GitHub] spark issue #14718: [SPARK-16711] YarnShuffleService doesn't re-init properl...

2016-08-19 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/14718 need to update the test to handle the new levedb --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #14650: [SPARK-17062][MESOS] add conf option to mesos dis...

2016-08-19 Thread mgummelt
Github user mgummelt commented on a diff in the pull request: https://github.com/apache/spark/pull/14650#discussion_r75530239 --- Diff: core/src/main/scala/org/apache/spark/deploy/mesos/MesosClusterDispatcherArguments.scala --- @@ -73,6 +82,10 @@ private[mesos] class

[GitHub] spark issue #14717: [WIP][SPARK-17090][ML]Make tree aggregation level in lin...

2016-08-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14717 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64092/ Test FAILed. ---

[GitHub] spark pull request #14693: [SPARK-17113][Shuffle] Job failure due to Executo...

2016-08-19 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14693 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #14717: [WIP][SPARK-17090][ML]Make tree aggregation level in lin...

2016-08-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14717 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14717: [WIP][SPARK-17090][ML]Make tree aggregation level in lin...

2016-08-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14717 **[Test build #64092 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64092/consoleFull)** for PR 14717 at commit

  1   2   3   4   5   6   7   8   >