[GitHub] spark pull request: [SPARK-14858][SQL] Enable subquery pushdown

2016-04-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12720#issuecomment-215328015 **[Test build #2899 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2899/consoleFull)** for PR 12720 at commit [`62c5c2f`](https://

[GitHub] spark pull request: [SPARK-14938][ML] replace some RDD.map with Da...

2016-04-27 Thread MLnick
Github user MLnick commented on the pull request: https://github.com/apache/spark/pull/12718#issuecomment-215327760 @viirya can you provide an example of how this works for use here in this PR? --- If your project is set up for it, you can reply to this email and have your reply appe

[GitHub] spark pull request: [SPARK-14706][SPARK-14973][ML][PySpark] Python...

2016-04-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12604#issuecomment-215327755 **[Test build #57228 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57228/consoleFull)** for PR 12604 at commit [`cdab34a`](https://gi

[GitHub] spark pull request: [SPARK-12660] [SPARK-14967] [SQL] Implement Ex...

2016-04-27 Thread gatorsmile
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/12736#issuecomment-215327600 Like `INTERSECT ALL`, `EXCEPT ALL` can be done by aggregation of the union of the two tables. We can augment one table with a new column of constant `1`, and the oth

[GitHub] spark pull request: [SPARK-14938][ML] replace some RDD.map with Da...

2016-04-27 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/12718#issuecomment-215327227 encoder now supports UDTs. You just need to declare one before you want to use it. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-12660] [SPARK-14967] [SQL] Implement Ex...

2016-04-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12736#issuecomment-215326945 **[Test build #57227 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57227/consoleFull)** for PR 12736 at commit [`4920360`](https://gi

[GitHub] spark pull request: [SPARK-14938][ML] replace some RDD.map with Da...

2016-04-27 Thread MLnick
Github user MLnick commented on the pull request: https://github.com/apache/spark/pull/12718#issuecomment-215326887 @viirya yeah, sorry I meant we should create an encoder that can be used in `ml`... whether an implicit or explicit. --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-14938][ML] replace some RDD.map with Da...

2016-04-27 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/12718#discussion_r61379580 --- Diff: mllib/src/main/scala/org/apache/spark/ml/Predictor.scala --- @@ -121,9 +121,9 @@ abstract class Predictor[ * and put it in an RDD with

[GitHub] spark pull request: [SPARK-14938][ML] replace some RDD.map with Da...

2016-04-27 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/12718#issuecomment-215326639 We have no implicit encoder for vector udt. But we can explicitly create it. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-14706][SPARK-14973][ML][PySpark] Python...

2016-04-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12604#issuecomment-215325779 **[Test build #57226 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57226/consoleFull)** for PR 12604 at commit [`ba664f9`](https://gi

[GitHub] spark pull request: [SPARK-14938][ML] replace some RDD.map with Da...

2016-04-27 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/12718#discussion_r61379351 --- Diff: mllib/src/main/scala/org/apache/spark/ml/Predictor.scala --- @@ -121,9 +121,9 @@ abstract class Predictor[ * and put it in an RDD with strong

[GitHub] spark pull request: [SPARK-14938][ML] replace some RDD.map with Da...

2016-04-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12718#issuecomment-215325772 **[Test build #57225 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57225/consoleFull)** for PR 12718 at commit [`b2101b2`](https://gi

[GitHub] spark pull request: [SPARK-12660] [SPARK-14967] [SQL] Implement Ex...

2016-04-27 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/12736#discussion_r61379222 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala --- @@ -398,6 +398,66 @@ class DataFrameSuite extends QueryTest with SharedSQ

[GitHub] spark pull request: [SPARK-14915] [CORE] Don't re-queue a task if ...

2016-04-27 Thread jasonmoore2k
Github user jasonmoore2k commented on the pull request: https://github.com/apache/spark/pull/12751#issuecomment-215325217 @andrewor14 @kayousterhout Would appreciate your thoughts on this change (or anybody else who you recommend with some experience with the task scheduler).

[GitHub] spark pull request: [SPARK-12660] [SPARK-14967] [SQL] Implement Ex...

2016-04-27 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/12736#discussion_r61379135 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercionSuite.scala --- @@ -488,14 +488,6 @@ class HiveTypeCoercionS

[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...

2016-04-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12259#issuecomment-215324565 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...

2016-04-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12259#issuecomment-215324562 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...

2016-04-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12259#issuecomment-215324403 **[Test build #57219 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57219/consoleFull)** for PR 12259 at commit [`1c230ae`](https://g

[GitHub] spark pull request: [SPARK-11171][SPARK-11237][SPARK-11241][ML] Tr...

2016-04-27 Thread MLnick
Github user MLnick commented on the pull request: https://github.com/apache/spark/pull/9207#issuecomment-215324118 ping @jkbradley @mengxr @srowen - thoughts on my suggestion above? I think the overall gist of this PR is fine, but it seems nicer to me to actually put a trait f

[GitHub] spark pull request: [SPARK-14915] [CORE] Don't re-queue a task if ...

2016-04-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12751#issuecomment-215324080 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your p

[GitHub] spark pull request: [SPARK-14945][PYTHON] SparkSession Python API

2016-04-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12746#issuecomment-215323954 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-14945][PYTHON] SparkSession Python API

2016-04-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12746#issuecomment-215323955 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-14945][PYTHON] SparkSession Python API

2016-04-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12746#issuecomment-215323935 **[Test build #57224 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57224/consoleFull)** for PR 12746 at commit [`2c0ea42`](https://g

[GitHub] spark pull request: [SPARK-14412][ML][PYSPARK] Add StorageLevel pa...

2016-04-27 Thread MLnick
Github user MLnick commented on the pull request: https://github.com/apache/spark/pull/12660#issuecomment-215323273 ping @srowen @yanboliang @jkbradley any comments? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-14915] [CORE] Don't re-queue a task if ...

2016-04-27 Thread jasonmoore2k
GitHub user jasonmoore2k opened a pull request: https://github.com/apache/spark/pull/12751 [SPARK-14915] [CORE] Don't re-queue a task if another attempt has already succeeded ## What changes were proposed in this pull request? Don't re-queue a task if another attempt has al

[GitHub] spark pull request: [SPARK-13965] [CORE] TaskSetManager should kil...

2016-04-27 Thread devaraj-kavali
Github user devaraj-kavali closed the pull request at: https://github.com/apache/spark/pull/11778 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the featur

[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

2016-04-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10220#issuecomment-215320655 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

2016-04-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10220#issuecomment-215320650 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-14945][PYTHON] SparkSession Python API

2016-04-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12746#issuecomment-215320663 **[Test build #57224 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57224/consoleFull)** for PR 12746 at commit [`2c0ea42`](https://gi

[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

2016-04-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10220#issuecomment-215320437 **[Test build #57223 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57223/consoleFull)** for PR 10220 at commit [`74ba7e8`](https://g

[GitHub] spark pull request: [SPARK-14938][ML] replace some RDD.map with Da...

2016-04-27 Thread MLnick
Github user MLnick commented on the pull request: https://github.com/apache/spark/pull/12718#issuecomment-215320479 We should definitely have an encoder for vector udts... cc @dbtsai @viirya --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...

2016-04-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12259#issuecomment-215320081 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...

2016-04-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12259#issuecomment-215320083 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...

2016-04-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12259#issuecomment-215319930 **[Test build #57217 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57217/consoleFull)** for PR 12259 at commit [`9ed0f30`](https://g

[GitHub] spark pull request: [SPARK-14938][ML] replace some RDD.map with Da...

2016-04-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12718#issuecomment-215319643 **[Test build #57222 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57222/consoleFull)** for PR 12718 at commit [`e57332a`](https://g

[GitHub] spark pull request: [SPARK-14938][ML] replace some RDD.map with Da...

2016-04-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12718#issuecomment-215319671 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-14938][ML] replace some RDD.map with Da...

2016-04-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12718#issuecomment-215319667 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-14706][ML][PySpark] Python ML persisten...

2016-04-27 Thread yinxusen
Github user yinxusen commented on the pull request: https://github.com/apache/spark/pull/12604#issuecomment-215318737 @jkbradley Another bug found: The `CrossValidator` and `TrainValidationSplit` miss the `seed` when saving and loading. I'd prefer to create a JIRA and fix them

[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...

2016-04-27 Thread NarineK
Github user NarineK commented on a diff in the pull request: https://github.com/apache/spark/pull/12493#discussion_r61377331 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -1981,6 +1982,23 @@ class Dataset[T] private[sql]( } /** +

[GitHub] spark pull request: [SPARK-14850][ML] specialize array data for Ve...

2016-04-27 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/12640#issuecomment-215318542 Had an offline discussion with @cloud-fan and we will try converting from/to UnsafeArrayData directly using memory copy and test its performance. --- If your project is

[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

2016-04-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10220#issuecomment-215318404 **[Test build #57223 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57223/consoleFull)** for PR 10220 at commit [`74ba7e8`](https://gi

[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

2016-04-27 Thread sun-rui
Github user sun-rui commented on a diff in the pull request: https://github.com/apache/spark/pull/10220#discussion_r61376989 --- Diff: R/pkg/R/DataFrame.R --- @@ -1426,11 +1426,11 @@ setMethod("withColumn", #' Mutate #' -#' Return a new SparkDataFrame with the sp

[GitHub] spark pull request: [SPARK-14858][SQL] Enable subquery pushdown

2016-04-27 Thread hvanhovell
Github user hvanhovell commented on the pull request: https://github.com/apache/spark/pull/12720#issuecomment-215317834 @davies we only move pulling out the predicates into the analyzer. The actual planning (conversion into Semi Joins) still takes place in the Optimizer. There are a f

[GitHub] spark pull request: [SPARK-14783] [SPARK-14786] [BRANCH-1.6] Prese...

2016-04-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12724#issuecomment-215317709 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-14783] [SPARK-14786] [BRANCH-1.6] Prese...

2016-04-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12724#issuecomment-215317647 **[Test build #57214 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57214/consoleFull)** for PR 12724 at commit [`49b7b52`](https://g

[GitHub] spark pull request: [SPARK-14783] [SPARK-14786] [BRANCH-1.6] Prese...

2016-04-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12724#issuecomment-215317707 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-14858][SQL] Enable subquery pushdown

2016-04-27 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/12720#discussion_r61376668 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/namedExpressions.scala --- @@ -337,6 +337,16 @@ case class PrettyAttribute(

[GitHub] spark pull request: [SPARK-14858][SQL] Enable subquery pushdown

2016-04-27 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/12720#discussion_r61376609 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/namedExpressions.scala --- @@ -337,6 +337,16 @@ case class PrettyAttribute(

[GitHub] spark pull request: [SPARK-13902][SCHEDULER] Make DAGScheduler.get...

2016-04-27 Thread ueshin
Github user ueshin commented on the pull request: https://github.com/apache/spark/pull/12655#issuecomment-215317033 I saw the PR #8427 now. Both the #8427 approach and @markhamstra's approach (should we use `getOrElseUpdate` instead of `getOrElse`?) seem like the simplest way to fi

[GitHub] spark pull request: [SPARK-14972] Improve performance of JSON sche...

2016-04-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12750#issuecomment-215316922 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-14972] Improve performance of JSON sche...

2016-04-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12750#issuecomment-215316921 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-14972] Improve performance of JSON sche...

2016-04-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12750#issuecomment-215316797 **[Test build #57215 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57215/consoleFull)** for PR 12750 at commit [`5d34a64`](https://g

[GitHub] spark pull request: [SPARK-14938][ML] replace some RDD.map with Da...

2016-04-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12718#issuecomment-215316563 **[Test build #57222 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57222/consoleFull)** for PR 12718 at commit [`e57332a`](https://gi

[GitHub] spark pull request: [SPARK-14858][SQL] Enable subquery pushdown

2016-04-27 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/12720#discussion_r61376302 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -866,71 +867,189 @@ class Analyzer( * Note: CTEs a

[GitHub] spark pull request: [SPARK-10001][Core] Don't short-circuit action...

2016-04-27 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/12745 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request: [SPARK-10001][Core] Don't short-circuit action...

2016-04-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12745#issuecomment-215315836 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-10001][Core] Don't short-circuit action...

2016-04-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12745#issuecomment-215315834 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-10001][Core] Don't short-circuit action...

2016-04-27 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/12745#issuecomment-215315768 Merging in master. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: [SPARK-10001][Core] Don't short-circuit action...

2016-04-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12745#issuecomment-215315711 **[Test build #57213 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57213/consoleFull)** for PR 12745 at commit [`06f83cc`](https://g

[GitHub] spark pull request: [SPARK-14858][SQL] Enable subquery pushdown

2016-04-27 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/12720#discussion_r61375897 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -866,71 +867,189 @@ class Analyzer( * Note: CT

[GitHub] spark pull request: [SPARK-14315][SparkR]Add model persistence to ...

2016-04-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12683#issuecomment-215315114 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-14315][SparkR]Add model persistence to ...

2016-04-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12683#issuecomment-215315097 **[Test build #57218 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57218/consoleFull)** for PR 12683 at commit [`6650890`](https://g

[GitHub] spark pull request: [SPARK-14315][SparkR]Add model persistence to ...

2016-04-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12683#issuecomment-215315113 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-14858][SQL] Enable subquery pushdown

2016-04-27 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/12720#discussion_r61375586 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -866,71 +867,189 @@ class Analyzer( * Note: CT

[GitHub] spark pull request: [SPARK-12660] [SPARK-14967] [SQL] Implement Ex...

2016-04-27 Thread cloud-fan
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/12736#issuecomment-215314522 LGTM. A unrelated question, how do we express the EXCEPT ALL semantic? --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [SPARK-10001][Core] Don't short-circuit action...

2016-04-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12745#issuecomment-215314462 **[Test build #2898 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2898/consoleFull)** for PR 12745 at commit [`7fe0e54`](https://

[GitHub] spark pull request: [SPARK-14858][SQL] Enable subquery pushdown

2016-04-27 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/12720#discussion_r61375441 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/subquery.scala --- @@ -75,76 +77,63 @@ case class ScalarSubquery(

[GitHub] spark pull request: [SPARK-14783] [SPARK-14786] [BRANCH-1.6] Prese...

2016-04-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12724#issuecomment-215314235 **[Test build #57221 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57221/consoleFull)** for PR 12724 at commit [`49b7b52`](https://gi

[GitHub] spark pull request: [SPARK-14850][ML] specialize array data for Ve...

2016-04-27 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/12640#discussion_r61375365 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayData.scala --- @@ -29,6 +29,82 @@ abstract class ArrayData extends SpecializedG

[GitHub] spark pull request: [SPARK-14858][SQL] Enable subquery pushdown

2016-04-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12720#issuecomment-215314230 **[Test build #2899 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2899/consoleFull)** for PR 12720 at commit [`62c5c2f`](https://g

[GitHub] spark pull request: [SPARK-14850][ML] specialize array data for Ve...

2016-04-27 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/12640#issuecomment-215314086 @cloud-fan This is still much slower than 1.4 and adding more subclasses of ArrayData may prevent JIT inline methods like `getInt` and `getDouble`. Is it easy to convert

[GitHub] spark pull request: [SPARK-14783] [SPARK-14786] [BRANCH-1.6] Prese...

2016-04-27 Thread yhuai
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/12724#issuecomment-215314116 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this featur

[GitHub] spark pull request: [SPARK-12660] [SPARK-14967] [SQL] Implement Ex...

2016-04-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/12736#discussion_r61375198 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala --- @@ -398,6 +398,66 @@ class DataFrameSuite extends QueryTest with SharedSQL

[GitHub] spark pull request: [SPARK-12660] [SPARK-14967] [SQL] Implement Ex...

2016-04-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/12736#discussion_r61375067 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercionSuite.scala --- @@ -488,14 +488,6 @@ class HiveTypeCoercionSu

[GitHub] spark pull request: [SPARK-14654][CORE] New accumulator API

2016-04-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/12612#discussion_r61374828 --- Diff: core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala --- @@ -175,124 +172,143 @@ class TaskMetrics private[spark] () extends Serializ

[GitHub] spark pull request: [SPARK-12660] [SPARK-14967] [SQL] Implement Ex...

2016-04-27 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/12736#discussion_r61374712 --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/JavaDatasetSuite.java --- @@ -291,7 +291,7 @@ public void testSetOperation() { union

[GitHub] spark pull request: [SPARK-14654][CORE] New accumulator API

2016-04-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/12612#discussion_r61374697 --- Diff: core/src/main/scala/org/apache/spark/NewAccumulator.scala --- @@ -0,0 +1,391 @@ +/* + * Licensed to the Apache Software Foundation (ASF) u

[GitHub] spark pull request: [SPARK-14654][CORE] New accumulator API

2016-04-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12612#issuecomment-215312526 **[Test build #57220 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57220/consoleFull)** for PR 12612 at commit [`124568b`](https://gi

[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...

2016-04-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12259#issuecomment-215311747 **[Test build #57219 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57219/consoleFull)** for PR 12259 at commit [`1c230ae`](https://gi

[GitHub] spark pull request: [SPARK-12235][SPARKR] Enhance mutate() to supp...

2016-04-27 Thread sun-rui
Github user sun-rui commented on a diff in the pull request: https://github.com/apache/spark/pull/10220#discussion_r61374122 --- Diff: R/pkg/R/DataFrame.R --- @@ -1451,17 +1451,54 @@ setMethod("mutate", function(.data, ...) { x <- .data

[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...

2016-04-27 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/12259#issuecomment-215310697 LGTM pending Jenkins --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this f

[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...

2016-04-27 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/12259#discussion_r61373781 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala --- @@ -117,6 +117,7 @@ class SQLContext private[sql]( * * @since 1.6.0

[GitHub] spark pull request: [SPARK-14315][SparkR]Add model persistence to ...

2016-04-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12683#issuecomment-215310647 **[Test build #57218 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57218/consoleFull)** for PR 12683 at commit [`6650890`](https://gi

[GitHub] spark pull request: [SPARK-14346][SQL] Add PARTITIONED BY and CLUS...

2016-04-27 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/12734#issuecomment-215310529 @jodersky Oh sorry, pasted the JIRA ticket summary to the PR title but forgot to add the tags. Updated! --- If your project is set up for it, you can reply to this e

[GitHub] spark pull request: [SPARK-14729][Scheduler] Refactored YARN sched...

2016-04-27 Thread hbhanawat
Github user hbhanawat commented on the pull request: https://github.com/apache/spark/pull/12641#issuecomment-215310532 Hmm. @vanzin I think you have a point. There are few things that can be done but not sure if they will simplify without reducing the flexibility. I will thin

[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...

2016-04-27 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/12259#discussion_r61373671 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/util/MLlibTestSparkContext.scala --- @@ -24,14 +24,18 @@ import org.scalatest.Suite import or

[GitHub] spark pull request: [SPARK-14972] Improve performance of JSON sche...

2016-04-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12750#issuecomment-215310239 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-13568] [ML] Create feature transformer ...

2016-04-27 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/11601#discussion_r61373605 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Imputer.scala --- @@ -0,0 +1,219 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] spark pull request: [SPARK-13568] [ML] Create feature transformer ...

2016-04-27 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/11601#discussion_r61373554 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Imputer.scala --- @@ -0,0 +1,219 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] spark pull request: [SPARK-14972] Improve performance of JSON sche...

2016-04-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12750#issuecomment-215310152 **[Test build #57212 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57212/consoleFull)** for PR 12750 at commit [`4bbf429`](https://g

[GitHub] spark pull request: [SPARK-14972] Improve performance of JSON sche...

2016-04-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12750#issuecomment-215310236 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-13568] [ML] Create feature transformer ...

2016-04-27 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/11601#discussion_r61373571 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Imputer.scala --- @@ -0,0 +1,219 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...

2016-04-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12259#issuecomment-215310117 **[Test build #57217 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57217/consoleFull)** for PR 12259 at commit [`9ed0f30`](https://gi

[GitHub] spark pull request: [SPARK-14706][ML][PySpark] Python ML persisten...

2016-04-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12604#issuecomment-215309644 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-14706][ML][PySpark] Python ML persisten...

2016-04-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12604#issuecomment-215309599 **[Test build #57216 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57216/consoleFull)** for PR 12604 at commit [`fa8a05c`](https://g

[GitHub] spark pull request: [SPARK-14706][ML][PySpark] Python ML persisten...

2016-04-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12604#issuecomment-215309643 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-13961][ML] spark.ml ChiSqSelector and R...

2016-04-27 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/12467#issuecomment-215308986 Look good overall, I have my last inline comment. After that, it should be ready to go. --- If your project is set up for it, you can reply to this email and have y

[GitHub] spark pull request: [SPARK-12660] [SPARK-14967] [SQL] Implement Ex...

2016-04-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/12736#discussion_r61372903 --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/JavaDatasetSuite.java --- @@ -291,7 +291,7 @@ public void testSetOperation() { unione

[GitHub] spark pull request: [SPARK-14706][ML][PySpark] Python ML persisten...

2016-04-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12604#issuecomment-215308604 **[Test build #57216 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57216/consoleFull)** for PR 12604 at commit [`fa8a05c`](https://gi

[GitHub] spark pull request: [SPARK-14972] Improve performance of JSON sche...

2016-04-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12750#issuecomment-215308136 **[Test build #57215 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57215/consoleFull)** for PR 12750 at commit [`5d34a64`](https://gi

  1   2   3   4   5   6   7   8   9   10   >