[GitHub] spark issue #13403: [SPARK-15660][CORE] Update RDD `variance/stdev` descript...

2016-06-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13403 **[Test build #61106 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61106/consoleFull)** for PR 13403 at commit [`bb12a7f`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13624: [SPARK-15858][ML]: Fix calculating error by tree stack o...

2016-06-23 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/13624 I think this is correct and can see why it's faster. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #13624: [SPARK-15858][ML]: Fix calculating error by tree ...

2016-06-23 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/13624#discussion_r68205142 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/impl/GradientBoostedTrees.scala --- @@ -205,31 +205,31 @@ private[spark] object GradientBoostedTrees

[GitHub] spark pull request #13494: [SPARK-15752] [SQL] support optimization for meta...

2016-06-23 Thread lianhuiwang
Github user lianhuiwang commented on a diff in the pull request: https://github.com/apache/spark/pull/13494#discussion_r68204608 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala --- @@ -109,108 +111,45 @@ private[sql] object Fil

[GitHub] spark issue #13874: [SQL][minor] ParserUtils.operationNotAllowed should thro...

2016-06-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13874 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13874: [SQL][minor] ParserUtils.operationNotAllowed should thro...

2016-06-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13874 **[Test build #61110 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61110/consoleFull)** for PR 13874 at commit [`98027fc`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13874: [SQL][minor] ParserUtils.operationNotAllowed should thro...

2016-06-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13874 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61110/ Test FAILed. ---

[GitHub] spark pull request #13624: [SPARK-15858][ML]: Fix calculating error by tree ...

2016-06-23 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/13624#discussion_r68203172 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/impl/GradientBoostedTrees.scala --- @@ -205,31 +205,31 @@ private[spark] object GradientBoostedTrees

[GitHub] spark pull request #13624: [SPARK-15858][ML]: Fix calculating error by tree ...

2016-06-23 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/13624#discussion_r68202993 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/impl/GradientBoostedTrees.scala --- @@ -205,31 +205,31 @@ private[spark] object GradientBoostedTrees

[GitHub] spark pull request #13624: [SPARK-15858][ML]: Fix calculating error by tree ...

2016-06-23 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/13624#discussion_r68202942 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/impl/GradientBoostedTrees.scala --- @@ -205,31 +205,31 @@ private[spark] object GradientBoostedTrees

[GitHub] spark issue #13874: [SQL][minor] ParserUtils.operationNotAllowed should thro...

2016-06-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13874 **[Test build #61110 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61110/consoleFull)** for PR 13874 at commit [`98027fc`](https://github.com/apache/spark/commit/9

[GitHub] spark issue #13874: [SQL][minor] ParserUtils.operationNotAllowed should thro...

2016-06-23 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13874 cc @hvanhovell --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark pull request #13874: [SQL][minor] ParserUtils.operationNotAllowed shou...

2016-06-23 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/13874 [SQL][minor] ParserUtils.operationNotAllowed should throw exception directly ## What changes were proposed in this pull request? It's weird that `ParserUtils.operationNotAllowed` returns

[GitHub] spark issue #13865: [SPARK-13709][SQL] Initialize deserializer with both tab...

2016-06-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13865 **[Test build #61109 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61109/consoleFull)** for PR 13865 at commit [`ebed01e`](https://github.com/apache/spark/commit/e

[GitHub] spark issue #13872: [SPARK-16164][SQL] Update `CombineFilters` to try to con...

2016-06-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13872 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61105/ Test PASSed. ---

[GitHub] spark issue #13872: [SPARK-16164][SQL] Update `CombineFilters` to try to con...

2016-06-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13872 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13872: [SPARK-16164][SQL] Update `CombineFilters` to try to con...

2016-06-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13872 **[Test build #61105 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61105/consoleFull)** for PR 13872 at commit [`2b21fd7`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13756: [SPARK-16041][SQL] Disallow Duplicate Columns in partiti...

2016-06-23 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13756 I'm thinking about if it's possible to concentrate error checking logics at one place for table creation. For example, we check duplicated table column names at parser for SQL statement(https://g

[GitHub] spark issue #13872: [SPARK-16164][SQL] Update `CombineFilters` to try to con...

2016-06-23 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13872 For any conclusion, thank you for review, @mengxr and @liancheng ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your proje

[GitHub] spark issue #13839: [SPARK-16128][SQL] Add truncateTo parameter to Dataset.s...

2016-06-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13839 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13839: [SPARK-16128][SQL] Add truncateTo parameter to Dataset.s...

2016-06-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13839 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61102/ Test PASSed. ---

[GitHub] spark issue #13839: [SPARK-16128][SQL] Add truncateTo parameter to Dataset.s...

2016-06-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13839 **[Test build #61102 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61102/consoleFull)** for PR 13839 at commit [`b4d9279`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13873: [SPARK-16167][SQL] RowEncoder should preserve array/map ...

2016-06-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13873 **[Test build #61108 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61108/consoleFull)** for PR 13873 at commit [`093a9fa`](https://github.com/apache/spark/commit/0

[GitHub] spark issue #13836: [SPARK-16125][YARN] Fix not test yarn cluster mode corre...

2016-06-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13836 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61101/ Test PASSed. ---

[GitHub] spark pull request #13873: [SPARK-16167][SQL] RowEncoder should preserve arr...

2016-06-23 Thread ueshin
GitHub user ueshin opened a pull request: https://github.com/apache/spark/pull/13873 [SPARK-16167][SQL] RowEncoder should preserve array/map type nullability. ## What changes were proposed in this pull request? Currently `RowEncoder` doesn't preserve nullability of `ArrayTyp

[GitHub] spark issue #13859: [SPARK-16154] [MLLIB] Update spark.ml and spark.mllib pa...

2016-06-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13859 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61107/ Test PASSed. ---

[GitHub] spark issue #13836: [SPARK-16125][YARN] Fix not test yarn cluster mode corre...

2016-06-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13836 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13859: [SPARK-16154] [MLLIB] Update spark.ml and spark.mllib pa...

2016-06-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13859 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13859: [SPARK-16154] [MLLIB] Update spark.ml and spark.mllib pa...

2016-06-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13859 **[Test build #61107 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61107/consoleFull)** for PR 13859 at commit [`c445b93`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13836: [SPARK-16125][YARN] Fix not test yarn cluster mode corre...

2016-06-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13836 **[Test build #61101 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61101/consoleFull)** for PR 13836 at commit [`8d2dea7`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13847: [SPARK-16135][SQL] Implement hashCode and euqals in Unsa...

2016-06-23 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/13847 I'm now checking failed tests... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #13143: [SPARK-15359] [Mesos] Mesos dispatcher should han...

2016-06-23 Thread tnachen
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/13143#discussion_r68194509 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerUtils.scala --- @@ -120,14 +120,25 @@ private[mesos] trait MesosSchedul

[GitHub] spark pull request #13494: [SPARK-15752] [SQL] support optimization for meta...

2016-06-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13494#discussion_r68193963 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala --- @@ -109,108 +111,45 @@ private[sql] object FileS

[GitHub] spark issue #13870: [SPARK-16165][SQL] Fix the update logic for InMemoryTabl...

2016-06-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13870 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61100/ Test PASSed. ---

[GitHub] spark issue #13870: [SPARK-16165][SQL] Fix the update logic for InMemoryTabl...

2016-06-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13870 **[Test build #61100 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61100/consoleFull)** for PR 13870 at commit [`b1a80dd`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13870: [SPARK-16165][SQL] Fix the update logic for InMemoryTabl...

2016-06-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13870 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13872: [SPARK-16164][SQL] Filter pushdown should keep the order...

2016-06-23 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13872 I think I had better change the title of this PR. (I just copied from the JIRA.) Does that will reduce your concern a little bit? --- If your project is set up for it, you can reply to th

[GitHub] spark issue #13756: [SPARK-16041][SQL] Disallow Duplicate Columns in partiti...

2016-06-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13756 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61099/ Test PASSed. ---

[GitHub] spark issue #13756: [SPARK-16041][SQL] Disallow Duplicate Columns in partiti...

2016-06-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13756 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13872: [SPARK-16164][SQL] Filter pushdown should keep the order...

2016-06-23 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13872 Sure, I fully agree with your view. That's the declarative language. However, we can provide more *natural* order as a default order like in this PR. As you see, without considering th

[GitHub] spark issue #13756: [SPARK-16041][SQL] Disallow Duplicate Columns in partiti...

2016-06-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13756 **[Test build #61099 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61099/consoleFull)** for PR 13756 at commit [`24edb5f`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13872: [SPARK-16164][SQL] Filter pushdown should keep the order...

2016-06-23 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13872 @dongjoon-hyun Thanks for the work! However, I think the optimizer should have the freedom to reorder predicate evaluation order. For example, we may evaluate cheap predicates first in order to sh

[GitHub] spark issue #13858: [SPARK-16148] [Scheduler] Allow for underscores in TaskL...

2016-06-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13858 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61098/ Test PASSed. ---

[GitHub] spark issue #13858: [SPARK-16148] [Scheduler] Allow for underscores in TaskL...

2016-06-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13858 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13858: [SPARK-16148] [Scheduler] Allow for underscores in TaskL...

2016-06-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13858 **[Test build #61098 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61098/consoleFull)** for PR 13858 at commit [`b497dc9`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13868: [SPARK-15899] [SQL]

2016-06-23 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/13868 Yes, this is not the change discussed in the JIRA. The best way forward seems to be to replace attempts to make a `file:` URI manually from a string with use of `File.toURI` or something from Java 7'

[GitHub] spark issue #13494: [SPARK-15752] [SQL] support optimization for metadata on...

2016-06-23 Thread lianhuiwang
Github user lianhuiwang commented on the issue: https://github.com/apache/spark/pull/13494 @cloud-fan Yes, I think what you said is right. as Hive/Prestodb, if queries that did some functions (example: MIN/MAX) or distinct aggregates on partition column and the value of config 'spark.

[GitHub] spark pull request #13841: [SPARK-16130][ML] model loading backward compatib...

2016-06-23 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/13841#discussion_r68191189 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -674,12 +674,12 @@ object LogisticRegressionModel extends

[GitHub] spark issue #13859: [SPARK-16154] [MLLIB] Update spark.ml and spark.mllib pa...

2016-06-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13859 **[Test build #61107 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61107/consoleFull)** for PR 13859 at commit [`c445b93`](https://github.com/apache/spark/commit/c

[GitHub] spark pull request #13844: [SPARK-16133][ML] model loading backward compatib...

2016-06-23 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/13844#discussion_r68191096 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/MinMaxScaler.scala --- @@ -232,7 +233,9 @@ object MinMaxScalerModel extends MLReadable[MinMaxScale

[GitHub] spark pull request #13859: [SPARK-16154] [MLLIB] Update spark.ml and spark.m...

2016-06-23 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/13859#discussion_r68190859 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/package-info.java --- @@ -16,6 +16,26 @@ */ /** - * Spark's machine learning library

[GitHub] spark issue #13403: [SPARK-15660][CORE] Update RDD `variance/stdev` descript...

2016-06-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13403 **[Test build #61106 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61106/consoleFull)** for PR 13403 at commit [`bb12a7f`](https://github.com/apache/spark/commit/b

[GitHub] spark pull request #13494: [SPARK-15752] [SQL] support optimization for meta...

2016-06-23 Thread lianhuiwang
Github user lianhuiwang commented on a diff in the pull request: https://github.com/apache/spark/pull/13494#discussion_r68190320 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala --- @@ -109,108 +111,45 @@ private[sql] object Fil

[GitHub] spark issue #13859: [SPARK-16154] [MLLIB] Update spark.ml and spark.mllib pa...

2016-06-23 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/13859 Looks good subject to @hhbyyh comment above --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #13403: [SPARK-15660][CORE] Update RDD `variance/stdev` d...

2016-06-23 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13403#discussion_r68188989 --- Diff: core/src/main/scala/org/apache/spark/rdd/DoubleRDDFunctions.scala --- @@ -74,6 +74,22 @@ class DoubleRDDFunctions(self: RDD[Double]) extends

[GitHub] spark issue #13872: [SPARK-16164][SQL] Filter pushdown should keep the order...

2016-06-23 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/13872 cc: @liancheng --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark pull request #13403: [SPARK-15660][CORE] Update RDD `variance/stdev` d...

2016-06-23 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/13403#discussion_r68188408 --- Diff: core/src/main/scala/org/apache/spark/rdd/DoubleRDDFunctions.scala --- @@ -74,6 +74,22 @@ class DoubleRDDFunctions(self: RDD[Double]) extends Loggin

[GitHub] spark issue #13872: [SPARK-16164][SQL] Filter pushdown should keep the order...

2016-06-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13872 **[Test build #61105 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61105/consoleFull)** for PR 13872 at commit [`2b21fd7`](https://github.com/apache/spark/commit/2

[GitHub] spark pull request #13836: [SPARK-16125][YARN] Fix not test yarn cluster mod...

2016-06-23 Thread renozhang
Github user renozhang commented on a diff in the pull request: https://github.com/apache/spark/pull/13836#discussion_r68188071 --- Diff: python/pyspark/context.py --- @@ -156,7 +156,7 @@ def _do_init(self, master, appName, sparkHome, pyFiles, environment, batchSize, se

[GitHub] spark pull request #13872: [SPARK-16164][SQL] Filter pushdown should keep th...

2016-06-23 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/13872 [SPARK-16164][SQL] Filter pushdown should keep the ordering in the logical plan ## What changes were proposed in this pull request? Chris McCubbin reported a bug when he used StringI

[GitHub] spark issue #13771: [SPARK-13748][PYSPARK][DOC] Add the description for expl...

2016-06-23 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/13771 Hi @davies, it seems related codes were written by you. Would this be a meaningful change? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #13494: [SPARK-15752] [SQL] support optimization for metadata on...

2016-06-23 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13494 hi @lianhuiwang , thanks for working on it! The overall idea LGTM, we should elimiante unnecessary file scan if only partition columns are read. However, the current implementation looks n

[GitHub] spark pull request #13494: [SPARK-15752] [SQL] support optimization for meta...

2016-06-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13494#discussion_r68186202 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala --- @@ -109,108 +111,45 @@ private[sql] object FileS

[GitHub] spark issue #13745: [Spark-15997][DOC][ML] Update user guide for HashingTF, ...

2016-06-23 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/13745 @GayathriMurali couple final comments, then I think it's good to go. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your pr

[GitHub] spark issue #13871: [SPARK-16163] [SQL] Cache the statistics for logical pla...

2016-06-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13871 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61104/ Test FAILed. ---

[GitHub] spark issue #13871: [SPARK-16163] [SQL] Cache the statistics for logical pla...

2016-06-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13871 **[Test build #61104 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61104/consoleFull)** for PR 13871 at commit [`ecdf2b8`](https://github.com/apache/spark/commit/

[GitHub] spark pull request #13403: [SPARK-15660][CORE] Update RDD `variance/stdev` d...

2016-06-23 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13403#discussion_r68185536 --- Diff: core/src/main/scala/org/apache/spark/rdd/DoubleRDDFunctions.scala --- @@ -74,6 +74,22 @@ class DoubleRDDFunctions(self: RDD[Double]) extends

[GitHub] spark issue #13871: [SPARK-16163] [SQL] Cache the statistics for logical pla...

2016-06-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13871 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13871: [SPARK-16163] [SQL] Cache the statistics for logical pla...

2016-06-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13871 **[Test build #61104 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61104/consoleFull)** for PR 13871 at commit [`ecdf2b8`](https://github.com/apache/spark/commit/e

[GitHub] spark pull request #13699: [SPARK-15958] Make initial buffer size for the So...

2016-06-23 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/13699#discussion_r68184716 --- Diff: core/src/main/java/org/apache/spark/shuffle/sort/UnsafeShuffleWriter.java --- @@ -122,6 +123,8 @@ public UnsafeShuffleWriter( this.taskCont

[GitHub] spark pull request #13403: [SPARK-15660][CORE] Update RDD `variance/stdev` d...

2016-06-23 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13403#discussion_r68184739 --- Diff: core/src/main/scala/org/apache/spark/rdd/DoubleRDDFunctions.scala --- @@ -74,6 +74,22 @@ class DoubleRDDFunctions(self: RDD[Double]) extends

[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...

2016-06-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13758 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61097/ Test PASSed. ---

[GitHub] spark pull request #13871: [SPARK-16163] [SQL] Cache the statistics for logi...

2016-06-23 Thread davies
GitHub user davies opened a pull request: https://github.com/apache/spark/pull/13871 [SPARK-16163] [SQL] Cache the statistics for logical plans ## What changes were proposed in this pull request? This calculation of statistics is not trivial anymore, it could be very slow o

[GitHub] spark issue #13871: [SPARK-16163] [SQL] Cache the statistics for logical pla...

2016-06-23 Thread davies
Github user davies commented on the issue: https://github.com/apache/spark/pull/13871 cc @liancheng --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...

2016-06-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13758 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...

2016-06-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13758 **[Test build #61097 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61097/consoleFull)** for PR 13758 at commit [`1f1d77c`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13839: [SPARK-16128][SQL] Add truncateTo parameter to Dataset.s...

2016-06-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13839 **[Test build #61102 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61102/consoleFull)** for PR 13839 at commit [`b4d9279`](https://github.com/apache/spark/commit/b

[GitHub] spark pull request #13403: [SPARK-15660][CORE] Update RDD `variance/stdev` d...

2016-06-23 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/13403#discussion_r68184371 --- Diff: core/src/main/scala/org/apache/spark/rdd/DoubleRDDFunctions.scala --- @@ -74,6 +74,22 @@ class DoubleRDDFunctions(self: RDD[Double]) extends Loggin

[GitHub] spark issue #13834: [TRIVIAL] [CORE] [ScriptTransform] move printing of stde...

2016-06-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13834 **[Test build #61103 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61103/consoleFull)** for PR 13834 at commit [`04c8637`](https://github.com/apache/spark/commit/0

[GitHub] spark pull request #13839: [SPARK-16128][SQL] Add truncateTo parameter to Da...

2016-06-23 Thread ScrapCodes
Github user ScrapCodes commented on a diff in the pull request: https://github.com/apache/spark/pull/13839#discussion_r68184305 --- Diff: R/pkg/R/DataFrame.R --- @@ -194,7 +195,13 @@ setMethod("isLocal", setMethod("showDF", signature(x = "SparkDataFrame"),

[GitHub] spark issue #13834: [TRIVIAL] [CORE] [ScriptTransform] move printing of stde...

2016-06-23 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/13834 Jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishe

<    1   2   3   4   5