[GitHub] spark pull request #14008: [SPARK-16281][SQL] Implement parse_url SQL functi...

2016-07-07 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14008#discussion_r70028288 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -652,6 +654,152 @@ case class StringRPad(str:

[GitHub] spark issue #14083: [SPARK-16406][SQL] Improve performance of LogicalPlan.re...

2016-07-07 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14083 Finally! Congrat! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #14093: SPARK-16420: Ensure compression streams are closed.

2016-07-07 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14093 cc @JoshRosen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark pull request #14093: SPARK-16420: Ensure compression streams are close...

2016-07-07 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14093#discussion_r70028149 --- Diff: core/src/main/java/org/apache/spark/shuffle/sort/UnsafeShuffleWriter.java --- @@ -349,12 +349,19 @@ void forceSorterToSpill() throws IOException {

[GitHub] spark pull request #14008: [SPARK-16281][SQL] Implement parse_url SQL functi...

2016-07-07 Thread janplus
Github user janplus commented on a diff in the pull request: https://github.com/apache/spark/pull/14008#discussion_r70028094 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -652,6 +654,152 @@ case class

[GitHub] spark pull request #14008: [SPARK-16281][SQL] Implement parse_url SQL functi...

2016-07-07 Thread janplus
Github user janplus commented on a diff in the pull request: https://github.com/apache/spark/pull/14008#discussion_r70028081 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -652,6 +654,152 @@ case class

[GitHub] spark pull request #14093: SPARK-16420: Ensure compression streams are close...

2016-07-07 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14093#discussion_r70028071 --- Diff: common/network-common/src/main/java/org/apache/spark/network/util/LimitedInputStream.java --- @@ -102,4 +118,10 @@ public

[GitHub] spark issue #14078: [SPARK-11857] [Mesos] [WIP] Deprecate fine grained

2016-07-07 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14078 Still waiting? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark issue #14102: [SPARK-16434][SQL] Avoid record-per type dispatch in JSO...

2016-07-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14102 **[Test build #61960 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61960/consoleFull)** for PR 14102 at commit

[GitHub] spark issue #14102: [SPARK-16434][SQL] Avoid record-per type dispatch in JSO...

2016-07-07 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14102 cc @yhuai @liancheng Do you mind if I ask a quick look for this as well please? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request #14008: [SPARK-16281][SQL] Implement parse_url SQL functi...

2016-07-07 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14008#discussion_r70027833 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -652,6 +654,152 @@ case class StringRPad(str:

[GitHub] spark issue #13701: [SPARK-15639][SQL] Try to push down filter at RowGroups ...

2016-07-07 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/13701 @gatorsmile That depends. In practice, we have many solutions to deal with the case you mentioned. It is not making sense to keep so many tiny parquet files. --- If your project is set up for it,

[GitHub] spark issue #14095: [SPARK-16429][SQL] Include `StringType` columns in `desc...

2016-07-07 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14095 Of course! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark issue #14103: [SPARK-16436][SQL] checkEvaluation support NaN and Runti...

2016-07-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14103 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14068: enhanced simulate multiply

2016-07-07 Thread uzadude
Github user uzadude commented on the issue: https://github.com/apache/spark/pull/14068 Sure. The current method for multiplying distributed block matrices starts by deciding which block should be shuffled to which partition to do the actual multiplications. This stage is

[GitHub] spark issue #14103: [SPARK-16436][SQL] checkEvaluation support NaN and Runti...

2016-07-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14103 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61955/ Test PASSed. ---

[GitHub] spark issue #14103: [SPARK-16436][SQL] checkEvaluation support NaN and Runti...

2016-07-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14103 **[Test build #61955 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61955/consoleFull)** for PR 14103 at commit

[GitHub] spark pull request #14008: [SPARK-16281][SQL] Implement parse_url SQL functi...

2016-07-07 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14008#discussion_r70027483 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -652,6 +654,152 @@ case class StringRPad(str:

[GitHub] spark pull request #14008: [SPARK-16281][SQL] Implement parse_url SQL functi...

2016-07-07 Thread janplus
Github user janplus commented on a diff in the pull request: https://github.com/apache/spark/pull/14008#discussion_r70027444 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -652,6 +654,152 @@ case class

[GitHub] spark issue #14075: [SPARK-16401] [SQL] Data Source API: Enable Extending Re...

2016-07-07 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14075 cc @cloud-fan and @liancheng for review. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #11748: [SPARK-13921] Store serialized blocks as multiple chunks...

2016-07-07 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/11748 @bonitao here's a patch that fixes it https://github.com/apache/spark/pull/14099 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request #14057: [SPARK-15425][SQL] Disallow cross joins, even if ...

2016-07-07 Thread rxin
Github user rxin closed the pull request at: https://github.com/apache/spark/pull/14057 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #14095: [SPARK-16429][SQL] Include `StringType` columns in `desc...

2016-07-07 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14095 And also update the documentation. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #14008: [SPARK-16281][SQL] Implement parse_url SQL functi...

2016-07-07 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14008#discussion_r70027201 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -652,6 +654,152 @@ case class StringRPad(str:

[GitHub] spark pull request #14008: [SPARK-16281][SQL] Implement parse_url SQL functi...

2016-07-07 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14008#discussion_r70027152 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -652,6 +654,152 @@ case class StringRPad(str:

[GitHub] spark issue #14071: [SPARK-16397][SQL] make CatalogTable more general and le...

2016-07-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14071 **[Test build #61959 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61959/consoleFull)** for PR 14071 at commit

[GitHub] spark pull request #14008: [SPARK-16281][SQL] Implement parse_url SQL functi...

2016-07-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/14008#discussion_r70027089 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -652,6 +654,152 @@ case class

[GitHub] spark issue #14083: [SPARK-16406][SQL] Improve performance of LogicalPlan.re...

2016-07-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14083 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61952/ Test PASSed. ---

[GitHub] spark issue #14083: [SPARK-16406][SQL] Improve performance of LogicalPlan.re...

2016-07-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14083 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14083: [SPARK-16406][SQL] Improve performance of LogicalPlan.re...

2016-07-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14083 **[Test build #61952 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61952/consoleFull)** for PR 14083 at commit

[GitHub] spark issue #14095: [SPARK-16429][SQL] Include `StringType` columns in `desc...

2016-07-07 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14095 Oh, sure! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement all remaining xpath functio...

2016-07-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13991 **[Test build #61958 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61958/consoleFull)** for PR 13991 at commit

[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement various xpath functions

2016-07-07 Thread petermaxlee
Github user petermaxlee commented on the issue: https://github.com/apache/spark/pull/13991 I just added the general xpath function that returns an array of string too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement various xpath functions

2016-07-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13991 **[Test build #61957 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61957/consoleFull)** for PR 13991 at commit

[GitHub] spark pull request #14008: [SPARK-16281][SQL] Implement parse_url SQL functi...

2016-07-07 Thread janplus
Github user janplus commented on a diff in the pull request: https://github.com/apache/spark/pull/14008#discussion_r70026365 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -652,6 +654,152 @@ case class

[GitHub] spark pull request #14008: [SPARK-16281][SQL] Implement parse_url SQL functi...

2016-07-07 Thread janplus
Github user janplus commented on a diff in the pull request: https://github.com/apache/spark/pull/14008#discussion_r70026344 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -652,6 +654,152 @@ case class

[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement various xpath functions

2016-07-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13991 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61954/ Test FAILed. ---

[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement various xpath functions

2016-07-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13991 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement various xpath functions

2016-07-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13991 **[Test build #61954 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61954/consoleFull)** for PR 13991 at commit

[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement various xpath functions

2016-07-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13991 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61953/ Test FAILed. ---

[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement various xpath functions

2016-07-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13991 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement various xpath functions

2016-07-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13991 **[Test build #61953 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61953/consoleFull)** for PR 13991 at commit

[GitHub] spark issue #13374: [SPARK-13638][SQL] Add escapeAll option to CSV DataFrame...

2016-07-07 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/13374 Yup... would be great if you can update this. Otherwise LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request #14103: [SPARK-16436][SQL] checkEvaluation support NaN an...

2016-07-07 Thread petermaxlee
Github user petermaxlee commented on a diff in the pull request: https://github.com/apache/spark/pull/14103#discussion_r70025935 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala --- @@ -63,6 +68,10 @@ trait

[GitHub] spark pull request #14103: [SPARK-16436][SQL] checkEvaluation support NaN an...

2016-07-07 Thread petermaxlee
Github user petermaxlee commented on a diff in the pull request: https://github.com/apache/spark/pull/14103#discussion_r70025887 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala --- @@ -41,7 +41,12 @@ trait

[GitHub] spark issue #13374: [SPARK-13638][SQL] Add escapeAll option to CSV DataFrame...

2016-07-07 Thread jurriaan
Github user jurriaan commented on the issue: https://github.com/apache/spark/pull/13374 I thought it should be named in line with the escapeQuotes method, but what it's doing is more like quoting all values then escaping all. So i guess that name could make sense after all --- If

[GitHub] spark issue #13701: [SPARK-15639][SQL] Try to push down filter at RowGroups ...

2016-07-07 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/13701 : ) Up to you. I think we will see less perfomance gains and more significant performance penalty when a table contains many many small parquet files. I also think that is expected, anyway.

[GitHub] spark pull request #13890: [SPARK-16189][SQL] Add ExternalRDD logical plan f...

2016-07-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13890#discussion_r70025823 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ExistingRDD.scala --- @@ -74,13 +74,71 @@ object RDDConversions { } }

[GitHub] spark pull request #14034: [SPARK-16355] [SPARK-16354] [SQL] Fix Bugs When L...

2016-07-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/14034#discussion_r70025602 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala --- @@ -46,6 +46,20 @@ trait CheckAnalysis extends

[GitHub] spark issue #13701: [SPARK-15639][SQL] Try to push down filter at RowGroups ...

2016-07-07 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/13701 @gatorsmile As I said, it is not an important issue here. What we want to confirm is there is no significant performance penalty.. --- If your project is set up for it, you can reply to this email

[GitHub] spark issue #13701: [SPARK-15639][SQL] Try to push down filter at RowGroups ...

2016-07-07 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/13701 @viirya Yeah, it is not easy to get a full performance picture. I do not know how Spark community did it in the past. When I working for the mainframe team, we had dedicated PQAs for measuring

[GitHub] spark pull request #13890: [SPARK-16189][SQL] Add ExternalRDD logical plan f...

2016-07-07 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13890#discussion_r70025315 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ExistingRDD.scala --- @@ -74,13 +74,71 @@ object RDDConversions { } }

[GitHub] spark pull request #14034: [SPARK-16355] [SPARK-16354] [SQL] Fix Bugs When L...

2016-07-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/14034#discussion_r70024905 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala --- @@ -660,7 +660,12 @@ case class

[GitHub] spark pull request #14087: [SPARK-16411][SQL][STREAMING] Add textFile to Str...

2016-07-07 Thread ScrapCodes
Github user ScrapCodes commented on a diff in the pull request: https://github.com/apache/spark/pull/14087#discussion_r70024634 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamReader.scala --- @@ -281,6 +281,31 @@ final class DataStreamReader

[GitHub] spark pull request #14087: [SPARK-16411][SQL][STREAMING] Add textFile to Str...

2016-07-07 Thread ScrapCodes
Github user ScrapCodes commented on a diff in the pull request: https://github.com/apache/spark/pull/14087#discussion_r70024651 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamReader.scala --- @@ -281,6 +281,31 @@ final class DataStreamReader

[GitHub] spark issue #14071: [SPARK-16397][SQL] make CatalogTable more general and le...

2016-07-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14071 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61951/ Test FAILed. ---

[GitHub] spark issue #14071: [SPARK-16397][SQL] make CatalogTable more general and le...

2016-07-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14071 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14071: [SPARK-16397][SQL] make CatalogTable more general and le...

2016-07-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14071 **[Test build #61951 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61951/consoleFull)** for PR 14071 at commit

[GitHub] spark pull request #14008: [SPARK-16281][SQL] Implement parse_url SQL functi...

2016-07-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/14008#discussion_r70024484 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -652,6 +654,152 @@ case class

[GitHub] spark pull request #14008: [SPARK-16281][SQL] Implement parse_url SQL functi...

2016-07-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/14008#discussion_r70024458 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -652,6 +654,152 @@ case class

[GitHub] spark issue #14034: [SPARK-16355] [SPARK-16354] [SQL] Fix Bugs When LIMIT/TA...

2016-07-07 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14034 ping @cloud-fan : ) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark pull request #14103: [SPARK-16436][SQL] checkEvaluation support NaN an...

2016-07-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/14103#discussion_r70024346 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala --- @@ -63,6 +68,10 @@ trait

[GitHub] spark pull request #14103: [SPARK-16436][SQL] checkEvaluation support NaN an...

2016-07-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/14103#discussion_r70024354 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala --- @@ -41,7 +41,12 @@ trait

[GitHub] spark issue #13890: [SPARK-16189][SQL] Add ExternalRDD logical plan for inpu...

2016-07-07 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13890 LGTM, cc @liancheng to take another look --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #13890: [SPARK-16189][SQL] Add ExternalRDD logical plan f...

2016-07-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13890#discussion_r70024013 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ExistingRDD.scala --- @@ -74,13 +74,71 @@ object RDDConversions { } }

[GitHub] spark issue #13701: [SPARK-15639][SQL] Try to push down filter at RowGroups ...

2016-07-07 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/13701 @gatorsmile I think the times I run the benchmark is not enough to confirm there is 5% performance difference. But I think it is not important here because we don't want to measure the exact

[GitHub] spark issue #14102: [SPARK-16434][SQL][WIP] Avoid record-per type dispatch i...

2016-07-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14102 **[Test build #61956 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61956/consoleFull)** for PR 14102 at commit

[GitHub] spark issue #13701: [SPARK-15639][SQL] Try to push down filter at RowGroups ...

2016-07-07 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/13701 @viirya If you run multiple times and still see 5% performance difference, you can confirm the penalty is around 5%. However, this might also depend on the other factors, e.g., the total time.

[GitHub] spark issue #14095: [SPARK-16429][SQL] Include `StringType` columns in `desc...

2016-07-07 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14095 Can you fix Python? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark issue #13701: [SPARK-15639][SQL] Try to push down filter at RowGroups ...

2016-07-07 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/13701 @gatorsmile BTW, if the TPC-DS performance is measured with 2.0 codebase, this should benefit the performance. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #13701: [SPARK-15639][SQL] Try to push down filter at RowGroups ...

2016-07-07 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/13701 @gatorsmile You know that the benchmark results will not the same every time even you run it with the same codes. If the difference is under a small range, we can assume they have no significant

[GitHub] spark issue #14103: [SPARK-16436][SQL] checkEvaluation support NaN and Runti...

2016-07-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14103 **[Test build #61955 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61955/consoleFull)** for PR 14103 at commit

[GitHub] spark issue #13890: [SPARK-16189][SQL] Add ExternalRDD logical plan for inpu...

2016-07-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13890 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13890: [SPARK-16189][SQL] Add ExternalRDD logical plan for inpu...

2016-07-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13890 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61950/ Test PASSed. ---

[GitHub] spark issue #13890: [SPARK-16189][SQL] Add ExternalRDD logical plan for inpu...

2016-07-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13890 **[Test build #61950 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61950/consoleFull)** for PR 13890 at commit

[GitHub] spark pull request #14103: [SPARK-16436][SQL] checkEvaluation support NaN an...

2016-07-07 Thread petermaxlee
GitHub user petermaxlee opened a pull request: https://github.com/apache/spark/pull/14103 [SPARK-16436][SQL] checkEvaluation support NaN and RuntimeReplaceable ## What changes were proposed in this pull request? This small patch modifies ExpressionEvalHelper. checkEvaluation to

[GitHub] spark issue #14103: [SPARK-16436][SQL] checkEvaluation support NaN and Runti...

2016-07-07 Thread petermaxlee
Github user petermaxlee commented on the issue: https://github.com/apache/spark/pull/14103 cc @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement various xpath functions

2016-07-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13991 **[Test build #61954 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61954/consoleFull)** for PR 13991 at commit

[GitHub] spark issue #13701: [SPARK-15639][SQL] Try to push down filter at RowGroups ...

2016-07-07 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/13701 Great! Around 5% performance penalty looks OK to me. Maybe we can send the code changes to the performance team for seeing TPC-DS improvement. CC @jfchen --- If your project is set up for it,

[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement various xpath functions

2016-07-07 Thread petermaxlee
Github user petermaxlee commented on the issue: https://github.com/apache/spark/pull/13991 I pushed a new change to this. We now have better error messages and test coverage for those. These expressions also now require foldable paths. I also changed the test values to make

[GitHub] spark issue #14083: [SPARK-16406][SQL] Improve performance of LogicalPlan.re...

2016-07-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14083 **[Test build #61952 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61952/consoleFull)** for PR 14083 at commit

[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement various xpath functions

2016-07-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13991 **[Test build #61953 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61953/consoleFull)** for PR 13991 at commit

[GitHub] spark pull request #14008: [SPARK-16281][SQL] Implement parse_url SQL functi...

2016-07-07 Thread janplus
Github user janplus commented on a diff in the pull request: https://github.com/apache/spark/pull/14008#discussion_r70021325 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -652,6 +654,152 @@ case class

[GitHub] spark pull request #14008: [SPARK-16281][SQL] Implement parse_url SQL functi...

2016-07-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/14008#discussion_r70020929 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -652,6 +654,152 @@ case class

[GitHub] spark issue #13778: [SPARK-16062][SPARK-15989][SQL] Fix two bugs of Python-o...

2016-07-07 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/13778 Python UDT in python side only serializes the python data to sql type defined in the Python UDT. The problem now is happened at the serialization to row in java side on the serialized python data. I

[GitHub] spark pull request #14071: [SPARK-16397][SQL] make CatalogTable more general...

2016-07-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/14071#discussion_r70020600 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala --- @@ -45,35 +45,28 @@ case class CatalogFunction( */

[GitHub] spark issue #14071: [SPARK-16397][SQL] make CatalogTable more general and le...

2016-07-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14071 **[Test build #61951 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61951/consoleFull)** for PR 14071 at commit

[GitHub] spark pull request #14008: [SPARK-16281][SQL] Implement parse_url SQL functi...

2016-07-07 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14008#discussion_r70020485 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -652,6 +654,152 @@ case class StringRPad(str:

[GitHub] spark issue #14102: [SPARK-16434][SQL][WIP] Avoid record-per type dispatch i...

2016-07-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14102 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14102: [SPARK-16434][SQL][WIP] Avoid record-per type dispatch i...

2016-07-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14102 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61949/ Test FAILed. ---

[GitHub] spark issue #14102: [SPARK-16434][SQL][WIP] Avoid record-per type dispatch i...

2016-07-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14102 **[Test build #61949 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61949/consoleFull)** for PR 14102 at commit

[GitHub] spark issue #14100: [SPARK-16433][SQL]Improve StreamingQuery.explain when no...

2016-07-07 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/14100 cc @tdas --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #14008: [SPARK-16281][SQL] Implement parse_url SQL function

2016-07-07 Thread janplus
Github user janplus commented on the issue: https://github.com/apache/spark/pull/14008 cc @cloud-fan @rxin @liancheng I did optimization for Literal `part`, so we don't need to check for every row. But since we may not assume in all circumstances the `part` is Literal, I keep

[GitHub] spark pull request #14065: [SPARK-14743][YARN][WIP] Add a configurable token...

2016-07-07 Thread jerryshao
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/14065#discussion_r70020146 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala --- @@ -390,8 +390,9 @@ private[spark] class Client( // Upload Spark and

[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement various xpath functions

2016-07-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13991 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement various xpath functions

2016-07-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13991 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61947/ Test PASSed. ---

[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement various xpath functions

2016-07-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13991 **[Test build #61947 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61947/consoleFull)** for PR 13991 at commit

[GitHub] spark issue #14102: [SPARK-16434][SQL][WIP] Avoid record-per type dispatch i...

2016-07-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14102 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14102: [SPARK-16434][SQL][WIP] Avoid record-per type dispatch i...

2016-07-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14102 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61948/ Test FAILed. ---

[GitHub] spark issue #14102: [SPARK-16434][SQL][WIP] Avoid record-per type dispatch i...

2016-07-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14102 **[Test build #61948 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61948/consoleFull)** for PR 14102 at commit

  1   2   3   4   5   6   >