[GitHub] spark issue #19488: [SPARK-22266][SQL] The same aggregate function was evalu...

2017-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19488 **[Test build #82722 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82722/testReport)** for PR 19488 at commit

[GitHub] spark issue #19488: [SPARK-22266][SQL] The same aggregate function was evalu...

2017-10-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19488 add to whitelist --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19488: SPARK-22266 The same aggregate function was evaluated mu...

2017-10-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19488 Update the title to `[SPARK-22266][SQL] The same aggregate function was evaluated multiple times` --- - To unsubscribe,

[GitHub] spark issue #19488: SPARK-22266 The same aggregate function was evaluated mu...

2017-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19488 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19488: SPARK-22266 The same aggregate function was evalu...

2017-10-12 Thread maryannxue
GitHub user maryannxue opened a pull request: https://github.com/apache/spark/pull/19488 SPARK-22266 The same aggregate function was evaluated multiple times ## What changes were proposed in this pull request? To let the same aggregate function that appear multiple times in

[GitHub] spark pull request #19475: [SPARK-22257][SQL]Reserve all non-deterministic e...

2017-10-12 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19475 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18460: [SPARK-21247][SQL] Type comparison should respect case-s...

2017-10-12 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/18460 Thank you, @gatorsmile . Sure, I agree. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19475: [SPARK-22257][SQL]Reserve all non-deterministic expressi...

2017-10-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19475 Thanks! Merged to master --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19470: [SPARK-14387][SPARK-16628][SPARK-18355][SQL] Use Spark s...

2017-10-12 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19470 One minor comment doesn't affect this. LGTM. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19475: [SPARK-22257][SQL]Reserve all non-deterministic expressi...

2017-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19475 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82714/ Test PASSed. ---

[GitHub] spark issue #19475: [SPARK-22257][SQL]Reserve all non-deterministic expressi...

2017-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19475 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19475: [SPARK-22257][SQL]Reserve all non-deterministic expressi...

2017-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19475 **[Test build #82714 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82714/testReport)** for PR 19475 at commit

[GitHub] spark pull request #19470: [SPARK-14387][SPARK-16628][SPARK-18355][SQL] Use ...

2017-10-12 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19470#discussion_r144469314 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala --- @@ -2050,4 +2050,64 @@ class SQLQuerySuite extends QueryTest

[GitHub] spark issue #18460: [SPARK-21247][SQL] Type comparison should respect case-s...

2017-10-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18460 BTW, we are unable to merge this to Spark 2.2 although this is a bug fix. --- - To unsubscribe, e-mail:

[GitHub] spark issue #19433: [SPARK-3162] [MLlib] Add local tree training for decisio...

2017-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19433 **[Test build #82721 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82721/testReport)** for PR 19433 at commit

[GitHub] spark issue #18460: [SPARK-21247][SQL] Type comparison should respect case-s...

2017-10-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18460 LGTM cc @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19448 Will check it if I am not confident next time. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19448 Ok. Next time, please check it with the committers who are familiar with Spark SQL. --- - To unsubscribe, e-mail:

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19448 I did this as I was confident if it is a bug because doc says it should work but actually not, without breaking the previous support. ---

[GitHub] spark issue #19470: [SPARK-14387][SPARK-16628][SPARK-18355][SQL] Use Spark s...

2017-10-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19470 LGTM too. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19448 This one starts at least since Spark 1.5. If you are not confident whether this is bug or not, please check it before merging it. ---

[GitHub] spark issue #19470: [SPARK-14387][SPARK-16628][SPARK-18355][SQL] Use Spark s...

2017-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19470 **[Test build #82720 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82720/testReport)** for PR 19470 at commit

[GitHub] spark pull request #19483: [SPARK-21165][SQL] FileFormatWriter should handle...

2017-10-12 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19483 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19433: [SPARK-3162] [MLlib] Add local tree training for decisio...

2017-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19433 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19433: [SPARK-3162] [MLlib] Add local tree training for decisio...

2017-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19433 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82717/ Test PASSed. ---

[GitHub] spark issue #19433: [SPARK-3162] [MLlib] Add local tree training for decisio...

2017-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19433 **[Test build #82717 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82717/testReport)** for PR 19433 at commit

[GitHub] spark issue #19483: [SPARK-21165][SQL] FileFormatWriter should handle mismat...

2017-10-12 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19483 thanks for the review, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19484: [SPARK-22252][SQL][2.2] FileFormatWriter should r...

2017-10-12 Thread cloud-fan
Github user cloud-fan closed the pull request at: https://github.com/apache/spark/pull/19484 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19269: [SPARK-22026][SQL][WIP] data source v2 write path

2017-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19269 **[Test build #82719 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82719/testReport)** for PR 19269 at commit

[GitHub] spark pull request #19470: [SPARK-14387][SPARK-16628][SPARK-18355][SQL] Use ...

2017-10-12 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19470#discussion_r144466637 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala --- @@ -2050,4 +2050,60 @@ class SQLQuerySuite extends

[GitHub] spark pull request #18692: [SPARK-21417][SQL] Infer join conditions using pr...

2017-10-12 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/18692#discussion_r144466472 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -152,3 +152,71 @@ object EliminateOuterJoin extends

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19448 How come fixing the behaviour as documented is not a bug fix? I think that basically mean we don't backport fixes for things not working as documented for other internal configurations.

[GitHub] spark issue #19483: [SPARK-21165][SQL] FileFormatWriter should handle mismat...

2017-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19483 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82712/ Test PASSed. ---

[GitHub] spark pull request #19470: [SPARK-14387][SPARK-16628][SPARK-18355][SQL] Use ...

2017-10-12 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19470#discussion_r144465324 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala --- @@ -2050,4 +2050,60 @@ class SQLQuerySuite extends

[GitHub] spark issue #19484: [SPARK-22252][SQL][2.2] FileFormatWriter should respect ...

2017-10-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19484 Thanks! Merged to 2.2. Could you close this? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark pull request #19470: [SPARK-14387][SPARK-16628][SPARK-18355][SQL] Use ...

2017-10-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19470#discussion_r144465575 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala --- @@ -2050,4 +2050,60 @@ class SQLQuerySuite extends

[GitHub] spark issue #19483: [SPARK-21165][SQL] FileFormatWriter should handle mismat...

2017-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19483 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19470: [SPARK-14387][SPARK-16628][SPARK-18355][SQL] Use Spark s...

2017-10-12 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19470 LGTM, pending jenkins --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19483: [SPARK-21165][SQL] FileFormatWriter should handle mismat...

2017-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19483 **[Test build #82712 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82712/testReport)** for PR 19483 at commit

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19448 That conf is an internal one. The end users will not see it. This is not a bug fix. We should not extend the existing functions or introduce new behaviors/features in 2.2.x releases.

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19448 Since the risk is low, I did not revert it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #19269: [SPARK-22026][SQL][WIP] data source v2 write path

2017-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19269 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82713/ Test FAILed. ---

[GitHub] spark issue #19269: [SPARK-22026][SQL][WIP] data source v2 write path

2017-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19269 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19269: [SPARK-22026][SQL][WIP] data source v2 write path

2017-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19269 **[Test build #82713 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82713/testReport)** for PR 19269 at commit

[GitHub] spark issue #18692: [SPARK-21417][SQL] Infer join conditions using propagate...

2017-10-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18692 cc @gengliangwang Review this? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #18460: [SPARK-21247][SQL] Type comparison should respect case-s...

2017-10-12 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/18460 Hi, @gatorsmile and @cloud-fan . Could you review this again, too? --- - To unsubscribe, e-mail:

[GitHub] spark issue #19470: [SPARK-14387][SPARK-18355][SQL] Use Spark schema to read...

2017-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19470 **[Test build #82718 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82718/testReport)** for PR 19470 at commit

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-10-12 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19222 @tejasapatil Thank you for your comment. I hope that benchmark result is a response to concern about virtual method raised by @hvanhovell. @hvanhovell **What do you think?** As a long

[GitHub] spark issue #19470: [SPARK-14387][SPARK-18355][SQL] Use Spark schema to read...

2017-10-12 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19470 @cloud-fan . Thank you so much for review! I updated the PR except one: If`fieldValue` is `null`, we also use `setNull` again in `else`. So, the current one is simpler. ```scala if

[GitHub] spark issue #19484: [SPARK-22252][SQL][2.2] FileFormatWriter should respect ...

2017-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19484 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19484: [SPARK-22252][SQL][2.2] FileFormatWriter should respect ...

2017-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19484 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82711/ Test PASSed. ---

[GitHub] spark issue #19484: [SPARK-22252][SQL][2.2] FileFormatWriter should respect ...

2017-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19484 **[Test build #82711 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82711/testReport)** for PR 19484 at commit

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19448 I think this is a bug to fix as the previous behaviour does not work as documented: ``` subclass of org.apache.hadoop.mapreduce.OutputCommitter... ``` and does not

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-10-12 Thread tejasapatil
Github user tejasapatil commented on the issue: https://github.com/apache/spark/pull/19222 At high level, this idea is good and worth moving forward with. I still have to dig into your analysis in response to concern raised by @hvanhovell. In terms of the PR itself, is the

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-10-12 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19222 Sure, I agree with you. I will try performance evaluation for other methods like `getByte()`, and so on. --- - To unsubscribe,

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-10-12 Thread tejasapatil
Github user tejasapatil commented on the issue: https://github.com/apache/spark/pull/19222 I pulled up frequency of methods from `UTF8String` which are being invoked from FB prod clusters and picked top 25. ``` .writeToMemory() .getBytes() .toString()

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-10-12 Thread tejasapatil
Github user tejasapatil commented on the issue: https://github.com/apache/spark/pull/19222 Apart from `UTF8String.trim`, can you try other some other method ? If we have to eval perf., its better to pick a method which would be most frequently used... if I have to guess, `trim()`

[GitHub] spark issue #19419: [SPARK-22188] [CORE] Adding security headers for prevent...

2017-10-12 Thread krishna-pandey
Github user krishna-pandey commented on the issue: https://github.com/apache/spark/pull/19419 @tgravescs These generic headers are about providing available client-side protection for the application. I also think even if there is no sensitive data to formulate an attack by itself

[GitHub] spark issue #19451: SPARK-22181 Adds ReplaceExceptWithNotFilter rule

2017-10-12 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/19451 Actually you already have it in the classdoc, so please just update the pr description with it. --- - To unsubscribe, e-mail:

[GitHub] spark pull request #19451: SPARK-22181 Adds ReplaceExceptWithNotFilter rule

2017-10-12 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/19451#discussion_r144461898 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1242,6 +1244,53 @@ object ReplaceIntersectWithSemiJoin

[GitHub] spark pull request #19451: SPARK-22181 Adds ReplaceExceptWithNotFilter rule

2017-10-12 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/19451#discussion_r144461913 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1242,6 +1244,53 @@ object ReplaceIntersectWithSemiJoin

[GitHub] spark pull request #19451: SPARK-22181 Adds ReplaceExceptWithNotFilter rule

2017-10-12 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/19451#discussion_r144461813 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1242,6 +1244,53 @@ object ReplaceIntersectWithSemiJoin

[GitHub] spark issue #19433: [SPARK-3162] [MLlib] Add local tree training for decisio...

2017-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19433 **[Test build #82717 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82717/testReport)** for PR 19433 at commit

[GitHub] spark issue #19451: SPARK-22181 Adds ReplaceExceptWithNotFilter rule

2017-10-12 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/19451 Can you update the pr description with an example plan before / after this optimization, and also put that example in the comment section of the doc. ---

[GitHub] spark issue #19487: [SPARK-21549][CORE] Respect OutputFormats with no/invali...

2017-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19487 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19487: [SPARK-21549][CORE] Respect OutputFormats with no/invali...

2017-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19487 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82709/ Test PASSed. ---

[GitHub] spark issue #19464: [SPARK-22233] [core] Allow user to filter out empty spli...

2017-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19464 **[Test build #82716 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82716/testReport)** for PR 19464 at commit

[GitHub] spark issue #19487: [SPARK-21549][CORE] Respect OutputFormats with no/invali...

2017-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19487 **[Test build #82709 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82709/testReport)** for PR 19487 at commit

[GitHub] spark pull request #19480: [SPARK-22226][SQL] splitExpression can create too...

2017-10-12 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19480#discussion_r144459033 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CodeGenerationSuite.scala --- @@ -201,6 +201,23 @@ class

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19448 This is not eligible for backporting. We should not do it next time. --- - To unsubscribe, e-mail:

[GitHub] spark issue #19433: [SPARK-3162] [MLlib] Add local tree training for decisio...

2017-10-12 Thread smurching
Github user smurching commented on the issue: https://github.com/apache/spark/pull/19433 Sorry, realized I conflated feature subsampling and `subsampleWeights` (instance weights for training examples). IMO feature subsampling can be added in a follow-up PR, but `subsampleWeights`

[GitHub] spark pull request #19476: [SPARK-22062][CORE] Spill large block to disk in ...

2017-10-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19476#discussion_r144458346 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -426,4 +426,11 @@ package object config { .toSequence

[GitHub] spark pull request #19476: [SPARK-22062][CORE] Spill large block to disk in ...

2017-10-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19476#discussion_r144458322 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -426,4 +426,11 @@ package object config { .toSequence

[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...

2017-10-12 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r144458324 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java --- @@ -17,47 +17,168 @@ package

[GitHub] spark issue #19483: [SPARK-21165][SQL] FileFormatWriter should handle mismat...

2017-10-12 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19483 that will be great, thanks @tejasapatil ! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19470: [SPARK-14387][SPARK-18355][SQL] Use Spark schema to read...

2017-10-12 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19470 LGTM except some minor comments --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark pull request #19470: [SPARK-14387][SPARK-18355][SQL] Use Spark schema ...

2017-10-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19470#discussion_r144458108 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala --- @@ -2050,4 +2050,80 @@ class SQLQuerySuite extends

[GitHub] spark pull request #19470: [SPARK-14387][SPARK-18355][SQL] Use Spark schema ...

2017-10-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19470#discussion_r144457977 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala --- @@ -2050,4 +2050,80 @@ class SQLQuerySuite extends

[GitHub] spark pull request #19470: [SPARK-14387][SPARK-18355][SQL] Use Spark schema ...

2017-10-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19470#discussion_r144457932 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala --- @@ -2050,4 +2050,80 @@ class SQLQuerySuite extends

[GitHub] spark issue #19452: [SPARK-22136][SS] Evaluate one-sided conditions early in...

2017-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19452 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82710/ Test PASSed. ---

[GitHub] spark pull request #19470: [SPARK-14387][SPARK-18355][SQL] Use Spark schema ...

2017-10-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19470#discussion_r144457796 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcFileFormat.scala --- @@ -272,25 +272,35 @@ private[orc] object OrcRelation extends

[GitHub] spark issue #19452: [SPARK-22136][SS] Evaluate one-sided conditions early in...

2017-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19452 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19452: [SPARK-22136][SS] Evaluate one-sided conditions early in...

2017-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19452 **[Test build #82710 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82710/testReport)** for PR 19452 at commit

[GitHub] spark pull request #19470: [SPARK-14387][SPARK-18355][SQL] Use Spark schema ...

2017-10-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19470#discussion_r144457643 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcFileFormat.scala --- @@ -138,8 +138,7 @@ class OrcFileFormat extends FileFormat with

[GitHub] spark pull request #19470: [SPARK-14387][SPARK-18355][SQL] Use Spark schema ...

2017-10-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19470#discussion_r144457479 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcFileFormat.scala --- @@ -272,25 +272,35 @@ private[orc] object OrcRelation extends

[GitHub] spark pull request #19470: [SPARK-14387][SPARK-18355][SQL] Use Spark schema ...

2017-10-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19470#discussion_r144457357 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcFileFormat.scala --- @@ -138,8 +138,7 @@ class OrcFileFormat extends FileFormat with

[GitHub] spark pull request #19470: [SPARK-14387][SPARK-18355][SQL] Use Spark schema ...

2017-10-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19470#discussion_r144457235 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcFileFormat.scala --- @@ -138,8 +138,7 @@ class OrcFileFormat extends FileFormat with

[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...

2017-10-12 Thread tejasapatil
Github user tejasapatil commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r144457118 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java --- @@ -17,47 +17,168 @@ package

[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...

2017-10-12 Thread tejasapatil
Github user tejasapatil commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r144457126 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java --- @@ -17,47 +17,168 @@ package

[GitHub] spark pull request #19451: SPARK-22181 Adds ReplaceExceptWithNotFilter rule

2017-10-12 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19451#discussion_r144456518 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1242,6 +1244,53 @@ object

[GitHub] spark issue #19483: [SPARK-21165][SQL] FileFormatWriter should handle mismat...

2017-10-12 Thread tejasapatil
Github user tejasapatil commented on the issue: https://github.com/apache/spark/pull/19483 >> I'll refactor it later, to use requiredChildOrdering to do the sort. The hive bucketing PR does that : https://github.com/apache/spark/pull/19001 I can isolate that piece and put

[GitHub] spark pull request #19451: SPARK-22181 Adds ReplaceExceptWithNotFilter rule

2017-10-12 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19451#discussion_r144456402 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1242,6 +1244,53 @@ object

[GitHub] spark pull request #19451: SPARK-22181 Adds ReplaceExceptWithNotFilter rule

2017-10-12 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19451#discussion_r144456109 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1242,6 +1244,53 @@ object

[GitHub] spark pull request #19451: SPARK-22181 Adds ReplaceExceptWithNotFilter rule

2017-10-12 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19451#discussion_r144455482 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1242,6 +1244,53 @@ object

[GitHub] spark pull request #19451: SPARK-22181 Adds ReplaceExceptWithNotFilter rule

2017-10-12 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19451#discussion_r144455603 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1242,6 +1244,53 @@ object

[GitHub] spark pull request #19458: [SPARK-22227][CORE] DiskBlockManager.getAllBlocks...

2017-10-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19458#discussion_r144456908 --- Diff: core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala --- @@ -100,7 +100,16 @@ private[spark] class DiskBlockManager(conf:

[GitHub] spark pull request #19476: [SPARK-22062][CORE] Spill large block to disk in ...

2017-10-12 Thread jerryshao
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/19476#discussion_r144456817 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -426,4 +426,11 @@ package object config { .toSequence

[GitHub] spark issue #19263: [SPARK-22050][CORE] Allow BlockUpdated events to be opti...

2017-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19263 **[Test build #82715 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82715/testReport)** for PR 19263 at commit

[GitHub] spark pull request #19476: [SPARK-22062][CORE] Spill large block to disk in ...

2017-10-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19476#discussion_r144456522 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -426,4 +426,11 @@ package object config { .toSequence

[GitHub] spark issue #19483: [SPARK-21165][SQL] FileFormatWriter should handle mismat...

2017-10-12 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19483 I'll refactor it later, to use `requiredChildOrdering` to do the sort. I just wanna make this bug fix as simple as possible. ---

  1   2   3   4   5   6   >