[GitHub] spark issue #18808: [SPARK-21605][BUILD] Let IntelliJ IDEA correctly detect ...

2017-08-02 Thread baibaichen
Github user baibaichen commented on the issue: https://github.com/apache/spark/pull/18808 @srowen 1. just remove hotfix from titel 2. encoding is unnecessary, since its default value is [${project.build.sourceEncoding}](https://maven.apache.org/plugins/maven-compiler

[GitHub] spark issue #18808: [SPARK-21605][BUILD] Let IntelliJ IDEA correctly detect ...

2017-08-02 Thread baibaichen
Github user baibaichen commented on the issue: https://github.com/apache/spark/pull/18808 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #18808: [SPARK-21605][HOT-FIX][BUILD] Let IntelliJ IDEA correctl...

2017-08-01 Thread baibaichen
Github user baibaichen commented on the issue: https://github.com/apache/spark/pull/18808 https://issues.apache.org/jira/browse/SPARK-21605 is added --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #18808: [HOT-FIX][BUILD] Let IntelliJ IDEA correctly detect Lang...

2017-08-01 Thread baibaichen
Github user baibaichen commented on the issue: https://github.com/apache/spark/pull/18808 cc @gslowikowski , @srowen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #18808: [HOT-FIX][BUILD] Let IntelliJ IDEA correctly dete...

2017-08-01 Thread baibaichen
GitHub user baibaichen opened a pull request: https://github.com/apache/spark/pull/18808 [HOT-FIX][BUILD] Let IntelliJ IDEA correctly detect Language level and Target byte code version With SPARK-21592, removing source and target properties from maven-compiler-plugin lets IntelliJ

[GitHub] spark issue #18652: [SPARK-21497][SQL] Pull non-deterministic equi join keys...

2017-07-28 Thread baibaichen
Github user baibaichen commented on the issue: https://github.com/apache/spark/pull/18652 can we add a flag i.e. ignore-non-deterministic , so that we can treat non-deterministic as deterministic, I believe this is what hive does. --- If your project is set up for it, you can reply

[GitHub] spark issue #18725: [SPARK-21520][SQL]Hivetable scan for all the columns the...

2017-07-26 Thread baibaichen
Github user baibaichen commented on the issue: https://github.com/apache/spark/pull/18725 @heary-cao your fix is wrong. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18725: [SPARK-21520][SQL]Hivetable scan for all the columns the...

2017-07-26 Thread baibaichen
Github user baibaichen commented on the issue: https://github.com/apache/spark/pull/18725 @heary-cao, is the better performance with your fix? e.g. changing RDG's deterministic property from false to true? ``` override def deterministic: Boolean = true

[GitHub] spark issue #18725: [SPARK-21520][SQL]Hivetable scan for all the columns the...

2017-07-26 Thread baibaichen
Github user baibaichen commented on the issue: https://github.com/apache/spark/pull/18725 The `HiveTableScans` strategy need `CatalogRelation`, but it's `LogicalRelation` in my case. Actually, the hive table is external table in my test, I guess that's the reason. I believe

[GitHub] spark issue #18725: [SPARK-21520][SQL]Hivetable scan for all the columns the...

2017-07-26 Thread baibaichen
Github user baibaichen commented on the issue: https://github.com/apache/spark/pull/18725 It's another issue about non-deterministic. When generating SparkPlan in `FileSourceStrategy` , `PhysicalOperation` is used to extract projects and filters on top of relation. But with [SPARK

[GitHub] spark pull request #18725: [SPARK-21520][SQL]Hivetable scan for all the colu...

2017-07-25 Thread baibaichen
Github user baibaichen commented on a diff in the pull request: https://github.com/apache/spark/pull/18725#discussion_r129276690 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/randomExpressions.scala --- @@ -49,6 +49,8 @@ abstract class RDG

[GitHub] spark issue #18652: [WIP] Pull non-deterministic joining keys from Join oper...

2017-07-20 Thread baibaichen
Github user baibaichen commented on the issue: https://github.com/apache/spark/pull/18652 @viirya , @jiangxb1987 @gatorsmile In general, Hive doesn't consider non-deterministic in join condition. Some terms: 1 equi-joins with key, i.e. a.key = b.key, using

[GitHub] spark issue #18652: [WIP] Pull non-deterministic joining keys from Join oper...

2017-07-19 Thread baibaichen
Github user baibaichen commented on the issue: https://github.com/apache/spark/pull/18652 The naive database join implementation looks like: ``` for each tuple in left relation for each tuple in right relation matching join condition for each tuple pair