[GitHub] spark pull request #19606: [SPARK-22333][SQL][Backport-2.2]timeFunctionCall(...

2017-10-31 Thread DonnyZone
Github user DonnyZone closed the pull request at: https://github.com/apache/spark/pull/19606 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19568: SPARK-22345: Fix sort-merge joins with conditions and co...

2017-10-30 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/19568 @rdblue Could you find any cases that can trigger the problem of wrong INPUT_ROW in SortMergeJoinExec after the fix (https://github.com/apache/spark/pull/18656) for CollapseCodegenStages rule? I

[GitHub] spark issue #19568: SPARK-22345: Fix sort-merge joins with conditions and co...

2017-10-30 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/19568 This PR is similar to the initial commit when I try to fix SPARK-21441 in https://github.com/apache/spark/pull/18656 (https://github.com/apache/spark/pull/18656/commits

[GitHub] spark issue #19606: [SPARK-22333][SQL][Backport-2.2]timeFunctionCall(CURRENT...

2017-10-30 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/19606 cc @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #19606: [SPARK-22333][SQL][Backport-2.2]timeFunctionCall(...

2017-10-29 Thread DonnyZone
GitHub user DonnyZone opened a pull request: https://github.com/apache/spark/pull/19606 [SPARK-22333][SQL][Backport-2.2]timeFunctionCall(CURRENT_DATE, CURRENT_TIMESTAMP) has conflicts with columnReference ## What changes were proposed in this pull request

[GitHub] spark issue #19559: [SPARK-22333][SQL]timeFunctionCall(CURRENT_DATE, CURRENT...

2017-10-29 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/19559 Sure, I will submit it later. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #19559: [SPARK-22333][SQL]timeFunctionCall(CURRENT_DATE, CURRENT...

2017-10-27 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/19559 Yes, ordering in `Sort(ordering, global, child)` is resolved in `resolveExpression` --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #19559: [SPARK-22333][SQL]timeFunctionCall(CURRENT_DATE, CURRENT...

2017-10-27 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/19559 OK, I will close this PR after review and submit a new one, after merging https://github.com/apache/spark/pull/19585

[GitHub] spark issue #19559: [SPARK-22333][SQL]timeFunctionCall(CURRENT_DATE, CURRENT...

2017-10-27 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/19559 @gatorsmile It seems that we should also support this logic in `resolveExpressions` for Sort plan. `select a from t order by current_date` Therefore, I think current

[GitHub] spark pull request #19559: [SPARK-22333][SQL]timeFunctionCall(CURRENT_DATE, ...

2017-10-27 Thread DonnyZone
Github user DonnyZone commented on a diff in the pull request: https://github.com/apache/spark/pull/19559#discussion_r147330565 --- Diff: sql/core/src/test/resources/sql-tests/inputs/datetime.sql --- @@ -8,3 +8,18 @@ select to_date(null), to_date('2016-12-31'), to_date('2016-12-31

[GitHub] spark issue #19573: [SPARK-22350][SQL] select grouping__id from subquery

2017-10-26 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/19573 Is it similar to the below issue? https://github.com/apache/spark/pull/19178 --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #19559: [SPARK-22333][SQL]timeFunctionCall(CURRENT_DATE, CURRENT...

2017-10-26 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/19559 @gatorsmile @gatorsmile There are still two issues need to be figured out. (1)It will be complicated to determine whether a literal function should be resolved as Expression

[GitHub] spark pull request #19559: [SPARK-22333][SQL]timeFunctionCall(CURRENT_DATE, ...

2017-10-26 Thread DonnyZone
Github user DonnyZone commented on a diff in the pull request: https://github.com/apache/spark/pull/19559#discussion_r147068164 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -783,6 +783,25 @@ class Analyzer

[GitHub] spark pull request #19559: [SPARK-22333][SQL]timeFunctionCall(CURRENT_DATE, ...

2017-10-25 Thread DonnyZone
Github user DonnyZone commented on a diff in the pull request: https://github.com/apache/spark/pull/19559#discussion_r147041980 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -139,6 +139,7 @@ class Analyzer

[GitHub] spark issue #19559: [SPARK-22333][SQL]timeFunctionCall(CURRENT_DATE, CURRENT...

2017-10-25 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/19559 @gatorsmile Thank for your advice, I will work on it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #19559: [SPARK-22333][SQL]ColumnReference should get higher prio...

2017-10-23 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/19559 @hvanhovell Yes! I made something wrong. The `timeFunctionCall` has conflicts with `columnReference`. This fix will break every use of CURRENT_DATE/CURRENT_TIMESTAMP. For [SPARK-16836

[GitHub] spark issue #19559: [SPARK-22333][SQL]ColumnReference should get higher prio...

2017-10-23 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/19559 ping @gatorsmile @hvanhovell @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19559: [SPARK-22333][SQL]ColumnReference should get higher prio...

2017-10-23 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/19559 ping @hvanhovell @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #19559: [SPARK-22333][SQL]ColumnReference should get high...

2017-10-23 Thread DonnyZone
GitHub user DonnyZone opened a pull request: https://github.com/apache/spark/pull/19559 [SPARK-22333][SQL]ColumnReference should get higher priority than timeFunctionCall(CURRENT_DATE, CURRENT_TIMESTAMP) ## What changes were proposed in this pull request? https

[GitHub] spark issue #19175: [SPARK-21964][SQL]Enable splitting the Aggregate (on Exp...

2017-09-28 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/19175 cc @hvanhovell @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #19175: [SPARK-21964][SQL]Enable splitting the Aggregate ...

2017-09-25 Thread DonnyZone
Github user DonnyZone commented on a diff in the pull request: https://github.com/apache/spark/pull/19175#discussion_r140741265 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1287,3 +1288,33 @@ object

[GitHub] spark pull request #19175: [SPARK-21964][SQL]Enable splitting the Aggregate ...

2017-09-25 Thread DonnyZone
Github user DonnyZone commented on a diff in the pull request: https://github.com/apache/spark/pull/19175#discussion_r140741053 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1287,3 +1288,33 @@ object

[GitHub] spark pull request #19175: [SPARK-21964][SQL]Enable splitting the Aggregate ...

2017-09-25 Thread DonnyZone
Github user DonnyZone commented on a diff in the pull request: https://github.com/apache/spark/pull/19175#discussion_r140729331 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1287,3 +1288,33 @@ object

[GitHub] spark issue #19175: [SPARK-21964][SQL]Enable splitting the Aggregate (on Exp...

2017-09-25 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/19175 @cloud-fan Do you have time to review this PR? We found it is useful in high dimensional cube cases. --- - To unsubscribe, e

[GitHub] spark issue #19175: [SPARK-21964][SQL]Enable splitting the Aggregate (on Exp...

2017-09-14 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/19175 ping! @hvanhovell @cloud-fan @gatorsmile Do you have any ideas about this optimization? We found it is useful in some scenarios

[GitHub] spark issue #19175: [SPARK-21964][SQL]Enable splitting the Aggregate (on Exp...

2017-09-12 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/19175 Could you help to review this PR? @jiangxb1987 @hvanhovell --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #19202: [SPARK-21980][SQL]References in grouping functions shoul...

2017-09-12 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/19202 ping @cloud-fan @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #19202: [SPARK-21980][SQL]References in grouping function...

2017-09-12 Thread DonnyZone
GitHub user DonnyZone opened a pull request: https://github.com/apache/spark/pull/19202 [SPARK-21980][SQL]References in grouping functions should be indexed with resolver ## What changes were proposed in this pull request? https://issues.apache.org/jira/browse/SPARK-21980

[GitHub] spark pull request #19178: [SPARK-21966][SQL]ResolveMissingReference rule sh...

2017-09-11 Thread DonnyZone
Github user DonnyZone closed the pull request at: https://github.com/apache/spark/pull/19178 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19178: [SPARK-21966][SQL]ResolveMissingReference rule should no...

2017-09-11 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/19178 Thanks, I will make a try in our private repository as there are several such cases and the users want to in a seamless way. It is really complicated for a general support. Should I close

[GitHub] spark pull request #19178: [SPARK-21966][SQL]ResolveMissingReference rule sh...

2017-09-10 Thread DonnyZone
Github user DonnyZone commented on a diff in the pull request: https://github.com/apache/spark/pull/19178#discussion_r137979887 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1115,6 +1115,8 @@ class Analyzer

[GitHub] spark pull request #19178: [SPARK-21966][SQL]ResolveMissingReference rule sh...

2017-09-10 Thread DonnyZone
GitHub user DonnyZone opened a pull request: https://github.com/apache/spark/pull/19178 [SPARK-21966][SQL]ResolveMissingReference rule should not ignore Union ## What changes were proposed in this pull request? https://issues.apache.org/jira/browse/SPARK-21966

[GitHub] spark pull request #19175: [SPARK-21964][SQL]Enable splitting the Aggregate ...

2017-09-09 Thread DonnyZone
GitHub user DonnyZone opened a pull request: https://github.com/apache/spark/pull/19175 [SPARK-21964][SQL]Enable splitting the Aggregate (on Expand) into a number of Aggregates for grouing analytics ## What changes were proposed in this pull request? https

[GitHub] spark issue #18986: [SPARK-21774][SQL] The rule PromoteStrings should cast a...

2017-08-20 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/18986 @gatorsmile For this issue, I think the behevior in PromoteStrings rule is reasonable, but there are problems in underlying converter UTF8String. As described in PR-15880 (https

[GitHub] spark pull request #18986: [SPARK-21774][SQL] The rule PromoteStrings should...

2017-08-18 Thread DonnyZone
Github user DonnyZone commented on a diff in the pull request: https://github.com/apache/spark/pull/18986#discussion_r133916434 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -127,8 +127,10 @@ object TypeCoercion

[GitHub] spark issue #18960: [SPARK-21739][SQL]Cast expression should initialize time...

2017-08-17 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/18960 Moreover, how about using `CastSupport.cast`, shall I initialize a `DataSourceAnalysis` or `DataSourceStrategy` ? --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark issue #18960: [SPARK-21739][SQL]Cast expression should initialize time...

2017-08-17 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/18960 Test case updated. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #18960: [SPARK-21739][SQL]Cast expression should initiali...

2017-08-17 Thread DonnyZone
Github user DonnyZone commented on a diff in the pull request: https://github.com/apache/spark/pull/18960#discussion_r133631784 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/QueryPartitionSuite.scala --- @@ -68,4 +68,25 @@ class QueryPartitionSuite extends QueryTest

[GitHub] spark pull request #18960: [SPARK-21739][SQL]Cast expression should initiali...

2017-08-16 Thread DonnyZone
Github user DonnyZone commented on a diff in the pull request: https://github.com/apache/spark/pull/18960#discussion_r133618679 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala --- @@ -227,7 +228,8 @@ class HadoopTableReader( def

[GitHub] spark pull request #18960: [SPARK-21739][SQL]Cast expression should initiali...

2017-08-16 Thread DonnyZone
Github user DonnyZone commented on a diff in the pull request: https://github.com/apache/spark/pull/18960#discussion_r133614757 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/HiveTableScanExec.scala --- @@ -104,7 +105,7 @@ case class HiveTableScanExec

[GitHub] spark pull request #18960: [SPARK-21739][SQL]Cast expression should initiali...

2017-08-16 Thread DonnyZone
Github user DonnyZone commented on a diff in the pull request: https://github.com/apache/spark/pull/18960#discussion_r133612759 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/HiveTableScanExec.scala --- @@ -104,7 +105,7 @@ case class HiveTableScanExec

[GitHub] spark pull request #18960: [SPARK-21739][SQL]Cast expression should initiali...

2017-08-16 Thread DonnyZone
Github user DonnyZone commented on a diff in the pull request: https://github.com/apache/spark/pull/18960#discussion_r133607798 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala --- @@ -227,7 +228,8 @@ class HadoopTableReader( def

[GitHub] spark issue #18960: [SPARK-21739][SQL]Cast expression should initialize time...

2017-08-16 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/18960 @cloud-fan @gatorsmile --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #18960: [SPARK-21739][SQL]Cast expression should initialize time...

2017-08-16 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/18960 @cloud-fan @gatorsmile --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #18960: [SPARK-21739][SQL]Cast expression should initiali...

2017-08-16 Thread DonnyZone
GitHub user DonnyZone opened a pull request: https://github.com/apache/spark/pull/18960 [SPARK-21739][SQL]Cast expression should initialize timezoneId when it is called statically to convert something into TimestampType ## What changes were proposed in this pull request

[GitHub] spark issue #18946: [SPARK-19471][SQL]AggregationIterator does not initializ...

2017-08-15 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/18946 Is Jenkin unstable? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #18946: [SPARK-19471][SQL]AggregationIterator does not initializ...

2017-08-14 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/18946 @gatorsmile --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #18946: [SPARK-19471][SQL]AggregationIterator does not in...

2017-08-14 Thread DonnyZone
GitHub user DonnyZone opened a pull request: https://github.com/apache/spark/pull/18946 [SPARK-19471][SQL]AggregationIterator does not initialize the generated result projection before using it ## What changes were proposed in this pull request? This is a follow-up PR

[GitHub] spark issue #18920: [SPARK-19471][SQL]AggregationIterator does not initializ...

2017-08-14 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/18920 Sure, I will do it later. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #18920: [SPARK-19471][SQL]AggregationIterator does not initializ...

2017-08-14 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/18920 retest please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #18920: [SPARK-19471][SQL]AggregationIterator does not initializ...

2017-08-14 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/18920 Updated, thanks for reviewing. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #18920: [SPARK-19471][SQL]AggregationIterator does not initializ...

2017-08-13 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/18920 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #18920: [SPARK-19471][SQL]AggregationIterator does not initializ...

2017-08-13 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/18920 updated --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #18920: [SPARK-19471][SQL]AggregationIterator does not initializ...

2017-08-13 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/18920 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #18920: [SPARK-19471][SQL]AggregationIterator does not initializ...

2017-08-11 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/18920 @hvanhovell, @yangw1234, @gatorsmile --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18920: [SPARK-19471][SQL]AggregationIterator does not initializ...

2017-08-11 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/18920 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #18920: [SPARK-19471][SQL]AggregationIterator does not in...

2017-08-11 Thread DonnyZone
GitHub user DonnyZone opened a pull request: https://github.com/apache/spark/pull/18920 [SPARK-19471][SQL]AggregationIterator does not initialize the generated result projection before using it ## What changes were proposed in this pull request? Recently, we have also

[GitHub] spark issue #18919: [SPARK-19471][SQL]AggregationIterator does not initializ...

2017-08-11 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/18919 There are some confilicts, close it first --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #18919: [SPARK-19471][SQL]AggregationIterator does not in...

2017-08-11 Thread DonnyZone
Github user DonnyZone closed the pull request at: https://github.com/apache/spark/pull/18919 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request #18919: [SPARK-19471][SQL]AggregationIterator does not in...

2017-08-11 Thread DonnyZone
GitHub user DonnyZone opened a pull request: https://github.com/apache/spark/pull/18919 [SPARK-19471][SQL]AggregationIterator does not initialize the generated result projection before using it ## What changes were proposed in this pull request? Recently, we have also

[GitHub] spark issue #18656: [SPARK-21441][SQL]Incorrect Codegen in SortMergeJoinExec...

2017-07-18 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/18656 Thanks for reviewing, I will add a test later. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #18656: [SPARK-21441]Incorrect Codegen in SortMergeJoinEx...

2017-07-18 Thread DonnyZone
Github user DonnyZone commented on a diff in the pull request: https://github.com/apache/spark/pull/18656#discussion_r128142467 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala --- @@ -489,13 +489,13 @@ case class CollapseCodegenStages

[GitHub] spark pull request #18656: [SPARK-21441]Incorrect Codegen in SortMergeJoinEx...

2017-07-18 Thread DonnyZone
Github user DonnyZone commented on a diff in the pull request: https://github.com/apache/spark/pull/18656#discussion_r128142370 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala --- @@ -489,13 +489,13 @@ case class CollapseCodegenStages

[GitHub] spark issue #18656: [SPARK-21441]Incorrect Codegen in SortMergeJoinExec resu...

2017-07-18 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/18656 I have validated both cases with and without CodegenFallback expressions for `SortMergeJoinExec`. The fix works well. --- If your project is set up for it, you can reply to this email

[GitHub] spark issue #18656: [SPARK-21441]Incorrect Codegen in SortMergeJoinExec resu...

2017-07-18 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/18656 Great! I'm also considering to disable codegen for `SortMergeJoinExec` with CodegenFallback expressions. Thanks for your advise. I will work on it and validate in our environment

[GitHub] spark issue #18656: [SPARK-21441]Incorrect Codegen in SortMergeJoinExec resu...

2017-07-18 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/18656 I notice that the CollapseCodegenStages rule will still enable codegen for SortMergeJoinExec without checking CodegenFallback expressions. The logic in `insertInputAdapter` seems to skip

[GitHub] spark issue #18656: [SPARK-21441]Incorrect Codegen in SortMergeJoinExec resu...

2017-07-18 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/18656 That's interesting, I will take a look at why the codegen is enabled --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #18656: [SPARK-21441]Incorrect Codegen in SortMergeJoinExec resu...

2017-07-18 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/18656 Yeah, CodegenFallback just provide a fallback mode. However, in such case, SortMergeJoinExec passes incomplete row as input to hiveUDF that implements CodegenFallback. --- If your project

[GitHub] spark issue #18656: [SPARK-21441]Incorrect Codegen in SortMergeJoinExec resu...

2017-07-18 Thread DonnyZone
Github user DonnyZone commented on the issue: https://github.com/apache/spark/pull/18656 Hi, @cloud-fan, @vanzin , could you help to take a look? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request #18656: [SPARK-21441]Incorrect Codegen in SortMergeJoinEx...

2017-07-17 Thread DonnyZone
GitHub user DonnyZone opened a pull request: https://github.com/apache/spark/pull/18656 [SPARK-21441]Incorrect Codegen in SortMergeJoinExec results failures in some cases ## What changes were proposed in this pull request? https://issues.apache.org/jira/projects/SPARK