[jira] [Commented] (DRILL-4476) Enhance Union-All operator for dealing with empty left input or empty both inputs
[ https://issues.apache.org/jira/browse/DRILL-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15431921#comment-15431921 ] Parth Chandra commented on DRILL-4476: -- The PR for DRILL-4510 is still open. I'm almost sure this is also causing DRILL-4850. > Enhance Union-All operator for dealing with empty left input or empty both > inputs > - > > Key: DRILL-4476 > URL: https://issues.apache.org/jira/browse/DRILL-4476 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Reporter: Sean Hsuan-Yi Chu >Assignee: Sean Hsuan-Yi Chu > Fix For: 1.7.0 > > > Union-All operator does not deal with the situation where left side comes > from empty source. > Due to DRILL-2288's enhancement for empty sources, Union-All operator now can > be allowed to support this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4476) Enhance Union-All operator for dealing with empty left input or empty both inputs
[ https://issues.apache.org/jira/browse/DRILL-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15430349#comment-15430349 ] Khurram Faraaz commented on DRILL-4476: --- Has this change been backed out of drill master ? > Enhance Union-All operator for dealing with empty left input or empty both > inputs > - > > Key: DRILL-4476 > URL: https://issues.apache.org/jira/browse/DRILL-4476 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Reporter: Sean Hsuan-Yi Chu >Assignee: Sean Hsuan-Yi Chu > Fix For: 1.7.0 > > > Union-All operator does not deal with the situation where left side comes > from empty source. > Due to DRILL-2288's enhancement for empty sources, Union-All operator now can > be allowed to support this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4476) Enhance Union-All operator for dealing with empty left input or empty both inputs
[ https://issues.apache.org/jira/browse/DRILL-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331914#comment-15331914 ] Zelaine Fong commented on DRILL-4476: - I don't believe it has been. [~amansinha100]? > Enhance Union-All operator for dealing with empty left input or empty both > inputs > - > > Key: DRILL-4476 > URL: https://issues.apache.org/jira/browse/DRILL-4476 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Reporter: Sean Hsuan-Yi Chu >Assignee: Sean Hsuan-Yi Chu > Fix For: 1.7.0 > > > Union-All operator does not deal with the situation where left side comes > from empty source. > Due to DRILL-2288's enhancement for empty sources, Union-All operator now can > be allowed to support this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4476) Enhance Union-All operator for dealing with empty left input or empty both inputs
[ https://issues.apache.org/jira/browse/DRILL-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331510#comment-15331510 ] Khurram Faraaz commented on DRILL-4476: --- As per the last comment, this fix needs to be backed out of master ? Was this backed out ? > Enhance Union-All operator for dealing with empty left input or empty both > inputs > - > > Key: DRILL-4476 > URL: https://issues.apache.org/jira/browse/DRILL-4476 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Reporter: Sean Hsuan-Yi Chu >Assignee: Sean Hsuan-Yi Chu > Fix For: 1.7.0 > > > Union-All operator does not deal with the situation where left side comes > from empty source. > Due to DRILL-2288's enhancement for empty sources, Union-All operator now can > be allowed to support this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4476) Enhance Union-All operator for dealing with empty left input or empty both inputs
[ https://issues.apache.org/jira/browse/DRILL-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331511#comment-15331511 ] Khurram Faraaz commented on DRILL-4476: --- As per the last comment, this fix needs to be backed out of master ? Was this backed out ? > Enhance Union-All operator for dealing with empty left input or empty both > inputs > - > > Key: DRILL-4476 > URL: https://issues.apache.org/jira/browse/DRILL-4476 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Reporter: Sean Hsuan-Yi Chu >Assignee: Sean Hsuan-Yi Chu > Fix For: 1.7.0 > > > Union-All operator does not deal with the situation where left side comes > from empty source. > Due to DRILL-2288's enhancement for empty sources, Union-All operator now can > be allowed to support this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4476) Enhance Union-All operator for dealing with empty left input or empty both inputs
[ https://issues.apache.org/jira/browse/DRILL-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15271035#comment-15271035 ] Sean Hsuan-Yi Chu commented on DRILL-4476: -- The logic in this patch can cause problems where Union-all operator makes wrong judgement regarding whether a record-batch is coming from an empty file/table. To be more specific, this patch uses the number of rows to determine whether the record-batch from the {color:red} left-side {color} was produced by an empty file/table. (Actually, this logic was applied to the {color:red} right-side {color} of Union-all in DRILL-2288 patch.) However, the number of rows is not sufficient to make a correct judgement. For instance, a record-batch which had passed by a "false filter" / "limit 0" will carry 0 row. For another example (DRILL-4510), when multiple Union-alls are running in parallel, one of the Union-all might just happen to not receive 0 row (due to data partition). Then, this Union-all will infer an output schema which is different from others. Unless the record-batch can somehow capture the information regarding its source (whether this record-batch is from an empty file), Union-all cannot be make a correct judgement. Thus, I think we have to back out this patch. And do not start to support empty files until there is a way to carry the empty file information with the record-batch (possibly by having a new type to represent non-existent column?). > Enhance Union-All operator for dealing with empty left input or empty both > inputs > - > > Key: DRILL-4476 > URL: https://issues.apache.org/jira/browse/DRILL-4476 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Reporter: Sean Hsuan-Yi Chu >Assignee: Sean Hsuan-Yi Chu > Fix For: 1.7.0 > > > Union-All operator does not deal with the situation where left side comes > from empty source. > Due to DRILL-2288's enhancement for empty sources, Union-All operator now can > be allowed to support this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4476) Enhance Union-All operator for dealing with empty left input or empty both inputs
[ https://issues.apache.org/jira/browse/DRILL-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15270943#comment-15270943 ] Sean Hsuan-Yi Chu commented on DRILL-4476: -- The change in UnionAllRecordBatch is related. > Enhance Union-All operator for dealing with empty left input or empty both > inputs > - > > Key: DRILL-4476 > URL: https://issues.apache.org/jira/browse/DRILL-4476 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Reporter: Sean Hsuan-Yi Chu >Assignee: Sean Hsuan-Yi Chu > Fix For: 1.7.0 > > > Union-All operator does not deal with the situation where left side comes > from empty source. > Due to DRILL-2288's enhancement for empty sources, Union-All operator now can > be allowed to support this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4476) Enhance Union-All operator for dealing with empty left input or empty both inputs
[ https://issues.apache.org/jira/browse/DRILL-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195029#comment-15195029 ] Khurram Faraaz commented on DRILL-4476: --- Verified on MapR Drill 1.7.0 commit ID : 050ff967 Empty input on left side of Union All. {noformat} 0: jdbc:drill:schema=dfs.tmp> select columns[0] from `empty.csv` union all select columns[0] from `vchar_f.csv`; +--+ | EXPR$0 | +--+ | 12345| | 1| | 0| | -1 | | 20 | | 100 | | 65535| | 13 | | 19 | | 17 | | 1| | 10101| +--+ 12 rows selected (0.538 seconds) {noformat} Empty input on left side of Union All. {noformat} 0: jdbc:drill:schema=dfs.tmp> select key from `empty.json` union all select key from `fewKeys.json`; +--+ | key | +--+ | 1| | 3| | 2| | 5| | 6| | 4| | 7| | 9| | 8| | 11 | | 10 | +--+ 11 rows selected (0.34 seconds) {noformat} Empty input on both sides on Union All. {noformat} 0: jdbc:drill:schema=dfs.tmp> select key from `empty.json` union all select key from `empty.json`; +--+ | key | +--+ +--+ No rows selected (0.201 seconds) {noformat} > Enhance Union-All operator for dealing with empty left input or empty both > inputs > - > > Key: DRILL-4476 > URL: https://issues.apache.org/jira/browse/DRILL-4476 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Reporter: Sean Hsuan-Yi Chu >Assignee: Sean Hsuan-Yi Chu > Fix For: 1.7.0 > > > Union-All operator does not deal with the situation where left side comes > from empty source. > Due to DRILL-2288's enhancement for empty sources, Union-All operator now can > be allowed to support this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4476) Enhance Union-All operator for dealing with empty left input or empty both inputs
[ https://issues.apache.org/jira/browse/DRILL-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192128#comment-15192128 ] ASF GitHub Bot commented on DRILL-4476: --- Github user asfgit closed the pull request at: https://github.com/apache/drill/pull/407 > Enhance Union-All operator for dealing with empty left input or empty both > inputs > - > > Key: DRILL-4476 > URL: https://issues.apache.org/jira/browse/DRILL-4476 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Reporter: Sean Hsuan-Yi Chu >Assignee: Sean Hsuan-Yi Chu > > Union-All operator does not deal with the situation where left side comes > from empty source. > Due to DRILL-2288's enhancement for empty sources, Union-All operator now can > be allowed to support this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4476) Enhance Union-All operator for dealing with empty left input or empty both inputs
[ https://issues.apache.org/jira/browse/DRILL-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15191800#comment-15191800 ] ASF GitHub Bot commented on DRILL-4476: --- Github user hsuanyi commented on the pull request: https://github.com/apache/drill/pull/407#issuecomment-195794797 Comments were added to the code; Also passed all the pre commit test. Can you help commit this patch ? > Enhance Union-All operator for dealing with empty left input or empty both > inputs > - > > Key: DRILL-4476 > URL: https://issues.apache.org/jira/browse/DRILL-4476 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Reporter: Sean Hsuan-Yi Chu >Assignee: Sean Hsuan-Yi Chu > > Union-All operator does not deal with the situation where left side comes > from empty source. > Due to DRILL-2288's enhancement for empty sources, Union-All operator now can > be allowed to support this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4476) Enhance Union-All operator for dealing with empty left input or empty both inputs
[ https://issues.apache.org/jira/browse/DRILL-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190320#comment-15190320 ] ASF GitHub Bot commented on DRILL-4476: --- Github user amansinha100 commented on the pull request: https://github.com/apache/drill/pull/407#issuecomment-195138369 Updated changes LGTM. +1 > Enhance Union-All operator for dealing with empty left input or empty both > inputs > - > > Key: DRILL-4476 > URL: https://issues.apache.org/jira/browse/DRILL-4476 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Reporter: Sean Hsuan-Yi Chu >Assignee: Sean Hsuan-Yi Chu > > Union-All operator does not deal with the situation where left side comes > from empty source. > Due to DRILL-2288's enhancement for empty sources, Union-All operator now can > be allowed to support this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4476) Enhance Union-All operator for dealing with empty left input or empty both inputs
[ https://issues.apache.org/jira/browse/DRILL-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190319#comment-15190319 ] ASF GitHub Bot commented on DRILL-4476: --- Github user amansinha100 commented on a diff in the pull request: https://github.com/apache/drill/pull/407#discussion_r55779800 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/union/UnionAllRecordBatch.java --- @@ -294,13 +313,41 @@ public UnionAllInput(UnionAllRecordBatch unionAllRecordBatch, RecordBatch left, rightSide = new OneSideInput(right); } +private boolean isBothSideEmpty() { + return leftIsFinish && rightIsFinish; +} + public IterOutcome nextBatch() throws SchemaChangeException { if(upstream == RecordBatch.IterOutcome.NOT_YET) { IterOutcome iterLeft = leftSide.nextBatch(); switch(iterLeft) { case OK_NEW_SCHEMA: -break; +whileLoop: +while(leftSide.getRecordBatch().getRecordCount() == 0) { + iterLeft = leftSide.nextBatch(); + + switch(iterLeft) { +case STOP: +case OUT_OF_MEMORY: + return iterLeft; + +case NONE: --- End diff -- Could you add comments here about why we need to keep reading after getting OK_NEW_SCHEMA. Also, do any of the union-all tests (either unit or functional) exercise this code ? i.e where the left input has 2 or more empty files ? > Enhance Union-All operator for dealing with empty left input or empty both > inputs > - > > Key: DRILL-4476 > URL: https://issues.apache.org/jira/browse/DRILL-4476 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Reporter: Sean Hsuan-Yi Chu >Assignee: Sean Hsuan-Yi Chu > > Union-All operator does not deal with the situation where left side comes > from empty source. > Due to DRILL-2288's enhancement for empty sources, Union-All operator now can > be allowed to support this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4476) Enhance Union-All operator for dealing with empty left input or empty both inputs
[ https://issues.apache.org/jira/browse/DRILL-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15189938#comment-15189938 ] ASF GitHub Bot commented on DRILL-4476: --- Github user amansinha100 commented on a diff in the pull request: https://github.com/apache/drill/pull/407#discussion_r55749381 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/union/UnionAllRecordBatch.java --- @@ -162,6 +162,25 @@ private IterOutcome doWork() throws ClassTransformationException, IOException, S allocationVectors = Lists.newArrayList(); transfers.clear(); +// If both sides of Union-All are empty +if(unionAllInput.isBothSideEmpty()) { + for(int i = 0; i < outputFields.size(); ++i) { --- End diff -- Ok, I suppose we are not supporting the SELECT * case, so the output fields only depends on what is specified in the SELECT list, not from the table. > Enhance Union-All operator for dealing with empty left input or empty both > inputs > - > > Key: DRILL-4476 > URL: https://issues.apache.org/jira/browse/DRILL-4476 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Reporter: Sean Hsuan-Yi Chu >Assignee: Sean Hsuan-Yi Chu > > Union-All operator does not deal with the situation where left side comes > from empty source. > Due to DRILL-2288's enhancement for empty sources, Union-All operator now can > be allowed to support this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4476) Enhance Union-All operator for dealing with empty left input or empty both inputs
[ https://issues.apache.org/jira/browse/DRILL-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15188421#comment-15188421 ] ASF GitHub Bot commented on DRILL-4476: --- Github user hsuanyi commented on the pull request: https://github.com/apache/drill/pull/407#issuecomment-194598924 @amansinha100 I consolidated the two logic into one: https://github.com/hsuanyi/incubator-drill/blob/DRILL-4476/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/union/UnionAllRecordBatch.java#L597 Can you help review again? Thanks. > Enhance Union-All operator for dealing with empty left input or empty both > inputs > - > > Key: DRILL-4476 > URL: https://issues.apache.org/jira/browse/DRILL-4476 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Reporter: Sean Hsuan-Yi Chu >Assignee: Sean Hsuan-Yi Chu > > Union-All operator does not deal with the situation where left side comes > from empty source. > Due to DRILL-2288's enhancement for empty sources, Union-All operator now can > be allowed to support this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4476) Enhance Union-All operator for dealing with empty left input or empty both inputs
[ https://issues.apache.org/jira/browse/DRILL-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15188252#comment-15188252 ] ASF GitHub Bot commented on DRILL-4476: --- Github user hsuanyi commented on a diff in the pull request: https://github.com/apache/drill/pull/407#discussion_r55605493 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/union/UnionAllRecordBatch.java --- @@ -491,6 +556,25 @@ private void inferOutputFieldsFromLeftSide() { } } +private void inferOutputFieldsFromRightSide() { --- End diff -- Inferencing happens only once when Drill receives the very first batches from left and right. (Schema-change is not yet supported for Union-All). Let me summarize the inference in four different situations: First of all, the field names are always determined by the left side (even when the left side is from empty file, we have the column names. Please see the comment above.) 1. Left: non-empty; Right: non-empty=> types determined by both sides with implicit casting involved 2. Left: empty; Right: non-empty=> type from the right 3. Left: non-empty; Right: empty=> types from the left 4. Left: empty; Right: empty=> types are nullable integer > Enhance Union-All operator for dealing with empty left input or empty both > inputs > - > > Key: DRILL-4476 > URL: https://issues.apache.org/jira/browse/DRILL-4476 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Reporter: Sean Hsuan-Yi Chu >Assignee: Sean Hsuan-Yi Chu > > Union-All operator does not deal with the situation where left side comes > from empty source. > Due to DRILL-2288's enhancement for empty sources, Union-All operator now can > be allowed to support this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4476) Enhance Union-All operator for dealing with empty left input or empty both inputs
[ https://issues.apache.org/jira/browse/DRILL-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15188136#comment-15188136 ] ASF GitHub Bot commented on DRILL-4476: --- Github user hsuanyi commented on a diff in the pull request: https://github.com/apache/drill/pull/407#discussion_r55597867 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/union/UnionAllRecordBatch.java --- @@ -294,13 +313,41 @@ public UnionAllInput(UnionAllRecordBatch unionAllRecordBatch, RecordBatch left, rightSide = new OneSideInput(right); } +private boolean isBothSideEmpty() { + return leftIsFinish && rightIsFinish; +} + public IterOutcome nextBatch() throws SchemaChangeException { if(upstream == RecordBatch.IterOutcome.NOT_YET) { IterOutcome iterLeft = leftSide.nextBatch(); switch(iterLeft) { case OK_NEW_SCHEMA: -break; +whileLoop: +while(leftSide.getRecordBatch().getRecordCount() == 0) { + iterLeft = leftSide.nextBatch(); + + switch(iterLeft) { +case STOP: +case OUT_OF_MEMORY: + return iterLeft; + +case NONE: --- End diff -- The while is for the case where the first few batches are empty and then union-all receive NONE. So we need to replicate this logic to right side too. > Enhance Union-All operator for dealing with empty left input or empty both > inputs > - > > Key: DRILL-4476 > URL: https://issues.apache.org/jira/browse/DRILL-4476 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Reporter: Sean Hsuan-Yi Chu >Assignee: Sean Hsuan-Yi Chu > > Union-All operator does not deal with the situation where left side comes > from empty source. > Due to DRILL-2288's enhancement for empty sources, Union-All operator now can > be allowed to support this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4476) Enhance Union-All operator for dealing with empty left input or empty both inputs
[ https://issues.apache.org/jira/browse/DRILL-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15188130#comment-15188130 ] ASF GitHub Bot commented on DRILL-4476: --- Github user hsuanyi commented on a diff in the pull request: https://github.com/apache/drill/pull/407#discussion_r55597640 --- Diff: exec/java-exec/src/test/java/org/apache/drill/TestUnionAll.java --- @@ -527,6 +532,52 @@ public void testUnionAllRightEmptyJson() throws Exception { .build().run(); } + @Test + public void testUnionAllLeftEmptyJson() throws Exception { --- End diff -- Added a few for both union-all and union-distinct > Enhance Union-All operator for dealing with empty left input or empty both > inputs > - > > Key: DRILL-4476 > URL: https://issues.apache.org/jira/browse/DRILL-4476 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Reporter: Sean Hsuan-Yi Chu >Assignee: Sean Hsuan-Yi Chu > > Union-All operator does not deal with the situation where left side comes > from empty source. > Due to DRILL-2288's enhancement for empty sources, Union-All operator now can > be allowed to support this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4476) Enhance Union-All operator for dealing with empty left input or empty both inputs
[ https://issues.apache.org/jira/browse/DRILL-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15188113#comment-15188113 ] ASF GitHub Bot commented on DRILL-4476: --- Github user hsuanyi commented on a diff in the pull request: https://github.com/apache/drill/pull/407#discussion_r55596051 --- Diff: exec/java-exec/src/test/java/org/apache/drill/TestUnionAll.java --- @@ -527,6 +532,52 @@ public void testUnionAllRightEmptyJson() throws Exception { .build().run(); } + @Test + public void testUnionAllLeftEmptyJson() throws Exception { +final String rootEmpty = FileUtils.getResourceAsFile("/project/pushdown/empty.json").toURI().toString(); +final String rootSimple = FileUtils.getResourceAsFile("/store/json/booleanData.json").toURI().toString(); + +final String queryRightEmpty = String.format( --- End diff -- addressed > Enhance Union-All operator for dealing with empty left input or empty both > inputs > - > > Key: DRILL-4476 > URL: https://issues.apache.org/jira/browse/DRILL-4476 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Reporter: Sean Hsuan-Yi Chu >Assignee: Sean Hsuan-Yi Chu > > Union-All operator does not deal with the situation where left side comes > from empty source. > Due to DRILL-2288's enhancement for empty sources, Union-All operator now can > be allowed to support this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4476) Enhance Union-All operator for dealing with empty left input or empty both inputs
[ https://issues.apache.org/jira/browse/DRILL-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15188106#comment-15188106 ] ASF GitHub Bot commented on DRILL-4476: --- Github user hsuanyi commented on a diff in the pull request: https://github.com/apache/drill/pull/407#discussion_r55595195 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/union/UnionAllRecordBatch.java --- @@ -162,6 +162,25 @@ private IterOutcome doWork() throws ClassTransformationException, IOException, S allocationVectors = Lists.newArrayList(); transfers.clear(); +// If both sides of Union-All are empty +if(unionAllInput.isBothSideEmpty()) { + for(int i = 0; i < outputFields.size(); ++i) { --- End diff -- For example, you could do: select a, b, c from empty union all select a, b, c from empty Then, the output should be (a, b, c) as column names. > Enhance Union-All operator for dealing with empty left input or empty both > inputs > - > > Key: DRILL-4476 > URL: https://issues.apache.org/jira/browse/DRILL-4476 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Reporter: Sean Hsuan-Yi Chu >Assignee: Sean Hsuan-Yi Chu > > Union-All operator does not deal with the situation where left side comes > from empty source. > Due to DRILL-2288's enhancement for empty sources, Union-All operator now can > be allowed to support this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4476) Enhance Union-All operator for dealing with empty left input or empty both inputs
[ https://issues.apache.org/jira/browse/DRILL-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15187551#comment-15187551 ] ASF GitHub Bot commented on DRILL-4476: --- Github user amansinha100 commented on a diff in the pull request: https://github.com/apache/drill/pull/407#discussion_r55561579 --- Diff: exec/java-exec/src/test/java/org/apache/drill/TestUnionAll.java --- @@ -527,6 +532,52 @@ public void testUnionAllRightEmptyJson() throws Exception { .build().run(); } + @Test + public void testUnionAllLeftEmptyJson() throws Exception { --- End diff -- Are there any union-all tests with WHERE 1 = 0 filters on the left and/or right inputs ? If not, could you add such tests ? We should distinguish between a truly empty input vs. one that occurs as a result of a False filter. > Enhance Union-All operator for dealing with empty left input or empty both > inputs > - > > Key: DRILL-4476 > URL: https://issues.apache.org/jira/browse/DRILL-4476 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Reporter: Sean Hsuan-Yi Chu >Assignee: Sean Hsuan-Yi Chu > > Union-All operator does not deal with the situation where left side comes > from empty source. > Due to DRILL-2288's enhancement for empty sources, Union-All operator now can > be allowed to support this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4476) Enhance Union-All operator for dealing with empty left input or empty both inputs
[ https://issues.apache.org/jira/browse/DRILL-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15187542#comment-15187542 ] ASF GitHub Bot commented on DRILL-4476: --- Github user amansinha100 commented on a diff in the pull request: https://github.com/apache/drill/pull/407#discussion_r55561188 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/union/UnionAllRecordBatch.java --- @@ -294,13 +313,41 @@ public UnionAllInput(UnionAllRecordBatch unionAllRecordBatch, RecordBatch left, rightSide = new OneSideInput(right); } +private boolean isBothSideEmpty() { + return leftIsFinish && rightIsFinish; +} + public IterOutcome nextBatch() throws SchemaChangeException { if(upstream == RecordBatch.IterOutcome.NOT_YET) { IterOutcome iterLeft = leftSide.nextBatch(); switch(iterLeft) { case OK_NEW_SCHEMA: -break; +whileLoop: +while(leftSide.getRecordBatch().getRecordCount() == 0) { + iterLeft = leftSide.nextBatch(); + + switch(iterLeft) { +case STOP: +case OUT_OF_MEMORY: + return iterLeft; + +case NONE: --- End diff -- Is the NONE state not reached outside this while loop ? It wasn't clear why we need the inner while loop for the left input considering that you don't do that for the right input. > Enhance Union-All operator for dealing with empty left input or empty both > inputs > - > > Key: DRILL-4476 > URL: https://issues.apache.org/jira/browse/DRILL-4476 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Reporter: Sean Hsuan-Yi Chu >Assignee: Sean Hsuan-Yi Chu > > Union-All operator does not deal with the situation where left side comes > from empty source. > Due to DRILL-2288's enhancement for empty sources, Union-All operator now can > be allowed to support this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4476) Enhance Union-All operator for dealing with empty left input or empty both inputs
[ https://issues.apache.org/jira/browse/DRILL-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15187520#comment-15187520 ] ASF GitHub Bot commented on DRILL-4476: --- Github user amansinha100 commented on a diff in the pull request: https://github.com/apache/drill/pull/407#discussion_r9565 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/union/UnionAllRecordBatch.java --- @@ -491,6 +556,25 @@ private void inferOutputFieldsFromLeftSide() { } } +private void inferOutputFieldsFromRightSide() { --- End diff -- It seems like the inferencing is repeated a few times. You could have a single function like inferOutputFieldNameAndType(outputFieldNames, batch). The field type is always inferred from the supplied batch. For the output names, if the outputFieldNames is null, then use the output field names as well as type from that batch (this would be the case for left input). If not null, then use the field names. Would that work ? > Enhance Union-All operator for dealing with empty left input or empty both > inputs > - > > Key: DRILL-4476 > URL: https://issues.apache.org/jira/browse/DRILL-4476 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Reporter: Sean Hsuan-Yi Chu >Assignee: Sean Hsuan-Yi Chu > > Union-All operator does not deal with the situation where left side comes > from empty source. > Due to DRILL-2288's enhancement for empty sources, Union-All operator now can > be allowed to support this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4476) Enhance Union-All operator for dealing with empty left input or empty both inputs
[ https://issues.apache.org/jira/browse/DRILL-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15187497#comment-15187497 ] ASF GitHub Bot commented on DRILL-4476: --- Github user amansinha100 commented on a diff in the pull request: https://github.com/apache/drill/pull/407#discussion_r8287 --- Diff: exec/java-exec/src/test/java/org/apache/drill/TestUnionAll.java --- @@ -527,6 +532,52 @@ public void testUnionAllRightEmptyJson() throws Exception { .build().run(); } + @Test + public void testUnionAllLeftEmptyJson() throws Exception { +final String rootEmpty = FileUtils.getResourceAsFile("/project/pushdown/empty.json").toURI().toString(); +final String rootSimple = FileUtils.getResourceAsFile("/store/json/booleanData.json").toURI().toString(); + +final String queryRightEmpty = String.format( --- End diff -- The name suggests right is empty but the query has left empty... > Enhance Union-All operator for dealing with empty left input or empty both > inputs > - > > Key: DRILL-4476 > URL: https://issues.apache.org/jira/browse/DRILL-4476 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Reporter: Sean Hsuan-Yi Chu >Assignee: Sean Hsuan-Yi Chu > > Union-All operator does not deal with the situation where left side comes > from empty source. > Due to DRILL-2288's enhancement for empty sources, Union-All operator now can > be allowed to support this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4476) Enhance Union-All operator for dealing with empty left input or empty both inputs
[ https://issues.apache.org/jira/browse/DRILL-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15187488#comment-15187488 ] ASF GitHub Bot commented on DRILL-4476: --- Github user amansinha100 commented on a diff in the pull request: https://github.com/apache/drill/pull/407#discussion_r7460 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/union/UnionAllRecordBatch.java --- @@ -162,6 +162,25 @@ private IterOutcome doWork() throws ClassTransformationException, IOException, S allocationVectors = Lists.newArrayList(); transfers.clear(); +// If both sides of Union-All are empty +if(unionAllInput.isBothSideEmpty()) { + for(int i = 0; i < outputFields.size(); ++i) { --- End diff -- If both sides are empty, wouldn't outputFields.size() be 0 ? Or is outputFields populated from a previous batch, which implies both sides may not have been empty. Can you clarify ? > Enhance Union-All operator for dealing with empty left input or empty both > inputs > - > > Key: DRILL-4476 > URL: https://issues.apache.org/jira/browse/DRILL-4476 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Reporter: Sean Hsuan-Yi Chu >Assignee: Sean Hsuan-Yi Chu > > Union-All operator does not deal with the situation where left side comes > from empty source. > Due to DRILL-2288's enhancement for empty sources, Union-All operator now can > be allowed to support this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4476) Enhance Union-All operator for dealing with empty left input or empty both inputs
[ https://issues.apache.org/jira/browse/DRILL-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15180752#comment-15180752 ] ASF GitHub Bot commented on DRILL-4476: --- Github user hsuanyi commented on the pull request: https://github.com/apache/drill/pull/407#issuecomment-192523111 @amansinha100 can you help review this? > Enhance Union-All operator for dealing with empty left input or empty both > inputs > - > > Key: DRILL-4476 > URL: https://issues.apache.org/jira/browse/DRILL-4476 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Reporter: Sean Hsuan-Yi Chu >Assignee: Sean Hsuan-Yi Chu > > Union-All operator does not deal with the situation where left side comes > from empty source. > Due to DRILL-2288's enhancement for empty sources, Union-All operator now can > be allowed to support this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4476) Enhance Union-All operator for dealing with empty left input or empty both inputs
[ https://issues.apache.org/jira/browse/DRILL-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15180744#comment-15180744 ] ASF GitHub Bot commented on DRILL-4476: --- GitHub user hsuanyi opened a pull request: https://github.com/apache/drill/pull/407 DRILL-4476: Allow UnionAllRecordBatch to manager situations where lef… …t input side or both sides come(s) from empty source(s). You can merge this pull request into a Git repository by running: $ git pull https://github.com/hsuanyi/incubator-drill DRILL-4476 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/407.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #407 commit 5a384a14f23827aee8b20a8ac6529b33bc66ee2e Author: Hsuan-Yi Chu Date: 2016-03-04T21:50:02Z DRILL-4476: Allow UnionAllRecordBatch to manager situations where left input side or both sides come(s) from empty source(s). > Enhance Union-All operator for dealing with empty left input or empty both > inputs > - > > Key: DRILL-4476 > URL: https://issues.apache.org/jira/browse/DRILL-4476 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Reporter: Sean Hsuan-Yi Chu >Assignee: Sean Hsuan-Yi Chu > > Union-All operator does not deal with the situation where left side comes > from empty source. > Due to DRILL-2288's enhancement for empty sources, Union-All operator now can > be allowed to support this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)