[jira] [Commented] (DRILL-6606) Hash Join returns incorrect data types when joining subqueries with limit 0

ASF GitHub Bot (JIRA) Thu, 19 Jul 2018 08:55:23 -0700


    [ 
https://issues.apache.org/jira/browse/DRILL-6606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16549452#comment-16549452
 ]


ASF GitHub Bot commented on DRILL-6606:
---------------------------------------

ilooner commented on a change in pull request #1384: DRILL-6606: Fixed bug in 
HashJoin that caused it not to return OK_NEW_SCHEMA in some cases.
URL: https://github.com/apache/drill/pull/1384#discussion_r203780860
 
 

 ##########
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/HashJoinBatch.java
 ##########
 @@ -317,7 +422,25 @@ public HashJoinMemoryCalculator getCalculatorImpl() {
 
   @Override
   public IterOutcome innerNext() {
-    // In case incoming was killed before, just cleanup and return
+    if (!prefetched) {
+      // If we didn't retrieve our first data hold batch, we need to do it now.
+      prefetched = true;
+      prefetchFirstBatchFromBothSides();
+
+      // Handle emitting the correct outcome for termination conditions
+      // Use the state set by prefetchFirstBatchFromBothSides to emit the 
correct termination outcome.
+      switch (state) {
+        case DONE:
+          return IterOutcome.NONE;
+        case STOP:
+          return IterOutcome.STOP;
+        case OUT_OF_MEMORY:
+          return IterOutcome.OUT_OF_MEMORY;
+        default:
+          // No termination condition so continue processing.
 
 Review comment:
   I took a look at the code in AbstractRecordBatch.next(). The handling is a 
little different for out of memory.
   
   ```
               case DONE:
                 return IterOutcome.NONE;
               case OUT_OF_MEMORY:
                 // because we don't support schema changes, it is safe to fail 
the query right away
                 context.getExecutorState().fail(UserException.memoryError()
                   .build(logger));
                 // FALL-THROUGH
               case STOP:
                 return IterOutcome.STOP;
   ```
   
   I don't see a way to move this code into a common method. If there is a 
clean way please let me know.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Hash Join returns incorrect data types when joining subqueries with limit 0
> ---------------------------------------------------------------------------
>
>                 Key: DRILL-6606
>                 URL: https://issues.apache.org/jira/browse/DRILL-6606
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Bohdan Kazydub
>            Assignee: Timothy Farkas
>            Priority: Blocker
>             Fix For: 1.14.0
>
>
> PreparedStatement for query
> {code:sql}
> SELECT l.l_quantity, l.l_shipdate, o.o_custkey
> FROM (SELECT * FROM cp.`tpch/lineitem.parquet` LIMIT 0) l
>     JOIN (SELECT * FROM cp.`tpch/orders.parquet` LIMIT 0) o 
>     ON l.l_orderkey = o.o_orderkey
> LIMIT 0
> {code}
>  is created with wrong types (nullable INTEGER) for all selected columns, no 
> matter what their actual type is. This behavior reproduces with hash join 
> only and is very likely to be caused by DRILL-6027 as the query works fine 
> before this feature was implemented.
> To reproduce the problem you can put the aforementioned query into 
> TestPreparedStatementProvider#joinOrderByQuery() test method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (DRILL-6606) Hash Join returns incorrect data types when joining subqueries with limit 0

Reply via email to