Review Request 32945: DRILL-2715: Implement nested loop join operator

2015-04-07 Thread Mehant Baid

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32945/
---

Review request for drill and Aman Sinha.


Repository: drill-git


Description
---

This patch implements the nested loop join operator. The main changes are in 
the files NestedLoopJoinBatch and NestedLoopJoinTemplate. This patch only 
contains the execution changes. Planning patch will be posted in a separate 
review request by Aman.


Diffs
-

  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/AbstractPhysicalVisitor.java
 27b0ecb 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/PhysicalVisitor.java
 e6a89d0 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/NestedLoopJoinPOP.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/ExpandableHyperContainerContext.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoin.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinBatch.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinBatchCreator.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinTemplate.java
 PRE-CREATION 
  protocol/src/main/java/org/apache/drill/exec/proto/UserBitShared.java 9a9d196 
  protocol/src/main/protobuf/UserBitShared.proto 5e44655 

Diff: https://reviews.apache.org/r/32945/diff/


Testing
---

The tests are dependent on the planning changes, hence not uploaded as part of 
this patch. However 
https://github.com/mehant/drill/blob/notin_1/exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/join/TestNestedLoopJoin.java
 is a working branch that contains a bunch of tests (planning & execution) 
added for nested loop join.


Thanks,

Mehant Baid



Re: Review Request 32945: DRILL-2715: Implement nested loop join operator

2015-04-08 Thread Hanifi Gunes

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32945/#review79434
---


Some initial comments.


exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/ExpandableHyperContainerContext.java


final?



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/ExpandableHyperContainerContext.java


Using recordCounts.clear() would be better here.



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinBatch.java


static?



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinBatch.java


static?



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinBatch.java


we should consider making left & right final. Also a null check here would 
be nice.



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinBatch.java


should be final. also can we use longer names here?



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinTemplate.java


is this instance variable needed when we are holding left instance? can we 
localize?



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinTemplate.java


the same. isn't this equal to cardinality of rightCounts? localize?



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinTemplate.java


we should localize this as well.



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinTemplate.java


localize?



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinTemplate.java


localize?



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinTemplate.java


localize?



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinTemplate.java


use of local variables would be more *performant* here.



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinTemplate.java


Afaik any non-long integer arithmetic results in an int by default. We may 
save some cpu cycles if we declare nextRightRecordToProcess be short and get 
rid of bit masking here.

Also is there any reason for not altering method signature to 
emitRight(compositeIndex, recordIndex, outIndex) instead of making this 
computation for each tuple?


- Hanifi Gunes


On April 8, 2015, 12:27 a.m., Mehant Baid wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32945/
> ---
> 
> (Updated April 8, 2015, 12:27 a.m.)
> 
> 
> Review request for drill and Aman Sinha.
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> This patch implements the nested loop join operator. The main changes are in 
> the files NestedLoopJoinBatch and NestedLoopJoinTemplate. This patch only 
> contains the execution changes. Planning patch will be posted in a separate 
> review request by Aman.
> 
> 
> Diffs
> -
> 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/AbstractPhysicalVisitor.java
>  27b0ecb 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/PhysicalVisitor.java
>  e6a89d0 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/NestedLoopJoinPOP.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/ExpandableHyperContainerContext.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoin.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinBatch.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinBatchCreator.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinTemplate.java
>  PRE-CREATION 
>   protocol/sr

Re: Review Request 32945: DRILL-2715: Implement nested loop join operator

2015-04-09 Thread Mehant Baid

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32945/
---

(Updated April 9, 2015, 9:03 p.m.)


Review request for drill and Aman Sinha.


Changes
---

Addressed review comments.


Repository: drill-git


Description
---

This patch implements the nested loop join operator. The main changes are in 
the files NestedLoopJoinBatch and NestedLoopJoinTemplate. This patch only 
contains the execution changes. Planning patch will be posted in a separate 
review request by Aman.


Diffs (updated)
-

  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/AbstractPhysicalVisitor.java
 27b0ecb 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/PhysicalVisitor.java
 e6a89d0 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/NestedLoopJoinPOP.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoin.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinBatch.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinBatchCreator.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinTemplate.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/record/ExpandableHyperContainer.java
 90310e2 
  protocol/src/main/java/org/apache/drill/exec/proto/UserBitShared.java 9a9d196 
  protocol/src/main/protobuf/UserBitShared.proto 5e44655 

Diff: https://reviews.apache.org/r/32945/diff/


Testing
---

The tests are dependent on the planning changes, hence not uploaded as part of 
this patch. However 
https://github.com/mehant/drill/blob/notin_1/exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/join/TestNestedLoopJoin.java
 is a working branch that contains a bunch of tests (planning & execution) 
added for nested loop join.


Thanks,

Mehant Baid



Re: Review Request 32945: DRILL-2715: Implement nested loop join operator

2015-04-09 Thread Mehant Baid


> On April 9, 2015, 1:22 a.m., Aman Sinha wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinTemplate.java,
> >  lines 93-94
> > 
> >
> > Will this loop work for left outer join ?  Suppose the right input is 
> > empty, we still want to produce all left rows.  e.g in the following 
> > example query: 
> >   SELECT * FROM t1 
> >LEFT OUTER JOIN
> >   (SELECT * FROM t2 WHERE 1=0) 
> >   ON 1=1  /* for  cartesian join */

Yes, currently we will not output any rows. But like discussed, we wouldn't use 
NLJ for generic outer joins anyways as the filter above would filter out the 
rows not conforming to the join condition. For outer joins we will use NLJ only 
with scalar sub queries.


- Mehant


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32945/#review79455
---


On April 9, 2015, 9:03 p.m., Mehant Baid wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32945/
> ---
> 
> (Updated April 9, 2015, 9:03 p.m.)
> 
> 
> Review request for drill and Aman Sinha.
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> This patch implements the nested loop join operator. The main changes are in 
> the files NestedLoopJoinBatch and NestedLoopJoinTemplate. This patch only 
> contains the execution changes. Planning patch will be posted in a separate 
> review request by Aman.
> 
> 
> Diffs
> -
> 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/AbstractPhysicalVisitor.java
>  27b0ecb 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/PhysicalVisitor.java
>  e6a89d0 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/NestedLoopJoinPOP.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoin.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinBatch.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinBatchCreator.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinTemplate.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/record/ExpandableHyperContainer.java
>  90310e2 
>   protocol/src/main/java/org/apache/drill/exec/proto/UserBitShared.java 
> 9a9d196 
>   protocol/src/main/protobuf/UserBitShared.proto 5e44655 
> 
> Diff: https://reviews.apache.org/r/32945/diff/
> 
> 
> Testing
> ---
> 
> The tests are dependent on the planning changes, hence not uploaded as part 
> of this patch. However 
> https://github.com/mehant/drill/blob/notin_1/exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/join/TestNestedLoopJoin.java
>  is a working branch that contains a bunch of tests (planning & execution) 
> added for nested loop join.
> 
> 
> Thanks,
> 
> Mehant Baid
> 
>



Re: Review Request 32945: DRILL-2715: Implement nested loop join operator

2015-04-09 Thread Mehant Baid


> On April 8, 2015, 10:55 p.m., Hanifi Gunes wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinTemplate.java,
> >  line 39
> > 
> >
> > is this instance variable needed when we are holding left instance? can 
> > we localize?

I will make a local copy of this member for performance but I think having this 
as a class member is ok here. This is used in two cases 1. first time we invoke 
outputRecords(), we use the record count to determine if we need to invoke 
populateOutgoingBatch() otherwise we'll still go over all the records in the 
right. 2. Once we have processed one left batch this is also useful to check if 
we need to end processing. If I remove this I will have to maintain some other 
state like 'isFirst' in the template which does not serve the purpose.


> On April 8, 2015, 10:55 p.m., Hanifi Gunes wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinTemplate.java,
> >  line 98
> > 
> >
> > Afaik any non-long integer arithmetic results in an int by default. We 
> > may save some cpu cycles if we declare nextRightRecordToProcess be short 
> > and get rid of bit masking here.
> > 
> > Also is there any reason for not altering method signature to 
> > emitRight(compositeIndex, recordIndex, outIndex) instead of making this 
> > computation for each tuple?

Good catch.


- Mehant


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32945/#review79434
---


On April 9, 2015, 9:03 p.m., Mehant Baid wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32945/
> ---
> 
> (Updated April 9, 2015, 9:03 p.m.)
> 
> 
> Review request for drill and Aman Sinha.
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> This patch implements the nested loop join operator. The main changes are in 
> the files NestedLoopJoinBatch and NestedLoopJoinTemplate. This patch only 
> contains the execution changes. Planning patch will be posted in a separate 
> review request by Aman.
> 
> 
> Diffs
> -
> 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/AbstractPhysicalVisitor.java
>  27b0ecb 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/PhysicalVisitor.java
>  e6a89d0 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/NestedLoopJoinPOP.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoin.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinBatch.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinBatchCreator.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinTemplate.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/record/ExpandableHyperContainer.java
>  90310e2 
>   protocol/src/main/java/org/apache/drill/exec/proto/UserBitShared.java 
> 9a9d196 
>   protocol/src/main/protobuf/UserBitShared.proto 5e44655 
> 
> Diff: https://reviews.apache.org/r/32945/diff/
> 
> 
> Testing
> ---
> 
> The tests are dependent on the planning changes, hence not uploaded as part 
> of this patch. However 
> https://github.com/mehant/drill/blob/notin_1/exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/join/TestNestedLoopJoin.java
>  is a working branch that contains a bunch of tests (planning & execution) 
> added for nested loop join.
> 
> 
> Thanks,
> 
> Mehant Baid
> 
>



Re: Review Request 32945: DRILL-2715: Implement nested loop join operator

2015-04-08 Thread Aman Sinha

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32945/#review79455
---



exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/NestedLoopJoinPOP.java


We could keep the join condition as a placeholder in the POP even though it 
will always be TRUE based on current generated plans. In future we can enhance 
to handle equality/non-equality etc.

Also, keep the JoinType attribute to identify inner and outer joins..



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/ExpandableHyperContainerContext.java


It would be preferable to avoid creating wrapper class unless really 
necessary; could the recordCounts be maintained in the ExpandableHyperContainer 
?



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinBatch.java


This says left batch, should be 'right' batch



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinTemplate.java


Will this loop work for left outer join ?  Suppose the right input is 
empty, we still want to produce all left rows.  e.g in the following example 
query: 
  SELECT * FROM t1 
   LEFT OUTER JOIN
  (SELECT * FROM t2 WHERE 1=0) 
  ON 1=1  /* for  cartesian join */


- Aman Sinha


On April 8, 2015, 12:27 a.m., Mehant Baid wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32945/
> ---
> 
> (Updated April 8, 2015, 12:27 a.m.)
> 
> 
> Review request for drill and Aman Sinha.
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> This patch implements the nested loop join operator. The main changes are in 
> the files NestedLoopJoinBatch and NestedLoopJoinTemplate. This patch only 
> contains the execution changes. Planning patch will be posted in a separate 
> review request by Aman.
> 
> 
> Diffs
> -
> 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/AbstractPhysicalVisitor.java
>  27b0ecb 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/PhysicalVisitor.java
>  e6a89d0 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/NestedLoopJoinPOP.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/ExpandableHyperContainerContext.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoin.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinBatch.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinBatchCreator.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinTemplate.java
>  PRE-CREATION 
>   protocol/src/main/java/org/apache/drill/exec/proto/UserBitShared.java 
> 9a9d196 
>   protocol/src/main/protobuf/UserBitShared.proto 5e44655 
> 
> Diff: https://reviews.apache.org/r/32945/diff/
> 
> 
> Testing
> ---
> 
> The tests are dependent on the planning changes, hence not uploaded as part 
> of this patch. However 
> https://github.com/mehant/drill/blob/notin_1/exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/join/TestNestedLoopJoin.java
>  is a working branch that contains a bunch of tests (planning & execution) 
> added for nested loop join.
> 
> 
> Thanks,
> 
> Mehant Baid
> 
>