> On Jan. 26, 2015, 5:51 p.m., Jacques Nadeau wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java,
> >  line 72
> > <https://reviews.apache.org/r/30051/diff/3/?file=828946#file828946line72>
> >
> >     You should specifically note that this doesn't cover the situation 
> > where we have multiple distinct partitions.

What do you mean by "multiple distinct partitions" ?


> On Jan. 26, 2015, 5:51 p.m., Jacques Nadeau wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java,
> >  line 87
> > <https://reviews.apache.org/r/30051/diff/3/?file=828946#file828946line87>
> >
> >     This logic seems a little backwards from what I was expecting.  Batches 
> > are physical concept.  Shouldn't these be stated based on p0.  For example, 
> > it seems like we should be able to process p0 the moment we have received 
> > b0.  We can't do p1 until we get to b2 and see that p1 has ended, etc.  As 
> > you've written it sounds like we can't do p0 until we know the end of p1.  
> > I guess this might make sense if we have small partitions and you're trying 
> > to work a batch at a time inside the generated code.  This sacrifices more 
> > memory for potentially better performance.  Is this the reason you're 
> > operating this way?

This comes from the first iterations of the window operator: processing one 
"full" batch at a time makes it possible to transfer all incoming vectors into 
the outgoing container. Once I started supporting "order by" and window frames, 
vector transfers could no longer be done in the general case. The code could be 
optimized to detect when transfers are possible.

It is possible to process and return partitions as soon as they end, but the 
remaining rows of the processed batch need to be kept into memory until they 
are ready to be processed.


- abdelhakim


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30051/#review69617
-----------------------------------------------------------


On Jan. 21, 2015, 11:05 p.m., abdelhakim deneche wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30051/
> -----------------------------------------------------------
> 
> (Updated Jan. 21, 2015, 11:05 p.m.)
> 
> 
> Review request for drill.
> 
> 
> Bugs: DRILL-1908
>     https://issues.apache.org/jira/browse/DRILL-1908
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> In order to fix DRILL-1487 a complete rewrite of the 
> StreamingWindowFrameRecordBatch was needed. This patch adds a new 
> WindowFrameRecordBatch that correctly handles window functions with or 
> without order by clauses.
> This code still lacks support for frame clauses and may be optimized to 
> reduce unneeded frame computations.
> 
> 
> Diffs
> -----
> 
>   
> common/src/main/java/org/apache/drill/common/logical/data/AbstractBuilder.java
>  28424a5 
>   common/src/main/java/org/apache/drill/common/logical/data/Window.java 
> 6dba77c 
>   contrib/data/pom.xml 86075f2 
>   contrib/data/window-test-data/pom.xml PRE-CREATION 
>   exec/java-exec/pom.xml 90734a5 
>   exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java 
> 190c13f 
>   exec/java-exec/src/main/java/org/apache/drill/exec/opt/BasicOptimizer.java 
> 5288f5d 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/WindowPOP.java
>  17738ee 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameBatchCreator.java
>  9b8929f 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameRecordBatch.java
>  a3e7940 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameTemplate.java
>  b4e3fed 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFramer.java
>  9588cef 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameBatchCreator.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameTemplate.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFramer.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrel.java
>  f1a8bc0 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrule.java
>  00c20b2 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java
>  f20627d 
>   
> exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/TestWindowFrame.java
>  7c04477 
>   
> exec/java-exec/src/test/java/org/apache/drill/exec/sql/TestWindowFunctions.java
>  6eff6db 
> 
> Diff: https://reviews.apache.org/r/30051/diff/
> 
> 
> Testing
> -------
> 
> Unit tests are available to test window functions in mulitple scenarios:
> - b1.p1: single batch with a single partition
> - b1.p2: 2 batches, each containing a different parition
> - b2.p4: 2 batches and 4 partitions, one partition has rows in both batches
> - b3.p2: 3 batches and 2 partitions, one partition includes the whole 2nd 
> batch and has rows in 3 batches
> - b4.p4: 4 batches and 4 partitions, the partitions are arranged to test an 
> edge case: the 2nd time innerNext() is called, WindowFrameRecordBatch has 
> enough saved batches to call it's framer.doWork() without the need to call 
> next(incoming)
> 
> All tests, except the last one, come in 2 variations: with and without "order 
> by" clause
> 
> 
> Thanks,
> 
> abdelhakim deneche
> 
>

Reply via email to