Re: Review Request 30051: DRILL-1908: new window function implementation
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30051/ --- (Updated March 3, 2015, 10 p.m.) Review request for drill and Jacques Nadeau. Changes --- patch rebased to master Bugs: DRILL-1908 https://issues.apache.org/jira/browse/DRILL-1908 Repository: drill-git Description --- In order to fix DRILL-1487 a complete rewrite of the StreamingWindowFrameRecordBatch was needed. This patch adds a new WindowFrameRecordBatch that correctly handles window functions with or without order by clauses. This code still lacks support for frame clauses and may be optimized to reduce unneeded frame computations. Diffs (updated) - common/src/main/java/org/apache/drill/common/logical/data/AbstractBuilder.java 28424a5 common/src/main/java/org/apache/drill/common/logical/data/Window.java 6dba77c contrib/data/pom.xml 86075f2 contrib/data/window-test-data/pom.xml PRE-CREATION exec/java-exec/pom.xml 28eca2b exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java 5efcce8 exec/java-exec/src/main/java/org/apache/drill/exec/opt/BasicOptimizer.java 5288f5d exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/WindowPOP.java 17738ee exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/OverFinder.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameBatchCreator.java 9b8929f exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameRecordBatch.java 87209eb exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameTemplate.java e2c7e9e exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFramer.java 9588cef exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameBatchCreator.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameTemplate.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFramer.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java 3b7adca exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrel.java f1a8bc0 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrule.java 00c20b2 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrel.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrule.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java 232778a exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java 3d3e96f exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/TestWindowFrame.java a9d2ef8 exec/java-exec/src/test/java/org/apache/drill/exec/sql/TestWindowFunctions.java 6eff6db Diff: https://reviews.apache.org/r/30051/diff/ Testing --- Unit tests are available to test window functions in mulitple scenarios: - b1.p1: single batch with a single partition - b1.p2: 2 batches, each containing a different parition - b2.p4: 2 batches and 4 partitions, one partition has rows in both batches - b3.p2: 3 batches and 2 partitions, one partition includes the whole 2nd batch and has rows in 3 batches - b4.p4: 4 batches and 4 partitions, the partitions are arranged to test an edge case: the 2nd time innerNext() is called, WindowFrameRecordBatch has enough saved batches to call it's framer.doWork() without the need to call next(incoming) All tests, except the last one, come in 2 variations: with and without order by clause all unit tests pass. functional, sf100 and customer tests don't add any new failures Thanks, abdelhakim deneche
Re: Review Request 30051: DRILL-1908: new window function implementation
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30051/ --- (Updated Feb. 12, 2015, 6:20 p.m.) Review request for drill and Jacques Nadeau. Changes --- rebased the patch. Bugs: DRILL-1908 https://issues.apache.org/jira/browse/DRILL-1908 Repository: drill-git Description --- In order to fix DRILL-1487 a complete rewrite of the StreamingWindowFrameRecordBatch was needed. This patch adds a new WindowFrameRecordBatch that correctly handles window functions with or without order by clauses. This code still lacks support for frame clauses and may be optimized to reduce unneeded frame computations. Diffs (updated) - common/src/main/java/org/apache/drill/common/logical/data/AbstractBuilder.java 28424a5 common/src/main/java/org/apache/drill/common/logical/data/Window.java 6dba77c contrib/data/pom.xml 86075f2 contrib/data/window-test-data/pom.xml PRE-CREATION exec/java-exec/pom.xml 06f60fb exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java 5efcce8 exec/java-exec/src/main/java/org/apache/drill/exec/opt/BasicOptimizer.java 5288f5d exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/WindowPOP.java 17738ee exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/OverFinder.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameBatchCreator.java 9b8929f exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameRecordBatch.java 26d23f2 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameTemplate.java e2c7e9e exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFramer.java 9588cef exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameBatchCreator.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameTemplate.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFramer.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java 3b7adca exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrel.java f1a8bc0 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrule.java 00c20b2 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrel.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrule.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java 6b3d301 exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java aa0a5ad exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/TestWindowFrame.java a9d2ef8 exec/java-exec/src/test/java/org/apache/drill/exec/sql/TestWindowFunctions.java 6eff6db Diff: https://reviews.apache.org/r/30051/diff/ Testing (updated) --- Unit tests are available to test window functions in mulitple scenarios: - b1.p1: single batch with a single partition - b1.p2: 2 batches, each containing a different parition - b2.p4: 2 batches and 4 partitions, one partition has rows in both batches - b3.p2: 3 batches and 2 partitions, one partition includes the whole 2nd batch and has rows in 3 batches - b4.p4: 4 batches and 4 partitions, the partitions are arranged to test an edge case: the 2nd time innerNext() is called, WindowFrameRecordBatch has enough saved batches to call it's framer.doWork() without the need to call next(incoming) All tests, except the last one, come in 2 variations: with and without order by clause all unit tests pass. functional, sf100 and customer tests don't add any new failures Thanks, abdelhakim deneche
Re: Review Request 30051: DRILL-1908: new window function implementation
On Jan. 30, 2015, 1:12 a.m., Chris Westin wrote: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/GenerateTestData.java, line 121 https://reviews.apache.org/r/30051/diff/2/?file=826885#file826885line121 Make the file prefix a command-line argument so that people besides yourself can run this. abdelhakim deneche wrote: Argh, I forgot to remove GenerateTestData again! I just used this to generate the data used in the tests, it was never intended to be part of the final code. Sorry about that I would check it in. We might need it again in the future. What if something changes and we have to re-generate the test data? On Jan. 30, 2015, 1:12 a.m., Chris Westin wrote: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameTemplate.java, line 32 https://reviews.apache.org/r/30051/diff/2/?file=826892#file826892line32 logger should be private. Jacques Nadeau wrote: I disagree. Our current standard is package private. If you think we should change this throughout the code, we should have a discussion but my preference is to maintain the current standard until we decide upon a new one. Loggers identify their source in log messages thanks to the class argument given to getLogger(). They're meant to be associated with a class in a one-to-one manner -- why else would getLogger() have this parameter? There are no uses of the pattern otherclass.logger..., so there's no reason for them to be package private. However, I have come across a few uses where a derived class uses the logger from its base class, and this is confusing. This has sent me looking in the wrong place for the source of the message, so we shouldn't do it. I've assumed it was accidental, and slipped by because the base class's logger wasn't private, so the author was able to use it without realizing it. Making them private will prevent that, and ensure that log messages correctly identify their real source. Because we have not written standard that described this, and because it goes against common best practice elsewhere, I've been converting these to private wherever I've come across them. In only a couple of cases has this made me add new loggers where a derived class was accidentally using its base class's logger. - Chris --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30051/#review70296 --- On Jan. 28, 2015, 7:50 p.m., abdelhakim deneche wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30051/ --- (Updated Jan. 28, 2015, 7:50 p.m.) Review request for drill. Bugs: DRILL-1908 https://issues.apache.org/jira/browse/DRILL-1908 Repository: drill-git Description --- In order to fix DRILL-1487 a complete rewrite of the StreamingWindowFrameRecordBatch was needed. This patch adds a new WindowFrameRecordBatch that correctly handles window functions with or without order by clauses. This code still lacks support for frame clauses and may be optimized to reduce unneeded frame computations. Diffs - common/src/main/java/org/apache/drill/common/logical/data/AbstractBuilder.java 28424a5 common/src/main/java/org/apache/drill/common/logical/data/Window.java 6dba77c contrib/data/pom.xml 86075f2 contrib/data/window-test-data/pom.xml PRE-CREATION exec/java-exec/pom.xml 90734a5 exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java 190c13f exec/java-exec/src/main/java/org/apache/drill/exec/opt/BasicOptimizer.java 5288f5d exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/WindowPOP.java 17738ee exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/OverFinder.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameBatchCreator.java 9b8929f exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameRecordBatch.java 26d23f2 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameTemplate.java e2c7e9e exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFramer.java 9588cef exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameBatchCreator.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameTemplate.java PRE-CREATION
Re: Review Request 30051: DRILL-1908: new window function implementation
On Jan. 24, 2015, 12:20 a.m., abdelhakim deneche wrote: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java, line 85 https://reviews.apache.org/r/30051/diff/3/?file=828946#file828946line85 Yes, especially if the user uses over() then we'll have one single parition containing all rows. Then it seems like we should have some way of limiting this and aborting, and submit a ticket to possibly find another way to store the required data (spool to disk, like sort?) if this happens in the future. - Chris --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30051/#review69506 --- On Jan. 28, 2015, 7:50 p.m., abdelhakim deneche wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30051/ --- (Updated Jan. 28, 2015, 7:50 p.m.) Review request for drill. Bugs: DRILL-1908 https://issues.apache.org/jira/browse/DRILL-1908 Repository: drill-git Description --- In order to fix DRILL-1487 a complete rewrite of the StreamingWindowFrameRecordBatch was needed. This patch adds a new WindowFrameRecordBatch that correctly handles window functions with or without order by clauses. This code still lacks support for frame clauses and may be optimized to reduce unneeded frame computations. Diffs - common/src/main/java/org/apache/drill/common/logical/data/AbstractBuilder.java 28424a5 common/src/main/java/org/apache/drill/common/logical/data/Window.java 6dba77c contrib/data/pom.xml 86075f2 contrib/data/window-test-data/pom.xml PRE-CREATION exec/java-exec/pom.xml 90734a5 exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java 190c13f exec/java-exec/src/main/java/org/apache/drill/exec/opt/BasicOptimizer.java 5288f5d exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/WindowPOP.java 17738ee exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/OverFinder.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameBatchCreator.java 9b8929f exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameRecordBatch.java 26d23f2 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameTemplate.java e2c7e9e exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFramer.java 9588cef exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameBatchCreator.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameTemplate.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFramer.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java 3b7adca exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrel.java f1a8bc0 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrule.java 00c20b2 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrel.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrule.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java 79603eb exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java f20627d exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/TestWindowFrame.java a9d2ef8 exec/java-exec/src/test/java/org/apache/drill/exec/sql/TestWindowFunctions.java 6eff6db Diff: https://reviews.apache.org/r/30051/diff/ Testing --- Unit tests are available to test window functions in mulitple scenarios: - b1.p1: single batch with a single partition - b1.p2: 2 batches, each containing a different parition - b2.p4: 2 batches and 4 partitions, one partition has rows in both batches - b3.p2: 3 batches and 2 partitions, one partition includes the whole 2nd batch and has rows in 3 batches - b4.p4: 4 batches and 4 partitions, the partitions are arranged to test an edge case: the 2nd time innerNext() is called, WindowFrameRecordBatch has enough saved batches to call it's framer.doWork() without the need to call next(incoming) All tests, except the last one, come in 2 variations: with and without
Re: Review Request 30051: DRILL-1908: new window function implementation
On Jan. 26, 2015, 5:51 p.m., Jacques Nadeau wrote: exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/TestWindowFrame.java, line 38 https://reviews.apache.org/r/30051/diff/3/?file=828952#file828952line38 Please add a negative test case when using multiple partitions. In that case, the failure should happen in the planning phase, not execution. WindowPrel.getPhysicalOperator(PhysicalPlanCreator creator) has the following: ``` checkState(windows.size() == 1, Only one window is expected in WindowPrel); ``` - abdelhakim --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30051/#review69617 --- On Jan. 28, 2015, 7:50 p.m., abdelhakim deneche wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30051/ --- (Updated Jan. 28, 2015, 7:50 p.m.) Review request for drill. Bugs: DRILL-1908 https://issues.apache.org/jira/browse/DRILL-1908 Repository: drill-git Description --- In order to fix DRILL-1487 a complete rewrite of the StreamingWindowFrameRecordBatch was needed. This patch adds a new WindowFrameRecordBatch that correctly handles window functions with or without order by clauses. This code still lacks support for frame clauses and may be optimized to reduce unneeded frame computations. Diffs - common/src/main/java/org/apache/drill/common/logical/data/AbstractBuilder.java 28424a5 common/src/main/java/org/apache/drill/common/logical/data/Window.java 6dba77c contrib/data/pom.xml 86075f2 contrib/data/window-test-data/pom.xml PRE-CREATION exec/java-exec/pom.xml 90734a5 exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java 190c13f exec/java-exec/src/main/java/org/apache/drill/exec/opt/BasicOptimizer.java 5288f5d exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/WindowPOP.java 17738ee exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/OverFinder.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameBatchCreator.java 9b8929f exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameRecordBatch.java 26d23f2 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameTemplate.java e2c7e9e exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFramer.java 9588cef exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameBatchCreator.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameTemplate.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFramer.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java 3b7adca exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrel.java f1a8bc0 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrule.java 00c20b2 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrel.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrule.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java 79603eb exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java f20627d exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/TestWindowFrame.java a9d2ef8 exec/java-exec/src/test/java/org/apache/drill/exec/sql/TestWindowFunctions.java 6eff6db Diff: https://reviews.apache.org/r/30051/diff/ Testing --- Unit tests are available to test window functions in mulitple scenarios: - b1.p1: single batch with a single partition - b1.p2: 2 batches, each containing a different parition - b2.p4: 2 batches and 4 partitions, one partition has rows in both batches - b3.p2: 3 batches and 2 partitions, one partition includes the whole 2nd batch and has rows in 3 batches - b4.p4: 4 batches and 4 partitions, the partitions are arranged to test an edge case: the 2nd time innerNext() is called, WindowFrameRecordBatch has enough saved batches to call it's framer.doWork() without the need to call next(incoming) All tests, except the last one, come in 2 variations: with and without order
Re: Review Request 30051: DRILL-1908: new window function implementation
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30051/ --- (Updated Jan. 28, 2015, 7:50 p.m.) Review request for drill. Changes --- - updated GeneratorMapping and MappingSet variables to follow same convention used in HashJoinBatch and ChainedHashTable - check and disable window function in sql parsing - fail when schema changes - use fail(Exception e) to report errors - rename StreamingWindowPrel/StreamingWindowPrule to WindowPrel/WindowPrule Bugs: DRILL-1908 https://issues.apache.org/jira/browse/DRILL-1908 Repository: drill-git Description --- In order to fix DRILL-1487 a complete rewrite of the StreamingWindowFrameRecordBatch was needed. This patch adds a new WindowFrameRecordBatch that correctly handles window functions with or without order by clauses. This code still lacks support for frame clauses and may be optimized to reduce unneeded frame computations. Diffs (updated) - common/src/main/java/org/apache/drill/common/logical/data/AbstractBuilder.java 28424a5 common/src/main/java/org/apache/drill/common/logical/data/Window.java 6dba77c contrib/data/pom.xml 86075f2 contrib/data/window-test-data/pom.xml PRE-CREATION exec/java-exec/pom.xml 90734a5 exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java 190c13f exec/java-exec/src/main/java/org/apache/drill/exec/opt/BasicOptimizer.java 5288f5d exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/WindowPOP.java 17738ee exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/OverFinder.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameBatchCreator.java 9b8929f exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameRecordBatch.java 26d23f2 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameTemplate.java e2c7e9e exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFramer.java 9588cef exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameBatchCreator.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameTemplate.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFramer.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java 3b7adca exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrel.java f1a8bc0 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrule.java 00c20b2 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrel.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrule.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java 79603eb exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java f20627d exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/TestWindowFrame.java a9d2ef8 exec/java-exec/src/test/java/org/apache/drill/exec/sql/TestWindowFunctions.java 6eff6db Diff: https://reviews.apache.org/r/30051/diff/ Testing --- Unit tests are available to test window functions in mulitple scenarios: - b1.p1: single batch with a single partition - b1.p2: 2 batches, each containing a different parition - b2.p4: 2 batches and 4 partitions, one partition has rows in both batches - b3.p2: 3 batches and 2 partitions, one partition includes the whole 2nd batch and has rows in 3 batches - b4.p4: 4 batches and 4 partitions, the partitions are arranged to test an edge case: the 2nd time innerNext() is called, WindowFrameRecordBatch has enough saved batches to call it's framer.doWork() without the need to call next(incoming) All tests, except the last one, come in 2 variations: with and without order by clause Thanks, abdelhakim deneche
Re: Review Request 30051: DRILL-1908: new window function implementation
On Jan. 23, 2015, 11:39 p.m., Chris Westin wrote: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java, line 273 https://reviews.apache.org/r/30051/diff/3/?file=828946#file828946line273 Shouldn't these all be static? No. We should actually fix the casing on the names. We originally had these as static but we actually maintain state inside the mappings which means they can't be static. In most of the code they were originally static but then we realized the mistake. We fixed the static part but I don't think we did a good job of fixing the case of all the declarations. He is being with consistent with what we have elsewhere but what is elsewhere isn't right stylistically. On Jan. 23, 2015, 11:39 p.m., Chris Westin wrote: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java, line 299 https://reviews.apache.org/r/30051/diff/3/?file=828946#file828946line299 static? same as above. - Jacques --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30051/#review69499 --- On Jan. 21, 2015, 11:05 p.m., abdelhakim deneche wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30051/ --- (Updated Jan. 21, 2015, 11:05 p.m.) Review request for drill. Bugs: DRILL-1908 https://issues.apache.org/jira/browse/DRILL-1908 Repository: drill-git Description --- In order to fix DRILL-1487 a complete rewrite of the StreamingWindowFrameRecordBatch was needed. This patch adds a new WindowFrameRecordBatch that correctly handles window functions with or without order by clauses. This code still lacks support for frame clauses and may be optimized to reduce unneeded frame computations. Diffs - common/src/main/java/org/apache/drill/common/logical/data/AbstractBuilder.java 28424a5 common/src/main/java/org/apache/drill/common/logical/data/Window.java 6dba77c contrib/data/pom.xml 86075f2 contrib/data/window-test-data/pom.xml PRE-CREATION exec/java-exec/pom.xml 90734a5 exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java 190c13f exec/java-exec/src/main/java/org/apache/drill/exec/opt/BasicOptimizer.java 5288f5d exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/WindowPOP.java 17738ee exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameBatchCreator.java 9b8929f exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameRecordBatch.java a3e7940 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameTemplate.java b4e3fed exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFramer.java 9588cef exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameBatchCreator.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameTemplate.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFramer.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrel.java f1a8bc0 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrule.java 00c20b2 exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java f20627d exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/TestWindowFrame.java 7c04477 exec/java-exec/src/test/java/org/apache/drill/exec/sql/TestWindowFunctions.java 6eff6db Diff: https://reviews.apache.org/r/30051/diff/ Testing --- Unit tests are available to test window functions in mulitple scenarios: - b1.p1: single batch with a single partition - b1.p2: 2 batches, each containing a different parition - b2.p4: 2 batches and 4 partitions, one partition has rows in both batches - b3.p2: 3 batches and 2 partitions, one partition includes the whole 2nd batch and has rows in 3 batches - b4.p4: 4 batches and 4 partitions, the partitions are arranged to test an edge case: the 2nd time innerNext() is called, WindowFrameRecordBatch has enough saved batches to call it's framer.doWork() without the need to call next(incoming) All tests, except the last one, come in 2 variations: with and without order by clause Thanks, abdelhakim deneche
Re: Review Request 30051: DRILL-1908: new window function implementation
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30051/#review69617 --- exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameBatchCreator.java https://reviews.apache.org/r/30051/#comment114332 This should only be in sql parsing, not here. Please check with Aman Sean in how they are disabling SQL functionality and then do the same for this (based on option). exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java https://reviews.apache.org/r/30051/#comment114361 You should specifically note that this doesn't cover the situation where we have multiple distinct partitions. exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java https://reviews.apache.org/r/30051/#comment114334 This logic seems a little backwards from what I was expecting. Batches are physical concept. Shouldn't these be stated based on p0. For example, it seems like we should be able to process p0 the moment we have received b0. We can't do p1 until we get to b2 and see that p1 has ended, etc. As you've written it sounds like we can't do p0 until we know the end of p1. I guess this might make sense if we have small partitions and you're trying to work a batch at a time inside the generated code. This sacrifices more memory for potentially better performance. Is this the reason you're operating this way? exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java https://reviews.apache.org/r/30051/#comment114338 Given that you're not handling schema change, it would be best to fail rather than return incorrect result. exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java https://reviews.apache.org/r/30051/#comment114339 We should save the failure state in the fragment context here. I think the method is fail(Exception e) or similar. exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java https://reviews.apache.org/r/30051/#comment114341 Let's discuss this. I think there is a disconnect here. exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/TestWindowFrame.java https://reviews.apache.org/r/30051/#comment114362 Please add a negative test case when using multiple partitions. In that case, the failure should happen in the planning phase, not execution. - Jacques Nadeau On Jan. 21, 2015, 11:05 p.m., abdelhakim deneche wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30051/ --- (Updated Jan. 21, 2015, 11:05 p.m.) Review request for drill. Bugs: DRILL-1908 https://issues.apache.org/jira/browse/DRILL-1908 Repository: drill-git Description --- In order to fix DRILL-1487 a complete rewrite of the StreamingWindowFrameRecordBatch was needed. This patch adds a new WindowFrameRecordBatch that correctly handles window functions with or without order by clauses. This code still lacks support for frame clauses and may be optimized to reduce unneeded frame computations. Diffs - common/src/main/java/org/apache/drill/common/logical/data/AbstractBuilder.java 28424a5 common/src/main/java/org/apache/drill/common/logical/data/Window.java 6dba77c contrib/data/pom.xml 86075f2 contrib/data/window-test-data/pom.xml PRE-CREATION exec/java-exec/pom.xml 90734a5 exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java 190c13f exec/java-exec/src/main/java/org/apache/drill/exec/opt/BasicOptimizer.java 5288f5d exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/WindowPOP.java 17738ee exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameBatchCreator.java 9b8929f exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameRecordBatch.java a3e7940 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameTemplate.java b4e3fed exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFramer.java 9588cef exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameBatchCreator.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameTemplate.java PRE-CREATION
Re: Review Request 30051: DRILL-1908: new window function implementation
On Jan. 26, 2015, 5:51 p.m., Jacques Nadeau wrote: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java, line 72 https://reviews.apache.org/r/30051/diff/3/?file=828946#file828946line72 You should specifically note that this doesn't cover the situation where we have multiple distinct partitions. What do you mean by multiple distinct partitions ? On Jan. 26, 2015, 5:51 p.m., Jacques Nadeau wrote: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java, line 87 https://reviews.apache.org/r/30051/diff/3/?file=828946#file828946line87 This logic seems a little backwards from what I was expecting. Batches are physical concept. Shouldn't these be stated based on p0. For example, it seems like we should be able to process p0 the moment we have received b0. We can't do p1 until we get to b2 and see that p1 has ended, etc. As you've written it sounds like we can't do p0 until we know the end of p1. I guess this might make sense if we have small partitions and you're trying to work a batch at a time inside the generated code. This sacrifices more memory for potentially better performance. Is this the reason you're operating this way? This comes from the first iterations of the window operator: processing one full batch at a time makes it possible to transfer all incoming vectors into the outgoing container. Once I started supporting order by and window frames, vector transfers could no longer be done in the general case. The code could be optimized to detect when transfers are possible. It is possible to process and return partitions as soon as they end, but the remaining rows of the processed batch need to be kept into memory until they are ready to be processed. - abdelhakim --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30051/#review69617 --- On Jan. 21, 2015, 11:05 p.m., abdelhakim deneche wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30051/ --- (Updated Jan. 21, 2015, 11:05 p.m.) Review request for drill. Bugs: DRILL-1908 https://issues.apache.org/jira/browse/DRILL-1908 Repository: drill-git Description --- In order to fix DRILL-1487 a complete rewrite of the StreamingWindowFrameRecordBatch was needed. This patch adds a new WindowFrameRecordBatch that correctly handles window functions with or without order by clauses. This code still lacks support for frame clauses and may be optimized to reduce unneeded frame computations. Diffs - common/src/main/java/org/apache/drill/common/logical/data/AbstractBuilder.java 28424a5 common/src/main/java/org/apache/drill/common/logical/data/Window.java 6dba77c contrib/data/pom.xml 86075f2 contrib/data/window-test-data/pom.xml PRE-CREATION exec/java-exec/pom.xml 90734a5 exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java 190c13f exec/java-exec/src/main/java/org/apache/drill/exec/opt/BasicOptimizer.java 5288f5d exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/WindowPOP.java 17738ee exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameBatchCreator.java 9b8929f exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameRecordBatch.java a3e7940 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameTemplate.java b4e3fed exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFramer.java 9588cef exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameBatchCreator.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameTemplate.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFramer.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrel.java f1a8bc0 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrule.java 00c20b2 exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java f20627d exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/TestWindowFrame.java 7c04477 exec/java-exec/src/test/java/org/apache/drill/exec/sql/TestWindowFunctions.java 6eff6db
Re: Review Request 30051: DRILL-1908: new window function implementation
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30051/#review69499 --- exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java https://reviews.apache.org/r/30051/#comment114189 If I'm understanding this correctly, you're saying that you don't begin to process a batch until you've received all of the last partition that starts in that batch. Couldn't this be a problem is the partition is very, very large, possibly extending to the end of the data stream? exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java https://reviews.apache.org/r/30051/#comment114190 Instead of copying the values, can't we just hang on to the previous batch until the frame no longer needs values from it? exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java https://reviews.apache.org/r/30051/#comment114192 Shouldn't these all be static? exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java https://reviews.apache.org/r/30051/#comment114193 static? - Chris Westin On Jan. 21, 2015, 11:05 p.m., abdelhakim deneche wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30051/ --- (Updated Jan. 21, 2015, 11:05 p.m.) Review request for drill. Bugs: DRILL-1908 https://issues.apache.org/jira/browse/DRILL-1908 Repository: drill-git Description --- In order to fix DRILL-1487 a complete rewrite of the StreamingWindowFrameRecordBatch was needed. This patch adds a new WindowFrameRecordBatch that correctly handles window functions with or without order by clauses. This code still lacks support for frame clauses and may be optimized to reduce unneeded frame computations. Diffs - common/src/main/java/org/apache/drill/common/logical/data/AbstractBuilder.java 28424a5 common/src/main/java/org/apache/drill/common/logical/data/Window.java 6dba77c contrib/data/pom.xml 86075f2 contrib/data/window-test-data/pom.xml PRE-CREATION exec/java-exec/pom.xml 90734a5 exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java 190c13f exec/java-exec/src/main/java/org/apache/drill/exec/opt/BasicOptimizer.java 5288f5d exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/WindowPOP.java 17738ee exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameBatchCreator.java 9b8929f exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameRecordBatch.java a3e7940 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameTemplate.java b4e3fed exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFramer.java 9588cef exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameBatchCreator.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameTemplate.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFramer.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrel.java f1a8bc0 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrule.java 00c20b2 exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java f20627d exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/TestWindowFrame.java 7c04477 exec/java-exec/src/test/java/org/apache/drill/exec/sql/TestWindowFunctions.java 6eff6db Diff: https://reviews.apache.org/r/30051/diff/ Testing --- Unit tests are available to test window functions in mulitple scenarios: - b1.p1: single batch with a single partition - b1.p2: 2 batches, each containing a different parition - b2.p4: 2 batches and 4 partitions, one partition has rows in both batches - b3.p2: 3 batches and 2 partitions, one partition includes the whole 2nd batch and has rows in 3 batches - b4.p4: 4 batches and 4 partitions, the partitions are arranged to test an edge case: the 2nd time innerNext() is called, WindowFrameRecordBatch has enough saved batches to call it's framer.doWork() without the need to call next(incoming) All tests, except the last one, come in 2 variations: with and without order
Re: Review Request 30051: DRILL-1908: new window function implementation
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30051/#review69506 --- exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java https://reviews.apache.org/r/30051/#comment114203 Yes, especially if the user uses over() then we'll have one single parition containing all rows. exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java https://reviews.apache.org/r/30051/#comment114204 when a batch is processed, it means we are ready to send a container downstream. We need to copy those value vectors to the container because because they are part of the schema. exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java https://reviews.apache.org/r/30051/#comment114205 probably. Honestly, I copied most of the code generation stuff from StreamingWindowRecordBatch which was also copied as it is from StreamingAggBatch, but I guess you are right, these should be static. - abdelhakim deneche On Jan. 21, 2015, 11:05 p.m., abdelhakim deneche wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30051/ --- (Updated Jan. 21, 2015, 11:05 p.m.) Review request for drill. Bugs: DRILL-1908 https://issues.apache.org/jira/browse/DRILL-1908 Repository: drill-git Description --- In order to fix DRILL-1487 a complete rewrite of the StreamingWindowFrameRecordBatch was needed. This patch adds a new WindowFrameRecordBatch that correctly handles window functions with or without order by clauses. This code still lacks support for frame clauses and may be optimized to reduce unneeded frame computations. Diffs - common/src/main/java/org/apache/drill/common/logical/data/AbstractBuilder.java 28424a5 common/src/main/java/org/apache/drill/common/logical/data/Window.java 6dba77c contrib/data/pom.xml 86075f2 contrib/data/window-test-data/pom.xml PRE-CREATION exec/java-exec/pom.xml 90734a5 exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java 190c13f exec/java-exec/src/main/java/org/apache/drill/exec/opt/BasicOptimizer.java 5288f5d exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/WindowPOP.java 17738ee exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameBatchCreator.java 9b8929f exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameRecordBatch.java a3e7940 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameTemplate.java b4e3fed exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFramer.java 9588cef exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameBatchCreator.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameTemplate.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFramer.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrel.java f1a8bc0 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrule.java 00c20b2 exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java f20627d exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/TestWindowFrame.java 7c04477 exec/java-exec/src/test/java/org/apache/drill/exec/sql/TestWindowFunctions.java 6eff6db Diff: https://reviews.apache.org/r/30051/diff/ Testing --- Unit tests are available to test window functions in mulitple scenarios: - b1.p1: single batch with a single partition - b1.p2: 2 batches, each containing a different parition - b2.p4: 2 batches and 4 partitions, one partition has rows in both batches - b3.p2: 3 batches and 2 partitions, one partition includes the whole 2nd batch and has rows in 3 batches - b4.p4: 4 batches and 4 partitions, the partitions are arranged to test an edge case: the 2nd time innerNext() is called, WindowFrameRecordBatch has enough saved batches to call it's framer.doWork() without the need to call next(incoming) All tests, except the last one, come in 2 variations: with and without order by clause Thanks, abdelhakim deneche
Re: Review Request 30051: DRILL-1908: new window function implementation
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30051/ --- (Updated Jan. 21, 2015, 11:05 p.m.) Review request for drill. Bugs: DRILL-1908 https://issues.apache.org/jira/browse/DRILL-1908 Repository: drill-git Description (updated) --- In order to fix DRILL-1487 a complete rewrite of the StreamingWindowFrameRecordBatch was needed. This patch adds a new WindowFrameRecordBatch that correctly handles window functions with or without order by clauses. This code still lacks support for frame clauses and may be optimized to reduce unneeded frame computations. Diffs - common/src/main/java/org/apache/drill/common/logical/data/AbstractBuilder.java 28424a5 common/src/main/java/org/apache/drill/common/logical/data/Window.java 6dba77c contrib/data/pom.xml 86075f2 contrib/data/window-test-data/pom.xml PRE-CREATION exec/java-exec/pom.xml 90734a5 exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java 190c13f exec/java-exec/src/main/java/org/apache/drill/exec/opt/BasicOptimizer.java 5288f5d exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/WindowPOP.java 17738ee exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameBatchCreator.java 9b8929f exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameRecordBatch.java a3e7940 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameTemplate.java b4e3fed exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFramer.java 9588cef exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameBatchCreator.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameTemplate.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFramer.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrel.java f1a8bc0 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrule.java 00c20b2 exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java f20627d exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/TestWindowFrame.java 7c04477 exec/java-exec/src/test/java/org/apache/drill/exec/sql/TestWindowFunctions.java 6eff6db Diff: https://reviews.apache.org/r/30051/diff/ Testing --- Unit tests are available to test window functions in mulitple scenarios: - b1.p1: single batch with a single partition - b1.p2: 2 batches, each containing a different parition - b2.p4: 2 batches and 4 partitions, one partition has rows in both batches - b3.p2: 3 batches and 2 partitions, one partition includes the whole 2nd batch and has rows in 3 batches - b4.p4: 4 batches and 4 partitions, the partitions are arranged to test an edge case: the 2nd time innerNext() is called, WindowFrameRecordBatch has enough saved batches to call it's framer.doWork() without the need to call next(incoming) All tests, except the last one, come in 2 variations: with and without order by clause Thanks, abdelhakim deneche
Re: Review Request 30051: DRILL-1908: new window function implementation
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30051/ --- (Updated Jan. 21, 2015, 11:03 p.m.) Review request for drill. Changes --- added 'window.enable' boolean configuration option (default to false), when set to false Drill will display window functions are not supported. updated TestWindowFrame to set 'window.enable' to true before launching the tests Bugs: DRILL-1908 https://issues.apache.org/jira/browse/DRILL-1908 Repository: drill-git Description --- In order to fix DRILL-1487 a complete rewrite of the StreamingWindowFrameRecordBatch was needed. This patch adds a new WindowFrameRecordBatch that correctly handles window functions with or without order by clauses. This code still lacks support for frame clauses and may be optimized to reduce unneeded frame computations. **Note: I forgot to remove GenerateTestData.java from this patch, it was used to generate the test data files and it's not needed. I will make sure to remove it in the upcoming patch update. You can just ignore it for now.** Diffs (updated) - common/src/main/java/org/apache/drill/common/logical/data/AbstractBuilder.java 28424a5 common/src/main/java/org/apache/drill/common/logical/data/Window.java 6dba77c contrib/data/pom.xml 86075f2 contrib/data/window-test-data/pom.xml PRE-CREATION exec/java-exec/pom.xml 90734a5 exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java 190c13f exec/java-exec/src/main/java/org/apache/drill/exec/opt/BasicOptimizer.java 5288f5d exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/WindowPOP.java 17738ee exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameBatchCreator.java 9b8929f exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameRecordBatch.java a3e7940 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameTemplate.java b4e3fed exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFramer.java 9588cef exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameBatchCreator.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameTemplate.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFramer.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrel.java f1a8bc0 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrule.java 00c20b2 exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java f20627d exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/TestWindowFrame.java 7c04477 exec/java-exec/src/test/java/org/apache/drill/exec/sql/TestWindowFunctions.java 6eff6db Diff: https://reviews.apache.org/r/30051/diff/ Testing --- Unit tests are available to test window functions in mulitple scenarios: - b1.p1: single batch with a single partition - b1.p2: 2 batches, each containing a different parition - b2.p4: 2 batches and 4 partitions, one partition has rows in both batches - b3.p2: 3 batches and 2 partitions, one partition includes the whole 2nd batch and has rows in 3 batches - b4.p4: 4 batches and 4 partitions, the partitions are arranged to test an edge case: the 2nd time innerNext() is called, WindowFrameRecordBatch has enough saved batches to call it's framer.doWork() without the need to call next(incoming) All tests, except the last one, come in 2 variations: with and without order by clause Thanks, abdelhakim deneche
Re: Review Request 30051: DRILL-1908: new window function implementation
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30051/ --- (Updated Jan. 20, 2015, 5:34 p.m.) Review request for drill. Changes --- - added several unit tests - added a new window-test-data module in contrib/data to download the test data from S3 (I am using my own bucket for now, and will update it to drill bucket later). - Removed ignored window tests TestWindowFunctions Bugs: DRILL-1908 https://issues.apache.org/jira/browse/DRILL-1908 Repository: drill-git Description --- In order to fix DRILL-1487 a complete rewrite of the StreamingWindowFrameRecordBatch was needed. This patch adds a new WindowFrameRecordBatch that correctly handles window functions with or without order by clauses. This code still lacks support for frame clauses and may be optimized to reduce unneeded frame computations. Diffs (updated) - common/src/main/java/org/apache/drill/common/logical/data/AbstractBuilder.java 28424a5 common/src/main/java/org/apache/drill/common/logical/data/Window.java 6dba77c contrib/data/pom.xml 86075f2 contrib/data/window-test-data/pom.xml PRE-CREATION exec/java-exec/pom.xml 90734a5 exec/java-exec/src/main/java/org/apache/drill/exec/opt/BasicOptimizer.java 5288f5d exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/WindowPOP.java 17738ee exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/GenerateTestData.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameBatchCreator.java 9b8929f exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameRecordBatch.java a3e7940 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameTemplate.java b4e3fed exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFramer.java 9588cef exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameBatchCreator.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameTemplate.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFramer.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrel.java f1a8bc0 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrule.java 00c20b2 exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/TestWindowFrame.java 7c04477 exec/java-exec/src/test/java/org/apache/drill/exec/sql/TestWindowFunctions.java 6eff6db Diff: https://reviews.apache.org/r/30051/diff/ Testing (updated) --- Unit tests are available to test window functions in mulitple scenarios: - b1.p1: single batch with a single partition - b1.p2: 2 batches, each containing a different parition - b2.p4: 2 batches and 4 partitions, one partition has rows in both batches - b3.p2: 3 batches and 2 partitions, one partition includes the whole 2nd batch and has rows in 3 batches - b4.p4: 4 batches and 4 partitions, the partitions are arranged to test an edge case: the 2nd time innerNext() is called, WindowFrameRecordBatch has enough saved batches to call it's framer.doWork() without the need to call next(incoming) All tests, except the last one, come in 2 variations: with and without order by clause Thanks, abdelhakim deneche
Re: Review Request 30051: DRILL-1908: new window function implementation
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30051/ --- (Updated Jan. 20, 2015, 5:40 p.m.) Review request for drill. Changes --- added a note about GenerateTestData.java Bugs: DRILL-1908 https://issues.apache.org/jira/browse/DRILL-1908 Repository: drill-git Description (updated) --- In order to fix DRILL-1487 a complete rewrite of the StreamingWindowFrameRecordBatch was needed. This patch adds a new WindowFrameRecordBatch that correctly handles window functions with or without order by clauses. This code still lacks support for frame clauses and may be optimized to reduce unneeded frame computations. **Note: I forgot to remove GenerateTestData.java from this patch, it was used to generate the test data files and it's not needed. I will make sure to remove it in the upcoming patch update. You can just ignore it for now.** Diffs - common/src/main/java/org/apache/drill/common/logical/data/AbstractBuilder.java 28424a5 common/src/main/java/org/apache/drill/common/logical/data/Window.java 6dba77c contrib/data/pom.xml 86075f2 contrib/data/window-test-data/pom.xml PRE-CREATION exec/java-exec/pom.xml 90734a5 exec/java-exec/src/main/java/org/apache/drill/exec/opt/BasicOptimizer.java 5288f5d exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/WindowPOP.java 17738ee exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/GenerateTestData.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameBatchCreator.java 9b8929f exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameRecordBatch.java a3e7940 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameTemplate.java b4e3fed exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFramer.java 9588cef exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameBatchCreator.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameTemplate.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFramer.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrel.java f1a8bc0 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrule.java 00c20b2 exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/TestWindowFrame.java 7c04477 exec/java-exec/src/test/java/org/apache/drill/exec/sql/TestWindowFunctions.java 6eff6db Diff: https://reviews.apache.org/r/30051/diff/ Testing --- Unit tests are available to test window functions in mulitple scenarios: - b1.p1: single batch with a single partition - b1.p2: 2 batches, each containing a different parition - b2.p4: 2 batches and 4 partitions, one partition has rows in both batches - b3.p2: 3 batches and 2 partitions, one partition includes the whole 2nd batch and has rows in 3 batches - b4.p4: 4 batches and 4 partitions, the partitions are arranged to test an edge case: the 2nd time innerNext() is called, WindowFrameRecordBatch has enough saved batches to call it's framer.doWork() without the need to call next(incoming) All tests, except the last one, come in 2 variations: with and without order by clause Thanks, abdelhakim deneche
Review Request 30051: DRILL-1908: new window function implementation
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30051/ --- Review request for drill. Bugs: DRILL-1908 https://issues.apache.org/jira/browse/DRILL-1908 Repository: drill-git Description --- In order to fix DRILL-1487 a complete rewrite of the StreamingWindowFrameRecordBatch was needed. This patch adds a new WindowFrameRecordBatch that correctly handles window functions with or without order by clauses. This code still lacks support for frame clauses and may be optimized to reduce unneeded frame computations. Diffs - common/src/main/java/org/apache/drill/common/logical/data/AbstractBuilder.java 28424a5 common/src/main/java/org/apache/drill/common/logical/data/Window.java 6dba77c exec/java-exec/src/main/java/org/apache/drill/exec/opt/BasicOptimizer.java 5288f5d exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/WindowPOP.java 17738ee exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameBatchCreator.java 9b8929f exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameRecordBatch.java a3e7940 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameTemplate.java b4e3fed exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFramer.java 9588cef exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameBatchCreator.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameTemplate.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFramer.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrel.java f1a8bc0 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrule.java 00c20b2 exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/TestWindowFrame.java 7c04477 Diff: https://reviews.apache.org/r/30051/diff/ Testing --- I initially added 10 unit tests to make sure the results are correct both with and without an order by clause, but the tests require test files that are relatively large (5MB). Luckily I also wrote a small piece of code to generate the test data, and will later update the patch with tests that generate the test data files on the fly. Thanks, abdelhakim deneche