Hi,

Currently I am working in a RANGE Filter Optimization. Previously if in
case there are both Greater Than and Less than predicates on a same column,
then they are evaluated spearately where multiple times scanning of the
blocks are involved which is undoubtedly costly.

To optimize this, if "Greater and Less than" predicates are there the a
same column and joined by AND expression, then these two expression can be
merged into a single expression and can form Range Expression. Rather than
evaluating the expressions separately only one time the block will be
scanned and if the values lies within Range Min and Max then will be
selected.

Current the filter expression tree is transformed as shown below. The below
tranformation helps us to carry the Range MIN MAX value so that prunning
optimization is applied.

//            AND                                      AND
//             |                                             |
//            / \                                           / \
//           /   \                                         /   \
//        Less   Greater         =>      TRUE   Range
//         / \    / \                                               / \
//        /   \  /   \                                             /   \
//       a    10 a   5                                     Less   greater
//                                                                 /\        /\
//                                                               /  \      /  \
//                                                             a   10  a    5


Presence of a Range in a single expression gives few more optimization
during block prunning. Comparing Filter Min and Max and Block Min and Max
values few decisions can be taken as shown below.

Case A)  In case the Filter Range lies completely outside of Block range =>
Then no scanning is required and the block can be completely skipped.

           -----------------------------------

            |               BLOCK             |

            -----------------------------------
                     -----------------------------
    Min                                      MAX
------------------------------------
                    |      FILTER               |
                      or                                              |
   FILTER                     |
                   ------------------------------

 ------------------------------------
                 Min                               Max
                                                                      Min
                                 Max

Case B ) The Filter Completely Covers the Block => No scanning is required.
Select all Rows of the Block. i.e. turn all bits of block to true.

            --------------------------------------

           |               BLOCK                 |

           --------------------------------------

 ------------------------------------------------------------------
                                                                       |
                         FILTER                                     |

 ------------------------------------------------------------------

Case C) The Filter Overlaps the Block, But completely donot cover it. =>
Choose the Block Scan Start or End based on filter overlaping, no need to
do a binary search to find the Start and End Index of Block.


-------------------------------------
                     ------------------------------------------
                                                  |           BLOCK
          |                                                            |
         BLOCK                        |

-------------------------------------
                     -----------------------------------------
                           --------------------------------------------
                                             OR
              -------------------------------------------------
                           |              FILTER                        |

                    |                 FILTER                            |
                           -------------------------------------------

                   -------------------------------------------------
                                      Start of Block will be default
StartIndex Value                                                End of
Blcok will be default EndIndex Value.

Along with these optimization in scanning few optimization are done during
filter expression tree formation.
Case A) Merging of Redundant Greater than Less Than filter predicates to a
Range.
Case B) Multiple Range Filters can be merged into one.

With this optimization the most benefitted candidates are Direct Dictionary
and No Dictionary Columns. For Dictionary we only get the surrogate Keys
withing the Range.

Please let me know your thoughts in this project.
-- 
Thanks
Sounak

Reply via email to