Agree. +1
Regards JB On Feb 15, 2017, 09:09, at 09:09, Kumar Vishal <kumarvishal1...@gmail.com> wrote: >+1 >This will improve the IO bottleneck. Page level min max will improve >the >block pruning and less number of false positive blocks will improve the >filter query performance. Separating uncompression of data from reader >layer will improve the overall query performance. > >-Regards >Kumar Vishal > >On Wed, Feb 15, 2017 at 7:50 PM, Ravindra Pesala ><ravi.pes...@gmail.com> >wrote: > >> Please find the thrift file in below location. >> https://drive.google.com/open?id=0B4TWTVbFSTnqZEdDRHRncVItQ242b >> 1NqSTU2b2g4dkhkVDRj >> >> On 15 February 2017 at 17:14, Ravindra Pesala <ravi.pes...@gmail.com> >> wrote: >> >> > Problems in current format. >> > 1. IO read is slower since it needs to go for multiple seeks on the >file >> > to read column blocklets. Current size of blocklet is 120000, so it >needs >> > to read multiple times from file to scan the data on that column. >> > Alternatively we can increase the blocklet size but it suffers for >filter >> > queries as it gets big blocklet to filter. >> > 2. Decompression is slower in current format, we are using inverted >index >> > for faster filter queries and using NumberCompressor to compress >the >> > inverted index in bit wise packing. It becomes slower so we should >avoid >> > number compressor. One alternative is to keep blocklet size with in >32000 >> > so that inverted index can be written with short, but IO read >suffers a >> lot. >> > >> > To overcome from above 2 issues we are introducing new format V3. >> > Here each blocklet has multiple pages with size 32000, number of >pages in >> > blocklet is configurable. Since we keep the page with in short >limit so >> no >> > need compress the inverted index here. >> > And maintain the max/min for each page to further prune the filter >> queries. >> > Read the blocklet with pages at once and keep in offheap memory. >> > During filter first check the max/min range and if it is valid then >go >> for >> > decompressing the page to filter further. >> > >> > Please find the attached V3 format thrift file. >> > >> > -- >> > Thanks & Regards, >> > Ravi >> > >> >> >> >> -- >> Thanks & Regards, >> Ravi >>