Re: "between and" filter query is very slow

Kumar Vishal Fri, 03 Mar 2017 04:48:07 -0800

Hi,

Currently In include and exclude filter case when dimension column does not
have inverted index it is doing linear search , We can add binary search
when data for that column is sorted, to get this information we can check
in carbon table for that column whether user has selected no inverted index
or not. If user has selected No inverted index while creating a column this
code is fine, if user has not selected then data will be sorted so we can
add binary search which will improve the performance.


Please raise a Jira for this improvement

-Regards
Kumar Vishal


On Fri, Mar 3, 2017 at 7:42 PM, 马云 <simafengyun1...@163.com> wrote:

> Hi Dev,
>
>
> I used carbondata version 0.2 in my local machine, and found that the
> "between and" filter query is very slow.
> the root caused is by the below code in IncludeFilterExecuterImpl.java.
> It takes about 20s in my test.
>  The code's  time complexity is O(n*m). I think it needs to optimized,
> please confirm. thanks
>
>
>
>
>
>   private BitSet setFilterdIndexToBitSet(DimensionColumnDataChunkdimens
> ionColumnDataChunk,
>
>       intnumerOfRows) {
>
>     BitSet bitSet = new BitSet(numerOfRows);
>
>     if (dimensionColumnDataChunkinstanceof FixedLengthDimensionDataChunk)
> {
>
>       FixedLengthDimensionDataChunk fixedDimensionChunk =
>
>           (FixedLengthDimensionDataChunk) dimensionColumnDataChunk;
>
>       byte[][] filterValues = dimColumnExecuterInfo.getFilterKeys();
>
>
>
>       longstart = System.currentTimeMillis();
>
>       for (intk = 0; k < filterValues.length; k++) {
>
>         for (intj = 0; j < numerOfRows; j++) {
>
>           if (ByteUtil.UnsafeComparer.INSTANCE
>
>               .compareTo(fixedDimensionChunk.getCompleteDataChunk(), j *
> filterValues[k].length,
>
>                   filterValues[k].length, filterValues[k], 0,
> filterValues[k].length) == 0) {
>
>             bitSet.set(j);
>
>           }
>
>         }
>
>       }
>
>       System.out.println("loop time: "+(System.currentTimeMillis() -
> start));
>
>     }
>
>
>
>
>     returnbitSet;
>
>   }

Re: "between and" filter query is very slow

Reply via email to