Re: using multiple columns Index

Chrystophe Vergnaud Thu, 16 Dec 2021 06:26:36 -0800

Hi Maxim,

Thanks for the details.


But do you have a tweak in mind to get better response time on this kind of
request ?

BR,

Chrystophe Vergnaud
Architect @ Cyblex Technologies



Le jeu. 16 déc. 2021 à 15:09, Maksim Timonin <[email protected]> a
écrit :

> Hi, Chrystophe!
>
> Multifield index should perfectly work for cases with strict equality
> like: (a == ? && b == ?) OR (a == ? && b > ?) . But for queries with range
> queries for first field "a" ("a" > ? && b ...) you should not expect a
> boost of performance. You're right, it's due to B+Tree implementation - we
> store data in pairs (a, b). And then in the index storage the sequence will
> be: (1, 1), (1, 2), (1, 3), (2, 1). We sort by A, and only in case of
> equality field A, we check field B.
>
> So for queries like (a > 0 && b < 10) there is not much help from the
> condition on B for reducing data slice. Our implementation doesn't skip
> some sub-trees for conditions, but it checks tree range sequentially.
>
> > Worst, if the value of b is not present in the slice, it is responding
> as if the b was not in the WHERE clause at all (it seems to run a full scan
> on the sub-result)
>
> I think it can depend on the amount of data you return. If there is no
> data suitable for your condition, you will hang until the query finishes.
> But if you have some, it will return the cursor earlier, after preparing
> the first page for response, see SqlFieldsQuery.setPageSize().
>
> Also, performance may depend on your index selectivity.
>
>
> On Thu, Dec 16, 2021 at 4:23 PM Chrystophe Vergnaud <
> [email protected]> wrote:
>
>> Hello Stephen,
>>
>> I was created with SQL :
>> CREATE INDEX IF NOT EXISTS "t_idx_1" ON MYSCHEMA."t" ("a", "b", "c");
>>
>> BR,
>>
>> Chrystophe Vergnaud
>> Architect @ Cyblex Technologies
>>
>>
>> Le jeu. 16 déc. 2021 à 13:51, Stephen Darlington <
>> [email protected]> a écrit :
>>
>>> Can you show how you’ve defined your index(es)?
>>>
>>> > On 16 Dec 2021, at 12:27, Chrystophe Vergnaud <
>>> [email protected]> wrote:
>>> >
>>> > Hello,
>>> >
>>> > I'm running an ignite 2.10 and I don't understand the behavior of the
>>> multi-columns index.
>>> >
>>> > For instance, I have a table t(id, a,b,c, d, e, f, g)
>>> > - id is a uuid and is the key
>>> > - a is a TIMESTAMP
>>> > - b is a  SMALLINT
>>> > - c is a TINYINT
>>> > - e, f, g are VARCHAR
>>> >
>>> > this table have around 200M lines
>>> >
>>> > I have to select data based on a, b, c in this order, so basically, I
>>> have setup an index on (a,b,c)
>>> >
>>> > If I apply a select with a WHERE clause on "a>=x AND a < y", it works
>>> perfectly, the response time is ok (using USE INDEX)
>>> >
>>> > If I add the b in the WHERE clause I expect to optimize the response
>>> time but it is not the case. Worst, if the value of b is not present in the
>>> slice, it is responding as if the b was not in the WHERE clause at all (it
>>> seems to run a full scan on the sub-result)
>>> >
>>> > Do I miss something ? is it related to the implementation of the
>>> B+tree ?
>>> >
>>> > Thanks in advance for your help.
>>> >
>>> > Best regards,
>>> >
>>> > Chrystophe Vergnaud
>>> > Architect @ Cyblex Technologies
>>>
>>>
>>>

Re: using multiple columns Index

Reply via email to