Re: SQL query and Indexes architecture

Ilya Kasnacheev Mon, 17 Sep 2018 07:57:52 -0700

Hello!

I recommend starting with H2TreeIndex class. Maybe dropping mails on
developer list with precise questions.


Regards,
-- 
Ilya Kasnacheev


пн, 17 сент. 2018 г. в 16:25, eugene miretsky <eugene.miret...@gmail.com>:

> Thanks!
>
> I am curious about the process of loading data from Ignite to H2 on the
> fly, as H2 creating indexes but storing them in Ignite. Can you point me to
> some JIRAs that discuss it, or which part of the code is responsible for
> that?
>
> On Mon, Sep 17, 2018 at 9:18 AM Ilya Kasnacheev <ilya.kasnach...@gmail.com>
> wrote:
>
>> Hello!
>>
>> 1. 1. H2 executes the query, during which it has to load rows from
>> tables, and Ignite does the row loading part. Then Ignite will collect
>> query results on all nodes and aggregate them on a single node.
>> 1. 2. Index is created by H2, but it is stored in Ignite pages (?).
>> 2. Maybe you're right, I have to admit I'm unfamiliar with precise
>> details here.
>>
>> Regards,
>> --
>> Ilya Kasnacheev
>>
>>
>> пн, 17 сент. 2018 г. в 16:02, eugene miretsky <eugene.miret...@gmail.com
>> >:
>>
>>> Thanks!
>>>
>>>
>>>    1.
>>>    1.  "Ignite feeds H2 rows that it asks for, and H2 creates indexes
>>>       on them and executes queries on them." - what exactly do you mean by 
>>> that?
>>>       Do you mean that all parts of a query that use indexes are executed 
>>> by H2,
>>>       then the actual data is retrieved from Ignite pages, and the final
>>>       (non-indexed) parts of the query executed by Ignite?
>>>       2.  What happens when I create an index on a new column? Is the
>>>       index created in Ignite (and stored in Ignite pages?), or is it 
>>> created in
>>>       H2?
>>>    2.  The reason I was asking about AFFINITY_KEY, _key_PK and
>>>    _key_PK_hash indexed is that in this   code
>>>    
>>> <https://github.com/apache/ignite/blob/56975c266e7019f307bb9da42333a6db4e47365e/modules/indexing/src/main/java/org/apache/ignite/internal/processors/query/h2/H2TableDescriptor.java>
>>>  it
>>>    looks like they are created in H2
>>>
>>>
>>>
>>> On Mon, Sep 17, 2018 at 8:36 AM Ilya Kasnacheev <
>>> ilya.kasnach...@gmail.com> wrote:
>>>
>>>> Hello!
>>>>
>>>> 1. H2 does not store data but, as far as my understanding goes, it
>>>> created SQL indexes from data. Ignite feeds H2 rows that it asks for, and
>>>> H2 creates indexes on them and executes queries on them.
>>>> 2. Ignite always has special index on your key (since it's a key-value
>>>> storage it can always find tuple by key). Ignite is also aware of key's
>>>> hash code, and affinity key value always maps to one partition of data (of
>>>> 1024 by default). Those are not H2 indexes and they're mostly used on
>>>> planning stage. E.g. you can map query to one node if affinity key is
>>>> present in the request.
>>>> 3. Data is brought onto the heap to read any fields from row. GROUP BY
>>>> will hold its tuples on heap. Ignite has configurable index inlining where
>>>> you can avoid reading objects from heap just to access indexed fields.
>>>> 4. With GROUP BY, lazy evaluation will not help you much. It will still
>>>> have to hold all data on heap at some point. Lazy evaluation mostly helps
>>>> with "SELECT * FROM table" type queries which provide very large and boring
>>>> result set.
>>>>
>>>> Hope this helps.
>>>> --
>>>> Ilya Kasnacheev
>>>>
>>>>
>>>> пт, 14 сент. 2018 г. в 17:39, eugene miretsky <
>>>> eugene.miret...@gmail.com>:
>>>>
>>>>> Hello,
>>>>>
>>>>> Trying to understand how exactly SQL queries are executed in Ignite. A
>>>>> few questions
>>>>>
>>>>>
>>>>>    1. To what extent is H2 used? Does it store the data? Does it
>>>>>    create the indexes? Is it used only for generating execution plans? I
>>>>>    believe that all the data used to be stored in H2, but with the new 
>>>>> durable
>>>>>    memory architecture, I believe that's no longer the case.
>>>>>    2. Which indexes are used? Ignite creates  B+ tree indexes and
>>>>>    stores them in Index pages, but I also see AFFINITY_KEY, _key_PK and
>>>>>    _key_PK_hash indexes created in H2.
>>>>>    3. When is data brought onto the heap? I am assuming that groupby
>>>>>    and aggregate require all the matching queries to first be copied from
>>>>>    off-heap to heap
>>>>>    4. How does lazy evaluation work? For example, for group_by, does
>>>>>    it bring batches of matching records with the same group_by key onto 
>>>>> the
>>>>>    heap?
>>>>>
>>>>> I am not necessarily looking for the exact answers, but rather pointer
>>>>> in the right direction (documentation, code, jiras)
>>>>>
>>>>> Cheers,
>>>>> Eugene
>>>>>
>>>>

Re: SQL query and Indexes architecture

Reply via email to