Hi, all!

It's a gentle reminder. There is a PR for the new Index API [1]. It was
approved by Alex Plekhanov. Does anybody want to review this API too? If
there won't be objections we're going to merge it Monday, 16th of August.

Thanks!

[1] https://github.com/apache/ignite/pull/9118

On Fri, May 21, 2021 at 10:43 PM Maksim Timonin <timonin.ma...@gmail.com>
wrote:

> Andrey, hi!
>
> Some updates, there.
>
> I've submitted a PR for IndexQuery [1]. There is an issue about lazy page
> loading, that is also related to Text query ticket IGNITE-12291.
>
> CacheQueries already have pending pages functionality, it's done with
> multiple sending GridCacheQueryRequest. There was an issue with TextQuery
> and limit, after exceeding a limit we still send requests, so I submitted a
> patch to fix this [2].
>
> But currently, TextQuery, as SqlFieldsQuery also does, prepares whole data
> on query request, holds it, and provides a cursor over this collection.
>
> As I understand you correctly, you propose to run TextQuery over index
> with every poll page request. We can do this with Lucene
> IndexSearcher.searchAfter. So from one side, it will save resources. But
> from the other side, no queries (no TextQuery, no SqlFieldsQuery) lock
> index for querying. So there can be data inconsistency, as there can be
> concurrent operations on an index while a user iterates over the cursor. It
> also could be for queries now, due to no index lock being there, but the
> window of time of such inconsistency is much shorter.
>
> The same dilemma I have for IndexQuery. In my patch [1] I provide lazy
> iteration over BPlusTree. There is no lock on an index too while querying.
> And I want to discuss the right way. I have in mind the next things:
> 1. Indexes currently doesn't support transactions, also SQL queries don't
> lock index for queries, so Ignite don't guarantee data consistency;
> 2. As I understand preparing whole data for SQL queries is required due to
> relations between tables. The more complex query and relations we have, the
> much consistency issues we have in result in case of parallel operations;
> 3. Querying a single index only (by TextQuery or IndexQuery) doesn't
> affect any relations, so we can allow concurrent updates, as it could
> affect a query result but it doesn't hurt.
>
> And following these thoughts, it's right to implement lazy iterations over
> indexes. What do you think?
>
> Also, there is a second topic to discuss. BPlusTree indexes support query
> parallelism. But CacheQueries don't. There needs to be a change to
> infrastructure to support query parallelism, so on this patch [1] I handle
> multiple segments in a single thread. And this works OK, as in the case of
> lazy querying it's very fast to initialize a cursor, so there is not much
> overhead on multiple segments. I ran performance tests and found that in
> some cases, IndexQuery beats SqlFieldsQuery even with enabled
> queryParallelism (it helps a SqlFieldsQuery much). So the need for
> supporting queryParallelism for IndexQuery is required to be tested well.
> As IndexQuery already can help users to speed up some queries I propose to
> check queryParallelism a little bit later. WDYT?
>
> So, those 2 things affect the Apache Ignite release that IndexQuery will
> be delivered with. So, please let me know your thoughts.
>
> Any thoughts from the community are welcome too.
>
>
> [1] https://github.com/apache/ignite/pull/9118
> [2] https://github.com/apache/ignite/pull/9086
>
> On Mon, Apr 12, 2021 at 1:52 PM Maksim Timonin <timonin.ma...@gmail.com>
> wrote:
>
>> Andrey,
>>
>> Thanks! I picked it.
>>
>> On Mon, Apr 12, 2021 at 1:51 PM Maksim Timonin <timonin.ma...@gmail.com>
>> wrote:
>>
>>> Stephen,
>>>
>>> I don't see a reason to replace or deprecate IndexingSpi. I'm not
>>> sure how smbd uses it, but it works now.
>>>
>>> On Mon, Apr 12, 2021 at 1:42 PM Stephen Darlington <
>>> stephen.darling...@gridgain.com> wrote:
>>>
>>>> Is this a replacement for IndexingSpi? Put bluntly, do we deprecate
>>>> (and remove) it?
>>>>
>>>> Or do you see them as complimentary?
>>>>
>>>> > On 12 Apr 2021, at 11:29, Maksim Timonin <timonin.ma...@gmail.com>
>>>> wrote:
>>>> >
>>>> > Hi Stephen!
>>>> >
>>>> > Please have a look at the QueryProcessing paragraph [1]. I've
>>>> described
>>>> > why IndexingSpi doesn't fit us well.
>>>> >
>>>> > [1]
>>>> >
>>>> https://cwiki.apache.org/confluence/display/IGNITE/IEP-71+Public+API+for+secondary+index+search#IEP71PublicAPIforsecondaryindexsearch-2)QueryProcessing
>>>> >
>>>> > On Mon, Apr 12, 2021 at 1:24 PM Stephen Darlington <
>>>> > stephen.darling...@gridgain.com> wrote:
>>>> >
>>>> >> How does this fit with the current IndexingSpi? Superficially they
>>>> appear
>>>> >> to do very similar things?
>>>> >>
>>>> >> Regards,
>>>> >> Stephen
>>>> >>
>>>> >>> On 6 Apr 2021, at 14:13, Maksim Timonin <timonin.ma...@gmail.com>
>>>> wrote:
>>>> >>>
>>>> >>> Hi, Igniters!
>>>> >>>
>>>> >>> I'd like to propose a new feature - opportunity to query and create
>>>> >> indexes
>>>> >>> from public API.
>>>> >>>
>>>> >>> It will help in some cases, where:
>>>> >>> 1. SQL is not applicable by design of user application;
>>>> >>> 2. Where IndexScan is preferable than ScanQuery for performance
>>>> reasons;
>>>> >>> 3. Functional indexes are required.
>>>> >>>
>>>> >>> Also it'll be great to have a transactional support for such
>>>> queries,
>>>> >> like
>>>> >>> the "select for update" query provides. But I don't dig there much.
>>>> It
>>>> >> will
>>>> >>> be a next step if this API will be implemented.
>>>> >>>
>>>> >>> I've prepared an IEP-71 for that [1] with more details. Please
>>>> share your
>>>> >>> thoughts.
>>>> >>>
>>>> >>>
>>>> >>> [1]
>>>> >>>
>>>> >>
>>>> https://cwiki.apache.org/confluence/display/IGNITE/IEP-71+Public+API+for+secondary+index+search
>>>> >>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>

Reply via email to