Hi, all! It's a gentle reminder. There is a PR for the new Index API [1]. It was approved by Alex Plekhanov. Does anybody want to review this API too? If there won't be objections we're going to merge it Monday, 16th of August.
Thanks! [1] https://github.com/apache/ignite/pull/9118 On Fri, May 21, 2021 at 10:43 PM Maksim Timonin <timonin.ma...@gmail.com> wrote: > Andrey, hi! > > Some updates, there. > > I've submitted a PR for IndexQuery [1]. There is an issue about lazy page > loading, that is also related to Text query ticket IGNITE-12291. > > CacheQueries already have pending pages functionality, it's done with > multiple sending GridCacheQueryRequest. There was an issue with TextQuery > and limit, after exceeding a limit we still send requests, so I submitted a > patch to fix this [2]. > > But currently, TextQuery, as SqlFieldsQuery also does, prepares whole data > on query request, holds it, and provides a cursor over this collection. > > As I understand you correctly, you propose to run TextQuery over index > with every poll page request. We can do this with Lucene > IndexSearcher.searchAfter. So from one side, it will save resources. But > from the other side, no queries (no TextQuery, no SqlFieldsQuery) lock > index for querying. So there can be data inconsistency, as there can be > concurrent operations on an index while a user iterates over the cursor. It > also could be for queries now, due to no index lock being there, but the > window of time of such inconsistency is much shorter. > > The same dilemma I have for IndexQuery. In my patch [1] I provide lazy > iteration over BPlusTree. There is no lock on an index too while querying. > And I want to discuss the right way. I have in mind the next things: > 1. Indexes currently doesn't support transactions, also SQL queries don't > lock index for queries, so Ignite don't guarantee data consistency; > 2. As I understand preparing whole data for SQL queries is required due to > relations between tables. The more complex query and relations we have, the > much consistency issues we have in result in case of parallel operations; > 3. Querying a single index only (by TextQuery or IndexQuery) doesn't > affect any relations, so we can allow concurrent updates, as it could > affect a query result but it doesn't hurt. > > And following these thoughts, it's right to implement lazy iterations over > indexes. What do you think? > > Also, there is a second topic to discuss. BPlusTree indexes support query > parallelism. But CacheQueries don't. There needs to be a change to > infrastructure to support query parallelism, so on this patch [1] I handle > multiple segments in a single thread. And this works OK, as in the case of > lazy querying it's very fast to initialize a cursor, so there is not much > overhead on multiple segments. I ran performance tests and found that in > some cases, IndexQuery beats SqlFieldsQuery even with enabled > queryParallelism (it helps a SqlFieldsQuery much). So the need for > supporting queryParallelism for IndexQuery is required to be tested well. > As IndexQuery already can help users to speed up some queries I propose to > check queryParallelism a little bit later. WDYT? > > So, those 2 things affect the Apache Ignite release that IndexQuery will > be delivered with. So, please let me know your thoughts. > > Any thoughts from the community are welcome too. > > > [1] https://github.com/apache/ignite/pull/9118 > [2] https://github.com/apache/ignite/pull/9086 > > On Mon, Apr 12, 2021 at 1:52 PM Maksim Timonin <timonin.ma...@gmail.com> > wrote: > >> Andrey, >> >> Thanks! I picked it. >> >> On Mon, Apr 12, 2021 at 1:51 PM Maksim Timonin <timonin.ma...@gmail.com> >> wrote: >> >>> Stephen, >>> >>> I don't see a reason to replace or deprecate IndexingSpi. I'm not >>> sure how smbd uses it, but it works now. >>> >>> On Mon, Apr 12, 2021 at 1:42 PM Stephen Darlington < >>> stephen.darling...@gridgain.com> wrote: >>> >>>> Is this a replacement for IndexingSpi? Put bluntly, do we deprecate >>>> (and remove) it? >>>> >>>> Or do you see them as complimentary? >>>> >>>> > On 12 Apr 2021, at 11:29, Maksim Timonin <timonin.ma...@gmail.com> >>>> wrote: >>>> > >>>> > Hi Stephen! >>>> > >>>> > Please have a look at the QueryProcessing paragraph [1]. I've >>>> described >>>> > why IndexingSpi doesn't fit us well. >>>> > >>>> > [1] >>>> > >>>> https://cwiki.apache.org/confluence/display/IGNITE/IEP-71+Public+API+for+secondary+index+search#IEP71PublicAPIforsecondaryindexsearch-2)QueryProcessing >>>> > >>>> > On Mon, Apr 12, 2021 at 1:24 PM Stephen Darlington < >>>> > stephen.darling...@gridgain.com> wrote: >>>> > >>>> >> How does this fit with the current IndexingSpi? Superficially they >>>> appear >>>> >> to do very similar things? >>>> >> >>>> >> Regards, >>>> >> Stephen >>>> >> >>>> >>> On 6 Apr 2021, at 14:13, Maksim Timonin <timonin.ma...@gmail.com> >>>> wrote: >>>> >>> >>>> >>> Hi, Igniters! >>>> >>> >>>> >>> I'd like to propose a new feature - opportunity to query and create >>>> >> indexes >>>> >>> from public API. >>>> >>> >>>> >>> It will help in some cases, where: >>>> >>> 1. SQL is not applicable by design of user application; >>>> >>> 2. Where IndexScan is preferable than ScanQuery for performance >>>> reasons; >>>> >>> 3. Functional indexes are required. >>>> >>> >>>> >>> Also it'll be great to have a transactional support for such >>>> queries, >>>> >> like >>>> >>> the "select for update" query provides. But I don't dig there much. >>>> It >>>> >> will >>>> >>> be a next step if this API will be implemented. >>>> >>> >>>> >>> I've prepared an IEP-71 for that [1] with more details. Please >>>> share your >>>> >>> thoughts. >>>> >>> >>>> >>> >>>> >>> [1] >>>> >>> >>>> >> >>>> https://cwiki.apache.org/confluence/display/IGNITE/IEP-71+Public+API+for+secondary+index+search >>>> >> >>>> >> >>>> >> >>>> >>>> >>>>