Hi!

It's a PR re-review request.

I've reworked PR [1]. The major changes are:
1. indexName is a required param for building IndexQuery.
2. the order of criteria fields doesn't make sense - now it checks only
that criteria is a prefix for index fields.
3. IndexQuery can accept a value type (only value class was before). Thanks
a lot to Taras Ledkov for this comment, I missed this.
4. Also refactored code in places where Andrey Mashenkov put his comments.

Also created a separate ticket [2] for building IndexQuery from a list of
criteria fields only.

[1] https://github.com/apache/ignite/pull/9118
[2] https://issues.apache.org/jira/browse/IGNITE-15391

On Fri, Aug 27, 2021 at 10:49 AM Maksim Timonin <timonin.ma...@gmail.com>
wrote:

> Hi!
>
> Thanks all for your comments!
>
> 1. I'm OK on the first step to make the indexName param required and not
> check fields order for this case. So, I'll remove code with search index by
> fields from current PR.
>
> 2. But I propose to create a ticket for possible improvement - IndexQuery
> should be created with a list of fields only.
> I believe that it's a useful feature. My main argument is still the same:
> - Ignite provides to users the possibility to create an index without name
> setting. Then Ignite should provide the possibility to query them.
> - There is an example of using such API in other databases: as Ivan
> mentioned, MongoDB doesn't require to specify an index name [1], it
> searches it by fields only. Note, MongoDB doesn't require strict fields
> order for querying indexes, but requires it for sorting [2].
>
> I don't see reasons not to provide a similar API for our users. Agree,
> that order of fields is under discussion.
>
> So, let's postpone step 2, and discuss it later again, as currently it is
> a blocker for Index API. If somebody sees strong arguments against this
> proposal and my arguments, please write them.
>
> [1] https://docs.mongodb.com/manual/indexes/
> [2]
> https://docs.mongodb.com/manual/tutorial/sort-results-with-indexes/#sort-on-multiple-fields
>
>
>
>
> On Thu, Aug 26, 2021 at 8:00 PM Courtney Robinson <
> courtney.robin...@hypi.io> wrote:
>
>> Prefer 1 from Teras' response. Specifying index name is preferred.
>> I've seen customers do idx(A,B) and idx(B,A) where semantics change
>> between
>> the two.
>>
>> Regards,
>> Courtney Robinson
>> Founder and CEO, Hypi
>> Tel: ++44 208 123 2413 (GMT+0) <https://hypi.io>
>>
>> <https://hypi.io>
>> https://hypi.io
>>
>>
>> On Thu, Aug 26, 2021 at 4:28 PM Taras Ledkov <tled...@gridgain.com>
>> wrote:
>>
>> > Hi,
>> >
>> > My proposal:
>> > 1. Don't search index by criteria, specify the index name always
>> > (preferred).
>> >
>> > OR
>> >
>> > 2. Search index by criteria without check the order of criteriones.
>> > Use the Set of criterions instead of the ordered collection.
>> > In the strange case when the both index exist (a, b) and (b, a) - use
>> > the any index
>> > when index name isn't specified.
>> >
>> > On 26.08.2021 16:49, Maksim Timonin wrote:
>> > > There are some thoughts about strict field order:
>> > > 1. Index (A, B) is not equivalent to index (B, A). Some queries may
>> have
>> > > different performance on such indexes, and users have to specify the
>> > right
>> > > index. What if both indexes exist?
>> > > 2. We should avoid cases when a user uses in query only field B for
>> index
>> > > (A, B). We have to force the user to specify range for (A) too, or
>> > > explicitly set it (null, null). Otherwise it looks like a mistake.
>> > >
>> > >
>> > >
>> > >
>> > > On Thu, Aug 26, 2021 at 4:39 PM Ivan Daschinsky <ivanda...@gmail.com>
>> > wrote:
>> > >
>> > >> 1. I suppose, that the next step is to implement the api for manually
>> > >> creating index. I think that user wants to create index that will
>> speed
>> > up
>> > >> his criteria base queries, so he or she will use the same criteria to
>> > >> define the index. So no problem at all
>> > >> 2. We should print warning or throws exception if there is not any
>> index
>> > >> that match specific criteria.
>> > >>
>> > >> BTW, Mongo DB doesn't make user to write index name in query. It just
>> > >> works.
>> > >>
>> > >> чт, 26 авг. 2021 г., 15:52 Taras Ledkov <tled...@gridgain.com>:
>> > >>
>> > >>> Hi,
>> > >>>
>> > >>>> It is an usability nightmare to make user write index name in all
>> > >> cases.
>> > >>> I don't see any difference between specifying the index name and
>> > >>> specifying the index fields in the right order.
>> > >>> Do you see?
>> > >>>
>> > >>> Let's there is the index:
>> > >>> idx_A_B ON TBL (A, B)
>> > >>>
>> > >>> Is it OK that the query like below doesn't math the index 'idx_A_B'?
>> > >>> new IndexQuery<>(..)
>> > >>>       .setCriteria(lt("b", 1), lt("a", 2));
>> > >>>
>> > >>> On 26.08.2021 15:23, Ivan Daschinsky wrote:
>> > >>>> I am against to make user write index name. It is quite simple and
>> > >>>> straightforward algorithm to match index to field names, so it is
>> > >> strange
>> > >>>> to compare it to sql engine optimizer.
>> > >>>>
>> > >>>> It is an usability nightmare to make user write index name in all
>> > >> cases.
>> > >>>> чт, 26 авг. 2021 г., 14:42 Maksim Timonin <timonin.ma...@gmail.com
>> >:
>> > >>>>
>> > >>>>> Hi, Igniters!
>> > >>>>>
>> > >>>>> There is a discussion about how to specify an index to query with
>> an
>> > >>>>> IndexQuery [1]. Currently my PR provides 2 ways to specify index:
>> > >>>>> 1. With a table and index name;
>> > >>>>> 2. With a table and list of index fields (without index name). In
>> > this
>> > >>> case
>> > >>>>> IndexQueryProcessor tries to find an index that matches table and
>> > >> index
>> > >>>>> fields in strict order (order of fields in criteria has to match
>> the
>> > >>> order
>> > >>>>> of fields in index).
>> > >>>>>
>> > >>>>> Discussion is whether is the second approach valid?
>> > >>>>>
>> > >>>>> Pros:
>> > >>>>> 1. Currently index name is an optional field for QueryIndex and
>> > >>>>> QuerySqlField. Then users can create an index with a table and
>> list
>> > of
>> > >>>>> fields. Then, we should provide an opportunity to define an index
>> for
>> > >>>>> querying the same way as we do for creating.
>> > >>>>> 2. It's required to know the index name to query it (in case the
>> > index
>> > >>> was
>> > >>>>> created without an explicit name). Users can find it and then use
>> it
>> > >> as
>> > >>> a
>> > >>>>> constant in code, but I see some troubles there:
>> > >>>>> 2.1. Get index name by querying the system view INDEXES. Note,
>> that
>> > >>> system
>> > >>>>> views are marked as an experimental feature [2].
>> > >>>>> 2.2. There is a workaround to know an index name with EXPLAIN
>> clause
>> > >> for
>> > >>>>> sql query that uses the required index (but it depends on SQL
>> > >>> optimizer).
>> > >>>>> 2.3. Users can use the index name builder, but it is in the
>> > >>>>> internal package
>> > >>>>>
>> (org.apache.ignite.internal.processors.query.QueryUtils#indexName).
>> > >>> Then it
>> > >>>>> can be changed from version to version without warning, and then
>> the
>> > >>> user
>> > >>>>> can't rely on it in code.
>> > >>>>> 3. Name of the Primary Key index (_key_PK) is predefined and
>> > hardcoded
>> > >>> in
>> > >>>>> Ignite. Users can't set it while creating it, and the name of PK
>> > index
>> > >>> is
>> > >>>>> hardcoded in the internal package too
>> (QueryUtils.PRIMARY_KEY_INDEX).
>> > >>>>>
>> > >>>>> Cons:
>> > >>>>> 1. It's declared that IndexQuery avoids some SQL steps (like
>> > planning,
>> > >>>>> optimizer) in favor of speed. It looks like that looking for an
>> index
>> > >> by
>> > >>>>> list of fields is the work of an optimizer.
>> > >>>>> 2. It can be not obvious that the order of fields in a query has
>> to
>> > >>> match
>> > >>>>> the order of fields in the related index. We should have it in
>> mind
>> > >> when
>> > >>>>> building a query - there should be a check for order of fields
>> before
>> > >>>>> querying.
>> > >>>>>
>> > >>>>>   From my side, I think that arguments for enforcing usage of an
>> > index
>> > >>> name
>> > >>>>> for queries are strong enough. But for me it's strange that it's
>> > >>> possible
>> > >>>>> to create an index without a name, but it's required to use name
>> to
>> > >>> query
>> > >>>>> it. Also taking in consideration that there is no guaranteed way
>> to
>> > >> get
>> > >>> an
>> > >>>>> index name (or I don't know it).
>> > >>>>>
>> > >>>>> Igniters, what do you think?
>> > >>>>> [1]
>> https://github.com/apache/ignite/pull/9118#discussion_r642557531
>> > >>>>> [2]
>> > >>>
>> https://ignite.apache.org/docs/latest/monitoring-metrics/system-views
>> > >>>>>
>> > >>>>> On Fri, Aug 6, 2021 at 4:04 PM Maksim Timonin <
>> > >> timonin.ma...@gmail.com>
>> > >>>>> wrote:
>> > >>>>>
>> > >>>>>> Hi, all!
>> > >>>>>>
>> > >>>>>> It's a gentle reminder. There is a PR for the new Index API [1].
>> It
>> > >> was
>> > >>>>>> approved by Alex Plekhanov. Does anybody want to review this API
>> > too?
>> > >>> If
>> > >>>>>> there won't be objections we're going to merge it Monday, 16th of
>> > >>> August.
>> > >>>>>> Thanks!
>> > >>>>>>
>> > >>>>>> [1] https://github.com/apache/ignite/pull/9118
>> > >>>>>>
>> > >>>>>> On Fri, May 21, 2021 at 10:43 PM Maksim Timonin <
>> > >>> timonin.ma...@gmail.com
>> > >>>>>> wrote:
>> > >>>>>>
>> > >>>>>>> Andrey, hi!
>> > >>>>>>>
>> > >>>>>>> Some updates, there.
>> > >>>>>>>
>> > >>>>>>> I've submitted a PR for IndexQuery [1]. There is an issue about
>> > lazy
>> > >>>>> page
>> > >>>>>>> loading, that is also related to Text query ticket IGNITE-12291.
>> > >>>>>>>
>> > >>>>>>> CacheQueries already have pending pages functionality, it's done
>> > >> with
>> > >>>>>>> multiple sending GridCacheQueryRequest. There was an issue with
>> > >>>>> TextQuery
>> > >>>>>>> and limit, after exceeding a limit we still send requests, so I
>> > >>>>> submitted a
>> > >>>>>>> patch to fix this [2].
>> > >>>>>>>
>> > >>>>>>> But currently, TextQuery, as SqlFieldsQuery also does, prepares
>> > >> whole
>> > >>>>>>> data on query request, holds it, and provides a cursor over this
>> > >>>>>>> collection.
>> > >>>>>>>
>> > >>>>>>> As I understand you correctly, you propose to run TextQuery over
>> > >> index
>> > >>>>>>> with every poll page request. We can do this with Lucene
>> > >>>>>>> IndexSearcher.searchAfter. So from one side, it will save
>> > resources.
>> > >>> But
>> > >>>>>>> from the other side, no queries (no TextQuery, no
>> SqlFieldsQuery)
>> > >> lock
>> > >>>>>>> index for querying. So there can be data inconsistency, as there
>> > can
>> > >>> be
>> > >>>>>>> concurrent operations on an index while a user iterates over the
>> > >>>>> cursor. It
>> > >>>>>>> also could be for queries now, due to no index lock being there,
>> > but
>> > >>> the
>> > >>>>>>> window of time of such inconsistency is much shorter.
>> > >>>>>>>
>> > >>>>>>> The same dilemma I have for IndexQuery. In my patch [1] I
>> provide
>> > >> lazy
>> > >>>>>>> iteration over BPlusTree. There is no lock on an index too while
>> > >>>>> querying.
>> > >>>>>>> And I want to discuss the right way. I have in mind the next
>> > things:
>> > >>>>>>> 1. Indexes currently doesn't support transactions, also SQL
>> queries
>> > >>>>> don't
>> > >>>>>>> lock index for queries, so Ignite don't guarantee data
>> consistency;
>> > >>>>>>> 2. As I understand preparing whole data for SQL queries is
>> required
>> > >>> due
>> > >>>>>>> to relations between tables. The more complex query and
>> relations
>> > we
>> > >>>>> have,
>> > >>>>>>> the much consistency issues we have in result in case of
>> parallel
>> > >>>>>>> operations;
>> > >>>>>>> 3. Querying a single index only (by TextQuery or IndexQuery)
>> > doesn't
>> > >>>>>>> affect any relations, so we can allow concurrent updates, as it
>> > >> could
>> > >>>>>>> affect a query result but it doesn't hurt.
>> > >>>>>>>
>> > >>>>>>> And following these thoughts, it's right to implement lazy
>> > >> iterations
>> > >>>>>>> over indexes. What do you think?
>> > >>>>>>>
>> > >>>>>>> Also, there is a second topic to discuss. BPlusTree indexes
>> support
>> > >>>>> query
>> > >>>>>>> parallelism. But CacheQueries don't. There needs to be a change
>> to
>> > >>>>>>> infrastructure to support query parallelism, so on this patch
>> [1] I
>> > >>>>> handle
>> > >>>>>>> multiple segments in a single thread. And this works OK, as in
>> the
>> > >>> case
>> > >>>>> of
>> > >>>>>>> lazy querying it's very fast to initialize a cursor, so there is
>> > not
>> > >>>>> much
>> > >>>>>>> overhead on multiple segments. I ran performance tests and found
>> > >> that
>> > >>> in
>> > >>>>>>> some cases, IndexQuery beats SqlFieldsQuery even with enabled
>> > >>>>>>> queryParallelism (it helps a SqlFieldsQuery much). So the need
>> for
>> > >>>>>>> supporting queryParallelism for IndexQuery is required to be
>> tested
>> > >>>>> well.
>> > >>>>>>> As IndexQuery already can help users to speed up some queries I
>> > >>> propose
>> > >>>>> to
>> > >>>>>>> check queryParallelism a little bit later. WDYT?
>> > >>>>>>>
>> > >>>>>>> So, those 2 things affect the Apache Ignite release that
>> IndexQuery
>> > >>> will
>> > >>>>>>> be delivered with. So, please let me know your thoughts.
>> > >>>>>>>
>> > >>>>>>> Any thoughts from the community are welcome too.
>> > >>>>>>>
>> > >>>>>>>
>> > >>>>>>> [1] https://github.com/apache/ignite/pull/9118
>> > >>>>>>> [2] https://github.com/apache/ignite/pull/9086
>> > >>>>>>>
>> > >>>>>>> On Mon, Apr 12, 2021 at 1:52 PM Maksim Timonin <
>> > >>> timonin.ma...@gmail.com
>> > >>>>>>> wrote:
>> > >>>>>>>
>> > >>>>>>>> Andrey,
>> > >>>>>>>>
>> > >>>>>>>> Thanks! I picked it.
>> > >>>>>>>>
>> > >>>>>>>> On Mon, Apr 12, 2021 at 1:51 PM Maksim Timonin <
>> > >>>>> timonin.ma...@gmail.com>
>> > >>>>>>>> wrote:
>> > >>>>>>>>
>> > >>>>>>>>> Stephen,
>> > >>>>>>>>>
>> > >>>>>>>>> I don't see a reason to replace or deprecate IndexingSpi. I'm
>> not
>> > >>>>>>>>> sure how smbd uses it, but it works now.
>> > >>>>>>>>>
>> > >>>>>>>>> On Mon, Apr 12, 2021 at 1:42 PM Stephen Darlington <
>> > >>>>>>>>> stephen.darling...@gridgain.com> wrote:
>> > >>>>>>>>>
>> > >>>>>>>>>> Is this a replacement for IndexingSpi? Put bluntly, do we
>> > >> deprecate
>> > >>>>>>>>>> (and remove) it?
>> > >>>>>>>>>>
>> > >>>>>>>>>> Or do you see them as complimentary?
>> > >>>>>>>>>>
>> > >>>>>>>>>>> On 12 Apr 2021, at 11:29, Maksim Timonin <
>> > >> timonin.ma...@gmail.com
>> > >>>>>>>>>> wrote:
>> > >>>>>>>>>>> Hi Stephen!
>> > >>>>>>>>>>>
>> > >>>>>>>>>>> Please have a look at the QueryProcessing paragraph [1].
>> I've
>> > >>>>>>>>>> described
>> > >>>>>>>>>>> why IndexingSpi doesn't fit us well.
>> > >>>>>>>>>>>
>> > >>>>>>>>>>> [1]
>> > >>>>>>>>>>>
>> > >>
>> >
>> https://cwiki.apache.org/confluence/display/IGNITE/IEP-71+Public+API+for+secondary+index+search#IEP71PublicAPIforsecondaryindexsearch-2)QueryProcessing
>> > >>>>>>>>>>> On Mon, Apr 12, 2021 at 1:24 PM Stephen Darlington <
>> > >>>>>>>>>>> stephen.darling...@gridgain.com> wrote:
>> > >>>>>>>>>>>
>> > >>>>>>>>>>>> How does this fit with the current IndexingSpi?
>> Superficially
>> > >>> they
>> > >>>>>>>>>> appear
>> > >>>>>>>>>>>> to do very similar things?
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>> Regards,
>> > >>>>>>>>>>>> Stephen
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>>> On 6 Apr 2021, at 14:13, Maksim Timonin <
>> > >>> timonin.ma...@gmail.com
>> > >>>>>>>>>> wrote:
>> > >>>>>>>>>>>>> Hi, Igniters!
>> > >>>>>>>>>>>>>
>> > >>>>>>>>>>>>> I'd like to propose a new feature - opportunity to query
>> and
>> > >>>>> create
>> > >>>>>>>>>>>> indexes
>> > >>>>>>>>>>>>> from public API.
>> > >>>>>>>>>>>>>
>> > >>>>>>>>>>>>> It will help in some cases, where:
>> > >>>>>>>>>>>>> 1. SQL is not applicable by design of user application;
>> > >>>>>>>>>>>>> 2. Where IndexScan is preferable than ScanQuery for
>> > >> performance
>> > >>>>>>>>>> reasons;
>> > >>>>>>>>>>>>> 3. Functional indexes are required.
>> > >>>>>>>>>>>>>
>> > >>>>>>>>>>>>> Also it'll be great to have a transactional support for
>> such
>> > >>>>>>>>>> queries,
>> > >>>>>>>>>>>> like
>> > >>>>>>>>>>>>> the "select for update" query provides. But I don't dig
>> there
>> > >>>>>>>>>> much. It
>> > >>>>>>>>>>>> will
>> > >>>>>>>>>>>>> be a next step if this API will be implemented.
>> > >>>>>>>>>>>>>
>> > >>>>>>>>>>>>> I've prepared an IEP-71 for that [1] with more details.
>> > Please
>> > >>>>>>>>>> share your
>> > >>>>>>>>>>>>> thoughts.
>> > >>>>>>>>>>>>>
>> > >>>>>>>>>>>>>
>> > >>>>>>>>>>>>> [1]
>> > >>>>>>>>>>>>>
>> > >>
>> >
>> https://cwiki.apache.org/confluence/display/IGNITE/IEP-71+Public+API+for+secondary+index+search
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>
>> > >>> --
>> > >>> Taras Ledkov
>> > >>> Mail-To: tled...@gridgain.com
>> > >>>
>> > >>>
>> > --
>> > Taras Ledkov
>> > Mail-To: tled...@gridgain.com
>> >
>> >
>>
>

Reply via email to