Re: Sync vs async APIs in Ignite 3
Hi Val, I'd highly support an async first API based on CompletionStage <https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/CompletionStage.html> or its subtypes like CompletableFuture. In Ignite 2 we've written a wrapper library around IgniteFuture to provide CompletionStage instead because many of the newer libs we use support this. If Ignite 3 went this way it'd remove a lot of boiler plate/wrapper that we wrote to get what you're suggesting here. Regards, Courtney Robinson Founder and CEO, Hypi Tel: ++44 208 123 2413 (GMT+0) <https://hypi.io> <https://hypi.io> https://hypi.io On Wed, Sep 8, 2021 at 12:44 AM Valentin Kulichenko < valentin.kuliche...@gmail.com> wrote: > Igniters, > > I would like to gather some opinions on whether we want to focus on sync vs > async APIs in Ignite 3. > > Here are some initial considerations that I have: > 1. Ignite 2.x is essentially "sync first". Async APIs exist, but they use > non-standard IgniteFuture and provide counterintuitive guarantees. In my > experience, they significantly lack usability, and because of that are > rarely used. > 2. In general, however, async execution becomes more and more prominent. > Something we can't ignore if we want to create a modern framework. > 3. Still, async support in Java is very limited (especially if compared to > other languages, like C# for example). > > My current position is the following (happy to discuss): > 1. We should pay more attention to async APIs. As a general rule, async API > should be primary, with the sync version build on top. > 2. In languages with proper async support (async-await, etc.), we can skip > sync API altogether. As an example of this, you can look at the first > version of the .NET client [1]. It exposes only async methods, and it > doesn't look like sync counterparts are really needed. > 3. In Java (as well as other languages where applicable), we will add sync > APIs that simply delegate to async APIs. This will help users to avoid > CompletableFuture if they don't want to use it. > > [1] https://github.com/apache/ignite-3/pull/306 > > Please share your thoughts. > > -Val >
Re: [DISCUSS] IEP-71 Public API for secondary index search
Prefer 1 from Teras' response. Specifying index name is preferred. I've seen customers do idx(A,B) and idx(B,A) where semantics change between the two. Regards, Courtney Robinson Founder and CEO, Hypi Tel: ++44 208 123 2413 (GMT+0) <https://hypi.io> <https://hypi.io> https://hypi.io On Thu, Aug 26, 2021 at 4:28 PM Taras Ledkov wrote: > Hi, > > My proposal: > 1. Don't search index by criteria, specify the index name always > (preferred). > > OR > > 2. Search index by criteria without check the order of criteriones. > Use the Set of criterions instead of the ordered collection. > In the strange case when the both index exist (a, b) and (b, a) - use > the any index > when index name isn't specified. > > On 26.08.2021 16:49, Maksim Timonin wrote: > > There are some thoughts about strict field order: > > 1. Index (A, B) is not equivalent to index (B, A). Some queries may have > > different performance on such indexes, and users have to specify the > right > > index. What if both indexes exist? > > 2. We should avoid cases when a user uses in query only field B for index > > (A, B). We have to force the user to specify range for (A) too, or > > explicitly set it (null, null). Otherwise it looks like a mistake. > > > > > > > > > > On Thu, Aug 26, 2021 at 4:39 PM Ivan Daschinsky > wrote: > > > >> 1. I suppose, that the next step is to implement the api for manually > >> creating index. I think that user wants to create index that will speed > up > >> his criteria base queries, so he or she will use the same criteria to > >> define the index. So no problem at all > >> 2. We should print warning or throws exception if there is not any index > >> that match specific criteria. > >> > >> BTW, Mongo DB doesn't make user to write index name in query. It just > >> works. > >> > >> чт, 26 авг. 2021 г., 15:52 Taras Ledkov : > >> > >>> Hi, > >>> > >>>> It is an usability nightmare to make user write index name in all > >> cases. > >>> I don't see any difference between specifying the index name and > >>> specifying the index fields in the right order. > >>> Do you see? > >>> > >>> Let's there is the index: > >>> idx_A_B ON TBL (A, B) > >>> > >>> Is it OK that the query like below doesn't math the index 'idx_A_B'? > >>> new IndexQuery<>(..) > >>> .setCriteria(lt("b", 1), lt("a", 2)); > >>> > >>> On 26.08.2021 15:23, Ivan Daschinsky wrote: > >>>> I am against to make user write index name. It is quite simple and > >>>> straightforward algorithm to match index to field names, so it is > >> strange > >>>> to compare it to sql engine optimizer. > >>>> > >>>> It is an usability nightmare to make user write index name in all > >> cases. > >>>> чт, 26 авг. 2021 г., 14:42 Maksim Timonin : > >>>> > >>>>> Hi, Igniters! > >>>>> > >>>>> There is a discussion about how to specify an index to query with an > >>>>> IndexQuery [1]. Currently my PR provides 2 ways to specify index: > >>>>> 1. With a table and index name; > >>>>> 2. With a table and list of index fields (without index name). In > this > >>> case > >>>>> IndexQueryProcessor tries to find an index that matches table and > >> index > >>>>> fields in strict order (order of fields in criteria has to match the > >>> order > >>>>> of fields in index). > >>>>> > >>>>> Discussion is whether is the second approach valid? > >>>>> > >>>>> Pros: > >>>>> 1. Currently index name is an optional field for QueryIndex and > >>>>> QuerySqlField. Then users can create an index with a table and list > of > >>>>> fields. Then, we should provide an opportunity to define an index for > >>>>> querying the same way as we do for creating. > >>>>> 2. It's required to know the index name to query it (in case the > index > >>> was > >>>>> created without an explicit name). Users can find it and then use it > >> as > >>> a > >>>>> constant in code, but I see some troubles there: > >>>>> 2.1. Get index name by querying the system view INDEXES. Note, that > >&g
Re: Ignite 3 async continuation executor
Pavel I would really welcome this - when we first started with Ignite we were constantly getting the Ignite threads blocked because we'd perform other work on it. I don't know about the configuration part however because this isn't a static thing I'd argue. Is Ignite 3 still using its own types or is it switching to CompletableFuture <https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/CompletableFuture.html> ? The key APIs in CompletableFuture (acceptEitherAsync,applyToEitherAsync, handleAsync, thenAcceptASync, thenComposeAsync, whenCompleteAsync) all already accept an Executor argument so returning CompletableFuture solves the problem, it'd just need documentation. If Ignite 3 still uses its own types then I'd suggest what's needed is an argument to accept a custom Executor. We have dedicated pools configured now with custom UncaughtExceptionHandler and ThreadFactory <https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ThreadFactory.html> that we have various metrics and customisations around. If the only option is the default ForkJoinPool#commonPool we'd lose this when eventually moving to 3. Regards, Courtney Robinson Founder and CEO, Hypi Tel: ++44 208 123 2413 (GMT+0) <https://hypi.io> <https://hypi.io> https://hypi.io On Thu, Aug 19, 2021 at 5:08 PM Alexander Polovtcev wrote: > Pavel, thanks for the response. Do I understand correctly that it is not > expected that a user may want to specify their own custom executor? > > On Thu, Aug 19, 2021 at 6:55 PM Pavel Tupitsyn > wrote: > > > Hi Alexander, > > > > To be honest, I'm not sure yet - just getting to know this new > > configuration mechanism and format. > > > > Since we can't use a property of type Executor, we'll have to provide > > predefined values. > > It can either be "bool executeAsyncContinuationsDirectly": false > (default) > > => commonPool, true => Runnable::run, > > or "String asyncContinuationExecutor" which allows two values "direct" > and > > "commonPool". > > > > I'm leaning towards the latter: > > > > { > > "node": { > > "metastorageNodes": [ "node-0" ], > > "asyncContinuationExecutor": "commonPool" > > }, > > "network": { ... } > > } > > > > > > > > On Thu, Aug 19, 2021 at 6:29 PM Alexander Polovtcev < > > alexpolovt...@gmail.com> > > wrote: > > > > > Hi, Pavel! > > > > > > Can you please provide an example (e.g. HOCON snippet) of how this > > > configuration is going to look like in Ignite 3? Or how is this > property > > > going to be set? > > > > > > > > > On Thu, Aug 19, 2021 at 6:00 PM Pavel Tupitsyn > > > wrote: > > > > > > > Igniters, > > > > > > > > I propose to add a configurable async continuation executor for > public > > > APIs > > > > to Ignite 3 > > > > like we have in Ignite 2.x [1] > > > > > > > > In short, currently, async APIs return a future to the user code. > > > > Continuations like "myCode" in "table.getAsync().thenApply(myCode)" > > will > > > be > > > > executed by the same thread that completes the future, which will be > a > > > > Netty thread or some other Ignite thread. > > > > > > > > This is dangerous because user code can be blocking or long-running, > > and > > > > system threads become unavailable. > > > > > > > > Proposal: > > > > 1. Add asyncContinuationExecutor configuration property, defaults to > > > > ForkJoinPool#commonPool - both for server and thin client > > > > 2. Use this executor to complete all public API futures > > > > > > > > This means safe default behavior and a possibility to enable unsafe > but > > > > faster behavior with Runnable::run executor. > > > > > > > > Thoughts? > > > > > > > > > > > > [1] > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-70%3A+Async+Continuation+Executor > > > > > > > > > > > > > -- > > > With regards, > > > Aleksandr Polovtcev > > > > > > > > -- > With regards, > Aleksandr Polovtcev >
Re: Re[2]: Google Guava in Ignite 3
I think since Calcite brings it in already then your arguments make sense. Would it be pinned to the same version as Calcite? Risk NoSuchMethodError at runtime if not. +1 On Mon, Aug 9, 2021 at 9:56 AM Alexander Polovtcev wrote: > Zhenya, Courtney, Andrey, > > What do you think about my arguments, was I able to convince you? I would > like to reach some consensus here. At the moment, my original points still > stand, I'm also ok with shading Guava if needed, though I think it is not > necessary at this point. > > On Fri, Aug 6, 2021 at 12:45 PM Alexander Polovtcev < > alexpolovt...@gmail.com> > wrote: > > > Zhenya, > > > > > But there is no restrictions from running ignite server nodes from some > > other code with it`s own guava version seems we obtain fast path to jar > > hell here? > > > > I'm not sure if I fully understand your question, but it looks like we > are > > in this situation already, because we have some dependencies that use > > Guava. That's why I propose to add Guava explicitly to at least have a > > deterministic runtime configuration (see this link > > < > https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#Dependency_Management > > > > for an explanation). > > > > On Fri, Aug 6, 2021 at 12:25 PM Zhenya Stanilovsky > > wrote: > > > >> > >> Alexander, first of all looks like Ivan Daschinsky approach about thin > >> client use only and shadow plugin are cover all Andrey Mashenkov listing > >> problems. > >> But there is no restrictions from running ignite server nodes from some > >> other code with it`s own guava version seems we obtain fast path to jar > >> hell here? > >> > >> > >> >Zhenya, > >> > > >> >My intentions are the following: > >> > > >> >1. Remove some copy-pasted code (like the "bytecode" module or some > >> utility > >> >methods). Please see my original message for the links to the code. > >> >2. Explicitly pin the Guava version to avoid conflicts in the runtime. > >> > > >> >About allowing to use Guava in the codebase, my thoughts are the > >> following: > >> > > >> >1. We *already* use some code from Guava either directly (like in the > >> >"calcite" module) or by copy-pasting it into a utility class. > >> >2. I understand that some Guava methods are obsolete as of Java 11, but > >> >some of them still don't have any standard library counterparts, in > which > >> >case I think using Guava is justified (which is supported by point 1). > >> > > >> >Can you please explain why you would disapprove of my proposal? > >> > > >> >On Thu, Aug 5, 2021 at 7:56 PM Zhenya Stanilovsky > >> >< arzamas...@mail.ru.invalid > wrote: > >> > > >> >> > >> >> alexpolovtcev please clarify what do you mean under : «possibility of > >> >> using Guava in Ignite 3», using how necessary dependency of calcite > or > >> >> using like «using in our code» ? If using in code, i -1 here. > >> >> thanks. > >> >> > >> >> > >> >> >Hello, dear Igniters! > >> >> > > >> >> >I would like to discuss the possibility of using Guava > >> >> >< https://github.com/google/guava > in Ignite 3. I know about the > >> >> restrictive > >> >> >policy of using it in Ignite 2, but I have the following reasons: > >> >> > > >> >> >1. We are de-facto using it already as an implicit dependency, since > >> the > >> >> >Calcite module depends on it, and Calcite is going to stay for a > >> while =) > >> >> >2. AFAIK, the "bytecode" module is copied into the codebase only to > >> strip > >> >> >Guava away from it. We can remove this module, which will improve > the > >> >> >maintainability of the project. > >> >> >3. We have some copy-paste of Guava code in the project. For > example, > >> see > >> >> >this > >> >> >< > >> >> > >> > https://github.com/apache/ignite-3/blob/main/modules/core/src/main/java/org/apache/ignite/internal/util/IgniteUtils.java#L136 > >> >> > > >> >> >and this > >> >> >< > >> >> > >> > https://github.com/apache/ignite-3/blob/main/modules/core/src/main/java/org/apache/ignite/internal/util/IgniteUtils.java#L428 > >> >> > > >> >> >. > >> >> >4. Regarding security concerns, this report > >> >> >< > >> >> > >> > https://www.cvedetails.com/product/52274/Google-Guava.html?vendor_id=1224 > >> >> > > >> >> >shows no major vulnerability issues for the last three years. > >> >> > > >> >> >Taking these points into account, I propose to allow using Guava > both > >> in > >> >> >production and test code and to add it as an explicit dependency. > >> >> > > >> >> >What do you think? > >> >> > > >> >> >-- > >> >> >With regards, > >> >> >Aleksandr Polovtcev > >> >> > >> >> > >> >> > >> >> > >> > > >> > > >> >-- > >> >With regards, > >> >Aleksandr Polovtcev > >> > >> > >> > >> > > > > > > > > -- > > With regards, > > Aleksandr Polovtcev > > > > > -- > With regards, > Aleksandr Polovtcev >
Re: Google Guava in Ignite 3
Also, what impact will this have on peer class loading? Something I think shading also resolves On Thu, Aug 5, 2021 at 7:05 PM Courtney Robinson wrote: > Can I suggest shading Guava? > Guava and Netty are two notorious libraries for version conflicts because > of their popularity and usefulness. > Other projects (ES for example solved it by shading them it > https://github.com/elastic/elasticsearch/issues/2091#issuecomment-7156766 > ). > > We use Ignite entirely as a thick client and already have Guava version > conflicts from other projects (Calcite being one because we use it directly > already) so Ignite bringing its own will only make this worse when we get > to V3. > > Even Calcite itself already has Guava conflicts because of the Cassandra > adapter. I'd +1 this but really only if it will be shaded. > > Regards, > Courtney Robinson > Founder and CEO, Hypi > Tel: ++44 208 123 2413 (GMT+0) <https://hypi.io> > > <https://hypi.io> > https://hypi.io > > > On Thu, Aug 5, 2021 at 5:56 PM Zhenya Stanilovsky > wrote: > >> >> alexpolovtcev please clarify what do you mean under : «possibility of >> using Guava in Ignite 3», using how necessary dependency of calcite or >> using like «using in our code» ? If using in code, i -1 here. >> thanks. >> >> >> >Hello, dear Igniters! >> > >> >I would like to discuss the possibility of using Guava >> >< https://github.com/google/guava > in Ignite 3. I know about the >> restrictive >> >policy of using it in Ignite 2, but I have the following reasons: >> > >> >1. We are de-facto using it already as an implicit dependency, since the >> >Calcite module depends on it, and Calcite is going to stay for a while =) >> >2. AFAIK, the "bytecode" module is copied into the codebase only to strip >> >Guava away from it. We can remove this module, which will improve the >> >maintainability of the project. >> >3. We have some copy-paste of Guava code in the project. For example, see >> >this >> >< >> https://github.com/apache/ignite-3/blob/main/modules/core/src/main/java/org/apache/ignite/internal/util/IgniteUtils.java#L136 >> > >> >and this >> >< >> https://github.com/apache/ignite-3/blob/main/modules/core/src/main/java/org/apache/ignite/internal/util/IgniteUtils.java#L428 >> > >> >. >> >4. Regarding security concerns, this report >> >< >> https://www.cvedetails.com/product/52274/Google-Guava.html?vendor_id=1224 >> > >> >shows no major vulnerability issues for the last three years. >> > >> >Taking these points into account, I propose to allow using Guava both in >> >production and test code and to add it as an explicit dependency. >> > >> >What do you think? >> > >> >-- >> >With regards, >> >Aleksandr Polovtcev >> >> >> >> > >
Re: Google Guava in Ignite 3
Can I suggest shading Guava? Guava and Netty are two notorious libraries for version conflicts because of their popularity and usefulness. Other projects (ES for example solved it by shading them it https://github.com/elastic/elasticsearch/issues/2091#issuecomment-7156766). We use Ignite entirely as a thick client and already have Guava version conflicts from other projects (Calcite being one because we use it directly already) so Ignite bringing its own will only make this worse when we get to V3. Even Calcite itself already has Guava conflicts because of the Cassandra adapter. I'd +1 this but really only if it will be shaded. Regards, Courtney Robinson Founder and CEO, Hypi Tel: ++44 208 123 2413 (GMT+0) <https://hypi.io> <https://hypi.io> https://hypi.io On Thu, Aug 5, 2021 at 5:56 PM Zhenya Stanilovsky wrote: > > alexpolovtcev please clarify what do you mean under : «possibility of > using Guava in Ignite 3», using how necessary dependency of calcite or > using like «using in our code» ? If using in code, i -1 here. > thanks. > > > >Hello, dear Igniters! > > > >I would like to discuss the possibility of using Guava > >< https://github.com/google/guava > in Ignite 3. I know about the > restrictive > >policy of using it in Ignite 2, but I have the following reasons: > > > >1. We are de-facto using it already as an implicit dependency, since the > >Calcite module depends on it, and Calcite is going to stay for a while =) > >2. AFAIK, the "bytecode" module is copied into the codebase only to strip > >Guava away from it. We can remove this module, which will improve the > >maintainability of the project. > >3. We have some copy-paste of Guava code in the project. For example, see > >this > >< > https://github.com/apache/ignite-3/blob/main/modules/core/src/main/java/org/apache/ignite/internal/util/IgniteUtils.java#L136 > > > >and this > >< > https://github.com/apache/ignite-3/blob/main/modules/core/src/main/java/org/apache/ignite/internal/util/IgniteUtils.java#L428 > > > >. > >4. Regarding security concerns, this report > >< > https://www.cvedetails.com/product/52274/Google-Guava.html?vendor_id=1224 > > > >shows no major vulnerability issues for the last three years. > > > >Taking these points into account, I propose to allow using Guava both in > >production and test code and to add it as an explicit dependency. > > > >What do you think? > > > >-- > >With regards, > >Aleksandr Polovtcev > > > >
Re: Apache Ignite 3 Alpha 2 webinar follow up questions
Hi Ivan, Atri's description of the query plan being cached is what I was thinking of with my description. I lack the knowledge on how the statistics are maintained to really comment constructively Atri but my first question about the problem you raise with statistics would be: How/where are the stats maintained and if a query plan is cached based on some stats, is it not possible to invalidate the cached plan periodically or based on statistics changes? Regards, Courtney Robinson Founder and CEO, Hypi Tel: ++44 208 123 2413 (GMT+0) <https://hypi.io> <https://hypi.io> https://hypi.io On Sat, Jul 31, 2021 at 8:54 AM Atri Sharma wrote: > Query caching works on three levels - caching results, caching blocks and > caching query plans. > > Prepared queries work by caching a plan for a query and reusing that plan > by changing the parameters for the incoming query. So the query remains the > same, but input values keep changing. > > The problem with prepared queries is that query execution can go bad very > fast if the underlying data distribution changes and the cached plan is no > longer optimal for the given statistics. > > On Sat, 31 Jul 2021, 12:54 Ivan Pavlukhin, wrote: > > > Hi Courtney, > > > > Please clarify what do you mean by prepared queries and query caching? > > Do you mean caching query results? If so, in my mind material views > > are the best approach here (Ignite 2 does not support them). Do you > > have other good approaches in your mind? E.g. implemented in other > > databases. > > > > 2021-07-26 21:27 GMT+03:00, Valentin Kulichenko < > > valentin.kuliche...@gmail.com>: > > > Hi Courtney, > > > > > > Generally speaking, query caching certainly makes sense. As far as I > > know, > > > Ignite 2.x actually does that, but most likely there might be room for > > > improvement as well. We will look into this. > > > > > > As for the SQL API - the answer is yes. The requirement for a dummy > cache > > > is an artifact of the current architecture. This is 100% wrong and will > > be > > > changed in 3.0. > > > > > > -Val > > > > > > On Sun, Jul 25, 2021 at 2:51 PM Courtney Robinson > > > > > > wrote: > > > > > >> Something else came to mind, are there plans to support prepared > > queries? > > >> > > >> I recall someone saying before that Ignite does internally cache > queries > > >> but it's not at all clear if or how it does do that. I assume a simple > > >> hash > > >> of the query isn't enough. > > >> > > >> We generate SQL queries based on user runtime settings and they can > get > > >> to > > >> hundreds of lines long, I imagine this means most of our queries are > not > > >> being cached but there are patterns so we could generate and manage > > >> prepared queries ourselves. > > >> > > >> Also, will there be a dedicated API for doing SQL queries rather than > > >> having to pass a SqlFieldsQuery to a cache that has nothing to do with > > >> the > > >> cache being queried? When I first started with Ignite years ago, this > > was > > >> beyond confusing for me. I'm trying to run select x from B but I pass > > >> this > > >> to a cache called DUMMY or whatever arbitrary name... > > >> > > >> On Fri, Jul 23, 2021 at 4:05 PM Courtney Robinson < > > >> courtney.robin...@hypi.io> > > >> wrote: > > >> > > >> > Andrey, > > >> > Thanks for the response - see my comments inline. > > >> > > > >> > > > >> >> I've gone through the questions and have no the whole picture of > your > > >> use > > >> >> case. > > >> > > > >> > Would you please clarify how you exactly use the Ignite? what are > the > > >> >> integration points? > > >> >> > > >> > > > >> > I'll try to clarify - we have a low/no code platform. A user > designs a > > >> > model for their application and we map this model to Ignite tables > and > > >> > other data sources. The model I'll describe is what we're building > now > > >> and > > >> > expected to be in alpha some time in Q4 21. Our current production > > >> > architecture is different and isn't as generic, it is heavily tied > to > > >> > Ignite and we've redesigned to get some fle
Re: Text Queries Support
+1 we're all saying the same thing here. My example from before select x from T0 where term(args to solr term query) AND .. term(xxx) was meant to indicate a lucene term query and so there'd be a list of lucene functions exposed in a similar way. On Mon, Jul 26, 2021 at 5:45 PM Atri Sharma wrote: > +1 > > Lets expose custom functions in Ignite SQL which allows us to use the full > capabilities that Lucene offers > > On Mon, 26 Jul 2021, 21:51 Andrey Mashenkov, > wrote: > > > Val, > > > > > I believe this is something we can look into in the scope of Ignite 3. > > > Andrey, does Calcite have any support for this? What's your view on > this? > > > > As Atri already mentioned, SQL 92 standard declares "LIKE" operator for > > pattern matching. > > Calcite supports LIKE operator. > > > > I've found it is a RexNode (expression) and I doubt it supports indices. > > Maybe, LIKE can use a sorted index for prefix matching or equality > > conditions, but it is very far from what we are talking about. > > > > Full-text search term is much wider than just a pattern matching. > > Lucene provides much more capabilities on that and has rich > > syntax contrary to "LIKE" operator. > > So, LIKE operator is the standard operator with the defined contract. I'm > > not sure it is worth integrating Lucene just for it. > > I think we should have native support for full-text search queries > and/or a > > custom SQL function. > > > > E.g. Postgres syntax for FTS queries [1] is completely different to > "LIKE" > > operator. > > > > [1] > > > > > https://www.postgresql.org/docs/9.5/textsearch-intro.html#TEXTSEARCH-MATCHING > > > > On Sat, Jul 24, 2021 at 4:49 PM Courtney Robinson < > > courtney.robin...@hypi.io> > > wrote: > > > > > Hey Ari, > > > Yes, I wasn't suggesting that Solr should be used. That's just what > we're > > > doing now out of necessity. > > > It was more the fact that Calcite's SqlOperator can be used to provide > > the > > > interface to Lucene. > > > For all the reasons you mentioned and more, using Lucene is the right > > > choice > > > > > > Calcite doesn't have support for Solr but it has an ES adapter which is > > > what we modified to support Solr. > > > > > > Regards, > > > Courtney Robinson > > > Founder and CEO, Hypi > > > Tel: ++44 208 123 2413 (GMT+0) <https://hypi.io> > > > > > > <https://hypi.io> > > > https://hypi.io > > > > > > > > > On Sat, Jul 24, 2021 at 1:59 PM Atri Sharma wrote: > > > > > > > What that entails is that the end user has to keep a Solr cluster > > > running, > > > > which comes with its own challenges (now you have to manage two > systems > > > > instead of one). > > > > > > > > I believe Calcite has native support for Solr? > > > > > > > > OTOH, having native Lucene indices allow us to control per partition > > > > indices with no distributed overhead, since Lucene is a per node > > instance > > > > with no global coordination. > > > > > > > > On Sat, 24 Jul 2021, 16:57 Courtney Robinson, < > > courtney.robin...@hypi.io > > > > > > > > wrote: > > > > > > > > > I'll add in here. > > > > > I agree with you Valentin, the decoupled state of text queries > makes > > it > > > > > useless for most use cases we have. > > > > > > > > > > As it relates to Calcite and Ignite 3, one approach (the one we're > > > taking > > > > > because we use calcite independent of Ignite) is to provide a bunch > > of > > > > SQL > > > > > functions that we implement as SqlOperator > > > > > < > > > > > > > > > > > > > > > https://calcite.apache.org/javadocAggregate/org/apache/calcite/sql/SqlOperator.html > > > > > >. > > > > > I forget how we've done aggregation functions but we have those too > > and > > > > > they map to Solr aggregations (which ultimately end up in lucene). > > > > > > > > > > This allows Solr filters to take part in the rest of the query. > It's > > > > > probably more complex than this for Ignite but that's one possible > > > route > > > > > but we generate queries like select x
Re: Apache Ignite 3 Alpha 2 webinar follow up questions
Something else came to mind, are there plans to support prepared queries? I recall someone saying before that Ignite does internally cache queries but it's not at all clear if or how it does do that. I assume a simple hash of the query isn't enough. We generate SQL queries based on user runtime settings and they can get to hundreds of lines long, I imagine this means most of our queries are not being cached but there are patterns so we could generate and manage prepared queries ourselves. Also, will there be a dedicated API for doing SQL queries rather than having to pass a SqlFieldsQuery to a cache that has nothing to do with the cache being queried? When I first started with Ignite years ago, this was beyond confusing for me. I'm trying to run select x from B but I pass this to a cache called DUMMY or whatever arbitrary name... On Fri, Jul 23, 2021 at 4:05 PM Courtney Robinson wrote: > Andrey, > Thanks for the response - see my comments inline. > > >> I've gone through the questions and have no the whole picture of your use >> case. > > Would you please clarify how you exactly use the Ignite? what are the >> integration points? >> > > I'll try to clarify - we have a low/no code platform. A user designs a > model for their application and we map this model to Ignite tables and > other data sources. The model I'll describe is what we're building now and > expected to be in alpha some time in Q4 21. Our current production > architecture is different and isn't as generic, it is heavily tied to > Ignite and we've redesigned to get some flexibility where Ignite doesn't > provide what we want. Things like window functions and other SQL-99 limits. > > In the next gen version we're working on you can create a model for a > Tweet(content, to) and we will create an Ignite table with content and to > columns using the type the user selects. This is the simplest case. > We are adding generic support for sources and sinks and using Calcite as a > data virtualisation layer. Ignite is one of the available source/sinks. > > When a user creates a model for Tweet, we also allow them to specify how > they want to index the data. We have a copy of the calcite Elasticsearch > adapter modified for Solr. > > When a source is queried (Ignite or any other that we support), we > generate SQL that Calcite executes. Calcite will push down the generated > queries to Solr and Solr produces a list of IDs (in case of Ignite) and we > do a multi-get from Ignite to produce the actual results. > > Obviously there's a lot more to this but that should give you a general > idea. > > and maybe share some experience with using Ignite SPIs? >> > Our evolution with Ignite started from the key value + compute APIs. We > used the SPIs then but have since moved to using only the Ignite SQL API > (we gave up transactions for this). > > We originally used the indexing SPI to keep our own lucene index of data > in a cache. We did not use the Ignite FTS as it is very limited compared to > what we allow customers to do. If I remember correctly, we were using an > affinity compute job to send queries to the right Ignite node and > then doing a multi-get to pull the data from caches. > I think we used one or two other SPIs and we found them very useful to be > able to extend and customise Ignite without having to fork/change upstream > classes. We only stopped using them because we eventually concluded that > using the SQL only API was better for numerous reasons. > > >> We'll keep the information in mind while developing the Ignite, >> because this may help us to make a better product. >> >> By the way, I'll try to answer the questions. >> >> > 1. Schema change - does that include the ability to change the types >> of >> > fields/columns? >> Yes, we plan to support transparent conversion to a wider type on-fly >> (e.g. >> 'int' to 'long'). >> This is a major point of our Live-schema concept. >> In fact, there is no need to convert data on all the nodes in a >> synchronous >> way as old SQL databases do (if one supports though), >> we are going to support multiple schema versions and convert data >> on-demand >> on a per-row basis to the latest version, >> then write-back the row. >> > > I can understand. The auto conversion to wider type makes sense. > >> >> More complex things like 'String' -> 'int' are out of scope for now >> because >> it requires the execution of a user code on the critical path. >> > > I would argue though that executing user code on the critical path > shouldn't be a blocker for custom conversions. I feel if a user is making > an advance enough integration to provide c
Re: Text Queries Support
Hey Ari, Yes, I wasn't suggesting that Solr should be used. That's just what we're doing now out of necessity. It was more the fact that Calcite's SqlOperator can be used to provide the interface to Lucene. For all the reasons you mentioned and more, using Lucene is the right choice Calcite doesn't have support for Solr but it has an ES adapter which is what we modified to support Solr. Regards, Courtney Robinson Founder and CEO, Hypi Tel: ++44 208 123 2413 (GMT+0) <https://hypi.io> <https://hypi.io> https://hypi.io On Sat, Jul 24, 2021 at 1:59 PM Atri Sharma wrote: > What that entails is that the end user has to keep a Solr cluster running, > which comes with its own challenges (now you have to manage two systems > instead of one). > > I believe Calcite has native support for Solr? > > OTOH, having native Lucene indices allow us to control per partition > indices with no distributed overhead, since Lucene is a per node instance > with no global coordination. > > On Sat, 24 Jul 2021, 16:57 Courtney Robinson, > wrote: > > > I'll add in here. > > I agree with you Valentin, the decoupled state of text queries makes it > > useless for most use cases we have. > > > > As it relates to Calcite and Ignite 3, one approach (the one we're taking > > because we use calcite independent of Ignite) is to provide a bunch of > SQL > > functions that we implement as SqlOperator > > < > > > https://calcite.apache.org/javadocAggregate/org/apache/calcite/sql/SqlOperator.html > > >. > > I forget how we've done aggregation functions but we have those too and > > they map to Solr aggregations (which ultimately end up in lucene). > > > > This allows Solr filters to take part in the rest of the query. It's > > probably more complex than this for Ignite but that's one possible route > > but we generate queries like select x from T0 where term(args to solr > term > > query) AND ... > > > > Regards, > > Courtney Robinson > > Founder and CEO, Hypi > > Tel: ++44 208 123 2413 (GMT+0) <https://hypi.io> > > > > <https://hypi.io> > > https://hypi.io > > > > > > On Fri, Jul 23, 2021 at 7:14 PM Valentin Kulichenko < > > valentin.kuliche...@gmail.com> wrote: > > > > > Atri, > > > > > > Sure, go ahead. Let's put the ideas on paper and have a discussion. > > > > > > -Val > > > > > > On Fri, Jul 23, 2021 at 10:59 AM Atri Sharma wrote: > > > > > > > Thanks Andrey. > > > > > > > > I have collected answers or proposals to many of these questions and > > > > would like to start a wiki page covering what we can do for Ignite 3. > > > > > > > > Does that sound good, please? > > > > > > > > On Fri, Jul 23, 2021 at 4:26 PM Andrey Mashenkov > > > > wrote: > > > > > > > > > > Atri, > > > > > > > > > > First of all, I'd recommend going through the Ignite ticket to > gather > > > > > information about the current implementation issues and users' > wants. > > > > > Then look at a code to get a complete understanding of how things > > work > > > > now, > > > > > which may help in future decisions. > > > > > > > > > > As we use the outdated Lucene version, some things may be > irrelevant > > > for > > > > > the latest Lucene version. > > > > > So, you will need expertise in the internals of modern Lucene > version > > > to > > > > > understand what capabilities, guarantees, and limitations Lucene > has > > > and > > > > > could bring to the Ignite. > > > > > The expertise could be got from the Lucene project code or Lucene > > > project > > > > > dev-list. > > > > > > > > > > > > > > > As for now, the potential capabilities are not clear to me. > > > > > At first glance, I see the next topics that must be covered at > first: > > > > > > > > > > General questions > > > > > * How Lucene index can be split among the nodes? > > > > > * If we'll have a single index for all partitions on the particular > > > node, > > > > > then how index records will be aware of partitioning? > > > > > This is important to filter out backup records from the results to > > > avoid > > > > > duplicates. > > > > > * How results from several nodes can be
Re: Text Queries Support
I'll add in here. I agree with you Valentin, the decoupled state of text queries makes it useless for most use cases we have. As it relates to Calcite and Ignite 3, one approach (the one we're taking because we use calcite independent of Ignite) is to provide a bunch of SQL functions that we implement as SqlOperator <https://calcite.apache.org/javadocAggregate/org/apache/calcite/sql/SqlOperator.html>. I forget how we've done aggregation functions but we have those too and they map to Solr aggregations (which ultimately end up in lucene). This allows Solr filters to take part in the rest of the query. It's probably more complex than this for Ignite but that's one possible route but we generate queries like select x from T0 where term(args to solr term query) AND ... Regards, Courtney Robinson Founder and CEO, Hypi Tel: ++44 208 123 2413 (GMT+0) <https://hypi.io> <https://hypi.io> https://hypi.io On Fri, Jul 23, 2021 at 7:14 PM Valentin Kulichenko < valentin.kuliche...@gmail.com> wrote: > Atri, > > Sure, go ahead. Let's put the ideas on paper and have a discussion. > > -Val > > On Fri, Jul 23, 2021 at 10:59 AM Atri Sharma wrote: > > > Thanks Andrey. > > > > I have collected answers or proposals to many of these questions and > > would like to start a wiki page covering what we can do for Ignite 3. > > > > Does that sound good, please? > > > > On Fri, Jul 23, 2021 at 4:26 PM Andrey Mashenkov > > wrote: > > > > > > Atri, > > > > > > First of all, I'd recommend going through the Ignite ticket to gather > > > information about the current implementation issues and users' wants. > > > Then look at a code to get a complete understanding of how things work > > now, > > > which may help in future decisions. > > > > > > As we use the outdated Lucene version, some things may be irrelevant > for > > > the latest Lucene version. > > > So, you will need expertise in the internals of modern Lucene version > to > > > understand what capabilities, guarantees, and limitations Lucene has > and > > > could bring to the Ignite. > > > The expertise could be got from the Lucene project code or Lucene > project > > > dev-list. > > > > > > > > > As for now, the potential capabilities are not clear to me. > > > At first glance, I see the next topics that must be covered at first: > > > > > > General questions > > > * How Lucene index can be split among the nodes? > > > * If we'll have a single index for all partitions on the particular > node, > > > then how index records will be aware of partitioning? > > > This is important to filter out backup records from the results to > avoid > > > duplicates. > > > * How results from several nodes can be merged on the Reduce stage? > > > * Does Lucene supports smth like JOIN operation or others that may > > require > > > data from another partition or index? > > > If so, then it likes to multistep query with merging results on > > > intermediate stages and requires detailed investigation and design. > > > It is ok if Ignite will have some limitations here, but we would like > to > > > know about them at the early stage. > > > * How effectively map Lucene files to the page memory? Is it even > > possible? > > > Otherwise, how to deal with potential OOM on large queries and memory > > > capacity planning? > > > > > > Persistence. > > > * How and what consistency guarantees could we have/expect? > > > Seems, we may not be able to write physical records for Lucene index to > > our > > > WAL. What can we do with this? > > > > > > Transactions. > > > * Will we support transactions? > > > * Should Lucene be aware of Transaction and track mvcc (or whatever) > > > versions for the records? > > > * What will be consistency guarantees? > > > > > > UX > > > * How to add FullText search queries syntax into Calcite? > > > * AFAIK, the Lucene index has many properties for tuning. How will the > > user > > > configure the index? > > > * How and where to store the settings? What are cluster-wide and what a > > > local to the particular node? > > > * Will be all the settings immutable? Can be they changed on-fly? after > > > node/grid restart? > > > * Any limitations on query syntax? > > > > > > SQL > > > * Will we support FullText search in SQL? > > > * How to integrate Lucene index into Calcite? What is the
Re: Apache Ignite 3 Alpha 2 webinar follow up questions
des) will definitely kill the performance > in that case. > So, the preliminary loadCache() call looks like a good compromise. > I think the problem is largely that the CacheStore interface is not sufficient for being able to do this. If it had a richer interface which allowed the cache store to answer index queries basically hooking into whatever Ignite's doing for its B+tree then this would be viable. A CacheStore that only implements KV API doesn't take part in SQL queries. > > 3. Splitting query into 2 parts to run on Ignite and to run on CacheStore > looks possible with Calcite, > but I think it impractical because in general, neither CacheStore nor > database structure are aware of the data partitioning. > hmmm, maybe I missed the point but as the implementor of the CacheStore you should have knowledge of the structure and partition info. or have some way of retrieving it. Again, I think the current CacheStore interface is the problem and if it was extended to provide this information then its up to the implementation to do this whilst Ignite knows that any implementation of these interfaces will meet the contract necessary. > > 4. Transactions can't be supported in case of direct CacheStore access, > because even if the underlying database supports 2-phase commit, which is a > rare case, the recovery protocol looks hard. > Just looks like this feature doesn't worth it. > I'd completely agree with this. It will be incredibly hard to get this done reliably > > > > 6. This question wasn't mine but I was going to ask it as well: What > > will happen to the Indexing API since H2 is being removed? > As I wrote above, Indexing SPI will be dropped, but IndexQuery will be > added. > > > 1. As I mentioned above, we Index into Solr, in earlier versions of > > our product we used the indexing SPI to index into Lucene on the > Ignite > > nodes but this presented so many challenges we ultimately abandoned > it and > > replaced it with the current Solr solution. > AFAIK, some guys developed and sell a plugin for Ignite-2 with persistent > Lucene and Geo indices. > I don't know about the capabilities and limitations of their solution, > because of closed code. > You can easily google it. > > I saw few encouraged guys who want to improve TEXT queries, > but unfortunately, things weren't moved far enough. For now, they are in > the middle of fixing the merging TEXT query results. > So far so good. > > I think it is a good chance to master the skill developing of a distributed > system for the one > who will take a lead over the full-text search feature and add native > FullText index support into Ignite-3. > I've seen the other thread from Atri I believe about this. > > > > 7. What impact does RAFT now have on conflict resolution? > RAFT is a state machine replication protocol. It guarantees all the nodes > will see the updates in the same order. > So, seems no conflicts are possible. Recovery from split-brain is > impossible in common-case. > > However, I think we have a conflict resolver analog in Ignite-3 as it is > very useful in some cases > e.g datacenter replication, incremental data load from 3-rd party source, > recovery from 3-rd party source. > > > > 8. CacheGroups. > AFAIK, CacheGroup will be eliminated, actually, we'll keep this mechanic, > but it will be configured in a different way, > which makes Ignite configuring a bit simpler. > Sorry, for now, I have no answer on your performance concerns, this part of > Ignite-3 slipped from my radar. > No worries. I'll wait and see if anyone else suggests something. Its getting a lot worse, a node took 1hr to start yesterday after a deployment and its in prod with very little visibility into what it is doing, it was just stopped, no logging or anything and then resumed. 2021-07-22 13:40:15.997 INFO [ArcOS,,,] 9 --- [orker-#40%hypi%] o.a.i.i.p.cache.GridCacheProcessor [285] : Finished recovery for cache [cache=hypi_01F8ZC3DGT66RNYCDZH3XNVY2E_Hue, grp=hypi, startVer=AffinityTopologyVersion [topVer=79, minorTopVer=0]] One hour later it printed the next cache recovery message and started 30 seconds after going through other tables. > > Let's wait if someone will clarify what we could expect in Ignite-3. > Guys, can someone chime in and give more light on 3,4,7,8 questions? > > > On Thu, Jul 22, 2021 at 4:15 AM Courtney Robinson < > courtney.robin...@hypi.io> > wrote: > > > Hey everyone, > > I attended the Alpha 2 update yesterday and was quite pleased to see the > > progress on things so far. So first, congratulations to everyone on the > > work being put in and thank you to Val and Kseniya for running > yesterday's > > event. > > > > I
Apache Ignite 3 Alpha 2 webinar follow up questions
ctResolver and manager are used by GridCacheMapEntry which just says if use old value do this otherwise use newVal. Ideally this will be exposed in the new API so that one can override this behaviour. The last writer wins approach isn't always ideal and the semantics of the domain can mean that what is consider "correct" in a conflict is not so for a different domain. 8. This is last on the list but is actually the most important for us right now as it is an impending and growing risk. We allow customers to create their own tables on demand. We're already using the same cache group etc for data structures to be re-used but now that we're getting to thousands of tables/caches our startup times are sometimes unpredictably long - at present it seems to depend on the state of the cache/table before the restart but we're into the order of 5 - 7 mins and steadily increasing with the growth of tables. Are there any provisions in Ignite 3 for ensuring startup time isn't proportional to the number of tables/caches available? Those are the key things I can think of at the moment. Val and others I'd love to open a conversation around these. Regards, Courtney Robinson Founder and CEO, Hypi Tel: ++44 208 123 2413 (GMT+0) <https://hypi.io> <https://hypi.io> https://hypi.io