Re: AggregateUnionTransposeRule fails when some inputs have unique grouping key
Hi Pavel, Yes, I missed the list, sorry. ср, 19 мая 2021 г. в 14:40, Pavel Tupitsyn : > Hi Vladimir, > > Looks like this message is for d...@calcite.apache.org, not > dev@ignite.apache.org, > or am I mistaken? > > On Wed, May 19, 2021 at 2:25 PM Vladimir Ozerov > wrote: > > > Hi, > > > > The AggregateUnionTransposeRule attempts to push the Aggregate below the > > Union. > > > > Before: > > Aggregate[group=$0, agg=SUM($1] > > Union[all] > > Input1 > > Input2 > > > > After: > > Aggregate[group=$0, agg=SUM($1] > > Union[all] > > Aggregate[group=$0, agg=SUM($1] > > Input1 > > Aggregate[group=$0, agg=SUM($1] > > Input2 > > > > When pushing the Aggregate, it checks whether the input is definitively > > unique on the grouping key. If yes, the Aggregate is not installed on top > > of the input, assuming that the result would be the same as without the > > aggregate. This generates a type mismatch exception when aggregation is > > pushed only to some of the inputs: > > Aggregate[group=$0, agg=SUM($1] > > Union[all] > > Aggregate[group=$0, agg=SUM($1] > > Input1 > > Input2 > > > > It seems that the uniqueness check should not be in that rule at all, and > > the aggregate should be pushed unconditionally. Motivation: we already > have > > AggregateRemoveRule that removes unnecessary aggregates. No need to > > duplicate the same non-trivial logic twice. > > > > Does the proposal make sense to you? > > > > Regards, > > Vladimir. > > >
AggregateUnionTransposeRule fails when some inputs have unique grouping key
Hi, The AggregateUnionTransposeRule attempts to push the Aggregate below the Union. Before: Aggregate[group=$0, agg=SUM($1] Union[all] Input1 Input2 After: Aggregate[group=$0, agg=SUM($1] Union[all] Aggregate[group=$0, agg=SUM($1] Input1 Aggregate[group=$0, agg=SUM($1] Input2 When pushing the Aggregate, it checks whether the input is definitively unique on the grouping key. If yes, the Aggregate is not installed on top of the input, assuming that the result would be the same as without the aggregate. This generates a type mismatch exception when aggregation is pushed only to some of the inputs: Aggregate[group=$0, agg=SUM($1] Union[all] Aggregate[group=$0, agg=SUM($1] Input1 Input2 It seems that the uniqueness check should not be in that rule at all, and the aggregate should be pushed unconditionally. Motivation: we already have AggregateRemoveRule that removes unnecessary aggregates. No need to duplicate the same non-trivial logic twice. Does the proposal make sense to you? Regards, Vladimir.
Re: [ANNOUNCE] New Committer: Taras Ledkov
Congratulations, Taras! Well deserved! вт, 12 мая 2020 г. в 20:27, Ivan Rakov : > Taras, > > Congratulations and welcome! > > On Tue, May 12, 2020 at 8:26 PM Denis Magda wrote: > > > Taras, > > > > Welcome, that was long overdue on our part! Hope to see you soon among > the > > PMC group. > > > > - > > Denis > > > > > > On Tue, May 12, 2020 at 9:09 AM Dmitriy Pavlov > wrote: > > > > > Hello Ignite Community, > > > > > > > > > > > > The Project Management Committee (PMC) for Apache Ignite has invited > > Taras > > > Ledkov to become a committer and we are pleased to announce that he has > > > accepted. > > > > > > > > > Taras is an Ignite SQL veteran who knows in detail current Ignite - H2 > > > integration and binary serialization, actively participates in JDBC and > > > thin client protocol development, he is eager to help users on the user > > > list within his area of expertise. > > > > > > > > > > > > Being a committer enables easier contribution to the project since > there > > is > > > no need to go via the patch submission process. This should enable > better > > > productivity. > > > > > > > > > > > > Taras, thank you for all your efforts, congratulations and welcome on > > > board! > > > . > > > > > > > > > > > > Best Regards, > > > > > > Dmitriy Pavlov > > > > > > on behalf of Apache Ignite PMC > > > > > >
JSR 381 (Visual Recognition in Java) and Apache Ignite
Hi Alexey, Igniters, Let me introduce you Heather, Zoran, and Frank. Heather is the Chair of the JCP. Zoran and Frank are JSR 381 spec leads. They are interested in discussing the upcoming visual recognition specification with the Apache Ignite community, to understand whether the community has any interest in implementing it. Zoran and Frank, Please meet Alexey Zinoviev, who is the principal maintainer of the Apache ML module. I hope he will be able to help you with your questions. Regards, Vladimir.
Re: Adding support for Ignite secondary indexes to Apache Calcite planner
Roman, What I am trying to understand is what advantage of materialization API you see over the normal optimization process? Does it save optimization time, or reduce memory footprint, or maybe provide better plans? I am asking because I do not see how expressing indexes as materializations fit classical optimization process. We discussed Sort <- Scan optimization. Let's consider another example: LogicalSort[a ASC] LogicalJoin Initially, you do not know the implementation of the join, and hence do not know it's collation. Then you may execute physical join rules, which produce, say, PhysicalMergeJoin[a ASC]. If you execute sort implementation rule afterwards, you may easily eliminate the sort, or make it simpler (e.g. remove local sorting phase), depending on the distribution. In other words, proper implementation of sorting optimization assumes that you have a kind of SortRemoveRule anyway, irrespectively of whether you use materializations or not, because sorting may be injected on top of any operator. With this in mind, the use of materializations doesn't make the planner simpler. Neither it improves the outcome of the whole optimization process. What is left is either lower CPU or RAM usage? Is this the case? ср, 11 дек. 2019 г. в 18:37, Roman Kondakov : > Vladimir, > > the main advantage of the Phoenix approach I can see is the using of > Calcite's native materializations API. Calcite has advanced support for > materializations [1] and lattices [2]. Since secondary indexes can be > considered as materialized views (it's just a sorted representation of > the same table) we can seamlessly use views to simulate indexes behavior > for Calcite planner. > > > [1] https://calcite.apache.org/docs/materialized_views.html > [2] https://calcite.apache.org/docs/lattice.html > > -- > Kind Regards > Roman Kondakov > > > On 11.12.2019 17:11, Vladimir Ozerov wrote: > > Roman, > > > > What is the advantage of Phoenix approach then? BTW, it looks like > Phoenix > > integration with Calcite never made it to production, did it? > > > > вт, 10 дек. 2019 г. в 19:50, Roman Kondakov >: > > > >> Hi Vladimir, > >> > >> from what I understand, Drill does not exploit collation of indexes. To > >> be precise it does not exploit index collation in "natural" way where, > >> say, we a have sorted TableScan and hence we do not create a new Sort. > >> Instead of it Drill always create a Sort operator, but if TableScan can > >> be replaced with an IndexScan, this Sort operator is removed by the > >> dedicated rule. > >> > >> Lets consider initial an operator tree: > >> > >> Project > >> Sort > >> TableScan > >> > >> after applying rule DbScanToIndexScanPrule this tree will be converted > to: > >> > >> Project > >> Sort > >> IndexScan > >> > >> and finally, after applying DbScanSortRemovalRule we have: > >> > >> Project > >> IndexScan > >> > >> while for Phoenix approach we would have two equivalent subsets in our > >> planner: > >> > >> Project > >> Sort > >> TableScan > >> > >> and > >> > >> Project > >> IndexScan > >> > >> and most likely the last plan will be chosen as the best one. > >> > >> -- > >> Kind Regards > >> Roman Kondakov > >> > >> > >> On 10.12.2019 17:19, Vladimir Ozerov wrote: > >>> Hi Roman, > >>> > >>> Why do you think that Drill-style will not let you exploit collation? > >>> Collation should be propagated from the index scan in the same way as > in > >>> other sorted operators, such as merge join or streaming aggregate. > >> Provided > >>> that you use converter-hack (or any alternative solution to trigger > >> parent > >>> re-analysis). > >>> In other words, propagation of collation from Drill-style indexes > should > >> be > >>> no different from other sorted operators. > >>> > >>> Regards, > >>> Vladimir. > >>> > >>> вт, 10 дек. 2019 г. в 16:40, Zhenya Stanilovsky > >> >>>> : > >>> > >>>> > >>>> Roman just as fast remark, Phoenix builds their approach on > >>>> already existing monolith HBase architecture, most cases it`s just a > >> stub > >>>> for someone who wants use secondary indexes with a base with no > >>>>
Re: Adding support for Ignite secondary indexes to Apache Calcite planner
Roman, What is the advantage of Phoenix approach then? BTW, it looks like Phoenix integration with Calcite never made it to production, did it? вт, 10 дек. 2019 г. в 19:50, Roman Kondakov : > Hi Vladimir, > > from what I understand, Drill does not exploit collation of indexes. To > be precise it does not exploit index collation in "natural" way where, > say, we a have sorted TableScan and hence we do not create a new Sort. > Instead of it Drill always create a Sort operator, but if TableScan can > be replaced with an IndexScan, this Sort operator is removed by the > dedicated rule. > > Lets consider initial an operator tree: > > Project > Sort > TableScan > > after applying rule DbScanToIndexScanPrule this tree will be converted to: > > Project > Sort > IndexScan > > and finally, after applying DbScanSortRemovalRule we have: > > Project > IndexScan > > while for Phoenix approach we would have two equivalent subsets in our > planner: > > Project > Sort > TableScan > > and > > Project > IndexScan > > and most likely the last plan will be chosen as the best one. > > -- > Kind Regards > Roman Kondakov > > > On 10.12.2019 17:19, Vladimir Ozerov wrote: > > Hi Roman, > > > > Why do you think that Drill-style will not let you exploit collation? > > Collation should be propagated from the index scan in the same way as in > > other sorted operators, such as merge join or streaming aggregate. > Provided > > that you use converter-hack (or any alternative solution to trigger > parent > > re-analysis). > > In other words, propagation of collation from Drill-style indexes should > be > > no different from other sorted operators. > > > > Regards, > > Vladimir. > > > > вт, 10 дек. 2019 г. в 16:40, Zhenya Stanilovsky > >> : > > > >> > >> Roman just as fast remark, Phoenix builds their approach on > >> already existing monolith HBase architecture, most cases it`s just a > stub > >> for someone who wants use secondary indexes with a base with no > >> native support of it. Don`t think it`s good idea here. > >> > >>> > >>> > >>> --- Forwarded message --- > >>> From: "Roman Kondakov" < kondako...@mail.ru.invalid > > >>> To: dev@ignite.apache.org > >>> Cc: > >>> Subject: Adding support for Ignite secondary indexes to Apache Calcite > >>> planner > >>> Date: Tue, 10 Dec 2019 15:55:52 +0300 > >>> > >>> Hi all! > >>> > >>> As you may know there is an activity on integration of Apache Calcite > >>> query optimizer into Ignite codebase is being carried out [1],[2]. > >>> > >>> One of a bunch of problems in this integration is the absence of > >>> out-of-the-box support for secondary indexes in Apache Calcite. After > >>> some research I came to conclusion that this problem has a couple of > >>> workarounds. Let's name them > >>> 1. Phoenix-style approach - representing secondary indexes as > >>> materialized views which are natively supported by Calcite engine [3] > >>> 2. Drill-style approach - pushing filters into the table scans and > >>> choose appropriate index for lookups when possible [4] > >>> > >>> Both these approaches have advantages and disadvantages: > >>> > >>> Phoenix style pros: > >>> - natural way of adding indexes as an alternative source of rows: index > >>> can be considered as a kind of sorted materialized view. > >>> - possibility of using index sortedness for stream aggregates, > >>> deduplication (DISTINCT operator), merge joins, etc. > >>> - ability to support other types of indexes (i.e. functional indexes). > >>> > >>> Phoenix style cons: > >>> - polluting optimizer's search space extra table scans hence increasing > >>> the planning time. > >>> > >>> Drill style pros: > >>> - easier to implement (although it's questionable). > >>> - search space is not inflated. > >>> > >>> Drill style cons: > >>> - missed opportunity to exploit sortedness. > >>> > >>> There is a good discussion about using both approaches can be found in > >> [5]. > >>> > >>> I made a small sketch [6] in order to demonstrate the applicability of > >>> the Phoenix approach to Ignite. Key d
Re: Adding support for Ignite secondary indexes to Apache Calcite planner
Hi Roman, Why do you think that Drill-style will not let you exploit collation? Collation should be propagated from the index scan in the same way as in other sorted operators, such as merge join or streaming aggregate. Provided that you use converter-hack (or any alternative solution to trigger parent re-analysis). In other words, propagation of collation from Drill-style indexes should be no different from other sorted operators. Regards, Vladimir. вт, 10 дек. 2019 г. в 16:40, Zhenya Stanilovsky : > > Roman just as fast remark, Phoenix builds their approach on > already existing monolith HBase architecture, most cases it`s just a stub > for someone who wants use secondary indexes with a base with no > native support of it. Don`t think it`s good idea here. > > > > > > >--- Forwarded message --- > >From: "Roman Kondakov" < kondako...@mail.ru.invalid > > >To: dev@ignite.apache.org > >Cc: > >Subject: Adding support for Ignite secondary indexes to Apache Calcite > >planner > >Date: Tue, 10 Dec 2019 15:55:52 +0300 > > > >Hi all! > > > >As you may know there is an activity on integration of Apache Calcite > >query optimizer into Ignite codebase is being carried out [1],[2]. > > > >One of a bunch of problems in this integration is the absence of > >out-of-the-box support for secondary indexes in Apache Calcite. After > >some research I came to conclusion that this problem has a couple of > >workarounds. Let's name them > >1. Phoenix-style approach - representing secondary indexes as > >materialized views which are natively supported by Calcite engine [3] > >2. Drill-style approach - pushing filters into the table scans and > >choose appropriate index for lookups when possible [4] > > > >Both these approaches have advantages and disadvantages: > > > >Phoenix style pros: > >- natural way of adding indexes as an alternative source of rows: index > >can be considered as a kind of sorted materialized view. > >- possibility of using index sortedness for stream aggregates, > >deduplication (DISTINCT operator), merge joins, etc. > >- ability to support other types of indexes (i.e. functional indexes). > > > >Phoenix style cons: > >- polluting optimizer's search space extra table scans hence increasing > >the planning time. > > > >Drill style pros: > >- easier to implement (although it's questionable). > >- search space is not inflated. > > > >Drill style cons: > >- missed opportunity to exploit sortedness. > > > >There is a good discussion about using both approaches can be found in > [5]. > > > >I made a small sketch [6] in order to demonstrate the applicability of > >the Phoenix approach to Ignite. Key design concepts are: > >1. On creating indexes are registered as tables in Calcite schema. This > >step is needed for internal Calcite's routines. > >2. On planner initialization we register these indexes as materialized > >views in Calcite's optimizer using VolcanoPlanner#addMaterialization > >method. > >3. Right before the query execution Calcite selects all materialized > >views (indexes) which can be potentially used in query. > >4. During the query optimization indexes are registered by planner as > >usual TableScans and hence can be chosen by optimizer if they have lower > >cost. > > > >This sketch shows the ability to exploit index sortedness only. So the > >future work in this direction should be focused on using indexes for > >fast index lookups. At first glance FilterableTable and > >FilterTableScanRule are good points to start. We can push Filter into > >the TableScan and then use FilterableTable for fast index lookups > >avoiding reading the whole index on TableScan step and then filtering > >its output on the Filter step. > > > >What do you think? > > > > > > > >[1] > > > http://apache-ignite-developers.2346864.n4.nabble.com/New-SQL-execution-engine-tt43724.html#none > >[2] > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-37%3A+New+query+execution+engine > >[3] https://issues.apache.org/jira/browse/PHOENIX-2047 > >[4] https://issues.apache.org/jira/browse/DRILL-6381 > >[5] https://issues.apache.org/jira/browse/DRILL-3929 > >[6] https://github.com/apache/ignite/pull/7115 > > > >
Re: [DISCUSS] PMC Chair rotation time
HI Anton, Thanks for adding me to the list. Ignite community is very vibrant, it would be a honor for me to resonate it even further. For example, I was thinking about Ignite - Hazelcast integration (aka "Hazelnite") which could have bring a lot of fresh technical discussions to the list. But as this may affect diversirty of my life, let me think more whether I am ready for this role ... Vladimir. чт, 24 окт. 2019 г. в 10:34, Anton Vinogradov : > More candidates: > > - Alexey Zinoviev. > He is not a PMC member, but it's definitely time to change this. > BigData evangelist. > > - Ivan Rakov. > He is not a PMC member, but it's definitely time to change this. > Distributed systems evangelist and great lead. > > - Denis Magda. > Keeping this stable is also the option. > > - Pavel Tupitsyn. > Open-minded, experienced in global communications. > > - Roman Shtykh. > Rocketmq and Ignite committer. > Highly involved to Ignite development and popularization. > > - Vladimir Ozerov > Ignite and Hazelcast committer. > Man with really strong leadership skills. > > - Yakov Zhdanov > Ignite's Godfather. > > On Thu, Oct 24, 2019 at 10:18 AM Alexey Zinoviev > wrote: > > > Currently, we discuss only candidates here, isn't it? This is not a vote? > > > > Also we should ask candidates about their plans, isn't it? > > > > Please, Denis, remind the procedure of voting, if we will have more than > 1 > > candidate? > > > > Ans yet one suggestion: maybe we will rotate every year, and it will be > > easy for candidates to plan their work-life balance? > > > > Thanks > > > > чт, 24 окт. 2019 г., 8:44 Dmitriy Pavlov : > > > > > Hi Igniters, > > > > > > I would be happy to serve this role, but since my day job current > > projects > > > not related to Ignite, it may cause some delays in replies. > > > > > > I would give my +1 to both candidates, but I'm concerned how keeping > PMC > > > Chair inside the company initially donated code to ASF could move the > > > community to the new trails. I strongly believe giving more control > > outside > > > and improving diversity always help to find new ways of developing > > > solution. > > > > > > So I, in any case, don't object to Alexey serving this role. But if > > Alexey > > > is more active on the list and if he is not affiliated with GridGain, > it > > > would be ++1 from my side. For now, I'm not so sure. > > > > > > Sincerely, > > > Dmitriy Pavlov > > > > > > чт, 24 окт. 2019 г. в 03:39, Nikita Ivanov : > > > > > > > > > > > +1 Alexey Goncharuk > > > > -- > > > > Nikita Ivanov > > > > > > > > > On Oct 21, 2019, at 10:21 PM, Denis Magda > wrote: > > > > > > > > > > Igniters, > > > > > > > > > > It’s been almost 3 years since my election as the PMC Chair and I’d > > > like > > > > > the community to give other PMC members an opportunity to serve in > > this > > > > > role. It’s healthy to rotate the role more frequently and we’re > > already > > > > > due. Though the chair doesn’t have formal power, he/she can bring a > > > fresh > > > > > perspective and help to navigate the community via trails not > > > considered > > > > > before. > > > > > > > > > > Please propose candidates selecting among active PMC members: > > > > > https://ignite.apache.org/community/resources.html#people > > > > > > > > > > > > > > > Denis > > > > > > > > > > > > > > >> On Monday, December 5, 2016, Dmitriy Setrakyan < > > dsetrak...@apache.org > > > > > > > > >> wrote: > > > > >> > > > > >> I haven't forgotten. Just got back from a business trip. Will > start > > a > > > > vote > > > > >> this week. > > > > >> > > > > >> On Wed, Nov 23, 2016 at 5:18 PM, Dmitriy Setrakyan < > > > > dsetrak...@apache.org> > > > > >> wrote: > > > > >> > > > > >>> Cos, I will start the vote soon. A bit over occupied with travel > > and > > > > >>> holidays at this moment. > > > > >>> > > > > >>> On Mon, Nov 21, 2016 at
[jira] [Created] (IGNITE-11701) SQL: Reflect in documentation change of system views schema from "IGNITE" to "SYS"
Vladimir Ozerov created IGNITE-11701: Summary: SQL: Reflect in documentation change of system views schema from "IGNITE" to "SYS" Key: IGNITE-11701 URL: https://issues.apache.org/jira/browse/IGNITE-11701 Project: Ignite Issue Type: Task Components: documentation Reporter: Vladimir Ozerov Assignee: Artem Budnikov Fix For: 2.8 Previously all system views were located in "IGNITE" schema. Now we moved them to "SYS" because this is more intuitive and consistent with other database vendors. Need to do two things: # Updated documentation of system views: change "IGNITE" schema to "SYS" # Add a balloon informing users that before AI 2.8 system views were located in "IGNITE" schema and that previous behavior could be forced with "-DIGNITE_SQL_SYSTEM_SCHEMA_NAME_IGNITE=true" system property. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Thin client: transactions support
protocol limitations that are > >> causing > >> > this. > >> > But I have no idea how to support this in .NET Thin Client, for > example. > >> > > >> > It is thread-safe and can handle multiple async operations in > parallel. > >> > But with TX support we have to somehow switch to single-threaded mode > to > >> > avoid unexpected effects. > >> > > >> > Any ideas? > >> > > >> > > >> > On Mon, Apr 1, 2019 at 6:38 PM Alex Plehanov > > >> > wrote: > >> > > >> > > Dmitriy, thank you! > >> > > > >> > > Guys, I've created the IEP [1] on wiki, please have a look. > >> > > > >> > > [1] > >> > > > >> > > > >> > > >> > https://cwiki.apache.org/confluence/display/IGNITE/IEP-34+Thin+client%3A+transactions+support > >> > > > >> > > > >> > > чт, 28 мар. 2019 г. в 14:33, Dmitriy Pavlov : > >> > > > >> > > > Hi, > >> > > > > >> > > > I've added permissions to account plehanov.alex > >> > > > > >> > > > Recently Infra integrated Apache LDAP with confluence, so it is > >> > possible > >> > > to > >> > > > login using Apache credentials. Probably we can ask infra if extra > >> > > > permissions to edit pages should be added for committers. > >> > > > > >> > > > Sincerely, > >> > > > Dmitriy Pavlov > >> > > > > >> > > > ср, 27 мар. 2019 г. в 13:37, Alex Plehanov < > plehanov.a...@gmail.com > >> >: > >> > > > > >> > > > > Vladimir, > >> > > > > > >> > > > > About current tx: ok, then we don't need tx() method in the > >> interface > >> > > at > >> > > > > all (the same cached transaction info user can store by > himself). > >> > > > > > >> > > > > About decoupling transactions from threads on the server side: > for > >> > now, > >> > > > we > >> > > > > can start with thread-per-connection approach (we only can > support > >> > one > >> > > > > active transaction per connection, see below, so we need one > >> > additional > >> > > > > dedicated thread for each connection with active transaction), > and > >> > > later > >> > > > > change server-side internals to process client transactions in > any > >> > > server > >> > > > > thread (not dedicated to this connection). This change will not > >> > affect > >> > > > the > >> > > > > thin client protocol, it only affects the server side. > >> > > > > In any case, we can't support concurrent transactions per > >> connection > >> > on > >> > > > > the client side without fundamental changes to the current > >> protocol > >> > > > (cache > >> > > > > operation doesn't bound to transaction or thread and the server > >> > doesn't > >> > > > > know which thread on the client side do this cache operation). > In > >> my > >> > > > > opinion, if a user wants to use concurrent transactions, he must > >> use > >> > > > > different connections from a connection pool. > >> > > > > > >> > > > > About semantics of suspend/resume on the client-side: it's > >> absolutely > >> > > > > different than server-side semantics (we don't need to do > >> > > suspend/resume > >> > > > to > >> > > > > pass transaction between threads on the client-side), but can't > be > >> > > > > implemented efficiently without implemented suspend/resume on > >> > > > server-side. > >> > > > > > >> > > > > Can anyone give me permissions to create IEP on Apache wiki? > >> > > > > > >> > > > > ср, 27 мар. 2019 г. в 11:59, Vladimir Ozerov < > >> voze...@gridgain.com>: > >> > > > > > >> > > > > > Hi Alex, > >> > > > >
[jira] [Created] (IGNITE-11648) Document SCHEMAS system view
Vladimir Ozerov created IGNITE-11648: Summary: Document SCHEMAS system view Key: IGNITE-11648 URL: https://issues.apache.org/jira/browse/IGNITE-11648 Project: Ignite Issue Type: Task Components: documentation Reporter: Vladimir Ozerov Assignee: Artem Budnikov Fix For: 2.8 We added "SCHEMAS" system view. It contains only one attribute - "SCHEMA_NAME". -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Thin client: transactions support
Hi Alex, My comments was only about the protocol. Getting current info about transaction should be handled by the client itself. It is not protocl's concern. Same about other APIs and behavior in case another transaction is attempted from the same thread. Putting protocol aside, transaction support is complicated matter. I would propose to route through IEP and wide community discussion. We need to review API and semantics very carefully, taking SUSPEND/RESUME in count. Also I do not see how we support client transactions efficiently without decoupling transactions from threads on the server side first. Because without it you will need a dedicated server thread for every client's transaction which is slow and may even crash the server. Vladimir. On Wed, Mar 27, 2019 at 11:44 AM Alex Plehanov wrote: > Vladimir, what if we want to get current transaction info (tx() method)? > > Does close() method mapped to TX_END(rollback)? > > For example, this code: > > try(tx = txStart()) { > tx.commit(); > } > > Will produce: > TX_START > TX_END(commit) > TX_END(rollback) > > Am I understand you right? > > About xid. There is yet another proposal. Use some unique per connection id > (integer, simple counter) for identifying the transaction on > commit/rollback message. The client gets this id from the server with > transaction info and sends it back to the server when trying to > commit/rollback transaction. This id is not shown to users. But also we can > pass from server to client real transaction id (xid) with transaction info > for diagnostic purposes. > > And one more question: what should we do if the client starts a new > transaction without ending the old one? Should we end the old transaction > implicitly (rollback) or throw an exception to the client? In my opinion, > the first option is better. For example, if we got a previously used > connection from the connection pool, we should not worry about any > uncompleted transaction started by the previous user of this connection. > > ср, 27 мар. 2019 г. в 11:02, Vladimir Ozerov : > > > As far as SUSPEND/RESUME/SAVEPOINT - we do not support them yet, and > adding > > them in future should not conflict with simple START/END infrastructure. > > > > On Wed, Mar 27, 2019 at 11:00 AM Vladimir Ozerov > > wrote: > > > > > Hi Alex, > > > > > > I am not sure we need 5 commands. Wouldn't it be enough to have only > two? > > > > > > START - accepts optional parameters, returns transaction info > > > END - provides commit flag, returns void > > > > > > Vladimir. > > > > > > On Wed, Mar 27, 2019 at 8:26 AM Alex Plehanov > > > > wrote: > > > > > >> Sergey, yes, the close is something like silent rollback. But we can > > >> also implement this on the client side, just using rollback and > ignoring > > >> errors in the response. > > >> > > >> ср, 27 мар. 2019 г. в 00:04, Sergey Kozlov : > > >> > > >> > Nikolay > > >> > > > >> > Am I correctly understand you points: > > >> > > > >> >- close: rollback > > >> >- commit, close: do nothing > > >> >- rollback, close: do what? (I suppose nothing) > > >> > > > >> > Also you assume that after commit/rollback we may need to free some > > >> > resources on server node(s)or just do on client started TX? > > >> > > > >> > > > >> > > > >> > On Tue, Mar 26, 2019 at 10:41 PM Alex Plehanov < > > plehanov.a...@gmail.com > > >> > > > >> > wrote: > > >> > > > >> > > Sergey, we have the close() method in the thick client, it's > > behavior > > >> is > > >> > > slightly different than rollback() method (it should rollback if > the > > >> > > transaction is not committed and do nothing if the transaction is > > >> already > > >> > > committed). I think we should support try-with-resource semantics > in > > >> the > > >> > > thin client and OP_TX_CLOSE will be useful here. > > >> > > > > >> > > Nikolay, suspend/resume didn't work yet for pessimistic > > transactions. > > >> > Also, > > >> > > the main goal of suspend/resume operations is to support > transaction > > >> > > passing between threads. In the thin client, the transaction is > > bound > > >> to > > >&
Re: Thin client: transactions support
As far as SUSPEND/RESUME/SAVEPOINT - we do not support them yet, and adding them in future should not conflict with simple START/END infrastructure. On Wed, Mar 27, 2019 at 11:00 AM Vladimir Ozerov wrote: > Hi Alex, > > I am not sure we need 5 commands. Wouldn't it be enough to have only two? > > START - accepts optional parameters, returns transaction info > END - provides commit flag, returns void > > Vladimir. > > On Wed, Mar 27, 2019 at 8:26 AM Alex Plehanov > wrote: > >> Sergey, yes, the close is something like silent rollback. But we can >> also implement this on the client side, just using rollback and ignoring >> errors in the response. >> >> ср, 27 мар. 2019 г. в 00:04, Sergey Kozlov : >> >> > Nikolay >> > >> > Am I correctly understand you points: >> > >> >- close: rollback >> >- commit, close: do nothing >> >- rollback, close: do what? (I suppose nothing) >> > >> > Also you assume that after commit/rollback we may need to free some >> > resources on server node(s)or just do on client started TX? >> > >> > >> > >> > On Tue, Mar 26, 2019 at 10:41 PM Alex Plehanov > > >> > wrote: >> > >> > > Sergey, we have the close() method in the thick client, it's behavior >> is >> > > slightly different than rollback() method (it should rollback if the >> > > transaction is not committed and do nothing if the transaction is >> already >> > > committed). I think we should support try-with-resource semantics in >> the >> > > thin client and OP_TX_CLOSE will be useful here. >> > > >> > > Nikolay, suspend/resume didn't work yet for pessimistic transactions. >> > Also, >> > > the main goal of suspend/resume operations is to support transaction >> > > passing between threads. In the thin client, the transaction is bound >> to >> > > the client connection, not client thread. I think passing a >> transaction >> > > between different client connections is not a very useful case. >> > > >> > > вт, 26 мар. 2019 г. в 22:17, Nikolay Izhikov : >> > > >> > > > Hello, Alex. >> > > > >> > > > We also have suspend and resume operations. >> > > > I think we should support them >> > > > >> > > > вт, 26 марта 2019 г., 22:07 Sergey Kozlov : >> > > > >> > > > > Hi >> > > > > >> > > > > Looks like I missed something but why we need OP_TX_CLOSE >> operation? >> > > > > >> > > > > Also I suggest to reserve a code for SAVEPOINT operation which >> very >> > > > useful >> > > > > to understand where transaction has been rolled back >> > > > > >> > > > > On Tue, Mar 26, 2019 at 6:07 PM Alex Plehanov < >> > plehanov.a...@gmail.com >> > > > >> > > > > wrote: >> > > > > >> > > > > > Hello Igniters! >> > > > > > >> > > > > > I want to pick up the ticket IGNITE-7369 and add transactions >> > support >> > > > to >> > > > > > our thin client implementation. >> > > > > > I've looked at our current implementation and have some >> proposals >> > to >> > > > > > support transactions: >> > > > > > >> > > > > > Add new operations to thin client protocol: >> > > > > > >> > > > > > OP_TX_GET, 4000, Get current transaction for client >> connection >> > > > > > OP_TX_START, 4001, Start a new transaction >> > > > > > OP_TX_COMMIT, 4002, Commit transaction >> > > > > > OP_TX_ROLLBACK, 4003, Rollback transaction >> > > > > > OP_TX_CLOSE, 4004, Close transaction >> > > > > > >> > > > > > From the client side (java) new interfaces will be added: >> > > > > > >> > > > > > public interface ClientTransactions { >> > > > > > public ClientTransaction txStart(); >> > > > > > public ClientTransaction txStart(TransactionConcurrency >> > > > concurrency, >> > > > > > TransactionIsolation isolation); >> > > > > > public ClientTransact
Re: Thin client: transactions support
ion(); > > > > > > public TransactionConcurrency concurrency(); > > > > > > public long timeout(); > > > > > > public String label(); > > > > > > > > > > > > public void commit(); > > > > > > public void rollback(); > > > > > > public void close(); > > > > > > } > > > > > > > > > > > > From the server side, I think as a first step (while transactions > > > > > > suspend/resume is not fully implemented) we can use the same > > approach > > > > as > > > > > > for JDBC: add a new worker to each ClientRequestHandler and > process > > > > > > requests by this worker if the transaction is started explicitly. > > > > > > ClientRequestHandler is bound to client connection, so there will > > be > > > > 1:1 > > > > > > relation between client connection and thread, which process > > > operations > > > > > in > > > > > > a transaction. > > > > > > > > > > > > Also, there is a couple of issues I want to discuss: > > > > > > > > > > > > We have overloaded method txStart with a different set of > > arguments. > > > > Some > > > > > > of the arguments may be missing. To pass arguments with > OP_TX_START > > > > > > operation we have the next options: > > > > > > * Serialize full set of arguments and use some value for missing > > > > > > arguments. For example -1 for int/long types and null for string > > > type. > > > > We > > > > > > can't use 0 for int/long types since 0 it's a valid value for > > > > > concurrency, > > > > > > isolation and timeout arguments. > > > > > > * Serialize arguments as a collection of property-value pairs > > (like > > > > it's > > > > > > implemented now for CacheConfiguration). In this case only > > explicitly > > > > > > provided arguments will be serialized. > > > > > > Which way is better? The simplest solution is to use the first > > option > > > > > and I > > > > > > want to use it if there were no objections. > > > > > > > > > > > > Do we need transaction id (xid) on the client side? > > > > > > If yes, we can pass xid along with OP_TX_COMMIT, OP_TX_ROLLBACK, > > > > > > OP_TX_CLOSE operations back to the server and do additional check > > on > > > > the > > > > > > server side (current transaction id for connection == transaction > > id > > > > > passed > > > > > > from client side). This, perhaps, will protect clients against > some > > > > > errors > > > > > > (for example when client try to commit outdated transaction). But > > > > > > currently, we don't have data type IgniteUuid in thin client > > > protocol. > > > > Do > > > > > > we need to add it too? > > > > > > Also, we can pass xid as a string just to inform the client and > do > > > not > > > > > pass > > > > > > it back to the server with commit/rollback operation. > > > > > > Or not to pass xid at all (.NET thick client works this way as > far > > > as I > > > > > > know). > > > > > > > > > > > > What do you think? > > > > > > > > > > > > ср, 7 мар. 2018 г. в 16:22, Vladimir Ozerov < > voze...@gridgain.com > > >: > > > > > > > > > > > > > We already have transactions support in JDBC driver in TX SQL > > > branch > > > > > > > (ignite-4191). Currently it is implemented through separate > > thread, > > > > > which > > > > > > > is not that efficient. Ideally we need to finish decoupling > > > > > transactions > > > > > > > from threads. But alternatively we can change the logic on how > we > > > > > assign > > > > > > > thread ID to specific transaction and "impersonate" thin client > > > > worker > > > > > > > threads when serving requests from multiple users. > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Mar 6, 2018 at 10:01 PM, Denis Magda < > dma...@apache.org> > > > > > wrote: > > > > > > > > > > > > > > > Here is an original discussion with a reference to the JIRA > > > ticket: > > > > > > > > http://apache-ignite-developers.2346864.n4.nabble. > > > > > > > > com/Re-Transaction-operations-using-the-Ignite-Thin-Client- > > > > > > > > Protocol-td25914.html > > > > > > > > > > > > > > > > -- > > > > > > > > Denis > > > > > > > > > > > > > > > > On Tue, Mar 6, 2018 at 9:18 AM, Dmitriy Setrakyan < > > > > > > dsetrak...@apache.org > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hi Dmitriy. I don't think we have a design proposal for > > > > transaction > > > > > > > > support > > > > > > > > > in thin clients. Do you mind taking this initiative and > > > creating > > > > an > > > > > > IEP > > > > > > > > on > > > > > > > > > Wiki? > > > > > > > > > > > > > > > > > > D. > > > > > > > > > > > > > > > > > > On Tue, Mar 6, 2018 at 8:46 AM, Dmitriy Govorukhin < > > > > > > > > > dmitriy.govoruk...@gmail.com> wrote: > > > > > > > > > > > > > > > > > > > Hi, Igniters. > > > > > > > > > > > > > > > > > > > > I've seen a lot of discussions about thin client and > binary > > > > > > protocol, > > > > > > > > > but I > > > > > > > > > > did not hear anything about transactions support. Do we > > have > > > > some > > > > > > > draft > > > > > > > > > for > > > > > > > > > > this purpose? > > > > > > > > > > > > > > > > > > > > As I understand we have several problems: > > > > > > > > > > > > > > > > > > > >- thread and transaction have hard related (we use > > > > > thread-local > > > > > > > > > variable > > > > > > > > > >and thread name) > > > > > > > > > >- we can process only one transaction at the same time > > in > > > > one > > > > > > > thread > > > > > > > > > (it > > > > > > > > > >mean we need hold thread per client. If connect 100 > thin > > > > > clients > > > > > > > to > > > > > > > > 1 > > > > > > > > > >server node, then need to hold 100 thread on the > server > > > > side) > > > > > > > > > > > > > > > > > > > > Let's discuss how we can implement transactions for the > > thin > > > > > > client. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Sergey Kozlov > > > > > GridGain Systems > > > > > www.gridgain.com > > > > > > > > > > > > > > > > > > -- > > Sergey Kozlov > > GridGain Systems > > www.gridgain.com > > >
[jira] [Created] (IGNITE-11630) Document changes to SQL views
Vladimir Ozerov created IGNITE-11630: Summary: Document changes to SQL views Key: IGNITE-11630 URL: https://issues.apache.org/jira/browse/IGNITE-11630 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Assignee: Artem Budnikov Fix For: 2.8 The following changes were made to our views. {{CACHE_GROUPS}} # {{ID}} -> {{CACHE_GROUP_ID}} # {{GROUP_NAME}} -> {{CACHE_GROUP_NAME}} {{LOCAL_CACHE_GROUPS_IO}} # {{GROUP_ID}} -> {{CACHE_GROUP_ID}} # {{GROUP_NAME}} -> {{CACHE_GROUP_NAME}} {{CACHES}} # {{NAME}} -> {{CACHE_NAME}} # {{GROUP_ID}} -> {{CACHE_GROUP_ID}} # {{GROUP_NAME}} -> {{CACHE_GROUP_NAME}} {{INDEXES}} # {{GROUP_ID}} -> {{CACHE_GROUP_ID}} # {{GROUP_NAME}} -> {{CACHE_GROUP_NAME}} {{NODES}} # {{ID}} -> {{NODE_ID}} {{TABLES}} # Added {{CACHE_GROUP_ID}} # Added {{CACHE_GROUP_NAME}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11564) SQL: Implement KILL QUERY command
Vladimir Ozerov created IGNITE-11564: Summary: SQL: Implement KILL QUERY command Key: IGNITE-11564 URL: https://issues.apache.org/jira/browse/IGNITE-11564 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Fix For: 2.8 This is an umbrella ticket for {{KILL QUERY}} command implementation. Original description could be found in IGNITE-10161. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Peer review: Victory over Patch Available debt
Hi, This is tough question, and first of all I'd like to ask participants to keep cold head. This is a public question and can be discussed on the dev list safely. On the one hand, it is true that a number of patches are not reviewed for a long time, what negatively affects community development. On the other hand, we definitely do not want to sacrifice product quality only because e.g. responsible component owner was on a sick leave or vacation and was not able to review the patch in a timely manner. Some compromise is needed. IMO additional comments in HTC may solve the issue. We should stress out that a patch should be committed if and only if committer is confident with the changes. Confidence comes from either experience (you worked with component a lot and know what you are doing), or from review by component's expert. But if there is an outdated patch and you are not confident enough, just don't merge. Let is stay in Patch Available as long as needed. In case of lazy consensus we may ask committers to add comments to the ticket explaining why they decided to merge a ticket without expert's review. This should help us avoid bad commits. Thoughts? On Mon, Mar 18, 2019 at 11:33 AM Anton Vinogradov wrote: > Dmitry, > > Phrase "Code modifications can be approved by silence: by lazy consensus > (72h) after Dev.List announcement." looks unacceptable to me. > > Please roll back the changes and start the discussion at the private list > and never do such updates in the future without the discussion. > > On Fri, Mar 15, 2019 at 8:29 PM Dmitriy Pavlov wrote: > > > Hi Igniters, > > > > sorry for the late reply. Because this process time to time causes > > questions, I decided to add a couple of words to our wiki. > > > > I've added topics about peer review to HTC > > > > > https://cwiki.apache.org/confluence/display/IGNITE/How+to+Contribute#HowtoContribute-PeerReviewandLGTM > > > > Actually, it is (more or less) rules of Apache Beam project, as well as > > Apache Training(incubating), as well as our current process + Apache > > policies. > > > > Sincerely, > > Dmitriy Pavlov > > > > > > чт, 16 авг. 2018 г. в 17:46, Yakov Zhdanov : > > > > > Dmitry, > > > > > > I like your suggestion very much! And I want everyone to follow. Let's > > see > > > if it helps. > > > > > > Can I ask everyone who has submitted tickets for review to add a > comment > > > described by Dmitry to each ticket submitted and see if any additional > > > check is still required and fix remaining issues? I believe this should > > > speed up review process very much. > > > > > > --Yakov > > > > > >
[jira] [Created] (IGNITE-11551) SQL: Document LOCAL_SQL_QUERY_HISTORY
Vladimir Ozerov created IGNITE-11551: Summary: SQL: Document LOCAL_SQL_QUERY_HISTORY Key: IGNITE-11551 URL: https://issues.apache.org/jira/browse/IGNITE-11551 Project: Ignite Issue Type: Task Reporter: Vladimir Ozerov Name: {{LOCAL_SQL_QUERY_HISTORY}} Fields: # {{SCHEMA_NAME}} - schema name # {{SQL}} - actual SQL being executed # {{LOCAL}} - whether query was stared with "local=true" flag # {{EXECUTIONS}} - total number of executions # {{FAILURES}} - number of executions which failed # {{DURATION_MIN}} - minimum duration # {{DURATION_MAX}} - maximum duration # {{LAST_START_TIME}} - start time of the last executed query -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11518) SQL: Security checks are skipped on some SELECT paths
Vladimir Ozerov created IGNITE-11518: Summary: SQL: Security checks are skipped on some SELECT paths Key: IGNITE-11518 URL: https://issues.apache.org/jira/browse/IGNITE-11518 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Assignee: Vladimir Ozerov Fix For: 2.8 This is regression introduced by IGNITE-11227. Security check should be moved from {{executeSelectLocal}} to {{executeSelect0}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11517) MVCC: Support one-phase commit
Vladimir Ozerov created IGNITE-11517: Summary: MVCC: Support one-phase commit Key: IGNITE-11517 URL: https://issues.apache.org/jira/browse/IGNITE-11517 Project: Ignite Issue Type: Task Components: mvcc Reporter: Vladimir Ozerov One-phase commit is critical performance optimization for single-key requests. Our profiling revealed that this is one of the key things why MVCC is much slower than non-MVCC caches. Let's add 1PC support to MVCC. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11516) MVCC management and monitoring
Vladimir Ozerov created IGNITE-11516: Summary: MVCC management and monitoring Key: IGNITE-11516 URL: https://issues.apache.org/jira/browse/IGNITE-11516 Project: Ignite Issue Type: Task Components: mvcc Reporter: Vladimir Ozerov This is an umbrella ticket for MVCC management and monitoring capabilities. This should include (but not limited to): # Proper cache metrics (standard cache operations, number of stale versions aka "bloat", etc) # MVCC coordinator metrics (node ID, number of received requests, number of active TXes, current cleanup version, current version, etc) # Cache events (either standard JCache or something else) # Deadlock detector metrics -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11515) MVCC: Make sure that multiple cursors are handled properly for JDBC/ODBC
Vladimir Ozerov created IGNITE-11515: Summary: MVCC: Make sure that multiple cursors are handled properly for JDBC/ODBC Key: IGNITE-11515 URL: https://issues.apache.org/jira/browse/IGNITE-11515 Project: Ignite Issue Type: Bug Components: jdbc, mvcc, odbc Reporter: Vladimir Ozerov Consider the following scenario executed from JDBC/ODBC driver: 1) Open transaction 2) Get a cursor for some large SELECT 3) Close transaction 4) Overwrite some of not-yet-returned values for the cursor 5) Force vacuum 6) Read remaining values from the cursor Will we get correct result? Most probably no, because we close transaction on commit without consulting to any opened cursors. Possible solutions: 1) Extend transaction lifetime until all cursors are closed 2) Or close the cursors forcibly and throw proper error message -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11514) MVCC: Client listener: do not delegate implicit operation execution to separate thread for JDBC/ODBC
Vladimir Ozerov created IGNITE-11514: Summary: MVCC: Client listener: do not delegate implicit operation execution to separate thread for JDBC/ODBC Key: IGNITE-11514 URL: https://issues.apache.org/jira/browse/IGNITE-11514 Project: Ignite Issue Type: Task Components: jdbc, mvcc, odbc Reporter: Vladimir Ozerov If implicit operation over MVCC cache(s) is executed from JDBC/ODBC driver, we always delegate it to a separate thread. But there is no need to do this - once we understand that no active transaction will be left after execution, query could be executed from normal listener thread safely. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11513) MVCC: make sure that unsupported features are documented properly
Vladimir Ozerov created IGNITE-11513: Summary: MVCC: make sure that unsupported features are documented properly Key: IGNITE-11513 URL: https://issues.apache.org/jira/browse/IGNITE-11513 Project: Ignite Issue Type: Task Components: documentation Reporter: Vladimir Ozerov Fix For: 2.8 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11511) SQL: Possible bug with parameters passing for complex DML queries
Vladimir Ozerov created IGNITE-11511: Summary: SQL: Possible bug with parameters passing for complex DML queries Key: IGNITE-11511 URL: https://issues.apache.org/jira/browse/IGNITE-11511 Project: Ignite Issue Type: Bug Components: sql Reporter: Vladimir Ozerov Assignee: Pavel Kuznetsov Fix For: 2.8 See methods {{IgniteH2Indexing.executeSelectLocal}} and {{IgniteH2Indexing.executeSelectForDml}}. They both could be invoked for {{SELECT}} statements extracted from DML. But notice how parameters are passed: it seems that we may pass parameters from DML statement unchanged, which is illegal. E.g. consider the following DML: {code} UPDATE table SET x=? WHERE x=? {code} In this case SELECT statement should get only the second parameter. Need to create tests to confirm that this is the case and make necessary fixes if needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11510) SQL: Rework running queries tests to make them stable to internal code changes
Vladimir Ozerov created IGNITE-11510: Summary: SQL: Rework running queries tests to make them stable to internal code changes Key: IGNITE-11510 URL: https://issues.apache.org/jira/browse/IGNITE-11510 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov See {{RunningQueriesTest}}. It hacks into {{IgniteH2Indexing.querySqlFields}}. This is not resilent to internal code changes. We need to make sure that the whole test use as less hacks as possible. E.g. we can hack into running queries manager instead of indexing. Several DML tests are muted due to changes introduced in IGNITE-11227. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Ignite 2.8 Release: Time & Scope & Release manager
Dmitry, “Master is always releasable” is a myth, let’s do not be naive. We develop complex product. Many features are being developed in iterations. Many features are developed by different contributors and have to be aligned with each other after merge. And in the end all this should be tested and benchmarked before becoming a product. None serious products are “releasable” from master in a classical “release” sense. Nightly builds are not releases. чт, 7 марта 2019 г. в 20:31, Dmitriy Pavlov : > Vova it is not cool I have to remind Ignite veterans about How to > contribute page, it says the master is release ready branch. > > Otherwise feature is developed in its own branch. > > So my vote goes for master-based release. > > чт, 7 мар. 2019 г. в 20:28, Vladimir Ozerov : > > > Igniters, > > > > Making release from master is not an option. We have a lot of > not-yet-ready > > and not-yet-tested features. From SQL side this is partition pruning and > > SQL views with KILL command. > > > > So if we do not want to release a mess, then there are only two options: > > release Java 11 fixes on top of 2.7, or make normal release in about > 1.5-2 > > month with proper feature freeze process and testing. > > > > Vladimir. > > > > чт, 7 марта 2019 г. в 20:10, Ilya Kasnacheev >: > > > > > Hello! > > > > > > Then please fast-forward review and merge > > > https://issues.apache.org/jira/browse/IGNITE-11299 because it breaks > SSL > > > on > > > Windows under Java 11. > > > > > > Anything else that needs to be merged before release is branched? > > > > > > Regards, > > > -- > > > Ilya Kasnacheev > > > > > > > > > чт, 7 мар. 2019 г. в 20:07, Nikolay Izhikov : > > > > > > > +1 > > > > > > > > чт, 7 марта 2019 г., 20:00 Denis Magda : > > > > > > > > > Igniters, > > > > > > > > > > How about releasing Ignite 2.8 from the master - creating the > release > > > > > branch on Monday-Tuesday, as fast as we can? Don't want us to delay > > > with > > > > > Java 11 improvements, they are really helpful from the usability > > > > > standpoint. > > > > > > > > > > After this release, let's introduce a practice of maintenance > > releases > > > > > 2.8.x. Those who are working on any improvements and won't merge > them > > > to > > > > > the release branch on Monday-Tuesday will be able to roll out in a > > > point > > > > > release like 2.8.1 slightly later. > > > > > > > > > > - > > > > > Denis > > > > > > > > > > > > > > > On Thu, Mar 7, 2019 at 6:22 AM Dmitriy Pavlov > > > > wrote: > > > > > > > > > > > Hi Ignite Developers, > > > > > > > > > > > > In the separate topic, we've touched the question of next release > > of > > > > > Apache > > > > > > Ignite. > > > > > > > > > > > > The main reason for the release is Java 11 support, modularity > > > changes > > > > > > (actually we have a couple of this kind of fixes). Unfortunately, > > > full > > > > > > modularity support is impossible without 3.0 because package > > > > refactoring > > > > > is > > > > > > breaking change in some cases. > > > > > > > > > > > > But I clearly remember that in 2.7 thread we've also discussed > that > > > the > > > > > > next release will contain step 1 of services redesign, - > discovery > > > > > protocol > > > > > > usage for services redeploy. > > > > > > > > > > > > We have 2 alternative options for releasing 2.8; > > > > > > > > > > > > A. (in a small way): 2.7-based branch with particular commits > > > > > cherry-picked > > > > > > into it. It is analog of emergency release but without really > > > > emergency. > > > > > > Since we don't release our new modules we have more time to make > it > > > > > modular > > > > > > for 2.9 and make Ignite fully modules compliant in 3.0 > > > > > > > > > > > > B. (in large) And, it is a full release based on master, it will > > > > include > > > > > > new hibernate version, ignite-compress, ignite-services, and all > > > other > > > > > > changes we have. Once it is published we will not be able to > change > > > > > > something. > > > > > > > > > > > > Please share your vision, and please stand up if you want to lead > > > this > > > > > > release (as release manager). > > > > > > > > > > > > Sincerely, > > > > > > Dmitriy Pavlov > > > > > > > > > > > > > > > > > > > > >
Re: Ignite 2.8 Release: Time & Scope & Release manager
Igniters, Making release from master is not an option. We have a lot of not-yet-ready and not-yet-tested features. From SQL side this is partition pruning and SQL views with KILL command. So if we do not want to release a mess, then there are only two options: release Java 11 fixes on top of 2.7, or make normal release in about 1.5-2 month with proper feature freeze process and testing. Vladimir. чт, 7 марта 2019 г. в 20:10, Ilya Kasnacheev : > Hello! > > Then please fast-forward review and merge > https://issues.apache.org/jira/browse/IGNITE-11299 because it breaks SSL > on > Windows under Java 11. > > Anything else that needs to be merged before release is branched? > > Regards, > -- > Ilya Kasnacheev > > > чт, 7 мар. 2019 г. в 20:07, Nikolay Izhikov : > > > +1 > > > > чт, 7 марта 2019 г., 20:00 Denis Magda : > > > > > Igniters, > > > > > > How about releasing Ignite 2.8 from the master - creating the release > > > branch on Monday-Tuesday, as fast as we can? Don't want us to delay > with > > > Java 11 improvements, they are really helpful from the usability > > > standpoint. > > > > > > After this release, let's introduce a practice of maintenance releases > > > 2.8.x. Those who are working on any improvements and won't merge them > to > > > the release branch on Monday-Tuesday will be able to roll out in a > point > > > release like 2.8.1 slightly later. > > > > > > - > > > Denis > > > > > > > > > On Thu, Mar 7, 2019 at 6:22 AM Dmitriy Pavlov > > wrote: > > > > > > > Hi Ignite Developers, > > > > > > > > In the separate topic, we've touched the question of next release of > > > Apache > > > > Ignite. > > > > > > > > The main reason for the release is Java 11 support, modularity > changes > > > > (actually we have a couple of this kind of fixes). Unfortunately, > full > > > > modularity support is impossible without 3.0 because package > > refactoring > > > is > > > > breaking change in some cases. > > > > > > > > But I clearly remember that in 2.7 thread we've also discussed that > the > > > > next release will contain step 1 of services redesign, - discovery > > > protocol > > > > usage for services redeploy. > > > > > > > > We have 2 alternative options for releasing 2.8; > > > > > > > > A. (in a small way): 2.7-based branch with particular commits > > > cherry-picked > > > > into it. It is analog of emergency release but without really > > emergency. > > > > Since we don't release our new modules we have more time to make it > > > modular > > > > for 2.9 and make Ignite fully modules compliant in 3.0 > > > > > > > > B. (in large) And, it is a full release based on master, it will > > include > > > > new hibernate version, ignite-compress, ignite-services, and all > other > > > > changes we have. Once it is published we will not be able to change > > > > something. > > > > > > > > Please share your vision, and please stand up if you want to lead > this > > > > release (as release manager). > > > > > > > > Sincerely, > > > > Dmitriy Pavlov > > > > > > > > > >
[jira] [Created] (IGNITE-11499) SQL: DML should not use batches by default
Vladimir Ozerov created IGNITE-11499: Summary: SQL: DML should not use batches by default Key: IGNITE-11499 URL: https://issues.apache.org/jira/browse/IGNITE-11499 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Fix For: 2.8 Currently DML apply updates in batches equal to {{SqlFieldsQuery.pageSize}}. This is prone to deadlocks. Instead, we should apply updates one-by-one by default. Proposal: # Introduce {{SqlFieldsQuery.updateBatchSize}} property, set it to {{1}} by default # Print a warning about deadlock to log if it is greater than 1 # Add it to JDBC and ODBC drivers -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11498) SQL: Rework DML data distribution logic
Vladimir Ozerov created IGNITE-11498: Summary: SQL: Rework DML data distribution logic Key: IGNITE-11498 URL: https://issues.apache.org/jira/browse/IGNITE-11498 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Fix For: 2.8 Current DML implementation has a number of problems: 1) We fetch the whole data set to originator's node. There is "skipDmlOnReducer" flag to avoid this in some cases, but it is still in experimental state, and is not enabled by default 2) Updates are deadlock-prone: we update entries in batches equal to {{SqlFieldsQuery.pageSize}}. So we can deadlock easily with concurrent cache operations 3) We have very strange re-try logic. It is not clear why it is needed in the first place provided that DML is not transactional and no guarantees are needed. Proposal: # Implement proper routing logic: if a request could be executed on data nodes bypassing skipping reducer, do this. Otherwise fetch all data to reducer. This decision should be made in absolutely the same way as for MVCC (see {{GridNearTxQueryEnlistFuture}} as a starting point) # Distribute updates to primary data node in batches, but apply them one by one, similar to data streamer with {{allowOverwrite=false}}. Do not do any partition state or {{AffinityTopologyVersion}} checks, since DML is not transactional. Return and aggregate update counts back. # Remove or at least rethink retry logic. Why do we need it in the first place? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Batch updates in Ignite B+ tree.
Hi Pavel, As far as I know batch tree updates already being developed. Alex, could you please elaborate? On Tue, Mar 5, 2019 at 5:05 PM Pavel Pereslegin wrote: > Hi Igniters! > > I am working on implementing batch updates in PageMemory [1] to > improve the performance of preloader, datastreamer and putAll. > > This task consists of two major related improvements: > 1. Batch writing to PageMemory via FreeList - store several values at > once to single memory page. > 2. Batch updates in BPlusTree (for introducing invokeAll operation). > > I started to investigate the issue with batch updates in B+ tree, and > it seems that the concurrent top-down balancing algorithm (TD) > described in this paper [2] may be suitable for batch insertion of > keys into Ignite B+ Tree. > This algorithm uses a top-down balancing approach and allows to insert > a batch of keys belonging to the leaves having the same parent. The > negative point of top-down balancing approach is that the parent node > is locked when performing insertion/splitting in child nodes. > > WDYT? Do you know other approaches for implementing batch updates in > Ignite B+ Tree? > > [1] https://issues.apache.org/jira/browse/IGNITE-7935 > [2] > https://aaltodoc.aalto.fi/bitstream/handle/123456789/2168/isbn9512258951.pdf >
Re: Storing short/empty strings in Ignite
Hi Val, I would say that we do not need string length at all, because it can be derived from object footer (next field offset MINUS current field offset). It is not very good idea to implement proposed change in Apache Ignite 2.x because it is breaking and will add unnecessary complexity to already very complex binary infrastructure. Instead, it is better to review binary format in 3.0 and remove length's not only from Strings, but from other variable-length data types as well (arrays, decimals). On Tue, Mar 5, 2019 at 10:12 AM Valentin Kulichenko < valentin.kuliche...@gmail.com> wrote: > Hey folks, > > While working with Ignite users, I keep seeing data models where a single > object (row) might contain many fields (100, 200, more...), and most of > them are strings. > > Correct me if I'm wrong, but per my understanding, for every such field we > store an integer value to represent its length. This is significant > overhead - with 200 fields we spend 800 bytes only for this. > > Now here is the catch: vast majority of those strings are actually empty or > very short (several chars), therefore we don't really need 4 bytes to their > length. > > My suggestions is to introduce another data type, e.g. STRING_SHORT, use it > for all strings that are 255 chars or less, and therefore use a single byte > to encode length. We can go even further, and also introduce STRING_EMPTY, > which obviously doesn't need any length information at all. > > What do you guys think? > > -Val >
Re: Please re-commit 3 last changes in the master
Looks like everything is good now - all three commits were returned. On Mon, Mar 4, 2019 at 2:04 PM Dmitriy Pavlov wrote: > Thanks, Ivan, these commits are in sync in GitHub & GitBox. Only one commit > remained, Vladimir O., please chime in > > пн, 4 мар. 2019 г. в 14:03, Ivan Rakov : > > > Thanks for keeping track of it, I've re-applied the following commits: > > > > IGNITE-11199 Add extra logging for client-server connections in TCP > > discovery - Fixes #6048. Andrey Kalinin* 04.03.2019 2:11 > > IGNITE-11322 [USABILITY] Extend Node FAILED message by add consistentId > > if it exist - Fixes #6180. Andrey Kalinin* 04.03.2019 2:03 > > > > Best Regards, > > Ivan Rakov > > > > On 04.03.2019 13:56, Dmitriy Pavlov wrote: > > > Thanks to Alexey Plehanov for noticing and Infra Team for fixing the > > issue: > > > https://issues.apache.org/jira/browse/INFRA-17950 > > > > > > пн, 4 мар. 2019 г. в 13:53, Dmitriy Pavlov : > > > > > >> Hi Developers, > > >> > > >> Because of the sync issue, the following 3 commits were lost. > > >> > > >> Please re-apply it to the master. > > >> > > >> > > > https://gitbox.apache.org/repos/asf?p=ignite.git;a=commit;h=b26bbb29d5fdd9d4de5187042778ebe3b8c6c42e > > >> > > >> > > >> > > > https://gitbox.apache.org/repos/asf?p=ignite.git;a=commit;h=6c562a997c0beb3a3cd9dd2976e016759a808f0c > > >> > > >> > > >> > > > https://gitbox.apache.org/repos/asf?p=ignite.git;a=commit;h=45c4dc98e0eac33cccd2e24acb3e9882f098cad1 > > >> > > >> > > >> Sorry for the inconvenience. > > >> > > >> Sincerely, > > >> Dmitriy Pavlov > > >> > > >
Re: SQL: INSERT with hidden columns _key, _val and check the type of input objects
I do not think this should be deferred, even though it changes default behavior. Clean and simple semantics is much more important. In this regards DML was created incorrectly in the first place. We will fix it, leaving hidden fallback mode for those users who use this strange semantics. ср, 27 февр. 2019 г. в 12:57, Ilya Kasnacheev : > Hello! > > > UPDATE table SET _VAL=? WHERE ... // Disallow > > Breaking change and as such should be deferred to 3.0. > > All of our tables have types, so we can disallow doing _VAL=? where > parameter object is not of table's type, and semantics break down here - > you INSERT object in cache, get "1" rows updated but can't select this row > from table. > But we probably should not disallow _VAL=? where parameter object IS of > table's type, since there may be users whose workflow depends on that and > it isn't fixable easily. > > For example, they can have objects of which only subset of fields is > indexed, the rest is not. Then they are inserting them via SQL as shown. > > Regards, > -- > Ilya Kasnacheev > > > ср, 27 февр. 2019 г. в 12:10, Vladimir Ozerov : > > > Hi Taras, > > > > As far as your original question :-) I would say that user should have > only > > one way to update data with DML - through plain attributes. That is, if > we > > have a composite value with attributes "a" and "b", then we should: > > UPDATE table SET a=?, b=? WHERE ... // Allow > > UPDATE table SET _VAL=? WHERE ... // Disallow > > > > But if the value is an attribute itself (e.g. in case of primitive), then > > DML should be allowed on it for sure: > > UPDATE table SET _VAL=? WHERE ... // Allow > > > > What do you think? > > > > On Sat, Feb 23, 2019 at 6:50 PM Denis Magda wrote: > > > > > Vladimir, > > > > > > Ok, agreed, let's not boil the ocean...at least for now ;) > > > > > > -- > > > Denis Magda > > > > > > > > > On Sat, Feb 23, 2019 at 12:50 AM Vladimir Ozerov > > > > wrote: > > > > > > > Denis, > > > > > > > > Yes, this is what my answer was about - you cannot have SQL without > > > > defining fields in advance. Because it breaks a lot of standard SQL > > > > invariants and virtually makes the whole language unusable. For > > instance, > > > > think of product behavior in the following cases: > > > > 1) User queries an empty cache with a query "SELECT a FROM table" - > > what > > > > should happen - exception or empty result? How would I know whether > > field > > > > "a" will appear in future? > > > > 2) User executed a command "ALTER TABLE ... ADD COLUMN b" - how can I > > > > understand whether it is possible or not to add a column without > strict > > > > schema? > > > > 3) "ALTER TABLE ... DROP COLUMN c" - what should happen if user will > > add > > > an > > > > object with field "c" after that? > > > > 4) User connects to Ignite from Tableau and navigates through schema > - > > > what > > > > should be shown? > > > > > > > > That is, you cannot have SQL without schema because it is at the very > > > heart > > > > of the technology. But you can have schema-less noSQL database. > > > > > > > > Let's do not invent a hybrid with tons of corner cases and separate > > > > learning curve. It should be enough just to rethink and simplify our > > > > configuration - reshape QueryEntity, deprecate all SQL annotations, > > allow > > > > only one table per cache, allow to define SQL script to be executed > on > > > > cache start or so. > > > > > > > > As far as schemaless - it is viable approach for sure, but should be > > > > considered either outside of SQL (e.g. a kind of predicate/criteria > API > > > > which can be merged with ScanQuery) or as a special datatype in SQL > > > > ecosystem (like is is done with JSON in many RDBMS databases). > > > > > > > > Vladimir. > > > > > > > > > > > > > > > > > > > > On Fri, Feb 22, 2019 at 11:01 PM Denis Magda > > wrote: > > > > > > > > > Vladimir, > > > > > > > > > > That's understood. I'm just thinking of a use case different from > the > > > DDL >
Re: SQL: INSERT with hidden columns _key, _val and check the type of input objects
Hi Taras, As far as your original question :-) I would say that user should have only one way to update data with DML - through plain attributes. That is, if we have a composite value with attributes "a" and "b", then we should: UPDATE table SET a=?, b=? WHERE ... // Allow UPDATE table SET _VAL=? WHERE ... // Disallow But if the value is an attribute itself (e.g. in case of primitive), then DML should be allowed on it for sure: UPDATE table SET _VAL=? WHERE ... // Allow What do you think? On Sat, Feb 23, 2019 at 6:50 PM Denis Magda wrote: > Vladimir, > > Ok, agreed, let's not boil the ocean...at least for now ;) > > -- > Denis Magda > > > On Sat, Feb 23, 2019 at 12:50 AM Vladimir Ozerov > wrote: > > > Denis, > > > > Yes, this is what my answer was about - you cannot have SQL without > > defining fields in advance. Because it breaks a lot of standard SQL > > invariants and virtually makes the whole language unusable. For instance, > > think of product behavior in the following cases: > > 1) User queries an empty cache with a query "SELECT a FROM table" - what > > should happen - exception or empty result? How would I know whether field > > "a" will appear in future? > > 2) User executed a command "ALTER TABLE ... ADD COLUMN b" - how can I > > understand whether it is possible or not to add a column without strict > > schema? > > 3) "ALTER TABLE ... DROP COLUMN c" - what should happen if user will add > an > > object with field "c" after that? > > 4) User connects to Ignite from Tableau and navigates through schema - > what > > should be shown? > > > > That is, you cannot have SQL without schema because it is at the very > heart > > of the technology. But you can have schema-less noSQL database. > > > > Let's do not invent a hybrid with tons of corner cases and separate > > learning curve. It should be enough just to rethink and simplify our > > configuration - reshape QueryEntity, deprecate all SQL annotations, allow > > only one table per cache, allow to define SQL script to be executed on > > cache start or so. > > > > As far as schemaless - it is viable approach for sure, but should be > > considered either outside of SQL (e.g. a kind of predicate/criteria API > > which can be merged with ScanQuery) or as a special datatype in SQL > > ecosystem (like is is done with JSON in many RDBMS databases). > > > > Vladimir. > > > > > > > > > > On Fri, Feb 22, 2019 at 11:01 PM Denis Magda wrote: > > > > > Vladimir, > > > > > > That's understood. I'm just thinking of a use case different from the > DDL > > > approach where the schema is defined initially. Let's say that someone > > > configured caches with CacheConfiguration and now puts an Object in the > > > cache. For that person, it would be helpful to skip the Annotations or > > > QueryEntities approaches for queryable fields definitions (not even > > > indexes). For instance, the person might simply query some fields with > > the > > > primary index in the WHERE clause and this shouldn't require any extra > > > settings. Yes, it's clear that it might be extremely challenging to > > support > > > but imagine how usable the API could become if we can get rid of > > > Annotations and QueryEntities. > > > > > > Basically, my idea is that all of the objects and their fields stored > in > > > the caches should be visible to SQL w/o extra settings. If someone > wants > > to > > > create indexes then use DDL which was designed for this. > > > > > > > > > - > > > Denis > > > > > > > > > On Fri, Feb 22, 2019 at 2:27 AM Vladimir Ozerov > > > wrote: > > > > > > > Denis, > > > > > > > > SQL is a language with strict schema what was one of significant > > factors > > > of > > > > it's worldwide success. I doubt we will ever have SQL without > > > > configuration/definiton, because otherwise it will be not SQL, but > > > > something else (e.g. document-oriented, JSON, whatever). > > > > > > > > On Fri, Feb 22, 2019 at 1:52 AM Denis Magda > wrote: > > > > > > > > > Folks, > > > > > > > > > > Do we want to preserve the annotation-based configuration? There > are > > > too > > > > > many ways to configure SQL indexes/fields. > > > > >
[jira] [Created] (IGNITE-11422) Remove H2 console from documentation
Vladimir Ozerov created IGNITE-11422: Summary: Remove H2 console from documentation Key: IGNITE-11422 URL: https://issues.apache.org/jira/browse/IGNITE-11422 Project: Ignite Issue Type: Task Components: documentation Reporter: Vladimir Ozerov Assignee: Artem Budnikov H2 console was deprecated as a part of IGNITE-11333. Need to remove all mentions of "H2 console" from documentation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11418) Document SQL IGNITE.INDEXES view
Vladimir Ozerov created IGNITE-11418: Summary: Document SQL IGNITE.INDEXES view Key: IGNITE-11418 URL: https://issues.apache.org/jira/browse/IGNITE-11418 Project: Ignite Issue Type: Task Components: documentation Reporter: Vladimir Ozerov Assignee: Artem Budnikov New {{IGNITE.INDEXES}} view was added, which displays indexes in specific columns. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11404) Document CREATE TABLE "parallelism" option
Vladimir Ozerov created IGNITE-11404: Summary: Document CREATE TABLE "parallelism" option Key: IGNITE-11404 URL: https://issues.apache.org/jira/browse/IGNITE-11404 Project: Ignite Issue Type: Task Components: documentation, sql Reporter: Vladimir Ozerov Assignee: Artem Budnikov Fix For: 2.8 We added new {{PARALLELISM}} option: {code} CREATE TABLE ... WITH "parallelism = 4" {code} This option affect query parallelism which is otherwise set from {{CacheConfiguration.queryParallelism}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11402) SQL: Add ability to specift inline size of PK and affinity key indexes from CREATE TABLE and QueryEntity
Vladimir Ozerov created IGNITE-11402: Summary: SQL: Add ability to specift inline size of PK and affinity key indexes from CREATE TABLE and QueryEntity Key: IGNITE-11402 URL: https://issues.apache.org/jira/browse/IGNITE-11402 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Fix For: 2.8 Currently it is not possible to set inline size for automatically created indexes easily. We need to make sure that use has convenient way to set them both programmatically and from DDL. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: SQL: INSERT with hidden columns _key, _val and check the type of input objects
Denis, Yes, this is what my answer was about - you cannot have SQL without defining fields in advance. Because it breaks a lot of standard SQL invariants and virtually makes the whole language unusable. For instance, think of product behavior in the following cases: 1) User queries an empty cache with a query "SELECT a FROM table" - what should happen - exception or empty result? How would I know whether field "a" will appear in future? 2) User executed a command "ALTER TABLE ... ADD COLUMN b" - how can I understand whether it is possible or not to add a column without strict schema? 3) "ALTER TABLE ... DROP COLUMN c" - what should happen if user will add an object with field "c" after that? 4) User connects to Ignite from Tableau and navigates through schema - what should be shown? That is, you cannot have SQL without schema because it is at the very heart of the technology. But you can have schema-less noSQL database. Let's do not invent a hybrid with tons of corner cases and separate learning curve. It should be enough just to rethink and simplify our configuration - reshape QueryEntity, deprecate all SQL annotations, allow only one table per cache, allow to define SQL script to be executed on cache start or so. As far as schemaless - it is viable approach for sure, but should be considered either outside of SQL (e.g. a kind of predicate/criteria API which can be merged with ScanQuery) or as a special datatype in SQL ecosystem (like is is done with JSON in many RDBMS databases). Vladimir. On Fri, Feb 22, 2019 at 11:01 PM Denis Magda wrote: > Vladimir, > > That's understood. I'm just thinking of a use case different from the DDL > approach where the schema is defined initially. Let's say that someone > configured caches with CacheConfiguration and now puts an Object in the > cache. For that person, it would be helpful to skip the Annotations or > QueryEntities approaches for queryable fields definitions (not even > indexes). For instance, the person might simply query some fields with the > primary index in the WHERE clause and this shouldn't require any extra > settings. Yes, it's clear that it might be extremely challenging to support > but imagine how usable the API could become if we can get rid of > Annotations and QueryEntities. > > Basically, my idea is that all of the objects and their fields stored in > the caches should be visible to SQL w/o extra settings. If someone wants to > create indexes then use DDL which was designed for this. > > > - > Denis > > > On Fri, Feb 22, 2019 at 2:27 AM Vladimir Ozerov > wrote: > > > Denis, > > > > SQL is a language with strict schema what was one of significant factors > of > > it's worldwide success. I doubt we will ever have SQL without > > configuration/definiton, because otherwise it will be not SQL, but > > something else (e.g. document-oriented, JSON, whatever). > > > > On Fri, Feb 22, 2019 at 1:52 AM Denis Magda wrote: > > > > > Folks, > > > > > > Do we want to preserve the annotation-based configuration? There are > too > > > many ways to configure SQL indexes/fields. > > > > > > For instance, if our new SQL API could see and access all of the fields > > > out-of-the-box (without any extra settings) and DDL will be used to > > define > > > indexed fields then that would be a huge usability improvement. > > > > > > - > > > Denis > > > > > > > > > On Thu, Feb 21, 2019 at 5:27 AM Taras Ledkov > > wrote: > > > > > > > Hi, > > > > > > > > Lets discuss SQL DML (INSERT/UPDATE) current behavior specific: > > > > > > > > Ignite doesn't check a type of input objects when hidden columns > _key, > > > > _value is used in a DML statements. > > > > I describe the current behavior for example: > > > > > > > > 1. Cache configuration: 'setIndexedTypes(PersonKey.class, > > > Person.class))' > > > > 2. PersonKey type contains 'int id' field. > > > > 3. SQL statement: 'INSERT INTO test (_val, _key) VALUES (?, ?)' > > > > > > > > Cases: > > > > 1. Invalid value object type: > > > > - Any value object may be passed as a query parameter > > > > - Query is executed without an error and returns '1' (one row > updated); > > > > - There is not inserted row at the 'SELECT * FROM test' results. > > > > - cache.get(key) returns inserted object; > > > > > > > > 2. Invalid key object type: >
Re: SQL: INSERT with hidden columns _key, _val and check the type of input objects
Denis, SQL is a language with strict schema what was one of significant factors of it's worldwide success. I doubt we will ever have SQL without configuration/definiton, because otherwise it will be not SQL, but something else (e.g. document-oriented, JSON, whatever). On Fri, Feb 22, 2019 at 1:52 AM Denis Magda wrote: > Folks, > > Do we want to preserve the annotation-based configuration? There are too > many ways to configure SQL indexes/fields. > > For instance, if our new SQL API could see and access all of the fields > out-of-the-box (without any extra settings) and DDL will be used to define > indexed fields then that would be a huge usability improvement. > > - > Denis > > > On Thu, Feb 21, 2019 at 5:27 AM Taras Ledkov wrote: > > > Hi, > > > > Lets discuss SQL DML (INSERT/UPDATE) current behavior specific: > > > > Ignite doesn't check a type of input objects when hidden columns _key, > > _value is used in a DML statements. > > I describe the current behavior for example: > > > > 1. Cache configuration: 'setIndexedTypes(PersonKey.class, > Person.class))' > > 2. PersonKey type contains 'int id' field. > > 3. SQL statement: 'INSERT INTO test (_val, _key) VALUES (?, ?)' > > > > Cases: > > 1. Invalid value object type: > > - Any value object may be passed as a query parameter > > - Query is executed without an error and returns '1' (one row updated); > > - There is not inserted row at the 'SELECT * FROM test' results. > > - cache.get(key) returns inserted object; > > > > 2. Invalid key object type: > > 2.1 Non-primitive object is passed and binary representation doesn't > > contain 'id' field. > > - Query is executed without error and returns '1' (one row updated); > > - The inserted row is available by 'SELECT *' and the row contains id = > > null; > > 2.2 Non-primitive object is passed and binary representation contains > > 'id' field. > > - The inserted row is available by 'SELECT *' and the row contains > > expected 'id' field; > > - The cache entry cannot be gathered by 'cache.get' operation with the > > corresponding 'PersonKey(id)' (keys differ). > > > > I propose to check type of the user's input object. > > > > I guess that using _key/_val columns works close to 'cache.put()' but it > > looks like significant usability issue. > > To confuse the 'PersonKey.class.getName()' and > > 'node.binary().builder("PersonKey")' is a typical mistake of Ignite > > newcomers. > > > > One more argument for check: SQL INSERT sematic means the row is > > inserted into the specified TABLE, not into the cache. > > So, throw IgniteSQLException is expected behavior in this case, i think. > > > > [1]. https://issues.apache.org/jira/browse/IGNITE-5250 > > > > -- > > Taras Ledkov > > Mail-To: tled...@gridgain.com > > > > >
[jira] [Created] (IGNITE-11341) SQL: Enable lazy mode by default
Vladimir Ozerov created IGNITE-11341: Summary: SQL: Enable lazy mode by default Key: IGNITE-11341 URL: https://issues.apache.org/jira/browse/IGNITE-11341 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Assignee: Taras Ledkov We redesigned lazy mode, so that now it doesn't spawn new thread and has the same performance as old "eager" mode (IGNITE-9171). However, we didn't enable it by default because H2 1.4.197 contains several bugs causing query engine slowdown in some cases when lazy mode is set. These issues are resolved in H2 master and will become available as a part of the next release (presumably 1.4.198). We need to make lazy mode enabled by default once new version is available (IGNITE-10801). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11340) SQL: Add OOME tests to separate suite
Vladimir Ozerov created IGNITE-11340: Summary: SQL: Add OOME tests to separate suite Key: IGNITE-11340 URL: https://issues.apache.org/jira/browse/IGNITE-11340 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Assignee: Taras Ledkov Fix For: 2.8 {{IgniteQueryOOMTestSuite}} was added as a part of IGNITE-9171. We need to add this suite to TC and make sure it is executed on regular basis. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11334) SQL: Deprecate SqlQuery
Vladimir Ozerov created IGNITE-11334: Summary: SQL: Deprecate SqlQuery Key: IGNITE-11334 URL: https://issues.apache.org/jira/browse/IGNITE-11334 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Assignee: Taras Ledkov This API is very limited comparing to {{SqlFieldsQuery}}. Let's deprecate it with proper links to {{SqlFieldsQuery}}. This should be not only deprecation on public API, but removal from examples as well. Separate ticket for documentation is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11333) SQL: Deprecate H2 console
Vladimir Ozerov created IGNITE-11333: Summary: SQL: Deprecate H2 console Key: IGNITE-11333 URL: https://issues.apache.org/jira/browse/IGNITE-11333 Project: Ignite Issue Type: Task Reporter: Vladimir Ozerov Assignee: Taras Ledkov This functional is not tested, not supported and may fail with unexpected errors. This affects user experience. Need to disable it and create ticket for relevant documentation update. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11331) SQL: Remove unnecessary parameters binding
Vladimir Ozerov created IGNITE-11331: Summary: SQL: Remove unnecessary parameters binding Key: IGNITE-11331 URL: https://issues.apache.org/jira/browse/IGNITE-11331 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Fix For: 2.8 See usages of {{H2Utils#bindParameters}}. Note that it is used both in SELECT and DML planners without any reason. Let's remove it from there. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11326) SQL: Common parsing logic
Vladimir Ozerov created IGNITE-11326: Summary: SQL: Common parsing logic Key: IGNITE-11326 URL: https://issues.apache.org/jira/browse/IGNITE-11326 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Assignee: Vladimir Ozerov -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11325) SQL: Single place to start missing caches (H2Utils.checkAndStartNotStartedCache)
Vladimir Ozerov created IGNITE-11325: Summary: SQL: Single place to start missing caches (H2Utils.checkAndStartNotStartedCache) Key: IGNITE-11325 URL: https://issues.apache.org/jira/browse/IGNITE-11325 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Fix For: 2.8 We need to start missing caches for the given SELECT/DML statement because we need affinity info during query planning which is only available for started caches. We need to do the following: # Move the method {{H2Utils.checkAndStartNotStartedCache}} to some common place, e.g. parser, so that it has one and only one usage all over the code base # Make sure that this method doesn't produce multiple network hops: missing caches should be started in a single request if possible. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11317) Document that SQL DML statements (UPDATE/MERGE) cannot update key fields
Vladimir Ozerov created IGNITE-11317: Summary: Document that SQL DML statements (UPDATE/MERGE) cannot update key fields Key: IGNITE-11317 URL: https://issues.apache.org/jira/browse/IGNITE-11317 Project: Ignite Issue Type: Task Components: documentation, sql Reporter: Vladimir Ozerov Assignee: Artem Budnikov This is architectural limitation which is unlikely to be resolved in the nearest time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Binary clients: fallback to the previous versions of the protocol
Hi Dmitriy, It is very common practice to keep client protocol compatible with multiple versions of the server. We constantly face this in practice. I do not see any reason to drop or complicate this functional: user just connects to the server and we automatically negotiate on the best feature set possible. No need to expose it somehow to users. As far as develpoment and testing, we are not afraid of challenges and difficulties. Yes, it takes more time, but it worth it. Vladimir. On Thu, Feb 14, 2019 at 6:28 AM Dmitry Melnichuk < dmitry.melnic...@nobitlost.com> wrote: > Igor, > > I am sorry it took me a while to fully understand your reasoning. > > “Update user software first, then update the server” approach still > looks somewhat weird to me (I think of Let's Encrypt client as an > example of “normal” approach in Python world), but since this approach > is vivid, I just have to take it into account, so I must agree with > you. > > I just want to reiterate on one downside of such multi-protocol client, > that was not yet addressed (not in Jira tasks or in docs, at least). > > Imagine a coder wrote a program with the latest client, using a feature > available only in latest binary protocol. When the coder tests his > program against the latest Ignite cluster, the program works perfectly. > > But then the end user runs the program against the previous version of > the server, which client is still backwards-compatible with, the > program runs, but at some point it tries to use the latest feature of > the binary protocol and fails with some cryptic message. The end user > is clueless, so as the coder. > > To avoid such a case, we must include an explicit parameter in our > client's initialization method, that would set the desired protocol > version(s) the user application is designed to work with. This > parameter should be explicit, i.e. not have a default value, since it > just will be useless the other way. And yes, this parameter renders all > the software built with previous client versions incompatible with the > new client. > > I think this problem concerns not only the Python client, but all the > thin clients. What do you think? > > On Wed, 2019-02-13 at 13:45 +0300, Igor Sapego wrote: > > The approach you suggest looks to me pretty much the same as > > installing a new version of client software in C++ or Java. The issue > > here that we break existing installed software and require for user > > to update software in order to have ability to connect to a server. > > Just imagine that application which made with thin client is not used > > by a developer that knows how to use pip and all the stuff, but > > someone with another background. Imagine, you have thousands of such > > users. And now imagine, you want to update your servers. > > > > Best Regards, > > Igor > > > > > > On Tue, Feb 12, 2019 at 8:51 PM Dmitry Melnichuk < > > dmitry.melnic...@nobitlost.com> wrote: > > > > > Igor, > > > > > > Thank you for your explanation. I think the matter begins to clear > > > up > > > for me now. > > > > > > The backward compatibility issue you described can not be applied > > > to > > > Python, because Python applications, unlike Java ones, do not have > > > to > > > be built. They rely on package manager (pip, conda, et c.) to run > > > anywhere, including production. > > > > > > At the stage of deployment, the package manager collects > > > dependencies > > > using a specially crafted response file, often called > > > `requirements.txt`. > > > > > > For example, to ensure that their application will work with the > > > current _and_ future minor versions of pyignite, the user may > > > include a > > > line in their `requirements.txt` file: > > > > > > pyignite < x > > > > > > where x is a next major version number. In compliance with semantic > > > versioning, the line is basically says: “Use the latest available > > > version, that is earlier than x”. > > > > > > When upgrading Ignite server, system administrator or devops > > > engineer > > > must also update or recreate the app's environment, or update OS- > > > level > > > packages, or redeploy the app using Docker − the exact procedure > > > may > > > vary, but in any case it should be completely standard − to deliver > > > the > > > latest suitable dependencies. > > > > > > And then the same app connects to a latest Ignite server. > > > > > > Here is more about how pip understands versions: > > > > > > > https://pip.pypa.io/en/stable/reference/pip_install/#requirement-specifiers > > > > > > What we really need to do for this to work seamlessly, is to > > > establish > > > the clear relation between products' versions. Regretfully, I have > > > not > > > done this before; just did not expect for this issue to come up. I > > > think it would be best for pyignite major and minor to be set > > > according > > > to the Ignite binary protocol versions, i.e. pyignite 1.2.z handles > > > Ignite binary protocol v1.2, and so on. But that is another matt
[jira] [Created] (IGNITE-11316) SQL: Support partition pruning for local queries
Vladimir Ozerov created IGNITE-11316: Summary: SQL: Support partition pruning for local queries Key: IGNITE-11316 URL: https://issues.apache.org/jira/browse/IGNITE-11316 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Currently it is not supported because extraction happens inside splitter. Local query eithe: # Do not reach splitter at all (no-split case) # Reach splitter, but skip extraction due to missing infrastructure which is to be implemented and tested in the scope of current ticket. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11310) SQL: remove special interaction between query parallelism and distributed joins
Vladimir Ozerov created IGNITE-11310: Summary: SQL: remove special interaction between query parallelism and distributed joins Key: IGNITE-11310 URL: https://issues.apache.org/jira/browse/IGNITE-11310 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Assignee: Vladimir Ozerov Fix For: 2.8 Currently we enable so-called "local distributed joins" when query is executed locally with enabled parallelism. This behavior is not needed and needs to be removed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11304) SQL: Common caching of both local and distributed query metadata
Vladimir Ozerov created IGNITE-11304: Summary: SQL: Common caching of both local and distributed query metadata Key: IGNITE-11304 URL: https://issues.apache.org/jira/browse/IGNITE-11304 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Assignee: Vladimir Ozerov Currently query metadata is only cached for distributed queries. For local queries it is calculated on every request over and over again. Need to cache it always in {{QueryParserResultSelect}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11280) SQL: Cache all queries, not only two-step
Vladimir Ozerov created IGNITE-11280: Summary: SQL: Cache all queries, not only two-step Key: IGNITE-11280 URL: https://issues.apache.org/jira/browse/IGNITE-11280 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Fix For: 2.8 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11278) SQL: Extract query parsing into separate class
Vladimir Ozerov created IGNITE-11278: Summary: SQL: Extract query parsing into separate class Key: IGNITE-11278 URL: https://issues.apache.org/jira/browse/IGNITE-11278 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Assignee: Vladimir Ozerov Fix For: 2.8 # Introduce separate command types for SELECT, DML and others commands # Move parsing logic and query cache to separate class # Fix bug with query parallelism when "distributedQueries" flag is modified not for newly created query, but globally. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11279) SQL: Remove H2's "prepared" from DML plans
Vladimir Ozerov created IGNITE-11279: Summary: SQL: Remove H2's "prepared" from DML plans Key: IGNITE-11279 URL: https://issues.apache.org/jira/browse/IGNITE-11279 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Fix For: 2.8 Currently it is only used to get the list of participating tables. Instead, we should encapsulate this information into {{ParsingResultDml}}. Streamer methods should use our own parser as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11275) SQL: Move all command processing stuff to DDL processor
Vladimir Ozerov created IGNITE-11275: Summary: SQL: Move all command processing stuff to DDL processor Key: IGNITE-11275 URL: https://issues.apache.org/jira/browse/IGNITE-11275 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Assignee: Vladimir Ozerov Fix For: 2.8 If command is of non-SELECT/non-DML type, it should be encapsulated inside {{ParsingResult}} as a pair of native/H2 commands and passed to separate processor. This will reduce complexity of {{IgniteH2Indexing}} significantly, as it will be concerned only about SELECT/DML processing and nothing else. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11274) SQL: Make GridCacheSqlQuery immutable
Vladimir Ozerov created IGNITE-11274: Summary: SQL: Make GridCacheSqlQuery immutable Key: IGNITE-11274 URL: https://issues.apache.org/jira/browse/IGNITE-11274 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Assignee: Vladimir Ozerov The goal of this ticket is to finally make two-step plan fully immutable. First steps we already made in IGNITE-11223, howevere plan's "query" objects are still mutable, what make's plan caching inherently unsafe. # Remove all setters from the message except of {{nodeId}}, which is really needed # Make splitter use another trully immutable object instead of {{GridCacheSqlQuery}} # Copy splitter's object to {{GridCacheSqlQuery}} during reduce -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11231) SQL: Remove scan index for merge table
Vladimir Ozerov created IGNITE-11231: Summary: SQL: Remove scan index for merge table Key: IGNITE-11231 URL: https://issues.apache.org/jira/browse/IGNITE-11231 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Assignee: Vladimir Ozerov Fix For: 2.8 Reasoning: # No business logic comparing to it's parent # Duplicated code for cost calculation -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11227) SQL: Streamline DML execution logic
Vladimir Ozerov created IGNITE-11227: Summary: SQL: Streamline DML execution logic Key: IGNITE-11227 URL: https://issues.apache.org/jira/browse/IGNITE-11227 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Currently DML execution logic is overly complex with execution flow being transferred between indexing and DML processor back and forth. Need to simplify it as much as possible. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11226) SQL: Remove GridQueryIndexing.prepareNativeStatement
Vladimir Ozerov created IGNITE-11226: Summary: SQL: Remove GridQueryIndexing.prepareNativeStatement Key: IGNITE-11226 URL: https://issues.apache.org/jira/browse/IGNITE-11226 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov This method is the only leak of H2 internals to the outer code. Close analysis of code reveals that the only reason we have it is *JDBC metadata*. Need to create a method which will prepare metadata for a statement and return it as a detached object. Most probably we already have all necessary mechanics. This is more about refactoring. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11223) SQL: Merge "collectCacheIds" and "processCaches" methods
Vladimir Ozerov created IGNITE-11223: Summary: SQL: Merge "collectCacheIds" and "processCaches" methods Key: IGNITE-11223 URL: https://issues.apache.org/jira/browse/IGNITE-11223 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Both methods are essentially two pieces of the same process - collect cache IDs for the given query, check MVCC mode. But because they are separated, we have unnecessary collection copies, "isEmpty" checks and iterations. Provided that these methods are on a hot path, let's merge them accurately. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11212) SQL: Merge affinity collocation models for partition pruning and distributed joins
Vladimir Ozerov created IGNITE-11212: Summary: SQL: Merge affinity collocation models for partition pruning and distributed joins Key: IGNITE-11212 URL: https://issues.apache.org/jira/browse/IGNITE-11212 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Currently we have two different types of tree models for partition pruning and distributed joins. First, this leads to code duplication. Second, they subtle semantic differences harboring hidden bugs. Let's try to merge them into a single model which is built with the same set of rules. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11211) SQL: Rework connection pool
Vladimir Ozerov created IGNITE-11211: Summary: SQL: Rework connection pool Key: IGNITE-11211 URL: https://issues.apache.org/jira/browse/IGNITE-11211 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Currently we have very complex multi-level connection pool. Instead, we may have a single concurrent queue with shared connections which are acquired and released by threads as needed. As an optimization we may optionally attach connections to thread-local storage. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11210) SQL: Introduce common logical execution plan for all query types
Vladimir Ozerov created IGNITE-11210: Summary: SQL: Introduce common logical execution plan for all query types Key: IGNITE-11210 URL: https://issues.apache.org/jira/browse/IGNITE-11210 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov At the moment we have a lot of various cached stuff used for different SQL types (prepared statements for local queries, two-step queries for distributed queries, update plan for DML). What we need instead of having multiple caches is to create common execution plan for every query, which will hold both DML and SELECT stuff. Approximate content of such a plan: # Two-step plan # DML plan # Partition pruning stuff # May be even cached physical node distribution (for reduce queries) for the given {{AffinityTopologyVersion}} # Probably {{AffinityTopologyVersion}} Then we will perform a single plan lookup/build per every query execution. In future we will probably display these plans in SQL views. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11208) SQL: Move reservations from QueryContext to MapQueryResult
Vladimir Ozerov created IGNITE-11208: Summary: SQL: Move reservations from QueryContext to MapQueryResult Key: IGNITE-11208 URL: https://issues.apache.org/jira/browse/IGNITE-11208 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov It is absolutely unclear why reservations are handled inside {{QueryContext}}. First, they belong to specific {{MapQueryResult}}, not some thread-local stuff. Second, inside {{QueryContext}} logic they are cleared only for requests with distributed joins. Why? Let's remove this weird stuff. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11209) SQL: streamline DML execution logic
Vladimir Ozerov created IGNITE-11209: Summary: SQL: streamline DML execution logic Key: IGNITE-11209 URL: https://issues.apache.org/jira/browse/IGNITE-11209 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Currently DML execution logic is overly complex with execution flow being transferred between indexing and DML processor back and forth. Need to simplify it as much as possible. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11207) SQL: Remove MapNodeResults class
Vladimir Ozerov created IGNITE-11207: Summary: SQL: Remove MapNodeResults class Key: IGNITE-11207 URL: https://issues.apache.org/jira/browse/IGNITE-11207 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Fix For: 2.8 This class holds results for a specific node. Let's remove it and refactor associated code with the following goals in mind: # Performance: one CHM lookup instead of two # Uniformity: move both SELECT and DML under the same {{MapQueryResult}} umbrella -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11206) SQL: Merge execution flow for local and map queries
Vladimir Ozerov created IGNITE-11206: Summary: SQL: Merge execution flow for local and map queries Key: IGNITE-11206 URL: https://issues.apache.org/jira/browse/IGNITE-11206 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Currently MAP and LOCAL queries are executed in completely different fashion. This leads to a number of bugs and discrepancies, not to mention obvious code duplication: # Local queries do not reserve partitions # Security checks might be missed for local queries (need to double-check). # Different event firing logic Let's merge both flows: # Check security and other prerequisites # Reserve partitions # Get connection # Execute, firing events along the way # Release connection # Release partitions -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11203) SQL: global refactoring
Vladimir Ozerov created IGNITE-11203: Summary: SQL: global refactoring Key: IGNITE-11203 URL: https://issues.apache.org/jira/browse/IGNITE-11203 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Assignee: Vladimir Ozerov Over the years of existence SQL business logic became overly complex as we never invested enough time into technical debt. Most prominent features that led to over-complication are: # Distributed joins # Subqueries in spliiter # MVCC # Query cancel feature # DML As a result currently it is too difficult to add new features to the product: we have to spend a lot time figuring what if going on, and loose a lot on introduced bugs. General idea of this initiative is to streamline query execution engine as much as possible. The most important things to consider: # Simplify H2 connection management: simple pooling, avoid exposing connection when possible # Execute MAP and LOCAL queries through the same flow # Avoid zig-zag code flow in DML stuff # Try to merge partition pruning and distributed join cost calculation -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11202) SQL: Move partition reservation logic to separate class
Vladimir Ozerov created IGNITE-11202: Summary: SQL: Move partition reservation logic to separate class Key: IGNITE-11202 URL: https://issues.apache.org/jira/browse/IGNITE-11202 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Assignee: Vladimir Ozerov Fix For: 2.8 Currently associated logic is located inside {{GridMapQueryExecutor}}. This is wrong because partitions should be reserved and then released for both local and distributed queries. To allow for smooth merge of "map" and "local" queries in future, it is necessary to move this common logic into a separate place which is independent of {{GridMapQueryExecutor}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11200) SQL: query contexts should not be static
Vladimir Ozerov created IGNITE-11200: Summary: SQL: query contexts should not be static Key: IGNITE-11200 URL: https://issues.apache.org/jira/browse/IGNITE-11200 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Assignee: Vladimir Ozerov Fix For: 2.8 Currently query contexts are static and as a result over complicated. Need to make them instance-bound and remove static. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Best Effort Affinity for thin clients
Igor, My idea is simply to add the list of caches with the same distribution to the end of partition response. Client can use this information to populate partition info for more caches in a single request. On Mon, Feb 4, 2019 at 3:06 PM Igor Sapego wrote: > Vladimir, > > So correct me if I'm wrong, what you propose is to avoid mentioning > of cache groups, and use instead of "cache group" term something like > "distribution"? Or do you propose some changes in protocol? If so, can > you briefly explain, what kind of changes they are? > > Best Regards, > Igor > > > On Mon, Feb 4, 2019 at 1:13 PM Vladimir Ozerov > wrote: > > > Igor, > > > > Yes, cache groups are public API. However, we try to avoid new APIs > > depending on them. > > The main point from my side is that “similar cache group” can be easily > > generalized to “similar distribution”. This way we avoid cache groups on > > protocol level at virtually no cost. > > > > Vladimir. > > > > пн, 4 февр. 2019 г. в 12:48, Igor Sapego : > > > > > Guys, > > > > > > Can you explain why do we want to avoid Cache groups in protocol? > > > > > > If it's about simplicity of the protocol, then removing cache groups > will > > > not help much with it - we will still need to include "knownCacheIds" > > > field in request and "cachesWithTheSamePartitioning" field in response. > > > And also, since when do Ignite prefers simplicity over performance? > > > > > > If it's about not wanting to show internals of Ignite then it sounds > like > > > a very weak argument to me, since Cache Groups is a public thing [1]. > > > > > > [1] - https://apacheignite.readme.io/docs/cache-groups > > > > > > Best Regards, > > > Igor > > > > > > > > > On Mon, Feb 4, 2019 at 11:47 AM Vladimir Ozerov > > > wrote: > > > > > > > Pavel, Igor, > > > > > > > > This is not very accurate to say that this will not save memory. In > > > > practice we observed a number of OOME issues on the server-side due > to > > > many > > > > caches and it was one of motivations for cache groups (another one > disk > > > > access optimizations). On the other hand, I agree that we'd better to > > > avoid > > > > cache groups in the protocol because this is internal implementation > > > detail > > > > which is likely (I hope so) to be changed in future. > > > > > > > > So I have another proposal - let's track caches with the same > affinity > > > > distribution instead. That is, normally most of PARTITIONED caches > will > > > > have very few variants of configuration: it will be Rendezvous > affinity > > > > function, most likely with default partition number and with 1-2 > > backups > > > at > > > > most. So when affinity distribution for specific cache is requested, > we > > > can > > > > append to the response *list of caches with the same distribution*. > > I.e.: > > > > > > > > class AffinityResponse { > > > > Object distribution;// Actual distribution > > > > List cacheIds; // Caches with similar distribution > > > > } > > > > > > > > Makes sense? > > > > > > > > On Sun, Feb 3, 2019 at 8:31 PM Pavel Tupitsyn > > > > wrote: > > > > > > > > > Igor, I have a feeling that we should omit Cache Group stuff from > the > > > > > protocol. > > > > > It is a rare use case and even then dealing with them on client > > barely > > > > > saves some memory. > > > > > > > > > > We can keep it simple and have partition map per cacheId. Thoughts? > > > > > > > > > > On Fri, Feb 1, 2019 at 6:49 PM Igor Sapego > > wrote: > > > > > > > > > > > Guys, I've updated the proposal once again [1], so please, > > > > > > take a look and let me know what you think. > > > > > > > > > > > > [1] - > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-23%3A+Best+Effort+Affinity+for+thin+clients > > > > > > > > > > > > Best Regards, > > > > > > Igor > > > &g
Re: Best Effort Affinity for thin clients
Igor, Yes, cache groups are public API. However, we try to avoid new APIs depending on them. The main point from my side is that “similar cache group” can be easily generalized to “similar distribution”. This way we avoid cache groups on protocol level at virtually no cost. Vladimir. пн, 4 февр. 2019 г. в 12:48, Igor Sapego : > Guys, > > Can you explain why do we want to avoid Cache groups in protocol? > > If it's about simplicity of the protocol, then removing cache groups will > not help much with it - we will still need to include "knownCacheIds" > field in request and "cachesWithTheSamePartitioning" field in response. > And also, since when do Ignite prefers simplicity over performance? > > If it's about not wanting to show internals of Ignite then it sounds like > a very weak argument to me, since Cache Groups is a public thing [1]. > > [1] - https://apacheignite.readme.io/docs/cache-groups > > Best Regards, > Igor > > > On Mon, Feb 4, 2019 at 11:47 AM Vladimir Ozerov > wrote: > > > Pavel, Igor, > > > > This is not very accurate to say that this will not save memory. In > > practice we observed a number of OOME issues on the server-side due to > many > > caches and it was one of motivations for cache groups (another one disk > > access optimizations). On the other hand, I agree that we'd better to > avoid > > cache groups in the protocol because this is internal implementation > detail > > which is likely (I hope so) to be changed in future. > > > > So I have another proposal - let's track caches with the same affinity > > distribution instead. That is, normally most of PARTITIONED caches will > > have very few variants of configuration: it will be Rendezvous affinity > > function, most likely with default partition number and with 1-2 backups > at > > most. So when affinity distribution for specific cache is requested, we > can > > append to the response *list of caches with the same distribution*. I.e.: > > > > class AffinityResponse { > > Object distribution;// Actual distribution > > List cacheIds; // Caches with similar distribution > > } > > > > Makes sense? > > > > On Sun, Feb 3, 2019 at 8:31 PM Pavel Tupitsyn > > wrote: > > > > > Igor, I have a feeling that we should omit Cache Group stuff from the > > > protocol. > > > It is a rare use case and even then dealing with them on client barely > > > saves some memory. > > > > > > We can keep it simple and have partition map per cacheId. Thoughts? > > > > > > On Fri, Feb 1, 2019 at 6:49 PM Igor Sapego wrote: > > > > > > > Guys, I've updated the proposal once again [1], so please, > > > > take a look and let me know what you think. > > > > > > > > [1] - > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-23%3A+Best+Effort+Affinity+for+thin+clients > > > > > > > > Best Regards, > > > > Igor > > > > > > > > > > > > On Thu, Jan 17, 2019 at 1:05 PM Igor Sapego > > wrote: > > > > > > > > > Yeah, I'll add it. > > > > > > > > > > Best Regards, > > > > > Igor > > > > > > > > > > > > > > > On Wed, Jan 16, 2019 at 11:08 PM Pavel Tupitsyn < > > ptupit...@apache.org> > > > > > wrote: > > > > > > > > > >> > to every server > > > > >> I did not think of this issue. Now I agree with your approach. > > > > >> Can you please add an explanation of this to the IEP? > > > > >> > > > > >> Thanks! > > > > >> > > > > >> On Wed, Jan 16, 2019 at 2:53 PM Igor Sapego > > > wrote: > > > > >> > > > > >> > Pavel, > > > > >> > > > > > >> > Yeah, it makes sense, but to me it seems that this approach can > > lead > > > > >> > to more complicated client logic, as it will require to make > > > > additional > > > > >> > call > > > > >> > to every server, that reports affinity topology change. > > > > >> > > > > > >> > Guys, WDYT? > > > > >> > > > > > >> > Best Regards, > > > > >> > Igor > > > > >> > > > > > >>
[jira] [Created] (IGNITE-11185) SQL: Move distributed joins code from base index to H2TreeIndex
Vladimir Ozerov created IGNITE-11185: Summary: SQL: Move distributed joins code from base index to H2TreeIndex Key: IGNITE-11185 URL: https://issues.apache.org/jira/browse/IGNITE-11185 Project: Ignite Issue Type: Task Reporter: Vladimir Ozerov Assignee: Vladimir Ozerov Fix For: 2.8 {{H2TreeIndex}} is the only implementation concerned with distributed joins. Let's move associated from {{GridH2IndexBase}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Best Effort Affinity for thin clients
Pavel, Igor, This is not very accurate to say that this will not save memory. In practice we observed a number of OOME issues on the server-side due to many caches and it was one of motivations for cache groups (another one disk access optimizations). On the other hand, I agree that we'd better to avoid cache groups in the protocol because this is internal implementation detail which is likely (I hope so) to be changed in future. So I have another proposal - let's track caches with the same affinity distribution instead. That is, normally most of PARTITIONED caches will have very few variants of configuration: it will be Rendezvous affinity function, most likely with default partition number and with 1-2 backups at most. So when affinity distribution for specific cache is requested, we can append to the response *list of caches with the same distribution*. I.e.: class AffinityResponse { Object distribution;// Actual distribution List cacheIds; // Caches with similar distribution } Makes sense? On Sun, Feb 3, 2019 at 8:31 PM Pavel Tupitsyn wrote: > Igor, I have a feeling that we should omit Cache Group stuff from the > protocol. > It is a rare use case and even then dealing with them on client barely > saves some memory. > > We can keep it simple and have partition map per cacheId. Thoughts? > > On Fri, Feb 1, 2019 at 6:49 PM Igor Sapego wrote: > > > Guys, I've updated the proposal once again [1], so please, > > take a look and let me know what you think. > > > > [1] - > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-23%3A+Best+Effort+Affinity+for+thin+clients > > > > Best Regards, > > Igor > > > > > > On Thu, Jan 17, 2019 at 1:05 PM Igor Sapego wrote: > > > > > Yeah, I'll add it. > > > > > > Best Regards, > > > Igor > > > > > > > > > On Wed, Jan 16, 2019 at 11:08 PM Pavel Tupitsyn > > > wrote: > > > > > >> > to every server > > >> I did not think of this issue. Now I agree with your approach. > > >> Can you please add an explanation of this to the IEP? > > >> > > >> Thanks! > > >> > > >> On Wed, Jan 16, 2019 at 2:53 PM Igor Sapego > wrote: > > >> > > >> > Pavel, > > >> > > > >> > Yeah, it makes sense, but to me it seems that this approach can lead > > >> > to more complicated client logic, as it will require to make > > additional > > >> > call > > >> > to every server, that reports affinity topology change. > > >> > > > >> > Guys, WDYT? > > >> > > > >> > Best Regards, > > >> > Igor > > >> > > > >> > > > >> > On Tue, Jan 15, 2019 at 10:59 PM Pavel Tupitsyn < > ptupit...@apache.org > > > > > >> > wrote: > > >> > > > >> > > Igor, > > >> > > > > >> > > > It is proposed to add flag to every response, that shows > whether > > >> the > > >> > > Affinity Topology Version of the cluster has changed since the > last > > >> > request > > >> > > from the client. > > >> > > I propose to keep this flag. So no need for periodic checks. Makes > > >> sense? > > >> > > > > >> > > On Tue, Jan 15, 2019 at 4:45 PM Igor Sapego > > >> wrote: > > >> > > > > >> > > > Pavel, > > >> > > > > > >> > > > This will require from client to send this new request > > periodically, > > >> > I'm > > >> > > > not > > >> > > > sure this will make clients simpler. Anyway, let's discuss it. > > >> > > > > > >> > > > Vladimir, > > >> > > > > > >> > > > With current proposal, we will have affinity info in message > > header. > > >> > > > > > >> > > > Best Regards, > > >> > > > Igor > > >> > > > > > >> > > > > > >> > > > On Tue, Jan 15, 2019 at 11:01 AM Vladimir Ozerov < > > >> voze...@gridgain.com > > >> > > > > >> > > > wrote: > > >> > > > > > >> > > > > Igor, > > >> > > > > > > >> > > > > I think that "Cache Partitions Request&quo
[jira] [Created] (IGNITE-11180) SQL: give more sensible names to reducer classes
Vladimir Ozerov created IGNITE-11180: Summary: SQL: give more sensible names to reducer classes Key: IGNITE-11180 URL: https://issues.apache.org/jira/browse/IGNITE-11180 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Assignee: Vladimir Ozerov Fix For: 2.8 # Rename classes in accordance to map/reduce approach to simplify further development # Remove dead code in reducer logic -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11169) SQL: Remove collocation model-related code from GridH2QueryContext
Vladimir Ozerov created IGNITE-11169: Summary: SQL: Remove collocation model-related code from GridH2QueryContext Key: IGNITE-11169 URL: https://issues.apache.org/jira/browse/IGNITE-11169 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Assignee: Vladimir Ozerov Fix For: 2.8 This should be located in splitter logic instead. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11160) SQL: Create light-weight row for read-only rows
Vladimir Ozerov created IGNITE-11160: Summary: SQL: Create light-weight row for read-only rows Key: IGNITE-11160 URL: https://issues.apache.org/jira/browse/IGNITE-11160 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Assignee: Vladimir Ozerov Fix For: 2.8 In order to minimize memory overhead during query execution we can create simplified version of {{GridH2KeyValueRowOnheap}} which will not hold reference to original row. Also we can remove value cache as it is never used during SELECT execution. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: proposed realization KILL QUERY command
Hi Yuriy, Agree that at the moment the simpler the better. Let's return to more complex syntax in future if needed. Regarding proposed syntax, please note that as query ID is not database object name but rather string literal, we'd better wrap it into quotes to keep syntax consistency across commands: KILL QUERY '8a55df83-2f41-4f81-8e11-ab0936d0_6742'; Vladimir. On Wed, Jan 30, 2019 at 3:09 PM Юрий wrote: > Hi Igniters, > > Let's return to KILL QUERY command. Previously we mostly discussed about > two variants of format: > 1) simple - KILL QUERY {running_query_id} > 2) advanced syntax - KILL QUERY WHERE {parameters}. Parameters seems can be > any columns from running queries view or just part of them. > > I've checked approaches used by Industrial RDBMS vendors : > >- > - *ORACLE*: ALTER SYSTEM CANCEL SQL 'SID, SERIAL, SQL_ID' > > >- > - *Postgres*: SELECT pg_cancel_backend() and > SELECT pg_terminate_backend() > - *MySQL*: KILL QUERY > > > As we see all of them use simple syntax to cancel a query and can't do some > filters. > > IMHO simple *KILL QUERY qry_id* better for the few reasons. > User can kill just single query belong (started) to single node and it will > be exactly that query which was passed as parameter - predictable results. > For advance syntax it could lead send kill request to all nodes in a > cluster and potentially user can kill unpredictable queries depend on > passed parameters. > Other vendors use simple syntax > > How it could be used > > 1)SELECT * from sql_running_queries > result is > query_id > | sql | schema_name | duration| > 8a55df83-2f41-4f81-8e11-ab0936d0_6742 | SELECT ... | ... > | | > 8a55df83-2f41-4f81-8e11-ab0936d0_1234 | UPDATE... | ... > | .. | > > 2) KILL QUERY 8a55df83-2f41-4f81-8e11-ab0936d0_6742 > > > > Do you have another opinion? Let's decide which of variant will be prefer. > > > ср, 16 янв. 2019 г. в 18:02, Denis Magda : > > > Yury, > > > > I do support the latter concatenation approach. It's simple and > correlates > > with what other DBs do. Plus, it can be passed to KILL command without > > complications. Thanks for thinking this through! > > > > As for the killing of all queries on a particular node, not sure that's a > > relevant use case. I would put this off. Usually, you want to stop a > > specific query (it's slow or resources consuming) and have to know its > id, > > the query runs across multiple nodes and a single KILL command with the > id > > can halt it everywhere. If someone decided to shut all queries on the > node, > > then it sounds like the node is experiencing big troubles and it might be > > better just to shut it down completely. > > > > - > > Denis > > > > > > On Tue, Jan 15, 2019 at 8:00 AM Юрий > wrote: > > > >> Denis and other Igniters, do you have any comments for proposed > approach? > >> Which of these ones will be better to use for us - simple numeric or > hex > >> values (shorter id, but with letters)? > >> > >> As for me hex values preferable due to it shorter and looks more unique > >> across a logs > >> > >> > >> > >> вт, 15 янв. 2019 г. в 18:35, Vladimir Ozerov : > >> > >>> Hi, > >>> > >>> Concatenation through a letter looks like a good approach to me. As far > >>> as > >>> killing all queries on a specific node, I would put it aside for now - > >>> this > >>> looks like a separate command with possibly different parameters. > >>> > >>> On Tue, Jan 15, 2019 at 1:30 PM Юрий > >>> wrote: > >>> > >>> > Thanks Vladimir for your thoughts. > >>> > > >>> > Based on it most convenient ways are first and third. > >>> > But with some modifications: > >>> > For first variant delimiter should be a letter, e.g. 123X15494, then > it > >>> > could be simple copy by user. > >>> > For 3rd variant can be used convert both numeric to HEX and use a > >>> letter > >>> > delimiter not included to HEX symbols (ABCDEF), in this case query id > >>> will > >>> > be shorter and also can be simple copy by user. e.g. 7BX3C86 ( it the > >>> same > >>> > value as used for first variant), instead of convert all value as > &g
[jira] [Created] (IGNITE-11134) SQL: Do not wrap key and value objects in GridH2KeyValueRowOnheap
Vladimir Ozerov created IGNITE-11134: Summary: SQL: Do not wrap key and value objects in GridH2KeyValueRowOnheap Key: IGNITE-11134 URL: https://issues.apache.org/jira/browse/IGNITE-11134 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Assignee: Vladimir Ozerov Fix For: 2.8 This wrapping is not needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11118) SQL: Ability to resolve partition from argument without H2
Vladimir Ozerov created IGNITE-8: Summary: SQL: Ability to resolve partition from argument without H2 Key: IGNITE-8 URL: https://issues.apache.org/jira/browse/IGNITE-8 Project: Ignite Issue Type: Task Reporter: Vladimir Ozerov Assignee: Vladimir Ozerov Fix For: 2.8 Currently we rely on H2 to get final partition: we need to convert originally passed argument to expected argument type. We need to write our own code to handle this as H2 code will not be available by thin clients. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11117) SQL: Move partition nodes to core module
Vladimir Ozerov created IGNITE-7: Summary: SQL: Move partition nodes to core module Key: IGNITE-7 URL: https://issues.apache.org/jira/browse/IGNITE-7 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Assignee: Vladimir Ozerov Fix For: 2.8 This is needed for further integration with thin clients which do not have dependency on {{indexing}} module. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11115) Binary: rework thread-local binary context to avoid set() operation
Vladimir Ozerov created IGNITE-5: Summary: Binary: rework thread-local binary context to avoid set() operation Key: IGNITE-5 URL: https://issues.apache.org/jira/browse/IGNITE-5 Project: Ignite Issue Type: Task Components: binary Reporter: Vladimir Ozerov Assignee: Vladimir Ozerov Fix For: 2.8 Currently we call {{ThreadLocal.set()}} on every serialization/deserialization (see {{GridBinaryMarshaller#BINARY_CTX}} usages). This may lead to high CPU usage, especially during SQL query execution. Let's refactor access patterns to work only with {{ThreadLocal.get()}} operation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: H2 license and vulnerabilities
Hi Steve, H2 cannot be removed from Ignite easily as it is integrated pretty deep into indexing module. Good news is that our usage of H2 is pretty limited - we only use it's parser, planner and execution pipeline. We do not use H2 as data storage. Please let me know if you need any additional clarifications. Vladimir. On Tue, Jan 29, 2019 at 10:35 AM steve.hostett...@gmail.com < steve.hostett...@gmail.com> wrote: > Hello, > I am using Apache Ignite in an financial setting and it gets reported as a > high risk because of one of its dependencies : H2 > > The blackduck report warns the following: > 1) The H2 license being weak reciprocal it is not the prefered type of OSS > licenses (e.g., Apache, MIT) > 2) There are known vulnerabulities for now more than a year that do not get > fixed: > > https://www.cvedetails.com/vulnerability-list/vendor_id-17893/product_id-45580/year-2018/H2database-H2.html > > So here are my questions : > 1) is there any plan to swap H2 by another in memory database and if not > what is the view of the community on the above points. > 2) Does ignite uses the part of H2 that is vulnerable (disk backup)? > > Many thanks in advance > > > > -- > Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/ >
Re: SQL View with list of existing indexes
Hi Yuriy, Yes, I believe we will have columns view(s) at some point in time for sure. On Thu, Jan 24, 2019 at 7:08 PM Юрий wrote: > Hi Vladimir, > > Thanks for your comments, > > 1) Agree. > 2) Ok. > 3) We create number of index copies depend on query parallelism. But seems > you are right - it should be exposed on TABLES level. > 4) Approx. inline size shouldn't be used here, due to the value depend on > node and not has single value. > 5) Do we have a plans for some view with table columns? If yes, may be will > be better have just array with column order from the columns view. For > example you want to know which columns are indexed already. In case we will > have plain comma-separated form it can't be achieved. > > > > > > чт, 24 янв. 2019 г. в 18:09, Vladimir Ozerov : > > > Hi Yuriy, > > > > Please note that MySQL link is about SHOW command, which is a different > > beast. In general I think that PG approach is better as it allows user to > > get quick overview of index content without complex JOINs. I would start > > with plain single view and add columns view later if we found it useful. > As > > far as view columns: > > 1) I would add both cache ID/name and cache group ID/name > > 2) Number of columns does not look as a useful info to me > > 3) Query parallelism is related to cache, not index, so it should be in > > IGNITE.TABLES view instead > > 4) Inline size is definitely useful metric. Not sure about approximate > > inline size > > 5) I would add list of columns in plain comma-separated form with > ASC/DESC > > modifiers > > > > Thoughts? > > > > Vladimir. > > > > On Thu, Jan 24, 2019 at 3:52 PM Юрий > wrote: > > > > > Hi Igniters, > > > > > > As part of IEP-29: SQL management and monitoring > > > < > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-29%3A+SQL+management+and+monitoring > > > > > > > I'm going to implement SQL view with list of existing indexes. > > > I've investigate how it expose by ORACLE, MySQL and Postgres. > > > ORACLE - > > > > > > > > > https://docs.oracle.com/en/database/oracle/oracle-database/18/refrn/ALL_INDEXES.html#GUID-E39825BA-70AC-45D8-AF30-C7FF561373B6 > > > > > > MySQL - https://dev.mysql.com/doc/refman/8.0/en/show-index.html > > > Postgres - https://www.postgresql.org/docs/11/view-pg-indexes.html , > > > https://www.postgresql.org/docs/11/catalog-pg-index.html > > > > > > All vendors have such views which show at least following information: > > > schema name - Name of schema related to table and index. > > > table name- Name of table related to an index. > > > index name - Name of index. > > > list of columns - All columns and their order included into > an > > > index. > > > collation - ASC or DESC sort for each columns. > > > > > > + many specific information which different form vendor to vendor. > > > > > > In our case such specific information could be at least: > > > > > >1. Owning cache ID - not sure, but may > > be > > >useful to join with other our views. > > >2. number of columns at the index- just to know how many > > result > > >should be in columns view > > >3. query parallelism - It's > > configuration > > >parameter show how many thread can be used to execute query. > > >4. inline size - inline > size > > >used for this index. > > >5. is affinity - boolean > > >parameter show that affinity key index > > >6. is pk- > boolean > > >parameter show that PK index > > >7. approx recommended inline size- dynamically calculated > > >recommended inline size for this index to show required size to keep > > > whole > > >indexed columns as inlined. > > > > > > > > > > > > All vendors have different ways to present information about index > > > columns: > > > PG - use array of index table columns and second array for collation > each > > > of columns. > > > MySQL - each row in index view contains information about one of > indexed > > > columsn
Re: Distributed MetaStorage discussion
Ivan, The idea is that certain changes to the system are not relevant for all components. E.g. if SQL schema is changed, then some SQL caches needs to be invalidated. When affinity topology changes, another part of caches needs to be invalidated. Having a single version may lead to unexpected latency spikes and invalidations in this case. On Fri, Jan 25, 2019 at 4:50 PM Ivan Bessonov wrote: > Vladimir, > > thank you for the reply. Topology and affinity changes are not reflected in > distributed metastorage, we didn't touch baseline history at all. I believe > that what you really need it just distributed property "sqlSchemaVer" that > is updated on each schema update. It could be achieved by creating > corresponding key in distributed metastorage without any specific treatment > from the API standpoint. > > Same thing applies to topology and affinity versions, but motivation here > is not that clear for me to be honest. > > I think that the most common approach with single incrementing version > is much simpler then several counters and I would prefer to leave it that > way. > > > пт, 25 янв. 2019 г. в 16:39, Vladimir Ozerov : > > > Ivan, > > > > The change you describe is extremely valuable thing as it allows to > detect > > changes into global configuration which is of great importance for SQL. > > Will topology and affinity changes be reflected in metastore history as > > well? From SQL perspective it is important for us to be able to > understand > > whether cluster topology, data distribution or SQL schema has changed > > between two versions. Is it possible to have a kind of composite version > > instead of hashed counter? E.g. > > > > class ConfigurationVersion { > > long globalVer; // Global counter > > long topVer; // Increasing topology version > > long affVer; // Increasing affinity version which is incremented > every > > time data distribution is changed (node join/leave, baseline changes, > late > > affinity assignment) > > long sqlSchemaVer; // Incremented every time SQL schema changes > > } > > > > Vladimir. > > > > > > On Fri, Jan 25, 2019 at 11:45 AM Ivan Bessonov > > wrote: > > > > > Hello, Igniters! > > > > > > Here's more info "Distributed MetaStorage" feature [1]. It is a part of > > > Phase II for > > > IEP-4 (Baseline topology) [2] and was mentioned in recent "Baseline > > > auto-adjust`s > > > discuss" topic. I'll partially duplicate that message here. > > > > > > One of key requirements is the ability to store configuration data (or > > any > > > other data) > > > consistently and cluster-wide. There are also other tickets that > require > > > similar > > > mechanisms, for example [3]. Ignite doesn't have any specific API for > > such > > > configurations and we don't want to have many similar implementations > of > > > the > > > same feature across the code. > > > > > > There are several API methods required for the feature: > > > > > > - read(key) / iterate(keyPrefix) - access to the distributed data. > > Should > > > be > > >consistent for all nodes in cluster when it's in active state. > > > - write / remove - modify data in distributed metastorage. Should > > > guarantee that > > >every node in cluster will have this update after the method is > > > finished. > > > - writeAsync / removeAsync (not yet implemented) - same as above, but > > > async. > > >Might be useful if one needs to update several values one after > > another. > > > - compareAndWrite / compareAndRemove - helpful to reduce number of > data > > >updates (more on that later). > > > - listen(keyPredicate) - a way of being notified when some data was > > > changed. > > >Normally it is triggered on "write/remove" operation or node > > activation. > > > Listener > > >itself will be notified with . > > > > > > Now some implementation details: > > > > > > First implementation is based on existing local metastorage API for > > > persistent > > > clusters (in-memory clusters will store data in memory). Write/remove > > > operation > > > use Discovery SPI to send updates to the cluster, it guarantees updates > > > order > > > and the fact that all existing (alive) nodes have handled the update > > > message. > > > > &g
Re: Distributed MetaStorage discussion
Ivan, The change you describe is extremely valuable thing as it allows to detect changes into global configuration which is of great importance for SQL. Will topology and affinity changes be reflected in metastore history as well? From SQL perspective it is important for us to be able to understand whether cluster topology, data distribution or SQL schema has changed between two versions. Is it possible to have a kind of composite version instead of hashed counter? E.g. class ConfigurationVersion { long globalVer; // Global counter long topVer; // Increasing topology version long affVer; // Increasing affinity version which is incremented every time data distribution is changed (node join/leave, baseline changes, late affinity assignment) long sqlSchemaVer; // Incremented every time SQL schema changes } Vladimir. On Fri, Jan 25, 2019 at 11:45 AM Ivan Bessonov wrote: > Hello, Igniters! > > Here's more info "Distributed MetaStorage" feature [1]. It is a part of > Phase II for > IEP-4 (Baseline topology) [2] and was mentioned in recent "Baseline > auto-adjust`s > discuss" topic. I'll partially duplicate that message here. > > One of key requirements is the ability to store configuration data (or any > other data) > consistently and cluster-wide. There are also other tickets that require > similar > mechanisms, for example [3]. Ignite doesn't have any specific API for such > configurations and we don't want to have many similar implementations of > the > same feature across the code. > > There are several API methods required for the feature: > > - read(key) / iterate(keyPrefix) - access to the distributed data. Should > be >consistent for all nodes in cluster when it's in active state. > - write / remove - modify data in distributed metastorage. Should > guarantee that >every node in cluster will have this update after the method is > finished. > - writeAsync / removeAsync (not yet implemented) - same as above, but > async. >Might be useful if one needs to update several values one after another. > - compareAndWrite / compareAndRemove - helpful to reduce number of data >updates (more on that later). > - listen(keyPredicate) - a way of being notified when some data was > changed. >Normally it is triggered on "write/remove" operation or node activation. > Listener >itself will be notified with . > > Now some implementation details: > > First implementation is based on existing local metastorage API for > persistent > clusters (in-memory clusters will store data in memory). Write/remove > operation > use Discovery SPI to send updates to the cluster, it guarantees updates > order > and the fact that all existing (alive) nodes have handled the update > message. > > As a way to find out which node has the latest data there is a "version" > value of > distributed metastorage, which is basically all > updates>. Whole updates history until some point in the past is stored > along with > the data, so when an outdated node connects to the cluster it will receive > all the > missing data and apply it locally. Listeners will also be invoked after > such updates. > If there's not enough history stored or joining node is clear then it'll > receive > shapshot of distributed metastorage so there won't be inconsistencies. > "compareAndWrite" / "compareAndRemove" API might help reducing the size of > the history, especially for Boolean or other primitive values. > > There are, of course, many more details, feel free to ask about them. First > implementation is in master, but there are already known improvements that > can > be done and I'm working on them right now. > > See package "org.apache.ignite.internal.processors.metastorage" for the new > interfaces and comment your opinion or questions. Thank you! > > [1] https://issues.apache.org/jira/browse/IGNITE-10640 > [2] > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-4+Baseline+topology+for+caches > [3] https://issues.apache.org/jira/browse/IGNITE-8717 > > -- > Sincerely yours, > Ivan Bessonov >
[jira] [Created] (IGNITE-11083) SQL: Extract query model from splitter
Vladimir Ozerov created IGNITE-11083: Summary: SQL: Extract query model from splitter Key: IGNITE-11083 URL: https://issues.apache.org/jira/browse/IGNITE-11083 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Assignee: Vladimir Ozerov Fix For: 2.8 We will need a common query model with join/subquery info for future splitter and partition pruning improvements. Let's extract accurately the model from splitter aiming to reuse it for partition pruning in future. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Baseline auto-adjust`s discuss
Got it, makes sense. On Fri, Jan 25, 2019 at 11:06 AM Anton Kalashnikov wrote: > Vladimir, thanks for your notes, both of them looks good enough but I > have two different thoughts about it. > > I think I agree about enabling only one of manual/auto adjustment. It is > easier than current solution and in fact as extra feature we can allow > user to force task to execute(if they doesn't want to wait until timeout > expired). > But about second one I don't sure that one parameters instead of two would > be more convenient. For example: in case when user changed timeout and then > disable auto-adjust after then when someone will want to enable it they > should know what value of timeout was before auto-adjust was disabled. I > think "negative value" pattern good choice for always usable parameters > like timeout of connection (ex. -1 equal to endless waiting) and so on, but > in our case we want to disable whole functionality rather than change > parameter value. > > -- > Best regards, > Anton Kalashnikov > > > 24.01.2019, 22:03, "Vladimir Ozerov" : > > Hi Anton, > > > > This is great feature, but I am a bit confused about automatic disabling > of > > a feature during manual baseline adjustment. This may lead to unpleasant > > situations when a user enabled auto-adjustment, then re-adjusted it > > manually somehow (e.g. from some previously created script) so that > > auto-adjustment disabling went unnoticed, then added more nodes hoping > that > > auto-baseline is still active, etc. > > > > Instead, I would rather make manual and auto adjustment mutually > exclusive > > - baseline cannot be adjusted manually when auto mode is set, and vice > > versa. If exception is thrown in that cases, administrators will always > > know current behavior of the system. > > > > As far as configuration, wouldn’t it be enough to have a single long > value > > as opposed to Boolean + long? Say, 0 - immediate auto adjustment, > negative > > - disabled, positive - auto adjustment after timeout. > > > > Thoughts? > > > > чт, 24 янв. 2019 г. в 18:33, Anton Kalashnikov : > > > >> Hello, Igniters! > >> > >> Work on the Phase II of IEP-4 (Baseline topology) [1] has started. I > want > >> to start to discuss of implementation of "Baseline auto-adjust" [2]. > >> > >> "Baseline auto-adjust" feature implements mechanism of auto-adjust > >> baseline corresponding to current topology after event join/left was > >> appeared. It is required because when a node left the grid and nobody > would > >> change baseline manually it can lead to lost data(when some more nodes > left > >> the grid on depends in backup factor) but permanent tracking of grid > is not > >> always possible/desirible. Looks like in many cases auto-adjust > baseline > >> after some timeout is very helpfull. > >> > >> Distributed metastore[3](it is already done): > >> > >> First of all it is required the ability to store configuration data > >> consistently and cluster-wide. Ignite doesn't have any specific API for > >> such configurations and we don't want to have many similar > implementations > >> of the same feature in our code. After some thoughts is was proposed to > >> implement it as some kind of distributed metastorage that gives the > ability > >> to store any data in it. > >> First implementation is based on existing local metastorage API for > >> persistent clusters (in-memory clusters will store data in memory). > >> Write/remove operation use Discovery SPI to send updates to the > cluster, it > >> guarantees updates order and the fact that all existing (alive) nodes > have > >> handled the update message. As a way to find out which node has the > latest > >> data there is a "version" value of distributed metastorage, which is > >> basically . All updates history > >> until some point in the past is stored along with the data, so when an > >> outdated node connects to the cluster it will receive all the missing > data > >> and apply it locally. If there's not enough history stored or joining > node > >> is clear then it'll receive shapshot of distributed metastorage so > there > >> won't be inconsistencies. > >> > >> Baseline auto-adjust: > >> > >> Main scenario: > >> - There is grid with the baseline is equal to the current > topology > >
Re: Baseline auto-adjust`s discuss
Hi Anton, This is great feature, but I am a bit confused about automatic disabling of a feature during manual baseline adjustment. This may lead to unpleasant situations when a user enabled auto-adjustment, then re-adjusted it manually somehow (e.g. from some previously created script) so that auto-adjustment disabling went unnoticed, then added more nodes hoping that auto-baseline is still active, etc. Instead, I would rather make manual and auto adjustment mutually exclusive - baseline cannot be adjusted manually when auto mode is set, and vice versa. If exception is thrown in that cases, administrators will always know current behavior of the system. As far as configuration, wouldn’t it be enough to have a single long value as opposed to Boolean + long? Say, 0 - immediate auto adjustment, negative - disabled, positive - auto adjustment after timeout. Thoughts? чт, 24 янв. 2019 г. в 18:33, Anton Kalashnikov : > > Hello, Igniters! > > Work on the Phase II of IEP-4 (Baseline topology) [1] has started. I want > to start to discuss of implementation of "Baseline auto-adjust" [2]. > > "Baseline auto-adjust" feature implements mechanism of auto-adjust > baseline corresponding to current topology after event join/left was > appeared. It is required because when a node left the grid and nobody would > change baseline manually it can lead to lost data(when some more nodes left > the grid on depends in backup factor) but permanent tracking of grid is not > always possible/desirible. Looks like in many cases auto-adjust baseline > after some timeout is very helpfull. > > Distributed metastore[3](it is already done): > > First of all it is required the ability to store configuration data > consistently and cluster-wide. Ignite doesn't have any specific API for > such configurations and we don't want to have many similar implementations > of the same feature in our code. After some thoughts is was proposed to > implement it as some kind of distributed metastorage that gives the ability > to store any data in it. > First implementation is based on existing local metastorage API for > persistent clusters (in-memory clusters will store data in memory). > Write/remove operation use Discovery SPI to send updates to the cluster, it > guarantees updates order and the fact that all existing (alive) nodes have > handled the update message. As a way to find out which node has the latest > data there is a "version" value of distributed metastorage, which is > basically . All updates history > until some point in the past is stored along with the data, so when an > outdated node connects to the cluster it will receive all the missing data > and apply it locally. If there's not enough history stored or joining node > is clear then it'll receive shapshot of distributed metastorage so there > won't be inconsistencies. > > Baseline auto-adjust: > > Main scenario: > - There is grid with the baseline is equal to the current topology > - New node joins to grid or some node left(failed) the grid > - New mechanism detects this event and it add task for changing > baseline to queue with configured timeout > - If new event are happened before baseline would be changed task > would be removed from queue and new task will be added > - When timeout are expired the task would try to set new baseline > corresponded to current topology > > First of all we need to add two parameters[4]: > - baselineAutoAdjustEnabled - enable/disable "Baseline > auto-adjust" feature. > - baselineAutoAdjustTimeout - timeout after which baseline should > be changed. > > This parameters are cluster wide and can be changed in real time because > it is based on "Distributed metastore". On first time this parameters would > be initiated by corresponded parameters(initBaselineAutoAdjustEnabled, > initBaselineAutoAdjustTimeout) from "Ignite Configuration". Init value > valid only before first changing of it after value would be changed it is > stored in "Distributed metastore". > > Restrictions: > - This mechanism handling events only on active grid > - If baselineNodes != gridNodes on activate this feature would be > disabled > - If lost partitions was detected this feature would be disabled > - If baseline was adjusted manually on baselineNodes != gridNodes > this feature would be disabled > > Draft implementation you can find here[5]. Feel free to ask more details > and make suggestions. > > [1] > https://cwiki.apache.org/confluence/display/IGNITE/IEP-4+Baseline+topology+for+caches > [2] https://issues.apache.org/jira/browse/IGNITE-8571 > [3] https://issues.apache.org/jira/browse/IGNITE-10640 > [4] https://issues.apache.org/jira/browse/IGNITE-8573 > [5] https://github.com/apache/ignite/pull/5907 > > -- > Best regards, > Anton Kalashnikov > >
Re: SQL View with list of existing indexes
Hi Yuriy, Please note that MySQL link is about SHOW command, which is a different beast. In general I think that PG approach is better as it allows user to get quick overview of index content without complex JOINs. I would start with plain single view and add columns view later if we found it useful. As far as view columns: 1) I would add both cache ID/name and cache group ID/name 2) Number of columns does not look as a useful info to me 3) Query parallelism is related to cache, not index, so it should be in IGNITE.TABLES view instead 4) Inline size is definitely useful metric. Not sure about approximate inline size 5) I would add list of columns in plain comma-separated form with ASC/DESC modifiers Thoughts? Vladimir. On Thu, Jan 24, 2019 at 3:52 PM Юрий wrote: > Hi Igniters, > > As part of IEP-29: SQL management and monitoring > < > https://cwiki.apache.org/confluence/display/IGNITE/IEP-29%3A+SQL+management+and+monitoring > > > I'm going to implement SQL view with list of existing indexes. > I've investigate how it expose by ORACLE, MySQL and Postgres. > ORACLE - > > https://docs.oracle.com/en/database/oracle/oracle-database/18/refrn/ALL_INDEXES.html#GUID-E39825BA-70AC-45D8-AF30-C7FF561373B6 > > MySQL - https://dev.mysql.com/doc/refman/8.0/en/show-index.html > Postgres - https://www.postgresql.org/docs/11/view-pg-indexes.html , > https://www.postgresql.org/docs/11/catalog-pg-index.html > > All vendors have such views which show at least following information: > schema name - Name of schema related to table and index. > table name- Name of table related to an index. > index name - Name of index. > list of columns - All columns and their order included into an > index. > collation - ASC or DESC sort for each columns. > > + many specific information which different form vendor to vendor. > > In our case such specific information could be at least: > >1. Owning cache ID - not sure, but may be >useful to join with other our views. >2. number of columns at the index- just to know how many result >should be in columns view >3. query parallelism - It's configuration >parameter show how many thread can be used to execute query. >4. inline size - inline size >used for this index. >5. is affinity - boolean >parameter show that affinity key index >6. is pk- boolean >parameter show that PK index >7. approx recommended inline size- dynamically calculated >recommended inline size for this index to show required size to keep > whole >indexed columns as inlined. > > > > All vendors have different ways to present information about index > columns: > PG - use array of index table columns and second array for collation each > of columns. > MySQL - each row in index view contains information about one of indexed > columsn with ther position at the index. So for one index there are many > columns. > ORACLE, - use separate view where each of row present column included into > index with all required information and can be joined by schema, table and > index names. > ORACLE indexed columns view - > > https://docs.oracle.com/cd/B19306_01/server.102/b14237/statviews_1064.htm#i1577532 > MySql - > > I propose use ORACLE way and have second view to represent column included > into indexes. > > In this case such view can have the following information: > schema name - Name of schema related to table and index. > table name- Name of table related to an index. > index name - Name of index. > column name- Name of column included into index. > column type - Type of the column. > column position - Position of column within the index. > collation- Either the column is sorted descending or > ascending > > And can be joined with index view through schema, table and index names. > > > > What do you think about such approach and list of columns which could be > included into the views? > > -- > Живи с улыбкой! :D >
[jira] [Created] (IGNITE-11057) Document new SQL system view "CACHE_GROUPS_IO"
Vladimir Ozerov created IGNITE-11057: Summary: Document new SQL system view "CACHE_GROUPS_IO" Key: IGNITE-11057 URL: https://issues.apache.org/jira/browse/IGNITE-11057 Project: Ignite Issue Type: Task Components: documentation, sql Reporter: Vladimir Ozerov Fix For: 2.8 See {{modules\indexing\src\main\java\org\apache\ignite\internal\processors\query\h2\sys\view\SqlSystemViewCacheGroupsIOStatistics.java}} # {{GROUP_ID}} - cache group ID # {{GROUP_ID}} - cache group name # {{PHYSICAL_READS}} - number of physical reads (i.e. block read from disk) for the given group # {{LOGICAL_READS}} - number of logical reads (i.e. from buffer cache) for the given group. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11042) Document new SQL system view "TABLES"
Vladimir Ozerov created IGNITE-11042: Summary: Document new SQL system view "TABLES" Key: IGNITE-11042 URL: https://issues.apache.org/jira/browse/IGNITE-11042 Project: Ignite Issue Type: Task Components: documentation, sql Reporter: Vladimir Ozerov Fix For: 2.8 See {{modules\indexing\src\main\java\org\apache\ignite\internal\processors\query\h2\sys\view\SqlSystemViewTables.java}} for the list of columns. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Continuous queries and duplicates
Hi Piotr, Unfortunately I do not have answer to the question about ordering guarantees during node crashes for the same affinity key. Hopefully some other Ignite experts would be able to help. But in any case I doubt we will be able to have public guarantee on the same affinity key, as opposed to current approach (key itself), Vladimir. On Fri, Jan 11, 2019 at 5:24 PM Piotr Romański wrote: > Hi Vladimir, thank you for your response. I tested the current behaviour > and it seems that the order is maintained for notifications within a > partition. Unfortunately, I don’t know how it would behave in exceptional > situations like losing partitions, rebalancing etc. Do you think it would > be possible to make that ordering guarantee to be a part of the Ignite API? > What I would really need is to have order for notifications sharing the > same affinity key, not even a partition. So I think it wouldn’t require any > cross-node ordering. > > Thank you, > > Piotr > > śr., 9 sty 2019, 21:11: Vladimir Ozerov napisał(a): > > > Hi, > > > > MVCC caches have the same ordering guarantees as non-MVCC caches, i.e. > two > > subsequent updates on a single key will be delivered in proper order. > There > > is no guarantees Order of updates on two subsequent transactions > affecting > > the same partition may be guaranteed with current implementation > (though. I > > am not sure), but even if it is so, I am not aware that this was ever our > > design goal. Most likely, this is an implementation artifact which may be > > changed in future. Cache experts are needed to clarify this. > > > > As far as MVCC, data anomalies are still possible in current > > implementation, because we didn't rework initial query handling in the > > first iteration, because technically this is not so simple as we thought. > > Once snapshot is obtained, query over that snapshot will return a data > set > > consistent at some point in time. But the problem is that there is a time > > frame between snapshot acquisition and listener installation (or vice > > versa), what leads to either duplicates or lost entries. Some multi-step > > listener installation will be required here. We haven't designed it yet. > > > > Vladimir. > > > > > > > > On Mon, Dec 24, 2018 at 10:06 PM Denis Magda wrote: > > > > > > > > > > In my case, values are immutable - I never change them, I just add > new > > > > entry for newer versions. Does it mean that I won't have any > duplicates > > > > between the initial query and listener entries when using continuous > > > > queries on caches supporting MVCC? > > > > > > > > > I'm afraid there still might be a race. Val, Vladimir, other Ignite > > > experts, please confirm. > > > > > > After reading the related thread ( > > > > > > > > > > > > > > http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html > > > > ) > > > > I'm now concerned about the ordering. My case assumes that there are > > > groups > > > > of entries which belong to a business aggregate object and I would > like > > > to > > > > make sure that if I commit two records in two serial transactions > then > > I > > > > have notifications in the same order. Those entries will have > different > > > > keys so based on what you said ("we'd better to leave things as is > and > > > > guarantee only per-key ordering"), it would seem that the order is > not > > > > guaranteed. But do you think it would possible to guarantee order > when > > > > those entries share the same affinity key and they belong to the same > > > > partition? > > > > > > > > > The order should be the same for key-value transactions. Vladimir, > could > > > you clear out MVCC based behavior? > > > > > > -- > > > Denis > > > > > > On Mon, Dec 17, 2018 at 9:55 AM Piotr Romański < > piotr.roman...@gmail.com > > > > > > wrote: > > > > > > > Hi all, sorry for answering so late. > > > > > > > > I would like to use SqlQuery because I can leverage indexes there. > > > > > > > > As it was already mentioned earlier, the partition update counter is > > > > exposed through CacheQueryEntryEvent. Initially, I thought that the > > > > partition update counter is something what's persisted together with >
Re: CompactFooter for ClientBinaryMarshaller
It's hard to believe that compact footers are not supported, as it was one of critical performance optimizations we implemented more than 4 years ago :-) If it is really so, we should prioritize the fix. On Tue, Jan 22, 2019 at 3:28 PM Igor Sapego wrote: > Roman, > > I've filed a ticket for C++: [1] > > [1] - https://issues.apache.org/jira/browse/IGNITE-11027 > > Best Regards, > Igor > > > On Tue, Jan 22, 2019 at 12:55 PM Roman Shtykh > wrote: > > > Igor, I see. How about having a warning if `BinaryConfiguration` is not > > provided explicitly to at least raise attention? And creating a JIRA > issue > > for C++ clients -- after it resolves we can probably switch it to cluster > > default. > > > > -- > > Roman Shtykh > > > > On Monday, January 21, 2019, 7:04:30 p.m. GMT+9, Igor Sapego < > > isap...@apache.org> wrote: > > > > I believe, it was set to false by default as it was kind of experimental > > optimisation. > > Also, I've checked right now and it seems that C++ clients (thick and > > thin)do not yet support compact footers. It may also be a blocker to set > > compactfooters to true by default. > > Best Regards,Igor > > > > On Sat, Jan 19, 2019 at 6:52 AM Roman Shtykh > > wrote: > > > > Thank you for the explanation. But here is the problem is not exactly > with > > deserialization but with that a user-defined key is being marshalled to a > > binary object with the compact footer set to true, while the key for > > putting has the footer set to false (which is server default). Thus we > have > > a different thing for the key when we try to retrieve and getting null. > > Therefore, I suppose switching client to server defaults is what has to > be > > done. If the user decides to switch to full schema mode, at least he/she > > will be aware of it. And for deserialization, the schema will be > retrieved, > > as you explained. What do you think? > > > > -- Roman > > On Friday, January 18, 2019, 10:52:11 p.m. GMT+9, Vladimir Ozerov < > > voze...@gridgain.com> wrote: > > > > "Compact footer" is optimization which saves a lot of space. Object > > serialized in this form do not have the full information required for > > deserialization. Metadata necessary for deserialization (aka "schema") is > > located on cluster nodes. For this client it could be requested through > > special command. Pleass see ClientOperation.GET_BINARY_TYPE as a starting > > point. > > On Fri, Jan 18, 2019 at 1:32 PM Igor Sapego wrote: > > > > I'm not sure, that such a change should be done in minor release, maybe > in > > 3.0 > > Vova, what do you think? It was you, who designed and developed compact > > footer, right? > > Best Regards,Igor > > > > On Fri, Jan 18, 2019 at 4:20 AM Roman Shtykh > > wrote: > > > > > I believe it has something to do with backward compatibility.That's > what > > I would like to know.If there's no strong reason to set it to false, it > > should be as Ignite's default -- that's what a user would expect. And if > > the user changes the configuration at the cluster, he/she will be aware > of > > that and change it at thin client.If we cannot set it to Ignite's > default, > > we can add a log message saying we force it to false. > > > > -- > > Roman > > > > > > On Thursday, January 17, 2019, 7:11:05 p.m. GMT+9, Igor Sapego < > > isap...@apache.org> wrote: > > > > First of all, I do not like that thin client is silently returns null. > It > > should be fixed. > > For the compact footer being set to false by default - I believe it has > > something to do withbackward compatibility. > > Best Regards,Igor > > > > On Thu, Jan 17, 2019 at 7:37 AM Roman Shtykh > > wrote: > > > > Igniters, > > After putting some data with a user-defined key with a thick client, it's > > impossible to retrieve it with a thin client. > > https://issues.apache.org/jira/browse/IGNITE-10960(I was not sure it was > > a bug, so I first reported the issue to the user ml, Mikhail thanks for > > checking and the jira issue) > > That happens because for Ignite `compactFooter` is `true` by default, but > > `ClientBinaryMarshaller` forces it to `false` if `BinaryConfiguration` is > > not created explicitly (see ClientBinaryMarshaller#createImpl). > > Any reason to force it to false? I would like to align it with Ignite > > defaults (by setting to true). > > > > -- Roman > > > > > > > > > > >
[jira] [Created] (IGNITE-10986) SQL: Drop _VER field support
Vladimir Ozerov created IGNITE-10986: Summary: SQL: Drop _VER field support Key: IGNITE-10986 URL: https://issues.apache.org/jira/browse/IGNITE-10986 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Assignee: Alexander Lapin Fix For: 2.8 {{_VER}} is undocumented hidden field which is never used in practice. But profiling shows that it consumes a lot of memory. Let's drop support of this field from all {{GridH2SearchRow}} implementations, as well as from internal descriptors. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10985) SQL: create low-overhead implementation of Row for SELECTs
Vladimir Ozerov created IGNITE-10985: Summary: SQL: create low-overhead implementation of Row for SELECTs Key: IGNITE-10985 URL: https://issues.apache.org/jira/browse/IGNITE-10985 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Assignee: Alexander Lapin Fix For: 2.8 Currently we use {{GridH2KeyValueRowOnheap}} for both update and search operations. This leads to *huge* memory overhead during {{SELECT}} execution. If you take a closer look on what is inside the row, you will note the following: # It has both serialized and deserialized {{GridCacheVersion}} which is never needed # It has wrapped key and value object # It has reference to {{CacheDataRow}} which is not needed either # It has {{valCache}} field which is never used in SELECT The goal of this ticket is to created optimized version of row which will be created during {{SELECT}} operations only. It should contain only minimally necessary information: # Key (unwrapped!) # Value (unwrapped!) # Version (unwrapped, we will remove it completely in separate ticket) It should not contain reference to {{CacheDataRow}}. There is a chance that we will need some pieces from it (e.g. cache ID and link for caching purposes), but it definitely will be only small subset of the whole {{CacheDataRowAdapter}} (or even worse - {{MvccDataRow}}). Entry point: {{H2Tree.createRowFromLink}} methods. Note that they return {{GridH2Row}}, while in their usages only very relaxed version of {{GridH2SearchRow}} is needed. So let's start with new implementation of row for these methods and then gradually remove all unnecessary stuff from there. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: CompactFooter for ClientBinaryMarshaller
"Compact footer" is optimization which saves a lot of space. Object serialized in this form do not have the full information required for deserialization. Metadata necessary for deserialization (aka "schema") is located on cluster nodes. For this client it could be requested through special command. Pleass see ClientOperation.GET_BINARY_TYPE as a starting point. On Fri, Jan 18, 2019 at 1:32 PM Igor Sapego wrote: > I'm not sure, that such a change should be done in minor release, maybe in > 3.0 > > Vova, what do you think? It was you, who designed and developed compact > footer, right? > > Best Regards, > Igor > > > On Fri, Jan 18, 2019 at 4:20 AM Roman Shtykh > wrote: > >> > I believe it has something to do with backward compatibility.That's >> what I would like to know.If there's no strong reason to set it to false, >> it should be as Ignite's default -- that's what a user would expect. And if >> the user changes the configuration at the cluster, he/she will be aware of >> that and change it at thin client.If we cannot set it to Ignite's default, >> we can add a log message saying we force it to false. >> >> -- >> Roman >> >> >> On Thursday, January 17, 2019, 7:11:05 p.m. GMT+9, Igor Sapego < >> isap...@apache.org> wrote: >> >> First of all, I do not like that thin client is silently returns null. >> It should be fixed. >> For the compact footer being set to false by default - I believe it has >> something to do withbackward compatibility. >> Best Regards, >> Igor >> >> >> On Thu, Jan 17, 2019 at 7:37 AM Roman Shtykh >> wrote: >> >> Igniters, >> After putting some data with a user-defined key with a thick client, it's >> impossible to retrieve it with a thin client. >> https://issues.apache.org/jira/browse/IGNITE-10960(I was not sure it was >> a bug, so I first reported the issue to the user ml, Mikhail thanks for >> checking and the jira issue) >> That happens because for Ignite `compactFooter` is `true` by default, but >> `ClientBinaryMarshaller` forces it to `false` if `BinaryConfiguration` is >> not created explicitly (see ClientBinaryMarshaller#createImpl). >> Any reason to force it to false? I would like to align it with Ignite >> defaults (by setting to true). >> >> -- Roman >> >> > >
[jira] [Created] (IGNITE-10971) SQL: Support partition pruning for distributed joins
Vladimir Ozerov created IGNITE-10971: Summary: SQL: Support partition pruning for distributed joins Key: IGNITE-10971 URL: https://issues.apache.org/jira/browse/IGNITE-10971 Project: Ignite Issue Type: Task Components: sql Reporter: Vladimir Ozerov Fix For: 2.8 During IGNITE-10307 implementation it was revealed that distributed joins do not work with partition pruning. We never observed it before because it was impossible to derive partitions from joins. The problem appears as timeout exception from reducer due to some timeouts/retries inside distributed joins logic. Failures could be reproduced as follows: 1) Remove {{GridSqlQuerySplitter.distributedJoins}} usage which prevents partition to be derived for map query. 2) Run any of the following tests and observe that some of tests cases fails with reducer timeout: {{IgniteSqlSplitterSelfTest}} {{IgniteCacheJoinQueryWithAffinityKeyTest}} {{IgniteCacheDistributedJoinQueryConditionsTest}} {{IgniteCacheCrossCacheJoinRandomTest}} Root cause is unknown, but most likely this is due some missing messages, because some parts of distributed join engine is not aware of extracted partitions and await for replies from not involved nodes. Note that most likely the same problem will appear for queries with distributed joins and explicit partitions ({{SqlFieldsQuery.partitions}}). -- This message was sent by Atlassian JIRA (v7.6.3#76005)