from:"Vladimir Ozerov"

Re: AggregateUnionTransposeRule fails when some inputs have unique grouping key

2021-05-19 Thread Vladimir Ozerov

Hi Pavel,

Yes, I missed the list, sorry.

ср, 19 мая 2021 г. в 14:40, Pavel Tupitsyn :

> Hi Vladimir,
>
> Looks like this message is for d...@calcite.apache.org, not
> dev@ignite.apache.org,
> or am I mistaken?
>
> On Wed, May 19, 2021 at 2:25 PM Vladimir Ozerov 
> wrote:
>
> > Hi,
> >
> > The AggregateUnionTransposeRule attempts to push the Aggregate below the
> > Union.
> >
> > Before:
> > Aggregate[group=$0, agg=SUM($1]
> >   Union[all]
> > Input1
> > Input2
> >
> > After:
> > Aggregate[group=$0, agg=SUM($1]
> >   Union[all]
> > Aggregate[group=$0, agg=SUM($1]
> >   Input1
> > Aggregate[group=$0, agg=SUM($1]
> >   Input2
> >
> > When pushing the Aggregate, it checks whether the input is definitively
> > unique on the grouping key. If yes, the Aggregate is not installed on top
> > of the input, assuming that the result would be the same as without the
> > aggregate. This generates a type mismatch exception when aggregation is
> > pushed only to some of the inputs:
> > Aggregate[group=$0, agg=SUM($1]
> >   Union[all]
> > Aggregate[group=$0, agg=SUM($1]
> >   Input1
> > Input2
> >
> > It seems that the uniqueness check should not be in that rule at all, and
> > the aggregate should be pushed unconditionally. Motivation: we already
> have
> > AggregateRemoveRule that removes unnecessary aggregates. No need to
> > duplicate the same non-trivial logic twice.
> >
> > Does the proposal make sense to you?
> >
> > Regards,
> > Vladimir.
> >
>

AggregateUnionTransposeRule fails when some inputs have unique grouping key

2021-05-19 Thread Vladimir Ozerov

Hi,

The AggregateUnionTransposeRule attempts to push the Aggregate below the
Union.

Before:
Aggregate[group=$0, agg=SUM($1]
  Union[all]
Input1
Input2

After:
Aggregate[group=$0, agg=SUM($1]
  Union[all]
Aggregate[group=$0, agg=SUM($1]
  Input1
Aggregate[group=$0, agg=SUM($1]
  Input2

When pushing the Aggregate, it checks whether the input is definitively
unique on the grouping key. If yes, the Aggregate is not installed on top
of the input, assuming that the result would be the same as without the
aggregate. This generates a type mismatch exception when aggregation is
pushed only to some of the inputs:
Aggregate[group=$0, agg=SUM($1]
  Union[all]
Aggregate[group=$0, agg=SUM($1]
  Input1
Input2

It seems that the uniqueness check should not be in that rule at all, and
the aggregate should be pushed unconditionally. Motivation: we already have
AggregateRemoveRule that removes unnecessary aggregates. No need to
duplicate the same non-trivial logic twice.

Does the proposal make sense to you?

Regards,
Vladimir.

Re: [ANNOUNCE] New Committer: Taras Ledkov

2020-05-12 Thread Vladimir Ozerov

Congratulations, Taras!

Well deserved!

вт, 12 мая 2020 г. в 20:27, Ivan Rakov :

> Taras,
>
> Congratulations and welcome!
>
> On Tue, May 12, 2020 at 8:26 PM Denis Magda  wrote:
>
> > Taras,
> >
> > Welcome, that was long overdue on our part! Hope to see you soon among
> the
> > PMC group.
> >
> > -
> > Denis
> >
> >
> > On Tue, May 12, 2020 at 9:09 AM Dmitriy Pavlov 
> wrote:
> >
> > > Hello Ignite Community,
> > >
> > >
> > >
> > > The Project Management Committee (PMC) for Apache Ignite has invited
> > Taras
> > > Ledkov to become a committer and we are pleased to announce that he has
> > > accepted.
> > >
> > >
> > > Taras is an Ignite SQL veteran who knows in detail current Ignite - H2
> > > integration and binary serialization, actively participates in JDBC and
> > > thin client protocol development, he is eager to help users on the user
> > > list within his area of expertise.
> > >
> > >
> > >
> > > Being a committer enables easier contribution to the project since
> there
> > is
> > > no need to go via the patch submission process. This should enable
> better
> > > productivity.
> > >
> > >
> > >
> > > Taras, thank you for all your efforts, congratulations and welcome on
> > > board!
> > > .
> > >
> > >
> > >
> > > Best Regards,
> > >
> > > Dmitriy Pavlov
> > >
> > > on behalf of Apache Ignite PMC
> > >
> >
>

JSR 381 (Visual Recognition in Java) and Apache Ignite

2020-03-30 Thread Vladimir Ozerov

Hi Alexey, Igniters,

Let me introduce you Heather, Zoran, and Frank. Heather is the Chair of the
JCP. Zoran and Frank are JSR 381 spec leads. They are interested in
discussing the upcoming visual recognition specification with the Apache
Ignite community, to understand whether the community has any interest in
implementing it.

Zoran and Frank,

Please meet Alexey Zinoviev, who is the principal maintainer of the Apache
ML module. I hope he will be able to help you with your questions.

Regards,
Vladimir.

Re: Adding support for Ignite secondary indexes to Apache Calcite planner

2019-12-11 Thread Vladimir Ozerov

Roman,

What I am trying to understand is what advantage of materialization API you
see over the normal optimization process? Does it save optimization time,
or reduce memory footprint, or maybe provide better plans? I am asking
because I do not see how expressing indexes as materializations fit
classical optimization process. We discussed Sort <- Scan optimization.
Let's consider another example:

LogicalSort[a ASC]
  LogicalJoin

Initially, you do not know the implementation of the join, and hence do not
know it's collation. Then you may execute physical join rules, which
produce, say, PhysicalMergeJoin[a ASC]. If you execute sort implementation
rule afterwards, you may easily eliminate the sort, or make it simpler
(e.g. remove local sorting phase), depending on the distribution. In other
words, proper implementation of sorting optimization assumes that you have
a kind of SortRemoveRule anyway, irrespectively of whether you use
materializations or not, because sorting may be injected on top of any
operator. With this in mind, the use of materializations doesn't make the
planner simpler. Neither it improves the outcome of the whole optimization
process.

What is left is either lower CPU or RAM usage? Is this the case?

ср, 11 дек. 2019 г. в 18:37, Roman Kondakov :

> Vladimir,
>
> the main advantage of the Phoenix approach I can see is the using of
> Calcite's native materializations API. Calcite has advanced support for
> materializations [1] and lattices [2]. Since secondary indexes can be
> considered as materialized views (it's just a sorted representation of
> the same table) we can seamlessly use views to simulate indexes behavior
> for Calcite planner.
>
>
> [1] https://calcite.apache.org/docs/materialized_views.html
> [2] https://calcite.apache.org/docs/lattice.html
>
> --
> Kind Regards
> Roman Kondakov
>
>
> On 11.12.2019 17:11, Vladimir Ozerov wrote:
> > Roman,
> >
> > What is the advantage of Phoenix approach then? BTW, it looks like
> Phoenix
> > integration with Calcite never made it to production, did it?
> >
> > вт, 10 дек. 2019 г. в 19:50, Roman Kondakov  >:
> >
> >> Hi Vladimir,
> >>
> >> from what I understand, Drill does not exploit collation of indexes. To
> >> be precise it does not exploit index collation in "natural" way where,
> >> say, we a have sorted TableScan and hence we do not create a new Sort.
> >> Instead of it Drill always create a Sort operator, but if TableScan can
> >> be replaced with an IndexScan, this Sort operator is removed by the
> >> dedicated rule.
> >>
> >> Lets consider initial an operator tree:
> >>
> >> Project
> >>   Sort
> >> TableScan
> >>
> >> after applying rule DbScanToIndexScanPrule this tree will be converted
> to:
> >>
> >> Project
> >>   Sort
> >> IndexScan
> >>
> >> and finally, after applying DbScanSortRemovalRule we have:
> >>
> >> Project
> >>   IndexScan
> >>
> >> while for Phoenix approach we would have two equivalent subsets in our
> >> planner:
> >>
> >> Project
> >>   Sort
> >> TableScan
> >>
> >> and
> >>
> >> Project
> >>   IndexScan
> >>
> >> and most likely the last plan  will be chosen as the best one.
> >>
> >> --
> >> Kind Regards
> >> Roman Kondakov
> >>
> >>
> >> On 10.12.2019 17:19, Vladimir Ozerov wrote:
> >>> Hi Roman,
> >>>
> >>> Why do you think that Drill-style will not let you exploit collation?
> >>> Collation should be propagated from the index scan in the same way as
> in
> >>> other sorted operators, such as merge join or streaming aggregate.
> >> Provided
> >>> that you use converter-hack (or any alternative solution to trigger
> >> parent
> >>> re-analysis).
> >>> In other words, propagation of collation from Drill-style indexes
> should
> >> be
> >>> no different from other sorted operators.
> >>>
> >>> Regards,
> >>> Vladimir.
> >>>
> >>> вт, 10 дек. 2019 г. в 16:40, Zhenya Stanilovsky
> >>  >>>> :
> >>>
> >>>>
> >>>> Roman just as fast remark, Phoenix builds their approach on
> >>>> already existing monolith HBase architecture, most cases it`s just a
> >> stub
> >>>> for someone who wants use secondary indexes with a base with no
> >>>>

Re: Adding support for Ignite secondary indexes to Apache Calcite planner

2019-12-11 Thread Vladimir Ozerov

Roman,

What is the advantage of Phoenix approach then? BTW, it looks like Phoenix
integration with Calcite never made it to production, did it?

вт, 10 дек. 2019 г. в 19:50, Roman Kondakov :

> Hi Vladimir,
>
> from what I understand, Drill does not exploit collation of indexes. To
> be precise it does not exploit index collation in "natural" way where,
> say, we a have sorted TableScan and hence we do not create a new Sort.
> Instead of it Drill always create a Sort operator, but if TableScan can
> be replaced with an IndexScan, this Sort operator is removed by the
> dedicated rule.
>
> Lets consider initial an operator tree:
>
> Project
>   Sort
> TableScan
>
> after applying rule DbScanToIndexScanPrule this tree will be converted to:
>
> Project
>   Sort
> IndexScan
>
> and finally, after applying DbScanSortRemovalRule we have:
>
> Project
>   IndexScan
>
> while for Phoenix approach we would have two equivalent subsets in our
> planner:
>
> Project
>   Sort
> TableScan
>
> and
>
> Project
>   IndexScan
>
> and most likely the last plan  will be chosen as the best one.
>
> --
> Kind Regards
> Roman Kondakov
>
>
> On 10.12.2019 17:19, Vladimir Ozerov wrote:
> > Hi Roman,
> >
> > Why do you think that Drill-style will not let you exploit collation?
> > Collation should be propagated from the index scan in the same way as in
> > other sorted operators, such as merge join or streaming aggregate.
> Provided
> > that you use converter-hack (or any alternative solution to trigger
> parent
> > re-analysis).
> > In other words, propagation of collation from Drill-style indexes should
> be
> > no different from other sorted operators.
> >
> > Regards,
> > Vladimir.
> >
> > вт, 10 дек. 2019 г. в 16:40, Zhenya Stanilovsky
>  >> :
> >
> >>
> >> Roman just as fast remark, Phoenix builds their approach on
> >> already existing monolith HBase architecture, most cases it`s just a
> stub
> >> for someone who wants use secondary indexes with a base with no
> >> native support of it. Don`t think it`s good idea here.
> >>
> >>>
> >>>
> >>> --- Forwarded message ---
> >>> From: "Roman Kondakov" < kondako...@mail.ru.invalid >
> >>> To:  dev@ignite.apache.org
> >>> Cc:
> >>> Subject: Adding support for Ignite secondary indexes to Apache Calcite
> >>> planner
> >>> Date: Tue, 10 Dec 2019 15:55:52 +0300
> >>>
> >>> Hi all!
> >>>
> >>> As you may know there is an activity on integration of Apache Calcite
> >>> query optimizer into Ignite codebase is being carried out [1],[2].
> >>>
> >>> One of a bunch of problems in this integration is the absence of
> >>> out-of-the-box support for secondary indexes in Apache Calcite. After
> >>> some research I came to conclusion that this problem has a couple of
> >>> workarounds. Let's name them
> >>> 1. Phoenix-style approach - representing secondary indexes as
> >>> materialized views which are natively supported by Calcite engine [3]
> >>> 2. Drill-style approach - pushing filters into the table scans and
> >>> choose appropriate index for lookups when possible [4]
> >>>
> >>> Both these approaches have advantages and disadvantages:
> >>>
> >>> Phoenix style pros:
> >>> - natural way of adding indexes as an alternative source of rows: index
> >>> can be considered as a kind of sorted materialized view.
> >>> - possibility of using index sortedness for stream aggregates,
> >>> deduplication (DISTINCT operator), merge joins, etc.
> >>> - ability to support other types of indexes (i.e. functional indexes).
> >>>
> >>> Phoenix style cons:
> >>> - polluting optimizer's search space extra table scans hence increasing
> >>> the planning time.
> >>>
> >>> Drill style pros:
> >>> - easier to implement (although it's questionable).
> >>> - search space is not inflated.
> >>>
> >>> Drill style cons:
> >>> - missed opportunity to exploit sortedness.
> >>>
> >>> There is a good discussion about using both approaches can be found in
> >> [5].
> >>>
> >>> I made a small sketch [6] in order to demonstrate the applicability of
> >>> the Phoenix approach to Ignite. Key d

Re: Adding support for Ignite secondary indexes to Apache Calcite planner

2019-12-10 Thread Vladimir Ozerov

Hi Roman,

Why do you think that Drill-style will not let you exploit collation?
Collation should be propagated from the index scan in the same way as in
other sorted operators, such as merge join or streaming aggregate. Provided
that you use converter-hack (or any alternative solution to trigger parent
re-analysis).
In other words, propagation of collation from Drill-style indexes should be
no different from other sorted operators.

Regards,
Vladimir.

вт, 10 дек. 2019 г. в 16:40, Zhenya Stanilovsky :

>
> Roman just as fast remark, Phoenix builds their approach on
> already existing monolith HBase architecture, most cases it`s just a stub
> for someone who wants use secondary indexes with a base with no
> native support of it. Don`t think it`s good idea here.
>
> >
> >
> >--- Forwarded message ---
> >From: "Roman Kondakov" < kondako...@mail.ru.invalid >
> >To:  dev@ignite.apache.org
> >Cc:
> >Subject: Adding support for Ignite secondary indexes to Apache Calcite
> >planner
> >Date: Tue, 10 Dec 2019 15:55:52 +0300
> >
> >Hi all!
> >
> >As you may know there is an activity on integration of Apache Calcite
> >query optimizer into Ignite codebase is being carried out [1],[2].
> >
> >One of a bunch of problems in this integration is the absence of
> >out-of-the-box support for secondary indexes in Apache Calcite. After
> >some research I came to conclusion that this problem has a couple of
> >workarounds. Let's name them
> >1. Phoenix-style approach - representing secondary indexes as
> >materialized views which are natively supported by Calcite engine [3]
> >2. Drill-style approach - pushing filters into the table scans and
> >choose appropriate index for lookups when possible [4]
> >
> >Both these approaches have advantages and disadvantages:
> >
> >Phoenix style pros:
> >- natural way of adding indexes as an alternative source of rows: index
> >can be considered as a kind of sorted materialized view.
> >- possibility of using index sortedness for stream aggregates,
> >deduplication (DISTINCT operator), merge joins, etc.
> >- ability to support other types of indexes (i.e. functional indexes).
> >
> >Phoenix style cons:
> >- polluting optimizer's search space extra table scans hence increasing
> >the planning time.
> >
> >Drill style pros:
> >- easier to implement (although it's questionable).
> >- search space is not inflated.
> >
> >Drill style cons:
> >- missed opportunity to exploit sortedness.
> >
> >There is a good discussion about using both approaches can be found in
> [5].
> >
> >I made a small sketch [6] in order to demonstrate the applicability of
> >the Phoenix approach to Ignite. Key design concepts are:
> >1. On creating indexes are registered as tables in Calcite schema. This
> >step is needed for internal Calcite's routines.
> >2. On planner initialization we register these indexes as materialized
> >views in Calcite's optimizer using VolcanoPlanner#addMaterialization
> >method.
> >3. Right before the query execution Calcite selects all materialized
> >views (indexes) which can be potentially used in query.
> >4. During the query optimization indexes are registered by planner as
> >usual TableScans and hence can be chosen by optimizer if they have lower
> >cost.
> >
> >This sketch shows the ability to exploit index sortedness only. So the
> >future work in this direction should be focused on using indexes for
> >fast index lookups. At first glance FilterableTable and
> >FilterTableScanRule are good points to start. We can push Filter into
> >the TableScan and then use FilterableTable for fast index lookups
> >avoiding reading the whole index on TableScan step and then filtering
> >its output on the Filter step.
> >
> >What do you think?
> >
> >
> >
> >[1]
> >
> http://apache-ignite-developers.2346864.n4.nabble.com/New-SQL-execution-engine-tt43724.html#none
> >[2]
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-37%3A+New+query+execution+engine
> >[3]  https://issues.apache.org/jira/browse/PHOENIX-2047
> >[4]  https://issues.apache.org/jira/browse/DRILL-6381
> >[5]  https://issues.apache.org/jira/browse/DRILL-3929
> >[6]  https://github.com/apache/ignite/pull/7115
>
>
>
>

Re: [DISCUSS] PMC Chair rotation time

2019-10-24 Thread Vladimir Ozerov

HI Anton,

Thanks for adding me to the list. Ignite community is very vibrant, it
would be a honor for me to resonate it even further. For example, I was
thinking about Ignite - Hazelcast integration (aka "Hazelnite") which could
have bring a lot of fresh technical discussions to the list.
But as this may affect diversirty of my life, let me think more whether I
am ready for this role ...

Vladimir.

чт, 24 окт. 2019 г. в 10:34, Anton Vinogradov :

> More candidates:
>
> - Alexey Zinoviev.
> He is not a PMC member, but it's definitely time to change this.
> BigData evangelist.
>
> - Ivan Rakov.
> He is not a PMC member, but it's definitely time to change this.
> Distributed systems evangelist and great lead.
>
> - Denis Magda.
> Keeping this stable is also the option.
>
> - Pavel Tupitsyn.
> Open-minded, experienced in global communications.
>
> - Roman Shtykh.
> Rocketmq and Ignite committer.
> Highly involved to Ignite development and popularization.
>
> - Vladimir Ozerov
> Ignite and Hazelcast committer.
> Man with really strong leadership skills.
>
> - Yakov Zhdanov
> Ignite's Godfather.
>
> On Thu, Oct 24, 2019 at 10:18 AM Alexey Zinoviev 
> wrote:
>
> > Currently, we discuss only candidates here, isn't it? This is not a vote?
> >
> > Also we should ask candidates about their plans, isn't it?
> >
> > Please, Denis, remind the procedure of voting, if we will have more than
> 1
> > candidate?
> >
> > Ans yet one suggestion: maybe we will rotate every year, and it will be
> > easy for candidates to plan their work-life balance?
> >
> > Thanks
> >
> > чт, 24 окт. 2019 г., 8:44 Dmitriy Pavlov :
> >
> > > Hi Igniters,
> > >
> > > I would be happy to serve this role, but since my day job current
> > projects
> > > not related to Ignite, it may cause some delays in replies.
> > >
> > > I would give my +1 to both candidates, but I'm concerned how keeping
> PMC
> > > Chair inside the company initially donated code to ASF could move the
> > > community to the new trails. I strongly believe giving more control
> > outside
> > > and improving diversity always help to find new ways of developing
> > > solution.
> > >
> > > So I, in any case, don't object to Alexey serving this role. But if
> > Alexey
> > > is more active on the list and if he is not affiliated with GridGain,
> it
> > > would be ++1 from my side. For now, I'm not so sure.
> > >
> > > Sincerely,
> > > Dmitriy Pavlov
> > >
> > > чт, 24 окт. 2019 г. в 03:39, Nikita Ivanov :
> > >
> > > >
> > > > +1 Alexey Goncharuk
> > > > --
> > > > Nikita Ivanov
> > > >
> > > > > On Oct 21, 2019, at 10:21 PM, Denis Magda 
> wrote:
> > > > >
> > > > > Igniters,
> > > > >
> > > > > It’s been almost 3 years since my election as the PMC Chair and I’d
> > > like
> > > > > the community to give other PMC members an opportunity to serve in
> > this
> > > > > role. It’s healthy to rotate the role more frequently and we’re
> > already
> > > > > due. Though the chair doesn’t have formal power, he/she can bring a
> > > fresh
> > > > > perspective and help to navigate the community via trails not
> > > considered
> > > > > before.
> > > > >
> > > > > Please propose candidates selecting among active PMC members:
> > > > > https://ignite.apache.org/community/resources.html#people
> > > > >
> > > > >
> > > > > Denis
> > > > >
> > > > >
> > > > >> On Monday, December 5, 2016, Dmitriy Setrakyan <
> > dsetrak...@apache.org
> > > >
> > > > >> wrote:
> > > > >>
> > > > >> I haven't forgotten. Just got back from a business trip. Will
> start
> > a
> > > > vote
> > > > >> this week.
> > > > >>
> > > > >> On Wed, Nov 23, 2016 at 5:18 PM, Dmitriy Setrakyan <
> > > > dsetrak...@apache.org>
> > > > >> wrote:
> > > > >>
> > > > >>> Cos, I will start the vote soon. A bit over occupied with travel
> > and
> > > > >>> holidays at this moment.
> > > > >>>
> > > > >>> On Mon, Nov 21, 2016 at

[jira] [Created] (IGNITE-11701) SQL: Reflect in documentation change of system views schema from "IGNITE" to "SYS"

2019-04-09 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11701:


 Summary: SQL: Reflect in documentation change of system views 
schema from "IGNITE" to "SYS"
 Key: IGNITE-11701
 URL: https://issues.apache.org/jira/browse/IGNITE-11701
 Project: Ignite
  Issue Type: Task
  Components: documentation
    Reporter: Vladimir Ozerov
Assignee: Artem Budnikov
 Fix For: 2.8


Previously all system views were located in "IGNITE" schema. Now we moved them 
to "SYS" because this is more intuitive and consistent with other database 
vendors. Need to do two things:
# Updated documentation of system views: change "IGNITE" schema to "SYS"
# Add a balloon informing users that before AI 2.8 system views were located in 
"IGNITE" schema and that previous behavior could be forced with 
"-DIGNITE_SQL_SYSTEM_SCHEMA_NAME_IGNITE=true" system property.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: Thin client: transactions support

2019-04-04 Thread Vladimir Ozerov

 protocol limitations that are
> >> causing
> >> > this.
> >> > But I have no idea how to support this in .NET Thin Client, for
> example.
> >> >
> >> > It is thread-safe and can handle multiple async operations in
> parallel.
> >> > But with TX support we have to somehow switch to single-threaded mode
> to
> >> > avoid unexpected effects.
> >> >
> >> > Any ideas?
> >> >
> >> >
> >> > On Mon, Apr 1, 2019 at 6:38 PM Alex Plehanov  >
> >> > wrote:
> >> >
> >> > > Dmitriy, thank you!
> >> > >
> >> > > Guys, I've created the IEP [1] on wiki, please have a look.
> >> > >
> >> > > [1]
> >> > >
> >> > >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-34+Thin+client%3A+transactions+support
> >> > >
> >> > >
> >> > > чт, 28 мар. 2019 г. в 14:33, Dmitriy Pavlov :
> >> > >
> >> > > > Hi,
> >> > > >
> >> > > > I've added permissions to account plehanov.alex
> >> > > >
> >> > > > Recently Infra integrated Apache LDAP with confluence, so it is
> >> > possible
> >> > > to
> >> > > > login using Apache credentials. Probably we can ask infra if extra
> >> > > > permissions to edit pages should be added for committers.
> >> > > >
> >> > > > Sincerely,
> >> > > > Dmitriy Pavlov
> >> > > >
> >> > > > ср, 27 мар. 2019 г. в 13:37, Alex Plehanov <
> plehanov.a...@gmail.com
> >> >:
> >> > > >
> >> > > > > Vladimir,
> >> > > > >
> >> > > > > About current tx: ok, then we don't need tx() method in the
> >> interface
> >> > > at
> >> > > > > all (the same cached transaction info user can store by
> himself).
> >> > > > >
> >> > > > > About decoupling transactions from threads on the server side:
> for
> >> > now,
> >> > > > we
> >> > > > > can start with thread-per-connection approach (we only can
> support
> >> > one
> >> > > > > active transaction per connection, see below, so we need one
> >> > additional
> >> > > > > dedicated thread for each connection with active transaction),
> and
> >> > > later
> >> > > > > change server-side internals to process client transactions in
> any
> >> > > server
> >> > > > > thread (not dedicated to this connection). This change will not
> >> > affect
> >> > > > the
> >> > > > > thin client protocol, it only affects the server side.
> >> > > > > In any case, we can't support concurrent transactions per
> >> connection
> >> > on
> >> > > > > the client side without fundamental changes to the current
> >> protocol
> >> > > > (cache
> >> > > > > operation doesn't bound to transaction or thread and the server
> >> > doesn't
> >> > > > > know which thread on the client side do this cache operation).
> In
> >> my
> >> > > > > opinion, if a user wants to use concurrent transactions, he must
> >> use
> >> > > > > different connections from a connection pool.
> >> > > > >
> >> > > > > About semantics of suspend/resume on the client-side: it's
> >> absolutely
> >> > > > > different than server-side semantics (we don't need to do
> >> > > suspend/resume
> >> > > > to
> >> > > > > pass transaction between threads on the client-side), but can't
> be
> >> > > > > implemented efficiently without implemented suspend/resume on
> >> > > > server-side.
> >> > > > >
> >> > > > > Can anyone give me permissions to create IEP on Apache wiki?
> >> > > > >
> >> > > > > ср, 27 мар. 2019 г. в 11:59, Vladimir Ozerov <
> >> voze...@gridgain.com>:
> >> > > > >
> >> > > > > > Hi Alex,
> >> > > > >

[jira] [Created] (IGNITE-11648) Document SCHEMAS system view

2019-03-28 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11648:


 Summary: Document SCHEMAS system view
 Key: IGNITE-11648
 URL: https://issues.apache.org/jira/browse/IGNITE-11648
 Project: Ignite
  Issue Type: Task
  Components: documentation
Reporter: Vladimir Ozerov
Assignee: Artem Budnikov
 Fix For: 2.8


We added "SCHEMAS" system view. It contains only one attribute - "SCHEMA_NAME".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: Thin client: transactions support

2019-03-27 Thread Vladimir Ozerov

Hi Alex,

My comments was only about the protocol. Getting current info about
transaction should be handled by the client itself. It is not protocl's
concern. Same about other APIs and behavior in case another transaction is
attempted from the same thread.

Putting protocol aside, transaction support is complicated matter. I would
propose to route through IEP and wide community discussion. We need to
review API and semantics very carefully, taking SUSPEND/RESUME in count.
Also I do not see how we support client transactions efficiently without
decoupling transactions from threads on the server side first. Because
without it you will need a dedicated server thread for every client's
transaction which is slow and may even crash the server.

Vladimir.

On Wed, Mar 27, 2019 at 11:44 AM Alex Plehanov 
wrote:

> Vladimir, what if we want to get current transaction info (tx() method)?
>
> Does close() method mapped to TX_END(rollback)?
>
> For example, this code:
>
> try(tx = txStart()) {
> tx.commit();
> }
>
> Will produce:
> TX_START
> TX_END(commit)
> TX_END(rollback)
>
> Am I understand you right?
>
> About xid. There is yet another proposal. Use some unique per connection id
> (integer, simple counter) for identifying the transaction on
> commit/rollback message. The client gets this id from the server with
> transaction info and sends it back to the server when trying to
> commit/rollback transaction. This id is not shown to users. But also we can
> pass from server to client real transaction id (xid) with transaction info
> for diagnostic purposes.
>
> And one more question: what should we do if the client starts a new
> transaction without ending the old one? Should we end the old transaction
> implicitly (rollback) or throw an exception to the client? In my opinion,
> the first option is better. For example, if we got a previously used
> connection from the connection pool, we should not worry about any
> uncompleted transaction started by the previous user of this connection.
>
> ср, 27 мар. 2019 г. в 11:02, Vladimir Ozerov :
>
> > As far as SUSPEND/RESUME/SAVEPOINT - we do not support them yet, and
> adding
> > them in future should not conflict with simple START/END infrastructure.
> >
> > On Wed, Mar 27, 2019 at 11:00 AM Vladimir Ozerov 
> > wrote:
> >
> > > Hi Alex,
> > >
> > > I am not sure we need 5 commands. Wouldn't it be enough to have only
> two?
> > >
> > > START - accepts optional parameters, returns transaction info
> > > END - provides commit flag, returns void
> > >
> > > Vladimir.
> > >
> > > On Wed, Mar 27, 2019 at 8:26 AM Alex Plehanov  >
> > > wrote:
> > >
> > >> Sergey, yes, the close is something like silent rollback. But we can
> > >> also implement this on the client side, just using rollback and
> ignoring
> > >> errors in the response.
> > >>
> > >> ср, 27 мар. 2019 г. в 00:04, Sergey Kozlov :
> > >>
> > >> > Nikolay
> > >> >
> > >> > Am I correctly understand you points:
> > >> >
> > >> >- close: rollback
> > >> >- commit, close: do nothing
> > >> >- rollback, close: do what? (I suppose nothing)
> > >> >
> > >> > Also you assume that after commit/rollback we may need to free some
> > >> > resources on server node(s)or just do on client started TX?
> > >> >
> > >> >
> > >> >
> > >> > On Tue, Mar 26, 2019 at 10:41 PM Alex Plehanov <
> > plehanov.a...@gmail.com
> > >> >
> > >> > wrote:
> > >> >
> > >> > > Sergey, we have the close() method in the thick client, it's
> > behavior
> > >> is
> > >> > > slightly different than rollback() method (it should rollback if
> the
> > >> > > transaction is not committed and do nothing if the transaction is
> > >> already
> > >> > > committed). I think we should support try-with-resource semantics
> in
> > >> the
> > >> > > thin client and OP_TX_CLOSE will be useful here.
> > >> > >
> > >> > > Nikolay, suspend/resume didn't work yet for pessimistic
> > transactions.
> > >> > Also,
> > >> > > the main goal of suspend/resume operations is to support
> transaction
> > >> > > passing between threads. In the thin client, the transaction is
> > bound
> > >> to
> > >&

Re: Thin client: transactions support

2019-03-27 Thread Vladimir Ozerov

As far as SUSPEND/RESUME/SAVEPOINT - we do not support them yet, and adding
them in future should not conflict with simple START/END infrastructure.

On Wed, Mar 27, 2019 at 11:00 AM Vladimir Ozerov 
wrote:

> Hi Alex,
>
> I am not sure we need 5 commands. Wouldn't it be enough to have only two?
>
> START - accepts optional parameters, returns transaction info
> END - provides commit flag, returns void
>
> Vladimir.
>
> On Wed, Mar 27, 2019 at 8:26 AM Alex Plehanov 
> wrote:
>
>> Sergey, yes, the close is something like silent rollback. But we can
>> also implement this on the client side, just using rollback and ignoring
>> errors in the response.
>>
>> ср, 27 мар. 2019 г. в 00:04, Sergey Kozlov :
>>
>> > Nikolay
>> >
>> > Am I correctly understand you points:
>> >
>> >- close: rollback
>> >- commit, close: do nothing
>> >- rollback, close: do what? (I suppose nothing)
>> >
>> > Also you assume that after commit/rollback we may need to free some
>> > resources on server node(s)or just do on client started TX?
>> >
>> >
>> >
>> > On Tue, Mar 26, 2019 at 10:41 PM Alex Plehanov > >
>> > wrote:
>> >
>> > > Sergey, we have the close() method in the thick client, it's behavior
>> is
>> > > slightly different than rollback() method (it should rollback if the
>> > > transaction is not committed and do nothing if the transaction is
>> already
>> > > committed). I think we should support try-with-resource semantics in
>> the
>> > > thin client and OP_TX_CLOSE will be useful here.
>> > >
>> > > Nikolay, suspend/resume didn't work yet for pessimistic transactions.
>> > Also,
>> > > the main goal of suspend/resume operations is to support transaction
>> > > passing between threads. In the thin client, the transaction is bound
>> to
>> > > the client connection, not client thread. I think passing a
>> transaction
>> > > between different client connections is not a very useful case.
>> > >
>> > > вт, 26 мар. 2019 г. в 22:17, Nikolay Izhikov :
>> > >
>> > > > Hello, Alex.
>> > > >
>> > > > We also have suspend and resume operations.
>> > > > I think we should support them
>> > > >
>> > > > вт, 26 марта 2019 г., 22:07 Sergey Kozlov :
>> > > >
>> > > > > Hi
>> > > > >
>> > > > > Looks like I missed something but why we need OP_TX_CLOSE
>> operation?
>> > > > >
>> > > > > Also I suggest to reserve a code for SAVEPOINT operation which
>> very
>> > > > useful
>> > > > > to understand where transaction has been rolled back
>> > > > >
>> > > > > On Tue, Mar 26, 2019 at 6:07 PM Alex Plehanov <
>> > plehanov.a...@gmail.com
>> > > >
>> > > > > wrote:
>> > > > >
>> > > > > > Hello Igniters!
>> > > > > >
>> > > > > > I want to pick up the ticket IGNITE-7369 and add transactions
>> > support
>> > > > to
>> > > > > > our thin client implementation.
>> > > > > > I've looked at our current implementation and have some
>> proposals
>> > to
>> > > > > > support transactions:
>> > > > > >
>> > > > > > Add new operations to thin client protocol:
>> > > > > >
>> > > > > > OP_TX_GET, 4000, Get current transaction for client
>> connection
>> > > > > > OP_TX_START, 4001, Start a new transaction
>> > > > > > OP_TX_COMMIT, 4002, Commit transaction
>> > > > > > OP_TX_ROLLBACK, 4003, Rollback transaction
>> > > > > > OP_TX_CLOSE, 4004, Close transaction
>> > > > > >
>> > > > > > From the client side (java) new interfaces will be added:
>> > > > > >
>> > > > > > public interface ClientTransactions {
>> > > > > > public ClientTransaction txStart();
>> > > > > > public ClientTransaction txStart(TransactionConcurrency
>> > > > concurrency,
>> > > > > > TransactionIsolation isolation);
>> > > > > > public ClientTransact

Re: Thin client: transactions support

2019-03-27 Thread Vladimir Ozerov

ion();
> > > > > > public TransactionConcurrency concurrency();
> > > > > > public long timeout();
> > > > > > public String label();
> > > > > >
> > > > > > public void commit();
> > > > > > public void rollback();
> > > > > > public void close();
> > > > > > }
> > > > > >
> > > > > > From the server side, I think as a first step (while transactions
> > > > > > suspend/resume is not fully implemented) we can use the same
> > approach
> > > > as
> > > > > > for JDBC: add a new worker to each ClientRequestHandler and
> process
> > > > > > requests by this worker if the transaction is started explicitly.
> > > > > > ClientRequestHandler is bound to client connection, so there will
> > be
> > > > 1:1
> > > > > > relation between client connection and thread, which process
> > > operations
> > > > > in
> > > > > > a transaction.
> > > > > >
> > > > > > Also, there is a couple of issues I want to discuss:
> > > > > >
> > > > > > We have overloaded method txStart with a different set of
> > arguments.
> > > > Some
> > > > > > of the arguments may be missing. To pass arguments with
> OP_TX_START
> > > > > > operation we have the next options:
> > > > > >  * Serialize full set of arguments and use some value for missing
> > > > > > arguments. For example -1 for int/long types and null for string
> > > type.
> > > > We
> > > > > > can't use 0 for int/long types since 0 it's a valid value for
> > > > > concurrency,
> > > > > > isolation and timeout arguments.
> > > > > >  * Serialize arguments as a collection of property-value pairs
> > (like
> > > > it's
> > > > > > implemented now for CacheConfiguration). In this case only
> > explicitly
> > > > > > provided arguments will be serialized.
> > > > > > Which way is better? The simplest solution is to use the first
> > option
> > > > > and I
> > > > > > want to use it if there were no objections.
> > > > > >
> > > > > > Do we need transaction id (xid) on the client side?
> > > > > > If yes, we can pass xid along with OP_TX_COMMIT, OP_TX_ROLLBACK,
> > > > > > OP_TX_CLOSE operations back to the server and do additional check
> > on
> > > > the
> > > > > > server side (current transaction id for connection == transaction
> > id
> > > > > passed
> > > > > > from client side). This, perhaps, will protect clients against
> some
> > > > > errors
> > > > > > (for example when client try to commit outdated transaction). But
> > > > > > currently, we don't have data type IgniteUuid in thin client
> > > protocol.
> > > > Do
> > > > > > we need to add it too?
> > > > > > Also, we can pass xid as a string just to inform the client and
> do
> > > not
> > > > > pass
> > > > > > it back to the server with commit/rollback operation.
> > > > > > Or not to pass xid at all (.NET thick client works this way as
> far
> > > as I
> > > > > > know).
> > > > > >
> > > > > > What do you think?
> > > > > >
> > > > > > ср, 7 мар. 2018 г. в 16:22, Vladimir Ozerov <
> voze...@gridgain.com
> > >:
> > > > > >
> > > > > > > We already have transactions support in JDBC driver in TX SQL
> > > branch
> > > > > > > (ignite-4191). Currently it is implemented through separate
> > thread,
> > > > > which
> > > > > > > is not that efficient. Ideally we need to finish decoupling
> > > > > transactions
> > > > > > > from threads. But alternatively we can change the logic on how
> we
> > > > > assign
> > > > > > > thread ID to specific transaction and "impersonate" thin client
> > > > worker
> > > > > > > threads when serving requests from multiple users.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Tue, Mar 6, 2018 at 10:01 PM, Denis Magda <
> dma...@apache.org>
> > > > > wrote:
> > > > > > >
> > > > > > > > Here is an original discussion with a reference to the JIRA
> > > ticket:
> > > > > > > > http://apache-ignite-developers.2346864.n4.nabble.
> > > > > > > > com/Re-Transaction-operations-using-the-Ignite-Thin-Client-
> > > > > > > > Protocol-td25914.html
> > > > > > > >
> > > > > > > > --
> > > > > > > > Denis
> > > > > > > >
> > > > > > > > On Tue, Mar 6, 2018 at 9:18 AM, Dmitriy Setrakyan <
> > > > > > dsetrak...@apache.org
> > > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Dmitriy. I don't think we have a design proposal for
> > > > transaction
> > > > > > > > support
> > > > > > > > > in thin clients. Do you mind taking this initiative and
> > > creating
> > > > an
> > > > > > IEP
> > > > > > > > on
> > > > > > > > > Wiki?
> > > > > > > > >
> > > > > > > > > D.
> > > > > > > > >
> > > > > > > > > On Tue, Mar 6, 2018 at 8:46 AM, Dmitriy Govorukhin <
> > > > > > > > > dmitriy.govoruk...@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > > Hi, Igniters.
> > > > > > > > > >
> > > > > > > > > > I've seen a lot of discussions about thin client and
> binary
> > > > > > protocol,
> > > > > > > > > but I
> > > > > > > > > > did not hear anything about transactions support. Do we
> > have
> > > > some
> > > > > > > draft
> > > > > > > > > for
> > > > > > > > > > this purpose?
> > > > > > > > > >
> > > > > > > > > > As I understand we have several problems:
> > > > > > > > > >
> > > > > > > > > >- thread and transaction have hard related (we use
> > > > > thread-local
> > > > > > > > > variable
> > > > > > > > > >and thread name)
> > > > > > > > > >- we can process only one transaction at the same time
> > in
> > > > one
> > > > > > > thread
> > > > > > > > > (it
> > > > > > > > > >mean we need hold thread per client. If connect 100
> thin
> > > > > clients
> > > > > > > to
> > > > > > > > 1
> > > > > > > > > >server node, then need to hold 100 thread on the
> server
> > > > side)
> > > > > > > > > >
> > > > > > > > > > Let's discuss how we can implement transactions for the
> > thin
> > > > > > client.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Sergey Kozlov
> > > > > GridGain Systems
> > > > > www.gridgain.com
> > > > >
> > > >
> > >
> >
> >
> > --
> > Sergey Kozlov
> > GridGain Systems
> > www.gridgain.com
> >
>

[jira] [Created] (IGNITE-11630) Document changes to SQL views

2019-03-26 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11630:


 Summary: Document changes to SQL views
 Key: IGNITE-11630
 URL: https://issues.apache.org/jira/browse/IGNITE-11630
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov
Assignee: Artem Budnikov
 Fix For: 2.8


The following changes were made to our views.

{{CACHE_GROUPS}}
 # {{ID}} -> {{CACHE_GROUP_ID}}
 # {{GROUP_NAME}} -> {{CACHE_GROUP_NAME}}

{{LOCAL_CACHE_GROUPS_IO}}
 # {{GROUP_ID}} -> {{CACHE_GROUP_ID}}
 # {{GROUP_NAME}} -> {{CACHE_GROUP_NAME}}

{{CACHES}}
# {{NAME}} -> {{CACHE_NAME}}
# {{GROUP_ID}} -> {{CACHE_GROUP_ID}}
# {{GROUP_NAME}} -> {{CACHE_GROUP_NAME}}

{{INDEXES}}
 # {{GROUP_ID}} -> {{CACHE_GROUP_ID}}
 # {{GROUP_NAME}} -> {{CACHE_GROUP_NAME}}

{{NODES}}
# {{ID}} -> {{NODE_ID}}

{{TABLES}}
# Added {{CACHE_GROUP_ID}}
# Added {{CACHE_GROUP_NAME}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11564) SQL: Implement KILL QUERY command

2019-03-19 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11564:


 Summary: SQL: Implement KILL QUERY command
 Key: IGNITE-11564
 URL: https://issues.apache.org/jira/browse/IGNITE-11564
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov
 Fix For: 2.8


This is an umbrella ticket for {{KILL QUERY}} command implementation. Original 
description could be found in IGNITE-10161.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: Peer review: Victory over Patch Available debt

2019-03-18 Thread Vladimir Ozerov

Hi,

This is tough question, and first of all I'd like to ask participants to
keep cold head. This is a public question and can be discussed on the dev
list safely.

On the one hand, it is true that a number of patches are not reviewed for a
long time, what negatively affects community development. On the other
hand, we definitely do not want to sacrifice product quality only because
e.g. responsible component owner was on a sick leave or vacation and was
not able to review the patch in a timely manner. Some compromise is needed.

IMO additional comments in HTC may solve the issue. We should stress out
that a patch should be committed if and only if committer is confident with
the changes. Confidence comes from either experience (you worked with
component a lot and know what you are doing), or from review by component's
expert. But if there is an outdated patch and you are not confident enough,
just don't merge. Let is stay in Patch Available as long as needed.

In case of lazy consensus we may ask committers to add comments to the
ticket explaining why they decided to merge a ticket without expert's
review. This should help us avoid bad commits.

Thoughts?

On Mon, Mar 18, 2019 at 11:33 AM Anton Vinogradov  wrote:

> Dmitry,
>
> Phrase "Code modifications can be approved by silence: by lazy consensus
> (72h) after Dev.List announcement." looks unacceptable to me.
>
> Please roll back the changes and start the discussion at the private list
> and never do such updates in the future without the discussion.
>
> On Fri, Mar 15, 2019 at 8:29 PM Dmitriy Pavlov  wrote:
>
> > Hi Igniters,
> >
> > sorry for the late reply. Because this process time to time causes
> > questions, I decided to add a couple of words to our wiki.
> >
> > I've added topics about peer review to HTC
> >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/How+to+Contribute#HowtoContribute-PeerReviewandLGTM
> >
> > Actually, it is (more or less) rules of Apache Beam project, as well as
> > Apache Training(incubating), as well as our current process + Apache
> > policies.
> >
> > Sincerely,
> > Dmitriy Pavlov
> >
> >
> > чт, 16 авг. 2018 г. в 17:46, Yakov Zhdanov :
> >
> > > Dmitry,
> > >
> > > I like your suggestion very much! And I want everyone to follow. Let's
> > see
> > > if it helps.
> > >
> > > Can I ask everyone who has submitted tickets for review to add a
> comment
> > > described by Dmitry to each ticket submitted and see if any additional
> > > check is still required and fix remaining issues? I believe this should
> > > speed up review process very much.
> > >
> > > --Yakov
> > >
> >
>

[jira] [Created] (IGNITE-11551) SQL: Document LOCAL_SQL_QUERY_HISTORY

2019-03-15 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11551:


 Summary: SQL: Document LOCAL_SQL_QUERY_HISTORY
 Key: IGNITE-11551
 URL: https://issues.apache.org/jira/browse/IGNITE-11551
 Project: Ignite
  Issue Type: Task
Reporter: Vladimir Ozerov


Name: {{LOCAL_SQL_QUERY_HISTORY}}
Fields:
# {{SCHEMA_NAME}} - schema name
# {{SQL}} - actual SQL being executed
# {{LOCAL}} - whether query was stared with "local=true" flag
# {{EXECUTIONS}} - total number of executions
# {{FAILURES}} - number of executions which failed
# {{DURATION_MIN}} - minimum duration
# {{DURATION_MAX}} - maximum duration
# {{LAST_START_TIME}} - start time of the last executed query




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11518) SQL: Security checks are skipped on some SELECT paths

2019-03-11 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11518:


 Summary: SQL: Security checks are skipped on some SELECT paths
 Key: IGNITE-11518
 URL: https://issues.apache.org/jira/browse/IGNITE-11518
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov
Assignee: Vladimir Ozerov
 Fix For: 2.8


This is regression introduced by IGNITE-11227. Security check should be moved 
from {{executeSelectLocal}} to {{executeSelect0}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11517) MVCC: Support one-phase commit

2019-03-11 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11517:


 Summary: MVCC: Support one-phase commit
 Key: IGNITE-11517
 URL: https://issues.apache.org/jira/browse/IGNITE-11517
 Project: Ignite
  Issue Type: Task
  Components: mvcc
Reporter: Vladimir Ozerov


One-phase commit is critical performance optimization for single-key requests. 
Our profiling revealed that this is one of the key things why MVCC is much 
slower than non-MVCC caches.

Let's add 1PC support to MVCC.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11516) MVCC management and monitoring

2019-03-11 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11516:


 Summary: MVCC management and monitoring
 Key: IGNITE-11516
 URL: https://issues.apache.org/jira/browse/IGNITE-11516
 Project: Ignite
  Issue Type: Task
  Components: mvcc
Reporter: Vladimir Ozerov


This is an umbrella ticket for MVCC management and monitoring capabilities. 
This should include (but not limited to):
# Proper cache metrics (standard cache operations, number of stale versions aka 
"bloat", etc)
# MVCC coordinator metrics (node ID, number of received requests, number of 
active TXes, current cleanup version, current version, etc)
# Cache events (either standard JCache or something else)
# Deadlock detector metrics 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11515) MVCC: Make sure that multiple cursors are handled properly for JDBC/ODBC

2019-03-11 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11515:


 Summary: MVCC: Make sure that multiple cursors are handled 
properly for JDBC/ODBC
 Key: IGNITE-11515
 URL: https://issues.apache.org/jira/browse/IGNITE-11515
 Project: Ignite
  Issue Type: Bug
  Components: jdbc, mvcc, odbc
Reporter: Vladimir Ozerov


Consider the following scenario executed from JDBC/ODBC driver:
1) Open transaction
2) Get a cursor for some large SELECT
3) Close transaction
4) Overwrite some of not-yet-returned values for the cursor
5) Force vacuum
6) Read remaining values from the cursor

Will we get correct result? Most probably no, because we close transaction on 
commit without consulting to any opened cursors. 

Possible solutions:
1) Extend transaction lifetime until all cursors are closed
2) Or close the cursors forcibly and throw proper error message



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11514) MVCC: Client listener: do not delegate implicit operation execution to separate thread for JDBC/ODBC

2019-03-11 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11514:


 Summary: MVCC: Client listener: do not delegate implicit operation 
execution to separate thread for JDBC/ODBC
 Key: IGNITE-11514
 URL: https://issues.apache.org/jira/browse/IGNITE-11514
 Project: Ignite
  Issue Type: Task
  Components: jdbc, mvcc, odbc
Reporter: Vladimir Ozerov


If implicit operation over MVCC cache(s) is executed from JDBC/ODBC driver, we 
always delegate it to a separate thread. But there is no need to do this - once 
we understand that no active transaction will be left after execution, query 
could be executed from normal listener thread safely.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11513) MVCC: make sure that unsupported features are documented properly

2019-03-10 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11513:


 Summary: MVCC: make sure that unsupported features are documented 
properly
 Key: IGNITE-11513
 URL: https://issues.apache.org/jira/browse/IGNITE-11513
 Project: Ignite
  Issue Type: Task
  Components: documentation
Reporter: Vladimir Ozerov
 Fix For: 2.8






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11511) SQL: Possible bug with parameters passing for complex DML queries

2019-03-08 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11511:


 Summary: SQL: Possible bug with parameters passing for complex DML 
queries
 Key: IGNITE-11511
 URL: https://issues.apache.org/jira/browse/IGNITE-11511
 Project: Ignite
  Issue Type: Bug
  Components: sql
Reporter: Vladimir Ozerov
Assignee: Pavel Kuznetsov
 Fix For: 2.8


See methods {{IgniteH2Indexing.executeSelectLocal}} and 
{{IgniteH2Indexing.executeSelectForDml}}. They both could be invoked for 
{{SELECT}} statements extracted from DML. 

But notice how parameters are passed: it seems that we may pass parameters from 
DML statement unchanged, which is illegal. E.g. consider the following DML:
{code}
UPDATE table SET x=? WHERE x=?
{code}
In this case SELECT statement should get only the second parameter.

Need to create tests to confirm that this is the case and make necessary fixes 
if needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11510) SQL: Rework running queries tests to make them stable to internal code changes

2019-03-08 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11510:


 Summary: SQL: Rework running queries tests to make them stable to 
internal code changes
 Key: IGNITE-11510
 URL: https://issues.apache.org/jira/browse/IGNITE-11510
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov


See {{RunningQueriesTest}}. It hacks into {{IgniteH2Indexing.querySqlFields}}. 
This is not resilent to internal code changes. We need to make sure that the 
whole test use as less hacks as possible. E.g. we can hack into running queries 
manager instead of indexing.

Several DML tests are muted due to changes introduced in IGNITE-11227.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: Ignite 2.8 Release: Time & Scope & Release manager

2019-03-07 Thread Vladimir Ozerov

Dmitry,

“Master is always releasable” is a myth, let’s do not be naive. We develop
complex product. Many features are being developed in iterations. Many
features are developed by different contributors and have to be aligned
with each other after merge. And in the end all this should be tested and
benchmarked before becoming a product.

None serious products are “releasable” from master in a classical “release”
sense. Nightly builds are not releases.

чт, 7 марта 2019 г. в 20:31, Dmitriy Pavlov :

> Vova it is not cool I have to remind Ignite veterans about How to
> contribute page, it says the master is release ready branch.
>
> Otherwise feature is developed in its own branch.
>
> So my vote goes for master-based release.
>
> чт, 7 мар. 2019 г. в 20:28, Vladimir Ozerov :
>
> > Igniters,
> >
> > Making release from master is not an option. We have a lot of
> not-yet-ready
> > and not-yet-tested features. From SQL side this is partition pruning and
> > SQL views with KILL command.
> >
> > So if we do not want to release a mess, then there are only two options:
> > release Java 11 fixes on top of 2.7, or make normal release in about
> 1.5-2
> > month with proper feature freeze process and testing.
> >
> > Vladimir.
> >
> > чт, 7 марта 2019 г. в 20:10, Ilya Kasnacheev  >:
> >
> > > Hello!
> > >
> > > Then please fast-forward review and merge
> > > https://issues.apache.org/jira/browse/IGNITE-11299 because it breaks
> SSL
> > > on
> > > Windows under Java 11.
> > >
> > > Anything else that needs to be merged before release is branched?
> > >
> > > Regards,
> > > --
> > > Ilya Kasnacheev
> > >
> > >
> > > чт, 7 мар. 2019 г. в 20:07, Nikolay Izhikov :
> > >
> > > > +1
> > > >
> > > > чт, 7 марта 2019 г., 20:00 Denis Magda :
> > > >
> > > > > Igniters,
> > > > >
> > > > > How about releasing Ignite 2.8 from the master - creating the
> release
> > > > > branch on Monday-Tuesday, as fast as we can? Don't want us to delay
> > > with
> > > > > Java 11 improvements, they are really helpful from the usability
> > > > > standpoint.
> > > > >
> > > > > After this release, let's introduce a practice of maintenance
> > releases
> > > > > 2.8.x. Those who are working on any improvements and won't merge
> them
> > > to
> > > > > the release branch on Monday-Tuesday will be able to roll out in a
> > > point
> > > > > release like 2.8.1 slightly later.
> > > > >
> > > > > -
> > > > > Denis
> > > > >
> > > > >
> > > > > On Thu, Mar 7, 2019 at 6:22 AM Dmitriy Pavlov 
> > > > wrote:
> > > > >
> > > > > > Hi Ignite Developers,
> > > > > >
> > > > > > In the separate topic, we've touched the question of next release
> > of
> > > > > Apache
> > > > > > Ignite.
> > > > > >
> > > > > > The main reason for the release is Java 11 support, modularity
> > > changes
> > > > > > (actually we have a couple of this kind of fixes). Unfortunately,
> > > full
> > > > > > modularity support is impossible without 3.0 because package
> > > > refactoring
> > > > > is
> > > > > > breaking change in some cases.
> > > > > >
> > > > > > But I clearly remember that in 2.7 thread we've also discussed
> that
> > > the
> > > > > > next release will contain step 1 of services redesign, -
> discovery
> > > > > protocol
> > > > > > usage for services redeploy.
> > > > > >
> > > > > > We have 2 alternative options for releasing 2.8;
> > > > > >
> > > > > > A. (in a small way): 2.7-based branch with particular commits
> > > > > cherry-picked
> > > > > > into it. It is analog of emergency release but without really
> > > > emergency.
> > > > > > Since we don't release our new modules we have more time to make
> it
> > > > > modular
> > > > > > for 2.9 and make Ignite fully modules compliant in 3.0
> > > > > >
> > > > > > B. (in large) And, it is a full release based on master, it will
> > > > include
> > > > > > new hibernate version, ignite-compress, ignite-services, and all
> > > other
> > > > > > changes we have. Once it is published we will not be able to
> change
> > > > > > something.
> > > > > >
> > > > > > Please share your vision, and please stand up if you want to lead
> > > this
> > > > > > release (as release manager).
> > > > > >
> > > > > > Sincerely,
> > > > > > Dmitriy Pavlov
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Ignite 2.8 Release: Time & Scope & Release manager

2019-03-07 Thread Vladimir Ozerov

Igniters,

Making release from master is not an option. We have a lot of not-yet-ready
and not-yet-tested features. From SQL side this is partition pruning and
SQL views with KILL command.

So if we do not want to release a mess, then there are only two options:
release Java 11 fixes on top of 2.7, or make normal release in about 1.5-2
month with proper feature freeze process and testing.

Vladimir.

чт, 7 марта 2019 г. в 20:10, Ilya Kasnacheev :

> Hello!
>
> Then please fast-forward review and merge
> https://issues.apache.org/jira/browse/IGNITE-11299 because it breaks SSL
> on
> Windows under Java 11.
>
> Anything else that needs to be merged before release is branched?
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> чт, 7 мар. 2019 г. в 20:07, Nikolay Izhikov :
>
> > +1
> >
> > чт, 7 марта 2019 г., 20:00 Denis Magda :
> >
> > > Igniters,
> > >
> > > How about releasing Ignite 2.8 from the master - creating the release
> > > branch on Monday-Tuesday, as fast as we can? Don't want us to delay
> with
> > > Java 11 improvements, they are really helpful from the usability
> > > standpoint.
> > >
> > > After this release, let's introduce a practice of maintenance releases
> > > 2.8.x. Those who are working on any improvements and won't merge them
> to
> > > the release branch on Monday-Tuesday will be able to roll out in a
> point
> > > release like 2.8.1 slightly later.
> > >
> > > -
> > > Denis
> > >
> > >
> > > On Thu, Mar 7, 2019 at 6:22 AM Dmitriy Pavlov 
> > wrote:
> > >
> > > > Hi Ignite Developers,
> > > >
> > > > In the separate topic, we've touched the question of next release of
> > > Apache
> > > > Ignite.
> > > >
> > > > The main reason for the release is Java 11 support, modularity
> changes
> > > > (actually we have a couple of this kind of fixes). Unfortunately,
> full
> > > > modularity support is impossible without 3.0 because package
> > refactoring
> > > is
> > > > breaking change in some cases.
> > > >
> > > > But I clearly remember that in 2.7 thread we've also discussed that
> the
> > > > next release will contain step 1 of services redesign, - discovery
> > > protocol
> > > > usage for services redeploy.
> > > >
> > > > We have 2 alternative options for releasing 2.8;
> > > >
> > > > A. (in a small way): 2.7-based branch with particular commits
> > > cherry-picked
> > > > into it. It is analog of emergency release but without really
> > emergency.
> > > > Since we don't release our new modules we have more time to make it
> > > modular
> > > > for 2.9 and make Ignite fully modules compliant in 3.0
> > > >
> > > > B. (in large) And, it is a full release based on master, it will
> > include
> > > > new hibernate version, ignite-compress, ignite-services, and all
> other
> > > > changes we have. Once it is published we will not be able to change
> > > > something.
> > > >
> > > > Please share your vision, and please stand up if you want to lead
> this
> > > > release (as release manager).
> > > >
> > > > Sincerely,
> > > > Dmitriy Pavlov
> > > >
> > >
> >
>

[jira] [Created] (IGNITE-11499) SQL: DML should not use batches by default

2019-03-06 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11499:


 Summary: SQL: DML should not use batches by default
 Key: IGNITE-11499
 URL: https://issues.apache.org/jira/browse/IGNITE-11499
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov
 Fix For: 2.8


Currently DML apply updates in batches equal to {{SqlFieldsQuery.pageSize}}. 
This is prone to deadlocks. Instead, we should apply updates one-by-one by 
default. Proposal:
# Introduce {{SqlFieldsQuery.updateBatchSize}} property, set it to {{1}} by 
default
# Print a warning about deadlock to log if it is greater than 1
# Add it to JDBC and ODBC drivers



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11498) SQL: Rework DML data distribution logic

2019-03-06 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11498:


 Summary: SQL: Rework DML data distribution logic
 Key: IGNITE-11498
 URL: https://issues.apache.org/jira/browse/IGNITE-11498
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov
 Fix For: 2.8


Current DML implementation has a number of problems:
1) We fetch the whole data set to originator's node. There is 
"skipDmlOnReducer" flag to avoid this in some cases, but it is still in 
experimental state, and is not enabled by default
2) Updates are deadlock-prone: we update entries in batches equal to 
{{SqlFieldsQuery.pageSize}}. So we can deadlock easily with concurrent cache 
operations
3) We have very strange re-try logic. It is not clear why it is needed in the 
first place provided that DML is not transactional and no guarantees are needed.

Proposal:
# Implement proper routing logic: if a request could be executed on data nodes 
bypassing skipping reducer, do this. Otherwise fetch all data to reducer. This 
decision should be made in absolutely the same way as for MVCC (see 
{{GridNearTxQueryEnlistFuture}} as a starting point)
# Distribute updates to primary data node in batches, but apply them one by 
one, similar to data streamer with {{allowOverwrite=false}}. Do not do any 
partition state or {{AffinityTopologyVersion}} checks, since DML is not 
transactional. Return and aggregate update counts back.
# Remove or at least rethink retry logic. Why do we need it in the first place?




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: Batch updates in Ignite B+ tree.

2019-03-05 Thread Vladimir Ozerov

Hi Pavel,

As far as I know batch tree updates already being developed. Alex, could
you please elaborate?

On Tue, Mar 5, 2019 at 5:05 PM Pavel Pereslegin  wrote:

> Hi Igniters!
>
> I am working on implementing batch updates in PageMemory [1] to
> improve the performance of preloader, datastreamer and putAll.
>
> This task consists of two major related improvements:
> 1. Batch writing to PageMemory via FreeList - store several values at
> once to single memory page.
> 2. Batch updates in BPlusTree (for introducing invokeAll operation).
>
> I started to investigate the issue with batch updates in B+ tree, and
> it seems that the concurrent top-down balancing algorithm (TD)
> described in this paper [2] may be suitable for batch insertion of
> keys into Ignite B+ Tree.
> This algorithm uses a top-down balancing approach and allows to insert
> a batch of keys belonging to the leaves having the same parent. The
> negative point of top-down balancing approach is that the parent node
> is locked when performing insertion/splitting in child nodes.
>
> WDYT? Do you know other approaches for implementing batch updates in
> Ignite B+ Tree?
>
> [1] https://issues.apache.org/jira/browse/IGNITE-7935
> [2]
> https://aaltodoc.aalto.fi/bitstream/handle/123456789/2168/isbn9512258951.pdf
>

Re: Storing short/empty strings in Ignite

2019-03-05 Thread Vladimir Ozerov

Hi Val,

I would say that we do not need string length at all, because it can be
derived from object footer (next field offset MINUS current field offset).
It is not very good idea to implement proposed change in Apache Ignite 2.x
because it is breaking and will add unnecessary complexity to already very
complex binary infrastructure. Instead, it is better to review binary
format in 3.0 and remove length's not only from Strings, but from other
variable-length data types as well (arrays, decimals).

On Tue, Mar 5, 2019 at 10:12 AM Valentin Kulichenko <
valentin.kuliche...@gmail.com> wrote:

> Hey folks,
>
> While working with Ignite users, I keep seeing data models where a single
> object (row) might contain many fields (100, 200, more...), and most of
> them are strings.
>
> Correct me if I'm wrong, but per my understanding, for every such field we
> store an integer value to represent its length. This is significant
> overhead - with 200 fields we spend 800 bytes only for this.
>
> Now here is the catch: vast majority of those strings are actually empty or
> very short (several chars), therefore we don't really need 4 bytes to their
> length.
>
> My suggestions is to introduce another data type, e.g. STRING_SHORT, use it
> for all strings that are 255 chars or less, and therefore use a single byte
> to encode length. We can go even further, and also introduce STRING_EMPTY,
> which obviously doesn't need any length information at all.
>
> What do you guys think?
>
> -Val
>

Re: Please re-commit 3 last changes in the master

2019-03-04 Thread Vladimir Ozerov

Looks like everything is good now - all three commits were returned.

On Mon, Mar 4, 2019 at 2:04 PM Dmitriy Pavlov  wrote:

> Thanks, Ivan, these commits are in sync in GitHub & GitBox. Only one commit
> remained, Vladimir O., please chime in
>
> пн, 4 мар. 2019 г. в 14:03, Ivan Rakov :
>
> > Thanks for keeping track of it, I've re-applied the following commits:
> >
> > IGNITE-11199 Add extra logging for client-server connections in TCP
> > discovery - Fixes #6048. Andrey Kalinin* 04.03.2019 2:11
> > IGNITE-11322 [USABILITY] Extend Node FAILED message by add consistentId
> > if it exist - Fixes #6180. Andrey Kalinin* 04.03.2019 2:03
> >
> > Best Regards,
> > Ivan Rakov
> >
> > On 04.03.2019 13:56, Dmitriy Pavlov wrote:
> > > Thanks to Alexey Plehanov for noticing and Infra Team for fixing the
> > issue:
> > > https://issues.apache.org/jira/browse/INFRA-17950
> > >
> > > пн, 4 мар. 2019 г. в 13:53, Dmitriy Pavlov :
> > >
> > >> Hi Developers,
> > >>
> > >> Because of the sync issue, the following 3 commits were lost.
> > >>
> > >> Please re-apply it to the master.
> > >>
> > >>
> >
> https://gitbox.apache.org/repos/asf?p=ignite.git;a=commit;h=b26bbb29d5fdd9d4de5187042778ebe3b8c6c42e
> > >>
> > >>
> > >>
> >
> https://gitbox.apache.org/repos/asf?p=ignite.git;a=commit;h=6c562a997c0beb3a3cd9dd2976e016759a808f0c
> > >>
> > >>
> > >>
> >
> https://gitbox.apache.org/repos/asf?p=ignite.git;a=commit;h=45c4dc98e0eac33cccd2e24acb3e9882f098cad1
> > >>
> > >>
> > >> Sorry for the inconvenience.
> > >>
> > >> Sincerely,
> > >> Dmitriy Pavlov
> > >>
> >
>

Re: SQL: INSERT with hidden columns _key, _val and check the type of input objects

2019-02-27 Thread Vladimir Ozerov

I do not think this should be deferred, even though it changes default
behavior. Clean and simple semantics is much more important. In this
regards DML was created incorrectly in the first place. We will fix it,
leaving hidden fallback mode for those users who use this strange semantics.

ср, 27 февр. 2019 г. в 12:57, Ilya Kasnacheev :

> Hello!
>
> > UPDATE table SET _VAL=? WHERE ...   // Disallow
>
> Breaking change and as such should be deferred to 3.0.
>
> All of our tables have types, so we can disallow doing _VAL=? where
> parameter object is not of table's type, and semantics break down here -
> you INSERT object in cache, get "1" rows updated but can't select this row
> from table.
> But we probably should not disallow _VAL=? where parameter object IS of
> table's type, since there may be users whose workflow depends on that and
> it isn't fixable easily.
>
> For example, they can have objects of which only subset of fields is
> indexed, the rest is not. Then they are inserting them via SQL as shown.
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> ср, 27 февр. 2019 г. в 12:10, Vladimir Ozerov :
>
> > Hi Taras,
> >
> > As far as your original question :-) I would say that user should have
> only
> > one way to update data with DML - through plain attributes. That is, if
> we
> > have a composite value with attributes "a" and "b", then we should:
> > UPDATE table SET a=?, b=? WHERE ... // Allow
> > UPDATE table SET _VAL=? WHERE ...   // Disallow
> >
> > But if the value is an attribute itself (e.g. in case of primitive), then
> > DML should be allowed on it for sure:
> > UPDATE table SET _VAL=? WHERE ...   // Allow
> >
> > What do you think?
> >
> > On Sat, Feb 23, 2019 at 6:50 PM Denis Magda  wrote:
> >
> > > Vladimir,
> > >
> > > Ok, agreed, let's not boil the ocean...at least for now ;)
> > >
> > > --
> > > Denis Magda
> > >
> > >
> > > On Sat, Feb 23, 2019 at 12:50 AM Vladimir Ozerov  >
> > > wrote:
> > >
> > > > Denis,
> > > >
> > > > Yes, this is what my answer was about - you cannot have SQL without
> > > > defining fields in advance. Because it breaks a lot of standard SQL
> > > > invariants and virtually makes the whole language unusable. For
> > instance,
> > > > think of product behavior in the following cases:
> > > > 1) User queries an empty cache with a query "SELECT a FROM table" -
> > what
> > > > should happen - exception or empty result? How would I know whether
> > field
> > > > "a" will appear in future?
> > > > 2) User executed a command "ALTER TABLE ... ADD COLUMN b" - how can I
> > > > understand whether it is possible or not to add a column without
> strict
> > > > schema?
> > > > 3) "ALTER TABLE ... DROP COLUMN c" - what should happen if user will
> > add
> > > an
> > > > object with field "c" after that?
> > > > 4) User connects to Ignite from Tableau and navigates through schema
> -
> > > what
> > > > should be shown?
> > > >
> > > > That is, you cannot have SQL without schema because it is at the very
> > > heart
> > > > of the technology. But you can have schema-less noSQL database.
> > > >
> > > > Let's do not invent a hybrid with tons of corner cases and separate
> > > > learning curve. It should be enough just to rethink and simplify our
> > > > configuration - reshape QueryEntity, deprecate all SQL annotations,
> > allow
> > > > only one table per cache, allow to define SQL script to be executed
> on
> > > > cache start or so.
> > > >
> > > > As far as schemaless - it is viable approach for sure, but should be
> > > > considered either outside of SQL (e.g. a kind of predicate/criteria
> API
> > > > which can be merged with ScanQuery) or as a special datatype in SQL
> > > > ecosystem (like is is done with JSON in many RDBMS databases).
> > > >
> > > > Vladimir.
> > > >
> > > >
> > > >
> > > >
> > > > On Fri, Feb 22, 2019 at 11:01 PM Denis Magda 
> > wrote:
> > > >
> > > > > Vladimir,
> > > > >
> > > > > That's understood. I'm just thinking of a use case different from
> the
> > > DDL
>

Re: SQL: INSERT with hidden columns _key, _val and check the type of input objects

2019-02-27 Thread Vladimir Ozerov

Hi Taras,

As far as your original question :-) I would say that user should have only
one way to update data with DML - through plain attributes. That is, if we
have a composite value with attributes "a" and "b", then we should:
UPDATE table SET a=?, b=? WHERE ... // Allow
UPDATE table SET _VAL=? WHERE ...   // Disallow

But if the value is an attribute itself (e.g. in case of primitive), then
DML should be allowed on it for sure:
UPDATE table SET _VAL=? WHERE ...   // Allow

What do you think?

On Sat, Feb 23, 2019 at 6:50 PM Denis Magda  wrote:

> Vladimir,
>
> Ok, agreed, let's not boil the ocean...at least for now ;)
>
> --
> Denis Magda
>
>
> On Sat, Feb 23, 2019 at 12:50 AM Vladimir Ozerov 
> wrote:
>
> > Denis,
> >
> > Yes, this is what my answer was about - you cannot have SQL without
> > defining fields in advance. Because it breaks a lot of standard SQL
> > invariants and virtually makes the whole language unusable. For instance,
> > think of product behavior in the following cases:
> > 1) User queries an empty cache with a query "SELECT a FROM table" - what
> > should happen - exception or empty result? How would I know whether field
> > "a" will appear in future?
> > 2) User executed a command "ALTER TABLE ... ADD COLUMN b" - how can I
> > understand whether it is possible or not to add a column without strict
> > schema?
> > 3) "ALTER TABLE ... DROP COLUMN c" - what should happen if user will add
> an
> > object with field "c" after that?
> > 4) User connects to Ignite from Tableau and navigates through schema -
> what
> > should be shown?
> >
> > That is, you cannot have SQL without schema because it is at the very
> heart
> > of the technology. But you can have schema-less noSQL database.
> >
> > Let's do not invent a hybrid with tons of corner cases and separate
> > learning curve. It should be enough just to rethink and simplify our
> > configuration - reshape QueryEntity, deprecate all SQL annotations, allow
> > only one table per cache, allow to define SQL script to be executed on
> > cache start or so.
> >
> > As far as schemaless - it is viable approach for sure, but should be
> > considered either outside of SQL (e.g. a kind of predicate/criteria API
> > which can be merged with ScanQuery) or as a special datatype in SQL
> > ecosystem (like is is done with JSON in many RDBMS databases).
> >
> > Vladimir.
> >
> >
> >
> >
> > On Fri, Feb 22, 2019 at 11:01 PM Denis Magda  wrote:
> >
> > > Vladimir,
> > >
> > > That's understood. I'm just thinking of a use case different from the
> DDL
> > > approach where the schema is defined initially. Let's say that someone
> > > configured caches with CacheConfiguration and now puts an Object in the
> > > cache. For that person, it would be helpful to skip the Annotations or
> > > QueryEntities approaches for queryable fields definitions (not even
> > > indexes). For instance, the person might simply query some fields with
> > the
> > > primary index in the WHERE clause and this shouldn't require any extra
> > > settings. Yes, it's clear that it might be extremely challenging to
> > support
> > > but imagine how usable the API could become if we can get rid of
> > > Annotations and QueryEntities.
> > >
> > > Basically, my idea is that all of the objects and their fields stored
> in
> > > the caches should be visible to SQL w/o extra settings. If someone
> wants
> > to
> > > create indexes then use DDL which was designed for this.
> > >
> > >
> > > -
> > > Denis
> > >
> > >
> > > On Fri, Feb 22, 2019 at 2:27 AM Vladimir Ozerov 
> > > wrote:
> > >
> > > > Denis,
> > > >
> > > > SQL is a language with strict schema what was one of significant
> > factors
> > > of
> > > > it's worldwide success. I doubt we will ever have SQL without
> > > > configuration/definiton, because otherwise it will be not SQL, but
> > > > something else (e.g. document-oriented, JSON, whatever).
> > > >
> > > > On Fri, Feb 22, 2019 at 1:52 AM Denis Magda 
> wrote:
> > > >
> > > > > Folks,
> > > > >
> > > > > Do we want to preserve the annotation-based configuration? There
> are
> > > too
> > > > > many ways to configure SQL indexes/fields.
> > > > >

[jira] [Created] (IGNITE-11422) Remove H2 console from documentation

2019-02-26 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11422:


 Summary: Remove H2 console from documentation
 Key: IGNITE-11422
 URL: https://issues.apache.org/jira/browse/IGNITE-11422
 Project: Ignite
  Issue Type: Task
  Components: documentation
Reporter: Vladimir Ozerov
Assignee: Artem Budnikov


H2 console was deprecated as a part of IGNITE-11333. Need to remove all 
mentions of "H2 console" from documentation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11418) Document SQL IGNITE.INDEXES view

2019-02-26 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11418:


 Summary: Document SQL IGNITE.INDEXES view
 Key: IGNITE-11418
 URL: https://issues.apache.org/jira/browse/IGNITE-11418
 Project: Ignite
  Issue Type: Task
  Components: documentation
Reporter: Vladimir Ozerov
Assignee: Artem Budnikov


New {{IGNITE.INDEXES}} view was added, which displays indexes in specific 
columns. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11404) Document CREATE TABLE "parallelism" option

2019-02-25 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11404:


 Summary: Document CREATE TABLE "parallelism" option
 Key: IGNITE-11404
 URL: https://issues.apache.org/jira/browse/IGNITE-11404
 Project: Ignite
  Issue Type: Task
  Components: documentation, sql
Reporter: Vladimir Ozerov
Assignee: Artem Budnikov
 Fix For: 2.8


We added new {{PARALLELISM}} option: 
{code}
CREATE TABLE ... WITH "parallelism = 4"
{code}

This option affect query parallelism which is otherwise set from 
{{CacheConfiguration.queryParallelism}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11402) SQL: Add ability to specift inline size of PK and affinity key indexes from CREATE TABLE and QueryEntity

2019-02-25 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11402:


 Summary: SQL: Add ability to specift inline size of PK and 
affinity key indexes from CREATE TABLE and QueryEntity
 Key: IGNITE-11402
 URL: https://issues.apache.org/jira/browse/IGNITE-11402
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov
 Fix For: 2.8


Currently it is not possible to set inline size for automatically created 
indexes easily. We need to make sure that use has convenient way to set them 
both programmatically and from DDL.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: SQL: INSERT with hidden columns _key, _val and check the type of input objects

2019-02-23 Thread Vladimir Ozerov

Denis,

Yes, this is what my answer was about - you cannot have SQL without
defining fields in advance. Because it breaks a lot of standard SQL
invariants and virtually makes the whole language unusable. For instance,
think of product behavior in the following cases:
1) User queries an empty cache with a query "SELECT a FROM table" - what
should happen - exception or empty result? How would I know whether field
"a" will appear in future?
2) User executed a command "ALTER TABLE ... ADD COLUMN b" - how can I
understand whether it is possible or not to add a column without strict
schema?
3) "ALTER TABLE ... DROP COLUMN c" - what should happen if user will add an
object with field "c" after that?
4) User connects to Ignite from Tableau and navigates through schema - what
should be shown?

That is, you cannot have SQL without schema because it is at the very heart
of the technology. But you can have schema-less noSQL database.

Let's do not invent a hybrid with tons of corner cases and separate
learning curve. It should be enough just to rethink and simplify our
configuration - reshape QueryEntity, deprecate all SQL annotations, allow
only one table per cache, allow to define SQL script to be executed on
cache start or so.

As far as schemaless - it is viable approach for sure, but should be
considered either outside of SQL (e.g. a kind of predicate/criteria API
which can be merged with ScanQuery) or as a special datatype in SQL
ecosystem (like is is done with JSON in many RDBMS databases).

Vladimir.

On Fri, Feb 22, 2019 at 11:01 PM Denis Magda  wrote:

> Vladimir,
>
> That's understood. I'm just thinking of a use case different from the DDL
> approach where the schema is defined initially. Let's say that someone
> configured caches with CacheConfiguration and now puts an Object in the
> cache. For that person, it would be helpful to skip the Annotations or
> QueryEntities approaches for queryable fields definitions (not even
> indexes). For instance, the person might simply query some fields with the
> primary index in the WHERE clause and this shouldn't require any extra
> settings. Yes, it's clear that it might be extremely challenging to support
> but imagine how usable the API could become if we can get rid of
> Annotations and QueryEntities.
>
> Basically, my idea is that all of the objects and their fields stored in
> the caches should be visible to SQL w/o extra settings. If someone wants to
> create indexes then use DDL which was designed for this.
>
>
> -
> Denis
>
>
> On Fri, Feb 22, 2019 at 2:27 AM Vladimir Ozerov 
> wrote:
>
> > Denis,
> >
> > SQL is a language with strict schema what was one of significant factors
> of
> > it's worldwide success. I doubt we will ever have SQL without
> > configuration/definiton, because otherwise it will be not SQL, but
> > something else (e.g. document-oriented, JSON, whatever).
> >
> > On Fri, Feb 22, 2019 at 1:52 AM Denis Magda  wrote:
> >
> > > Folks,
> > >
> > > Do we want to preserve the annotation-based configuration? There are
> too
> > > many ways to configure SQL indexes/fields.
> > >
> > > For instance, if our new SQL API could see and access all of the fields
> > > out-of-the-box (without any extra settings) and DDL will be used to
> > define
> > > indexed fields then that would be a huge usability improvement.
> > >
> > > -
> > > Denis
> > >
> > >
> > > On Thu, Feb 21, 2019 at 5:27 AM Taras Ledkov 
> > wrote:
> > >
> > > > Hi,
> > > >
> > > > Lets discuss SQL DML (INSERT/UPDATE) current behavior specific:
> > > >
> > > > Ignite doesn't check a type of input objects when hidden columns
> _key,
> > > > _value is used in a DML statements.
> > > > I describe the current behavior for example:
> > > >
> > > > 1. Cache configuration:  'setIndexedTypes(PersonKey.class,
> > > Person.class))'
> > > > 2.  PersonKey type contains 'int id' field.
> > > > 3. SQL statement: 'INSERT INTO test (_val, _key) VALUES (?, ?)'
> > > >
> > > > Cases:
> > > > 1. Invalid value object type:
> > > > - Any value object may be passed as a query parameter
> > > > - Query is executed without an error and returns '1' (one row
> updated);
> > > > - There is not inserted row at the 'SELECT * FROM test' results.
> > > > - cache.get(key) returns inserted object;
> > > >
> > > > 2. Invalid key object type:
>

Re: SQL: INSERT with hidden columns _key, _val and check the type of input objects

2019-02-22 Thread Vladimir Ozerov

Denis,

SQL is a language with strict schema what was one of significant factors of
it's worldwide success. I doubt we will ever have SQL without
configuration/definiton, because otherwise it will be not SQL, but
something else (e.g. document-oriented, JSON, whatever).

On Fri, Feb 22, 2019 at 1:52 AM Denis Magda  wrote:

> Folks,
>
> Do we want to preserve the annotation-based configuration? There are too
> many ways to configure SQL indexes/fields.
>
> For instance, if our new SQL API could see and access all of the fields
> out-of-the-box (without any extra settings) and DDL will be used to define
> indexed fields then that would be a huge usability improvement.
>
> -
> Denis
>
>
> On Thu, Feb 21, 2019 at 5:27 AM Taras Ledkov  wrote:
>
> > Hi,
> >
> > Lets discuss SQL DML (INSERT/UPDATE) current behavior specific:
> >
> > Ignite doesn't check a type of input objects when hidden columns _key,
> > _value is used in a DML statements.
> > I describe the current behavior for example:
> >
> > 1. Cache configuration:  'setIndexedTypes(PersonKey.class,
> Person.class))'
> > 2.  PersonKey type contains 'int id' field.
> > 3. SQL statement: 'INSERT INTO test (_val, _key) VALUES (?, ?)'
> >
> > Cases:
> > 1. Invalid value object type:
> > - Any value object may be passed as a query parameter
> > - Query is executed without an error and returns '1' (one row updated);
> > - There is not inserted row at the 'SELECT * FROM test' results.
> > - cache.get(key) returns inserted object;
> >
> > 2. Invalid key object type:
> > 2.1 Non-primitive object is passed and binary representation doesn't
> > contain 'id' field.
> > - Query is executed without error and returns '1' (one row updated);
> > - The inserted row is available by 'SELECT *' and the row contains id =
> > null;
> > 2.2 Non-primitive object is passed and binary representation contains
> > 'id' field.
> > - The inserted row is available by 'SELECT *' and the row contains
> > expected 'id' field;
> > - The cache entry cannot be gathered by 'cache.get' operation with the
> > corresponding 'PersonKey(id)' (keys differ).
> >
> > I propose to check type of the user's input object.
> >
> > I guess that using _key/_val columns works close to 'cache.put()' but it
> > looks like significant usability issue.
> > To confuse the 'PersonKey.class.getName()' and
> > 'node.binary().builder("PersonKey")' is a typical mistake of Ignite
> > newcomers.
> >
> > One more argument for check: SQL INSERT sematic means the row is
> > inserted into the specified TABLE, not into the cache.
> > So, throw IgniteSQLException is expected behavior in this case, i think.
> >
> > [1]. https://issues.apache.org/jira/browse/IGNITE-5250
> >
> > --
> > Taras Ledkov
> > Mail-To: tled...@gridgain.com
> >
> >
>

[jira] [Created] (IGNITE-11341) SQL: Enable lazy mode by default

2019-02-17 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11341:


 Summary: SQL: Enable lazy mode by default
 Key: IGNITE-11341
 URL: https://issues.apache.org/jira/browse/IGNITE-11341
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov
Assignee: Taras Ledkov


We redesigned lazy mode, so that now it doesn't spawn new thread and has the 
same performance as old "eager" mode (IGNITE-9171). However, we didn't enable 
it by default because H2 1.4.197 contains several bugs causing query engine 
slowdown in some cases when lazy mode is set. These issues are resolved in H2 
master and will become available as a part of the next release (presumably 
1.4.198). 

We need to make lazy mode enabled by default once new version is available 
(IGNITE-10801).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11340) SQL: Add OOME tests to separate suite

2019-02-17 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11340:


 Summary: SQL: Add OOME tests to separate suite
 Key: IGNITE-11340
 URL: https://issues.apache.org/jira/browse/IGNITE-11340
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov
Assignee: Taras Ledkov
 Fix For: 2.8


{{IgniteQueryOOMTestSuite}} was added as a part of IGNITE-9171. We need to add 
this suite to TC and make sure it is executed on regular basis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11334) SQL: Deprecate SqlQuery

2019-02-15 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11334:


 Summary: SQL: Deprecate SqlQuery
 Key: IGNITE-11334
 URL: https://issues.apache.org/jira/browse/IGNITE-11334
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov
Assignee: Taras Ledkov


This API is very limited comparing to {{SqlFieldsQuery}}. Let's deprecate it 
with proper links to {{SqlFieldsQuery}}. This should be not only deprecation on 
public API, but removal from examples as well.

Separate ticket for documentation is needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11333) SQL: Deprecate H2 console

2019-02-15 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11333:


 Summary: SQL: Deprecate H2 console
 Key: IGNITE-11333
 URL: https://issues.apache.org/jira/browse/IGNITE-11333
 Project: Ignite
  Issue Type: Task
Reporter: Vladimir Ozerov
Assignee: Taras Ledkov


This functional is not tested, not supported and may fail with unexpected 
errors. This affects user experience. Need to disable it and create ticket for 
relevant documentation update.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11331) SQL: Remove unnecessary parameters binding

2019-02-15 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11331:


 Summary: SQL: Remove unnecessary parameters binding
 Key: IGNITE-11331
 URL: https://issues.apache.org/jira/browse/IGNITE-11331
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov
 Fix For: 2.8


See usages of {{H2Utils#bindParameters}}. Note that it is used both in SELECT 
and DML planners without any reason. Let's remove it from there.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11326) SQL: Common parsing logic

2019-02-14 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11326:


 Summary: SQL: Common parsing logic
 Key: IGNITE-11326
 URL: https://issues.apache.org/jira/browse/IGNITE-11326
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov
Assignee: Vladimir Ozerov






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11325) SQL: Single place to start missing caches (H2Utils.checkAndStartNotStartedCache)

2019-02-14 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11325:


 Summary: SQL: Single place to start missing caches 
(H2Utils.checkAndStartNotStartedCache)
 Key: IGNITE-11325
 URL: https://issues.apache.org/jira/browse/IGNITE-11325
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov
 Fix For: 2.8


We need to start missing caches for the given SELECT/DML statement because we 
need affinity info during query planning which is only available for started 
caches. 

We need to do the following:
# Move the method {{H2Utils.checkAndStartNotStartedCache}} to some common 
place, e.g. parser, so that it has one and only one usage all over the code base
# Make sure that this method doesn't produce multiple network hops: missing 
caches should be started in a single request if possible.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11317) Document that SQL DML statements (UPDATE/MERGE) cannot update key fields

2019-02-14 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11317:


 Summary: Document that SQL DML statements (UPDATE/MERGE) cannot 
update key fields
 Key: IGNITE-11317
 URL: https://issues.apache.org/jira/browse/IGNITE-11317
 Project: Ignite
  Issue Type: Task
  Components: documentation, sql
Reporter: Vladimir Ozerov
Assignee: Artem Budnikov


This is architectural limitation which is unlikely to be resolved in the 
nearest time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: Binary clients: fallback to the previous versions of the protocol

2019-02-13 Thread Vladimir Ozerov

Hi Dmitriy,

It is very common practice to keep client protocol compatible with multiple
versions of the server. We constantly face this in practice. I do not see
any reason to drop or complicate this functional: user just connects to the
server and we automatically negotiate on the best feature set possible. No
need to expose it somehow to users. As far as develpoment and testing, we
are not afraid of challenges and difficulties. Yes, it takes more time, but
it worth it.

Vladimir.

On Thu, Feb 14, 2019 at 6:28 AM Dmitry Melnichuk <
dmitry.melnic...@nobitlost.com> wrote:

> Igor,
>
> I am sorry it took me a while to fully understand your reasoning.
>
> “Update user software first, then update the server” approach still
> looks somewhat weird to me (I think of Let's Encrypt client as an
> example of “normal” approach in Python world), but since this approach
> is vivid, I just have to take it into account, so I must agree with
> you.
>
> I just want to reiterate on one downside of such multi-protocol client,
> that was not yet addressed (not in Jira tasks or in docs, at least).
>
> Imagine a coder wrote a program with the latest client, using a feature
> available only in latest binary protocol. When the coder tests his
> program against the latest Ignite cluster, the program works perfectly.
>
> But then the end user runs the program against the previous version of
> the server, which client is still backwards-compatible with, the
> program runs, but at some point it tries to use the latest feature of
> the binary protocol and fails with some cryptic message. The end user
> is clueless, so as the coder.
>
> To avoid such a case, we must include an explicit parameter in our
> client's initialization method, that would set the desired protocol
> version(s) the user application is designed to work with. This
> parameter should be explicit, i.e. not have a default value, since it
> just will be useless the other way. And yes, this parameter renders all
> the software built with previous client versions incompatible with the
> new client.
>
> I think this problem concerns not only the Python client, but all the
> thin clients. What do you think?
>
> On Wed, 2019-02-13 at 13:45 +0300, Igor Sapego wrote:
> > The approach you suggest looks to me pretty much the same as
> > installing a new version of client software in C++ or Java. The issue
> > here that we break existing installed software and require for user
> > to update software in order to have ability to connect to a server.
> > Just imagine that application which made with thin client is not used
> > by a developer that knows how to use pip and all the stuff, but
> > someone with another background. Imagine, you have thousands of such
> > users. And now imagine, you want to update your servers.
> >
> > Best Regards,
> > Igor
> >
> >
> > On Tue, Feb 12, 2019 at 8:51 PM Dmitry Melnichuk <
> > dmitry.melnic...@nobitlost.com> wrote:
> >
> > > Igor,
> > >
> > > Thank you for your explanation. I think the matter begins to clear
> > > up
> > > for me now.
> > >
> > > The backward compatibility issue you described can not be applied
> > > to
> > > Python, because Python applications, unlike Java ones, do not have
> > > to
> > > be built. They rely on package manager (pip, conda, et c.) to run
> > > anywhere, including production.
> > >
> > > At the stage of deployment, the package manager collects
> > > dependencies
> > > using a specially crafted response file, often called
> > > `requirements.txt`.
> > >
> > > For example, to ensure that their application will work with the
> > > current _and_ future minor versions of pyignite, the user may
> > > include a
> > > line in their `requirements.txt` file:
> > >
> > > pyignite < x
> > >
> > > where x is a next major version number. In compliance with semantic
> > > versioning, the line is basically says: “Use the latest available
> > > version, that is earlier than x”.
> > >
> > > When upgrading Ignite server, system administrator or devops
> > > engineer
> > > must also update or recreate the app's environment, or update OS-
> > > level
> > > packages, or redeploy the app using Docker − the exact procedure
> > > may
> > > vary, but in any case it should be completely standard − to deliver
> > > the
> > > latest suitable dependencies.
> > >
> > > And then the same app connects to a latest Ignite server.
> > >
> > > Here is more about how pip understands versions:
> > >
> > >
> https://pip.pypa.io/en/stable/reference/pip_install/#requirement-specifiers
> > >
> > > What we really need to do for this to work seamlessly, is to
> > > establish
> > > the clear relation between products' versions. Regretfully, I have
> > > not
> > > done this before; just did not expect for this issue to come up. I
> > > think it would be best for pyignite major and minor to be set
> > > according
> > > to the Ignite binary protocol versions, i.e. pyignite 1.2.z handles
> > > Ignite binary protocol v1.2, and so on. But that is another matt

[jira] [Created] (IGNITE-11316) SQL: Support partition pruning for local queries

2019-02-13 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11316:


 Summary: SQL: Support partition pruning for local queries
 Key: IGNITE-11316
 URL: https://issues.apache.org/jira/browse/IGNITE-11316
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov


Currently it is not supported because extraction happens inside splitter. Local 
query eithe: 
# Do not reach splitter at all (no-split case)
# Reach splitter, but skip extraction due to missing infrastructure which is to 
be implemented and tested in the scope of current ticket.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11310) SQL: remove special interaction between query parallelism and distributed joins

2019-02-13 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11310:


 Summary: SQL: remove special interaction between query parallelism 
and distributed joins
 Key: IGNITE-11310
 URL: https://issues.apache.org/jira/browse/IGNITE-11310
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov
Assignee: Vladimir Ozerov
 Fix For: 2.8


Currently we enable so-called "local distributed joins" when query is executed 
locally with enabled parallelism. This behavior is not needed and needs to be 
removed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11304) SQL: Common caching of both local and distributed query metadata

2019-02-13 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11304:


 Summary: SQL: Common caching of both local and distributed query 
metadata
 Key: IGNITE-11304
 URL: https://issues.apache.org/jira/browse/IGNITE-11304
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov
Assignee: Vladimir Ozerov


Currently query metadata is only cached for distributed queries. For local 
queries it is calculated on every request over and over again. Need to cache it 
always in {{QueryParserResultSelect}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11280) SQL: Cache all queries, not only two-step

2019-02-09 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11280:


 Summary: SQL: Cache all queries, not only two-step
 Key: IGNITE-11280
 URL: https://issues.apache.org/jira/browse/IGNITE-11280
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov
 Fix For: 2.8






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11278) SQL: Extract query parsing into separate class

2019-02-09 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11278:


 Summary: SQL: Extract query parsing into separate class
 Key: IGNITE-11278
 URL: https://issues.apache.org/jira/browse/IGNITE-11278
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov
Assignee: Vladimir Ozerov
 Fix For: 2.8


# Introduce separate command types for SELECT, DML and others commands
# Move parsing logic and query cache to separate class
# Fix bug with query parallelism when "distributedQueries" flag is modified not 
for newly created query, but globally.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11279) SQL: Remove H2's "prepared" from DML plans

2019-02-09 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11279:


 Summary: SQL: Remove H2's "prepared" from DML plans
 Key: IGNITE-11279
 URL: https://issues.apache.org/jira/browse/IGNITE-11279
 Project: Ignite
  Issue Type: Task
  Components: sql
    Reporter: Vladimir Ozerov
 Fix For: 2.8


Currently it is only used to get the list of participating tables. Instead, we 
should encapsulate this information into {{ParsingResultDml}}. Streamer methods 
should use our own parser as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11275) SQL: Move all command processing stuff to DDL processor

2019-02-09 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11275:


 Summary: SQL: Move all command processing stuff to DDL processor
 Key: IGNITE-11275
 URL: https://issues.apache.org/jira/browse/IGNITE-11275
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov
Assignee: Vladimir Ozerov
 Fix For: 2.8


If command is of non-SELECT/non-DML type, it should be encapsulated inside 
{{ParsingResult}} as a pair of native/H2 commands and passed to separate 
processor. This will reduce complexity of {{IgniteH2Indexing}} significantly, 
as it will be concerned only about SELECT/DML processing and nothing else.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11274) SQL: Make GridCacheSqlQuery immutable

2019-02-09 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11274:


 Summary: SQL: Make GridCacheSqlQuery immutable
 Key: IGNITE-11274
 URL: https://issues.apache.org/jira/browse/IGNITE-11274
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov
Assignee: Vladimir Ozerov


The goal of this ticket is to finally make two-step plan fully immutable. First 
steps we already made in IGNITE-11223, howevere plan's "query" objects are 
still mutable, what make's plan caching inherently unsafe.

# Remove all setters from the message except of {{nodeId}}, which is really 
needed
# Make splitter use another trully immutable object instead of 
{{GridCacheSqlQuery}}
# Copy splitter's object to {{GridCacheSqlQuery}} during reduce



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11231) SQL: Remove scan index for merge table

2019-02-06 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11231:


 Summary: SQL: Remove scan index for merge table
 Key: IGNITE-11231
 URL: https://issues.apache.org/jira/browse/IGNITE-11231
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov
Assignee: Vladimir Ozerov
 Fix For: 2.8


Reasoning:
# No business logic comparing to it's parent
# Duplicated code for cost calculation



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11227) SQL: Streamline DML execution logic

2019-02-06 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11227:


 Summary: SQL: Streamline DML execution logic
 Key: IGNITE-11227
 URL: https://issues.apache.org/jira/browse/IGNITE-11227
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov


Currently DML execution logic is overly complex with execution flow being 
transferred between indexing and DML processor back and forth. Need to simplify 
it as much as possible.





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11226) SQL: Remove GridQueryIndexing.prepareNativeStatement

2019-02-06 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11226:


 Summary: SQL: Remove GridQueryIndexing.prepareNativeStatement
 Key: IGNITE-11226
 URL: https://issues.apache.org/jira/browse/IGNITE-11226
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov


This method is the only leak of H2 internals to the outer code. Close analysis 
of code reveals that the only reason we have it is *JDBC metadata*. Need to 
create a method which will prepare metadata for a statement and return it as a 
detached object. Most probably we already  have all necessary mechanics. This 
is more about refactoring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11223) SQL: Merge "collectCacheIds" and "processCaches" methods

2019-02-06 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11223:


 Summary: SQL: Merge "collectCacheIds" and "processCaches" methods
 Key: IGNITE-11223
 URL: https://issues.apache.org/jira/browse/IGNITE-11223
 Project: Ignite
  Issue Type: Task
  Components: sql
    Reporter: Vladimir Ozerov


Both methods are essentially two pieces of the same process - collect cache IDs 
for the given query, check MVCC mode. But because they are separated, we have 
unnecessary collection copies, "isEmpty" checks and iterations. 

Provided that these methods are on a hot path, let's merge them accurately.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11212) SQL: Merge affinity collocation models for partition pruning and distributed joins

2019-02-05 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11212:


 Summary: SQL: Merge affinity collocation models for partition 
pruning and distributed joins
 Key: IGNITE-11212
 URL: https://issues.apache.org/jira/browse/IGNITE-11212
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov


Currently we have two different types of tree models for partition pruning and 
distributed joins. First, this leads to code duplication. Second, they subtle 
semantic differences harboring hidden bugs. 

Let's try to merge them into a single model which is built with the same set of 
rules.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11211) SQL: Rework connection pool

2019-02-05 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11211:


 Summary: SQL: Rework connection pool
 Key: IGNITE-11211
 URL: https://issues.apache.org/jira/browse/IGNITE-11211
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov


Currently we have very complex multi-level connection pool. Instead, we may 
have a single concurrent queue with shared connections which are acquired and 
released by threads as needed. As an optimization we may optionally attach 
connections to thread-local storage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11210) SQL: Introduce common logical execution plan for all query types

2019-02-05 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11210:


 Summary: SQL: Introduce common logical execution plan for all 
query types
 Key: IGNITE-11210
 URL: https://issues.apache.org/jira/browse/IGNITE-11210
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov


At the moment we have a lot of various cached stuff used for different SQL 
types (prepared statements for local queries, two-step queries for distributed 
queries, update plan for DML). 
What we need instead of having multiple caches is to create common execution 
plan for every query, which will hold both DML and SELECT stuff. Approximate 
content of such a plan:
# Two-step plan
# DML plan 
# Partition pruning stuff
# May be even cached physical node distribution (for reduce queries) for the 
given {{AffinityTopologyVersion}}
# Probably {{AffinityTopologyVersion}}

Then we will perform a single plan lookup/build per every query execution. In 
future we will probably display these plans in SQL views.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11208) SQL: Move reservations from QueryContext to MapQueryResult

2019-02-05 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11208:


 Summary: SQL: Move reservations from QueryContext to MapQueryResult
 Key: IGNITE-11208
 URL: https://issues.apache.org/jira/browse/IGNITE-11208
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov


It is absolutely unclear why reservations are handled inside {{QueryContext}}. 
First, they belong to specific {{MapQueryResult}}, not some thread-local stuff. 
Second, inside {{QueryContext}} logic they are cleared only for requests with 
distributed joins. Why?

Let's remove this weird stuff.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11209) SQL: streamline DML execution logic

2019-02-05 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11209:


 Summary: SQL: streamline DML execution logic
 Key: IGNITE-11209
 URL: https://issues.apache.org/jira/browse/IGNITE-11209
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov


Currently DML execution logic is overly complex with execution flow being 
transferred between indexing and DML processor back and forth. Need to simplify 
it as much as possible.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11207) SQL: Remove MapNodeResults class

2019-02-05 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11207:


 Summary: SQL: Remove MapNodeResults class
 Key: IGNITE-11207
 URL: https://issues.apache.org/jira/browse/IGNITE-11207
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov
 Fix For: 2.8


This class holds results for a specific node. Let's remove it and refactor 
associated code with the following goals in mind:
# Performance: one CHM lookup instead of two
# Uniformity: move both SELECT and DML under the same {{MapQueryResult}} 
umbrella



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11206) SQL: Merge execution flow for local and map queries

2019-02-05 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11206:


 Summary: SQL: Merge execution flow for local and map queries
 Key: IGNITE-11206
 URL: https://issues.apache.org/jira/browse/IGNITE-11206
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov


Currently MAP and LOCAL queries are executed in completely different fashion. 
This leads to a number of bugs and discrepancies, not to mention obvious code 
duplication:
# Local queries do not reserve partitions
# Security checks might be missed for local queries (need to double-check).
# Different event firing logic

Let's merge both flows:
# Check security and other prerequisites
# Reserve partitions 
# Get connection
# Execute, firing events along the way
# Release connection
# Release partitions



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11203) SQL: global refactoring

2019-02-05 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11203:


 Summary: SQL: global refactoring
 Key: IGNITE-11203
 URL: https://issues.apache.org/jira/browse/IGNITE-11203
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov
Assignee: Vladimir Ozerov


Over the years of existence SQL business logic became overly complex as we 
never invested enough time into technical debt. Most prominent features that 
led to over-complication are:
# Distributed joins
# Subqueries in spliiter
# MVCC 
# Query cancel feature
# DML

As a result currently it is too difficult to add new features to the product: 
we have to spend a lot time figuring what if going on, and loose a lot on 
introduced bugs.

General idea of this initiative is to streamline query execution engine as much 
as possible. The most important things to consider:
# Simplify H2 connection management: simple pooling, avoid exposing connection 
when possible
# Execute MAP and LOCAL queries through the same flow
# Avoid zig-zag code flow in DML stuff
# Try to merge partition pruning and distributed join cost calculation



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11202) SQL: Move partition reservation logic to separate class

2019-02-05 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11202:


 Summary: SQL: Move partition reservation logic to separate class
 Key: IGNITE-11202
 URL: https://issues.apache.org/jira/browse/IGNITE-11202
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov
Assignee: Vladimir Ozerov
 Fix For: 2.8


Currently associated logic is located inside {{GridMapQueryExecutor}}. This is 
wrong because partitions should be reserved and then released for both local 
and distributed queries. To allow for smooth merge of "map" and "local" queries 
in future, it is necessary to move this common logic into a separate place 
which is independent of {{GridMapQueryExecutor}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11200) SQL: query contexts should not be static

2019-02-04 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11200:


 Summary: SQL: query contexts should not be static
 Key: IGNITE-11200
 URL: https://issues.apache.org/jira/browse/IGNITE-11200
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov
Assignee: Vladimir Ozerov
 Fix For: 2.8


Currently query contexts are static and as a result over complicated. Need to 
make them instance-bound and remove static.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: Best Effort Affinity for thin clients

2019-02-04 Thread Vladimir Ozerov

Igor,

My idea is simply to add the list of caches with the same distribution to
the end of partition response. Client can use this information to populate
partition info for more caches in a single request.

On Mon, Feb 4, 2019 at 3:06 PM Igor Sapego  wrote:

> Vladimir,
>
> So correct me if I'm wrong, what you propose is to avoid mentioning
> of cache groups, and use instead of "cache group" term something like
> "distribution"? Or do you propose some changes in protocol? If so, can
> you briefly explain, what kind of changes they are?
>
> Best Regards,
> Igor
>
>
> On Mon, Feb 4, 2019 at 1:13 PM Vladimir Ozerov 
> wrote:
>
> > Igor,
> >
> > Yes, cache groups are public API. However, we try to avoid new APIs
> > depending on them.
> > The main point from my side is that “similar cache group” can be easily
> > generalized to “similar distribution”. This way we avoid cache groups on
> > protocol level at virtually no cost.
> >
> > Vladimir.
> >
> > пн, 4 февр. 2019 г. в 12:48, Igor Sapego :
> >
> > > Guys,
> > >
> > > Can you explain why do we want to avoid Cache groups in protocol?
> > >
> > > If it's about simplicity of the protocol, then removing cache groups
> will
> > > not help much with it - we will still need to include "knownCacheIds"
> > > field in request and "cachesWithTheSamePartitioning" field in response.
> > > And also, since when do Ignite prefers simplicity over performance?
> > >
> > > If it's about not wanting to show internals of Ignite then it sounds
> like
> > > a very weak argument to me, since Cache Groups is a public thing [1].
> > >
> > > [1] - https://apacheignite.readme.io/docs/cache-groups
> > >
> > > Best Regards,
> > > Igor
> > >
> > >
> > > On Mon, Feb 4, 2019 at 11:47 AM Vladimir Ozerov 
> > > wrote:
> > >
> > > > Pavel, Igor,
> > > >
> > > > This is not very accurate to say that this will not save memory. In
> > > > practice we observed a number of OOME issues on the server-side due
> to
> > > many
> > > > caches and it was one of motivations for cache groups (another one
> disk
> > > > access optimizations). On the other hand, I agree that we'd better to
> > > avoid
> > > > cache groups in the protocol because this is internal implementation
> > > detail
> > > > which is likely (I hope so) to be changed in future.
> > > >
> > > > So I have another proposal - let's track caches with the same
> affinity
> > > > distribution instead. That is, normally most of PARTITIONED caches
> will
> > > > have very few variants of configuration: it will be Rendezvous
> affinity
> > > > function, most likely with default partition number and with 1-2
> > backups
> > > at
> > > > most. So when affinity distribution for specific cache is requested,
> we
> > > can
> > > > append to the response *list of caches with the same distribution*.
> > I.e.:
> > > >
> > > > class AffinityResponse {
> > > > Object distribution;// Actual distribution
> > > > List cacheIds; // Caches with similar distribution
> > > > }
> > > >
> > > > Makes sense?
> > > >
> > > > On Sun, Feb 3, 2019 at 8:31 PM Pavel Tupitsyn 
> > > > wrote:
> > > >
> > > > > Igor, I have a feeling that we should omit Cache Group stuff from
> the
> > > > > protocol.
> > > > > It is a rare use case and even then dealing with them on client
> > barely
> > > > > saves some memory.
> > > > >
> > > > > We can keep it simple and have partition map per cacheId. Thoughts?
> > > > >
> > > > > On Fri, Feb 1, 2019 at 6:49 PM Igor Sapego 
> > wrote:
> > > > >
> > > > > > Guys, I've updated the proposal once again [1], so please,
> > > > > > take a look and let me know what you think.
> > > > > >
> > > > > > [1] -
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-23%3A+Best+Effort+Affinity+for+thin+clients
> > > > > >
> > > > > > Best Regards,
> > > > > > Igor
> > > &g

Re: Best Effort Affinity for thin clients

2019-02-04 Thread Vladimir Ozerov

Igor,

Yes, cache groups are public API. However, we try to avoid new APIs
depending on them.
The main point from my side is that “similar cache group” can be easily
generalized to “similar distribution”. This way we avoid cache groups on
protocol level at virtually no cost.

Vladimir.

пн, 4 февр. 2019 г. в 12:48, Igor Sapego :

> Guys,
>
> Can you explain why do we want to avoid Cache groups in protocol?
>
> If it's about simplicity of the protocol, then removing cache groups will
> not help much with it - we will still need to include "knownCacheIds"
> field in request and "cachesWithTheSamePartitioning" field in response.
> And also, since when do Ignite prefers simplicity over performance?
>
> If it's about not wanting to show internals of Ignite then it sounds like
> a very weak argument to me, since Cache Groups is a public thing [1].
>
> [1] - https://apacheignite.readme.io/docs/cache-groups
>
> Best Regards,
> Igor
>
>
> On Mon, Feb 4, 2019 at 11:47 AM Vladimir Ozerov 
> wrote:
>
> > Pavel, Igor,
> >
> > This is not very accurate to say that this will not save memory. In
> > practice we observed a number of OOME issues on the server-side due to
> many
> > caches and it was one of motivations for cache groups (another one disk
> > access optimizations). On the other hand, I agree that we'd better to
> avoid
> > cache groups in the protocol because this is internal implementation
> detail
> > which is likely (I hope so) to be changed in future.
> >
> > So I have another proposal - let's track caches with the same affinity
> > distribution instead. That is, normally most of PARTITIONED caches will
> > have very few variants of configuration: it will be Rendezvous affinity
> > function, most likely with default partition number and with 1-2 backups
> at
> > most. So when affinity distribution for specific cache is requested, we
> can
> > append to the response *list of caches with the same distribution*. I.e.:
> >
> > class AffinityResponse {
> > Object distribution;// Actual distribution
> > List cacheIds; // Caches with similar distribution
> > }
> >
> > Makes sense?
> >
> > On Sun, Feb 3, 2019 at 8:31 PM Pavel Tupitsyn 
> > wrote:
> >
> > > Igor, I have a feeling that we should omit Cache Group stuff from the
> > > protocol.
> > > It is a rare use case and even then dealing with them on client barely
> > > saves some memory.
> > >
> > > We can keep it simple and have partition map per cacheId. Thoughts?
> > >
> > > On Fri, Feb 1, 2019 at 6:49 PM Igor Sapego  wrote:
> > >
> > > > Guys, I've updated the proposal once again [1], so please,
> > > > take a look and let me know what you think.
> > > >
> > > > [1] -
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-23%3A+Best+Effort+Affinity+for+thin+clients
> > > >
> > > > Best Regards,
> > > > Igor
> > > >
> > > >
> > > > On Thu, Jan 17, 2019 at 1:05 PM Igor Sapego 
> > wrote:
> > > >
> > > > > Yeah, I'll add it.
> > > > >
> > > > > Best Regards,
> > > > > Igor
> > > > >
> > > > >
> > > > > On Wed, Jan 16, 2019 at 11:08 PM Pavel Tupitsyn <
> > ptupit...@apache.org>
> > > > > wrote:
> > > > >
> > > > >> >  to every server
> > > > >> I did not think of this issue. Now I agree with your approach.
> > > > >> Can you please add an explanation of this to the IEP?
> > > > >>
> > > > >> Thanks!
> > > > >>
> > > > >> On Wed, Jan 16, 2019 at 2:53 PM Igor Sapego 
> > > wrote:
> > > > >>
> > > > >> > Pavel,
> > > > >> >
> > > > >> > Yeah, it makes sense, but to me it seems that this approach can
> > lead
> > > > >> > to more complicated client logic, as it will require to make
> > > > additional
> > > > >> > call
> > > > >> > to every server, that reports affinity topology change.
> > > > >> >
> > > > >> > Guys, WDYT?
> > > > >> >
> > > > >> > Best Regards,
> > > > >> > Igor
> > > > >> >
> > > > >>

[jira] [Created] (IGNITE-11185) SQL: Move distributed joins code from base index to H2TreeIndex

2019-02-04 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11185:


 Summary: SQL: Move distributed joins code from base index to 
H2TreeIndex
 Key: IGNITE-11185
 URL: https://issues.apache.org/jira/browse/IGNITE-11185
 Project: Ignite
  Issue Type: Task
Reporter: Vladimir Ozerov
Assignee: Vladimir Ozerov
 Fix For: 2.8


{{H2TreeIndex}} is the only implementation concerned with distributed joins. 
Let's move associated from {{GridH2IndexBase}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: Best Effort Affinity for thin clients

2019-02-04 Thread Vladimir Ozerov

Pavel, Igor,

This is not very accurate to say that this will not save memory. In
practice we observed a number of OOME issues on the server-side due to many
caches and it was one of motivations for cache groups (another one disk
access optimizations). On the other hand, I agree that we'd better to avoid
cache groups in the protocol because this is internal implementation detail
which is likely (I hope so) to be changed in future.

So I have another proposal - let's track caches with the same affinity
distribution instead. That is, normally most of PARTITIONED caches will
have very few variants of configuration: it will be Rendezvous affinity
function, most likely with default partition number and with 1-2 backups at
most. So when affinity distribution for specific cache is requested, we can
append to the response *list of caches with the same distribution*. I.e.:

class AffinityResponse {
Object distribution;// Actual distribution
List cacheIds; // Caches with similar distribution
}

Makes sense?

On Sun, Feb 3, 2019 at 8:31 PM Pavel Tupitsyn  wrote:

> Igor, I have a feeling that we should omit Cache Group stuff from the
> protocol.
> It is a rare use case and even then dealing with them on client barely
> saves some memory.
>
> We can keep it simple and have partition map per cacheId. Thoughts?
>
> On Fri, Feb 1, 2019 at 6:49 PM Igor Sapego  wrote:
>
> > Guys, I've updated the proposal once again [1], so please,
> > take a look and let me know what you think.
> >
> > [1] -
> >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-23%3A+Best+Effort+Affinity+for+thin+clients
> >
> > Best Regards,
> > Igor
> >
> >
> > On Thu, Jan 17, 2019 at 1:05 PM Igor Sapego  wrote:
> >
> > > Yeah, I'll add it.
> > >
> > > Best Regards,
> > > Igor
> > >
> > >
> > > On Wed, Jan 16, 2019 at 11:08 PM Pavel Tupitsyn 
> > > wrote:
> > >
> > >> >  to every server
> > >> I did not think of this issue. Now I agree with your approach.
> > >> Can you please add an explanation of this to the IEP?
> > >>
> > >> Thanks!
> > >>
> > >> On Wed, Jan 16, 2019 at 2:53 PM Igor Sapego 
> wrote:
> > >>
> > >> > Pavel,
> > >> >
> > >> > Yeah, it makes sense, but to me it seems that this approach can lead
> > >> > to more complicated client logic, as it will require to make
> > additional
> > >> > call
> > >> > to every server, that reports affinity topology change.
> > >> >
> > >> > Guys, WDYT?
> > >> >
> > >> > Best Regards,
> > >> > Igor
> > >> >
> > >> >
> > >> > On Tue, Jan 15, 2019 at 10:59 PM Pavel Tupitsyn <
> ptupit...@apache.org
> > >
> > >> > wrote:
> > >> >
> > >> > > Igor,
> > >> > >
> > >> > > >  It is proposed to add flag to every response, that shows
> whether
> > >> the
> > >> > > Affinity Topology Version of the cluster has changed since the
> last
> > >> > request
> > >> > > from the client.
> > >> > > I propose to keep this flag. So no need for periodic checks. Makes
> > >> sense?
> > >> > >
> > >> > > On Tue, Jan 15, 2019 at 4:45 PM Igor Sapego 
> > >> wrote:
> > >> > >
> > >> > > > Pavel,
> > >> > > >
> > >> > > > This will require from client to send this new request
> > periodically,
> > >> > I'm
> > >> > > > not
> > >> > > > sure this will make clients simpler. Anyway, let's discuss it.
> > >> > > >
> > >> > > > Vladimir,
> > >> > > >
> > >> > > > With current proposal, we will have affinity info in message
> > header.
> > >> > > >
> > >> > > > Best Regards,
> > >> > > > Igor
> > >> > > >
> > >> > > >
> > >> > > > On Tue, Jan 15, 2019 at 11:01 AM Vladimir Ozerov <
> > >> voze...@gridgain.com
> > >> > >
> > >> > > > wrote:
> > >> > > >
> > >> > > > > Igor,
> > >> > > > >
> > >> > > > > I think that "Cache Partitions Request&quo

[jira] [Created] (IGNITE-11180) SQL: give more sensible names to reducer classes

2019-02-03 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11180:


 Summary: SQL: give more sensible names to reducer classes
 Key: IGNITE-11180
 URL: https://issues.apache.org/jira/browse/IGNITE-11180
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov
Assignee: Vladimir Ozerov
 Fix For: 2.8


# Rename classes in accordance to map/reduce approach to simplify further 
development
# Remove dead code in reducer logic



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11169) SQL: Remove collocation model-related code from GridH2QueryContext

2019-02-01 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11169:


 Summary: SQL: Remove collocation model-related code from 
GridH2QueryContext
 Key: IGNITE-11169
 URL: https://issues.apache.org/jira/browse/IGNITE-11169
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov
Assignee: Vladimir Ozerov
 Fix For: 2.8


This should be located in splitter logic instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11160) SQL: Create light-weight row for read-only rows

2019-01-31 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11160:


 Summary: SQL: Create light-weight row for read-only rows
 Key: IGNITE-11160
 URL: https://issues.apache.org/jira/browse/IGNITE-11160
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov
Assignee: Vladimir Ozerov
 Fix For: 2.8


In order to minimize memory overhead during query execution we can create 
simplified version of {{GridH2KeyValueRowOnheap}} which will not hold reference 
to original row. Also we can remove value cache as it is never used during 
SELECT execution.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: proposed realization KILL QUERY command

2019-01-30 Thread Vladimir Ozerov

Hi Yuriy,

Agree that at the moment the simpler the better. Let's return to more
complex syntax in future if needed. Regarding proposed syntax, please note
that as query ID is not database object name but rather string literal,
we'd better wrap it into quotes to keep syntax consistency across commands:

KILL QUERY '8a55df83-2f41-4f81-8e11-ab0936d0_6742';

Vladimir.

On Wed, Jan 30, 2019 at 3:09 PM Юрий  wrote:

> Hi Igniters,
>
> Let's return to KILL QUERY command. Previously we mostly discussed about
> two variants of format:
> 1) simple - KILL QUERY {running_query_id}
> 2) advanced syntax - KILL QUERY WHERE {parameters}. Parameters seems can be
> any columns from running queries view or just part of them.
>
> I've checked approaches used by  Industrial  RDBMS vendors :
>
>-
>   - *ORACLE*: ALTER SYSTEM CANCEL SQL 'SID, SERIAL, SQL_ID'
>
>
>-
>   - *Postgres*: SELECT pg_cancel_backend() and
>   SELECT pg_terminate_backend()
>   - *MySQL*: KILL QUERY 
>
>
> As we see all of them use simple syntax to cancel a query and can't do some
> filters.
>
> IMHO simple *KILL QUERY qry_id* better for the few reasons.
> User can kill just single query belong (started) to single node and it will
> be exactly that query which was passed as parameter - predictable results.
> For advance syntax  it could lead send kill request to all nodes in a
> cluster and potentially user can kill unpredictable queries depend on
> passed parameters.
> Other vendors use simple syntax
>
> How it could be used
>
> 1)SELECT * from sql_running_queries
> result is
>  query_id
>   |  sql  | schema_name | duration| 
> 8a55df83-2f41-4f81-8e11-ab0936d0_6742 | SELECT ... | ...
>   |  | 
> 8a55df83-2f41-4f81-8e11-ab0936d0_1234 | UPDATE...  | ...
>   |  ..  | 
>
> 2) KILL QUERY 8a55df83-2f41-4f81-8e11-ab0936d0_6742
>
>
>
> Do you have another opinion? Let's decide which of variant will be prefer.
>
>
> ср, 16 янв. 2019 г. в 18:02, Denis Magda :
>
> > Yury,
> >
> > I do support the latter concatenation approach. It's simple and
> correlates
> > with what other DBs do. Plus, it can be passed to KILL command without
> > complications. Thanks for thinking this through!
> >
> > As for the killing of all queries on a particular node, not sure that's a
> > relevant use case. I would put this off. Usually, you want to stop a
> > specific query (it's slow or resources consuming) and have to know its
> id,
> > the query runs across multiple nodes and a single KILL command with the
> id
> > can halt it everywhere. If someone decided to shut all queries on the
> node,
> > then it sounds like the node is experiencing big troubles and it might be
> > better just to shut it down completely.
> >
> > -
> > Denis
> >
> >
> > On Tue, Jan 15, 2019 at 8:00 AM Юрий 
> wrote:
> >
> >> Denis and other Igniters, do you have any comments for proposed
> approach?
> >> Which of these ones will be better to use for us - simple numeric  or
> hex
> >> values (shorter id, but with letters)?
> >>
> >> As for me hex values preferable due to it  shorter and looks more unique
> >> across a logs
> >>
> >>
> >>
> >> вт, 15 янв. 2019 г. в 18:35, Vladimir Ozerov :
> >>
> >>> Hi,
> >>>
> >>> Concatenation through a letter looks like a good approach to me. As far
> >>> as
> >>> killing all queries on a specific node, I would put it aside for now -
> >>> this
> >>> looks like a separate command with possibly different parameters.
> >>>
> >>> On Tue, Jan 15, 2019 at 1:30 PM Юрий 
> >>> wrote:
> >>>
> >>> > Thanks Vladimir for your thoughts.
> >>> >
> >>> > Based on it most convenient ways are first and third.
> >>> > But with some modifications:
> >>> > For first variant delimiter should be a letter, e.g. 123X15494, then
> it
> >>> > could be simple copy by user.
> >>> > For 3rd variant can be used convert both numeric to HEX and use a
> >>> letter
> >>> > delimiter not included to HEX symbols (ABCDEF), in this case query id
> >>> will
> >>> > be shorter and also can be simple copy by user. e.g. 7BX3C86 ( it the
> >>> same
> >>> > value as used for first variant), instead of convert all value as
> &g

[jira] [Created] (IGNITE-11134) SQL: Do not wrap key and value objects in GridH2KeyValueRowOnheap

2019-01-30 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11134:


 Summary: SQL: Do not wrap key and value objects in 
GridH2KeyValueRowOnheap
 Key: IGNITE-11134
 URL: https://issues.apache.org/jira/browse/IGNITE-11134
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov
Assignee: Vladimir Ozerov
 Fix For: 2.8


This wrapping is not needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11118) SQL: Ability to resolve partition from argument without H2

2019-01-29 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-8:


 Summary: SQL: Ability to resolve partition from argument without H2
 Key: IGNITE-8
 URL: https://issues.apache.org/jira/browse/IGNITE-8
 Project: Ignite
  Issue Type: Task
Reporter: Vladimir Ozerov
Assignee: Vladimir Ozerov
 Fix For: 2.8


Currently we rely on H2 to get final partition: we need to convert originally 
passed argument to expected argument type. 
We need to write our own code to handle this as H2 code will not be available 
by thin clients.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11117) SQL: Move partition nodes to core module

2019-01-29 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-7:


 Summary: SQL: Move partition nodes to core module
 Key: IGNITE-7
 URL: https://issues.apache.org/jira/browse/IGNITE-7
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov
Assignee: Vladimir Ozerov
 Fix For: 2.8


This is needed for further integration with thin clients which do not have 
dependency on {{indexing}} module.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11115) Binary: rework thread-local binary context to avoid set() operation

2019-01-28 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-5:


 Summary: Binary: rework thread-local binary context to avoid set() 
operation
 Key: IGNITE-5
 URL: https://issues.apache.org/jira/browse/IGNITE-5
 Project: Ignite
  Issue Type: Task
  Components: binary
Reporter: Vladimir Ozerov
Assignee: Vladimir Ozerov
 Fix For: 2.8


Currently we call {{ThreadLocal.set()}} on every serialization/deserialization 
(see {{GridBinaryMarshaller#BINARY_CTX}} usages). This may lead to high CPU 
usage, especially during SQL query execution. 
Let's refactor access patterns to work only with {{ThreadLocal.get()}} 
operation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: H2 license and vulnerabilities

2019-01-28 Thread Vladimir Ozerov

Hi Steve,

H2 cannot be removed from Ignite easily as it is integrated pretty deep
into indexing module. Good news is that our usage of H2 is pretty limited -
we only use it's parser, planner and execution pipeline. We do not use H2
as data storage.
Please let me know if you need any additional clarifications.

Vladimir.

On Tue, Jan 29, 2019 at 10:35 AM steve.hostett...@gmail.com <
steve.hostett...@gmail.com> wrote:

> Hello,
> I am using Apache Ignite in an financial setting and it gets reported as a
> high risk because of one of its dependencies : H2
>
> The blackduck report warns the following:
> 1) The H2 license being weak reciprocal it is not the prefered type of OSS
> licenses (e.g., Apache, MIT)
> 2) There are known vulnerabulities for now more than a year that do not get
> fixed:
>
> https://www.cvedetails.com/vulnerability-list/vendor_id-17893/product_id-45580/year-2018/H2database-H2.html
>
> So here are my questions :
> 1) is there any plan to swap H2 by another in memory database and if not
> what is the view of the community on the above points.
> 2) Does ignite uses the part of H2 that is vulnerable (disk backup)?
>
> Many thanks in advance
>
>
>
> --
> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
>

Re: SQL View with list of existing indexes

2019-01-28 Thread Vladimir Ozerov

Hi Yuriy,

Yes, I believe we will have columns view(s) at some point in time for sure.

On Thu, Jan 24, 2019 at 7:08 PM Юрий  wrote:

> Hi Vladimir,
>
> Thanks for your comments,
>
> 1) Agree.
> 2) Ok.
> 3) We create number of index copies depend on query parallelism. But seems
> you are right - it should be exposed on TABLES level.
> 4) Approx. inline size shouldn't be used here, due to the value depend on
> node and not has single value.
> 5) Do we have a plans for some view with table columns? If yes, may be will
> be better have just array with column order from the columns view. For
> example you want to know which columns are indexed already. In case we will
> have plain comma-separated form it can't be achieved.
>
>
>
>
>
> чт, 24 янв. 2019 г. в 18:09, Vladimir Ozerov :
>
> > Hi Yuriy,
> >
> > Please note that MySQL link is about SHOW command, which is a different
> > beast. In general I think that PG approach is better as it allows user to
> > get quick overview of index content without complex JOINs. I would start
> > with plain single view and add columns view later if we found it useful.
> As
> > far as view columns:
> > 1) I would add both cache ID/name and cache group ID/name
> > 2) Number of columns does not look as a useful info to me
> > 3) Query parallelism is related to cache, not index, so it should be in
> > IGNITE.TABLES view instead
> > 4) Inline size is definitely useful metric. Not sure about approximate
> > inline size
> > 5) I would add list of columns in plain comma-separated form with
> ASC/DESC
> > modifiers
> >
> > Thoughts?
> >
> > Vladimir.
> >
> > On Thu, Jan 24, 2019 at 3:52 PM Юрий 
> wrote:
> >
> > > Hi Igniters,
> > >
> > > As part of IEP-29: SQL management and monitoring
> > > <
> > >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-29%3A+SQL+management+and+monitoring
> > > >
> > > I'm going to implement SQL view with list of existing indexes.
> > > I've investigate how it expose by ORACLE, MySQL and Postgres.
> > > ORACLE -
> > >
> > >
> >
> https://docs.oracle.com/en/database/oracle/oracle-database/18/refrn/ALL_INDEXES.html#GUID-E39825BA-70AC-45D8-AF30-C7FF561373B6
> > >
> > > MySQL - https://dev.mysql.com/doc/refman/8.0/en/show-index.html
> > > Postgres - https://www.postgresql.org/docs/11/view-pg-indexes.html ,
> > > https://www.postgresql.org/docs/11/catalog-pg-index.html
> > >
> > > All vendors have such views which show at least following information:
> > > schema name   - Name of schema related to table and index.
> > > table name- Name of table related to an index.
> > > index name   - Name of index.
> > > list of columns   - All columns and their order included into
> an
> > > index.
> > > collation - ASC or DESC sort for each columns.
> > >
> > > + many specific information which different form vendor to vendor.
> > >
> > > In our case such specific information could be at least:
> > >
> > >1. Owning cache ID   - not sure, but may
> > be
> > >useful to join with other our views.
> > >2. number of columns at the index- just to know how many
> > result
> > >should be in columns view
> > >3. query parallelism   - It's
> > configuration
> > >parameter show how many thread can be used to execute query.
> > >4. inline size   - inline
> size
> > >used for this index.
> > >5. is affinity - boolean
> > >parameter show that affinity key index
> > >6. is pk-
> boolean
> > >parameter show that PK index
> > >7. approx recommended inline size- dynamically calculated
> > >recommended inline size for this index to show required size to keep
> > > whole
> > >indexed columns as inlined.
> > >
> > >
> > >
> > > All vendors have different ways  to present information about index
> > > columns:
> > > PG - use array of index table columns and second array for collation
> each
> > > of columns.
> > > MySQL - each row in index view contains information about one of
> indexed
> > > columsn

Re: Distributed MetaStorage discussion

2019-01-25 Thread Vladimir Ozerov

Ivan,

The idea is that certain changes to the system are not relevant for all
components. E.g. if SQL schema is changed, then some SQL caches needs to be
invalidated. When affinity topology changes, another part of caches needs
to be invalidated. Having a single version may lead to unexpected latency
spikes and invalidations in this case.

On Fri, Jan 25, 2019 at 4:50 PM Ivan Bessonov  wrote:

> Vladimir,
>
> thank you for the reply. Topology and affinity changes are not reflected in
> distributed metastorage, we didn't touch baseline history at all. I believe
> that what you really need it just distributed property "sqlSchemaVer" that
> is updated on each schema update. It could be achieved by creating
> corresponding key in distributed metastorage without any specific treatment
> from the API standpoint.
>
> Same thing applies to topology and affinity versions, but motivation here
> is not that clear for me to be honest.
>
> I think that the most common approach with single incrementing version
> is much simpler then several counters and I would prefer to leave it that
> way.
>
>
> пт, 25 янв. 2019 г. в 16:39, Vladimir Ozerov :
>
> > Ivan,
> >
> > The change you describe is extremely valuable thing as it allows to
> detect
> > changes into global configuration which is of great importance for SQL.
> > Will topology and affinity changes be reflected in metastore history as
> > well? From SQL perspective it is important for us to be able to
> understand
> > whether cluster topology, data distribution or SQL schema has changed
> > between two versions. Is it possible to have a kind of composite version
> > instead of hashed counter? E.g.
> >
> > class ConfigurationVersion {
> > long globalVer; // Global counter
> > long topVer; // Increasing topology version
> > long affVer; // Increasing affinity version which is incremented
> every
> > time data distribution is changed (node join/leave, baseline changes,
> late
> > affinity assignment)
> > long sqlSchemaVer; // Incremented every time SQL schema changes
> > }
> >
> > Vladimir.
> >
> >
> > On Fri, Jan 25, 2019 at 11:45 AM Ivan Bessonov 
> > wrote:
> >
> > > Hello, Igniters!
> > >
> > > Here's more info "Distributed MetaStorage" feature [1]. It is a part of
> > > Phase II for
> > > IEP-4 (Baseline topology) [2] and was mentioned in recent "Baseline
> > > auto-adjust`s
> > > discuss" topic. I'll partially duplicate that message here.
> > >
> > > One of key requirements is the ability to store configuration data (or
> > any
> > > other data)
> > > consistently and cluster-wide. There are also other tickets that
> require
> > > similar
> > > mechanisms, for example [3]. Ignite doesn't have any specific API for
> > such
> > > configurations and we don't want to have many similar implementations
> of
> > > the
> > > same feature across the code.
> > >
> > > There are several API methods required for the feature:
> > >
> > >  - read(key) / iterate(keyPrefix) - access to the distributed data.
> > Should
> > > be
> > >consistent for all nodes in cluster when it's in active state.
> > >  - write / remove - modify data in distributed metastorage. Should
> > > guarantee that
> > >every node in cluster will have this update after the method is
> > > finished.
> > >  - writeAsync / removeAsync (not yet implemented) - same as above, but
> > > async.
> > >Might be useful if one needs to update several values one after
> > another.
> > >  - compareAndWrite / compareAndRemove - helpful to reduce number of
> data
> > >updates (more on that later).
> > >  - listen(keyPredicate) - a way of being notified when some data was
> > > changed.
> > >Normally it is triggered on "write/remove" operation or node
> > activation.
> > > Listener
> > >itself will be notified with .
> > >
> > > Now some implementation details:
> > >
> > > First implementation is based on existing local metastorage API for
> > > persistent
> > > clusters (in-memory clusters will store data in memory). Write/remove
> > > operation
> > > use Discovery SPI to send updates to the cluster, it guarantees updates
> > > order
> > > and the fact that all existing (alive) nodes have handled the update
> > > message.
> > >
> &g

Re: Distributed MetaStorage discussion

2019-01-25 Thread Vladimir Ozerov

Ivan,

The change you describe is extremely valuable thing as it allows to detect
changes into global configuration which is of great importance for SQL.
Will topology and affinity changes be reflected in metastore history as
well? From SQL perspective it is important for us to be able to understand
whether cluster topology, data distribution or SQL schema has changed
between two versions. Is it possible to have a kind of composite version
instead of hashed counter? E.g.

class ConfigurationVersion {
long globalVer; // Global counter
long topVer; // Increasing topology version
long affVer; // Increasing affinity version which is incremented every
time data distribution is changed (node join/leave, baseline changes, late
affinity assignment)
long sqlSchemaVer; // Incremented every time SQL schema changes
}

Vladimir.


On Fri, Jan 25, 2019 at 11:45 AM Ivan Bessonov 
wrote:

> Hello, Igniters!
>
> Here's more info "Distributed MetaStorage" feature [1]. It is a part of
> Phase II for
> IEP-4 (Baseline topology) [2] and was mentioned in recent "Baseline
> auto-adjust`s
> discuss" topic. I'll partially duplicate that message here.
>
> One of key requirements is the ability to store configuration data (or any
> other data)
> consistently and cluster-wide. There are also other tickets that require
> similar
> mechanisms, for example [3]. Ignite doesn't have any specific API for such
> configurations and we don't want to have many similar implementations of
> the
> same feature across the code.
>
> There are several API methods required for the feature:
>
>  - read(key) / iterate(keyPrefix) - access to the distributed data. Should
> be
>consistent for all nodes in cluster when it's in active state.
>  - write / remove - modify data in distributed metastorage. Should
> guarantee that
>every node in cluster will have this update after the method is
> finished.
>  - writeAsync / removeAsync (not yet implemented) - same as above, but
> async.
>Might be useful if one needs to update several values one after another.
>  - compareAndWrite / compareAndRemove - helpful to reduce number of data
>updates (more on that later).
>  - listen(keyPredicate) - a way of being notified when some data was
> changed.
>Normally it is triggered on "write/remove" operation or node activation.
> Listener
>itself will be notified with .
>
> Now some implementation details:
>
> First implementation is based on existing local metastorage API for
> persistent
> clusters (in-memory clusters will store data in memory). Write/remove
> operation
> use Discovery SPI to send updates to the cluster, it guarantees updates
> order
> and the fact that all existing (alive) nodes have handled the update
> message.
>
> As a way to find out which node has the latest data there is a "version"
> value of
> distributed metastorage, which is basically  all
> updates>. Whole updates history until some point in the past is stored
> along with
> the data, so when an outdated node connects to the cluster it will receive
> all the
> missing data and apply it locally. Listeners will also be invoked after
> such updates.
> If there's not enough history stored or joining node is clear then it'll
> receive
> shapshot of distributed metastorage so there won't be inconsistencies.
> "compareAndWrite" / "compareAndRemove" API might help reducing the size of
> the history, especially for Boolean or other primitive values.
>
> There are, of course, many more details, feel free to ask about them. First
> implementation is in master, but there are already known improvements that
> can
> be done and I'm working on them right now.
>
> See package "org.apache.ignite.internal.processors.metastorage" for the new
> interfaces and comment your opinion or questions. Thank you!
>
> [1] https://issues.apache.org/jira/browse/IGNITE-10640
> [2]
>
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-4+Baseline+topology+for+caches
> [3] https://issues.apache.org/jira/browse/IGNITE-8717
>
> --
> Sincerely yours,
> Ivan Bessonov
>

[jira] [Created] (IGNITE-11083) SQL: Extract query model from splitter

2019-01-25 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11083:


 Summary: SQL: Extract query model from splitter
 Key: IGNITE-11083
 URL: https://issues.apache.org/jira/browse/IGNITE-11083
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov
Assignee: Vladimir Ozerov
 Fix For: 2.8


We will need a common query model with join/subquery info for future splitter 
and partition pruning improvements. 
Let's extract accurately the model from splitter aiming to reuse it for 
partition pruning in future.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: Baseline auto-adjust`s discuss

2019-01-25 Thread Vladimir Ozerov

Got it, makes sense.

On Fri, Jan 25, 2019 at 11:06 AM Anton Kalashnikov 
wrote:

> Vladimir, thanks  for your notes, both of them looks good enough but I
> have two different thoughts about it.
>
> I think I agree about enabling only one of manual/auto adjustment. It is
> easier than current solution and in fact as extra feature  we can allow
> user to force task to execute(if they doesn't want to wait until timeout
> expired).
> But about second one I don't sure that one parameters instead of two would
> be more convenient. For example: in case when user changed timeout and then
> disable auto-adjust after then when someone will want to enable it they
> should know what value of timeout was before auto-adjust was disabled. I
> think "negative value" pattern good choice for always usable parameters
> like timeout of connection (ex. -1 equal to endless waiting) and so on, but
> in our case we want to disable whole functionality rather than change
> parameter value.
>
> --
> Best regards,
> Anton Kalashnikov
>
>
> 24.01.2019, 22:03, "Vladimir Ozerov" :
> > Hi Anton,
> >
> > This is great feature, but I am a bit confused about automatic disabling
> of
> > a feature during manual baseline adjustment. This may lead to unpleasant
> > situations when a user enabled auto-adjustment, then re-adjusted it
> > manually somehow (e.g. from some previously created script) so that
> > auto-adjustment disabling went unnoticed, then added more nodes hoping
> that
> > auto-baseline is still active, etc.
> >
> > Instead, I would rather make manual and auto adjustment mutually
> exclusive
> > - baseline cannot be adjusted manually when auto mode is set, and vice
> > versa. If exception is thrown in that cases, administrators will always
> > know current behavior of the system.
> >
> > As far as configuration, wouldn’t it be enough to have a single long
> value
> > as opposed to Boolean + long? Say, 0 - immediate auto adjustment,
> negative
> > - disabled, positive - auto adjustment after timeout.
> >
> > Thoughts?
> >
> > чт, 24 янв. 2019 г. в 18:33, Anton Kalashnikov :
> >
> >>  Hello, Igniters!
> >>
> >>  Work on the Phase II of IEP-4 (Baseline topology) [1] has started. I
> want
> >>  to start to discuss of implementation of "Baseline auto-adjust" [2].
> >>
> >>  "Baseline auto-adjust" feature implements mechanism of auto-adjust
> >>  baseline corresponding to current topology after event join/left was
> >>  appeared. It is required because when a node left the grid and nobody
> would
> >>  change baseline manually it can lead to lost data(when some more nodes
> left
> >>  the grid on depends in backup factor) but permanent tracking of grid
> is not
> >>  always possible/desirible. Looks like in many cases auto-adjust
> baseline
> >>  after some timeout is very helpfull.
> >>
> >>  Distributed metastore[3](it is already done):
> >>
> >>  First of all it is required the ability to store configuration data
> >>  consistently and cluster-wide. Ignite doesn't have any specific API for
> >>  such configurations and we don't want to have many similar
> implementations
> >>  of the same feature in our code. After some thoughts is was proposed to
> >>  implement it as some kind of distributed metastorage that gives the
> ability
> >>  to store any data in it.
> >>  First implementation is based on existing local metastorage API for
> >>  persistent clusters (in-memory clusters will store data in memory).
> >>  Write/remove operation use Discovery SPI to send updates to the
> cluster, it
> >>  guarantees updates order and the fact that all existing (alive) nodes
> have
> >>  handled the update message. As a way to find out which node has the
> latest
> >>  data there is a "version" value of distributed metastorage, which is
> >>  basically . All updates history
> >>  until some point in the past is stored along with the data, so when an
> >>  outdated node connects to the cluster it will receive all the missing
> data
> >>  and apply it locally. If there's not enough history stored or joining
> node
> >>  is clear then it'll receive shapshot of distributed metastorage so
> there
> >>  won't be inconsistencies.
> >>
> >>  Baseline auto-adjust:
> >>
> >>  Main scenario:
> >>  - There is grid with the baseline is equal to the current
> topology
> >

Re: Baseline auto-adjust`s discuss

2019-01-24 Thread Vladimir Ozerov

Hi Anton,

This is great feature, but I am a bit confused about automatic disabling of
a feature during manual baseline adjustment. This may lead to unpleasant
situations when a user enabled auto-adjustment, then re-adjusted it
manually somehow (e.g. from some previously created script) so that
auto-adjustment disabling went unnoticed, then added more nodes hoping that
auto-baseline is still active, etc.

Instead, I would rather make manual and auto adjustment mutually exclusive
- baseline cannot be adjusted manually when auto mode is set, and vice
versa. If exception is thrown in that cases, administrators will always
know current behavior of the system.

As far as configuration, wouldn’t it be enough to have a single long value
as opposed to Boolean + long? Say, 0 - immediate auto adjustment, negative
- disabled, positive - auto adjustment after timeout.

Thoughts?

чт, 24 янв. 2019 г. в 18:33, Anton Kalashnikov :

>
> Hello, Igniters!
>
> Work on the Phase II of IEP-4 (Baseline topology) [1] has started. I want
> to start to discuss of implementation of "Baseline auto-adjust" [2].
>
> "Baseline auto-adjust" feature implements mechanism of auto-adjust
> baseline corresponding to current topology after event join/left was
> appeared. It is required because when a node left the grid and nobody would
> change baseline manually it can lead to lost data(when some more nodes left
> the grid on depends in backup factor) but permanent tracking of grid is not
> always possible/desirible. Looks like in many cases auto-adjust baseline
> after some timeout is very helpfull.
>
> Distributed metastore[3](it is already done):
>
> First of all it is required the ability to store configuration data
> consistently and cluster-wide. Ignite doesn't have any specific API for
> such configurations and we don't want to have many similar implementations
> of the same feature in our code. After some thoughts is was proposed to
> implement it as some kind of distributed metastorage that gives the ability
> to store any data in it.
> First implementation is based on existing local metastorage API for
> persistent clusters (in-memory clusters will store data in memory).
> Write/remove operation use Discovery SPI to send updates to the cluster, it
> guarantees updates order and the fact that all existing (alive) nodes have
> handled the update message. As a way to find out which node has the latest
> data there is a "version" value of distributed metastorage, which is
> basically . All updates history
> until some point in the past is stored along with the data, so when an
> outdated node connects to the cluster it will receive all the missing data
> and apply it locally. If there's not enough history stored or joining node
> is clear then it'll receive shapshot of distributed metastorage so there
> won't be inconsistencies.
>
> Baseline auto-adjust:
>
> Main scenario:
> - There is grid with the baseline is equal to the current topology
> - New node joins to grid or some node left(failed) the grid
> - New mechanism detects this event and it add task for changing
> baseline to queue with configured timeout
> - If new event are happened before baseline would be changed task
> would be removed from queue and new task will be added
> - When timeout are expired the task would try to set new baseline
> corresponded to current topology
>
> First of all we need to add two parameters[4]:
> - baselineAutoAdjustEnabled - enable/disable "Baseline
> auto-adjust" feature.
> - baselineAutoAdjustTimeout - timeout after which baseline should
> be changed.
>
> This parameters are cluster wide and can be changed in real time because
> it is based on "Distributed metastore". On first time this parameters would
> be initiated by corresponded parameters(initBaselineAutoAdjustEnabled,
> initBaselineAutoAdjustTimeout) from "Ignite Configuration". Init value
> valid only before first changing of it after  value would be changed it is
> stored in "Distributed metastore".
>
> Restrictions:
> - This mechanism handling events only on active grid
> - If baselineNodes != gridNodes on activate this feature would be
> disabled
> - If lost partitions was detected this feature would be disabled
> - If baseline was adjusted manually on baselineNodes != gridNodes
> this feature would be disabled
>
> Draft implementation you can find here[5]. Feel free to ask more details
> and make suggestions.
>
> [1]
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-4+Baseline+topology+for+caches
> [2] https://issues.apache.org/jira/browse/IGNITE-8571
> [3] https://issues.apache.org/jira/browse/IGNITE-10640
> [4] https://issues.apache.org/jira/browse/IGNITE-8573
> [5] https://github.com/apache/ignite/pull/5907
>
> --
> Best regards,
> Anton Kalashnikov
>
>

Re: SQL View with list of existing indexes

2019-01-24 Thread Vladimir Ozerov

Hi Yuriy,

Please note that MySQL link is about SHOW command, which is a different
beast. In general I think that PG approach is better as it allows user to
get quick overview of index content without complex JOINs. I would start
with plain single view and add columns view later if we found it useful. As
far as view columns:
1) I would add both cache ID/name and cache group ID/name
2) Number of columns does not look as a useful info to me
3) Query parallelism is related to cache, not index, so it should be in
IGNITE.TABLES view instead
4) Inline size is definitely useful metric. Not sure about approximate
inline size
5) I would add list of columns in plain comma-separated form with ASC/DESC
modifiers

Thoughts?

Vladimir.

On Thu, Jan 24, 2019 at 3:52 PM Юрий  wrote:

> Hi Igniters,
>
> As part of IEP-29: SQL management and monitoring
> <
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-29%3A+SQL+management+and+monitoring
> >
> I'm going to implement SQL view with list of existing indexes.
> I've investigate how it expose by ORACLE, MySQL and Postgres.
> ORACLE -
>
> https://docs.oracle.com/en/database/oracle/oracle-database/18/refrn/ALL_INDEXES.html#GUID-E39825BA-70AC-45D8-AF30-C7FF561373B6
>
> MySQL - https://dev.mysql.com/doc/refman/8.0/en/show-index.html
> Postgres - https://www.postgresql.org/docs/11/view-pg-indexes.html ,
> https://www.postgresql.org/docs/11/catalog-pg-index.html
>
> All vendors have such views which show at least following information:
> schema name   - Name of schema related to table and index.
> table name- Name of table related to an index.
> index name   - Name of index.
> list of columns   - All columns and their order included into an
> index.
> collation - ASC or DESC sort for each columns.
>
> + many specific information which different form vendor to vendor.
>
> In our case such specific information could be at least:
>
>1. Owning cache ID   - not sure, but may be
>useful to join with other our views.
>2. number of columns at the index- just to know how many result
>should be in columns view
>3. query parallelism   - It's configuration
>parameter show how many thread can be used to execute query.
>4. inline size   - inline size
>used for this index.
>5. is affinity - boolean
>parameter show that affinity key index
>6. is pk- boolean
>parameter show that PK index
>7. approx recommended inline size- dynamically calculated
>recommended inline size for this index to show required size to keep
> whole
>indexed columns as inlined.
>
>
>
> All vendors have different ways  to present information about index
> columns:
> PG - use array of index table columns and second array for collation each
> of columns.
> MySQL - each row in index view contains information about one of indexed
> columsn with ther position at the index. So for one index there are many
> columns.
> ORACLE,  - use separate view where each of row present column included into
> index with all required information and can be joined by schema, table and
> index names.
> ORACLE indexed columns view -
>
> https://docs.oracle.com/cd/B19306_01/server.102/b14237/statviews_1064.htm#i1577532
> MySql -
>
> I propose use ORACLE way and have second view to represent column included
> into indexes.
>
> In this case such view can have the following information:
> schema name   - Name of schema related to table and index.
> table name- Name of table related to an index.
> index name   - Name of index.
> column name- Name of column included into index.
> column type  - Type of the column.
> column position - Position of column within the index.
> collation- Either the column is sorted descending or
> ascending
>
> And can be joined with index view through schema, table and index names.
>
>
>
> What do you think about such approach and list of columns which could be
> included into the views?
>
> --
> Живи с улыбкой! :D
>

[jira] [Created] (IGNITE-11057) Document new SQL system view "CACHE_GROUPS_IO"

2019-01-24 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11057:


 Summary: Document new SQL system view "CACHE_GROUPS_IO"
 Key: IGNITE-11057
 URL: https://issues.apache.org/jira/browse/IGNITE-11057
 Project: Ignite
  Issue Type: Task
  Components: documentation, sql
Reporter: Vladimir Ozerov
 Fix For: 2.8


See 
{{modules\indexing\src\main\java\org\apache\ignite\internal\processors\query\h2\sys\view\SqlSystemViewCacheGroupsIOStatistics.java}}

# {{GROUP_ID}} - cache group ID
# {{GROUP_ID}} - cache group name
# {{PHYSICAL_READS}} - number of physical reads (i.e. block read from disk) for 
the given group
# {{LOGICAL_READS}} - number of logical reads (i.e. from buffer cache) for the 
given group.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11042) Document new SQL system view "TABLES"

2019-01-23 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-11042:


 Summary: Document new SQL system view "TABLES"
 Key: IGNITE-11042
 URL: https://issues.apache.org/jira/browse/IGNITE-11042
 Project: Ignite
  Issue Type: Task
  Components: documentation, sql
Reporter: Vladimir Ozerov
 Fix For: 2.8


See 
{{modules\indexing\src\main\java\org\apache\ignite\internal\processors\query\h2\sys\view\SqlSystemViewTables.java}}
 for the list of columns.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: Continuous queries and duplicates

2019-01-23 Thread Vladimir Ozerov

Hi Piotr,

Unfortunately I do not have answer to the question about ordering
guarantees during node crashes for the same affinity key. Hopefully some
other Ignite experts would be able to help.
But in any case I doubt we will be able to have public guarantee on the
same affinity key, as opposed to current approach (key itself),

Vladimir.

On Fri, Jan 11, 2019 at 5:24 PM Piotr Romański 
wrote:

> Hi Vladimir, thank you for your response. I tested the current behaviour
> and it seems that the order is maintained for notifications within a
> partition. Unfortunately, I don’t know how it would behave in exceptional
> situations like losing partitions, rebalancing etc. Do you think it would
> be possible to make that ordering guarantee to be a part of the Ignite API?
> What I would really need is to have order for notifications sharing the
> same affinity key, not even a partition. So I think it wouldn’t require any
> cross-node ordering.
>
> Thank you,
>
> Piotr
>
> śr., 9 sty 2019, 21:11: Vladimir Ozerov  napisał(a):
>
> > Hi,
> >
> > MVCC caches have the same ordering guarantees as non-MVCC caches, i.e.
> two
> > subsequent updates on a single key will be delivered in proper order.
> There
> > is no guarantees  Order of updates on two subsequent transactions
> affecting
> > the same partition may be guaranteed with current implementation
> (though. I
> > am not sure), but even if it is so, I am not aware that this was ever our
> > design goal. Most likely, this is an implementation artifact which may be
> > changed in future. Cache experts are needed to clarify this.
> >
> > As far as MVCC, data anomalies are still possible in current
> > implementation, because we didn't rework initial query handling in the
> > first iteration, because technically this is not so simple as we thought.
> > Once snapshot is obtained, query over that snapshot will return a data
> set
> > consistent at some point in time. But the problem is that there is a time
> > frame between snapshot acquisition and listener installation (or vice
> > versa), what leads to either duplicates or lost entries. Some multi-step
> > listener installation will be required here. We haven't designed it yet.
> >
> > Vladimir.
> >
> >
> >
> > On Mon, Dec 24, 2018 at 10:06 PM Denis Magda  wrote:
> >
> > > >
> > > > In my case, values are immutable - I never change them, I just add
> new
> > > > entry for newer versions. Does it mean that I won't have any
> duplicates
> > > > between the initial query and listener entries when using continuous
> > > > queries on caches supporting MVCC?
> > >
> > >
> > > I'm afraid there still might be a race. Val, Vladimir, other Ignite
> > > experts, please confirm.
> > >
> > > After reading the related thread (
> > > >
> > > >
> > >
> >
> http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html
> > > > )
> > > > I'm now concerned about the ordering. My case assumes that there are
> > > groups
> > > > of entries which belong to a business aggregate object and I would
> like
> > > to
> > > > make sure that if I commit two records in two serial transactions
> then
> > I
> > > > have notifications in the same order. Those entries will have
> different
> > > > keys so based on what you said ("we'd better to leave things as is
> and
> > > > guarantee only per-key ordering"), it would seem that the order is
> not
> > > > guaranteed. But do you think it would possible to guarantee order
> when
> > > > those entries share the same affinity key and they belong to the same
> > > > partition?
> > >
> > >
> > > The order should be the same for key-value transactions. Vladimir,
> could
> > > you clear out MVCC based behavior?
> > >
> > > --
> > > Denis
> > >
> > > On Mon, Dec 17, 2018 at 9:55 AM Piotr Romański <
> piotr.roman...@gmail.com
> > >
> > > wrote:
> > >
> > > > Hi all, sorry for answering so late.
> > > >
> > > > I would like to use SqlQuery because I can leverage indexes there.
> > > >
> > > > As it was already mentioned earlier, the partition update counter is
> > > > exposed through CacheQueryEntryEvent. Initially, I thought that the
> > > > partition update counter is something what's persisted together with
>

Re: CompactFooter for ClientBinaryMarshaller

2019-01-23 Thread Vladimir Ozerov

It's hard to believe that compact footers are not supported, as it was one
of critical performance optimizations we implemented more than 4 years ago
:-)
If it is really so, we should prioritize the fix.

On Tue, Jan 22, 2019 at 3:28 PM Igor Sapego  wrote:

> Roman,
>
> I've filed a ticket for C++: [1]
>
> [1] - https://issues.apache.org/jira/browse/IGNITE-11027
>
> Best Regards,
> Igor
>
>
> On Tue, Jan 22, 2019 at 12:55 PM Roman Shtykh 
> wrote:
>
> > Igor, I see. How about having a warning if `BinaryConfiguration` is not
> > provided explicitly to at least raise attention? And creating a JIRA
> issue
> > for C++ clients -- after it resolves we can probably switch it to cluster
> > default.
> >
> > --
> > Roman Shtykh
> >
> > On Monday, January 21, 2019, 7:04:30 p.m. GMT+9, Igor Sapego <
> > isap...@apache.org> wrote:
> >
> >  I believe, it was set to false by default as it was kind of experimental
> > optimisation.
> > Also, I've checked right now and it seems that C++ clients (thick and
> > thin)do not yet support compact footers. It may also be a blocker to set
> > compactfooters to true by default.
> > Best Regards,Igor
> >
> > On Sat, Jan 19, 2019 at 6:52 AM Roman Shtykh 
> > wrote:
> >
> > Thank you for the explanation. But here is the problem is not exactly
> with
> > deserialization but with that a user-defined key is being marshalled to a
> > binary object with the compact footer set to true, while the key for
> > putting has the footer set to false (which is server default). Thus we
> have
> > a different thing for the key when we try to retrieve and getting null.
> > Therefore, I suppose switching client to server defaults is what has to
> be
> > done. If the user decides to switch to full schema mode, at least he/she
> > will be aware of it. And for deserialization, the schema will be
> retrieved,
> > as you explained. What do you think?
> >
> > -- Roman
> > On Friday, January 18, 2019, 10:52:11 p.m. GMT+9, Vladimir Ozerov <
> > voze...@gridgain.com> wrote:
> >
> >  "Compact footer" is optimization which saves a lot of space. Object
> > serialized in this form do not have the full information required for
> > deserialization. Metadata necessary for deserialization (aka "schema") is
> > located on cluster nodes. For this client it could be requested through
> > special command. Pleass see ClientOperation.GET_BINARY_TYPE as a starting
> > point.
> > On Fri, Jan 18, 2019 at 1:32 PM Igor Sapego  wrote:
> >
> > I'm not sure, that such a change should be done in minor release, maybe
> in
> > 3.0
> > Vova, what do you think? It was you, who designed and developed compact
> > footer, right?
> > Best Regards,Igor
> >
> > On Fri, Jan 18, 2019 at 4:20 AM Roman Shtykh 
> > wrote:
> >
> > > I believe it has something to do with backward compatibility.That's
> what
> > I would like to know.If there's no strong reason to set it to false, it
> > should be as Ignite's default -- that's what a user would expect. And if
> > the user changes the configuration at the cluster, he/she will be aware
> of
> > that and change it at thin client.If we cannot set it to Ignite's
> default,
> > we can add a log message saying we force it to false.
> >
> > --
> > Roman
> >
> >
> > On Thursday, January 17, 2019, 7:11:05 p.m. GMT+9, Igor Sapego <
> > isap...@apache.org> wrote:
> >
> >  First of all, I do not like that thin client is silently returns null.
> It
> > should be fixed.
> > For the compact footer being set to false by default - I believe it has
> > something to do withbackward compatibility.
> > Best Regards,Igor
> >
> > On Thu, Jan 17, 2019 at 7:37 AM Roman Shtykh 
> > wrote:
> >
> > Igniters,
> > After putting some data with a user-defined key with a thick client, it's
> > impossible to retrieve it with a thin client.
> > https://issues.apache.org/jira/browse/IGNITE-10960(I was not sure it was
> > a bug, so I first reported the issue to the user ml, Mikhail thanks for
> > checking and the jira issue)
> > That happens because for Ignite `compactFooter` is `true` by default, but
> > `ClientBinaryMarshaller` forces it to `false` if `BinaryConfiguration` is
> > not created explicitly (see ClientBinaryMarshaller#createImpl).
> > Any reason to force it to false? I would like to align it with Ignite
> > defaults (by setting to true).
> >
> > -- Roman
> >
> >
> >
> >
> >
>

[jira] [Created] (IGNITE-10986) SQL: Drop _VER field support

2019-01-18 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-10986:


 Summary: SQL: Drop _VER field support
 Key: IGNITE-10986
 URL: https://issues.apache.org/jira/browse/IGNITE-10986
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov
Assignee: Alexander Lapin
 Fix For: 2.8


{{_VER}} is undocumented hidden field which is never used in practice. But 
profiling shows that it consumes a lot of memory. Let's drop support of this 
field from all {{GridH2SearchRow}} implementations, as well as from internal 
descriptors.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-10985) SQL: create low-overhead implementation of Row for SELECTs

2019-01-18 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-10985:


 Summary: SQL: create low-overhead implementation of Row for SELECTs
 Key: IGNITE-10985
 URL: https://issues.apache.org/jira/browse/IGNITE-10985
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov
Assignee: Alexander Lapin
 Fix For: 2.8


Currently we use {{GridH2KeyValueRowOnheap}} for both update and search 
operations. This leads to *huge* memory overhead during {{SELECT}} execution. 
If you take a closer look on what is inside the row, you will note the 
following:
# It has both serialized and deserialized {{GridCacheVersion}} which is never 
needed
# It has wrapped key and value object
# It has reference to {{CacheDataRow}} which is not needed either
# It has {{valCache}} field which is never used in SELECT

The goal of this ticket is to created optimized version of row which will be 
created during {{SELECT}} operations only. It should contain only minimally 
necessary information:
# Key (unwrapped!)
# Value (unwrapped!)
# Version (unwrapped, we will remove it completely in separate ticket)

It should not contain reference to {{CacheDataRow}}. There is a chance that we 
will need some pieces from it (e.g. cache ID and link for caching purposes), 
but it definitely will be only small subset of the whole 
{{CacheDataRowAdapter}} (or even worse - {{MvccDataRow}}).

Entry point: {{H2Tree.createRowFromLink}} methods. Note that they return 
{{GridH2Row}}, while in their usages only very relaxed version of 
{{GridH2SearchRow}} is needed. So let's start with new implementation of row 
for these methods and then gradually remove all unnecessary stuff from there.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: CompactFooter for ClientBinaryMarshaller

2019-01-18 Thread Vladimir Ozerov

"Compact footer" is optimization which saves a lot of space. Object
serialized in this form do not have the full information required for
deserialization. Metadata necessary for deserialization (aka "schema") is
located on cluster nodes. For this client it could be requested through
special command. Pleass see ClientOperation.GET_BINARY_TYPE as a starting
point.

On Fri, Jan 18, 2019 at 1:32 PM Igor Sapego  wrote:

> I'm not sure, that such a change should be done in minor release, maybe in
> 3.0
>
> Vova, what do you think? It was you, who designed and developed compact
> footer, right?
>
> Best Regards,
> Igor
>
>
> On Fri, Jan 18, 2019 at 4:20 AM Roman Shtykh 
> wrote:
>
>> > I believe it has something to do with backward compatibility.That's
>> what I would like to know.If there's no strong reason to set it to false,
>> it should be as Ignite's default -- that's what a user would expect. And if
>> the user changes the configuration at the cluster, he/she will be aware of
>> that and change it at thin client.If we cannot set it to Ignite's default,
>> we can add a log message saying we force it to false.
>>
>> --
>> Roman
>>
>>
>> On Thursday, January 17, 2019, 7:11:05 p.m. GMT+9, Igor Sapego <
>> isap...@apache.org> wrote:
>>
>>  First of all, I do not like that thin client is silently returns null.
>> It should be fixed.
>> For the compact footer being set to false by default - I believe it has
>> something to do withbackward compatibility.
>> Best Regards,
>> Igor
>>
>>
>> On Thu, Jan 17, 2019 at 7:37 AM Roman Shtykh 
>> wrote:
>>
>> Igniters,
>> After putting some data with a user-defined key with a thick client, it's
>> impossible to retrieve it with a thin client.
>> https://issues.apache.org/jira/browse/IGNITE-10960(I was not sure it was
>> a bug, so I first reported the issue to the user ml, Mikhail thanks for
>> checking and the jira issue)
>> That happens because for Ignite `compactFooter` is `true` by default, but
>> `ClientBinaryMarshaller` forces it to `false` if `BinaryConfiguration` is
>> not created explicitly (see ClientBinaryMarshaller#createImpl).
>> Any reason to force it to false? I would like to align it with Ignite
>> defaults (by setting to true).
>>
>> -- Roman
>>
>>
>
>

[jira] [Created] (IGNITE-10971) SQL: Support partition pruning for distributed joins

2019-01-17 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-10971:


 Summary: SQL: Support partition pruning for distributed joins
 Key: IGNITE-10971
 URL: https://issues.apache.org/jira/browse/IGNITE-10971
 Project: Ignite
  Issue Type: Task
  Components: sql
Reporter: Vladimir Ozerov
 Fix For: 2.8


During IGNITE-10307 implementation it was revealed that distributed joins do 
not work with partition pruning. We never observed it before because it was 
impossible to derive partitions from joins.

The problem appears as timeout exception from reducer due to some 
timeouts/retries inside distributed joins logic. Failures could be reproduced 
as follows:
1) Remove {{GridSqlQuerySplitter.distributedJoins}} usage which prevents 
partition to be derived for map query.
2) Run any of the following tests and observe that some of tests cases fails 
with reducer timeout:
{{IgniteSqlSplitterSelfTest}}
{{IgniteCacheJoinQueryWithAffinityKeyTest}}
{{IgniteCacheDistributedJoinQueryConditionsTest}}
{{IgniteCacheCrossCacheJoinRandomTest}}

Root cause is unknown, but most likely this is due some missing messages, 
because some parts of distributed join engine is not aware of extracted 
partitions and await for replies from not involved nodes.

Note that most likely the same problem will appear for queries with distributed 
joins and explicit partitions ({{SqlFieldsQuery.partitions}}).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 2411 matches

Mail list logo