Re: [Announce] New committer: Iurii Gerzhedovich

2024-02-15 Thread Юрий
Thank you everyone, I'm very pleased!

ср, 14 февр. 2024 г. в 17:28, Pavel Pereslegin :

> Congratulations, Iurii!
>
> ср, 14 февр. 2024 г. в 09:29, Pavel Tupitsyn :
> >
> > Congratulations Iurii!
> >
> > On Wed, Feb 14, 2024 at 8:17 AM Roman Puchkovskiy <
> > roman.puchkovs...@gmail.com> wrote:
> >
> > > Congratulations!
> > >
> > > вт, 13 февр. 2024 г. в 23:51, Dmitriy Pavlov :
> > > >
> > > > Dear Igniters,
> > > >
> > > > The Project Management Committee (PMC) for Apache Ignite
> > > > has invited Iurii Gerzhedovich to become a committer and we are
> pleased
> > > > to announce that he has accepted.
> > > >
> > > > Being a committer enables easier contribution to the
> > > > project since there is no need to go via the patch
> > > > submission process. This should enable better productivity.
> > > >
> > > >
> > > > Please join me in sincere congratulations to Iurii on his new role!
> > > > Iurii, keep the pace!
> > > >
> > > >
> > > > Sincerely,
> > > > Dmitriy Pavlov on behalf of Apache Ignite PMC
> > >
>


-- 
Живи с улыбкой! :D


Re: [VOTE] Release Apache Ignite 3.0.0-beta1 RC2

2022-11-16 Thread Юрий
+1

вт, 15 нояб. 2022 г. в 21:11, Vladislav Pyatkov :

> +1
>
> On Tue, Nov 15, 2022 at 3:35 PM Denis C  wrote:
> >
> > +1
> >
> > вт, 15 нояб. 2022 г. в 13:33, Alexander Lapin :
> >
> > > +1
> > >
> > > вт, 15 нояб. 2022 г. в 08:48, Pavel Tupitsyn :
> > >
> > > > +1 (binding)
> > > >
> > > > On Mon, Nov 14, 2022 at 9:05 PM Вячеслав Коптилин <
> > > > slava.kopti...@gmail.com>
> > > > wrote:
> > > >
> > > > > Dear Community,
> > > > >
> > > > > Ignite 3 is moving forward and I think we're in a good spot to
> release
> > > > the
> > > > > first beta version. In the last few months the following major
> features
> > > > > have been added:
> > > > > - RPM and DEB packages: simplified installation and node management
> > > with
> > > > > system services.
> > > > > - Client's Partition Awareness: Clients are now aware of data
> > > > distribution
> > > > > over the cluster nodes which helps avoid additional network
> > > transmissions
> > > > > and lowers operations latency.
> > > > > - C++ client:  Basic C++ client, able to perform operations on
> data.
> > > > > - Autogenerated values: now a function can be specified as a
> default
> > > > value
> > > > > generator during a table creation. Currently only gen_random_uuid
> is
> > > > > supported.
> > > > > - SQL Transactions.
> > > > > - Transactional Protocol: improved locking model, multi-version
> based
> > > > > lock-free read-only transactions.
> > > > > - Storage: A number of improvements to memory-only and on-disk
> engines
> > > > > based on Page Memory.
> > > > > - Indexes: Basic functionality, hash and sorted indexes.
> > > > > - Client logging: A LoggerFactory may be provided during client
> > > creation
> > > > to
> > > > > specify a custom logger for logs generated by the client.
> > > > > - Metrics framework: Collection and export of cluster metrics.
> > > > >
> > > > > I propose to release 3.0.0-beta1 with the features listed above.
> > > > >
> > > > > Release Candidate:
> > > > > https://dist.apache.org/repos/dist/dev/ignite/3.0.0-beta1-rc2/
> > > > > Maven Staging:
> > > > >
> > >
> https://repository.apache.org/content/repositories/orgapacheignite-1556/
> > > > > Tag: https://github.com/apache/ignite-3/tree/3.0.0-beta1-rc2
> > > > >
> > > > > +1 - accept Apache Ignite 3.0.0-beta1 RC2
> > > > >  0 - don't care either way
> > > > > -1 - DO NOT accept Apache Ignite 3.0.0-beta1 RC2 (explain why)
> > > > >
> > > > > Voting guidelines: https://www.apache.org/foundation/voting.html
> > > > > How to verify the release:
> > > https://www.apache.org/info/verification.html
> > > > >
> > > > > The vote will be closed on Wednesday, 16 November 2022, 18:00:00
> (UTC
> > > > time)
> > > > >
> > > > >
> > > >
> > >
> https://www.timeanddate.com/countdown/generic?iso=20221116T18&p0=1440&msg=Apache+Ignite+3.0.0-beta1+RC2&font=cursive&csz=1
> > > > >
> > > > > Thanks,
> > > > > S.
> > > > >
> > > >
> > >
>
>
>
> --
> Vladislav Pyatkov
>


-- 
Живи с улыбкой! :D


Re: [ANNOUNCE] SCOPE FREEZE for Apache Ignite 3.0.0 beta 1 RELEASE

2022-11-04 Thread Юрий
Hi Igniters,

I would like to add last one bugfix to the beta1:
https://issues.apache.org/jira/browse/IGNITE-18090

It will be ready in a few hours.

чт, 3 нояб. 2022 г. в 19:28, Alexander Lapin :

> Hi Igniters,
>
> I would like to ask you to add one more bugfix to the beta1:
> https://issues.apache.org/jira/browse/IGNITE-18003
>
> Best regards,
> Aleksandr
>
> вт, 1 нояб. 2022 г. в 17:17, Aleksandr Pakhomov :
>
> > Hi Igniters,
> >
> > I would like to ask you to add two more tickets to the beta1:
> >
> > - https://issues.apache.org/jira/browse/IGNITE-18036 <
> > https://issues.apache.org/jira/browse/IGNITE-18036>
> > - https://issues.apache.org/jira/browse/IGNITE-18025 <
> > https://issues.apache.org/jira/browse/IGNITE-18025>
> >
> > Both of them now have PR's into the main.
> >
> > Best regards,
> > Aleksandr
> >
> > > On 15 Oct 2022, at 00:45, Andrey Gura  wrote:
> > >
> > > Igniters,
> > >
> > > The 'ignite-3.0.0-beta1' branch was created (the latest commit is
> > > 8160ef31ecf8d49f227562b6f0ab090c6b4438c1).
> > >
> > > The scope for the release is frozen.
> > >
> > > It means the following:
> > >
> > > - Any issue could be added to the release (fixVersion == 3.0.0-beta1)
> > > only after discussion with the community and a release manager in this
> > > thread.
> > > - Any commit to the release branch must be also applied to the 'main'
> > branch.
> >
> >
>


-- 
Живи с улыбкой! :D


Re: [ANNOUNCE] SCOPE FREEZE for Apache Ignite 3.0.0 beta 1 RELEASE

2022-10-31 Thread Юрий
Hello, I would like to add plus one small and simple, but a mandatory
ticket to ignite-3.0.0-beta1.
https://issues.apache.org/jira/browse/IGNITE-18016
<https://issues.apache.org/jira/browse/IGNITE-18016>

пт, 28 окт. 2022 г. в 19:51, Vladislav Pyatkov :

> Hi,
>
> The following tickets were merged to the release branch, hence they contain
> critical fixes:
>
>1. IGNITE-17967 - contains a patch that solve issue with hanging RO
>transactions on a multi node cluster.
>2.
>   1. IGNITE-18005 - fixes a bug with a batch remove operation, that is
>   used in SQL internally. Without the fix impossible to delete an entry
>   through SQL, instead of this the entry is stored with the empty value
>   splice.
>
>
> On Fri, Oct 28, 2022 at 6:07 PM Mikhail Pochatkin 
> wrote:
>
> > Hello, I would like to add following tickets to ignite-3.0.0-beta1
> > [IGNITE-17966] Fix problem with stuck Gradle processes in .NET tests -
> ASF
> > JIRA (apache.org) <https://issues.apache.org/jira/browse/IGNITE-17966>
> > [IGNITE-17965] Enable remote build cache for Gradle - ASF JIRA (
> apache.org
> > )
> > <https://issues.apache.org/jira/browse/IGNITE-17965>
> > [IGNITE-17980] ./gradlew clean build -x test fails - ASF JIRA (
> apache.org)
> > <https://issues.apache.org/jira/browse/IGNITE-17980>
> > [IGNITE-18009] Fix gradle build - ASF JIRA (apache.org)
> > <https://issues.apache.org/jira/browse/IGNITE-18009>
> >
> > These tickets need to unblock problems with Gradle build and required for
> > packaging scope of beta1. Thanks!
> >
> > On Mon, Oct 24, 2022 at 10:27 PM Mikhail Pochatkin
>  > >
> > wrote:
> >
> > > Hello, Igniters.
> > >
> > > I want to point out that the current beta seems to be blocked by
> > > [IGNITE-17966] <https://issues.apache.org/jira/browse/IGNITE-17966>.
> The
> > > main problem is that we cannot enable Gradle build on CI at this
> moment,
> > > but we need it because all beta distributions are implemented via
> Gradle
> > > build. So, I am trying to fix it in a short time.
> > >
> > > On Sat, Oct 22, 2022 at 5:35 PM Stanislav Lukyanov <
> > stanlukya...@gmail.com>
> > > wrote:
> > >
> > >> There are 11 unresolved tickets in the scope now, 4 In Progress and 7
> > >> Patch Available.
> > >>
> > >> I think we should try to set the code freeze according to the ticket
> > >> estimates instead of just setting it to the end of next week. I'll
> work
> > >> with each ticket owner to determine the critical path.
> > >>
> > >> I also saw an Open ticket that was added to the scope outside of the
> > >> process. I descoped it already, but we need to be careful of new
> tickets
> > >> being added to the scope.
> > >>
> > >> Thanks,
> > >> Stan
> > >>
> > >> > On 20 Oct 2022, at 15:59, Вячеслав Коптилин <
> slava.kopti...@gmail.com
> > >
> > >> wrote:
> > >> >
> > >> > Hello Alexandr,
> > >> >
> > >> > Ok, I added these tickets to the scope.
> > >> >
> > >> > There are 12 tickets that are not resolved and included into the
> > scope.
> > >> So,
> > >> > we have to move the code freeze to the end of next week.
> > >> >
> > >> > Thanks,
> > >> > Slava.
> > >> >
> > >> > чт, 20 окт. 2022 г. в 08:36, Aleksandr Pakhomov :
> > >> >
> > >> >> Hi, Igniters.
> > >> >>
> > >> >> I would like to ask you to add a couple of tickets that are
> required
> > >> for
> > >> >> packaging:
> > >> >>
> > >> >> https://issues.apache.org/jira/browse/IGNITE-17781 <
> > >> >> https://issues.apache.org/jira/browse/IGNITE-17781>
> > >> >> https://issues.apache.org/jira/browse/IGNITE-17773 <
> > >> >> https://issues.apache.org/jira/browse/IGNITE-17773>
> > >> >>
> > >> >> These tickets are the last tickets in the packaging scope for
> beta1.
> > >> >>
> > >> >> --
> > >> >> Best regards,
> > >> >> Aleksandr
> > >> >>
> > >> >>> On 19 Oct 2022, at 21:48, Вячеслав Коптилин <
> > slava.kopti...@gmail.com
> > >> >
> > 

Re: [ANNOUNCE] SCOPE FREEZE for Apache Ignite 3.0.0 beta 1 RELEASE

2022-10-19 Thread Юрий
Slava, thank you.

During cherry picking one of aforementioned ticket was observed dependency
on
https://issues.apache.org/jira/browse/IGNITE-17907
https://issues.apache.org/jira/browse/IGNITE-17671
https://issues.apache.org/jira/browse/IGNITE-17816

So, I propose adding them into the release scope.

ср, 19 окт. 2022 г. в 15:53, Вячеслав Коптилин :

> Hi Yuriy,
>
> I agree, let's add them to the scope.
>
> Thanks,
> S.
>
>
> ср, 19 окт. 2022 г. в 15:20, Юрий :
>
> > Dear Release managers and Igniters,
> >
> > I would like to add the following tickets to Ignite 3.0.0 beta1:
> >
> > https://issues.apache.org/jira/browse/IGNITE-17820 - improvement SQL and
> > required for the next ticket
> > https://issues.apache.org/jira/browse/IGNITE-17748 - related to support
> of
> > indexes
> > https://issues.apache.org/jira/browse/IGNITE-17612 - fix issue when some
> > queries couldn't be done.
> > https://issues.apache.org/jira/browse/IGNITE-17330 - support RO
> > transaction
> > by SQL
> > https://issues.apache.org/jira/browse/IGNITE-17859 - index filling
> > https://issues.apache.org/jira/browse/IGNITE-17813 - related to support
> > indexes by SQL
> > https://issues.apache.org/jira/browse/IGNITE-17655 - related to support
> > indexes by SQL
> >
> > ср, 19 окт. 2022 г. в 12:11, Вячеслав Коптилин  >:
> >
> > > Hello Alexander,
> > >
> > > Thank you for pointing this out. I fully support including RO
> > transactions
> > > into the scope of Ignite 3.0.0-beta1 release.
> > >
> > > Thanks,
> > > S.
> > >
> > >
> > > ср, 19 окт. 2022 г. в 11:42, Alexander Lapin :
> > >
> > > > Igniters,
> > > >
> > > > I would like to add following tickets to ignite-3.0.0-beta1
> > > > https://issues.apache.org/jira/browse/IGNITE-17806
> > > > https://issues.apache.org/jira/browse/IGNITE-17759
> > > > https://issues.apache.org/jira/browse/IGNITE-17637
> > > > https://issues.apache.org/jira/browse/IGNITE-17263
> > > > https://issues.apache.org/jira/browse/IGNITE-17260
> > > >
> > > > It's all about read-only transactions.
> > > >
> > > > Best regards,
> > > > Alexander
> > > >
> > > > пт, 14 окт. 2022 г. в 19:45, Andrey Gura :
> > > >
> > > > > Igniters,
> > > > >
> > > > > The 'ignite-3.0.0-beta1' branch was created (the latest commit is
> > > > > 8160ef31ecf8d49f227562b6f0ab090c6b4438c1).
> > > > >
> > > > > The scope for the release is frozen.
> > > > >
> > > > > It means the following:
> > > > >
> > > > > - Any issue could be added to the release (fixVersion ==
> 3.0.0-beta1)
> > > > > only after discussion with the community and a release manager in
> > this
> > > > > thread.
> > > > > - Any commit to the release branch must be also applied to the
> 'main'
> > > > > branch.
> > > > >
> > > >
> > >
> >
> >
> > --
> > Живи с улыбкой! :D
> >
>


-- 
Живи с улыбкой! :D


Re: [ANNOUNCE] SCOPE FREEZE for Apache Ignite 3.0.0 beta 1 RELEASE

2022-10-19 Thread Юрий
Dear Release managers and Igniters,

I would like to add the following tickets to Ignite 3.0.0 beta1:

https://issues.apache.org/jira/browse/IGNITE-17820 - improvement SQL and
required for the next ticket
https://issues.apache.org/jira/browse/IGNITE-17748 - related to support of
indexes
https://issues.apache.org/jira/browse/IGNITE-17612 - fix issue when some
queries couldn't be done.
https://issues.apache.org/jira/browse/IGNITE-17330 - support RO transaction
by SQL
https://issues.apache.org/jira/browse/IGNITE-17859 - index filling
https://issues.apache.org/jira/browse/IGNITE-17813 - related to support
indexes by SQL
https://issues.apache.org/jira/browse/IGNITE-17655 - related to support
indexes by SQL

ср, 19 окт. 2022 г. в 12:11, Вячеслав Коптилин :

> Hello Alexander,
>
> Thank you for pointing this out. I fully support including RO transactions
> into the scope of Ignite 3.0.0-beta1 release.
>
> Thanks,
> S.
>
>
> ср, 19 окт. 2022 г. в 11:42, Alexander Lapin :
>
> > Igniters,
> >
> > I would like to add following tickets to ignite-3.0.0-beta1
> > https://issues.apache.org/jira/browse/IGNITE-17806
> > https://issues.apache.org/jira/browse/IGNITE-17759
> > https://issues.apache.org/jira/browse/IGNITE-17637
> > https://issues.apache.org/jira/browse/IGNITE-17263
> > https://issues.apache.org/jira/browse/IGNITE-17260
> >
> > It's all about read-only transactions.
> >
> > Best regards,
> > Alexander
> >
> > пт, 14 окт. 2022 г. в 19:45, Andrey Gura :
> >
> > > Igniters,
> > >
> > > The 'ignite-3.0.0-beta1' branch was created (the latest commit is
> > > 8160ef31ecf8d49f227562b6f0ab090c6b4438c1).
> > >
> > > The scope for the release is frozen.
> > >
> > > It means the following:
> > >
> > > - Any issue could be added to the release (fixVersion == 3.0.0-beta1)
> > > only after discussion with the community and a release manager in this
> > > thread.
> > > - Any commit to the release branch must be also applied to the 'main'
> > > branch.
> > >
> >
>


-- 
Живи с улыбкой! :D


Re: [ANNOUNCE] Apache Ignite 3.0.0 alpha 5: Code freeze

2022-06-08 Thread Юрий
Seems  SQL API: Add batched DML queries support [1] will be moved to next
alpha, so pleas don't waiting for it.

In the same time SQL API: Examples [2] already done and merged to alpha 5
branch.

[1] https://issues.apache.org/jira/browse/IGNITE-16963
[2] https://issues.apache.org/jira/browse/IGNITE-17088

пн, 6 июн. 2022 г. в 22:52, Andrey Gura :

> Igniters,
>
> ignite-3.0.0-alpha5 release branch has been created. But the following
> issues are still in progress:
>
> Data rebalancing
> https://issues.apache.org/jira/browse/IGNITE-14209
>
> SQL API: Add batched DML queries support.
> https://issues.apache.org/jira/browse/IGNITE-16963
>
> SQL API: Examples.
> https://issues.apache.org/jira/browse/IGNITE-17088
>
> Please, make sure these commits will be merged to the main and to the
> release branch.
>
> Thanks!
>
> On Mon, Jun 6, 2022 at 6:02 PM Aleksandr Pakhomov 
> wrote:
> >
> > Hi Andrey,
> >
> > As for CLI MVP, the planned timeline is today till 21:00.
> > Probably, will be ready in an hour. Just waiting for CI build.
> >
> > Best regards,
> > Aleksandr
> >
> > > On 6 Jun 2022, at 17:56, Andrey Gura  wrote:
> > >
> > > Igniters,
> > >
> > > our release schedule has shifted a bit. But it is time for a code
> > > freeze and a new branch creation.
> > >
> > > The following issues is still in progress (not an issues status, but
> > > work state):
> > >
> > > Data rebalancing
> > > https://issues.apache.org/jira/browse/IGNITE-14209
> > >
> > > CLI MVP
> > > https://issues.apache.org/jira/browse/IGNITE-16971
> > >
> > > [Native Persistence 3.0] End-to-end test for persistent PageMemory
> > > https://issues.apache.org/jira/browse/IGNITE-17107
> > >
> > > SQL API: Implement query metadata
> > > https://issues.apache.org/jira/browse/IGNITE-16962
> > >
> > > SQL API: Add batched DML queries support.
> > > https://issues.apache.org/jira/browse/IGNITE-16963
> > >
> > > SQL API: Examples.
> > > https://issues.apache.org/jira/browse/IGNITE-17088
> > >
> > > Please, give some planned timelines for these issues. I would like to
> > > create the ignite-3.0.0-alpha5 branch today and announce Code Freeze.
> > > Otherwise, extra steps will be required to include the PR's to the
> > > release branch.
> > >
> > > Thanks!
> >
>


-- 
Живи с улыбкой! :D


Re: Apache Ignite 3.0.0 alpha 5 RELEASE [Time, Scope, Manager]

2022-05-23 Thread Юрий
Hi Andrey,

It's good news, thanks for take RM role and provide list of changes for the
alpha.
Proposed dates looks good to me as well as the scope of the release

пн, 23 мая 2022 г. в 16:05, Andrey Gura :

> Hi Igniters,
>
> Four months have passed already since the Ignite 3 alpha 4 release. At
> the moment we have a set of features that can be released in order to
> give to a user an ability to try the features and share some feedback
> with the community. The expected feature list consists of:
>
>   - Pluggable storages: ability to choose a specific storage for a
> table (LSM based storage, Page memory persistent and in-memory
> storage) with some known limitations.
>   - Compute API (A simple remote job execution): The first phase of
> Compute API design and implementation. Of course, with known
> limitations.
>   - Data colocation: The colocation key concept replaces the affinity
> key concept. DDL introduces COLOCATE BY clause. Colocated job
> execution.
>   - Open API for the Ignite REST endpoints: A Specification to
> generate a client for any language + auto-generated docs for REST API.
>   - Ignite REPL: The Ignite CLI as a REPL with autocompletion and improves
> UX.
>   - Cluster lifecycle: It introduces cluster initialization logic and
> allows to specify cluster management and meta storage groups. Improved
> node join protocol.
>   - Local and distributed recovery: Now it is possible to restart a
> cluster/node without data loss.
>   - Data rebalance improvements (in progress and could be excluded
> from the release), including dynamically changing the number of
> partition replicas.
>   - Robust client connection with seamless reconnection support and
> retry policies.
>   - Java API for SQL: A simplified API (design only) for executing SQL
> queries on a cluster.
>
> I want to propose myself to be the release manager of the Ignite 3 alpha 5.
>
> Also I propose the following milestones for the release:
>
> Scope Freeze: June 1, 2022
> Code Freeze: June 3, 2022
> Voting Date: June 6, 2022
> Release Date: June 10, 2012
>
> Please, take into account that the proposed release is still alpha, so
> we can afford to have such a compressed schedule.
>
> WDYT?
>


-- 
Живи с улыбкой! :D


Re: [VOTE] Create separate Jira project and Confluence space for Ignite 3

2021-10-05 Thread Юрий
+1

вт, 5 окт. 2021 г. в 02:52, Valentin Kulichenko <
valentin.kuliche...@gmail.com>:

> Hello Community,
>
> As discussed in [1], I would like to propose the creation of a separate
> Jira project and Confluence space for Ignite 3.
>
> Ignite 2 and Ignite 3 are developed in parallel in separate repos, so we
> need a clear separation in other tools as well - this will help to
> streamline the development process. Please refer to the discussion for more
> details.
>
> [1]
>
> https://lists.apache.org/thread.html/rdcad3fc64b9f3a848c93089baae2bee1124a97869a94f4a04dd80fdf%40%3Cdev.ignite.apache.org%3E
>
> Voting options:
>
>- +1 - Agree with the suggestion
>- 0 - Don't care much about the suggestion
>- -1 - Disagree with the suggestion
>
> This is a majority vote.
>
> Voting ends in 72 hours, at 5pm PDT on October 7:
>
> https://www.timeanddate.com/counters/fullscreen.html?mode=a&iso=20211007T17&year=2021&month=10&day=7&hour=17&min=0&sec=0&p0=224
>
> -Val
>


-- 
Живи с улыбкой! :D


Re: [DISCUSS] Confuse default inspections.

2021-07-20 Thread Юрий
I totally agree.
Let's get rid of them.

вт, 20 июл. 2021 г. в 18:11, Konstantin Orlov :

> + for both
>
> --
> Regards,
> Konstantin Orlov
>
>
> > On 20 Jul 2021, at 16:32, Pavel Tupitsyn  wrote:
> >
> > Agree with for both points
> >
> > On Tue, Jul 20, 2021 at 3:14 PM Alexander Polovtcev <
> alexpolovt...@gmail.com>
> > wrote:
> >
> >> this is a very welcome change for me
> >>
> >> On Tue, Jul 20, 2021 at 10:13 AM Ivan Pavlukhin 
> >> wrote:
> >>
> >>> + for both points.
> >>>
> >>> 2021-07-20 9:56 GMT+03:00, Ivan Daschinsky :
>  Hi!
> 
>  Firstly, lets talk about interfaces.
> 
>  1. First of all, do we have an automatic inspection for it? AFAIK we
> >>> don't
>  2. I am for consistency. At least for production code. Nothing worse
> is
>  when someone mixes both approaches.
> 
>  About a prohibition of curly brackets around one line -- I am strongly
>  against this rule, it should be removed. There were a few bugs that
> >> this
>  rule caused.
> 
>  вт, 20 июл. 2021 г. в 09:23, Zhenya Stanilovsky
> >>>  > :
> 
> >
> > Igniters, i understand that this is very long and fundamental story.
> >> but
> > … still want to rise up this discussion, we have 2 very strange
> > inspections:
> > *  «public» modifier in interface methods.
> > *  Illegal ‘{}’ for one line statement. — i found it harmful.
> > I don`t want to link additional discussion about pos. 2 i think that
> >>> harm
> > is obvious.
> > I suggest to get rid of them.
> >
> > what do you think ?
> >
> >
> >
> 
> 
> 
>  --
>  Sincerely yours, Ivan Daschinskiy
> 
> >>>
> >>>
> >>> --
> >>>
> >>> Best regards,
> >>> Ivan Pavlukhin
> >>>
> >>
> >>
> >> --
> >> With regards,
> >> Aleksandr Polovtcev
> >>
>
>

-- 
Живи с улыбкой! :D


Re: Stop sending IGNITE Created e-mails to dev@

2021-04-29 Thread Юрий
Hi,
Ilya, could you please add me to "Contributors 1" too?

пн, 26 апр. 2021 г. в 12:29, Ilya Kasnacheev :

> Hello!
>
> I have added you to "Contributors 1" role. Everybody in this role will
> still get those "issue created" e-mail.
>
> Feel free in asking me to enlist.
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> чт, 22 апр. 2021 г. в 18:16, Ivan Pavlukhin :
>
> > > All issues notifications are also sent to iss...@ignite.apache.org so
> > one can subscribe to this list in order to track the created tickets.
> >
> > Does not sound as useful advice. Issues list [1] looks like real
> > scrapyard, I doubt that it can be usable for anyone in current flavor.
> > Can we send only "Created" notifications there?
> >
> > [1] https://lists.apache.org/list.html?iss...@ignite.apache.org
> >
> > 2021-04-21 18:30 GMT+03:00, Ilya Kasnacheev :
> > > Hello!
> > >
> > > INFRA ticket created:
> https://issues.apache.org/jira/browse/INFRA-21762
> > >
> > > I have asked to keep sending the created issue notifications for
> > > "Contributors 1" role, which is empty at present. So if you wish to
> keep
> > > getting those e-mails, please add yourself to this role or tell me to
> do
> > so
> > > for you.
> > >
> > > Regards,
> > > --
> > > Ilya Kasnacheev
> > >
> > >
> > > ср, 21 апр. 2021 г. в 17:59, Alexey Goncharuk <
> > alexey.goncha...@gmail.com>:
> > >
> > >> I support the idea. All issues notifications are also sent to
> > >> iss...@ignite.apache.org so one can subscribe to this list in order
> to
> > >> track the created tickets. The notifications trash the devlist archive
> > UI
> > >> and make it extremely difficult to navigate.
> > >>
> > >> вт, 20 апр. 2021 г. в 18:35, Ilya Kasnacheev <
> ilya.kasnach...@gmail.com
> > >:
> > >>
> > >> > Hello, Maxim!
> > >> >
> > >> > You are free to revert any commit which has led to any new stable
> test
> > >> > failure, or new flaky test that was non-flaky before.
> > >> >
> > >> > Just revert the change and reopen the ticket.
> > >> >
> > >> > The problem here is that it's very hard to detect on the spot, most
> of
> > >> > MTCGA e-mails are false positives and even if they are not, it is
> not
> > >> > relevant for most of developers.
> > >> >
> > >> > WDYT? I'm also still waiting for more input.
> > >> >
> > >> > Regards,
> > >> > --
> > >> > Ilya Kasnacheev
> > >> >
> > >> >
> > >> > ср, 14 апр. 2021 г. в 21:26, Maxim Muzafarov :
> > >> >
> > >> > > +1 for new JIRA issues
> > >> > > -1 for MTCGA notifications
> > >> > >
> > >> > > Why we should hide errors from the dev-list? Who should take care
> of
> > >> > > issues reported by MTCGA.Bot in this case?
> > >> > > We must apply stricter rules for such issues: a commit leading to
> an
> > >> > > error must be reverted.
> > >> > >
> > >> > > On Wed, 14 Apr 2021 at 20:00, Denis Mekhanikov
> > >> > > 
> > >> > > wrote:
> > >> > > >
> > >> > > > Huge +1 to this.
> > >> > > >
> > >> > > > I've already brought up this topic in the past:
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> http://apache-ignite-developers.2346864.n4.nabble.com/Bots-on-dev-list-td34406.html
> > >> > > > I hope some day newcomers won't need to set up their email
> filters
> > >> when
> > >> > > > they come to the developers list.
> > >> > > >
> > >> > > > Denis
> > >> > > >
> > >> > > > ср, 14 апр. 2021 г. в 18:07, Atri Sharma :
> > >> > > >
> > >> > > > > +1 to move issues to the issues list.
> > >> > > > >
> > >> > > > > For MTCGA, maybe build@?
> > >> > > > >
> > >> > > > > On Wed, Apr 14, 2021 at 8:35 PM Ilya Kasnacheev
> > >> > > > > 
> > >> > > wrote:
> > >> > > > > >
> > >> > > > > > Hello!
> > >> > > > > >
> > >> > > > > > We have a discussion on how to ensure best engagement in
> dev@
> > >> > list,
> > >> > > and
> > >> > > > > it
> > >> > > > > > seems that Issue Created emails from IGNITE project consume
> a
> > >> > > > > > lot
> > >> > of
> > >> > > > > screen
> > >> > > > > > space, it's hard to spot genuine discussions in
> > >> > > > > > https://lists.apache.org/list.html?dev@ignite.apache.org
> for
> > >> > > example.
> > >> > > > > >
> > >> > > > > > We already have issues@ mailing list. I propose that we
> stop
> > >> > > sending any
> > >> > > > > > JIRA emails to dev@. If anyone wishes to get just Created
> > >> emails,
> > >> > > they
> > >> > > > > can
> > >> > > > > > subscribe to these messages in their JIRA account settings.
> I
> > >> > imagine
> > >> > > > > most
> > >> > > > > > of you already filter these messages out, so you may need to
> > >> adjust
> > >> > > your
> > >> > > > > > filters slightly.
> > >> > > > > >
> > >> > > > > > A distant second is MTCGA messages, which are also
> > >> > > > > > autogenerated
> > >> > and
> > >> > > not
> > >> > > > > > informative for most readers of the channel, since they are
> at
> > >> best
> > >> > > > > > targeted at a single committer and at worst flaky.
> > >> > > > > >
> > >> > > > > > Where could we move those? What is your opinion here, on
> both
> > >> > issues?
> > >> > > > > >
>

Re: Removing MVCC public API

2020-12-10 Thread Юрий
+1

ср, 9 дек. 2020 г. в 11:25, Maxim Muzafarov :

> +1
>
>
> Also, I want to mention the list of MVCC related opened issues [1]
> without any updates for over a year.
>
> [1]  https://s.apache.org/1r5yk
>
> On Wed, 9 Dec 2020 at 10:22, Alexei Scherbakov
>  wrote:
> >
> > +1
> >
> > ср, 9 дек. 2020 г. в 10:03, Petr Ivanov :
> >
> > > +1
> > >
> > >
> > > > On 9 Dec 2020, at 09:39, Nikita Amelchev 
> wrote:
> > > >
> > > > +1
> > > >
> > > > ср, 9 дек. 2020 г. в 08:29, ткаленко кирилл :
> > > >>
> > > >> +1
> > > >>
> > > >>
> > > >> 08.12.2020, 23:47, "Andrey Mashenkov" :
> > > >>> +1
> > > >>>
> > > >>> On Tue, Dec 8, 2020 at 11:22 PM Igor Seliverstov <
> gvvinbl...@gmail.com
> > > >
> > > >>> wrote:
> > > >>>
> > >  +1
> > > 
> > >  08.12.2020 22:38, Andrey Gura пишет:
> > > > +1
> > > >
> > > > On Tue, Dec 8, 2020 at 10:02 PM Nikolay Izhikov <
> nizhi...@apache.org
> > > >
> > >  wrote:
> > > >> +1
> > > >>
> > > >>> 8 дек. 2020 г., в 21:54, Valentin Kulichenko <
> > >  valentin.kuliche...@gmail.com> написал(а):
> > > >>>
> > > >>> +1
> > > >>>
> > > >>> On Tue, Dec 8, 2020 at 8:31 AM Вячеслав Коптилин <
> > >  slava.kopti...@gmail.com>
> > > >>> wrote:
> > > >>>
> > >  Hello Igniters,
> > > 
> > >  I want to start voting on removing the public API (and
> eventually
> > > all
> > >  unused parts) related to the MVCC feature.
> > > 
> > >  This topic has already been discussed many times (at least,
> [1],
> > > [2])
> > >  and
> > >  the community has agreed the feature implementation must be
> > >  reapproached,
> > >  because using coordinator node for transactions ordering and
> 2pc
> > >  protocol
> > >  is slow by design and will not scale well. [3]
> > > 
> > >  Moreover, the current implementation has critical issues [4],
> not
> > >  supported
> > >  by the community, and not well tested at all.
> > > 
> > >  Removing the public API first will allow us to clean up the
> code
> > >  later step
> > >  by step without rushing and keep intact useful improvements
> that
> > > are
> > >  already in use or can be reused for other parts in the future.
> > >  For instance, partition counters implementation is already
> > > adapted to
> > >  fix
> > >  tx caches protocol issues [5].
> > > 
> > >  The future of MVCC is unclear for now, but, definitely, this
> > > feature
> > >  is
> > >  useful for a lot of user scenarios and can be scheduled for
> later
> > >  Ignite
> > >  versions.
> > >  Also, the MVCC feature is in an experimental state, so it can
> be
> > >  modified
> > >  in any way, I think.
> > > 
> > >  +1 - to accept removing MVVC feature from public API
> > >  0 - don't care either way
> > >  -1 - do not accept removing API (explain why)
> > > 
> > >  The vote will hold for 7 days and will end on Wednesday,
> December
> > >  16th at
> > >  19:00 UTC:
> > > 
> > > 
> > > 
> > >
> https://www.timeanddate.com/countdown/generic?iso=20201216T19&p0=1440&font=cursive
> > > 
> > >  [1]
> > > 
> > > 
> > > 
> > >
> http://apache-ignite-developers.2346864.n4.nabble.com/Mark-MVCC-with-IgniteExperimental-td45669.html
> > >  [2]
> > > 
> > > 
> > > 
> > >
> http://apache-ignite-developers.2346864.n4.nabble.com/Disable-MVCC-test-suites-td50416.html
> > >  [3]
> > > 
> > > 
> > > 
> > >
> http://apache-ignite-developers.2346864.n4.nabble.com/Mark-MVCC-with-IgniteExperimental-tp45669p45727.html
> > >  [4]
> > > 
> > > 
> > > 
> > >
> http://apache-ignite-developers.2346864.n4.nabble.com/Mark-MVCC-with-IgniteExperimental-tp45669p45716.html
> > >  [5]
> > > 
> > > 
> > > 
> > >
> http://apache-ignite-developers.2346864.n4.nabble.com/Mark-MVCC-with-IgniteExperimental-tp45669p45714.html
> > > 
> > >  Thanks,
> > >  Slava.
> > > 
> > > >>>
> > > >>> --
> > > >>> Best regards,
> > > >>> Andrey V. Mashenkov
> > > >
> > > >
> > > >
> > > > --
> > > > Best wishes,
> > > > Amelchev Nikita
> > >
> > >
> >
> > --
> >
> > Best regards,
> > Alexei Scherbakov
>


-- 
Живи с улыбкой! :D


Re: IEP-54: Schema-first approach for 3.0

2020-11-26 Thread Юрий
A little bit my thoughts about unsigned types:

1. Seems we may support unsign types
2. It requires adding new types to the internal representation, protocol,
e.t.c.
3. internal representation should be the same as we keep sign types. So it
will not requires more memory
4. User should be aware of specifics such types for platforms which not
support unsigned types. For example, a user could derive -6 value in Java
for 250 unsigned byte value (from bits perspective will be right). I think
We shouldn't use more wide type for such cases, especially it will be bad
for unsigned long when we require returns BigInteger type.
5. Possible it requires some suffix/preffix for new types like a '250u' -
it means that 250 is an unsigned value type.
6. It requires a little bit more expensive comparison logic for indexes
7. It requires new comparison logic for expressions. I think it not
possible for the current H2 engine and probably possible for the new
Calcite engine. Need clarification from anybody who involved in this part

WDYT?

вт, 24 нояб. 2020 г. в 18:36, Alexey Goncharuk :

> Actually, we can support comparisons in 3.0: once we the actual type
> information, we can make proper runtime adjustments and conversions to
> treat those values as unsigned - it will be just a bit more expensive.
>
> вт, 24 нояб. 2020 г. в 18:32, Pavel Tupitsyn :
>
> > > SQL range queries it will break
> > > WHERE x > y may return wrong results
> >
> > Yes, range queries, inequality comparisons and so on are broken
> > for unsigned data types, I think I mentioned this somewhere above.
> >
> > Again, in my opinion, we can document that SQL is not supported on those
> > types,
> > end of story.
> >
> > On Tue, Nov 24, 2020 at 6:25 PM Alexey Goncharuk <
> > alexey.goncha...@gmail.com>
> > wrote:
> >
> > > Folks, I think this is a reasonable request. I thought about this when
> I
> > > was drafting the IEP, but hesitated to add these types right away.
> > >
> > > > That is how it works in Ignite since the beginning with .NET and C++
> :)
> > > I have some doubts that it actually works as expected, it needs some
> > > checking (will be glad if my concerns are false):
> > >
> > >- It's true that equality check works properly, but for SQL range
> > >queries it will break unless some special care is taken on Java
> side:
> > > for
> > >u8 255 > 10, but in Java (byte)255 will be converted to -1, which
> will
> > >break the comparison. Since we don't have unsigned types now, I
> doubt
> > it
> > >works.
> > >- There is an obvious cross-platform data loss when "intuitive" type
> > >mapping is used by a user (u8 corresponds to byte type in .NET, but
> to
> > >avoid values loss, a user will have to use short type in Java, and
> > > Ignite
> > >will also need to take care of the range check during
> serialization).
> > I
> > >think we can even allow to try to deserialize a value into arbitrary
> > > type,
> > >but throw an exception if the range is out of bounds.
> > >
> > > Overall, I agree with Andrey's comments.
> > > Andrey, do you mind updating the IEP once all the details are settled
> > here?
> > >
> > > вт, 24 нояб. 2020 г. в 18:19, Andrey Mashenkov <
> > andrey.mashen...@gmail.com
> > > >:
> > >
> > > > Pavel,
> > > >
> > > > I believe uLong values beyond 2^63 can't be treated correctly for now
> > > > (WHERE x > y may return wrong results)
> > > >
> > > > I think we could make "true" support for unsigned types, but they
> will
> > > have
> > > > limitations on the Java side.
> > > > Thus, the one will not be able to map uint64 to Java long primitive,
> > but
> > > to
> > > > BigInteger only.
> > > > As for indices, we could read uint64 to Java long, but treat negative
> > > > values in a different way to preserve correct ordering.
> > > >
> > > > These limitations will affect only mixed environments when .Net and
> > Java
> > > > used to access the data.
> > > > Will this solution address your issues?
> > > >
> > > >
> > > > On Tue, Nov 24, 2020 at 5:45 PM Pavel Tupitsyn  >
> > > > wrote:
> > > >
> > > > > > That way is impossible.
> > > > >
> > > > > That is how it works in Ignite since the beginning with .NET and
> C++
> > :)
> > > > > You can use unsigned primitives as cache keys and values, as fields
> > and
> > > > > properties,
> > > > > and in SQL queries (even in WHERE x=y clauses) - it works
> > transparently
> > > > for
> > > > > the users.
> > > > > Java side knows nothing and treats those values as corresponding
> > signed
> > > > > types.
> > > > >
> > > > > However, this abstraction leaks in some cases only because there
> are
> > no
> > > > > corresponding type ids.
> > > > > That is why I'm proposing a very simple change to the protocol -
> add
> > > type
> > > > > ids, but handle them the same way as signed counterparts.
> > > > >
> > > > >
> > > > > On Tue, Nov 24, 2020 at 5:00 PM Andrey Mashenkov <
> > > > > andrey.mashen...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Pavel,

Re: delete is too slow, sometimes even causes OOM

2020-11-19 Thread Юрий
Frank,

Tiket [1] has been resolved. Try to use LAZY flag for your DML query on the
new nightly build.

[1] https://issues.apache.org/jira/browse/IGNITE-9182

пн, 9 нояб. 2020 г. в 19:28, Denis Magda :

> Frank,
>
> The ticket doesn't suggest the lazy flag as a workaround. The flag is
> supposed to be used to address the performance issue.
>
> How about a workaround on your application side while you're waiting for
> this improvement?
>
>- Query all the records for a deletion - "SELECT record_primary_key
>WHERE delete_condition"
>- Delete the records using the key-value API -
>cache.removeAll(all_primary_keys).
>
> -
> Denis
>
>
> On Mon, Nov 9, 2020 at 8:20 AM frank li  wrote:
>
> > I enforced  a lazy flag in DELETE code for tesing, but it is stil running
> > very slow. I mean that "Lazy" flag cannot solve the problem of running
> too
> > slow.
> >
> > On 2020/11/06 09:50:15, Юрий  wrote:
> > > Hi Frank!
> > >
> > > There is an old ticket [1] - We will try to prioritize it to finish
> > before
> > > the end of the year it should prevent OOM for most cases.
> > >
> > > [1] https://issues.apache.org/jira/browse/IGNITE-9182
> > >
> > > вт, 3 нояб. 2020 г. в 18:53, frank li :
> > >
> > > > Current code logic for DELETE is as follows:
> > > > if WHERE clause contains a condition as "key=xxx", it uses fastUpdate
> > > > which remove the related item directly.
> > > >
> > > > else
> > > > do select for update;
> > > > for each row, call closure code "RMV" to remove it.
> > > >
> > > > 1. As "executeSelectForDml" get _KEY and _VAL columns for all
> condidate
> > > > rows, it often causes OOM when there are a lot of data  to delete.
> Why
> > do
> > > > we verify "val" during remove operation?
> > > >
> > > > 2. After selection,  why don't we just remove it with cache.remove as
> > > > fastUpdate does?
> > > >
> > > >
> > > >
> > >
> > > --
> > > Живи с улыбкой! :D
> > >
> >
>


-- 
Живи с улыбкой! :D


Re: delete is too slow, sometimes even causes OOM

2020-11-06 Thread Юрий
Hi Frank!

There is an old ticket [1] - We will try to prioritize it to finish before
the end of the year it should prevent OOM for most cases.

[1] https://issues.apache.org/jira/browse/IGNITE-9182

вт, 3 нояб. 2020 г. в 18:53, frank li :

> Current code logic for DELETE is as follows:
> if WHERE clause contains a condition as "key=xxx", it uses fastUpdate
> which remove the related item directly.
>
> else
> do select for update;
> for each row, call closure code "RMV" to remove it.
>
> 1. As "executeSelectForDml" get _KEY and _VAL columns for all condidate
> rows, it often causes OOM when there are a lot of data  to delete. Why do
> we verify "val" during remove operation?
>
> 2. After selection,  why don't we just remove it with cache.remove as
> fastUpdate does?
>
>
>

-- 
Живи с улыбкой! :D


Re: Proposal of new event QUERY_EXECUTION_EVENT

2020-09-14 Thread Юрий
Dmitrii, seems you are right and we can go with new separate event

пн, 7 сент. 2020 г. в 23:53, Dmitrii Ryabov :

> Any objections to create a separate event, which will be fired before
> executing a query?
>
> ср, 2 сент. 2020 г. в 22:33, Dmitrii Ryabov :
> >
> > I agree with Max, we need to add a separate event for starting query
> > execution, and EVT_CACHE_QUERY_EXECUTED shouldn't be deprecated,
> > because it is another case - it is fired when cache query was
> > successfully finished.
> >
> > > Would the event notification be synchronous? In which thread?
> > As Max said, synchronicity depends on implementation. As I see, we
> > don't use a separate thread for any record calls.
> >
> > > What happens in case the event listener fails?
> > Exceptions are logged by `U.error(...)` call. Errors are thrown out.
> >
> > > Should we discuss this within this topic?
> > I suggest to separate adding a new event and configuring existing events.
> >
> > пн, 20 июл. 2020 г. в 14:37, Max Timonin :
> > >
> > > Looks like EVT_CACHE_QUERY_EXECUTED just works for different use cases:
> > > 1. it relates to a specific cache (Event for SQL queries looks
> different as
> > > it could contain multiple caches or none of them);
> > > 2. Also the EVT_CACHE_QUERY_EXECUTED event fires multiple times for
> > > distributed queries, see GridMapQueryExecutor class (For SQL query it's
> > > required to fire once independently on how many nodes are affected).
> > >
> > > So there are different patterns for events. I think
> > > EVT_CACHE_QUERY_EXECUTED should not be deprecated or changed.
> > >
> > > > What happens in case the event listener fails?
> > > > Would the event notification be synchronous?
> > > It depends on how other events are implemented. As I see for the
> > > EVT_CACHE_QUERY_EXECUTED event - it's synchronous, and listener errors
> > > aren't handled.
> > >
> > > I think these questions are related to GridEventStorageManager as it
> just
> > > provides an API for recording events. Internal implementations (sync
> > > / async, error handling) is not related to an event, is it?
> > >
> > > > I have some doubts about provide text of a query even with
> > > hidden arguments, probably it should be configured due to it could lead
> > > to security leak
> > > Currently event EVT_CACHE_QUERY_EXECUTED provides a sql clause without
> > > limitations. If we're going to provide some restrictions it will
> require
> > > additional investigation. I see at least 2 configurations here:
> > > 1. Ignite can be configured to hide clase, params only or nothing for
> all
> > > listeners;
> > > 2. Only authorized listeners can subscribe to the event.
> > >
> > > Should we discuss this within this topic?
> > >
> > > On Mon, Jul 20, 2020 at 1:55 PM Юрий 
> wrote:
> > >
> > > > In my opinion existing events EVT_CACHE_QUERY_EXECUTION_STARTED
> should be
> > > > deprecated and added two new for start and for end of queries which
> should
> > > > cover all SQL query types.
> > > > I have some doubts about provide text of a query even with hidden
> > > > arguments, probably it should be configured due to it could lead to
> > > > security leak
> > > >
> > > > пн, 20 июл. 2020 г. в 12:49, Stanislav Lukyanov <
> stanlukya...@gmail.com>:
> > > >
> > > > > Maksim,
> > > > >
> > > > > Can we change the EVT_CACHE_QUERY_EXECUTED to fire earlier? Or
> should
> > > > > there be an EVT_CACHE_QUERY_EXECUTION_STARTED for the query start,
> while
> > > > > the old event would continue to work for query finish?
> > > > > I think the new event needs to either reuse the old one, or be its
> mirror
> > > > > for the query start. It also means that we probably should resolve
> the
> > > > > issues you've listed.
> > > > >
> > > > > Would the event notification be synchronous? In which thread?
> > > > Asynchronous
> > > > > is generally preferred - would it work?
> > > > >
> > > > > What happens in case the event listener fails?
> > > > >
> > > > > Thanks,
> > > > > Stan
> > > > >
> > > > > > On 16 Jul 2020, at 18:49, Denis Magda  wrote:
> > > > > &g

Re: Proposal of new event QUERY_EXECUTION_EVENT

2020-07-20 Thread Юрий
In my opinion existing events EVT_CACHE_QUERY_EXECUTION_STARTED should be
deprecated and added two new for start and for end of queries which should
cover all SQL query types.
I have some doubts about provide text of a query even with hidden
arguments, probably it should be configured due to it could lead to
security leak

пн, 20 июл. 2020 г. в 12:49, Stanislav Lukyanov :

> Maksim,
>
> Can we change the EVT_CACHE_QUERY_EXECUTED to fire earlier? Or should
> there be an EVT_CACHE_QUERY_EXECUTION_STARTED for the query start, while
> the old event would continue to work for query finish?
> I think the new event needs to either reuse the old one, or be its mirror
> for the query start. It also means that we probably should resolve the
> issues you've listed.
>
> Would the event notification be synchronous? In which thread? Asynchronous
> is generally preferred - would it work?
>
> What happens in case the event listener fails?
>
> Thanks,
> Stan
>
> > On 16 Jul 2020, at 18:49, Denis Magda  wrote:
> >
> > Taras, Yury, Ivan,
> >
> > Could you please join this thread and share your thoughts? Do we already
> > have any plans on tracking of the DDL and DML queries?
> >
> > -
> > Denis
> >
> >
> > On Wed, Jul 15, 2020 at 12:09 AM Max Timonin 
> > wrote:
> >
> >> Hi Denis, thanks for the answer!
> >>
> >> We already checked EVT_CACHE_QUERY_EXECUTED and found that it works
> only in
> >> cases:
> >> 1. Scan queries and Select queries (common pattern is access to cache
> >> data);
> >> 2. This event triggers only if query execution succeeds, in case of
> failure
> >> while execution this event won't fire.
> >>
> >> Our additional requirements are to protocol queries:
> >> 1. that aren't cache related (for example, alter user);
> >> 2. that relate to multiple caches (while EVT_CACHE_QUERY_EXECUTED have
> >> field cacheName related to specific cache);
> >> 3. we need to protocol also DDL and DML queries.
> >>
> >> Regards,
> >> Maksim
> >>
> >> On Tue, Jul 14, 2020 at 10:20 PM Denis Magda  wrote:
> >>
> >>> Hi Max,
> >>>
> >>> Could you check if the EVT_CACHE_QUERY_EXECUTED event is what you're
> >>> looking for?
> >>>
> >>>
> >>
> https://www.gridgain.com/docs/latest/developers-guide/events/events#cache-query-events
> >>>
> >>> -
> >>> Denis
> >>>
> >>>
> >>> On Fri, Jul 10, 2020 at 3:54 AM Max Timonin 
> >>> wrote:
> >>>
>  Hi Igniters!
> 
>  We're going to protocol all input SQL queries from our users.
> Currently
>  there is no such mechanism in Ignite to use for it. So we're proposing
> >> to
>  add a new event: QUERY_EXECUITION_EVENT.
> 
>  Requirements for the event:
>  1. If this event fires it means that a query is correct and will be
>  executed (and failed only in exceptional cases);
> 
>  2. Event fires for all query types;
> 
>  3. Required fields are:
>  - text of a query (with hidden arguments);
>  - arguments of query;
>  - query type;
>  - node id.
> 
>  Looks that this event should go along with `runningQryMgr::register`
> in
>  class `IgniteH2Indexing` as this method invoked for all input queries
> >>> too.
> 
>  What do you think?
> 
>  Regards,
>  Maksim
> 
> >>>
> >>
>
>

-- 
Живи с улыбкой! :D


Roadmap of new distributed SQL engine based on Calcite

2020-06-08 Thread Юрий
Dear Igniters,

Many of you heard about starting development of new SQL engine based on
Calcite. Currently we are only in beginning of the way and it will require
many works to achieve the goal. In order to understand where are we and
which features already done and which of them could be started by you will
be better to have some consolidate resource. For the reason I've prepare
roadmap page [1] , where we could track progress of development new SQL
engine.

Let's start fill the page and providing rough estimates. Dear, Ignite user
community, please share your suggestion as well.

[1]
https://cwiki.apache.org/confluence/display/IGNITE/Apache+Calcite-powered+SQL+Engine+Roadmap

-- 
Живи с улыбкой! :D


Re: [CVE-2020-1963] Apache Ignite access to file system disclosure vulnerability

2020-06-08 Thread Юрий
Denis,

It has been done in the same day as it announced here as described at
https://www.apache.org/security/committers.html#vulnerability-handling.
Probably it require some time to information to be updated.


Also I can confirm that no any plans to provide patch for any previous
versions of Ignite.



пт, 5 июн. 2020 г. в 19:20, Denis Magda :

> Yury,
>
> Could you please update the CVE with the details from this announcement?
>
> Nick, to my knowledge, there are no any plans to propagate this fix to the
> downstream versions such as 2.7, etc.
>
> -
> Denis
>
>
> On Wed, Jun 3, 2020 at 8:10 AM Nick Popov  wrote:
>
>> Are you going to provide CVE-2020-1964 patches and patch instructions for
>> previous Ignite versions?
>>
>>
>>
>> Regards,
>>
>> -Nick
>>
>>
>>
>> *From:* Sriveena Mattaparthi 
>> *Sent:* Wednesday, June 3, 2020 9:04 AM
>> *To:* u...@ignite.apache.org; dev ;
>> annou...@apache.org; Apache Security Team 
>> *Subject:* COMMERCIAL:RE: [CVE-2020-1963] Apache Ignite access to file
>> system disclosure vulnerability
>>
>>
>>
>> Thanks, Could you please confirm when the analysis will be updated here
>> for the CVE logged.
>>
>> https://nvd.nist.gov/vuln/detail/CVE-2020-1963
>>
>>
>>
>> Regards,
>> Sriveena
>>
>>
>>
>> *From:* Юрий 
>> *Sent:* 03 June 2020 16:02
>> *To:* dev ; u...@ignite.apache.org;
>> annou...@apache.org; Apache Security Team ;
>> Sriveena Mattaparthi 
>> *Subject:* [CVE-2020-1963] Apache Ignite access to file system
>> disclosure vulnerability
>>
>>
>>
>> Hi All,
>>
>> Apache Ignite 2.8.1 has been released. The release contain fix of
>> critical vulnerability
>>
>> CVE-2020-1963: Apache Ignite access to file system through predefined H2
>> SQL functions
>>
>> Severity: Critical
>>
>> Vendor:
>> The Apache Software Foundation
>>
>> Versions Affected:
>> All versions of Apache Ignite up to 2.8
>>
>> Impact
>> An attacker can use embedded H2 SQL functions to access a filesystem for
>> write and read.
>>
>> Description:
>> Apache Ignite uses H2 database to build SQL distributed execution engine.
>> H2 provides SQL functions which could be used by attacker to access to a
>> filesystem.
>>
>> Mitigation:
>> Ignite 2.8 or earlier users should upgrade to 2.8.1
>> In case SQL is not used at all the issue could be mitigated by removing
>> ignite-indexing.jar from Ignite classpath
>> Risk could be partially mitigated by using non privileged user to start
>> Apache Ignite.
>>
>> Credit:
>> This issue was discovered by Sriveena Mattaparthi of ekaplus.com
>> <https://apc01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fekaplus.com%2F&data=02%7C01%7CSriveena.Mattaparthi%40ekaplus.com%7Cfd4be57b204d40b49a3208d807a952ca%7C2a5b4e9716be4be4b2d40f3fcb3d373c%7C1%7C0%7C637267771122745491&sdata=eOKf4r6a1PmMvRg1HKa79HZqd%2Fp%2Fhq%2BJGlHmIZoLy%2Bo%3D&reserved=0>
>>
>>
>>
>> --
>>
>> Живи с улыбкой! :D
>>
>> “Confidentiality Notice: The contents of this email message and any
>> attachments are intended solely for the addressee(s) and may contain
>> confidential and/or privileged information and may be legally protected
>> from disclosure. If you are not the intended recipient of this message or
>> their agent, or if this message has been addressed to you in error, please
>> immediately alert the sender by reply email and then delete this message
>> and any attachments. If you are not the intended recipient, you are hereby
>> notified that any use, dissemination, copying, or storage of this message
>> or its attachments is strictly prohibited.”
>>
>>
>>
>>  CAUTION EXTERNAL EMAIL - The email originated outside the organization.  Do 
>> not click on any links or open attachments unless you recognize the sender 
>> and know the content is safe.
>>
>>
>>
>>
>>
>>
>> TDECU and our subsidiaries are committed to maintaining Member 
>> confidentiality. Please note this message is being sent using a secure 
>> connection to ensure all information remains private and confidential. The 
>> information contained in this message is intended only for the recipient. If 
>> the reader of this message is not the intended recipient, please delete 
>> immediately.
>>
>>
>>
>>

-- 
Живи с улыбкой! :D


[CVE-2020-1963] Apache Ignite access to file system disclosure vulnerability

2020-06-03 Thread Юрий
Hi All,

Apache Ignite 2.8.1 has been released. The release contain fix of critical
vulnerability

CVE-2020-1963: Apache Ignite access to file system through predefined H2
SQL functions

Severity: Critical

Vendor:
The Apache Software Foundation

Versions Affected:
All versions of Apache Ignite up to 2.8

Impact
An attacker can use embedded H2 SQL functions to access a filesystem for
write and read.

Description:
Apache Ignite uses H2 database to build SQL distributed execution engine.
H2 provides SQL functions which could be used by attacker to access to a
filesystem.

Mitigation:
Ignite 2.8 or earlier users should upgrade to 2.8.1
In case SQL is not used at all the issue could be mitigated by removing
ignite-indexing.jar from Ignite classpath
Risk could be partially mitigated by using non privileged user to start
Apache Ignite.

Credit:
This issue was discovered by Sriveena Mattaparthi of ekaplus.com

-- 
Живи с улыбкой! :D


Re: Apache Ignite 2.8.1 RELEASE [Time, Scope, Manager]

2020-05-14 Thread Юрий
Nikolay,

Release 2.8.1 are delayed and announced dates [1] at release page is not
actual. Could you update it to reflect current vision of release date?

[1]. https://cwiki.apache.org/confluence/display/IGNITE/Apache+Ignite+2.8.1

чт, 7 мая 2020 г. в 17:23, Nikolay Izhikov :

> Done.
>
>
> > 7 мая 2020 г., в 13:20, Denis Garus  написал(а):
> >
> > Nikolay,
> > could we add the simple improvement [1] to 2.8.1 scope?
> >
> > 1. https://issues.apache.org/jira/browse/IGNITE-12983
> >
> > чт, 30 апр. 2020 г. в 11:26, Alex Plehanov :
> >
> >> Nikolay,
> >>
> >> TC results: [1], [2]
> >> I've cherry-picked IGNITE-12933 and IGNITE-12855 to 2.8.1
> >>
> >> [1]:
> >>
> >>
> https://mtcga.gridgain.com/pr.html?serverId=apache&suiteId=IgniteTests24Java8_RunAll&branchForTc=pull%2F7754%2Fhead&action=Latest&baseBranchForTc=ignite-2.8.1
> >> [2]:
> >>
> >>
> https://mtcga.gridgain.com/pr.html?serverId=apache&suiteId=IgniteTests24Java8_RunAll&branchForTc=pull%2F7755%2Fhead&action=Latest&baseBranchForTc=ignite-2.8.1
> >>
> >> вт, 28 апр. 2020 г. в 15:19, Nikolay Izhikov :
> >>
> >>> Hello, Alex.
> >>>
> >>> +1 from me.
> >>>
>  28 апр. 2020 г., в 15:03, Alex Plehanov 
> >>> написал(а):
> 
>  Hello guys,
> 
>  While we are still waiting for some tickets to resolve I propose to
>  cherry-pick to 2.8.1 two more bugfixes:
>  IGNITE-12933 Fixed node failure after put incorrect key class for
> cache
>  with indexed types
>  IGNITE-12855 Fixed node failure with concurrent get operation and
> entry
>  expiration when persistent is enabled
>  Both fixes prevent node failure in some circumstances, both fixed
> >> already
>  merged to master.
> 
>  WDYT?
> 
>  пн, 27 апр. 2020 г. в 11:53, Nikolay Izhikov :
> 
> > Taras.
> >
> > Thank you, very much!
> > You changes merged to 2.8.1 branch.
> >
> > Igniters,
> >
> > We have 10 tickets scheduled for 2.8.1 release:
> >
> > OPEN:
> >
> > IGNITE-11687Concurrent WAL replay & log may fail with CRC error
> on
> > read - Dmitriy Govorukhin
> > IGNITE-12346.NET: Platform error:System.NullReferenceException -
> > Pavel Tupitsyn
> >
> > IN PROGRESS:
> >
> > IGNITE-12637IgniteSparkSession doesn't start the clients on
> really
> > distributed cluster - Yaroslav Molochkov
> > IGNITE-12788Cluster achieved fully rebalanced (PME-free ready)
> >> state
> > metric - Nikolay Izhikov
> >
> > PATCH AVAILABLE:
> >
> > IGNITE-10417notifyDiscoveryListener() call can be lost. - Pavel
> > Voronkin
> > IGNITE-12852Comma in field is not supported by COPY command -
> >> YuJue
> >>> Li
> > IGNITE-12252Unchecked exceptions during rebalancing should be
> >>> handled
> > - Nikolai Kulagin
> > IGNITE-12905QueryKeyValueIterable missing custom spliterator()
> > implementation - Johnny Galatikitis
> > IGNITE-12801Possible extra page release when throttling and
> >>> checkpoint
> > thread store its concurrently - Anton Kalashnikov
> > IGNITE-12794Scan query fails with an assertion error: Unexpected
> >> row
> > key - Denis Mekhanikov
> >
> >
> >> 27 апр. 2020 г., в 11:08, Taras Ledkov 
> > написал(а):
> >>
> >> Hi,
> >>
> >> Nikolay, i've created PR [1] that contains the SQL-related tickets
> to
> > port into 2.8.1:
> >>
> >> IGNITE-12790 Introduce distributed SQL configuration and ability to
> > disable SQL functions.
> >> IGNITE-12887 Fix handle type mismatch exception on compare values
> >> while
> > traversing index tree.
> >> IGNITE-12848 fix H2Connection leaks on INSERT
> >> IGNITE-12800  SQL: local queries cursors must be closed or full read
> >> to
> > unlock the GridH2Table.
> >>
> >> TC test are OK. Please take a look at the TC bot report [2].
> >> Please review & merge the patch into ignite-2.8.1.
> >>
> >> [1]. https://github.com/apache/ignite/pull/7703
> >> [2].
> >
> >>>
> >>
> https://mtcga.gridgain.com/pr.html?serverId=apache&suiteId=IgniteTests24Java8_RunAll&branchForTc=pull%2F7703%2Fhead&action=Latest&baseBranchForTc=ignite-2.8.1
> >>
> >>
> >> On 23.04.2020 13:25, Mikhail Petrov wrote:
> >>> Hello, Igniters.
> >>>
> >>> I propose to cherry-pick to 2.8.1 ticket [1].
> >>>
> >>>
> >>> In addition to adding a new metric, it fixes a bug when, after
> > deactivation, GridDhtPartitionsExchangeFuture#rebalanced flag was not
> > reset. And therefore, it can be different on nodes that are already
> in
> >>> the
> > cluster from newly joined ones.
> >>>
> >>> [1] - https://issues.apache.org/jira/browse/IGNITE-12788
> >>>
> >>> Regards,
> >>> Mikhail.
> >>>
> >>> On 22.04.2020 14:03, Nikolay Izhikov wrote:
>  Hello, Ivan.
> 
>  I think we can include this improvements.
> >>

Re: [DISCUSSION] Major changes in Ignite in 2020

2020-04-10 Thread Юрий
Hi Igniters!

Major changes that are going to be contributed from our side for IGNITE SQL:

   - Local runtime statistics which helps to estimate query execution plan.
   It should help use right join order in most cases. Seems could be done by
   Q3.
   - most efforts will be direct to new SQL engine based on Calcite. I hope
   new engine will be able to execute a query arbitrary complexity by end of
   the year, however it still have many performance optimization absent.


пт, 10 апр. 2020 г. в 14:52, Ivan Rakov :

> Hi everyone!
>
> Major changes that are going to be contributed from our side:
> - https://issues.apache.org/jira/browse/IGNITE-11704 - keeping tombstones
> for removed entries to make rebalance consistent (this problem is solved by
> on-heap deferred deletes queue so far).
> - https://issues.apache.org/jira/browse/IGNITE-11147  - don't cancel
> ongoing rebalance if affinity assignment for the rebalancing group wasn't
> changed during the PME.
> - Batch of other updates related to the historical rebalance. Goal is to
> make historical rebalance stable and to ensure that if WAL history is
> configured properly the cluster will be able to recover data consistency
> via historical rebalance in case of any topology changes (including cycling
> restart).
> - Overhaul of partition loss handling. It has several flaws so far; the
> most critical one is that by default (with PartitionLossPolicy.IGNORE)
> Ignite may silently lose data. Also, (PartitionLossPolicy.IGNORE) is
> totally inapplicable to scenarios when persistence is enabled and BLT is
> established. Also, even safe policies have bugs: LOST state is reset when
> node rejoins the cluster, so data actually can be lost even with safe
> policy. We are going to set safe policy as default and fix related bugs.
> - Distributed tracing (via OpenCensus). Discovery, communication and
> transactions will be covered.
>
> On Fri, Apr 10, 2020 at 11:43 AM Anton Kalashnikov 
> wrote:
>
>> My top priorities:
>> * Cache warm-up - loading data from disk to memory before the join to
>> cluster -
>> https://cwiki.apache.org/confluence/display/IGNITE/IEP-40+Cache+warm-up
>> * PDS Defragmentation - possibility to free up space on disc after
>> removing entries
>>
>>
>> --
>> Best regards,
>> Anton Kalashnikov
>>
>>
>>
>> 20.03.2020, 10:19, "Pavel Tupitsyn" :
>>
>> My top priorities:
>>
>>- Thin Client API extension: Compute, Continuous Queries, Services
>>- .NET Near Cache: soon to come in Thick API, to be investigated for
>>Thin Clients
>>- .NET Modernization for Ignite 3.0: drop legacy .NET Framework
>>support, target .NET Standard 2.0, add nullable annotations to the API
>>
>>
>> On Fri, Mar 20, 2020 at 5:23 AM Saikat Maitra 
>> wrote:
>>
>> Hi Denis,
>>
>> Thank you for sharing the list of top changes. The list looks good.
>>
>> I wanted to share that efforts regarding IEP-36 is already underway and
>> there are also open PRs under review and working through review feedback.
>> One of the area that we are focussing is first we will merge changes in
>> ignite-extensions repo before removing the specific migrated module from
>> ignite repo.
>>
>> There are also contribution from community on bug fixes in
>> ignite-extensions repo as well which we are verifying and merging in
>> ignite-extensions repo after running through CI pipeline in teamcity.
>>
>> I like the focus area on docs and I really like the Apache Ignite
>> Usecases page https://ignite.apache.org/provenusecases.html,  I would
>> like to suggest if we can add a page like powered by Apache Ignite and list
>> few Org who are already using Apache Ignite in prod.
>>
>> Something similar to this page https://flink.apache.org/poweredby.html
>>
>> Regards,
>> Saikat
>>
>>
>>
>>
>>
>>
>> On Thu, Mar 19, 2020 at 1:44 PM Denis Magda  wrote:
>>
>> My top list of changes is as follows:
>>
>>- Feature: New lightweight Apache Ignite website with advanced search
>>engine optimizations and updated technical content. Why? Much better
>>discoverability of Ignite via search engines like Google to let many more
>>application developers learn about Ignite existence. This change is to be
>>brought to live soon:
>>
>> http://apache-ignite-developers.2346864.n4.nabble.com/Ignite-Website-New-Look-td46324.html
>>
>>
>>- Feature: New Ignite documentation on a new platform and with a new
>>structure. Why? Ignite documentation has to help new application 
>> developers
>>to get up and running as quickly as possible, it also has to become a
>>primary source that answers most of the questions. Our current docs have a
>>lot of gaps: https://issues.apache.org/jira/browse/IGNITE-7595
>>
>>
>>- Process Change: to be successful with the point above,
>>documentation should be created/updated before we close a JIRA ticket for
>>code/API/feature contribution. Why? First, application developers learn
>>Ignite and create their Ignite-apps referring to API refe

Re: [VOTE] Allow or prohibit a joint use of @deprecated and @IgniteExperimental

2020-02-10 Thread Юрий
-1 Prohibit

It looks inconsistent to me deprecate one API without present new stable
API as replacement.

пн, 10 февр. 2020 г. в 11:02, Alexey Goncharuk :

> Dear Apache Ignite community,
>
> We would like to conduct a formal vote on the subject of whether to allow
> or prohibit a joint existence of @deprecated annotation for an old API
> and @IgniteExperimental [1] for a new (replacement) API. The result of this
> vote will be formalized as an Apache Ignite development rule to be used in
> future.
>
> The discussion thread where you can address all non-vote messages is [2].
>
> The votes are:
> *[+1 Allow]* Allow to deprecate the old APIs even when new APIs are marked
> with @IgniteExperimental to explicitly notify users that an old APIs will
> be removed in the next major release AND new APIs are available.
> *[-1 Prohibit]* Never deprecate the old APIs unless the new APIs are stable
> and released without @IgniteExperimental. The old APIs javadoc may be
> updated with a reference to new APIs to encourage users to evaluate new
> APIs. The deprecation and new API release may happen simultaneously if the
> new API is not marked with @IgniteExperimental or the annotation is removed
> in the same release.
>
> Neither of the choices prohibits deprecation of an API without a
> replacement if community decides so.
>
> The vote will hold for 72 hours and will end on February 13th 2020 08:00
> UTC:
>
> https://www.timeanddate.com/countdown/to?year=2020&month=2&day=13&hour=8&min=0&sec=0&p0=utc-1
>
> All votes count, there is no binding/non-binding status for this.
>
> [1]
>
> https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/lang/IgniteExperimental.java
> [2]
>
> http://apache-ignite-developers.2346864.n4.nabble.com/DISCUSS-Public-API-deprecation-rules-td45647.html
>
> Thanks,
> --AG
>


-- 
Живи с улыбкой! :D


Re: Aggregation functions

2019-08-30 Thread Юрий
Hi Pavel,

1. Yes GROUP_CONCAT function works and as for collocated and for non
collocated case. There are following tests:

org.apache.ignite.internal.processors.query.IgniteSqlGroupConcatNotCollocatedTest

org.apache.ignite.internal.processors.query.IgniteSqlGroupConcatCollocatedTest

2. Seems yes, they should be equivalent.
3. I think yes, it could be implemented as some aggregate function. May be
any one of Igniters want to implement it?


вт, 27 авг. 2019 г. в 15:30, Pavel Vinokurov :

> Hi Igniters!
>
> I often meet the following use case.
> Id
> Company_id
> Name
> Birthday(dd.mm.)
> 1 1 John 01.01.2000
> 2 1 Mike 01.01.2010
> 3 1 Nick 01.01.2015
>
> Having table Person, it requires to select min and max birth-dates and name
> of the youngest and oldest person for each company.
>
> The current possible solution is  write the query using join between the
> same table. Such query has poor performance and looks quite clumsy. Also it
> requires to handle same birth dates:
> *Ignite Query(simplified)*
> SELECT
>   MIN_MAX.company_id,
>   p1.name as oldest_name,
>   MIN_MAX.min_date,
>   p2.name as youngest,
>   MIN_MAX.max_date
> FROM
> (SELECT
>  company_id,
>  min(birthday) as min_date,
>  max(birthday) as max_date
> group by company_id) MIN_MAX
> INNER JOIN Person p1 on p1.birthday=MIN_MAX.MIN_DATE and
> p1.company_id=MIN_MAX.company_id
> INNER JOIN Person p2 on p2.birthday=MIN_MAX.MAX_DATE and
> p2.company_id=MIN_MAX.company_id
>
> Given performance of this query, it's make sense to re-implement this
> usecase using pure java code.
>
> But in H2 it's possible to execute the following query:
> SELECT
>  company_id,
>  first_value(name) over( ORDER BY birthday) as oldest_name,
>  min(birthday)
>  last_value(name) over( ORDER BY birthday) as youngest_name,
>  max(birthday)
> group by company_id
>
> Ignite doesn't provide any window or inside grouping functions excepting
> GROUP_CONCAT, so we could make the similar query.
> SELECT
>  company_id,
>  PARSE_STRING_AND_GET_FIRST_STRING(GROUP_CONCAT( name order by birthday
> SEPARATOR ',')) as oldest_name
>  min(birthday)
>  PARSE_STRING_AND_GET_LAST_STRING(GROUP_CONCAT( name order by birthday
> SEPARATOR ',')) as youngest_name
>  max(birthday)
> group by company_id
>
> These last 2 queries are much faster(10-100x) than the first one.
>
> Thus I want to clarify a few questions:
>
>1. Does GROUP_CONCAT[2] function really work and make aggregation
>inside  group( in collocated case)?
>2. Are queries 2 and 3 equivalent?
>3. Is there any options to implement first_value[1], last_value without
>custom partitioning. IMHO first_value is the simplified version of
>GROUP_CONCAT. Am I right?
>
>
> [1] http://www.h2database.com/html/functions.html#first_value
> <
> http://ggsystems.atlassian.net/wiki/pages/createpage.action?spaceKey=GG&title=1&linkCreation=true&fromPageId=1296597032
> >
>
> [2] https://apacheignite-sql.readme.io/docs/group_concat
>
>
> Thanks,
>
> Pavel
>
>
>
> --
>
> Regards
>
> Pavel Vinokurov
>


-- 
Живи с улыбкой! :D


Re: Is Ignite planning to support COMMENT ON statements?

2019-07-10 Thread Юрий
Hi Liyuj,

As of now we don't support COMMENT ON statements and don't have such plans
or a tickets to add one. But I agree that it should add facilities for
users.

Feel free to raise the ticket.

ср, 10 июл. 2019 г. в 00:48, Dmitriy Pavlov :

> Hi,
>
> I'm not aware of such plans, for now.
>
> Maybe SQL experts (CCed) can provide some more details.
>
> Sincerely,
> Dmitriy Pavlov
>
> вс, 7 июл. 2019 г. в 15:39, liyuj <18624049...@163.com>:
>
> > Hi,
> >
> > Is Ignite planning to support COMMENT ON statements?
> >
> > H2 supports this command.
> > If Ignite can support this command, it will be helpful to increase the
> > maintainability of tables.
> >
> >
> >
>


-- 
Живи с улыбкой! :D


Re: [discussion] using custom build of H2 for Ignite

2019-07-10 Thread Юрий
Agree with Ivan.

We can start work with H2 fork owned by GG and decide change it later in
case it will bring some issues to Ignite community. Currently I don't see
any issues here.

I'm worried about the issue, with process to synchonize changes H2 fork and
Ignite. As possible solution it could be as follow:
Ignite has dependency only to released version of H2 fork.
Then we could modify the fork in any time without compatibility issues.
When new feature is ready to use by Ignite need to modify code of Ignite
and change dependency version for H2 fork.
However H2 for in the case should be release more often than Ignite.

WDYT?

ср, 10 июл. 2019 г. в 13:08, Павлухин Иван :

> Nikolay,
>
> Could you please elaborate why is it "closed source"?
>
> > What the difference for the Ignite community?
> The difference is similar to using version X and version Y of the same
> library. Version Y might be better.
>
> > I think all Ignite commiters should have write priveledges to H2 fork.
> I agree, it is quite natural. Actually, my only point is that we can
> do it at any point later, cannot we?
>
> ср, 10 июл. 2019 г. в 12:25, Nikolay Izhikov :
> >
> > Ivan
> >
> > We have closed source code dependency for now owned by H2 owners.
> > With new fork we will have the same closed dependency owned by Grid Gain.
> >
> > What the difference for the Ignite community?
> >
> > > 2. Anyways some process must be established for merging changes
> > > requiring changes in h2 library. So, I suppose it should be review of
> > > changes in 2 repositories.
> >
> > The question is - *Who can apply those changes*.
> >
> > I think all Ignite commiters should have write priveledges to H2 fork.
> >
> > В Ср, 10/07/2019 в 11:30 +0300, Павлухин Иван пишет:
> > > Folks,
> > >
> > > I would like to highlight a couple of points.
> > > 1. Perhaps it is not so crucial where is this fork located if the code
> > > is publicly available and can be cloned to another repository easily.
> > > We can relocate code and use it at any point in future.
> > > 2. Anyways some process must be established for merging changes
> > > requiring changes in h2 library. So, I suppose it should be review of
> > > changes in 2 repositories.
> > >
> > > Now (and beforehand) we use original h2. And how many of us were ever
> > > interested what changes were made in h2? So, perhaps for the first
> > > time we can start with GG fork? And if later on some problems with
> > > that appear we can clone it and use that new fork without much
> > > trouble, can't we?
> > >
> > > ср, 10 июл. 2019 г. в 09:52, Nikolay Izhikov :
> > > >
> > > > Hello, Denis.
> > > >
> > > > > Nickolay, as for that fork which is in GG codebase - GridGain is a
> major
> > > > > contributor and maintainer but the others are welcomed to send
> > > > > pull-requests.
> > > >
> > > > Can we make this fork maintained by Ignite Community?
> > > >
> > > > With all respect to Grid Gain as an author of Apache Ignite I don't
> like when some huge dependencies
> > > > (incompatible with community-driven analogue) belongs to the
> enterprise.
> > > >
> > > > This leads us to the situation when Grid Gain will decide which
> features will be added to the SQL engine and which not.
> > > >
> > > > В Пн, 08/07/2019 в 13:51 -0700, Denis Magda пишет:
> > > > > Dmitry,
> > > > >
> > > > > To make this fully-vendor neutral even at the originating
> repository level,
> > > > > we can create and work with the H2 fork as a separate Github repo
> (separate
> > > > > project governed and maintained by Ignite community). That repo
> can't be
> > > > > part of Ignite due to license mismatch. Thus, during release
> times, we need
> > > > > to assemble a binary (maven artifact) from that fork.
> > > > >
> > > > > However, it's not clear to me how to use those sources during the
> dev time?
> > > > > It sounds like Ignite can use only the binary (Maven) artifact
> that has to
> > > > > be updated/regenerated if there are any changes. *SQL experts*,
> could you
> > > > > please step in?
> > > > >
> > > > > Nickolay, as for that fork which is in GG codebase - GridGain is a
> major
> > > > > contributor and maintainer but the others are welcomed to send
> > > > > pull-requests.
> > > > >
> > > > > -
> > > > > Denis
> > > > >
> > > > >
> > > > > On Thu, Jul 4, 2019 at 9:26 AM Dmitriy Pavlov 
> wrote:
> > > > >
> > > > > > Hi Denis,
> > > > > >
> > > > > > As you know, some time ago I've started a discussion about
> removing
> > > > > > dependence from gridgain:shmem. Ignite community seems to be not
> so much
> > > > > > interested in this removal, for now. So once added it could stay
> here
> > > > > > forever. Reverse dependency direction seems to be more natural.
> It is like
> > > > > > the open-core model.
> > > > > >
> > > > > > I feel more comfortable if all Ignite dependencies are released
> as part of
> > > > > > the Ignite code base, or some open governed project with a
> license from
> > > > > > Category A https://www.apache.or

Re: proposed realization KILL QUERY command

2019-01-30 Thread Юрий
Hi Igniters,

Let's return to KILL QUERY command. Previously we mostly discussed about
two variants of format:
1) simple - KILL QUERY {running_query_id}
2) advanced syntax - KILL QUERY WHERE {parameters}. Parameters seems can be
any columns from running queries view or just part of them.

I've checked approaches used by  Industrial  RDBMS vendors :

   -
  - *ORACLE*: ALTER SYSTEM CANCEL SQL 'SID, SERIAL, SQL_ID'


   -
  - *Postgres*: SELECT pg_cancel_backend() and
  SELECT pg_terminate_backend()
  - *MySQL*: KILL QUERY 


As we see all of them use simple syntax to cancel a query and can't do some
filters.

IMHO simple *KILL QUERY qry_id* better for the few reasons.
User can kill just single query belong (started) to single node and it will
be exactly that query which was passed as parameter - predictable results.
For advance syntax  it could lead send kill request to all nodes in a
cluster and potentially user can kill unpredictable queries depend on
passed parameters.
Other vendors use simple syntax

How it could be used

1)SELECT * from sql_running_queries
result is
 query_id
  |  sql  | schema_name | duration| 
8a55df83-2f41-4f81-8e11-ab0936d0_6742 | SELECT ... | ...
  |  | 
8a55df83-2f41-4f81-8e11-ab0936d0_1234 | UPDATE...  | ...
  |  ..  | 

2) KILL QUERY 8a55df83-2f41-4f81-8e11-ab0936d0_6742



Do you have another opinion? Let's decide which of variant will be prefer.


ср, 16 янв. 2019 г. в 18:02, Denis Magda :

> Yury,
>
> I do support the latter concatenation approach. It's simple and correlates
> with what other DBs do. Plus, it can be passed to KILL command without
> complications. Thanks for thinking this through!
>
> As for the killing of all queries on a particular node, not sure that's a
> relevant use case. I would put this off. Usually, you want to stop a
> specific query (it's slow or resources consuming) and have to know its id,
> the query runs across multiple nodes and a single KILL command with the id
> can halt it everywhere. If someone decided to shut all queries on the node,
> then it sounds like the node is experiencing big troubles and it might be
> better just to shut it down completely.
>
> -
> Denis
>
>
> On Tue, Jan 15, 2019 at 8:00 AM Юрий  wrote:
>
>> Denis and other Igniters, do you have any comments for proposed approach?
>> Which of these ones will be better to use for us - simple numeric  or hex
>> values (shorter id, but with letters)?
>>
>> As for me hex values preferable due to it  shorter and looks more unique
>> across a logs
>>
>>
>>
>> вт, 15 янв. 2019 г. в 18:35, Vladimir Ozerov :
>>
>>> Hi,
>>>
>>> Concatenation through a letter looks like a good approach to me. As far
>>> as
>>> killing all queries on a specific node, I would put it aside for now -
>>> this
>>> looks like a separate command with possibly different parameters.
>>>
>>> On Tue, Jan 15, 2019 at 1:30 PM Юрий 
>>> wrote:
>>>
>>> > Thanks Vladimir for your thoughts.
>>> >
>>> > Based on it most convenient ways are first and third.
>>> > But with some modifications:
>>> > For first variant delimiter should be a letter, e.g. 123X15494, then it
>>> > could be simple copy by user.
>>> > For 3rd variant can be used convert both numeric to HEX and use a
>>> letter
>>> > delimiter not included to HEX symbols (ABCDEF), in this case query id
>>> will
>>> > be shorter and also can be simple copy by user. e.g. 7BX3C86 ( it the
>>> same
>>> > value as used for first variant), instead of convert all value as
>>> string to
>>> > base16 due to it will be really long value.
>>> >
>>> > Possible realization for the cases:
>>> > 1) Concatenation node order id with query id with a letter delimiter.
>>> >
>>> > query_id = 1234X8753 , where *1234* - node order, *8753* - local node
>>> query
>>> > counter. *X* - delimeter
>>> >
>>> > 2) Converting both node order id and query id to HEX.
>>> >
>>> > query_id =  7BX3C86,  value is concat(hex(node),"X",hex(queryID))
>>> >
>>> > For both variants we can use either simple or copmlex KILL QUERY
>>> syntax.
>>> > Simple:
>>> >
>>> > KILL QUERY 7BX3C86 - for kill concrete query
>>> > KILL QUERY 7B - for killing all queries on a node.  May be need extra
>>> > symbols for such queries to avoid fault of user a

Re: SQL View with list of existing indexes

2019-01-24 Thread Юрий
Hi Vladimir,

Thanks for your comments,

1) Agree.
2) Ok.
3) We create number of index copies depend on query parallelism. But seems
you are right - it should be exposed on TABLES level.
4) Approx. inline size shouldn't be used here, due to the value depend on
node and not has single value.
5) Do we have a plans for some view with table columns? If yes, may be will
be better have just array with column order from the columns view. For
example you want to know which columns are indexed already. In case we will
have plain comma-separated form it can't be achieved.





чт, 24 янв. 2019 г. в 18:09, Vladimir Ozerov :

> Hi Yuriy,
>
> Please note that MySQL link is about SHOW command, which is a different
> beast. In general I think that PG approach is better as it allows user to
> get quick overview of index content without complex JOINs. I would start
> with plain single view and add columns view later if we found it useful. As
> far as view columns:
> 1) I would add both cache ID/name and cache group ID/name
> 2) Number of columns does not look as a useful info to me
> 3) Query parallelism is related to cache, not index, so it should be in
> IGNITE.TABLES view instead
> 4) Inline size is definitely useful metric. Not sure about approximate
> inline size
> 5) I would add list of columns in plain comma-separated form with ASC/DESC
> modifiers
>
> Thoughts?
>
> Vladimir.
>
> On Thu, Jan 24, 2019 at 3:52 PM Юрий  wrote:
>
> > Hi Igniters,
> >
> > As part of IEP-29: SQL management and monitoring
> > <
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-29%3A+SQL+management+and+monitoring
> > >
> > I'm going to implement SQL view with list of existing indexes.
> > I've investigate how it expose by ORACLE, MySQL and Postgres.
> > ORACLE -
> >
> >
> https://docs.oracle.com/en/database/oracle/oracle-database/18/refrn/ALL_INDEXES.html#GUID-E39825BA-70AC-45D8-AF30-C7FF561373B6
> >
> > MySQL - https://dev.mysql.com/doc/refman/8.0/en/show-index.html
> > Postgres - https://www.postgresql.org/docs/11/view-pg-indexes.html ,
> > https://www.postgresql.org/docs/11/catalog-pg-index.html
> >
> > All vendors have such views which show at least following information:
> > schema name   - Name of schema related to table and index.
> > table name- Name of table related to an index.
> > index name   - Name of index.
> > list of columns   - All columns and their order included into an
> > index.
> > collation - ASC or DESC sort for each columns.
> >
> > + many specific information which different form vendor to vendor.
> >
> > In our case such specific information could be at least:
> >
> >1. Owning cache ID   - not sure, but may
> be
> >useful to join with other our views.
> >2. number of columns at the index- just to know how many
> result
> >should be in columns view
> >3. query parallelism   - It's
> configuration
> >parameter show how many thread can be used to execute query.
> >4. inline size   - inline size
> >used for this index.
> >5. is affinity - boolean
> >parameter show that affinity key index
> >6. is pk- boolean
> >parameter show that PK index
> >7. approx recommended inline size- dynamically calculated
> >recommended inline size for this index to show required size to keep
> > whole
> >indexed columns as inlined.
> >
> >
> >
> > All vendors have different ways  to present information about index
> > columns:
> > PG - use array of index table columns and second array for collation each
> > of columns.
> > MySQL - each row in index view contains information about one of indexed
> > columsn with ther position at the index. So for one index there are many
> > columns.
> > ORACLE,  - use separate view where each of row present column included
> into
> > index with all required information and can be joined by schema, table
> and
> > index names.
> > ORACLE indexed columns view -
> >
> >
> https://docs.oracle.com/cd/B19306_01/server.102/b14237/statviews_1064.htm#i1577532
> > MySql -
> >
> > I propose use ORACLE way and have second view to represent column
> included
> > into indexes.
> >
> > In this case such view can have the following information:
> > schema 

Re: SQL View with list of existing indexes

2019-01-24 Thread Юрий
One additional thought which I figured out just now.

Seems approximately recommended inline size is not good choice to add to
the index view, due to value this parameter has different value for each
node. Even more, for non affinity node it always will be zero. So, seems it
should be excluded from my initial proposal.

чт, 24 янв. 2019 г. в 15:51, Юрий :

> Hi Igniters,
>
> As part of IEP-29: SQL management and monitoring
> <https://cwiki.apache.org/confluence/display/IGNITE/IEP-29%3A+SQL+management+and+monitoring>
> I'm going to implement SQL view with list of existing indexes.
> I've investigate how it expose by ORACLE, MySQL and Postgres.
> ORACLE -
> https://docs.oracle.com/en/database/oracle/oracle-database/18/refrn/ALL_INDEXES.html#GUID-E39825BA-70AC-45D8-AF30-C7FF561373B6
>
> MySQL - https://dev.mysql.com/doc/refman/8.0/en/show-index.html
> Postgres - https://www.postgresql.org/docs/11/view-pg-indexes.html ,
> https://www.postgresql.org/docs/11/catalog-pg-index.html
>
> All vendors have such views which show at least following information:
> schema name   - Name of schema related to table and index.
> table name- Name of table related to an index.
> index name   - Name of index.
> list of columns   - All columns and their order included into an
> index.
> collation - ASC or DESC sort for each columns.
>
> + many specific information which different form vendor to vendor.
>
> In our case such specific information could be at least:
>
>1. Owning cache ID   - not sure, but may
>be useful to join with other our views.
>2. number of columns at the index- just to know how many
>result should be in columns view
>3. query parallelism   - It's
>configuration parameter show how many thread can be used to execute query.
>4. inline size   - inline size
>used for this index.
>5. is affinity - boolean
>parameter show that affinity key index
>6. is pk- boolean
>parameter show that PK index
>7. approx recommended inline size- dynamically calculated
>recommended inline size for this index to show required size to keep whole
>indexed columns as inlined.
>
>
>
> All vendors have different ways  to present information about index
> columns:
> PG - use array of index table columns and second array for collation each
> of columns.
> MySQL - each row in index view contains information about one of indexed
> columsn with ther position at the index. So for one index there are many
> columns.
> ORACLE,  - use separate view where each of row present column included
> into index with all required information and can be joined by schema, table
> and index names.
> ORACLE indexed columns view -
> https://docs.oracle.com/cd/B19306_01/server.102/b14237/statviews_1064.htm#i1577532
> MySql -
>
> I propose use ORACLE way and have second view to represent column included
> into indexes.
>
> In this case such view can have the following information:
> schema name   - Name of schema related to table and index.
> table name- Name of table related to an index.
> index name   - Name of index.
> column name- Name of column included into index.
> column type  - Type of the column.
> column position - Position of column within the index.
> collation- Either the column is sorted descending or
> ascending
>
> And can be joined with index view through schema, table and index names.
>
>
>
> What do you think about such approach and list of columns which could be
> included into the views?
>
> --
> Живи с улыбкой! :D
>


-- 
Живи с улыбкой! :D


SQL View with list of existing indexes

2019-01-24 Thread Юрий
Hi Igniters,

As part of IEP-29: SQL management and monitoring

I'm going to implement SQL view with list of existing indexes.
I've investigate how it expose by ORACLE, MySQL and Postgres.
ORACLE -
https://docs.oracle.com/en/database/oracle/oracle-database/18/refrn/ALL_INDEXES.html#GUID-E39825BA-70AC-45D8-AF30-C7FF561373B6

MySQL - https://dev.mysql.com/doc/refman/8.0/en/show-index.html
Postgres - https://www.postgresql.org/docs/11/view-pg-indexes.html ,
https://www.postgresql.org/docs/11/catalog-pg-index.html

All vendors have such views which show at least following information:
schema name   - Name of schema related to table and index.
table name- Name of table related to an index.
index name   - Name of index.
list of columns   - All columns and their order included into an
index.
collation - ASC or DESC sort for each columns.

+ many specific information which different form vendor to vendor.

In our case such specific information could be at least:

   1. Owning cache ID   - not sure, but may be
   useful to join with other our views.
   2. number of columns at the index- just to know how many result
   should be in columns view
   3. query parallelism   - It's configuration
   parameter show how many thread can be used to execute query.
   4. inline size   - inline size
   used for this index.
   5. is affinity - boolean
   parameter show that affinity key index
   6. is pk- boolean
   parameter show that PK index
   7. approx recommended inline size- dynamically calculated
   recommended inline size for this index to show required size to keep whole
   indexed columns as inlined.



All vendors have different ways  to present information about index columns:
PG - use array of index table columns and second array for collation each
of columns.
MySQL - each row in index view contains information about one of indexed
columsn with ther position at the index. So for one index there are many
columns.
ORACLE,  - use separate view where each of row present column included into
index with all required information and can be joined by schema, table and
index names.
ORACLE indexed columns view -
https://docs.oracle.com/cd/B19306_01/server.102/b14237/statviews_1064.htm#i1577532
MySql -

I propose use ORACLE way and have second view to represent column included
into indexes.

In this case such view can have the following information:
schema name   - Name of schema related to table and index.
table name- Name of table related to an index.
index name   - Name of index.
column name- Name of column included into index.
column type  - Type of the column.
column position - Position of column within the index.
collation- Either the column is sorted descending or
ascending

And can be joined with index view through schema, table and index names.



What do you think about such approach and list of columns which could be
included into the views?

-- 
Живи с улыбкой! :D


Re: SQL views for IO statistics

2019-01-17 Thread Юрий
Denis,

As I understand logical and physical IO operations are standard terms which
use other DB vendors, for example Oracle -
http://www.dba-oracle.com/t_oracle_logical_io_physical_io.htm , sometime
logical IO operation called as 'page/buffer hit'.

So I think current naming is ok.

WDYT?





ср, 16 янв. 2019 г. в 18:44, Denis Magda :

> Wouldn't disk_read and memory_read be better naming?
>
> -
> Denis
>
>
> On Wed, Jan 16, 2019 at 7:38 AM Юрий  wrote:
>
> > Denis,
> >
> > Physical reads is load page from storage to memory.
> > Logical reads is read page which already in memory.
> >
> > We gather IO statistics on CACHE_GROUP level due to Ignite use one page
> to
> > keep all caches related to one cache group.  Unfortunately gathering on
> > table level will be expensive due to the reason. That's way name of view
> > contains words cache and groups.
> >
> > ср, 16 янв. 2019 г. в 17:52, Denis Magda :
> >
> > > Yury,
> > >
> > > How do we differentiate between logical and physical reads?
> > >
> > > Also, it looks counter-intuitive when "CACHE" is used in the name of
> the
> > > views for SQL table related statistics. It's still hard to explain the
> > user
> > > the relations between caches and tables. Hopefully, this will be fixed
> in
> > > 3.0 with renaming but as for the statistics can we use anything neutral
> > for
> > > the view names?
> > >
> > > -
> > > Denis
> > >
> > >
> > > On Tue, Jan 15, 2019 at 5:57 AM Юрий 
> > wrote:
> > >
> > > > Hi Igniters!
> > > >
> > > > As part of IEP-27
> > > > <
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-27%3A+Page+IO+statistics
> > > > >
> > > > we
> > > > already gathering IO statistics and expose it through JMX.
> > > >
> > > > User  who use only SQL should have access to the statistics also. So
> > > let's
> > > > discuss about how such SQL view should looks.
> > > >
> > > > My proposal it is two SQL views:
> > > > 1) STATIO_CACHE_GRP
> > > >
> > > > cache_grp_name - Name of cache group
> > > > physical_read   - Number of physical read of pages
> > > > logical_read  - Number of logical read of pages
> > > >
> > > >
> > > >  The view can be filtered by name, like SELECT * from
> > > > IGNITE.STATIO_CACHE_GRP where cache_grp_name='cache1'
> > > > 2) STATIO_IDX
> > > >
> > > > cache_grp_name - Name of cache group
> > > >
> > > > idx_name - Name of index
> > > > physical_read   - Common number of physical reads of
> pages
> > > for
> > > > the index
> > > > logical_read  - Common number of logical reads of
> pages
> > > for
> > > > the index
> > > >
> > > > leaf_logical_read  - Number of logical reads of index leaf
> > pages
> > > >
> > > > leaf_physical_read   - Number of physical reads of index leaf
> pages
> > > >
> > > > inner_logical_read- Number of logical reads of index inner
> > pages
> > > >
> > > > inner_physical_read - Number of physical reads of index leaf
> pages
> > > >
> > > >
> > > >  The view can be filtered by cache group name or by index name,
> > like
> > > > SELECT * from IGNITE.STATIO_IDX where idx_name='cache1_name_idx'
> > > >
> > > > We also have time of start gathering statistics, but I'm not sure
> that
> > it
> > > > should be exposed here.
> > > >
> > > >
> > > > WDYT about proposed format for the SQL views?
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Живи с улыбкой! :D
> > > >
> > >
> >
> >
> > --
> > Живи с улыбкой! :D
> >
>


-- 
Живи с улыбкой! :D


Re: SQL views for IO statistics

2019-01-16 Thread Юрий
Denis,

Physical reads is load page from storage to memory.
Logical reads is read page which already in memory.

We gather IO statistics on CACHE_GROUP level due to Ignite use one page to
keep all caches related to one cache group.  Unfortunately gathering on
table level will be expensive due to the reason. That's way name of view
contains words cache and groups.

ср, 16 янв. 2019 г. в 17:52, Denis Magda :

> Yury,
>
> How do we differentiate between logical and physical reads?
>
> Also, it looks counter-intuitive when "CACHE" is used in the name of the
> views for SQL table related statistics. It's still hard to explain the user
> the relations between caches and tables. Hopefully, this will be fixed in
> 3.0 with renaming but as for the statistics can we use anything neutral for
> the view names?
>
> -
> Denis
>
>
> On Tue, Jan 15, 2019 at 5:57 AM Юрий  wrote:
>
> > Hi Igniters!
> >
> > As part of IEP-27
> > <
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-27%3A+Page+IO+statistics
> > >
> > we
> > already gathering IO statistics and expose it through JMX.
> >
> > User  who use only SQL should have access to the statistics also. So
> let's
> > discuss about how such SQL view should looks.
> >
> > My proposal it is two SQL views:
> > 1) STATIO_CACHE_GRP
> >
> > cache_grp_name - Name of cache group
> > physical_read   - Number of physical read of pages
> > logical_read  - Number of logical read of pages
> >
> >
> >  The view can be filtered by name, like SELECT * from
> > IGNITE.STATIO_CACHE_GRP where cache_grp_name='cache1'
> > 2) STATIO_IDX
> >
> > cache_grp_name - Name of cache group
> >
> > idx_name - Name of index
> > physical_read   - Common number of physical reads of pages
> for
> > the index
> > logical_read  - Common number of logical reads of pages
> for
> > the index
> >
> > leaf_logical_read  - Number of logical reads of index leaf pages
> >
> > leaf_physical_read   - Number of physical reads of index leaf pages
> >
> > inner_logical_read- Number of logical reads of index inner pages
> >
> > inner_physical_read - Number of physical reads of index leaf pages
> >
> >
> >  The view can be filtered by cache group name or by index name, like
> > SELECT * from IGNITE.STATIO_IDX where idx_name='cache1_name_idx'
> >
> > We also have time of start gathering statistics, but I'm not sure that it
> > should be exposed here.
> >
> >
> > WDYT about proposed format for the SQL views?
> >
> >
> >
> >
> >
> >
> > --
> > Живи с улыбкой! :D
> >
>


-- 
Живи с улыбкой! :D


Re: proposed realization KILL QUERY command

2019-01-15 Thread Юрий
Denis and other Igniters, do you have any comments for proposed approach?
Which of these ones will be better to use for us - simple numeric  or hex
values (shorter id, but with letters)?

As for me hex values preferable due to it  shorter and looks more unique
across a logs



вт, 15 янв. 2019 г. в 18:35, Vladimir Ozerov :

> Hi,
>
> Concatenation through a letter looks like a good approach to me. As far as
> killing all queries on a specific node, I would put it aside for now - this
> looks like a separate command with possibly different parameters.
>
> On Tue, Jan 15, 2019 at 1:30 PM Юрий  wrote:
>
> > Thanks Vladimir for your thoughts.
> >
> > Based on it most convenient ways are first and third.
> > But with some modifications:
> > For first variant delimiter should be a letter, e.g. 123X15494, then it
> > could be simple copy by user.
> > For 3rd variant can be used convert both numeric to HEX and use a letter
> > delimiter not included to HEX symbols (ABCDEF), in this case query id
> will
> > be shorter and also can be simple copy by user. e.g. 7BX3C86 ( it the
> same
> > value as used for first variant), instead of convert all value as string
> to
> > base16 due to it will be really long value.
> >
> > Possible realization for the cases:
> > 1) Concatenation node order id with query id with a letter delimiter.
> >
> > query_id = 1234X8753 , where *1234* - node order, *8753* - local node
> query
> > counter. *X* - delimeter
> >
> > 2) Converting both node order id and query id to HEX.
> >
> > query_id =  7BX3C86,  value is concat(hex(node),"X",hex(queryID))
> >
> > For both variants we can use either simple or copmlex KILL QUERY syntax.
> > Simple:
> >
> > KILL QUERY 7BX3C86 - for kill concrete query
> > KILL QUERY 7B - for killing all queries on a node.  May be need extra
> > symbols for such queries to avoid fault of user and kill all queries by
> > mistake, like KILL QUERY 7B*
> >
> > Complex:
> >
> > KILL QUERY WHERE queryId=7BX3C86 - for killing concrete query.
> >
> > KILL QUERY WHERE nodeId=37d7afd8-b87d-4aa1-b3d1-c1c03380  - for kill
> > all running queries on a given node.
> >
> >
> >
> > What do you think?
> >
> >
> > вт, 15 янв. 2019 г. в 11:20, Vladimir Ozerov :
> >
> > > Hi Yuriy,
> > >
> > > I think all proposed approaches might work. The question is what is the
> > > most convenient from user perspective. Encoded values without special
> > > characters are good because they are easy to copy with mouse
> > (double-click)
> > > or keyboard (Ctrl+Shift+arrow). On the other hand, ability to identify
> > > ID/name of suspicious node from query ID is also a good thing. Several
> > > examples of query ID:
> > >
> > > CockroachDB: 14dacc1f9a781e3d0001
> > > MongoDB: shardB:79014
> > >
> > > Also it is important that the same query ID is printed in various log
> > > messages. This will be very useful for debugging purposes, e.g. grep
> over
> > > logs. So ideally query ID should not have any symbols which interfere
> > with
> > > grep syntax.
> > >
> > >
> > > On Mon, Jan 14, 2019 at 3:09 PM Юрий 
> > wrote:
> > >
> > > > Hi Igniters,
> > > >
> > > > Earlier we discuss about columns for running queries. Let's summarize
> > it
> > > > and continue discussion for not closed questions.
> > > >
> > > > What we had:
> > > > *name of view**: *running_queries
> > > > *columns and meaning*:
> > > >query_id -  unique id of query on node
> > > >node_id - initial node of request.
> > > >sql - text of query
> > > >schema_name - name of sql schema
> > > >duration - duration in milliseconds from start
> > of
> > > > execution.
> > > >
> > > > All of this columns are clear, except query_id.
> > > > Let's keep in mind that the query_id column of the view coupled with
> > KILL
> > > > QUERY command.
> > > >
> > > > We have the following variants what is query_id:
> > > > 1) It's string, internally with two parts separated by '.'(it can be
> > > other
> > > > separator): numeric node order and numeric query counter unique for
> > local
> > > > node,

SQL views for IO statistics

2019-01-15 Thread Юрий
Hi Igniters!

As part of IEP-27

we
already gathering IO statistics and expose it through JMX.

User  who use only SQL should have access to the statistics also. So let's
discuss about how such SQL view should looks.

My proposal it is two SQL views:
1) STATIO_CACHE_GRP

cache_grp_name - Name of cache group
physical_read   - Number of physical read of pages
logical_read  - Number of logical read of pages


 The view can be filtered by name, like SELECT * from
IGNITE.STATIO_CACHE_GRP where cache_grp_name='cache1'
2) STATIO_IDX

cache_grp_name - Name of cache group

idx_name - Name of index
physical_read   - Common number of physical reads of pages for
the index
logical_read  - Common number of logical reads of pages for
the index

leaf_logical_read  - Number of logical reads of index leaf pages

leaf_physical_read   - Number of physical reads of index leaf pages

inner_logical_read- Number of logical reads of index inner pages

inner_physical_read - Number of physical reads of index leaf pages


 The view can be filtered by cache group name or by index name, like
SELECT * from IGNITE.STATIO_IDX where idx_name='cache1_name_idx'

We also have time of start gathering statistics, but I'm not sure that it
should be exposed here.


WDYT about proposed format for the SQL views?






-- 
Живи с улыбкой! :D


Re: proposed realization KILL QUERY command

2019-01-15 Thread Юрий
Thanks Vladimir for your thoughts.

Based on it most convenient ways are first and third.
But with some modifications:
For first variant delimiter should be a letter, e.g. 123X15494, then it
could be simple copy by user.
For 3rd variant can be used convert both numeric to HEX and use a letter
delimiter not included to HEX symbols (ABCDEF), in this case query id will
be shorter and also can be simple copy by user. e.g. 7BX3C86 ( it the same
value as used for first variant), instead of convert all value as string to
base16 due to it will be really long value.

Possible realization for the cases:
1) Concatenation node order id with query id with a letter delimiter.

query_id = 1234X8753 , where *1234* - node order, *8753* - local node query
counter. *X* - delimeter

2) Converting both node order id and query id to HEX.

query_id =  7BX3C86,  value is concat(hex(node),"X",hex(queryID))

For both variants we can use either simple or copmlex KILL QUERY syntax.
Simple:

KILL QUERY 7BX3C86 - for kill concrete query
KILL QUERY 7B - for killing all queries on a node.  May be need extra
symbols for such queries to avoid fault of user and kill all queries by
mistake, like KILL QUERY 7B*

Complex:

KILL QUERY WHERE queryId=7BX3C86 - for killing concrete query.

KILL QUERY WHERE nodeId=37d7afd8-b87d-4aa1-b3d1-c1c03380  - for kill
all running queries on a given node.



What do you think?


вт, 15 янв. 2019 г. в 11:20, Vladimir Ozerov :

> Hi Yuriy,
>
> I think all proposed approaches might work. The question is what is the
> most convenient from user perspective. Encoded values without special
> characters are good because they are easy to copy with mouse (double-click)
> or keyboard (Ctrl+Shift+arrow). On the other hand, ability to identify
> ID/name of suspicious node from query ID is also a good thing. Several
> examples of query ID:
>
> CockroachDB: 14dacc1f9a781e3d0001
> MongoDB: shardB:79014
>
> Also it is important that the same query ID is printed in various log
> messages. This will be very useful for debugging purposes, e.g. grep over
> logs. So ideally query ID should not have any symbols which interfere with
> grep syntax.
>
>
> On Mon, Jan 14, 2019 at 3:09 PM Юрий  wrote:
>
> > Hi Igniters,
> >
> > Earlier we discuss about columns for running queries. Let's summarize it
> > and continue discussion for not closed questions.
> >
> > What we had:
> > *name of view**: *running_queries
> > *columns and meaning*:
> >query_id -  unique id of query on node
> >node_id - initial node of request.
> >sql - text of query
> >schema_name - name of sql schema
> >duration - duration in milliseconds from start of
> > execution.
> >
> > All of this columns are clear, except query_id.
> > Let's keep in mind that the query_id column of the view coupled with KILL
> > QUERY command.
> >
> > We have the following variants what is query_id:
> > 1) It's string, internally with two parts separated by '.'(it can be
> other
> > separator): numeric node order and numeric query counter unique for local
> > node, e.g. '172.67321'. For this case query id will be really unique
> across
> > a cluster, but can be looks a strange for a user, especially in case we
> > will have ability to kill all queries on a node, when user should get
> first
> > part before separator to use it, e.g. KILL QUERY '172.*'.
> >
> > 2) Just single numeric id, unique for local node, e.g '127'. In this case
> > we need more complicated syntax for further KILL QUERY command, which
> lead
> > to use two columns from the view, e.g. KILL QUERY WHERE nodeId=
> > 37d7afd8-b87d-4aa1-b3d1-c1c03380 and queryId=67321
> >
> > 3) Use base16String(concat(node,".",queryID) as query id, e.g. '
> > 3132332E393337'. Then we hide internal structure of id and such id will
> be
> > unique across a cluster. However we will need use complicated syntax for
> > KILL QUERY command as for 2nd case.
> >
> > 4) Just single numeric id, unique for local node, e.g '127'. But user
> > should use two columns to merge it and create query id unique in a
> cluster.
> > Such approach use  by Oracle:ALTER SYSTEM CANCEL SQL 'SID, SERIAL,
> SQL_ID'.
> > In this case user will know real meaning of each part of passed parameter
> > for KILL QUERY command. But it hard to use.
> >
> > 5) Any other approach you can think of
> >
> > If be honestly I prefer first variant, it looks simple to use by user (it
&

Re: proposed realization KILL QUERY command

2019-01-14 Thread Юрий
lems with separate parameters are explained above.
> > > > > >
> > > > > > чт, 22 нояб. 2018 г. в 3:23, Denis Magda :
> > > > > >
> > > > > > > Vladimir,
> > > > > > >
> > > > > > > All of the alternatives are reminiscent of mathematical
> > operations.
> > > > Don't
> > > > > > > look like a SQL command. What if we use a SQL approach
> > introducing
> > > > named
> > > > > > > parameters:
> > > > > > >
> > > > > > > KILL QUERY query_id=10 [AND node_id=5]
> > > > > > >
> > > > > > > --
> > > > > > > Denis
> > > > > > >
> > > > > > > On Wed, Nov 21, 2018 at 4:11 AM Vladimir Ozerov <
> > > > voze...@gridgain.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Denis,
> > > > > > > >
> > > > > > > > Space is bad candidate because it is a whitespace. Without
> > > > whitespaces
> > > > > > we
> > > > > > > > can have syntax without quotes at all. Any non-whitespace
> > delimiter
> > > > > > will
> > > > > > > > work, though:
> > > > > > > >
> > > > > > > > KILL QUERY 45.1
> > > > > > > > KILL QUERY 45-1
> > > > > > > > KILL QUERY 45:1
> > > > > > > >
> > > > > > > > On Wed, Nov 21, 2018 at 3:06 PM Юрий <
> > jury.gerzhedow...@gmail.com>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Denis,
> > > > > > > > >
> > > > > > > > > Let's consider parameter of KILL QUERY just a string with
> > some
> > > > query
> > > > > > > id,
> > > > > > > > > without any meaning for user. User just need to get the id
> > and
> > > > pass
> > > > > > as
> > > > > > > > > parameter to KILL QUERY command.
> > > > > > > > >
> > > > > > > > > Even if query is distributed it have single query id from
> > user
> > > > > > > > perspective
> > > > > > > > > and will killed on all nodes. User just need to known one
> > global
> > > > > > query
> > > > > > > > id.
> > > > > > > > >
> > > > > > > > > How it can works.
> > > > > > > > > 1)SELECT * from running_queries
> > > > > > > > > result is
> > > > > > > > >  query_id | node_id
> > > > > > > > >   | sql   | schema_name | connection_id |
> > duration
> > > > > > > > > 123.33 | e0a69cb8-a1a8-45f6-b84d-ead367a0   |
> SELECT
> > > > ...  |
> > > > > > ...
> > > > > > > > >   |   22 | 23456
> > > > > > > > > 333.31 | aaa6acb8-a4a5-42f6-f842-ead111b00020 |
> > > > UPDATE...  |
> > > > > > > ...
> > > > > > > > >   |  321| 346
> > > > > > > > > 2) KILL QUERY '123.33'
> > > > > > > > >
> > > > > > > > > So, user need select query_id from running_queries view and
> > use
> > > > it
> > > > > > for
> > > > > > > > KILL
> > > > > > > > > QUERY command.
> > > > > > > > >
> > > > > > > > > I hope it became clearer.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > ср, 21 нояб. 2018 г. в 02:11, Denis Magda <
> dma...@apache.org
> > >:
> > > > > > > > >
> > > > > > > > > > Folks,
> > > > > > > > > >
> > > > > > > > > > The decimal syntax is really odd - KILL QUERY
> > > > > > > > > > '[node_order].[query_counter]'
> &

Re: Query history statistics API

2019-01-09 Thread Юрий
Hi,

I have question related to subject. How do you think should we track
EXPLAIN queries also? I see reasons as to skip it and as include to history:

pros:

We will have full picture and see all queries.

cons:

Such queries can be considered as investigate/debug/service queries and can
push out real queries.


What is your opinion?


пт, 21 дек. 2018 г. в 19:23, Юрий :

> Vladimir, thanks for your expert opinion.
>
> I have some thoughts about 5 point.
> I tried to find how it works for Oracle and PG:
>
> *PG*: keep by default 1000 (can be configured) statements without and
> discard the least-executed statements. Update statistics is asynchronous
> process and statistics may have lag.
>
> *Oracle*: use shared pool for historical data and can evict records with
> min time of last execution in case free space at shared pool is not enough
> for a data which can be related not only historical statistics. So seems
> also separate asynchronous process (information about it so small).
>
>
> Unfortunately I could not find information about big workload and how it
> handled for these databases. However We could see that both of vendors use
> asynchronous statistic processing.
>
>
> I see few variants how we can handle very high workload.
>
> First part of variants use asynchronous model with separate thread which
> should take elements to update stats from a queue:
> 1) We blocking on overlimited queue and wait when capacity will be enough
> to put new element.
>
> + We have all actual statistics
> - End of our query execution can be blocked.
>
> 2) Discard statistics for ended query in case queue is full.
>
> + Very fast for current query
> - We lose part of statistics.
>
> 3) Do full clean of statistic's queue.
>
> + Fast and freespace for further elements
> - We lose big number of statistic elements.
>
>
> Second part of variants use current approach for queryMetrics. When we
> have some additional capacity for CHM with history + periodical cleanup the
> Map. In case even the additional space is not enough we can :
> 1) Discard statistics for ended query
> 2) Do full clean CHM and discard all gathered information.
>
> First part of variants potentially should work faster due to we can update
> history Map in single thread without contention and put to queue should be
> faster.
>
>
> What do you think? Which of the variant will be prefer or may be you can
> suggest another way to handle potential huge workload?
>
> Also there is one initial question which stay not clear to me - it is
> right place for new API.
>
>
> пт, 21 дек. 2018 г. в 13:05, Vladimir Ozerov :
>
>> Hi,
>>
>> I'd propose the following approach:
>> 1) Enable history by default. Becuase otherwise users will have to restart
>> the node to enable it, or we will have to implement dynamic history
>> enable,
>> which is complex thing. Default value should be relatively small yet
>> allowing to accommodate typical workloads. E.g. 1000 entries. This should
>> not put any serious pressure to GC.
>> 2) Split queries by: schema, query, local flag
>> 3) Track only growing values: execution count, error count, minimum
>> duration, maximum duration
>> 4) Implement ability to clear history - JMX, SQL command, whatever (may be
>> this is different ticket)
>> 5) History cleanup might be implemented similarly to current approach:
>> store everything in CHM. Periodically check it's size. If it is too big -
>> evict oldest entries. But this should be done with care - under some
>> workloads new queries will be generated very quickly. In this case we
>> should either fallback to synchronous evicts, or do not log history at
>> all.
>>
>> Thoughts?
>>
>> Vladimir.
>> -
>>
>> On Fri, Dec 21, 2018 at 11:22 AM Юрий 
>> wrote:
>>
>> > Alexey,
>> >
>> > Yes, such property to configuration history size will be added. I think
>> > default value should be 0 and history by default shouldn't be gather at
>> > all, and can be switched on by property in case when it required.
>> >
>> > Currently I planned use the same way to evicting old data as for
>> > queryMetrics - scheduled task will evict will old data by oldest start
>> time
>> > of query.
>> >
>> > Will be gathered statistics for only initial clients queries, so
>> internal
>> > queries will not including. For the same queries we will have one
>> record in
>> > history with merged statistics.
>> >
>> > All above points just my proposal. Please revert back in case you think
>> > anything sh

Re: Query history statistics API

2018-12-21 Thread Юрий
Vladimir, thanks for your expert opinion.

I have some thoughts about 5 point.
I tried to find how it works for Oracle and PG:

*PG*: keep by default 1000 (can be configured) statements without and
discard the least-executed statements. Update statistics is asynchronous
process and statistics may have lag.

*Oracle*: use shared pool for historical data and can evict records with
min time of last execution in case free space at shared pool is not enough
for a data which can be related not only historical statistics. So seems
also separate asynchronous process (information about it so small).


Unfortunately I could not find information about big workload and how it
handled for these databases. However We could see that both of vendors use
asynchronous statistic processing.


I see few variants how we can handle very high workload.

First part of variants use asynchronous model with separate thread which
should take elements to update stats from a queue:
1) We blocking on overlimited queue and wait when capacity will be enough
to put new element.

+ We have all actual statistics
- End of our query execution can be blocked.

2) Discard statistics for ended query in case queue is full.

+ Very fast for current query
- We lose part of statistics.

3) Do full clean of statistic's queue.

+ Fast and freespace for further elements
- We lose big number of statistic elements.


Second part of variants use current approach for queryMetrics. When we have
some additional capacity for CHM with history + periodical cleanup the Map.
In case even the additional space is not enough we can :
1) Discard statistics for ended query
2) Do full clean CHM and discard all gathered information.

First part of variants potentially should work faster due to we can update
history Map in single thread without contention and put to queue should be
faster.


What do you think? Which of the variant will be prefer or may be you can
suggest another way to handle potential huge workload?

Also there is one initial question which stay not clear to me - it is right
place for new API.


пт, 21 дек. 2018 г. в 13:05, Vladimir Ozerov :

> Hi,
>
> I'd propose the following approach:
> 1) Enable history by default. Becuase otherwise users will have to restart
> the node to enable it, or we will have to implement dynamic history enable,
> which is complex thing. Default value should be relatively small yet
> allowing to accommodate typical workloads. E.g. 1000 entries. This should
> not put any serious pressure to GC.
> 2) Split queries by: schema, query, local flag
> 3) Track only growing values: execution count, error count, minimum
> duration, maximum duration
> 4) Implement ability to clear history - JMX, SQL command, whatever (may be
> this is different ticket)
> 5) History cleanup might be implemented similarly to current approach:
> store everything in CHM. Periodically check it's size. If it is too big -
> evict oldest entries. But this should be done with care - under some
> workloads new queries will be generated very quickly. In this case we
> should either fallback to synchronous evicts, or do not log history at all.
>
> Thoughts?
>
> Vladimir.
> -
>
> On Fri, Dec 21, 2018 at 11:22 AM Юрий  wrote:
>
> > Alexey,
> >
> > Yes, such property to configuration history size will be added. I think
> > default value should be 0 and history by default shouldn't be gather at
> > all, and can be switched on by property in case when it required.
> >
> > Currently I planned use the same way to evicting old data as for
> > queryMetrics - scheduled task will evict will old data by oldest start
> time
> > of query.
> >
> > Will be gathered statistics for only initial clients queries, so internal
> > queries will not including. For the same queries we will have one record
> in
> > history with merged statistics.
> >
> > All above points just my proposal. Please revert back in case you think
> > anything should be implemented by another way.
> >
> >
> >
> >
> >
> > чт, 20 дек. 2018 г. в 18:23, Alexey Kuznetsov :
> >
> > > Yuriy,
> > >
> > > I have several questions:
> > >
> > > Are we going to add some properties to cluster configuration for
> history
> > > size?
> > >
> > > And what will be default history size?
> > >
> > > Will the same queries count as same item of historical data?
> > >
> > > How we will evict old data that not fit into history?
> > >
> > > Will we somehow count "reduce" queries? Or only final "map" ones?
> > >
> > > --
> > > Alexey Kuznetsov
> > >
> >
> >
> > --
> > Живи с улыбкой! :D
> >
>


-- 
Живи с улыбкой! :D


Re: Query history statistics API

2018-12-21 Thread Юрий
Alexey,

Yes, such property to configuration history size will be added. I think
default value should be 0 and history by default shouldn't be gather at
all, and can be switched on by property in case when it required.

Currently I planned use the same way to evicting old data as for
queryMetrics - scheduled task will evict will old data by oldest start time
of query.

Will be gathered statistics for only initial clients queries, so internal
queries will not including. For the same queries we will have one record in
history with merged statistics.

All above points just my proposal. Please revert back in case you think
anything should be implemented by another way.





чт, 20 дек. 2018 г. в 18:23, Alexey Kuznetsov :

> Yuriy,
>
> I have several questions:
>
> Are we going to add some properties to cluster configuration for history
> size?
>
> And what will be default history size?
>
> Will the same queries count as same item of historical data?
>
> How we will evict old data that not fit into history?
>
> Will we somehow count "reduce" queries? Or only final "map" ones?
>
> --
> Alexey Kuznetsov
>


-- 
Живи с улыбкой! :D


Query history statistics API

2018-12-20 Thread Юрий
Hi Igniters,

As of now we have query statistics (
*org.apache.ignite.IgniteCache#queryMetrics*) , but it's look a little bit
wrong, due to at least few reasons:
1) Duration execution it just time between start execution and return
cursor to client and doesn't include all life time of query.
2) It doesn't know about multistatement queries. Such queries participate
in statistics as single query without splitting.
3) API to access the statistics expose as depend on cache, however query
don't have such dependency.

I want to create parallel similar realization as we already have.
But it should fix all above mistakes:
1) Use new infrastructure of tracking running queries developed under
IGNITE-10621  and
update statistics when query really finished, so we will have real time of
execution.
2) The new infrastructure support multistatement queries. And all part of
such queries will be tracked independent each other.
3) Expose API on upper level then it placed now. Seems it should be on
Ignite level. But I'm not fully sure.

Old API will be marked as deprecated and could be deleted further, may be
in 3.0 version.

I created the corresponding JIRA ticket - IGNITE-10754


Please give me some recommendation for right place for expose the API to
don't have misunderstanding in future.

WDYT, do you have additional proposal for such deal?





-- 
Живи с улыбкой! :D


Re: proposed design for thin client SQL management and monitoring (view running queries and kill it)

2018-12-04 Thread Юрий
Hey Igniters!

I continue working on IEP-29
<https://cwiki.apache.org/confluence/display/IGNITE/IEP-29%3A+SQL+management+and+monitoring>
in
part related to expose view with currently running queries.

I checked how we track running queries right now and see that it's
complicated and spread by few classes  process and we don't track all
running queries. Also there are internal queries which tracked within users
queries and can't be distinguished of user queries , e.g. map queries for
map-reduce queries or DML operation which required first step as select to
modify data.

My proposal is extract logic for working with running queries information
to separate class, like RunningQueryManager. The class will track running
queries and will be single point to retrieve information about running
queries. Currently to keep information about running queries uses
GridRunningQueryInfo class. As of now it can't provide useful information
to distinguish internal queries and users query, so the class need to
extend to keep information about type of query and id of initial user query
for internal query to be able identify it.

New RunningQueryManager should be used for all place where currently we
have tracking of running queries and added for all places which not covered
yet, it's mostly DDL and DML operations.

After implement the proposed change we can simple expose SQL view for all
running queries on a local node.
Collecting information from all node in a cluster currently out of scope
the change and will be described for discussing later.

Is there any objections for described proposal?

чт, 29 нояб. 2018 г. в 09:03, Юрий :

> Hi Alex,
>
> I've just started implement of the view. Thanks for the your efforts!
>
> ср, 28 нояб. 2018 г. в 19:00, Alex Plehanov :
>
>> Yuriy,
>>
>> If you have plans to implement running queries view in the nearest future,
>> I already have implemented draft for local node queries some time ago [1].
>> Maybe it will help to implement a view for whole cluster queries.
>>
>> [1]:
>>
>> https://github.com/alex-plekhanov/ignite/commit/6231668646a2b0f848373eb4e9dc38d127603e43
>>
>>
>> ср, 28 нояб. 2018 г. в 17:34, Vladimir Ozerov :
>>
>> > Denis
>> >
>> > I would wait for running queries view first.
>> >
>> > ср, 28 нояб. 2018 г. в 1:57, Denis Magda :
>> >
>> > > Vladimir,
>> > >
>> > > Please see inline
>> > >
>> > > On Mon, Nov 19, 2018 at 8:23 AM Vladimir Ozerov > >
>> > > wrote:
>> > >
>> > > > Denis,
>> > > >
>> > > > I partially agree with you. But there are several problem with
>> syntax
>> > > > proposed by you:
>> > > > 1) This is harder to implement technically - more parsing logic to
>> > > > implement. Ok, this is our internal problem, users do not care
>> about it
>> > > > 2) User will have to consult to docs in any case
>> > > >
>> > >
>> > > Two of these are not a big deal. We just need to invest more time for
>> > > development and during the design phase so that people need to consult
>> > the
>> > > docs rarely.
>> > >
>> > >
>> > > > 3) "nodeId" is not really node ID. For Ignite users node ID is
>> UUID. In
>> > > our
>> > > > case this is node order, and we intentionally avoided any naming
>> here.
>> > > >
>> > >
>> > > Let's use a more loose name such as "node".
>> > >
>> > >
>> > > > Query is just identified by a string, no more than that
>> > > > 4) Proposed syntax is more verbose and open ways for misuse. E.g.
>> what
>> > is
>> > > > "KILL QUERY WHERE queryId=1234"?
>> > > >
>> > > > I am not 100% satisfied with both variants, but the first one looks
>> > > simpler
>> > > > to me. Remember, that user will not guess query ID. Instead, he will
>> > get
>> > > > the list of running queries with some other syntax. What we need to
>> > > > understand for now is how this syntax will look like. I think that
>> we
>> > > > should implement getting list of running queries, and only then
>> start
>> > > > working on cancellation.
>> > > >
>> > >
>> > > That's a good point. Syntax of both running and killing queires
>> commands
>> > > should be tightly coupled. We're going to name a column i

Re: [VOTE] Apache Ignite 2.7.0 RC1

2018-11-30 Thread Юрий
+1

пт, 30 нояб. 2018 г. в 13:25, Anton Vinogradov :

> +1
>
> пт, 30 нояб. 2018 г. в 10:05, Seliverstov Igor :
>
> > +1
> >
> > пт, 30 нояб. 2018 г., 9:59 Nikolay Izhikov nizhi...@apache.org:
> >
> > > Igniters,
> > >
> > > We've uploaded a 2.7.0 release candidate to
> > >
> > > https://dist.apache.org/repos/dist/dev/ignite/2.7.0-rc1/
> > >
> > > Git tag name is 2.7.0-rc1
> > >
> > > This release includes the following changes:
> > >
> > > Apache Ignite In-Memory Database and Caching Platform 2.7
> > > -
> > >
> > > Ignite:
> > > * Added experimental support for multi-version concurrency control with
> > > snapshot isolation
> > >   - available for both cache API and SQL
> > >   - use CacheAtomicityMode.TRANSACTIONAL_SNAPSHOT to enable it
> > >   - not production ready, data consistency is not guaranteed in case of
> > > node failures
> > > * Implemented Transparent Data Encryption based on JKS certificates
> > > * Implemented Node.JS Thin Client
> > > * Implemented Python Thin Client
> > > * Implemented PHP Thin Client
> > > * Ignite start scripts now support Java 9 and higher
> > > * Added ability to set WAL history size in bytes
> > > * Added SslContextFactory.protocols and SslContextFactory.cipherSuites
> > > properties to control which SSL encryption algorithms can be used
> > > * Added JCache 1.1 compliance
> > > * Added IgniteCompute.withNoResultCache method with semantics similar
> to
> > > ComputeTaskNoResultCache annotation
> > > * Spring Data 2.0 is now supported in the separate module
> > > 'ignite-spring-data_2.0'
> > > * Added monitoring of critical system workers
> > > * Added ability to provide custom implementations of ExceptionListener
> > for
> > > JmsStreamer
> > > * Ignite KafkaStreamer was upgraded to use new KafkaConsmer
> configuration
> > > * S3 IP Finder now supports subfolder usage instead of bucket root
> > > * Improved dynamic cache start speed
> > > * Improved checkpoint performance by decreasing mark duration.
> > > * Added ability to manage compression level for compressed WAL
> archives.
> > > * Added metrics for Entry Processor invocations.
> > > * Added JMX metrics: ClusterMetricsMXBean.getTotalBaselineNodes and
> > > ClusterMetricsMXBean.getActiveBaselineNodes
> > > * Node uptime metric now includes days count
> > > * Exposed info about thin client connections through JMX
> > > * Introduced new system property IGNITE_REUSE_MEMORY_ON_DEACTIVATE to
> > > enable reuse of allocated memory on node deactivation (disabled by
> > default)
> > > * Optimistic transaction now will be properly rolled back if waiting
> too
> > > long for a new topology on remap
> > > * ScanQuery with setLocal flag now checks if the partition is actually
> > > present on local node
> > > * Improved cluster behaviour when a left node does not cause partition
> > > affinity assignment changes
> > > * Interrupting user thread during partition initialization will no
> longer
> > > cause node to stop
> > > * Fixed problem when partition lost event was not triggered if multiple
> > > nodes left cluster
> > > * Fixed massive node drop from the cluster on temporary network issues
> > > * Fixed service redeployment on cluster reactivation
> > > * Fixed client node stability under ZooKeeper discovery
> > > * Massive performance and stability improvements
> > >
> > > Ignite .Net:
> > > * Add .NET Core 2.1 support
> > > * Added thin client connection failover
> > >
> > > Ignite C++:
> > > * Implemented Thin Client with base cache operations
> > > * Implemented smart affinity routing for Thin Client to send requests
> > > directly to nodes containing data when possible
> > > * Added Clang compiler support
> > >
> > > SQL:
> > > * Added experimental support for fully ACID transactional SQL with the
> > > snapshot isolation:
> > >   - use CacheAtomicityMode.TRANSACTIONAL_SNAPSHOT to enable it
> > >   - a transaction can be started through native API
> (IgniteTransactions),
> > > thin JDBC driver or ODBC driver
> > >   - not production ready, data consistency is not guaranteed in case of
> > > node failures
> > > * Added a set of system views located in "IGNITE" schema to view
> cluster
> > > information (NODES, NODE_ATTRIBUTES, NODE_METRICS, BASELINE_NODES)
> > > * Added ability to create predefined SQL schemas
> > > * Added GROUP_CONCAT function support
> > > * Added string length constraint
> > > * Custom Java objects are now inlined into primary and secondary
> indexes
> > > what may significantly improve performance when AFFINITY_KEY is used
> > > * Added timeout to fail query execution in case it cannot be mapped to
> > > topology
> > > * Restricted number of cores allocated for CREATE INDEX by default to 4
> > to
> > > avoid contention on index tree Fixed transaction hanging during runtime
> > > error on commit.
> > > * Fixed possible memory leak when result set size is multiple of the
> page
> > > size
> > > * Fixed situation when data may be returned f

Re: proposed design for thin client SQL management and monitoring (view running queries and kill it)

2018-11-28 Thread Юрий
Hi Alex,

I've just started implement of the view. Thanks for the your efforts!

ср, 28 нояб. 2018 г. в 19:00, Alex Plehanov :

> Yuriy,
>
> If you have plans to implement running queries view in the nearest future,
> I already have implemented draft for local node queries some time ago [1].
> Maybe it will help to implement a view for whole cluster queries.
>
> [1]:
>
> https://github.com/alex-plekhanov/ignite/commit/6231668646a2b0f848373eb4e9dc38d127603e43
>
>
> ср, 28 нояб. 2018 г. в 17:34, Vladimir Ozerov :
>
> > Denis
> >
> > I would wait for running queries view first.
> >
> > ср, 28 нояб. 2018 г. в 1:57, Denis Magda :
> >
> > > Vladimir,
> > >
> > > Please see inline
> > >
> > > On Mon, Nov 19, 2018 at 8:23 AM Vladimir Ozerov 
> > > wrote:
> > >
> > > > Denis,
> > > >
> > > > I partially agree with you. But there are several problem with syntax
> > > > proposed by you:
> > > > 1) This is harder to implement technically - more parsing logic to
> > > > implement. Ok, this is our internal problem, users do not care about
> it
> > > > 2) User will have to consult to docs in any case
> > > >
> > >
> > > Two of these are not a big deal. We just need to invest more time for
> > > development and during the design phase so that people need to consult
> > the
> > > docs rarely.
> > >
> > >
> > > > 3) "nodeId" is not really node ID. For Ignite users node ID is UUID.
> In
> > > our
> > > > case this is node order, and we intentionally avoided any naming
> here.
> > > >
> > >
> > > Let's use a more loose name such as "node".
> > >
> > >
> > > > Query is just identified by a string, no more than that
> > > > 4) Proposed syntax is more verbose and open ways for misuse. E.g.
> what
> > is
> > > > "KILL QUERY WHERE queryId=1234"?
> > > >
> > > > I am not 100% satisfied with both variants, but the first one looks
> > > simpler
> > > > to me. Remember, that user will not guess query ID. Instead, he will
> > get
> > > > the list of running queries with some other syntax. What we need to
> > > > understand for now is how this syntax will look like. I think that we
> > > > should implement getting list of running queries, and only then start
> > > > working on cancellation.
> > > >
> > >
> > > That's a good point. Syntax of both running and killing queires
> commands
> > > should be tightly coupled. We're going to name a column if running
> > queries
> > > IDs somehow anyway and that name might be resued in the WHERE clause of
> > > KILL.
> > >
> > > Should we discuss the syntax in a separate thread?
> > >
> > > --
> > > Denis
> > >
> > > >
> > > > Vladimir.
> > > >
> > > >
> > > > On Mon, Nov 19, 2018 at 7:02 PM Denis Mekhanikov <
> > dmekhani...@gmail.com>
> > > > wrote:
> > > >
> > > > > Guys,
> > > > >
> > > > > Syntax like *KILL QUERY '25.1234'* look a bit cryptic to me.
> > > > > I'm going to look up in documentation, which parameter goes first
> in
> > > this
> > > > > query every time I use it.
> > > > > I like the syntax, that Igor suggested more.
> > > > > Will it be better if we make *nodeId* and *queryId *named
> properties?
> > > > >
> > > > > Something like this:
> > > > > KILL QUERY WHERE nodeId=25 and queryId=1234
> > > > >
> > > > > Denis
> > > > >
> > > > > пт, 16 нояб. 2018 г. в 14:12, Юрий :
> > > > >
> > > > > > I fully agree with last sentences and can start to implement this
> > > part.
> > > > > >
> > > > > > Guys, thanks for your productive participate at discussion.
> > > > > >
> > > > > > пт, 16 нояб. 2018 г. в 2:53, Denis Magda :
> > > > > >
> > > > > > > Vladimir,
> > > > > > >
> > > > > > > Thanks, make perfect sense to me.
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Nov 15, 201

Re: [VOTE] Creation dedicated list for github notifiacations

2018-11-27 Thread Юрий
+1

вт, 27 нояб. 2018 г. в 11:22, Andrey Mashenkov :

> +1
>
> On Tue, Nov 27, 2018 at 10:12 AM Sergey Chugunov <
> sergey.chugu...@gmail.com>
> wrote:
>
> > +1
> >
> > Plus this dedicated list should be properly documented in wiki,
> mentioning
> > it in How to Contribute [1] or in Make Teamcity Green Again [2] would be
> a
> > good idea.
> >
> > [1] https://cwiki.apache.org/confluence/display/IGNITE/How+to+Contribute
> > [2]
> >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/Make+Teamcity+Green+Again
> >
> > On Tue, Nov 27, 2018 at 9:51 AM Павлухин Иван 
> wrote:
> >
> > > +1
> > > вт, 27 нояб. 2018 г. в 09:22, Dmitrii Ryabov :
> > > >
> > > > 0
> > > > вт, 27 нояб. 2018 г. в 02:33, Alexey Kuznetsov <
> akuznet...@apache.org
> > >:
> > > > >
> > > > > +1
> > > > > Do not forget notification from GitBox too!
> > > > >
> > > > > On Tue, Nov 27, 2018 at 2:20 AM Zhenya  >
> > > wrote:
> > > > >
> > > > > > +1, already make it by filers.
> > > > > >
> > > > > > > This was discussed already [1].
> > > > > > >
> > > > > > > So, I want to complete this discussion with moving outside
> > dev-list
> > > > > > > GitHub-notification to dedicated list.
> > > > > > >
> > > > > > > Please start voting.
> > > > > > >
> > > > > > > +1 - to accept this change.
> > > > > > > 0 - you don't care.
> > > > > > > -1 - to decline this change.
> > > > > > >
> > > > > > > This vote will go for 72 hours.
> > > > > > >
> > > > > > > [1]
> > > > > > >
> > > > > >
> > >
> >
> http://apache-ignite-developers.2346864.n4.nabble.com/Time-to-remove-automated-messages-from-the-devlist-td37484i20.html
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Alexey Kuznetsov
> > >
> > >
> > >
> > > --
> > > Best regards,
> > > Ivan Pavlukhin
> > >
> >
>
>
> --
> Best regards,
> Andrey V. Mashenkov
>


-- 
Живи с улыбкой! :D


Re: proposed realization KILL QUERY command

2018-11-21 Thread Юрий
Denis,

Let's consider parameter of KILL QUERY just a string with some query id,
without any meaning for user. User just need to get the id and pass as
parameter to KILL QUERY command.

Even if query is distributed it have single query id from user perspective
and will killed on all nodes. User just need to known one global query id.

How it can works.
1)SELECT * from running_queries
result is
 query_id | node_id
  | sql   | schema_name | connection_id | duration
123.33 | e0a69cb8-a1a8-45f6-b84d-ead367a0   | SELECT ...  | ...
  |   22 | 23456
333.31 | aaa6acb8-a4a5-42f6-f842-ead111b00020 | UPDATE...  | ...
  |  321| 346
2) KILL QUERY '123.33'

So, user need select query_id from running_queries view and use it for KILL
QUERY command.

I hope it became clearer.



ср, 21 нояб. 2018 г. в 02:11, Denis Magda :

> Folks,
>
> The decimal syntax is really odd - KILL QUERY
> '[node_order].[query_counter]'
>
> Confusing, let's use a space to separate parameters.
>
> Also, what if I want to halt a specific query with certain ID? Don't know
> the node number, just know that the query is distributed and runs across
> several machines. Sounds like the syntax still should consider
> [node_order/id] as an optional parameter.
>
> Probably, if you explain to me how an end user will use this command from
> the very beginning (how do I look for a query id and node id, etc) then the
> things get clearer.
>
> --
> Denis
>
> On Tue, Nov 20, 2018 at 1:03 AM Юрий  wrote:
>
> > Hi Vladimir,
> >
> > Thanks for your suggestion to use MANAGEMENT_POOL for processing
> > cancellation requests.
> >
> > About your questions.
> > 1) I'm going to implements SQL view to provide list of running queries.
> The
> > SQL VIEW has been a little bit discussed earlier. Proposed name is
> > *running_queries* with following columns: query_id, node_id, sql,
> > schema_name, connection_id, duration. Currently most of the information
> can
> > be  retrieved through cache API, however it doesn't matter, any case we
> > need to expose SQL VIEW. Seem's you are right - the part should be
> > implemented firstly.
> > 2) Fully agree that we need to support all kind of SQL queries
> > (SLECT/DML/DDL, transactional, non transnational, local, distributed). I
> > definitely sure that it will possible for all of above, however I'm not
> > sure about DDL - need to investigate it deeper. Also need to understand
> > that canceled DML operation can lead to partially updated data for non
> > transational caches.
> >
> >
> >
> > пн, 19 нояб. 2018 г. в 19:17, Vladimir Ozerov :
> >
> > > Hi Yuriy,
> > >
> > > I think we can use MANAGEMENT_POOL for this. It is already used for
> some
> > > internal Ignite tasks, and it appears to be a good candidate to process
> > > cancel requests.
> > >
> > > But there are several things which are not clear enough for me at the
> > > moment:
> > > 1) How user is going to get the list of running queries in the first
> > place?
> > > Do we already have any SQL commands/views to get this information?
> > > 2) We need to ensure that KILL command will be processed properly by
> all
> > > kinds of SQL queries - SELECT/DML/DDL, non-transactional or
> > transactional,
> > > local queries and distributed queries. Will we be able to support all
> > these
> > > modes?
> > >
> > > Vladimir.
> > >
> > > On Mon, Nov 19, 2018 at 6:37 PM Юрий 
> > wrote:
> > >
> > > > Hi Igniters,
> > > >
> > > > Earlier we agreed about syntax KILL QUERY
> > '[node_order].[query_counter]',
> > > > e.g. KILL QUERY '25.123' for single query  or KILL QUERY '25.*' for
> all
> > > > queries on the node. Which is part of IEP-29
> > > > <
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-29%3A+SQL+management+and+monitoring
> > > > >
> > > > .
> > > >
> > > > Now I want to discuss internal realization of KILL query feature.
> > > >
> > > > My current vision is following:
> > > > After parsing, Ignite create KILL query command with two parameters:
> > > > nodeOrderId, nodeQryId. To determine that need to kill all queries
> on a
> > > > node we can use negative value of query id, due to qry id always have
> > > > positive values.
>

Re: proposed realization KILL QUERY command

2018-11-20 Thread Юрий
Hi Vladimir,

Thanks for your suggestion to use MANAGEMENT_POOL for processing
cancellation requests.

About your questions.
1) I'm going to implements SQL view to provide list of running queries. The
SQL VIEW has been a little bit discussed earlier. Proposed name is
*running_queries* with following columns: query_id, node_id, sql,
schema_name, connection_id, duration. Currently most of the information can
be  retrieved through cache API, however it doesn't matter, any case we
need to expose SQL VIEW. Seem's you are right - the part should be
implemented firstly.
2) Fully agree that we need to support all kind of SQL queries
(SLECT/DML/DDL, transactional, non transnational, local, distributed). I
definitely sure that it will possible for all of above, however I'm not
sure about DDL - need to investigate it deeper. Also need to understand
that canceled DML operation can lead to partially updated data for non
transational caches.



пн, 19 нояб. 2018 г. в 19:17, Vladimir Ozerov :

> Hi Yuriy,
>
> I think we can use MANAGEMENT_POOL for this. It is already used for some
> internal Ignite tasks, and it appears to be a good candidate to process
> cancel requests.
>
> But there are several things which are not clear enough for me at the
> moment:
> 1) How user is going to get the list of running queries in the first place?
> Do we already have any SQL commands/views to get this information?
> 2) We need to ensure that KILL command will be processed properly by all
> kinds of SQL queries - SELECT/DML/DDL, non-transactional or transactional,
> local queries and distributed queries. Will we be able to support all these
> modes?
>
> Vladimir.
>
> On Mon, Nov 19, 2018 at 6:37 PM Юрий  wrote:
>
> > Hi Igniters,
> >
> > Earlier we agreed about syntax KILL QUERY '[node_order].[query_counter]',
> > e.g. KILL QUERY '25.123' for single query  or KILL QUERY '25.*' for all
> > queries on the node. Which is part of IEP-29
> > <
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-29%3A+SQL+management+and+monitoring
> > >
> > .
> >
> > Now I want to discuss internal realization of KILL query feature.
> >
> > My current vision is following:
> > After parsing, Ignite create KILL query command with two parameters:
> > nodeOrderId, nodeQryId. To determine that need to kill all queries on a
> > node we can use negative value of query id, due to qry id always have
> > positive values.
> > The command process at IgniteH2Indexing as native command.
> > By nodeOrderId we find node which initial for the query and send to the
> > node new GridQueryKillRequest with nodeQryId to TOPIC_QUERY with not
> QUERY
> > POOL executor.
> > At GridReduceQueryExecutor we add support of processing new
> > GridQueryKillRequest
> > which just run already exists cancelQueries method with given qryId or
> with
> > all qryIds which currently running at the node in case at initial KILL
> > QUERY parameters used star symbol.
> >
> > I have a doubt which of thread pool we should use to process
> > GridQueryKillRequest.
> > My opinion it shouldn't be QUERY pool, due to the pool can be fully used
> by
> > executing queries, it such case we can't cancel query immediately. May we
> > use one of already existed pool or create new one? Or may be I'm mistaken
> > and it should use QUERY pool.
> >
> > What do you think about proposed plan of implementation?
> >
> > And please give comments about which of thread pool will be better to use
> > for kill query requests. It's small, but really important part of the
> > realization.
> >
> >
> > Thanks.
> >
> >
> > --
> > Живи с улыбкой! :D
> >
>


-- 
Живи с улыбкой! :D


Re: proposed design for thin client SQL management and monitoring (view running queries and kill it)

2018-11-19 Thread Юрий
Hi Denis,

It's not a problem, the full query id could be get from additional column
from *running_queries* view. So you may not known real meaning of each of
part of the string to use it.
Is it works?

пн, 19 нояб. 2018 г. в 19:02, Denis Mekhanikov :

> Guys,
>
> Syntax like *KILL QUERY '25.1234'* look a bit cryptic to me.
> I'm going to look up in documentation, which parameter goes first in this
> query every time I use it.
> I like the syntax, that Igor suggested more.
> Will it be better if we make *nodeId* and *queryId *named properties?
>
> Something like this:
> KILL QUERY WHERE nodeId=25 and queryId=1234
>
> Denis
>
> пт, 16 нояб. 2018 г. в 14:12, Юрий :
>
> > I fully agree with last sentences and can start to implement this part.
> >
> > Guys, thanks for your productive participate at discussion.
> >
> > пт, 16 нояб. 2018 г. в 2:53, Denis Magda :
> >
> > > Vladimir,
> > >
> > > Thanks, make perfect sense to me.
> > >
> > >
> > > On Thu, Nov 15, 2018 at 12:18 AM Vladimir Ozerov  >
> > > wrote:
> > >
> > > > Denis,
> > > >
> > > > The idea is that QueryDetailMetrics will be exposed through separate
> > > > "historical" SQL view in addition to current API. So we are on the
> same
> > > > page here.
> > > >
> > > > As far as query ID I do not see any easy way to operate on a single
> > > integer
> > > > value (even 64bit). This is distributed system - we do not want to
> have
> > > > coordination between nodes to get query ID. And coordination is the
> > only
> > > > possible way to get sexy "long". Instead, I would propose to form ID
> > from
> > > > node order and query counter within node. This will be (int, long)
> > pair.
> > > > For use convenience we may convert it to a single string, e.g.
> > > > "[node_order].[query_counter]". Then the syntax would be:
> > > >
> > > > KILL QUERY '25.1234'; // Kill query 1234 on node 25
> > > > KILL QUERY '25.*; // Kill all queries on the node 25
> > > >
> > > > Makes sense?
> > > >
> > > > Vladimir.
> > > >
> > > > On Wed, Nov 14, 2018 at 1:25 PM Denis Magda 
> wrote:
> > > >
> > > > > Yury,
> > > > >
> > > > > As I understand you mean that the view should contains both running
> > and
> > > > > > finished queries. If be honest for the view I was going to use
> just
> > > > > queries
> > > > > > running right now. For finished queries I thought about another
> > view
> > > > with
> > > > > > another set of fields which should include I/O related ones. Is
> it
> > > > works?
> > > > >
> > > > >
> > > > > Got you, so if only running queries are there then your initial
> > > proposal
> > > > > makes total sense. Not sure we need a view of the finished queries.
> > It
> > > > will
> > > > > be possible to analyze them through the updated DetailedMetrics
> > > approach,
> > > > > won't it?
> > > > >
> > > > > For "KILL QUERY node_id query_id"  node_id required as part of
> unique
> > > key
> > > > > > of query and help understand Ignite which node start the
> > distributed
> > > > > query.
> > > > > > Use both parameters will allow cheap generate unique key across
> all
> > > > > nodes.
> > > > > > Node which started a query can cancel it on all nodes participate
> > > > nodes.
> > > > > > So, to stop any queries initially we need just send the cancel
> > > request
> > > > to
> > > > > > node who started the query. This mechanism is already in Ignite.
> > > > >
> > > > >
> > > > > Can we locate node_id behind the scenes if the user supplies
> query_id
> > > > only?
> > > > > A query record in the view already contains query_id and node_id
> and
> > it
> > > > > sounds like an extra work for the user to fill in all the details
> for
> > > us.
> > > > > Embed node_id into query_id if you'd like to avoid extra network
> hops
> > > for
> > > > > query_id to node_id mapping.
> > > >

proposed realization KILL QUERY command

2018-11-19 Thread Юрий
Hi Igniters,

Earlier we agreed about syntax KILL QUERY '[node_order].[query_counter]',
e.g. KILL QUERY '25.123' for single query  or KILL QUERY '25.*' for all
queries on the node. Which is part of IEP-29

.

Now I want to discuss internal realization of KILL query feature.

My current vision is following:
After parsing, Ignite create KILL query command with two parameters:
nodeOrderId, nodeQryId. To determine that need to kill all queries on a
node we can use negative value of query id, due to qry id always have
positive values.
The command process at IgniteH2Indexing as native command.
By nodeOrderId we find node which initial for the query and send to the
node new GridQueryKillRequest with nodeQryId to TOPIC_QUERY with not QUERY
POOL executor.
At GridReduceQueryExecutor we add support of processing new
GridQueryKillRequest
which just run already exists cancelQueries method with given qryId or with
all qryIds which currently running at the node in case at initial KILL
QUERY parameters used star symbol.

I have a doubt which of thread pool we should use to process
GridQueryKillRequest.
My opinion it shouldn't be QUERY pool, due to the pool can be fully used by
executing queries, it such case we can't cancel query immediately. May we
use one of already existed pool or create new one? Or may be I'm mistaken
and it should use QUERY pool.

What do you think about proposed plan of implementation?

And please give comments about which of thread pool will be better to use
for kill query requests. It's small, but really important part of the
realization.


Thanks.


-- 
Живи с улыбкой! :D


Re: proposed design for thin client SQL management and monitoring (view running queries and kill it)

2018-11-16 Thread Юрий
I fully agree with last sentences and can start to implement this part.

Guys, thanks for your productive participate at discussion.

пт, 16 нояб. 2018 г. в 2:53, Denis Magda :

> Vladimir,
>
> Thanks, make perfect sense to me.
>
>
> On Thu, Nov 15, 2018 at 12:18 AM Vladimir Ozerov 
> wrote:
>
> > Denis,
> >
> > The idea is that QueryDetailMetrics will be exposed through separate
> > "historical" SQL view in addition to current API. So we are on the same
> > page here.
> >
> > As far as query ID I do not see any easy way to operate on a single
> integer
> > value (even 64bit). This is distributed system - we do not want to have
> > coordination between nodes to get query ID. And coordination is the only
> > possible way to get sexy "long". Instead, I would propose to form ID from
> > node order and query counter within node. This will be (int, long) pair.
> > For use convenience we may convert it to a single string, e.g.
> > "[node_order].[query_counter]". Then the syntax would be:
> >
> > KILL QUERY '25.1234'; // Kill query 1234 on node 25
> > KILL QUERY '25.*; // Kill all queries on the node 25
> >
> > Makes sense?
> >
> > Vladimir.
> >
> > On Wed, Nov 14, 2018 at 1:25 PM Denis Magda  wrote:
> >
> > > Yury,
> > >
> > > As I understand you mean that the view should contains both running and
> > > > finished queries. If be honest for the view I was going to use just
> > > queries
> > > > running right now. For finished queries I thought about another view
> > with
> > > > another set of fields which should include I/O related ones. Is it
> > works?
> > >
> > >
> > > Got you, so if only running queries are there then your initial
> proposal
> > > makes total sense. Not sure we need a view of the finished queries. It
> > will
> > > be possible to analyze them through the updated DetailedMetrics
> approach,
> > > won't it?
> > >
> > > For "KILL QUERY node_id query_id"  node_id required as part of unique
> key
> > > > of query and help understand Ignite which node start the distributed
> > > query.
> > > > Use both parameters will allow cheap generate unique key across all
> > > nodes.
> > > > Node which started a query can cancel it on all nodes participate
> > nodes.
> > > > So, to stop any queries initially we need just send the cancel
> request
> > to
> > > > node who started the query. This mechanism is already in Ignite.
> > >
> > >
> > > Can we locate node_id behind the scenes if the user supplies query_id
> > only?
> > > A query record in the view already contains query_id and node_id and it
> > > sounds like an extra work for the user to fill in all the details for
> us.
> > > Embed node_id into query_id if you'd like to avoid extra network hops
> for
> > > query_id to node_id mapping.
> > >
> > > --
> > > Denis
> > >
> > > On Wed, Nov 14, 2018 at 1:04 AM Юрий 
> > wrote:
> > >
> > > > Denis,
> > > >
> > > > Under the hood 'time' will be as startTime, but for system view I
> > planned
> > > > use duration which will be simple calculated as now - startTime. So,
> > > there
> > > > is't a performance issue.
> > > > As I understand you mean that the view should contains both running
> and
> > > > finished queries. If be honest for the view I was going to use just
> > > queries
> > > > running right now. For finished queries I thought about another view
> > with
> > > > another set of fields which should include I/O related ones. Is it
> > works?
> > > >
> > > > For "KILL QUERY node_id query_id"  node_id required as part of unique
> > key
> > > > of query and help understand Ignite which node start the distributed
> > > query.
> > > > Use both parameters will allow cheap generate unique key across all
> > > nodes.
> > > > Node which started a query can cancel it on all nodes participate
> > nodes.
> > > > So, to stop any queries initially we need just send the cancel
> request
> > to
> > > > node who started the query. This mechanism is already in Ignite.
> > > >
> > > > Native SQL APIs will automatically support the futures after
> > implementing
> > > &

Re: proposed design for thin client SQL management and monitoring (view running queries and kill it)

2018-11-13 Thread Юрий
Igor,

I think we shouldn't mix management and select syntax. Potentially it can
be dangerous. e.g. your example you don't know set of queries which will be
cancelled. Also I have not seen such approach in other databases.

Yes, the syntax should work from SQL API also.

вт, 13 нояб. 2018 г. в 14:20, Igor Sapego :

> Yuriy,
>
> Would not it be more convenient fro user to write a request in a free
> form, like
> KILL QUERY WHERE ...
>
> For example,
> KILL QUERY WHERE duration > 15000
>
> Or is it going to be too hard to implement?
>
> Also, is this syntax going to work only from thin clients, or if it just
> designed for them, but will also be usable from basic SQL API?
>
> Best Regards,
> Igor
>
>
> On Tue, Nov 13, 2018 at 12:15 PM Юрий  wrote:
>
> > Igniters,
> >
> > Some comments for my original email's.
> >
> > The proposal related to part of IEP-29
> > <
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-29%3A+SQL+management+and+monitoring
> > >
> > .
> >
> > What purpose are we pursuing of the proposal?
> > We want to be able check which queries running right now through thin
> > clients. Get some information related to the queries and be able to
> cancel
> > a query if it required for some reasons.
> > So, we need interface to get a running queries. For the goal we propose
> > running_queries system view. The view contains unique query identifier
> > which need to pass to kill query command to cancel the query.
> >
> > What do you think about fields of the running queries view? May be some
> > useful fields we could easy add to the view.
> >
> > Also let's discuss syntax of cancellation of query. I propose to use
> MySQL
> > like syntax as easy to understand and shorter then Oracle and Postgres
> > syntax ( detailed information in IEP-29
> > <
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-29%3A+SQL+management+and+monitoring
> > >
> > ).
> >
> >
> >
> > пн, 12 нояб. 2018 г. в 19:28, Юрий :
> >
> > > Igniters,
> > >
> > > Below is a proposed design for thin client SQL management and
> monitoring
> > > to cancel a queries.
> > >
> > > 1) Ignite expose system SQL view with name *running_queries*
> > > proposed columns: *node_id, query_id, sql, schema_name, connection_id,
> > > duration*.
> > >
> > > node_id - initial node of request
> > > query_id - unique id of query on node
> > > sql - text of query
> > > schema name - name of sql schema
> > > connection_id - id of client connection from
> > ClientListenerConnectionContext
> > > class
> > > duration - duration in millisecond from start of query
> > >
> > >
> > > Ignite will gather info about running queries from each of nodes and
> > > collect it during user query. We already have most of the information
> at
> > GridRunningQueryInfo
> > > on each of nodes.
> > >
> > > Instead of duration we can use start_time, but I think duration will be
> > > simple to use due to it not depend on a timezone.
> > >
> > >
> > > 2) Propose to use following syntax to kill a running query:
> > >
> > > *KILL QUERY node_Id query_id*
> > >
> > >
> > > Both parameters node_id and query_id can be get through running_queries
> > > system view.
> > >
> > > When a node receive such request it can be run locally in case node
> have
> > > given node_id or send message to node with given id. Because node have
> > > information about local running queries then can cancel it - it already
> > > implemented in GridReduceQueryExecutor.cancelQueries(qryId) method.
> > >
> > > Comments are welcome.
> > > --
> > > Живи с улыбкой! :D
> > >
> >
> >
> > --
> > Живи с улыбкой! :D
> >
>


-- 
Живи с улыбкой! :D


Re: proposed design for thin client SQL management and monitoring (view running queries and kill it)

2018-11-13 Thread Юрий
Denis,

Under the hood 'time' will be as startTime, but for system view I planned
use duration which will be simple calculated as now - startTime. So, there
is't a performance issue.
As I understand you mean that the view should contains both running and
finished queries. If be honest for the view I was going to use just queries
running right now. For finished queries I thought about another view with
another set of fields which should include I/O related ones. Is it works?

For "KILL QUERY node_id query_id"  node_id required as part of unique key
of query and help understand Ignite which node start the distributed query.
Use both parameters will allow cheap generate unique key across all nodes.
Node which started a query can cancel it on all nodes participate nodes.
So, to stop any queries initially we need just send the cancel request to
node who started the query. This mechanism is already in Ignite.

Native SQL APIs will automatically support the futures after implementing
for thin clients. So we are good here.



вт, 13 нояб. 2018 г. в 18:52, Denis Magda :

> Yury,
>
> Please consider the following:
>
>- If we record the duration instead of startTime, then the former has to
>be updated frequently - sounds like a performance red flag. Should we
> store
>startTime and endTime instead? This way a query record will be updated
>twice - when the query is started and terminated.
>- In the IEP you've mentioned I/O related fields that should help to
>grasp why a query runs that slow. Should they be stored in this view?
>- "KILL QUERY query_id" is more than enough. Let's not add "node_id"
>unless it's absolutely required. Our queries are distributed and
> executed
>across several nodes that's why the node_id parameter is redundant.
>- This API needs to be supported across all our interfaces. We can start
>with JDBC/ODBC and thin clients and then support for the native SQL APIs
>(Java, Net, C++)
>- Please share examples of SELECTs in the IEP that would show how to
>    find long running queries, queries that cause a lot of I/O troubles.
>
> --
> Denis
>
> On Tue, Nov 13, 2018 at 1:15 AM Юрий  wrote:
>
> > Igniters,
> >
> > Some comments for my original email's.
> >
> > The proposal related to part of IEP-29
> > <
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-29%3A+SQL+management+and+monitoring
> > >
> > .
> >
> > What purpose are we pursuing of the proposal?
> > We want to be able check which queries running right now through thin
> > clients. Get some information related to the queries and be able to
> cancel
> > a query if it required for some reasons.
> > So, we need interface to get a running queries. For the goal we propose
> > running_queries system view. The view contains unique query identifier
> > which need to pass to kill query command to cancel the query.
> >
> > What do you think about fields of the running queries view? May be some
> > useful fields we could easy add to the view.
> >
> > Also let's discuss syntax of cancellation of query. I propose to use
> MySQL
> > like syntax as easy to understand and shorter then Oracle and Postgres
> > syntax ( detailed information in IEP-29
> > <
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-29%3A+SQL+management+and+monitoring
> > >
> > ).
> >
> >
> >
> > пн, 12 нояб. 2018 г. в 19:28, Юрий :
> >
> > > Igniters,
> > >
> > > Below is a proposed design for thin client SQL management and
> monitoring
> > > to cancel a queries.
> > >
> > > 1) Ignite expose system SQL view with name *running_queries*
> > > proposed columns: *node_id, query_id, sql, schema_name, connection_id,
> > > duration*.
> > >
> > > node_id - initial node of request
> > > query_id - unique id of query on node
> > > sql - text of query
> > > schema name - name of sql schema
> > > connection_id - id of client connection from
> > ClientListenerConnectionContext
> > > class
> > > duration - duration in millisecond from start of query
> > >
> > >
> > > Ignite will gather info about running queries from each of nodes and
> > > collect it during user query. We already have most of the information
> at
> > GridRunningQueryInfo
> > > on each of nodes.
> > >
> > > Instead of duration we can use start_time, but I think duration will be
> > > simple to use due to it not depend on a timezone.
> > >
> > >
> 

Re: proposed design for thin client SQL management and monitoring (view running queries and kill it)

2018-11-13 Thread Юрий
Igniters,

Some comments for my original email's.

The proposal related to part of IEP-29
<https://cwiki.apache.org/confluence/display/IGNITE/IEP-29%3A+SQL+management+and+monitoring>
.

What purpose are we pursuing of the proposal?
We want to be able check which queries running right now through thin
clients. Get some information related to the queries and be able to cancel
a query if it required for some reasons.
So, we need interface to get a running queries. For the goal we propose
running_queries system view. The view contains unique query identifier
which need to pass to kill query command to cancel the query.

What do you think about fields of the running queries view? May be some
useful fields we could easy add to the view.

Also let's discuss syntax of cancellation of query. I propose to use MySQL
like syntax as easy to understand and shorter then Oracle and Postgres
syntax ( detailed information in IEP-29
<https://cwiki.apache.org/confluence/display/IGNITE/IEP-29%3A+SQL+management+and+monitoring>
).



пн, 12 нояб. 2018 г. в 19:28, Юрий :

> Igniters,
>
> Below is a proposed design for thin client SQL management and monitoring
> to cancel a queries.
>
> 1) Ignite expose system SQL view with name *running_queries*
> proposed columns: *node_id, query_id, sql, schema_name, connection_id,
> duration*.
>
> node_id - initial node of request
> query_id - unique id of query on node
> sql - text of query
> schema name - name of sql schema
> connection_id - id of client connection from ClientListenerConnectionContext
> class
> duration - duration in millisecond from start of query
>
>
> Ignite will gather info about running queries from each of nodes and
> collect it during user query. We already have most of the information at 
> GridRunningQueryInfo
> on each of nodes.
>
> Instead of duration we can use start_time, but I think duration will be
> simple to use due to it not depend on a timezone.
>
>
> 2) Propose to use following syntax to kill a running query:
>
> *KILL QUERY node_Id query_id*
>
>
> Both parameters node_id and query_id can be get through running_queries
> system view.
>
> When a node receive such request it can be run locally in case node have
> given node_id or send message to node with given id. Because node have
> information about local running queries then can cancel it - it already
> implemented in GridReduceQueryExecutor.cancelQueries(qryId) method.
>
> Comments are welcome.
> --
> Живи с улыбкой! :D
>


-- 
Живи с улыбкой! :D


proposed design for thin client SQL management and monitoring (view running queries and kill it)

2018-11-12 Thread Юрий
Igniters,

Below is a proposed design for thin client SQL management and monitoring to
cancel a queries.

1) Ignite expose system SQL view with name *running_queries*
proposed columns: *node_id, query_id, sql, schema_name, connection_id,
duration*.

node_id - initial node of request
query_id - unique id of query on node
sql - text of query
schema name - name of sql schema
connection_id - id of client connection from ClientListenerConnectionContext
class
duration - duration in millisecond from start of query


Ignite will gather info about running queries from each of nodes and
collect it during user query. We already have most of the information
at GridRunningQueryInfo
on each of nodes.

Instead of duration we can use start_time, but I think duration will be
simple to use due to it not depend on a timezone.


2) Propose to use following syntax to kill a running query:

*KILL QUERY node_Id query_id*


Both parameters node_id and query_id can be get through running_queries
system view.

When a node receive such request it can be run locally in case node have
given node_id or send message to node with given id. Because node have
information about local running queries then can cancel it - it already
implemented in GridReduceQueryExecutor.cancelQueries(qryId) method.

Comments are welcome.
-- 
Живи с улыбкой! :D


Re: SQL management and monitoring improvements

2018-11-12 Thread Юрий
Hi Alex,

Thanks for the ideas! It will be useful.
However, from my side, Oracle like syntax looks excess. Will be good if
someone else say opinion opinion about syntax.

Igniters, what do you think which syntax of Ignite management via SQL will
be prefer?


пт, 9 нояб. 2018 г. в 0:37, Alex Plehanov :

> Yuri,
>
> I think it will be useful if we extend in future management via SQL tool to
> cover not only queries but also other parts of Ignite (for example: for
> canceling tasks, for killing transactions, activation/deactivation,
> baseline topology change, etc.).
>
> Maybe, in this case, we should use a common prefix for such management
> commands (like Oracle do: "ALTER SYSTEM ...")?
>
> Something like:
> ALTER GRID KILL QUERY ... [ON NODE ...]
> ALTER GRID KILL TRANSACTION ... [ON NODE ...]
> ALTER GRID ACTIVATE
> etc.
>
> What do you think about it?
>
>
> чт, 8 нояб. 2018 г. в 0:06, Denis Magda :
>
> > Yuri,
> >
> > That's an excellent idea, thank you for driving it.
> >
> > What is not explained is how to leverage from all that stat. For
> instance,
> > how can I know a total number of SELECTs, or a total number of SELECTs
> > happened yesterday for specific data sets. Do believe that it's feasible
> > and just not covered.
> >
> > --
> > Denis
> >
> >
> >
> > On Wed, Nov 7, 2018 at 2:01 AM Юрий  wrote:
> >
> > > Hi Igniters!
> > >
> > > I think we can improve Ignite management and monitoring instruments
> > related
> > > to SQL.
> > > I've prepared draft of IEP-29: SQL management and monitoring
> > > <
> > >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-29%3A+SQL+management+and+monitoring
> > > >
> > > .
> > >
> > > What do you think about it? May be do you have some additional
> > suggestions?
> > >
> > >
> > > --
> > > Живи с улыбкой! :D
> > >
> >
>


-- 
Живи с улыбкой! :D


Re: SQL management and monitoring improvements

2018-11-08 Thread Юрий
Denis,

I mentioned about it, may be it's not so clear now.  I'll try add more
explanation about it into IEP. It should be some SQL systems views and JMX
interfaces.
Deeper details will be covered in a tasks during more detailed analyze.

Thanks for your feedback.


чт, 8 нояб. 2018 г. в 0:06, Denis Magda :

> Yuri,
>
> That's an excellent idea, thank you for driving it.
>
> What is not explained is how to leverage from all that stat. For instance,
> how can I know a total number of SELECTs, or a total number of SELECTs
> happened yesterday for specific data sets. Do believe that it's feasible
> and just not covered.
>
> --
> Denis
>
>
>
> On Wed, Nov 7, 2018 at 2:01 AM Юрий  wrote:
>
> > Hi Igniters!
> >
> > I think we can improve Ignite management and monitoring instruments
> related
> > to SQL.
> > I've prepared draft of IEP-29: SQL management and monitoring
> > <
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-29%3A+SQL+management+and+monitoring
> > >
> > .
> >
> > What do you think about it? May be do you have some additional
> suggestions?
> >
> >
> > --
> > Живи с улыбкой! :D
> >
>


-- 
Живи с улыбкой! :D


SQL management and monitoring improvements

2018-11-07 Thread Юрий
Hi Igniters!

I think we can improve Ignite management and monitoring instruments related
to SQL.
I've prepared draft of IEP-29: SQL management and monitoring

.

What do you think about it? May be do you have some additional suggestions?


-- 
Живи с улыбкой! :D


Re: Page IO statistics for Ignite

2018-10-02 Thread Юрий
Hi,

Thanks you for participate here.
I've prepared some draft of IEP-27
<https://cwiki.apache.org/confluence/display/IGNITE/IEP-27%3A+Page+IO+statistics>,
which aggregate information from the thread. Please take a look and give
your feedback.

Also there are some open questions:
1) We can simple gathering metrics on each nodes. But also we need cluster
wide metrics. How it can be achieved? I see that we already share some
metrics through TcpDiscoveryMetricsUpdateMessage
and TcpDiscoveryClientMetricsUpdateMessage. If we add all IO metrics here
it will grow up more then two times. Can it lead to performance degradation?
2) How it will be more convenient to reset statistics? My opinion it can be
done by time (for example once per hour) and by request + keep history of
the statistics. What do you think?

Please find below example of node statistics which can be simple gathered
on node. Such statistics can be obtained separately for physical read/write
and logical read:
1. Fine-grained:
{T_PAGE_LIST_NODE=1592, UNKNOWN=0, T_PART_META=865, T_H2_MVCC_REF_LEAF=0,
T_TX_LOG_INNER=0, T_DATA=592, T_DATA_METASTORAGE=0,
T_CACHE_ID_AWARE_PENDING_REF_INNER=0, T_CACHE_ID_DATA_REF_MVCC_LEAF=0,
T_PAGE_LIST_META=365, T_DATA_REF_INNER=0, T_H2_EX_REF_MVCC_INNER=0,
T_PENDING_REF_LEAF=408, T_DATA_REF_METASTORAGE_LEAF=0,
T_DATA_REF_MVCC_INNER=0, T_DATA_REF_METASTORAGE_INNER=0,
T_DATA_REF_MVCC_LEAF=0, T_CACHE_ID_AWARE_DATA_REF_LEAF=0,
T_CACHE_ID_AWARE_PENDING_REF_LEAF=0, T_TX_LOG_LEAF=0, T_DATA_REF_LEAF=1000,
T_H2_EX_REF_MVCC_LEAF=0, T_BPLUS_META=408, T_PAGE_UPDATE_TRACKING=0,
T_H2_REF_LEAF=0, T_METASTORE_INNER=0, T_META=0, T_H2_REF_INNER=0,
T_PART_CNTRS=0, T_H2_EX_REF_INNER=0, T_CACHE_ID_AWARE_DATA_REF_INNER=0,
T_PENDING_REF_INNER=0, T_CACHE_ID_DATA_REF_MVCC_INNER=0,
T_H2_EX_REF_LEAF=0, T_H2_MVCC_REF_INNER=0, T_METASTORE_LEAF=0}
2. Aggregated:
{DATA=592, INDEX=1000, OTHER=3638}


Maybe I missed something important, please share you opinion.


Thanks.


вт, 25 сент. 2018 г. в 22:20, Alex Plehanov :

> Hi,
>
> I've made some investigation a couple of months ago about a statistics
> collected by some RDBMS vendors (Oracle, Postgres, MySQL). These databases
> collect detailed IO statistics in dimensions such as queries, database
> objects (tables and indexes), files, sessions, users, event types etc.
>
> Some views where you can get IO statistics:
> Oracle: v$filestat, v$segment_statistics, v$sqlarea, v$sysstat, v$sesstat
> Postgres: pg_stat_database, pg_statio_all_tables, pg_statio_all_indexes,
> pg_statio_all_sequences
> MySQL: table_io_waits_summary_by_table,
> table_io_waits_summary_by_index_usage, io_global_by_file_by_bytes,
> io_global_by_wait_by_bytes, file_summary_by_event_name,
> file_summary_by_instance, host_summary_by_file_io, user_summary_by_file_io,
> metrics, etc.
>
> I think we can start by collecting statistics per FilePageStore (updating
> counters on read(...) and write(...) methods). Each FilePageStore is
> bounded to cache and partition or cache index, so we can easily aggregate
> values and get IO statistics per cache/index/node.
>
> вт, 25 сент. 2018 г. в 10:57, Vladimir Ozerov :
>
>> Hi Yuriy,
>>
>> I think this is great idea. But we need to collect more details on how and
>> what to collect. I think one of the most interesting parts for us would be
>> index and data page usages, split by different "dimensions":
>> 1) Global node statistics
>> 2) Per-cache statistics
>> 3) Per-index statistics
>>
>> We can start with a short summary of what is collected by other database
>> vendors.
>>
>> Vladimir.
>>
>>
>> On Sat, Sep 22, 2018 at 1:07 AM Denis Magda  wrote:
>>
>> > Hello Yuri,
>> >
>> > I might give useful feedback if see how the metrics will look like from
>> the
>> > API standpoint. If it's not difficult please create a draft.
>> >
>> > AS for the interface, in addition to JMX and SQL we need to ensure Visor
>> > CMD and Web Console gets updated. *Alex K.*, please join the thread and
>> > share your requirements.
>> >
>> > --
>> > Denis
>> >
>> > On Fri, Sep 21, 2018 at 8:16 AM Юрий 
>> wrote:
>> >
>> > > Hi Igniters,
>> > >
>> > > I started IGNITE-8580
>> > > <https://issues.apache.org/jira/browse/IGNITE-8580> ticket
>> > > related to print page read/write metrics and did some investigation
>> what
>> > > other databases provide for the similar purposes.
>> > >
>> > > Based on the investigation I want to propose my raw vision of how to
>> > IGNITE
>> > > can be more transparent from performance perspective.
>> > >
>> &

Re: ML examples wrap logic in IgniteThread. Why?

2018-09-26 Thread Юрий Бабак
Denis,

Thanks for this notice, actually this is some kind of atavism. Run this
code inside IgniteThread was a requirement when we had a distributed
matrices. But now all our algorithms builds over distributed datasets and
we don't need it anymore.

I created JIRA ticket 
for this.

Thanks,
Yuriy

чт, 27 сент. 2018 г. в 0:20, Denis Magda :

> Yury, ML folks,
>
> I've mentioned a strange thing. Looks like every example we have wraps up
> its logic in the following block
>
> IgniteThread igniteThread = new
> IgniteThread(ignite.configuration().getIgniteInstanceName(),
> KMeansClusterizationExample.class.getSimpleName(), () -> {
>
>
> //ML specific stuff (training, predicting, calculations, etc.)
>
> });
>
> igniteThread.start();
> igniteThread.join();
>
>
> Why do we do that?
>
> Denis
>


[ML] New features and improvement of ML module for 2.7 release

2018-09-26 Thread Юрий Бабак
Hello Igniters,

I want to make up some overview of all features and major improvement of ML
module for this release.

So let me start from the one of our main feature for this release:

*TensorFlow integration* 

This integration allows us to use Apache Ignite as a data source for
TensorFlow. Also, this integration will allow creating and maintaining
TensorFlow clusters over Apache Ignite and submit TF jobs to those
clusters. More details in the related umbrella ticket.

Also, for this release we have some new algorithms:

* Random forest  
* Gradient boosted trees 
* Logistic regression[binary
][multi-class
]
* ANN 

New features related with data preprocessing:

* Pipeline 
* L1,L2 normalization 
* Data filtering for new datasets

* Encoding categorical features [OneHotEncoder
][OneOfKEncoder
]
* Imputer and Binarizer 
* MaxAbsScaler 
* Dataset splitting 

New features for a model validation:

* Model estimator 
* k-fold cross-validation

* Param grid for tuning hyper-parameters in cross-validation


Other features and improvements:

* Model updating 
* ML tutorial 
* Optional indexing for decision trees

* Learning context for trainers(local parallelizing and logging of training
process) 
* Unification of API for feature extractor

* Several tickets for removing old unused classes and improvements for code
coverage and examples [1 
][2 ][3
][4
][5
][6
]

Sincerely,
Yuriy Babak


[ML] TensorFlow intergration module release

2018-09-25 Thread Юрий Бабак
Hello, Igniters.

For release 2.7 we will introduce integration between TensorFlow and Apache
Ignite. This integration contains changes on Apache Ignite side and on
TensorFlow side.

Apache Ignite part is the command line tool which allows create and
maintain TensorFlow clusters over Apache Ignite and submit TF jobs to those
clusters.

For TensorFlow we implemented "ignite dataset". More details in related PR
[1]

As Apache Ignite part is done and TensorFlow part is ready for the merge I
suggest to add module "ignite-tensorflow" to other Ignite deliverables. So
I've created the ticket in JIRA for this [2]. In that case, we will be able
to release this feature with Apache Ignite binary release includes deb/rpm
packages.

[1] https://github.com/tensorflow/tensorflow/pull/22210
[2] https://issues.apache.org/jira/browse/IGNITE-9685

Regards,
Yury


Page IO statistics for Ignite

2018-09-21 Thread Юрий
Hi Igniters,

I started IGNITE-8580
 ticket
related to print page read/write metrics and did some investigation what
other databases provide for the similar purposes.

Based on the investigation I want to propose my raw vision of how to IGNITE
can be more transparent from performance perspective.

Need to collect statistics for logical (from memory) and physical (from
storage) page reads/writes. All these metrics should be separated by next
dimensions:
1) index/cache
2) query level
3) node/cluster
...

Seems the statistics should be limited by time.

If we will have such statistics we could realize such things as:
1) Get IO statistics per SQL query, global or/and splitted by
indexes/caches.
2) Have ability to understand why performance goes down in case it related
to IO. For example on concrete node or cache.
3) Evaluate effectiveness of use indexes. Find unused indexes.
4) Keep TOP queries with aggressive physical reads



Such statistics could be available least at JMX and SQL interfaces.

Let's discuss. In case it will be interested for you I can dig deeper into
the area and prepare IEP based on our discussion.


Igniters, what do you think?




-- 
Живи с улыбкой! :D


Re: Apache Ignite 2.7 release

2018-08-29 Thread Юрий Бабак
Denis, Nikolay, Igniters,

This is a list of planned ML features for Apache Ignite 2.7 release:

   - Tensor Flow integration (
   https://issues.apache.org/jira/browse/IGNITE-8670)
   - Data preprocessing (https://issues.apache.org/jira/browse/IGNITE-8662)
   - Model validation (https://issues.apache.org/jira/browse/IGNITE-8665)
   - Random forest algorithm (
   https://issues.apache.org/jira/browse/IGNITE-8840)
   - Gradient boosted trees (
   https://issues.apache.org/jira/browse/IGNITE-7149)
   - ANN algorithm (https://issues.apache.org/jira/browse/IGNITE-9261)
   - ML tutorial (https://issues.apache.org/jira/browse/IGNITE-8741)
   - And other improvements of ML module like a bugfixes, code cleanup,
   optimizations, etc

Regards,
Yury


ср, 29 авг. 2018 г. в 2:43, Denis Magda :

> Nikolay, Igniters, let me help you with the list.
>
> That what I was tracking on my side (something we can announce). Don't have
> a JIRA ticket for every ticket but CC-ed everyone who claimed to be in
> charge. Nikolay, please work with the community members to add these
> capabilities to the release wiki page. If something doesn't get delivered
> then let's exclude it.
>
> 1. Partition map exchange optimizations. Are we releasing any of them?
> *(Sergey
> Chugunov, Andrey Mashenkov)*
>
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-25%3A+Partition+Map+Exchange+hangs+resolving
>
> 2. Java 9/10/11 Support. Are we on track to support it better? *(Peter,
> Vladimir)*.
>
> 3. SQL *(Vladimir)*:
>
>- Transactional SQL beta?
>- Basic monitoring facilities (inline index alerts, page reads/writes
>per type)?
>- SQL index update optimizations? (
>
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-19%3A+SQL+index+update+optimizations
>)
>- ODBC/JDBC session management
>- Result set offload to disk. Looks it doesn't get to the release?
>https://issues.apache.org/jira/browse/IGNITE-7526
>
> 4. JCache 1.1 support. Completed!
>
> 5. Transparent data encryption? What exactly goes in 2.7? *(Nikolay)*.
>
> 6. Ignite + Informatica integration
>
> 7. Ignite and Spring Session integration (heard it was done but the ticket
> is still Open):
> https://issues.apache.org/jira/browse/IGNITE-2741
>
> 8. Service Grid 2.0. What exactly goes in 2.7? (*Vyacheslav)*
>
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-17%3A+Oil+Change+in+Service+Grid
>
> 9. Ignite Multi Map *(Amir, Anton)*
> https://issues.apache.org/jira/browse/IGNITE-640
>
> 10. Thin Clients:
>
>- Node.JS (https://issues.apache.org/jira/browse/IGNITE-) - *Pavel
>Petroshenko*
>- Python (https://issues.apache.org/jira/browse/IGNITE-7782) - *Pavel
>Petroshenko, **Dmitry Melnichuk*
>- PHP (https://issues.apache.org/jira/browse/IGNITE-7783) - *Pavel P.,
>Ekaterina*
>- C++: *Igor S.*
>- Affinity awareness for thin clients (*Igor S.*):
>
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-23%3A+Best+Effort+Affinity+for+thin+clients
>
> 11. Machine and Deep Learning (preprocessing APIs, TensorFlow integration,
> extra algorithms) - *Yuri and our ML experts*
>
>
>
> On Tue, Aug 28, 2018 at 10:17 AM Dmitriy Setrakyan 
> wrote:
>
> > Hi Nikolai,
> >
> > Generally looks OK, however, It is hard to comment on your schedule
> without
> > seeing a full list of all must-have features we plan to add to this
> > release. I am hoping that the community will see this list at some point.
> >
> > D.
> >
> > On Tue, Aug 28, 2018 at 8:23 AM, Nikolay Izhikov 
> > wrote:
> >
> > > Hello, Igniters.
> > >
> > > I think we should discuss the release schedule.
> > >
> > > Current dates are following:
> > >
> > > * Code Freeze: September 30, 2018
> > > * Voting Date: October 1, 2018
> > > * Release Date: October 15, 2018
> > >
> > > We discussed it privately with Vladimir Ozerov.
> > >
> > > Is seems better to reschedule a bit:
> > >
> > > * Scope freeze - September 17 - We should have a full list of
> > > tickets for 2.7 here.
> > > * Code freeze - October 01 - We should merge all 2.7 tickets to
> > > master here.
> > > * Vote - October 08.
> > >
> > > What do you think?
> > >
> > >
> > > В Сб, 25/08/2018 в 00:57 +0300, Dmitriy Pavlov пишет:
> > > > I hope Vyacheslav can comment better than me. I suppose it is, more
> or
> > > > less, rectifications and clarifications of design aspects. Not
> overall
> > > > redesign.
> > > >
> > > > I also hope Igniters, especially Services experts, will join the
> > > discussion
> > > > in the separate topic. Now after a couple of days there is no
> reaction.
> > > >
> > > > сб, 25 авг. 2018 г. в 0:53, Dmitriy Setrakyan  >:
> > > >
> > > > > On Fri, Aug 24, 2018 at 2:50 PM, Dmitriy Pavlov <
> > dpavlov@gmail.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > Hi Dmitriy, I suppose it highly depends on how fast community
> will
> > > come
> > > > >
> > > > > to
> > > > > > a consensus about design. So it is up to us to make this happen
>

[ML] Bugs in GA Grid

2018-08-29 Thread Юрий Бабак
Turik,

Could you please take a look on those two bugs:

https://issues.apache.org/jira/browse/IGNITE-9354
https://issues.apache.org/jira/browse/IGNITE-9359

Thanks,
Yury


welcome

2018-08-07 Thread Юрий
Hello, Ignite Community!

My name is Iurii. I want to contribute to Apache Ignite.
my JIRA user name is jooger. Any help on this will be appreciated.

Thanks!

-- 
Live with a smile! :D