Re: [DISCUSSION] New Ignite settings for IGNITE-12438 and IGNITE-13013

2020-06-29 Thread Denis Magda
Hi Raymond,

You will not come across any internode-communication issues with your
deployment configuration as long as the servers and clients are running
withing the K8 environment.

The issue, discussed here, takes place if one of the following happens:

   - The clients are deployed in K8 while the servers are running on
   virtual machines (or vice versa).
   - A serverless function attempts to use a thick client that by the
   current design creates a ServerSocket connection:
   https://issues.apache.org/jira/browse/IGNITE-13013

As for the continuous queries, I have the following use case that can
easily fail. Let's say your 10 servers are running on virtual machines
while a thick client is managed by K8. The client registers a CQ in the
cluster and all 10 servers at some point in time will need to send an
update notification to the client. To do that, they have to open a
connection with the client, and here the things can fall apart.

-
Denis


On Fri, Jun 26, 2020 at 11:54 PM Raymond Wilson 
wrote:

> I have just caught up with this discussion and wanted to outline a set of
> use
> cases we have that rely on server nodes communicating with client nodes.
>
> Firstly, I'd like to confirm my mental model of server & client nodes
> within
> a grid (ignoring thin clients for now):
>
> A grid contains a set of nodes somewhat arbitrarily labelled 'server' and
> 'client' where the distinction of a 'server' node is that it is responsible
> for containing data (in-memory only, or also with persistence). Apart from
> that distinction, all nodes are essentially peers in the grid and may use
> the messaging fabric, compute layer and other grid features on an equal
> footing.
>
> In our solution we leverage these capabilities to build and orchestrate
> complex analytics queries that utilise compute functions that are initiated
> in three distinct ways: client -> client, client -> server and server ->
> client, and where all three styles of initiation are using within a single
> analytics request made to the grid it self. I can go into more detail about
> the exact sequencing of these activities if you like, but it may be
> sufficient to know they are used to reason about the problem statement and
> proposals outlined here.
>
> Our infrastructure is deployed to Kubernetes using EKS on AWS, and all
> three
> relationships between client and server nodes noted above function well
> (caveat: we do see odd things though such as long pauses on critical worker
> threads, and occasional empty topology warnings when locating client nodes
> to send requests to). We also use continuous queries in three contexts (all
> within server nodes).
>
> If this thread is suggesting changing the functional relationship between
> server and client nodes then this may have impacts on our architecture and
> implementation that we will need to consider.
>
> This thread has highlighted issues with K8s deployments and also CQ issues.
> The suggestion is that Server to Client just doesn't work on K8s, which
> does
> not agree with our experience of it working. I'd also like to understand
> better the bounds of the issue with CQ: When does it not work and what are
> the symptoms we would see if there was an issue with the way we are using
> it, or the K8s infrastructure we deploy to?
>
> Thanks,
> Raymond.
>
>
>
>
> --
> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
>


[jira] [Created] (IGNITE-13196) Greater than Where condition on secondary column in Primary key does not work when also filtering on primary column in primary key

2020-06-29 Thread Andrew (Jira)
Andrew created IGNITE-13196:
---

 Summary: Greater than Where condition on secondary column in 
Primary key does not work when also filtering on primary column in primary key
 Key: IGNITE-13196
 URL: https://issues.apache.org/jira/browse/IGNITE-13196
 Project: Ignite
  Issue Type: Bug
  Components: thin client
Affects Versions: 2.8.1
 Environment: Windows 10
Reporter: Andrew


Using the Java Thin Client when I try to use the SQL API, I have an issue where 
I am unable to query by both columns in a composite primary key when one of 
those conditions is a greater than condition.

 

For example if I create a table: "Create Table If Not Exists test (foo varchar, 
bar int, biz int, PRIMARY KEY( foo, bar))

And then insert the following data: INSERT INTO test VALUES ('key1', 1, 2), 
('key2', 2, 3 )

 

The following queries return the same result (('key2', 2, 3)):

"Select * from test where foo = 'key2'"

"Select * from test where bar > 1"

 

However the following query returns no data:

"Select * from test where foo = 'key2' and bar > 1"

 

For reference each query is executed using: igniteClient.query( new 
SqlFieldsQuery( Query) ).getAll()

Could you help explain why this is happening and if there are any workarounds? 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Event for failed queries

2020-06-29 Thread Max Timonin
Hi Igniters,

I'm looking for a technique that can provide SQL clauses for an external
observer. There is an event 'CacheQueryExecutedEvent' (maps to
EVT_CACHE_QUERY_EXECUTED) that contains the clause. But as I can see it's
produced only for success queries and there is no any event if a query
fails. I wonder:

1. Is there a reason why it is designed that only success queries produce
the event?
2. Can I rely on the event to subscribe to all success queries for my task?
3. Is there a valid way to subscribe on failed queries?

Thanks for any help,

Maksim


Apache Ignite Virtual Meetup - stay connected from wherever you are

2020-06-29 Thread Kseniya Romanova
Hi Igniters!

Probably the only good thing about quarantine is that we realized that our
community spans the globe. And, we would like to keep these connections,
even after our offline, in-person activities resume. So we created Apache
Ignite Virtual Meetup[1].


Join the Virtual Meetup group to keep in touch with Igniters from around
the world. Attend our virtual events from any place that offers an internet
connection. We will try to schedule events in various time zones[2]. Please
feel free to suggest topics and speakers.


Use this form[3] to submit your Virtual Meetup proposal. Share your
knowledge and expertise with the worldwide Apache Ignite Community.

Cheers,

Kseniya

[1] https://www.meetup.com/Apache-Ignite-Virtual-Meetup/

[2] https://www.meetup.com/Apache-Ignite-Virtual-Meetup/events/271602652/

[3] https://bit.ly/2Bmq1WT


Re: [DISCUSSION] New Ignite settings for IGNITE-12438 and IGNITE-13013

2020-06-29 Thread Ivan Bessonov
Ivan,

Currently we have no requirement to maintain all possible connections
opened. Every node can have arbitrary number of connections to every
other node (it's configurable with "connectionsPerNode" setting).

Also, we can't expect that client would magically open connection when
we need it, that's the main issue. Changing this approach is out of
scope and I can't guarantee that it can or will be implemented this way.


пн, 29 июн. 2020 г. в 17:30, Ivan Pavlukhin :

> Ivan,
>
> It seems that if a server notices that an existing connection to a
> client cannot be used anymore then the server can expect that the
> client will establish a new one. Is it just out of current iteration
> scope? Or are there still other fundamental problems?
>
> 2020-06-29 16:32 GMT+03:00, Ivan Bessonov :
> > Hi Ivan,
> >
> > sure, TCP connections are lazy. So, if a connection is not already opened
> > then node (trying to send a message) will initiate connection opening.
> > It's also possible that the opened connection is spontaneously closed for
> > some reason. Otherwise you are right, everything is as you described.
> >
> > There's also a tie breaker when two nodes connect to each other at the
> > same time. Only one of them will succeed and it depends on internal
> > discovery order, which you can't control basically.
> >
> > пн, 29 июн. 2020 г. в 16:01, Ivan Pavlukhin :
> >
> >> Hi Ivan,
> >>
> >> Sorry for a possibly naive question. As I understand we are talking
> >> about order of establishing client-server connections. And I suppose
> >> that in some environments (e.g. cloud) servers cannot directly
> >> establish connections with clients. But TCP connections are
> >> bidirectional and we still can send messages in both directions. Could
> >> you please provide an example case in which servers have to initiate
> >> new connections to clients?
> >>
> >> 2020-06-29 13:08 GMT+03:00, Ivan Bessonov :
> >> > Hi igniters, Hi Raymond,
> >> >
> >> > that was a really good point. I will try to address it as much as I
> >> > can.
> >> >
> >> > First of all, this new mode will be configurable for now. As Val
> >> suggested,
> >> > "TcpCommunicationSpi#forceClientToServerConnections" will be a new
> >> > setting to trigger this behavior. Disabled by default.
> >> >
> >> > About issues with K8S deployments - I'm not an expert, but from what
> >> > I've
> >> > heard, sometimes servers and client nodes are not in the same
> >> environments.
> >> > For example, there is an Ignite cluster and user tries to start client
> >> node
> >> > in
> >> > isolated K8S pod. In this case clients cannot properly resolve their
> >> > own
> >> > addresses
> >> > and send it to servers, making it impossible for servers to connect to
> >> such
> >> > clients.
> >> > Or, in other words, clients are used as if they were thin.
> >> >
> >> > In your case everything is fine, clients and servers share the same
> >> network
> >> > and can resolve each other's addresses.
> >> >
> >> > Now, CQ issue [1]. You can pass a custom event filter when you
> register
> >> > a
> >> > new
> >> > continuous query. But, depending on the setup, the class of this
> filter
> >> may
> >> > not
> >> > be in the classpath of the server node that holds the data and invokes
> >> that
> >> > filter.
> >> > There are two solutions to the problem:
> >> > - server fails to resolve class name and fails to register CQ;
> >> > - or server can have p2p deployment enabled. Let's assume that it was
> a
> >> > client
> >> > node that requested CQ. In this case the server will try to download
> >> > "class" file
> >> > directly from the node that sent the filter object in the first place.
> >> Due
> >> > to a poor
> >> > design decision it will be done synchronously while registering the
> >> query,
> >> > and
> >> > query registration is happening in "discovery" thread. In normal
> >> > circumstances
> >> > the server will load the class and finish query registration, it's
> just
> >> > a
> >> > little bit slow.
> >> >
> >> > Second case is not compatible with a new
> >> > "forceClientToServerConnections"
> >> > setting. I'm not sure that I need to go into all technical details,
> but
> >> the
> >> > result of
> >> > such procedure is a cluster that cannot process any discovery messages
> >> > during
> >> > TCP connection timeout, we're talking about tens of seconds or maybe
> >> > even
> >> > several minutes depending on the settings and the environment. All
> this
> >> > time the
> >> > server will be in a "deadlock" state inside of the "discovery" thread.
> >> > It
> >> > means that
> >> > some cluster operations will be unavailable during this period, like
> >> > new
> >> > node joining
> >> > or starting a new cache. Node failures will not be processed properly
> >> > as
> >> > well. For
> >> > me it's hard to predict real behavior until we reproduce the situation
> >> in a
> >> > live
> >> > environment. I saw this in tests only.
> >> >
> >> > I hope that my message clarifies the 

Re: [DISCUSSION] New Ignite settings for IGNITE-12438 and IGNITE-13013

2020-06-29 Thread Ivan Pavlukhin
Ivan,

It seems that if a server notices that an existing connection to a
client cannot be used anymore then the server can expect that the
client will establish a new one. Is it just out of current iteration
scope? Or are there still other fundamental problems?

2020-06-29 16:32 GMT+03:00, Ivan Bessonov :
> Hi Ivan,
>
> sure, TCP connections are lazy. So, if a connection is not already opened
> then node (trying to send a message) will initiate connection opening.
> It's also possible that the opened connection is spontaneously closed for
> some reason. Otherwise you are right, everything is as you described.
>
> There's also a tie breaker when two nodes connect to each other at the
> same time. Only one of them will succeed and it depends on internal
> discovery order, which you can't control basically.
>
> пн, 29 июн. 2020 г. в 16:01, Ivan Pavlukhin :
>
>> Hi Ivan,
>>
>> Sorry for a possibly naive question. As I understand we are talking
>> about order of establishing client-server connections. And I suppose
>> that in some environments (e.g. cloud) servers cannot directly
>> establish connections with clients. But TCP connections are
>> bidirectional and we still can send messages in both directions. Could
>> you please provide an example case in which servers have to initiate
>> new connections to clients?
>>
>> 2020-06-29 13:08 GMT+03:00, Ivan Bessonov :
>> > Hi igniters, Hi Raymond,
>> >
>> > that was a really good point. I will try to address it as much as I
>> > can.
>> >
>> > First of all, this new mode will be configurable for now. As Val
>> suggested,
>> > "TcpCommunicationSpi#forceClientToServerConnections" will be a new
>> > setting to trigger this behavior. Disabled by default.
>> >
>> > About issues with K8S deployments - I'm not an expert, but from what
>> > I've
>> > heard, sometimes servers and client nodes are not in the same
>> environments.
>> > For example, there is an Ignite cluster and user tries to start client
>> node
>> > in
>> > isolated K8S pod. In this case clients cannot properly resolve their
>> > own
>> > addresses
>> > and send it to servers, making it impossible for servers to connect to
>> such
>> > clients.
>> > Or, in other words, clients are used as if they were thin.
>> >
>> > In your case everything is fine, clients and servers share the same
>> network
>> > and can resolve each other's addresses.
>> >
>> > Now, CQ issue [1]. You can pass a custom event filter when you register
>> > a
>> > new
>> > continuous query. But, depending on the setup, the class of this filter
>> may
>> > not
>> > be in the classpath of the server node that holds the data and invokes
>> that
>> > filter.
>> > There are two solutions to the problem:
>> > - server fails to resolve class name and fails to register CQ;
>> > - or server can have p2p deployment enabled. Let's assume that it was a
>> > client
>> > node that requested CQ. In this case the server will try to download
>> > "class" file
>> > directly from the node that sent the filter object in the first place.
>> Due
>> > to a poor
>> > design decision it will be done synchronously while registering the
>> query,
>> > and
>> > query registration is happening in "discovery" thread. In normal
>> > circumstances
>> > the server will load the class and finish query registration, it's just
>> > a
>> > little bit slow.
>> >
>> > Second case is not compatible with a new
>> > "forceClientToServerConnections"
>> > setting. I'm not sure that I need to go into all technical details, but
>> the
>> > result of
>> > such procedure is a cluster that cannot process any discovery messages
>> > during
>> > TCP connection timeout, we're talking about tens of seconds or maybe
>> > even
>> > several minutes depending on the settings and the environment. All this
>> > time the
>> > server will be in a "deadlock" state inside of the "discovery" thread.
>> > It
>> > means that
>> > some cluster operations will be unavailable during this period, like
>> > new
>> > node joining
>> > or starting a new cache. Node failures will not be processed properly
>> > as
>> > well. For
>> > me it's hard to predict real behavior until we reproduce the situation
>> in a
>> > live
>> > environment. I saw this in tests only.
>> >
>> > I hope that my message clarifies the situation, or at least doesn't
>> > cause
>> > more
>> > confusion. These changes will not affect your infrastructure or your
>> Ignite
>> > installations, they are aimed at adding more flexibility to other ways
>> > of
>> > using Ignite.
>> >
>> > [1] https://issues.apache.org/jira/browse/IGNITE-13156
>> >
>> >
>> >
>> > сб, 27 июн. 2020 г. в 09:54, Raymond Wilson > >:
>> >
>> >> I have just caught up with this discussion and wanted to outline a set
>> of
>> >> use
>> >> cases we have that rely on server nodes communicating with client
>> >> nodes.
>> >>
>> >> Firstly, I'd like to confirm my mental model of server & client nodes
>> >> within
>> >> a grid (ignoring thin clients for now):
>> >>
>> >> A grid 

Re: [DISCUSSION] New Ignite settings for IGNITE-12438 and IGNITE-13013

2020-06-29 Thread Ivan Bessonov
Hi Ivan,

sure, TCP connections are lazy. So, if a connection is not already opened
then node (trying to send a message) will initiate connection opening.
It's also possible that the opened connection is spontaneously closed for
some reason. Otherwise you are right, everything is as you described.

There's also a tie breaker when two nodes connect to each other at the
same time. Only one of them will succeed and it depends on internal
discovery order, which you can't control basically.

пн, 29 июн. 2020 г. в 16:01, Ivan Pavlukhin :

> Hi Ivan,
>
> Sorry for a possibly naive question. As I understand we are talking
> about order of establishing client-server connections. And I suppose
> that in some environments (e.g. cloud) servers cannot directly
> establish connections with clients. But TCP connections are
> bidirectional and we still can send messages in both directions. Could
> you please provide an example case in which servers have to initiate
> new connections to clients?
>
> 2020-06-29 13:08 GMT+03:00, Ivan Bessonov :
> > Hi igniters, Hi Raymond,
> >
> > that was a really good point. I will try to address it as much as I can.
> >
> > First of all, this new mode will be configurable for now. As Val
> suggested,
> > "TcpCommunicationSpi#forceClientToServerConnections" will be a new
> > setting to trigger this behavior. Disabled by default.
> >
> > About issues with K8S deployments - I'm not an expert, but from what I've
> > heard, sometimes servers and client nodes are not in the same
> environments.
> > For example, there is an Ignite cluster and user tries to start client
> node
> > in
> > isolated K8S pod. In this case clients cannot properly resolve their own
> > addresses
> > and send it to servers, making it impossible for servers to connect to
> such
> > clients.
> > Or, in other words, clients are used as if they were thin.
> >
> > In your case everything is fine, clients and servers share the same
> network
> > and can resolve each other's addresses.
> >
> > Now, CQ issue [1]. You can pass a custom event filter when you register a
> > new
> > continuous query. But, depending on the setup, the class of this filter
> may
> > not
> > be in the classpath of the server node that holds the data and invokes
> that
> > filter.
> > There are two solutions to the problem:
> > - server fails to resolve class name and fails to register CQ;
> > - or server can have p2p deployment enabled. Let's assume that it was a
> > client
> > node that requested CQ. In this case the server will try to download
> > "class" file
> > directly from the node that sent the filter object in the first place.
> Due
> > to a poor
> > design decision it will be done synchronously while registering the
> query,
> > and
> > query registration is happening in "discovery" thread. In normal
> > circumstances
> > the server will load the class and finish query registration, it's just a
> > little bit slow.
> >
> > Second case is not compatible with a new "forceClientToServerConnections"
> > setting. I'm not sure that I need to go into all technical details, but
> the
> > result of
> > such procedure is a cluster that cannot process any discovery messages
> > during
> > TCP connection timeout, we're talking about tens of seconds or maybe even
> > several minutes depending on the settings and the environment. All this
> > time the
> > server will be in a "deadlock" state inside of the "discovery" thread. It
> > means that
> > some cluster operations will be unavailable during this period, like new
> > node joining
> > or starting a new cache. Node failures will not be processed properly as
> > well. For
> > me it's hard to predict real behavior until we reproduce the situation
> in a
> > live
> > environment. I saw this in tests only.
> >
> > I hope that my message clarifies the situation, or at least doesn't cause
> > more
> > confusion. These changes will not affect your infrastructure or your
> Ignite
> > installations, they are aimed at adding more flexibility to other ways of
> > using Ignite.
> >
> > [1] https://issues.apache.org/jira/browse/IGNITE-13156
> >
> >
> >
> > сб, 27 июн. 2020 г. в 09:54, Raymond Wilson  >:
> >
> >> I have just caught up with this discussion and wanted to outline a set
> of
> >> use
> >> cases we have that rely on server nodes communicating with client nodes.
> >>
> >> Firstly, I'd like to confirm my mental model of server & client nodes
> >> within
> >> a grid (ignoring thin clients for now):
> >>
> >> A grid contains a set of nodes somewhat arbitrarily labelled 'server'
> and
> >> 'client' where the distinction of a 'server' node is that it is
> >> responsible
> >> for containing data (in-memory only, or also with persistence). Apart
> >> from
> >> that distinction, all nodes are essentially peers in the grid and may
> use
> >> the messaging fabric, compute layer and other grid features on an equal
> >> footing.
> >>
> >> In our solution we leverage these capabilities to build and orchestrate
> >> 

Re: [DISCUSSION] New Ignite settings for IGNITE-12438 and IGNITE-13013

2020-06-29 Thread Ivan Pavlukhin
Hi Ivan,

Sorry for a possibly naive question. As I understand we are talking
about order of establishing client-server connections. And I suppose
that in some environments (e.g. cloud) servers cannot directly
establish connections with clients. But TCP connections are
bidirectional and we still can send messages in both directions. Could
you please provide an example case in which servers have to initiate
new connections to clients?

2020-06-29 13:08 GMT+03:00, Ivan Bessonov :
> Hi igniters, Hi Raymond,
>
> that was a really good point. I will try to address it as much as I can.
>
> First of all, this new mode will be configurable for now. As Val suggested,
> "TcpCommunicationSpi#forceClientToServerConnections" will be a new
> setting to trigger this behavior. Disabled by default.
>
> About issues with K8S deployments - I'm not an expert, but from what I've
> heard, sometimes servers and client nodes are not in the same environments.
> For example, there is an Ignite cluster and user tries to start client node
> in
> isolated K8S pod. In this case clients cannot properly resolve their own
> addresses
> and send it to servers, making it impossible for servers to connect to such
> clients.
> Or, in other words, clients are used as if they were thin.
>
> In your case everything is fine, clients and servers share the same network
> and can resolve each other's addresses.
>
> Now, CQ issue [1]. You can pass a custom event filter when you register a
> new
> continuous query. But, depending on the setup, the class of this filter may
> not
> be in the classpath of the server node that holds the data and invokes that
> filter.
> There are two solutions to the problem:
> - server fails to resolve class name and fails to register CQ;
> - or server can have p2p deployment enabled. Let's assume that it was a
> client
> node that requested CQ. In this case the server will try to download
> "class" file
> directly from the node that sent the filter object in the first place. Due
> to a poor
> design decision it will be done synchronously while registering the query,
> and
> query registration is happening in "discovery" thread. In normal
> circumstances
> the server will load the class and finish query registration, it's just a
> little bit slow.
>
> Second case is not compatible with a new "forceClientToServerConnections"
> setting. I'm not sure that I need to go into all technical details, but the
> result of
> such procedure is a cluster that cannot process any discovery messages
> during
> TCP connection timeout, we're talking about tens of seconds or maybe even
> several minutes depending on the settings and the environment. All this
> time the
> server will be in a "deadlock" state inside of the "discovery" thread. It
> means that
> some cluster operations will be unavailable during this period, like new
> node joining
> or starting a new cache. Node failures will not be processed properly as
> well. For
> me it's hard to predict real behavior until we reproduce the situation in a
> live
> environment. I saw this in tests only.
>
> I hope that my message clarifies the situation, or at least doesn't cause
> more
> confusion. These changes will not affect your infrastructure or your Ignite
> installations, they are aimed at adding more flexibility to other ways of
> using Ignite.
>
> [1] https://issues.apache.org/jira/browse/IGNITE-13156
>
>
>
> сб, 27 июн. 2020 г. в 09:54, Raymond Wilson :
>
>> I have just caught up with this discussion and wanted to outline a set of
>> use
>> cases we have that rely on server nodes communicating with client nodes.
>>
>> Firstly, I'd like to confirm my mental model of server & client nodes
>> within
>> a grid (ignoring thin clients for now):
>>
>> A grid contains a set of nodes somewhat arbitrarily labelled 'server' and
>> 'client' where the distinction of a 'server' node is that it is
>> responsible
>> for containing data (in-memory only, or also with persistence). Apart
>> from
>> that distinction, all nodes are essentially peers in the grid and may use
>> the messaging fabric, compute layer and other grid features on an equal
>> footing.
>>
>> In our solution we leverage these capabilities to build and orchestrate
>> complex analytics queries that utilise compute functions that are
>> initiated
>> in three distinct ways: client -> client, client -> server and server ->
>> client, and where all three styles of initiation are using within a
>> single
>> analytics request made to the grid it self. I can go into more detail
>> about
>> the exact sequencing of these activities if you like, but it may be
>> sufficient to know they are used to reason about the problem statement
>> and
>> proposals outlined here.
>>
>> Our infrastructure is deployed to Kubernetes using EKS on AWS, and all
>> three
>> relationships between client and server nodes noted above function well
>> (caveat: we do see odd things though such as long pauses on critical
>> worker
>> threads, and occasional empty 

Announcing ApacheCon @Home 2020

2020-06-29 Thread Rich Bowen

Hi, Apache enthusiast!

(You’re receiving this because you’re subscribed to one or more dev or 
user mailing lists for an Apache Software Foundation project.)


The ApacheCon Planners and the Apache Software Foundation are pleased to 
announce that ApacheCon @Home will be held online, September 29th 
through October 1st, 2020. We’ll be featuring content from dozens of our 
projects, as well as content about community, how Apache works, business 
models around Apache software, the legal aspects of open source, and 
many other topics.


Full details about the event, and registration, is available at 
https://apachecon.com/acah2020


Due to the confusion around how and where this event was going to be 
held, and in order to open up to presenters from around the world who 
may previously have been unable or unwilling to travel, we’ve reopened 
the Call For Presentations until July 13th. Submit your talks today at 
https://acna2020.jamhosted.net/


We hope to see you at the event!
Rich Bowen, VP Conferences, The Apache Software Foundation


Re: [MTCGA]: new failures in builds [5419627] needs to be handled

2020-06-29 Thread Steshin Vladimir

Dmitry, hi.

Here is the ticket [1] and PR [2]. Looks trivial. Waiting for the test 
completed.


[1] https://issues.apache.org/jira/browse/IGNITE-13194

[2] https://github.com/apache/ignite/pull/7969


29.06.2020 03:43, dpavlov.ta...@gmail.com пишет:

Hi Igniters,

  I've detected some new issue on TeamCity to be handled. You are more than 
welcomed to help.

  If your changes can lead to this failure(s): We're grateful that you were a 
volunteer to make the contribution to this project, but things change and you 
may no longer be able to finalize your contribution.
  Could you respond to this email and indicate if you wish to continue and fix 
test failures or step down and some committer may revert you commit.

  *New test failure in master 
IgnitePdsBinaryMetadataOnClusterRestartTest.testNodeWithIncompatibleMetadataIsProhibitedToJoinTheCluster
 
https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=-7949590731195529429=%3Cdefault%3E=testDetails
  Changes may lead to failure were done by
 - maksim timonin  
https://ci.ignite.apache.org/viewModification.html?modId=903615
 - tledkov  
https://ci.ignite.apache.org/viewModification.html?modId=903624
 - alexey zinoviev  
https://ci.ignite.apache.org/viewModification.html?modId=903701
 - ivan daschinskiy  
https://ci.ignite.apache.org/viewModification.html?modId=903700

 - Here's a reminder of what contributors were agreed to do 
https://cwiki.apache.org/confluence/display/IGNITE/How+to+Contribute
 - Should you have any questions please contact dev@ignite.apache.org

Best Regards,
Apache Ignite TeamCity Bot
https://github.com/apache/ignite-teamcity-bot
Notification generated at 03:43:21 29-06-2020



Re: [DISCUSSION] New Ignite settings for IGNITE-12438 and IGNITE-13013

2020-06-29 Thread Ivan Bessonov
Hi igniters, Hi Raymond,

that was a really good point. I will try to address it as much as I can.

First of all, this new mode will be configurable for now. As Val suggested,
"TcpCommunicationSpi#forceClientToServerConnections" will be a new
setting to trigger this behavior. Disabled by default.

About issues with K8S deployments - I'm not an expert, but from what I've
heard, sometimes servers and client nodes are not in the same environments.
For example, there is an Ignite cluster and user tries to start client node
in
isolated K8S pod. In this case clients cannot properly resolve their own
addresses
and send it to servers, making it impossible for servers to connect to such
clients.
Or, in other words, clients are used as if they were thin.

In your case everything is fine, clients and servers share the same network
and can resolve each other's addresses.

Now, CQ issue [1]. You can pass a custom event filter when you register a
new
continuous query. But, depending on the setup, the class of this filter may
not
be in the classpath of the server node that holds the data and invokes that
filter.
There are two solutions to the problem:
- server fails to resolve class name and fails to register CQ;
- or server can have p2p deployment enabled. Let's assume that it was a
client
node that requested CQ. In this case the server will try to download
"class" file
directly from the node that sent the filter object in the first place. Due
to a poor
design decision it will be done synchronously while registering the query,
and
query registration is happening in "discovery" thread. In normal
circumstances
the server will load the class and finish query registration, it's just a
little bit slow.

Second case is not compatible with a new "forceClientToServerConnections"
setting. I'm not sure that I need to go into all technical details, but the
result of
such procedure is a cluster that cannot process any discovery messages
during
TCP connection timeout, we're talking about tens of seconds or maybe even
several minutes depending on the settings and the environment. All this
time the
server will be in a "deadlock" state inside of the "discovery" thread. It
means that
some cluster operations will be unavailable during this period, like new
node joining
or starting a new cache. Node failures will not be processed properly as
well. For
me it's hard to predict real behavior until we reproduce the situation in a
live
environment. I saw this in tests only.

I hope that my message clarifies the situation, or at least doesn't cause
more
confusion. These changes will not affect your infrastructure or your Ignite
installations, they are aimed at adding more flexibility to other ways of
using Ignite.

[1] https://issues.apache.org/jira/browse/IGNITE-13156



сб, 27 июн. 2020 г. в 09:54, Raymond Wilson :

> I have just caught up with this discussion and wanted to outline a set of
> use
> cases we have that rely on server nodes communicating with client nodes.
>
> Firstly, I'd like to confirm my mental model of server & client nodes
> within
> a grid (ignoring thin clients for now):
>
> A grid contains a set of nodes somewhat arbitrarily labelled 'server' and
> 'client' where the distinction of a 'server' node is that it is responsible
> for containing data (in-memory only, or also with persistence). Apart from
> that distinction, all nodes are essentially peers in the grid and may use
> the messaging fabric, compute layer and other grid features on an equal
> footing.
>
> In our solution we leverage these capabilities to build and orchestrate
> complex analytics queries that utilise compute functions that are initiated
> in three distinct ways: client -> client, client -> server and server ->
> client, and where all three styles of initiation are using within a single
> analytics request made to the grid it self. I can go into more detail about
> the exact sequencing of these activities if you like, but it may be
> sufficient to know they are used to reason about the problem statement and
> proposals outlined here.
>
> Our infrastructure is deployed to Kubernetes using EKS on AWS, and all
> three
> relationships between client and server nodes noted above function well
> (caveat: we do see odd things though such as long pauses on critical worker
> threads, and occasional empty topology warnings when locating client nodes
> to send requests to). We also use continuous queries in three contexts (all
> within server nodes).
>
> If this thread is suggesting changing the functional relationship between
> server and client nodes then this may have impacts on our architecture and
> implementation that we will need to consider.
>
> This thread has highlighted issues with K8s deployments and also CQ issues.
> The suggestion is that Server to Client just doesn't work on K8s, which
> does
> not agree with our experience of it working. I'd also like to understand
> better the bounds of the issue with CQ: When does it not work and what are
> 

[jira] [Created] (IGNITE-13195) Allow skipping autotools invocation when building Ignite release

2020-06-29 Thread Alexey Goncharuk (Jira)
Alexey Goncharuk created IGNITE-13195:
-

 Summary: Allow skipping autotools invocation when building Ignite 
release
 Key: IGNITE-13195
 URL: https://issues.apache.org/jira/browse/IGNITE-13195
 Project: Ignite
  Issue Type: Improvement
Reporter: Alexey Goncharuk


I do not have an up-to-date set of autotools installed on local machine, and 
Apache Ignite release build fails locally with the following error:
{code}
main::scan_file() called too early to check prototype at /usr/local/bin/aclocal 
line 617.
configure.ac:36: warning: macro `AM_PROG_AR' not found in library
configure.ac:21: error: Autoconf version 2.69 or higher is required
configure.ac:21: the top level
autom4te: /usr/bin/m4 failed with exit status: 63
aclocal: autom4te failed with exit status: 63
{code}
I do not need to run these commands locally because I only need a quick 
assembly (java only, even no javadocs) to verify the release structure and 
command-line utilities integrity.

It would be great to move these commands to a separate profile (enabled by 
default) so users can skip them when building the release package, something 
like {{mvn initialize -Prelease -P!autotools}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13194) Fix testNodeWithIncompatibleMetadataIsProhibitedToJoinTheCluster()

2020-06-29 Thread Vladimir Steshin (Jira)
Vladimir Steshin created IGNITE-13194:
-

 Summary: Fix 
testNodeWithIncompatibleMetadataIsProhibitedToJoinTheCluster()
 Key: IGNITE-13194
 URL: https://issues.apache.org/jira/browse/IGNITE-13194
 Project: Ignite
  Issue Type: Bug
Reporter: Vladimir Steshin
Assignee: Vladimir Steshin






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Extended logging for rebalance performance analysis

2020-06-29 Thread Ivan Rakov
+1 to Alex G.

>From my experience, the most interesting cases with Ignite rebalancing
happen exactly in production. According to the fact that we already have
detailed rebalancing logging, adding info about rebalance performance looks
like a reasonable improvement. With new logs we'll be able to detect and
investigate situations when rebalance is slow due to uneven suppliers
distribution or network issues.
Option to disable the feature in runtime shouldn't be used often, but it
will keep us on the safe side in case something goes wrong.
The format described in
https://issues.apache.org/jira/browse/IGNITE-12080 looks
good to me.

On Tue, Jun 23, 2020 at 7:01 PM ткаленко кирилл 
wrote:

> Hello, Alexey!
>
> Currently there is no way to disable / enable it, but it seems that the
> logs will not be overloaded, since Alexei Scherbakov offer seems reasonable
> and compact. Of course, you can add disabling / enabling statistics
> collection via jmx for example.
>
> 23.06.2020, 18:47, "Alexey Goncharuk" :
> > Hello Maxim, folks,
> >
> > ср, 6 мая 2020 г. в 21:01, Maxim Muzafarov :
> >
> >>  We won't do performance analysis on the production environment. Each
> >>  time we need performance analysis it will be done on a test
> >>  environment with verbose logging enabled. Thus I suggest moving these
> >>  changes to a separate `profiling` module and extend the logging much
> >>  more without any ышяу limitations. The same as these [2] [3]
> >>  activities do.
> >
> >  I strongly disagree with this statement. I am not sure who is meant here
> > by 'we', but I see a strong momentum in increasing observability tooling
> > that helps people to understand what exactly happens in the production
> > environment [1]. Not everybody can afford two identical environments for
> > testing. We should make sure users have enough information to understand
> > the root cause after the incident happened, and not force them to
> reproduce
> > it, let alone make them add another module to the classpath and restart
> the
> > nodes.
> > I think having this functionality in the core module with the ability to
> > disable/enable it is the right approach. Having the information printed
> to
> > log is ok, having it in an event that can be sent to a monitoring/tracing
> > subsystem is even better.
> >
> > Kirill, can we enable and disable this feature in runtime to avoid the
> very
> > same nodes restart?
> >
> > [1]
> https://www.honeycomb.io/blog/yes-i-test-in-production-and-so-do-you/
>


[jira] [Created] (IGNITE-13193) Implement fallback to full partition rebalancing in case historical supplier failed to read all necessary data updates from WAL

2020-06-29 Thread Vyacheslav Koptilin (Jira)
Vyacheslav Koptilin created IGNITE-13193:


 Summary: Implement fallback to full partition rebalancing in case 
historical supplier failed to read all necessary data updates from WAL
 Key: IGNITE-13193
 URL: https://issues.apache.org/jira/browse/IGNITE-13193
 Project: Ignite
  Issue Type: Improvement
Affects Versions: 2.8.1
Reporter: Vyacheslav Koptilin
Assignee: Vyacheslav Koptilin


Historical rebalance may fail for several reasons:
1) WAL on supplier node is corrupted - the supplier will trigger a failure 
handler in the current implementation.
2) After iteration over WAL demander node didn't receive all updates to make 
MOVING partition up-to-date (resulting update counter didn't converge with 
expected update counter of OWNING partition) - demander will silently ignore 
lack of updates in the current implementation.
Such behavior negatively affects the stability of the cluster: an inappropriate 
state of historical WAL is not a reason to fail a supplier node.
The more proper way to handle this scenario is:
 - Either try to rebalance partition historically from another supplier
 - Or use full partition rebalance for problem partition

Once the supplier fails to provide data from part of the WAL, its corresponding 
sequence of checkpoints should be marked as inapplicable for historical 
rebalance in order to prevent further errors.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13192) Get value after INSERT and put

2020-06-29 Thread Surkov Aleksandr (Jira)
Surkov Aleksandr created IGNITE-13192:
-

 Summary: Get value after INSERT and put
 Key: IGNITE-13192
 URL: https://issues.apache.org/jira/browse/IGNITE-13192
 Project: Ignite
  Issue Type: Bug
Reporter: Surkov Aleksandr


Reproducer:
{code:java}
@Test
public void testSql() throws Exception {
try (Ignite ignored = Ignition.start(Config.getServerConfiguration()); 
Ignite ignored2 = Ignition.start(Config.getServerConfiguration());
 IgniteClient client = Ignition.startClient(new 
ClientConfiguration().setBinaryConfiguration(new 
BinaryConfiguration().setCompactFooter(true)).setAddresses(Config.SERVER))
) {
// 1. Create table
client.query(
new SqlFieldsQuery(String.format(
"CREATE TABLE IF NOT EXISTS Person (id INT PRIMARY KEY, name 
VARCHAR) WITH \"VALUE_TYPE=%s,CACHE_NAME=%s\"",
Person.class.getName(), "PersonCache"
)).setSchema("PUBLIC")
).getAll();


int key = 1;
Person val = new Person(key, "Person " + key);

// 2. INSERT value to cache
client.query(new SqlFieldsQuery("INSERT INTO Person(id, name) VALUES(?, 
?)")
.setArgs(val.getId(), val.getName())
.setSchema("PUBLIC")
)
.getAll();

// 4. Execute put()
// Without this line, there will be no exception
client.getOrCreateCache("PersonCache").put(2, val);

// 5. Execute get(). There will be an exception: 
org.apache.ignite.binary.BinaryObjectException: Cannot find metadata for object 
with compact footer
assertNotNull(client.getOrCreateCache("PersonCache").get(1));
}
}{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Apache Ignite 2.9.0 RELEASE [Time, Scope, Manager]

2020-06-29 Thread Anton Vinogradov
You're now at the "Ignite Release Managers" group.
Please check you gain access.

On Fri, Jun 26, 2020 at 9:38 PM Alex Plehanov 
wrote:

> Guys,
>
> I've created the 2.9 release confluence page [1].
> Also, I found that I don't have permission to Team City release tasks. Can
> anyone give me such permissions?
>
> [1]: https://cwiki.apache.org/confluence/display/IGNITE/Apache+Ignite+2.9
>
> пт, 26 июн. 2020 г. в 13:38, Alexey Zinoviev :
>
> > Igniters
> >
> > Unfortunately, the ML model export/import feature
> >  is under development
> > yet.
> > I need to delay it before the 2.10 release.
> >
> >
> >
> > пт, 26 июн. 2020 г. в 06:50, Alex Plehanov :
> >
> > > Denis,
> > >
> > > Yes, I still ready to manage it.
> > > Today I will prepare a release page on wiki and will try to go over
> > tickets
> > > list.
> > > Also, I have plans to cut the branch by the end of next week if there
> are
> > > no objections.
> > >
> > >
> > > пт, 26 июн. 2020 г. в 03:48, Denis Magda :
> > >
> > > > Igniters,
> > > >
> > > > Are we moving forward with this release? Alex Plehanov, are you still
> > > ready
> > > > to manage it? It seems like everyone agreed with the timeline you
> > > proposed
> > > > in the very beginning.
> > > >
> > > > -
> > > > Denis
> > > >
> > > >
> > > > On Tue, Jun 16, 2020 at 8:52 AM Denis Magda 
> wrote:
> > > >
> > > > > Sergey, Ivan,
> > > > >
> > > > > Could you please check the questions below? If it's time-consuming
> to
> > > > > rework continuous queries, then the new mode can become available
> in
> > > the
> > > > > experimental state and should not let register continuous queries
> to
> > > > avoid
> > > > > potential deadlocks. Overall, this design gap in continuous queries
> > was
> > > > > like a bomb that has just detonated [1]. Anyway, this new
> > connectivity
> > > > mode
> > > > > will be priceless even if you can't use continuous queries with
> them
> > > > > because right now we cannot even start a thick client inside of a
> > > > > serverless function.
> > > > >
> > > > > Alexey Plehanov,
> > > > >
> > > > > It looks like we can proceed with the release taking your
> timelines.
> > > > >
> > > > > [1] https://issues.apache.org/jira/browse/IGNITE-13156
> > > > >
> > > > > -
> > > > > Denis
> > > > >
> > > > >
> > > > > On Wed, Jun 10, 2020 at 4:16 PM Denis Magda 
> > wrote:
> > > > >
> > > > >> Ivan, Sergey,
> > > > >>
> > > > >> How much effort should we put to resolve the issue with
> > > > >> continuous queries? Are you working on that issue actively? Let's
> > try
> > > to
> > > > >> guess what would be the ETA.
> > > > >>
> > > > >> -
> > > > >> Denis
> > > > >>
> > > > >>
> > > > >> On Wed, Jun 10, 2020 at 3:55 AM Ivan Bessonov <
> > bessonov...@gmail.com>
> > > > >> wrote:
> > > > >>
> > > > >>> Hello,
> > > > >>>
> > > > >>> Sorry for the delay. Sergey Chugunov (sergey.chugu...@gmail.com)
> > > just
> > > > >>> replied
> > > > >>> to the main conversation about "communication via discovery" [1].
> > We
> > > > >>> work on it
> > > > >>> together and recently have found one hard-to-fix scenario,
> detailed
> > > > >>> description is
> > > > >>> provided in Sergey's reply.
> > > > >>>
> > > > >>> In short, July 10th looks realistic only if we introduce new
> > behavior
> > > > in
> > > > >>> its current
> > > > >>> implementation, with new setting and IgniteExperimental status.
> > > Blocker
> > > > >>> here is
> > > > >>> current implementation of Continuos Query protocol that in some
> > cases
> > > > >>> (described
> > > > >>> at the end) initiates TCP connection right from discovery thread
> > > which
> > > > >>> obviously
> > > > >>> leads to deadlock. We haven't estimated efforts needed to
> redesign
> > of
> > > > CQ
> > > > >>> protocol
> > > > >>> but it is definitely a risk and fixing it isn't feasible with a
> > code
> > > > >>> freeze at 10th of July.
> > > > >>> So my verdict: we can include this new feature in 2.9 scope as
> > > > >>> experimental and with
> > > > >>> highlighted limitation on CQ usage. Is that OK?
> > > > >>>
> > > > >>> CQ limitation: server needs to open a communication connection to
> > the
> > > > >>> client if during
> > > > >>> CQ registration client tries to p2p deploy new class not
> available
> > on
> > > > >>> server classpath.
> > > > >>> In other cases registration of CQ should be fine.
> > > > >>>
> > > > >>> [1]
> > > > >>>
> > > >
> > >
> >
> http://apache-ignite-developers.2346864.n4.nabble.com/DISCUSSION-New-Ignite-settings-for-IGNITE-12438-and-IGNITE-13013-td47586.html
> > > > >>>
> > > > >>> вт, 9 июн. 2020 г. в 19:36, Ivan Rakov :
> > > > >>>
> > > >  Hi,
> > > > 
> > > >  Indeed, the tracing feature is almost ready. Discovery,
> > > communication
> > > >  and
> > > >  transactions tracing will be introduced, as well as an option to
> > > >  configure
> > > >  tracing in runtime. Right now we are working on final
> performance
> > > >