Re: Strange annoying messages in Ignite 2.11 logs for one of Ignite nodes

2021-09-29 Thread Ilya Roublev
Hi all,

I'd like to add that this is accompanied by the following messages:
```
ignite_1| 
[17:07:27,540][WARNING][grid-nio-worker-tcp-comm-0-#103%TcpCommunicationSpi%][TcpCommunicationSpi]
 Failed to shutdown socket: null
ignite_1| java.nio.channels.ClosedChannelException
ignite_1|   at 
sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:797)
ignite_1|   at 
sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:407)
ignite_1|   at 
org.apache.ignite.internal.util.IgniteUtils.close(IgniteUtils.java:4231)
ignite_1|   at 
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.closeKey(GridNioServer.java:2784)
ignite_1|   at 
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.close(GridNioServer.java:2835)
ignite_1|   at 
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.close(GridNioServer.java:2794)
ignite_1|   at 
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:2157)
ignite_1|   at 
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1910)
ignite_1|   at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
ignite_1|   at java.lang.Thread.run(Thread.java:748)
ignite_2| 
[17:07:27,541][WARNING][grid-nio-worker-tcp-comm-1-#104%TcpCommunicationSpi%][TcpCommunicationSpi]
 Failed to shutdown socket: null
ignite_2| java.nio.channels.ClosedChannelException
ignite_2|   at 
sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:797)
ignite_2|   at 
sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:407)
ignite_2|   at 
org.apache.ignite.internal.util.IgniteUtils.close(IgniteUtils.java:4231)
ignite_2|   at 
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.closeKey(GridNioServer.java:2784)
ignite_2|   at 
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.close(GridNioServer.java:2835)
ignite_2|   at 
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.close(GridNioServer.java:2794)
ignite_2|   at 
org.apache.ignite.internal.util.nio.GridNioServer$DirectNioClientWorker.processRead(GridNioServer.java:1357)
ignite_2|   at 
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.processSelectedKeysOptimized(GridNioServer.java:2508)
ignite_2|   at 
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:2273)
ignite_2|   at 
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1910)
ignite_2|   at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
ignite_2|   at java.lang.Thread.run(Thread.java:748)
```
I believe it is just to some initialization of Ignite cluster within docker. 
But I cann't understand how to fix this... Could someone help us please?

On 2021/09/27 17:47:54,  wrote: 
> I switched to Ignite 2.11, and started to obtain the following messages for 
> one of Ignite nodes (I have two nodes in total launched within Docker for our 
> testing infrastructure)
> ignite_2| 
> [12:30:35,205][WARNING][tcp-disco-ip-finder-cleaner-#8-#175][TcpDiscoverySpi] 
> Failed to ping node [nodeId=null]. Reached the timeout 1ms. Cause: 
> Connection refused (Connection refused)
> ignite_2| 
> [12:30:35,206][WARNING][tcp-disco-ip-finder-cleaner-#8-#175][TcpDiscoverySpi] 
> Failed to ping node [nodeId=null]. Reached the timeout 1ms. Cause: 
> Connection refused (Connection refused)
> ignite_2| 
> [12:30:35,206][WARNING][tcp-disco-ip-finder-cleaner-#8-#175][TcpDiscoverySpi] 
> Failed to ping node [nodeId=null]. Reached the timeout 1ms. Cause: 
> Connection refused (Connection refused)
> ignite_2| 
> [12:30:35,206][WARNING][tcp-disco-ip-finder-cleaner-#8-#175][TcpDiscoverySpi] 
> Failed to ping node [nodeId=null]. Reached the timeout 1ms. Cause: 
> Connection refused (Connection refused)
> ignite_2| 
> [12:30:35,207][WARNING][tcp-disco-ip-finder-cleaner-#8-#175][TcpDiscoverySpi] 
> Failed to ping node [nodeId=null]. Reached the timeout 1ms. Cause: 
> Connection refused (Connection refused)
> ignite_2| 
> [12:30:35,207][WARNING][tcp-disco-ip-finder-cleaner-#8-#175][TcpDiscoverySpi] 
> Failed to ping node [nodeId=null]. Reached the timeout 1ms. Cause: 
> Connection refused (Connection refused)
> ignite_2| 
> [12:30:35,208][WARNING][tcp-disco-ip-finder-cleaner-#8-#175][TcpDiscoverySpi] 
> Failed to ping node [nodeId=null]. Reached the timeout 1ms. Cause: 
> Connection refused 

Re: Ignite client node hangs while IgniteAtomicLong is created

2020-08-10 Thread Ilya Roublev
Hello, Ilya,

In the post above one week ago I've attached necessary thread dumps. Could
you please say whether do you have sufficient information to investigate the
problem with hanging of IgniteAtomicLong? I think the issue not all that
harmless, it concerns the last Ignite 2.8.1 and its fixing may be IMHO
important for the community (I think the cause is in initialization of
ignite-sys-atomic-cache simultaneously in several nodes, but certainly I may
be mistaken). But unfortunately I see no reaction on this since a week.
Could you please give at least a hint that the problem is under
investigation and there is a slightest chance that the problem can be
resolved? Or it is better to work out some workarounds?

Thank you very much in advance for any response.

My best regards,
Ilya


ilya.kasnacheev wrote
> Hello!
> 
> Can you collect thread dumps from all nodes once you get them hanging?
> 
> Can you throw together a reproducer project?
> 
> Regards,
> -- 
> Ilya Kasnacheev
> 
> 
> вт, 4 авг. 2020 г. в 12:51, Ilya Roublev 

> iroublev@

> :
> 
>> We are developing Jira cloud app using Apache Ignite both as data storage
>> and as job scheduler. This is done via a standard Ignite client node. But
>> we need to use Atlassian Connect Spring Boot to be able to communicate
>> with
>> Jira. In short, all is done exactly as in our article Boosting Jira Cloud
>> app development with Apache Ignite
>> https://medium.com/alliedium/boosting-jira-cloud-app-development-with-apache-ignite-7eebc7bb3d48;.
>> At first we used simple Ignite JDBC driver
>> https://apacheignite-sql.readme.io/docs/jdbc-driver; just for
>> Atlassian
>> Connect Spring Boot along with a separate Ignite client node for our own
>> purposes. But this turned out to be very unstable being deployed in our
>> local Kubernetes cluster (built via Kubespray) due to constant exceptions
>>
>> java.net.SocketException: Connection reset
>>
>> occuring from time to time (in fact, this revealed only in our local
>> cluster, in AWS EKS all worked fine). To make all this more stable we
>> tried
>> to use Ignite JDBC Client driver
>> https://apacheignite-sql.readme.io/docs/jdbc-client-driver;
>> exactly as
>> described in the article mentioned above. Thus, now our backend uses two
>> Ignite client nodes per single JVM: the first one for JDBC used by
>> Atlassian Connect Spring Boot, the second one for our own purposes. This
>> solution turned out to be good enough, because our app works now very
>> stable both in our local cluster and in AWS EKS. But when we deploy our
>> app
>> in Docker for testing and developing purposes, our Ignite client nodes
>> hang
>> from time to time. After some investigation we were able to see that this
>> occurs exactly at the instant when an object of IgniteAtomicLong is
>> created. Below are logs both for successful initialization of our app and
>> for the case when Ignite client node hanged. Logs when all is ok
>> ignite-appclientnode-successful.log
>> http://apache-ignite-users.70518.x6.nabble.com/file/t2262/ignite-appclientnode-successful.log;
>> ignite-jdbcclientnode-successful.log
>> http://apache-ignite-users.70518.x6.nabble.com/file/t2262/ignite-jdbcclientnode-successful.log;
>> Logs
>> when both client node hang ignite-appclientnode-failed.log
>> http://apache-ignite-users.70518.x6.nabble.com/file/t2262/ignite-appclientnode-failed.log;
>> ignite-jdbcclientnode-failed.log
>> http://apache-ignite-users.70518.x6.nabble.com/file/t2262/ignite-jdbcclientnode-failed.log;
>> Some
>> analysis and questions From logs one can see that caches default,
>> tenants, atlassian_host_audit, SQL_PUBLIC_ATLASSIAN_HOST are manipulated,
>> in fact, default is given in client configuration: client.xml
>> http://apache-ignite-users.70518.x6.nabble.com/file/t2262/client.xml;,
>> the cache SQL_PUBLIC_ATLASSIAN_HOST contains atlassian_host table
>> mentioned
>> in Boosting Jira Cloud app development with Apache Ignite
>> https://medium.com/alliedium/boosting-jira-cloud-app-development-with-apache-ignite-7eebc7bb3d48;
>> and is created in advance even before the app starts. Further,
>> atlassian_host_audit is a copy of atlassian_host, in any case it is not
>> yet
>> created when the app hangs. What concerns other entities processed by
>> Ignite, they are created by the following code:
>>
>> CacheConfigurationLong, Tenant tenantCacheCfg = new
>> CacheConfiguration<>();
>> tenantCacheCfg.setName("tenants");
>> tenantCacheCfg.setSqlSchema("PROD");
>> tenantCacheCfg.

Re: Ignite client node hangs while IgniteAtomicLong is created

2020-08-06 Thread Ilya Roublev
Hello, Ilya,Attached are two thread dumps, the second is taken 13 minutes
after thefirst one:  threaddump.txt
 
,  threaddump2.txt
 
.The hanging occurs in the main thread (in fact the same output is for
threaddump taken after 8 hours):The differences between two thread dumps are
minor, one of them is asfollows:in the first thread dumpin the secondWhat
concerns a reproducer project, this is not an easy task, because it
isdifficult to understand which factors may be treated as significant.
Ourinitial project is in general stable, the matter is that we have dozens
of buildson our build server per each day and only some of these builds
fail. It is very difficult tocatch this situation, I have had to launch 5
builds one after another beforethis situation really occured. And it may be
that this situation requireslaunching very specific containers in Docker
each at very specific time. Andwe cannot share our original project, all I
can do is to give you thoseparts of the code that deal with Ignite. For
example, the full code of startmethod from DbManager is as follows:And we
have logs for all containers of our app including those for Igniteserver
nodes, if you like I can provide them.Thank you very much for your help in
advance.My best regards,Ilya



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Ignite client node hangs while IgniteAtomicLong is created

2020-08-05 Thread Ilya Roublev
Hello, Ilya,

Attached are two thread dumps, the second is taken 13 minutes after the
first one:  threaddump.txt
 
,  threaddump2.txt
 
.

The hanging occurs in the main thread (in fact the same output is for thread
dump taken after 8 hours):


The differences between two thread dumps are minor, one of them is as
follows:
in the first thread dump

in the second


What concerns a reproducer project, this is not an easy task, because it is
difficult to understand which factors may be treated as significant. Our
initial project is in general stable, the matter we have dozens of builds
per each day and only some of these builds fail. It is very difficult to
catch this situation, I have had to launch 5 builds one after another before
this situation really occured. And it may be that this situation requires
launching very specific containers in Docker each at very specific time. And
we cannot share our original project, all I can do is to give you those
parts of the code that deal with Ignite. For example, the full code of start
method from DbManager is as follows:



And we have logs for all containers of our app including those for Ignite
server nodes, if you like I can provide them.

Thank you very much for your help in advance.

My best regards,
Ilya



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Ignite client node hangs while IgniteAtomicLong is created

2020-08-04 Thread Ilya Roublev
We are developing Jira cloud app using Apache Ignite both as data storage and
as job scheduler. This is done via a standard Ignite client node. But we
need to use Atlassian Connect Spring Boot to be able to communicate with
Jira. In short, all is done exactly as in our article  Boosting Jira Cloud
app development with Apache Ignite

 
.At first we used simple  Ignite JDBC driver
   just for Atlassian
Connect Spring Boot along with a separate Ignite client node for our own
purposes. But this turned out to be very unstable being deployed in our
local Kubernetes cluster (built via Kubespray) due to constant exceptions 
occuring from time to time (in fact, this revealed only in our local
cluster, in AWS EKS all worked fine). To make all this more stable we tried
to use  Ignite JDBC Client driver
   exactly as
described in the article mentioned above. Thus, now our backend uses two
Ignite client nodes per single JVM: the first one for JDBC used by Atlassian
Connect Spring Boot, the second one for our own purposes.This solution
turned out to be good enough, because our app works now very stable both in
our local cluster and in AWS EKS. But when we deploy our app in Docker for
testing and developing purposes, our Ignite client nodes hang from time to
time. After some investigation we were able to see that this occurs exactly
at the instant when an object of IgniteAtomicLong is created. Below are logs
both for successful initialization of our app and for the case when Ignite
client node hanged.
Logs when all is ok
ignite-appclientnode-successful.log

  
ignite-jdbcclientnode-successful.log

  
Logs when both client node hang
ignite-appclientnode-failed.log

  
ignite-jdbcclientnode-failed.log

  
Some analysis and questions
>From logs one can see that caches default, tenants, atlassian_host_audit,
SQL_PUBLIC_ATLASSIAN_HOST  are manipulated, in fact, default is given in
client configuration:  client.xml
  ,
the cache SQL_PUBLIC_ATLASSIAN_HOST contains atlassian_host table mentioned
in  Boosting Jira Cloud app development with Apache Ignite

  
and is created in advance even before the app starts. Further,
atlassian_host_audit is a copy of atlassian_host, in any case it is not yet
created when the app hangs.What concerns other entities processed by Ignite,
they are created by the following code:And from the logs of the app itself
it is clear that the app hangs exactly on the last line. This is confirmed
by the fact that the in ignite-jdbcclientnode-successful.log we have the
following lines:while in ignite-jdbcclientnode-failed.log all the lines
starting the first time the cache ignite-sys-atomic-cache@default-ds-group
(the cache used for atomics) was mentioned are as follows:In particular, the
following line from ignite-jdbcclientnode-successful.log is absent in
ignite-jdbcclientnode-failed.log:But it should be noted that for the failure
case there are other client nodes executed in separate containers executed
simultaneously with the backend app and with the same code creating the
cache tenants and IgniteAtomicLong idGen what concerns the logs below (see
above for the code), their node ids are 653143b2-6e80-49ff-9e9a-ae10237b32e8
and 30e24e06-ab76-4053-a36e-548e87ffe5d1, respectively (and it can be easily
seen that all the lines in ignite-jdbcclientnode-failed.log with
ignite-sys-atomic-cache@default-ds-group relate namely to these nodes), the
logs for the time segment when the code with tenants and idGen is executed
are as follows:And the code creating tenants and idGen is executed
successfully. But is it possible that this simultaneous creation of idGen
may hang some nodes? (As for the case when all was executed successfully,
there we also have two separate containers, but they are executed strictly
after all is done in the main app, so the simultaneous execution of the same
code in several client nodes may be the reason of hanging, isn't it?) And in
the case the answer is positive, what is to do? Certainly it is possible to
set a delay for those separate containers, but this does not look as a
rather safe solution...And we have another small question, when we have two
separate client nodes in our app, both configured for logging, why starting
from some instant only the 

Re: Prevent automatic Ignite server node start for Spring Boot

2019-01-25 Thread Ilya Roublev
ilya.kasnacheev wrote
> It seems that somebody (like spring boot) have started
> 
> org.apache.ignite.cache.CacheManager
> 
> 
> How to prevent it I am unaware, you should probably consult with Spring
> Boot docs.

Thank you very much! It seems that 

spring.cache.type: simple

put in application.yml file switches off starting of
org.apache.ignite.cache.CacheManager.

But it is not clear, how spring.cache.type: simple can influence on other
things loaded by spring boot (I'm using by now also some other database via
hibernate).

Certainly it would be better to exclude starting of
org.apache.ignite.cache.CacheManager via some parameter specific namely for
Ignite.

Thanks once again for your help!




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Prevent automatic Ignite server node start for Spring Boot

2019-01-22 Thread Ilya Roublev
I have the following problem. I'm developing a Spring Boot application, this
application uses Ignite, but via creation of some client nodes (when it is
necessary). The following is the part of pom.xml for the project:


The problem is when the application is started, I get the following log:


But that is not at all what was expected, this server node was created
without any intention to do this (Ignite server nodes are started
separately). The matter is I removed just for test purposes all the code
working with Ignite at all, the only thing that has to do with Ignite is the
following dependency:

Now there are no any configuration xmls in JAR (relating to Ignite or not).
But when JAR is launched, the above Ignite server node starts.

Could you please help me and explain how to prevent such a start of Ignite
server nodes? Further within the application special client nodes are to be
created, but this is to be done intentionally. I'd like to avoid any
unintentional creation of any Ignite nodes (server or client ones).

Thank you very much in advance.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/