tcp-comm-worker-#1 attempt to connect to unrecognized IP addresses

2019-11-08 Thread Conrad Mukai (cmukai)
I am running a 4 node ignite cluster in AWS. We are running ignite inside of 
Docker containers (the image is apacheignite/ignite:2.7.6 from Dockerhub). The 
container is setup to use the host network.

I noticed a lot of warning from the tcp-comm-worker-#1 thread in which it is 
attempting to connect to unrecognized IP addresses:

[2019-11-08 22:38:24,205][WARN ][tcp-comm-worker-#1][TcpCommunicationSpi] 
Connect timed out (consider increasing 'failureDetectionTimeout' configuration 
property) [addr=ip-172-17-0-1.ec2.internal/172.17.0.1:47104, 
failureDetectionTimeout=30]
[2019-11-08 22:40:31,501][WARN ][tcp-comm-worker-#1][TcpCommunicationSpi] 
Connect timed out (consider increasing 'failureDetectionTimeout' configuration 
property) [addr=/9.0.1.1:47104, failureDetectionTimeout=30]
[2019-11-08 22:44:40,973][WARN ][tcp-comm-worker-#1][TcpCommunicationSpi] 
Connect timed out (consider increasing 'failureDetectionTimeout' configuration 
property) [addr=/198.51.100.3:47105, failureDetectionTimeout=30]
[2019-11-08 22:45:03,931][WARN ][tcp-comm-worker-#1][TcpCommunicationSpi] 
Connect timed out (consider increasing 'failureDetectionTimeout' configuration 
property) [addr=ip-172-17-0-1.ec2.internal/172.17.0.1:47104, 
failureDetectionTimeout=30]
[2019-11-08 22:47:11,245][WARN ][tcp-comm-worker-#1][TcpCommunicationSpi] 
Connect timed out (consider increasing 'failureDetectionTimeout' configuration 
property) [addr=ip-172-31-254-1.ec2.internal/172.31.254.1:47104, 
failureDetectionTimeout=30]
[2019-11-08 22:50:00,333][WARN ][tcp-comm-worker-#1][TcpCommunicationSpi] 
Connect timed out (consider increasing 'failureDetectionTimeout' configuration 
property) [addr=/198.51.100.1:47109, failureDetectionTimeout=30]

These IP addresses were not configured in the discoverySpi. I recognize 
172.17.0.1 as part of the default Docker bridge network, but as I mentioned 
earlier we are using the host network. Also 172.31.0.0/16 is the default VPC in 
AWS, but we are not using that VPC. I have no idea why these IP addresses are 
being accessed. Can someone explain what is going on?

Thanks in advance,
Conrad



Re: Cluster in AWS can not have more than 100 nodes?

2019-11-08 Thread codeboyyong
Hi Ilya
yes from the message I can see there  are more than 100 server nodes :

Topology snapshot [ver=2104, locNode=63120c8d, servers=117, clients=129,
state=ACTIVE, CPUs=9240, offheap=2300.0GB, heap=7300.0GB]

We are running a spark job keep writing to the database every minute. 
At least from the long I can see there are 117 server nodes, comparing with
the 100 ip in S3 bucket, it is better but still no sure why it is not 128
nodes we started .

Thank You
Yong



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Ignite.Net Server & client node

2019-11-08 Thread Pavel Tupitsyn
Hi,

Every request is communicated to the server, there is no "local cache" or
anything like that.

On Fri, Nov 8, 2019 at 8:37 PM Sudhir Patil  wrote:

> Hi All,
>
> In ignite.net server node stores cache data and a thin client
> communicates with server to get cache data.
>
> In such situations, post 1 request of cache data by thin client, does all
> further requests still communicate with server or it stores that cache data
> on thin client and server from there and do not communicate with server ??
>
> Regards,
> Sudhir
>
>
> --
> Thanks & Regards,
> Sudhir Patil,
> +91 9881095647.
>


Ignite.Net Server & client node

2019-11-08 Thread Sudhir Patil
Hi All,

In ignite.net server node stores cache data and a thin client communicates
with server to get cache data.

In such situations, post 1 request of cache data by thin client, does all
further requests still communicate with server or it stores that cache data
on thin client and server from there and do not communicate with server ??

Regards,
Sudhir


-- 
Thanks & Regards,
Sudhir Patil,
+91 9881095647.


Re: Ignite-spring-data_2.0 not working

2019-11-08 Thread Ilya Kasnacheev
Hello!

I don't recommend this since index can't be use in this case. Better
introduce firstNameInUpper field.

Regards,
-- 
Ilya Kasnacheev


пт, 8 нояб. 2019 г. в 18:32, Humphrey :

> Thanks, is there a way to use IgnoreCase as well?
>
> IgnoreCase  findByFirstnameIgnoreCase   … where UPPER(x.firstame)
> = UPPER(?1)
>
>
> https://docs.spring.io/spring-data/jpa/docs/1.5.1.RELEASE/reference/html/jpa.repositories.html#jpa.query-methods.query-creation
>
> Or another way to do accomplish that with spring-data?
>
> Humphrey
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>


Re: Ignite-spring-data_2.0 not working

2019-11-08 Thread Humphrey
Thanks, is there a way to use IgnoreCase as well?

IgnoreCase  findByFirstnameIgnoreCase   … where UPPER(x.firstame) = 
UPPER(?1)

https://docs.spring.io/spring-data/jpa/docs/1.5.1.RELEASE/reference/html/jpa.repositories.html#jpa.query-methods.query-creation

Or another way to do accomplish that with spring-data?

Humphrey



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: recoveryBallotBoxes in MvccProcessorImpl memory leak?

2019-11-08 Thread Ilya Kasnacheev
Hello!

You seem to have an awful lot of errors related to connectivity problems
between nodes, such as:

Suppressed: class org.apache.ignite.IgniteCheckedException: Failed to
connect to address [addr=ult-s2-svr1.dataprocessors.com.au/10.16.1.47:47106,
err=Connection refused]

Suppressed: class org.apache.ignite.IgniteCheckedException: Failed to
connect to address [addr=ult-s2-svr3/10.16.1.43:47102, err=Remote node ID
is not as expected [expected=d97b5e5d-fb46-4b5b-91ad-79a69fce738f,
rcvd=1dc23ebb-0997-4858-9433-d5d30c9b643e]]

I recommend figuring those errors out: it's possible that you have nodes in
your cluster which are not reachable by communication from server node(s),
but present in discovery. Such nodes will cause all kinds of problems in
cluster.

Regards,
-- 
Ilya Kasnacheev


пт, 8 нояб. 2019 г. в 17:12, mvkarp :

> Ok, there are no exceptions in the ignite logs for the client JVMs but I've
> attached the log for one of the problem servers. Looks like a few errors
> but
> I am unable to determine the root cause.
> ignite-46073e05.zip
> <
> http://apache-ignite-users.70518.x6.nabble.com/file/t2658/ignite-46073e05.zip>
>
>
>
> ilya.kasnacheev wrote
> > Hello!
> >
> > This is very strange, since we expect this collection to be cleared on
> > exchange.
> >
> > Please make sure you don't have any stray exceptions during exchange in
> > your logs.
> >
> > Regards,
> > --
> > Ilya Kasnacheev
> >
> >
> > пт, 8 нояб. 2019 г. в 12:49, mvkarp 
>
> > liquid_ninja2k@
>
> > :
> >
> >> Hi,
> >>
> >> This is not the case. Always only a maximum total of two server nodes.
> >> One
> >> JVM server on each. However there are many client JVMs that start and
> >> stop
> >> caches with setClientMode=true. It looks like one of the server
> instances
> >> is
> >> immune to the issue, whilst the most newly created one gets the leak,
> >> with
> >> a
> >> lot of partition exchanges happening for EVT_NODE_JOINED and
> >> EVT_NODE_LEFT
> >> (one of the nodes don't get any of these partition exchanges, however
> the
> >> exact server node that gets this can alternate so its not linked to one
> >> node
> >> in particular but seems to be linked to the most newly launched server).
> >>
> >>
> >> ilya.kasnacheev wrote
> >> > Hello!
> >> >
> >> > How many nodes do you have in your cluster?
> >> >
> >> > From the dump it seems that the number of server nodes is in
> thousands.
> >> Is
> >> > this the case?
> >> >
> >> > Regards,
> >> > --
> >> > Ilya Kasnacheev
> >> >
> >> >
> >> > пт, 8 нояб. 2019 г. в 10:26, mvkarp 
> >>
> >> > liquid_ninja2k@
> >>
> >> > :
> >> >
> >> >> Let me know if these help or if you need anything more specific.
> >> >> recoveryBallotBoxes.zip
> >> >> <
> >> >>
> >>
> http://apache-ignite-users.70518.x6.nabble.com/file/t2658/recoveryBallotBoxes.zip
> >> >
> >> >>
> >> >>
> >> >>
> >> >> ilya.kasnacheev wrote
> >> >> > Hello!
> >> >> >
> >> >> > Can you please check whether there are any especially large objects
> >> >> inside
> >> >> > recoveryBallotBoxes object graph? Sorting by retained heap may help
> >> in
> >> >> > determining this. It would be nice to know what is the type
> >> histogram
> >> >> of
> >> >> > what's inside recoveryBallotBoxes and where the bulk of heap usage
> >> >> > resides.
> >> >> >
> >> >> > Regards,
> >> >> > --
> >> >> > Ilya Kasnacheev
> >> >> >
> >> >> >
> >> >> > чт, 7 нояб. 2019 г. в 06:23, mvkarp 
> >> >>
> >> >> > liquid_ninja2k@
> >> >>
> >> >> > :
> >> >> >
> >> >> >> I've attached another set of screenshots, might be more clear.
> >> >> >> heap.zip
> >> >> >> 
> >> >>
> http://apache-ignite-users.70518.x6.nabble.com/file/t2658/heap.zip;
> >> >> >>
> >> >> >>
> >> >> >> mvkarp wrote
> >> >> >> > I've attached some extra screenshots showing what is inside
> these
> >> >> >> records
> >> >> >> > and path to GC roots. heap.zip
> >> >> >> > 
> >> >> >>
> >> http://apache-ignite-users.70518.x6.nabble.com/file/t2658/heap.zip;
> >> >> >> >
> >> >> >> >
> >> >> >> >
> >> >> >> >
> >> >> >> >
> >> >> >> > --
> >> >> >> > Sent from: http://apache-ignite-users.70518.x6.nabble.com/
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> --
> >> >> >> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
> >> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
> >> >>
> >>
> >>
> >>
> >>
> >>
> >> --
> >> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
> >>
>
>
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>


Re: Ignite-spring-data_2.0 not working

2019-11-08 Thread Ilya Kasnacheev
Hello!

Well, I guess that's how it is currently.

Regards,
-- 
Ilya Kasnacheev


пт, 8 нояб. 2019 г. в 17:17, Humphrey :

> Thanks,
>
> I've tried it with version 2.0.14 and it works, but version 2.1.0 it
> doesn't.
>
> 
>   org.springframework.data
>   spring-data-commons
>   2.0.14.RELEASE
> 
>
> Humphrey
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>


Re: Ignite-spring-data_2.0 not working

2019-11-08 Thread Humphrey
Thanks,

I've tried it with version 2.0.14 and it works, but version 2.1.0 it
doesn't.


  org.springframework.data
  spring-data-commons
  2.0.14.RELEASE


Humphrey



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: recoveryBallotBoxes in MvccProcessorImpl memory leak?

2019-11-08 Thread mvkarp
Ok, there are no exceptions in the ignite logs for the client JVMs but I've
attached the log for one of the problem servers. Looks like a few errors but
I am unable to determine the root cause.
ignite-46073e05.zip
 
 


ilya.kasnacheev wrote
> Hello!
> 
> This is very strange, since we expect this collection to be cleared on
> exchange.
> 
> Please make sure you don't have any stray exceptions during exchange in
> your logs.
> 
> Regards,
> -- 
> Ilya Kasnacheev
> 
> 
> пт, 8 нояб. 2019 г. в 12:49, mvkarp 

> liquid_ninja2k@

> :
> 
>> Hi,
>>
>> This is not the case. Always only a maximum total of two server nodes.
>> One
>> JVM server on each. However there are many client JVMs that start and
>> stop
>> caches with setClientMode=true. It looks like one of the server instances
>> is
>> immune to the issue, whilst the most newly created one gets the leak,
>> with
>> a
>> lot of partition exchanges happening for EVT_NODE_JOINED and
>> EVT_NODE_LEFT
>> (one of the nodes don't get any of these partition exchanges, however the
>> exact server node that gets this can alternate so its not linked to one
>> node
>> in particular but seems to be linked to the most newly launched server).
>>
>>
>> ilya.kasnacheev wrote
>> > Hello!
>> >
>> > How many nodes do you have in your cluster?
>> >
>> > From the dump it seems that the number of server nodes is in thousands.
>> Is
>> > this the case?
>> >
>> > Regards,
>> > --
>> > Ilya Kasnacheev
>> >
>> >
>> > пт, 8 нояб. 2019 г. в 10:26, mvkarp 
>>
>> > liquid_ninja2k@
>>
>> > :
>> >
>> >> Let me know if these help or if you need anything more specific.
>> >> recoveryBallotBoxes.zip
>> >> <
>> >>
>> http://apache-ignite-users.70518.x6.nabble.com/file/t2658/recoveryBallotBoxes.zip
>> >
>> >>
>> >>
>> >>
>> >> ilya.kasnacheev wrote
>> >> > Hello!
>> >> >
>> >> > Can you please check whether there are any especially large objects
>> >> inside
>> >> > recoveryBallotBoxes object graph? Sorting by retained heap may help
>> in
>> >> > determining this. It would be nice to know what is the type
>> histogram
>> >> of
>> >> > what's inside recoveryBallotBoxes and where the bulk of heap usage
>> >> > resides.
>> >> >
>> >> > Regards,
>> >> > --
>> >> > Ilya Kasnacheev
>> >> >
>> >> >
>> >> > чт, 7 нояб. 2019 г. в 06:23, mvkarp 
>> >>
>> >> > liquid_ninja2k@
>> >>
>> >> > :
>> >> >
>> >> >> I've attached another set of screenshots, might be more clear.
>> >> >> heap.zip
>> >> >> 
>> >> http://apache-ignite-users.70518.x6.nabble.com/file/t2658/heap.zip;
>> >> >>
>> >> >>
>> >> >> mvkarp wrote
>> >> >> > I've attached some extra screenshots showing what is inside these
>> >> >> records
>> >> >> > and path to GC roots. heap.zip
>> >> >> > 
>> >> >>
>> http://apache-ignite-users.70518.x6.nabble.com/file/t2658/heap.zip;
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > --
>> >> >> > Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>> >> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> --
>> >> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>> >>
>>
>>
>>
>>
>>
>> --
>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: recoveryBallotBoxes in MvccProcessorImpl memory leak?

2019-11-08 Thread Ilya Kasnacheev
Hello!

This is very strange, since we expect this collection to be cleared on
exchange.

Please make sure you don't have any stray exceptions during exchange in
your logs.

Regards,
-- 
Ilya Kasnacheev


пт, 8 нояб. 2019 г. в 12:49, mvkarp :

> Hi,
>
> This is not the case. Always only a maximum total of two server nodes. One
> JVM server on each. However there are many client JVMs that start and stop
> caches with setClientMode=true. It looks like one of the server instances
> is
> immune to the issue, whilst the most newly created one gets the leak, with
> a
> lot of partition exchanges happening for EVT_NODE_JOINED and EVT_NODE_LEFT
> (one of the nodes don't get any of these partition exchanges, however the
> exact server node that gets this can alternate so its not linked to one
> node
> in particular but seems to be linked to the most newly launched server).
>
>
> ilya.kasnacheev wrote
> > Hello!
> >
> > How many nodes do you have in your cluster?
> >
> > From the dump it seems that the number of server nodes is in thousands.
> Is
> > this the case?
> >
> > Regards,
> > --
> > Ilya Kasnacheev
> >
> >
> > пт, 8 нояб. 2019 г. в 10:26, mvkarp 
>
> > liquid_ninja2k@
>
> > :
> >
> >> Let me know if these help or if you need anything more specific.
> >> recoveryBallotBoxes.zip
> >> <
> >>
> http://apache-ignite-users.70518.x6.nabble.com/file/t2658/recoveryBallotBoxes.zip
> >
> >>
> >>
> >>
> >> ilya.kasnacheev wrote
> >> > Hello!
> >> >
> >> > Can you please check whether there are any especially large objects
> >> inside
> >> > recoveryBallotBoxes object graph? Sorting by retained heap may help in
> >> > determining this. It would be nice to know what is the type histogram
> >> of
> >> > what's inside recoveryBallotBoxes and where the bulk of heap usage
> >> > resides.
> >> >
> >> > Regards,
> >> > --
> >> > Ilya Kasnacheev
> >> >
> >> >
> >> > чт, 7 нояб. 2019 г. в 06:23, mvkarp 
> >>
> >> > liquid_ninja2k@
> >>
> >> > :
> >> >
> >> >> I've attached another set of screenshots, might be more clear.
> >> >> heap.zip
> >> >> 
> >> http://apache-ignite-users.70518.x6.nabble.com/file/t2658/heap.zip;
> >> >>
> >> >>
> >> >> mvkarp wrote
> >> >> > I've attached some extra screenshots showing what is inside these
> >> >> records
> >> >> > and path to GC roots. heap.zip
> >> >> > 
> >> >>
> http://apache-ignite-users.70518.x6.nabble.com/file/t2658/heap.zip;
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> > --
> >> >> > Sent from: http://apache-ignite-users.70518.x6.nabble.com/
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
> >> >>
> >>
> >>
> >>
> >>
> >>
> >> --
> >> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
> >>
>
>
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>


Re: Blocked Worker Thread

2019-11-08 Thread Andrei Aleksandrov

Hi Conrad,

The reasons can be different. Could you please share the logs?

BR,
Andrei

11/7/2019 10:35 PM, Conrad Mukai (cmukai) пишет:


We are running a cache in a 4 node cluster with atomicityMode set to 
ATOMIC and have persistence enabled. We repeatedly get a 
SYSTEM_WORKER_BLOCKED error on one node which is disabling the entire 
cluster. We were seeing a lot of sockets in TIME_WAIT state which was 
blocking clients from connecting so we did the following on all the nodes:


/# ignore TIME_WAIT state on sockets
/echo *"1" **> */proc/sys/net/ipv4/tcp_tw_reuse
echo *"1" **> */proc/sys/net/ipv4/tcp_tw_recycle

This made that issue go away, but may play a part in this new issue. 
First question is what is the root cause of the error? The second 
question is why does this bring down the entire cluster?


Here is the error message:

[2019-11-07 16:36:22,037][ERROR][tcp-disco-msg-worker-#2][root] 
Critical system error detected. Will be handled accordingly to 
configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, 
timeout=0, super=AbstractFailureHandler 
[ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED, 
SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext 
[type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: 
GridWorker [name=partition-exchanger, igniteInstanceName=null, 
finished=false, heartbeatTs=1573090509428]]]


class org.apache.ignite.IgniteException: GridWorker 
[name=partition-exchanger, igniteInstanceName=null, finished=false, 
heartbeatTs=1573090509428]


    at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1831)


    at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1826)


    at 
org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:233)


    at 
org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:297)


    at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.lambda$new$0(ServerImpl.java:2663)


    at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorker.body(ServerImpl.java:7181)


    at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2700)


    at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)


    at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerThread.body(ServerImpl.java:7119)


    at 
org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)


This is followed by a warning and a thread dump:

[2019-11-07 16:36:22,038][WARN 
][tcp-disco-msg-worker-#2][FailureProcessor] No deadlocked threads 
detected.


[2019-11-07 16:36:22,328][WARN 
][tcp-disco-msg-worker-#2][FailureProcessor] Thread dump at 2019/11/07 
16:36:22 GMT


For the particular thread in the error and warning messages here is 
the thread dump:


Thread [name="tcp-disco-msg-worker-#2", id=113, state=RUNNABLE, 
blockCnt=211, waitCnt=4745368]


    at sun.management.ThreadImpl.dumpThreads0(Native Method)

    at sun.management.ThreadImpl.dumpAllThreads(ThreadImpl.java:454)

    at o.a.i.i.util.IgniteUtils.dumpThreads(IgniteUtils.java:1368)

    at 
o.a.i.i.processors.failure.FailureProcessor.process(FailureProcessor.java:128)


    - locked o.a.i.i.processors.failure.FailureProcessor@7e65ceba

    at 
o.a.i.i.processors.failure.FailureProcessor.process(FailureProcessor.java:104)


    at 
o.a.i.i.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1829)


    at 
o.a.i.i.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1826)


    at o.a.i.i.worker.WorkersRegistry.onIdle(WorkersRegistry.java:233)

    at o.a.i.i.util.worker.GridWorker.onIdle(GridWorker.java:297)

    at 
o.a.i.spi.discovery.tcp.ServerImpl$RingMessageWorker.lambda$new$0(ServerImpl.java:2663)


    at 
o.a.i.spi.discovery.tcp.ServerImpl$RingMessageWorker$$Lambda$47/1047515321.run(Unknown 
Source)


    at 
o.a.i.spi.discovery.tcp.ServerImpl$MessageWorker.body(ServerImpl.java:7181)


    at 
o.a.i.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2700)


    at o.a.i.i.util.worker.GridWorker.run(GridWorker.java:120)

    at 
o.a.i.spi.discovery.tcp.ServerImpl$MessageWorkerThread.body(ServerImpl.java:7119)


    at o.a.i.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)

In addition all the system threads are in TIMED_WAITING state:

Thread [name="sys-#7099", id=9252, state=TIMED_WAITING, blockCnt=0, 
waitCnt=1]


    Lock 
[object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@677ec573, 
ownerName=null, ownerId=-1]


    at sun.misc.Unsafe.park(Native Method)

    at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)


    at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)


    at 

Re: recoveryBallotBoxes in MvccProcessorImpl memory leak?

2019-11-08 Thread mvkarp
Hi, 

This is not the case. Always only a maximum total of two server nodes. One
JVM server on each. However there are many client JVMs that start and stop
caches with setClientMode=true. It looks like one of the server instances is
immune to the issue, whilst the most newly created one gets the leak, with a
lot of partition exchanges happening for EVT_NODE_JOINED and EVT_NODE_LEFT
(one of the nodes don't get any of these partition exchanges, however the
exact server node that gets this can alternate so its not linked to one node
in particular but seems to be linked to the most newly launched server).


ilya.kasnacheev wrote
> Hello!
> 
> How many nodes do you have in your cluster?
> 
> From the dump it seems that the number of server nodes is in thousands. Is
> this the case?
> 
> Regards,
> -- 
> Ilya Kasnacheev
> 
> 
> пт, 8 нояб. 2019 г. в 10:26, mvkarp 

> liquid_ninja2k@

> :
> 
>> Let me know if these help or if you need anything more specific.
>> recoveryBallotBoxes.zip
>> <
>> http://apache-ignite-users.70518.x6.nabble.com/file/t2658/recoveryBallotBoxes.zip>
>>
>>
>>
>> ilya.kasnacheev wrote
>> > Hello!
>> >
>> > Can you please check whether there are any especially large objects
>> inside
>> > recoveryBallotBoxes object graph? Sorting by retained heap may help in
>> > determining this. It would be nice to know what is the type histogram
>> of
>> > what's inside recoveryBallotBoxes and where the bulk of heap usage
>> > resides.
>> >
>> > Regards,
>> > --
>> > Ilya Kasnacheev
>> >
>> >
>> > чт, 7 нояб. 2019 г. в 06:23, mvkarp 
>>
>> > liquid_ninja2k@
>>
>> > :
>> >
>> >> I've attached another set of screenshots, might be more clear.
>> >> heap.zip
>> >> 
>> http://apache-ignite-users.70518.x6.nabble.com/file/t2658/heap.zip;
>> >>
>> >>
>> >> mvkarp wrote
>> >> > I've attached some extra screenshots showing what is inside these
>> >> records
>> >> > and path to GC roots. heap.zip
>> >> > 
>> >> http://apache-ignite-users.70518.x6.nabble.com/file/t2658/heap.zip;
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> --
>> >> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>> >>
>>
>>
>>
>>
>>
>> --
>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: How to insert data?

2019-11-08 Thread Ivan Pavlukhin
Hi Boris,

Perhaps confusing point here is that Ignite separates key and value
parts in KV storage. And there is always underlying KV storage in
Ignite. So you cannot have cache key and value in a single POJO. I
tried your class with annotations and run INSERTs with it. A trick
here is a "_key" column.
class DataX {
@QuerySqlField
private Integer key;
@QuerySqlField
private String value;
public DataX(int key,String value) {
this.key=key;
this.value=value;
}
public int getKey() {
return key;
}
public void setKey(int key) {
this.key=key;
}
public String getValue() {
return value;
}
public void setValue(String value) {
this.value=value;
}
}

IgniteCache cache = ignite.createCache(new
CacheConfiguration<>(DEFAULT_CACHE_NAME)
.setIndexedTypes(Integer.class, DataX.class));

cache.query(new SqlFieldsQuery("insert into DataX(_key, key, value)
values(1, 42, 'value42')"));

System.err.println(cache.query(new SqlFieldsQuery("select * from
DataX")).getAll());

ср, 6 нояб. 2019 г. в 11:46, BorisBelozerov :
>
> When I run your code in one node, It run OK
> But when I run your code in a cluster, It can't run. I can select but can't
> insert or do other things
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/



-- 
Best regards,
Ivan Pavlukhin