Hello!
2-4 kilobytes is not big. Still you may want to check your logs for
long-running transactions, etc.
Regards,
--
Ilya Kasnacheev
пт, 25 окт. 2019 г. в 18:21, ihalilaltun :
> Hi Ilya,
>
> It is almost impossible for us to get thread dumps since this is production
> environment we cannot
Hi Ilya,
It is almost impossible for us to get thread dumps since this is production
environment we cannot use profiler :(
Our biggest object range from 2 to 4 kilobytes. We are planning to shrink
the sizes but time for this is not decided yet.
regards.
-
İbrahim Halil Altun
Senior Softwa
Hello!
Ignite operations will use built-in locks even if you don't explicitly use
any. If you have uncommitted transactions or something like that,
checkpoint can't start (and other operations are waiting for it too).
How big are we talking about? I recommend capturing several thread dumps
after
Hi Ilya,
Sorry for the late response. We don't use lock mechanism in our environment.
We have a lot of put, get operaitons, as far as i remember these operations
does not hold the locks. In addition to these operations, in many update/put
operations we use CacheEntryProcessor which also does not h
Hello!
Then this error likely means that you have very long operations preventing
checkpoint from starting in time.
Make sure your code does not cause Ignite to hold locks for prolonged time.
It may work OK without persistence, but if it interferes with checkpoint it
becomes a problem.
Regards,
Hi Ilya,
Yes we have persistence enabled.
OS is not swapping out ignite memory, since we have more than enough
resources on the server. The disks used for persistence are ssd ones with
96MB/s read and write speed. Is there any easy way to check if we are
r
Hello!
Unfortunately it's hard to say what happens here, because the oldest log
already starts with error messages and any root causes are clobbered.
Do you have persistence in this cluster? If the answer is yes, are you sure
that OS is not swapping out Ignite memory to disk, and that your disk i
Hi There,
We had a unresponsive cluster today after the following error;
[2019-10-09T07:08:13,623][ERROR][sys-stripe-94-#95][GridCacheDatabaseSharedManager]
Checkpoint read lock acquisition has been timed out.
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$