Jeremy, Thank you for the response. I reviewed cache properties using GG Control Center and there was nothing in the cache props that would lead me to the conclusion that any expiry policy/TTL is set up for the cache. It wasn't set on the operation level, either.
I decided to delete the cache entirely and re-create it. Tomorrow I'll check if it helps. My best, Alex Avrutin On Thu, Feb 22, 2024 at 3:56 AM Jeremy McMillan < [email protected]> wrote: > First, logging should be configured to at least WARN level if not INFO. > > Ignite manages data internally at the page level. If you see errors about > pages, it is low, low level ignite problems. The next level up is > partitions. Errors involving partitions are mid low level ignite problems. > The next level up is caches. Errors at the cache level are mid to high > level problems. The next level is cache records. Errors in cache record > handling are high level of abstraction, and the next level is client > application operations. > > The lower level of abstraction the errors appear, the less chance > operations in general will succeed. Since the cache appears to operate > mostly as expected, and there are no obvious errors in the ignite logs, > most likely there is some client side logic which is deleting records, and > ignite does not consider this behavior to be in error. > > I would recommend fine tuning cache delete method log coverage. First > identify if the deletion is happening on a client connection thread pool or > a thread for server initiated operations. > > My guess is that a client is connecting, getting a cache object, and then > setting expiration on that cache connection so that all cache adds under > that cache connection will have expiration applied to them. > > > https://ignite.apache.org/docs/2.14.0/configuring-caches/expiry-policies#configuration > > "You can also change or set Expiry Policy for individual cache operations. > This policy is used for each operation invoked on the returned cache > instance." > > > https://ignite.apache.org/releases/latest/dotnetdoc/api/Apache.Ignite.Core.Client.Cache.ICacheClient-2.html?q=withExpiryPolicy#Apache_Ignite_Core_Client_Cache_ICacheClient_2_WithExpiryPolicy_Apache_Ignite_Core_Cache_Expiry_IExpiryPolicy_ > > On Wed, Feb 21, 2024, 19:17 Aleksej Avrutin <[email protected]> wrote: > >> Hello, >> >> A couple of days ago I encountered a strange phenomenon in our >> application based on Apache Ignite .Net 2.14 with persistence (3 nodes, 1 >> backup per cache). >> Data in a cache started disappearing for seemingly no reason and the >> amount of records could be halved (220K to 108K) overnight. I spent a >> couple of days trying to find a problem in the application, crunched >> hundreds megabytes of application logs but didn't manage to find a reason >> to blame the application. Retention/TTL is not set for the cache. Apache >> Ignite logs with the option -DIGNITE_QUIET=false also don't reveal any >> anomalies (or I don't know what to look for). The data shares are expected >> to be durable (based on Azure Disk) and we never had any issues with them. >> RAM utilisation is normal and there's plenty of available RAM. >> The Ignite cluster is hosted in a 3 node Kubernetes cluster on Azure. >> >> The question is: how would you recommend investigating issues like this? >> What metrics and logs can I check? Is it possible to log and track >> individual Remove() operations as well as SQL queries at Ignite engine >> level? >> >> The application has been working on Ignite for years already and we >> didn't encounter data loss at such scales before. It's possible that the >> app wasn't used so extensively before as it is now and the problem left >> unnoticed. >> >> My best, >> Alex Avrutin >> >
