Retaining local pox types in the client after a disconnect will cause problems as you observed. Take a look at the “ON_DISCONNECT_CLEAR_PDXTYPEIDS” property to improve this.
Anthony > On May 4, 2021, at 4:36 AM, Mario Salazar de Torres > <mario.salazar.de.tor...@est.tech> wrote: > > Hi everyone, > > While debugging some coredumps in the native client related to > PdxTypeRegistry cleanup, I tried to reproduce the scenario with the Java > client API to see how it was handled. > Thing is I've noticed that this scenario in the Java client might lead to > Geode storing a corrupted entry, meaning that queries won't work on those > regions containing corrupted entries. > And with corrupted entries, I refer to entries using a missing PdxType. The > scenario involves a cluster restart. It's described below: > > 1. Start a cluster with 1 locator and 3 servers, and persistence is > disabled for PdxTypes. > 2. Setup a region called "test-region" with persistence disabled. It > doesn't mind whether is replicated or partitioned. > 3. In the client, instantiate the client region with PROXY region shortcut > and establish the connection toward the cluster. > 4. In the client, create a PdxInstance and put in into the "test-region" > with key "test". > 5. In the client, get the entry which key is "test", which turns out to be > the PdxInstance inserted in step 4. > 6. At this point, cluster is restarted, meaning that all the data is lost, > included PdxTypes. > 7. In the client, the PdxInstance obtained in step 5 is put into > "test-region" with key "test2" > 8. In the client, the following query is executed: "SELECT * FROM > /test-region WHERE value = -1". > Such query fails with the message "Unknown pdx type=<PdxType ID>" and it > won't work until the corrupted entry is removed. > > Also, the above scenario could be solved by enabling persistence for > PdxTypes, but if you have an unrecoverable issue in your cluster and you need > to spin up a backup, > it could happen that PdxInstance's PdxType obtained step 5 is not present in > the backup, leading to the entry being inserted but, yet again, the PdxType > being missing. > > It's worth mentioning that in the native client, this scenario currently > results in a coredump, but no data corruption, > given that after losing the connection towards the cluster PdxTypeRegistry is > cleaned up and PdxTypes are obtained with its ID, rather than directly using > the object. > > My question here are: > > * Have you seen this issue before? > * Is there a way to verify that PdxTypes are present in the cluster before > writing an entry which holds some PdxInstances? > > Thanks, > Mario.