This sounds strange. There definetely should be a cause of such behaviour.
Rebalancing is happened only after an topology change (node join/leave,
deactivation/activation).
Could you please share logs from node with exception you mentioned in
message, node with id "5423e6b5-c9be-4eb8-8f68-e643357ec2b3", and
coordinator (oldest) node (you can find this node grepping "crd=true" in
logs) to find the root cause of such behaviour?
Cache configurations / Data storage configurations would be also very
useful to debug.

1) If rebalancing didn't happen you should notice MOVING partitions in your
cache groups (from metrics MxBeans or Visor). It's possible to write data
to such partitions and read (it depends on configured PartitionLossPolicy
in your caches). If you have at least 1 owner (OWNING state) for each of
such replicated partition there is no data loss. Such MOVING partitions
will be properly rebalanced after node restart and data become consistent
in primary-backups partitions.
2) If part*.bin files are corrupted you may notice it only during node
restart or subsequent cluster deactivation/activation or if you have less
RAM than your data size and node do pages swapping (replacing) to/from
disk. In usual cluster life this is undetectable since all data placed in
RAM.


ср, 26 дек. 2018 г. в 13:44, aMark <feku.fa...@gmail.com>:

> Thanks Pavel for prompt response.
>
> I could confirm that node "5423e6b5-c9be-4eb8-8f68-e643357ec2b3" (and no
> other node in the cluster) did not go down, not sure how did stale data
> cropped up on few nodes.  And this type of exception is coming from every
> server node in the cluster.
>
> What happens if re-balancing did not happen properly due to this exception,
> could it lead to data loss ?
> does data get corrupted on the part*.bin files (in persistent store) in the
> Ignite cache due to this exception ?
>
> Thanks,
>
>
>
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Reply via email to