Hi!

When persistence is enabled, binary metadata is written to disk upon 
registration. Currently it happens in the discovery thread, which makes 
processing of related messages very slow.
There are cases, when a lot of nodes and slow disks can make every binary type 
be registered for several minutes. Plus it blocks processing of other messages.

I propose starting a separate thread that will be responsible   for writing 
binary metadata to disk. So, binary type registration will be considered 
finished before information about it will is written to disks on all nodes.

The main concern here is data consistency in cases when a node acknowledges 
type registration and then fails before writing the metadata to disk.
I see two parts of this issue:
Nodes will have different metadata after restarting.
If we write some data into a persisted cache and shut down nodes faster than a 
new binary type is written to disk, then after a restart we won’t have a binary 
type to work with.

The first case is similar to a situation, when one node fails, and after that a 
new type is registered in the cluster. This issue is resolved by the discovery 
data exchange. All nodes receive information about all binary types in the 
initial discovery messages sent by other nodes. So, once you restart a node, it 
will receive information, that it failed to finish writing to disk, from other 
nodes.
If all nodes shut down before finishing writing the metadata to disk, then 
after a restart the type will be considered unregistered, so another 
registration will be required.

The second case is a bit more complicated. But it can be resolved by making the 
discovery threads on every node create a future, that will be completed when 
writing to disk is finished. So, every node will have such future, that will 
reflect the current state of persisting the metadata to disk.
After that, if some operation needs this binary type, it will need to wait on 
that future until flushing to disk is finished.
This way discovery threads won’t be blocked, but other threads, that actually 
need this type, will be.

Please let me know what you think about that.

Denis

Reply via email to