Sorry, I don't understand what the purpose and use of flushing current datanode 
is.

IMO, flush all should mean that all storage group could be flushed, in another 
word, flush sg is a subset of flush all.

For users, distributed is a black box, while SG is an exposed structure. 
Therefore, for cli commands, there is no need to be aware of the relationship 
between the datanode and the self-created SG.  

In addition, the Flush operation may speed up our restart recovery process. For 
example, when we flush an SG successfully, we can label the associated data 
files to indicate that all copies are consistent at that point in time(here are 
flush and write priorities). During the next restart, we can use this flag to 
quickly skip the verification step.

In summary, here are my questions and thoughts:
1. Is it necessary to flush a dataNode? What are the benefits of this?  
2. Can the Flush operation affect the consensus group or WAL for a quick 
restart?  

BR,
-----------------------------------
Sijia Li


-----邮件原件-----
发件人: Jialin Qiao <[email protected]> 
发送时间: 2022年5月23日 11:07
收件人: [email protected]
主题: Flush function in cluster

Hi,

Flush is a frequently used command in IoTDB, which flushes memtable into disk 
and closes all tsfiles.

In the new cluster, we need to redefine this function [1].

* flush: flushing current datanode

* flush all/cluster: flushing all datanodes

* flush sg: flush all DataRegions of a storage group


What do you think?

[1] https://issues.apache.org/jira/browse/IOTDB-3099

—————————————————
Jialin Qiao
Apache IoTDB PMC

Reply via email to