On 1/9/21 10:23 PM, Zygo Blaxell wrote:
On a loaded test server, I observed 90th percentile fsync times drop from 7 seconds without preferred_metadata to 0.7 seconds with preferred_metadata when all the metadata is on the SSDs. If some metadata ever lands on a spinner, we go back to almost 7 seconds latency again (it sometimes only gets up to 5 or 6 seconds, but it's still very bad). We lost our performance gain, so our test resulted in failure.
Wow, this is a very interesting information: an use case where there is a 10x increase of speed ! Could you share more detail about this server. With more data that supporting this patch, we can convince David to include it. [...]
We could handle all these use cases with two bits: bit 0: 0 = prefer data, 1 = prefer metadata bit 1: 0 = allow other types, 1 = exclude other types which gives 4 encoded values: 0 = prefer data, allow metadata (default) 1 = prefer metadata, allow data (same as v4 patch) 2 = prefer data, disallow metadata 3 = prefer metadata, disallow data
What you are suggesting allows the maximum flexibility. However I still fear that we are mixing two discussions that are unrelated except that the solution *may* be the same: 1) the first discussion is related to the increasing of performance because we put the metadata in the faster disks and the data in the slower one. 2) the second discussion is how avoid that the chunk data consumes space of the metadata. Regarding 2), I think that a more generic approach is something like: - don't allocate *data* chunk if the chunk free space is less than <X> Where <X> is the maximum size of a metadata chunk (IIRC 1GB ?), eventually multiplied by 2x or 3x. Instead the metadata allocation policy is still constrained only to have enough space. As further step (to allow a metadata balance command to success), we could constraint the metadata allocation policy to allocate up to the half of the available space ( or 1 GB, whichever is smaller) Regarding 1) I prefer to leave the patch as simple as possible to increase the likelihood of an inclusion. Eventually we can put further constraint after. Anyway I am rebasing the patch to the latest kernel. Let me to check how complex could be implement you algorithm (the two bits one). BR G.Baroncelli -- gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it> Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5