Folks, I am having a knowledge question concerning the selection of secondary OSDs in Ceph.
I have a cluster here that consists of three nodes. For the sake of the argument, I have simulated latency between the third node and the other two using tc and netem. I have set the priority-affinity of the OSDs on the third node to 0, and indeed, RADOS is not using any of these OSDs as primary OSD, so this part works as expected. Furthermore, my expectation for a pool that has size=3 and min_size=2 is that for any given write, the primary OSD on nodes 1 or 2 will select a secondary OSD in node 1/2 respectively and one in node 3. Which would then lead me to believe that any client writing into the cluster from node 1 will only ever have the latency between node 1 and node 2 as an actual performance penalty because * client selects primary OSD on node 1 or node 2 * primary OSD selects secondary OSDs and starts transfer in parallel * Write to OSD with lower latency will finish much sooner than the one to the other OSD, leading to the write acknowledgement being sent to the client, because min_size=2 But that appears not to be the case. priority-affinity has a very slight impact, but the overall performance when writing into the cluster with queue depth 1 and request size of 4k still very much resembles a scenario in which every single write appears to be latency-penalized with the latency between node1/2 and node 3. Where is my understanding incorrect? Or are there any configuration settings for this? I tried to search for this, but the only results I can find refer to priority-affinity. I am looking into something like „secondary affinity“ I guess, but I do not think that such a thing exists in Ceph. Which leads me to believe that my understanding of this is seriously wrong somehow. Any hint will be greatly appreciated. Thank you very much in advance. Best regards Martin -- Martin Gerhard Loschwitz Geschäftsführer / CEO, True West IT Services GmbH Phone: +49 2433 5253130 Mobile: +49 176 61832178 Address: Schmiedegasse 24a, 41836 Hückelhoven, Germany Legal: HRB 21985, Amtsgericht Mönchengladbach VAT: DE363893844 True West IT Services GmbH is compliant with the GDPR regulation on data protection and privacy in the European Union and the European Economic Area. You can request the information on how we collect and process your private data according to the law by contacting the email sender. _______________________________________________ ceph-users mailing list -- [email protected] To unsubscribe send an email to [email protected]
