Hi, this is the final patch-series to make dlm reliable when re-connection occurs. You can easily generate a couple of re-connections by running:
tcpkill -9 -i $IFACE port 21064 on your own to test these patches. At some time dlm will detect message drops and will re-transmit messages if necessary. It introduces a new dlm protocol behaviour and increases the dlm protocol version. I tested it with SCTP as well and tried to be backwards compatible with dlm protocol version 3.1. However I don't recommend at all to mix these versions in a setup since dlm version 3.2 fixes long-term issues. - Alex changes since v4: - add big midcomms file header comment about what's the idea about midcomms layer and how it works. - add the close mutex lock to prevent running close API call while connection is being terimanted. However when a close call occurs it will terminate the current termination wait until the close lock is released. If the node is removed from the nodes hash the lowcomms close call will occur anyway. I added a define to insert some sleep to test this behaviour. changes since v3: - make dlm messages to 8 byte boundary size (more pads), because there exists uint64_t fields and we should prepared for future 8 byte fields. This will make it directly aligned to 4 and 2 as well. - change unaligned memory access handling. I will not fix it yet. It seems nobody is using dlm on an architecture which cannot handle unaligned memory access at all (panics). However I added a note that this is a known problem. There is a slightly performance improvement (depends on many things e.g. if another message gets allocated after a (len % 8) != 0 message length got allocated). However I saw that such cases are rarely (for now some user space messages only) occur. The receiving side is not the problem here, the sending side is it and we run in a unaligned memory access in dlm messages fields there as well. However, fixing sending side will fix the receiving side and more length checks can be applied then to drop invalid message lengths. - be sure to remove node from hash at first at close call I am a little bit worried about the midcomms/lowcomms close call and the timer is running at exactly this time and maybe begins to re-transmit messages. I thought about to stop/start the timer but now I ended up to remove the node from the hash at first and be sure that no readers are left when calling lowcomms close. I think this should be fine because we "should" not receive any dlm messages from this node while close is running. - add patch "fs: dlm: add per node receive flush" As I was worried about that the lowcomms close call flushes the receive work on a socket close and we already removed the node from the hash, I added a functionality to flush the receive work right before we remove the node. With this functionality we male sure we don't receive any messages after we removed the node from the hash. - add patch "fs: dlm: remove obsolete code and comment" - add patch "fs: dlm: check for invalid namelen" changes since v2: - add patch "fs: dlm: set connected bit after accept" - add patch "fs: dlm: set subclass for othercon sock_mutex" - change title "fs: dlm: public utils header utils" to "fs: dlm: public header in out utility" - squash "fs: dlm: add check for minimum allocation length" into "fs: dlm: remove unaligned memory access handling" - make the midcomms timeout a little bit longer, because I saw sometimes it's not enough (I hope that was the reason) - midcomms: fix version mismatch handling - remove DLM_ACK in invalid sequence handling - add additional length check in dlm_opts_check_msglen() - use optlen to skip DLM_OPTS header - add DLM_MSGLEN_IS_NOT_ALIGNED to check if msglen is proper aligned before parsing - change dlm_midcomms_close() to close first then cut queues, because lowcomms close will may flush some messages which need to be dropped afterwards if seq doesn't fit. - remove newline in "fs: dlm: add more midcomms hooks" - may more changes which I don't have on track. - change defines handling for calculating max application buffer size vs max allocation size - run aspell on my commit msgs Alexander Aring (20): fs: dlm: set connected bit after accept fs: dlm: set subclass for othercon sock_mutex fs: dlm: add errno handling to check callback fs: dlm: add check if dlm is currently running fs: dlm: change allocation limits fs: dlm: public header in out utility fs: dlm: use GFP_ZERO for page buffer fs: dlm: simplify writequeue handling fs: dlm: add more midcomms hooks fs: dlm: make buffer handling per msg fs: dlm: make new buffer handling softirq ready fs: dlm: add functionality to re-transmit a message fs: dlm: move out some hash functionality fs: dlm: remove unaligned memory access handling fs: dlm: add union in dlm header for lockspace id fs: dlm: add per node receive flush fs: dlm: add reliable connection if reconnect fs: dlm: don't allow half transmitted messages fs: dlm: remove obsolete code and comment fs: dlm: check for invalid namelen fs/dlm/config.c | 60 +- fs/dlm/dlm_internal.h | 41 +- fs/dlm/lock.c | 16 +- fs/dlm/lockspace.c | 5 +- fs/dlm/lowcomms.c | 288 +++++++--- fs/dlm/lowcomms.h | 27 +- fs/dlm/member.c | 16 + fs/dlm/member.h | 1 + fs/dlm/midcomms.c | 1266 +++++++++++++++++++++++++++++++++++++++-- fs/dlm/midcomms.h | 10 + fs/dlm/rcom.c | 61 +- fs/dlm/recoverd.c | 3 + fs/dlm/user.c | 3 + fs/dlm/util.c | 10 +- fs/dlm/util.h | 2 + 15 files changed, 1628 insertions(+), 181 deletions(-) -- 2.26.2