1 and 2 are minor cleanups. Patch 3-5 are follow-ups of the structure-freeing cleanup of my previous patch series "Multipath-tools: various bug fixes". The previous series tried to make sure that no freed "struct multipath" objects are ever referenced by any "struct path" objects. To achieve that, it nullified pp->mpp links from all paths in a struct multipath when the latter was discarded in free_multipath(). This is only safe if we make sure that the path referenced by a struct multipath (either via mpp->paths or mpp->pg[i]->paths for some i), aren't freed before the struct multipath itself. Ideally, the orphaning in free_multipath() wouldn't need to happen if we'd make sure that pp->mpp is set *if and only if* pp was referenced by mpp as explained above. I think we do a decent job at that, but it's hard to tell with 100% confidence.
In order to improve confidence, patch 4 and 5 add some logging. It's now an error if a path which is freed still links to a multipath map. Patch 3/5 eliminates a few cases where this could still happen, even though the path was not member of the map. Patch 5 is mainly meant to debug cases pp->mpp was either NULL or pointing to a different map. This shouldn't happen and I haven't observed it in my testing so far. Patch 6 avoids a strange behavior that multipathd has had forever - after grouping paths, mpp->paths was freed. In most of our code, we assume that mpp->paths and mpp->pg should reference the same set of path devices, and we have update_mpp_paths() and sync_paths() to take care of it. But right after path grouping, this is not the case. Subsequent invocations of update_mpp_path() (usually via refresh_multipath()) will re-allocate and fill the paths vector. This causes a lot of unnecessary heap operations. Patch 7 and 8 are unrelated bug fixes which I came across lately. Patch 9 is motivated by the recent patch "libmpathutil: use union for bitfield". I did some further experiments with strict aliasing and found that the multipath-tools code base just should not use it at all. Like the kernel, we use between low-level data structures that don't comply with strict aliasing rules. So like the kernel, we should disable strict aliasing. That basically obsoletes "libmpathutil: use union for bitfield", but having it doesn't hurt, either. Interestingly, switching to -fno-strict-aliasing seems to have fixed [1], which I'd unsuccessfully tried to fix with various methods before. [1] https://github.com/opensvc/multipath-tools/issues/130 Martin Wilck (9): libmultipath: reset pgindex in uninitialize_path() libmultipath: clarify code in set_path_removed() libmultipath: nullify pp->mpp before calling free_path() libmultipath: log error when freeing path that refers to a map libmultipath: sync_paths(): print message when fixing pp->mpp libmultipath: don't free mpp->paths in group_paths() libmultipath: warn only once in scsi_tmo_error_msg() libmultipath: find_hwe(): fix gcc errors at high optimization levels multipath-tools: compile with -fno-strict-aliasing Makefile.inc | 2 +- libmultipath/config.c | 13 ++++++++++--- libmultipath/discovery.c | 3 +++ libmultipath/pgpolicies.c | 2 -- libmultipath/structs.c | 4 ++++ libmultipath/structs_vec.c | 26 +++++++++++++++++--------- tests/pgpolicy.c | 7 +++++-- 7 files changed, 40 insertions(+), 17 deletions(-) -- 2.52.0
