** Patch added: "utopic_multipath-tools_0.4.9-3ubuntu9.debdiff" https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4172259/+files/utopic_multipath-tools_0.4.9-3ubuntu9.debdiff
** Description changed: - [Impact] - * Multipath can cause segmentation fault due to wrong code and can - possibly cause user to loose access to multipath devices. + * Multipath can cause segmentation fault due to wrong code and can + possibly cause user to loose access to multipath devices. [Test Case] * Working on it. [Regression Potential] - * Fix based on upstream code (96f8146) Tag 0.5.0 already functioning. - * Introducing mutex, logic to deal with already dead pthread and other - way to access same data (instead of accessing other time lived - structure). + * Patch 1/4 tries to fix the issue. Patch 2/4 fixes the 1/4. + * Patch 3/4 discovers 1/4 was no good. Patch 4/4 fixes 3/4. + + * Fix based on upstream code (96f8146) + subsequent patches. + * Followed this code development until the issue was addressed. [Other Info] - * Original bug description: + * Original bug description: ---------------- It was brought to me (~inaddy) the following situation with multipathd: ##### Program terminated with signal 6, Aborted. #0 0x00007fbc6ae09445 in raise () from /lib/x86_64linuxgnu/ libc.so.6 (gdb) bt #0 0x00007fbc6ae09445 in raise () from /lib/x86_64linuxgnu/ libc.so.6 #1 0x00007fbc6ae0cbab in abort () from /lib/x86_64linuxgnu/ libc.so.6 #2 0x00007fbc6ae0210e in ?? () from /lib/x86_64linuxgnu/ libc.so.6 #3 0x00007fbc6ae021b2 in __assert_fail () from /lib/x86_64linuxgnu/ libc.so.6 #4 0x00007fbc6b849efb in pthread_mutex_lock () from /lib/x86_64linuxgnu/ libpthread.so.0 #5 0x00007fbc6b1cba5f in free_waiter (data=0x1691de0) at waiter.c:44 #6 0x00007fbc6b1cc25a in waitevent (et=0x1691de0) at waiter.c:204 #7 0x00007fbc6b847e9a in start_thread () from /lib/x86_64linuxgnu/ libpthread.so.0 #8 0x00007fbc6aec54bd in clone () from /lib/x86_64linuxgnu/ libc.so.6 #9 0x0000000000000000 in ?? () -------------------------------------------------------------------------------------------- #5 0x00007fbc6b1cba5f in free_waiter (data=0x1691de0) at waiter.c:44 44 lock(wp> vecs> lock); (gdb) print wp> vecs> lock $1 = {mutex = 0x168c280, depth = 1} In pthread_mutex_lock.c:62 there's an assert that fails: #4 0x00007fbc6b849efb in __pthread_mutex_lock (mutex=0xfefefefefefefeff) at pthread_mutex_lock.c:62 62 assert (mutex>_ data._owner == 0); In this run: (gdb) p *wp> vecs> lock> mutex $3 = {_data = {lock = 1, __count = 0, __owner = 49, __nusers = 0, __kind = 0, __spins = 0, __list = {_prev = 0x0, __next = 0xffffffff}}, __size = "\001\000\000\000\000\000\000\000\061", '\000' <repeats 23 times>"\377, \377\377\377\000\000\000", __align = 1} so __owner is 49 and not 0. Note that 49 is somewhat strange; it's expected to be a pid_t obtained via pid_t id = THREAD_GETMEM (THREAD_SELF, tid); According to https://bugzilla.redhat.com/show_bug.cgi?id=570278 , this assert failure could be an expected behaviour if, for some reason the multipath code was trying to release a mutex that has already been freed. The multipath-tools package is up to date (0.4.9-3ubuntu5) I do not find obvious thing related in http://git.opensvc.com/gitweb.cgi?p=multipath-tools%2F.git except may be http://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=commitdiff;h=5ee9f716549d913aeb9800041f78ee9c6a50d860 ##### In between Precise's version and Upstream there are the following patches touching waiter.c: d887f4a = signal waiter thread to stop waiting on dm events 5ee9f71 = simplify multipath signal handlers af4fd6d = Fix race condition in stop_waiter_thread() e1fcc59 = multipath: clean up code for stopping the waiter threads 03ec4ef = multipath: fix shutdown crashes 4dfdaf2 = multipath: Update multipath device on show topology c301a3f = Race condition when calling stop_waiter_thread() 96f8146 = libmultipath: update waiter handling This specific one: 96f8146 (libmultipath: update waiter handling) """ The current 'waiter' structure accesses fields which belong to the main 'mpp' structure, which has a totally different lifetime. """ Shows that due to different lifetime between different structures, there can be use-after-free segfaults (what seems to be happening). waiter.c:44 = lock(wp->vecs->lock); -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1354114 Title: multipath segmentation Fault (libmultipath: update waiter handling) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs