Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err

2021-02-05 Thread lixiaokeng
Hi Martin, I have removed multipathd_query in my test script. And a flock is added before/after reonfigure() and iscsi login/out. Sequence of events: (1)iscsi log out /dev/sdi(36001405b7679bd96b094bccbf971bc90) is removed. multipath -r: sdi->fd is closed. ref of sdi becomes 0. (2)iscsi log i

Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err

2021-02-04 Thread Martin Wilck
On Thu, 2021-02-04 at 19:25 +0800, lixiaokeng wrote: > > Hi Martin, > > On 2021/1/27 7:11, Martin Wilck wrote: > > So we can only conclude that (if there's no kernel refcounting bug, > > which I doubt) either orphan_path()->uninitialize_path() had been > > called (closing the fd),  or that openin

Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err

2021-02-04 Thread lixiaokeng
Hi Martin, On 2021/1/27 7:11, Martin Wilck wrote: > So we can only conclude that (if there's no kernel refcounting bug, > which I doubt) either orphan_path()->uninitialize_path() had been > called (closing the fd), or that opening the sd device had failed in > the first place (in which case the

Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err

2021-01-28 Thread Martin Wilck
On Thu, 2021-01-28 at 16:27 +0800, lixiaokeng wrote: >    I don't know why dm-5 is destoried. I doubt there may be some > issue in the kernel that I add some print. I have test this in > three computers, but the other two have no problem (they have been > runing for 96h and for 48h respectively).

Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err

2021-01-28 Thread lixiaokeng
> >> >> (1)multipath -r: The sdf is found as a path of >> 36001405b7679bd96b094bccbf971bc90 >> (iscsi node is 4:0:0:2) >> Here is a log "dm destory name dm-5; majir:minor 253:5; dm-5" (dm-5 36001405b7679bd96b094bccbf971bc90) in system time [1202538.163972]. I add this print in dm_destroy. >

Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err

2021-01-26 Thread Martin Wilck
On Tue, 2021-01-26 at 19:14 +0800, lixiaokeng wrote: > > > > Hi, > > >   Unfortunately the verify_path() called before *and* after > > > domap() > > > in > > > coalesce_paths can't solve this problem. I think it is another > > > way to > > > lead multipath with wrong path, but now I can't find the

Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err

2021-01-26 Thread lixiaokeng
>> Hi, >>   Unfortunately the verify_path() called before *and* after domap() >> in >> coalesce_paths can't solve this problem. I think it is another way to >> lead multipath with wrong path, but now I can't find the way from >> log. > > Can you provide multipathd -v3 logs, and kernel logs? Mayb

Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err

2021-01-25 Thread lixiaokeng
> > Can you provide multipathd -v3 logs, and kernel logs? Maybe I'll see > something. > Hi Martin, The first scene was not found, so I can't provide some useful logs. I will try again to get more information from the logs. If possible, please use scripts to reproduce the problem. Regards,

Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err

2021-01-25 Thread Martin Wilck
Hi lixiaokeng, On Mon, 2021-01-25 at 09:33 +0800, lixiaokeng wrote: > > > > > verify_paths() before *and* after domap(). > > > > > > Can calling verify_paths() before *and* after domap() deal this > > > entirely? > > > Hi, >   Unfortunately the verify_path() called before *and* after domap() >

Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err

2021-01-24 Thread lixiaokeng
>>> verify_paths() before *and* after domap(). >> >> Can calling verify_paths() before *and* after domap() deal this >> entirely? > Hi, Unfortunately the verify_path() called before *and* after domap() in coalesce_paths can't solve this problem. I think it is another way to lead multipath with

Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err

2021-01-20 Thread Martin Wilck
On Wed, 2021-01-20 at 07:02 -0600, Roger Heflin wrote: > > > I don't know if this helps, or is exactly like what he is > duplicating: > > I debugged and verified a corruption issue a few years ago where this > was what happened: > > DiskA was presented at say sdh (via SAN) and a multipath devic

Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err

2021-01-20 Thread Martin Wilck
On Wed, 2021-01-20 at 10:30 +0800, lixiaokeng wrote: > Hi Martin: >     Thanks for your reply. > > > > verify_paths() would detect this. We do call verify_paths() in > > coalesce_paths() before calling domap(), but not immediately > > before. > > Perhaps we should move the verify_paths() call dow

Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err

2021-01-20 Thread Roger Heflin
> verify_paths() would detect this. We do call verify_paths() in > coalesce_paths() before calling domap(), but not immediately before. > Perhaps we should move the verify_paths() call down to immediately > before the domap() call. That would at least minimize the time window > for this race. It's

Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err

2021-01-19 Thread lixiaokeng
Hi Martin: Thanks for your reply. > verify_paths() would detect this. We do call verify_paths() in > coalesce_paths() before calling domap(), but not immediately before. > Perhaps we should move the verify_paths() call down to immediately > before the domap() call. That would at least minimiz

Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err

2021-01-19 Thread Martin Wilck
On Mon, 2021-01-18 at 19:08 +0800, lixiaokeng wrote: > Hi >   When we make IO stress test on multipath device, there will > be a  metadata err because of wrong path. > > > Sequence of events: > (1)multipath -r, ip1 logout at same > The load table params of 36001405ca5165367d67447ea68108e1d is > "

Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err

2021-01-19 Thread Martin Wilck
Hi Lixiaokeng, On Mon, 2021-01-18 at 19:08 +0800, lixiaokeng wrote: > Hi >   When we make IO stress test on multipath device, there will > be a  metadata err because of wrong path. > thanks for the detailed report and analysis, also for sharing your test scripts! I'll take a closer look asap. M

Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err

2021-01-19 Thread lixiaokeng
Hi Martin, I have tested 0.7.7 and 0.8.5 code.They both have metadate error. I find I missed some information in last email. Ip1 log in between (1) and (2) and sdi became a path of 36001405b7679bd96b094bccbf971bc90, but uevents weren't deal until (3) finished. The details described are bas

[dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err

2021-01-18 Thread lixiaokeng
Hi When we make IO stress test on multipath device, there will be a metadata err because of wrong path. There are three test scripts. First: #!/bin/bash disk_list="/dev/mapper/3600140531f063b3e19349bc82028e0cc /dev/mapper/36001405ca5165367d67447ea68108e1d /dev/mapper/3600140584e11eb1818c4afab1