On Fri, Jun 19, 2020 at 01:42:47PM +0000, Martin Wilck wrote:
> On Thu, 2020-06-18 at 18:17 -0500, Benjamin Marzinski wrote:
> > On Thu, Jun 18, 2020 at 07:34:38PM +0000, Martin Wilck wrote:
> > > On Wed, 2020-06-17 at 19:24 -0500, Benjamin Marzinski wrote:
> > > > 
> > > >  static void
> > > > @@ -2088,8 +2084,13 @@ check_path (struct vectors * vecs, struct
> > > > path
> > > > * pp, unsigned int ticks)
> > > >          * Synchronize with kernel state
> > > >          */
> > > >         if (update_multipath_strings(pp->mpp, vecs->pathvec, 1)) {
> > > > +               struct dm_info info;
> > > >                 condlog(1, "%s: Could not synchronize with kernel
> > > > state",
> > > >                         pp->dev);
> > > > +               if (pp->mpp && pp->mpp->alias &&
> > > > +                   do_dm_get_info(pp->mpp->alias, &info) == 0)
> > > > +                       /* multipath device missing. Likely removed */
> > > > +                       return 0;
> > > >                 pp->dmstate = PSTATE_UNDEF;
> > > 
> > > It would be more elegant if we could distinguish different failure
> > > modes from update_multipath_strings() directly, without having to
> > > call
> > > do_dm_get_info() again.
> > 
> > Seems reasonable. I'll take a look at that.
> 
> Yet another idea: I just discussed this with Hannes, and he pointed out
> that right below this code, we have
> 
>       /* if update_multipath_strings orphaned the path, quit early */
>       if (!pp->mpp)
>               return 0;
> 
> ... which could have the same effect, if pp->mpp was reloaded. Probably
> that doesn't happen because the pp->mpp dereference is cached in a
> register, but we could simply add a READ_ONCE there.

When update_multipath_strings() calls update_multipath_table() it will
fail because the table no longer exists.  If we differentiate between
a dm error and not finding the map, we can exit early without having to
call do_dm_get_info() again. But right now, if the map is gone, but we
haven't got the uevent removing it, then nothing will clear pp->mpp. If
we did get the uevent, then it must have grabbed the vecs lock. That
better have caused a memory barrier, which will guarantee that
check_path() sees the updated value.

-Ben
 
> Choose what you prefer.
> 
> Regards,
> Martin
> 
> -- 
> Dr. Martin Wilck <mwi...@suse.com>, Tel. +49 (0)911 74053 2107
> SUSE  Software Solutions Germany GmbH
> HRB 36809, AG Nürnberg GF: Felix
> Imendörffer
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Reply via email to