On Tue, 2010-10-26 at 15:22 +0200, Erwan Loaëc wrote:
> Hi,
> 
> First, thank you for spending time to analyse the problem.
> 
> I've apply the patch in our dev environnement, and defined a slow 
> timeout for the mount (60s)
> 
> FYI, in production, in order to limit this problem, the timeout is set 
> to 2 hours.
> 
> When it will be possible, I'll update every autofs. The problem is quite 
> rare, so I won't be able to attest that the problem is solve or not 
> immediatly.

Right, that sounds sensible since I can't work out how it gets through.

I also need to know that the change doesn't cause any unwanted side
effects so let me know how things go.

Adding an is_mounted() check in a frequently called function like this
can have an overhead if your using the old ioctl interface, which you
are by the look of the log. This should only show up if you have a
largish number of mounts. How many is a large number is very much site
dependent but if the CPU usage of the daemon is acceptable to you then
you don't need to worry about it.

> 
> In addition, can you tell me why the patch is "autofs-5.0.3 - fix expire 
> race" and not autofs-5.0.5 ?

That's just a typo, I'll rename the patch before committing it.

> 
> Thanks,
> 
> Erwan
> 
> 
> Ian Kent wrote:
> > On Tue, 2010-10-26 at 09:22 +0200, Erwan Loaëc wrote:
> >> Sorry for my previous mail with missing parts...
> >>
> >>
> >> Ian Kent wrote:
> >>> On Fri, 2010-10-22 at 17:19 +0200, Erwan Loaëc wrote:
> >>>> Hello,
> >>>>
> >>>> Sorry for the time, but before posting again I've upgrade every 
> >>>> production servers with newest autofs4 module (with last patch) and last 
> >>> What does "newest autofs4 module" mean exactly?
> >>
> >> Yes it is not the "newest module" but the module recompiled with the 
> >> patch autofs4-2.6.26-v5-update-20090903.patch
> > 
> > OK.
> > 
> >>>> automount with all patch EXCEPT these:
> >>>>
> >>>> autofs-5.0.5-fix-restart.patch
> >>>> autofs-5.0.5-fix-status-privilege-error.patch
> >>>> autofs-5.0.4-always-read-file-maps-mount-lookup-map-read-fix.patch
> >>>> autofs-5.0.5-fix-direct-map-not-updating-on-reread.patch
> >>>> autofs-5.0.5-add-external-bind-method.patch
> >>>> autofs-5.0.5-fix-add-simple-bind-auth.patch
> >>>> autofs-5.0.5-add-dump-maps-option.patch
> >>>> autofs-5.0.5-fix-submount-shutdown-wait.patch
> >>>>
> >>>>
> >>>> Today I had the same behaviour than the issue explained my previous mail.
> >>>>
> >>>> --------------------
> >>>> Oct 22 16:44:06 SERVERNAME automount[2665]: umount_autofs_offset: 
> >>>> couldn't get ioctl fd for offset /cifs/XXXXXXX/volume: No such file or 
> >>>> directory
> >>>> Oct 22 16:44:06 SERVERNAME automount[2665]: 
> >>>> handle_packet_missing_direct:1363: can't find map entry for (20,3548545)
> >>>> --------------------
> >>> This could be caused by a umount returning success when in fact it
> >>> didn't succeed with the umount. Are you sure umount is returning correct
> >>> status?
> >> Unfortunaly the last case occured on production server with debug 
> >> disable. I can't find more information in logfile...
> >>
> >> But, as I've explained in my previous mail, this could logical with the 
> >> bad sequence found in my previous case:
> >>
> >> *Call to the share /cifs/XXXXXXX/volume
> >> *Mount /cifs/XXXXXXX/volume expiring
> >> *New call to the share /cifs/XXXXXXX/volume
> >> *Umount /cifs/XXXXXXX/volume
> > 
> > But that isn't quite what the problem is.
> > 
> > I've had a closer look at the log and:
> > 
> > Aug 26 10:11:42 bacchus automount[17827]: handle_packet_expire_direct: 
> > token 1526, name /cifs/SERV2/TM_termoz
> > Aug 26 10:11:42 bacchus automount[17827]: expiring path 
> > /cifs/SERV2/TM_termoz
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi: path 
> > /cifs/SERV2/TM_termoz incl 1
> > Aug 26 10:11:42 bacchus automount[17827]: umount_subtree_mounts: unmounting 
> > dir = /cifs/SERV2/TM_termoz
> > Aug 26 10:11:42 bacchus automount[17827]: expired /cifs/SERV2/TM_termoz
> > Aug 26 10:11:42 bacchus automount[17827]: ioctl_send_ready: token = 1526
> > 
> > /cifs/SERV2/TM_termoz is umounted.
> > 
> > Aug 26 10:11:42 bacchus automount[17827]: handle_packet: type = 5
> > Aug 26 10:11:42 bacchus automount[17827]: handle_packet_missing_direct: 
> > token 1527, name /cifs/SERV2/TM_termoz, request pid 29523
> > Aug 26 10:11:42 bacchus automount[17827]: attempting to mount entry 
> > /cifs/SERV2/TM_termoz
> > Aug 26 10:11:42 bacchus automount[17827]: lookup_mount: lookup(program): 
> > /cifs/SERV2/TM_termoz -> 
> > -fstype=cifs,file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile
> >  ://SERV2/TM_termoz
> > Aug 26 10:11:42 bacchus automount[17827]: parse_mount: parse(sun): expanded 
> > entry: 
> > -fstype=cifs,file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile
> >  ://SERV2/TM_termoz
> > Aug 26 10:11:42 bacchus automount[17827]: parse_mount: parse(sun): gathered 
> > options: 
> > fstype=cifs,file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile
> > Aug 26 10:11:42 bacchus automount[17827]: sun_mount: parse(sun): mounting 
> > root /tmp/auto1I7gB7, mountpoint /cifs/SERV2/TM_termoz, what 
> > //SERV2/TM_termoz, fstype cifs, options 
> > file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile
> > Aug 26 10:11:42 bacchus automount[17827]: do_mount: //SERV2/TM_termoz 
> > /cifs/SERV2/TM_termoz type cifs options 
> > file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile
> >  using module generic
> > Aug 26 10:11:42 bacchus automount[17827]: mount_mount: mount(generic): 
> > calling mkdir_path /tmp/auto1I7gB7
> > Aug 26 10:11:42 bacchus automount[17827]: mount_mount: mount(generic): 
> > calling mount -t cifs -s -o 
> > file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile
> >  //SERV2/TM_termoz /tmp/auto1I7gB7
> > Aug 26 10:11:42 bacchus automount[17827]: mount_mount: mount(generic): 
> > mounted //SERV2/TM_termoz type cifs on /tmp/auto1I7gB7
> > Aug 26 10:11:42 bacchus automount[17827]: move_mount: moved mount tree from 
> > /tmp/auto1I7gB7 to /cifs/SERV2/TM_termoz
> > 
> > And is then mounted again here, but the request is not complete and
> > we get an expire, which means that the kernel saw the tree of mounts
> > below /cifs/SERV2 as not busy. This means the dentry was selected
> > for expire before the mount, and in fact before the path walk which
> > caused it even started.
> > 
> > Aug 26 10:11:42 bacchus automount[17827]: handle_packet: type = 4
> > Aug 26 10:11:42 bacchus automount[17827]: handle_packet_expire_indirect: 
> > token 1528, name SERV2
> > Aug 26 10:11:42 bacchus automount[17827]: expiring path /cifs/SERV2
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi: path /cifs/SERV2 
> > incl 1
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount 
> > offset /cifs/SERV2/01_MARKETING
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount 
> > /cifs/SERV2/01_MARKETING
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount 
> > offset /cifs/SERV2/02_ADMINISTRATIF
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount 
> > /cifs/SERV2/02_ADMINISTRATIF
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount 
> > offset /cifs/SERV2/03_EXPLOITATION
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount 
> > /cifs/SERV2/03_EXPLOITATION
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount 
> > offset /cifs/SERV2/04_SERVICE-CLIENTS
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount 
> > /cifs/SERV2/04_SERVICE-CLIENTS
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount 
> > offset /cifs/SERV2/05_TELECHARGEMENTS
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount 
> > /cifs/SERV2/05_TELECHARGEMENTS
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount 
> > offset /cifs/SERV2/06_UTILISATEURS
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount 
> > /cifs/SERV2/06_UTILISATEURS
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount 
> > offset /cifs/SERV2/07_PRODUCTION
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount 
> > /cifs/SERV2/07_PRODUCTION
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount 
> > offset /cifs/SERV2/08_BDD
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount 
> > /cifs/SERV2/08_BDD
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount 
> > offset /cifs/SERV2/ADMIN$
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount 
> > /cifs/SERV2/ADMIN$
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount 
> > offset /cifs/SERV2/C$
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount 
> > /cifs/SERV2/C$
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount 
> > offset /cifs/SERV2/CGIN
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount 
> > /cifs/SERV2/CGIN
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount 
> > offset /cifs/SERV2/D$
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount 
> > /cifs/SERV2/D$
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount 
> > offset /cifs/SERV2/F$
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount 
> > /cifs/SERV2/F$
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount 
> > offset /cifs/SERV2/ImageMagick-6.3.3V6
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount 
> > /cifs/SERV2/ImageMagick-6.3.3V6
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount 
> > offset /cifs/SERV2/TM_termoz
> > Aug 26 10:11:42 bacchus automount[17827]: umount_autofs_offset: couldn't 
> > get ioctl fd for offset /cifs/SERV2/TM_termoz: No such file or directory
> > Aug 26 10:11:42 bacchus automount[17827]: forcing umount of offset mount 
> > /cifs/SERV2/TM_termoz
> > Aug 26 10:11:42 bacchus automount[17827]: failed to remove dir 
> > /cifs/SERV2/TM_termoz: Device or resource busy
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount 
> > offset /cifs/SERV2/Projets
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount 
> > /cifs/SERV2/Projets
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount 
> > offset /cifs/SERV2/Projets_Air
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount 
> > /cifs/SERV2/Projets_Air
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount 
> > offset /cifs/SERV2/termozs
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount 
> > /cifs/SERV2/termozs
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount 
> > offset /cifs/SERV2/packages
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount 
> > /cifs/SERV2/packages
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount 
> > offset /cifs/SERV2/powervault
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount 
> > /cifs/SERV2/powervault
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount 
> > offset /cifs/SERV2/temp
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount 
> > /cifs/SERV2/temp
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: 
> > deleting offset key /cifs/SERV2/01_MARKETING
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: 
> > deleting offset key /cifs/SERV2/02_ADMINISTRATIF
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: 
> > deleting offset key /cifs/SERV2/03_EXPLOITATION
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: 
> > deleting offset key /cifs/SERV2/04_SERVICE-CLIENTS
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: 
> > deleting offset key /cifs/SERV2/05_TELECHARGEMENTS
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: 
> > deleting offset key /cifs/SERV2/06_UTILISATEURS
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: 
> > deleting offset key /cifs/SERV2/07_PRODUCTION
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: 
> > deleting offset key /cifs/SERV2/08_BDD
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: 
> > deleting offset key /cifs/SERV2/ADMIN$
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: 
> > deleting offset key /cifs/SERV2/C$
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: 
> > deleting offset key /cifs/SERV2/CGIN
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: 
> > deleting offset key /cifs/SERV2/D$
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: 
> > deleting offset key /cifs/SERV2/F$
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: 
> > deleting offset key /cifs/SERV2/ImageMagick-6.3.3V6
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: 
> > deleting offset key /cifs/SERV2/TM_termoz
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: 
> > deleting offset key /cifs/SERV2/Projets
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: 
> > deleting offset key /cifs/SERV2/Projets_Air
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: 
> > deleting offset key /cifs/SERV2/termozs
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: 
> > deleting offset key /cifs/SERV2/packages
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: 
> > deleting offset key /cifs/SERV2/powervault
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: 
> > deleting offset key /cifs/SERV2/temp
> > 
> > Now we are trying to umount the tree of autofs trigger mounts but one of
> > the trigger mounts now has a mount on it which causes the problem.
> > 
> > Aug 26 10:11:42 bacchus automount[17827]: ioctl_send_ready: token = 1527
> > Aug 26 10:11:42 bacchus automount[17827]: mounted /cifs/SERV2/TM_termoz
> > 
> > Now the mount completes.
> > 
> > It is the mount not completing before the expire which causes autofs to
> > not recognize that the tree has become busy which should prevent it from
> > attempting to umount it.
> > 
> > The check to determine if the umount of the tree can go ahead is done by
> > checking if any of these autofs trigger mounts have a open file handle
> > (which is used for control operations when they have a mount on top of
> > them). But, during the mount, the file handle is not set until the time
> > we see the "ioctl_send_ready: token = 1527" message so it is missed by
> > the expire check.
> > 
> > Aug 26 10:11:42 bacchus automount[17827]: handle_packet: type = 5
> > Aug 26 10:11:42 bacchus automount[17827]: rm_unwanted_fn: removing 
> > directory /cifs/SERV2
> > Aug 26 10:11:42 bacchus automount[17827]: unable to remove directory 
> > /cifs/SERV2: Directory not empty
> > Aug 26 10:11:42 bacchus automount[17827]: expired /cifs/SERV2
> > Aug 26 10:11:42 bacchus automount[17827]: 
> > handle_packet_missing_direct:1323: can't find map entry for (46,12396497)
> > 
> > If we ever get any of these then it is game over.
> > They are fatal and, although you will see a hang because a lock is not
> > released, you must start over anyway. There is no other option.
> > 
> > I don't know why the above expire is getting through, it shouldn't, but
> > user space does need to handle this case anyway.
> > 
> > I've done some basic testing and this still needs work (like a problem
> > description) but I'm interested to know if this helps.
> > 
> > autofs-5.0.3 - fix expire race
> > 
> > From: Ian Kent <ra...@themaw.net>
> > 
> > 
> > ---
> > 
> >  lib/mounts.c |    5 ++++-
> >  1 files changed, 4 insertions(+), 1 deletions(-)
> > 
> > 
> > diff --git a/lib/mounts.c b/lib/mounts.c
> > index 4c44982..f26579e 100644
> > --- a/lib/mounts.c
> > +++ b/lib/mounts.c
> > @@ -1525,8 +1525,11 @@ int umount_multi_triggers(struct autofs_point *ap, 
> > struct mapent *me, char *root
> >             oe_base = oe->key + strlen(root);
> >             left += umount_multi_triggers(ap, oe, root, oe_base);
> >  
> > -           if (oe->ioctlfd != -1)
> > +           if (oe->ioctlfd != -1 ||
> > +               is_mounted(_PROC_MOUNTS, oe->key, MNTS_REAL)) {
> >                     left++;
> > +                   break;
> > +           }
> >     }
> >  
> >     if (left)
> > 
> > 
> 
> _______________________________________________
> autofs mailing list
> autofs@linux.kernel.org
> http://linux.kernel.org/mailman/listinfo/autofs


_______________________________________________
autofs mailing list
autofs@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/autofs

Reply via email to