On Thu, 2011-02-17 at 12:01 +0100, Erwan Loaëc wrote: > Hello, > > I'm going to generate a new autofs package for squeeze for our IT, and I > notice that the patch you've done about my problem does not seems to be > available on ftp://ftp.kernel.org/pub/linux/daemons/autofs/v5/ > > Is there a reason ?
No, it's in the queue waiting to get pushed, when I do push the patches I have in the queue. Some will get dropped, I rearrange the order fairly often, but eventually I'll push some of the patches. I don't have any reason to drop your patch so it'll stay in the queue until I reach a point where I'm ready to push them. > Is the problem has been solved elsewhere ? Nope. > > -- > Erwan > > > Ian Kent wrote: > > On Tue, 2010-12-07 at 09:54 +0100, Erwan Loaëc wrote: > >> Hello! > >> > >> For information, since I've updated our autofs with the suggested patch, > >> the problem has not occurred, and I don't see any CPU usage problem. > > > > Thanks for letting me know. > > I'll commit that change soon as I get a chance. > > > >> Thanks!! > >> > >> I hope this problem is definitively solved :o) > >> > >> -- > >> Erwan Loaec > >> > >> > >> Ian Kent wrote: > >>> On Tue, 2010-10-26 at 15:22 +0200, Erwan Loaëc wrote: > >>>> Hi, > >>>> > >>>> First, thank you for spending time to analyse the problem. > >>>> > >>>> I've apply the patch in our dev environnement, and defined a slow > >>>> timeout for the mount (60s) > >>>> > >>>> FYI, in production, in order to limit this problem, the timeout is set > >>>> to 2 hours. > >>>> > >>>> When it will be possible, I'll update every autofs. The problem is quite > >>>> rare, so I won't be able to attest that the problem is solve or not > >>>> immediatly. > >>> Right, that sounds sensible since I can't work out how it gets through. > >>> > >>> I also need to know that the change doesn't cause any unwanted side > >>> effects so let me know how things go. > >>> > >>> Adding an is_mounted() check in a frequently called function like this > >>> can have an overhead if your using the old ioctl interface, which you > >>> are by the look of the log. This should only show up if you have a > >>> largish number of mounts. How many is a large number is very much site > >>> dependent but if the CPU usage of the daemon is acceptable to you then > >>> you don't need to worry about it. > >>> > >>>> In addition, can you tell me why the patch is "autofs-5.0.3 - fix expire > >>>> race" and not autofs-5.0.5 ? > >>> That's just a typo, I'll rename the patch before committing it. > >>> > >>>> Thanks, > >>>> > >>>> Erwan > >>>> > >>>> > >>>> Ian Kent wrote: > >>>>> On Tue, 2010-10-26 at 09:22 +0200, Erwan Loaëc wrote: > >>>>>> Sorry for my previous mail with missing parts... > >>>>>> > >>>>>> > >>>>>> Ian Kent wrote: > >>>>>>> On Fri, 2010-10-22 at 17:19 +0200, Erwan Loaëc wrote: > >>>>>>>> Hello, > >>>>>>>> > >>>>>>>> Sorry for the time, but before posting again I've upgrade every > >>>>>>>> production servers with newest autofs4 module (with last patch) and > >>>>>>>> last > >>>>>>> What does "newest autofs4 module" mean exactly? > >>>>>> Yes it is not the "newest module" but the module recompiled with the > >>>>>> patch autofs4-2.6.26-v5-update-20090903.patch > >>>>> OK. > >>>>> > >>>>>>>> automount with all patch EXCEPT these: > >>>>>>>> > >>>>>>>> autofs-5.0.5-fix-restart.patch > >>>>>>>> autofs-5.0.5-fix-status-privilege-error.patch > >>>>>>>> autofs-5.0.4-always-read-file-maps-mount-lookup-map-read-fix.patch > >>>>>>>> autofs-5.0.5-fix-direct-map-not-updating-on-reread.patch > >>>>>>>> autofs-5.0.5-add-external-bind-method.patch > >>>>>>>> autofs-5.0.5-fix-add-simple-bind-auth.patch > >>>>>>>> autofs-5.0.5-add-dump-maps-option.patch > >>>>>>>> autofs-5.0.5-fix-submount-shutdown-wait.patch > >>>>>>>> > >>>>>>>> > >>>>>>>> Today I had the same behaviour than the issue explained my previous > >>>>>>>> mail. > >>>>>>>> > >>>>>>>> -------------------- > >>>>>>>> Oct 22 16:44:06 SERVERNAME automount[2665]: umount_autofs_offset: > >>>>>>>> couldn't get ioctl fd for offset /cifs/XXXXXXX/volume: No such file > >>>>>>>> or > >>>>>>>> directory > >>>>>>>> Oct 22 16:44:06 SERVERNAME automount[2665]: > >>>>>>>> handle_packet_missing_direct:1363: can't find map entry for > >>>>>>>> (20,3548545) > >>>>>>>> -------------------- > >>>>>>> This could be caused by a umount returning success when in fact it > >>>>>>> didn't succeed with the umount. Are you sure umount is returning > >>>>>>> correct > >>>>>>> status? > >>>>>> Unfortunaly the last case occured on production server with debug > >>>>>> disable. I can't find more information in logfile... > >>>>>> > >>>>>> But, as I've explained in my previous mail, this could logical with > >>>>>> the > >>>>>> bad sequence found in my previous case: > >>>>>> > >>>>>> *Call to the share /cifs/XXXXXXX/volume > >>>>>> *Mount /cifs/XXXXXXX/volume expiring > >>>>>> *New call to the share /cifs/XXXXXXX/volume > >>>>>> *Umount /cifs/XXXXXXX/volume > >>>>> But that isn't quite what the problem is. > >>>>> > >>>>> I've had a closer look at the log and: > >>>>> > >>>>> Aug 26 10:11:42 bacchus automount[17827]: handle_packet_expire_direct: > >>>>> token 1526, name /cifs/SERV2/TM_termoz > >>>>> Aug 26 10:11:42 bacchus automount[17827]: expiring path > >>>>> /cifs/SERV2/TM_termoz > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umount_multi: path > >>>>> /cifs/SERV2/TM_termoz incl 1 > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umount_subtree_mounts: > >>>>> unmounting dir = /cifs/SERV2/TM_termoz > >>>>> Aug 26 10:11:42 bacchus automount[17827]: expired /cifs/SERV2/TM_termoz > >>>>> Aug 26 10:11:42 bacchus automount[17827]: ioctl_send_ready: token = 1526 > >>>>> > >>>>> /cifs/SERV2/TM_termoz is umounted. > >>>>> > >>>>> Aug 26 10:11:42 bacchus automount[17827]: handle_packet: type = 5 > >>>>> Aug 26 10:11:42 bacchus automount[17827]: handle_packet_missing_direct: > >>>>> token 1527, name /cifs/SERV2/TM_termoz, request pid 29523 > >>>>> Aug 26 10:11:42 bacchus automount[17827]: attempting to mount entry > >>>>> /cifs/SERV2/TM_termoz > >>>>> Aug 26 10:11:42 bacchus automount[17827]: lookup_mount: > >>>>> lookup(program): /cifs/SERV2/TM_termoz -> > >>>>> -fstype=cifs,file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile > >>>>> ://SERV2/TM_termoz > >>>>> Aug 26 10:11:42 bacchus automount[17827]: parse_mount: parse(sun): > >>>>> expanded entry: > >>>>> -fstype=cifs,file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile > >>>>> ://SERV2/TM_termoz > >>>>> Aug 26 10:11:42 bacchus automount[17827]: parse_mount: parse(sun): > >>>>> gathered options: > >>>>> fstype=cifs,file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile > >>>>> Aug 26 10:11:42 bacchus automount[17827]: sun_mount: parse(sun): > >>>>> mounting root /tmp/auto1I7gB7, mountpoint /cifs/SERV2/TM_termoz, what > >>>>> //SERV2/TM_termoz, fstype cifs, options > >>>>> file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile > >>>>> Aug 26 10:11:42 bacchus automount[17827]: do_mount: //SERV2/TM_termoz > >>>>> /cifs/SERV2/TM_termoz type cifs options > >>>>> file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile > >>>>> using module generic > >>>>> Aug 26 10:11:42 bacchus automount[17827]: mount_mount: mount(generic): > >>>>> calling mkdir_path /tmp/auto1I7gB7 > >>>>> Aug 26 10:11:42 bacchus automount[17827]: mount_mount: mount(generic): > >>>>> calling mount -t cifs -s -o > >>>>> file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile > >>>>> //SERV2/TM_termoz /tmp/auto1I7gB7 > >>>>> Aug 26 10:11:42 bacchus automount[17827]: mount_mount: mount(generic): > >>>>> mounted //SERV2/TM_termoz type cifs on /tmp/auto1I7gB7 > >>>>> Aug 26 10:11:42 bacchus automount[17827]: move_mount: moved mount tree > >>>>> from /tmp/auto1I7gB7 to /cifs/SERV2/TM_termoz > >>>>> > >>>>> And is then mounted again here, but the request is not complete and > >>>>> we get an expire, which means that the kernel saw the tree of mounts > >>>>> below /cifs/SERV2 as not busy. This means the dentry was selected > >>>>> for expire before the mount, and in fact before the path walk which > >>>>> caused it even started. > >>>>> > >>>>> Aug 26 10:11:42 bacchus automount[17827]: handle_packet: type = 4 > >>>>> Aug 26 10:11:42 bacchus automount[17827]: > >>>>> handle_packet_expire_indirect: token 1528, name SERV2 > >>>>> Aug 26 10:11:42 bacchus automount[17827]: expiring path /cifs/SERV2 > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umount_multi: path > >>>>> /cifs/SERV2 incl 1 > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount > >>>>> offset /cifs/SERV2/01_MARKETING > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount > >>>>> /cifs/SERV2/01_MARKETING > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount > >>>>> offset /cifs/SERV2/02_ADMINISTRATIF > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount > >>>>> /cifs/SERV2/02_ADMINISTRATIF > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount > >>>>> offset /cifs/SERV2/03_EXPLOITATION > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount > >>>>> /cifs/SERV2/03_EXPLOITATION > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount > >>>>> offset /cifs/SERV2/04_SERVICE-CLIENTS > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount > >>>>> /cifs/SERV2/04_SERVICE-CLIENTS > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount > >>>>> offset /cifs/SERV2/05_TELECHARGEMENTS > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount > >>>>> /cifs/SERV2/05_TELECHARGEMENTS > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount > >>>>> offset /cifs/SERV2/06_UTILISATEURS > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount > >>>>> /cifs/SERV2/06_UTILISATEURS > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount > >>>>> offset /cifs/SERV2/07_PRODUCTION > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount > >>>>> /cifs/SERV2/07_PRODUCTION > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount > >>>>> offset /cifs/SERV2/08_BDD > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount > >>>>> /cifs/SERV2/08_BDD > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount > >>>>> offset /cifs/SERV2/ADMIN$ > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount > >>>>> /cifs/SERV2/ADMIN$ > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount > >>>>> offset /cifs/SERV2/C$ > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount > >>>>> /cifs/SERV2/C$ > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount > >>>>> offset /cifs/SERV2/CGIN > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount > >>>>> /cifs/SERV2/CGIN > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount > >>>>> offset /cifs/SERV2/D$ > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount > >>>>> /cifs/SERV2/D$ > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount > >>>>> offset /cifs/SERV2/F$ > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount > >>>>> /cifs/SERV2/F$ > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount > >>>>> offset /cifs/SERV2/ImageMagick-6.3.3V6 > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount > >>>>> /cifs/SERV2/ImageMagick-6.3.3V6 > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount > >>>>> offset /cifs/SERV2/TM_termoz > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umount_autofs_offset: > >>>>> couldn't get ioctl fd for offset /cifs/SERV2/TM_termoz: No such file or > >>>>> directory > >>>>> Aug 26 10:11:42 bacchus automount[17827]: forcing umount of offset > >>>>> mount /cifs/SERV2/TM_termoz > >>>>> Aug 26 10:11:42 bacchus automount[17827]: failed to remove dir > >>>>> /cifs/SERV2/TM_termoz: Device or resource busy > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount > >>>>> offset /cifs/SERV2/Projets > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount > >>>>> /cifs/SERV2/Projets > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount > >>>>> offset /cifs/SERV2/Projets_Air > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount > >>>>> /cifs/SERV2/Projets_Air > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount > >>>>> offset /cifs/SERV2/termozs > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount > >>>>> /cifs/SERV2/termozs > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount > >>>>> offset /cifs/SERV2/packages > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount > >>>>> /cifs/SERV2/packages > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount > >>>>> offset /cifs/SERV2/powervault > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount > >>>>> /cifs/SERV2/powervault > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount > >>>>> offset /cifs/SERV2/temp > >>>>> Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount > >>>>> /cifs/SERV2/temp > >>>>> Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: > >>>>> deleting offset key /cifs/SERV2/01_MARKETING > >>>>> Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: > >>>>> deleting offset key /cifs/SERV2/02_ADMINISTRATIF > >>>>> Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: > >>>>> deleting offset key /cifs/SERV2/03_EXPLOITATION > >>>>> Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: > >>>>> deleting offset key /cifs/SERV2/04_SERVICE-CLIENTS > >>>>> Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: > >>>>> deleting offset key /cifs/SERV2/05_TELECHARGEMENTS > >>>>> Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: > >>>>> deleting offset key /cifs/SERV2/06_UTILISATEURS > >>>>> Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: > >>>>> deleting offset key /cifs/SERV2/07_PRODUCTION > >>>>> Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: > >>>>> deleting offset key /cifs/SERV2/08_BDD > >>>>> Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: > >>>>> deleting offset key /cifs/SERV2/ADMIN$ > >>>>> Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: > >>>>> deleting offset key /cifs/SERV2/C$ > >>>>> Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: > >>>>> deleting offset key /cifs/SERV2/CGIN > >>>>> Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: > >>>>> deleting offset key /cifs/SERV2/D$ > >>>>> Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: > >>>>> deleting offset key /cifs/SERV2/F$ > >>>>> Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: > >>>>> deleting offset key /cifs/SERV2/ImageMagick-6.3.3V6 > >>>>> Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: > >>>>> deleting offset key /cifs/SERV2/TM_termoz > >>>>> Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: > >>>>> deleting offset key /cifs/SERV2/Projets > >>>>> Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: > >>>>> deleting offset key /cifs/SERV2/Projets_Air > >>>>> Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: > >>>>> deleting offset key /cifs/SERV2/termozs > >>>>> Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: > >>>>> deleting offset key /cifs/SERV2/packages > >>>>> Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: > >>>>> deleting offset key /cifs/SERV2/powervault > >>>>> Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: > >>>>> deleting offset key /cifs/SERV2/temp > >>>>> > >>>>> Now we are trying to umount the tree of autofs trigger mounts but one of > >>>>> the trigger mounts now has a mount on it which causes the problem. > >>>>> > >>>>> Aug 26 10:11:42 bacchus automount[17827]: ioctl_send_ready: token = 1527 > >>>>> Aug 26 10:11:42 bacchus automount[17827]: mounted /cifs/SERV2/TM_termoz > >>>>> > >>>>> Now the mount completes. > >>>>> > >>>>> It is the mount not completing before the expire which causes autofs to > >>>>> not recognize that the tree has become busy which should prevent it from > >>>>> attempting to umount it. > >>>>> > >>>>> The check to determine if the umount of the tree can go ahead is done by > >>>>> checking if any of these autofs trigger mounts have a open file handle > >>>>> (which is used for control operations when they have a mount on top of > >>>>> them). But, during the mount, the file handle is not set until the time > >>>>> we see the "ioctl_send_ready: token = 1527" message so it is missed by > >>>>> the expire check. > >>>>> > >>>>> Aug 26 10:11:42 bacchus automount[17827]: handle_packet: type = 5 > >>>>> Aug 26 10:11:42 bacchus automount[17827]: rm_unwanted_fn: removing > >>>>> directory /cifs/SERV2 > >>>>> Aug 26 10:11:42 bacchus automount[17827]: unable to remove directory > >>>>> /cifs/SERV2: Directory not empty > >>>>> Aug 26 10:11:42 bacchus automount[17827]: expired /cifs/SERV2 > >>>>> Aug 26 10:11:42 bacchus automount[17827]: > >>>>> handle_packet_missing_direct:1323: can't find map entry for > >>>>> (46,12396497) > >>>>> > >>>>> If we ever get any of these then it is game over. > >>>>> They are fatal and, although you will see a hang because a lock is not > >>>>> released, you must start over anyway. There is no other option. > >>>>> > >>>>> I don't know why the above expire is getting through, it shouldn't, but > >>>>> user space does need to handle this case anyway. > >>>>> > >>>>> I've done some basic testing and this still needs work (like a problem > >>>>> description) but I'm interested to know if this helps. > >>>>> > >>>>> autofs-5.0.3 - fix expire race > >>>>> > >>>>> From: Ian Kent <ra...@themaw.net> > >>>>> > >>>>> > >>>>> --- > >>>>> > >>>>> lib/mounts.c | 5 ++++- > >>>>> 1 files changed, 4 insertions(+), 1 deletions(-) > >>>>> > >>>>> > >>>>> diff --git a/lib/mounts.c b/lib/mounts.c > >>>>> index 4c44982..f26579e 100644 > >>>>> --- a/lib/mounts.c > >>>>> +++ b/lib/mounts.c > >>>>> @@ -1525,8 +1525,11 @@ int umount_multi_triggers(struct autofs_point > >>>>> *ap, struct mapent *me, char *root > >>>>> oe_base = oe->key + strlen(root); > >>>>> left += umount_multi_triggers(ap, oe, root, oe_base); > >>>>> > >>>>> - if (oe->ioctlfd != -1) > >>>>> + if (oe->ioctlfd != -1 || > >>>>> + is_mounted(_PROC_MOUNTS, oe->key, MNTS_REAL)) { > >>>>> left++; > >>>>> + break; > >>>>> + } > >>>>> } > >>>>> > >>>>> if (left) > >>>>> > >>>>> > >>>> _______________________________________________ > >>>> autofs mailing list > >>>> autofs@linux.kernel.org > >>>> http://linux.kernel.org/mailman/listinfo/autofs > >>> > >> _______________________________________________ > >> autofs mailing list > >> autofs@linux.kernel.org > >> http://linux.kernel.org/mailman/listinfo/autofs > > > > > > _______________________________________________ > autofs mailing list > autofs@linux.kernel.org > http://linux.kernel.org/mailman/listinfo/autofs _______________________________________________ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs