On Tue, 2010-10-26 at 09:22 +0200, Erwan Loaëc wrote:
Sorry for my previous mail with missing parts...
Ian Kent wrote:
On Fri, 2010-10-22 at 17:19 +0200, Erwan Loaëc wrote:
Hello,
Sorry for the time, but before posting again I've upgrade every
production servers with newest autofs4 module (with last patch) and last
What does "newest autofs4 module" mean exactly?
Yes it is not the "newest module" but the module recompiled with the
patch autofs4-2.6.26-v5-update-20090903.patch
OK.
automount with all patch EXCEPT these:
autofs-5.0.5-fix-restart.patch
autofs-5.0.5-fix-status-privilege-error.patch
autofs-5.0.4-always-read-file-maps-mount-lookup-map-read-fix.patch
autofs-5.0.5-fix-direct-map-not-updating-on-reread.patch
autofs-5.0.5-add-external-bind-method.patch
autofs-5.0.5-fix-add-simple-bind-auth.patch
autofs-5.0.5-add-dump-maps-option.patch
autofs-5.0.5-fix-submount-shutdown-wait.patch
Today I had the same behaviour than the issue explained my previous mail.
--------------------
Oct 22 16:44:06 SERVERNAME automount[2665]: umount_autofs_offset:
couldn't get ioctl fd for offset /cifs/XXXXXXX/volume: No such file or
directory
Oct 22 16:44:06 SERVERNAME automount[2665]:
handle_packet_missing_direct:1363: can't find map entry for (20,3548545)
--------------------
This could be caused by a umount returning success when in fact it
didn't succeed with the umount. Are you sure umount is returning correct
status?
Unfortunaly the last case occured on production server with debug
disable. I can't find more information in logfile...
But, as I've explained in my previous mail, this could logical with the
bad sequence found in my previous case:
*Call to the share /cifs/XXXXXXX/volume
*Mount /cifs/XXXXXXX/volume expiring
*New call to the share /cifs/XXXXXXX/volume
*Umount /cifs/XXXXXXX/volume
But that isn't quite what the problem is.
I've had a closer look at the log and:
Aug 26 10:11:42 bacchus automount[17827]: handle_packet_expire_direct: token
1526, name /cifs/SERV2/TM_termoz
Aug 26 10:11:42 bacchus automount[17827]: expiring path /cifs/SERV2/TM_termoz
Aug 26 10:11:42 bacchus automount[17827]: umount_multi: path
/cifs/SERV2/TM_termoz incl 1
Aug 26 10:11:42 bacchus automount[17827]: umount_subtree_mounts: unmounting dir
= /cifs/SERV2/TM_termoz
Aug 26 10:11:42 bacchus automount[17827]: expired /cifs/SERV2/TM_termoz
Aug 26 10:11:42 bacchus automount[17827]: ioctl_send_ready: token = 1526
/cifs/SERV2/TM_termoz is umounted.
Aug 26 10:11:42 bacchus automount[17827]: handle_packet: type = 5
Aug 26 10:11:42 bacchus automount[17827]: handle_packet_missing_direct: token
1527, name /cifs/SERV2/TM_termoz, request pid 29523
Aug 26 10:11:42 bacchus automount[17827]: attempting to mount entry
/cifs/SERV2/TM_termoz
Aug 26 10:11:42 bacchus automount[17827]: lookup_mount: lookup(program):
/cifs/SERV2/TM_termoz ->
-fstype=cifs,file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile
://SERV2/TM_termoz
Aug 26 10:11:42 bacchus automount[17827]: parse_mount: parse(sun): expanded
entry:
-fstype=cifs,file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile
://SERV2/TM_termoz
Aug 26 10:11:42 bacchus automount[17827]: parse_mount: parse(sun): gathered
options:
fstype=cifs,file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile
Aug 26 10:11:42 bacchus automount[17827]: sun_mount: parse(sun): mounting root
/tmp/auto1I7gB7, mountpoint /cifs/SERV2/TM_termoz, what //SERV2/TM_termoz,
fstype cifs, options
file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile
Aug 26 10:11:42 bacchus automount[17827]: do_mount: //SERV2/TM_termoz
/cifs/SERV2/TM_termoz type cifs options
file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile
using module generic
Aug 26 10:11:42 bacchus automount[17827]: mount_mount: mount(generic): calling
mkdir_path /tmp/auto1I7gB7
Aug 26 10:11:42 bacchus automount[17827]: mount_mount: mount(generic): calling
mount -t cifs -s -o
file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile
//SERV2/TM_termoz /tmp/auto1I7gB7
Aug 26 10:11:42 bacchus automount[17827]: mount_mount: mount(generic): mounted
//SERV2/TM_termoz type cifs on /tmp/auto1I7gB7
Aug 26 10:11:42 bacchus automount[17827]: move_mount: moved mount tree from
/tmp/auto1I7gB7 to /cifs/SERV2/TM_termoz
And is then mounted again here, but the request is not complete and
we get an expire, which means that the kernel saw the tree of mounts
below /cifs/SERV2 as not busy. This means the dentry was selected
for expire before the mount, and in fact before the path walk which
caused it even started.
Aug 26 10:11:42 bacchus automount[17827]: handle_packet: type = 4
Aug 26 10:11:42 bacchus automount[17827]: handle_packet_expire_indirect: token
1528, name SERV2
Aug 26 10:11:42 bacchus automount[17827]: expiring path /cifs/SERV2
Aug 26 10:11:42 bacchus automount[17827]: umount_multi: path /cifs/SERV2 incl 1
Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset
/cifs/SERV2/01_MARKETING
Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount
/cifs/SERV2/01_MARKETING
Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset
/cifs/SERV2/02_ADMINISTRATIF
Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount
/cifs/SERV2/02_ADMINISTRATIF
Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset
/cifs/SERV2/03_EXPLOITATION
Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount
/cifs/SERV2/03_EXPLOITATION
Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset
/cifs/SERV2/04_SERVICE-CLIENTS
Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount
/cifs/SERV2/04_SERVICE-CLIENTS
Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset
/cifs/SERV2/05_TELECHARGEMENTS
Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount
/cifs/SERV2/05_TELECHARGEMENTS
Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset
/cifs/SERV2/06_UTILISATEURS
Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount
/cifs/SERV2/06_UTILISATEURS
Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset
/cifs/SERV2/07_PRODUCTION
Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount
/cifs/SERV2/07_PRODUCTION
Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset
/cifs/SERV2/08_BDD
Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount
/cifs/SERV2/08_BDD
Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset
/cifs/SERV2/ADMIN$
Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount
/cifs/SERV2/ADMIN$
Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset
/cifs/SERV2/C$
Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount /cifs/SERV2/C$
Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset
/cifs/SERV2/CGIN
Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount /cifs/SERV2/CGIN
Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset
/cifs/SERV2/D$
Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount /cifs/SERV2/D$
Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset
/cifs/SERV2/F$
Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount /cifs/SERV2/F$
Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset
/cifs/SERV2/ImageMagick-6.3.3V6
Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount
/cifs/SERV2/ImageMagick-6.3.3V6
Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset
/cifs/SERV2/TM_termoz
Aug 26 10:11:42 bacchus automount[17827]: umount_autofs_offset: couldn't get
ioctl fd for offset /cifs/SERV2/TM_termoz: No such file or directory
Aug 26 10:11:42 bacchus automount[17827]: forcing umount of offset mount
/cifs/SERV2/TM_termoz
Aug 26 10:11:42 bacchus automount[17827]: failed to remove dir
/cifs/SERV2/TM_termoz: Device or resource busy
Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset
/cifs/SERV2/Projets
Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount
/cifs/SERV2/Projets
Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset
/cifs/SERV2/Projets_Air
Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount
/cifs/SERV2/Projets_Air
Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset
/cifs/SERV2/termozs
Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount
/cifs/SERV2/termozs
Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset
/cifs/SERV2/packages
Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount
/cifs/SERV2/packages
Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset
/cifs/SERV2/powervault
Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount
/cifs/SERV2/powervault
Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset
/cifs/SERV2/temp
Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount /cifs/SERV2/temp
Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting
offset key /cifs/SERV2/01_MARKETING
Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting
offset key /cifs/SERV2/02_ADMINISTRATIF
Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting
offset key /cifs/SERV2/03_EXPLOITATION
Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting
offset key /cifs/SERV2/04_SERVICE-CLIENTS
Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting
offset key /cifs/SERV2/05_TELECHARGEMENTS
Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting
offset key /cifs/SERV2/06_UTILISATEURS
Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting
offset key /cifs/SERV2/07_PRODUCTION
Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting
offset key /cifs/SERV2/08_BDD
Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting
offset key /cifs/SERV2/ADMIN$
Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting
offset key /cifs/SERV2/C$
Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting
offset key /cifs/SERV2/CGIN
Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting
offset key /cifs/SERV2/D$
Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting
offset key /cifs/SERV2/F$
Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting
offset key /cifs/SERV2/ImageMagick-6.3.3V6
Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting
offset key /cifs/SERV2/TM_termoz
Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting
offset key /cifs/SERV2/Projets
Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting
offset key /cifs/SERV2/Projets_Air
Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting
offset key /cifs/SERV2/termozs
Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting
offset key /cifs/SERV2/packages
Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting
offset key /cifs/SERV2/powervault
Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting
offset key /cifs/SERV2/temp
Now we are trying to umount the tree of autofs trigger mounts but one of
the trigger mounts now has a mount on it which causes the problem.
Aug 26 10:11:42 bacchus automount[17827]: ioctl_send_ready: token = 1527
Aug 26 10:11:42 bacchus automount[17827]: mounted /cifs/SERV2/TM_termoz
Now the mount completes.
It is the mount not completing before the expire which causes autofs to
not recognize that the tree has become busy which should prevent it from
attempting to umount it.
The check to determine if the umount of the tree can go ahead is done by
checking if any of these autofs trigger mounts have a open file handle
(which is used for control operations when they have a mount on top of
them). But, during the mount, the file handle is not set until the time
we see the "ioctl_send_ready: token = 1527" message so it is missed by
the expire check.
Aug 26 10:11:42 bacchus automount[17827]: handle_packet: type = 5
Aug 26 10:11:42 bacchus automount[17827]: rm_unwanted_fn: removing directory
/cifs/SERV2
Aug 26 10:11:42 bacchus automount[17827]: unable to remove directory
/cifs/SERV2: Directory not empty
Aug 26 10:11:42 bacchus automount[17827]: expired /cifs/SERV2
Aug 26 10:11:42 bacchus automount[17827]: handle_packet_missing_direct:1323:
can't find map entry for (46,12396497)
If we ever get any of these then it is game over.
They are fatal and, although you will see a hang because a lock is not
released, you must start over anyway. There is no other option.
I don't know why the above expire is getting through, it shouldn't, but
user space does need to handle this case anyway.
I've done some basic testing and this still needs work (like a problem
description) but I'm interested to know if this helps.
autofs-5.0.3 - fix expire race
From: Ian Kent <ra...@themaw.net>
---
lib/mounts.c | 5 ++++-
1 files changed, 4 insertions(+), 1 deletions(-)
diff --git a/lib/mounts.c b/lib/mounts.c
index 4c44982..f26579e 100644
--- a/lib/mounts.c
+++ b/lib/mounts.c
@@ -1525,8 +1525,11 @@ int umount_multi_triggers(struct autofs_point *ap,
struct mapent *me, char *root
oe_base = oe->key + strlen(root);
left += umount_multi_triggers(ap, oe, root, oe_base);
- if (oe->ioctlfd != -1)
+ if (oe->ioctlfd != -1 ||
+ is_mounted(_PROC_MOUNTS, oe->key, MNTS_REAL)) {
left++;
+ break;
+ }
}
if (left)