On Sat, 2008-11-29 at 18:48 +0100, Ondrej Valousek wrote:
> To summarize:
> process 4032 - D (disk sleep)
> process 18848 - S (sleep, but does not react to kill)
> process 18841 - Z (zombie)
> O.

For a start you need to make the entire debug log of the session in
which this occurs available.

> Ondrej Valousek wrote:
> >> Seems that the expire is completing before the parent signals are
> >> restored. But I thought a signal that is sent while it is blocked
> >> (SIGCHLD in this case) is delivered once the signal is unblocked so this
> >> is a bit of a puzzle.
> >>
> >>   
> >>     
> > And which game plays the process 18848 here - this is the first one to
> > hang (looks like)....
> >
> > Nov 20 15:02:39 login02 automount[18848]: lookup(yp): looking up .directory
> > Nov 20 15:02:39 login02 automount[18848]: failed to mount /proj/.directory
> > Nov 20 15:02:39 login02 automount[18848]: umount_multi:
> > path=/proj/.directory incl=1
> > Nov 20 15:02:39 login02 automount[4125]: handle_child: got pid 18848,
> > sig 0 (0), stat 1
> > Nov 20 15:02:39 login02 automount[4125]: sig_child: found pending iop
> > pid 18848: signalled 0 (sig 0), exit status 1
> > Nov 21 15:07:55 login02 automount[18848]: lookup(yp): looking up .raw_data
> > Nov 21 15:07:55 login02 automount[18848]: failed to mount /proj/.raw_data
> > Nov 21 15:07:55 login02 automount[18848]: umount_multi:
> > path=/proj/.raw_data incl=1
> > Nov 21 15:07:55 login02 automount[4125]: handle_child: got pid 18848,
> > sig 0 (0), stat 1
> > Nov 21 15:07:55 login02 automount[4125]: sig_child: found pending iop
> > pid 18848: signalled 0 (sig 0), exit status 1
> >
> >   
> >>> Ondrej
> >>>
> >>>     
> >>>       
> >>>> Hi All,
> >>>>
> >>>> I hoped this went away forever, but I was wrong (unfortunately). Here we
> >>>> go again:
> >>>> RHEL-4, full updates, autofs 4, automounter hangs:
> >>>> ps -ef | grep auto:
> >>>> root      3805     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
> >>>> --timeout=3600 --debug --use-old-ldap-lookup /softappli yp
> >>>> auto.softappli -rw
> >>>> root      3880     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
> >>>> --timeout=3600 --debug --use-old-ldap-lookup /cadappl yp auto.cadappl -rw
> >>>> root      3947     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
> >>>> --timeout=3600 --debug --use-old-ldap-lookup /appli yp auto.appli -rw
> >>>> root      4032     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
> >>>> --timeout=3600 --debug --use-old-ldap-lookup /proj yp auto.proj -rw
> >>>> root      4118     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
> >>>> --timeout=3600 --debug --use-old-ldap-lookup /home yp auto.home -rw
> >>>> root     18848  4032  0 Nov27 ?        00:00:00 /usr/sbin/automount
> >>>> --timeout=3600 --debug --use-old-ldap-lookup /proj yp auto.proj -rw
> >>>> root     18851  4032  0 Nov27 ?        00:00:00 [automount] <defunct>
> >>>> root     28454 21820  0 15:25 pts/134  00:00:00 grep auto
> >>>>
> >>>> Debug logs:
> >>>> Nov 27 13:07:28 login02 automount[4032]: sig 14 switching from 1 to 2
> >>>> Nov 27 13:07:28 login02 automount[4032]: get_pkt: state 1, next 2
> >>>> Nov 27 13:07:28 login02 automount[4032]: st_expire(): state = 1
> >>>> Nov 27 13:07:28 login02 automount[4032]: expire_proc: exp_proc=18848
> >>>> Nov 27 13:07:28 login02 automount[4032]: handle_packet: type = 2
> >>>> Nov 27 13:07:28 login02 automount[4032]: handle_packet_expire_multi:
> >>>> token 7150, name towerip
> >>>> Nov 27 13:07:28 login02 automount[18849]: expiring path /proj/towerip
> >>>> Nov 27 13:07:28 login02 automount[18849]: umount_multi:
> >>>> path=/proj/towerip incl=1
> >>>> Nov 27 13:07:28 login02 automount[18849]: umount_multi: unmounting
> >>>> dir=/proj/towerip
> >>>> Nov 27 13:07:28 login02 automount[18849]: expired /proj/towerip
> >>>> Nov 27 13:07:28 login02 automount[4032]: handle_child: got pid 18849,
> >>>> sig 0 (0), stat 0
> >>>> Nov 27 13:07:28 login02 automount[4032]: sig_child: found pending iop
> >>>> pid 18849: signalled 0 (sig 0), exit status 0
> >>>> Nov 27 13:07:28 login02 automount[4032]: send_ready: token=7150
> >>>> Nov 27 13:07:28 login02 automount[4032]: handle_packet: type = 2
> >>>> Nov 27 13:07:28 login02 automount[4032]: handle_packet_expire_multi:
> >>>> token 7151, name pdld4
> >>>> Nov 27 13:07:28 login02 automount[18851]: expiring path /proj/pdld4
> >>>> Nov 27 13:07:28 login02 automount[18851]: umount_multi: path=/proj/pdld4
> >>>> incl=1
> >>>> Nov 27 13:07:28 login02 automount[18851]: umount_multi: unmounting
> >>>> dir=/proj/pdld4
> >>>> Nov 27 13:07:28 login02 automount[18851]: expired /proj/pdld4
> >>>> Nov 27 13:07:28 login02 automount[4032]: handle_packet: type = 0
> >>>> Nov 27 13:07:28 login02 automount[4032]: handle_packet_missing: token
> >>>> 7152, name towerip
> >>>>
> >>>> The automounter daemon handling the /proj map stalled.
> >>>> Please help.
> >>>> Thanks,
> >>>>
> >>>> Ondrej
> >>>>
> >>>> Ondrej Valousek wrote:
> >>>>       
> >>>>         
> >>>>> Hi Jeff,
> >>>>>
> >>>>> Yes I am trying to reproduce this with the debug enabled - it will take
> >>>>> some time.
> >>>>> Please stay tuned.
> >>>>>
> >>>>> Ondrej
> >>>>>
> >>>>>         
> >>>>>           
> >>>>>> It rings a bell, but I can't put my finger on it.  Can you reproduce
> >>>>>> this?  If so, could you send along a debug log?  Instructions for
> >>>>>> collecting debug information can be found at:
> >>>>>>   http://people.redhat.com/~jmoyer/
> >>>>>>
> >>>>>> Cheers,
> >>>>>>
> >>>>>> Jeff
> >>>>>>
> >>>>>>
> >>>>>>           
> >>>>>>             
> >>>>>         
> >>>>>           
> >>>> _______________________________________________
> >>>> autofs mailing list
> >>>> autofs@linux.kernel.org
> >>>> http://linux.kernel.org/mailman/listinfo/autofs
> >>>>
> >>>>       
> >>>>         
> >>> The information contained in this e-mail and in any attachments is 
> >>> confidential and is designated solely for the attention of the intended 
> >>> recipient(s). If you are not an intended recipient, you must not use, 
> >>> disclose, copy, distribute or retain this e-mail or any part thereof. If 
> >>> you have received this e-mail in error, please notify the sender by 
> >>> return e-mail and delete all copies of this e-mail from your computer 
> >>> system(s).
> >>> Please direct any additional queries to: [EMAIL PROTECTED]
> >>> Thank You.
> >>> Silicon and Software Systems Limited. Registered in Ireland no. 378073.
> >>> Registered Office: South County Business Park, Leopardstown, Dublin 18
> >>>
> >>> _______________________________________________
> >>> autofs mailing list
> >>> autofs@linux.kernel.org
> >>> http://linux.kernel.org/mailman/listinfo/autofs
> >>>     
> >>>       
> >>   
> >>     
> >
> > _______________________________________________
> > autofs mailing list
> > autofs@linux.kernel.org
> > http://linux.kernel.org/mailman/listinfo/autofs
> >   
> 

_______________________________________________
autofs mailing list
autofs@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/autofs

Reply via email to