On Fri, Oct 05, 2007 at 09:35:16AM +0200, Mario Goebbels wrote:
> > i'm not really sure what's going on.  perhaps init is missing/dropping
> > a SIGCHLD notification?  you could check it's signal disposition via
> > "psig <init_pid>".
>
> root at bigmclargehuge:~ > psig 1011 | grep SIGCHLD
> root at bigmclargehuge:~ >
>
> If I understood the idea behind full output right, and then look at
> this, it appears that it isn't even caught.
>

so the application is actually catching sigcld.  to check for look for
the line that matches ^CLD.
        CLD     caught  lx_sigacthandler ...


> > it would also be interesting to have a truss output for init to see
> > if it does get a SIGCHLD.  to do that you'd have to stop init when
> > booting the zone.
> >     zoneadm -z ubuntu halt
> >     dtrace -w -n 'syscall:::return/execname == "init" && zonename == 
> > "ubuntu"/{stop(); trace(pid); exit(0);}'
> >     zoneadm -z ubuntu boot
> >
> > and then attach to it with truss
> >     truss -a -o ubuntu.init.txt -p <init_pid>
>
> I've attached the full output. I can't figure out much from it, but it
> appears that init just goes to sleep and waits for something. Then the
> zone halt happens, pretty short, too.
>

well, it is in fact usefull, here's the key tidbits (with some lines
removed for readability:
---8<---
xmknod(2, "/tmp/.pipe.1368.1", 010600, 0x00000000) = 0
open("/tmp/.pipe.1368.1", O_RDWR)               = 3
open("/tmp/.pipe.1368.1", O_RDONLY)             = 4
open("/tmp/.pipe.1368.1", O_WRONLY)             = 5
unlink("/tmp/.pipe.1368.1")                     = 0
fcntl(4, F_GETFL)                               = 0
fcntl(4, F_SETFL, FNONBLOCK)                    = 0
fcntl(5, F_GETFL)                               = 1
fcntl(5, F_SETFL, FNONBLOCK)                    = 0
...
forkx(0)                                        = 1376
...
pollsys(0x080479D0, 2, 0x00000000, 0x00000000) (sleeping...)
    Received signal #18, SIGCLD, in pollsys() [caught]
      siginfo: SIGCLD CLD_EXITED pid=1376 status=0x0000
pollsys(0x080479D0, 2, 0x00000000, 0x00000000)  Err#91 ERESTART
write(5, "\0", 1)                               = 1
fstat64(4, 0x08047A60)                          = 0
read(4, "\0", 1)                                = 1
fstat64(4, 0x08047A60)                          = 0
read(4, 0x08047D8B, 1)                          Err#11 EAGAIN
pollsys(0x080479D0, 2, 0x00000000, 0x00000000) (sleeping...)
    Stopped by signal #23, SIGSTOP, in pollsys()
---8<---

so the first bit of truss output above is probably lx_pipe().
so i'd guess that init is setting up a pipe to talk to the child
process.  then it's forking to create the child process at which
point it's trying to read data from the child process.  something
is going wrong here.

to debug it any further i'd probably have to look at the init
source code under ubuntu to see how it handles io with children
and how it detects child process termination to determine why
it's failing.

given that init is trying to read from an empty pipe, my first
guess would be that we're not emulating some aspect of linux
non-blocking pipe behavior correctly.  (i know that for master
pseudo terminals i had to do a lot of work to get the polling
and blocking behaviors correct.)

ed

Reply via email to