On 10/4/2019 10:27 AM, Ken Brown wrote:
> On 9/29/2019 4:05 PM, Ken Brown wrote:
>> On 9/27/2019 10:12 AM, Ken Brown wrote:
>>> On 9/27/2019 9:37 AM, Norton Allen wrote:
>>>> On 9/26/2019 10:50 PM, Ken Brown wrote:
>>>>>
>>>>>> As a simple test example, consider:
>>>>>>
>>>>>> /bin/ssh-agent /bin/sleep 10
>>>>>>
>>>>>> While the sleep is still running, ps shows:
>>>>>>
>>>>>>            PID    PPID    PGID     WINPID   TTY         UID    STIME 
>>>>>> COMMAND
>>>>>>           1694    1693    1694       1576  ?          22534 00:01:10
>>>>>> /usr/bin/ssh-agent
>>>>>>           1653       1    1653      11740  cons1      22534 00:00:37 
>>>>>> /usr/bin/bash
>>>>>>           1693    1653    1693       1552  cons1      22534 00:01:10 
>>>>>> /usr/bin/sleep
>>>>>>
>>>>>> One oddity is that ssh-agent is listed as a subprocess of sleep
>>>>> ...but this isn't a bug.  ssh-agent forks, and then the parent execs the 
>>>>> command.
>>>>
>>>> With the salient difference presumably being that the exec is done in the 
>>>> parent
>>>> instead of the child as usual?
>>>
>>> Yes.  The idea is that 'ssh-agent command' should be more-or-less 
>>> equivalent to
>>> running 'command', with ssh-agent running as a subprocess.
>>>
>>> The ssh-agent subprocess periodically checks to see if its parent is still
>>> alive, and it exits when the parent has died.  Someone should figure out why
>>> this is not working on Cygwin.
>>
>> As an aid to someone who might want to debug this (probably Corinna when she
>> returns), I've created a test program agent.c (attached) that simulates the
>> relevant part of ssh-agent:
>>
>> 1. It forks a subprocess that periodically checks to see if its parent has 
>> died,
>> and then exits.
>>
>> 2. The parent execs "/usr/bin/sleep 1".
>>
>> As with ssh-agent, the subprocess never detects that the parent has died, 
>> and so
>> it never exits.
>>
>> Running this program under strace shows the following error in the pinfo
>> constructor:
>>
>> pinfo::pinfo: couldn't duplicate parent rd_proc_pipe handle 0x1BC for forked
>> child 1666 after exec, Win32 error 5
>>
>> [Win32 error 5 is ERROR_ACCESS_DENIED.]
> 
> It seems that the pinfo constructor failure happens in
> cygheap_exec_info::reattach_children().  The latter is preceded by the 
> following
> comment:
> 
> /* Reattach non-reaped subprocesses passed in from the cygwin process
>      which previously operated under this pid.  FIXME: Is there a race here
>      if the process exits during cygwin's exec handoff?  */
> 
> I tried running my test program under gdb with a breakpoint at
> reattach_children, and the breakpoint was never hit.  That gives an 
> affirmative
> answer to the question in the FIXME. >
> As a result, the exec'd program never becomes aware that it has a subprocess, 
> so
> it exits without resetting the subprocess's ppid to 1.
> 
> Is there someone out there familiar enough with Cygwin's exec to suggest a 
> fix?
> It would be a nice gift to Corinna to get this fixed before her return.

What I said above about gdb is nonsense.  It's the exec'd process that calls 
reattach_children, so I wouldn't expect gdb to see that call.  I think the rest 
of my analysis is correct, but I'm not sure that the FIXME explains the failure.

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

Reply via email to