It is not a bug in syslog(), as explained in the web page Hans Feldt 
sent. syslog() is illegal between fork() and exec() in a multithreaded 
program.

regards,
Anders Widell

2013-08-12 06:00, Nagendra Kumar skrev:
> Hi,
>       I agree with Anders. We had reproduced last week only with syslog 
> addition 1. between fork and exec, 2 Few syslog in main thread of leap.
> So, to me, it looks like a bug in syslog(when added between fork and exec).
>
> Thanks
> -Nagu
>
> -----Original Message-----
> From: Anders Bjornerstedt [mailto:[email protected]]
> Sent: 09 August 2013 13:54
> To: Anders Widell
> Cc: [email protected]; [email protected]
> Subject: Re: [devel] [PATCH 1 of 1] leap: ncs_os_process_execute_timed child 
> process takes too long time before exec (#514)
>
> WE have had several inciudents/problems causeed by syslog messages inside 
> signal handlers.
> So I guess its not so surprising that you could get problems between fork and 
> exec.
>
> /AndersBj
>
> Anders Widell wrote:
>> Wow, this was news to me. Ok then we must go through all functions we
>> use after fork() and check that they are safe.
>>
>> So do we know if syslog() in general can block (if we disregard the
>> case when it is used after fork(), which apparently is a illegal).
>>
>> regards,
>> Anders Widell
>>
>> 2013-08-09 09:17, Hans Feldt skrev:
>>    
>>> Yeah it seems that syslog in a child between fork() and exec() is unsecure. 
>>> Googling reveals this:
>>>
>>> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=454183
>>>
>>> In search for #514 a syslog entry was added  between fork() and exec() that 
>>> increased the likeliness of this to happen. The remaining syslogs between 
>>> fork() and exec() are for error cases. I guess those need to be stderr 
>>> prints. The problem is that stderr is closed in the child...
>>> We need to come up with a way to have stderr available in some
>>> trouble shooting scenarios
>>>
>>> /HansF
>>>
>>>      
>>>> -----Original Message-----
>>>> From: Hans Nordebäck
>>>> Sent: den 9 augusti 2013 09:04
>>>> To: praveen malviya; Nagendra Kumar; Ramesh Babu Betham; Hans Feldt
>>>> Cc: [email protected]; Anders Widell
>>>> Subject: Re: SV: [PATCH 1 of 1] leap: ncs_os_process_execute_timed
>>>> child process takes too long time before exec (#514)
>>>>
>>>> Hi, we got feedback from one site with the alarm patch applied (#532).
>>>> In this case it hangs when calling syslog:
>>>>
>>>> #0  0x00007f03b0b9db35 in raise () from /lib64/libc.so.6
>>>> #0  0x00007f03b0b9db35 in raise () from /lib64/libc.so.6
>>>> #1  0x00007f03b0b9f111 in abort () from /lib64/libc.so.6
>>>> #2  0x00007f03b1e1dd99 in sigalrm_handler (sig=5928) at os_defs.c:79
>>>> #3  <signal handler called>
>>>> #4  0x00007f03b0c51dac in __lll_lock_wait_private () from
>>>> /lib64/libc.so.6
>>>> #5  0x00007f03b0c04e0d in _L_lock_1598 () from /lib64/libc.so.6
>>>> #6  0x00007f03b0c04bc6 in __tz_convert () from /lib64/libc.so.6
>>>> #7  0x00007f03b0c40f6a in __vsyslog_chk () from /lib64/libc.so.6
>>>> #8  0x00007f03b0c415b0 in syslog () from /lib64/libc.so.6
>>>> #9  0x00007f03b1e1e992 in ncs_os_process_execute_timed
>>>> (req=0x7fff10644ef0) at os_defs.c:1086
>>>> #10 0x0000000000416d03 in avnd_comp_clc_cmd_execute (cb=0x6578c0,
>>>> comp=0x67f680,
>>>> cmd_type=AVND_COMP_CLC_CMD_TYPE_INSTANTIATE) at avnd_clc.c:2702
>>>> #11 0x000000000041a810 in avnd_comp_clc_uninst_inst_hdler
>>>> (cb=0x6578c0, comp=0x67f680) at avnd_clc.c:1384
>>>> #12 0x00000000004187fb in avnd_comp_clc_fsm_run (cb=0x6578c0,
>>>> comp=0x67f680, ev=AVND_COMP_CLC_PRES_FSM_EV_INST) at
>>>> avnd_clc.c:874
>>>> #13 0x000000000043438f in avnd_su_pres_insting_compinst_hdler
>>>> (cb=0x6578c0, su=0x676380, comp=0x680750) at avnd_susm.c:1759
>>>> #14 0x0000000000435bc4 in avnd_su_pres_fsm_run (cb=0x6578c0,
>>>> su=0x676380, comp=0x680750,
>>>> ev=AVND_SU_PRES_FSM_EV_COMP_INSTANTIATED) at avnd_susm.c:1293
>>>> #15 0x00000000004175cb in avnd_comp_clc_st_chng_prc (cb=0x6578c0,
>>>> comp=0x680750, prv_st=SA_AMF_PRESENCE_INSTANTIATING,
>>>> final_st=SA_AMF_PRESENCE_INSTANTIATED) at avnd_clc.c:1313
>>>> #16 0x000000000041888f in avnd_comp_clc_fsm_run (cb=0x6578c0,
>>>> comp=0x680750,
>>>> ev=AVND_COMP_CLC_PRES_FSM_EV_INST_SUCC) at avnd_clc.c:897
>>>> #17 0x000000000041adb8 in avnd_evt_clc_resp_evh (cb=0x6578c0,
>>>> evt=0x67d0e0) at avnd_clc.c:446
>>>> #18 0x00000000004300e0 in avnd_evt_process (evt=<optimized out>) at
>>>> avnd_proc.c:278
>>>> #19 avnd_main_process () at avnd_proc.c:219
>>>> #20 0x0000000000408815 in main (argc=1, argv=0x7fff10645988) at
>>>> amfnd_main.c:61
>>>>
>>>>
>>>> /HansN
>>>>
>>>> On 08/06/13 09:34, praveen malviya wrote:
>>>>        
>>>>> Hi,
>>>>>           I added a new syslog entry(to debug) before execv and was
>>>>> able to reproduce the problem.
>>>>> The reason for the time out seems to be that syslog was *hung*, but
>>>>> *after* writing the new log message into /var/log/messages.
>>>>> So, there is a high possibility that the child process gets hung in
>>>>> syslog or other file operations.
>>>>> See bt below:
>>>>> Thread 1 (Thread 0x7f030a16f700 (LWP 16331)):
>>>>>
>>>>> #0  0x00007f0308de4b1e in __lll_lock_wait_private () from
>>>>> /lib64/libc.so.6
>>>>> #1  0x00007f0308dd4398 in _L_lock_634 () from /lib64/libc.so.6
>>>>> #2  0x00007f0308dd3e62 in __vsyslog_chk () from /lib64/libc.so.6
>>>>> #3  0x00007f0308dd4320 in syslog () from /lib64/libc.so.6
>>>>> #4  0x00007f0309d38feb in ncs_os_process_execute_timed
>>>>> (req=0x7fffdcc52770, flag=false) at os_defs.c:1059
>>>>> #5  0x0000000000416bd6 in avnd_comp_clc_cmd_execute (cb=0x6578a0,
>>>>> comp=0x673840, cmd_type=AVND_COMP_CLC_CMD_TYPE_INSTANTIATE)
>>>>>       at avnd_clc.c:2687
>>>>> #6  0x000000000041a630 in avnd_comp_clc_uninst_inst_hdler
>>>>> (cb=0x6578a0, comp=0x673840) at avnd_clc.c:1384
>>>>> #7  0x000000000041865b in avnd_comp_clc_fsm_run (cb=0x6578a0,
>>>>> comp=0x673840, ev=AVND_COMP_CLC_PRES_FSM_EV_INST)
>>>>>       at avnd_clc.c:874
>>>>> #8  0x0000000000433f5f in avnd_su_pres_insting_compinst_hdler
>>>>> (cb=0x6578a0, su=0x66fc70, comp=0x674e10) at avnd_susm.c:1759
>>>>> #9  0x0000000000435764 in avnd_su_pres_fsm_run (cb=0x6578a0,
>>>>> su=0x66fc70, comp=0x674e10,
>>>>>       ev=AVND_SU_PRES_FSM_EV_COMP_INSTANTIATED) at avnd_susm.c:1293
>>>>> #10 0x000000000041748b in avnd_comp_clc_st_chng_prc (cb=0x6578a0,
>>>>> comp=0x674e10, prv_st=SA_AMF_PRESENCE_INSTANTIATING,
>>>>>       final_st=SA_AMF_PRESENCE_INSTANTIATED) at avnd_clc.c:1313
>>>>> #11 0x00000000004186ef in avnd_comp_clc_fsm_run (cb=0x6578a0,
>>>>> comp=0x674e10, ev=AVND_COMP_CLC_PRES_FSM_EV_INST_SUCC)
>>>>>       at avnd_clc.c:897
>>>>> #12 0x000000000041f400 in avnd_comp_reg_prc (cb=0x6578a0,
>>>>> comp=0x674e10, pxy_comp=<optimized out>, reg=0x677528,
>>>>>       dest=0x677518) at avnd_comp.c:709
>>>>> #13 0x000000000041fbd7 in avnd_evt_ava_comp_reg_evh (cb=0x6578a0,
>>>>> evt=0x683530) at avnd_comp.c:212
>>>>> #14 0x000000000042fcc8 in avnd_evt_process (evt=<optimized out>) at
>>>>> avnd_proc.c:278
>>>>> #15 avnd_main_process () at avnd_proc.c:219
>>>>> #16 0x0000000000408835 in main (argc=2, argv=0x7fffdcc53588) at
>>>>> amfnd_main.c:53
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Praveen.
>>>>>
>>>>>
>>>>> On 03-Aug-13 9:47 PM, Hans Nordebäck wrote:
>>>>>          
>>>>>> Hi,
>>>>>>
>>>>>> I think the problem with "child before exec" "hangs" and the
>>>>>> problem where "child" at cleanup times out instantly, even though a 10 
>>>>>> sec.
>>>>>> timeout is set, are related.
>>>>>>
>>>>>> Some reasons why the child seems to hang could be:
>>>>>>
>>>>>> 1. Not run at all
>>>>>> 2. "Hangs" in e.g. close.
>>>>>> 3. Crashes, but then whe should have core dumps.
>>>>>>
>>>>>> but the problem with "hanging" and instantly times out could be:
>>>>>>
>>>>>> 4. When parent process adds the pid in the patricia tree,
>>>>>> add_new_req_pid_in_list, ncs_exec_mod_hdlr is able to kill this
>>>>>> child pid before it has started.
>>>>>> A e.g. race condtion in this code could explain both these faults.
>>>>>> I looked briefly in this code and one thing is the delivery of the
>>>>>> SIGCHLD signal, any thread can get that signal, e.g amfnd, mds,
>>>>>> timer or ncs_exec_mod_hdlr. I think a dedicated thread should
>>>>>> handle SIGCHLD.
>>>>>> I guess we should look closer to this part.
>>>>>>
>>>>>> /BR HansN
>>>>>> ________________________________________
>>>>>> Från: Hans Nordebäck
>>>>>> Skickat: den 30 juli 2013 13:05
>>>>>> Till: Nagendra Kumar
>>>>>> Cc: Ramesh Babu Betham; [email protected];
>>>>>> Praveen Malviya; Hans Feldt
>>>>>> Ämne: RE: [PATCH 1 of 1] leap: ncs_os_process_execute_timed child
>>>>>> process takes too long time before exec (#514)
>>>>>>
>>>>>> Hi Nagu, I have tested the patch and simulated "hanging" in the
>>>>>> child part, the core dump produced points out where.
>>>>>> The system works as before, exept the core dump if it hangs in the
>>>>>> child part. We have this patch running at a site waiting for it to
>>>>>> trigger, so we don't know yet what is causing this and if the
>>>>>> patch will help.
>>>>>>
>>>>>>     /BR HansN
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Nagendra Kumar [mailto:[email protected]]
>>>>>> Sent: den 30 juli 2013 13:05
>>>>>> To: Hans Nordebäck
>>>>>> Cc: Ramesh Babu Betham; [email protected];
>>>>>> Praveen Malviya; Hans Feldt
>>>>>> Subject: RE: [PATCH 1 of 1] leap: ncs_os_process_execute_timed
>>>>>> child process takes too long time before exec (#514)
>>>>>>
>>>>>> Hi Hans N,
>>>>>>
>>>>>> Close is not a sync call. I am still not convinced, these calls
>>>>>> may get blocked because of NFS system issue.
>>>>>> Have you tested by keeping abort in child process and see how
>>>>>> amfnd or other services behaves(just to simulate abort)?
>>>>>>
>>>>>> Thanks
>>>>>> -Nagu
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Hans Nordebäck [mailto:[email protected]]
>>>>>> Sent: 30 July 2013 16:16
>>>>>> To: Nagendra Kumar
>>>>>> Cc: Ramesh Babu Betham; [email protected];
>>>>>> Praveen Malviya; Hans Feldt
>>>>>> Subject: Re: [PATCH 1 of 1] leap: ncs_os_process_execute_timed
>>>>>> child process takes too long time before exec (#514)
>>>>>>
>>>>>> Hi Nagu, I agree with Ramesh, e.g. in the child part:
>>>>>>
>>>>>>         /* By default we close all inherited file descriptors in
>>>>>> the child */
>>>>>>             if (getenv("OPENSAF_KEEP_FD_OPEN_AFTER_FORK") == NULL) {
>>>>>>                 /* Close all inherited file descriptors */
>>>>>>                 int i = sysconf(_SC_OPEN_MAX);
>>>>>>                 if (i == -1) {
>>>>>>                     syslog(LOG_ERR, "%s: sysconf failed - %s",
>>>>>> __FUNCTION__, strerror(errno));
>>>>>>                     exit(EXIT_FAILURE);
>>>>>>                 }
>>>>>>                 for (i--; i >= 0; --i)
>>>>>>                     (void) close(i); /* close all descriptors */
>>>>>>
>>>>>>                 /* Redirect standard files to /dev/null */
>>>>>>                 if (freopen("/dev/null", "r", stdin) == NULL)
>>>>>>                     syslog(LOG_ERR, "%s: freopen stdin failed -
>>>>>> %s", __FUNCTION__, strerror(errno));
>>>>>>                 if (freopen("/dev/null", "w", stdout) == NULL)
>>>>>>                     syslog(LOG_ERR, "%s: freopen stdout failed -
>>>>>> %s", __FUNCTION__, strerror(errno));
>>>>>>                 if (freopen("/dev/null", "w", stderr) == NULL)
>>>>>>                     syslog(LOG_ERR, "%s: freopen stderr failed -
>>>>>> %s", __FUNCTION__, strerror(errno));
>>>>>>             }
>>>>>>
>>>>>> close are issued.
>>>>>>
>>>>>> /BR HansN
>>>>>>
>>>>>> On 07/30/13 12:34, Nagendra Kumar wrote:
>>>>>>            
>>>>>>> But where are file operations coming during fork, can you please
>>>>>>> explain ?
>>>>>>>
>>>>>>> Thanks
>>>>>>> -Nagu
>>>>>>> -----Original Message-----
>>>>>>> From: Ramesh Betham
>>>>>>> Sent: 30 July 2013 15:53
>>>>>>> To: Nagendra Kumar
>>>>>>> Cc: Hans Nordebäck; [email protected]; Praveen
>>>>>>> Malviya; Hans Feldt
>>>>>>> Subject: Re: [PATCH 1 of 1] leap: ncs_os_process_execute_timed
>>>>>>> child process takes too long time before exec (#514)
>>>>>>>
>>>>>>> Nagu: Hans N might be pointing to the chances of hung of
>>>>>>> file-operation calls (esp.  when some inconsistency happens with
>>>>>>> NFS).  Just a guess, let Hans N confirm it.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Ramesh.
>>>>>>>
>>>>>>> On 7/30/2013 3:27 PM, Nagendra Kumar wrote:
>>>>>>>              
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>>                
>>>>>>>>>> regarding what can "hang" in the child part, e.g close of file
>>>>>>>>>> descriptors close of file descriptors.
>>>>>>>>>>                    
>>>>>>>> When this can happen? After fork is successful, this shouldn't
>>>>>>>> happen. Can you please provide any example.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> -Nagu
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Hans Nordebäck [mailto:[email protected]]
>>>>>>>> Sent: 30 July 2013 12:53
>>>>>>>> To: Nagendra Kumar
>>>>>>>> Cc: Hans Nordebäck; [email protected]; Praveen
>>>>>>>> Malviya; Ramesh Babu Betham; Hans Feldt
>>>>>>>> Subject: Re: [PATCH 1 of 1] leap: ncs_os_process_execute_timed
>>>>>>>> child process takes too long time before exec (#514)
>>>>>>>>
>>>>>>>> Hi Nagu, regarding what can "hang" in the child part, e.g close
>>>>>>>> of file descriptors. /BR HansN On 07/30/13 09:01, Hans Nordebäck wrote:
>>>>>>>>                
>>>>>>>>> Hi Nagu,
>>>>>>>>>
>>>>>>>>> On 07/30/13 08:54, Nagendra Kumar wrote:
>>>>>>>>>                  
>>>>>>>>>> Hi Hans N,
>>>>>>>>>>
>>>>>>>>>>                    
>>>>>>>>>>>> 1. OPENSAF_CHILD_EXEC_TIME_TOLERANCE is the name of a new
>>>>>>>>>>>> environment variable where value is used as input to alarm,
>>>>>>>>>>>> if not set it is default 2 seconds.
>>>>>>>>>>>>                        
>>>>>>>>>> Do we have some place holder for this variable for
>>>>>>>>>> configuration and are we going to add it in README for information.
>>>>>>>>>>                    
>>>>>>>>> Perhaps the name isn't the best, but it should be handled as
>>>>>>>>> the other env variable I guess, e.g. "AVND_PM_MONITORING_RATE", etc.
>>>>>>>>>                  
>>>>>>>>>>>> if the child  "hangs" before exec this extra coredump should
>>>>>>>>>>>> give information  where/what is wrong.
>>>>>>>>>>>>                        
>>>>>>>>>> This means that fork hangs, am I right ? If yes, then dump is
>>>>>>>>>> not going to provide any information as it is a system call,
>>>>>>>>>> it can only show, ithangs in fork.
>>>>>>>>>>                    
>>>>>>>>> I don't think fork hangs as the parent part continues and
>>>>>>>>> later, with the help of ncs_exec_mod_hdlr,  the parent detects
>>>>>>>>> that the child or "exec" has timed out,
>>>>>>>>> 10 sec in this case. But in this case the exec has not been
>>>>>>>>> performed.
>>>>>>>>>                  
>>>>>>>>>>>> After exec, it will work as usual
>>>>>>>>>>>>                        
>>>>>>>>>> This confirms that we are only targeting fork to debug.
>>>>>>>>>>                    
>>>>>>>>> Yes, the extra core dump will help troubleshooting.
>>>>>>>>> /BR HansN
>>>>>>>>>                  
>>>>>>>>>> Thanks
>>>>>>>>>> -Nagu
>>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Hans Nordebäck [mailto:[email protected]]
>>>>>>>>>> Sent: 30 July 2013 11:57
>>>>>>>>>> To: Nagendra Kumar
>>>>>>>>>> Cc: [email protected]; Praveen Malviya;
>>>>>>>>>> Ramesh Babu Betham; Hans Feldt; Hans Nordebäck
>>>>>>>>>> Subject: RE: [PATCH 1 of 1] leap: ncs_os_process_execute_timed
>>>>>>>>>> child process takes too long time before exec (#514)
>>>>>>>>>>
>>>>>>>>>> Hi Nagu,
>>>>>>>>>>
>>>>>>>>>> 1. OPENSAF_CHILD_EXEC_TIME_TOLERANCE is the name of a new
>>>>>>>>>> environment variable where value is used as input to alarm,
>>>>>>>>>> if not set it is default 2 seconds.
>>>>>>>>>> 2. Yes you are right, in this particular case it is set to 10
>>>>>>>>>> sec, that's why the env. variable above can be set.
>>>>>>>>>> 3. This alarm is just an additional precaution, at no extra
>>>>>>>>>> cost, to check the child part before the exec.  After exec
>>>>>>>>>>            it will work as usual but if the child  "hangs"
>>>>>>>>>> before exec this extra coredump should give information  where/what 
>>>>>>>>>> is wrong.
>>>>>>>>>>
>>>>>>>>>> /BR HansN
>>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Nagendra Kumar [mailto:[email protected]]
>>>>>>>>>> Sent: den 30 juli 2013 07:11
>>>>>>>>>> To: Hans Nordebäck; Praveen Malviya; Hans Feldt; Ramesh Babu
>>>>>>>>>> Betham
>>>>>>>>>> Cc: [email protected]
>>>>>>>>>> Subject: RE: [PATCH 1 of 1] leap: ncs_os_process_execute_timed
>>>>>>>>>> child process takes too long time before exec (#514)
>>>>>>>>>>
>>>>>>>>>> Hi Hans N,
>>>>>>>>>>              For my understanding, can you please provide the
>>>>>>>>>> below
>>>>>>>>>> information:
>>>>>>>>>>
>>>>>>>>>> 1.    I can't find OPENSAF_CHILD_EXEC_TIME_TOLERANCE in opensaf
>>>>>>>>>> source code.
>>>>>>>>>> 2.    I hope the child process is hung for more than
>>>>>>>>>> saAmfCtDefClcCliTimeout resulting in CLC time out. Am I right?
>>>>>>>>>> 3.    Even we add assert in child process and we get core dump, but
>>>>>>>>>> it may not give any information as it got delayed because of
>>>>>>>>>>          system issue. Are we targeting, which system call the
>>>>>>>>>> child process is hung?
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>> -Nagu
>>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Hans Nordeback [mailto:[email protected]]
>>>>>>>>>> Sent: 22 July 2013 17:07
>>>>>>>>>> To: Nagendra Kumar; Praveen Malviya; [email protected];
>>>>>>>>>> Ramesh Babu Betham
>>>>>>>>>> Cc: [email protected]
>>>>>>>>>> Subject: [PATCH 1 of 1] leap: ncs_os_process_execute_timed
>>>>>>>>>> child process takes too long time before exec (#514)
>>>>>>>>>>
>>>>>>>>>>        osaf/libs/core/leap/os_defs.c |  27 
>>>>>>>>>> +++++++++++++++++++++++++++
>>>>>>>>>>        1 files changed, 27 insertions(+), 0 deletions(-)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> amfnd calls ncs_os_process_execute_timed and the child process
>>>>>>>>>> takes too long time before exec, (10 sec timeout). An alarm is
>>>>>>>>>> set in the ncs_os_process_execute_timed child process. If
>>>>>>>>>> timed out a core dump will be produced to be able to trouble shoot.
>>>>>>>>>>
>>>>>>>>>> diff --git a/osaf/libs/core/leap/os_defs.c
>>>>>>>>>> b/osaf/libs/core/leap/os_defs.c
>>>>>>>>>> --- a/osaf/libs/core/leap/os_defs.c
>>>>>>>>>> +++ b/osaf/libs/core/leap/os_defs.c
>>>>>>>>>> @@ -65,6 +65,15 @@ bool gl_ncs_atomic_mtx_initialise = fals
>>>>>>>>>>         * description of SOCK_CLOEXEC. */
>>>>>>>>>>        static pthread_mutex_t s_cloexec_mutex =
>>>>>>>>>> PTHREAD_MUTEX_INITIALIZER;
>>>>>>>>>>        +/*
>>>>>>>>>> + * ALRM signal is used to detect if child process takes too
>>>>>>>>>> + long
>>>>>>>>>> time before exec.
>>>>>>>>>> + *
>>>>>>>>>> + * @param sig
>>>>>>>>>> + */
>>>>>>>>>> +static void sigalrm_handler(int sig) {
>>>>>>>>>> +    abort();
>>>>>>>>>> +}
>>>>>>>>>> /*************************************************************
>>>>>>>>>> **************
>>>>>>>>>>
>>>>>>>>>>         *
>>>>>>>>>>         * uns64
>>>>>>>>>> @@ -999,6 +1008,22 @@ uint32_t ncs_os_process_execute_timed(NC
>>>>>>>>>>            osaf_mutex_lock_ordie(&s_cloexec_mutex);
>>>>>>>>>>              if ((pid = fork()) == 0) {
>>>>>>>>>> +                unsigned int alarm_time_sec;
>>>>>>>>>> +                char* alarm_time;
>>>>>>>>>> +
>>>>>>>>>> +                if (signal(SIGALRM, sigalrm_handler) == SIG_ERR) {
>>>>>>>>>> +                        LOG_ER("signal ALRM failed: %s",
>>>>>>>>>> strerror(errno));
>>>>>>>>>> +                }
>>>>>>>>>> +                if ((alarm_time =
>>>>>>>>>> getenv("OPENSAF_CHILD_EXEC_TIME_TOLERANCE")) != NULL) {
>>>>>>>>>> +                        alarm_time_sec = strtol(alarm_time,
>>>>>>>>>> NULL, 0);
>>>>>>>>>> +                }
>>>>>>>>>> +                else {
>>>>>>>>>> +                        // default alarm timeout 2 seconds
>>>>>>>>>> +                        alarm_time_sec = 2;
>>>>>>>>>> +                }
>>>>>>>>>> +
>>>>>>>>>> +                alarm(alarm_time_sec);
>>>>>>>>>> +
>>>>>>>>>>                /*
>>>>>>>>>>                 ** Make sure forked processes have default
>>>>>>>>>> scheduling class
>>>>>>>>>>                 ** independent of the callers scheduling class.
>>>>>>>>>> @@ -1054,6 +1079,8 @@ uint32_t ncs_os_process_execute_timed(NC
>>>>>>>>>>                }
>>>>>>>>>>        #endif
>>>>>>>>>>        +                alarm(0);
>>>>>>>>>> +
>>>>>>>>>>                /* child part */
>>>>>>>>>>                if (execvp(req->i_script, req->i_argv) == -1) {
>>>>>>>>>>                    syslog(LOG_ERR, "%s: execvp '%s' failed -
>>>>>>>>>> %s", __FUNCTION__, req->i_script, strerror(errno));
>>>>>>>>>>                    
>>
>>
>> ----------------------------------------------------------------------
>> -------- Get 100% visibility into Java/.NET code with AppDynamics
>> Lite!
>> It's a free troubleshooting tool designed for production.
>> Get down to code-level detail for bottlenecks, with <2% overhead.
>> Download for free and get started troubleshooting in minutes.
>> http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.c
>> lktrk _______________________________________________
>> Opensaf-devel mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel
>>    
>
>
> ------------------------------------------------------------------------------
> Get 100% visibility into Java/.NET code with AppDynamics Lite!
> It's a free troubleshooting tool designed for production.
> Get down to code-level detail for bottlenecks, with <2% overhead.
> Download for free and get started troubleshooting in minutes.
> http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
> _______________________________________________
> Opensaf-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/opensaf-devel



------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with <2% overhead. 
Download for free and get started troubleshooting in minutes. 
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to