When we observed the child process, it was /usr/local/bin/ecl, so if we assume it was forked to invoke C compiler, then it hung before 'exec'.
I saw in src/sys/unixsys.d that ECL when forks itself first coordinates the parent and child processes by reading writing a byte to a pipe in the parent, and reading this byte in the child. Only after this the child calls 'exec[ve|vp]'. This coordination is necessary to allow the parent to finish initializing the 'process' structure representing the child. So, a possible explanation of hanging child process is that the parent was killed by test-grid-agent right before it has written to byte to the child. It's just a guess, but the problem is a bit difficult to reproduce. We saw just several times from many attempts. What would be the best way to debug such situation if we encounter it in the futurer? Create a core dump of the hangin ECL? Should it be compiled with debug information, or normally compiled ECL is enough to analyze the core dump? Best regards, - Anton ------------------------------------------------------------------------------ November Webinars for C, C++, Fortran Developers Accelerate application performance with scalable programming models. Explore techniques for threading, error checking, porting, and tuning. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk _______________________________________________ Ecls-list mailing list Ecls-list@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ecls-list