Several people have been able to reproduce a problem building Perl 
with MMK on OpenVMS Alpha v7.3-1.  The symptom is a lot of looping in 
kernel mode and the parent process doing tens of thousands of DIOs 
per second (or so MONITOR and SHOW PROC/CONT claim).  I believe the 
sequence of events when shutting down a child process in MMK is 
tripping over something introduced as part the
performance improvements in 7.3-1, perhaps those affecting AST 
delivery or those affecting mailbox I/O. The performance improvements 
in 7.3-1 are summarized here:

<http://www.openvms.compaq.com/doc/731FINAL/6657/6657pro.html#pfeat>

What it comes down to is that the write attention AST that is 
supposed to set up notification for when the child wants to send info 
to the parent via a mailbox keeps requeueing itself infinitely even 
if the child no longer exists.  This is timing sensitive and doesn't 
always happen, probably because if the parent image manages to exit 
soon enough the looping sequence never gets started, or perhaps it 
manages to delete the mailbox quickly enough sometimes.

I have no idea why a successfully queued write attention AST is 
counted as a DIO rather than a BIO.  I also don't understand why the 
write attention AST would continually fire when the only process that 
could be writing to it no longer exists; this may well be a bug in 
some of the new VMS code.  However, it seems clear that you never 
want to queue a write attention AST when you know for sure the writer 
is already gone.  I've made the modifications described below to 
accomplish that.  All may not peachy yet, however,  I did see an 
accvio once after this patch, though I could not reproduce it.  There
may still be something fishy going on in the shutdown sequence. 
Reinvoking MMK after the accvio completed the build successfully.  I
also got Perl 5.8.0 to run its test suite, which uses MMK 
extensively, and it passed all tests.

I've patched echo_ast in build_target.c to store and return the 
status it gets from sp_receive.  I've patched sp_wrtattn_ast in 
sp_mgr.c to check the return status it gets from echo_ast.  If 
sp_wrtattn_ast gets SS$_NONEXPR then it knows it does not need to 
queue itself again because the child no longer exists so there's not 
much point in wanting to be notified when it writes to its output 
mailbox.

These changes against MMK 3.9-3 are available below as a GNU unified 
diff and also as the output of DIFFERENCES/SLP.  The former can be 
applied with GNU patch and the latter with EDIT/SUM.

--- build_target.c;-0   Mon Dec 28 07:00:47 1998
+++ build_target.c      Thu Oct 17 13:42:12 2002
@@ -869,3 +869,3 @@
 **  Keeps reading the output and echoing it until it gets the magic
-**  end-of-command text.
+**  end-of-command text or the read fails.
 **
@@ -891,5 +891,6 @@
     $DESCRIPTOR(end_marker,EOM_TEXT);
+    unsigned int status;

     INIT_DYNDESC(rcvstr);
-    while (OK(sp_receive(&spctx, &rcvstr, 0))) {
+    while (OK(status = sp_receive(&spctx, &rcvstr, 0))) {
        if (rcvstr.dsc$w_length > EOM_LEN &&
@@ -909,3 +910,3 @@
     }
-    return SS$_NORMAL;
+    return status;
 } /* echo_ast */
--- sp_mgr.c;-0 Mon Dec 28 06:15:58 1998
+++ sp_mgr.c    Thu Oct 17 14:06:42 2002
@@ -451,9 +451,10 @@
     unsigned int status;

     status = (ctx->rcvast)(ctx->astprm);
-    sys$qiow(0, ctx->outchn, IO$_SETMODE|IO$M_WRTATTN, 0, 0, 0,
-       sp_wrtattn_ast, ctx, 0, 0, 0, 0);
-
+    if (status != SS$_NONEXPR) {
+        sys$qiow(0, ctx->outchn, IO$_SETMODE|IO$M_WRTATTN, 0, 0, 0,
+           sp_wrtattn_ast, ctx, 0, 0, 0, 0);
+    }
     return status;

 }
[end of patch]

$ type build_target.dif
-  870,  870
**  end-of-command text or the read fails.
-  892,  894
    unsigned int status;

    INIT_DYNDESC(rcvstr);
    while (OK(status = sp_receive(&spctx, &rcvstr, 0))) {
-  910,  910
    return status;
/
[end of BUILD_TARGET.DIF]

$ type sp_mgr.dif
-  454,  456
    if (status != SS$_NONEXPR) {
        sys$qiow(0, ctx->outchn, IO$_SETMODE|IO$M_WRTATTN, 0, 0, 0,
            sp_wrtattn_ast, ctx, 0, 0, 0, 0);
    }
/
[end of SP_MGR.DIF]

-- 
________________________________________
Craig A. Berry
mailto:craigberry@;mac.com

"... getting out of a sonnet is much more
 difficult than getting in."
                 Brad Leithauser

Reply via email to