>>>>> "Joe" == Joe Slagel <[EMAIL PROTECTED]> writes:

Joe> We're seeing some very odd behavior with using alarm() around the
Joe> DBI->connect() call only on Solaris platforms. It appears that
Joe> even though we clear the alarm after a successful connection, for
Joe> some reason the alarm signal is still getting sent and we receive
Joe> a very nice "Alarm Clock" message.  The following script
Joe> demostrates the problem (on Solaris), but works fine on Linux.
Joe> Anyone have any guesses why this may be?

>>>>> "David" == "David M. Lloyd" <[EMAIL PROTECTED]> writes:

David> This is a Perl bug on Solaris.  Doing an alarm(0) doesn't
David> really clear the alarm.

>>>>> "Tim" == "Tim Bunce" <[EMAIL PROTECTED]> writes:

Tim> Can you point me to a url for more information on the bug?

I've been following this thread through the archives, and I found
another report which was very helpful:

   http://groups.google.com/groups?selm=93he36%24egc%241%40nnrp1.deja.com

Adam points out that, after DBD::Oracle is loaded, perl's alarm() gets
turned into a call to lwp_alarm, instead of basic alarm().  The fix
for this is to manually load DBD::Oracle before trying to do anything
with perl's alarm(), and that does indeed make all the system calls
lwp_alarm().

Sadly, I'm still having issues.  Now, instead of getting a rogue
"Alarm Clock" message, it looks like the alarm is ignored until a read
completes, then the message is passed on.  Using this script:

perl -MDBI -we \
  '$dbh = DBI->connect("dbi:Oracle:db", "user", "pass", {});
  alarm 1;
  @info = $dbh->selectrow_array("SELECT COUNT(*) FROM big_table")
    or warn $dbh->errstr();
  print join("|", @info), "\n";
  $dbh->disconnect();'

This works correctly.  After one second, it prints "Alarm Clock" then
exits.

If I install a SIGALRM handler like:

  $SIG{ALRM} = sub { die "foo!\n" };

Then it still takes a long time (executing the whole query to
completion), and I get that message at the end of the run.
Interestingly enough, if I interrupt it with SIGINT (after the alarm
has fired), I still see the "foo!" message.

Looking at the truss output (with "-f -l", so the first column is
pid/lwp_id), I see this:

   13013/1: lwp_alarm(1)                                    = 0

Ok, set the alarm...

   13013/1: write(5, "\0 v\0\006\0\0\0\0\011 k".., 118)     = 118
   13013/1: read(5, "\0 s\0\006\0\0\0\0\01019".., 2064)     = 115
   13013/1: write(5, "\0 n\0\006\0\0\0\0\003 ^".., 110)     = 110

Send the query, read something, send again, and start reading the
query result:

   13013/1:     Received signal #14, SIGALRM, in read() [caught]
   13013/1: read(5, 0x082981DE, 2064)                       Err#4 EINTR

Which gets interrupted by the SIGALRM

   13013/1: sigprocmask(SIG_SETMASK, 0xDF35A8F4, 0x00000000) = 0
   13013/1: sigprocmask(SIG_SETMASK, 0xDF365E98, 0x00000000) = 0
   13013/1: setcontext(0x080450B4)
   13013/1: read(5, 0x082981DE, 2064)       (sleeping...)

But do some magic, and then resume the read!

   13013/2: signotifywait()                 (sleeping...)
   13013/3: lwp_sema_wait(0xDEF06E3C)       (sleeping...)
   13013/4: lwp_cond_wait(0xDF360E6C, 0xDF360E7C, 0xDF35A66C) (sleeping...)
   13013/5: door_return(0x00000000, 0, 0x00000000, 0) (sleeping...)
   13013/6: lwp_cond_wait(0xDF360E6C, 0xDF360E7C, 0xDF35A66C) (sleeping...)
   13013/1: read(5, "\09C\0\006\0\0\0\0\01019".., 2064)     = 156

Read actual result here

   13013/1: brk(0x082AEAF0)                                 = 0
   13013/1: brk(0x082B4AF0)                                 = 0
   13013/1: getcontext(0x08046B04)
   13013/1: setcontext(0x080467E4)
   13013/1: write(2, " f o o !   a t   - e   l".., 19)      = 19

But there's an error pending, so print that ...

Anyway.  Anyone have ideas?

Thanks,
t.

Reply via email to