Re: [OT] select and sysread problem on solaris

Mark Blackman Thu, 11 Sep 2008 02:42:13 -0700

On 11 Sep 2008, at 02:12, Paul Johnson wrote:

I'm looking for a little help in solving a problem which has mestumped

and couldn't think of anywhere better to come.  That's not the problem
by the way, but I'll take answers to that as well.


I have about 210 named pipes (FIFOs) and three processes which are
running a select over a third of the pipes each, and then calling
sysread on the pipe before writing out the data to log files.

This has been working well in production for almost two years handling
many GB of data daily.

Recently, another thirty or so pipes have been added to this group and

very occassionally I am noticing a problem whereby select willindicate

that a pipe is ready for reading and sysread will attempt to read from
the pipe, but there is actually nothing there to be read, and so the
sysread call hangs waiting for input.

Reproducing this problem is difficult, but I currently have the system
in such a state.  The pipe on which the sysread call is waiting is one
of the new pipes.

I can only think of four possible explanations here:

 1.  My code is broken.  I don't think this is the case but don't want
     to rule it out.

2. Some other process has read the data inbetween the selectreturning

     and the sysread being called.  lsof shows no unexpected processes

accessing the pipe at the moment and no one should have beenon thesystem to have run cat or anything. last shows nothingsuspicious.


 3. Perl's select is broken.

 4. The OS broken.

Is my assumption correct that if select tells you there issomething to

be read then there should be something there to be read?  Can anyone
think of any other possibilities?

What is curious to me is that the process writing to the named pipe is
hung.  Is the pipe locked somehow until the sysread call has returned?

Unless I can think of anything better to do, tomorrow I will try tosendsome data to the named pipe that is being read to see if that willallowthe sysread to return. If it does, I should be able to tellwhether any

data has been lost from the named pipe, which might indicate that
another process had read it.

I am running perl-5.8.8 on Solaris 8. The program writing to namedpipe

is a Java program which is writing to STDOUT.  That program has been
called using system by a Perl wrapper which has reopened STDOUT to the
named pipe.  The program reading from the named pipe is using PERLIO.

I'm open to any hints, suggestions or solutions.


This reminds of a issue I found with select/sysread on solaris too,
although it turned out it was a misunderstanding on my part of the perl
sysread semantics compared to the read system call. It was something
to do with what happened when a pipe was closed unexpectedly I think.
You might review the docs on sysread and select, but I'm sure you've
done that already.

the perl select docs also suggest you use the O_NONBLOCK flag for the
case you're referring to as well.

Sorry, but that's all I can offer without doing any serious research.

- Mark

Re: [OT] select and sysread problem on solaris

Reply via email to