From: Vic Sage
I'm writing a multiplexed TCP-based server that reads \n-terminated
strings from clients and does something with them. Since this is one process
with multiple client connections, it uses select() (or, actually, can_read
from
IO::Select) to block until data has arrived from any client.
It's my understanding that TCP does not preserve message boundaries.
That is, if a client writes:
$sock-autoflush(1);
print $sock abel\n;
print $sock baker\n;
print $sock charlie\n;
... there are no guarantees about what will be returned from the server's
system buffer by sysread() when select() pops. It could be a, or abel\n,
or abel\nbak, etc.
What I *want* is to block until an entire \n-terminated string [can that be
referred to as a line?] can be retrieved from one of my clients. I'm sure
I
could work out the logic to maintain application-level buffers, but I suspect
I
would merely be reinventing the wheel, one bug at a time :-). What does
the experienced Perl programmer - or socket-level programmer in general -
do in this situation?
TCP is a stream oriented protocol. As you noted it does not make any guarantee
about fragmenting or appending data because it just doesn't care what it
transports. It simply guarantees the octets will arrive in the same order they
were sent. There are no assumptions nor restrictions on what those octets
represent.
Therefore, Perl cannot do what you asked for either. It is necessary for your
application to keep track of the message boundaries and manage multiple and/or
incomplete messages from a single read.
Having dealt with this issue many times over the past 20 years or so, I have
developed a couple of approaches. The first is a two layer version that reads
the incoming data from the socket and queues it into a circular buffer. The
next layer extracts individual messages from the buffer and hands them off to a
parser, or whatever is needed next. The size of the buffer is dependent on a
lot of variables which vary by application and protocol. The possibility of
overflow is too much of a risk in some cases. Another approach is message
queues where the incoming octets are moved into a buffer until the boundary
marker arrives. That buffer is sent to a message queue and the next buffer is
opened. I often use XINU style message queues for this approach. The risk here
is the possibility of running out of buffers.
There may be other options to simplify this. One popular variation is to
precede each message with a two byte length value. Normally this will be a 16
bit integer in network byte order. You read the two bytes, then do another read
for the number of bytes indicated by them. You still have to manage fragments,
but no longer need to deal directly with multiples.
Bob McConnell
--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/