> From: Vic Sage
> 
> I'm writing a multiplexed TCP-based server that reads "\n"-terminated
> strings from clients and does something with them.  Since this is one process
> with multiple client connections, it uses select() (or, actually, can_read 
> from
> IO::Select) to block until data has arrived from any client.
> 
> It's my understanding that TCP does not preserve "message boundaries."
> That is, if a client writes:
> 
> $sock->autoflush(1);
> print $sock "abel\n";
> print $sock "baker\n";
> print $sock "charlie\n";
> 
> ... there are no guarantees about what will be returned from the server's
> system buffer by sysread() when select() pops.  It could be "a", or "abel\n",
> or "abel\nbak", etc.
> 
> What I *want* is to block until an entire "\n"-terminated string [can that be
> referred to as a "line"?]  can be retrieved from one of my clients.  I'm sure 
> I
> could work out the logic to maintain application-level buffers, but I suspect 
> I
> would merely be reinventing the wheel, one bug at a time :-).   What does
> the experienced Perl programmer - or socket-level programmer in general -
> do in this situation?

TCP is a stream oriented protocol. As you noted it does not make any guarantee 
about fragmenting or appending data because it just doesn't care what it 
transports. It simply guarantees the octets will arrive in the same order they 
were sent. There are no assumptions nor restrictions on what those octets 
represent.

Therefore, Perl cannot do what you asked for either. It is necessary for your 
application to keep track of the message boundaries and manage multiple and/or 
incomplete messages from a single read.

Having dealt with this issue many times over the past 20 years or so, I have 
developed a couple of approaches. The first is a two layer version that reads 
the incoming data from the socket and queues it into a circular buffer. The 
next layer extracts individual messages from the buffer and hands them off to a 
parser, or whatever is needed next. The size of the buffer is dependent on a 
lot of variables which vary by application and protocol. The possibility of 
overflow is too much of a risk in some cases. Another approach is message 
queues where the incoming octets are moved into a buffer until the boundary 
marker arrives. That buffer is sent to a message queue and the next buffer is 
opened. I often use XINU style message queues for this approach. The risk here 
is the possibility of running out of buffers.

There may be other options to simplify this. One popular variation is to 
precede each message with a two byte length value. Normally this will be a 16 
bit integer in network byte order. You read the two bytes, then do another read 
for the number of bytes indicated by them. You still have to manage fragments, 
but no longer need to deal directly with multiples.

Bob McConnell


--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to