> From: Vic Sage > > I'm writing a multiplexed TCP-based server that reads "\n"-terminated > strings from clients and does something with them. Since this is one process > with multiple client connections, it uses select() (or, actually, can_read > from > IO::Select) to block until data has arrived from any client. > > It's my understanding that TCP does not preserve "message boundaries." > That is, if a client writes: > > $sock->autoflush(1); > print $sock "abel\n"; > print $sock "baker\n"; > print $sock "charlie\n"; > > ... there are no guarantees about what will be returned from the server's > system buffer by sysread() when select() pops. It could be "a", or "abel\n", > or "abel\nbak", etc. > > What I *want* is to block until an entire "\n"-terminated string [can that be > referred to as a "line"?] can be retrieved from one of my clients. I'm sure > I > could work out the logic to maintain application-level buffers, but I suspect > I > would merely be reinventing the wheel, one bug at a time :-). What does > the experienced Perl programmer - or socket-level programmer in general - > do in this situation?
TCP is a stream oriented protocol. As you noted it does not make any guarantee about fragmenting or appending data because it just doesn't care what it transports. It simply guarantees the octets will arrive in the same order they were sent. There are no assumptions nor restrictions on what those octets represent. Therefore, Perl cannot do what you asked for either. It is necessary for your application to keep track of the message boundaries and manage multiple and/or incomplete messages from a single read. Having dealt with this issue many times over the past 20 years or so, I have developed a couple of approaches. The first is a two layer version that reads the incoming data from the socket and queues it into a circular buffer. The next layer extracts individual messages from the buffer and hands them off to a parser, or whatever is needed next. The size of the buffer is dependent on a lot of variables which vary by application and protocol. The possibility of overflow is too much of a risk in some cases. Another approach is message queues where the incoming octets are moved into a buffer until the boundary marker arrives. That buffer is sent to a message queue and the next buffer is opened. I often use XINU style message queues for this approach. The risk here is the possibility of running out of buffers. There may be other options to simplify this. One popular variation is to precede each message with a two byte length value. Normally this will be a 16 bit integer in network byte order. You read the two bytes, then do another read for the number of bytes indicated by them. You still have to manage fragments, but no longer need to deal directly with multiples. Bob McConnell -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/