> > BM> Of course I know regexes are evil, index() is much better in > performance, > BM> but it still doesn't solve what I am truly after, and that is > a select() > BM> that returns when a complete line of data is available, not just a few > BM> characters. > > would it be possible for you to switch to a stream protocol? In that > case, you could read blocks of data, knowing their length before, > using fast sysread() or read() calls, without need to investigate the > data by a regular expression. > > Such protocols are available in CPAN. POE, for example, has them (I > only read about them, but I think the block IO wheels would do this). > Or have a look at IPC::LDT, which implements such a protocol directly > and handles data transparently, so you could send everything including > your terminator and expect to receive it as one block on the other > side (the block length is determined dynamically). It can be used with > Event. >
Hmmm, an automatically adjusting block reader. That is clever. However, much of my data is of the variable length type, even within the same data stream. For instance GPS data. This is mostly industrial data acquisition, and some of the instruments may take 40 samples in a trace in one go, and then 41 samples in the next trace. Besides, I've abstracted out the concept of filehandles to the derived classes, so creating a new class doesn't and in my opinion should not be limited to the format of data they are expecting. In short, I'm not willing to give up that flexibility, because it might come back to get me in the future. I've digressed a little bit here going beyond the scope of your comments, but I mention it because one of the suggestions I received was to either make everything fixed packet, or prepend a packet length to each transmission. I've been playing around with alternatives, and forking would be "the way", except I would have to rewrite a bit of code. I have however found a solution that is quite acceptable. My GPS handler used to consume about 15% of the CPU (Pentium II, 233), now it consumes less than 1%. The old way: 1) create a filehandle, set it to nonblocking 2) add the FH to a big select, and wait for select() to return 3) read the filehandle, as much data as is there. Separate for multiple lines and save partial lines that do not have separators. The new way: 1) create a filehandle, set it to blocking 2) add the file handle to a big select, and wait for select() to return 3) Read one single complete line of data from the filehandle in a blocking fashion. 4) check the filehandle for eof(). If not at eof(), return to 3) 5) return all lines that were read. The problem was the slow writers writing individual characters at a time. I found a simple solution that is not very efficient resource-wise, but these are slow writers, and an inefficient way of reading these guys makes the fast writers much better. After having read Perl's implementation of eof(), all it does is try a getc(). If it succeeds, you are not at the end of the file. After that, it does an ungetc(). I solved it with an open() like this: open($fh, "cat /dev/gps |"); "cat" very nicely takes the GPS data and sends it over to me a line at a time. I can still open /dev/gps for writing if I need to. The rest of the fast writers? Well, I write most of the device drivers, so I simply return -EAGAIN on a call to poll(), unless there is an entire line ready to be written out to the reader. I have a radar application that generates traces at 50Hertz, maximum. I thought for sure the eof() would occasionally return false (more data available) as I loaded down the system, but I see less than 2% of these double reads. I'm impressed and surprised that the solution was so simple. I've never gone into optimization before. Always wrote the code so that it worked, but as the embedded computer was asked to do more and more, I decided something needed to be done. Why Perl? It got the job done quickly and cheaply (for the customer). If they really want to optimize it, it's going to have to be rewritten in C. Brian Michalk <http://www.michalk.com> Life is what you make of it ... never wish you had done something. Aviator, experimental aircraft builder, motorcyclist, SCUBA diver musician, home-brewer, entrepreneur and barely single
