Wouldn't this interact rather badly with the /gc option (which also leaves C<pos> set on failure)? This question arose because I was trying to work out how one would write a lexer with the new /z option, and it made my head ache ;-) > As you can see from the example code, the program flow stays very close > to what people would ordinarily program under normal circumstances. > > By contrast, RFC 93 proposes another solution to the same problem, but > using callbacks. Since the same sub must do one of several things, the > first thing that needs to be done is to channel different kinds of > requests to their own handler. As a result, you need a complete rewrite > from what you'd use in the ordinary case. > > I think that a lot of people will find my approach far less > intimidating. I'm not sure I see that this: > my $chunksize = 1024; > while(read FH, my $buffer, $chunksize) { > while(/(abcd|bc)/gz) { > # do something boring with the matched string: > print "$1\n"; > } > if(defined pos) { # end-of-buffer exception > # append the next chunk to the current one > read FH, $buffer, $chunksize, length $buffer; > # retry matching > redo; > } > } is less intimidating or closer to the "ordinary program flow" than: \*FH =~ /(abcd|bc)/g; (as proposed in RFC 93). > =head2 Match prefix > > It can be useful to be able to recognize if a string could possibly be a > prefix for a potential match. For example in an interactive program, > you want to allow a user to enter a number into an input field, but > nothing else. After every single keystroke, you can test what he just > entered against a regex matching the valid format for a number, so that > C<1234E> can be recognized as a prefix for the regex > > /^\d+\.?\d*(?:E[+-]?\d+)$/ Isn't this just: \*STDIN =~ /^\d+\.?\d*(?:E[+-]?\d+)$/ or die "Not a number"; ??? Damian