Suppose we are doing a multiline regex pattern search on a bunch of files
and we want to extract the matches, e.g. for further processing. By
default, grep outputs matches separated by newlines, but since we are doing
multiline patterns this creates the inconvenience that we cannot easily
extract the individual matches. So we would want to have the matches
separated by null bytes. This seems to be a very straightforward feature,
and I was surprised that this was not already possible.

Here is a tiny example

grep -rzPIho '}\n\n\w\w\b' | od -a

Depending on the files in your file tree, this may yield an output like

0000000   }  nl  nl   m   y  nl   }  nl  nl   i   f  nl   }  nl  nl
m0000020   y  nl   }  nl  nl   m   y  nl   }  nl  nl   i   f  nl   }
nl0000040  nl   m   y  nl0000044

As you can see, we cannot split on newlines to obtain the matches for
further processing, since the matches contain newline characters themselves.

Now grep already has the -z/--null flag, but that works only in conjunction
with the -l flag, which makes grep output filenames instead of matches.

So here the feature request: can we make the -z flag also affect the normal
output?


Regards,

Chiel

Reply via email to