I can't speak for the grep guys, but at least I was correct that current gawk is much faster than gawk 4.0.2.
Arnold Daniel Green <ddgr...@gmail.com> wrote: > I don't have access to a newer gawk where I did the initial timings, but I > ran an almost identical test on my home machine. > > grep (v3.11): ~0.60s > perl (v5.38.0): ~3.21s > gawk (v4.0.2 built from source with `-O3 -march=native`): ~10.22s > gawk (v5.2.2 built from source with `-O3 -march=native`): ~4.95s > > If grep will never add this functionality I'll survive, it just seemed like > it might not be too much work to implement, and would probably still be > much faster than using awk/perl. I've never looked at the grep source code > before, but could be tempted to try implementing it myself if there was any > chance of the path being accepted. > > Dan > > On Mon, Aug 21, 2023 at 2:37 PM <arn...@skeeve.com> wrote: > > > Gawk 4.0.2 is 11 years old. Try timing the current version, > > I'll bet it's faster. And it solves your problem NOW, > > instead of waiting for a feature that the grep developers > > aren't likely to add. > > > > My two cents of course. > > > > Arnold > > > > Daniel Green <ddgr...@gmail.com> wrote: > > > > > That works, as well as the Perl version I've been using: > > > > > > perl -ne 'print if ($. == 1 || /pattern/)' > > > > > > But timings for a real-life example (3GB file with ~16m lines, CentOS 7) > > > show the problem: > > > > > > grep (v2.20): ~1.15s > > > perl (v5.36.1): ~4.48s > > > awk (v4.0.2): ~10.81s > > > > > > Admittedly grep is just searching in those timings, but I suspect it > > could > > > accomplish the full task with a minimal decrease in speed. > > > > > > Dan > > > > > > On Mon, Aug 21, 2023 at 12:57 PM <arn...@skeeve.com> wrote: > > > > > > > Daniel Green <ddgr...@gmail.com> wrote: > > > > > > > > > I'm frequently searching CSV files with 20-30 columns, and when > > there's a > > > > > hit it can be hard to know what the columns are. An option to also > > print > > > > > the first line of a file (either always, or only if that file had a > > match > > > > > to the pattern) in addition to any hits would be nice. > > > > > > > > > > Thanks, > > > > > Dan > > > > > > > > It sounds like awk would be a better tool: > > > > > > > > awk 'FNR == 1 || /pattern/' files ... > > > > > > > > should do the trick. > > > > > > > > HTH, > > > > > > > > Arnold > > > > > >