Making grep on large number of files run faster

uzibalqa Thu, 30 Mar 2023 13:02:24 -0700


How can I make searches with grep on a large number of files run faster?


My plan has been to use parallel processing with Gnu Parallel to run multiple 
instances of grep in
parallel, instead of using xargs with the -P option.  

Here is my parallel implementation.  I first locate the files and then pipe 
them to parallel. 
PTRN contains multiple search patterns, whilst ICTX stores the context.

I need some help to get a good implementation with possible improvements

ptrn=("-e" "FN" "-e" "DS")
ictx=(-A 8)

grep --null -r -l "${isufx[@]}"  \
  -f <(printf "%s\n" "${ptrn[@]}") -- "${fdir[@]}"  \
    | PARALLEL_SHELL=bash psgc=$sgc psgr=$sgr uptrn=$ptrn  \
        parallel -m0kj"$procs"  \
          'for fl in {}; do
            printf "\n%s\n\n" "${psgc}==> $fl <==${psgr}"
            grep -ni '"${ictx[@]@Q}"'
                 -f <(printf "%s\n" "${ptrn[@]}") -- "$fl"
          done'





For example, you can use find to locate the files and then pipe them to xargs 
with the -P option to run multiple instances of grep in parallel.

Making grep on large number of files run faster

Reply via email to