How can I make searches with grep on a large number of files run faster?
My plan has been to use parallel processing with Gnu Parallel to run multiple
instances of grep in
parallel, instead of using xargs with the -P option.
Here is my parallel implementation. I first locate the files and then pipe
them to parallel.
PTRN contains multiple search patterns, whilst ICTX stores the context.
I need some help to get a good implementation with possible improvements
ptrn=("-e" "FN" "-e" "DS")
ictx=(-A 8)
grep --null -r -l "${isufx[@]}" \
-f <(printf "%s\n" "${ptrn[@]}") -- "${fdir[@]}" \
| PARALLEL_SHELL=bash psgc=$sgc psgr=$sgr uptrn=$ptrn \
parallel -m0kj"$procs" \
'for fl in {}; do
printf "\n%s\n\n" "${psgc}==> $fl <==${psgr}"
grep -ni '"${ictx[@]@Q}"'
-f <(printf "%s\n" "${ptrn[@]}") -- "$fl"
done'
For example, you can use find to locate the files and then pipe them to xargs
with the -P option to run multiple instances of grep in parallel.