On Sat, 20 Jun 2020, Albretch Mueller wrote:
_X=".\(html\|txt\)" _SDIR="$(pwd)" _AR_TERMS=( Kant "Gilbert Ryle" Hegel ) for iZ in ${!_AR_TERMS[@]}; do find "${_SDIR}" -type f -iregex .*"${_X}" -exec grep -il "${_AR_TERMS[$iZ]}" {} \; done # iZ: terms search/grep'ped inside text files; echo "~"; # this would be much faster find "${_SDIR}" -type f -iregex .*"${_X}" -exec grep -il "Kant\|Gilbert Ryle\|Hegel" {} \;
There is also -exec '{}' + instead of the -exec '{}' ';' version. You could compare them. There was a recent thread here about this?
but how do I know which match happened in order to save it into separate files?
Remove -l flag, use -o flag. $ man grep # Read it now. Read it later too. ... -l, --files-with-matches Suppress normal output; instead print the name of each input file from which output would normally have been printed. The scanning will stop on the first match. ... -o, --only-matching Print only the matched (non-empty) parts of a matching line, with each such part on a separate output line. ... -n, --line-number Prefix each line of output with the 1-based line number within its input file. ... -Z, --null Output a zero byte (the ASCII NUL character) instead of the character that normally follows a file name. For example, grep -lZ outputs a zero byte after each file name instead of the usual newline. This option makes the output unambiguous, even in the presence of file names containing unusual characters like newlines. This option can be used with commands like find -print0, perl -0, sort -z, and xargs -0 to process arbitrary file names, even those that contain newline characters.
grep doesn't do replacements: https://stackoverflow.com/questions/16197406/grep-regex-replace-specific-find-in-text-file
No. Do you want replacements? You haven't asked for that.
but at least (in my way to understand reality, since it must try such searches sequentially) it should give you the index of the match
It can. See -n flag above. $ find . -type f \ \( -name "*.[Hh][Tt][Mm][Ll]" -o -name "[Tt][Xx][Tt]" \) \ -exec grep -ino "Waldo\|Ice Cream" '{}' + | uniq [...] ./wikipedia.org/Transport_Layer_Security.html:999:Ice Cream ./wikipedia.org/Transport_Layer_Security.html:1623:Ice Cream ./wikipedia.org/Edward_R_Murrow.html:338:Waldo ./wikipedia.org/Firefox_version_history.html:3219:Ice Cream ./wikipedia.org/Firefox_version_history.html:3710:Ice Cream ./wikipedia.org/Firefox_version_history.html:6021:Ice Cream ./wikipedia.org/Rwandan_genocide.html:993:Waldo ./wikipedia.org/Agar.html:61:Ice cream ./wikipedia.org/Agar.html:61:ice cream ./wikipedia.org/Turmeric.html:341:ice cream ./wikipedia.org/There_Will_Be_Blood.html:150:ice cream ./wikipedia.org/Boko_haram.html:1328:ice cream ./wikipedia.org/Phantasm_(franchise).html:126:ice cream ./wikipedia.org/Phantasm_(franchise).html:226:ice cream ./wikipedia.org/Thundarr_the_Barbarian.html:153:Waldo ./wikipedia.org/Ecuador.html:1487:waldo ./wikipedia.org/French_Guiana.html:886:Ice cream ./wikipedia.org/French_Guiana.html:886:ice cream ./wikipedia.org/Henry_david_thoreau.html:136:Waldo ./wikipedia.org/Henry_david_thoreau.html:235:Waldo [...]
and if grep doesn't do that I am sure some other batch utility would
"Batch utility" sounds like COBOL talk, or something else paleographic and interesting.
(I havenever used sed in my code)
I believe sed is a write-only language that is overwhelmingly useful at command line. If I find myself putting it in a script, I try to find another way. For example, awk is far nicer to look at. -- @almightygenie 8 Jun 2020 | @Windex Thanks, Windex. That's a relief. Your drink is even more refreshing now that I know that it deplores racism & discrimination. twitter.com/almightygenie/status/1270096054864809988