I wanted to submit a patch that is quite short and more thought as a feature request. It adds the predicate "-dtype <regex>" (dtype meaning datatype). The dtype predicate uses libmagic from the "file" command to get the *content datatype* of the file in view, then doing a regex on it. i.e. "echo abc>f.txt; file f.txt" yealds "ASSCII text". Therefore "file f.txt -dtype .*text.*" would do a regex ".*text.*" on "ASCII text" (and match).
The problem this patch addresses is like this: I have several source project directory with serveral million files in them. I want to make a backup, however i want to only backup text files, (Makefiles, shell sripts, c and h files etc). Currently I do something like this: (for f in `find <srcdir> -type f`; do if (file $f | cut -d: -f2 | grep text &> /dev/null ); then echo $f; fi; done) > file.list Then I use file.list to create a tar. But, this pipe is very slow (I run it over night so it works but...). With the above patch I can do: find <stcdir> -dtype .*text.* >file.list This version is a magnitude faster... As for I'm not really familiar with patch subbmissions I send this short and easy to understand patch so that a developer can integrate it himself. I guess having a content datatype match is quite useful... To make it compile you need to have libmagic (from the "file" command distribtuin) installed. Then I added LIBS=-lmagic to the configure call. I guess in reality you should have a --with-magic or so configure option etc. -- Greetings Konrad Eisele -- GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT! Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01
fu.diff
Description: Binary data
_______________________________________________ Findutils-patches mailing list [email protected] http://lists.gnu.org/mailman/listinfo/findutils-patches
