Post, Mark K writes:
> Finally, there is a subtle difference between doing an -exec rm and piping
> the output of find to an xargs rm command.  The difference there is that the
> find command will invoke the rm command once for each file that it finds
> that matches your criteria.  The xargs version will "batch" them up to the
> maximum line length that is allowed on your system, and invoke rm once for
> each maximum number of arguments, thus reducing the amount of system
> overhead required for process creation and destruction, etc.  I tend to use
> that a lot these days.  It really does speed things up when there are a lot
> of objects to be handled.

However, if you use xargs be extremely careful about the possibility of
whitespace in filenames. If you have a file called "old price.list" and
use a pipe such as
    find .... -print | xargs rm
(or, equivalently, omit the "-print" since it's the default action)
then xargs will parse its input stream
    foo
    bar
    old price.list
    baz
for arguments to rm by separating at whitespace and end up attempting
to remove file "old" (which probably doesn't exist) and file
"price.list" (which may be your new file which you definitely don't
want removed). It's much safer to use
    find .... -print0 | xargs -0 rm
(those are zeroes) which are GNU extensions that force find to print
the filenames terminated with NUL (a.k.a. \0 a.k.a. ASCII code 0) and
force xargs to split its input stream at the \0 character (which
cannot appear in filenames) and thus safely remove exactly the right
files. It also therefore handles filenames containing \n correctly
which, although not a common mistake, can form part of a malicious
attack against some programs which mis-parse such things.

Talking of attacks, there were examples elsewhere in the thread of
using find to traverse a directory such as /tmp to clean things up.
I should warn people that there are race conditions that are easy
to miss when doing such recursive operations on filesystems which are
writable by potential attackers. These involve the order in which
directories are read, lists built up, directories and symbolic links
traversed and the resulting actions executed. There have been known
exploits in the past resulting from such automation. An example
includes versions of some of the automated /tmp cleanup scripts run
from cron in various older distributions. Any of you whose threat
models require you to give attention to possible attacks from local
users should be careful how such automated scripts are coded.

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Reply via email to