At 8:31 AM +0930 8/15/04, Paul A. Hoadley wrote:
Hello,
I'm in the process of cleaning a Maildir full of spam. It has
somewhere in the vicinity of 400K files in it. I started running
this yesterday:
find . -atime +1 -exec mv {} /home/paulh/tmp/spam/sne/ \;
It's been running for well over 12 hours. It certainly is
working---the spams are slowly moving to their new home---but
it is taking a long time. It's a very modest system, running
4.8-R on a P2-350. I assume this is all overhead for spawning
a shell and running mv 400K times.
Some of it is that, and some of it is the performance-penalty of
deleting files from a directory which has 400K filenames in it,
only to add the same files into a directory which will eventually
have 400K filenames in it. Directory adds/deletes are not fast
when a directory has that many filenames. It is probably even
worse if there are other processes still working on the same
directory (such as sendmail importing more mail).
Where is '.' in the above `find .' command? Is it is on the same
partition as /home/paulh/tmp/spam/sne/ ?
You may find it much faster to do something like:
mkdir usermail.new
chown user:group usermail.new
mv usermail usermail.bigspam
mv usermail.new usermail
cd usermail.bigspam
find . \! -atime +1 -exec mv {} ../usermail \;
My assumption there is that you have a LOT fewer "good files" than
you have "bad files", so there will be fewer files to move. But I
am also making the assumption that all your files are in a single
directory (and not a tree of directories), which may be a bad
assumption.
Is there a better way to move all files based on some characteristic
of their date stamp? Maybe separating the find and the move, piping
it through xargs?
The thing to use is the '-J' option of xargs. That way you can
have the destination-directory be the last argument in the command
that gets executed, and yet you're still moving as many files in
a single `mv' command as possible. E.g., change my earlier `find'
command to:
find . \! -atime +1 -print0 | xargs -0J[] mv [] ../usermail
Check the man page for xargs for a description of -J
--
Garance Alistair Drosehn = [EMAIL PROTECTED]
Senior Systems Programmer or [EMAIL PROTECTED]
Rensselaer Polytechnic Institute or [EMAIL PROTECTED]
_______________________________________________
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"