On 30/03/2008 4:43 PM, Justin Mason wrote:
> Daryl C. W. O'Shea writes:
>> Also, I noticed that the client side corpus caches are growing somewhat
>> large. They should probably be pruned (I usually just use 'find'). I
>> don't know what the expected life-span of a message is in the corpus,
>> though, so I'll defer this to you.
>
> what command line do you use?
I use this for managing my corpus...
/usr/bin/find ${CORPUSDIR}/ham/ -type f -mtime +${AGE_HAM} -exec rm -f {} \;
/usr/bin/find ${CORPUSDIR}/spam/ -type f -mtime +${AGE_SPAM} -exec rm -f
{} \;
/usr/bin/find ${CORPUSDIR}/ham/ -depth -type d -empty -exec rmdir {} \;
/usr/bin/find ${CORPUSDIR}/spam/ -depth -type d -empty -exec rmdir {} \;
...the corpus cache is a little different, though, as the clients don't
know which messages are spam and which are ham. Hopefully either both
spam and ham stick around in the corpora for the same period of time, or
you can tell them apart based on their path.
Daryl