Thanks for the reply On 10/04/2012 07:08, Thomas Eckardt wrote: >> For the first two to three thousand files the rebuild goes through >> really quickly. Somewhere after that it starts to slow down > First: what is your setting for 'useDB4rebuild' and is BerkeleyDB > installed ? useDB4rebuild = on
Perl Modules: BerkeleyDB 0.51 / 0.42 enabled BerkeleyDB_DBEngine 4.8 / 4.5 enabled Packages installed: libdb4.8 libdb4.8-dev >> Once it completes the >> spam folder it moves to the notspam folder and starts off quicker again. > This behavior is mostly normal and depends on the contents of the corpus > files. > > You can see, the files in the error folder are processed slower - because > the size of the analyzed data is 2 times the size in the other both > folders. This is actually the opposite of what I am seeing. The error folders are processed really quickly. > You may also try to reduce 'MaxFiles'. It should be at least at a value of > the largest error folder. In most case, it makes not a big difference to > the confidence of the resulting databases, > if the rebuild processes 7000 or 14000 files (this may or may not be true > !) I'll see what difference it makes, it will obviously reduce the runtime but doesn't solve the actual problem >> sit around 1.9GB usage >> According to other discussions I have >> seen here that may be a bit on the high side as they aren't particularly >> high volume so I'm wondering if there's anything I can do to improve >> that. > If you switched all the main hashes and lists to MySQL and all the > 'useDB4....' config parms are set - all is done to reduce the memory usage > to a minimum. > > I'm running exactly this config with all plugins fully in use and 5 > workers - the system uses never more than 500 MB of memory Odd, top gives me: VIRT=1379m, RES=828m, SHR=229m 1.3GB is quite a bit bigger than 500MB, even the physical usage of 828Mb is out by a factor of 50%. Still it seems to stay stable and not increase, plus is significantly better than running on a 64 bit OS. Thanks for your help, Colin. ------------------------------------------------------------------------------ Better than sec? Nothing is better than sec when it comes to monitoring Big Data applications. Try Boundary one-second resolution app monitoring today. Free. http://p.sf.net/sfu/Boundary-dev2dev _______________________________________________ Assp-test mailing list Assp-test@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/assp-test