Re: Parallelizing Spam Assassin
I did that - with DNSBL off there are no port 53 communications from SA -- Jason Philbrook wrote: > > I would run a tcpdump on the ethernet interface while doing this, just > in case there are network tests happening that you are not aware of. > > On Thu, Jul 30, 2009 at 11:55:21PM -0700, poifgh wrote: >> >> Hi >> >> I was measuring how quickly could SA [spam assassin] process spams when >> several SA processes are run in parallel over separate mbox files. I used >> a >> 8 core machine. Below are the numbers when I forked different number of >> processes. >> >> Fork = 8; >> Rate = 57 msgs/sec >> >> Fork = 4; >> Rate = 44 msgs/sec >> >> Fork = 1; >> Rate = 22 msgs/sec >> >> >> I ran freshly build SA with Bayes and DNSBL turned off. Why am I not >> seeing >> a linear increase in the throughput? Is a file locking creating the >> bottleneck? If yes, which particular file is being locked? If no, what >> could >> be the reason for this? >> >> thnx >> -- >> View this message in context: >> http://www.nabble.com/Parallelizing-Spam-Assassin-tp24751958p24751958.html >> Sent from the SpamAssassin - Users mailing list archive at Nabble.com. > > -- > /* > Jason Philbrook | Midcoast Internet Solutions - Wireless and DSL > KB1IOJ| Broadband Internet Access, Dialup, and Hosting > http://f64.nu/ | for Midcoast Mainehttp://www.midcoast.com/ > */ > > -- View this message in context: http://www.nabble.com/Parallelizing-Spam-Assassin-tp24751958p24796555.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: Parallelizing Spam Assassin
I would run a tcpdump on the ethernet interface while doing this, just in case there are network tests happening that you are not aware of. On Thu, Jul 30, 2009 at 11:55:21PM -0700, poifgh wrote: > > Hi > > I was measuring how quickly could SA [spam assassin] process spams when > several SA processes are run in parallel over separate mbox files. I used a > 8 core machine. Below are the numbers when I forked different number of > processes. > > Fork = 8; > Rate = 57 msgs/sec > > Fork = 4; > Rate = 44 msgs/sec > > Fork = 1; > Rate = 22 msgs/sec > > > I ran freshly build SA with Bayes and DNSBL turned off. Why am I not seeing > a linear increase in the throughput? Is a file locking creating the > bottleneck? If yes, which particular file is being locked? If no, what could > be the reason for this? > > thnx > -- > View this message in context: > http://www.nabble.com/Parallelizing-Spam-Assassin-tp24751958p24751958.html > Sent from the SpamAssassin - Users mailing list archive at Nabble.com. -- /* Jason Philbrook | Midcoast Internet Solutions - Wireless and DSL KB1IOJ| Broadband Internet Access, Dialup, and Hosting http://f64.nu/ | for Midcoast Mainehttp://www.midcoast.com/ */
Re: Parallelizing Spam Assassin
This whole time I thought the subject line was "Paralyzing Spam Assassin" and the original poster was having trouble with SA locking up. Oops. ;-) -- Dan Schaefer Web Developer/Systems Analyst Performance Administration Corp.
Some benchmarks (Re: Parallelizing Spam Assassin)
On Sat, Aug 01, 2009 at 01:34:34PM +0300, Henrik K wrote: > > That reminds me, gotta test how SA runs on a Sun T5240 with 16 core "128 > cores".. Well not that impressive for SA, price/speed wise.. T2+ 2x8x1.4Ghz, 144 msgs/sec @ 128 processes AMD X4 4x3Ghz, 43 msgs/sec @ 4 processes Note that this is 3.3 SVN with all the rulesrc included, perl 5.10. I saved the used stuff at http://sa.hege.li/bench/ to be able to make real comparisons, if someone has interesting servers. And this is as scientific as I can bother. :)
Re: Parallelizing Spam Assassin
Um, Linda.. I'm pretty positive Justin is Irish, not American. Linda Walsh wrote: > It's an American thing. Things that are normal speech for UK blokes, get > Americans all disturbed. > > Funny, used to be the other way around...but well...times change. > > > > Justin Mason wrote: >> On Fri, Jul 31, 2009 at 09:32, >> rich...@buzzhost.co.uk wrote: >>> Imagine what Barracuda Networks could do with that if they did not fill >>> their gay little boxes with hardware rubbish from the floors of MSI and >>> supermicro. Jesus, try and process that many messages with a $30,000 >>> Barracuda and watch support bitch 'You are fully scanning to much mail >>> and making our rubbish hardware wet the bed.' LOL. >> >> Richard -- please watch your language. This is a public mailing >> list, and offensive language here is inappropriate. >> > >
Re: Parallelizing Spam Assassin
On Fri, 2009-07-31 at 23:56 -0700, Linda Walsh wrote: > May I point out, that while you may find the language crude -- it isn't > language that would violate FTC standards in that in used any of the > 7 or so 'unmentionable words'... It's not about words on their own -- it's about how they are being used, and their meaning in context. > BTW, I've never even 'heard' or seen his name before this post. Must have been a warm and cozy place, the rock you've been hiding under. ;) You missed a 3 digit figure of posts and uncalled-for off-topic rants within a few weeks. > If I was talking with [...] I just apply my linguistic filter and > attempt to get the meaning. Sic. -- char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4"; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}
Re: Parallelizing Spam Assassin
Henrik K wrote: > On Sat, Aug 01, 2009 at 11:46:57AM +0200, Per Jessen wrote: >> Henrik K wrote: >> >> > On Sat, Aug 01, 2009 at 12:04:08AM -0700, Linda Walsh wrote: >> >> Well -- it's not just the cores -- what was the usage of the cores >> >> that >> >> were being used? were 3 out the 8 'pegged'? Are these 'real' >> >> cores, or >> >> HT cores? In the Core2 and P4 archs, HT's actually slowed down a >> >> good many workloads unless they were tightly constructed to work >> >> on the same >> >> data in cache. Else, those HT's did just enough extra work to >> >> block cache contents more than anything else. >> > >> > I really doubt there's HT involved in a recent looking 8 core 16GB >> > machine.. >> >> Why not? I have a couple of brandnew Intel Core i7 (Nehalem) systems >> with 8Gb RAM - they have 1 physical CPU with 4 cores and HT = >> 8 "cores". And they've got room for more RAM :-) > > Ah a comeback.. I guess it's atleast better than the P4 stuff? Not sure about that - AFAICT, it's exactly the same technology. (I haven't done in exhaustive tests though). /Per Jessen, Zürich
Re: Parallelizing Spam Assassin
On Sat, Aug 01, 2009 at 11:46:57AM +0200, Per Jessen wrote: > Henrik K wrote: > > > On Sat, Aug 01, 2009 at 12:04:08AM -0700, Linda Walsh wrote: > >> Well -- it's not just the cores -- what was the usage of the cores > >> that > >> were being used? were 3 out the 8 'pegged'? Are these 'real' cores, > >> or > >> HT cores? In the Core2 and P4 archs, HT's actually slowed down a > >> good many workloads unless they were tightly constructed to work on > >> the same > >> data in cache. Else, those HT's did just enough extra work to block > >> cache contents more than anything else. > > > > I really doubt there's HT involved in a recent looking 8 core 16GB > > machine.. > > Why not? I have a couple of brandnew Intel Core i7 (Nehalem) systems > with 8Gb RAM - they have 1 physical CPU with 4 cores and HT = > 8 "cores". And they've got room for more RAM :-) Ah a comeback.. I guess it's atleast better than the P4 stuff? That reminds me, gotta test how SA runs on a Sun T5240 with 16 core "128 cores"..
Re: Parallelizing Spam Assassin
On Sat, Aug 1, 2009 at 10:04, Henrik K wrote: > > On Sat, Aug 01, 2009 at 12:04:08AM -0700, Linda Walsh wrote: >> Well -- it's not just the cores -- what was the usage of the cores that >> were being used? were 3 out the 8 'pegged'? Are these 'real' cores, or >> HT cores? In the Core2 and P4 archs, HT's actually slowed down a good >> many workloads unless they were tightly constructed to work on the same >> data in cache. Else, those HT's did just enough extra work to block cache >> contents more than anything else. > > I really doubt there's HT involved in a recent looking 8 core 16GB machine.. > >> What's the disk I/O look like? I mean don't just focus on idle cores -- >> if the wait is on disk, maybe the cores can't get the data fast enough. > > As we already guessed, AWL (BerkeleyDB) caused disk I/O and slowness. For > heavy loads you need to use SQL (or maybe the better BDB plugin in 3.3 if we > get it working). > >> If the network is involved, well, that's a drag on any message checking. >> I'm seeing times of .3msgs/sec, but I think that's with networking turned >> on. Pretty Ugly. > > It affects single messages, but not total throughput. With network checks > you just dedicate a lot more childs. Waiting for network responses takes no > CPU time, thus you can process more messages simultaneously. although you will also need to allocate more memory, as well, to ensure that no swapping takes place. -- --j.
Re: Parallelizing Spam Assassin
Henrik K wrote: > On Sat, Aug 01, 2009 at 12:04:08AM -0700, Linda Walsh wrote: >> Well -- it's not just the cores -- what was the usage of the cores >> that >> were being used? were 3 out the 8 'pegged'? Are these 'real' cores, >> or >> HT cores? In the Core2 and P4 archs, HT's actually slowed down a >> good many workloads unless they were tightly constructed to work on >> the same >> data in cache. Else, those HT's did just enough extra work to block >> cache contents more than anything else. > > I really doubt there's HT involved in a recent looking 8 core 16GB > machine.. Why not? I have a couple of brandnew Intel Core i7 (Nehalem) systems with 8Gb RAM - they have 1 physical CPU with 4 cores and HT = 8 "cores". And they've got room for more RAM :-) /Per Jessen, Zürich
Re: Parallelizing Spam Assassin
On Sat, Aug 01, 2009 at 12:04:08AM -0700, Linda Walsh wrote: > Well -- it's not just the cores -- what was the usage of the cores that > were being used? were 3 out the 8 'pegged'? Are these 'real' cores, or > HT cores? In the Core2 and P4 archs, HT's actually slowed down a good > many workloads unless they were tightly constructed to work on the same > data in cache. Else, those HT's did just enough extra work to block cache > contents more than anything else. I really doubt there's HT involved in a recent looking 8 core 16GB machine.. > What's the disk I/O look like? I mean don't just focus on idle cores -- > if the wait is on disk, maybe the cores can't get the data fast enough. As we already guessed, AWL (BerkeleyDB) caused disk I/O and slowness. For heavy loads you need to use SQL (or maybe the better BDB plugin in 3.3 if we get it working). > If the network is involved, well, that's a drag on any message checking. > I'm seeing times of .3msgs/sec, but I think that's with networking turned > on. Pretty Ugly. It affects single messages, but not total throughput. With network checks you just dedicate a lot more childs. Waiting for network responses takes no CPU time, thus you can process more messages simultaneously.
Re: Parallelizing Spam Assassin
On Fri, 2009-07-31 at 23:40 -0700, Linda Walsh wrote: > It's an American thing. Things that are normal speech for UK blokes, get > Americans all disturbed. I'm sure that is mostly it, Linda. They don't seem to 'get' it. Two things I observe in this whole 'barracuda-gate' posting; 1. Being 'offended' is not terminal, it does not kill, disable or have any side effects. Can you image going to a doctor and saying "You've got to treat me Doctor, I got offended, my feelings are hurt." 2. Cultural differences exist. If I am expected to respect the 'diversity' that has people jumping up and down about the use of 'gay' because *they* have a different meaning for it, it is not unreasonable to expect *them* to respect my diversity in using it in it's original context. I'm tired of being told not to offend or upset people who don't show my views and beliefs equal respect. Anyway, it's all OT and pointless in any context of processing spam - the point I made was factual love it or hate it. That was poor hardware spec used in a well known retail anti-spam appliance = 6-8 MPS 'fully scanned'.
Re: Parallelizing Spam Assassin
Well -- it's not just the cores -- what was the usage of the cores that were being used? were 3 out the 8 'pegged'? Are these 'real' cores, or HT cores? In the Core2 and P4 archs, HT's actually slowed down a good many workloads unless they were tightly constructed to work on the same data in cache. Else, those HT's did just enough extra work to block cache contents more than anything else. What's the disk I/O look like? I mean don't just focus on idle cores -- if the wait is on disk, maybe the cores can't get the data fast enough. If the network is involved, well, that's a drag on any message checking. I'm seeing times of .3msgs/sec, but I think that's with networking turned on. Pretty Ugly. poifgh wrote: Henrik K wrote: Yeah, given that my 4x3Ghz box masscheck peaks at 22 msgs/sec, without Net/AWL/Bayes. But that's the 3.3 SVN ruleset.. wonder what version was used and any nondefault rules/settings? Certainly sounds strange that 1 core could top out the same. Anyone else have figures? Maybe I've borked something myself.. The problem is not with 22 being a low number, but when we have other free cores to run different SA parallely why doesnt the throughput scale linearly .. I expect for 8 cores with 8 SA running simultaneously the number to be 150+ msgs/sec but it is 1/3rd at 50 msgs/sec
Re: Parallelizing Spam Assassin
May I point out, that while you may find the language crude -- it isn't language that would violate FTC standards in that in used any of the 7 or so 'unmentionable words'... People -- these standards of 'crude language' really need to be strongly held 'in check' -- the US is 'supposed' to be the society of 'free speech' unless it is obscene or threatening. I don't think his posting was either (BTW, I've never even 'heard' or seen his name before this post. All I saw was his 'uk' addr -- and I've known a few 'uk' types, and many of them sound very crude to an American ear these days. So in addition to applying strictures in a conservative manner, we must, hopefully, try to be sensitive to different cultural backgrounds. If I was talking with a black teen from downtown SF/Oakland, I'd have to translate from Eubonics -- which can sound rather crude and might contain and F-word every other sentence. I just apply my linguistic filter and attempt to get the meaning. I hardly thing this list is aimed at an young audience -- and kid 13+ is going to have heard quite an ear-full of 'colorful explicatives' from ST4:Voyage home (a family movie), to everyday peer talk. Yes -- it sounded crude...more than I, normally hear in America -- but not more than I'd hear in London. Just my 2-cents on cultural sensitivity, and the ability to be amused at cultural differences (rather than choosing to be offended by them). p.s. - Most Commercial vendor products are Bantha Poodoo -- especially for Virus/Security and Spam protection, but NOT all. Usually the highest advertised profile are the worst -- they put more budget into advertising than engineering. Yeah, I still thing SA is a bit slow, but I put much of that up to it being written in an interpretive language and it's wide flexibility and extensibility with plug-ins. Whatcha gonna do? Maybe we should rewrite it in Forth? *grin*...
Re: Parallelizing Spam Assassin
* Linda Walsh : > It's an American thing. Things that are normal speech for UK blokes, get > Americans all disturbed. Sloppy language is sloppy language everywhere! I took offense in the message, too and I am neither American nor am I from the UK. But what annoys me the most is that the comments were simply off-topic. I can go and meet some friends and I can happily spend the whole night cracking one joke after another - pc or not pc. There's a place of everything. This is the place for SpamAssassin. I wish we could get back to what this thread was all about: "Parallelizing SpamAssassin". p...@rick > Funny, used to be the other way around...but well...times change. > > Justin Mason wrote: > >On Fri, Jul 31, 2009 at 09:32, > >rich...@buzzhost.co.uk wrote: > >>Imagine what Barracuda Networks could do with that if they did not fill > >>their gay little boxes with hardware rubbish from the floors of MSI and > >>supermicro. Jesus, try and process that many messages with a $30,000 > >>Barracuda and watch support bitch 'You are fully scanning to much mail > >>and making our rubbish hardware wet the bed.' LOL. > > > >Richard -- please watch your language. This is a public mailing > >list, and offensive language here is inappropriate. > > -- state of mind Digitale Kommunikation http://www.state-of-mind.de Franziskanerstraße 15 Telefon +49 89 3090 4664 81669 München Telefax +49 89 3090 4666 Amtsgericht MünchenPartnerschaftsregister PR 563
Re: Parallelizing Spam Assassin
It's an American thing. Things that are normal speech for UK blokes, get Americans all disturbed. Funny, used to be the other way around...but well...times change. Justin Mason wrote: On Fri, Jul 31, 2009 at 09:32, rich...@buzzhost.co.uk wrote: Imagine what Barracuda Networks could do with that if they did not fill their gay little boxes with hardware rubbish from the floors of MSI and supermicro. Jesus, try and process that many messages with a $30,000 Barracuda and watch support bitch 'You are fully scanning to much mail and making our rubbish hardware wet the bed.' LOL. Richard -- please watch your language. This is a public mailing list, and offensive language here is inappropriate.
Re: Parallelizing Spam Assassin
From: "poifgh" Sent: Friday, 2009/July/31 19:47 I am sorry, I did not provide any statistics of the machine involved. CPU - 8 cores with each core 2327 MHz RAM - 16GB Afair its has 7200RPM disk - 2TB. One disk you might consider a striped array to get disk speed. 50 megabytes per second stresses most disks pretty hard - not to the limit. But if there is a lot of seeking involved as well as multiple copies of the files being made as they pass through the system I can see how it'd be a little rough on the disk throughput. {^_^}
Re: Parallelizing Spam Assassin
From: "LuKreme" Sent: Friday, 2009/July/31 12:37 On Jul 31, 2009, at 1:33 PM, jdow wrote: Given that profanity is the effort of a small mind to express itself I have a feeling he's going to receive his third and final warning any time now, Matt Given that nothing that richard said is not anything I've heard on, say, prime time TV or... a committee meeting I am really curious now as to what was considered 'obscene'. I'm quite serious. Have I stumbled into a list run by religious freaks? Not me. I can happily go several whole days without hearing the B word. When I hear it I get B...y. {^_^} Joanne
Re: Parallelizing Spam Assassin
From: "LuKreme" Sent: Friday, 2009/July/31 12:30 On Jul 31, 2009, at 9:25 AM, John Hardin wrote: On Fri, 31 Jul 2009, rich...@buzzhost.co.uk wrote: ... dropping in here and making jokes at such low hanging fruit. Make all the jokes at Barracuda's expense that you like, complain about them all you like, just avoid offensive language. Really? Referring to gay hardware is THAT offensive that someone would need to be banned over it? No, it's the word "expensive". {+_+}
Re: Parallelizing Spam Assassin
On Fri, 2009-07-31 at 17:37 -0400, Glenn Sieb wrote: > LuKreme said the following on 7/31/09 3:27 PM: > >> Richard -- please watch your language. This is a public mailing > >> list, and offensive language here is inappropriate. > > > > I dunno, 'gay' isn't that offensive. > > > > > > Gay is *not* a synonym for stupid. > > I do take offense to the term being used in that manner. > > --Glenn > I find it deeply offensive that the word 'gay' is used as a synonym for homosexual in an attempt to stop people from using 'queer' - but hey 'gays' are not the only ones with opinions that 'matter'. Gay **is** a synonym for 'stupid' (silly) as far as I am concerned. It's original meaning of 'carefree','happy','silly' and 'showy' are clearly being used with sarcasm. The fact is 'queers' hijacked the word as per this; "— USAGE Gay is now a standard term for ‘homosexual’, and is the term preferred by homosexual men to describe themselves. As a result, it is now very difficult to use gay in its earlier meanings ‘carefree’ or ‘bright and showy’ without arousing a sense of double entendre. Gay in its modern sense typically refers to men, lesbian being the standard term for homosexual women." http://www.askoxford.com/concise_oed/gay?view=uk So please *quit* with the sympathetic pink preaching and learn what the word actually means. Just because it "is the term preferred by homosexual men to describe themselves" does not mean a minority have the right to slate people who use the word properly. With regards to the dig about Barracuda - this *WAS* OT. There were some benchmark tests discussed here that were impressive. My experience of SA in daily production is on Barracuda Appliances that STRUGGLE to push 6-8 messages a second through, so it was relevant as comparison. The wording could have been chosen with more care and I apologise to Christians or dog lovers who found the use of the messiah or female form offensive. However, the use of gay in a sarcastic context clearly fits with the original origin of the word, not by that section of the society who have stolen it and made it OT and OM. For that I make ***NO*** apology. I appreciate that using 'gay' in it's real meaning may hurt the feelings of some 'homosexuals' but as I have to respect their choices and views, they should show *me* the same respect for *my* views and choices. You may not like who I am and what I do, I may not like who you are and what you do. Now do we need to continue this or throw little tin God banning threats around more or can we just *get along* knowing we are all different but frequenting this list for Spamassassin information ?
Re: Parallelizing Spam Assassin
I havent tried with sa-compile yet - I can give it a shot -- Henrik K wrote: > > On Fri, Jul 31, 2009 at 10:41:47AM -0700, poifgh wrote: >> >> Henrik K wrote: >> > >> > Yeah, given that my 4x3Ghz box masscheck peaks at 22 msgs/sec, without >> > Net/AWL/Bayes. But that's the 3.3 SVN ruleset.. wonder what version was >> > used >> > and any nondefault rules/settings? Certainly sounds strange that 1 core >> > could top out the same. Anyone else have figures? Maybe I've borked >> > something myself.. >> > >> >> The problem is not with 22 being a low number, but when we have other >> free > > I did not say it was a problem. I was just wondering how fast CPU/memory > you > have, since my 3Ghz AMD doesn't seem to keep up. > > I just tested with fresh 3.2.5 install, and running 500 mail mbox with > single core resulted in 11 msgs / sec. Then I used sa-compile, and it > raised > to 15. Did you use it also? > > Of course your mailbox could be a lot different, so hard to compare. > >> cores to run different SA parallely why doesnt the throughput scale >> linearly >> .. I expect for 8 cores with 8 SA running simultaneously the number to be >> 150+ msgs/sec but it is 1/3rd at 50 msgs/sec > > Anyway as people have already said here, disable AWL: > > use_auto_whitelist 0 > > > -- View this message in context: http://www.nabble.com/Parallelizing-Spam-Assassin-tp24751958p24765570.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: Parallelizing Spam Assassin
I am sorry, I did not provide any statistics of the machine involved. CPU - 8 cores with each core 2327 MHz RAM - 16GB Afair its has 7200RPM disk - 2TB. Yes, people were right in indicating AWL could be the problem. turning off AWL results in near linear scaling of SA as we increase number of processes. My input is more than a 100K [mostly] spams which allowed me to have each run last for several minutes and then take an avg to get #msgs/sec With AWL, bayes and DNSBL turned off - i get about 24 msgs/sec for 1 fork and 166 msgs/sec for 8 fork with awl on and bayes and DNSBL off, i get about 22 msgs/sec for 1 fork and 50 msgs/sec for 8 fork Thnx everyone for helping out. -- Henrik K wrote: > > On Fri, Jul 31, 2009 at 10:41:47AM -0700, poifgh wrote: > > > I did not say it was a problem. I was just wondering how fast CPU/memory > you > have, since my 3Ghz AMD doesn't seem to keep up. > > I just tested with fresh 3.2.5 install, and running 500 mail mbox with > single core resulted in 11 msgs / sec. Then I used sa-compile, and it > raised > to 15. Did you use it also? > > Of course your mailbox could be a lot different, so hard to compare. > >> cores to run different SA parallely why doesnt the throughput scale >> linearly >> .. I expect for 8 cores with 8 SA running simultaneously the number to be >> 150+ msgs/sec but it is 1/3rd at 50 msgs/sec > > Anyway as people have already said here, disable AWL: > > use_auto_whitelist 0 > > > -- View this message in context: http://www.nabble.com/Parallelizing-Spam-Assassin-tp24751958p24765545.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: Parallelizing Spam Assassin
On Fri, Jul 31, 2009 at 10:41:47AM -0700, poifgh wrote: > > Henrik K wrote: > > > > Yeah, given that my 4x3Ghz box masscheck peaks at 22 msgs/sec, without > > Net/AWL/Bayes. But that's the 3.3 SVN ruleset.. wonder what version was > > used > > and any nondefault rules/settings? Certainly sounds strange that 1 core > > could top out the same. Anyone else have figures? Maybe I've borked > > something myself.. > > > > The problem is not with 22 being a low number, but when we have other free I did not say it was a problem. I was just wondering how fast CPU/memory you have, since my 3Ghz AMD doesn't seem to keep up. I just tested with fresh 3.2.5 install, and running 500 mail mbox with single core resulted in 11 msgs / sec. Then I used sa-compile, and it raised to 15. Did you use it also? Of course your mailbox could be a lot different, so hard to compare. > cores to run different SA parallely why doesnt the throughput scale linearly > .. I expect for 8 cores with 8 SA running simultaneously the number to be > 150+ msgs/sec but it is 1/3rd at 50 msgs/sec Anyway as people have already said here, disable AWL: use_auto_whitelist 0
Re: Parallelizing Spam Assassin
rich...@buzzhost.co.uk wrote: > email me off list as I've just been > banned for upsetting a sponsor LOL > Richard, this has nothing to do with Barracuda. They have no influence over my opinions whatsoever. I don't work for Apache or Barracuda, or any company sponsored by either.Neither Apache nor Barracuda has complained. At the time I warned you, I didn't even remember that Barracuda ever donated to Apache. I don't think any member of the PMC has any regular contact with Barracuda, although we've had occasional contact about using their RBL. Your warning is about using foul language, and then choosing to thumb your nose at the warning Justin gave you. You're behaving like an impudent and foul mouthed child, and that's unwelcome her. That said, I really don't appreciate you using this list to rant about Barracuda's products, or discuss them at all. This is the SpamAssassin list, not the Barracuda list. Barracuda may use SpamAssassin, and SpamAssassin may support the Barracuda public RBL, but beyond that, any discussion of them is, quite frankly, off-topic. I don't care how good or bad their commercial product, or its support is, because it is off-topic here. I don't welcome people praising Barracuda any more than I welcome complaints. It simply doesn't matter to SpamAssassin, so it doesn't belong here. You may as well be ranting about Ford cars for all I care, it still doesn't belongs here. This list is about SpamAssassin, nothing more, nothing less. Continue with the foul language, and you'll find the door very quickly. Keep harping on the same off-topic subject and we will eventually get tired of it. You've said your peace about Barracuda, now give it a rest, because frankly I don't care about their products, I care about our product. Is that difficult to understand? > > > >
Re: Parallelizing Spam Assassin
LuKreme said the following on 7/31/09 3:27 PM: >> Richard -- please watch your language. This is a public mailing >> list, and offensive language here is inappropriate. > > I dunno, 'gay' isn't that offensive. > > Gay is *not* a synonym for stupid. I do take offense to the term being used in that manner. --Glenn
Re: Parallelizing Spam Assassin
On Fri, Jul 31, 2009 at 12:37, LuKreme wrote: > On Jul 31, 2009, at 1:33 PM, jdow wrote: >> >> Given that profanity is the effort of a small mind to express itself >> I have a feeling he's going to receive his third and final warning any >> time now, Matt > > Given that nothing that richard said is not anything I've heard on, say, > prime time TV or... a committee meeting I am really curious now as to what > was considered 'obscene'. > > I'm quite serious. > > Have I stumbled into a list run by religious freaks? (mods: sorry if this also falls into the verboten category, I'm more trying to explore/catalog than perpetuate) Maybe it was using the word "bitch", where he could have used the word "complain". (and, religious freaks aren't the only freaks that don't like to see the word "Jesus" used in that kind of context ... saying words like "Jesus" around atheist freaks can also result in them claiming offence ... luckily religious freaks and atheist freaks aren't as common as merely religious people and merely atheist people)
Re: Parallelizing Spam Assassin
On Jul 31, 2009, at 1:33 PM, jdow wrote: Given that profanity is the effort of a small mind to express itself I have a feeling he's going to receive his third and final warning any time now, Matt Given that nothing that richard said is not anything I've heard on, say, prime time TV or... a committee meeting I am really curious now as to what was considered 'obscene'. I'm quite serious. Have I stumbled into a list run by religious freaks? -- Clark's Law: Sufficiently advanced cluelessness is indistinguishable from malice Clark Slaw: Anything that has been severely damaged or destroyed by application of Clark's Law
Re: Parallelizing Spam Assassin
From: "Matt Kettler" Sent: Friday, 2009/July/31 04:26 rich...@buzzhost.co.uk wrote: On Fri, 2009-07-31 at 09:53 +0100, Justin Mason wrote: On Fri, Jul 31, 2009 at 09:32, rich...@buzzhost.co.uk wrote: ... Richard -- please watch your language. This is a public mailing list, and offensive language here is inappropriate. ... Richard, we are not joking. Please watch your language on this mailing list, or you will be banned from it. You have now been warned by 2 members of the Project Management Committee. You will not be warned again. Given that profanity is the effort of a small mind to express itself I have a feeling he's going to receive his third and final warning any time now, Matt. {^_-}
Re: Parallelizing Spam Assassin
On Jul 31, 2009, at 9:25 AM, John Hardin wrote: On Fri, 31 Jul 2009, rich...@buzzhost.co.uk wrote: ... dropping in here and making jokes at such low hanging fruit. Make all the jokes at Barracuda's expense that you like, complain about them all you like, just avoid offensive language. Really? Referring to gay hardware is THAT offensive that someone would need to be banned over it? -- Is a vegetarian permitted to eat animal crackers?
Re: Parallelizing Spam Assassin
On Jul 31, 2009, at 2:53 AM, Justin Mason wrote: On Fri, Jul 31, 2009 at 09:32, rich...@buzzhost.co.uk wrote: Imagine what Barracuda Networks could do with that if they did not fill their gay little boxes with hardware rubbish from the floors of MSI and supermicro. Jesus, try and process that many messages with a $30,000 Barracuda and watch support bitch 'You are fully scanning to much mail and making our rubbish hardware wet the bed.' LOL. Richard -- please watch your language. This is a public mailing list, and offensive language here is inappropriate. I dunno, 'gay' isn't that offensive. -- Overhead, without any fuss, the stars were going out.
Re: Parallelizing Spam Assassin
On Jul 31, 2009, at 1:55 AM, poifgh wrote: I ran freshly build SA with Bayes and DNSBL turned off. Why am I not seeing a linear increase in the throughput? Is a file locking creating the bottleneck? If yes, which particular file is being locked? If no, what could be the reason for this? There could be many reasons, check out my talk (admittedly out of date a little but should still be mostly relevant) on High Performance Apache SpamAssassin at the following link: http://people.apache.org/~parker/presentations/index.html Keep in mind that you might also be seeing other factors like memory and disk I/O contention. You don't really spell out your testing infrastructure so its not real clear if you're even performing a valid test. Also, I wouldn't necessarily expect to see a linear increase, although you might be able to take some easy steps for increasing your overall performance. Michael
Re: Parallelizing Spam Assassin
> In my tests - there was not MTA. The mails/spam were collected from > some server in mbox format and fed to SA using --mbox switch. The > size of msgs was not altered in any fashion - just the usual size of > incoming spam/mails If you're interested in testing/tuning spamassassin for heavy loads you should consider using spamd daemon. Then you may use SLAMD [1] as performance evaluation platform [2]. It takes some effort to set up the environment, but SLAMD helps in repetitive testing and keeping track of the results (comparison, history, charts). [1] http://www.slamd.com [2] https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5689 -- Pawel Sasin "WIRTUALNA POLSKA" Spolka Akcyjna z siedziba w Gdansku przy ul. Traugutta 115 C, wpisana do Krajowego Rejestru Sadowego - Rejestru Przedsiebiorcow prowadzonego przez Sad Rejonowy Gdansk - Polnoc w Gdansku pod numerem KRS 068548, o kapitale zakladowym 67.980.024,00 zlotych oplaconym w calosci oraz Numerze Identyfikacji Podatkowej 957-07-51-216.
Re: Parallelizing Spam Assassin
OK - I can see what metrics you are trying to ascertain - I think. I'm not sure that your test and real life are 'right'. For obvious reasons I don't want to carry this one on via list - I would suggest you ask Justin and I will be happy to give info on my local setup (this assumes Justin can grab time away from toxic nappies/daipers) There is a lot you can do to ameliorate load. On bad days my quad does 50 a second so it's doable. I will freely admit I have no clue quite how this came to be, but it is (a case of having colleagues knowing more than I do - for which I am eternally grateful; the usual culprits know who they are) Kind regards Nigel On Fri, 31 Jul 2009 11:41:14 -0700 (PDT), poifgh wrote: > >In my tests - there was not MTA. The mails/spam were collected from some >server in mbox format and fed to SA using --mbox switch. The size of msgs >was not altered in any fashion - just the usual size of incoming spam/mails > >There are no AV [you mean Anti Virus right?] running on the machine > >Would be back with results > >-- > > > > >Nigel Frankcom-2 wrote: >> >> I'm assuming you run a tad more messages than I, but on a quad with a >> failover I have never seen the failover kick in 4 years. This is not >> disputing your observations, just noting mine. >> >> I claim absolutely no knowledge about the core processing/stacking >> though I would assume (perhaps incorrectly) that the parsing would be >> part of the software (MTA). >> >> I freely admit I only picked up what seems the tail end of this thread >> but having used SA for so many years I think I have at least a handle >> on how it plays (hence the failover). My failover SA is in place to >> handle slow queries from the primary SA. Assuming (again) that mail >> size has been factored and any AV is running remotely? >> >> Just a few thoughts based on a very cursory read of a few posts, sadly >> - or happily, work make my contributions here limited. >> >> I'd be interested in the results of this though. >> >> Kind regards >> >> Nigel >> >> PS - apologies if I'm repeating prior observations. >> >> On Fri, 31 Jul 2009 10:41:47 -0700 (PDT), poifgh >> wrote: >> >>> >>> >>> >>>Henrik K wrote: Yeah, given that my 4x3Ghz box masscheck peaks at 22 msgs/sec, without Net/AWL/Bayes. But that's the 3.3 SVN ruleset.. wonder what version was used and any nondefault rules/settings? Certainly sounds strange that 1 core could top out the same. Anyone else have figures? Maybe I've borked something myself.. >>> >>>The problem is not with 22 being a low number, but when we have other free >>>cores to run different SA parallely why doesnt the throughput scale >linearly >>>.. I expect for 8 cores with 8 SA running simultaneously the number to be >>>150+ msgs/sec but it is 1/3rd at 50 msgs/sec >> >>
Re: Parallelizing Spam Assassin
In my tests - there was not MTA. The mails/spam were collected from some server in mbox format and fed to SA using --mbox switch. The size of msgs was not altered in any fashion - just the usual size of incoming spam/mails There are no AV [you mean Anti Virus right?] running on the machine Would be back with results -- Nigel Frankcom-2 wrote: > > I'm assuming you run a tad more messages than I, but on a quad with a > failover I have never seen the failover kick in 4 years. This is not > disputing your observations, just noting mine. > > I claim absolutely no knowledge about the core processing/stacking > though I would assume (perhaps incorrectly) that the parsing would be > part of the software (MTA). > > I freely admit I only picked up what seems the tail end of this thread > but having used SA for so many years I think I have at least a handle > on how it plays (hence the failover). My failover SA is in place to > handle slow queries from the primary SA. Assuming (again) that mail > size has been factored and any AV is running remotely? > > Just a few thoughts based on a very cursory read of a few posts, sadly > - or happily, work make my contributions here limited. > > I'd be interested in the results of this though. > > Kind regards > > Nigel > > PS - apologies if I'm repeating prior observations. > > On Fri, 31 Jul 2009 10:41:47 -0700 (PDT), poifgh > wrote: > >> >> >> >>Henrik K wrote: >>> >>> Yeah, given that my 4x3Ghz box masscheck peaks at 22 msgs/sec, without >>> Net/AWL/Bayes. But that's the 3.3 SVN ruleset.. wonder what version was >>> used >>> and any nondefault rules/settings? Certainly sounds strange that 1 core >>> could top out the same. Anyone else have figures? Maybe I've borked >>> something myself.. >>> >> >>The problem is not with 22 being a low number, but when we have other free >>cores to run different SA parallely why doesnt the throughput scale linearly >>.. I expect for 8 cores with 8 SA running simultaneously the number to be >>150+ msgs/sec but it is 1/3rd at 50 msgs/sec > > -- View this message in context: http://www.nabble.com/Parallelizing-Spam-Assassin-tp24751958p24761236.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: Parallelizing Spam Assassin
I'm assuming you run a tad more messages than I, but on a quad with a failover I have never seen the failover kick in 4 years. This is not disputing your observations, just noting mine. I claim absolutely no knowledge about the core processing/stacking though I would assume (perhaps incorrectly) that the parsing would be part of the software (MTA). I freely admit I only picked up what seems the tail end of this thread but having used SA for so many years I think I have at least a handle on how it plays (hence the failover). My failover SA is in place to handle slow queries from the primary SA. Assuming (again) that mail size has been factored and any AV is running remotely? Just a few thoughts based on a very cursory read of a few posts, sadly - or happily, work make my contributions here limited. I'd be interested in the results of this though. Kind regards Nigel PS - apologies if I'm repeating prior observations. On Fri, 31 Jul 2009 10:41:47 -0700 (PDT), poifgh wrote: > > > >Henrik K wrote: >> >> Yeah, given that my 4x3Ghz box masscheck peaks at 22 msgs/sec, without >> Net/AWL/Bayes. But that's the 3.3 SVN ruleset.. wonder what version was >> used >> and any nondefault rules/settings? Certainly sounds strange that 1 core >> could top out the same. Anyone else have figures? Maybe I've borked >> something myself.. >> > >The problem is not with 22 being a low number, but when we have other free >cores to run different SA parallely why doesnt the throughput scale linearly >.. I expect for 8 cores with 8 SA running simultaneously the number to be >150+ msgs/sec but it is 1/3rd at 50 msgs/sec
Re: Parallelizing Spam Assassin
Henrik K wrote: > > Yeah, given that my 4x3Ghz box masscheck peaks at 22 msgs/sec, without > Net/AWL/Bayes. But that's the 3.3 SVN ruleset.. wonder what version was > used > and any nondefault rules/settings? Certainly sounds strange that 1 core > could top out the same. Anyone else have figures? Maybe I've borked > something myself.. > The problem is not with 22 being a low number, but when we have other free cores to run different SA parallely why doesnt the throughput scale linearly .. I expect for 8 cores with 8 SA running simultaneously the number to be 150+ msgs/sec but it is 1/3rd at 50 msgs/sec -- View this message in context: http://www.nabble.com/Parallelizing-Spam-Assassin-tp24751958p24760294.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: Parallelizing Spam Assassin
c. r. wrote: > > On Thu, Jul 30, 2009 at 11:55:21PM -0700, poifgh wrote: >> Why am I not seeing a linear increase in the throughput? >> Is a file locking creating the bottleneck? > > Maybe the auto white list. > > -- > I can try turning off AWL and get back here.. Thnx -- View this message in context: http://www.nabble.com/Parallelizing-Spam-Assassin-tp24751958p24760203.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: Parallelizing Spam Assassin
Bernd Petrovitsch wrote: > > On Thu, 2009-07-30 at 23:55 -0700, poifgh wrote: > [...] >> I ran freshly build SA with Bayes and DNSBL turned off. Why am I not >> seeing >> a linear increase in the throughput? Is a file locking creating the > Because the bottleneck is not (only) the CPUs? > Run `vmstat 1` or similar to see (or at least get an idea;-) if the > workload is I/O bound or CPU-bound or > >> bottleneck? If yes, which particular file is being locked? If no, what >> could > Maybe. The default "store in files" drivers locks the DBs exclusively > for each access. > >> be the reason for this? > Switch the DB backend to some MySQL or PostgreSQL (or whatever you like > using from the "supported" ones). Run that on the very same machine and > compare the numbers with the above. > Running 'top' with a single SA process running gives 12.5% CPU utilization which makes sense since one core is fully utilized at this point out of 8 cores. The SA process reports 100% util for that CPU When fork goes to 8, each individual CPU is utilized from 30-70% mostly staying about 30 and only a few reaching 70. I can vmstat to check out the IO which I dont think should be a problem - the disks are fast enough to deliver order of magnitudes more reads than 50 msgs/sec. Can you elaborate on 'store in files'? What are these files, what are they used for - can they be turned off? Thnx -- View this message in context: http://www.nabble.com/Parallelizing-Spam-Assassin-tp24751958p24760163.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: Parallelizing Spam Assassin
Henrik K wrote: > > Yeah, given that my 4x3Ghz box masscheck peaks at 22 msgs/sec, without > Net/AWL/Bayes. But that's the 3.3 SVN ruleset.. wonder what version was > used > and any nondefault rules/settings? Certainly sounds strange that 1 core > could top out the same. Anyone else have figures? Maybe I've borked > something myself.. > The rules sets were default .. 1. Took a fresh SA download 2. Run [configured number of parallel] SA on a [different giant] mbox file without DNSBL and 'use_bayes 0' and 'bayes_auto_learn 0' -- View this message in context: http://www.nabble.com/Parallelizing-Spam-Assassin-tp24751958p24760106.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: Parallelizing Spam Assassin
On Fri, 2009-07-31 at 08:25 -0700, John Hardin wrote: > On Fri, 31 Jul 2009, rich...@buzzhost.co.uk wrote: > > > ... dropping in here and making jokes at such low hanging fruit. > > Make all the jokes at Barracuda's expense that you like, complain about > them all you like, just avoid offensive language. Vitriol is more > impressive if you are creative enough to avoid using profanity and > vulgarity while still blasting your target to pieces. > Received and understood.
Re: Parallelizing Spam Assassin
On Fri, 31 Jul 2009, rich...@buzzhost.co.uk wrote: ... dropping in here and making jokes at such low hanging fruit. Make all the jokes at Barracuda's expense that you like, complain about them all you like, just avoid offensive language. Vitriol is more impressive if you are creative enough to avoid using profanity and vulgarity while still blasting your target to pieces. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- Vista is at best mildly annoying and at worst makes you want to rush to Redmond, Wash. and rip somebody's liver out. -- Forbes --- 5 days until the 274th anniversary of John Peter Zenger's acquittal
Re: Parallelizing Spam Assassin
On Fri, 2009-07-31 at 07:26 -0400, Matt Kettler wrote: > rich...@buzzhost.co.uk wrote: > > On Fri, 2009-07-31 at 09:53 +0100, Justin Mason wrote: > > > >> On Fri, Jul 31, 2009 at 09:32, > >> rich...@buzzhost.co.uk wrote: > >> > >>> Imagine what Barracuda Networks could do with that if they did not fill > >>> their gay little boxes with hardware rubbish from the floors of MSI and > >>> supermicro. Jesus, try and process that many messages with a $30,000 > >>> Barracuda and watch support bitch 'You are fully scanning to much mail > >>> and making our rubbish hardware wet the bed.' LOL. > >>> > >> Richard -- please watch your language. This is a public mailing > >> list, and offensive language here is inappropriate. > >> > >> > > I apologise for the any language deemed offensive. Whilst 'Jesus', > > 'Bitch' and 'Wet the bed' are mostly acceptable, I offer no defence for > > openly swearing and using the filty phrase 'Barracuda Networks'. For > > this I apologise. > > > > > > > > > Richard, we are not joking. Please watch your language on this mailing > list, or you will be banned from it. > > You have now been warned by 2 members of the Project Management > Committee. You will not be warned again. > > > I have already apologised. I will not use the words you appear to have found offensive again. Can I ask, is this actually about the words I used *or* because of my comments regarding Barracuda Networks? I ask because I note they made a 'monetary donation' to Apache: http://www.barracudanetworks.com/ns/company/open-source.php If you want to ban me I will understand - you need to keep the wheels greased. It would give me more time to concentrate on leaking all the Barracuda code into the public domain, along with the various 'warez' tools I've written for it. This would probably be more beneficial to Barracuda Customers than dropping in here and making jokes at such low hanging fruit. If any Barracuda Customer would like to know how to unlock their barracuda without lifting the lid, or get change the model serial number and get free e.u. email me off list as I've just been banned for upsetting a sponsor LOL
Re: Parallelizing Spam Assassin
rich...@buzzhost.co.uk wrote: > On Fri, 2009-07-31 at 09:53 +0100, Justin Mason wrote: > >> On Fri, Jul 31, 2009 at 09:32, >> rich...@buzzhost.co.uk wrote: >> >>> Imagine what Barracuda Networks could do with that if they did not fill >>> their gay little boxes with hardware rubbish from the floors of MSI and >>> supermicro. Jesus, try and process that many messages with a $30,000 >>> Barracuda and watch support bitch 'You are fully scanning to much mail >>> and making our rubbish hardware wet the bed.' LOL. >>> >> Richard -- please watch your language. This is a public mailing >> list, and offensive language here is inappropriate. >> >> > I apologise for the any language deemed offensive. Whilst 'Jesus', > 'Bitch' and 'Wet the bed' are mostly acceptable, I offer no defence for > openly swearing and using the filty phrase 'Barracuda Networks'. For > this I apologise. > > > > Richard, we are not joking. Please watch your language on this mailing list, or you will be banned from it. You have now been warned by 2 members of the Project Management Committee. You will not be warned again.
Re: Parallelizing Spam Assassin
On Thu, 2009-07-30 at 23:55 -0700, poifgh wrote: [...] > I was measuring how quickly could SA [spam assassin] process spams when > several SA processes are run in parallel over separate mbox files. I used a > 8 core machine. Below are the numbers when I forked different number of > processes. > > Fork = 8; > Rate = 57 msgs/sec > > Fork = 4; > Rate = 44 msgs/sec > > Fork = 1; > Rate = 22 msgs/sec > > > I ran freshly build SA with Bayes and DNSBL turned off. Why am I not seeing > a linear increase in the throughput? Is a file locking creating the Because the bottleneck is not (only) the CPUs? Run `vmstat 1` or similar to see (or at least get an idea;-) if the workload is I/O bound or CPU-bound or > bottleneck? If yes, which particular file is being locked? If no, what could Maybe. The default "store in files" drivers locks the DBs exclusively for each access. > be the reason for this? Switch the DB backend to some MySQL or PostgreSQL (or whatever you like using from the "supported" ones). Run that on the very same machine and compare the numbers with the above. Bernd -- Firmix Software GmbH http://www.firmix.at/ mobil: +43 664 4416156 fax: +43 1 7890849-55 Embedded Linux Development and Services
Re: Parallelizing Spam Assassin
On Fri, 2009-07-31 at 09:53 +0100, Justin Mason wrote: > On Fri, Jul 31, 2009 at 09:32, > rich...@buzzhost.co.uk wrote: > > Imagine what Barracuda Networks could do with that if they did not fill > > their gay little boxes with hardware rubbish from the floors of MSI and > > supermicro. Jesus, try and process that many messages with a $30,000 > > Barracuda and watch support bitch 'You are fully scanning to much mail > > and making our rubbish hardware wet the bed.' LOL. > > Richard -- please watch your language. This is a public mailing > list, and offensive language here is inappropriate. > I apologise for the any language deemed offensive. Whilst 'Jesus', 'Bitch' and 'Wet the bed' are mostly acceptable, I offer no defence for openly swearing and using the filty phrase 'Barracuda Networks'. For this I apologise.
Re: Parallelizing Spam Assassin
On Fri, Jul 31, 2009 at 09:32:42AM +0100, rich...@buzzhost.co.uk wrote: > On Thu, 2009-07-30 at 23:55 -0700, poifgh wrote: > > Hi > > > > I was measuring how quickly could SA [spam assassin] process spams when > > several SA processes are run in parallel over separate mbox files. I used a > > 8 core machine. Below are the numbers when I forked different number of > > processes. > > > > Fork = 8; > > Rate = 57 msgs/sec > > > > Fork = 4; > > Rate = 44 msgs/sec > > > > Fork = 1; > > Rate = 22 msgs/sec > > > > > > I ran freshly build SA with Bayes and DNSBL turned off. Why am I not seeing > > a linear increase in the throughput? Is a file locking creating the > > bottleneck? If yes, which particular file is being locked? If no, what could > > be the reason for this? > > > > thnx > Wow! That's a real flying machine! Yeah, given that my 4x3Ghz box masscheck peaks at 22 msgs/sec, without Net/AWL/Bayes. But that's the 3.3 SVN ruleset.. wonder what version was used and any nondefault rules/settings? Certainly sounds strange that 1 core could top out the same. Anyone else have figures? Maybe I've borked something myself..
Re: Parallelizing Spam Assassin
On Fri, Jul 31, 2009 at 09:32, rich...@buzzhost.co.uk wrote: > Imagine what Barracuda Networks could do with that if they did not fill > their gay little boxes with hardware rubbish from the floors of MSI and > supermicro. Jesus, try and process that many messages with a $30,000 > Barracuda and watch support bitch 'You are fully scanning to much mail > and making our rubbish hardware wet the bed.' LOL. Richard -- please watch your language. This is a public mailing list, and offensive language here is inappropriate. -- --j.
Re: Parallelizing Spam Assassin
On Thu, 2009-07-30 at 23:55 -0700, poifgh wrote: > Hi > > I was measuring how quickly could SA [spam assassin] process spams when > several SA processes are run in parallel over separate mbox files. I used a > 8 core machine. Below are the numbers when I forked different number of > processes. > > Fork = 8; > Rate = 57 msgs/sec > > Fork = 4; > Rate = 44 msgs/sec > > Fork = 1; > Rate = 22 msgs/sec > > > I ran freshly build SA with Bayes and DNSBL turned off. Why am I not seeing > a linear increase in the throughput? Is a file locking creating the > bottleneck? If yes, which particular file is being locked? If no, what could > be the reason for this? > > thnx Wow! That's a real flying machine! Imagine what Barracuda Networks could do with that if they did not fill their gay little boxes with hardware rubbish from the floors of MSI and supermicro. Jesus, try and process that many messages with a $30,000 Barracuda and watch support bitch 'You are fully scanning to much mail and making our rubbish hardware wet the bed.' LOL. Well done you!
Re: Parallelizing Spam Assassin
On Thu, Jul 30, 2009 at 11:55:21PM -0700, poifgh wrote: > Why am I not seeing a linear increase in the throughput? > Is a file locking creating the bottleneck? Maybe the auto white list. --
Re: Parallelizing Spam Assassin
hi -- turn off Bayes and AWL. On Fri, Jul 31, 2009 at 07:55, poifgh wrote: > > Hi > > I was measuring how quickly could SA [spam assassin] process spams when > several SA processes are run in parallel over separate mbox files. I used a > 8 core machine. Below are the numbers when I forked different number of > processes. > > Fork = 8; > Rate = 57 msgs/sec > > Fork = 4; > Rate = 44 msgs/sec > > Fork = 1; > Rate = 22 msgs/sec > > > I ran freshly build SA with Bayes and DNSBL turned off. Why am I not seeing > a linear increase in the throughput? Is a file locking creating the > bottleneck? If yes, which particular file is being locked? If no, what could > be the reason for this? > > thnx > -- > View this message in context: > http://www.nabble.com/Parallelizing-Spam-Assassin-tp24751958p24751958.html > Sent from the SpamAssassin - Users mailing list archive at Nabble.com. > > -- --j.