[SAtalk] Re: Thinking of performance

2002-05-10 Thread Daniel Pittman
On Sat, 11 May 2002, Mail Admin wrote: > Hi, I want to use spamassassin on a system where real heavy load > exists. I have 540,000 incoming emails daily. I know spamc/spamd do > well under moderate load , but this is not enough. Did anybody think > of rewriting spammassasin in C, Yup. It's been s

[SAtalk] Re: Thinking of performance

2002-05-11 Thread Daniel Pittman
On Sat, 11 May 2002, Craig R. Hughes wrote: > Daniel Pittman wrote: > > DP> For the first, *nothing* that you do is likely to improve things > DP> much other than rewriting the rules themselves; this can be done > DP> equally well with Perl. > > Rule optimization is proceeding. You might find a

[SAtalk] Re: Thinking of performance

2002-05-12 Thread Daniel Pittman
On Sat, 11 May 2002, Marc MERLIN wrote: > On Sat, May 11, 2002 at 02:26:55PM -0500, dman wrote: [...] > I am using spamd, but I'm pretty sure what's killing me are the rbl > checks. [...] > However, since we're going to do up to 10 queries, and each can be > blocking, wouldn't it be better to

[SAtalk] Re: Thinking of performance

2002-05-13 Thread Daniel Pittman
On Sun, 12 May 2002, [EMAIL PROTECTED] wrote: > On Mon, May 13, 2002 at 11:52:23AM +1200, Jason Haar wrote: > | On Sat, May 11, 2002 at 10:33:41AM +1000, Daniel Pittman wrote: > | > Fix that first, if you want to fix anything. Grab, or write, a > | > version of spamproxyd that you trust[1] with yo

[SAtalk] Re: Thinking of performance

2002-05-13 Thread Daniel Pittman
On Sun, 12 May 2002, Marc MERLIN wrote: > On Mon, May 13, 2002 at 11:52:23AM +1200, Jason Haar wrote: >> I'd suggest the opposite is better: have the real MTA relay it to >> spamproxyd. If you do it your way, you've just lost all anti-relaying >> protection... > > Yep. Running spamproxyd is reall

Re: [SAtalk] Re: Thinking of performance

2002-05-11 Thread Arpi
Hi, > On Sat, 11 May 2002, Mail Admin wrote: > > Hi, I want to use spamassassin on a system where real heavy load > > exists. I have 540,000 incoming emails daily. I know spamc/spamd do > > well under moderate load , but this is not enough. Did anybody think > > of rewriting spammassasin in C, >

Re: [SAtalk] Re: Thinking of performance

2002-05-11 Thread dman
On Sat, May 11, 2002 at 04:09:43PM +0200, Arpi wrote: | (yes, i agree on that perl is usefull thing for text processing, but it is | no more true when high performance does matter - then asm+c kicks in) This completely depends. First you MUST *profile* to determine where the hotspots are. Mayb

Re: [SAtalk] Re: Thinking of performance

2002-05-11 Thread Craig R Hughes
Daniel Pittman wrote: DP> For the first, *nothing* that you do is likely to improve things much DP> other than rewriting the rules themselves; this can be done equally well DP> with Perl. Rule optimization is proceeding. You might find a better/faster regex engine, but you'll probably have to r

Re: [SAtalk] Re: Thinking of performance

2002-05-11 Thread Marc MERLIN
On Sat, May 11, 2002 at 02:26:55PM -0500, dman wrote: > This completely depends. First you MUST *profile* to determine where > the hotspots are. Maybe _those_ pieces of the program would be better > in C or ASM. Remember that 90% of the execution time is spent in 10% > of the code (generally).

Re: [SAtalk] Re: Thinking of performance

2002-05-12 Thread Nathan Neulinger
> However, since we're going to do up to 10 queries, and each can be blocking, > wouldn't it be better to fork for each DNS lookup (even optionally) and kill > the children if the DNS query hasn't returned in x seconds? > That way, since all the DNS queries are run in parallel, at worst, you spen

Re: [SAtalk] Re: Thinking of performance

2002-05-12 Thread Jason Haar
On Sat, May 11, 2002 at 10:33:41AM +1000, Daniel Pittman wrote: > Fix that first, if you want to fix anything. Grab, or write, a version > of spamproxyd that you trust[1] with your email, then have inbound SMTP > talk directly to that and have it relay on to the real MTA. I'd suggest the opposite

Re: [SAtalk] Re: Thinking of performance

2002-05-12 Thread dman
On Mon, May 13, 2002 at 11:52:23AM +1200, Jason Haar wrote: | On Sat, May 11, 2002 at 10:33:41AM +1000, Daniel Pittman wrote: | > Fix that first, if you want to fix anything. Grab, or write, a version | > of spamproxyd that you trust[1] with your email, then have inbound SMTP | > talk directly to

Re: [SAtalk] Re: Thinking of performance

2002-05-12 Thread Jeremy Zawodny
On Sun, May 12, 2002 at 08:57:32PM -0500, dman wrote: > > Why not just embedd spamc in the MTA itself? Then there's no extra > process running and the MTA just does a little more socket work > passing the message through spamd. In fact, Marc's sa-exim patch > almost does this. The only thing i

Re: [SAtalk] Re: Thinking of performance

2002-05-12 Thread Vivek Khera
> "DP" == Daniel Pittman <[EMAIL PROTECTED]> writes: >> Rule optimization is proceeding. You might find a better/faster regex >> engine, but you'll probably have to re-optimize the rules for that >> engine vs the perl engine. I think we're going to be focussing on In one of my previous lives

Re: [SAtalk] Re: Thinking of performance

2002-05-12 Thread Marc MERLIN
On Sun, May 12, 2002 at 05:28:27PM +1000, Daniel Pittman wrote: > > However, since we're going to do up to 10 queries, and each can be > > blocking, wouldn't it be better to fork for each DNS lookup (even > > optionally) and kill the children if the DNS query hasn't returned in > > x seconds? >

Re: [SAtalk] Re: Thinking of performance

2002-05-12 Thread Richie Laager
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Sunday 12 May 2002 21:20 pm, Jeremy Zawodny wrote: > That would be idea. Being able to just add a few lines of > code and link in libspamc or libspamassasin or whatever > would rock. Indeed it would. Now, imagine for a second that we have two ve

Re: [SAtalk] Re: Thinking of performance

2002-05-12 Thread Marc MERLIN
On Sun, May 12, 2002 at 09:53:51AM -0500, Nathan Neulinger wrote: > > > However, since we're going to do up to 10 queries, and each can be blocking, > > wouldn't it be better to fork for each DNS lookup (even optionally) and kill > > the children if the DNS query hasn't returned in x seconds? > >

Re: [SAtalk] Re: Thinking of performance

2002-05-12 Thread Marc MERLIN
On Mon, May 13, 2002 at 11:52:23AM +1200, Jason Haar wrote: > I'd suggest the opposite is better: have the real MTA relay it to > spamproxyd. If you do it your way, you've just lost all anti-relaying > protection... Yep. Running spamproxyd is really not an option for most of us. You lose (if I'm

Re: [SAtalk] Re: Thinking of performance

2002-05-12 Thread Marc MERLIN
On Sun, May 12, 2002 at 08:48:41PM -0700, Marc MERLIN wrote: > > What about using the bgsend/bgisready functionality in Net::DNS? That > > should allow multiple queries in the background in parallel. > > Sounds like a great idea, I wasn't aware of the functionality. > Craig, is it something: > -

Re: [SAtalk] Re: Thinking of performance

2002-05-12 Thread Marc MERLIN
On Sun, May 12, 2002 at 08:57:21PM -0700, Marc MERLIN wrote: > Yep. Running spamproxyd is really not an option for most of us. > You lose (if I'm not mistaken) > - SMTP AUTH > - STARTTLS/SSL > - The IP of the real sender Sorry, I meant: the ability to block senders by IP from your MTA (accept the

Re: [SAtalk] Re: Thinking of performance

2002-05-13 Thread Daniel Quinlan
Jesus Climent wrote: > Anyone's comments about the upload of the c code in the cvs? It's interesting, but given how quickly spammers adapt and change, I think trying to keep up with them using C as the implementation language will be problematic. It seems like any solution that loses the flexib

Re: [SAtalk] Re: Thinking of performance

2002-05-13 Thread Jesus Climent
On Mon, May 13, 2002 at 02:05:24AM -0700, Daniel Quinlan wrote: > Jesus Climent wrote: > > > Anyone's comments about the upload of the c code in the cvs? > > A few things off the top of my head: > > - Limit the a maximum amount of data going into body tests (max [snip] > - Order tests bette

Re: [SAtalk] Re: Thinking of performance

2002-05-13 Thread Matt Sergeant
Jesus Climent wrote: > On Sat, May 11, 2002 at 06:23:19PM +0200, Arpi wrote: > >>Hi, >> >> >>>I will add the code to the CVS and put a mark that is still beta >> > code. > >>where should i send the updates later? >>i mean i see no sense of having it in your cvs, while the developed > > version

Re: [SAtalk] Re: Thinking of performance

2002-05-13 Thread Craig R Hughes
Marc MERLIN wrote: MM> That's what I wanted to do originally. MM> I just didn't do it in my current version of SA-Exim, because I didn't want MM> to track the spamd protocol, or embed spamc into exim right now, and then MM> have to maintain that. MM> That said, if it were to be a library, I'd

Re: [SAtalk] Re: Thinking of performance

2002-05-13 Thread Craig R Hughes
Marc MERLIN wrote: MM> On Sun, May 12, 2002 at 09:53:51AM -0500, Nathan Neulinger wrote: MM> > MM> > What about using the bgsend/bgisready functionality in Net::DNS? That MM> > should allow multiple queries in the background in parallel. MM> MM> Sounds like a great idea, I wasn't aware of the fun

Re: [SAtalk] Re: Thinking of performance

2002-05-13 Thread Marc MERLIN
On Mon, May 13, 2002 at 11:35:40AM +0200, Jesus Climent wrote: > Having a strict configuration in my MTA (not allowing the use of our > domain for mail not coming from localhost), I do not want to check mail > in MTA level (using spamd/c) originated in my own system. SA-Exim lets you do that and

Re: [SAtalk] Re: Thinking of performance

2002-05-13 Thread Marc MERLIN
On Mon, May 13, 2002 at 07:21:05PM +1000, Daniel Pittman wrote: > > Yep. Running spamproxyd is really not an option for most of us. > > You lose (if I'm not mistaken) > > - SMTP AUTH > > Nope, at least not with my model. Admittedly this /is/ vaporware > because I don't need it, but it would work

Re: [SAtalk] Re: Thinking of performance

2002-05-13 Thread Daniel Quinlan
Jesus Climent wrote: >> Having a strict configuration in my MTA (not allowing the use of our >> domain for mail not coming from localhost), I do not want to check mail >> in MTA level (using spamd/c) originated in my own system. Marc MERLIN <[EMAIL PROTECTED]> writes: > SA-Exim lets you do that

Re: Re: [SAtalk] Re: Thinking of performance

2002-05-11 Thread Jesus Climent
On Sat, May 11, 2002 at 05:19:26PM +0200, Arpi wrote: > Hi, > > as i said - it is not in usable form yet > it's just the "core", no interfaces added yet > > anyway usage is simple: > cat mail.txt | ./check > > it will print spam stats on it OK. I will add the code to the CVS and put a mark

Re: Re: [SAtalk] Re: Thinking of performance

2002-05-11 Thread Arpi
Hi, > The only argument I'm making is against the preconceived notion that a > program in C is always faster than the same program in perl. agree. there are good c programmers and bad c programmers. good ones know how to write fast c code - bad ones don't. good ones usually thinking in lowlevel a

Re: Re: Re: [SAtalk] Re: Thinking of performance

2002-05-11 Thread Arpi
Hi, > I will add the code to the CVS and put a mark that is still beta code. where should i send the updates later? i mean i see no sense of having it in your cvs, while the developed version is here... > If someone is decided to hack it a bit more, then they will be welcome > to do it. do you

Re: Re: Re: [SAtalk] Re: Thinking of performance

2002-05-11 Thread Jesus Climent
On Sat, May 11, 2002 at 06:23:19PM +0200, Arpi wrote: > Hi, > > > I will add the code to the CVS and put a mark that is still beta code. > > where should i send the updates later? > i mean i see no sense of having it in your cvs, while the developed version > is here... Usually patches, so far,

Re: Re: Re: [SAtalk] Re: Thinking of performance

2002-05-11 Thread Craig R Hughes
Arpi wrote: A> Hi, A> A> > I will add the code to the CVS and put a mark that is still beta code. A> A> where should i send the updates later? We can add it to the spamassassin CVS tree and handle bugfixes/updates through bugzilla. We can give CVS write access to you for working on the code the

Re: Re: Re: [SAtalk] Re: Thinking of performance

2002-05-11 Thread Craig R Hughes
Jesus Climent wrote: JC> Usually patches, so far, go to the mailing list. That's discouraged now the volumes are getting higher. Please submit patches as attachments to bugzilla tickets at http://bugzilla.spamassassin.org/ JC> Anyone's comments about the upload of the c code in the cvs? Well,