>-----Original Message----- >From: Keith Ivey [mailto:[EMAIL PROTECTED] >Sent: Thursday, April 07, 2005 10:32 PM >To: users@spamassassin.apache.org >Cc: Jesse Houwing >Subject: Re: Extra Sare Rules for meds? > > >Jesse Houwing wrote: > >> BODY TABLEOBFU >> >m{<td([^>]+|"[^"]+)>(<([^>]+|"[^"]+)>)*[a-z]{1,2}(<([^>]+|"[^"] >+)>)*</td([^>]+|"[^"]+)>}i > >I think you may want a * after the ) inside the <>. As it is, >you're looking for either a bunch of characters that are not > >or a quote followed by a bunch of characters that are not quote. > In fact, I think what was really intended was something more >like this (note that this also requires an ending quote on >contained quoted strings and allows ""): > >m{<td([^>"]+|"[^"]*")*>(<([^>"]+|"[^"]*")*>)*[a-z]{1,2}(<([^>"] >+|"[^"]*")*>)*</td([^>"]+|"[^"]*")*>}i > > >The other problem with the pattern as written (with no *) is >that the subpatterns don't match plain <td> or </td>, since they >require at least one character between the td and the >. >
One of the things the SARE group has realized, is that using '*' in any regex is a bad idea. Trust me on that one. We avoid it like the plague. --Chris