Re: [sa] Re: Yahoo/URL spam

2010-03-24 Thread Mike Grau

On 3/23/2010 2:49 PM the voices made Charles Gregory write:

On Tue, 23 Mar 2010, Alex wrote:

This is what I have:
/^[^a-z]{0,10}(http:\/\/|www\.)(\w+\.)+(com|net|org|biz|cn|ru)\/?[^
]{0,20}[a-z]{0,10}$/msi


My bad. I got an option wrong. Please remove the 'm' above.
I always get it backwards. According to 'man perlre' (the definitive
resource for SA regexes!) the 'm' makes '^' match every newline!
We want it to only match the beginning of the body.

So just remove it, and, as noted by others, add the '^' that was
missing... like so

... ]{0,20}[^a-z]{0,10}$/si


Hello,

You might want to change  (\w+\.)+  to  ([\w-]+\.)+  to account for 
domains like polster-jj.de


-- MG


Re: [sa] Re: Yahoo/URL spam

2010-03-23 Thread Alex
Hi,

>> This is what I have:
>> /^[^a-z]{0,10}(http:\/\/|www\.)(\w+\.)+(com|net|org|biz|cn|ru)\/?[^
>> ]{0,20}[a-z]{0,10}$/msi
>
> My bad. I got an option wrong. Please remove the 'm' above.
> I always get it backwards. According to 'man perlre' (the definitive
> resource for SA regexes!) the 'm' makes '^' match every newline!
> We want it to only match the beginning of the body.

Much better. Not sure how I introduced that typo. Thanks so much.

Best,
Alex


Re: [sa] Re: Yahoo/URL spam

2010-03-23 Thread Charles Gregory

On Tue, 23 Mar 2010, Alex wrote:

This is what I have:
/^[^a-z]{0,10}(http:\/\/|www\.)(\w+\.)+(com|net|org|biz|cn|ru)\/?[^
]{0,20}[a-z]{0,10}$/msi


My bad. I got an option wrong. Please remove the 'm' above.
I always get it backwards. According to 'man perlre' (the definitive 
resource for SA regexes!) the 'm' makes '^' match every newline!

We want it to only match the beginning of the body.

So just remove it, and, as noted by others, add the '^' that was 
missing... like so


... ]{0,20}[^a-z]{0,10}$/si

- Charles