Re: help with regular expressions

2003-02-22 Thread Marck D Pearlstone
Hi Task,

22-Feb-2003, 03:02 -0400 (07:02 UK time) Task Control said:

   To procces the texts of the mails in vampire, i use first the
   uppercase function (i.e: uppercase(GooGLe) = GOOGLE), my
   questions are:

   - what happened if i applied uppercase function to a regular
   expression. ¿it's will work fine? -

Don't. It won't work fine. The RegEx must be passed unmodified raw
text and let the analysis routines within RegEx do their own case
conversion.

 ¿what is better? a. tell to plug'in user: your mails will be
 converted to a uppercase letters to be processed. Make your
 regex's filters accord this.

No.

 b. in silence convert the expresions and mail texts to uppercase.

No.

Best to say If an expression is regex leave the case unmodified.

 - did you like see a regex's filter in vampire?

Yes.

 ¿anyone knows a good regular expresions guide? please reply with
 the url.

http://www.silverstones.com/thebat/RegEx.html and TheBat help file.

-- 
Cheers -- .\\arck D Pearlstone -- List moderator
TB! v1.63 Beta/7 on Windows 2000 5.0.2195 Service Pack 2
'


smime.p7s
Description: S/MIME Cryptographic Signature

Current version is 1.62 | Using TBDEV information:
http://www.silverstones.com/thebat/TBUDLInfo.html


Re[4]: Regular expressions in AntiSpam: is it possible?

2003-02-22 Thread Alexey N. Vinogradov
Hello, Task. 
You wrote in mid:[EMAIL PROTECTED]

TC and I don't undertand: what are you thinking men please, explain
TC you.
TC first question: what is a regular expression.

Well, the simplest example: you can see my letter-prefix:
You wrote in mid:[EMAIL PROTECTED]. This is
generated automatically by template using regular expression. Any
letter contain ID as [EMAIL PROTECTED]. You
need to insert the mid:; into the angle brackets. It realized by just
one line in the answer template:

%SetPattRegExp=\(.*)\ You wrote in mid:%RegExpMatch=%OMSGID;

The pattern for regular expression is \(.*)\. The first \ and
the last \ simple mean  and . Phrase in parentheses (.*)
means the very subsrting which will be returned as a result. Inside
the phrase: . called 'atom' and means any symbol, * called
'quantifier', is attached to . and means previous atom any times,
so .* means some symbols. By applying this whole regexp pattern to
a string enclosed in angle brackets, like
'[EMAIL PROTECTED]' you'll get the string
without brackets, ie. '[EMAIL PROTECTED]'
because the brackets in the pattern are outside the parentheses. So
the macros defines the pattern '\(.*)\' by %SetPattRegExp, then
applies it to message ID by %RegExpMatch='%OMSGID' and then places the
result between You wrote in mid:; and .


...just few more examples:

...if you have to find 'elephant' in a letter, you just need to define
appropriate pattern for recognizing by it different variants (let me
treat non-empty result as 'true' and empty string as 'false')

'\belephant\b'FALSE for 'my elephant red' and TRUE for 'telephantom'
'\Belephant\B'TRUE for 'my elephant red' and FALSE for 'telephantom'
'\W(elephant)\W'  TRUE for '$elephant#' and FALSE for '3elephants', 's elephant df'
'e\s+l\s+e\s+p\s+h\s+a\s+n\s+t' (the \s+ between every letters)
  TRUE for 'elephant' and 'e l e ph   a nt'
'(?s)e\.?l\.?e\.?p\.?h\.?a\.?n\.?t' (the \.? between every letters)
  TRUE for 'elephant', 'e.l.e.p.h.a.n.t' and
'e
l
p
h
a
n
t'

So, if you need to filter any specific variant you mus just construct
the regexp from your signal word by adding to it prefix and/or suffix
and (if necessary) by inserting something between letters.

That us take a line for search, for example our 'elephant'. We can
test a text for the line by two steps:

1. Construct a regexp pattern from signal line:

- for 'partial search' - leave the line:
  'elephant' -- 'elephant' (no change)
  
- for 'wholeword search', add '\b' before and after line;
  'elephant' -- '\belephant\b'
  
- for 'space-delimited search', add '\s+' between letters
  'elephant' -- 'e\s+l\s+e\s+p\s+h\s+a\s+n\s+t'

- for any other phantasy - just learn how to make appropriate regexp.

2. Just apply constructed pattern to a target text by calling
pcre.dll. That's all!



TC we can make a plug'in that call a dll, and use the dll in the bat!

I meant that The Bat! is already uses regular expressions! And if I
understand clear, it uses the same pcre.dll, but it is statically
linked into TheBat.exe (I think that Stefan Tanurkov knows it better :)
My point was if Tha Bat! developers make the library accessible for
sharing then it would not be necessary for 'outsiders' to deploy an
extra copy of the library with their plugins.


-- 
Sincerely,
 Alexey.
Using TB 1.63b7 on WinXP SP1 Corp + MUI RU, spelling by ORFO2002
   mailto:[EMAIL PROTECTED]



Current version is 1.62 | Using TBDEV information:
http://www.silverstones.com/thebat/TBUDLInfo.html


Re[2]: vampire updated now 0.01c

2003-02-22 Thread Task Control
Estimados seguidores del tbdev arroba thebat.dutaint.com:


En relación a lo que Luc en su momento posteó:

  

L  Good evening Task,
  
L It was foretold that on 22-2-2003  16:50:28 GMT-0400 (which was
L 21:50:28 where I live) Task Control would mumble:
  
L snipped a bit
TC the  firsttime  was  puted  in  the  same  directory  that we have put
TC vampire?  you see the correct directory in the title of the windows in
TC the  cionfioguration  utility?

L Yes

TCincluding c:\vampire.ini

L Question first: does the vampire.ini file need to be in the same
L directory as vampire?
no, vampire only works is vampire.ini is it in c:\, in the nexts
releases i move the information of vampire.ini to the register of
windows.

-- 
Se despide,
 Task Control 
   mail: TaskControl at SoftHome dot net
 correo: TaskControl arroba SoftHome punto net

Usando: 
- Windows 98 4.10.1998 
- AVG 6.0 Free Edition
- The Bat! 1.63 Beta/7
- Trillian PRO 1.0 B



Current version is 1.62 | Using TBDEV information:
http://www.silverstones.com/thebat/TBUDLInfo.html


Re: PACSPAM words list

2003-02-22 Thread Task Control
Estimados seguidores del tbdev arroba thebat.dutaint.com:

En relación a lo que Leif en su momento posteó:

LG Hello users,

LG Well, here again are my current list of words. These have been tweaked
LG over and over since PACSPAM came out, and so far,

LG the only ones getting through right now are the ones completely in
LG Russian,  or  Korean.  Anyone  have  any thoughts on being able to
LG filter those ones?
easy, capture the ascii chars that represents this words and put it's
in your list files.

LG The problem I'm having filtering on those is trying to find words
LG (quoted, because it all looks like garbage on my screen), which would
LG be more or less common in all of them.
maybe you can filter using the name of the table of codification, i
think you can find it in the raw message.

LG So, these words are in the UNKNOWN / BODY / COMPLETE WORDS list. I'd
LG really like to see other people's lists too. Once I'm happy that
LG nothing is being caught that shouldn't. I'll move this words list into
LG the TRASH tab.

¿why you are not using vampire? i was abandoned the developed of
pacspam because, ... in the pacspam page you can find the explain.
You can use your pacspam list in vampire.

LG :pacSpam - unknow list
LG :
[...]
LG :
if you put the list of words in the body, the plug'in can say: it is
spam, you know, ¡a lot of words from spam! please zip it (or rar it)
when you send.

-- 
Se despide,
 Task Control 
   mail: TaskControl at SoftHome dot net
 correo: TaskControl arroba SoftHome punto net

Usando: 
- Windows 98 4.10.1998 
- AVG 6.0 Free Edition
- The Bat! 1.63 Beta/7
- Trillian PRO 1.0 B



Current version is 1.62 | Using TBDEV information:
http://www.silverstones.com/thebat/TBUDLInfo.html


Re: help with regular expressions

2003-02-22 Thread Leif Gregory
Hi Task,

On Sat, 22 Feb 2003, at 03:02:49 [GMT -0400] (which was 12:02 AM where
I live) you wrote:
TC Hi, i'm working in a regular expresions (regex's) filter in
TC vampire, i use the freeware TRegExpr, avalaible in
TC http://anso.da.ru

One thing that should help using regexps, is the case I mentioned once
before where I was receiving a lot of SPAM with similar to the below:

To be re move d from fu ture e mail ings cli ck here

With regexps, we can have it ignore whitespaces (space, tab etc). That
should help out considerably if Bayesian filtering catches on, because
it's the only way I can foresee them being able to send you their
message without inventing a whole new lexicon (short of just
misspelling everything).

Speaking of which.. Everyone has put the thebat.dutaint.com as a
PARTIAL word in the KLUDGES of the SECURE tab right? If you don't do
something to that effect, since we are discussing SPAM and word lists,
they'll be caught by the PACSPAM plugin. By doing that, it covers all
the lists except for TBOT, but here's my PARTIAL words list for
KLUDGES on the SECURE tab:

thebat.dutaint.com
[EMAIL PROTECTED]
TBLH
silverstones.com

The last two are for another list I run, and for list moderation of
the TB lists that Marck has set up, so you probably don't need those
last two.


-- 
Cheers,
Leif Gregory 

List Moderator (and fellow registered end-user)
PCWize Editor  /  ICQ 216395  /  PGP Key ID 0x7CD4926F
Web Site http://www.PCWize.com
TB FAQ   http://www.silverstones.com/thebat/FAQ.html
Using The Bat! 1.63 Beta/6 under Windows 2000 5.0 Build 2195 Service Pack 3 
on a P4 1.6Ghz OC'd to 2.32Ghz with 512MB.

Tagline of the day:
Where are we going and why am I in this handbasket?





Current version is 1.62 | Using TBDEV information:
http://www.silverstones.com/thebat/TBUDLInfo.html


vampire with support to regular expressions uploaded

2003-02-22 Thread Task Control
Estimados seguidores del tbdev arroba thebat.dutaint.com

  a  beta  version  for  test,  this  time  with  support  to  regular
  expressions, you can get it, and test from:
  
  http://fyberger.tripod.com/vampire/vampire.htm

  news in the version:
  - new way to make the international support
  - regular expressions
  - syntax check utility for regex's
  
-- 
Se despide,
 Task Control 
   mail: TaskControl at SoftHome dot net
 correo: TaskControl arroba SoftHome punto net

Usando: 
- Windows 98 4.10.1998 
- AVG 6.0 Free Edition
- The Bat! 1.63 Beta/7
- Trillian PRO 1.0 B




Current version is 1.62 | Using TBDEV information:
http://www.silverstones.com/thebat/TBUDLInfo.html