On Wed, 2005-04-27 at 21:16 +0200, Paul J Stevens wrote:
> Geo Carncross wrote:
> > On Wed, 2005-04-27 at 10:51 +0200, Paul J Stevens wrote:
> 
> >>No. It is pretty much required for getting at the base-subject as
> >>required for SORT and THREAD (draft-ietf-imapext-sort-17).
> > 
> > 
> > No it's not. I [have] used:
> > 
> > http://www.jwz.org/doc/threading.html
> > 
> > to implement threading (same as in imapext-sort) without using regex at
> > all.
> 
> Thanks for the URL. Nice reading. Jamie rulez. Still, anything can be done
> without regex. And parsing base-subjects is no rocket science. But I like 
> regex
> :-) and wanted to try my hand at using them in c-code.

The reason I highly recommend avoiding regex is because of the amount of
memory needed. Subject lines can be really (really!) long, and excessive
backtracking means exponential memory usage.

Quoting regular expressions is _hard_ - IIRC, perl's qr// operator does
this internally by translating all \'s into \\ and wrapping it with \Q
and \E

There are other problems that inevitably crop up. Better to avoid the
temptations and write state-machine parsers that use fixed-memory...


> >>>I'd say that PCRE is the way to go.
> >>
> >>I spent all day yesterday trying to build a rather complicated pattern
> >>with regex. No good. With pcre, it's a breeze ... once you get the hang
> >>of the api :-)
> > 
> > 
> > If this is the only place we've got regex, then I'll be happy to rewrite
> > it.
> 
> Currently the *only* place that uses (posix) regex is the namespace code in
> imap. I just finished a working implementation of base-subject retrieval using
> pcre, but that's not in svn yet.

Why do we need regex for namespaces? The server defines the namespaces,
and they're all fixed-strings... aren't they?

-- 
Internet Connection High Quality Web Hosting
http://www.internetconnection.net/

Reply via email to