Bulat Ziganshin wrote:
Hello Chris,

Thursday, July 13, 2006, 12:17:30 PM, you wrote:

Question 2: Is there interest in getting this into an official release of the
base libraries?  The Compat module could at least replace or sit alongside the
performance sink of the current Text.Regex code.
i'm 120% want to see ByteString, regular expressions matching for
String and ByteString, and JRegex (=~ operator implementation) to be
included in GHC 6.6

That typeclass interface is very handy, BUT it expects the thing being matched
against is a list of something.  This prevents making ByteString an instance of
RegexLike.

The answer will be to alter the type class to not make such an assumption.
Luckily John Meacham put JRegex under the 3 clause BSD, so I will
   * Make a modified version of the type classes
   * Make Text.Regex.Lazy an instance of these type classes
   * Port JRegex to be instances of these type classes (links to PCRE!)
Then I or someone else can
   * Implement an efficient instance of Bytestring being handled by PCRE.

regexps support for ByteStrings already exists:

========================================================================
btw, what will be really useful now, imho, is the interface to
Text.Regex. how about working on it as next stage?

This is already done actually, here:
    http://www.cse.unsw.edu.au/~dons/code/lambdabot/Lib/Regex.hsc
    http://www.cse.unsw.edu.au/~dons/code/hmp3/Regex.hsc
========================================================================

Thanks, I'll go take a look at that. I have pcre + JRegex installed now. And I have a remote darcs repository with my current version imported. (URL coming after I am sure it won't get re-organized).


well, i'm just dumb user telling what i want to see in GHC 6.6:

* regexp matching for Strings and ByteStrings
* perl-like syntax for doing it
* ability to select regexp engine for each matching operation and
using of most efficient ones (Lazy for String, posix or pcre (?) for
ByteString) by default

i also know that Simon Marlow want to see JRegex(-like) engine
included in 6.6 (see http://hackage.haskell.org/trac/ghc/ticket/710 )

what you mentioned is just implementation details for me, the dumb user :)

As a user, the JRegex API can also only support a single Regex type and a single backend. But it would be really handy to be able to use different types of regular expressions. Mainly there are going to be different regex syntax possibilities:

  * Old Text.Regex syntax, also emulated by Text.Regex.Lazy.Compat
  * The "Full" syntax of Text.Regex.Lazy (close to Extended regex)
  * regex.h syntax (perhaps Basic as well as Extended)
  * pcre.h syntax

All of these might conceivably come in [Word8] and [Char] sources.

The backend will vary: at least because we will want both a Lazy version and a hand-off to pcre library version (if installed) or regex library (more likely to be installed).

And the plan is to generalize the target to be either [Char] or ByteString.

New Question: What do people think is the best way to use data/newtype/class to allow for
  1) Different regex syntax as different types
  2) Different target [Char] or ByteString
  3) Different engine in the back end.

My first thought is that the type of the regex encodes both which syntax is in use and which back-end will be used. Something like

 "Hello" =~ (pcre "el+")

would use PCRE syntax and pcre library backend against the [Char]. And

 (pack "Hello") =~ (compatRE "el+")

Would use the old Text.Regex syntax and my lazy backend against the ByteString produced by pack.

Other answers?

--
Chris
_______________________________________________
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell

Reply via email to