[Haskell-cafe] Re: ANN: weighted-regexp-0.1.0.0

Sjoerd Visscher Mon, 26 Jul 2010 16:04:29 -0700

Hi Sebastian,

I enjoyed this paper very much. Writing papers in the style of a play seems to 
work very well! (although I think you should spice it up more if your want to 
get it on Broadway)


It seems that only shift needs the reg field of the RegW datatype. So you can 
also replace the reg field with a shift field. This makes the regexp parser 
extensible, as there is no longer a dependence on the (closed) datatype Reg:

> data RegW w c = RegW { active :: !Bool,
>                        empty :: !w,
>                        final_ :: !w,
>                        shift :: w -> c -> RegW w c }

For example it is then easy to define the parser that matches nothing, which is 
the identity element of alt:

> noMatch :: RegExp c
> noMatch = RegExp noMatchW
> 
> noMatchW :: Semiring w => RegW w c
> noMatchW = RegW False zero zero $ \_ _ -> noMatchW

But otherwise I do wonder if the parser needs to be extensible. For example 
some XML Schema implementations that are based on finite automata have special 
cases for the xs:all construct, which matches a list of elements, each 
occurring once in any order. But I tried a straightforward implementation and 
it works fine:

> eachOnce :: [RegExp c] -> RegExp c
> eachOnce [] = eps
> eachOnce ps = eachOnce' ps [] where
>   eachOnce' [] _ = noMatch
>   eachOnce' (p:ps) qs = (p `seq_` eachOnce (ps ++ qs)) `alt` eachOnce' ps 
> (p:qs)

*Main> accept (eachOnce (map char ['a'..'z'])) $ reverse ['a'..'z']
True
(0.05 secs, 8706356 bytes)

greetings,
Sjoerd




_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: ANN: weighted-regexp-0.1.0.0

Reply via email to