Hi, Here is Richard Bird's version for reference. I changed it a bit.
data State a = E | S a (State a) (State a) matched (S (_, []) _ _) = True matched _ = False kmpSearch4 :: (Eq a) => [a] -> [a] -> [Int] kmpSearch4 ws txt = snd $ foldl tr (root, []) (zip txt [1..]) where root = build E ([], ws) build fails (xs, []) = S (xs, []) fails E build fails s@(xs, (y:ys)) = S s fails succs where succs = build' (fst (tr (fails, []) (y, 0))) (xs++[y], ys) tr (E, ns) _ = (root, ns) tr ((S (xs, ys) fails succs), ns) (x, n) | [x] `isPrefixOf` ys = if matched succs then (succs, ns++[n]) else (succs, ns) | otherwise = tr (fails, ns) (x, n) In the program, tr is the transfer function applied to the state tree. And build function is used to build the automaton. Best regards. -- LIU On Mar 3, 5:25 pm, "larry.liuxinyu" <liuxiny...@gmail.com> wrote: > Hi, > > I read about some KMP implementation in Haskell including: > > [1] Richard Bird. ``Pearls of Functional algorithm design'' > [2]http://twan.home.fmf.nl/blog/haskell/Knuth-Morris-Pratt-in-Haskell.de... > [3]http://www.haskell.org/haskellwiki/Runtime_compilation > [4] LazyString version > > [1] builds a infinite lazy state transfer trees, while [3] uses index > to build overlap table. > > I created a version which isn't as efficient as in [1]. Just for fun: > > failure :: (Eq a)=> ([a], [a]) -> ([a], [a]) > failure ([], ys) = ([], ys) > failure (xs, ys) = fallback (init xs) (last xs:ys) where > fallback as bs | as `isSuffixOf` xs = (as, bs) > | otherwise = fallback (init as) (last as:bs) > > kmpSearch2 :: (Eq a) => [a] -> [a] ->[Int] > kmpSearch2 ws txt = snd $ foldl f (([], ws), []) (zip txt [1..]) where > f (p@(xs, (y:ys)), ns) (x, n) | x == y = if ys==[] then ((xs++[y], > ys), ns++[n]) > else ((xs++[y], ys), ns) > | xs == [] = (p, ns) > | otherwise = f (failure p, ns) (x, > n) > f (p, ns) e = f (failure p, ns) e > > The function failure just follows the idea that in case (xs, ys) fails > matching some letter c in text, > where xs++ys = pattern and c!= head ys, it means we must fallback to > (xs', ys') so that > xs' = longest { s: s is prefix of xs AND s is suffix of xs } > > The bad thing is that failure can't memorize what it has compute > before, for example, as pattern = "ababc" > and we fails at ("abab", "c"), then we call function failure to get > the new one as ("ab", "abc"). > After several matches, we fails again at ("abab", "c"), failure can't > just return ("ab", "abc") what it has > been compute already. It has too do the same work again. > > Function f inside kmpSearch2 is in fact a state-transfer function. If > we try to use some data structure (for example tree) to memorize the > results which failure function calculated, we can finally reach to the > idea in [1]. > > -- > LIUhttp://sites.google.com/site/algoxy/ > > _______________________________________________ > Haskell-Cafe mailing list > Haskell-C...@haskell.orghttp://www.haskell.org/mailman/listinfo/haskell-cafe _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe