Am Montag, 12. Dezember 2005 16:28 schrieben Sie: > From: Daniel Fischer <[EMAIL PROTECTED]> > > >To: "Branimir Maksimovic" <[EMAIL PROTECTED]> > >CC: Haskell-Cafe@haskell.org > >Subject: Re: [Haskell-cafe] Substring replacements > >Date: Mon, 12 Dec 2005 16:15:46 +0100 > > > >Earlier today: > > > Sorry, but > > > Prelude SearchRep> searchReplace "abaaba" "##" "abababaaba" > > > "abababaaba" > > > > > > I haven't analyzed the algorithm, so I don't know why exactly this > > > >fails. > > > > > I'll take a look sometime soon. > > > >I found the problem (one at least). > >Say the pattern to be replaced begins with 'a' and we have a sufficiently > >long > >match with the pattern starting at the first 'a' in the String. Upon > >encountering the second 'a', while the first pattern still matches, you > >start > >pushing onto the rollback-stack. But that isn't inspected anymore, so if > >the > >actual occurence of the pattern starts at the third (or fourth, n-th) > >occurence of 'a' and that is already pushed onto the rollback, you miss > > it. > > I've corrected this with adjusting rollback position. if rollBack is null > then > search for rollback starts at second character if not starts at same as > searhed > character because I skip what was searched. That's all. > Though I'm not so sure now when I read this. > Still not working:
*New> searchReplace "abababc" "#" "ababababababc" "ababababababc" *New> searchReplace1 "abababc" "#" "ababababababc" "ababababababc" > >So the question is, can we find a cheap test to decide whether to use KMP > >or > >Bulat's version? > > In real world situation your KMP will always be fastest on average. > I like that we are not using C arrays as then we have advantage > of lazyness and save on memory usage. C++ program will be faster > on shorter strings but on this large strings will loose due memory > latency. and with your test, both programs are very fast. > > Greetings, Bane. > On my 256MB RAM AMD Duron 1200 MHz, Bulat's version is consistently about 20% faster than my KMP on your test -- btw, I unboxed the pat array, which gave a bit of extra speed, but not much. And apologies to Sebastian Sylvan, I also included an unboxed version of bord, built from the boxed version, and that sped things further up -- not much, again, but there it is. I wonder about this difference, -10% on one system and +20% on another system, ist that normal? Cheers, Daniel ------------------------------------------------------------------------------------------ Up-To-Date version of KMP: import Data.Array.Unboxed (UArray, listArray, (!)) import qualified Data.Array as A (array, (!), elems) searchReplace :: String -> String -> String -> String searchReplace "" _ str = str searchReplace src@(c:cs) dst str = process 0 str "" where len = {-# scc "len" #-} length src pat :: UArray Int Char pat = {-# scc "pat" #-} listArray (0,len-1) src bord ={-# scc "bord" #-} A.array (0,len) $ (0,-1):(1,0): [(i+1,getBord (pat!i) i + 1) | i <- [1 .. len-1]] getBord s n | m < 0 = m | s == pat!m = m | otherwise = getBord s m where m = bord A.! n bor :: UArray Int Int bor = listArray (0,len) $ A.elems bord getBor s n | m < 0 || s == pat!m = m | otherwise = getBor s m where m = bor!n process n str _ | n >= len = {-# scc "process" #-} dst ++ process 0 str "" process _ "" mat = {-# scc "process" #-} reverse mat process 0 (s:st) _ | s == c = {-# scc "process" #-} process 1 st [s] | otherwise = {-# scc "process" #-} s:process 0 st "" process n str@(s:st) mat | s == pat!n = {-# scc "process" #-} process (n+1) st (s:mat) | otherwise = {-# scc "process" #-} let j = getBor s n (ret,skip) = splitAt j mat in if j < 0 then reverse mat ++ process 0 str "" else reverse skip ++ process (j+1) st (s:ret) _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe