From: Daniel Fischer <[EMAIL PROTECTED]>
To: "Branimir Maksimovic" <[EMAIL PROTECTED]>
CC: Haskell-Cafe@haskell.org
Subject: Re: [Haskell-cafe] Substring replacements
Date: Tue, 13 Dec 2005 11:23:29 +0100


After seeing that your program is fastest (I've also tried one from
http://haskell.org/hawiki/RunTimeCompilation but perhaps I'm not
that good in converting to search replace?) I've decided to
try with Rabin-Karp algorithm.
This algorithm performs same operation as straightforward search,
but compares hashes instead of chars.
With ability to rotate hash (remove first, add next) characters
there is also optimisation, that hash is calculated only for single
next character rather again for whole substring.
Unfortunatelly on my machine it is very cheap to compare
characters so with my test hashing overweights character compare,
except in your test when hash searching is faster then straightforward
search.

This is best I can write in terms of performance and readability.
I've tried with getFst that returns Maybe but it was slower so I decided
to return '\0' in case that argument is empty list, which renders '\0'
unusable, but then I really doubt that 0 will be used in strings.

-- Rabin-Karp string search algorithm, it is very effective in searching of set
-- of patterns of length n on same string
-- this program is for single pattern search, but can be crafted
-- for multiple patterns of length m

hSearchReplace :: String -> String -> String -> String
hSearchReplace sr rp xs
   | not (null remaining) = found ++ rp
++ hSearchReplace sr rp (drop (length sr) remaining)
   | otherwise = found
   where
   (found,remaining) = hSearch sr xs

hSearch :: String -> String -> (String,String)
hSearch sr xs = hSearch' sr xs hcmp ""
   where
       hsrch = hash sr
       hcmp = hash $ take ls xs
       cmp = take ls xs
       ls = length sr

       hSearch' [] xs _ _= (xs,[])
       hSearch' sr [] _ fndFail = (reverse fndFail,[])
       hSearch' srch xxs@(x:xs) hcmps fndFail
           = if hsrch == hcmps
                then if isPrefixOf srch xxs
                        then (reverse fndFail,xxs)
                        else searchAgain
                else searchAgain
           where
           searchAgain
            = hSearch' srch xs
(hashRotate (getFst xxs) (getFst nextxxs) (ls-1) hcmps)
                       (x:fndFail)
           nextxxs = drop ls xxs

getFst :: String -> Char
getFst [] = '\0'
getFst (a:as) = a

hash :: String -> Int
hash str
   =  hash' str (length str - 1)
   where
   hash' :: String -> Int -> Int
   hash' [] _ = 0
   hash' (s:str) pow  = (101 ^ pow) *(fromEnum s)
                        + hash' str (pow-1)

hashRotate :: Char -> Char -> Int -> Int -> Int
hashRotate cout cin pow hsh
   = (hsh - ((101 ^ pow) * (fromEnum cout)))*101
     + (fromEnum cin)



Greetings, Bane.

_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to