Kamil Dworakowski wrote:
On Jun 22, 10:03 am, Eugene Kirpichov <ekirpic...@gmail.com> wrote:
Hey, you're using String I/O!

nWORDS <- fmap (train . map B.pack . words) (readFile "big.txt")

This should be

WORDS <- fmap (train . B.words) (B.readFile "big.txt")

By the way, which exact file do you use as a misspellings file? The
corpus linked to at Norvig's page has many.
And do you have a driver program that I could run and obtain your timings?

Yep, Don pointed that out and I have changed the program accordingly.
It didn't make any difference though. The time spent on building the
dictionary is a small portion of the overall run time.

Please see the repository contents for the current version of the
program:
http://patch-tag.com/r/spellcorrect/snapshot/current/content/pretty

The eval-bytestring.hs there is the program I used for timing. Inside
of it you will find the name of the misspellings file needed.

Thanks all for the suggestions. I'll try them when I get home tonight.


Another suggestion, is that you should try to make sure that the lists constructed by getCommonSpellingMistakesWithCorrections get optimized away. As written I'm not sure there will be sufficient inlining to ensure that. If you want to be explicit about removing them, something like the following should help:

> module Main where
> import Prelude hiding (words)
> import SpellingCorrection
> import qualified Data.ByteString.Char8 as B
> import Data.Char
> import Data.IORef
> import Control.Monad (forM_)
>
> main = do
>     corrector       <- getCorrector
>     misspell_corpus <- B.readFile "FAWTHROP1DAT.643"
>     n               <- newIORef (0::Int)
>     wrong           <- newIORef (0::Int)
>     forM_ (B.lines misspell_corpus) $ \line -> do
>         modifyIORef' n (1+)
>         let [ms,c] = map (B.map toLower) . B.words $ line
>         if corrector ms /= c
>             then modifyIORef' wrong (1+)
>             else return ()
>     accuracy <- do
>         n' <- readIORef n
>         w' <- readIORef wrong
>         return $! 100 * fromIntegral w' / fromIntegral n'
>     putStrLn $ "accuracy" ++ show (100 - accuracy) ++ "%"
>
> modifyIORef'    :: IORef a -> (a -> a) -> IO ()
> modifyIORef' r f = readIORef r >>= (writeIORef r $!) . f
> {-# INLINE modifyIORef' #-}

--
Live well,
~wren
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to