Re: [Haskell-cafe] Code Review: Sudoku solver

Daniel Fischer Fri, 07 Apr 2006 16:06:17 -0700

Am Freitag, 7. April 2006 01:50 schrieben Sie:
> On Apr 6, 2006, at 6:05 PM, Daniel Fischer wrote:
> > I've also written a version using David F. Place's EnumSet instead
> > of [Int],
> > that takes less MUT time, but more GC time, so is slower on the
> > 36,628 test,
> > but faster for a single puzzle.
>
> That's a curious result.  Did you compile with optimization?  It


I considered that curious, too.
Everything was compiled with -O2 (does -O3 give better results?, would adding 
-optc-On improve performance? I'll try).
What makes it even weirder is that the EnumSet-version does indeed allocate 
fewer bytes and performs fewer garbage collections (small difference, 
though).
But I got consistent results, when running on a large number of puzzles, the 
list-version's faster gc'ing led to shorter overall run times.
The same held when compiled with -O3 -optc-O3, however, I've been stupid,
my excuse is that I was ill this week, the list version spent 46.5% gc'ing and 
the set version 53.5%, which is not really cricket, so today I used -AxM,
x <- [10,16,32,64], all reducing GC time to reasonable 0.5 - 2%. That, plus a 
few polishings, reduced the running time to about 16 minutes for EnumSet, a 
little more for lists. 

But, lo and behold, I  also tried how plai Array fared in comparison to 
DiffArray and ... reduced the running time to under ten minutes (a little 
above for the list version), 5% GC time without -AxM, 1.2% with -A8M.

And I thought, DiffArrays were supposed to be fast!

> should compile into primitive bit-twiddling operations and do no
> allocating at all.  I'd be curious to see how fast my solver works on

Well, since I've wrapped the sets in the Entry datatype, I wouldn't expect 
that, switching from Poss (singleton k) to Def k costs.
I tried making Board a DiffArray (Int,Int) (Set Int), but then I had the 
problem that either I lost the information gained by placing & forbidding for 
those positions where the number of possibilities dropped to one by 
inference, or had to scan the grid and re-place every now and then, both 
resulting in poor performance.

> the 36,628 test.  I'm afraid to run my ancient limping powerbook in
> such a tight loop for that long.  It gets too hot!
>
> If you'd find it amusing to give it a whirl, I'd love to know the
> result.

I ran your incrsud on the first fifteen 17-hint puzzles, took over 20s, so I 
decided against the full 36,628 test. Extrapolation makes me believe it'd 
take thirteen to fourteen hours.
The really big thing is to include the "if there is only one place to put a 
number in a row/column/cell, then put it there" inference step. Further 
inference has smaller impact (but the group-inference bought me a 20% 
speedup, which isn't bad, is it?).
But using Array instead of DiffArray gave almost 40%, that's impressive.

Attached is the fastest version I have, oddly, compiled with -O2 it's faster 
than with -O3 -optc-O3 (on my computer), how come?

setUni +RTS -A8M -sstderr
True
99,859,933,904 bytes allocated in the heap
104,713,900 bytes copied during GC
    150,260 bytes maximum residency (72 sample(s))

      11558 collections in generation 0 (  6.83s)
         72 collections in generation 1 (  0.16s)

         13 Mb total memory in use

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time  554.68s  (568.29s elapsed)
  GC    time    6.99s  (  7.22s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time  561.67s  (575.51s elapsed)

  %GC time       1.2%  (1.3% elapsed)

  Alloc rate    180,031,610 bytes per MUT second

  Productivity  98.8% of total user, 96.4% of total elapsed

an average of 0.015s per 17-hint puzzle, cool!

Cheers,
Daniel

>
> --------------------------------
> David F. Place
> mailto:[EMAIL PROTECTED]

-- 

"In My Egotistical Opinion, most people's C programs should be
indented six feet downward and covered with dirt."
        -- Blair P. Houghton

module SudokuSet where

import Data.Array
import Data.List (unfoldr, intersperse, sortBy, nub)
import Data.Char (isDigit, digitToInt)
import System.IO
import qualified EnumSet as Set (null)
import EnumSet (Set, (\\), size, member, empty,
                delete, unions, intersection, toList,
                fromList, singleton, findMin, fold)

type Position = (Int, Int)

pool :: Entry
pool = Poss $ fromList [1 .. 9]

ixes :: [Int]
ixes = [0 .. 8]

data Entry
    = Def !Int         -- ^ definite entry
    | Poss !(Set Int)  -- ^ set of possibilities at a position
      deriving Eq

-- Show instance, the Poss case comes from the original design,
-- where a sudoku puzzle was solved step by step
instance Show Entry where
    show (Def k)  = show k
    show (Poss s) = case size s of
                      0 -> "#"
                      1 -> "*"
                      2 -> ":"
                      _ -> " "

-- was a DiffArray, however this is MUCH faster
type Board = Array Position Entry

emptyBoard :: Board
emptyBoard = listArray ((0,0),(8,8)) $ replicate 81 pool

-- get the entry in (cellNo, cellInd) form
(?) :: Board -> Position -> Entry
b ? p = b ! toGridPos p

-- a puzzle is unsolvable if at any position, we can't place a number
failed :: Board -> Bool
failed = any isImp . elems

-- the puzzle is solved, if we have entered a number at all positions,
-- the constraints are guarded by the place & forbid connection
solved :: Board -> Bool
solved = all isDef . elems

-- has this board exactly one solution?
unique :: Board -> Bool
unique b = case solve b of
             [_] -> True
             _   -> False

----------------------------------------------------------------------
--                         Solving a Puzzle                         --
----------------------------------------------------------------------

-- place a number at a position, if that's possible
--
-- if there is already a number, either it is the one we want to
-- place, in which case we do nothing, or it is another one, in
-- which case we make our board failed (actually the latter can't
-- occur, I think)
--
-- if the number is still allowed at the position, we put it there
-- and forbid it at the other positions in the same row,
-- column and cell
place :: Board -> Position -> Int -> Board
place b p@(r,c) k
    = case b!p of
        Def n  -> if k == n then b
                     else b // [(p, Poss empty)]
        Poss s -> if k `member` s then b'
                     else b // [(p, Poss empty)]
            where
              b1 = b // [(p,Def k)]
              ri = [(r,i) | i <- ixes, i /= c]
              ci = [(i,c) | i <- ixes, i /= r]
              cn = cellNo p
              good (x,y) = x /= r && y /= c
              bi = filter good [toGridPos (cn,i) | i <- ixes]
              b' = foldl (forbid k) b1 $ ri ++ ci ++ bi

-- restrict the possibilities at a position to the members of a set,
-- if that's a singleton, we 'place' its only element
-- the Def case is never used, only present for completeness
restrict :: Board -> (Position, Set Int) -> Board
restrict b (p,s)
    | size s == 1 = place b p $ findMin s
    | otherwise   = case b!p of
                      Poss t -> b // [(p, Poss $ intersection s t)]
                      Def k  -> if k `member` s then b
                                  else b // [(p, Poss empty)]

-- forbid a number at a position, only if it is one of the
-- possibilities need we do anything (we never try to forbid
-- a number previously placed)
forbid :: Int -> Board -> Position -> Board
forbid k b p
    = case b!p of
        Poss s | k `member` s -> b // [(p, Poss $ delete k s)]
        _      -> b

-- list of determined (Position, Int) pairs
certs :: Board -> [(Position, Int)]
certs b = [(p,findMin s) | (p, Poss s) <- assocs b, size s == 1]

-- define (i.e. 'place') all determined entries, if that leads
-- to further determined entries, recurse; and tell us whether
-- we have defined anything
defCerts :: Board -> (Bool, Board)
defCerts b
    = case certs b of
        []  -> (False, b)
        cs  -> let b1 = foldl (uncurry . place) b cs
                   (_, b2) = defCerts b1
               in (True, b2)

-- find the possible positions of number 'k' in cell 'cn',
-- if there is only one, place it there, if all are in one
-- row (column), we 'forbid' 'k' on that row (column) in
-- the other cells
posCell :: Int -> Board -> Int -> Board
posCell cn b k
    = let ps = [(i,mkSt e) | i <- ixes, let e = b?(cn,i), not (isDef e)]
          fp = findPos k ps
          rs = nub [i `quot` 3 | i <- fp]
          cs = nub [i `rem` 3 | i <- fp]
          rc = filter (/= cn) [(cn `quot` 3) * 3 + i | i <- [0 .. 2]]
          cc = filter (/= cn) [3 * i + (cn `rem` 3) | i <- [0 .. 2]]
          ls = case (rs,cs) of
                ([r], _)  -> [(c',3*r+i) | c' <- rc, i <- [0 .. 2]]
                (_, [c])  -> [(c',3*i+c) | c' <- cc, i <- [0 .. 2]]
                _         -> []
      in case fp of
           [i] -> place b (toGridPos (cn,i)) k
           _   -> foldl (forbid k) b $ map toGridPos ls

-- do 'posCell' for all numbers not yet placed in cell no 'cn'
treatCell :: Int -> Board -> Board
treatCell cn b
    = fold (posCell cn) b st
      where
        st = unions [mkSt e | i <- ixes, let e = b?(cn,i), not (isDef e)]

-- do the above for all cells
treatCells :: Board -> Board
treatCells b = foldr treatCell b [0 .. 8]

-- find the possible positions of 'k' in row 'r', if there's
-- only one, place it, if all are in one cell, forbid 'k' in
-- the other rows of that cell
posRow :: Int -> Board -> Int -> Board
posRow r b k
    = let ps = [(i,mkSt e) | i <- ixes, let e = b!(r,i), not (isDef e)]
          fp = findPos k ps
          cs = nub [i `quot` 3 | i <- fp]
      in case (fp,cs) of
            ([c],_) -> place b (r,c) k
            (_,[c]) -> let ls = [(r',c') | i <- [0 .. 2]
                                , let r' = 3*(r `quot` 3) + i, r' /= r
                                , j <- [0 .. 2], let c' = 3*c + j]
                       in foldl (forbid k) b ls
            _       -> b

-- do 'posRow' for all numbers not yet placed in row 'r'
treatRow :: Int -> Board -> Board
treatRow r b
    = fold (posRow r) b st
      where
        st = unions [mkSt s | i <- ixes, let s = b!(r,i), not (isDef s)]

-- do the above for all rows
treatRows :: Board -> Board
treatRows b = foldr treatRow b [0 .. 8]

-- find the possible positions of 'k' in column 'c', if there's
-- only one, place it, if all are in one cell, forbid 'k' in
-- the other columns of that cell
posCol :: Int -> Board -> Int -> Board
posCol c b k
    = let ps = [(i,mkSt e) | i <- ixes, let e = b!(i,c), not (isDef e)]
          fp = findPos k ps
          rs = nub [i `quot` 3 | i <- fp]
      in case (fp,rs) of
            ([r],_) -> place b (r,c) k
            (_,[r]) -> let ls = [(r',c') | i <- [0 .. 2]
                                , let r' = 3*r + i, j <- [0 .. 2]
                                , let c' = 3*(c `quot` 3) + j, c' /= c]
                       in foldl (forbid k) b ls
            _       -> b

-- do 'posCol' for all numbers not yet placed in column 'c'
treatCol :: Int -> Board -> Board
treatCol c b
    = fold (posCol c) b st
      where
        st = unions [mkSt s | i <- ixes, let s = b!(i,c), not (isDef s)]

-- do the above for all columns
treatCols :: Board -> Board
treatCols b =  foldr treatCol b [0 .. 8]

-- infer until we either have determined (but not yet placed)
-- some entries or don't make any progress (e.g. when we've
-- solved our puzzle)
infer :: Board -> Board
infer b
    = let b1 = treatCells b
          c1 = certs b1
          b2 = treatRows b1
          c2 = certs b2
          b3 = treatCols b2
          c3 = certs b3
          b4 = treatGroups b3
          c4 = certs b4
      in case (c1,c2,c3,c4) of
           ((_:_),_,_,_) -> b1
           (_,(_:_),_,_) -> b2
           (_,_,(_:_),_) -> b3
           (_,_,_,(_:_)) -> b4
           _             -> if b4 == b then b4 else infer b4

-- find a position with the fewest possible entries and return
-- the list of boards with that position filled accordingly
guess :: Board -> [Board]
guess b
    = let ass = filter (not . isDef . snd) $ assocs b
          cmp (_,Poss s1) (_,Poss s2) = compare (size s1) (size s2)
          sas = sortBy cmp ass
      in case sas of
            [] -> []
            ((p,Poss s):_) -> [place b p n | n <- toList s]

-- create the list of all solutions of a board
--
-- if it's already solved, there obviously is only one solution
-- if the board is failed, there is none
-- otherwise we infer as much as we can and concat all solutions
-- of all guesses
solve :: Board -> [Board]
solve b
    | solved b  = [b]
    | failed b  = []
    | otherwise = let b1 = infer b in
                  case defCerts b1 of
                    (True,b2) -> solve b2
                    (_   ,b2) -> if b == b2 then
                                    do bd <- guess b
                                       solve bd
                                    else solve b2

----------------------------------------------------------------------
--                        Inference on Groups                       --
----------------------------------------------------------------------

-- group inference:
-- from a group of positions (one cell/column/row) we select
-- k as yet unfilled positions and consider the set of all
-- numbers allowed at one of these, if this set has k members,
-- we know that they can't appear anywhere else in that cell/
-- column/row
-- if that set has fewer than k members, our board will fail,
-- so we do it as quickly as possible
--
-- this also treats the dual problem: if we take a set of k
-- numbers and have altogether k possible positions for them,
-- then no other number can appear at any of these positions
inferGroup :: Board -> [Position] -> [[(Position, Set Int)]]
inferGroup b ps
    = let ass = [(p,b!p) | p <- ps]
          sts = [(p,s) | (p, Poss s) <- ass]
          len = length sts
      in do k <- [2 .. len - 2]
            (xs,ys) <- select k sts
            let ws = unions $ map snd xs
            case compare (size ws) k of
              GT -> fail "nothing found"
              LT -> return [(p,empty) | (p,_) <- sts]
              _  -> return $ map (`without` ws) ys

-- apply the restriction inferred by 'inferGroups'
infGr :: Board -> [Position] -> Board
infGr b ps
    = let ups = concat $ inferGroup b ps
      in foldl restrict b ups

-- list of (list of) positions in one cell, row, column
indGroups :: [[Position]]
indGroups = [map toGridPos [(cn,i) | i <- ixes] | cn <- ixes]
            ++ [[(r,c) | c <- ixes] | r <- ixes]
            ++ [[(r,c) | r <- ixes] | c <- ixes]

-- apply the above inference to all cells, rows, columns
treatGroups :: Board -> Board
treatGroups b = foldl infGr b indGroups

----------------------------------------------------------------------
--                  Reading and Displaying a Board                  --
----------------------------------------------------------------------

-- read a board in line-format
readLine :: String -> Board
readLine str
    = let nlns = zip [0 .. 8] $ takeBy 9 str
          tr s = zip [0 .. 8] $ map digitToInt s
          grd  = map (\(n,l) -> map (\(k,m) -> ((n,k),m)) $ tr l) nlns
          plcs = filter ((/= 0) . snd) $ concat grd
      in foldl (uncurry . place) emptyBoard plcs

-- read a board in either line or grid format
readBoard :: String -> Board
readBoard = readLine . take 81 . filter isDigit

-- make a list of lists out of an array
boardToLists :: Board -> [[Entry]]
boardToLists b = [[b!(r,c) | c <- ixes] | r <- ixes]

-- String representation, unashamedly stolen from David F. Place's
-- solver, because it's much prettier than my original display
display :: [[Entry]] -> String
display s = stretch . (intersperse ["---+---+---\n"]) . (stitch knot) . gather $ s
    where
      sewRow = stretch . (intersperse ["|"]) . (stitch show) . gather
      knot r = (sewRow r)++['\n']
      stretch = concat . concat
      stitch f = map (map f)
      gather = takeBy 3

-- show a board in nice format
showBoard :: Board -> String
showBoard = display . boardToLists

-- pretty-print a board
pretty :: Board -> IO ()
pretty = putStr . showBoard

----------------------------------------------------------------------
--                         Helper Functions                         --
----------------------------------------------------------------------

-- calculate the cell number of a position
cellNo :: Position -> Int
cellNo (r,c) = (r `quot` 3) * 3 + c `quot` 3

-- calculate the index of a position in the cell
cellInd :: Position -> Int
cellInd (r,c) =(r `rem` 3) * 3 + c `rem` 3

-- transform from (row, column) to (cell, index) format
toCellPos :: Position -> Position
toCellPos p = (cn,ci)
              where
                cn = cellNo p
                ci = cellInd p

-- transform from (cell, index) to (row, column) format
toGridPos :: Position -> Position
toGridPos = toCellPos

-- also from David F. Places solver
takeBy :: Int -> [a] -> [[a]]
takeBy n xs = unfoldr f xs
    where
      f [] = Nothing
      f xs = Just $ splitAt n xs

isDef :: Entry -> Bool
isDef e = case e of
            Def _ -> True
            _     -> False

isImp :: Entry -> Bool
isImp e = case e of
            Poss s | Set.null s -> True
            _                   -> False

findPos :: Int -> [(Int, Set Int)] -> [Int]
findPos k ps = [y | (y,s) <- ps, k `member` s]

mkSt :: Entry -> Set Int
mkSt e = case e of
            Poss s -> s
            Def k  -> singleton k

-------------------------------------------------------

select :: Int -> [a] -> [([a],[a])]
select 0 xs = [([],xs)]
select _ [] = []
select n (x:xs) = [(x:ys,zs) | (ys,zs) <- select (n-1) xs]
               ++ [(ys,x:zs) | (ys,zs) <- select n xs]

without :: (a,Set Int) -> Set Int -> (a,Set Int)
(x,st) `without` ts = (x, st \\ ts)

_______________________________________________
Haskell-Cafe mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Code Review: Sudoku solver

Reply via email to