Re: [Haskell-cafe] Matrices in Haskell

Ivan Lazar Miljenovic Sun, 22 Jul 2007 21:25:58 -0700

On Tue, 2007-03-20 at 15:09 +1000, Matthew Brecknell wrote:

> I'm not sure I see the problem, since any operation that touches all the
> elements of a n-by-n matrix will be at least O(n^2). For such an
> operation, a transposition should just add a constant factor.


What I was hoping for was a data structure such that I could directly
access the columns of the matrix, rather than having to apply a function
to get to them.  I didn't think it likely, but then again, a man can
dream... ;)

> 
> When you tried using Arrays, I presume you used an array indexed by a
> pair (i,j), and just reversed the order of the index pair to switch from
> row-wise to column-wise access? It's hard to see how that would slow you
> down. Perhaps the slowdown was caused by excessive array copying?
> Immutable arrays can be efficient for algorithms that write a whole
> array at once, but usually not for algorithms that modify one element at
> a time.
> 
> I think you'll need to show specific functions that are performing below
> expectation before the list will be able to help.
> 

I've attached the code I had written using Arrays.  Note that it wasn't
the final version of my algorithms, etc. as I continued work a bit
further using lists of lists, but the working behind them is just the
same (just some extra optimizations, etc. are missing).

Here's the way I was pulling out the rows and columns from the array:


type Matrix a = IArray MatCoord a
type MatCoord = (Int,Int)
type Row a = Array Int a

getRows :: Matrix a -> [Row a]
getRows m = [newRow sr [(c,v) | ((r,c),v) <- vals, r == r'] |
                r' <- values sr]
            where
            sr = fst . snd . bounds $ m
            vals = assocs m

unRows :: [Row a] -> Matrix a
unRows = liftM2 newMatrix length addRowValues
         where
         addRowValue r = map (\ (c,v) -> ((r,c),v)) . assocs
         addRowValues = concat . zipWith addRowValue [1..]


getCols :: Matrix a -> [Row a]
getCols m = [newRow sc [(r,v) | ((r,c),v) <- vals, c == c'] |
                c' <- values sc]
            where
            sc = snd . snd . bounds $ m
            vals = assocs m

unCols :: [Row a] -> Matrix a
unCols = liftM2 newMatrix length addColValues
         where
         addColValue c = map (\ (r,v) -> ((r,c),v)) . assocs
         addColValues = concat . zipWith addColValue [1..]

And here's an example where I used these functions.  What it's doing is
storing a list of all possible values for each cell in the matrix, then
removing double ups.

type Choices = [[Int]]

prune   :: Matrix Choices -> Matrix Choices
prune =  pruneBy cols . pruneBy rows
         where 
         pruneBy (f,f') = f' . map reduce . f

reduce     :: Row Choices -> Row Choices
reduce r =  array size (map reducer vals)
            where
            size = bounds r
            vals = assocs r
            reducer (c,vs) = (c, vs `minus` singles)
            singles = getSingles r

minus   :: Choices -> Choices -> Choices
xs `minus` ys   =  if single xs then xs else xs \\ ys

> For problems like latin squares and sudoku, also try thinking "outside
> the matrix". Do you really need to structure the problem in terms of
> rows and columns? What about a set of mutually-intersecting sets of
> cells?

I based my code upon the paper by Richard Bird and later implemented by
Graham Hutton.  I'm going to be re-writing it a bit, in terms of first
generating the "shapes" that the Partial Latin Squares can take, then
trying to fill in the numbers.  I have considered using a graph-theory
approach, but I have no idea where to even start to code it.
Fundamentally, I'm trying to generate all Partial Latin Squares that fit
a given criteria, so I need the rows/columns to ensure that I have no
more than one instance of each value in each row or column.

-- 
Ivan Lazar Miljenovic

Latin Squares Solver and Generator
==================================

Declaring this as a module
--------------------------

> module LatinSquares  where

Importing Modules
-----------------

> import Data.List
> import Data.Maybe()
> import Control.Monad.List
> import Data.Array.IArray

Defining Types
--------------

There are two main storage structures used: Sets and Matrices.
Whilst both of these utilise underlying Haskell lists, this distinction
is used to separate the different uses.  Sets are used to store all possible
or given values in no particular order, whilst matrices are used to store 
elements in some order in a 2-dimensional fashion.

A matrix is defined as a list of rows, where each row is a list
of values.  Note that, due to this definition, matrices are not 
inherently defined as having the same number of items in each 
row and column: this is up to the function that creates the matrix!

> type Matrix a = IArray MatCoord a
>
> type MatCoord = (Int,Int)
>
> type Row a = Array Int a

A set is merely a collection of items of the same type

> type Set a = [a]

A Latin Square is defined as a matrix of values, with Value as a wrapper
around Int.  Choices is a set of values used to indicate which values
can be stored in a particular position in the Latin Square. SolutionSet is used
to signify the results from the solve function, and should only be used
by those functions that require this output (otherwise, use "Set LatinSquare"
directly).

> newtype LatinSquare = LS (Matrix Value)
>     deriving (Eq, Ord)
>
> type SolutionSet = Set LatinSquare
>
> type Value = Int
>
> type Choices = Set Value

We require that LatinSquare is in the Eq and Show classes:

> instance Show LatinSquare where
>     show (LS m) = matToString printFunc m
>         where
>         printFunc v = if (isUnset v) then unsetString else (show v)
>         unsetString = "."

Basic Definitions
-----------------

A list of all the allowed values

> values :: Int -> [Int]
> values = enumFromTo 1
 
Indicates an undefined value

> unsetValue :: Value
> unsetValue = 0

Checks if the given value is set or unset

> isUnset :: Value -> Bool
> isUnset = (== unsetValue)
>
> isSet :: Value -> Bool
> isSet = (/= unsetValue)

Checks if there is only one element

> single     :: Set a -> Bool
> single [_] = True
> single  _  = False

Checks if there are any elements

> empty :: Set a -> Bool
> empty = null

Find the first element in a given Set that fulfills the criteria,
or else the last element if none of them fulfill the criteria
(note: the set cannot be non-empty).

> dropUntil          :: (a -> Bool) -> Set a -> a
> dropUntil _ []     = error "dropUntil requires a non-empty list!!!"
> dropUntil p xs     = if (not (empty filtered)) then (head filtered) else (last xs)
>                      where
>                      filtered = filter p $ xs


Matrix-related Functions
------------------------

To create a square, array-based matrix

> newMatrix   :: Int -> [(MatCoord,a)] -> Matrix a
> newMatrix n = array ((1,1),(n,n))

To create a row/column

> newRow   :: Int -> [(Int,a)] -> Row a
> newRow n = array (1,n)

To create a new matrix filled with the given value:

> blankMatrix     :: a -> Int -> Matrix a
> blankMatrix v n = newMatrix n [((r,c),v) | r <- values n, c <- values n]

To get the values only, without worrying about row or columns (though due
to the storage mechanism of arrays, it should come in terms of row by row).
This should, however, also get the values out of a Row.

> getVals :: (Ix i ) => Array i a -> Set a
> getVals = map snd . assocs

To unset each set value in turn.

> getSubSquares        :: LatinSquare -> Set LatinSquare
> getSubSquares (LS m) = [LS (m // [((r,c),unsetValue)]) 
>                         | ((r,c),v) <- vals, isSet v]
>                        where
>                        vals = assocs m

To get the rows from a Matrix, we need to pull out all coordinates from
the matrix that are in a given row.

> getRows :: Matrix a -> Set (Row a)
> getRows m = [newRow sr [(c,v) | ((r,c),v) <- vals, r == r'] | r' <- values sr]
>             where
>             sr = fst . snd . bounds $ m
>             vals = assocs m

To turn a set of rows back into a matrix, assumed to be square:

> unRows :: Set (Row a) -> Matrix a
> unRows = liftM2 newMatrix length addRowValues
>          where
>          addRowValue r = map (\ (c,v) -> ((r,c),v)) . assocs
>          addRowValues = concat . zipWith addRowValue [1..]


Similarly for columns:

> getCols :: Matrix a -> Set (Row a)
> getCols m = [newRow sc [(r,v) | ((r,c),v) <- vals, c == c'] | c' <- values sc]
>             where
>             sc = snd . snd . bounds $ m
>             vals = assocs m
>
> unCols :: Set (Row a) -> Matrix a
> unCols = liftM2 newMatrix length addColValues
>          where
>          addColValue c = map (\ (r,v) -> ((r,c),v)) . assocs
>          addColValues = concat . zipWith addColValue [1..]

We thus have getRows . unRows = id = getCols . unCols.  To simplify calling
these, we shall bundle them up:

> rows, cols :: (Matrix a -> Set (Row a), Set (Row a) -> Matrix a)
> rows = (getCols, unCols)
> cols = (getCols, unCols)

Creating a new Latin Square
---------------------------

Creates a new Latin Square of the given size filled with unsetValue

> blank :: Int -> Matrix Value
> blank = blankMatrix unsetValue

>
> createLatinSquare   :: Matrix Value -> LatinSquare
> createLatinSquare m = LS m
>
> blankLatinSquare :: Int -> LatinSquare
> blankLatinSquare = createLatinSquare . blank

To recover the matrix from the Latin Square:

> getMatrix        :: LatinSquare -> Matrix Value
> getMatrix (LS m) = m

Validity Checking
-----------------

A Latin Square is "complete" (i.e. finished) if there is only one 
possible choice available for each position

> complete :: Matrix Choices -> Bool
> complete =  (all single) . getVals

Similarly, a Latin Square is "void" if some square contains no choices:

> void :: Matrix Choices -> Bool
> void =  (any empty) . getVals

A Latin Square is "safe" when all rows and columns have no duplicates
of the same single choice.

> safe   :: Matrix Choices -> Bool
> safe m =  all consistent (getRows m) 
>           && all consistent (getCols m)
>
> consistent :: Row Choices -> Bool
> consistent =  nodups . getSingles
>
> getSingles :: (Ix i) => Array i Choices -> Set Value
> getSingles = getSinglesSet . getVals
>              where
>              getSinglesSet r = [ v | [v] <- r, isSet v]
>
> nodups        :: Choices -> Bool
> nodups []     = True
> nodups (x:xs) = notElem x xs && nodups xs

Finally, a Latin Square is "blocked" if it is void or unsafe:

> blocked   :: Matrix Choices -> Bool
> blocked m = void m || not (safe m)

To determine how many elements in a Latin Square are set or unset:

> numSetRow :: Row Value -> Int
> numSetRow = length . filter isSet . getVals

Generating all Possibilities
----------------------------

Constructing a Matrix where each position has a list of all
possible values that might go there

> choices   :: Int -> LatinSquare -> Matrix Choices
> choices n =  prune . newMatrix n . map chooser . assocs . getMatrix
>              where
>              chooser (i,v) = (i, choice v)
>              choice v = if isUnset v then vals else [v]
>              vals = values n

The prune function goes through and eliminates all set values from the list 
of possible values in each row and column.  The minus function is an adapted
form of the (\\) function, such that it ignores single cells.

> prune	:: Matrix Choices -> Matrix Choices
> prune =  pruneBy cols . pruneBy rows
>          where 
>          pruneBy (f,f') = f' . map reduce . f
>
> reduce     :: Row Choices -> Row Choices
> reduce r =  array size (map reducer vals)
>             where
>             size = bounds r
>             vals = assocs r
>             reducer (c,vs) = (c, vs `minus` singles)
>             singles = getSingles r
>
> minus 	:: Choices -> Choices -> Choices
> xs `minus` ys	=  if single xs then xs else xs \\ ys

To ensure that whenever we are collapsing a matrix of Choices that the Latin
Square created at each step is still valid, we only expand the first cell
with more than once choice.  When pruned after each expansion, the Latin Squares
produced are guarunteed to fulfill the criteria that each value between 1 and n
occurs at most one time in each row and column.  An extra advantage of this approach
is that if the Latin Square produced does not fulfill other criteria, then
the expansion of that particular matrix of Choices can be halted, saving computation
time.  Note: as a consequence of this, all outputs of this function should be
'safe', but it is best to use the safe function to make sure (i.e. in case
overlaps from various pre-set values cause un-safeness).

Since the values are stored in matrices in a list of associations, this
simplifies the matter of choosing a single value.

> expand   :: Matrix Choices -> Set (Matrix Choices)
> expand m = map (array size . (++) before . flip (:) after . (,) i . return) vs
>            where
>            size = bounds m
>            (before, (i,vs):after) = break (not . single . snd) . assocs $ m

Solving a Latin Square
----------------------

To solve a Partial Latin Square, we construct matrix of Choices,
with the known values set.  We then go and expand each possible unset
position at a time, and see if that provides us with a solution.

> solve   :: Int -> LatinSquare -> SolutionSet
> solve n = solutionSearch . (choices n)
>
> solutionSearch :: Matrix Choices -> Set LatinSquare
> solutionSearch m
>     | blocked m  = []
>     | complete m = [fixVals m]
>     | otherwise  = [g | m' <- expand m,
>                     g <- solutionSearch (prune m')]

fixVals is used to turn a matrix of Choices for a Latin Square into that Latin
Square by 'fixing' the values in place.  Make sure you check that it isn't 
void first!!!!!!!  The extra else segment for non-single matrices of Choices 
is in case this function is used to test a Latin Square that hasn't been 
fully expanded as yet.

> fixVals :: Matrix Choices -> LatinSquare
> fixVals = createLatinSquare . valFixer
>
> valFixer   :: Matrix Choices -> Matrix Value
> valFixer m = array size . map fixer $ vals
>              where
>              size = bounds m
>              vals = assocs m
>              fixer (i,(x:xs)) = if empty xs then (i,x) else (i,unsetValue)

Partial Latin Square (PLS) Properties
-------------------------------------

The functions completeable, premature and uniquelyCompleteable are designed to
receive the input of the solve function.

A PLS is completeable if it has at least one solution, premature otherwise

> completeable :: SolutionSet -> Bool
> completeable = not . empty

> premature :: SolutionSet -> Bool
> premature = empty

A PLS is uniquely completeable if it has exactly one solution
(note that, this function returns false for non-completeable PLSs
as well, i.e. if uniquely completeable, then it is completeable).

> uniquelyCompleteable :: SolutionSet -> Bool
> uniquelyCompleteable = single

The following functions are the same as those above, but take in a Latin Square
and solve it first.

> isCompleteable   :: Int -> LatinSquare -> Bool
> isCompleteable n = completeable . solve n
>
> isPremature   :: Int -> LatinSquare -> Bool
> isPremature n = premature . solve n
>
> isUniquelyCompleteable   :: Int -> LatinSquare -> Bool
> isUniquelyCompleteable n = uniquelyCompleteable . solve n

A PLS is a critical set if it is uniquely completeable, but all subsets 
have more than one completion (note that we do not need to check if the subsets
are completeable, since if the given Latin Square is completeable then the 
sub-squares must be).

> isCriticalSet      :: Int -> LatinSquare -> Bool
> isCriticalSet n ls = isUniquelyCompleteable n ls
>                        && all (not . isUniquelyCompleteable n) subSquares
>                      where
>                      subSquares = getSubSquares ls

Helper functions for creating Partial Latin Squares (PLS)
---------------------------------------------------------

Similar to choice, partialChoices constructs a Matrix of Choices, except that
it includes the unsetValue as well.

> partialChoices   :: Int -> LatinSquare -> Matrix Choices
> partialChoices n = newMatrix n . addUnset . assocs . (choices n)
>                    where
>                    addUnset = map (\ (i,vs) -> (i,unsetValue:vs))

Since Latin Squares are O(n^(n^2)), we want to try and reduce the number
of them considered as soon as possible.  As such, we shall try and construct
exactly _one_ from each isotopy class.

Two Latin Squares are in the same isotopic class if one can be transformed into
another by performing combinations of the following:

    1) Row swaps
    2) Column swaps
    3) Symbol swaps (e.g. replace all 1s with 2s and vice versa)
    4) Transposition

This works because all that defines a PLS is that each number from 1 to n appears
at most once in each row or column (for a non-Partial Latin Square, replace
"at most" with "exactly").  As such, none of the above four transformations will
change this behaviour.

Since swapping rows and columns does not change the structural definition
of the Latin Square, we define the "simplest isotopic form" of a PLS as 
the one where the number of set values in each row or column is greater than
or equal to the number of set values in the next row or column.  Furthermore,
if we consider the first time any particular (set) number appears in the Latin 
Square, then they should be in ascending order.

As such, we need functions that will check to see if a given matrix of Choices
or a Latin Square fulfill this criteria.

First of all, we compare the rows and columns.  A row or column has a higher
precedence than another row or column if it has a higher weighting, or -- if
they have the same weighting -- by finding which of the two first has a set
value where the other one doesn't (if they both have values in the same positions,
give precedence to the one with the lower value).

> isSimplestRows :: Matrix Value -> Bool
> isSimplestRows = isSimplestBy getRows
>
> isSimplestCols :: Matrix Value -> Bool
> isSimplestCols = isSimplestBy getCols
>
> isSimplestBy :: (Matrix Value -> Set (Row Value)) 
>                      -> Matrix Value -> Bool
> isSimplestBy f = and . ap (zipWith comesBefore) tail . f
>
> comesBefore :: Row Value -> Row Value -> Bool
> comesBefore r1 r2 = (r1 == r2) || s1 > s2 || ((s1 == s2) && (isLessThan fr1 fr2))
>                     where
>                     s1 = numSetRow r1
>                     s2 = numSetRow r2
>                     joined = zip (getVals r1) (getVals r2)
>                     (fr1,fr2) = dropUntil dropFunc joined
>                     dropFunc (x,y) = (isSet x) /= (isSet y)
>                     isLessThan v1 v2 = (isUnset v2) || ((isSet v1) && (v1 < v2))

Checking if the Matrix of Choices is in the simplest numerical order
i.e. each new, unique numerical value must be in ascending order.

> isSimplestNumbers   :: Int -> Matrix Choices -> Bool
> isSimplestNumbers n = simplestNumbers n . valOrder
>
> valOrder :: Matrix Choices -> Set Value
> valOrder = nub . getSingles
>
> simplestNumbers :: Int -> Set Value -> Bool
> simplestNumbers = (and .) . zipWith (==) . values

Since it is possible to transpose a Latin Square and still have it retain
is isotopy-ness, we need a way to check if a Latin Square would be simpler as
it is or as its transpose (if the latter, dump it).  Ultimately, this compares
the shape of the given Latin Square to its transpose, and determines which is
simpler.  If the shape is invariant under transposition, use sameShapeCheck.

> transposeCheck   :: Matrix Value -> Bool
> transposeCheck m = validWeightings || (equalWeightings && sameShapeCheck m)
>                    where
>                    dropped = dropUntil (\ (x1,x2) -> x1 /= x2)
>                    comparer = zip (weightOf getRows) (weightOf getCols)
>                    weightOf f = map numSetRow . f $ m
>                    droppedWeightings = dropped  comparer
>                    validWeightings =  uncurry (>) droppedWeightings
>                    equalWeightings = uncurry (==) droppedWeightings

If the shape is invariant under transposition, then preference is given to the
LatinSquare where - when the elements in the kth column are paired up with the
elements in the kth row - the first pair where both elements are set has a lower 
or equal value in the column than the row.  Note that if the shape is invariant, 
then if there is a set value at (i,j), then there is also a set value at (j,i).

> sameShapeCheck   :: Matrix Value -> Bool
> sameShapeCheck m = all comparer firsts
>                    where
>                    getVs f = map getVals . f
>                    zippedRows = zip (getVs getRows m) (getVs getCols m)
>                    zippedVals = map (uncurry zip) zippedRows
>                    firsts = map (dropUntil dropFun) zippedVals
>                    dropFun (r,_) = isSet r
>                    comparer (r,c) = r >= c

Generating PLSs that are in simplest form
-----------------------------------------

Since values are expanded along each row first, when creating the Set of PLSs
in simplest form, at each stage we check if the _rows_ are simplest, and leave
the checking of columns to the end.  To see why this is important, consider
the following PLS:

     0 1 2
     1 0 0
     2 0 0

If we check both rows and columns at each stage of expansion, then any PLS
where the top-left-hand element is not set will automatically fail, as
if we then set any other element in the first row then the criteria for that
intermediary PLS to be simplest isn't satisfied (but row-wise will be OK).
As such, we try and cut down as many as possible during the expansion stage so
that non-simplest PLSs aren't needlessly expanded.

> generatePartials :: Int -> Set LatinSquare
> generatePartials n = partialSearch n partialOptions
>                      where
>                      partialOptions = partialChoices n (blankLatinSquare n)
>
> partialSearch :: Int -> Matrix Choices -> Set LatinSquare
> partialSearch n m
>     | blocked m  = []
>     | not . isSimplestRows . valFixer $ m = []
>     | not (isSimplestNumbers n m) = []
>     | complete m = createPartial m
>     | otherwise  = [g | m' <- expand m,
>                     g <- partialSearch n (prune m')]

createPartial checks a few extra criteria to ensure that the Latin Square
is in the simplest form.  Note that its input _must_ fulfill "complete m"
(i.e. each Set of Values should be single).

> createPartial     :: Matrix Choices -> Set LatinSquare
> createPartial m = if valid then [createLatinSquare fixed] else []
>                   where
>                   fixed = valFixer m
>                   valid = (isSimplestCols fixed) && (transposeCheck fixed)

To simplify the construction of PLSs that fulfill a certain criteria, use this
function (it is just a wrapper for filtering the results of generatePartials).

> generatePartialType :: (Int -> LatinSquare -> Bool) -> Int -> Set LatinSquare
> generatePartialType p n = filter (p n) $! partials
>                           where
>                           partials = generatePartials n

Generating Critical Sets:

> generateCriticalSets :: Int -> Set LatinSquare
> generateCriticalSets = generatePartialType isCriticalSet

Matrix Printing
---------------

Converts a Matrix to a String (used to show Latin Squares).

> matToString     :: (a -> String) -> Matrix a -> String
> matToString f m = flip (++) "\n" . unlines . map unwords $ mStr'
>                   where
>                   mStr = map ((map f) . getVals) . getRows $ m
>                   strLen = maxStrLen mStr
>                   mStr' = map (map (bulkStr strLen)) mStr

> maxStrLen :: [[String]] -> Int
> maxStrLen = maximum . (map (maximum . map length))

> bulkStr :: Int -> String -> String
> bulkStr l = (++) =<< flip replicate space . (-) l . length
>             where
>             space = ' '

signature.asc
Description: This is a digitally signed message part

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Matrices in Haskell

Reply via email to