Re: literate scripts.

Malcolm Wallace Tue, 05 Aug 2003 07:51:34 -0700

Immanuel Litzroth <[EMAIL PROTECTED]> writes:

> I have a small question relating to literate haskell programs that
> use the \begin{code} \end{code} style. Am I correct to assume that 
> \end{code} inside a string should be recognized as being inside a
> string. The report seems to say this, but the unlit that is
> distributed with ghc doesn't grok this. Has anyone written an Unlit
> module in Haskell?


I have attached the Unlit.hs module from the nhc98 compiler, based
on the original specification in the the Haskell 1.2 Report.  Use the
exported function
    unlit :: String -> String -> String
whose first argument is the filename (for error messages), the second
argument is the file content, and whose result is the un-literated
file.

Regards,
    Malcolm

module Unlit(unlit) where

-- Part of the following code is from
-- "Report on the Programming Language Haskell",
--   version 1.2, appendix C.

import Char

data Classified = Program String | Blank | Comment
                | Include Int String | Pre String

classify :: [String] -> [Classified]
classify []                = []
classify (('\\':x):xs) | x == "begin{code}" = Blank : allProg xs
   where allProg [] = []  -- Should give an error message,
                          -- but I have no good position information.
         allProg (('\\':x):xs) |  x == "end{code}" = Blank : classify xs
         allProg (x:xs) = Program x:allProg xs
classify (('>':x):xs)      = Program (' ':x) : classify xs
classify (('#':x):xs)      = (case words x of
                                (line:file:_) | all isDigit line
                                   -> Include (read line) file
                                _  -> Pre x
                             ) : classify xs
classify (x:xs) | all isSpace x = Blank:classify xs
classify (x:xs)                 = Comment:classify xs

unclassify :: Classified -> String
unclassify (Program s) = s
unclassify (Pre s)     = '#':s
unclassify (Include i f) = '#':' ':show i ++ ' ':f
unclassify Blank       = ""
unclassify Comment     = ""

unlit :: String -> String -> String
unlit file lhs = (unlines
                 . map unclassify
                 . adjecent file (0::Int) Blank
                 . classify) (inlines lhs)

adjecent :: String -> Int -> Classified -> [Classified] -> [Classified]
adjecent file 0 _             (x              :xs) = x : adjecent file 1 x xs -- force 
evaluation of line number
adjecent file n y@(Program _) ([EMAIL PROTECTED]      :xs) = error (message file n 
"program" "comment")
adjecent file n y@(Program _) (x@(Include i f):xs) = x: adjecent f    i     y xs
adjecent file n y@(Program _) (x@(Pre _)      :xs) = x: adjecent file (n+1) y xs
adjecent file n [EMAIL PROTECTED]     (x@(Program _)  :xs) = error (message file n 
"comment" "program")
adjecent file n [EMAIL PROTECTED]     (x@(Include i f):xs) = x: adjecent f    i     y 
xs
adjecent file n [EMAIL PROTECTED]     (x@(Pre _)      :xs) = x: adjecent file (n+1) y 
xs
adjecent file n [EMAIL PROTECTED]       (x@(Include i f):xs) = x: adjecent f    i     
y xs
adjecent file n [EMAIL PROTECTED]       (x@(Pre _)      :xs) = x: adjecent file (n+1) 
y xs
adjecent file n _             ([EMAIL PROTECTED]         :xs) = x: adjecent file (n+1) 
x xs
adjecent file n _             []                    = []

message "\"\"" n p c = "Line "++show n++": "++p++ " line before "++c++" line.\n"
message []     n p c = "Line "++show n++": "++p++ " line before "++c++" line.\n"
message file   n p c = "In file " ++ file ++ " at line "++show n++": "++p++ " line 
before "++c++" line.\n"


-- Re-implementation of 'lines', for better efficiency (but decreased laziness).
-- Also, importantly, accepts non-standard DOS and Mac line ending characters.
inlines s = lines' s id
  where
  lines' []             acc = [acc []]
  lines' ('\^M':'\n':s) acc = acc [] : lines' s id      -- DOS
  lines' ('\^M':s)      acc = acc [] : lines' s id      -- MacOS
  lines' ('\n':s)       acc = acc [] : lines' s id      -- Unix
  lines' (c:s)          acc = lines' s (acc . (c:))

Re: literate scripts.

Reply via email to