lines --- to derive, or not to derive ;-)

Wolfram Kahl Wed, 03 May 2000 04:28:15 -0700

Currently I teach functional programming classes again,
and I wanted to present the derivation of ``lines''
that can be found in [Bird-Wadler-1988].
However, since I wanted to spare my students the confusion of
a non-prelude-equivalent definition of ``unlines'',
I re-did the development of ``lines'' from the current
prelude definition of ``unlines'', therewith in fact solving
exercise 4.3.3. of that book :-).

To my surprise,
the prelude definition of lines:

> lines   :: String -> [String]
> lines "" = []
> lines s  = let (l, s') = break (== '\n') s
>            in  l : case s' of
>                     []      -> []
>                     (_:s'') -> lines s''

(for which I have not yet seen a derivation,
 and which is actually used in hugs, ghc, and hbc
 although the report allows more efficient implementations)
not only is not much less cryptic than the definition I arrived at
(and anybody would arrive at):

> lines :: String -> [String]
> lines = foldr f []
>  where
>   f '\n' xss = "" : xss
>   f x [] = [[x]]
>   f x (ys:yss) = (x:ys) : yss

but it is also much less efficient:

Here are the best running times (out of ten runs) of

> main = interact (unlines . lines)

on a 1.4Mb source file using

 *  first the prelude definition,
 *  then the new definition,
 *  then the new definition with foldr expanded
    (the fact that this is faster should be an
     urgent challenge to compiler writers),
 *  and finally even with f expanded: 

               ghc-4.06 -O    |   hbc-0.9999.5b -O
               real    user   |   real    user
  PrelLines:  1.772s  1.660s  |  2.615s  2.330s
  NewLines:   1.340s  1.220s  |  1.428s  1.280s
  NewLines2:  1.204s  1.080s  |  1.361s  1.210s
  NewLines3:  1.190s  1.060s  |  1.291s  1.140s

Since everybody on the Haskell committee knows [Bird-Wadler-1988] :-),
I can only conclude that the prelude definition of ``lines''
that is contained in the report is a typo (TM) ;-),
especially since it has been stated repeatedly on the Haskell list
that prelude definitions in the report should also serve as
examples of good coding practices.


The picture is essentially the same for ``words'',
where I would propose the following definition:

> words :: String -> [String]
> words [] = []
> words (x:xs)
>  | Char.isSpace x  =  words1 xs
>  | otherwise       =  case words xs of
>                         []  ->  [[x]]
>                         (ys:yss) -> (x:ys):yss
>  where
>   words1 [] = []
>   words1 xs@(y:ys)
>    | Char.isSpace y  =  words1 ys
>    | otherwise       =  [] : case words ys of
>                                []  ->  [[y]]
>                                (zs:zss) -> (y:zs):zss

All the details can be found at:

http://ist.unibw-muenchen.de/Lectures/FT2000/FP/Lines.html


Best regards,

Wolfram Kahl
lines --- to derive, or not to derive ;-)

Reply via email to