RE: Robustness of instance Read Char

2001-11-02 Thread Simon Peyton-Jones

I do agree with you that it woud be better for the Read class
to use a Maybe result rather than a list of parses.  But I'm not
sure your problem can be solved simply by making the Char
instance of Read better.   The point is that the parser has to read
the *whole* string before it can be sure that it is syntactically well
formed
(e.g. no duff escape sequence in it) and hence it can't produce the
result
string till its sure that it can parse it.  So it gets tummy ache.

Better perhaps to roll your own Read class which produces output
earlier.  For that it would help if I finished up the generics support
in GHC so that you could do something like deriving Read for your
own new class.

Simon

| -Original Message-
| From: Peter Thiemann [mailto:[EMAIL PROTECTED]] 
| Sent: 15 October 2001 11:45
| To: [EMAIL PROTECTED]
| Cc: [EMAIL PROTECTED]
| Subject: Robustness of instance Read Char
| 
| 
| Folks,
| 
| my code has unwillingly been forced to read a large string 
| generated by show. This turned out to be a robustness test 
| because the effect is a stack overflow (with Hugs as well as 
| with GHC) and, of course, this error happened in a CGI script. 
| 
| If you want to try the effect yourself, just take a file 
| foo of, say, 150k and type this into you hungry Hugs prompt:
| 
| readFile foo = \s - putStr (read (show foo))
| 
| Digging down into the prelude code (taken from Hugs's prelude 
| file), you find this: 
| 
|  instance Read Char where
|readsPrec p  = readParen False
|  (\r - [(c,t) | ('\'':s,t) - lex r,
|  (c,\')   - 
| readLitChar s ])
|readList = readParen False (\r - [(l,t) | ('':s, t) - lex r,
| (l,_)  - readl s ])
| where readl ('':s)  = [(,s)]
|   readl ('\\':'':s) = readl s
|   readl s= [(c:cs,u) | (c ,t) - 
| readLitChar s,
|(cs,u) - 
| readl t ]
| 
| which means that the parser reading this string has the 
| ability to fail and to backtrack *at every single character*. 
| While this might be 
| useful in the general case, it certainly causes our little 
| one-line program to die. 
| 
| Unfortunately, in my real program, the String is embedded in 
| a data type which is deriving Read, so that writing the 
| specific instance of read is a major pain. Two things would 
| help me in this situation:
| 
| 1. some kind-hearted maintainer of a particularly 
| well-behaved Haskell 
|implementation might put in a more efficient definition in the
|instance Read Char (or convince me that backtracking inside of
|reading a String is a useful gadget). The following code will do:
| 
| readListChar :: String - [(String, String)]
| readListChar =
|   return . readListChar' . dropWhile isSpace
| 
| readListChar' ('\':rest) =
|   readListChar'' rest
| 
| readListChar'' ('\':rest) =
|   (,rest)
| readListChar'' rest = 
|   let (c, s') = head (readLitChar rest) 
|   (s, s'') = readListChar'' s'
|   in  (c:s, s'')
| 
| {- clearly, taking the head should be guarded and a proper 
| error message generated -}
| 
| 2. provide a way of locally replacing the offending instance of Read
|with something else. [urgh, a language extension]
| 
| Any suggestions or comments?
| -Peter
| 
| ___
| Haskell mailing list
| [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
| 

___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell



Robustness of instance Read Char

2001-10-16 Thread Peter Thiemann

Folks,

my code has unwillingly been forced to read a large string generated
by show. This turned out to be a robustness test because the effect is
a stack overflow (with Hugs as well as with GHC) and, of course, this
error happened in a CGI script. 

If you want to try the effect yourself, just take a file foo of,
say, 150k and type this into you hungry Hugs prompt:

readFile foo = \s - putStr (read (show foo))

Digging down into the prelude code (taken from Hugs's prelude file),
you find this: 

 instance Read Char where
   readsPrec p  = readParen False
   (\r - [(c,t) | ('\'':s,t) - lex r,
   (c,\')   - readLitChar s ])
   readList = readParen False (\r - [(l,t) | ('':s, t) - lex r,
  (l,_)  - readl s ])
  where readl ('':s)  = [(,s)]
readl ('\\':'':s) = readl s
readl s= [(c:cs,u) | (c ,t) - readLitChar s,
 (cs,u) - readl t ]

which means that the parser reading this string has the ability to
fail and to backtrack *at every single character*. While this might be 
useful in the general case, it certainly causes our little one-line
program to die. 

Unfortunately, in my real program, the String is embedded in a data
type which is deriving Read, so that writing the specific instance of
read is a major pain. Two things would help me in this situation:

1. some kind-hearted maintainer of a particularly well-behaved Haskell 
   implementation might put in a more efficient definition in the
   instance Read Char (or convince me that backtracking inside of
   reading a String is a useful gadget). The following code will do:

readListChar :: String - [(String, String)]
readListChar =
  return . readListChar' . dropWhile isSpace

readListChar' ('\':rest) =
  readListChar'' rest

readListChar'' ('\':rest) =
  (,rest)
readListChar'' rest = 
  let (c, s') = head (readLitChar rest) 
  (s, s'') = readListChar'' s'
  in  (c:s, s'')

{- clearly, taking the head should be guarded and a proper error
message generated -}

2. provide a way of locally replacing the offending instance of Read
   with something else. [urgh, a language extension]

Any suggestions or comments?
-Peter

___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell