Re: [Haskell'-private] pragmas and annotations (RE: the record system)

Ben Rudiak-Gould Tue, 28 Feb 2006 13:09:06 -0800

Malcolm Wallace wrote:

  * If the first three bytes of the file are "{-#", then keep reading in
    ASCII/Latin-1/whatever until you discover an ENCODING decl (or not).


  * If the first six bytes of the file are one of the two possible
    UTF-16 representations of "{-#", then assume UTF-16 with that
    byte-encoding until we find the ENCODING decl.  (A missing decl in
    this case would be an error.)

  * If the first twelve bytes of the file are a UCS-4 representation of
    "{-#" then ... you get the picture.

  * For UTF-16 and UCS-4 variations, you must also permit the file to
    begin with an optional byte-order mark (two or four bytes).


You'd also want to look for the UTF-8 BOM, which is very common in Windows.

As for literate source, I suppose you could forbid .lhs files from usingUTF-16 or UCS-32 unless there's a BOM. Then unlit wouldn't need to know theencoding (I think), and the .hs heuristics would work on the output.


-- Ben

_______________________________________________
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime

Re: [Haskell'-private] pragmas and annotations (RE: the record system)

Reply via email to