Re: [Chicken-users] BOM in a Scheme source file

2007-09-09 Thread Pierpaolo Bernardi
On 9/10/07, John Cowan <[EMAIL PROTECTED]> wrote: > Pierpaolo Bernardi scripsit: > > which says that you can put a bom in a utf8 file (of course, you can > > put whatever character you want in a file), but it is a character > > like every other character, it has no particular meaning wrt the encod

Re: [Chicken-users] BOM in a Scheme source file

2007-09-09 Thread John Cowan
Zbigniew scripsit: > BOM breaks the UNIX shebang mechanism. To me, this is good enough > reason to avoid prepending a BOM to scripts, and to detect encoding > via heuristic, user directive or current locale. I agree w/r/t scripts. My point is not that it's a Good Thing to generate 8-BOMs, but th

Re: [Chicken-users] BOM in a Scheme source file

2007-09-09 Thread Zbigniew
BOM breaks the UNIX shebang mechanism. To me, this is good enough reason to avoid prepending a BOM to scripts, and to detect encoding via heuristic, user directive or current locale. On 9/9/07, John Cowan <[EMAIL PROTECTED]> wrote: > Shawn Rutledge scripsit: > > It would be nice if Chicken was to

Re: [Chicken-users] BOM in a Scheme source file

2007-09-09 Thread John Cowan
Shawn Rutledge scripsit: > Instead, you think Scite should assume that when it sees any bytes > with the MSB set, the file is UTF-8? Or there is a better way to > detect it? There is no *guaranteed correct* way to detect UTF-8, because a Latin-1 (or various other 8859-x encodings) file can conta

Re: [Chicken-users] BOM in a Scheme source file

2007-09-09 Thread Elf
everything ive seen on the unicode site itself seems to discourage the use of a BOM outside of protocol ambiguous cases since its not a necessary object. its not an easy thing to be tolerant of in code text, although it is relatively easy to be tolerant of it in plain text. possibilties: is it

Re: [Chicken-users] BOM in a Scheme source file

2007-09-09 Thread John Cowan
Pierpaolo Bernardi scripsit: > See here for example: http://unicode.org/faq/utf_bom.html#29 > > which says that you can put a bom in a utf8 file (of course, you can > put whatever character you want in a file), but it is a character > like every other character, it has no particular meaning wrt t

Re: [Chicken-users] BOM in a Scheme source file

2007-09-09 Thread Shawn Rutledge
On 9/8/07, Elf <[EMAIL PROTECTED]> wrote: > and does not state anything about byte order.[1] Quite a lot of > Windows software (including Windows Notepad) adds one to UTF-8 files. > However in Unix-like systems (which make heavy use of text files for > configuration) this practice i

Re: [Chicken-users] BOM in a Scheme source file

2007-09-09 Thread Elf
and according to the unicode consortium: A: Yes, UTF-8 can contain a BOM. However, it makes no difference as to the endianness of the byte stream. UTF-8 always has the same byte order. An initial BOM is only used as a signature -- an indication that an otherwise unmarked text fil

Re: [Chicken-users] BOM in a Scheme source file

2007-09-08 Thread Elf
from that page: While UTF-8 does not have byte order issues, a BOM encoded in UTF-8 may be used to mark text as UTF-8. It only identifies a file as UTF-8 and does not state anything about byte order.[1] Quite a lot of Windows software (including Windows Notepad) adds one to UTF-8 files.

Re: [Chicken-users] BOM in a Scheme source file

2007-09-08 Thread Pierpaolo Bernardi
On 9/9/07, Graham Fawcett <[EMAIL PROTECTED]> wrote: > On 9/8/07, Pierpaolo Bernardi <[EMAIL PROTECTED]> wrote: > > UTF8 has no BOM. A BOM in a utf8 file should be there only if you > > put it there. > > Not true. > > http://en.wikipedia.org/wiki/Byte_Order_Mark UTF8 is defined by the Unicode con

Re: [Chicken-users] BOM in a Scheme source file

2007-09-08 Thread Graham Fawcett
On 9/8/07, Pierpaolo Bernardi <[EMAIL PROTECTED]> wrote: > UTF8 has no BOM. A BOM in a utf8 file should be there only if you > put it there. Not true. http://en.wikipedia.org/wiki/Byte_Order_Mark G ___ Chicken-users mailing list Chicken-users@nongnu

Re: [Chicken-users] BOM in a Scheme source file

2007-09-08 Thread Pierpaolo Bernardi
On 9/9/07, Shawn Rutledge <[EMAIL PROTECTED]> wrote: > If I save a Scheme source file from Scite (my usual editor) in UTF8 > mode, it writes the Byte Order Marker at the beginning (EF BB BF). UTF8 has no BOM. A BOM in a utf8 file should be there only if you put it there. It's a bug in Scite. P.

[Chicken-users] BOM in a Scheme source file

2007-09-08 Thread Shawn Rutledge
If I save a Scheme source file from Scite (my usual editor) in UTF8 mode, it writes the Byte Order Marker at the beginning (EF BB BF). If I load it like this csi myfile.scm I get Error: unbound variable: || But if I delete the BOM using a hex editor and try again, csi seems to assume it's UTF8