Re: byte-order marks

2013-01-29 Thread Ludovic Courtès
Mark H Weaver skribis: > I wrote: >> Having slept on this, I think I agree that 'open-input-file' should >> auto-consume BOMs. Good. > So what should (open-file FILENAME "r+") do? What about doing the same as for just “r”? I can’t think of any reasonable scenario where this could be a problem

Re: byte-order marks

2013-01-29 Thread Neil Jerram
Andy Wingo writes: > On Tue 29 Jan 2013 20:22, Neil Jerram writes: > >> (define (read-csv file-name) >> (let ((s (utf16->string (get-bytevector-all (open-input-file file-name)) >>'little))) >> >> ;; Discard possible byte order mark. >> (if (and (>= (string-lengt

Re: byte-order marks

2013-01-29 Thread Andy Wingo
On Tue 29 Jan 2013 20:22, Neil Jerram writes: > (define (read-csv file-name) > (let ((s (utf16->string (get-bytevector-all (open-input-file file-name)) > 'little))) > > ;; Discard possible byte order mark. > (if (and (>= (string-length s) 1) >(char=?

Re: byte-order marks

2013-01-29 Thread Ludovic Courtès
Mark H Weaver skribis: >>> However, there’s no way to open a file in binary mode when using >>> ‘open-input-file’, ‘call-with-input-file’, etc. >> >> We can add keyword or optional arguments of course. (Not suggesting >> that we do so at this time though.) > > This has been on my TODO list for a

Re: byte-order marks

2013-01-29 Thread Neil Jerram
Andy Wingo writes: > What do people think about this attached patch? > > Andy > > >>From 831c3418941f2d643f91e3076ef9458f700a2c59 Mon Sep 17 00:00:00 2001 > From: Andy Wingo > Date: Mon, 28 Jan 2013 22:41:34 +0100 > Subject: [PATCH] detect and consume byte-order marks for textual ports In case

Re: byte-order marks

2013-01-29 Thread Mark H Weaver
I wrote: > Having slept on this, I think I agree that 'open-input-file' should > auto-consume BOMs. On the other hand, there's a nasty complication. Of course (open-input-file FILENAME) is just (open-file FILENAME "r"), so the auto-consuming logic should be in 'open-file'. So what should (open-f

Re: byte-order marks

2013-01-29 Thread Mark H Weaver
Hi, l...@gnu.org (Ludovic Courtès) writes: >> For textual files, it doesn’t seem unreasonable for ‘open-input-file’ to >> consume the BOM, IMO. It’s not much different from the ‘eol-style’ >> transcoders. Andy Wingo writes: > I could go either way. I would prefer for open-input-file to consume

Re: byte-order marks

2013-01-29 Thread Andy Wingo
Hi, [Ludo and Mark and I scribas]: >>> * 'open-input-file' could perhaps auto-consume a BOM at the beginning of >>> the stream, but *only* if the BOM is already in the encoding specified >>> by the user (possibly via an explicit call to 'file-encoding'). >> >> The problem is that we have no wa

Re: byte-order marks

2013-01-29 Thread Ludovic Courtès
Andy Wingo skribis: [...] >> Regarding byte-order marks, my preference is that users should explictly >> consume BOMs if that's what they want (ideally using some convenience >> procedure provided by Guile). Sometimes consuming the BOM is the wrong >> thing. For example, if the user is copying

Re: more capable xml->sxml

2013-01-29 Thread Ludovic Courtès
Hi! Andy Wingo skribis: > I just pushed some changes to (sxml simple)'s xml->sxml. Basically it > has keyword arguments now that do most of what people have been > requesting for a while (non-significant whitespace, easier handling of > entities, declaration of namespaces). We can add handling

Re: byte-order marks

2013-01-29 Thread Andy Wingo
On Mon 28 Jan 2013 23:20, Mike Gran writes: > So if there is a "coding:" line in the doc, I think it > should nullify giving precedence to a UTF-16 BOM. OK. Cheers, Andy -- http://wingolog.org/

Re: byte-order marks

2013-01-29 Thread Andy Wingo
Hi Mark, Let me work the other way around, starting at the problem and not a potential solution. There is a file: https://cvs.khronos.org/svn/repos/ogl/trunk/ecosystem/public/sdk/docs/man3/glBlendEquationSeparate.xml It is valid XML. It also has a UTF-8 BOM. It fails to parse in SSAX. The

Re: byte-order marks

2013-01-29 Thread Mark H Weaver
Hi Andy, Andy Wingo writes: > What do people think about this attached patch? I'm strongly opposed to making 'open-input-file' any more clever than it already is. Furthermore, I strongly believe that it should be much less clever than it is now. Our basic textual I/O should be robust by defaul