On Thu, Apr 14, 2005 at 10:52:25AM -0500, Kenneth Pronovici wrote: > Hi, > > I'm the maintainer for the Debian XML::Writer packages. I've just > received Debian bug #304477: > > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=304477 > > This bug documents a problem with the OUTPUT constructor parameter. The > constructor bombs out when this parameter is passed a scalar reference.
At the moment, those two parameters aren't designed to work together, although this isn't indicated by the documentation. I can think of a few solutions, so here are some notes for the reporter and anyone else who's interested. Perl distinguishes between Unicode strings (stored, as an implementation detail, in UTF-8) and strings of octets. By design, all input to XML::Writer should be Unicode. All files must be written as octets, and ENCODING specifies how to do this. If the OUTPUT parameter is a scalar, the current intention is that it's built up as a Unicode string - that is, ENCODING is meaningless because the string is never encoded. One obvious way to conceal the original bug would be to ignore ENCODING when OUTPUT is passed. This would ensure that the program given in the bug report didn't fail. (Alternatively, this incompatibility could be made explicit, and an explicit 'die' added for the case where they're both passed in.) Or, the strings could be encoded before appending them to the scalar. This would mean that OUTPUT ended up as octets, rather than Unicode. Alternatively, OUTPUT could be built up as a Unicode string and given a method (asBytes, say) that would do the encoding and return the octets. I like this solution because it requires the developer to be explicit about what they're expecting. (Perhaps with a complementary asString.) The real purpose of the ENCODING parameter is as a marker to indicate that the developer using the library is aware of Unicode and character encoding. Given that they do, it's not obvious to me which case they'd expect when appending to a scalar. The main use-cases I would expect are storing XML in a database and sending it over a socket, so those should be convenient. Sorry for a the lengthy response - none of those solutions is more than about five lines, and it's really a question of policy rather than mechanism. -- ------------------------------------------------------------ Joseph Walton -- ---------------- "Dude, you know you are just making up animal names now." -- -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]