On Thu, Apr 14, 2005 at 10:52:25AM -0500, Kenneth Pronovici wrote:
> Hi,
> 
> I'm the maintainer for the Debian XML::Writer packages.  I've just
> received Debian bug #304477:
> 
>    http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=304477
> 
> This bug documents a problem with the OUTPUT constructor parameter.  The
> constructor bombs out when this parameter is passed a scalar reference.

At the moment, those two parameters aren't designed to work together,
although this isn't indicated by the documentation. I can think of a few
solutions, so here are some notes for the reporter and anyone else who's
interested.

Perl distinguishes between Unicode strings (stored, as an implementation
detail, in UTF-8) and strings of octets. By design, all input to XML::Writer
should be Unicode. All files must be written as octets, and ENCODING
specifies how to do this. If the OUTPUT parameter is a scalar, the
current intention is that it's built up as a Unicode string - that is,
ENCODING is meaningless because the string is never encoded.

One obvious way to conceal the original bug would be to ignore ENCODING
when OUTPUT is passed. This would ensure that the program given in the
bug report didn't fail. (Alternatively, this incompatibility could be made
explicit, and an explicit 'die' added for the case where they're both
passed in.)

Or, the strings could be encoded before appending them to the scalar.
This would mean that OUTPUT ended up as octets, rather than Unicode.

Alternatively, OUTPUT could be built up as a Unicode string and given a
method (asBytes, say) that would do the encoding and return the octets.
I like this solution because it requires the developer to be explicit
about what they're expecting. (Perhaps with a complementary asString.)

The real purpose of the ENCODING parameter is as a marker to indicate
that the developer using the library is aware of Unicode and character
encoding. Given that they do, it's not obvious to me which case they'd
expect when appending to a scalar. The main use-cases I would expect are
storing XML in a database and sending it over a socket, so those should
be convenient.

Sorry for a the lengthy response - none of those solutions is more
than about five lines, and it's really a question of policy rather than
mechanism.
-- 
------------------------------------------------------------ Joseph Walton --
---------------- "Dude, you know you are just making up animal names now." --


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to