On 11/23/05, Joe Orton <[EMAIL PROTECTED]> wrote: > On Sun, Nov 20, 2005 at 09:53:50AM -0500, Jeff Trawick wrote: > > On input path, ap_xml_parse_input() handles converting xml to native > > charset (at least in 2.2). On output, there is no provision for > > converting xml in responses. > > OK, pop quiz: how is a Unicode XML document getting converted into > EBCDIC on input without losing most of the character set along the way?
unclear to me, at least... For this code: server/util_xml.c::ap_xml_parse_input(): ... #if APR_CHARSET_EBCDIC apr_xml_parser_convert_doc(r->pool, *pdoc, ap_hdrs_from_ascii); #endif ... The xml library apparently parses the input it well enough to understand the nodes. After that, it looks like the charset translation specified here (ap_hdrs_from_ascii) should use the real charset specified by the client. As it is, interesting* characters won't be handled correctly. No idea here on how interesting* chars in resource names are to be handled on EBCDIC box. Preserving in UTF-8 would be optimal, but reporting these in log files wouldn't work too well without major modifications. *characters not in some lowest-common-denominator EBCDIC (something like 7-bit ASCII)