http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5083
------- Additional Comments From [EMAIL PROTECTED] 2006-09-05 15:33 ------- No, that doesn't work unfortunately. Leaving $msg_resp flagged as native Perl Unicode causes the length() function to give the number of Unicode characters, and hence the Content-Length header sent back to the client doesn't match the number of UTF-8 bytes in the response body. spamc whinges, although other clients might either be using protocol 1.2 or below (no Content-Length) or just ignore the discrepancy and carry on anyway. How about explicitly turning *off* the :utf8 layer on the socket (ie. stop the magic that appears to be happening on Linux but not on BSD), and do the UTF-8 conversion as in the original patch? At least that way we won't be potentially double-encoding anything. Or we could use "length(Encode::encode_utf8($msg_resp))" or "use bytes; length($msg_resp)" to try to ensure the Content-Length matches the data generated by the :utf8 output layer. However, that feels even hackier than doing the conversion ourselves once, and then using the same resulting byte-string both for Content-Length and for verbatim output. I agree that in general we should let Perl get on with the details of character conversion, but one other advantage of doing it ourselves in the *particular* case of spamd, rather than relying on PerlIO layers, is that we can control the encoding more precisely. This would be useful if announcement of the encoding becomes part of the spamc protocol (ie. Content-Transfer-Encoding). We can also be more certain about when the encoding happens, which might be an issue if a different encoding is needed in each direction: as you pointed out, even with the current API the best place for a binmode() call is *after* the client request has been read, since binmode() will apply the translation layer to both input and output, and it's only output we want to be UTF-8-encoded. ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee.
