How exactly should we handle I18N in freenet? The HTTP spec says that
text/html defaults to the charset "ISO-8859-1". In order to try to
prevent ambiguity in the filter, we need to explicitly set the charset
in the Content-Type that we send back to the browser. The first question
is whether this will force the browser to use ISO-8859-1, or whether,
IE-style, it will autodetect anyway and use whatever it thinks the code
looks like. Ideally we would always send an explicit charset, and allow
any charset to be specified as long as java.io.InputStreamReader knows
about it and therefore we can filter it. The problem with this is that
the browser may try to read it as a different charset... so either we
assume that the browser will accept an explicit setting of the charset
in the Content-Type field, or we have to put in autodetection code for
any conceivable charset - starting with the UTF16 patch I recently
hacked up. So what do we do?
-- 
Matthew Toseland
toad at amphibian.dyndns.org
amphibian at users.sourceforge.net
Freenet/Coldstore open source hacker.
Employed full time by Freenet Project Inc. from 11/9/02 to 11/1/03
http://freenetproject.org/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20021127/49d39d53/attachment.pgp>

Reply via email to