Hi,
I don't know how to set up a combination of the latest AOLserver,
using the nsd8x Interpreter, and a Postgres 7.2 database, that allows
me to safely work with a charset of iso8859-1. Please don't throw
stones, I know this has been discussed very often ;-)
How to do it?
The problems I run into during my own tests and the problems that
other people have (I read through some threads on various boards) are:
A.
Using the latest server out of the box, working with most European
characters will almost always fail using the typical string and regexp
functions that internally use utf-8 ... _IF_ you are returning (in http-
header or meta-tags) a charset of e.g. iso8859-1 (you have to
know what comes from forms/submits; you try to return umlauts
or any language depending chars and tell the browser of it).
B.
I got the tip of using an undocumented parameter in the config
file, that maps the iso8859-1 charset to .adp files (last year on this
list).
But this does not guarantee that all characters that _leave_ an
adp will be in iso8859-1 encoding (e.g. if I use combinations of
ns_adp_parse -file (where strings from the DB are regexped and
stringed and whatever) and return the string with ns_return).
At least if all what comes in goes iso-ed to the DB, you could do
a workaround and translate all outgoing chars to #123 html code.
(There was an example of a ns_adp_puts function that does this
given by Harray Moreau on the list)... If you escape all the characters
that way, I assume, you would not no longer have to return a
charset header or charset-meta-tag.
C.
The uncool way: Using charset=utf-8 outgoing, then also
expecting it incoming. A special character should come in as unicode
and tcl should treat it this way. The database must be installed with
unicode-encoding. You will run into performace problems and/or, maybe,
some unresolved topics of Postgres unicode-implementation as well.
I have not tried this yet, I merely assume this would work. What's
your opinion on that?
D.
Using an AD-patched version of AOLserver. The Problem: It's an
older version of the server, will it be kept up to date in the future?
(Of course, there's a large user base running it)
E.
I did not try --enable-recode and setting up a charset table for
utf8 - latin1 and latin1 - utf8 for Postgres. Maybe this would
work if you can guarantee all charsets coming from AOLserver
are coming as utf8 or iso8859x. Maybe, don't know.
I tried putting SET ENCODING TO 'UNICODE' resp. LATIN1 in front
of every SQL statement in my test api (and took into account that it may
make a
difference if you are SELECTing or INSERTing) for telling the Postgres
server that the client uses unicode or latin with and without the
undocumented feature of (B). This only lead to error messages
noticing me of failed encoding translations. Every insert _always_
was logged with german special characters (umlauts) and never
as unicode characters (don't know if this is correct).
Will AOLserver 4 come with I18N support that solves all the problems
and what to do if you need a solution here and now?
Solution B? C?
Thanks for reading through this one...
Bernd.