Hi,
IANAP, but it seems, that we will have to seriously consider doing the
right thing and stay with iconv for all of the encodings except the few
supported internally by expat/sablotron. This would obviously mean
working with expat people to include iconv into expat for input
conversions and suck only utf from expat to sablotron and at the end use
iconv again for output translation. This is the only robust and long
term solution IMHO. The only drawback is the systems, where iconv is not
available or somehow broken. These would be limited to the internal
character sets, right?
Now I am ducking to avoid the stones from people who know, what they are
discussing.
SvZ> I would prefer a solution based on either iconv or ICU lib by IBM. It would
SvZ> be nice if we could use this in a stream, so we do not have to have the
SvZ> entire XML file in memory.
SvZ> Apache 2.0 and APR use iconv. I know they have also discussed using ICU, I
SvZ> will have to dig through and find out their reasoning and figure out what
SvZ> would be the best for Sab. We should also ask William Rowe, since he was
SvZ> the one that imported the libs for Apache.
SvZ> Also in regards to Expat. Maybe there should be some development on the
SvZ> new expat that would allow for an option of iconv to be used instead of
SvZ> the tables. I am sure this is going to be an issue for Greg Stein as well
SvZ> sooner or later.
Pavel mailto:[EMAIL PROTECTED]