Hi,

IANAP,  but  it seems, that we will have to seriously consider doing the
right  thing and stay with iconv for all of the encodings except the few
supported  internally  by  expat/sablotron.  This  would  obviously mean
working  with  expat  people  to  include  iconv  into  expat  for input
conversions  and  suck only utf  from  expat to sablotron and at the end use
iconv  again  for  output  translation. This is the only robust and long
term  solution IMHO.  The  only  drawback  is the systems, where iconv is not
available  or  somehow  broken.  These  would be limited to the internal
character sets, right?

Now I am ducking to avoid the stones from people who know, what they are
discussing.


SvZ> I would prefer a solution based on either iconv or ICU lib by IBM. It would
SvZ> be nice if we could use this in a stream, so we do not have to have the 
SvZ> entire XML file in memory.

SvZ> Apache 2.0 and APR use iconv. I know they have also discussed using ICU, I
SvZ> will have to dig through and find out their reasoning and figure out what
SvZ> would be the best for Sab. We should also ask William Rowe, since he was
SvZ> the one that imported the libs for Apache.

SvZ> Also in regards to Expat. Maybe there should be some development on the
SvZ> new expat that would allow for an option of iconv to be used instead of
SvZ> the tables. I am sure this is going to be an issue for Greg Stein as well
SvZ> sooner or later.




Pavel                          mailto:[EMAIL PROTECTED]


Reply via email to