Friends

Why not use iconv?  

I have been experementing with it as I want character sets that libxml
cannot deal with.  

Worik


Bruno David Simões Rodrigues <[EMAIL PROTECTED]> writes:

> On Wed, 2001-11-21 at 17:34, Nektarios K. Papadopoulos wrote:
> Yes, it should be a bug.
> When I coded it, I've asked for help in this part because: 1st I didn't know
> the xml_* functions and
> 2nd: I usually only use iso-8859-1 or ucs-2 directly.
> That's why there's so many debug lines around it.
> Feel free to correct it.
> BTW, there should be a bug somewhere in this code that panics, I've seen it
> once but I don't
> recall what I've done (besides passing some differente charsets and codings)
> 
>           Andreas Fink wrote:
> > 
> > >  > Index: gw/smsbox.c
> > >>  ===================================================================
> > >>  RCS file: /home/cvs/gateway/gw/smsbox.c,v
> > >>  retrieving revision 1.156
> > >>  diff -r1.156 smsbox.c
> > >>  1392,1395d1391
> > >>  <       if (charset_processing(charset, &body, coding) == -1) {
> > >>  <           *status = 415;
> > >>  <           ret = octstr_create("Charset or body misformed, rejected");
> > >>  <       }
> > >
> > >votings from the smsbox hackers for the proposed change?! Andreas?
> > >Nick?
> > 
> > if its a bug, lets fix it. I had a user complaining that he has
> > problems with greek characters. Sounds like the source of the problem.
> > 
> 
> Actually this is a bug (I think) I found trying to solve the problem
> with greek characters.
> 
> Removing this line is not enough.
> 
> The code in charset processing does well when coding==DC_UCS2 (well this
> is the easy case).
> 
> It also does well when coding==DC_7BIT and charset=="ISO-8859-1"(well
> that is even easier: just do nothing)
> 
> But when coding==DC_7BIT and charset!="ISO-8859-1" it seems to be trying
> to do something like this:
> first ... encode to UTF-8
> then  UTF-8 to ISO-8859-1
> allways using libxml calls.
> 
> Actually the code for UTF-8 to ISO-8859-1 is wrong and commented out.
>       /* UTF-8 to ISO-8859-1 */
> /*  charset = octstr_create("ISO-8859-1"); 
>       if (charset_from_utf8(new*body, &temp, charset) >= 0) {
>     octstr_destroy(new*body);
>     new*body = temp;
> octstr_dump(new*body, 0);
> 
>     octstr_destroy(charset);
>       } else {
>     octstr_destroy(charset);
>     octstr_destroy(new*body);
>     return NULL;
>       }
> debug("sms.http", 0, "coding=7bit, after iso8859-1, msgdata is %s",
> octstr_get_cstr(n
> ew*body));
> */
> 
> Anyway it would *NOT* do the job. libxml maps any characters that not
> map directly to ISO-8859-1 to something like this &#x39C; (the XML way).
> Which is not good!
> 
> I am working on a solution for the greek characters, which must be
> relatively easy since GSM default alphabet has all the GREEK capital
> letters.
> 
> But I don't know how to give more general solution for all the possible
> charsets (other than ISO-8859-7 which is for Greek).
> 

-- 
                                                      Worik Macky Turei Stanton
Whew!                                                        [EMAIL PROTECTED]
                                                                       Aotearoa

Reply via email to