I imagine my problem is due to my own ignorance of how char encodings work and 
how libxml2 handles them, but I’m growing frustrated with my inability to 
figure it out so thought to beg advice from the list.

Given this small program:

/*
 * author:   Lucas Brasilino 
<brasil...@recife.pe.gov.br<mailto:brasil...@recife.pe.gov.br>>
 * copy:     see Copyright for the status of this software
 * hacked up by Fred Smith to illustrate a problem I'm having.
 */

#include <stdio.h>
#include <libxml/parser.h>
#include <libxml/tree.h>

int
main(int argc, char **argv)
{
    xmlDocPtr doc = NULL;       /* document pointer */
    xmlNodePtr root_node = NULL, node = NULL, node1 = NULL;/* node pointers */
    xmlDtdPtr dtd = NULL;       /* DTD pointer */
    char buff[256];
    int i, j;
    xmlChar * convstr;
    char tststr[40];
    xmlNodePtr sub;

    LIBXML_TEST_VERSION;
    doc = xmlNewDoc(BAD_CAST "1.0");
    snprintf (tststr, sizeof(tststr), "Test %c Test", 0xC9);
    convstr = xmlEncodeEntitiesReentrant (doc, (xmlChar *)tststr);
    if (convstr)
           {
           printf ("tststr:  %s\n", tststr);
           printf ("convstr: %s\n", convstr);
           free (convstr);
           }
    xmlFreeDoc(doc);
    xmlCleanupParser();
    xmlMemoryDump();
    return(0);
}
I get this output:

$ ./tree
tststr:  Test � Test
convstr: Test &#x260;Test

hexdump reveals it as:

000000: 73 74 73 74 72 3a 20 20 54 65 73 74 20 c9 20 54    ststr:  Test . T
000010: 65 73 74 0a 63 6f 6e 76 73 74 72 3a 20 54 65 73    est.convstr: Tes
000020: 74 20 26 23 78 32 36 30 3b 54 65 73 74 0a          t &#x260;Test.

Now,… I’m puzzled by why the output from xmlEncodeEntitiesReentrant() seems 
clearly (to me) to be wrong. First of all, it has sucked up not only the 0xC9, 
but the character following it too, but just as bad when the app that should be 
receiving this actually gets it, it is unable to reconstruct the actual Unicode 
point that appeared in the original text (i.e., the 0xC9, which represents a 
capital E with acute accent).

I’m sure I’m doing something wrong here, but I am unable to see it, so your 
advice will be appreciated.

Thanks in advance!

Fred Smith



This email and any files transmitted with it are confidential and intended 
solely for the use of the individual or entity to which they are addressed. If 
you have received this email in error please notify the system manager. Please 
note that any views or opinions presented in this email are solely those of the 
author and do not necessarily represent those of the company. Finally, the 
recipient should check this email and any attachments for the presence of 
viruses. The company accepts no liability for any damage caused by any virus 
transmitted by this email
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml

Reply via email to