OK, but in this case it really depends on your input XML format and what you 
consider "useless".

If you only have "locally" useless whitespaces like here:

<txt>     Some text </txt>

and you want to get <txt>Some text</txt> you can still use the function below 
and "strip" every text node with a C function (I don't think that a standard C 
function exists for that).

But, if you have mixed content like:

<p>  Some text<b><i> in bold italics </i></b>and continuing</p>

it is tricky to define "useless" whitespaces in a recursive descent because the 
decision is not local (here, clearly, the whitespaces in the <i> elements 
should not be removed).

So, depending on your definition of "useless", the difficulty of the answer can 
range from very simple to very complicated.

Best regards,

Georges-André SILBER
LUXIA

Le 16 févr. 2012 à 09:32, spam.spam.spam.s...@free.fr a écrit :

> I think this function removes blank nodes.
> That's not exactly what I want.
> I want to strip useless whitespaces from text nodes.
> These nodes aren't considered as blank nodes because they contains also 
> visible characters.
> 
> ----- Mail original -----
> De: "Georges-André SILBER" <gasil...@luxia.fr>
> À: "spam spam spam spam" <spam.spam.spam.s...@free.fr>
> Cc: xml@gnome.org
> Envoyé: Jeudi 16 Février 2012 09:14:17
> Objet: Re: [xml] Remove whitespaces from text nodes
> 
> Hi,
> 
> I wrote a small function for this purpose some time ago.
> I didn't test it with the last versions of libxml2 nor did I ensure that this 
> code if correct but it gives you the idea of a method that you can use to 
> remove blank nodes.
> 
> The usage is for instance:
> 
> doc = xmlReadFile (xmlfile, NULL, 0);
> if (doc == NULL)
>   {
>       /* Deal with error... */
>      return 1;
>    }
> glbRemoveBlankNodes (xmlDocGetRootElement(doc));
> 
> Hope this helps,
> 
> Best regards,
> 
> Georges-André SILBER
> LUXIA
> 
> int
> glbRemoveBlankNodes (xmlNodePtr n)
> {
>  xmlNodePtr cur;
>  xmlNodePtr next;
> 
>  if (n == NULL)
>    return 0;
> 
>  cur = n->children;
>  while (cur)
>    {
>      next = cur->next;      
>      if (xmlIsBlankNode (cur))
>       {
>         xmlUnlinkNode (cur);
>         xmlFreeNode (cur);
>       }
>      else
>       glbRemoveBlankNodes (cur);
>      cur = next;
>    }
> 
>  return 0;
> }
> 
> 
> Le 16 févr. 2012 à 08:57, spam.spam.spam.s...@free.fr a écrit :
> 
>> Yes you are right.
>> But I am not sure my function will do a good job.
>> I know 2 whitespaces : " ", "\t", ... But I am not sure that I know all of 
>> them.
>> My function will probably forgot to strip some whitespaces...
>> This is the reason why I would like to use an already defined function.
>> 
>> Is there a function which do this job?
>> 
>> ----- Mail original -----
>> De: "Liam R E Quin" <l...@holoweb.net>
>> À: "spam spam spam spam" <spam.spam.spam.s...@free.fr>
>> Cc: xml@gnome.org
>> Envoyé: Jeudi 16 Février 2012 08:40:31
>> Objet: Re: [xml] Remove whitespaces from text nodes
>> 
>> On Thu, 2012-02-16 at 08:28 +0100, spam.spam.spam.s...@free.fr wrote:
>>> [...].
>>> Anyway, there seems to have no other solution with libxml2 only.
>> 
>> The spaces are part of the text of the document, so it's not likely that
>> a conformant XML parser will strip them for you.
>> 
>> You could of course remove the spaces in C after parsing, just as if you
>> decided to remove every occurrence of an upper-case "B" from the input.
>> 
>> That's just standard C string processing.
>> 
>> -- 
>> Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/
>> Pictures from old books: http://fromoldbooks.org/
>> 
>> _______________________________________________
>> xml mailing list, project page  http://xmlsoft.org/
>> xml@gnome.org
>> http://mail.gnome.org/mailman/listinfo/xml
> 
> _______________________________________________
> xml mailing list, project page  http://xmlsoft.org/
> xml@gnome.org
> http://mail.gnome.org/mailman/listinfo/xml

_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml

Reply via email to