On Thu, Dec 10, 2009 at 1:23 AM, Daniel Veillard <veill...@redhat.com> wrote:
> On Wed, Dec 09, 2009 at 08:36:59AM -0800, Aaron Patterson wrote:
>> On Wed, Dec 9, 2009 at 7:54 AM, Daniel Veillard <veill...@redhat.com> wrote:
>> > On Sat, Dec 05, 2009 at 11:03:26AM -0800, Aaron Patterson wrote:
>> >> Hey everyone,
>> >>
>> >> It looks like sometimes there is unexpected behavior when parsing with
>> >> XML_PARSE_NOBLANKS.  It seems that sometimes blank nodes will get
>> >> included in the resulting tree.  I don't think this is expected
>> >
>> >  If libxml2 detected a non-blank text node at the same level it
>> > will keep all further text nodes, assuming a mixed content element.
>>
>> Understood.  Thank you!
>
>  This tend to surprize people but since blank node elimination without
> having read the DTD is a pure heuristic, the parser try to be as safe as
> possible (though it's not possible to go back on nodes already parsed).
> In general XML_PARSE_NOBLANKS is a deviation from the normal parsing
> behaviour so in general I suggest to avoid it and just ignore the nodes
> you know are purely formatting, the parser can't guess it 100%

Excellent.  Thanks for the tips.  I suspect the person using this was
trying to save memory with the tree style processor.  Can you think of
any other techniques that might help save on memory, but keep tree
style interface?  Maybe the Reader Parser?  I'm just curious.

Thanks!

-- 
Aaron Patterson
http://tenderlovemaking.com/
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml

Reply via email to