Charlie Clark schrieb am 19.01.24 um 15:00:
On 18 Jan 2024, at 18:10, Charlie Clark wrote:

Apart from the fact that this currently doesn't work, I imagine that both 
Elements and their children would happily be passed to the write, which could 
lead to an almighty mess. Getting this to work properly, possibly rewritten for 
async to avoid the awfully awful (yield) hack could be a nice addition to the 
documentation.

Thinking about this again, I think a pull parser is probably the way to go as I 
really don't want or need to create elements, it's probably fine if I just make 
the changes to what's coming through and stream the text straight back into 
another file. I'll give that a go.

If you want to avoid creating element objects all together, maybe even don't need a full (sub-)tree structure to get all relevant information, I suggest you try the low-level SAX interface.

https://lxml.de/parsing.html#the-target-parser-interface

It's quite efficient and usable for locally constrained XML transformations, e.g. filtering elements or attributes.

And you can still parse input chunk by chunk, if you need that:

https://lxml.de/parsing.html#the-feed-parser-interface

Stefan

_______________________________________________
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com

Reply via email to