More XML fun in the morning!

#!/usr/bin/env python3

from lxml import etree as et

html_parser = et.HTMLParser(encoding='utf8')

WRAPPER_FILENAME = 'wrapper.html'
CONTENT_FILENAME = 'content.html'

wrapper_tree = et.parse(WRAPPER_FILENAME, parser=html_parser)

# Note, we can't use HTML parser for the content as it is not
# a full, well formed HTML file.  Also, this file needs to
# be encapuslated within a single XML element, e.g. a <div>
content_tree = et.parse(CONTENT_FILENAME)

wrapper_root = wrapper_tree.getroot()
content_root = content_tree.getroot()

# Get the <body> element of the wrapper, else raise an exception
wrapper_body_list = wrapper_root.xpath('//body')
if len(wrapper_body_list) == 0:
    raise Exception("Could not find <body> in wrapper")
wrapper_body = wrapper_body_list[0]

# Use use an index of 0 to insert our content as the first
# child element of the wrapper's body element...
wrapper_body.insert(index=0, element=content_root)

print(et.tostring(wrapper_root, pretty_print=True).decode('utf8'))


> On 11 May 2022, at 11:59, Gilles <[email protected]> wrote:
> 
> Hello,
> 
> I need to add ~twenty lines of HTML right after the <body> tag.
> 
> Does lxml provide a way to read that data from a variable, to keep things 
> simple?
> 
> ========
> for body in root.xpath('//body[@*]'):
> 
>     et.SubElement(body,"<p>",HTML_block)
> ========
> 
> Thank you.
> 
> _______________________________________________
> lxml - The Python XML Toolkit mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
> https://mail.python.org/mailman3/lists/lxml.python.org/
> Member address: [email protected]

_______________________________________________
lxml - The Python XML Toolkit mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: [email protected]

Reply via email to