Hello,
I can't find how to tell lxml/BS to preserve carriage returns in an HTML
snippet when calling soup.body.text: After removing </br>'s, it also removes
the CRLF that follows.
==========
builder = LXMLTreeBuilderForXML(preserve_whitespace_tags=["body"])
rows = cur.fetchall()
for row in rows:
#BAD soup = BeautifulSoup(row["introtext"],
builder=builder,features='lxml')
soup = BeautifulSoup(row["intro"],features='lxml')
print(soup.body.text)
break
==========
Is there an option?
Thank you.
_______________________________________________
lxml - The Python XML Toolkit mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: [email protected]