I hope this is an appropriate mailing list for BeautifulSoup questions, it's been a long time since I've used python-list and I don't remember if third-party modules are on topic. I did try posting to the BeautifulSoup mailing list on Google groups, but I've waited a day or two and my message hasn't been approved yet.
Say I have the following HTML (I hope this shows up as plain text here rather than formatting): <div style="font-size: 20pt;"><span style="color: #000000;"><em><strong>"Is today the day?"</strong></em></span></div> And I want to extract the "Is today the day?" part. There are other places in the document with <em> and <strong>, but this is the only place that uses color #000000, so I want to extract anything that's within a color #000000 style, even if it's nested multiple levels deep within that. - Sometimes the color is defined as RGB(0, 0, 0) and sometimes it's defined as #000000 - Sometimes the <strong> is within the <em> and sometimes the <em> is within the <strong>. - There may be other discrepancies I haven't noticed yet How can I do this in BeautifulSoup (or is this better done in lxml.html)? Thanks -- https://mail.python.org/mailman/listinfo/python-list