On Thursday, July 17th, 2025 at 8:57 AM, Paul Heinlein <heinl...@madboa.com> wrote:
> On Thu, 17 Jul 2025, Eldo Varghese wrote: > > > I second the xmlstarlet approach, why bother creating one's own (crude) > > parser when xml parsers already exist. > > > Another vote for this approach! For anything other than a one-off > operation, using an XML parser is the smart move. > > Developers have spent untold labor years getting their browsers and > parsing libraries to work with malformed html or xhtml. You're best > off taking advantage of that work, even if it means a bit of a > learning curve. > XML parsers will also handle minor changes in the formatting of the source data. It looks like that's what happened here. In the long run a simply xml parser would prevent a lot of problems like this in the future. -Ben