Jon Freeman writes: | From: "John Chambers" <[EMAIL PROTECTED]> | > I've seen a couple of sites that have ABC-in-HTML with <p> at the end | > of each line. This is purposely double spaced, so there's probably no | > good fix for it. | | If you are talking about filtering the HTML, perhaps this would work for | dealing with line endings: | | 1. remove all CR and LF characters. | 2 remove all </p> | 3 change all <p> to CR/LF | 4 change all <br> to CR/LF | | I've probably missed something but it's a thought...
That would work for some files, and fail for others. The main problem is that a lot of ABC-in-HTML tunes use <br> to terminate lines and <p> to terminate the tune. This is probably the best way to do it. This means that <p> should expand to two newlines. But that makes the above example double spaced. I've also seen tunes surrounded with <pre> ... </pre> tags, with <br> added to the lines. Either alone works, but both together produces double-spaced text. Unfortunately, it takes a fair degree of human-level pattern matching to figure out what to do in each case. No two sites seem to use the same scheme for embedding ABC inside HTML. And a scheme that works for one site will mess up some others. I've basically treated it as hopeless. My Tune Finder code has some simple heuristics to try to extract ABC from HTML. But I haven't found anything that works in all cases. The fact that different browsers produce different spacing in some cases implies that even a full HTML+CSS parse can't be guaranteed to handle the task successfully. I conclude that it's not worth worrying about. I just warn people that ABC inside HTML is not a good idea. If you insist on doing it, you should expect messages from people who can't figure out how to use your site's tunes. To subscribe/unsubscribe, point your browser to: http://www.tullochgorm.com/lists.html