Xah Lee <xah...@gmail.com> writes: > 〈Emacs Lisp vs Perl: Validate Local File Links〉 > http://xahlee.org/emacs/elisp_vs_perl_validate_links.html > > a comparison of 2 scripts. > > lots code, so i won't paste plain text version here. > > i have some comments at the bottom. Excerpt: > > ------------------ > > «One thing interesting is to compare the approaches in perl and emacs > lisp.» > > «For our case, regex is not powerful enough to deal with the problem > by itself, due to the nested nature of html. This is why, in my perl > code, i split the file by < into segments first, then, use regex to > deal with now the non-nested segment. This will break if you have <a > title="x < href=z" href="math.html">math</a>. This cannot be worked > around unless you really start to write a real parser.» > > «The elisp here is more powerful, not because of any lisp features, > but because emacs's buffer datatype. You can think of it as a > glorified string datatype, that you can move a cursor back and forth, > or use regex to search forward or backward, or save cursor positions > (index) and grab parts of text for further analysis.» > > ------------------ > > If you are a perl coder, and disagree, let me know your opinion. > (showing working code is very welcome) My comment about perl there > applies to python too. (python code welcome too.)
Interesting. Perl, Python, and Lisp have real HTML parsers available. I've used the ones for Perl and Python. -- Dan Espen -- http://mail.python.org/mailman/listinfo/python-list