html-parser egg can not handle spaces between html attributes and values
correctly.

Spaces are allowed around '=' in a tag's attributes even though it is a
bad practice. (see discussion in
https://stackoverflow.com/questions/7064095/spaces-between-html-attributes-and-values)

But when use the latest version of html-parser process following html:

#+begin_example
<a href="/dictionary/lineament" class = "mw_t_sx"><span 
class='text-uppercase'>lineament</span></a>
#+end_example

It will generate sxml like this:

#+begin_example
(*TOP* (a (@ (href "/dictionary/lineament") (class)) "= \"mw_t_sx\">" (span (@ 
(class "text-uppercase")) "lineament")) "\n")
#+end_example

Since html-parser's major goal is "bug-for-bug compatibility", it should
handle the spaces in attributes correctly.

#+begin_example
$ chicken-csi -version
CHICKEN
(c) 2008-2021, The CHICKEN Team
(c) 2000-2007, Felix L. Winkelmann
Version 5.3.0 (rev e31bbee5)
linux-unix-gnu-x86-64 [ 64bit dload ptables ]

$ cat ~/.cache/chicken-install/html-parser/VERSION
"0.3"
#+end_example

Thanks
Pan

Reply via email to