Hello there, it turns out that the module pmccabe2html does not escape the charecters '&', '<', and '>' properly, when including C source code. The following patch is able to do that with Gawk, Mawk, and Nawk. It even produces xmllint-clean output for the GNU Shishi project, where the resulting file presents 876 functions in a total of 510k HTML code.
Best regards, Mats Erik Andersson, on behalf of GNU Shishi
>From d50f9be209e66832524d5b7b4addf27bf33b46c3 Mon Sep 17 00:00:00 2001 From: Mats Erik Andersson <g...@gisladisker.se> Date: Wed, 25 Sep 2013 22:27:03 +0200 Subject: [PATCH] pmccabe2html: escaping of special characters The C code characters '<', '>', and '&' were improperly escaped in HTML output, and their multiplicity was ignored. --- ChangeLog | 11 +++++++++++ build-aux/pmccabe2html | 6 +++--- 2 files changed, 14 insertions(+), 3 deletions(-) diff --git a/ChangeLog b/ChangeLog index 211d296..770710c 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,14 @@ +2013-09-25 Mats Erik Andersson <g...@gisladisker.se> + + pmccabe2html: escaping of special characters + Escape all '<', '>', and '&' in HTML output. + * build-aux/pmccabe2html (html_fnc): Call gsub() + instead of sub() to capture all '<', '>', and '&'. + Neither of '<' and '>' is special in a regexp, + so first arguments to gsub() are corrected. Also, + in replacement strings, ampersand must be escaped. + Finally, '&' must be handled first, then '<' and '>'. + 2013-09-24 Eric Blake <ebl...@redhat.com> manywarnings: enable nicer gcc warning messages diff --git a/build-aux/pmccabe2html b/build-aux/pmccabe2html index 094c3e9..ffd0788 100644 --- a/build-aux/pmccabe2html +++ b/build-aux/pmccabe2html @@ -422,9 +422,9 @@ function html_fnc (nfun, while ((getline codeline < (fname nfun "_fn.txt")) > 0) { - sub(/\\</, "<", codeline) - sub(/\\>/, ">", codeline) - sub(/&/, "&", codeline) + gsub(/&/, "\&", codeline) # Must come first. + gsub(/</, "\<", codeline) + gsub(/>/, "\>", codeline) print codeline } -- 1.8.4.rc3