Hello there,

it turns out that the module pmccabe2html does not escape
the charecters '&', '<', and '>' properly, when including
C source code. The following patch is able to do that with
Gawk, Mawk, and Nawk. It even produces xmllint-clean output
for the GNU Shishi project, where the resulting file presents
876 functions in a total of 510k HTML code.

Best regards,
  Mats Erik Andersson, on behalf of GNU Shishi
>From d50f9be209e66832524d5b7b4addf27bf33b46c3 Mon Sep 17 00:00:00 2001
From: Mats Erik Andersson <g...@gisladisker.se>
Date: Wed, 25 Sep 2013 22:27:03 +0200
Subject: [PATCH] pmccabe2html: escaping of special characters

The C code characters '<', '>', and '&' were improperly
escaped in HTML output, and their multiplicity was ignored.
---
 ChangeLog              | 11 +++++++++++
 build-aux/pmccabe2html |  6 +++---
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 211d296..770710c 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,14 @@
+2013-09-25  Mats Erik Andersson  <g...@gisladisker.se>
+
+	pmccabe2html: escaping of special characters
+	Escape all '<', '>', and '&' in HTML output.
+	* build-aux/pmccabe2html (html_fnc): Call gsub()
+	instead of sub() to capture all '<', '>', and '&'.
+	Neither of '<' and '>' is special in a regexp,
+	so first arguments to gsub() are corrected. Also,
+	in replacement strings, ampersand must be escaped.
+	Finally, '&' must be handled first, then '<' and '>'.
+
 2013-09-24  Eric Blake  <ebl...@redhat.com>
 
 	manywarnings: enable nicer gcc warning messages
diff --git a/build-aux/pmccabe2html b/build-aux/pmccabe2html
index 094c3e9..ffd0788 100644
--- a/build-aux/pmccabe2html
+++ b/build-aux/pmccabe2html
@@ -422,9 +422,9 @@ function html_fnc (nfun,
 
             while ((getline codeline < (fname nfun "_fn.txt")) > 0)
             {
-                sub(/\\</, "&lt;", codeline)
-                sub(/\\>/, "&gt;", codeline)
-                sub(/&/, "&amp;", codeline)
+                gsub(/&/, "\&amp;", codeline)	# Must come first.
+                gsub(/</, "\&lt;", codeline)
+                gsub(/>/, "\&gt;", codeline)
 
                 print codeline
             }
-- 
1.8.4.rc3

Reply via email to