Hi, On Thu, Aug 16, 2012 at 04:58:17AM +0100, Ben Hutchings wrote: > Attaching an updated patch which corrects many errors in the original > description of the bug.
One update from me. Pleae also read TODO list. Your patch is very fundamental one for SDATA. XML backend I made is nothing more than minor modificaion of origimal HTML backend. I kept HTML as the original one. > -- > Ben Hutchings > I say we take off; nuke the site from orbit. It's the only way to be sure. > From b0a67f447be96d4b5df683da6f280d4b24eb3ff4 Mon Sep 17 00:00:00 2001 > From: Ben Hutchings <[email protected]> > Date: Thu, 16 Aug 2012 04:16:54 +0100 > Subject: [PATCH] Fix mangling of '&' in URLs > > SDATA entities, e.g. '&' are converted on input to different > escape sequences, e.g. '\|[amp ]\|'. These need to be converted > back to literal characters or entities on output, depending on the > format. Currently we fail to do this because: > > 1. The HTML and XML back-ends only do URL-normalization; they never > process the SDATA escape sequences. I trust you here. I kind of remember thinking similar but did not have courage to changing this fundamental feature. > Change them to do CDATA conversion before normalizing URLs. This is > not theoretically correct: we should convert to literal text, then > normalize the URL, then convert to CDATA. However we know that '&' > and ';' will not be URL-escaped and therefore the result should be the > same. > > 2. The driver does some basic normalization of the URL by squashing > multiple spaces. Since the spaces are significant in matching of the > SDATA escape sequences, they are then not converted by any back-end. > > Change it to trim leading and trailing space only; URLs should not > normally contain any spaces anyway. I think your assesment is sound. Osamu -- To UNSUBSCRIBE, email to [email protected] with a subject of "unsubscribe". Trouble? Contact [email protected]

