Bernd K?mmerlen wrote:
> You also said many times that it is not very likely that Entities are 
> fully supported in XXE in the future. This is also a problem for 
> entities which are used as "macros" which are bound to be changed some 
> time in the future. Wouldn't it be possible to have a mode in which 
> expansion of entities is completely disabled? I think that you will need 
> something like this for the multi-file documents implementation as well, 
> so it would be great if this could be disabled for all entities (or will 
> this work only for external entities?) It wouldn't be a problem if the 
> entities could not be declared in XXE itself (this could be done with 
> another editor) but it would be great if they were not expanded.

This is one of the few things about XXE which was a problem for us as 
well.  We use a number of macros for things like product names (which 
always end up getting changed by marketing people, and having XXE expand 
them was very annoying.  In order to deal with this, I developed a Perl 
script, unxxe.pl (sorry about the name!) that scans all DTD references 
(DOCTYPE and external parameter ENTITYs) in an XML document and creates 
an expansion table of general (internal and external) ENTITYs, which it 
then "unexpands" in the input file.

I only wrote this a few nights ago, and we haven't used it too 
extensively, but I suspect that many XXE users may find it useful. 
However, you should be aware of its limitations:

It doesn't use PUBLIC ids and/or catalogs to support external entities, 
so only the system location is used (and it must be a file: URL or just 
a filename).  If the script cannot retrieve an external entity, it 
warns, but doesn't fail, but won't unexpand that entity (if it was a 
parameter entity, any entities it defined).

It unexpands *anything* that matches an entity expansion.  If you have 
entities that expand into small, common text (one of our entities was 
<!ENTITY year "2002">) it is unexpanded *everywhere* including many 
places you don't want it to.  (Our solution was to eliminate this entity.)

While the script will unexpand external entities in a master document, 
it will do so only if the text matches.  That means that if you edit or 
modify any of the text that came from an external entity, it will not be 
unexpanded.  The script does warn you if you have external entities 
(general or parameter) that are not referenced, so you will usually know 
if an external entity was modified.

This is a work in progress, and in particular we plan to enhance it in 
order to support some CVS integration (e.g. to run it automatically on 
update or checkin).  Anyone who has requests, comments, or bug reports 
regarding this, or who wants to enhance it (e.g. to support http: or 
catalogs) please send them to me and I'll try to incorporate them.

@alex
-- 
mailto:dupuy at sysd.com
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: unxxe.pl
Url: 
http://www.xmlmind.com/pipermail/xmleditor-support/attachments/20021209/58272922/attachment.bat
 

Reply via email to