Re: Finding unclosed XML tags?

Ben Doom Fri, 02 Nov 2007 05:47:41 -0800

This kind of thing (parsing) is pretty hard in regex.  I think it's 
possible, but I can't think offhand of an easy way to do it and be sure 
your XML is valid (ie, you could have three open tags and one close tag).

What I would probably do (if I were doing it from scratch) is build 
something that loops over the document, reading each tag and building a 
"stack" where you add open tags and remove the top tag when you find the 
matching close tag.

In reality, what I would do is run it through an XML tidyer like the XML 
codesweeper in HS+.

--Ben Doom

Peter Boughton wrote:
> Can anyone provide a regex that will identify any <prefix:tag...> which isn't 
> followed by its own </prefix:tag>
> 
> Getting the initial tag is easy enough ( <prefix:([a-z_]+)[^>]*[^/]> ), but I 
> can't think how to check for a lack of closing tag.
> 
> 
> (This is just for a one-off check/fix, so if anyone knows of a tool/editor 
> that can do this (for a little under two thousand files), without getting 
> muddled up by CF tags, that would work too)
> 
> 
> Thanks. 
> 
> 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Create robust enterprise, web RIAs.
Upgrade to ColdFusion 8 and integrate with Adobe Flex
http://www.adobe.com/products/coldfusion/flex2/?sdid=RVJP

Archive: http://www.houseoffusion.com/groups/RegEx/message.cfm/messageid:1075
Subscription: http://www.houseoffusion.com/groups/RegEx/subscribe.cfm
Unsubscribe: 
http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.21

Re: Finding unclosed XML tags?

Reply via email to