Get a better parser.  SAX - is designed for stream processing - is
exactly what you need.  The DOM-centric CF XML stuff is great for
simple stuff, but as you've found, only works for small documents.  I
haven't checked CF9, but CF8 uses the Apache XML tooling, which
includes a SAX parser.  I'd expect CF9 to be the same.

If you can't/won't and must pursue the regex approach, you'll again
need to get a better RegEx engine than what CF ships with.
Specifically one that supports lookahead and lookbehind to anchor
yourself.  Again, you already have what you need as the
java.util.regex package provides all this functionality for you (CF
uses ORO, which doesn't have it, instead of the Java-native stuff).
The gist is this (which will work from CFML as-is):

newXmlString = xmlString.replaceAll(regex, replaceString);

cheers,
barneyb

On Mon, Mar 7, 2011 at 10:40 AM, Ian Skinner <[email protected]> wrote:
>
> I'm struggling with a regular expression to match this:
>
> <SITE_LOC_ID><!--help--></SITE_LOC_ID>
>
> Where the help content is any string that contains one or more
> characters that are NOT digits.
>
> I know [^0-9] would match a single non-digit character.  But I don't
> know how to allow there to be one or more such characters mixed in zero
> or more digit characters.
>
> I need to return the strings that match this so that I can replace them.
>
> Plus this is a xml file that will have upwards of 75,000 record nodes
> each with one of these <SITE_LOC_ID>...</SITE_LOC_ID> nodes along side
> several other nodes of each record.  So I want to make sure I match only
> the content of a single node.
>
> TIA
>
> P.S.  The file is too large to parse into an XML data structure, so I am
> doing simple string replace() and rereplace() functions to modify the
> XML text file.
>
>
>
> 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Order the Adobe Coldfusion Anthology now!
http://www.amazon.com/Adobe-Coldfusion-Anthology/dp/1430272155/?tag=houseoffusion
Archive: http://www.houseoffusion.com/groups/regex/message.cfm/messageid:1251
Subscription: http://www.houseoffusion.com/groups/regex/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/groups/regex/unsubscribe.cfm

Reply via email to