Hey Chaps, I've been doing a little work with some RSS feeds of late, and on the most part all is very well, now, the one problem I'm running into is people who publish RSS feeds containing lots of junk HTML (urgh!), like inline links, images, divs and whatnot in the description content of the feed.
I only want to have the plain text version of these feeds and not all the other junk. This means stripping out the html tags <div>, <a> etc, some of which are being published as < and >. Also, I want to convert HTML formatted characters into their nice plain text equivilants, for instance making & just a standard &. Now presumably this can all be done with REGEX (I couldn't find any nice built in CF functions) however my skills in this area are pretty much non-existent, however I know some of you are fairly experienced with this kind of thing. I'm also hoping that I'll be able to do some form of REGEX related 'find' on the rules first so I can say to the user 'this feed appears to contain lots of redundant crap, would you like it cleaned for you? this may cause formatting issues.' or something to that effect, I can then process the replace rules if they choose to do so. I'd appreciate any advice. Rob ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~| Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to date Get the Free Trial http://ad.doubleclick.net/clk;207172674;29440083;f Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:322053 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4