>>>Can anyone tell me how to remove the xml code, that text pasted from >word or works ads. For examlpe <P class=MsoNormal style="MARGIN: 0in 0in >0pt"> > >This is not XML but HTML. >Any way, to remove it, all you need are regular expressions. >
To build on Claude's comments, there's a very good UDF at cflib.org called DeMoronize: http://www.cflib.org/udf.cfm?id=725 >From cflib: -- Description Fixes text using Microsoft Latin-1 "Extentions", namely ASCII characters 128-160. Supplies semicolons where missing in HTML numeric and common non-numeric entities. This is a rough port of John Walker's demoroniser, written in Perl. http://www.fourmilab.ch/webtools/demoroniser/ Parameters Name Description Required text Text to be modified. Yes Return Values Returns a string. Example <cfset MSText = "My name is #Chr(147)#Foo#Chr(148)##Chr(133)#<br>"> <cfoutput>With MS Latin-1 Extentions:<br>#MSText#</cfoutput> <cfset ValidText = DeMoronize(MSText)> <cfoutput>Valid ASCII:<br>#ValidText#</cfoutput> -- hth, larry -- Larry C. Lyons Web Analyst BEI Resources American Type Culture Collection http://www.beiresources.org email: llyons(at)atcc(dot)org tel: 703.365.2700.2678 -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~| Enterprise web applications, build robust, secure scalable apps today - Try it now ColdFusion Today ColdFusion 8 beta - Build next generation apps Archive: http://www.houseoffusion.com/groups/CF-Talk/message.cfm/messageid:290973 Subscription: http://www.houseoffusion.com/groups/CF-Talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4