Converting MS Word-exported HTML to clean HTML
Does anyone have a CF (needs to be server-side) component, or cfm script/page to convert the HTML you get from a MS Word export into safe HTML for sending a HTML email? In particular, I'm looking to have the extended characters escaped into ; references, such as the forward/backward quotes, the enongated hyphens, and the apostrophe's that MS Word uses, and the style stripped out, etc. I know there's something like this in fckEditor, but I'm looking for it to be server-side. Thanks! Dov ~| Macromedia ColdFusion MX7 Upgrade to MX7 experience time-saving features, more productivity. http://www.adobe.com/products/coldfusion Archive: http://www.houseoffusion.com/groups/CF-Talk/message.cfm/messageid:270752 Subscription: http://www.houseoffusion.com/groups/CF-Talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Re: Converting MS Word-exported HTML to clean HTML
Dov, I don't know of anything to strip out the styling, but DataMgr converts the extended characters automatically. The relevant code: cfscript // Replace the special characters that Microsoft uses. MyStruct[Key] = Replace(MyStruct[Key], Chr(8217), Chr(39), ALL);// apostrophe MyStruct[Key] = Replace(MyStruct[Key], Chr(8216), Chr(39), ALL);// apostrophe MyStruct[Key] = Replace(MyStruct[Key], Chr(8220), Chr(34), ALL);// quotes MyStruct[Key] = Replace(MyStruct[Key], Chr(8221), Chr(34), ALL);// quotes MyStruct[Key] = Replace(MyStruct[Key], Chr(8211), -, ALL);// dashes MyStruct[Key] = Replace(MyStruct[Key], Chr(8212), -, ALL);// dashes /cfscript HTH, Steve Bryant Bryant Web Consulting LLC http://www.BryantWebConsulting.com/ http://steve.coldfusionjournal.com/ Does anyone have a CF (needs to be server-side) component, or cfm script/page to convert the HTML you get from a MS Word export into safe HTML for sending a HTML email? In particular, I'm looking to have the extended characters escaped into ; references, such as the forward/backward quotes, the enongated hyphens, and the apostrophe's that MS Word uses, and the style stripped out, etc. I know there's something like this in fckEditor, but I'm looking for it to be server-side. Thanks! Dov ~| Upgrade to Adobe ColdFusion MX7 The most significant release in over 10 years. Upgrade see new features. http://www.adobe.com/products/coldfusion Archive: http://www.houseoffusion.com/groups/CF-Talk/message.cfm/messageid:270754 Subscription: http://www.houseoffusion.com/groups/CF-Talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Re: Converting MS Word-exported HTML to clean HTML
Dov Katz wrote: I know there's something like this in fckEditor, but I'm looking for it to be server-side. The js used to do this in FCK is in FCKEditor/editor/dialog/fck_paste.html...it is just a bunch of regex replaces...perhaps you could massage those over to some cfscript. ~| Create robust enterprise, web RIAs. Upgrade integrate Adobe Coldfusion MX7 with Flex 2 http://www.adobe.com/products/coldfusion/flex2/ Archive: http://www.houseoffusion.com/groups/CF-Talk/message.cfm/messageid:270757 Subscription: http://www.houseoffusion.com/groups/CF-Talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4