Converting MS Word-exported HTML to clean HTML

2007-02-27 Thread Dov Katz
Does anyone have a CF (needs to be server-side) component, or cfm script/page 
to convert the HTML you get from a MS Word export into safe HTML for sending a 
HTML email?

In particular, I'm looking to have the extended characters escaped into ; 
references, such as the forward/backward quotes, the enongated hyphens, and the 
apostrophe's that MS Word uses, and the style stripped out, etc.

I know there's something like this in fckEditor, but I'm looking for it to be 
server-side. 

Thanks!
Dov

~|
Macromedia ColdFusion MX7
Upgrade to MX7  experience time-saving features, more productivity.
http://www.adobe.com/products/coldfusion

Archive: 
http://www.houseoffusion.com/groups/CF-Talk/message.cfm/messageid:270752
Subscription: http://www.houseoffusion.com/groups/CF-Talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4


Re: Converting MS Word-exported HTML to clean HTML

2007-02-27 Thread Steve Bryant
Dov,

I don't know of anything to strip out the styling, but DataMgr converts the 
extended characters automatically.

The relevant code:

cfscript
// Replace the special characters that Microsoft uses.
MyStruct[Key] = Replace(MyStruct[Key], Chr(8217), Chr(39), ALL);// apostrophe
MyStruct[Key] = Replace(MyStruct[Key], Chr(8216), Chr(39), ALL);// apostrophe
MyStruct[Key] = Replace(MyStruct[Key], Chr(8220), Chr(34), ALL);// quotes
MyStruct[Key] = Replace(MyStruct[Key], Chr(8221), Chr(34), ALL);// quotes
MyStruct[Key] = Replace(MyStruct[Key], Chr(8211), -, ALL);// dashes
MyStruct[Key] = Replace(MyStruct[Key], Chr(8212), -, ALL);// dashes
/cfscript

HTH,

Steve Bryant
Bryant Web Consulting LLC
http://www.BryantWebConsulting.com/
http://steve.coldfusionjournal.com/ 

 Does anyone have a CF (needs to be server-side) component, or cfm 
 script/page to convert the HTML you get from a MS Word export into 
 safe HTML for sending a HTML email?
 
 In particular, I'm looking to have the extended characters escaped 
 into ; references, such as the forward/backward quotes, the 
 enongated hyphens, and the apostrophe's that MS Word uses, and the 
 style stripped out, etc.
 
 I know there's something like this in fckEditor, but I'm looking for 
 it to be server-side. 
 
 Thanks!
Dov

~|
Upgrade to Adobe ColdFusion MX7
The most significant release in over 10 years. Upgrade  see new features.
http://www.adobe.com/products/coldfusion

Archive: 
http://www.houseoffusion.com/groups/CF-Talk/message.cfm/messageid:270754
Subscription: http://www.houseoffusion.com/groups/CF-Talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4


Re: Converting MS Word-exported HTML to clean HTML

2007-02-27 Thread Jim Wright
Dov Katz wrote:
 
 I know there's something like this in fckEditor, but I'm looking for it to be 
 server-side. 
 
The js used to do this in FCK is in 
FCKEditor/editor/dialog/fck_paste.html...it is just a bunch of regex 
replaces...perhaps you could massage those over to some cfscript.



~|
Create robust enterprise, web RIAs.
Upgrade  integrate Adobe Coldfusion MX7 with Flex 2
http://www.adobe.com/products/coldfusion/flex2/

Archive: 
http://www.houseoffusion.com/groups/CF-Talk/message.cfm/messageid:270757
Subscription: http://www.houseoffusion.com/groups/CF-Talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4