[PHP] Cleaning pasted Word text

2002-10-29 Thread a . h . s . boy
I'm working on a PHP-based CMS that allows users to post lengthy article texts by submitting through a form. The short version of my quandary is this: How can I create a conversion routine that reliably substitutes HTML-acceptable output for high-ASCII characters pasted into the form (from

Re: [PHP] Cleaning pasted Word text

2002-10-29 Thread Brent Baisley
I think you have posted before and probably didn't get an answer. I'm not going to give you an answer (because I don't have one), but perhaps I can point you in the right direction. Look at http://www.w3.org/TR/REC-html40/charset.html and see if that helps you. Below is a paragraph I pulled

Re: [PHP] Cleaning pasted Word text

2002-10-29 Thread a . h . s . boy
Brent -- Thanks for the pointer, but it doesn't really address the problem. I am specifying the character set for the page (ISO-8859-1), and I'm inserting an ACCEPT-CHARSET parameter into the FORM element, but it specifies acceptable charsets as UTF-8, ISO-8859-1, and Windows 1252. The

Re: [PHP] Cleaning pasted Word text

2002-10-29 Thread Daniel Guerrier
Paste into notepad, the copy the text from notepad. Notepad should remove the high ASCII text. --- Brent Baisley [EMAIL PROTECTED] wrote: I think you have posted before and probably didn't get an answer. I'm not going to give you an answer (because I don't have one), but perhaps I can

Re: [PHP] Cleaning pasted Word text

2002-10-29 Thread Jimmy Brake
for file maker pro (windows/mac) -- word (windows/mac) function make_safe($text) { $text = preg_replace(/(\cM)/, , $text); $text = preg_replace(/(\c])/, , $text); $text = str_replace(\r\n, , $text); $text = str_replace(\x0B, , $text); $text =

Re: [PHP] Cleaning pasted Word text

2002-10-29 Thread a . h . s . boy
Errr...I'm not sure how this is applicable to my situation. I'm concerned, above all, with converting curly double quotes curly single quotes em and en dashes inverted exclamation points inverted question marks ellipses non-breaking spaces registered trademark symbols bullets left and right