On 13 Dec 2013, at 2:29 AM, peter <peterchutchin...@gmail.com> wrote:
> Thank you Jonathon for taking the time to make two good suggestions. The 
> first did not get the apostrophes to appear. WRT the second, one can only 
> control the encoding of Word when outputting txt files.
> 
> I have now created a work around, a bit ugly but the best I can achieve for 
> the moment.
> 
> smart apostrophes should be &#8217 for html. So I use textpad and replace all 
> smart apostrophe's with  &#8217.  I have to do this twice, once for the left 
> leaning apostrophe and once for the right leaning one.
> 
> This is not a nice solution but it does work.

One more thing to try: the mht file should have one or more charset 
declarations. What do they say?

It occurs to me that a couple of other things could be going wrong. One is that 
my re-coding suggestion isn't working because even though it might change the 
*character* encoding to UTF-8, the internal declarations are telling the 
browser to treat it as something else.

Another problem is that the database itself might be doing some 
encoding/decoding, depending on how the field is defined. You probably want to 
make sure you're using a field that treats its contents as binary data, or else 
store it as base64 text.

Yet another thing to try: rather than using your browser's view-source to see 
what you're getting back, fetch the result with curl or wget and look at the 
result in a text editor. 

BTW, your html entities should be terminated with a semicolon. Depending on the 
next character in the string, most browsers will try to do the right thing 
without it, but don't count on it. And FWIW you can use &rsquo; and &lsquo; 
instead.

One last thing. In Word for Mac 2011, when I save to .mht I get a dialog with a 
Web Options button. That button give me a dialog with several tabs. One of the 
tabs is Encoding, which lets me save as UTF-8 (and set the default to UTF-8). 
Hopefully Word for Windows has something similar.

> 
> Peter
> 
> On Friday, 13 December 2013 02:20:31 UTC, Jonathan Lundell wrote:
> On 12 Dec 2013, at 6:12 PM, Jonathan Lundell <jlun...@pobox.com> wrote:
>> On 12 Dec 2013, at 4:16 PM, peter <peterchu...@gmail.com> wrote:
>>> I have a word document that I output as a .'.mht; file  ie, a 'single file 
>>> web page'.
>>> 
>>> I can put sections of this into a string field in a database and then 
>>> display the field through a view, and the formatting in the word document 
>>> is preserved. 
>>> 
>>> here is a line from the file that I read into web2py and insert into a 
>>> field in a database.
>>> 
>>> <p class=3DStyle7 
>>> style=3D'line-height:11.5pt;mso-line-height-rule:exactly'><span 
>>> lang=3DEN-US style=3D'font-family:"Adobe 
>>> Garamond","serif";mso-bidi-font-family: "Adobe Garamond"'>‘One Lettuce Does 
>>> Not a Salad Make’ is similar to Jones’ story  .......
>>> 
>>> 
>>> Everything works fine except the apostrophes in the text disappear.
>>> 
>>> When I display the field on the screen, there are no apostrophes. I f I 
>>> 'view source', it is as above, but without the apostrophe's before One, 
>>> after Make and after Jones.
>>> 
>>> Clearly this is an encoding problem. If I read the .mht file into textpad, 
>>> the apostrophe's appear, and textpad says the file is 'ANSI'. The question 
>>> is how do I read the file in such as way as to correctly encode the 
>>> apostrophes?
>>> 
>>> I have tried various encodings including 'locale.getpreferredencoding()'.
>>> 
>>> 
>>> Does anyone know how to solve this problem
>>> 
>> 
>> Your email headers suggest that the string (at least in the email) is 
>> encoded as windows-1252.
>> 
>> So if s is your encoded string, you might try 
>> s.decode('cp1252').encode('utf8'). Assuming that UTF-8 is OK for output.
>> 
>> 
> 
> Alternatively, you might try to persuade Word to emit UTF-8 directly. This 
> might help: 
> http://office.microsoft.com/en-us/outlook-help/choose-text-encoding-when-you-open-and-save-files-HA010121249.aspx
> 
> -- 
> Resources:
> - http://web2py.com
> - http://web2py.com/book (Documentation)
> - http://github.com/web2py/web2py (Source code)
> - https://code.google.com/p/web2py/issues/list (Report Issues)



-- 
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
--- 
You received this message because you are subscribed to the Google Groups 
"web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to web2py+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to