On 13 Dec 2013, at 2:29 AM, peter <peterchutchin...@gmail.com> wrote: > Thank you Jonathon for taking the time to make two good suggestions. The > first did not get the apostrophes to appear. WRT the second, one can only > control the encoding of Word when outputting txt files. > > I have now created a work around, a bit ugly but the best I can achieve for > the moment. > > smart apostrophes should be ’ for html. So I use textpad and replace all > smart apostrophe's with ’. I have to do this twice, once for the left > leaning apostrophe and once for the right leaning one. > > This is not a nice solution but it does work.
One more thing to try: the mht file should have one or more charset declarations. What do they say? It occurs to me that a couple of other things could be going wrong. One is that my re-coding suggestion isn't working because even though it might change the *character* encoding to UTF-8, the internal declarations are telling the browser to treat it as something else. Another problem is that the database itself might be doing some encoding/decoding, depending on how the field is defined. You probably want to make sure you're using a field that treats its contents as binary data, or else store it as base64 text. Yet another thing to try: rather than using your browser's view-source to see what you're getting back, fetch the result with curl or wget and look at the result in a text editor. BTW, your html entities should be terminated with a semicolon. Depending on the next character in the string, most browsers will try to do the right thing without it, but don't count on it. And FWIW you can use ’ and ‘ instead. One last thing. In Word for Mac 2011, when I save to .mht I get a dialog with a Web Options button. That button give me a dialog with several tabs. One of the tabs is Encoding, which lets me save as UTF-8 (and set the default to UTF-8). Hopefully Word for Windows has something similar. > > Peter > > On Friday, 13 December 2013 02:20:31 UTC, Jonathan Lundell wrote: > On 12 Dec 2013, at 6:12 PM, Jonathan Lundell <jlun...@pobox.com> wrote: >> On 12 Dec 2013, at 4:16 PM, peter <peterchu...@gmail.com> wrote: >>> I have a word document that I output as a .'.mht; file ie, a 'single file >>> web page'. >>> >>> I can put sections of this into a string field in a database and then >>> display the field through a view, and the formatting in the word document >>> is preserved. >>> >>> here is a line from the file that I read into web2py and insert into a >>> field in a database. >>> >>> <p class=3DStyle7 >>> style=3D'line-height:11.5pt;mso-line-height-rule:exactly'><span >>> lang=3DEN-US style=3D'font-family:"Adobe >>> Garamond","serif";mso-bidi-font-family: "Adobe Garamond"'>‘One Lettuce Does >>> Not a Salad Make’ is similar to Jones’ story ....... >>> >>> >>> Everything works fine except the apostrophes in the text disappear. >>> >>> When I display the field on the screen, there are no apostrophes. I f I >>> 'view source', it is as above, but without the apostrophe's before One, >>> after Make and after Jones. >>> >>> Clearly this is an encoding problem. If I read the .mht file into textpad, >>> the apostrophe's appear, and textpad says the file is 'ANSI'. The question >>> is how do I read the file in such as way as to correctly encode the >>> apostrophes? >>> >>> I have tried various encodings including 'locale.getpreferredencoding()'. >>> >>> >>> Does anyone know how to solve this problem >>> >> >> Your email headers suggest that the string (at least in the email) is >> encoded as windows-1252. >> >> So if s is your encoded string, you might try >> s.decode('cp1252').encode('utf8'). Assuming that UTF-8 is OK for output. >> >> > > Alternatively, you might try to persuade Word to emit UTF-8 directly. This > might help: > http://office.microsoft.com/en-us/outlook-help/choose-text-encoding-when-you-open-and-save-files-HA010121249.aspx > > -- > Resources: > - http://web2py.com > - http://web2py.com/book (Documentation) > - http://github.com/web2py/web2py (Source code) > - https://code.google.com/p/web2py/issues/list (Report Issues) -- Resources: - http://web2py.com - http://web2py.com/book (Documentation) - http://github.com/web2py/web2py (Source code) - https://code.google.com/p/web2py/issues/list (Report Issues) --- You received this message because you are subscribed to the Google Groups "web2py-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.