Re: [docbook-apps] Unicode characters in epub
On Tue, January 31, 2012 10:39 pm, Boris Schäling wrote: ... 1. My book is about C++. Unfortunately C++ is not a word - so e-readers seem to break C++ wherever they like. A line could end with C+ or C, and the plus sign(s) is on the next line. I turned C++ into C#xfeff;+#xfeff;+ (which is already crazy as I don't know how often I refer to C++ in my book). However this had some unfortunate side effects: If #xfeff; is used in the book title or titles which appear in the table of contents, the Sony Reader displays rectangles (not in the body text though). #xFEFF; has a dual role as Zero Width No-Break Space and as the BOM. Unicode 3.2 added #x2060, WORD JOINER, that is just a word joiner. [1] The Unicode Standard says that you are supposed to use #x2060; in new text, and that applications are supposed to support word joining with either #x2060; or #xFEFF;. Maybe, just maybe, your EPUB readers will do better with #x2060; than they do with #xFEFF;. Regards, Tony Graham tgra...@mentea.net Consultant http://www.mentea.net Mentea 13 Kelly's Bay Beach, Skerries, Co. Dublin, Ireland -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- XML, XSL-FO and XSLT consulting, training and programming [1] Page 5 (or 524) of http://www.unicode.org/versions/Unicode6.0.0/ch16.pdf - To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org
RE: [docbook-apps] Unicode characters in epub
-Original Message- From: Tony Graham [mailto:tgra...@mentea.net] Sent: Mittwoch, 1. Februar 2012 14:21 To: docbook-apps@lists.oasis-open.org Subject: Re: [docbook-apps] Unicode characters in epub [...] The Unicode Standard says that you are supposed to use #x2060; in new text, and that applications are supposed to support word joining with either #x2060; or #xFEFF;. Maybe, just maybe, your EPUB readers will do better with #x2060; than they do with #xFEFF;. Thanks, I just tried it: Adobe Digital Editions and the Sony Reader show a rectangle with a 0 inside when #x2060; is used in the book title or table of contents. Kindle shows rectangles in the table of contents. I didn't see any problems on the Kobo Touch. Anyway, I find this too risky and will probably not use any special characters unless I know that without them some parts of the book become entirely unreadable. Boris [...] - To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org
[docbook-apps] Unicode characters in epub
Hello, I successfully generated an epub file with the epub stylesheets and tested it on various e-readers (or emulators of e-readers). I have some annoying problems with Unicode characters though and wonder what others do or recommend. 1. My book is about C++. Unfortunately C++ is not a word - so e-readers seem to break C++ wherever they like. A line could end with C+ or C, and the plus sign(s) is on the next line. I turned C++ into C#xfeff;+#xfeff;+ (which is already crazy as I don't know how often I refer to C++ in my book). However this had some unfortunate side effects: If #xfeff; is used in the book title or titles which appear in the table of contents, the Sony Reader displays rectangles (not in the body text though). If I use #xfeff; somewhere else like in --#xfeff;option in the body text (to avoid that a command line option is broken after the double minus), the Sony Reader displays something like --`option (and still breaks after the double minus). I don't know whether this is only a problem with the Sony Reader. But if in doubt I prefer line breaks than having some readers to see rectangles or other funny characters everywhere. 2. Some e-readers like the Sony Reader and the Kobo Touch don't break long words. If you have a book about C++, you can have very long paths to header files or very long macros. The Kindle does the right thing and puts a line break into a word which you can't read anymore otherwise. I tried different CSS properties like word-wrap and overflow-warp but to no avail. Is there any trick to make e-readers break words by all means if they are too long? 3. I use a table with three columns in my book which is already difficult to display on a narrow e-reader. If there are some long words, e-readers can mess up completely (because of 2.). So I added #xad; here and there to insert soft hyphens. The Sony Reader, Kobo Touch and Adobe Digital Editions do break the words now where I put #xad; - but they don't display a hyphen! Adobe Digital Editions does display a hyphen in the table of contents if I add #xad; to a chapter title - although the chapter title doesn't need to be and isn't broken in the table of contents. Only the Kindle seems to do the right thing. My conclusion is that one better doesn't try to beautify an epub with Unicode characters? I think I'll use #xad; where it's absolutely required to break words (like in a table with three columns) because I know that some parts of the text will not be displayed at all. Otherwise it's probably better to blame the e-reader? ;) Boris - To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org