Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-11 Thread Keith J. Schultz
Hi All, Phillip. Let recap the situation here: The original post from Scott stated he had a problem going from his wiki to PDF via Xe(La)TeX! His problem involved texts with mixed directionality. I did not express myself very well and should have said that in unicode one can identify

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-10 Thread Keith J. Schultz
Hi Phillip, I will repeat I do not know Vietnamese so I can not give you the utf-8 sequence for it. All I can say that in utf-8 the singular letters will be encoded in multi-bytes whereas the english letters will be just one byte. Now, i also, mentioned that differentiating western language

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-10 Thread Zdenek Wagner
2013/12/10 Keith J. Schultz keithjschu...@web.de: Hi Phillip, I will repeat I do not know Vietnamese so I can not give you the utf-8 sequence for it. All I can say that in utf-8 the singular letters will be encoded in multi-bytes whereas the english letters will be just one byte. It has no

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-10 Thread C. Scott Ananian
On Tue, Dec 10, 2013 at 6:09 AM, Zdenek Wagner zdenek.wag...@gmail.com wrote: 2013/12/10 Keith J. Schultz keithjschu...@web.de: I will repeat I do not know Vietnamese so I can not give you [...] Now, if sang is true Vietnamese and not a latinized form stand corrected! Though I have [...]

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-10 Thread Andrew Cunningham
8 On 11/12/2013 5:27 AM, C. Scott Ananian csc...@cscott.net wrote: ..which is indeed the issue I am attempting to deal with (trying to put the discussion back on track) -- a bunch of user authored content which looks correct to a native speaker when using the unicode bidi algorithm

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-10 Thread Khaled Hosny
On Wed, Dec 11, 2013 at 04:36:31AM +0200, Khaled Hosny wrote: On Tue, Dec 10, 2013 at 11:11:27AM -0500, C. Scott Ananian wrote: On Tue, Dec 10, 2013 at 6:09 AM, Zdenek Wagner zdenek.wag...@gmail.com wrote: 2013/12/10 Keith J. Schultz keithjschu...@web.de: I will repeat I do not know

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-10 Thread Khaled Hosny
On Wed, Dec 11, 2013 at 08:36:53AM +1100, Andrew Cunningham wrote: More to the point which libraries is XeTeX using for Bidi support me how up-to-date are they? The issue is not the libraries (we use ICU which implemented the latest BiDi algorithm changes in its last release), but the fact that

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-09 Thread Keith J. Schultz
Hi Khaled, your question can not be serious! It is pretty much in the standard! True enough that for most western languages american, english, spanish, german, austrian, etc. this is somewhat difficult. Yet, these are not causing the problems. regards Keith. Am 05.12.2013 um 09:46

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-09 Thread Khaled Hosny
On Mon, Dec 09, 2013 at 09:22:10AM +0100, Keith J. Schultz wrote: Hi Khaled, your question can not be serious! No, it is. It is pretty much in the standard! No. True enough that for most western languages american, english, spanish, german, austrian, etc. this is somewhat difficult.

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-09 Thread Keith J. Schultz
Hi Khaled, I would agree with you if the text was not encoded in unicode! A properly encoded utf-8 string should contain everything you need! Unfortunately, for efficiency reasons, utf-8 strings are not properly encoded and programs assume a particular language, to save space. In multi-language

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-09 Thread Philip Taylor
Keith -- could you possible supply an example of a properly encoded utf-8 string from which it can be unambiguously determined whether the string sang is an English word (the past tense of sing) or a Vietnamese word meaning to, posh or knowingly in English ? Could you also paste that string into

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-09 Thread Zdenek Wagner
2013/12/9 Philip Taylor p.tay...@rhul.ac.uk: Keith -- could you possible supply an example of a properly encoded utf-8 string from which it can be unambiguously determined whether the string sang is an English word (the past tense of sing) or a Vietnamese word meaning to, posh or knowingly

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-09 Thread mskala
On Mon, 9 Dec 2013, Khaled Hosny wrote: U+E0001 U+E0065 U+E006E U+0073 U+0061 U+006E U+0067 And it is a kind of tagging, so beyond the scope of identifying the language of *untagged* text (which is the claim that spurred all this discussion). The claim was A properly encoded utf-8 string

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-09 Thread mskala
On Mon, 9 Dec 2013, Zdenek Wagner wrote: A bit off topic, dou you know a good Linux text editor woth properly implemented bidi algorithm so that I could type multilingual texts? No, I don't really do any work with RTL languages myself. Wikipedia's comparison list at

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-09 Thread maxwell
On 2013-12-09 11:15, Zdenek Wagner wrote: A bit off topic, dou you know a good Linux text editor woth properly implemented bidi algorithm so that I could type multilingual texts? Yudit (http://www.yudit.org/) claims to be that. I have not used it. Mike Maxwell

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-09 Thread C. Scott Ananian
In my particular case, I have citations in (for example) the arabic wikipedia, which cite references on English or Turkish webpages (to cite the example of the arwiki article on 'Istanbul'). The original author of the article did not explicitly mark the language of the reference, because the

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-09 Thread mskala
On Mon, 9 Dec 2013, C. Scott Ananian wrote: feeding the output to xelatex. That work won't help others who find themselves in a similar situation (or document authors who would prefer not to have to explicitly annotate every LTR embedding), but it The software also doesn't automatically

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-09 Thread Axel E. Retif
On 12/09/2013 10:15 AM, Zdenek Wagner wrote: A bit off topic, dou you know a good Linux text editor woth properly implemented bidi algorithm so that I could type multilingual texts? Evne the combination of Urdu and TeX macros is a pain, it is not easy to type \textbf{میں نے \today\ کو سب کچھ

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-09 Thread Khaled Hosny
On Mon, Dec 09, 2013 at 09:32:05AM -0600, msk...@ansuz.sooke.bc.ca wrote: On Mon, 9 Dec 2013, Khaled Hosny wrote: U+E0001 U+E0065 U+E006E U+0073 U+0061 U+006E U+0067 And it is a kind of tagging, so beyond the scope of identifying the language of *untagged* text (which is the claim

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-09 Thread Khaled Hosny
On Mon, Dec 09, 2013 at 01:40:21PM -0600, msk...@ansuz.sooke.bc.ca wrote: On Mon, 9 Dec 2013, C. Scott Ananian wrote: feeding the output to xelatex. That work won't help others who find themselves in a similar situation (or document authors who would prefer not to have to explicitly

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-09 Thread mskala
On Tue, 10 Dec 2013, Khaled Hosny wrote: Now you beat Keith in Who Wrote The Most Nonessential Text In This Thread contest. Well, it's always nice to be a winner. -- Matthew Skala msk...@ansuz.sooke.bc.ca People before principles. http://ansuz.sooke.bc.ca/

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-09 Thread Philip Taylor
Keith J. Schultz wrote: Hi Phillip, 1) I do not know Vietnamese! 2) If I did uses the proper BMP would give me the answer. As sang would be a sequence of singualr octcets, and Vietnamese would use multi-byte sequences! case closed! Like I mentioned there are often ways

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-07 Thread Georg Duffner
Am 07.12.2013 01:28, schrieb Dominik Wujastyk: I'm sensing, I think, that you don't like that font, Khaled? Dominik :-) He’s not alone and the Arabic is not the only problem... Georg 2013/12/5 Khaled Hosny khaledho...@eglug.org: Please, please, please, never ever use GNU free font

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-07 Thread Khaled Hosny
Not at all! I even designed a companion Latin font, so that readers of Latin script can enjoy the same quality and polishes of FreeSerif that Arabic script reader enjoy: http://www.khaledhosny.org/files/tmp/freeserif.html Please use both and don’t let Arabic readers have all the joy. Regards,

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-07 Thread Zdenek Wagner
2013/12/7 Khaled Hosny khaledho...@eglug.org: Not at all! I even designed a companion Latin font, so that readers of Latin script can enjoy the same quality and polishes of FreeSerif that Arabic script reader enjoy: http://www.khaledhosny.org/files/tmp/freeserif.html Do I understand well

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-07 Thread Khaled Hosny
On Sat, Dec 07, 2013 at 03:15:53PM +0100, Zdenek Wagner wrote: 2013/12/7 Khaled Hosny khaledho...@eglug.org: Not at all! I even designed a companion Latin font, so that readers of Latin script can enjoy the same quality and polishes of FreeSerif that Arabic script reader enjoy:

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-06 Thread Dominik Wujastyk
I'm sensing, I think, that you don't like that font, Khaled? Dominik :-) 2013/12/5 Khaled Hosny khaledho...@eglug.org: Please, please, please, never ever use GNU free font for Arabic; it is the most hideous, crappy and useless un-Arabic font ever created, my blood boils every time I

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-05 Thread Keith J. Schultz
Hi Scott, We are talking Unicode here right! What is there to guess? Then there is always the possibility of having the text tagged when written by the original author. Of course, only when you can control his input tools. Lua(La)TeX has other great feature. You have a complete programming

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-05 Thread Khaled Hosny
On Thu, Dec 05, 2013 at 09:41:04AM +0100, Keith J. Schultz wrote: Hi Scott, We are talking Unicode here right! What is there to guess? And how do you, using Unicode, tell in what language is this line written? Regards, Khaled --

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-05 Thread Jonathan Kew
On 4/12/13 13:24, C. Scott Ananian wrote: The goal is to match the Unicode bidi algorithm, because that is how the web page displays and thus how the original author saw the text as they wrote. This would be a nice enhancement, but would require a significant amount of work (or in other

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-05 Thread Zdenek Wagner
2013/12/5 Khaled Hosny khaledho...@eglug.org: On Wed, Dec 04, 2013 at 12:31:58AM -0500, C. Scott Ananian wrote: On Tue, Dec 3, 2013 at 5:33 PM, Khaled Hosny khaledho...@eglug.org wrote: On Tue, Dec 03, 2013 at 01:42:21PM -0500, C. Scott Ananian wrote: Does XeLaTeX implement the Unicode BiDi

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-05 Thread Khaled Hosny
On Thu, Dec 05, 2013 at 12:29:40PM +0100, Zdenek Wagner wrote: 2013/12/5 Khaled Hosny khaledho...@eglug.org: On Wed, Dec 04, 2013 at 12:31:58AM -0500, C. Scott Ananian wrote: On Tue, Dec 3, 2013 at 5:33 PM, Khaled Hosny khaledho...@eglug.org wrote: On Tue, Dec 03, 2013 at 01:42:21PM -0500,

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-05 Thread Zdenek Wagner
2013/12/5 Khaled Hosny khaledho...@eglug.org: On Thu, Dec 05, 2013 at 12:29:40PM +0100, Zdenek Wagner wrote: 2013/12/5 Khaled Hosny khaledho...@eglug.org: On Wed, Dec 04, 2013 at 12:31:58AM -0500, C. Scott Ananian wrote: On Tue, Dec 3, 2013 at 5:33 PM, Khaled Hosny khaledho...@eglug.org

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-05 Thread C. Scott Ananian
Can anyone point me to docs on XeT--TeX? A Google the other day failed to turn up anything useful. Also: polyglossia appears to be doing some amount of LTR/RTL directionality switching based on the character block. Can anyone offer advice on how to avoid fighting with that, if I'm implementing

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-05 Thread Alan Munn
On Dec 5, 2013, at 7:48 AM, C. Scott Ananian csc...@cscott.net wrote: Can anyone point me to docs on XeT--TeX? A Google the other day failed to turn up anything useful. On your TeX system, texdoc xetex gives the main documentation. But the bidi documentation, polyglossia documentation and

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-05 Thread Jonathan Kew
On 5/12/13 12:48, C. Scott Ananian wrote: Can anyone point me to docs on XeT--TeX? A Google the other day failed to turn up anything useful. (TeX--XeT, not XeT--TeX.) This is part of e-TeX; see the e-TeX manual[1], section 4.1. HTH, JK [1]

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-04 Thread C. Scott Ananian
On Tue, Dec 3, 2013 at 5:33 PM, Khaled Hosny khaledho...@eglug.org wrote: On Tue, Dec 03, 2013 at 01:42:21PM -0500, C. Scott Ananian wrote: Does XeLaTeX implement the Unicode BiDi algorithm? Short answer: no. I think sample documents (minimal working example) are needed for any useful

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-04 Thread C. Scott Ananian
The goal is to match the Unicode bidi algorithm, because that is how the web page displays and thus how the original author saw the text as they wrote. Guessing the proper language tag to use is likely infeasible; note that the example given contains titles in Turkish as well as English. The

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-04 Thread Zdenek Wagner
2013/12/4 C. Scott Ananian csc...@cscott.net: The goal is to match the Unicode bidi algorithm, because that is how the web page displays and thus how the original author saw the text as they wrote. Guessing the proper language tag to use is likely infeasible; note that the example given

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-04 Thread Andrew Cunningham
Well first step is implementing and providing ways of using the bidi alg and its changes in Unicode 6.3, especially being able to leverage off bidi isolation. Andrew On 4 December 2013 20:07, Keith J. Schultz schul...@uni-trier.de wrote: Hi Scott, Am 03.12.2013 um 19:42 schrieb C. Scott

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-04 Thread Khaled Hosny
On Wed, Dec 04, 2013 at 12:31:58AM -0500, C. Scott Ananian wrote: On Tue, Dec 3, 2013 at 5:33 PM, Khaled Hosny khaledho...@eglug.org wrote: On Tue, Dec 03, 2013 at 01:42:21PM -0500, C. Scott Ananian wrote: Does XeLaTeX implement the Unicode BiDi algorithm? Short answer: no. I think

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-04 Thread Khaled Hosny
On Wed, Dec 04, 2013 at 11:50:05PM +0100, Zdenek Wagner wrote: 2013/12/4 C. Scott Ananian csc...@cscott.net: 3) Arabic comma instead of English comma in citation [23]. (in both web and XeLaTeX output) The engine cannot recognize the context if the language is not tagged, the comma will

[XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-03 Thread C. Scott Ananian
I'm using Xe(La)Tex for a rewrite of the PDF booklet backend for the Wikimedia Foundation (Wikipedia). It's going pretty well, and the output looks good across a wide variety of Wikipedia's languages, but I'm having an issue with mixed RTL/LTR texts. I'm using the polyglossia package (and hence

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-03 Thread Khaled Hosny
On Tue, Dec 03, 2013 at 01:42:21PM -0500, C. Scott Ananian wrote: Does XeLaTeX implement the Unicode BiDi algorithm? Short answer: no. Long answer: XeTeX, more or less, breaks words at spaces or other non-character material (spaces in TeX are converted to the so called glue, so are not handled

Re: [XeTeX] xetex and the unicode bidirectional algorithm.

2013-12-03 Thread Zdenek Wagner
2013/12/3 Khaled Hosny khaledho...@eglug.org: On Tue, Dec 03, 2013 at 01:42:21PM -0500, C. Scott Ananian wrote: Does XeLaTeX implement the Unicode BiDi algorithm? Short answer: no. Long answer: XeTeX, more or less, breaks words at spaces or other non-character material (spaces in TeX are