Re: [XeTeX] Hyphenation in Transliterated Sanskrit
Le 11 sept. 2011 à 20:40, Dominik Wujastyk a écrit : > To get appropriate hyphenation in Romanisation, we need to go down the Patgen > path. So we need to develop a large lexicon of appropriately-hyphenated > romanised Sanskrit words in UTF8 encoding, and when that list is reasonably > long, process it through Patgen to make patterns. > > I am slowly developing such a list, but it would be great to collaborate. Several years ago Somadeva Vasudeva wrote here that he had started compiling such a list. See: http://tug.org/mailman/htdig/xetex/2005-March/002053.html Maybe he would be willing to give what he had done. Regards, Yves -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] bug using \underbrace with unicode-math package
Sorry, the message became double-spaced (copy-and-paste from TeXworks) -- I will try a second time. \documentclass{article} \RequirePackage{amsmath} \RequirePackage{unicode-math} \setmathfont{xits-math.otf} \def\midshift#1{ \setbox0=\hbox{#1}\dimen0=\ht0\advance\dimen0by+\dp0\advance\dimen0by-1ex \lower.5\dimen0\box0 } \def\rotatebrace#1{% \leavevmode\setbox0=\hbox{#1}\rlap{% \kern.5\wd0\dimen0=\ht0\advance\dimen0by-\dp0%\advance\dimen0by+1ex% \raise.5\dimen0\hbox{\special{x:gsave}\special{x:rotate 90}}}% \box0\special{x:grestore}} \XeTeXmathchardef\bracelu = 0 3 `\⎧ \XeTeXmathchardef\bracemu = 0 3 `\⎨ \XeTeXmathchardef\braceru = 0 3 `\⎩ \XeTeXmathchardef\bracebar = 0 3 `\⎪ \XeTeXmathchardef\braceld = 0 3 `\⎫ \XeTeXmathchardef\bracemd = 0 3 `\⎬ \XeTeXmathchardef\bracerd = 0 3 `\⎭ \def\upbracefill{% \setbox0=\hbox{\lower.64ex\hbox{\rotatebrace{\midshift{$\bracemu$\ht0=.1\wd0\dp0=0pt% \setbox1=\hbox{\lower.64ex\hbox{\rotatebrace{\midshift{$\bracelu$}}\kern-.2em}}\ht1=.1\wd0\dp1=0pt% \setbox2=\hbox{\lower.64ex\hbox{\rotatebrace{\midshift{$\bracebar$\ht2=.1\wd0\dp2=0pt% \setbox3=\hbox{\lower.64ex\hbox{\kern-.2em\rotatebrace{\midshift{$\braceru$\ht3=.1\wd0\dp3=0pt% \box1\cleaders\copy2\hfill\box0\cleaders\box2\hfill\box3} \def\downbracefill{% \setbox0=\hbox{\lower.64ex\hbox{\rotatebrace{\midshift{$\bracemd$\ht0=.1\wd0\dp0=0pt% \setbox1=\hbox{\lower.64ex\hbox{\rotatebrace{\midshift{$\braceld$}}\kern-.2em}}\ht1=.1\wd0\dp1=0pt% \setbox2=\hbox{\lower.64ex\hbox{\rotatebrace{\midshift{$\bracebar$\ht2=.1\wd0\dp2=0pt% \setbox3=\hbox{\lower.64ex\hbox{\kern-.2em\rotatebrace{\midshift{$\bracerd$\ht3=.1\wd0\dp3=0pt% \box1\cleaders\copy2\hfill\box0\cleaders\box2\hfill\box3}% \setmathfont {XITS Math} \begin {document} $$ \underbrace{xyz} $$ \end{document} Philip Taylor -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] bug using \underbrace with unicode-math package
Daniel Greenhoe wrote: > Using \underbrace with the unicode-math package under XeLaTeX produces > garbage output. Here is a minimal example: > > \documentclass{book} > \usepackage{unicode-math} > \setmathfont{xits-math.otf} > \begin{document}% > \[ \underbrace{xyz} \] > \end{document}% > > Using the mathspec package instead of unicode-math seems to be OK. > The \underbrace with unicode-math problem was discussed almost one year ago. > Does anyone have a solution? For reasons that are totally beyond my comprehension, the following (stolen from http://tex.stackexchange.com/questions/3488/horizontal-braces-with-xetex-take-too-much-space) appears to solve the problem : \documentclass{article} \RequirePackage{amsmath} \RequirePackage{unicode-math} \setmathfont{xits-math.otf} \def\midshift#1{ \setbox0=\hbox{#1}\dimen0=\ht0\advance\dimen0by+\dp0\advance\dimen0by-1ex \lower.5\dimen0\box0 } \def\rotatebrace#1{% \leavevmode\setbox0=\hbox{#1}\rlap{% \kern.5\wd0\dimen0=\ht0\advance\dimen0by-\dp0%\advance\dimen0by+1ex% \raise.5\dimen0\hbox{\special{x:gsave}\special{x:rotate 90}}}% \box0\special{x:grestore}} \XeTeXmathchardef\bracelu = 0 3 `\⎧ \XeTeXmathchardef\bracemu = 0 3 `\⎨ \XeTeXmathchardef\braceru = 0 3 `\⎩ \XeTeXmathchardef\bracebar = 0 3 `\⎪ \XeTeXmathchardef\braceld = 0 3 `\⎫ \XeTeXmathchardef\bracemd = 0 3 `\⎬ \XeTeXmathchardef\bracerd = 0 3 `\⎭ \def\upbracefill{% \setbox0=\hbox{\lower.64ex\hbox{\rotatebrace{\midshift{$\bracemu$\ht0=.1\wd0\dp0=0pt% \setbox1=\hbox{\lower.64ex\hbox{\rotatebrace{\midshift{$\bracelu$}}\kern-.2em}}\ht1=.1\wd0\dp1=0pt% \setbox2=\hbox{\lower.64ex\hbox{\rotatebrace{\midshift{$\bracebar$\ht2=.1\wd0\dp2=0pt% \setbox3=\hbox{\lower.64ex\hbox{\kern-.2em\rotatebrace{\midshift{$\braceru$\ht3=.1\wd0\dp3=0pt% \box1\cleaders\copy2\hfill\box0\cleaders\box2\hfill\box3} \def\downbracefill{% \setbox0=\hbox{\lower.64ex\hbox{\rotatebrace{\midshift{$\bracemd$\ht0=.1\wd0\dp0=0pt% \setbox1=\hbox{\lower.64ex\hbox{\rotatebrace{\midshift{$\braceld$}}\kern-.2em}}\ht1=.1\wd0\dp1=0pt% \setbox2=\hbox{\lower.64ex\hbox{\rotatebrace{\midshift{$\bracebar$\ht2=.1\wd0\dp2=0pt% \setbox3=\hbox{\lower.64ex\hbox{\kern-.2em\rotatebrace{\midshift{$\bracerd$\ht3=.1\wd0\dp3=0pt% \box1\cleaders\copy2\hfill\box0\cleaders\box2\hfill\box3}% \setmathfont {XITS Math} \begin {document} $$ \underbrace{xyz} $$ \end{document} Philip Taylor -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
[XeTeX] bug using \underbrace with unicode-math package
Using \underbrace with the unicode-math package under XeLaTeX produces garbage output. Here is a minimal example: \documentclass{book} \usepackage{unicode-math} \setmathfont{xits-math.otf} \begin{document}% \[ \underbrace{xyz} \] \end{document}% Using the mathspec package instead of unicode-math seems to be OK. The \underbrace with unicode-math problem was discussed almost one year ago. Does anyone have a solution? I posted an email very similar to this one more than 48 hours ago but received no responses. My previous post included a web link and I fear this may have caused the email to be identified as spam. This email is basically a repost. If you received the previous email, I apologize for the annoyance. Dan -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Hyphenation in Transliterated Sanskrit
Hello, Neal. I still don't receive your messages :( Le 11 sept. 2011 à 22:21, Zdenek Wagner a écrit : >> Also Zdenek raises an interesting possibility. If I were to want to typeset >> Sanskrit, say this very Sanskrit, in Bengali or Telugu script. How would I >> go about that? >> > Probably you can mechanically rewrite RomDev.map to convert the > transliteration to another script and compile it with teckit_compile. > I do not know Sanskrit and do not know other scripts, my knowledge in > this area is almost zero, so I am not sure whether such mechanical > approach would work. I presume you can do like Zdeněk says (I don't much about Teckit). Otherwise you can write Sanskrit directly in Bengali or Telugu script. If you tell Polyglossia what is in Sanskrit it should be hyphenated correctly in those scripts as well. Best wishes, Yves -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Hyphenation in Transliterated Sanskrit
2011/9/11 Neal Delmonico : > Thanks to both Yves and Zdenek for your suggestions and examples. The > hyphenation is working now in both Devanagari and Roman Translit. I'd have > never figured it out on my own. If I were to want to read more on this > where would I look? > Frankly I do not know. I often read the source code of the packages in order to uinderstand the internals. In fact I even studied the whole source code of LaTeX. > Also Zdenek raises an interesting possibility. If I were to want to typeset > Sanskrit, say this very Sanskrit, in Bengali or Telugu script. How would I > go about that? > Probably you can mechanically rewrite RomDev.map to convert the transliteration to another script and compile it with teckit_compile. I do not know Sanskrit and do not know other scripts, my knowledge in this area is almost zero, so I am not sure whether such mechanical approach would work. > Thanks again. > > Neal > > On Sun, 11 Sep 2011 04:32:59 -0500, Zdenek Wagner > wrote: > >> 2011/9/11 Neal Delmonico : >>> >>> Thanks! How would one set it up so that the English portions are >>> hyphenated >>> according to English rules and the transliteration is hyphenated >>> according >>> to Sanskrit rules? >>> >> I am sending an example. You can see another nice feature of the >> TECkit mapping. The mapping is applied when the text is typeset. You >> can thus store the transliterated text in a temporary macro and >> typeset it twice. >> >> There is one problem (this is the reason why I am sending a copy to >> François). It is requested that Sanskrit text is typeset by a font >> with Devanagari characters. However, Sanskrit is also written in other >> scripts so that people in other parts of India, who do not know >> Devanagari, could read it. Even the Tibetan script contains retroflex >> consonants that are not used in the Tibetan language but server for >> writing Sanskrit (and recently writing words of English origin). >> Polyglossia should not be that demanding. >> >> And just to François: I found two bugs in documentation. Section 5.2 >> mentions selection between Western and Devanagari numerals, but it >> should be Bengali numerals (I am not sure which option is really >> implemented). At the introduction, Vafa Khaligi's name is wrong. AFAIK >> in Urdu and Farsi, the isolated and final form of YEH are dotless (it >> is not a big bug), but in fact the name is written as Khaliql, there >> is ق instead of غ >> >>> Best >>> >>> Neal >>> >>> On Sat, 10 Sep 2011 19:40:51 -0500, Zdenek Wagner >>> >>> wrote: >>> 2011/9/11 Neal Delmonico : > > Here is the source files for the pdf. Sorry to take so long to send > them. > Your default language for polygliglossia is defined as English. You switch to Sanskrit only inside the \skt macro. The text in Devanagari is therefore hyphenated according to Sanskrit rules but the transliterated text is hyphenated according to the English rules. You have to switch the language to Sanskrit also for the transliterated text. > Best > > Neal > > On Sat, 10 Sep 2011 17:53:42 -0500, Mojca Miklavec > wrote: > >> On Sun, Sep 11, 2011 at 00:39, Neal Delmonico wrote: >>> >>> Here is an example of what I mean in the pdf attached. >> >> Do I get it right that hyphenation is working, it is just that it >> misses a lot of valid hyphenation points? >> >> You should talk to Yves Codet, the author of Sanskrit patterns. >> >> But PLEASE: do post example of your code when you ask for help. If you >> don't send the source, it is not clear whether you are in fact using >> Sanskrit patterns or if you are falling back to English when you try >> to switch fonst. You could just as well sent us PDF with French >> hyphenation enabled and claim that TeX is buggy since it doesn't >> hyphenate right. >> >> Mojca >> >> >> -- >> Subscriptions, Archive, and List information, etc.: >> http://tug.org/mailman/listinfo/xetex > > > -- > Using Opera's revolutionary email client: http://www.opera.com/mail/ > > > -- > Subscriptions, Archive, and List information, etc.: > http://tug.org/mailman/listinfo/xetex > > >>> >>> >>> -- >>> Using Opera's revolutionary email client: http://www.opera.com/mail/ >>> >>> >>> -- >>> Subscriptions, Archive, and List information, etc.: >>> http://tug.org/mailman/listinfo/xetex >>> >> >> >> > > > -- > Using Opera's revolutionary email client: http://www.opera.com/mail/ > > > > -- > Subscriptions, Archive, and List information, etc.: > http://tug.org/mailman/listinfo/xetex > -- Zdeněk Wagner http://hroch486.icpf.cas.cz/wagner/ http://icebearsoft.euweb.cz -
Re: [XeTeX] Hyphenation in Transliterated Sanskrit
Thanks to both Yves and Zdenek for your suggestions and examples. The hyphenation is working now in both Devanagari and Roman Translit. I'd have never figured it out on my own. If I were to want to read more on this where would I look? Also Zdenek raises an interesting possibility. If I were to want to typeset Sanskrit, say this very Sanskrit, in Bengali or Telugu script. How would I go about that? Thanks again. Neal On Sun, 11 Sep 2011 04:32:59 -0500, Zdenek Wagner wrote: 2011/9/11 Neal Delmonico : Thanks! How would one set it up so that the English portions are hyphenated according to English rules and the transliteration is hyphenated according to Sanskrit rules? I am sending an example. You can see another nice feature of the TECkit mapping. The mapping is applied when the text is typeset. You can thus store the transliterated text in a temporary macro and typeset it twice. There is one problem (this is the reason why I am sending a copy to François). It is requested that Sanskrit text is typeset by a font with Devanagari characters. However, Sanskrit is also written in other scripts so that people in other parts of India, who do not know Devanagari, could read it. Even the Tibetan script contains retroflex consonants that are not used in the Tibetan language but server for writing Sanskrit (and recently writing words of English origin). Polyglossia should not be that demanding. And just to François: I found two bugs in documentation. Section 5.2 mentions selection between Western and Devanagari numerals, but it should be Bengali numerals (I am not sure which option is really implemented). At the introduction, Vafa Khaligi's name is wrong. AFAIK in Urdu and Farsi, the isolated and final form of YEH are dotless (it is not a big bug), but in fact the name is written as Khaliql, there is ق instead of غ Best Neal On Sat, 10 Sep 2011 19:40:51 -0500, Zdenek Wagner wrote: 2011/9/11 Neal Delmonico : Here is the source files for the pdf. Sorry to take so long to send them. Your default language for polygliglossia is defined as English. You switch to Sanskrit only inside the \skt macro. The text in Devanagari is therefore hyphenated according to Sanskrit rules but the transliterated text is hyphenated according to the English rules. You have to switch the language to Sanskrit also for the transliterated text. Best Neal On Sat, 10 Sep 2011 17:53:42 -0500, Mojca Miklavec wrote: On Sun, Sep 11, 2011 at 00:39, Neal Delmonico wrote: Here is an example of what I mean in the pdf attached. Do I get it right that hyphenation is working, it is just that it misses a lot of valid hyphenation points? You should talk to Yves Codet, the author of Sanskrit patterns. But PLEASE: do post example of your code when you ask for help. If you don't send the source, it is not clear whether you are in fact using Sanskrit patterns or if you are falling back to English when you try to switch fonst. You could just as well sent us PDF with French hyphenation enabled and claim that TeX is buggy since it doesn't hyphenate right. Mojca -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex -- Using Opera's revolutionary email client: http://www.opera.com/mail/ -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex -- Using Opera's revolutionary email client: http://www.opera.com/mail/ -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex -- Using Opera's revolutionary email client: http://www.opera.com/mail/ -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Hyphenation in Transliterated Sanskrit
Sanskrit is hyphenated differently in Devanagari and in Roman script. If you use the hyph-sa.tex patterns, you get Roman hyphenated *as if it were Devanagari,* which is not acceptable in scholarly circles. The last 150 years of European writing on Sanskrit, using Romanisation, has developed hyphenation rules based on Sanskrit etymology, paying attention to compound words, internal sandhi, etc. (i.e., like German in some respects). The Devanagari hyphenation uses a much simpler idea, basically hyphenate after almost any vowel. To get appropriate hyphenation in Romanisation, we need to go down the Patgen path. So we need to develop a large lexicon of appropriately-hyphenated romanised Sanskrit words in UTF8 encoding, and when that list is reasonably long, process it through Patgen to make patterns. I am slowly developing such a list, but it would be great to collaborate. While the list is in the making, it can still be used, by using \hyphenation. Thus: \documentclass{article} polyglossia, xltxtra, whatnot ... \setotherlanguage{sanskrit} % for transliterated Sanskrit \newfontfamily\sanskritfont{TeX Gyre Pagella} % Define \sansk{} which is the same as \emph{}, except that it causes appropriate hyphenation % for Sanskrit words. Use \sansk{} for Sanskrit and \emph{} for English. \newcommand{\sansk}[1]{\emph{\textsanskrit{#1}}} ... \begin{document} \input{sanskrit-hyphenations.tex} % see attached file. Blah English blah. \sansk{āyurveda, avicchinnasampradāyatvād}. \end{document} Best, Dominik sanskrit-hyphenations.tex Description: TeX document -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Hyphenation in Transliterated Sanskrit
Hello, Neal. You could do something like this to have correct hyphenations: \documentclass[12pt]{article} \usepackage{fontspec} \setmainfont{Charis SIL} \newfontfamily\sanskritfont[Script=Devanagari]{Sanskrit 2003} \usepackage{polyglossia} \setmainlanguage{english} \setotherlanguage{sanskrit} \newcommand{\dev}[1]{{\begin{sanskrit}#1\end{sanskrit}}} \newcommand{\tra}[1]{{\fontspec{Charis SIL}\begin{sanskrit}#1\end{sanskrit}}} \textwidth 6cm % only to have more hyphenations \begin{document} Greetings. Greetings. Greetings. Greetings. Greetings. Greetings. Greetings. Greetings. Greetings. Greetings. Greetings. Greetings. \dev{नमस्ते। नमस्ते। नमस्ते। नमस्ते। नमस्ते। नमस्ते। नमस्ते। नमस्ते। नमस्ते। नमस्ते। नमस्ते। नमस्ते।} \tra{Namaste. Namaste. Namaste. Namaste. Namaste. Namaste. Namaste. Namaste. Namaste. Namaste. Namaste. Namaste.} \end{document} Regards, Yves P.S. For some reason I don't receive Neal's messages but I do receive replies to his messages. Anything wrong with the list? -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex