Re: [XeTeX] XeLaTeX and SIunitx
Hello, thanks very much for that. What does normalise mean with angstrom and ohm? ciao T On 11.06.2012 22:57, Joseph Wright wrote: Hello all, Taking a look back over the code, I already have some auto-detection in for picking up UTF-8 symbols when the correct engine is in use. I've revised this a bit for the next release (v2.5d, on CTAN tomorrow), so that all of the 'problematic' symbols are covered in what seems to be the best way possible. Nothing happens unless appropriate support (fontspec/unicode-math) is loaded. If it is, then you get the following symbols: - Ångström u+00c5 (u+212b normalises here) - Degree Celsius u+00b0 + C (u+2103 is a compatibility character) - Micro u+00b5 (u+03bc is wrong) - Ohmu+03a9 (u+2126 normalises here) - Degree u+00b0 - Arc minute u+2032 (requires unicode-math) - Arc second u+2033 (requires unicode-math) -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] XeLaTeX and SIunitx
Tobias Schoel wrote: What does normalise mean with angstrom and ohm? Perhaps as per http://en.wikipedia.org/wiki/Unicode_equivalence#Normalization Philip Taylor -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] XeLaTeX and SIunitx
On 12/06/2012 15:10, Philip TAYLOR wrote: Tobias Schoel wrote: What does normalise mean with angstrom and ohm? Perhaps as per http://en.wikipedia.org/wiki/Unicode_equivalence#Normalization Philip Taylor Indeed: normalization is a way of dealing with differences in logical meaning where the symbols used are identical. For siunitx, I have to balance meaning with the likelihood of the symbol appearing in the output at all. Using the normalisation characters means that you have the best chance of getting the visually correct output, while still being able to search using the UTF-8 characters correctly. -- Joseph Wright -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] XeLaTeX and SIunitx
Hello all, Taking a look back over the code, I already have some auto-detection in for picking up UTF-8 symbols when the correct engine is in use. I've revised this a bit for the next release (v2.5d, on CTAN tomorrow), so that all of the 'problematic' symbols are covered in what seems to be the best way possible. Nothing happens unless appropriate support (fontspec/unicode-math) is loaded. If it is, then you get the following symbols: - Ångström u+00c5 (u+212b normalises here) - Degree Celsius u+00b0 + C (u+2103 is a compatibility character) - Micro u+00b5 (u+03bc is wrong) - Ohmu+03a9 (u+2126 normalises here) - Degree u+00b0 - Arc minute u+2032 (requires unicode-math) - Arc second u+2033 (requires unicode-math) -- Joseph Wright -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] XeLaTeX and SIunitx
Hi, On 14.05.2012 21:46, Joseph Wright wrote: On 14/05/2012 20:38, Tobias Schoel wrote: If I understand you correctly, there are two ways, in which this could/should be solved on package level: 1. siunitx gets an option / command whatever, which does approximately: \ifxetex\input{other file which can include suitable unicode symbols}\fi 2. a new package xesiunitx is created, which does approximately: \usepackage{siunitx} \sisetup{definitions using suitable unicode symbols, depending on package option} or \usepackage{siunitx} \testiffonthassymbols \sisetup{definitions using suitable unicode symbols} \else \sisetup{some other helpful definitions} \fi As I've tried to explain, there are simply too many possible combinations to cover things for XeTeX and LuaTeX users without them actually checking the settings they use. The best that I can do even with pdfTeX is provide some sensible defaults, and even there there are failure cases (for a start, any 'non-standard' font packages may well fail to give good output). My current approach is to be honest with XeTeX/LuaTeX users and say 'look, you are going to have to check that the font you've chosen to use has the correct symbols available'. I am happy to consider changes, but what I don't want to do is give the impression that it's possible to do all of this automatically: that is not what I've found. -- Joseph Wright Maybe you missunderstood me. That shouldn't be a feature request to the siunitx-package. It was more of a general question. But what you said seems to indicate to me, that it would be more sensible to create my own package xesiunitx, which solves the problem for my situation. As I only use open fonts, there aren't so many possibilities, and even for arbitrary fonts, one might only check for the best solutions and else uses siunitx' fallbacks. -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] XeLaTeX and SIunitx
On 15/05/2012 17:21, Tobias Schoel wrote: But what you said seems to indicate to me, that it would be more sensible to create my own package xesiunitx, which solves the problem for my situation. As I only use open fonts, there aren't so many possibilities, and even for arbitrary fonts, one might only check for the best solutions and else uses siunitx' fallbacks. I'm keen to avoid package proliferation where possible, escpecially where we are looking at essentially at settings for another package. I've created a new issue for siunitx: https://bitbucket.org/josephwright/siunitx/issue/199/improve-default-symbols-when-using-utf-8#comment-1423028. I will take a look at this over the next few days: there will need to be some non-trivial testing. As it's not tied to XeTeX (the same applies to LuaTeX) I'd suggest anyone wanting to discuss this particular case does so via the comments on the BitBucket site :-) -- Joseph Wright -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] XeLaTeX and SIunitx
On 5/14/12, Ross Moore ross.mo...@mq.edu.au wrote: Hi Ulrike, and Bruno, On 13/05/2012, at 11:05 PM, Ulrike Fischer wrote: Am Fri, 11 May 2012 19:44:00 +0200 schrieb Bruno Le Floch: I'm really no expert, but the siunitx package could include, e.g., µ as 00b5. This would not make pdftex choke when appearing in the false branch of an engine-dependent conditional. Using ^^..-notation is certainly a good idea in styles - regardless of the engine - as it avoids encoding confusing. If by styles, you mean in a macro definition made within a separate style file, then I agree with you 100%. But ... But it doesn't solve the problem here as pdftex chokes if it sees more than two ^^: ... this is not a good example to support this view. \documentclass{article} \begin{document} 00b5 \end{document} The body of your document source should be engine independent, so this should look more like: \documentclass{article} \usepackage{ifxetex} \ifxetex \newcommand{\micronChar}{00b5} % handle other characters ... \else \if ... % handle other possibilities % e.g. ^^c2^^b5 ... \fi \fi \begin{document} \micronChar \end{document} Better still, of course is to have the conditional definitions made in a separate file, so that similar things can all be handled together and used in multiple documents. You want to avoid having to find and replace multiple instances of the special characters, when you share you work with colleagues or need to reuse your own work in other contexts. Instead you should simply need to adjust the macro expansions, and all that previous work will adapt automatically. ! Text line contains an invalid character. l.9 ^^^ ^00b5 ? x For pdftex you would have to code it as two 8bit-octect: ^^c2^^b5 But this naturally will assume that pdftex is expecting utf8-input. You cannot do \ifxetex 00b5 \else ^^c2^^b5 \fi because the character ^^^ is invalid in pdfTeX (catcode 15), hence pdfTeX chokes whenever it sees that character in a line, with the exception of \^^^, the command symbol (otherwise it would be difficult to change the catcode of ^^^). On the other hand, you can do \ifxetex \expandafter \@gobble \string \00b5 \else ... \fi Regards, Bruno -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] XeLaTeX and SIunitx(in a way OT)
Hi All, I have a question. In a style file would say TeX barf if it contained utf-8 characters even if I have them in a conditional sothat the are not processed by the engine just parsed? regards Keith Am 14.05.2012 um 01:19 schrieb Ross Moore: Hi Ulrike, and Bruno, On 13/05/2012, at 11:05 PM, Ulrike Fischer wrote: Am Fri, 11 May 2012 19:44:00 +0200 schrieb Bruno Le Floch: I'm really no expert, but the siunitx package could include, e.g., µ as 00b5. This would not make pdftex choke when appearing in the false branch of an engine-dependent conditional. Using ^^..-notation is certainly a good idea in styles - regardless of the engine - as it avoids encoding confusing. If by styles, you mean in a macro definition made within a separate style file, then I agree with you 100%. But ... But it doesn't solve the problem here as pdftex chokes if it sees more than two ^^: ... this is not a good example to support this view. \documentclass{article} \begin{document} 00b5 \end{document} The body of your document source should be engine independent, so this should look more like: \documentclass{article} \usepackage{ifxetex} \ifxetex \newcommand{\micronChar}{00b5} % handle other characters ... \else \if ... % handle other possibilities % e.g. ^^c2^^b5 ... \fi \fi \begin{document} \micronChar \end{document} Better still, of course is to have the conditional definitions made in a separate file, so that similar things can all be handled together and used in multiple documents. You want to avoid having to find and replace multiple instances of the special characters, when you share you work with colleagues or need to reuse your own work in other contexts. Instead you should simply need to adjust the macro expansions, and all that previous work will adapt automatically. ! Text line contains an invalid character. l.9 ^^^ ^00b5 ? x For pdftex you would have to code it as two 8bit-octect: ^^c2^^b5 But this naturally will assume that pdftex is expecting utf8-input. -- Ulrike Fischer Hope this helps, Ross Ross Moore ross.mo...@mq.edu.au Mathematics Department office: E7A-419 Macquarie University tel: +61 (0)2 9850 8955 Sydney, Australia 2109 fax: +61 (0)2 9850 8114 -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] XeLaTeX and SIunitx(in a way OT)
Thanx for the info. I was just curious because of the discussions about backward/legacy compatibilties with UTF. It would seem to me that them all base packages should be refractured to the point where the package/style file contains the if(engine) and loads then the appropriate files for them. Yes, this approach does have the big draw back that there is more code to maintain, yet it has the advantage the for older engines the code can be left alone and one can concentrate on the modern technologies. regards Keith. Am 14.05.2012 um 09:08 schrieb Bruno Le Floch: In a style file would say TeX barf if it contained utf-8 characters even if I have them in a conditional sothat the are not processed by the engine just parsed? I believe that it would be ok if you use the actual bytes ^^c3 and ^^b5 in the file. The reason is that pdfTeX only makes (most) code points from 0 to 31 invalid (?), and those only appear in the utf-8 encoding of the Unicode code points 0 to 31, which you are probably not using in your files (except 9, 10, 13, which are ok for pdfTeX). On the other hand, if you want to use the ^^ notation, for pdfTeX (set up with the appropriate inputenc option) you'd need to use the eight characters ^, ^, c, 3, ^, ^, b, and 5 (that'd give you à and µ in [Xe/Lua]TeX), whereas for the other two engines you'd need either ^^b5 or 00b5. In this last case of the notation, pdfTeX will choke even if it appears in the unused branch of a conditional. Now, why would anyone use the ^^ notation? Because it is most robust against encoding changes since we then only use ASCII characters. Using utf-8 encoded characters directly is only good if you stick with utf-8 (which I'd advise). So I'd say my impression is that the best is to use , but in a separate file, loaded for the luatex or xetex engine. Three other options: * Keep one file, work in a group, and use \catcode`\^^^=9 for the pdftex engine before any appears. * Put the pdfTeX-specific commands first in the file, and conditionally \ifpdftex \endinput \fi, then anyhing can appear later in the file. * Use \char00b5, which only works if your font is encoded in a sensible way IIRC. Hope that helps (and is correct), Bruno -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] XeLaTeX and SIunitx
Am Mon, 14 May 2012 08:04:47 +0200 schrieb Bruno Le Floch: You cannot do \ifxetex 00b5 \else ^^c2^^b5 \fi because the character ^^^ is invalid in pdfTeX (catcode 15), hence pdfTeX chokes whenever it sees that character in a line, with the exception of \^^^, the command symbol (otherwise it would be difficult to change the catcode of ^^^). On the other hand, you can do \ifxetex \expandafter \@gobble \string \00b5 \else ... \fi Or one could use the actual character: In this case pdftex would see the two octets and not complain about an invalid character. So one has to make a choice between encoding independence and invalid chars problem. On the whole I would say the best is really to separate such engine dependent code in different files - much cleaner. -- Ulrike Fischer -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] XeLaTeX and SIunitx
On 5/14/12, Ulrike Fischer ne...@nililand.de wrote: Am Mon, 14 May 2012 09:19:18 +1000 schrieb Ross Moore: But it doesn't solve the problem here as pdftex chokes if it sees more than two ^^: ... this is not a good example to support this view. \documentclass{article} \begin{document} 00b5 \end{document} The body of your document source should be engine independent, so this should look more like: [...] Well I wanted to show that pdftex *chokes* over more than two ^^-symbols - and my document demonstrates this in a more shorter way than your example (which gives an error too) ;-). If it gives an error, it might be that \ifxetex is not defined. The code \let\ifxetex\iffalse % to be replaced by the appropriate package from Heiko \ifxetex \expandafter\@gobble\string\00b5 \else ... \fi \bye works perfectly well in pdftex. I agree, though, that the best is to separate engine-dependent code into different files. Regards, Bruno -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] XeLaTeX and SIunitx
Am Mon, 14 May 2012 11:51:33 +0200 schrieb Bruno Le Floch: If it gives an error, it might be that \ifxetex is not defined. The code No you misunderstood my remark. Ross example chokes over the same invalid char (00b5) as -- as you correctly remarked in another post -- it doesn't help to hide the invalid char in a \if-\fi-branch. -- Ulrike Fischer -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] XeLaTeX and SIunitx
If I understand you correctly, there are two ways, in which this could/should be solved on package level: 1. siunitx gets an option / command whatever, which does approximately: \ifxetex\input{other file which can include suitable unicode symbols}\fi 2. a new package xesiunitx is created, which does approximately: \usepackage{siunitx} \sisetup{definitions using suitable unicode symbols, depending on package option} or \usepackage{siunitx} \testiffonthassymbols \sisetup{definitions using suitable unicode symbols} \else \sisetup{some other helpful definitions} \fi -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] XeLaTeX and SIunitx
On 14/05/2012 20:38, Tobias Schoel wrote: If I understand you correctly, there are two ways, in which this could/should be solved on package level: 1. siunitx gets an option / command whatever, which does approximately: \ifxetex\input{other file which can include suitable unicode symbols}\fi 2. a new package xesiunitx is created, which does approximately: \usepackage{siunitx} \sisetup{definitions using suitable unicode symbols, depending on package option} or \usepackage{siunitx} \testiffonthassymbols \sisetup{definitions using suitable unicode symbols} \else \sisetup{some other helpful definitions} \fi As I've tried to explain, there are simply too many possible combinations to cover things for XeTeX and LuaTeX users without them actually checking the settings they use. The best that I can do even with pdfTeX is provide some sensible defaults, and even there there are failure cases (for a start, any 'non-standard' font packages may well fail to give good output). My current approach is to be honest with XeTeX/LuaTeX users and say 'look, you are going to have to check that the font you've chosen to use has the correct symbols available'. I am happy to consider changes, but what I don't want to do is give the impression that it's possible to do all of this automatically: that is not what I've found. -- Joseph Wright -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] XeLaTeX and SIunitx
Hi Ulrike, and Bruno, On 13/05/2012, at 11:05 PM, Ulrike Fischer wrote: Am Fri, 11 May 2012 19:44:00 +0200 schrieb Bruno Le Floch: I'm really no expert, but the siunitx package could include, e.g., µ as 00b5. This would not make pdftex choke when appearing in the false branch of an engine-dependent conditional. Using ^^..-notation is certainly a good idea in styles - regardless of the engine - as it avoids encoding confusing. If by styles, you mean in a macro definition made within a separate style file, then I agree with you 100%. But ... But it doesn't solve the problem here as pdftex chokes if it sees more than two ^^: ... this is not a good example to support this view. \documentclass{article} \begin{document} 00b5 \end{document} The body of your document source should be engine independent, so this should look more like: \documentclass{article} \usepackage{ifxetex} \ifxetex \newcommand{\micronChar}{00b5} % handle other characters ... \else \if ... % handle other possibilities % e.g. ^^c2^^b5 ... \fi \fi \begin{document} \micronChar \end{document} Better still, of course is to have the conditional definitions made in a separate file, so that similar things can all be handled together and used in multiple documents. You want to avoid having to find and replace multiple instances of the special characters, when you share you work with colleagues or need to reuse your own work in other contexts. Instead you should simply need to adjust the macro expansions, and all that previous work will adapt automatically. ! Text line contains an invalid character. l.9 ^^^ ^00b5 ? x For pdftex you would have to code it as two 8bit-octect: ^^c2^^b5 But this naturally will assume that pdftex is expecting utf8-input. -- Ulrike Fischer Hope this helps, Ross Ross Moore ross.mo...@mq.edu.au Mathematics Department office: E7A-419 Macquarie University tel: +61 (0)2 9850 8955 Sydney, Australia 2109 fax: +61 (0)2 9850 8114 -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] XeLaTeX and SIunitx
On 11.05.2012 19:44, Bruno Le Floch wrote: On 5/11/12, Joseph Wrightjoseph.wri...@morningstar2.co.uk wrote: On 11/05/2012 17:36, Tobias Schoel wrote: Hi, I have done a few tests with the problematic symbols in siunitx (namely micro, ohm, angstrom, celsius, degree/arcsecond/arcminute) and different math fonts. You'll find source and result attached. As I don't have access to commercial fonts (which includes MS Fonts), I could only test some of them. The results aren't overwhelming. Is it possible and acceptable to include a package option or sisetup-option which makes the suitable definitions? It shouldn't be default, even when loading fontspec in xetex, but easily accessible. Thanks bye Tobias As the siunitx documents state, there are simply too many combinations of font packages to hope to cover all of them 'out of the box' or indeed in the documentation, especially as XeLaTeX and LuaLaTeX users may be loading /any/ system font. Furthermore, the package code has to work with pdfTeX, so it cannot contain UTF-8 characters outside of the ASCII range. As such, I can only make general recommendations in the documentation on what to do to print these symbols correctly when using UTF-8 engines. I'm really no expert, but the siunitx package could include, e.g., µ as 00b5. This would not make pdftex choke when appearing in the false branch of an engine-dependent conditional. Could unicode-math-symbols be used? Can one load a package only dependent on engine? \ifxetex\usepackage{unicode-math-symbols}\fi ? Regards, Bruno -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] XeLaTeX and SIunitx
2012/5/12 Tobias Schoel liesdieda...@googlemail.com: On 11.05.2012 19:44, Bruno Le Floch wrote: On 5/11/12, Joseph Wrightjoseph.wri...@morningstar2.co.uk wrote: On 11/05/2012 17:36, Tobias Schoel wrote: Hi, I have done a few tests with the problematic symbols in siunitx (namely micro, ohm, angstrom, celsius, degree/arcsecond/arcminute) and different math fonts. You'll find source and result attached. As I don't have access to commercial fonts (which includes MS Fonts), I could only test some of them. The results aren't overwhelming. Is it possible and acceptable to include a package option or sisetup-option which makes the suitable definitions? It shouldn't be default, even when loading fontspec in xetex, but easily accessible. Thanks bye Tobias As the siunitx documents state, there are simply too many combinations of font packages to hope to cover all of them 'out of the box' or indeed in the documentation, especially as XeLaTeX and LuaLaTeX users may be loading /any/ system font. Furthermore, the package code has to work with pdfTeX, so it cannot contain UTF-8 characters outside of the ASCII range. As such, I can only make general recommendations in the documentation on what to do to print these symbols correctly when using UTF-8 engines. I'm really no expert, but the siunitx package could include, e.g., µ as 00b5. This would not make pdftex choke when appearing in the false branch of an engine-dependent conditional. Could unicode-math-symbols be used? Can one load a package only dependent on engine? \ifxetex\usepackage{unicode-math-symbols}\fi AFAIK there are four unicode math fonts: Cambria Math (commercial) Asana Math XITS Neo Euler How theengine-only switch could recognize which fonts do you wish to use? Moreover \Omega in the math sense must be typeset in math italic while as a unit (Ohm) it must be upright and may be taken from any font containing Greek. It would be better to have a unicode SI units package that will allow users to select a font they like. ? Regards, Bruno -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex -- Zdeněk Wagner http://hroch486.icpf.cas.cz/wagner/ http://icebearsoft.euweb.cz -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] XeLaTeX and SIunitx
On 5/11/12, Joseph Wright joseph.wri...@morningstar2.co.uk wrote: On 11/05/2012 17:36, Tobias Schoel wrote: Hi, I have done a few tests with the problematic symbols in siunitx (namely micro, ohm, angstrom, celsius, degree/arcsecond/arcminute) and different math fonts. You'll find source and result attached. As I don't have access to commercial fonts (which includes MS Fonts), I could only test some of them. The results aren't overwhelming. Is it possible and acceptable to include a package option or sisetup-option which makes the suitable definitions? It shouldn't be default, even when loading fontspec in xetex, but easily accessible. Thanks bye Tobias As the siunitx documents state, there are simply too many combinations of font packages to hope to cover all of them 'out of the box' or indeed in the documentation, especially as XeLaTeX and LuaLaTeX users may be loading /any/ system font. Furthermore, the package code has to work with pdfTeX, so it cannot contain UTF-8 characters outside of the ASCII range. As such, I can only make general recommendations in the documentation on what to do to print these symbols correctly when using UTF-8 engines. I'm really no expert, but the siunitx package could include, e.g., µ as 00b5. This would not make pdftex choke when appearing in the false branch of an engine-dependent conditional. Regards, Bruno -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex