Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-30 Thread tlaronde
On Thu, Jun 30, 2011 at 07:00:48PM +0200, tlaronde wrote: > > This does not preclude the user from directly entering the unicode > codepoint: in the TFM, if you want, the glyph information is duplicated, > in the conventional plain TeX position, and as a literal in the unicode > position. More pr

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-30 Thread tlaronde
On Thu, Jun 30, 2011 at 12:31:17PM -0400, erik quanstrom wrote: > > I don't despise XeTeX. Nor Unicode. And I will take Unicode as is. But I > > will take TeX conventions as is too, since I'm working on TeX, and not > > another formatting system; since these conventions are confined to the > > ASCI

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-30 Thread erik quanstrom
> I don't despise XeTeX. Nor Unicode. And I will take Unicode as is. But I > will take TeX conventions as is too, since I'm working on TeX, and not > another formatting system; since these conventions are confined to the > ASCII subrange and only diverging from ASCII for the not glyph > positions.

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-30 Thread tlaronde
On Thu, Jun 30, 2011 at 10:51:53AM -0400, Karljurgen Feuerherm wrote: > [...] > > >But starting with "modern fonts", "modern system", "archaic" and the > like, it's like starting with: "only Adolf Hitler would still use not > Unicode fonts". > > Looking here: http://scripts.sil.org/cms/scripts/pa

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-30 Thread Michael Kerpan
On Thu, Jun 30, 2011 at 10:51 AM, Karljurgen Feuerherm wrote: > Thanks for this. Two notes: > >>Re-reading it, it's not "all ligatures" that are gone with > "Unicode-compliant fonts", but it spoke about the em- and en-dashes and > double quotes. So on these ones, I plead guilty. > > Alright. Not a

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-30 Thread Karljurgen Feuerherm
Thanks for this. Two notes: >Re-reading it, it's not "all ligatures" that are gone with "Unicode-compliant fonts", but it spoke about the em- and en-dashes and double quotes. So on these ones, I plead guilty. Alright. Not a big deal, it seems to me. >But starting with "modern fonts", "modern sy

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-30 Thread tlaronde
On Thu, Jun 30, 2011 at 09:14:10AM -0400, erik quanstrom wrote: > > But, as the present state allows the use for every character set that > > fits in eight bits, by using (for Plan9 users) tcs(1) to feed TeX with > > what it expects, I will not delay forever the release of 1.0 waiting for > > this

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-30 Thread erik quanstrom
> But, as the present state allows the use for every character set that > fits in eight bits, by using (for Plan9 users) tcs(1) to feed TeX with > what it expects, I will not delay forever the release of 1.0 waiting for > this next solution. good grief. how hard is it to write this code!? this b

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-30 Thread tlaronde
On Wed, Jun 29, 2011 at 07:43:08PM -0400, Karljurgen Feuerherm wrote: >[...] First to make clear what I was refering to (and making a false generalization) : the XeTeX FAQ: "However, standard Unicode-compliant fonts do not include ligatures for these sequences, as the normal expectation is that

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-29 Thread Karljurgen Feuerherm
I'd like to make a few comments concerning what you say below. 1. I've been involved with Unicode, both UTC and as a representative to WG2, and I can confidently affirm that there is no Unicode God. No one has ever said There is no Code but Unicode, and UTC/WG2 is its prophet, or anything like th

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-28 Thread erik quanstrom
> BUT the documentation found told that with "modern" fonts, one has the > absolute obligation threatened by Thy Unicode GOD to enter the codepoint > and that ligatures were deprecated. well of course, just use tcs. ;-|. - erik

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-28 Thread tlaronde
On Tue, Jun 28, 2011 at 01:19:15PM +0200, tlaro...@polynum.com wrote: >[...] > some \'e let > CID Please ignore this trailing garbage. -- Thierry Laronde http://www.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-28 Thread tlaronde
On Mon, Jun 27, 2011 at 05:17:16PM -0400, Michael Kerpan wrote: > > The subfont system works fine if you both have a complete Type 1 font > set including all the "expert fonts" including the extra glyphs and > the like AND are willing to put together a mapping for it. The problem > is that fonts h

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-28 Thread tlaronde
On Mon, Jun 27, 2011 at 07:45:34PM -0400, Karljurgen Feuerherm wrote: > Thierry, > > > I only say that: > > > 1) Forcing, as this was written in the XeTeX FAQ, user to enter the > special codepoint for the fi ligature since, white eyes, scornful wave > of the hand: "this is the way this is done w

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-27 Thread erik quanstrom
> I don't know who told you that... just because there is a codepoint > for something does not mean that one has to access that codepoint > directly in all cases. Software at various levels can render a > ligature on the basis of various actual character sequences (e.g. f + > i, or f, i when lig

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-27 Thread Karljurgen Feuerherm
Thierry, > I only say that: > 1) Forcing, as this was written in the XeTeX FAQ, user to enter the special codepoint for the fi ligature since, white eyes, scornful wave of the hand: "this is the way this is done with Unicode" is sheer stupidity. I don't know who told you that... just because t

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-27 Thread Michael Kerpan
On Mon, Jun 27, 2011 at 2:01 PM, wrote: > On Mon, Jun 27, 2011 at 01:34:07PM -0400, erik quanstrom wrote: >> >> i don't even have an opinion on this.  i don't understand the conflation >> of the input character set and tex's internal representations.  could >> you explain why you are taking about

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-27 Thread tlaronde
On Mon, Jun 27, 2011 at 01:34:07PM -0400, erik quanstrom wrote: > > i don't even have an opinion on this. i don't understand the conflation > of the input character set and tex's internal representations. could > you explain why you are taking about them as the same? > > to be brutally honest,

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-27 Thread erik quanstrom
> As can be clear from the even more desastrous level of my english > than usual, I only had a minute or two to write the message. > > I DON'T SAY THAT I WILL RESTRICT TEX TO THE FIRST 256 CODEPOINTS. > > This is precisely why I have rejected your proposal. KerTeX will > provide, because this is

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-27 Thread tlaronde
On Mon, Jun 27, 2011 at 08:36:35AM -0400, erik quanstrom wrote: > > and there's no penalty for having that many glyphs. it just > means that my font file as a couple hundred subfonts. these > are only open if needed. typically only 3 subfonts are open > at any one time. As can be clear from th

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-27 Thread Karljurgen Feuerherm
Thanks for bringing up Sumerian (better: Sumero-Akkadian Cuneiform). I was thinking along exactly those lines. For me at least, solutions that satisfy 'the majority' are no solutions at all. And obviously, I'm not alone. (Though it could well be that I missed the intent of Thierry's comment and a

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-27 Thread erik quanstrom
> But I don't want to have the obligation to "know" 65536 signs to > express what I want to express. I'm sorry, but I think that the > main majority (remember that for latin1/latin2 accented letters > are just variants so need less "user memory" than plain different > characters) can do with (less

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-27 Thread tlaronde
On Sun, Jun 26, 2011 at 09:01:13PM -0400, Michael Kerpan wrote: > On Sun, Jun 26, 2011 at 3:57 AM, wrote: > > > I don't know what "automagic" ligatures are; but ligatures are here in > > the kerTeX fonts, user having nothing special to do to have them. Small > > caps are here. Using the system f

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-26 Thread Michael Kerpan
On Sun, Jun 26, 2011 at 3:57 AM, wrote: > I don't know what "automagic" ligatures are; but ligatures are here in > the kerTeX fonts, user having nothing special to do to have them. Small > caps are here. Using the system fonts is here too, at least for T1 > fonts: afm2tfm(1) makes them available

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-26 Thread tlaronde
On Sat, Jun 25, 2011 at 02:43:32PM -0400, Michael Kerpan wrote: > Modern TeX implementations like XeTeX and LuaTeX handle UTF-8 natively > and also bring all sorts of benefits like OpenType support (automagic > ligatures, real small caps, selectable lining or old-style figures and > more) and the a

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-25 Thread Michael Kerpan
Modern TeX implementations like XeTeX and LuaTeX handle UTF-8 natively and also bring all sorts of benefits like OpenType support (automagic ligatures, real small caps, selectable lining or old-style figures and more) and the ability to define fonts from the system font pool rather than using archa

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-25 Thread tlaronde
On Sat, Jun 25, 2011 at 04:34:17PM +, Mauricio CA wrote: > > Since TeX is "8 bits", the tex file must have characters encoded in > > 8 bits, with the not control positions of the first half being, after > > perhaps mapping defined at compile time (can be remapped at user level > > but with appa

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-25 Thread Mauricio CA
> Since TeX is "8 bits", the tex file must have characters encoded in > 8 bits, with the not control positions of the first half being, after > perhaps mapping defined at compile time (can be remapped at user level > but with apparently "strange" macro commands), conforming to ASCII--- > used as li

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-25 Thread tlaronde
On Sat, Jun 25, 2011 at 11:11:50AM -0400, erik quanstrom wrote: > On Sat Jun 25 11:01:38 EDT 2011, tlaro...@polynum.com wrote: > > > > I mean the .tex file. The font files as seen by TeX are only the metrics > > tfm, and they are binaries. > > so are you planning on hiding this conversion within

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-25 Thread erik quanstrom
On Sat Jun 25 11:01:38 EDT 2011, tlaro...@polynum.com wrote: > On Sat, Jun 25, 2011 at 08:19:40AM -0400, erik quanstrom wrote: > > > So for now, TeX is kept 8 bits. I make no assumption for the encoding > > > (and user has to feed "8 bits encoding" to TeX; ASCII users have nothing > > > to change;

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-25 Thread tlaronde
On Sat, Jun 25, 2011 at 08:19:40AM -0400, erik quanstrom wrote: > > So for now, TeX is kept 8 bits. I make no assumption for the encoding > > (and user has to feed "8 bits encoding" to TeX; ASCII users have nothing > > to change; others, if they want to use directly another 8 bits encoding > > (ex.

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-25 Thread erik quanstrom
> So for now, TeX is kept 8 bits. I make no assumption for the encoding > (and user has to feed "8 bits encoding" to TeX; ASCII users have nothing > to change; others, if they want to use directly another 8 bits encoding > (ex.: directly accented letters latin1 code) have to tcs(1) the file > first

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-24 Thread tlaronde
On Fri, Jun 24, 2011 at 11:05:23PM +, Mauricio CA wrote: > > I found this text in TeX by Topic[1] that seems to support Quanstrom's > idea. It describes how TeX reads input, and says it's done one line at > a time (where it follows what the system defines as lines) and then for > each line it

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-24 Thread Mauricio CA
>> i'm not sure what the hard part is. just front the normal input function >> with one that calls chartorune and rejects anything above codepoint 255. >> that can't be more than 10 lines of code. [...] > Yes, "casting" to byte can do and this is almost trivial since the input > is buffered and h

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-21 Thread tlaronde
On Mon, Jun 20, 2011 at 05:53:25PM -0400, erik quanstrom wrote: > > i'm not sure what the hard part is. just front the normal input > function with one that calls chartorune and rejects anything above > codepoint 255. that can't be more than 10 lines of code. > > that way there is no possibilit

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-20 Thread erik quanstrom
On Mon Jun 20 07:17:16 EDT 2011, tlaro...@polynum.com wrote: > On Sun, Jun 19, 2011 at 06:38:59PM -0400, erik quanstrom wrote: > > > > nobody cares what font encoding tex uses internally. the > > real issue is the input to tex. i sure would be very reluctant > > to load anything on my system tha

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-20 Thread tlaronde
On Sun, Jun 19, 2011 at 06:38:59PM -0400, erik quanstrom wrote: > > nobody cares what font encoding tex uses internally. the > real issue is the input to tex. i sure would be very reluctant > to load anything on my system that will mangle utf-8, especially > for codepoints <256. that's the path

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-19 Thread erik quanstrom
> > perhaps you mean the subset of unicode corresponding to the codepoints > > encoded by latin1 encoded in utf-8. the system character set is utf-8, > > and latin1 is not a compatable encoding. utf-8 is assumed everwhere except > > when the data is inbound, and explicitly tagged as having a diff

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-19 Thread tlaronde
On Sun, Jun 19, 2011 at 06:34:58PM +0200, tlaronde wrote: > > There is a reason here: for now, TeX is 8 bits and that's all. So, if > allowing to use, at least, all of the 8 bits means something, it shall > be latin1. To be more accurate: TeX is 8 bits, and wants ASCII for the first semi-range.

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-19 Thread tlaronde
On Sun, Jun 19, 2011 at 10:07:19AM -0400, erik quanstrom wrote: > > > > Why latin1? Not only because, as a French, I use it, but because it is > > compatible with unicode. > > perhaps you mean the subset of unicode corresponding to the codepoints > encoded by latin1 encoded in utf-8. the system

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-19 Thread erik quanstrom
> I've given a look at it. I don't want to start a discussion about > Unicode, since, supplementary to the "characters" (alphabetical, > syllabics, ideographics; but no hieroglyphes or Linear B, so it's not > complete ;) not central to my point, but this is not correct ; grep -i 'linear b syllab

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-19 Thread erik quanstrom
> I have so extended the encoding used to generate the virtual fonts so > that for the ASCII range it matches the Computer Modern expectations > (hence it is totally compatible with plain TeX), and so that the latin1 > encoding used as input will give the correct glyphes. And the cryptic > names wi

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-17 Thread tlaronde
On Fri, Jun 17, 2011 at 02:07:42PM -0400, Joel C. Salomon wrote: > On 06/17/2011 11:37 AM, tlaro...@polynum.com wrote: >[...] > > but no hieroglyphes or Linear B, so it's not complete ;) > > The fonts may be lacking, but Hieroglyphs & Linear B *are* in Unicode; > see and > . I stand corrected

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-17 Thread Joel C. Salomon
On 06/17/2011 11:37 AM, tlaro...@polynum.com wrote: > On Fri, Jun 17, 2011 at 10:18:20AM -0400, Joel C. Salomon wrote: >> At which point you've reinvented XeTeX. > > I've given a look at it. I don't want to start a discussion about > Unicode, since, supplementary to the "characters" > there are f

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-17 Thread tlaronde
On Fri, Jun 17, 2011 at 10:18:20AM -0400, Joel C. Salomon wrote: >[...] > OK to generate automatically. But ?ae???æ? and ?oe?, &c.?please > don?t make these substitutions I have already found (and answered) that "oe" can not be a ligature since, even in french, the "oe" sequence appears in wo

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-17 Thread Joel C. Salomon
On Thu, Jun 16, 2011 at 8:17 AM, wrote: > Second question: I'm trying to find if, in western languages, including > ligatures for ae and oe would be good since it is generally needed (one > can forbid ligatures by inserting "{}" between the letters), or if it's > not correct to set this by defaul

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-16 Thread tlaronde
On Thu, Jun 16, 2011 at 11:43:28AM -0700, Bakul Shah wrote: > > Modifying TeX to accept utf as input (I mean the compiler/interpreter by > > itself; not macros), converting to rune and then using 16 bits à la math > > mode to switch inside a font family to the "correct" 256 vector is > > something

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-16 Thread Bakul Shah
> Modifying TeX to accept utf as input (I mean the compiler/interpreter by > itself; not macros), converting to rune and then using 16 bits à la math > mode to switch inside a font family to the "correct" 256 vector is > something that, for a first step, seems to me both reasonable and > simple. W

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-16 Thread tlaronde
On Thu, Jun 16, 2011 at 02:17:00PM +0200, tlaro...@polynum.com wrote: >[...] > Second question: I'm trying to find if, in western languages, including > ligatures for ae and oe would be good since it is generally needed (one > can forbid ligatures by inserting "{}" between the letters), or if it's

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-16 Thread tlaronde
On Thu, Jun 16, 2011 at 12:49:12PM -0400, Russ Cox wrote: > Virtual fonts tricks can't be the correct solution. Virtual fonts are not the whole solution. To accept, naturally, utf as input, TeX will have to be adapted (and it is perhaps not as deep as one could think). But virtual fonts can use f

Re: [9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-16 Thread Russ Cox
Virtual fonts tricks can't be the correct solution. The correct solution is to use a font format that can handle >256 glyphs, such as OTF. This is what heirloom troff does. Failing that, it is not clear how much you want to hack up tex versus just going along to get along. For Latin alphabets, the

[9fans] [RFC] fonts and unicode/utf [TeX]

2011-06-16 Thread tlaronde
Hello, I'm currently exploring, for kerTeX, the area I have the least knowledge till now: fonts. It seems that the TeX community has spent a huge amount of time, and produced a huge amount of tricks to try to use fonts that have glyphes the Computer Modern have not, specially accented letters. I