Re: [HarfBuzz] Exposing attachment tree / Arabic joining to shaping clients
Dear Behdad, > For reason that many of you know (letter-spacing, Arabic elongation, other > postprocessing) I like to expose attachment data to the shaping clients. > There's two separate pieces so far: > > - The Arabic joining info, which is applicable to all Arabic-like fonts > even the ones that do NOT use cursive joining. > > - Mark attachment and cursive attachment. These form a tree with a > attachment-type enum. > > I'm not sure which slots in the pos buffer to expose this in. The latter > definitely belongs to the pos buffer, whereas the former is more a property > of the text. So I feel like we should expose them separately. > > Ideas? Pulling back into the more generic script question, I think both may be covered by giving the attachment parent and attachment type. Thus Arabic joining is via cursive attachment and diacritic attachment via mark attachment. IMHO, this would then also apply naturally to other scripts and give some help with regard to justification. Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] Why mark glyphs are skipped when MarkBasePos matching?
Dear Shusaku, > For example, the following glyph sequence is input. B is a glyph in > Base Coverage Table, M is a glyph in Mark Coverage Table > and m is a mark glyph defined by GDEF. > > B m M > > The implementation of HarfBuzz finds B as base of M even though there > is m between B and M. If m is not a mark glyph, B and M > are not matched. I tried to find out the reason of the behavior in the > specification, but could not. Can anyone teach me why > such behavior is needed? The link to the related discussion log is enough. In the OpenType spec under the GPOS Lookup Type 4 there is this start to a paragraph: To identify the base glyph that combines with a mark, the text-processing client must look backward in the glyph string from the mark to the preceding base glyph. Unfortunately, the English here is ambiguous. It could be saying that the immediately preceding glyph is tested and if it is a base, then attachment occurs. As per only the sequence B M would result in attachment and therefore you would have to say to skip marks for it to find that immediately preceding glyph is base. I would suggest that this is the more common use of something like 'the preceding glyph'. But 'preceding' may mean the first glyph looking backwards that meets the requirement of being a base. In this case, it is the latter interpretation. But I would suggest that if this sentence could be rewritten without using the word 'preceding', then things would be a lot clearer. Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/harfbuzz
[HarfBuzz] problems with hmn
Dear Behdad, I notice that there are a lot of language tags that are mapped to the internal OT language tag HMN. Given that a bunch of those are Miao languages that have significant stylistic variation in them, I am interested in removing their unification. Do you know where this set of mappings came from (particularly hmd, sfm, etc.)? Is there any chance of removing them or does uniscribe suffer from the same fate? TIA, Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/harfbuzz
[HarfBuzz] Changes in SIL
Dear All, This is an internal name change announcement that may be relevant to folk here: Within SIL the team that has worked on scripts and writing system issues has been known as the NRSI (Non-Roman Scripts Initiative). But as the team's work has widened to include Roman scripts (since 2001) and more than just fonts, there has been a growing feeling that a team name change is in order. Therefore in January 2019 the team will change its name to WSTech (Writing Systems Technology), and updates to websites etcetera are already underway. A clear case of change the name and do the same! Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] [docs][question] Emoji "shaping", etc.
Dear Nathan, > Kind of a high-level question about describing HB functionality > > Would you consider the handling of emoji variation-selector sequences to be > "shaping", or some other operation? > > Feels like kind of a gray area to me; when it comes to describing what > HarfBuzz does and what the use cases are that developers need it for, > though, it's important to get it right. So I'd like to know what the > consensus is Not for me, since it doesn't involve either a glyph to glyph mapping or the positioning of any glyphs. But: creating ligatures, kerning (of any kind), getting combining marks positioned correctly, all are. So handling the skin tone letters to give different kinds of faces in emoji is shaping. > [Related question also applies to handling the MATH table -- AIUI, HarfBuzz > leaves math layout to others higher up in the stack, so I wouldn't call > math-table support "math shaping", but other people may see it > differently...] Harbfuzz need not be the only shaping processor :) But at that point I see hairs splitting. Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] query cvXX feature name table references
On Mon, 30 Apr 2018 23:22:29 +0200 Khaled Hosny <khaledho...@eglug.org> wrote: > On Mon, Apr 30, 2018 at 08:50:57PM +0700, Martin Hosken wrote: > > Dear Behdad, > > > > Do you have any plans (pretty please) to add an API to enable a client > > to query a font to get hold of the name table references for the > > various cvXX features in a font? > > See https://github.com/harfbuzz/harfbuzz/pull/976, may be you can > comment on the proposed API? This looks good. Given there is a function to allow iteration of all the features in a font, it should make it not too hard to be able to create a UI that presents feature options to a user. (The reason I'm asking). So how long before this PR gets in? Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] script segmentation
Dear Richard, I would reply to my own message, but that never comes back to me. Thank you (NOT) Google. So here goes. I've started a discussion document here: https://github.com/OpenType/opentype-layout/blob/master/docs/script_segmentation.md. Please feel free to interact with it. If you are one of the implementers of the segmentation algorithms summarised, please feel free to correct me. I'm certain I've made lots of mistakes in describing this. Yours, Martin > > 1. Do we have a standard algorithm for this? > Well, the obvious fix is a per-block default script, just as some > unassigned characters have a default property of AL or R. The problem > comes with Indic scripts, though a default of consonant will often work. > > > 2. Do we want one? > I suspect you're the expert. How well does MultiScribe work on > Windows? On Apple systems, the answer for ordinary users is to use > AAT, and I suspect that will soon extend to Linux applications courtesy > of HarfBuzz. I don't know if that would work on ChromeOS. > > On the other hand, in the free world it would be nice to test out > OpenType fonts. Several applications already use a Linux sharable > object for HarfBuzz, and one could in principle replace them with a > version that already included the new characters. LibreOffice is one > such application. > > > 3. How can we make it more future resilient? > > A mechanism that ascribes properties to PUA points could be extended to > unassigned characters in general. > > In principal, the USE grammar policeman is a problem. Combining marks > can usually be identified by an OpenType glyph category of 'mark', but > unassigned combining marks are unlikely to get a security clearance, so > the obvious relaxation will not work. > > Richard. > ___ > HarfBuzz mailing list > HarfBuzz@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/harfbuzz ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/harfbuzz
[HarfBuzz] script segmentation
Dear All, One problem I am facing as we add characters to Unicode, is that if a character is added to a block, it doesn't necessarily mean that an existing application will keep that character in the same run as other characters in the same script of that block. This means the app is broken until the character is published in a future Unicode standard, a library is updated, and the application is updated to use the new version of the library. It also makes it impossible to test out proposed changes to Unicode. It would be great if we could come up with a standard script segmentation algorithm for runs of text that is also somewhat future proof, even if it is not perfect and changes in the future. A best guess at what script an unknown character may take has a much higher probability of being correct than to give it a special script category of unknown, which is always going to be wrong. So. 1. Do we have a standard algorithm for this? 2. Do we want one? 3. How can we make it more future resilient? TIA, Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/harfbuzz
[HarfBuzz] unsafe to break
Dear Behdad, Please could you explain the purpose and function of HB_GLYPH_FLAG_UNSAFE_TO_BREAK. Is this about line breaking? grapheme clustering? TIA, Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/harfbuzz
[HarfBuzz] seems not to build without fallback shaper
Dear Behdad, I recently had occasion to build harfbuzz without setting -DHAVE_FALLBACK and it didn't do too well. I don't think it unreasonable to require the fallback shaper. But in that case, perhaps it's worth getting rid of the need for -DHAVE_FALLBACK. Minor point, etc. But wanted to flag it in passing. And I could have been wrong. Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] harfbuzz: Branch 'master'
Dear Behdad, > - if (buffer == NULL) > + if (!buffer) Always wanting to learn. How does this cause a divide by zero? Or what led you to make the change? TIA, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/harfbuzz
[HarfBuzz] parse_one_feature
Dear Behdad, I notice that hb-shape has a parse_one_feature function, but that nothing refers to it and it is not in the public API. Does this mean that it is deprecated or that one day you will publicise it? Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] Python: accessing harfbuzz enums
On Sun, 19 Jun 2016 16:34:23 -0400 Kelvin Mawrote: > When using the python harfbuzz bindings, how do I access the harfbuzz enums > for direction, script, & language? Also can someone explain the difference > between direction and script because I thought script implies direction. Look in hb-common.h which is pulled in by hb.h. There are enums for hb_direction_t and hb_script_t. There is no enum for hb_language_t, but you can get one by using hb_language_from_string(). A run needs to have a separate direction from its script because, in the case of bidi scripts like Arabic, a particular sequence of letters may be in the opposite order to the default direction of the script (arab). Harfbuzz expects each run to be in a single direction. So a sequence of arabic digits in an arabic text: ARABIC1234ARABIC, would get passed to harfbuzz as 3 runs "ARABIC" (rtl), "1234" (ltr), "ARABIC" (rtl). A bidi processor, say from icu, will do the maths for you here and work out the runs and their directions. HTH, Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] have markfiltersets ever worked?
Dear All, > > I meant 1.2.3 and 1.2.4 are broken. I may be wrong, but my understanding is that this error came in with revision: 9a13ed453ef96822a47d6e6f58332b87f38d5c59 which was released with 1.2.1. So it may be a bit wider than that. I don't think android has this in, at least not in 6.0.1, which uses older code I think. But I can't speak for the webview application/component. > Firefox is currently on 1.2.2: > >https://bugzilla.mozilla.org/show_bug.cgi?id=1249861 > > and we'll jump straight to 1.2.5 when we update: > >https://bugzilla.mozilla.org/show_bug.cgi?id=1251203 > > so no problem here. Might be worth checking: see hb-open-type-private.hh and the struct FixedVersion around line 750 GB, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] difference between harfbuzz and uniscribe?
Dear Behdad, Some pertinent further information and that is that both diacritics were previously attached to the base before these rules are applied. > Consider a font with the following FEA rule in it: > > pos u1014' 79 u1032 u1037; > > which adds 79 to the advance of the first glyph in the sequence u1014 u1032 > u1037: > > [u1014=0+609|u1032=0@-89,-42+0|u1037=0@-55,0+0] > > vs without the 1037: > > [u1014=0+530|u1032=0@-10,-42+0] > > But if we compare with a recent uniscribe we get: > > [u1014=0+609|u1032=1@-10,-42+0|u1037=2@24,0+0] > [u1014=0+530|u1032=1@-10,-42+0] > > and if we change the rule to compensate for the advance in the windows case > we get: > > pos u1014' 79 u1032' <-79 0 0 0> u1037; > > harfbuzz gives: > > [u1014=0+609|u1032=0@-168,-42+0|u1037=0@-55,0+0] > [u1014=0+530|u1032=0@-10,-42+0] > > uniscribe gives: > > [u1014=0+609|u1032=1@-89,-42+0|u1037=2@24,0+0] > [u1014=0+530|u1032=1@-10,-42+0] > > Yours, > Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/harfbuzz
[HarfBuzz] difference between harfbuzz and uniscribe?
Dear Behdad, Consider a font with the following FEA rule in it: pos u1014' 79 u1032 u1037; which adds 79 to the advance of the first glyph in the sequence u1014 u1032 u1037: [u1014=0+609|u1032=0@-89,-42+0|u1037=0@-55,0+0] vs without the 1037: [u1014=0+530|u1032=0@-10,-42+0] But if we compare with a recent uniscribe we get: [u1014=0+609|u1032=1@-10,-42+0|u1037=2@24,0+0] [u1014=0+530|u1032=1@-10,-42+0] and if we change the rule to compensate for the advance in the windows case we get: pos u1014' 79 u1032' <-79 0 0 0> u1037; harfbuzz gives: [u1014=0+609|u1032=0@-168,-42+0|u1037=0@-55,0+0] [u1014=0+530|u1032=0@-10,-42+0] uniscribe gives: [u1014=0+609|u1032=1@-89,-42+0|u1037=2@24,0+0] [u1014=0+530|u1032=1@-10,-42+0] Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/harfbuzz
[HarfBuzz] have markfiltersets ever worked?
Dear All, Has anyone had any success with mark filter sets? According bug #238 there is a problem with hb-open-type-private.hh where the to_int() function is programmed wrongly (need to multiply sizeof() by 8 when shifting). This would mean that no mark filter set lookups will ever succeed. Has anyone managed to get harfbuzz to execute a lookup with a mark filter set in it? Oh and you can't work around it by setting the gdef version to 0x40010002 because harfbuzz carefully sanitizes that away. So there's no workaround. Back to MarkAttachment I suppose. Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/harfbuzz
[HarfBuzz] uniscribe confusion
Dear Behdad, This is default shaping. The u1037 is attached to the u102F and we get the following results: * uniscribe: [u1000=0+1002|u103D=1@-55,0+0|u102F=2+147|u1037=3@217,0+0] * harfbuzz: [u1000=0+1002|u103D=0@-55,0+0|u102F=0+147|u1037=0@217,0+0] which makes sense. Now if we introduce an increase of the advance on the u102F via a feature rule: u102F' 272 u1037 we get: * uniscribe: [u1000=0+1002|u103D=1@-55,0+0|u102F=2+419|u1037=3@217,0+0] * harfbuzz: [u1000=0+1002|u103D=0@-55,0+0|u102F=0+419|u1037=0@-55,0+0] I'm wondering if harfbuzz is over compensating differently to how uniscribe does it. It looks like uniscribe just slaps on the extra advance and doesn't compensate its attached components. I don't really mind what results I get so long as they are consistent. Any thoughts? Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] harfbuzz: Branch 'master'
Dear Behdad, > > So, I would add an enum to the debug message to give a debug message event > > type. > > My current thinking is that everything is transferred as a text API in > one-line messages. The client can transform that to an enum if desired. That works only if the messages are constrained to be parsable. In effect the string is being used as a way of passing an identifier and varargs. To make life easy for the poor debugger, the messages should be structured in a way that makes them parsable without knowing the context of the message. > > One big question that always needs to be answered in the debugger is: where > > are we? Where in the buffer are we now processing. This is the idx field of > > the buffer. I don't think this is exposed in the public buffer interface. > > So it either needs to be exposed or passed as part of the debug message. > > I'm unsure about this one. We don't expose the out_buf pard of the buffer, so > calling client code in the middle of a pass of transformation is harmful > currently. Exposing all of that, on the other hand, leaks a lot of the buffer > design, which I like to avoid right now. Indeed, we might end up changing the > buffer internals to accommodate the lookup direction proposal. > > So, for now, no callbacks in the middle of a pass. I understand that's far > from ideal, but at least we are now answering the big question: which lookup > did what. It doesn't answer: which lookup did what *where*. It's the "where" I am trying to answer. If you get given the buffer before the change and after, it's asking a lot of the debugger to work out precisely where the change occurred. Can we, therefore, please pass idx as part of the debug message? > > For GPOS we need to be passing parameters like the two points in an > > attachment or the actual calculated offset in a pair or single adjustment. > > When doing classed based activities, we should be passing the class values > > involved or perhaps pointers (or offsets) to the data structures involved > > so that a debugger can turn cross reference that back to source code. > > GPOS is more friendly since the buffer structure is fully exposed. Though, > deferred attachments won't be exposed. But even with the buffer, you don't know which AP was used and what the relative positions were of the APs. > I'm probably going to add shape_plan to list of arguments. After that, if I > make a release, the API is here to stay... So, speak very loudly if you think > for whatever reason this is not workable. Ie, there are things that cannot be > done using a message. I can't think of any. Bear in mind that the structure of the message, at least at a high level, is part of that API. Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] harfbuzz: Branch 'master'
Dear Behdad, > buf = hb.buffer_create () > +class Debugger(object): > + def message (self, buf, font, msg, data, _x_what_is_this): > + print(msg) > + return True > +debugger = Debugger() > +hb.buffer_set_message_func (buf, debugger.message, 1, 0) > hb.buffer_add_utf8 (buf, text.encode('utf-8'), 0, -1) > hb.buffer_guess_segment_properties (buf) Yippee. At last, a debug interface :) (Behdad reminds me that I have been asking this once per year for the last 4 years!). Thank you. OK. Now to make a great debug interface! There are two ways of doing a debug interface: Event driven and One shot. There are probably more, but those are the only two that come to mind now. One shot sends all the information needed to give all the debug information for a debug point in its message. This allows the debugger not to have to keep state, but just record the results and pass them on. Event driven sends, well, events to the debugger and requires the debugger to keep state. While one shot seems more inviting and is more in line with what Graphite does. I think for harfbuzz, I would recommend an event based debugger, where you send debug events at the start and end of every lookup, at recursion, during initial reordering and shaping, at dotted circle insertion, etc. and have an enum of events and let the debugger work out what it wants to do with that information. So, I would add an enum to the debug message to give a debug message event type. One big question that always needs to be answered in the debugger is: where are we? Where in the buffer are we now processing. This is the idx field of the buffer. I don't think this is exposed in the public buffer interface. So it either needs to be exposed or passed as part of the debug message. I suggest that rather than relying on a message to give the lookup number, that the lookup number be passed as a separate parameter (or in a struct or whatever). The lookup number can be overloaded based on event type. So we could have a starting high level phase event type and use the lookup to say whether that is initial shaping, GSUB, GPOS, etc. for example. Or we could have different event types for each one. That's up to you. I think we need to send a message each shaper pause when the pause occurs. For GPOS we need to be passing parameters like the two points in an attachment or the actual calculated offset in a pair or single adjustment. When doing classed based activities, we should be passing the class values involved or perhaps pointers (or offsets) to the data structures involved so that a debugger can turn cross reference that back to source code. What does that look like now: debug_message(type, buf, idx, lkupidx, void *aptr, void *bptr, uint32 aoffset, uint32 boffset, msg, ...) where aoffset and boffset are defined by type and lkupidx and may point to things like an attachment point record or a lookup record in a class based contextual lookup or somesuch. aptr and bptr may also point to debugger specific data structures (perhaps for an attachment point one needs a pointer to the ap record and 2 floats for the resolved x,y coordinates). Of course this could all end up in a structure of some kind. You know, if we get this right, we should be able to drop the msg, ... since debuggers really don't want to have to parse textual messages. Yes they are easy for a quick trace, but not for a real debugger. But it's welcome to stay to make such tracing programs' lives easier, but it shouldn't contain anything that isn't in the other parameters. If it does, then we need a way to pass it outside the message. And yes, while I'm trying to define what the kitchen sink is, I'm also trying to keep this lightweight. I know the moment I hit send, I'll think of things I've forgotten! Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] harfbuzz: Branch 'master'
Dear Behdad, > buf = hb.buffer_create () > +class Debugger(object): > + def message (self, buf, font, msg, data, _x_what_is_this): > + print(msg) > + return True > +debugger = Debugger() > +hb.buffer_set_message_func (buf, debugger.message, 1, 0) > hb.buffer_add_utf8 (buf, text.encode('utf-8'), 0, -1) > hb.buffer_guess_segment_properties (buf) Yippee. At last, a debug interface :) (Behdad reminds me that I have been asking this once per year for the last 4 years!). Thank you. OK. Now to make a great debug interface! There are two ways of doing a debug interface: Event driven and One shot. There are probably more, but those are the only two that come to mind now. One shot sends all the information needed to give all the debug information for a debug point in its message. This allows the debugger not to have to keep state, but just record the results and pass them on. Event driven sends, well, events to the debugger and requires the debugger to keep state. While one shot seems more inviting and is more in line with what Graphite does. I think for harfbuzz, I would recommend an event based debugger, where you send debug events at the start and end of every lookup, at recursion, during initial reordering and shaping, at dotted circle insertion, etc. and have an enum of events and let the debugger work out what it wants to do with that information. So, I would add an enum to the debug message to give a debug message event type. One big question that always needs to be answered in the debugger is: where are we? Where in the buffer are we now processing. This is the idx field of the buffer. I don't think this is exposed in the public buffer interface. So it either needs to be exposed or passed as part of the debug message. I suggest that rather than relying on a message to give the lookup number, that the lookup number be passed as a separate parameter (or in a struct or whatever). The lookup number can be overloaded based on event type. So we could have a starting high level phase event type and use the lookup to say whether that is initial shaping, GSUB, GPOS, etc. for example. Or we could have different event types for each one. That's up to you. I think we need to send a message each shaper pause when the pause occurs. For GPOS we need to be passing parameters like the two points in an attachment or the actual calculated offset in a pair or single adjustment. When doing classed based activities, we should be passing the class values involved or perhaps pointers (or offsets) to the data structures involved so that a debugger can turn cross reference that back to source code. What does that look like now: debug_message(type, buf, idx, lkupidx, void *aptr, void *bptr, msg, ...) where aptr and bptr are defined by type and lkupidx and may point to things like an attachment point record or a lookup record in a class based contextual lookup or somesuch. They may also point to debugger specific data structures (perhaps for an attachment point one needs a pointer to the ap record and 2 floats for the resolved x,y coordinates). You know, if we get this right, we should be able to drop the msg, ... since debuggers really don't want to have to parse textual messages. Yes they are easy for a quick trace, but not for a real debugger. But it's welcome to stay to make such tracing programs' lives easier, but it shouldn't contain anything that isn't in the other parameters. If it does, then we need a way to pass it outside the message. And yes, while I'm trying to define what the kitchen sink is, I'm also trying to keep this lightweight. I know the moment I hit send, I'll think of things I've forgotten! Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
[HarfBuzz] USE and zeroing marks
Dear Behdad, I see from the code that the USE shaper doesn't zero marks. But that the USE spec implies that they are: the width of the base character must be added back using the feature. This is necessary because OT processing cancels the width associated with a mark. It is necessary to cancel the width of a non-spacing mark because it is not clear where to apply the width of a non-spacing mark during OpenType processing. I'm hoping that the spacing marks proposal will answer that final question. But it does imply that marks need to have their advances zeroed. I don't quite follow how not zeroing marks works. If I attach acute with advance of 100 to an a with advance 200, I assume I end up with a total advance of 300? I think the spacing mark proposal helps sort out the overlap problem which is really tricky to resolve otherwise even for those shapers that don't zero their marks. Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
[HarfBuzz] Spacing Mark Proposal
Dear All, Here is an initial draft of a proposal to change OpenType to support spacing marks. # Spacing Marks This is a proposal to extend the OpenType standard to support spacing marks. Spacing marks are marks considered to have extent. When attached to a base or another mark, such marks cause the extent of the base to be adjusted to ensure that the combined cluster includes the extent of the mark in its attached position. For example, if a mark is attached to the right of a base, the advance of the base is extended to include the extent of the mark, and the mark itself is given a zero advance. Likewise if such a mark were attached to the left of a base, the origin of the cluster would be shifted back to include the origin of the mark, while the offset from the origin of the base would be equally adjusted to keep it in its same relative position. ## Changes ### LookupTable The lookuptable is extended to support multiple flags in anticipation of extra flag needs: LookupTable: Type | Name | Description -- | - | --- uint16 | LookupType| Different enumerations for GSUB and GPOS uint16 | LookupFlag| Lookup qualifiers uint16 | SubTableCount | Number of subtables in this lookup Offset | Subtable[SubTableCount] | Array of offset to Subtables from beginning of Lookup table uint16 | MarkFilteringSet | Only present if UserMarkFilteringSet of LookupFlag is set uint16 | ExtraFlag | More lookup qualifiers, only present if ExtraFlags of LookupFlag is set LookupFlag bit enumeration: Type | Name | Description -- | | --- 0x0001 | RightToLeft | Only used for cursive attachment 0x0002 | IgnoreBaseGlyphs | If set, skips over base glyphs 0x0004 | IgnoreLigatures | If set, skips over ligatures 0x0008 | IgnoreMarks | If set, skips over all combining marks 0x0010 | UserMarkFilteringSet | If set, use mark filtering 0x0060 | Reserved | Set to zero 0x0080 | ExtraFlags | If set, the ExtraFlag field is included and used 0xFF00 | MarkAttachmentType | If not zero, skips over all marks of attachmant type different from specified. ExtraFlag bit enumeration: Type | Name | Description -- | | --- 0x0001 | SpacingMarks | When applied to a Mark Attachment GPOS lookup specifies spacing mark attachment semantics 0x7FFE | Reserved | Set to zero 0x8000 | Reserved | Set to zero, reserved for future extra flag extensions ## Comments The Reserved bits in the LookupFlag enumeration will probably be used for directionality filtering. The need here is to ensure that the flags are extensible into the future. Where fields are unchanged from the original text, the original text description should be used. ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
[HarfBuzz] Move Lookup Proposal
Dear All, This is a proposal to add a move lookup to GSUB: # Move Lookup This is a proposal to add a GSUB lookup to support glyph movement. The purpose of this lookup is to move a glyph relative to its current position in the glyph string. The lookup also supports swapping two glyphs. Most OpenType implmentations use a cluster model whereby glyphs that are attached or are reordered in relation to each other are in the same cluster. Therefore, if a glyph is moved across a cluster boundary, that cluster boundary should be removed and the clusters merged. ## Changes Add a new GSUB lookup with lookup type of 9. There is only one format for this lookup type. MoveLookupFormat1: Type | Name | Description - | -- | --- uint8 | MoveFlags | Flags governing the move int8 | MoveOffset | Distance to move, may be negative If the MoveOffset results in a position outside the glyph string or the absolute values of MoveOffset is 0 or greater than 32, no action occurs and the lookup is ignored. MoveFlags bit enumeration: Type | Name | Description | -- | --- 0x01 | MoveThis | Moves the current glyph by the given offset 0x02 | MoveOther | Moves the glyph at the given offset to before the current glyph Notice that if both bits are set, the moves are considered to happen in parallel. If executed in series then the offset may need to be adjusted by 1 to get the final positions correct. ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] Streamlining hb_font_t some more
Dear Simon, > In InDesign, both (1) and (2) get 12 x 1.2 = 14.4pt interline space. > This means that the descender of the "p" in "ipsum" will bump into > letters on the next line. That's clearly wrong. > > In the CSS model, both (1) and (2) get half leading added to the top and > the bottom of the first line. So there is a gap between the first and > second lines of (1) even though there is no large descender using that > gap. That's not *wrong* but I don't like it. Imagine a large drop-cap > "T" at the start of a paragraph - the line spacing after it becomes > inconsistent. Another approach is to say the ascent and descent of a line is the max(ascent, for all font ascents) and max(descent, for all font descents) on the line. > I am not a typographer, I just play one on the Internet, so I am not > sure what someone who was actually typesetting a book would do in that > situation. My guess would be that they would, basically, do what SILE > does right now (and what TeX does; perhaps Knuth knew what he was doing > after all) - use consistent 14.4pt (or whatever) line spacing in > situation (1) and use larger line spacing which fits in the descender in > situation (2). But I would have to ask a real typesetter to know. I like this, but as stated by Werner, if you think the OS/2 values don't make sense, then you could use the font bounding box and add 1pt interline gap or some such, as a fall back. GB, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] Kerning, glyph width, and x advance
Dear Simon, > So I'm using Harfbuzz to shape stuff and put it out to PDF. When you > output a string in PDF, you are expected to kern it manually, or else > each glyph will be placed one after the other with no kerning: > > No kerning: Td (VAVAVOOM) Tj > Kerning: Td[(V) 153 (A) 122 (V) ... ]TJ > > The numeric values in the TJ array are the distances to go back after > the glyph is painted. In other words, this is the unkerned X advance > minus the kerned X advance. > > I had assumed that the unkerned glyph advance was the width of the > character, but that's not the case: No. There is an advance array in the pdf as part of the font definition. It comes from the hmtx table with each glyph having its own advance width which is independent of the bounding box of the glyph. So you'll need to basically calculate the distance between adjacent glyphs and subtract the advance of the first width to get your number. But thankfully you won't need to run shaping twice. Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
[HarfBuzz] hb-graphite2 patch progress chase
Dear Behdad, Any progress on accepting that core of the patch I sent you? I notice that it fixes a bug in Sile, for example. Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] Loading Graphite dynamically
Dear Behdad, 1. Can distro people please chime in with their preferences? Debian, Ubuntu and Fedora and derivatives (AFAIK) make their harfbuzz packages dependent on the libgraphite package. Thus they all enable Graphite at the system level. libgraphite is too small not to ship. The only people who get concerned about this are those who statically link harfbuzz into a framework: qt, gecko, chromium, gtk?. Where such frameworks dynamically link to a system harfbuzz, there is less of an issue. From what you say, chromium won't dlopen a library, although I got it to work just fine in that mode in an android app (content shell). So perhaps this a difference between the app and the view? 2. What are the security implications of this? The same for any dlopen. Notice that only the system library load path is used, so if someone nafarious can write to that area, they may be able to use that as a vector, but that would be just as true for any other dependent library, whether loaded at startup, during preload. I realise this is a bit of an off the cuff answer, and I would love to hear from a security expert on this. Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] Loading Graphite dynamically [2]
Dear Behdad, 3. Why stop at Graphite? Why not use this for ICU, FreeType, glib, gobject, fontconfig, as well as others? That way I can have one libharfbuzz.so with all the bits and pieces without pulling in 100MB worth of libraries; and we can fold libharfbuzz-icu.so back into libharfbuzz.so... I think the difference lies in the needs of the framework layers that call harfbuzz. Harfbuzz is rarely called directly by applications, but by intermediate layers like qt, gtk, chromium (the browser is the new OS), etc. And those intermediate layers are currently having to make support choices for their applications regarding which backends to support. There are two kinds of library decisions that integrators of harfbuzz need to make. The first is the choice of libraries necessary to have harfbuzz work in the direct calling environment. These include things like unicode support, freetype/font querying support; in effect everything that harfbuzz calls back to the application for. The second is what backends harfbuzz supports. For the most part the intermediate layers don't care what backends harfbuzz supports. Clearly they want OT support otherwise they wouldn't be using harfbuzz, but others are of less value to them. They are caught between pressure to support applications needs for backends and the desire to keep the intermediate layer itself, as thin a possible. So they would prefer not to ship with a niche backend like graphite, forcing all applications to support it. But they would like to allow applications to support graphite if they so desire. The same would apply to, say, someone producing a cross platform AAT backend. Currently, we have two types of backends: all platform and single platform. uniscribe and coretext are single platform, and therefore their support by an intermediate layer is chosen based on what platform the layer is being built for (along with that support being near zero cost, since the supporting libraries are part of the platform). Of the cross platform backends, there are 3: OT (required), FALLBACK (required), GRAPHITE2 (not required, but desired). Therefore Graphite2 is currently, the only backend that would benefit from being dynamically loaded, allowing the intermediate layer to leave this backend support question to the application. Sorry, I forgot to expand my first point in that paragraph. The first class of libraries, the ones necessary to have harfbuzz work in the direct calling environment, gain nothing from being dynamically loaded (like graphite) because their choice is a static one based on their direct call environment. The calling environment does not want to change those library choices on the fly or in response to application choice. The choice is made by the programmer of the intermediate layer, at the time they integrate harfbuzz, once. HTH, Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] Loading Graphite dynamically
Dear Jonathan, AFAICS, the patch will leak the graphite2_funcs_t record that's attached to the face, as it fails to free it in _hb_graphite2_shaper_face_data_destroy. Thanks for flagging that. I've restructured grfuncs to be part of the shaper_data so there is no m/calloc to free now. It has the same lifetime and handling as the overall shaper_data (It also fails to free it if _hb_graphite2_shaper_face_data_create hits an error in gr_make_face, or if hb_graphite2_load_gr fails to find one of the expected functions in the library.) A second patch fixes the dlhandle leak if gr_make_face fails. I wonder if it'd be better to ALWAYS do the dynamic-load thing, and scrap the HAVE_GRAPHITE2_STATIC option? This would substantially clean up the #if-clutter that currently makes things look a bit hairy, and probably make it easier to verify that the code paths are all sane. I'm undecided on that. Users may still want the option of direct linking? As for whether to do this in general -- I think that if we can ensure the code is clean enough that it won't introduce new leaks (see above) or vulnerabilities, it'd provide a crucial feature that's currently lacking for most client apps. In Gecko, we don't need this as we have a separate Graphite codepath that's independent of harfbuzz (though we could consider changing that some day), but for software that uses a harfbuzz rendering path exclusively, this could offer a valuable added capability. Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] Loading Graphite dynamically
Dear Behdad, For the curious, this is the PR: https://github.com/behdad/harfbuzz/pull/107 This basically uses dlopen to open libgraphite2. I need to understand better what you are trying to address with this. A number of people have asked does not quite justify it. Why do people care? Years ago I thought about doing this, specially since with all the integrations (FreeType, glib, gobject, icu, graphite2, more to come later), dynamic loading sounds very attractive. But decided not to pursue that as it adds unnecessary complexity. We just removed support for dynamic modules from Pango and I'm so glad we did... I don't want to add those back to HarfBuzz unless we absolutely have to. So, what are you trying to fix? In case my previous answer wasn't as helpful as it might be. I think this is a different case to the idea of dynamic modules. Dynamically loading the graphite library will lessen harfbuzz's relation to Graphite. It means that harfbuzz can support graphite without having to know anything about it. If the library happens to be on the system, it'll get loaded otherwise it won't. So harfbuzz doesn't need to manage anything. It's not a module in that harfbuzz has to manage it, locate it, handle registration, etc. The filename and lack of path is hardwired to improve security (graphite must be a true library in that respect), and an application can always dlopen the graphite library itself if it wants to load it from somewhere else. If this patch is accepted and enabled on say, android, then there is no need for android to consider including graphite on the system. Instead applications can include the graphite library if they want, and only if they include the graphite library, will it get used. This is far more preferable for such applications than having to bundle a 50MB browser with every app. Remember for minorities that might mean a keyboard app, an sms app, a document reader, etc. In the case of chromium.org the patch allows them to ship chromium without graphite and allow the application to include the library or not. The difficulty we face at the moment, is that the support of graphite in an application is an all or nothing choice. Either the complete stack has to be built with graphite in or it is locked out. If graphite is not supported by the stack, then any application that wants to support graphite has to copy the entire stack with the slight difference of enabling graphite. As the stack grows, this burden only increases and becomes more ridiculous. This patch breaks that deadlock and allows the decision of whether graphite is supported, to be made by the application rather than the framework on which the application is based. Yours, Martin Thanks, behdad On 15-05-18 05:50 AM, Martin Hosken wrote: Dear All, A number of people have asked me for a mechanism by which graphite may be dynamically loaded only when a Graphite font is used. I've struggled with the notion of this, but I think I understand it now. I hope that this can help everyone to have what they want for minimal cost. I've submitted a pull request on github for a patch that does the above. This patch adds dynamic loading of graphite support for graphite fonts in harfbuzz. The three way configure option is now: --with-graphite2=no means no graphite support. --with-graphite2=yes means to build and link against an existing graphite library. --with-graphite2=auto means to build independently of any graphite library but to attempt to dynamically load graphite when a graphite font is encountered. This patch has been built and tested on linux only at the moment. Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
[HarfBuzz] Loading Graphite dynamically
Dear All, A number of people have asked me for a mechanism by which graphite may be dynamically loaded only when a Graphite font is used. I've struggled with the notion of this, but I think I understand it now. I hope that this can help everyone to have what they want for minimal cost. I've submitted a pull request on github for a patch that does the above. This patch adds dynamic loading of graphite support for graphite fonts in harfbuzz. The three way configure option is now: --with-graphite2=no means no graphite support. --with-graphite2=yes means to build and link against an existing graphite library. --with-graphite2=auto means to build independently of any graphite library but to attempt to dynamically load graphite when a graphite font is encountered. This patch has been built and tested on linux only at the moment. Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] Loading Graphite dynamically
Dear Behdad, For the curious, this is the PR: https://github.com/behdad/harfbuzz/pull/107 This basically uses dlopen to open libgraphite2. I need to understand better what you are trying to address with this. A number of people have asked does not quite justify it. Why do people care? Years ago I thought about doing this, specially since with all the integrations (FreeType, glib, gobject, icu, graphite2, more to come later), dynamic loading sounds very attractive. But decided not to pursue that as it adds unnecessary complexity. We just removed support for dynamic modules from Pango and I'm so glad we did... I don't want to add those back to HarfBuzz unless we absolutely have to. So, what are you trying to fix? This aims to fix the we don't care about Graphite. We aren't against it. Surely this should be integrated at some other level than ours? response that I get when I try to get Graphite integrated in such a way that I can create applications that use Graphite. For example, chromium.org. This approach allows everyone to ignore the problem and for applications or system integrators to decide whether to support Graphite or not. By dynamically loading only when a Graphite font is encountered (and if a Graphite library is available), there is no extra cost to supporting Graphite for those not interested, and it allows those interested to integrate Graphite if they want. It actually goes in the opposite direction from modules since it says that Graphite doesn't have to ship with harfbuzz. In addition, the Graphite library is not tied to harfbuzz, like a true module would be. There are no modules to be 'managed' by harfbuzz. In effect, this allows harfbuzz to support Graphite, at the same time, not having to require harfbuzz users to ship with the Graphite library, unless they want to. HTH, Yours, Martin Thanks, behdad On 15-05-18 05:50 AM, Martin Hosken wrote: Dear All, A number of people have asked me for a mechanism by which graphite may be dynamically loaded only when a Graphite font is used. I've struggled with the notion of this, but I think I understand it now. I hope that this can help everyone to have what they want for minimal cost. I've submitted a pull request on github for a patch that does the above. This patch adds dynamic loading of graphite support for graphite fonts in harfbuzz. The three way configure option is now: --with-graphite2=no means no graphite support. --with-graphite2=yes means to build and link against an existing graphite library. --with-graphite2=auto means to build independently of any graphite library but to attempt to dynamically load graphite when a graphite font is encountered. This patch has been built and tested on linux only at the moment. Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] Choosing dev2 vs deva OT script for shaping
Dear Behdad, I can extend the BCP 47 extension to also choose the script system if it's available. Eg, a language setting of x-hbotdeva will choose deva whereas x-hbotdev2 will choose dev2. This works for script tags that have four letters (ie, not 'lao ', 'yi ', 'nko ', and 'vai '). We would only recognize the three-letter ones as language system tag. This will be useful for choosing 'math' script as well. +1 Or you could use x-hbscdev2 (as in script) to separate the namespaces. Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
[HarfBuzz] towards an OT debugger
Dear Behdad, Thanks for your help in getting the debug output from hb. I've been playing and have some thoughts about how to use the existing debug framework to help with font development. 1. It would be great if the HB_DEBUG variable controlled which categories of debug are output, rather than the depth. For example, I don't need the SANITIZE debug (and pretty much anything else than APPLY). 2. It would be good if we can get the lookup index, somehow, into the debug report. I don't think it's stored in the lookup. But I may be wrong. 3. Is there a way to print out the whole buffer as per, perhaps, the output from hb-shape. And can we call that from when a lookup starts. These would be the basics. I'm sort of thinking about creeping up on the problem rather than building some massive implementation layer either to the side or inside. If we can get the debug system to print at least the info we need, we can have it call a side library to do the debug reporting later on. TIA, Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
[HarfBuzz] is opentype crazy
Dear All, It seems that if I want to insert a glyph (say a dotted circle) between two diacritics, then I can. I can sub diac1' diac2 by diac1 dottedcircle or something akin to that. But when I do, because diac1 is a mark and not a ligature, harfbuzz resets the advance of my dottedcircle to 0, even though dottedcircle is listed in gdef as being a base. Is this a bug, he asks hopefully? GB, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
[HarfBuzz] using HB_DEBUG
Dear Behdad, How does one use HB_DEBUG? Are there magic values? Is there a configure option? How do you typically turn on HB_DEBUG? TIA, Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] Unsure how to use HB output to combine characters
Dear Simon, When I try to shape the diglyph སྐ (TIBETAN LETTER SA U+0F66, TIBETAN SUBJOINED LETTER KA U+0F90) in the Kokonor font I get back two glyph values, 118 and 160: 1: (སྐ) 1: U+0F66,U+0F90 1: [118=0+1539|160=0+0] Is it that the Kokonor font has the diacritics as overstriking, i.e. with a negative x-min (and probably x-max too)? This would account for all the advance being on the base character and none on the diacritic. The glyph_pos structures (using a scaled font via hb_ft_font_create) for each glyph look like this: { x_advance = 375, y_advance = 0, x_offset = 0, y_offset = 0 } { x_advance = 0, y_advance = 0, x_offset = 0, y_offset = 0 } That all seems fine, I think. (I'm confused why I should be advancing after the first glyph and then not after the combining character, but I don't think that's actually the problem here.) Next I output that glyph string in my PDF document, where it looks like Td[007600a0]TJ. (I have been spending too much time reading PDF documents in a text editor this week.) What I see in my output is two separate glyphs next to each other, ས (TIBETAN LETTER SA U+0F66) and ྐ (TIBETAN SUBJOINED LETTER KA U+0F90) with the hello I am a combining character dotted circle around it. Shouldn't the combination be its own glyph in the font? How do I say in PDF-speak combine these two glyphs together, or should the font be doing that for me? That is very surprising, since the PDF viewer should not be doing any shaping. What is the world coming to if we can't trust PDF viewers to put our glyphs where we tell them! Perhaps if you munge the post table to not let on to PDF what characters the glyphs correspond to? What am I doing wrong? Allowing the PDF viewer to do shaping? GB, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
[HarfBuzz] Unsure how to use HB output to combine characters
Dear Simon, When I try to shape the diglyph སྐ (TIBETAN LETTER SA U+0F66, TIBETAN SUBJOINED LETTER KA U+0F90) in the Kokonor font I get back two glyph values, 118 and 160: 1: (སྐ) 1: U+0F66,U+0F90 1: [118=0+1539|160=0+0] Is it that the Kokonor font has the diacritics as overstriking, i.e. with a negative x-min (and probably x-max too)? This would account for all the advance being on the base character and none on the diacritic. The glyph_pos structures (using a scaled font via hb_ft_font_create) for each glyph look like this: { x_advance = 375, y_advance = 0, x_offset = 0, y_offset = 0 } { x_advance = 0, y_advance = 0, x_offset = 0, y_offset = 0 } That all seems fine, I think. (I'm confused why I should be advancing after the first glyph and then not after the combining character, but I don't think that's actually the problem here.) Next I output that glyph string in my PDF document, where it looks like Td[007600a0]TJ. (I have been spending too much time reading PDF documents in a text editor this week.) What I see in my output is two separate glyphs next to each other, ས (TIBETAN LETTER SA U+0F66) and ྐ (TIBETAN SUBJOINED LETTER KA U+0F90) with the hello I am a combining character dotted circle around it. Shouldn't the combination be its own glyph in the font? How do I say in PDF-speak combine these two glyphs together, or should the font be doing that for me? That is very surprising, since the PDF viewer should not be doing any shaping. What is the world coming to if we can't trust PDF viewers to put our glyphs where we tell them! Perhaps if you munge the post table to not let on to PDF what characters the glyphs correspond to? What am I doing wrong? Allowing the PDF viewer to do shaping? GB, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] Thai text render issue in Android L
Dear Roozbeh, My mistake. This is intentional. Basically, the grapheme cluster would go too deep for Android, so Noto Sans Thai UI pushes the SARA UU to the left so it can show something instead of making SARA UU disappear. Moving the sara uu to the side is certainly a novel solution to the problem. I think I would go so far as to say that it was a unique solution to the problem. Do the Noto fonts want to be producing unique styling not found in any other print in the language? Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] feature execution order
Dear Bob, I'm copy/pasting some stuff I wrote a while back in another thread. In general, look for add_gsub_pause in harfbuzz source code and you get to the place you need. Which, from reading the code for arabic results in the following feature order. Merged features are listed on the same line. Each line is a run of the lookups for the given features: ccmp, locl isol fina fin2 fin3 medi med2 init rlig calt mset HTH, Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] feature execution order
Dear Bob, Further to the script specific list. The general features list plan is as follows, again, all features on the same line are merged into one pass, except the script specific features which follow the plan laid down for that script. rtla, rtlm, frac, numr, dnom script specific features ccmp, locl, mark, mkmk, rlig, calt, clig, curs, kern, liga, rclt, user features Note that if you are in a LTR script, you replace rlta, rtlm with ltra, ltrm. Also if you are in vertical text then replace calt, clig, curs, kern, liga, rclt with vert. In addition, a specific script can override this plan, but even if it does, all user features end up at the end. Arabic doesn't mess with the plan at all. Since this overall plan is applied to both GSUB and GPOS, it means that you can use GSUB features in GPOS and vice versa. But bear in mind if you do that, that you are constrained to the type of lookup you can use according to which table (GSUB/GPOS) you are in. BTW only passes for which the table has lookups for the features of that pass, will actually happen. No lookups, no pass. I'm copy/pasting some stuff I wrote a while back in another thread. In general, look for add_gsub_pause in harfbuzz source code and you get to the place you need. Which, from reading the code for arabic results in the following feature order. Merged features are listed on the same line. Each line is a run of the lookups for the given features: ccmp, locl isol fina fin2 fin3 medi med2 init rlig calt mset HTH, Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] Inter-glyph spacing
Dear Simon, if (FT_Set_Char_Size(uds-ft_face,f-pointSize * 64.0, 0, 0, 0)) return 0; What also confuses me is that the result is very font-specific. SIL fonts are squashed. Times and Optima render perfectly: Pango and Harfbuzz equivalent. Adobe Garamond Pro and Caslon Pro are horrible, with some very strange inter-glyph spacing; in particular there is too much space after every letter a, and too little after an s. I think I have what the problem is. I'm not sure what the best solution is. The SIL fonts have horizontal device metrics, and you are basically getting back hinted positioning, due to it thinking you are working on a 1dpi screen or somesuch. This gives you hinted positioning that looks really funny in a pdf. The key code inside hb-ft.cc is: hb_font_set_scale (font, (int) (((uint64_t) ft_face-size-metrics.x_scale * (uint64_t) ft_face-units_per_EM + (115)) 16), (int) (((uint64_t) ft_face-size-metrics.y_scale * (uint64_t) ft_face-units_per_EM + (115)) 16)); since it is the scale that is passed to graphite make_font as the ppem value. Off the top of my head, I'm not sure how you could set an appropriate metrics.x_scale to get the rendering you want at an appropriate ppem that gets scaled back to points for you. I'm wondering if using the font scale rather than the font ppem is the right answer in the graphite integration code. Anyone got any thoughts? Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] Inter-glyph spacing
Dear Simon, What also confuses me is that the result is very font-specific. SIL fonts are squashed. Times and Optima render perfectly: Pango and Harfbuzz equivalent. Adobe Garamond Pro and Caslon Pro are horrible, with some very strange inter-glyph spacing; in particular there is too much space after every letter a, and too little after an s. What version of harfbuzz-ng? Did you compile it --with-graphite2? Perhaps this is a graphite integration bug? Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] behavior of mark-width-zeroing
Dear Jonathan, Our current mark-zeroing code, in zero_mark_widths_by_gdef() and zero_mark_widths_by_unicode(), modifies only the advance of the glyphs, so that they no longer take up any space on the line. I'm wondering whether we should also adjust the offset, by subtracting the advance from it before we zero the advance. (Though perhaps only if there's no GPOS positioning?) How would that affect anchor positioning? Would the offset be ignored or would one need to subtract the advance from all the anchors too? Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
[HarfBuzz] bug in graphite integration
Dear Behdad, I came across this bug in the graphite integration code. The cluster component of the info structure needs to point back into the original string. Currently the code is returning a character offset that is zero based. The fix is to use the cluster attributes passed in. I don't think we can get away without storing the cluster again. I enclose a patch. This fix could well clean up some of the instabilities in using graphite fonts in hbng applications I've been seeing and wondering about. I think it's a pretty key bug fix for people like Debian. Yours, Martin diff --git a/src/hb-graphite2.cc b/src/hb-graphite2.cc index a309ca8..709aa94 100644 --- a/src/hb-graphite2.cc +++ b/src/hb-graphite2.cc @@ -209,6 +209,7 @@ struct hb_graphite2_cluster_t { unsigned int num_chars; unsigned int base_glyph; unsigned int num_glyphs; + unsigned int cluster; }; hb_bool_t @@ -299,6 +300,7 @@ _hb_graphite2_shape (hb_shape_plan_t*shape_plan, memset (clusters, 0, sizeof (clusters[0]) * buffer-len); hb_codepoint_t *pg = gids; + clusters[0].cluster = buffer-info[0].cluster; for (is = gr_seg_first_slot (seg), ic = 0; is; is = gr_slot_next_in_segment (is), ic++) { unsigned int before = gr_slot_before (is); @@ -316,6 +318,7 @@ _hb_graphite2_shape (hb_shape_plan_t*shape_plan, { hb_graphite2_cluster_t *c = clusters + ci + 1; c-base_char = clusters[ci].base_char + clusters[ci].num_chars; + c-cluster = buffer-info[c-base_char].cluster; c-num_chars = before - c-base_char; c-base_glyph = ic; c-num_glyphs = 0; @@ -335,7 +338,7 @@ _hb_graphite2_shape (hb_shape_plan_t*shape_plan, { hb_glyph_info_t *info = buffer-info[clusters[i].base_glyph + j]; info-codepoint = gids[clusters[i].base_glyph + j]; - info-cluster = gr_cinfo_base(gr_seg_cinfo(seg, clusters[i].base_char)); + info-cluster = clusters[i].cluster; } } buffer-len = glyph_count; ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
[HarfBuzz] Fixing Khmer U+17DD
Dear Behdad, Khmer has a rare character U+17DD used by linguists and minorities and it isn't shaping correctly in that there is a dotted circle inserted if it occurs before U+17C8. The fix to this is to change the test for vowel to include U+17DD. In hb-ot-shape-complex-indic.cc, near the start of set_indic_properties there is a test of hb_in_range(u, 0x17CB, 0x17D3) this would need an extra || u == 0x17DD. TIA, Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
[HarfBuzz] Adding debug support to hb
Dear Behdad, As we approach a v1.0 I would like to encourage us to add debug support for font developers. Looking at the rather nice hb_auto_trace, it would really help if the trace routine could be passed the buffer as well as the lookup being traced. That way a font debugger could show the glyph string state before and after a lookup runs, the position in the glyph string where the lookup is executed and which lookup executed. Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] Thai below-base normalization
Dear All, It's not clear to me, then, why uniscribe treats this in the way it does. (Perhaps there was no good reason, and it was merely an arbitrary choice of ordering in the absence of any clear requirement?) In my data, I have no examples of a U+0E3A occurring after/below U+0E38/9. So I agree with this patch. Note that U+0E3A does occur following upper vowels (U+0E34-7). Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] Question regarding the use of HB_SCRIPT_KATAKANA for regular Japanese
Dear All, https://github.com/arielm/Unicode/blob/master/Projects/ScriptDetector This is awesome! Thank you. As I work with minority languages, automatic language detectors make me shudder and cry. Please do not assume that because something is in, say Thai script, that it is in Thai language. This is true for nearly every script there is. Yours, Martin behdad Feedback is welcome, Ariel P.S. the next step is to mix script/lang items with BIDI items (the Mapnik project should be very helpful here...) On Mon, Dec 23, 2013 at 4:46 AM, Behdad Esfahbod beh...@behdad.org mailto:beh...@behdad.org wrote: On 13-12-22 08:51 PM, Ariel Malka wrote: Thanks Behdad, the info on how it works in Pango is indeed super useful. An attempt to recap using my original Japanese example: ユニコードは、すべての文字に固有の番号を付与します ICU's scrptrun is detecting Katakana, Hiragana and Han scripts. Case 1: no input list of languages is provided. a) For Katakana and Hiragana items, ja will be selected, with the help of http://goo.gl/mpD9Fg In turn, MTLmr3m.ttf (default for ja in my system) will be used. So far so good. b) For Han items, no language will be selected because of http://goo.gl/xusqwn At this stage, we still need to pick a font, so I guess we choose DroidSansFallback.ttf (default for Han in my system), unless... Some additional strategy could be used, like: observing the surrounding items? Yes. All itemization issues can use surrounding context when in doubt... It's just about managing complexity... Case 2: we use ja (say, collected from the locale) as input language For all the items, ja will be selected because the 3 scripts are valid for writing this language, as defined in http://goo.gl/hwQri5 By the way, I wonder why Korean is not including Han (see http://goo.gl/bI5BLj), in contradiction to the explanations in http://goo.gl/xusqwn? Great point. The way the script-per-language was put together is using fontconfig's orth files, which basically only list Hangul characters for Korean. It definitely can be improved upon and I'm willing to hear from roozbeh and others whether we have better data somewhere. behdad On Mon, Dec 23, 2013 at 1:35 AM, Behdad Esfahbod beh...@behdad.org mailto:beh...@behdad.org mailto:beh...@behdad.org mailto:beh...@behdad.org wrote: On 13-12-22 06:17 PM, Ariel Malka wrote: As it happens, those three scripts are all considered simple, so the shaping logic in HarfBuzz is the same for all three. Good to know. For the record, there's a function for checking if a script is complex in the recent Harfbuzz-flavored Android OS: http://goo.gl/KL1KUi Please NEVER use something like that. It's broken by design. It exists in Android for legacy reasons, and will eventually be removed. Where it does make a difference is if the font has ligatures, kerning, etc for those. OpenType organizes those features by script, and if you request the wrong script you will miss out on the features. Makes sense to me for Hebrew, Arabic, Thai, etc., but I was bit surprised to find-out that LATN was also a complex script. LATN uses the generic shaper, so it's not complex, no. So for instance, if I would shape some text containing Hebrew and English solely using the HEBR script, I would probably loose kerning and ffi-like ligatures for the english part Correct. (this is what I'm actually doing currently in my simple BIDI implementation...) Then fix it. BIDI and script itemization are two separate issues. How you do font selection and what script you pass to HarfBuzz are two completely separate issues. Font fallback stack should be per-language. I understand that the best scenario will always be to take decisions based on language rather than solely on script, but it creates a problem: Say you work on an API for Unicode text rendering: you can't promise your users a solution where they would use arbitrary text without providing language-context per span. These are very good questions. And we have answers to all. Unfortunately there's no single location with
Re: [HarfBuzz] Change in HarfBuzz after version 0.90 ?
Dear Ed, When shaping text, the correct ordering for your typical consonant-vowel-consonant syllable is: BASE_CONSONANT + (i)VOWEL_MARK + (ii) TONE MARK +(iv) U+1A60 SAKOT + (iii) SUBJOINED CONSONANT With Harfbuzz 0.90, I get the shaping that I expect as shown in the first image. But with the most recent versions of HarfBuzz including the latest version (sorry, I don't know at which intermediate version things changed), I get incorrect shaping, as shown in the 2nd image. These flaws are apparent in the latest versions of Firefox too (which presumably contains the latest HarfBuzz library unchanged in some statically linked form I guess ... ) Can someone please give me a hint about what changed in HarfBuzz? Is this a bug in HarfBuzz? Or is some definition in my OpenType feature file not correct after changes were made in HarfBuzz? The answer is simple but insidious. The normalization for Tai Tham is somewhat broken in that sakot has been given a lower combining order than a tone mark. Thus when text is normalized, the sequence cons + vowel + tone + sakot + cons, gets reordered to cons + vowel + sakot + tone + cons. I have tried very hard to get the UTC to fix this, but they absolutely refuse on the basis of stability. (Please don't get me started!). The best answer is to have the Tai Tham shaper re-reorder the wrong normalised order of sakot + tone + cons, back to tone + sakot + cons before allocating features, etc. Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] Tai Tham NGA, SAKOT is not Kinzi
Dear Richard and Thep, In a Lanna tutorial [1], it's stated in page 12 that MAI KANG LAI is placed on the second consonant only. But the position is actually in the middle. [1] http://dl.dropbox.com/u/12266813/TaiTham/lanna-tutorial.pdf I think this is a wrong analysis. On p23 of the same pdf the last example shows the mai tang lai over the following glyph which is an -e vowel. So clearly the mai tang lai is not associated with the second consonant. In my analysing this character I came to the conclusion that for Lanna, Khuen and Lue, mai kang lai is a final on the first consonant, even if it hangs out to the right quite a long way. For Lao Tham, [2] on page 14, [3] on page 15, it's clearly placed on the second consonant. [2] http://www.laomanuscripts.net/downloads/tham_pali.pdf [3] http://www.esansawang.in.th/esanweb/es3_text/palitx_web.pdf Is this Lao Tham or Isaan? Anyway, it's a new discovery to me, so thank you for finding it. It's tricky to decide how to mark this ordering. I think we have two choices to make: 1. We don't encode the difference. We keep the mai tang lai encoded after the first consonant (which also puts it in front of the second consonant, so the encoding position is unchanged) and we say that the difference in rendering position is stylistic. Thus an OT engine would need a specific feature to trigger the reordering (given that GPOS can't attach forwards, only backwards). 2. We encode the difference. I like Thep's suggestion of using sakot for this. Thus an encoding of C1, vowels, mai kang lai, sakot, C2 The difference in position *is* stylistic. As to whether it's based purely on language or whether it's language and style, I don't know. But the difference in position carries no meaning. This would argue for approach 1. On the other hand, such a radical rendering difference can be argued as being a spelling difference, and this favours approach 2. Approach 2 is also easier for an OpenType engine. But it is harder for users who would have to use a special keyboard to do the reordering. Mind you they would have to have that for approach 1 also, so there's nothing to be gained there. My natural inclination is to keep the data clean and go with approach 1. What do you guys think? Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] pangocairo on Mac: transformation matrix and fontsize
Dear Khaled, The warning comes from libtool, I also get it when linking XeTeX with system harfbuzz and graphite2. I don’t have the slightest idea what it means, but it seems to be caused by this line in libgraphit2.la: # Should we warn about portability when linking against -modules? shouldnotlink=yes Which is, apparently, generated by this line in Graphite.cmake: GET_TARGET_PROPERTY_WITH_DEFAULT(_target_shouldnotlink ${_target} LT_SHOULDNOTLINK yes) Ooh. Thank you for spotting that. That's so good to nail. Fixed, and sharpish. My profuse apologies to Behdad thinking that he had put this in on purpose. My bad judgement. Sorry. Behdad, are you planning for a release date for harfbuzz? I'm wondering when to push out an interim release of Graphite and whether to aim for that date for you? Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
[HarfBuzz] patch to add getters for gr_font and gr_face
Dear Behdad, Khaled says he needs this for xetex. I enclose a patch to add getter functions to hb-graphite2 for gr_font and gr_face. No idea if the work though ;) Yours, Martin diff --git a/src/hb-graphite2.cc b/src/hb-graphite2.cc index 6c890d4..06eca24 100644 --- a/src/hb-graphite2.cc +++ b/src/hb-graphite2.cc @@ -113,7 +113,7 @@ _hb_graphite2_shaper_face_data_create (hb_face_t *face) hb_blob_destroy (silf_blob); data-face = face; - data-grface = gr_make_face (data, hb_graphite2_get_table, gr_face_default); + data-grface = gr_make_face (data, hb_graphite2_get_table, gr_face_preloadAll); if (unlikely (!data-grface)) { free (data); @@ -193,6 +193,21 @@ _hb_graphite2_shaper_shape_plan_data_destroy (hb_graphite2_shaper_shape_plan_dat * shaper */ +gr_font * +hb_graphite2_font_get_font (hb_font_t *font) +{ + if (unlikely (!hb_graphite2_shaper_font_data_ensure(font))) return 0; + return HB_SHAPER_DATA_GET (font); +} + +gr_face * +hb_graphite2_font_get_face (hb_font_t *font) +{ + if (unlikely (!hb_graphite2_shaper_face_data_ensure(font-face))) return 0; + hb_face_t *face = font-face; + return HB_SHAPER_DATA_GET (face)-grface; +} + struct hb_graphite2_cluster_t { unsigned int base_char; unsigned int num_chars; @@ -311,10 +326,18 @@ _hb_graphite2_shape (hb_shape_plan_t*shape_plan, } ci++; - buffer-clear_output (); + //buffer-clear_output (); for (unsigned int i = 0; i ci; ++i) -buffer-replace_glyphs (clusters[i].num_chars, clusters[i].num_glyphs, gids + clusters[i].base_glyph); - buffer-swap_buffers (); + { +for (unsigned int j = 0; j clusters[i].num_glyphs; ++j) +{ + hb_glyph_info_t *info = buffer-info[clusters[i].base_glyph + j]; + info-codepoint = gids[clusters[i].base_glyph + j]; + info-cluster = gr_cinfo_base(gr_seg_cinfo(seg, clusters[i].base_char)); +} + } + buffer-len = glyph_count; + //buffer-swap_buffers (); if (HB_DIRECTION_IS_BACKWARD(buffer-props.direction)) curradvx = gr_seg_advance_X(seg); diff --git a/src/hb-graphite2.h b/src/hb-graphite2.h index 8122495..c244f09 100644 --- a/src/hb-graphite2.h +++ b/src/hb-graphite2.h @@ -34,6 +34,11 @@ HB_BEGIN_DECLS #define HB_GRAPHITE2_TAG_SILF HB_TAG('S','i','l','f') /* TODO add gr_font/face etc getters and other glue API */ +gr_font * +hb_graphite2_font_get_font (hb_font_t *font); + +gr_face * +hb_graphite2_font_get_face (hb_font_t *font); HB_END_DECLS ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] pangocairo on Mac: transformation matrix and fontsize
Dear Behdad, The next hold-up is some graphite2 build problems and test failures on PowerPC that I'm dealing with that developer about. Please package without graphite2. That's my recommendation anyway. And my recommendation is the opposite (of course). Please do package with graphite so that pango will 'just work' with graphite fonts. It's not a big cost, and it is no less portable than having harfbuzz depend on glib, despite Behdad's warnings that if you link to graphite the world will come to an end :) Could you explain why you think that linking against libgraphite2 is non-portable? Perhaps you could remove this warning? Graphite2 builds in all environments that TeX Live and Firefox do. And we would be more than happy to ensure that Graphite builds in all contexts that harfbuzz does. Yours, Martin PS. Here's my current dependencies on a typical system build of harfbuzz: ldd /usr/local/lib/libharfbuzz.so linux-vdso.so.1 = (0x7fff005be000) libgobject-2.0.so.0 = /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0 (0x7fcb4ce7) libglib-2.0.so.0 = /lib/x86_64-linux-gnu/libglib-2.0.so.0 (0x7fcb4cb7b000) libfreetype.so.6 = /usr/lib/x86_64-linux-gnu/libfreetype.so.6 (0x7fcb4c8de000) libgraphite2.so.3 = /usr/local/lib/libgraphite2.so.3 (0x7fcb4c6aa000) libc.so.6 = /lib/x86_64-linux-gnu/libc.so.6 (0x7fcb4c2eb000) libffi.so.6 = /usr/lib/x86_64-linux-gnu/libffi.so.6 (0x7fcb4c0e2000) libpcre.so.3 = /lib/x86_64-linux-gnu/libpcre.so.3 (0x7fcb4bea5000) libpthread.so.0 = /lib/x86_64-linux-gnu/libpthread.so.0 (0x7fcb4bc88000) librt.so.1 = /lib/x86_64-linux-gnu/librt.so.1 (0x7fcb4ba7f000) libz.so.1 = /lib/x86_64-linux-gnu/libz.so.1 (0x7fcb4b868000) /lib64/ld-linux-x86-64.so.2 (0x7fcb4d3bb000) which raises questions as to why some of those are in the list (libpcre, libz). and (FWIW) here's graphite's: ldd /usr/local/lib/libgraphite2.so linux-vdso.so.1 = (0x7fff2bbff000) libc.so.6 = /lib/x86_64-linux-gnu/libc.so.6 (0x7fdc1408) /lib64/ld-linux-x86-64.so.2 (0x7fdc146a6000) ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
[HarfBuzz] building debug
Dear All, Probably a dumb question, but how do I build harfbuzz with debug turned on? I was hoping for a --enable-debug option to configure or some such. TIA, Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
[HarfBuzz] Adding font debugging
Dear Behdad, It would be great if we could add a little telemetry to harfbuzz to make life easier for font developers so they can find out why their font behaves other than how they are expecting it to behave. In the Graphite project we've been working on a gui based debugger that does after the fact debugging. It runs a string through graphite and collects information from the engine about what it did and then presents that to the user so that they can work out why graphite did what it did rather than what they expected it to. The approach we took in the graphite engine is to have it spit out a json dump describing the segment processing. While you would probably prefer to give the information via a callback mechanism, it's still the same idea. Thinking about the information that we dump and use, the main thing is that for every rule (lookup) we give the position in the glyph string where the lookup fired and then the glyph string after the lookup executes, along with which lookup executed (by index I assume). In the case of harfbuzz, I would also expect it to give the glyph string after pre-shaping giving all the glyphs and the features associated with each one. In the case of a contextual chaining lookup there probably needs to be a debug call made before the sublookup fires to say where in the string that lookup is executed and which lookup it is, and therefore what the match was. HTH, Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
[HarfBuzz] non-portable linkages
Dear Behdad, I would just like to point out that linking harfbuzz or anything against ICU, glib, freetype makes the library just as non-portable as linking against graphite2. Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] HarfBuzz.old backend in new HarfBuzz
Dear Behdad, Pretty much like the Uniscribe and CoreText backends, this new backend is primarily for testing, and may be removed in the future (after I have convinced everyone to move to the real HarfBuzz). Is now the right time to ask for the reenabling of the graphite backend? Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] Tai Tham / Lanna (iso15924=lana) shaping question
Dear Behdad, Good to know. I'll give HB a run on my Myanmar corpus and see if I can fix a few high-impact issues. I would commend UTN#11 as worth reading at least the first half, on this. It'll give you a good feel for what's involved. In the case of Tai Tham, we took the Myanmar model as the basis and built on it. Tai Tham writing systems have the extra fun that you can subjoin a final and even the start of another word. So you can get fun things like: U+1A3B U+1A66 U+1A76 U+1A60 U+1A36 U+1A6C U+1A26 = p + ii + tone2 + sakot + n + o + ng and this common spelling: U+1A20 U+1A63 U+1A60 U+1A36 : k + aa + sakot + n Of course you can also mix prevowels and medials into this mix: U+1A4B U+1A6B U+1A36 U+1A60 U+1A32 U+1A55 U+1A63 U+1A60 U+1A3F : ?a + vowel + n + sakot + t + medial ra + aa + sakot + y Tai Tham is just plain fun :) Yours, Martin Will look at my sources to confirm for Tai Tham. Thanks, b A. On Thursday, 24 May 2012, Behdad Esfahbod beh...@behdad.org mailto:beh...@behdad.org wrote: Hi Thep, Humm, the message from Ed hat you are replying to never made it to me or to the list. Replies inline. On 05/23/2012 06:53 AM, Theppitak Karoonboonyanan wrote: Hi, Ed, Behdad, On Sun, May 20, 2012 at 3:45 AM, Ed Trager ed.tra...@gmail.com mailto:ed.tra...@gmail.com wrote: On Fri, May 18, 2012 at 5:48 PM, Behdad Esfahbod beh...@behdad.org mailto:beh...@behdad.org wrote: On 05/18/2012 04:02 PM, Ed Trager wrote: In Tai Tham, U+1A6E VOWEL SIGN E needs to be shifted all the way to the left so that the final visual appearance would be: Are you sure? Without U+1A60 TAI THAM SIGN SAKOT before the subjoined consonant? Reading Unicode suggests that you need that sign betwee PA and LA. For most subjoined consonants, yes, that's true. But note in particular that U+1A56 MEDIAL LA and U+1A57 MEDIAL LA TANG LAI were encoded separately. In the case of these two LA signs, I believe there are two reasons justifying the separate encoding: (1) These are variant forms of the same subjoined letter LA: apparently, there is no other good way to do it other than encoding both. (2) Both of these LA signs can be part of triple consonant clusters, i.e. KLW appears in the common word Thai / Tai word for banana, กล้วย, klwy . In Tai Tham, both the L and the W appear as below-base stacked forms (and actually the y is also a subjoined form, but it's kind of hanging off the right side of the whole stack). I'm not questioning the separate encoding. I don't care :-). What I'm saying is that you need a SAKOT before them for them to be considered part of the same syllable according to the Indic OpenType spec and my implementation. Now, if you think Unicode intended these to subjoin without a SAKOT, then I like you to point me to documentation about that. If that is the case, we would need changes to the Indic machine. Not impossible, but I first want to make sure that it is indeed the case. behdad There are some other separately-encoded subjoining consonant signs: U+1A5B, U+1A5C, U+1A5D, U+1A5E. Please also count U+1A55 (MEDIAL RA) in the rule, although it's not a subjoined form. Regards, -Thep. ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org mailto:HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz -- Andrew Cunningham Senior Project Manager, Research and Development Vicnet State Library of Victoria Australia andr...@vicnet.net.au mailto:andr...@vicnet.net.au lang.supp...@gmail.com mailto:lang.supp...@gmail.com ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
[HarfBuzz] applying the graphite patch
Dear Behdad, I notice that you have made various fixes to the hb-graphite code, but you seem not to have applied the patch I sent you that both re-enables graphite and fixes the rtl problem. At least that's my understanding from the git repo pull I have here. TIA, GB, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] hb-view assert with graphite version of Scheherazade
Dear Behdad, Humm. As it is the logic in hb-graphite.cc only handles LTR runs. I'm not fluent enough in Graphite to try to fix that. I'll wait to see what Martin has to offer. OK. Rather than bend my brain around all the possible configurations of clustering in RTL text, let me ask the following question and then I can work on getting the same clustering out of graphite. With an rtl input text of arabic, rendering as CIBARA, where we consider a vowel to be a base character and a consonant to be a diacritic, are the clusters (in order): 1. (A R), (A B), (I C) 2. (R A), (B A), (C I) 3. (C I), (B A), (R A) 4. (I C), (A B), (A R) or something else I haven't thought of? TIA. Yours, Martin behdad On 09/19/11 17:08, Khaled Hosny wrote: Hi Behdad, I get an assertion failure from hb-view when using Graphite version of Scheherazade (http://scripts.sil.org/graphitefonts): ERROR:helper-cairo.cc:356:void helper_cairo_line_from_buffer(helper_cairo_line_t*, hb_buffer_t*, const char*, unsigned int, double): assertion failed: (hb_glyph[i].cluster hb_glyph[i+1].cluster) I didn't have this few days ago. Regards, Khaled ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] hb-view assert with graphite version of Scheherazade
Dear Behdad, Humm. As it is the logic in hb-graphite.cc only handles LTR runs. I'm not fluent enough in Graphite to try to fix that. I'll wait to see what Martin has to offer. OK. Rather than bend my brain around all the possible configurations of clustering in RTL text, let me ask the following question and then I can work on getting the same clustering out of graphite. With an rtl input text of arabic, rendering as CIBARA, where we consider a vowel to be a base character and a consonant to be a diacritic, are the clusters (in order): 1. (A R), (A B), (I C) 2. (R A), (B A), (C I) 3. (C I), (B A), (R A) 4. (I C), (A B), (A R) or something else I haven't thought of? TIA. Yours, Martin behdad On 09/19/11 17:08, Khaled Hosny wrote: Hi Behdad, I get an assertion failure from hb-view when using Graphite version of Scheherazade (http://scripts.sil.org/graphitefonts): ERROR:helper-cairo.cc:356:void helper_cairo_line_from_buffer(helper_cairo_line_t*, hb_buffer_t*, const char*, unsigned int, double): assertion failed: (hb_glyph[i].cluster hb_glyph[i+1].cluster) I didn't have this few days ago. Regards, Khaled ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] what not to indic shape
Dear Behdad, The questionable ones are rare scripts and I would take the risk of treating them as simple given there is no reordering involved. We can always move them with a bug report. One key aspect is that there is no reordering involved, and that's a key issue. The way I designed the shaper, the idea was to use it for nonreordering scripts too. If the script can use blwf, abvf, etc features, it belongs to this shaper. But I'm pretty certain they don't. I might do some more digging to get the actual proposals to find out, but IIRC they don't. Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
[HarfBuzz] what not to indic shape
Dear Behdad, Here is a list of scripts that I think shouldn't be using the indic shaper. Justification of simple means that there is no reordering or conjuncts involved and that there is probably no actual shaping (so just generic shaping will be sufficient). BATAK: ? Simple BRAHMI: ? Simple HANUNOO:? Simple KAYAH_LI: Simple LAO:See Thai LIMBU: Simple MEETEI_MAYEK: ? Simple MYANMAR:Current implementations do not have complex shaping. The current indic shaper is inappropriate. This is a temporary measure. Ideally the font should be queried for a key feature like blwf. If missing, then use generic shaping else use either fixed indic or myanmar specific. PHAGS_PA: Simple SAURASHTRA: ? Simple SYLOTI_NAGRI: Simple TAGALOG:Simple TAGBANWA: Simple TAI_LE: Simple TAI_VIET: See Thai THAI: No reordering, no conjuncts, some ligation, generic shaping sufficient. Note that for the Thai class of scripts reordering prevowels would be wrong. TIBETAN:Subjoined characters have their own codes. HTH, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] harfbuzz-ng: Changes to 'graphite2'
Dear Behdad, You mentioned that you intend to simplify the whole userdata approach for both uniscribe and graphite, which is a great idea. One thing that is worth pointing out is that graphite is somewhat unique in needing a difference between undef (not yet tried to create a gr_face) and NULL (tried and failed). Most fonts are not going to have graphite tables so we don't want to keep trying to create a gr_face for them. Instead we just want to try once and note that this isn't a graphite font. The fail is then quick. I haven't refactored the hb-graphite2.cc to put that back since you intend to refactor that area anyway. I would suggest the following patch though: diff --git a/src/hb-graphite2.cc b/src/hb-graphite2.cc index df97175..dfeab9f 100644 --- a/src/hb-graphite2.cc +++ b/src/hb-graphite2.cc @@ -227,6 +227,7 @@ hb_graphite_shape (hb_font_t *font, buffer-guess_properties (); hb_gr_font_data_t *data = _hb_gr_font_get_data (font); + if (!data-grface) return FALSE; unsigned int charlen; hb_glyph_info_t *bufferi = hb_buffer_get_glyph_infos (buffer, charlen); otherwise you try to do feature stuff on a null font, and so on. Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] Fwd: Where to report a bug with Tamil text rendering in LibreOffice?
Dear Shriramana, hb-view only produces images. The problem is with the cursor placement. Unless I configure a tiny text box to use hb-ng for CTL, I can't test whether the problem exists in hb-ng myself. Can you please tell me how to do that? Or is cursor placement entirely out of the purview of HB, as I asked before? there is a contrib/python hbtestfont that prints out cluster information as well as positions. I tried it for you with: ./runpy scripts/hbtestfont -f Lohit Tamil -c taml 0BA4 0BA4 0BCD 0BA4 Processing: /usr/share/fonts/truetype/ttf-indic-fonts-core/lohit_ta.ttf [720@(0.00,0.00)+(20.00,0.00), 2273@(0.00,0.00)+(20.00,0.00), 729@(0.00,0.00)+(20.00,0.00)] This means that there were 3 clusters in the sequence, which is what you want. So it seems that hbng is doing what you want. Remember, hbng doesn't do anything with cursors, but it does do clustering. Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] graphite 1.0.1 released
Dear Shriramana, Perhaps I wasn't sufficiently clear. I was interested in knowing whether Graphite rendering via Harfbuzz is slower than OT rendering via Harfbuzz for the same text/script. Somehow I get the perception that OT is inbuilt into HB whereas Graphite is an *external* library that HB calls, which is why I ask. For some scripts Graphite is faster and for others it is slower. But it is not much slower and it is sometimes much faster. The more lookups that fire in your font (particularly contextual lookups) the more likely that Graphite will be faster. It being an external library has nothing to do with it. Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] graphite 1.0.1 released
Dear Shriramana, Without any offence to anybody, and with no desire to bash OT but only to make things clear (?) it appears (from this) that the basic principle upon which OT lookup tables were designed aren't very effective. Is that so? No that is far from the case. OpenType is generally quicker because much of the complexity in shaping can be implemented in code, and working with underlying data structures, rather than generic fsm+engine code, allows for various optimisations, etc. But writing all that shaping code centralises the solution to the shaping problems that people face and so the cost of addressing those problems, is in updating library code. (Which then needs to be distributed, etc.) Graphite, OTOH is designed to address the use cases where current shaping is insufficient by providing a generic solution mechanism. But that generality costs, and we work very hard to get it running fast. So OT and Graphite are addressing similar problems in different ways. In summary, OT aims to solve the 80% problem and Graphite to solve the 20% problem (in the traditional arbitrary 80:20 split, not reflecting the real ratio which nobody knows). They do not compete because you can support both in the same font. I realise that I have presented one view on this, and that others will have differing opinions, particularly on the relative importance of the 80% vs the 20% (or is that 98% and 2%?). It being an external library has nothing to do with it. Oh -- right -- I learnt this early along my (limited) programming knowledge but forgot it -- since there is direct linkage of the HB code against the Graphite library, it is fast. (Right?) In effect, yes. Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] graphite 1.0.1 released
Dear Shriramana, On Sat, Aug 6, 2011 at 5:08 AM, Behdad Esfahbod beh...@behdad.org wrote: Thanks Martin. Now that we have multi-shaper infrastructure in place, I'll go ahead and do that. Sorry to sound ignorant, but HB is OpenType technology for rendering, Graphite is a different technology for the same -- what exactly is the need/nature of connection between the two? Integration between two technologies in *general* sounds a good thing, but I'm curious to know the details. HB supplies a shaping interface to applications, which don't care *how* that shaping is done, just so long as it is done and is returned in an agreed fashion. HB implements, as its primary technology, OpenType to do that shaping. But there is nothing to stop it also using Graphite or AAT (if someone were to write an implementation of AAT) or any other shaping technology, to do the shaping the application requires. By integrating Graphite under the HB shaper API, applications only need concern themselves with the one API to get good shaping from Graphite fonts as well as OpenType fonts. Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
[HarfBuzz] graphite 1.0.1 released
Dear All, Graphite2 v1.0.1 has been released as the first major release of the new graphite engine. It is available from: http://sourceforge.net/projects/silgraphite/files/graphite2/graphite2-1.0.1.tgz/download http://projects.palaso.org/attachments/download/139/graphite2-1.0.1.tgz In addition, I have upgraded the integration code for harfbuzz and also the python test code. You can merge from: git://gitorious.org/harfbuzz-dev/harfbuzz-dev.git I'm hoping Behdad will do this soon. In tracking the internal and external API changes, I've only ever had to deal in hb concepts and never in graphite concepts (FWIW). Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
[HarfBuzz] cd build ../configure
Dear Behdad, The enclosed patch fixes this not building ./autogensh make distclean mkdir build cd build ../configure make diff --git a/test/Makefile.am b/test/Makefile.am index b1a9b87..adf1ec8 100644 --- a/test/Makefile.am +++ b/test/Makefile.am @@ -7,7 +7,7 @@ DISTCLEANFILES = MAINTAINERCLEANFILES = if HAVE_GLIB -AM_CPPFLAGS = -DSRCDIR=\$(srcdir)\ -I$(top_srcdir)/src/ $(GLIB_CFLAGS) $(GT +AM_CPPFLAGS = -DSRCDIR=\$(srcdir)\ -I$(top_builddir)/src/ -I$(top_srcdir)/s LDADD = $(top_builddir)/src/libharfbuzz.la $(GLIB_LIBS) $(GTHREAD_LIBS) EXTRA_DIST += hb-test.h Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] Quick show of hands needed
Dear Behdad, I'm wondering: should harfbuzz try compatibility composition/decomposition (NFKC/NFKD) if a font doesn't support a character? No, I don't think this belongs in a shaping/rendering engine. I agree. It shouldn't go near compatibility characters. But it should do canonical composition/decomposition. Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] A few HarfBuzz-ng questions
Dear Behdad, 2. SHAPERS: In void hb_shape(...) in hb-shape.cc, I see this: The current Graphite shaper is disabled because it was crashing on me all the time. Deep in the libgraphite code... That's a major concern. Most probably I never enable graphite or any other non-included backend by default and make users --enable them if they wish. That's the first I've heard of your having problems. We could dig into them or we could wait for the rewrite. We're aiming for a good solid beta by the end of the year. Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] harfbuzz coordinate system
Dear Behdad, I think I'm going to make that change, especially since everytime I hooked up HarfBuzz to another system, it worked the first try except for a missing negation sign for y_offset. I don't think it matters too much which way you jump on this one so long as: 1. You clearly state the directionality in lots of places in the documentation and code. 2. You make it clear where the different y directionalities are used. For example, what should a call to a font metrics function return? If it is as per a font with y increasing up the page, then where in the code does the y directionality switch? I sympathise with the quandary: font designers think in terms of y going up the page, while graphics programmers think of y increasing down the page. Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
[HarfBuzz] Adding tracing
Dear All, I've added tracing to harfbuzz-ng. This is to help font developers rather than programmers, figure out what is going on with their fonts. The cost is a simple if() for each lookup. If that is too high, we can probably make tracing optional. Code available from gitorious.org:harfbuzz-dev/harfbuzz-dev.git (can't remember how to get it publicly). The tracing goes all the way through to the python test code via a callback. Yours, Martin ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz