Re: Static compose table in gtkimcontextsimple.c
On Tue, 2007-12-04 at 08:31 +0200, Tor Lillqvist wrote: > > GDK_dead_circumflex, GDK_C, 0, 0, 0, 0x0108, /* > > LATIN_CAPITAL_LETTER_C_WITH_CIRCUMFLEX */ > > [...] > > GDK_dead_circumflex, GDK_c, 0, 0, 0, 0x0109, /* > > LATIN_SMALL_LETTER_C_WITH_CIRCUMFLEX */ > > [...] > > The sequences you list are exactly of the straightforward kind that in > my opinion can and should be handled algorithmically. I.e. a "dead" > accent followed by a letter can be mapped to the corresponding > precomposed character without an explicit table. I have a patch in bug > #321896 that implements such an algorithm (and which would handle your > cases, too.) Basically it's waiting for a second opinion from the GTK+ > maintainers. I made two small changes to the patch (now at #321896): 1. if diacritic marks belong to the same combining class, normalisation does not reorder them, so we need to try out all permutations then attempt to normalise again. 2. added a check if the compose sequence is overlong; otherwise one can type up too many dead keys, and overflow the buffer. I added a script at #321896 as well that parses UnicodeData.txt, checks and counts all characters that can be taken care of by the algorithmic function. They are about 1000 of them. Simos ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: Static compose table in gtkimcontextsimple.c
I wrote: > > a "dead" > > accent followed by a letter can be mapped to the corresponding > > precomposed character without an explicit table. On 06/12/2007, Paul LeoNerd Evans <[EMAIL PROTECTED]> wrote: > Really..? Last time I checked, the precomposed letters weren't in any > particularly easy-to-find locations; Well, obviously there has to be some tables somewhere (in GLib's case I guess it's in the generated header files like gunicomp.h), but I meant, the information shouldn't have to be effectively duplicated in gtk. --tml ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: Static compose table in gtkimcontextsimple.c
On Thu, 2007-12-06 at 17:30 +, Paul LeoNerd Evans wrote: > On Thu, 06 Dec 2007 12:12:39 -0500 > Owen Taylor <[EMAIL PROTECTED]> wrote: > > > Note also that loading /usr/share/X11/locale/en_US.UTF-8/Compose > > That's not quite what I meant. > > What I meant was, I thought that the X11 server did some of this work > for us? So can we not ask it to do that? > > Or have I misunderstood how it works, and that this is really a > clientside thing done by Xlib? The latter. - Owen signature.asc Description: This is a digitally signed message part ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: Static compose table in gtkimcontextsimple.c
On Thu, 06 Dec 2007 12:12:39 -0500 Owen Taylor <[EMAIL PROTECTED]> wrote: > Note also that loading /usr/share/X11/locale/en_US.UTF-8/Compose That's not quite what I meant. What I meant was, I thought that the X11 server did some of this work for us? So can we not ask it to do that? Or have I misunderstood how it works, and that this is really a clientside thing done by Xlib? -- Paul "LeoNerd" Evans [EMAIL PROTECTED] ICQ# 4135350 | Registered Linux# 179460 http://www.leonerd.org.uk/ signature.asc Description: PGP signature ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: Static compose table in gtkimcontextsimple.c
On Thu, 2007-12-06 at 12:28 +, Paul LeoNerd Evans wrote: > On Tue, 04 Dec 2007 05:38:56 + > Simos Xenitellis <[EMAIL PROTECTED]> wrote: > > > If you would like to help with bug 321896 it would be great. The current > > state is on how to make the table much smaller, even with the addition of > > more keysyms. There is a script that converts en_US.UTF-8/Compose into a > > series of arrays that should be easy for GTK+ to work on. > > OK, I've had a good read through that bug, and now I'm confused again. > > Can someone explain why GTK has to have this large table compiled into > it..? I thought X itself provided ways to perform input composition into > Unicode strings. Otherwise, why do I have a file > > /usr/share/X11/locale/en_US.UTF-8/Compose > > Can we just use that? Note also that loading /usr/share/X11/locale/en_US.UTF-8/Compose causes a large amount of per-process memory to be allocated, and quite a bit of time spent parsing it. While the GTK+ table is "large", it is mapped read-only so shared between all GTK+ applications. (*) (**) I don't have any exact or recent numbers here; the Compose table was a significant fraction of the per-process overhead when I measured it before writing gtkimcontextsimple.c, and current UTF-8 table is much bigger than anything I measured. On the other hand, it's possible that optimization has been done within Xlib in the subsequent 5-6 years. The original motivations in order of priority: 1. Reliable compose sequences in non-UTF-8 locales 2. Efficiency 3. Cross-platform portability 1. is luckily no longer an issue, but the two still apply. - Owen (*) The Xlib problem could obviously be fixed by precompiling and mem-mapping the Compose tables, as we do for similiar things (**) The one thing to be careful about when modifying gtkimcontextsimple.c is not to save "size" by introducing relocations. Arrays that include pointers to other arrays cannot be mapped read-only. Other than that, go for it! signature.asc Description: This is a digitally signed message part ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: Static compose table in gtkimcontextsimple.c
On Tue, 04 Dec 2007 05:38:56 + Simos Xenitellis <[EMAIL PROTECTED]> wrote: > If you would like to help with bug 321896 it would be great. The current > state is on how to make the table much smaller, even with the addition of > more keysyms. There is a script that converts en_US.UTF-8/Compose into a > series of arrays that should be easy for GTK+ to work on. OK, I've had a good read through that bug, and now I'm confused again. Can someone explain why GTK has to have this large table compiled into it..? I thought X itself provided ways to perform input composition into Unicode strings. Otherwise, why do I have a file /usr/share/X11/locale/en_US.UTF-8/Compose Can we just use that? -- Paul "LeoNerd" Evans [EMAIL PROTECTED] ICQ# 4135350 | Registered Linux# 179460 http://www.leonerd.org.uk/ signature.asc Description: PGP signature ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: Static compose table in gtkimcontextsimple.c
On Tue, 4 Dec 2007 08:31:30 +0200 "Tor Lillqvist" <[EMAIL PROTECTED]> wrote: > The sequences you list are exactly of the straightforward kind that in > my opinion can and should be handled algorithmically. I.e. a "dead" > accent followed by a letter can be mapped to the corresponding > precomposed character without an explicit table. Really..? Last time I checked, the precomposed letters weren't in any particularly easy-to-find locations; I looked them up by typing them in xterm and seeing what unicode sequences were generated. > I have a patch in bug #321896 that implements such an algorithm (and > which would handle your cases, too.) Basically it's waiting for a > second opinion from the GTK+ maintainers. Perhaps we could subtly poke them here then to remind them? :) -- Paul "LeoNerd" Evans [EMAIL PROTECTED] ICQ# 4135350 | Registered Linux# 179460 http://www.leonerd.org.uk/ signature.asc Description: PGP signature ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: Static compose table in gtkimcontextsimple.c
On Dec 6, 2007 8:22 AM, Simos Xenitellis <[EMAIL PROTECTED]> wrote: > I just compiled Tor's working patch which actually eliminates most of > the compose sequences and it is amazing in the way it simplifies the work. > I think it is the way to go once the small issues are resolved. Thanks for staying on this issue for so long, SImos. It will be good to have this finally resolved. ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: Static compose table in gtkimcontextsimple.c
On Thu, 2007-12-06 at 12:28 +, Paul LeoNerd Evans wrote: > On Tue, 04 Dec 2007 05:38:56 + > Simos Xenitellis <[EMAIL PROTECTED]> wrote: > > > If you would like to help with bug 321896 it would be great. The current > > state is on how to make the table much smaller, even with the addition of > > more keysyms. There is a script that converts en_US.UTF-8/Compose into a > > series of arrays that should be easy for GTK+ to work on. > > OK, I've had a good read through that bug, and now I'm confused again. > > Can someone explain why GTK has to have this large table compiled into > it..? I thought X itself provided ways to perform input composition into > Unicode strings. Otherwise, why do I have a file > > /usr/share/X11/locale/en_US.UTF-8/Compose > > Can we just use that? There are issues on GTK+ running on other platforms that require to have a separate copy. Having the file contents in the library as static data is good for performance and memory use. I just compiled Tor's working patch which actually eliminates most of the compose sequences and it is amazing in the way it simplifies the work. I think it is the way to go once the small issues are resolved. Simos ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: Static compose table in gtkimcontextsimple.c
> GDK_dead_circumflex, GDK_C, 0, 0, 0, 0x0108, /* > LATIN_CAPITAL_LETTER_C_WITH_CIRCUMFLEX */ > [...] > GDK_dead_circumflex, GDK_c, 0, 0, 0, 0x0109, /* > LATIN_SMALL_LETTER_C_WITH_CIRCUMFLEX */ > [...] The sequences you list are exactly of the straightforward kind that in my opinion can and should be handled algorithmically. I.e. a "dead" accent followed by a letter can be mapped to the corresponding precomposed character without an explicit table. I have a patch in bug #321896 that implements such an algorithm (and which would handle your cases, too.) Basically it's waiting for a second opinion from the GTK+ maintainers. --tml ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: Static compose table in gtkimcontextsimple.c
On Mon, 2007-12-03 at 17:08 +, Paul LeoNerd Evans wrote: > I notice there's a large table of compose sequences in > gtkimcontextsimple.c. Is there any particular logic to the exact > sequences listed here, or would it be acceptable to add some more? The table should be in sync with the one from Xorg, /usr/share/X11/locale/en_US.UTF-8/Compose There is a bug report on this, "Synch gdkkeysyms.h/gtkimcontextsimple.c with X.org 6.9/7.0" http://bugzilla.gnome.org/show_bug.cgi?id=321896 > I'd quite like to have some mappings of Esperanto characters added; > namely: > > GDK_dead_circumflex, GDK_C, 0, 0, 0, 0x0108, /* > LATIN_CAPITAL_LETTER_C_WITH_CIRCUMFLEX */ > GDK_dead_circumflex, GDK_G, 0, 0, 0, 0x011D, /* > LATIN_CAPITAL_LETTER_G_WITH_CIRCUMFLEX */ > GDK_dead_circumflex, GDK_H, 0, 0, 0, 0x0124, /* > LATIN_CAPITAL_LETTER_H_WITH_CIRCUMFLEX */ > GDK_dead_circumflex, GDK_J, 0, 0, 0, 0x0134, /* > LATIN_CAPITAL_LETTER_J_WITH_CIRCUMFLEX */ > GDK_dead_circumflex, GDK_S, 0, 0, 0, 0x015C, /* > LATIN_CAPITAL_LETTER_S_WITH_CIRCUMFLEX */ > > GDK_dead_circumflex, GDK_c, 0, 0, 0, 0x0109, /* > LATIN_SMALL_LETTER_C_WITH_CIRCUMFLEX */ > GDK_dead_circumflex, GDK_g, 0, 0, 0, 0x011D, /* > LATIN_SMALL_LETTER_G_WITH_CIRCUMFLEX */ > GDK_dead_circumflex, GDK_h, 0, 0, 0, 0x0125, /* > LATIN_SMALL_LETTER_H_WITH_CIRCUMFLEX */ > GDK_dead_circumflex, GDK_j, 0, 0, 0, 0x0135, /* > LATIN_SMALL_LETTER_J_WITH_CIRCUMFLEX */ > GDK_dead_circumflex, GDK_s, 0, 0, 0, 0x015D, /* > LATIN_SMALL_LETTER_S_WITH_CIRCUMFLEX */ > > GDK_dead_caron, GDK_U, 0, 0, 0, 0x01D3, /* > LATIN_CAPITAL_LETTER_U_WITH_CARON */ > > GDK_dead_caron, GDK_u, 0, 0, 0, 0x01D4, /* > LATIN_SMALL_LETTER_U_WITH_CARON */ > > Should I submit a patch? A quick glance at the compose file of Xorg shows that these sequences exist there which is good. If you would like to help with bug 321896 it would be great. The current state is on how to make the table much smaller, even with the addition of more keysyms. There is a script that converts en_US.UTF-8/Compose into a series of arrays that should be easy for GTK+ to work on. Regarding Greek polytonic there is an optimisation suggested by Tor to reduce the sequences (current about 1000 sequences out of 5000). Simos ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Static compose table in gtkimcontextsimple.c
I notice there's a large table of compose sequences in gtkimcontextsimple.c. Is there any particular logic to the exact sequences listed here, or would it be acceptable to add some more? I'd quite like to have some mappings of Esperanto characters added; namely: GDK_dead_circumflex, GDK_C, 0, 0, 0, 0x0108, /* LATIN_CAPITAL_LETTER_C_WITH_CIRCUMFLEX */ GDK_dead_circumflex, GDK_G, 0, 0, 0, 0x011D, /* LATIN_CAPITAL_LETTER_G_WITH_CIRCUMFLEX */ GDK_dead_circumflex, GDK_H, 0, 0, 0, 0x0124, /* LATIN_CAPITAL_LETTER_H_WITH_CIRCUMFLEX */ GDK_dead_circumflex, GDK_J, 0, 0, 0, 0x0134, /* LATIN_CAPITAL_LETTER_J_WITH_CIRCUMFLEX */ GDK_dead_circumflex, GDK_S, 0, 0, 0, 0x015C, /* LATIN_CAPITAL_LETTER_S_WITH_CIRCUMFLEX */ GDK_dead_circumflex, GDK_c, 0, 0, 0, 0x0109, /* LATIN_SMALL_LETTER_C_WITH_CIRCUMFLEX */ GDK_dead_circumflex, GDK_g, 0, 0, 0, 0x011D, /* LATIN_SMALL_LETTER_G_WITH_CIRCUMFLEX */ GDK_dead_circumflex, GDK_h, 0, 0, 0, 0x0125, /* LATIN_SMALL_LETTER_H_WITH_CIRCUMFLEX */ GDK_dead_circumflex, GDK_j, 0, 0, 0, 0x0135, /* LATIN_SMALL_LETTER_J_WITH_CIRCUMFLEX */ GDK_dead_circumflex, GDK_s, 0, 0, 0, 0x015D, /* LATIN_SMALL_LETTER_S_WITH_CIRCUMFLEX */ GDK_dead_caron, GDK_U, 0, 0, 0, 0x01D3, /* LATIN_CAPITAL_LETTER_U_WITH_CARON */ GDK_dead_caron, GDK_u, 0, 0, 0, 0x01D4, /* LATIN_SMALL_LETTER_U_WITH_CARON */ Should I submit a patch? -- Paul "LeoNerd" Evans [EMAIL PROTECTED] ICQ# 4135350 | Registered Linux# 179460 http://www.leonerd.org.uk/ signature.asc Description: Digital signature ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list