Re: Your presentation on LibreOffice code

2020-03-21 Thread Jan-Marek Glogowski
Am 21.03.20 um 18:02 schrieb Arvind Kumar:
> On Fri, 20 Mar 2020 Jan-Marek Glogowski wrote:
> 
>> Hmm - I know fcitx uses some kind of tables for the direct mappings. My
>> Debian has fcitx-table-emoji. Guess that would be the easiest starting
>> point, if your languages typed letters don't depend already existing
>> previous or next letters and just need some keys to code point mapping.
> 
> There are two separate issues here - keyboard input and display of the
> glyph. Leaving aside for the moment the input mechanism assuming that I
> have done what you suggest, I'd like to understand the code dealing with
> the display mechanism in LO. This is because even if some external
> method did the input mappings and the keycode came into LO as a result
> of those mappings, the problem here is that although everything works
> fine in the case of copy-paste, it is not the same with keyboard input.

[more stuff, I think is not relevant here. But I have just limited
knowledge of the whole IM stuff, as I just implemented IM handling in
vcl/qt5]

>From my POV it's just a single issue. I think you try to sole a
non-existing problem. LO uses UTF-16 as its internal string
representation (OUString), which is also length encoded. And you already
wrote CnP of your text works.

And X11 based input methods aren't restricted to either virtual or real
key codes, but can also insert "text". For some IM key events you'll get
a text value with length > 1 for these composed  input cases. Commit
1c6ea413cb01c0b482f87e625ba208eec230618a ("tdf#130071 tdf#127815 qt5:
Use ExtTextInput only for multi-char key events") has some more
explanations for the recent fix to the Qt5 backend.

I tried to find some good architecture overview of X11 / xorg, input
methods and keyboard handling, but couldn't find any, except references
to some book (XLIB Programming Manual). But maybe the compose key
handling is sufficient as a start for your case[1].

>From all I know, there is no need to change anything in LO or even Gtk.
You "just" have to develop a "plugin" for your preferred IM system. And
I have no knowledge, about that, so best contact the people of your
preferred IM implementation.

HTH

Jan-Marek

[1]
https://wiki.archlinux.org/index.php/Xorg/Keyboard_configuration#Configuring_compose_key
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: Your presentation on LibreOffice code

2020-03-21 Thread Arvind Kumar
On Fri, 20 Mar 2020 Jan-Marek Glogowski wrote:

> Hmm - I know fcitx uses some kind of tables for the direct mappings. My
> Debian has fcitx-table-emoji. Guess that would be the easiest starting
> point, if your languages typed letters don't depend already existing
> previous or next letters and just need some keys to code point mapping.
There are two separate issues here - keyboard input and display of the glyph. 
Leaving aside for the moment the input mechanism assuming that I have done what 
you suggest, I'd like to understand the code dealing with the display mechanism 
in LO. This is because even if some external method did the input mappings and 
the keycode came into LO as a result of those mappings, the problem here is 
that although everything works fine in the case of copy-paste, it is not the 
same with keyboard input. 

In the case of keyboard input, the keycodes that have a value above 65535 get 
truncated to short when it passes through various layers of functions that 
handle the codes. The PUAs I use are values greater than 65535.
As an example, the values of keyval and aOrigCode in the arguments of 
GtkSalFrame::doKeyCallback are both 97 when you type the letter 'a' on the 
standard keyboard. Printing the individual elements of the array pStr in 
CommonSalLayout::LayoutText, you see the value 97 printed here. Now change the 
97 to a PUA value in doKeyCallback (e.g.: 1051531) and you see that the 
corresponding value printed in LayoutText is the truncated value (printed value 
of 2955 for 1051531). 2955 is the value that will be printed when an integer 
type containing 1051531 is written into a short type and printed.
I also see that uInt16 is used in many places in the code. 

At this point, I just want to understand the flow. I'm not suggesting that LO 
make any change. Where in the code do the key values get handled as they are 
typed in and where in the code do they get mapped to the value needed for 
displaying the glyph. I assume the value for display will be encoded in UTF-8. 
I'd like to know where in the source code that happens as well.

> Yup. No LO changes needed, unless you find some bug.
I'm definitely not suggesting changes, but am trying to understand the code as 
I explained above. However, I would also not rule out the possibility that 
copy-paste part of the code works well because it correctly reads the UTF-8 
encoded values of the codepoints expected by the font file, while the keyboard 
input results in these values being incorrect as they pass through various 
layers of the program. I just want to know what these layers are.

> I'm not sure I understand you. Is this a Gtk-only problem, so qt5 or kf5
> works? I'm not aware of any restriction regarding file names. Sure Gtk+
> and Qt5 default to utf-8 encoding, but that should just work. Or do they
> reject PUA code points (which IMHO makes sense, because a filename has
> no font).

Not sure about other systems, but GNOME restricts to valid unicode values. It 
does not reject PUA but rejects 32 bit values encoded in UTF-8. I wrote my own 
UTF-8 encoding mechanism that would take 32 bit values but some GNOME functions 
fail which is why I mapped my coding system to PUAs. As far as this discussion 
for LO's functionality is concerned, it is only related to PUA values.

> From the filesystem POV it's all just bytes. 

This is not related to LO, but this is where many GNOME libraries impose the 
restriction. It does not follow the filesystem of filenames being just bytes. 
If you try using a g_filesystem* function and pass a filename containing a 
character which is not approved by the Unicode Consortium, it will fail. GNOME 
is not agnostic to various Standards out there but follows the Standards set by 
some organizations. Of course, in those cases, I just use fopen or related 
calls.

-a

___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: Your presentation on LibreOffice code

2020-03-20 Thread Jan-Marek Glogowski
Sorry for the late reply. Had some mail problems.

Am 16.03.20 um 22:20 schrieb Arvind Kumar:
> Jan-Marek Glogowski mailto:glo...@fbihome.de>> wrote:
> 
>> If you want to generate single glyphs from multiple keystrokes, then you
>> should have a look into input method handling (IM), like ibus or
> fcitx, which
>> is normally used to type complex-glyph based languages, like Chinese.
> 
> I know this is outside LO, but is this as easy as editing a file and
> adding my mapping, and if so, is there an example I can look at?

Hmm - I know fcitx uses some kind of tables for the direct mappings. My
Debian has fcitx-table-emoji. Guess that would be the easiest starting
point, if your languages typed letters don't depend already existing
previous or next letters and just need some keys to code point mapping.

>> Hard to say, if this is a general problem of your font or a bug in LO or
>> just caused by your changes to the VCL gtk3 plugin key handling code
> in LO.
>>
>> If you have some other working example document, like a UTF-8 encoded
>> text file, which you know is displayed correctly in some Gtk
>> application, than you could copy and paste that text into Writer and
>> then select your font. That should already work, without any code changes.
> 
> I just tested this and it works very well and correctly shows my text!
> So it now comes down to the input mechanism and making it work for the
> keystrokes. LayoutText is not the right place?

Yup. No LO changes needed, unless you find some bug.

[some unicode politics, I can't do anything about]

> Another problem is that even GTK's code tests for unicode
> compatibility and will not accept "non-standard" strings, for example,
> file names not recognized as unicode compatible.

I'm not sure I understand you. Is this a Gtk-only problem, so qt5 or kf5
works? I'm not aware of any restriction regarding file names. Sure Gtk+
and Qt5 default to utf-8 encoding, but that should just work. Or do they
reject PUA code points (which IMHO makes sense, because a filename has
no font).

>From the filesystem POV it's all just bytes. Encoding depends on your
locale, like C.UTF-8. There is
https://bugs.documentfoundation.org/show_bug.cgi?id=125995 as a result
of this IMHO sane UTF-8 default.
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: Your presentation on LibreOffice code

2020-03-16 Thread Arvind Kumar
Jan-Marek Glogowski  wrote:
> If you want to generate single glyphs from multiple keystrokes, then you 
> should have a look into input method handling (IM), like ibus or fcitx, 
> which> is normally used to type complex-glyph based languages, like Chinese.

I know this is outside LO, but is this as easy as editing a file and adding my 
mapping, and if so, is there an example I can look at?
> Hard to say, if this is a general problem of your font or a bug in LO or
> just caused by your changes to the VCL gtk3 plugin key handling code in LO.
>
> If you have some other working example document, like a UTF-8 encoded
> text file, which you know is displayed correctly in some Gtk
> application, than you could copy and paste that text into Writer and
> then select your font. That should already work, without any code changes.
I just tested this and it works very well and correctly shows my text! So it 
now comes down to the input mechanism and making it work for the keystrokes. 
LayoutText is not the right place?

> Maybe try contact either people from unicode.org or icu-project.org?
> They should eventually be able to assign some code point ranges for your
> language / glyphs, so other programs can also represent your glyphs
> correctly. I have no idea, how the unicode people work, but especially
> if you have an actively used language, that would be my way to go.

Unicode is the problem! In the early 1990s, UTF-8 encoding started off as 
31-bit encoding scheme. IEEE came up with a 32-bit scheme. Unicode should have 
done the same thing. It was well intentioned when it started off and was 
supposed to accommodate all languages. Today, Unicode Consortium is run by 
people from large firms who, over time, restricted Unicode to 16 and 20-bits 
and there is no space for every character. Of course, Unicode neatly fits in 
with the ancient DBCS technology of Microsoft. Another problem is that even 
GTK's code tests for unicode compatibility and will not accept "non-standard" 
strings, for example, file names not recognized as unicode compatible. 

Thanks!
-a
  ___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: Your presentation on LibreOffice code

2020-03-16 Thread Jan-Marek Glogowski
Hi Arvind,

Am 15.03.20 um 17:15 schrieb Arvind Kumar:

[some text about modifying the key handling in gtk3]

I don't think you're on the right track. In theory you should be able to
create a keyboard mapping with your special characters, so the keypress
is translated to the correct key code. If you want to generate single
glyphs from multiple keystrokes, then you should have a look into input
method handling (IM), like ibus or fcitx, which is normally used to type
complex-glyph based languages, like Chinese.

OTOH, since you're using Private Use Area (PUA) codes, this won't work
well, if other software doesn't use your font, as PUAs - per definition
- can't have fallbacks in other fonts, as they are font specific. But it
should work just fine with any LO document type, if you select your
font, which already shows up in LO.

> The other problem I had when I hard coded the value of nGlyphIndex is
> that the cursor does not move all the way to the end of the glyph but is
> in the center of the glyph so that the next keystroke results in an
> overlap of the right half of the glyph with the left half of the next
> glyph. This is the case only with the glyph in the font file I
> generated. Note that I have used my fontfile with other programs written
> using Gtk+ and they work fine.

Hard to say, if this is a general problem of your font or a bug in LO or
just caused by your changes to the VCL gtk3 plugin key handling code in LO.

If you have some other working example document, like a UTF-8 encoded
text file, which you know is displayed correctly in some Gtk
application, than you could copy and paste that text into Writer and
then select your font. That should already work, without any code changes.

If not, please open a bug report at https://bugs.documentfoundation.org/

> Any ideas would be appreciated.

Quoting your old mail:

> This is needed for some Indian languages.

Maybe try contact either people from unicode.org or icu-project.org?
They should eventually be able to assign some code point ranges for your
language / glyphs, so other programs can also represent your glyphs
correctly. I have no idea, how the unicode people work, but especially
if you have an actively used language, that would be my way to go.

HTH

Jan-Marek
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: Your presentation on LibreOffice code

2020-03-15 Thread Arvind Kumar
 Hi Miklos,
After digging in some more, I was able to track down the hardware key events to 
gtk3gtkframe.cxx in vcl/unx/gtk3. Hopefully this information is useful for 
others looking for the same information. Within doKeyCallback function, I set 
mnCharCode in the SalKeyEvent object to the value I want and this value is 
reflected in the LayoutText function of CommonSalLayout.cxx.
Coming to the actual display, I faced two problems. I was able to display my 
glyph by hard coding the value of nGlyphIndex in LayoutText but that results in 
every single character in libreoffice printing my glyph (on every menu item, 
label, etc.). I need to figure out how to display my glyph based on the value 
passed into the LayoutText function based on mapping this value to the value of 
the nGlyphIndex. The mapping is the easy part but identifying the control flow 
is where things seem to get tricky. Print statements do not seem to help as 
they get printed multiple times for a single keystroke (18 times) and I am 
unsure which invocation in the loop is the relevant one and an attempt to 
insert an if condition based on the value I passed seemed to have no effect.

The other problem I had when I hard coded the value of nGlyphIndex is that the 
cursor does not move all the way to the end of the glyph but is in the center 
of the glyph so that the next keystroke results in an overlap of the right half 
of the glyph with the left half of the next glyph. This is the case only with 
the glyph in the font file I generated. Note that I have used my fontfile with 
other programs written using Gtk+ and they work fine.

Any ideas would be appreciated.
-a

On Monday, February 17, 2020, 2:36:57 AM CST, Miklos Vajna 
 wrote:  
 
 Hi Arvind,

Try 'git grep key_press vcl/source/', and keep in mind that if you type
into a Writer document, Calc cell or Impress shape, those are all
"custom widgets" in terms of gtk3.

Regards,

Miklos
  ___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: Your presentation on LibreOffice code

2020-02-17 Thread Miklos Vajna
Hi Arvind,

Try 'git grep key_press vcl/source/', and keep in mind that if you type
into a Writer document, Calc cell or Impress shape, those are all
"custom widgets" in terms of gtk3.

Regards,

Miklos


signature.asc
Description: Digital signature
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: Your presentation on LibreOffice code

2020-02-14 Thread Miklos Vajna
Hi Arvind,

[ Please don't drop the list from CC, you may get valuable input from
others. ]

On Thu, Feb 13, 2020 at 12:22:18PM -0600, Arvind Kumar  
wrote:
> What I'm trying is this: I have created a fontfile with additional
> characters (non-standard codes - I use the private user's space of Unicode)
> that should get represented in the document when a certain combination of
> keys is pressed. This is needed for some Indian languages and the absence
> of such a technique is why Indian language computers do not exist.
> 
> I have written a few applications using Gtk that achieve the above by
> trapping the keypress and keyrelease events. I want to now do this in
> LibreOffice so I can use it for my own document creation and printing. My
> fontfile shows up in the drop down menu, but I need it to use my glyph in
> response to the keycodes sent by a non-standard keyboard (I have a 121 key
> keyboard in which I can assign custom HID codes).
> 
> If you can tell me which files I need to specifically edit to achieve this,
> that would be perfect!

Perhaps what you want is to tweak/extend the code that maps an OUString
to a list of glyphs, the vcl text layout.

GenericSalLayout::LayoutText() at vcl/source/gdi/CommonSalLayout.cxx:286
does that using harfbuzz, I believe standard gtk apps use that at the
end as well.

Regards,

Miklos


signature.asc
Description: Digital signature
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: Your presentation on LibreOffice code

2020-02-13 Thread Miklos Vajna
Hi Arvind,

On Wed, Feb 12, 2020 at 06:41:24PM -0600, Arvind Kumar  
wrote:
> I came across your presentation on the code structure of LibreOffice source
> code.
> https://libocon.org/assets/Conference/Rome/Slides/beginners-structure-locon-rome-2k17.pdf
> 
> I was able to download the code and compile it on my Linux machine. I'm now
> trying to make sense of the code and wish to know the following. If you
> don't mind, I have three questions.
> 
> (1) Are the pages in a LibreOffice document GtkTextView objects? If so,
> where is the code for the creation of the TextView object?

git grep GtkTextView vcl/

should give you some hints, the gtk3 case either creates widgets using
.ui files ("welded" case) or using the VclBuilder (non-welded case).

> (2) Where is the code to get the input from the keyboard and then set the
> text in the TextView object?

Keyboard input is typically handled by the KeyInput() virtual member
function of vcl::Window subclasses. See include/vcl/weld.hxx for the
interface that is an abstraction on top of the gtk3 and vcl (non-gtk3)
cases.

> (3) Where is the code to save the text from the TextView object into a file?

A typical Writer/Calc/Impress document content is presented using custom
widgets in gtk3 terms, so it's rare that a GtkTextView content would be
really saved to a file as-is.

> By understanding the code for the creation, read, and write operations, it
> will get me started off since that is the core functionality of a document
> writer.

Perhaps ask something more specific, so you can get specific answers.
:-)

Regards,

Miklos


signature.asc
Description: Digital signature
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice