Re: [arch-general] Opening a document with unicode in path
> "I forgot to generate the locales" will cause this issue. Try running > `localedef --list-archive` and checking that en_CA.UTF-8 actually exists. > If not, uncomment it in /etc/locale.gen and run `sudo locale-gen`. Right on the mark, Mr. Celti, I discovered this mere minutes before your mail. Please just allow me to save my honour by adding the fact that I'm troubleshooting a machine that isn't mine. -- "That gum you like is going to come back in style."
Re: [arch-general] Opening a document with unicode in path
August 2, 2019 11:10 AM, "Eli Schwartz via arch-general" wrote: > The ls command will by default escape the character into its numeric > code if it thinks the character is invalid in your locale. I can get ls > to print the same thing as you (using shell-escaped $'\303\251') *iff* I > first export LC_ALL=C (which is not a UTF-8 locale and therefore cannot > print unicode characters). > > This indicates something is wrong with your locale, because at the very > least, your shell cannot parse the character correctly -- maybe neither > can libreoffice. "I forgot to generate the locales" will cause this issue. Try running `localedef --list-archive` and checking that en_CA.UTF-8 actually exists. If not, uncomment it in /etc/locale.gen and run `sudo locale-gen`. ~Celti
Re: [arch-general] Opening a document with unicode in path
> The ls command will by default escape the character into its numeric > code if it thinks the character is invalid in your locale. I can get ls > to print the same thing as you (using shell-escaped $'\303\251') *iff* I > first export LC_ALL=C (which is not a UTF-8 locale and therefore cannot > print unicode characters). > > This indicates something is wrong with your locale, because at the very > least, your shell cannot parse the character correctly -- maybe neither > can libreoffice. Man, can't thank you enough. You guided me to the issue. So, I tried what you said, but I couldn't modify LC_ALL at all - bash was complaining. If I echo it, I'd get back en_CA.UTF-8. I started wondering if there's an issue with locales since the install, so I figured I'll check /etc/locale.conf and regenerate them, and lo and behold - all locales were commented out. I uncommented en_CA.UTF-8, ran locale-gen, and now both `ls` and libreoffice work correctly. Thanks everyone on their time, both to read my questions and write out answers, and helping me fix this issue. -- "That gum you like is going to come back in style."
Re: [arch-general] Opening a document with unicode in path
On 8/2/19 1:24 PM, John Z. wrote: >> Could you verify that the encoding of the filepath is, in fact, UTF8? >> Filepaths in linux are free to be arbitrary bytes despite the locale >> settings. Most tools don't care, though I would expect the filepath to >> display incorrectly in the terminal and file browser if it were not UTF8. >> So it is probably a long shot but perhaps worth checking. > > Hi, thank you for the suggestion. I tried running your script, and all > filenames are decoded correctly, no exception was thrown (I also tried > without try/except just in case something else gets thrown) > > However, you might be onto something here because, interestingly enough: > while BASH prompt and autocompletition feature both decode the character > correctly, `ls` does not and outputs a sequence of escape codes: > > Proc'$'\303\251''dures > > instead of > > Procedures (where first 'e' is the unicode char, and has french accent) The ls command will by default escape the character into its numeric code if it thinks the character is invalid in your locale. I can get ls to print the same thing as you (using shell-escaped $'\303\251') *iff* I first export LC_ALL=C (which is not a UTF-8 locale and therefore cannot print unicode characters). This indicates something is wrong with your locale, because at the very least, your shell cannot parse the character correctly -- maybe neither can libreoffice. -- Eli Schwartz Bug Wrangler and Trusted User signature.asc Description: OpenPGP digital signature
Re: [arch-general] Opening a document with unicode in path
On 8/2/19 1:48 PM, Chris Billington via arch-general wrote: ... I do not understand the escape sequences \303\251 ... They're octal: 303 (octal) = 011 000 011 (binary) = 0 1100 0011 (binary) = c3 (hex) 251 (octal) = 010 101 001 (binary) = 0 1010 1001 (binary) = a9 (hex) HTH, Dan
Re: [arch-general] Opening a document with unicode in path
> What happens if you run the following? > > $ echo $'\303\251' > > I get the character printing correctly. Same here, it prints out fine. Terminal is Konsole. I tried touching new file with é, and ls again prints the escape sequence, however - trying to `cat` the file by hitting Tab to get autocompletition list, it prints it correctly there. I am not entirely sure how to check for locale issues? I know there's an extensive page in arch wiki which I checked, but I don't think any of Troubleshooting issues applies. I tried `localectl status` and, I dunno if this is normal, but it prints different locale than the one set in $LANG. > echo $LANG en_CA.UTF-8 > localectl status System Locale: LANG=en_US.UTF-8 VC Keymap: n/a X11 layout: n/a Changing keyboard layouts doesn't change the output of `localectl status` -- "That gum you like is going to come back in style."
Re: [arch-general] Opening a document with unicode in path
> However, you might be onto something here because, interestingly enough: > while BASH prompt and autocompletition feature both decode the character > correctly, `ls` does not and outputs a sequence of escape codes: > That's interesting. If I run: touch Proc$'\303\251'dures and then ls, I get it printing correctly with the accented character. Then if I do an os.listdir(b'.') in Python and look at its raw bytes, they are the same as if I type the character on my keyboard (US keyboard but with a compose key) and encode UTF8. So it looks to me to be UTF8 encoded (I do not understand the escape sequences \303\251 - once in Python I see the two bytes \xc3\xa9 for the character, which is the correct UTF8 encoding but do not map to the numbers in the bash escape sequences). What happens if you run the following? $ echo $'\303\251' I get the character printing correctly. This could be terminal-dependent behaviour, it works for me in xterm, tilix, alacritty and gnome-terminal. Perhaps if it doesn't work for you in one of these terminals it indicates there is a locale issue deeper than the check you already did to ensure the locale was set correctly. On Fri, Aug 2, 2019 at 1:36 PM John Z. wrote: > > Can you determine some steps that exactly reproduce the problem? > > Assuming that the problem should manifest when opening the file using > > /usr/bin/loffice /path/to/file, I tried creating a test file and opening > > it, and it worked: > > Hi Eli, > good idea, I tried following your sequence as well. > > I created a directory using `mkdir`, then launched libre office and > tried to save a file in it. Interesting thing happens:, it actually > creates a directory named 'Proc?dures' instead of the original > 'Procédures' directory, and saves it in there. I repeated the test > twice, because the first time around, I was puzzled enough that I > wasn't sure I actually saved the file. > > Furthermore, I copied the file using console into the 'Procédures', > then opened it using libreoffice, and it opened the one in > 'Proc?dures' - I know because I updated the file and saved it, and > the latter one was updated. > > The only difference between us is that I'm using `libreoffice` > launcher command, and you seem to have `loffice`? The package is > also libreoffice-fresh, package version 6.2.5-1, and `libreoffice > --version` 6.2.5.2 2@(build: 2) > The --version in ubuntu, that works, is 6.0.7.3 > > > P.S. I am unsure how well Unicode fares in mailing lists, so I > apologize if there are weird escape sequences in there. I just > composed it with vim. > > -- > "That gum you like is going to come back in style." >
Re: [arch-general] Opening a document with unicode in path
> There might also be a difference between libreoffice-fresh and > libreoffice-still which is quite a bit behind fresh. Hi Gene, also a good idea, I wasn't even aware of the `libreoffice-still` package. I tried replacing `libreoffice-fresh` with it, and I still get the same error, although with slightly different looking dialog :-( -- "That gum you like is going to come back in style."
Re: [arch-general] Opening a document with unicode in path
> Can you determine some steps that exactly reproduce the problem? > Assuming that the problem should manifest when opening the file using > /usr/bin/loffice /path/to/file, I tried creating a test file and opening > it, and it worked: Hi Eli, good idea, I tried following your sequence as well. I created a directory using `mkdir`, then launched libre office and tried to save a file in it. Interesting thing happens:, it actually creates a directory named 'Proc?dures' instead of the original 'Procédures' directory, and saves it in there. I repeated the test twice, because the first time around, I was puzzled enough that I wasn't sure I actually saved the file. Furthermore, I copied the file using console into the 'Procédures', then opened it using libreoffice, and it opened the one in 'Proc?dures' - I know because I updated the file and saved it, and the latter one was updated. The only difference between us is that I'm using `libreoffice` launcher command, and you seem to have `loffice`? The package is also libreoffice-fresh, package version 6.2.5-1, and `libreoffice --version` 6.2.5.2 2@(build: 2) The --version in ubuntu, that works, is 6.0.7.3 P.S. I am unsure how well Unicode fares in mailing lists, so I apologize if there are weird escape sequences in there. I just composed it with vim. -- "That gum you like is going to come back in style."
Re: [arch-general] Opening a document with unicode in path
> Could you verify that the encoding of the filepath is, in fact, UTF8? > Filepaths in linux are free to be arbitrary bytes despite the locale > settings. Most tools don't care, though I would expect the filepath to > display incorrectly in the terminal and file browser if it were not UTF8. > So it is probably a long shot but perhaps worth checking. Hi, thank you for the suggestion. I tried running your script, and all filenames are decoded correctly, no exception was thrown (I also tried without try/except just in case something else gets thrown) However, you might be onto something here because, interestingly enough: while BASH prompt and autocompletition feature both decode the character correctly, `ls` does not and outputs a sequence of escape codes: Proc'$'\303\251''dures instead of Procedures (where first 'e' is the unicode char, and has french accent) > > The following Python script, run in the directory containing the > file/directory containing the french character should tell you if it it > valid UTF8: > > import os > for item in os.listdir(b'.'): > try: > item.decode('utf8') > except UnicodeDecodeError: > print(item, "is not valid UTF8") > raise > > On Fri, Aug 2, 2019 at 12:48 PM Eli Schwartz via arch-general < > arch-general@archlinux.org> wrote: > > > On 8/2/19 8:59 AM, John Z. wrote: > > > Hi everyone, > > > there's a document on Dropbox, that has unicode character in its > > > path (french character). Trying to open this document with libre > > > office (Plasma is running) fails with 'file not found', and the path > > > shown with error clearly presents the path with that unicode > > > character replaced by '??' > > > > > > What I tried: > > > * copy the document in a path where there's no unicode - it opens > > > * copy the document using shell - it works > > > * copy the document using Dolphin (from Plasma) - it works > > > * check $LANG - its set to `en_CA.UTF8` > > > * search for 'libreoffice unicode path', 'archlinux unicode path' > > > and plethora of similar search terms - not much came through > > > > > > This makes me think the issue is actually with LibreOffice, but the > > > reason I ask here, and not in their forum, is that on another > > > computer running Ubuntu - this works without fail, so I'm fairly > > > certain the issue is in some local configuration. > > > > > > Could anyone shed some light on this, please, or at least point me > > > in some direction where I could look? > > > > Can you determine some steps that exactly reproduce the problem? > > Assuming that the problem should manifest when opening the file using > > /usr/bin/loffice /path/to/file, I tried creating a test file and opening > > it, and it worked: > > > > $ mkdir -p '/tmp/unicode paths are /' > > $ touch '/tmp/unicode paths are /testfile.txt' > > $ loffice '/tmp/unicode paths are /testfile.txt' > > $ > > > > I could successfully edit this file in libreoffice, save content, or > > reopen it. > > Tested with LANG=en_US.UTF-8 and the libreoffice-fresh package > > > > -- > > Eli Schwartz > > Bug Wrangler and Trusted User > > > > -- "That gum you like is going to come back in style."
Re: [arch-general] Opening a document with unicode in path
Could you verify that the encoding of the filepath is, in fact, UTF8? Filepaths in linux are free to be arbitrary bytes despite the locale settings. Most tools don't care, though I would expect the filepath to display incorrectly in the terminal and file browser if it were not UTF8. So it is probably a long shot but perhaps worth checking. The following Python script, run in the directory containing the file/directory containing the french character should tell you if it it valid UTF8: import os for item in os.listdir(b'.'): try: item.decode('utf8') except UnicodeDecodeError: print(item, "is not valid UTF8") raise On Fri, Aug 2, 2019 at 12:48 PM Eli Schwartz via arch-general < arch-general@archlinux.org> wrote: > On 8/2/19 8:59 AM, John Z. wrote: > > Hi everyone, > > there's a document on Dropbox, that has unicode character in its > > path (french character). Trying to open this document with libre > > office (Plasma is running) fails with 'file not found', and the path > > shown with error clearly presents the path with that unicode > > character replaced by '??' > > > > What I tried: > > * copy the document in a path where there's no unicode - it opens > > * copy the document using shell - it works > > * copy the document using Dolphin (from Plasma) - it works > > * check $LANG - its set to `en_CA.UTF8` > > * search for 'libreoffice unicode path', 'archlinux unicode path' > > and plethora of similar search terms - not much came through > > > > This makes me think the issue is actually with LibreOffice, but the > > reason I ask here, and not in their forum, is that on another > > computer running Ubuntu - this works without fail, so I'm fairly > > certain the issue is in some local configuration. > > > > Could anyone shed some light on this, please, or at least point me > > in some direction where I could look? > > Can you determine some steps that exactly reproduce the problem? > Assuming that the problem should manifest when opening the file using > /usr/bin/loffice /path/to/file, I tried creating a test file and opening > it, and it worked: > > $ mkdir -p '/tmp/unicode paths are /' > $ touch '/tmp/unicode paths are /testfile.txt' > $ loffice '/tmp/unicode paths are /testfile.txt' > $ > > I could successfully edit this file in libreoffice, save content, or > reopen it. > Tested with LANG=en_US.UTF-8 and the libreoffice-fresh package > > -- > Eli Schwartz > Bug Wrangler and Trusted User > >
Re: [arch-general] Opening a document with unicode in path
On 8/2/19 8:59 AM, John Z. wrote: > Hi everyone, > there's a document on Dropbox, that has unicode character in its > path (french character). Trying to open this document with libre > office (Plasma is running) fails with 'file not found', and the path > shown with error clearly presents the path with that unicode > character replaced by '??' > > What I tried: > * copy the document in a path where there's no unicode - it opens > * copy the document using shell - it works > * copy the document using Dolphin (from Plasma) - it works > * check $LANG - its set to `en_CA.UTF8` > * search for 'libreoffice unicode path', 'archlinux unicode path' > and plethora of similar search terms - not much came through > > This makes me think the issue is actually with LibreOffice, but the > reason I ask here, and not in their forum, is that on another > computer running Ubuntu - this works without fail, so I'm fairly > certain the issue is in some local configuration. > > Could anyone shed some light on this, please, or at least point me > in some direction where I could look? Can you determine some steps that exactly reproduce the problem? Assuming that the problem should manifest when opening the file using /usr/bin/loffice /path/to/file, I tried creating a test file and opening it, and it worked: $ mkdir -p '/tmp/unicode paths are /' $ touch '/tmp/unicode paths are /testfile.txt' $ loffice '/tmp/unicode paths are /testfile.txt' $ I could successfully edit this file in libreoffice, save content, or reopen it. Tested with LANG=en_US.UTF-8 and the libreoffice-fresh package -- Eli Schwartz Bug Wrangler and Trusted User signature.asc Description: OpenPGP digital signature
Re: [arch-general] Opening a document with unicode in path
On 8/2/19 12:23 PM, John Z. wrote: ... >> I don't have a direct answer, but check the version(s) of LibreOffice, There might also be a difference between libreoffice-fresh and libreoffice-still which is quite a bit behind fresh.
Re: [arch-general] Opening a document with unicode in path
> Good jump on the research. I try to do what I can, before asking other people to spend their time on me :-) > I don't have a direct answer, but check the version(s) of LibreOffice, > Dropbox, and possibly some of the other packages you've already > mentioed. Perhaps your issue is something that's been fixed by a newer > package in Ubuntu and not yet fixed in the corresponding package in Arch > (or something that's been broken by a newer package in Arch and yet > merged into Ubuntu). Maybe a release note or a patch will jump out at > you. That is a solid idea! I'll see if maybe I can find a version mismatch and downgrade accordingly. I already checked Libreoffice's bugtracker for this (which would indicate there's a patch incoming), but haven't found any entries, so I'll file one. Thank you. -- "That gum you like is going to come back in style."
[arch-general] Does Evolution or Claws work for other Arch Linux users?
Hi, both MUAs Evolution and Claws are broken on my machine. I tried to at least work around the Evolution issue, but this was a naive miscalculation. I ensured that icu 64 is available for other apps and that ico 63 is available to build and run evolution and that the sonames are linked against icu 63 [1]. I check that icu 63 from an old package isn't broken [2]. After checking out Evolution 3.32.4 and increasing the pkgrel to 1.1 [3], I tried building evolution-data-server and got [rocketmouse@archlinux extra-x86_64]$ makepkg -s ==> Making package: evolution-data-server 3.32.4-1.1 (Fri 02 Aug 2019 04:41:43 PM CEST) [snip] [ 25%] Linking C shared library libcamel-1.2.so /usr/bin/ld: CMakeFiles/camel.dir/camel-net-utils.c.o: in function `camel_host_idna_to_ascii': camel-net-utils.c:(.text+0x15b): undefined reference to `u_strFromUTF8_64' /usr/bin/ld: camel-net-utils.c:(.text+0x1a2): undefined reference to `u_strFromUTF8_64' /usr/bin/ld: camel-net-utils.c:(.text+0x1ec): undefined reference to `uidna_IDNToASCII_64' /usr/bin/ld: camel-net-utils.c:(.text+0x289): undefined reference to `u_strToUTF8_64' /usr/bin/ld: camel-net-utils.c:(.text+0x2ca): undefined reference to `u_strToUTF8_64' collect2: error: ld returned 1 exit status make[2]: *** [src/camel/CMakeFiles/camel.dir/build.make:1941: src/camel/libcamel-1.2.so.62.0.0] Error 1 make[1]: *** [CMakeFiles/Makefile2:2958: src/camel/CMakeFiles/camel.dir/all] Error 2 make: *** [Makefile:141: all] Error 2 ==> ERROR: A failure occurred in build(). Aborting... Obviously it's not that simple, as just providing 2 versions of icu ;). The reasons that I try to build Evolution 3.32.4 against icu 63 are 1. a bug that prevents to access the contacts, see https://bugs.archlinux.org/task/62317 . 2. I'm using the downgraded packages of version 3.32.0-1 against the provided icu 63 libs. This works without issues, if I write emails in (broken) English. Unfortunately sent emails containing umlauts and/or the "Eszett" show "??", instead of umlauts and/or the "Eszett", while editing the message with those "special character" works. Btw. the stored emails still contain the umlauts, just received mails are missing them. 3. Claws on Arch Linux is one of those apps that suffers from the well known "!xcb_xlib_threads_sequence_lost" issue, IOW my alternative MUA claws is completely broken, since it crashes all the times. The same version(s) build against dependencies of a very old Ubuntu 16.04 LTS release is (were) stable, using identical build options. Claws upstream know about it, see "Walter Lapchynski 2019-05-05 07:06:45 CEST I just wanted to add a note that there's a similar bug in Ubuntu that is affecting PCManFM. It seems there are several other bugs against other packages that all feature the same behavior: crashing with the "Assertion `!xcb_xlib_threads_sequence_lost' failed" error. The other thing they have in common, like Claws: GTK2. There were applications (e.g SpaceFM, LibreOffice) that also had GTK3 support and when compiled with it, worked without problem. https://bugs.launchpad.net/ubuntu/+source/pcmanfm/+bug/1782984 That said, I'm pretty sure GTK2 is to blame, but not sure how to actually solve it or where exactly to look to find the answers." - https://www.thewildbeast.co.uk/claws-mail/bugzilla/show_bug.cgi?id=4203#c10 https://bugs.launchpad.net/ubuntu/+source/gtk+2.0/+bug/1808710 FWIW Evolution upstream also knows about the Evolution icu issue, see https://mail.gnome.org/archives/evolution-list/2019-August/msg00032.html . Regards, Ralf [1] [root@archlinux lib]# ls -go libicu*so lrwxrwxrwx 1 18 Aug 2 16:06 libicudata.so -> libicudata.so.63.1 lrwxrwxrwx 1 18 Aug 2 16:06 libicui18n.so -> libicui18n.so.63.1 lrwxrwxrwx 1 16 Aug 2 16:06 libicuio.so -> libicuio.so.63.1 lrwxrwxrwx 1 18 Aug 2 16:06 libicutest.so -> libicutest.so.63.1 lrwxrwxrwx 1 16 Aug 2 16:06 libicutu.so -> libicutu.so.63.1 lrwxrwxrwx 1 16 Aug 2 16:06 libicuuc.so -> libicuuc.so.63.1 [root@archlinux lib]# ls -go libicu*so*63* lrwxrwxrwx 1 18 Oct 24 2018 libicudata.so.63 -> libicudata.so.63.1 -rwxr-xr-x 1 27185816 Oct 24 2018 libicudata.so.63.1 lrwxrwxrwx 1 18 Oct 24 2018 libicui18n.so.63 -> libicui18n.so.63.1 -rwxr-xr-x 1 2996296 Oct 24 2018 libicui18n.so.63.1 lrwxrwxrwx 1 16 Oct 24 2018 libicuio.so.63 -> libicuio.so.63.1 -rwxr-xr-x 155064 Oct 24 2018 libicuio.so.63.1 lrwxrwxrwx 1 18 Oct 24 2018 libicutest.so.63 -> libicutest.so.63.1 -rwxr-xr-x 176912 Oct 24 2018 libicutest.so.63.1 lrwxrwxrwx 1 16 Oct 24 2018 libicutu.so.63 -> libicutu.so.63.1 -rwxr-xr-x 1 211488 Oct 24 2018 libicutu.so.63.1 lrwxrwxrwx 1 16 Oct 24 2018 libicuuc.so.63 -> libicuuc.so.63.1 -rwxr-xr-x 1 1890072 Oct 24 2018 libicuuc.so.63.1 [root@archlinux lib]# ls -go libicu*so*64* lrwxrwxrwx 1 18 Apr 23 18:25 libicudata.so.64 -> libicudata.so.64.2 -rwxr-xr-x 1 27538072 Apr 23 18:25 libicudata.so.64.2 lrwxrwxrwx 1 18 Apr 23
Re: [arch-general] Opening a document with unicode in path
On 8/2/19 8:59 AM, John Z. wrote: > This makes me think the issue is actually with LibreOffice, but the > reason I ask here, and not in their forum, is that on another computer > running Ubuntu - this works without fail, so I'm fairly certain the > issue is in some local configuration. Good jump on the research. > Could anyone shed some light on this, please, or at least point me in > some direction where I could look? I don't have a direct answer, but check the version(s) of LibreOffice, Dropbox, and possibly some of the other packages you've already mentioed. Perhaps your issue is something that's been fixed by a newer package in Ubuntu and not yet fixed in the corresponding package in Arch (or something that's been broken by a newer package in Arch and yet merged into Ubuntu). Maybe a release note or a patch will jump out at you.