It's not a new problem: https://github.com/leo-editor/leo-editor/issues/1368
On Thursday, April 14, 2022 at 2:55:33 AM UTC+2 tbp1...@gmail.com wrote: > > There could also be a problem with a specific version of Qt, so if you can > try later version (or possibly earlier) it might behave differently. > Supposedly, all Qt widgets and strings work correctly with unicode and/or > utf-8 encoding. > On Wednesday, April 13, 2022 at 4:14:07 PM UTC-4 tbp1...@gmail.com wrote: > >> It looks like that on particular page, the non-ascii characters are >> emojis. I copied part of that page with two of the emojis into a Leo node >> and didn't see any unusual behavior. <Home>, <End> and copying with >> <CTRL-C> worked as expected. Do you have an example that didn't work right >> for you? >> >> Here's an online checker for non-ascii characters: Non-Ascii Checker >> <https://pages.cs.wisc.edu/~markm/ascii.html>. You can paste suspect >> text in or point it to a file. >> >> Since Python by default uses utf-8 and unicode, text that isn't encoded >> in utf-8 could cause problems. Or if it is wrongly encoded, or encoded >> with some other encoding. Some text editors can figure it out and you can >> tell them to save a file in a different encoding. EditPlus is the one I >> use for this. Not free but worth the $35. Notepad++ also can do it, >> though I haven't used it. >> >> Characters that your font does not have a glyph for might be troublesome >> too, but I'm not sure. Again, emojis probably would be the most likely if >> we're not getting into cjk characters, since so many new emojis are getting >> introduced.. >> >> If we see the kind of behavior you experienced in properly encoded >> strings, then for sure we'd have a problem. Unfortunately there is a lot >> of incorrectly encoded material out there. Hmm, I wonder if Leo should >> have an encoding checker built in? >> >> On Wednesday, April 13, 2022 at 2:48:06 PM UTC-4 SegundoBob wrote: >> >>> I don't know if this is a bug or just the way PyQt works, but this is a >>> very annoying problem. Sometimes HOME takes you to the end of line instead >>> of the start. Sometimes select and Ctrl+C copies unselected characters. >>> The "mistakes" are endless because the displayed cursor position is not >>> "correct". >>> >>> I first noticed this problem in 2022-02 because more and more articles >>> posted on the Internet contain non-ASCII and everyday I copy many articles >>> to node bodies and then edit them slightly. >>> >>> 2022-04-13 Wed I definitely identified the problem with the help of this >>> command: >>> >>> grep --color='auto' -P -n "[^\x00-\x7F]" x.txt >>> >>> which I obtained from >>> >>> >>> https://stackoverflow.com/questions/3001177/how-do-i-grep-for-all-non-ascii-characters >>> >>> Here is an example article containing many non-ASCII characters: >>> >>> https://newsletter.pragmaticengineer.com/p/scoop-atlassian >>> >>> There are many suggestions on the Internet for removing non-ASCII >>> characters using Python. So far this is the best workaround that I've come >>> up with. If we don't come with a fix or a better workaround, I'll >>> eventually figure out how to replace non-ASCII charcters that have similar >>> ASCII characters with the appropriate ASCII characters. Someone has >>> probably implemented this, but so far I have not found it. >>> >>> Unfortunately, I have higher priority problems right now that prevent me >>> from devoting much time to this problem. >>> >>> Versions tested: >>> >>> Leo 6.6b2-devel, devel branch, build 0ce2fa9ad5 >>> 2022-02-24 09:55:29 -0600 >>> Python 3.8.10, PyQt version 5.12.8 >>> linux >>> --------------- >>> Leo 6.6.1-devel, devel branch, build 90bad4f475 >>> 2022-04-13 09:33:47 -0500 >>> Python 3.8.10, PyQt version 5.12.8 >>> >>> -- You received this message because you are subscribed to the Google Groups "leo-editor" group. To unsubscribe from this group and stop receiving emails from it, send an email to leo-editor+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/leo-editor/0a388816-9669-42a9-8104-84bcb66da992n%40googlegroups.com.