Hello, I tried to make my Perl5 code unicode compliant after reading a post on stackoverflow[1].
As suggested in the post: “always run incoming stuff through NFD and outbound stuff from NFC.” I got a hard time finding why my Test::More was failing but displaying exactly the same strings for “got” and “expected”. I finally check how UTF-8 sources are handled and found that they are in NFC form, I run the following script: #+begin_src perl #!/usr/bin/env perl use utf8; use warnings; use Test::More; use Unicode::Normalize; my $unistring = 'C’est une chaîne unicode'; my @forms = ("NFD", "NFC", "NFKD", "NFKC"); for my $form (@forms) { if ($unistring eq &$form($unistring)) { print "UTF-8 source is in form '$form'\n"; } } #+end_src and got: #+begin_src UTF-8 source is in form 'NFC' UTF-8 source is in form 'NFKC' #+end_src So, the Test::More::is_deeply was trying to compare an input in NFD with the expected string in NFC. My code can use Unicode::Collate, but for all the code I did not write I wonder if there is a way to handle it cleanly. Or maybe I'm doing something wrong? Regards. Footnotes: [1] https://stackoverflow.com/questions/6162484/why-does-modern-perl-avoid-utf-8-by-default -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF
signature.asc
Description: PGP signature