Karl Williamson <pub...@khwilliamson.com> writes: > On 05/09/2016 08:53 AM, Daniel Dehennin wrote: >> Hello, >> >> I tried to make my Perl5 code unicode compliant after reading a post on >> stackoverflow[1]. >> >> As suggested in the post: >> >> “always run incoming stuff through NFD and outbound stuff from NFC.” >> >> I got a hard time finding why my Test::More was failing but displaying >> exactly the same strings for “got” and “expected”. >> >> I finally check how UTF-8 sources are handled and found that they are in >> NFC form, I run the following script:
[...] > I'm afraid that when it comes to normalization in Perl5, you have to > do it yourself. I hear that Perl6 is much friendlier in this regard, > but I have no personal experience with it. Your $unistring is in > whatever normalization you made it when you typed it into your editor, > or whatever your editor did with it as you were typing. You could > have typed it in NFD, but probably the most natural way to enter > things on your keyboard will underlying it all be NFC. That's what I finally find out in another post, normally all my inputs are NFD but my tests used static string to match, I declared them with NFD to make it explicit. I added a note in my POD to signal that the sub returns NFD strings. > Normalization is tricky, and the Unicode Consortium has had to modify > things years after they were first specified, because no one could > reasonably implement what was expected. I may tackle getting > normalization to be more developer friendly in future Perl5 versions, > but not in the next couple of years. Thanks, as soon as my little work project is working well I'll try to redo it in Perl6. Regards. -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF
signature.asc
Description: PGP signature