t/spec/S02-builtin_data_types/unicode.t has tests like this: # LATIN CAPITAL LETTER A, COMBINING GRAVE ACCENT my Str $u = "\x[0041,0300]"; is $u.bytes, 3, 'combining À is three bytes as utf8'; is $u.codes, 2, 'combining À is two codes'; is $u.graphs, 1, 'combining À is one graph';
Which seems to imply that a Str remembers its codepoints, even if it is in grapheme mode (because that's the default). Is this correct? I don't really think that's sensible. I'd expect a compiler to store strings in composed normalization (+ NFG), so $u.codes would be 1. Cheers, Moritz