Unicode is hard to deal with properly as how you deal with it is very context dependant.
One grapheme is a visible character and consists of one or more codepoints. One codepoint is one mapping of a byte sequence to a meaning, and consists of one or more bytes. This you do not want to deal with yourself, as knowing which codepoints form graphemes is hard. Thankfully, std.uni exists. Specifically, look at decodeGrapheme: it pops one grapheme from an input range and returns it. Never write code that deals with unicode on a bytelevel. It will always be wrong.