The paper points out that the input buffer needs to be padded with 3 null bytes as a precondition.
Mark <https://twitter.com/mark_e_davis> On Mon, Oct 9, 2017 at 10:57 AM, J Decker via Unicode <[email protected]> wrote: > that's interesting; however it will segfault if the string ends on a > memory allocation boundary. will have to make sure strings are always > allocated with 3 extra bytes. > > 2017-10-09 1:37 GMT-07:00 Martin J. Dürst via Unicode <[email protected] > >: > >> A friend of mine sent me a pointer to >> http://nullprogram.com/blog/2017/10/06/, a branchless UTF-8 decoder. >> >> Regards, Martin. >> > >

