Good idea, thanks. should be a bit slower than lookup table, but faster
then now.

On Sun, Feb 10, 2019, 21:02 Rowan Collins <rowan.coll...@gmail.com wrote:

> On 10/02/2019 12:29, Legale Legage wrote:
> > This conception can be used for the utf-16 encoding, but table size
> > would be 65536 bytes against 256 byte for the utf-8 table.
>
> Rather than two 65 kilobyte lookup tables with most entries identical,
> would it be reasonable to use a bit mask to check for the range we care
> about?
>
> I may have this slightly wrong, but something like:
>
> #define UTF16_LE_CODE_UNIT_IS_HIGH_SURROGATE (code_unit & 0xFC00 == 0xD800)
> #define UTF16_BE_CODE_UNIT_IS_HIGH_SURROGATE (code_unit & 0x00FC == 0x00D8)
>
> m = UTF16_LE_CODE_UNIT_IS_HIGH_SURROGATE(*(uint16_t *)p) ? 4 : 2;
>
> Regards,
>
> --
> Rowan Collins
> [IMSoP]
>
>
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>

Reply via email to