Good idea, thanks. should be a bit slower than lookup table, but faster then now.
On Sun, Feb 10, 2019, 21:02 Rowan Collins <rowan.coll...@gmail.com wrote: > On 10/02/2019 12:29, Legale Legage wrote: > > This conception can be used for the utf-16 encoding, but table size > > would be 65536 bytes against 256 byte for the utf-8 table. > > Rather than two 65 kilobyte lookup tables with most entries identical, > would it be reasonable to use a bit mask to check for the range we care > about? > > I may have this slightly wrong, but something like: > > #define UTF16_LE_CODE_UNIT_IS_HIGH_SURROGATE (code_unit & 0xFC00 == 0xD800) > #define UTF16_BE_CODE_UNIT_IS_HIGH_SURROGATE (code_unit & 0x00FC == 0x00D8) > > m = UTF16_LE_CODE_UNIT_IS_HIGH_SURROGATE(*(uint16_t *)p) ? 4 : 2; > > Regards, > > -- > Rowan Collins > [IMSoP] > > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: http://www.php.net/unsub.php > >