Er... why would it be slower than cpython? Anyway, the speeds I'm reporting on are based on C/assembler programs so far.
On Sat, Mar 4, 2017 at 7:36 PM, Phyo Arkar <phyo.arkarl...@gmail.com> wrote: > SSE measn https://en.wikipedia.org/wiki/Streaming_SIMD_Extensions? > > in comparison to CPython is this much slower ? > > On Sun, Mar 5, 2017 at 12:32 AM Maciej Fijalkowski <fij...@gmail.com> wrote: >> >> Hello everyone >> >> I've been experimenting a bit with faster utf8 operations (and >> conversion that does not do much). I'm writing down the results so >> they don't get forgotten, as well as trying to put them in rpython >> comments. >> >> As far as non-SSE algorithms go, for things like splitlines, split >> etc. is important to walk the utf8 string quickly and check properties >> of characters. >> >> So far the current finding has been that lookup table, for example: >> >> def next_codepoint_pos(code, pos): >> chr1 = ord(code[pos]) >> if chr1 < 0x80: >> return pos + 1 >> return pos + ord(runicode._utf8_code_length[chr1 - 0x80]) >> >> is significantly slower than following code (both don't do error >> checking): >> >> def next_codepoint_pos(code, pos): >> chr1 = ord(code[pos]) >> if chr1 < 0x80: >> return pos + 1 >> if 0xC2 >= chr1 <= 0xDF: >> return pos + 2 >> if chr >= 0xE0 and chr <= 0xEF: >> return pos + 3 >> return pos + 4 >> >> The exact difference depends on how much multi-byte characters are >> there and how big the strings are. It's up to 40%, but as a general >> rule, the more ascii characters are, the less of an impact it has, as >> well as the larger they are, the more impact memory/L2/L3 cache has. >> >> PS. SSE will be faster still, but we might not want SSE for just >> splitlines >> >> Cheers, >> fijal >> _______________________________________________ >> pypy-dev mailing list >> pypy-dev@python.org >> https://mail.python.org/mailman/listinfo/pypy-dev _______________________________________________ pypy-dev mailing list pypy-dev@python.org https://mail.python.org/mailman/listinfo/pypy-dev