[issue14654] More fast utf-8 decoding

2012-04-24 Thread Serhiy Storchaka

Serhiy Storchaka storch...@gmail.com added the comment:

Thank you, Antoine. It is interesting results, that on 64 bits greatly
accelerated the case, which on 32 bits sped up a little. It was the
pathology that a 2-byte to UCS1 was decoded in 1.5x slower than a 2-byte
to UCS2. Interestingly, a small acceleration for the other cases are
random deviations or consequential effect? Strange looks like the
difference for ascii-only text, this branch is not affected by the
patch. Except that the consequences of global optimization. The
deceleration of the decoding of the 4-byte data is expected.

Here is a patch, which is risky reception with signed numbers. For me,
it shows the acceleration of a few percent in comparison with the
previous patch. But I can not recommend it, it looks too hacker for such
a small improvement. It will not work on the exotic platforms where
signed numbers are implemented not as complement code (but Python is not
supports such platforms).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14654
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14654] More fast utf-8 decoding

2012-04-24 Thread Serhiy Storchaka

Changes by Serhiy Storchaka storch...@gmail.com:


Added file: http://bugs.python.org/file25338/utf8-signed.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14654
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14654] More fast utf-8 decoding

2012-04-24 Thread Martin v . Löwis

Martin v. Löwis mar...@v.loewis.de added the comment:

I'm -1 on using signed char in the implementation. If this gives any advantage, 
it's because the compiler is not able to generate as efficient code for 
unsigned char as it does for signed char. So the performance results may again 
change if you switch compilers, or use the next compiler version.

The code should do what is *logically* correct; IMO, UTF-8 is really a sequence 
of unsigned bytes, conceptually.

So if you want to demonstrate any performance improvements, you need to do so 
with unsigned chars.

--
nosy: +loewis

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14654
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14654] More fast utf-8 decoding

2012-04-24 Thread Serhiy Storchaka

Serhiy Storchaka storch...@gmail.com added the comment:

 I'm -1 on using signed char in the implementation.

I completely agree with you, for these and for other not mentioned
reasons. So I don't released this patch yesterday, and did not suggest
it to accept. I showed him just out of curiosity -- whether the effect
is stronger on a 64-bit platform? Although this technique will not be
accepted, it sets the bar that can be achieved (if it's worth it).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14654
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14654] More fast utf-8 decoding

2012-04-24 Thread Serhiy Storchaka

Serhiy Storchaka storch...@gmail.com added the comment:

Here are two new patches. The first one takes into account the Martin
wishes about comments. The second also rejects optimization for ASCII.

On the Intel Atom last patch annihilates acceleration for some cases
(mostly-ascii with UCS2 data):

  vanilla patch1  patch3

utf-8 'A'*+'\u0100'   124 (+8%)   288 (-53%)  134
utf-8 'A'*+'\u8000'   124 (+8%)   291 (-54%)  134
utf-8   '\u0100'+'A'* 78 (+5%)123 (-33%)  82
utf-8   '\u8000'+'A'* 78 (+5%)124 (-34%)  82

On the AMD Athlon there is no noticeable effect.

--
Added file: http://bugs.python.org/file25342/decode_utf8_2.patch
Added file: http://bugs.python.org/file25343/decode_utf8_3.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14654
___diff -r c820aa9c0c00 Objects/stringlib/codecs.h
--- a/Objects/stringlib/codecs.hFri Apr 20 18:04:03 2012 -0400
+++ b/Objects/stringlib/codecs.hTue Apr 24 19:51:31 2012 +0300
@@ -21,7 +21,6 @@
const char **src_pos, Py_ssize_t *dest_index)
 {
 int ret;
-Py_ssize_t n;
 const char *s = start;
 const char *aligned_end = (const char *) ((size_t) end  ~LONG_PTR_MASK);
 STRINGLIB_CHAR *p = dest;
@@ -48,15 +47,33 @@
 unsigned long value = *(unsigned long *) _s;
 if (value  ASCII_CHAR_MASK)
 break;
-_p[0] = _s[0];
-_p[1] = _s[1];
-_p[2] = _s[2];
-_p[3] = _s[3];
-#if (SIZEOF_LONG == 8)
-_p[4] = _s[4];
-_p[5] = _s[5];
-_p[6] = _s[6];
-_p[7] = _s[7];
+#ifdef BYTEORDER_IS_LITTLE_ENDIAN
+_p[0] = (STRINGLIB_CHAR)(value  0xFFu);
+_p[1] = (STRINGLIB_CHAR)((value  8)  0xFFu);
+_p[2] = (STRINGLIB_CHAR)((value  16)  0xFFu);
+_p[3] = (STRINGLIB_CHAR)((value  24)  0xFFu);
+#if SIZEOF_LONG == 8
+_p[4] = (STRINGLIB_CHAR)((value  32)  0xFFu);
+_p[5] = (STRINGLIB_CHAR)((value  40)  0xFFu);
+_p[6] = (STRINGLIB_CHAR)((value  48)  0xFFu);
+_p[7] = (STRINGLIB_CHAR)((value  56)  0xFFu);
+#endif
+#else
+#if SIZEOF_LONG == 8
+_p[0] = (STRINGLIB_CHAR)((value  56)  0xFFu);
+_p[1] = (STRINGLIB_CHAR)((value  48)  0xFFu);
+_p[2] = (STRINGLIB_CHAR)((value  40)  0xFFu);
+_p[3] = (STRINGLIB_CHAR)((value  32)  0xFFu);
+_p[4] = (STRINGLIB_CHAR)((value  24)  0xFFu);
+_p[5] = (STRINGLIB_CHAR)((value  16)  0xFFu);
+_p[6] = (STRINGLIB_CHAR)((value  8)  0xFFu);
+_p[7] = (STRINGLIB_CHAR)(value  0xFFu);
+#else
+_p[0] = (STRINGLIB_CHAR)((value  24)  0xFFu);
+_p[1] = (STRINGLIB_CHAR)((value  16)  0xFFu);
+_p[2] = (STRINGLIB_CHAR)((value  8)  0xFFu);
+_p[3] = (STRINGLIB_CHAR)(value  0xFFu);
+#endif
 #endif
 _s += SIZEOF_LONG;
 _p += SIZEOF_LONG;
@@ -67,78 +84,114 @@
 break;
 ch = (unsigned char)*s;
 }
+if (ch  0x80) {
+s++;
+*p++ = ch;
+continue;
+}
 }
 
-if (ch  0x80) {
-s++;
+if (ch  0xC2) {
+/* invalid sequence
+   \x80-\xBF -- continuation byte
+   \xC0-\xC1 -- fake -007F */
+goto _error;
+}
+
+if (ch  0xE0) {
+/* \xC2\x80-\xDF\xBF -- 0080-07FF */
+Py_UCS4 ch2;
+if (end - s  2) {
+/* unexpected end of data: the caller will decide whether
+   it's an error or not */
+goto _error;
+}
+ch2 = (unsigned char)s[1];
+if ((ch2  0xc0) != 0x80)
+/* invalid continuation byte */
+goto _error;
+ch = (ch  6) + ch2 - 030200;
+assert ((ch  0x007F)  (ch = 0x07FF));
+s += 2;
 *p++ = ch;
 continue;
 }
 
-n = utf8_code_length[ch];
-
-if (s + n  end) {
-/* unexpected end of data: the caller will decide whether
-   it's an error or not */
-goto _error;
-}
-
-switch (n) {
-case 0:
-/* invalid start byte */
-goto _error;
-case 1:
-/* internal error */
-goto _error;
-case 2:
-if ((s[1]  0xc0) != 0x80)
- 

[issue14654] More fast utf-8 decoding

2012-04-23 Thread Serhiy Storchaka

New submission from Serhiy Storchaka storch...@gmail.com:

The utf-8 decoder is already well optimized. I propose a patch, which 
accelerates the utf-8 decoder for some of the frequent cases even more 
(+10-30%). In particular, for 2-bites non-latin1 codes will get about +30%.

This is not the final result of optimization. It may be possible to optimize 
the decoding of the ascii and mostly-ascii text (up to the speed of memcpy), 
decoding of text with occasional errors, reduce code duplication. But I'm not 
sure of the success.

Related issues:
[issue4868] Faster utf-8 decoding
[issue13417] faster utf-8 decoding
[issue14419] Faster ascii decoding
[issue14624] Faster utf-16 decoder
[issue14625] Faster utf-32 decoder

--
components: Interpreter Core
files: decode_utf8.patch
keywords: patch
messages: 159080
nosy: haypo, pitrou, storchaka
priority: normal
severity: normal
status: open
title: More fast utf-8 decoding
type: performance
versions: Python 3.3
Added file: http://bugs.python.org/file25326/decode_utf8.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14654
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14654] More fast utf-8 decoding

2012-04-23 Thread Serhiy Storchaka

Serhiy Storchaka storch...@gmail.com added the comment:

Here are the results of benchmarking (numbers in MB/s).

On 32-bit Linux, AMD Athlon 64 X2 4600+ @ 2.4GHz:

  Py2.7 Py3.2 Py3.3 
   patch

utf-8 'A'*1   191 (+790%)   1170 (+45%)   1664 
(+2%)1700
utf-8 '\x80'*1187 (+4%) 219 (-11%)172 
(+13%)194
utf-8   '\x80'+'A'*   191 (+98%)1152 (-67%)   376 (+1%) 
378
utf-8 '\u0100'*1  188 (+15%)221 (-2%) 164 
(+32%)217
utf-8   '\u0100'+'A'* 191 (+103%)   1150 (-66%)   382 (+1%) 
387
utf-8   '\u0100'+'\x80'*  188 (+15%)221 (-2%) 164 
(+32%)217
utf-8 '\u8000'*1  244 (-12%)263 (-18%)191 
(+13%)215
utf-8   '\u8000'+'A'* 191 (+102%)   1174 (-67%)   382 (+1%) 
386
utf-8   '\u8000'+'\x80'*  188 (+15%)216 (+0%) 164 
(+32%)217
utf-8   '\u8000'+'\u0100'*188 (+15%)216 (+0%) 164 
(+32%)217
utf-8 '\U0001'*1  251 (-15%)248 (-14%)199 (+7%) 
213
utf-8   '\U0001'+'A'* 191 (+97%)1173 (-68%)   372 (+1%) 
376
utf-8   '\U0001'+'\x80'*  188 (+21%)221 (+3%) 180 
(+26%)227
utf-8   '\U0001'+'\u0100'*188 (+21%)221 (+3%) 180 
(+26%)227
utf-8   '\U0001'+'\u8000'*244 (-9%) 263 (-16%)201 
(+10%)221

On 32-bit Linux, Intel Atom N570 @ 1.66GHz:

  Py2.7 Py3.2 Py3.3 
   patch

utf-8 'A'*1   117 (+414%)   349 (+72%)597 (+1%) 
601
utf-8 '\x80'*186 (-5%)  89 (-8%)  67 (+22%) 
82
utf-8   '\x80'+'A'*   117 (+6%) 340 (-64%)126 (-2%) 
124
utf-8 '\u0100'*1  86 (-2%)  89 (-6%)  66 (+27%) 
84
utf-8   '\u0100'+'A'* 117 (+5%) 339 (-64%)78 (+58%) 
123
utf-8   '\u0100'+'\x80'*  86 (-2%)  89 (-6%)  66 (+27%) 
84
utf-8 '\u8000'*1  109 (-26%)98 (-17%) 71 (+14%) 
81
utf-8   '\u8000'+'A'* 116 (+7%) 339 (-63%)78 (+59%) 
124
utf-8   '\u8000'+'\x80'*  86 (-3%)  89 (-7%)  66 (+26%) 
83
utf-8   '\u8000'+'\u0100'*86 (-3%)  89 (-7%)  66 (+26%) 
83
utf-8 '\U0001'*1  106 (-14%)105 (-13%)81 (+12%) 
91
utf-8   '\U0001'+'A'* 116 (+12%)338 (-62%)127 (+2%) 
130
utf-8   '\U0001'+'\x80'*  86 (+6%)  88 (+3%)  69 (+32%) 
91
utf-8   '\U0001'+'\u0100'*86 (+6%)  88 (+3%)  69 (+32%) 
91
utf-8   '\U0001'+'\u8000'*109 (-24%)98 (-15%) 74 (+12%) 
83

The results were ambiguous (everywhere plus, but in different ways). I
would like to see the results for 64-bit platforms. For scripts see
issue14624.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14654
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14654] More fast utf-8 decoding

2012-04-23 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

64-bit Linux, Intel Core i5-2500K CPU @ 3.30GHz:

  vanilla 3.3   patched
utf-8 'A'*1   6668 (+7%)7145
utf-8 'A'*+'\x80' 2358 (+3%)2418
utf-8 'A'*+'\u0100'   2306 (+0%)2311
utf-8 'A'*+'\u8000'   2299 (+0%)2309
utf-8 'A'*+'\U0001'   2373 (-4%)2278
utf-8 '\x80'*1366 (+53%)559
utf-8   '\x80'+'A'*   859 (+1%) 868
utf-8 '\x80'*+'\u0100'529 (+5%) 558
utf-8 '\x80'*+'\u8000'529 (+5%) 558
utf-8 '\x80'*+'\U0001'529 (+5%) 558
utf-8 '\u0100'*1  520 (+6%) 549
utf-8   '\u0100'+'A'* 822 (+0%) 823
utf-8   '\u0100'+'\x80'*  519 (+6%) 549
utf-8 '\u0100'*+'\u8000'  520 (+6%) 549
utf-8 '\u0100'*+'\U0001'  520 (+6%) 549
utf-8 '\u8000'*1  470 (+4%) 491
utf-8   '\u8000'+'A'* 822 (+0%) 822
utf-8   '\u8000'+'\x80'*  509 (+8%) 549
utf-8   '\u8000'+'\u0100'*509 (+8%) 549
utf-8 '\u8000'*+'\U0001'  470 (-4%) 451
utf-8 '\U0001'*1  483 (-6%) 453
utf-8   '\U0001'+'A'* 938 (-1%) 926
utf-8   '\U0001'+'\x80'*  561 (+6%) 595
utf-8   '\U0001'+'\u0100'*561 (+6%) 595
utf-8   '\U0001'+'\u8000'*503 (-4%) 482

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14654
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14654] More fast utf-8 decoding

2012-04-23 Thread STINNER Victor

STINNER Victor victor.stin...@gmail.com added the comment:

 64-bit Linux, Intel Core i5-2500K CPU @ 3.30GHz: (...)

Hum, the patch doesn't look very interesting if it only optimize one
specific case:

 utf-8     '\x80'*1                    366 (+53%)    559

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14654
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14654] More fast utf-8 decoding

2012-04-23 Thread Jesús Cea Avión

Changes by Jesús Cea Avión j...@jcea.es:


--
nosy: +jcea

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14654
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14654] More fast utf-8 decoding

2012-04-23 Thread Arfrever Frehtes Taifersar Arahesis

Changes by Arfrever Frehtes Taifersar Arahesis arfrever@gmail.com:


--
nosy: +Arfrever

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14654
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com