On 15/03/2016 00:25, BartC wrote:
> On 14/03/2016 23:24, Steven D'Aprano wrote:

>> Try this instead:
>>
>> c = chr(c)
>> if 'A' <= c <= 'Z':
>>      upper += 1
>> elif 'a' <= c <= 'z':
>>      lower += 1
>> elif '0' <= c <= '9':
>>      digits += 1
>> else:
>>      other += 1
>>
>>
>> But even better:
>>
>> if c.isupper():
>>      upper += 1
>> elif c islower():
>>      lower += 1
>> elif c.isdigit():
>>      digits += 1
>> else:
>>      other += 1
>>
>>
>> which will work correctly for non-ASCII characters as well.
>
> Yes, but now you've destroyed my example!
>
> A more realistic use of switch is shown below [not Python].

A tokeniser along those lines in Python, with most of the bits filled in, is here:

http://pastebin.com/dtM8WnFZ

This is a test of a character-at-a-time task in Python; I know such tasks are not popular here, but exactly such tasks are what I often use dynamic languages for.

I started off trying to write it in a more efficient way that would suit Python better, but quickly tired of that. I should be able to express the code how I want.

But it is based on strings (integers seem to be slower in Python), and uses your style above.

However, this is crying out for at least a case statement (which works with anything) if not a switch that works with ints and constants.

All other versions it is compared against use switches, integers, and names that don't change at runtime.

---------------------------------------------------------------------

Performance figures for the test code above. These use the same input file (220K lines of CPython C sources), and show lines-per-second achieved not runtime.

Interpreters:

Py2      Python 2 (various)
Py3      Python 3.4
PyPy     PyPy 3.2.5

PCA      Mine (accelerated via some ASM routines, 64-bit)
PCC      Mine (standard, 32-bit)


Windows 7 64-bits:

Py3:       33K lps (lines per second)
Py2:       43K lps
PyPy      100K lps

PCC:      460K lps
C:       1000K lps (Tiny C compiler, native code)
PCA:     1200K lps
B:       5800K Lps (My own compiler, native code)
C:       9000K lps (gcc C compiler, native code)

(Tiny C is particularly poor at generating code for switches. Still, native code is running slower than byte-code, which is something.)


Ubuntu 15.x 64-bits (run via VirtualBox on Windows 7):

Py2:       58K lps
PyPy:     133K lps (2.x)

PCC:  250-330K lps (via 'wine'; timings variable)
PCA:  500-800K lps (via 'wine')


(Debian on an old Raspberry Pi:

Py2:        2K lps
PCC:       16K lps

I think these, especially for PCC, should be faster. But this was a hastily created Linux port of PCC.)

--
Bartc
--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to