On 11/03/2016 21:36, Chris Angelico wrote:
On Sat, Mar 12, 2016 at 5:57 AM, BartC <b...@freeuk.com> wrote:
Functions are dynamic. That is, you don't know the F in F() is actually a
function, even if it was defined a few lines previously, because it could have
been rebound to something else. That means a runtime check to ensure that it is
one.
All of these are really one thing: When you compile some code, you
don't know whether F() is still going to be the same object when you
run it. FAT Python is intended to pick this up;
That's an interesting project. And they seem to understand the problems.
it recognizes
constants, and can then perform all the other optimizations you
describe (since functions are themselves immutable, all you need is an
identity check on the function object itself, and everything else you
describe can be done safely).
You mean the compiler assumes it's a particular function, but adds a
runtime check? (Perhaps by generating two call sequences, one fast, and
selecting the path at runtime.)
* Variables are dynamic. So you can't for example declare (with a new const
feature)
'const rows=50'
Definitely agree with this. Having a way to declare that a name is
"truly constant" would be extremely handy; there currently isn't a
way, and I'm not sure whether FAT Python is looking into this or not.
I think it's more focusing on the situation where something simply has
never been rebound, which is probably the better optimization; but
sometimes it'd be nice to be able to assert that something *will not*
be rebound, to help with self-documenting code. (Note that "constant"
implies both "never rebound" and "immutable".
The 'const' prefix here is intended to define a named constant (a
numeric literal with a convenient alias) rather than some 'read-only
attribute for a conventional variable.
But introducing that into Python would be a can of worms. (A named
constant needs a compile-time expression on the right-hand-side. The
compiler needs to be able to see named constants which are in an
imported module, and which are accessed via attributes, and so on.)
* Without a way of defining compile-time constants, very useful language
features such as 'switch' are not practical.
Maybe, but I honestly don't miss 'switch' all that often - and when I
do, it's usually because I want a range. Consider this
semi-hypothetical Pike code:
switch (key)
{
case 'A'..'Z': case 'a'..'z': return "Letter";
case 0xFF00..0xFFFF: return "Special key";
case '.': case ',': case '!': return "Punctuation";
default: return "Other";
}
With a small number of cases, this wouldn't be too bad in if/elif
form; but as it gets bigger, the value of a switch block becomes
evident. The normal advice in Python is "use a dispatch dictionary",
but that would be impractical here. So this would end up being a
massive if/elif tree, because there's just no better way to do this.
Your example doesn't really need a switch: a range check and a
predefined list or tuple which maps 0 to 255 to a certain string.
Otherwise a list of functions indexed by the switch key would generally
do, and might be faster than an if-elsif chain. But there's some untidy
setup code and the rest would be spread out across various functions.
A proper 'switch' statement takes care of all that setup code, and keeps
the blocks of code localised. And selection of the correct bit of code
is very fast, as it's done inside the interpreter with a single
byte-code (well, load and switch).
(See yet another microbenchmark below.)
* Python 3 has decided to have a single 'long' type for integers, instead of
separate int (32 or 64-bit) and long types. This gives a noticeable slow-down
in programs executing lots of integer code, compared with Python 2.
That said, though: I suspect the reason nobody's gone and done this is
not that it wouldn't be any benefit, but that applications doing a lot
of integer code are usually going to see more benefit from Numpy,
Cython, or PyPy,
You'd be surprised how much any kind of program relies on adhoc integer
operations. It doesn't need to work with large int arrays for them to be
important. (Look at the benchmark below: the inner loop is nearly all
integer ops.)
For anything other than individual characters, this is likely to
perform better. However, I was unable to see this, because *CPython
does optimize string appends*.
I guess mine doesn't! (I don't think my PC is 30 times slower than yours.)
----------------------------
'Switch' testing benchmark. The little program show below reads a text
file (I used the entire CPython C sources, 6MB), and counts the number
of characters of each category in upper, lower, digit and other.
(Note there are other ways to approach this task, but a proper 'lexer'
usually does more than count. 'Switch' then becomes invaluable.)
On my machine, I got 2.8 - 3.6 seconds (PyPy 0.4 seconds).
I tried my bytecode language, with the same if-else chain, and it was
0.6 seconds. Then I used 'switch', and it halved to 0.3 seconds.
And the switch here looks at only 3 different groups; typical switch
statements might have dozens. Switch can make a big difference! However,
because of the problem with consts, it would be awkward to add to Python
(it would be unlikely anyway).
(I then tried something else, and got 0.08 seconds ...
http://pastebin.com/whSbieiW for the (non-Python) code.)
upper=0
lower=0
digits=0
other=0
def lex(data):
global upper,lower,digits,other
AA=ord('A')
ZZ=ord('Z')
aa=ord('a')
zz=ord('z')
ZERO=ord('0')
NINE=ord('9')
for c in data:
k=ord(c)
if AA<=k<=ZZ:
upper+=1
elif aa<=k<=zz:
lower+=1
elif ZERO<=k<=NINE:
digits+=1
else:
other+=1
print ("READING")
with open ("big", "r") as myfile:
data=myfile.read()
print ("LEXING")
lex(data)
print ("Upper",upper)
print ("Lower",lower)
print ("Digits",digits)
print ("Other",other)
print ("Total",upper+lower+digits+other)
--
Bartc
--
https://mail.python.org/mailman/listinfo/python-list