On 11/03/2016 21:36, Chris Angelico wrote:
On Sat, Mar 12, 2016 at 5:57 AM, BartC <b...@freeuk.com> wrote:

Functions are dynamic. That is, you don't know the F in F() is actually a 
function, even if it was defined a few lines previously, because it could have 
been rebound to something else. That means a runtime check to ensure that it is 
one.

All of these are really one thing: When you compile some code, you
don't know whether F() is still going to be the same object when you
run it. FAT Python is intended to pick this up;

That's an interesting project. And they seem to understand the problems.

 it recognizes
constants, and can then perform all the other optimizations you
describe (since functions are themselves immutable, all you need is an
identity check on the function object itself, and everything else you
describe can be done safely).

You mean the compiler assumes it's a particular function, but adds a runtime check? (Perhaps by generating two call sequences, one fast, and selecting the path at runtime.)

* Variables are dynamic. So you can't for example declare (with a new const 
feature)
'const rows=50'

Definitely agree with this. Having a way to declare that a name is
"truly constant" would be extremely handy; there currently isn't a
way, and I'm not sure whether FAT Python is looking into this or not.
I think it's more focusing on the situation where something simply has
never been rebound, which is probably the better optimization; but
sometimes it'd be nice to be able to assert that something *will not*
be rebound, to help with self-documenting code. (Note that "constant"
implies both "never rebound" and "immutable".

The 'const' prefix here is intended to define a named constant (a numeric literal with a convenient alias) rather than some 'read-only attribute for a conventional variable.

But introducing that into Python would be a can of worms. (A named constant needs a compile-time expression on the right-hand-side. The compiler needs to be able to see named constants which are in an imported module, and which are accessed via attributes, and so on.)

  * Without a way of defining compile-time constants, very useful language 
features such as 'switch' are not practical.

Maybe, but I honestly don't miss 'switch' all that often - and when I
do, it's usually because I want a range. Consider this
semi-hypothetical Pike code:

switch (key)
{
     case 'A'..'Z': case 'a'..'z': return "Letter";
     case 0xFF00..0xFFFF: return "Special key";
     case '.': case ',': case '!': return "Punctuation";
     default: return "Other";
}

With a small number of cases, this wouldn't be too bad in if/elif
form; but as it gets bigger, the value of a switch block becomes
evident. The normal advice in Python is "use a dispatch dictionary",
but that would be impractical here. So this would end up being a
massive if/elif tree, because there's just no better way to do this.

Your example doesn't really need a switch: a range check and a predefined list or tuple which maps 0 to 255 to a certain string.

Otherwise a list of functions indexed by the switch key would generally do, and might be faster than an if-elsif chain. But there's some untidy setup code and the rest would be spread out across various functions.

A proper 'switch' statement takes care of all that setup code, and keeps the blocks of code localised. And selection of the correct bit of code is very fast, as it's done inside the interpreter with a single byte-code (well, load and switch).

(See yet another microbenchmark below.)

* Python 3 has decided to have a single 'long' type for integers, instead of 
separate int (32 or 64-bit) and long types. This gives a noticeable slow-down 
in programs executing lots of integer code, compared with Python 2.

That said, though: I suspect the reason nobody's gone and done this is
not that it wouldn't be any benefit, but that applications doing a lot
of integer code are usually going to see more benefit from Numpy,
Cython, or PyPy,

You'd be surprised how much any kind of program relies on adhoc integer operations. It doesn't need to work with large int arrays for them to be important. (Look at the benchmark below: the inner loop is nearly all integer ops.)

For anything other than individual characters, this is likely to
perform better. However, I was unable to see this, because *CPython
does optimize string appends*.

I guess mine doesn't! (I don't think my PC is 30 times slower than yours.)

----------------------------

'Switch' testing benchmark. The little program show below reads a text file (I used the entire CPython C sources, 6MB), and counts the number of characters of each category in upper, lower, digit and other.

(Note there are other ways to approach this task, but a proper 'lexer' usually does more than count. 'Switch' then becomes invaluable.)

On my machine, I got 2.8 - 3.6 seconds (PyPy 0.4 seconds).

I tried my bytecode language, with the same if-else chain, and it was 0.6 seconds. Then I used 'switch', and it halved to 0.3 seconds.

And the switch here looks at only 3 different groups; typical switch statements might have dozens. Switch can make a big difference! However, because of the problem with consts, it would be awkward to add to Python (it would be unlikely anyway).

(I then tried something else, and got 0.08 seconds ... http://pastebin.com/whSbieiW for the (non-Python) code.)


upper=0
lower=0
digits=0
other=0

def lex(data):
        global upper,lower,digits,other
        AA=ord('A')
        ZZ=ord('Z')
        aa=ord('a')
        zz=ord('z')
        ZERO=ord('0')
        NINE=ord('9')

        for c in data:
                k=ord(c)
                if AA<=k<=ZZ:
                        upper+=1
                elif aa<=k<=zz:
                        lower+=1
                elif ZERO<=k<=NINE:
                        digits+=1
                else:
                        other+=1

print ("READING")

with open ("big", "r") as myfile:
        data=myfile.read()

print ("LEXING")

lex(data)

print ("Upper",upper)
print ("Lower",lower)
print ("Digits",digits)
print ("Other",other)
print ("Total",upper+lower+digits+other)


--
Bartc


--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to