Re: [Python-Dev] Inconsistency between PEP492, documentation and behaviour of "async with"

2017-06-19 Thread Damien George
> > Can someone please clarify the exact behaviour of "async with"?
>
> "async with" is expected to behave essentially the same way that
> normal "with" does as far as return, break, and continue are concerned
> (i.e. calling __aexit__ without an exception set, so it's more like
> try/finally than it is try/else).
>
> Would you mind filing a documentation bug for that? We clearly missed
> that the semantics described in the new documentation didn't actually
> match the original with statement semantics (even though matching
> those semantics is the intended behaviour).

Ok, bug filed at: http://bugs.python.org/issue30707
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Inconsistency between PEP492, documentation and behaviour of "async with"

2017-06-19 Thread Damien George
Hi all,

Regarding the behaviour of the "async with" statement: it seems that the
description of it in PEP492, and the language documentation, do not match
the behaviour of CPython (v3.6.1).

The PEP and the docs here:
https://www.python.org/dev/peps/pep-0492/#asynchronous-context-managers-and-async-with
https://docs.python.org/3/reference/compound_stmts.html#async-with
say that "async with" is equivalent to a particular use of try/except/else.

But the implementation seems more like a try/except/finally, because the
__aexit__ is always executed, even if a return statement is in the try
block ("else" won't be executed if there's a "return" in the "try").  Also,
as per normal "with", the implementation is a bit more complex than
try/except/finally because you don't want to execute the __aexit__ method
twice if there is an exception in the try.

Can someone please clarify the exact behaviour of "async with"?

Background: in implementing "async with" in MicroPython, we went by the
PEP/docs, and now our behaviour doesn't match that of CPython.

Cheers,
Damien.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Opcode cache in ceval loop

2016-02-01 Thread Damien George
Hi Yury,

That's great news about the speed improvements with the dict offset cache!

> The cache struct is defined in code.h [2], and is 32 bytes long. When a
> code object becomes hot, it gets an cache offset table allocated for it
> (+1 byte for each opcode) + an array of cache structs.

Ok, so each opcode has a 1-byte cache that sits separately to the
actual bytecode.  But a lot of opcodes don't use it so that leads to
some wasted memory, correct?

But then how do you index the cache, do you keep a count of the
current opcode number?  If I remember correctly, CPython has some
opcodes taking 1 byte, and some taking 3 bytes, so the offset into the
bytecode cannot be easily mapped to a bytecode number.

Cheers,
Damien.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Speeding up CPython 5-10%

2016-01-29 Thread Damien George
Hi Yury,

> An off-topic: have you ever tried hg.python.org/benchmarks
> or compare MicroPython vs CPython?  I'm curious if MicroPython
> is faster -- in that case we'll try to copy some optimization
> ideas.

I've tried a small number of those benchmarks, but not in any rigorous
way, and not enough to compare properly with CPython.  Maybe one day I
(or someone) will get to it and report results :)

One thing that makes MP fast is the use of pointer tagging and
stuffing of small integers within object pointers.  Thus integer
arithmetic below 2**30 (on 32-bit arch) requires no heap.

> Do you use opcode dictionary caching only for LOAD_GLOBAL-like
> opcodes?  Do you have an equivalent of LOAD_FAST, or you use
> dicts to store local variables?

The opcodes that have dict caching are:

LOAD_NAME
LOAD_GLOBAL
LOAD_ATTR
STORE_ATTR
LOAD_METHOD (not implemented yet in mainline repo)

For local variables we use LOAD_FAST and STORE_FAST (and DELETE_FAST).
Actually, there are 16 dedicated opcodes for loading from positions
0-15, and 16 for storing to these positions.  Eg:

LOAD_FAST_0
LOAD_FAST_1
...

Mostly this is done to save RAM, since LOAD_FAST_0 is 1 byte.

> If we change the opcode size, it will probably affect libraries
> that compose or modify code objects.  Modules like "dis" will
> also need to be updated.  And that's probably just a tip of the
> iceberg.
>
> We can still implement your approach if we add a separate
> private 'unsigned char' array to each code object, so that
> LOAD_GLOBAL can store the key offsets.  It should be a bit
> faster than my current patch, since it has one less level
> of indirection.  But this way we loose the ability to
> optimize LOAD_METHOD, simply because it requires more memory
> for its cache.  In any case, I'll experiment!

Problem with that approach (having a separate array for offset_guess)
is that how do you know where to look into that array for a given
LOAD_GLOBAL opcode?  The second LOAD_GLOBAL in your bytecode should
look into the second entry in the array, but how does it know?

I'd love to experiment implementing my original caching idea with
CPython, but no time!

Cheers,
Damien.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Speeding up CPython 5-10%

2016-01-27 Thread Damien George
Hi Yuri,

I think these are great ideas to speed up CPython.  They are probably
the simplest yet most effective ways to get performance improvements
in the VM.

MicroPython has had LOAD_METHOD/CALL_METHOD from the start (inspired
by PyPy, and the main reason to have it is because you don't need to
allocate on the heap when doing a simple method call).  The specific
opcodes are:

LOAD_METHOD # same behaviour as you propose
CALL_METHOD # for calls with positional and/or keyword args
CALL_METHOD_VAR_KW # for calls with one or both of */**

We also have LOAD_ATTR, CALL_FUNCTION and CALL_FUNCTION_VAR_KW for
non-method calls.

MicroPython also has dictionary lookup caching, but it's a bit
different to your proposal.  We do something much simpler: each opcode
that has a cache ability (eg LOAD_GLOBAL, STORE_GLOBAL, LOAD_ATTR,
etc) includes a single byte in the opcode which is an offset-guess
into the dictionary to find the desired element.  Eg for LOAD_GLOBAL
we have (pseudo code):

CASE(LOAD_GLOBAL):
key = DECODE_KEY;
offset_guess = DECODE_BYTE;
if (global_dict[offset_guess].key == key) {
// found the element straight away
} else {
// not found, do a full lookup and save the offset
offset_guess = dict_lookup(global_dict, key);
UPDATE_BYTECODE(offset_guess);
}
PUSH(global_dict[offset_guess].elem);

We have found that such caching gives a massive performance increase,
on the order of 20%.  The issue (for us) is that it increases bytecode
size by a considerable amount, requires writeable bytecode, and can be
non-deterministic in terms of lookup time.  Those things are important
in the embedded world, but not so much on the desktop.

Good luck with it!

Regards,
Damien.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Speeding up CPython 5-10%

2016-01-27 Thread Damien George
Hi Yury,

(Sorry for misspelling your name previously!)

> Yes, we'll need to add CALL_METHOD{_VAR|_KW|etc} opcodes to optimize all
> kind of method calls.  However, I'm not sure how big the impact will be,
> need to do more benchmarking.

I never did such fine grained analysis with MicroPython.  I don't
think there are many uses of * and ** that it'd be worth it, but
definitely there are lots of uses of plain keywords.  Also, you'd want
to consider how simple/complex it is to treat all these different
opcodes in the compiler.  For us, it's simpler to treat everything the
same.  Otherwise your LOAD_METHOD part of the compiler will need to
peek deep into the AST to see what kind of call it is.

> BTW, how do you benchmark MicroPython?

Haha, good question!  Well, we use Pystone 1.2 (unmodified) to do
basic benchmarking, and find it to be quite good.  We track our code
live at:

http://micropython.org/resources/code-dashboard/

You can see there the red line, which is the Pystone result.  There
was a big jump around Jan 2015 which is when we introduced opcode
dictionary caching.  And since then it's been very gradually
increasing due to small optimisations here and there.

Pystone is actually a great benchmark for embedded systems because it
gives very reliable results there (almost zero variation across runs)
and if we can squeeze 5 more Pystones out with some change then we
know that it's a good optimisation (for efficiency at least).

For us, low RAM usage and small code size are the most important
factors, and we track those meticulously.  But in fact, smaller code
size quite often correlates with more efficient code because there's
less to execute and it fits in the CPU cache (at least on the
desktop).

We do have some other benchmarks, but they are highly specialised for
us.  For example, how fast can you bit bang a GPIO pin using pure
Python code.  Currently we get around 200kHz on a 168MHz MCU, which
shows that pure (Micro)Python code is about 100 times slower than C.

> That's a neat idea!  You're right, it does require bytecode to become
> writeable.  I considered implementing a similar strategy, but this would
> be a big change for CPython.  So I decided to minimize the impact of the
> patch and leave the opcodes untouched.

I think you need to consider "big" changes, especially ones like this
that can have a great (and good) impact.  But really, this is a
behind-the-scenes change that *should not* affect end users, and so
you should not have any second thoughts about doing it.  One problem I
see with CPython is that it exposes way too much to the user (both
Python programmer and C extension writer) and this hurts both language
evolution (you constantly need to provide backwards compatibility) and
ability to optimise.

Cheers,
Damien.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Clarification of PEP 394 for scripts that run under Python 2 and 3

2015-11-13 Thread Damien George
Hi python-dev,

We have a Python script that runs correctly under Python 2.6, 2.7 and 3.3+.
It is executed on a *nix system using the "python" executable (ie not
python2 or python3 specifically). This works just fine for systems that
have Python 2 installed, or 2 and 3, or just 3 and symlink "python" to
"python3" (eg Arch Linux).

But it fails for systems that have only Python 3 and do not create a
"python" symlink, ie only "python3" exists as an executable.

We thought that PEP 394 would come to the rescue here but it seems to be
unclear on this point. In particular it says:

- 4th point of the abstract: "so python should be used in the shebang line
only for scripts that are source compatible with both Python 2 and 3"

- 6th point of the recommendation section: "One exception to this is
scripts that are deliberately written to be source compatible with both
Python 2.x and 3.x. Such scripts may continue to use python on their
shebang line without affecting their portability"

- 8th point in the migration notes: "If these conventions are adhered to,
it will become the case that the python command is only executed in an
interactive manner as a user convenience, or to run scripts that are source
compatible with both Python 2 and Python 3."

Well, that's pretty clear to me: one can expect the "python" executable to
be available to run scripts that are compatible with versions 2.x and 3.x.

The confusion comes because there are systems that install Python 3 without
creating a "python" symlink (hence breaking the above).  And Guido said
that "'python' should always be the same as 'python2'" (see
https://mail.python.org/pipermail/python-dev/2014-September/136389.html).
Further, Nick Coghlan seemed to agree that "when there's only python3
installed, there should be no /usr/bin/python" (see
https://mail.python.org/pipermail/python-dev/2014-September/136527.html).

My questions are:

1. What is the true intent of PEP 394 when only Python 3 is installed?  Is
"python" available or not to run scripts compatible with 2.x and 3.x?

2. Is it possible to write a shebang line that supports all variations of
Python installations on *nix machines?

3. If the answer to 2 is no, then what is the recommended way to support
all Python installations with one standalone script?

Thanks!

Regards,
Damien.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] async/await in Python; v2

2015-04-21 Thread Damien George
Hi Yury,

In your PEP 492 draft, in the Grammar section, I think you're missing
the modifications to the flow_stmt line.

Cheers,
Damien.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com