Re: [Python-Dev] Review request: issue 27350, compact ordered dict

2016-08-10 Thread Maciej Fijalkowski
Hello everyone.

I did do only a cursory look on that one, but I would like to
reiterate that this gives huge benefits in general and we measured
nice speedups on pypy (where all the dicts are ordered forever):

* you can essentially kill OrderedDict or make it almost OrderedDict =
dict (in pypy it's a simple dict subclass that has one or two extra
things that OrderedDict has in the API)
* there are nice speedups
* the C version of OrderedDict can be killed
* it saves memory, on 64bit by quite a bit (not everyone stores more
than 4bln items in a dictionary)
* it solves the problem of tests relying on order in dictionaries

In short, it has no downsides

On Tue, Aug 9, 2016 at 3:12 PM, INADA Naoki  wrote:
> Hi, devs.
>
> I've implemented compact and ordered dictionary [1], which PyPy
> implemented in 2015 [2].
>
> Since it is my first large patch, I would like to have enough time for
> review cycle by Python 3.6 beta1.
>
> Could someone review it?
>
> [1] http://bugs.python.org/issue27350
> [2] 
> https://morepypy.blogspot.jp/2015/01/faster-more-memory-efficient-and-more.html
>
> --
> INADA Naoki  
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Rewrite @contextlib.contextmanager in C

2016-08-10 Thread Nick Coghlan
On 10 August 2016 at 04:43, Giampaolo Rodola'  wrote:

>
>
> On Tue, Aug 9, 2016 at 3:30 PM, Nick Coghlan  wrote:
>
>> On 9 August 2016 at 23:26, Nick Coghlan  wrote:
>>
>>> On 9 August 2016 at 06:18, Guido van Rossum  wrote:
>>>
 I think Nick would be interested in understanding why this is the case.
 What does the decorator do that could be so expensive?

>>>
>>> Reviewing https://hg.python.org/cpython/file/default/Lib/contextlib.py
>>> #l57, Chris's analysis seems plausible to me
>>>
>>
>> Sorry Wolfgang - I missed that Chris was expanding on a comparison you
>> initially made!
>>
>> Either way, I agree that aspect does make up the bulk of the difference
>> in speed, so moving to C likely wouldn't help much. However, the speed
>> difference relative to the simpler warppers is far less defensible - I
>> think there are some opportunities for improvement there, especially around
>> moving introspection work out of _GeneratorContextManager.__init__ and
>> into the contextmanager decorator itself.
>>
>
> By moving the __doc__ introspection out of __init__ and by introducing
> __slots__ I got from -4.37x to -3.16x (__slot__ speedup was about +0.3x).
>

The draft changes aren't preserving the semantics (which may suggest we
have some missing test cases), so we can't take timing numbers just yet.


> Chris' SimplerContextManager solution is faster because it avoids the
> factory function but that is necessary for supporting the decoration of
> methods. Here's a PoC:
>
>
> diff --git a/Lib/contextlib.py b/Lib/contextlib.py
> index 7d94a57..45270dd 100644
> --- a/Lib/contextlib.py
> +++ b/Lib/contextlib.py
> @@ -2,7 +2,7 @@
>  import abc
>  import sys
>  from collections import deque
> -from functools import wraps
> +from functools import wraps, update_wrapper
>
>  __all__ = ["contextmanager", "closing", "AbstractContextManager",
> "ContextDecorator", "ExitStack", "redirect_stdout",
> @@ -57,25 +57,18 @@ class ContextDecorator(object):
>  class _GeneratorContextManager(ContextDecorator, AbstractContextManager):
>  """Helper for @contextmanager decorator."""
>
> +__slots__ = ['gen', 'funcak']
> +
>

Note that this is is still keeping the __dict__ allocation (you'd have to
make ContexDecorator and AbstractContextManager use __slots__ as well to
eliminate it).

While there's likely some speedup just from the dict->descriptor shift,
let's see how far we can improve *without* taking the __slots__ step, and
then consider the question of adopting __slots__ (and the backwards
compatibility implications if we take it as far as eliminating __dict__)
separately.


>  def __init__(self, func, args, kwds):
>  self.gen = func(*args, **kwds)
> -self.func, self.args, self.kwds = func, args, kwds
> -# Issue 19330: ensure context manager instances have good
> docstrings
> -doc = getattr(func, "__doc__", None)
> -if doc is None:
> -doc = type(self).__doc__
> -self.__doc__ = doc
> -# Unfortunately, this still doesn't provide good help output when
> -# inspecting the created context manager instances, since pydoc
> -# currently bypasses the instance docstring and shows the
> docstring
> -# for the class instead.
> -# See http://bugs.python.org/issue19404 for more details.
> +self.funcak = func, args, kwds
>

This actually gives me an idea - we should compare the performance of the
current code with that of using a functools.partial object defined in the
decorator function, so only the partial object needs to be passed in to
_GeneratorContextManager, and we can drop the args and kwds parameters
entirely.

That should move some more of the work out of the per-use code in
_GeneratorContextManager.__init__ and into the per-definition code in the
contextmanager decorator, and "self.gen = func()" should then be faster
than "self.gen = func(*args, **kwds)".


>  def _recreate_cm(self):
>  # _GCM instances are one-shot context managers, so the
>  # CM must be recreated each time a decorated function is
>  # called
> -return self.__class__(self.func, self.args, self.kwds)
> +func, args, kwds = self.funcak
> +return self.__class__(func, args, kwds)
>
>  def __enter__(self):
>  try:
> @@ -157,6 +150,8 @@ def contextmanager(func):
>  @wraps(func)
>  def helper(*args, **kwds):
>  return _GeneratorContextManager(func, args, kwds)
> +
> +update_wrapper(helper, func)
>  return helper
>

As Chris noted, this probably isn't having the effect you want - to move
the introspection code out to the decorator, you still need to pull __doc__
from the function (at decoration time) and set it on the
_GeneratorContextManager instance (at instance creation time).

However, we're getting down to discussing proposed patch details now, so
it's probably time to file an enhancement issue on the tracker and move
further discussion there :)

Re: [Python-Dev] Review request: issue 27350, compact ordered dict

2016-08-10 Thread INADA Naoki
Thanks you for supporting.

BTW, my patch doesn't include OrderedDict speedup. (I'll try it when
compact dict is merged.)

In case of Python 3, OrderedDict.move_to_end(key, last=False) can append
item at front.
So implementing OrderedDict based on compact dict is not so easy as
OrderedDict = dict.

My idea is using compact_array (described in PyPy blog) as ring, not fixed
capacity array.
It means one of these two:

a. dict has one additional word and support ring internally.
b. OrderedDict reimplements many APIs (iterating, resizing, etc...) to
support ring.

I like (a) because it's easier to implement and maintain.
But I don't know it can be in time for Python 3.6 beta.

On Thu, Aug 11, 2016 at 12:06 AM Maciej Fijalkowski 
wrote:

> Hello everyone.
>
> I did do only a cursory look on that one, but I would like to
> reiterate that this gives huge benefits in general and we measured
> nice speedups on pypy (where all the dicts are ordered forever):
>
> * you can essentially kill OrderedDict or make it almost OrderedDict =
> dict (in pypy it's a simple dict subclass that has one or two extra
> things that OrderedDict has in the API)
> * there are nice speedups
> * the C version of OrderedDict can be killed
> * it saves memory, on 64bit by quite a bit (not everyone stores more
> than 4bln items in a dictionary)
> * it solves the problem of tests relying on order in dictionaries
>
> In short, it has no downsides
>
> On Tue, Aug 9, 2016 at 3:12 PM, INADA Naoki 
> wrote:
> > Hi, devs.
> >
> > I've implemented compact and ordered dictionary [1], which PyPy
> > implemented in 2015 [2].
> >
> > Since it is my first large patch, I would like to have enough time for
> > review cycle by Python 3.6 beta1.
> >
> > Could someone review it?
> >
> > [1] http://bugs.python.org/issue27350
> > [2]
> https://morepypy.blogspot.jp/2015/01/faster-more-memory-efficient-and-more.html
> >
> > --
> > INADA Naoki  
> > ___
> > Python-Dev mailing list
> > Python-Dev@python.org
> > https://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Rewrite @contextlib.contextmanager in C

2016-08-10 Thread Chris Angelico
On Wed, Aug 10, 2016 at 4:43 AM, Giampaolo Rodola'  wrote:
> Chris' SimplerContextManager solution is faster because it avoids the
> factory function but that is necessary for supporting the decoration of
> methods.

Hooking a tangent onto this: I have no idea, based on the current
source code, which layers of complexity are necessary for which
reasons. Possibly as a side effect of this transformation, you might
consider adding some comments to the code? There are a few (citing
tracker issues such as #11647 and #27112), but they don't cover
everything.

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com