[issue10160] operator.attrgetter slower than lambda after adding dotted names ability

2011-02-21 Thread Christos Georgiou

Χρήστος Γεωργίου (Christos Georgiou) t...@users.sourceforge.net added the 
comment:

This is not the proper place for it, but in the 3.2 and 2.7 news it is reported 
that “The multi-argument form of operator.attrgetter() function now runs 
slightly faster” while it should be “The multi-argument form of 
operator.attrgetter() function now runs slightly faster and the single-argument 
form runs much faster”.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10160
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10160] operator.attrgetter slower than lambda after adding dotted names ability

2010-10-31 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

The PyUnicode_GET_SIZE issue was still there, but I've fixed it and committed 
in r86036. Thanks for your contribution!

--
resolution:  - fixed
stage: patch review - committed/rejected
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10160
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10160] operator.attrgetter slower than lambda after adding dotted names ability

2010-10-30 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

Some comments about the patch:
- this seems to be some dead code:

 if (nattrs = 1) {
 if (!PyArg_UnpackTuple(args, attrgetter, 1, 1, attr))
 return NULL;

- you can't call PyUnicode_GET_SIZE or PyUnicode_AS_UNICODE before you have 
called PyUnicode_Check. If the object is not an unicode object, it triggers an 
assertion in debug mode:
python: ./Modules/operator.c:404: attrgetter_new: Assertion 
`((PyObject*)(item))-ob_type))-tp_flags  ((1L28))) != 0)' failed.

- you should also call PyUnicode_InternInPlace() on the last dotless name 
(after the for loop)

- this comment looks false:
  /* attrgetter_new code should ensure we never come here */

- in dotted_getattr, when attr is a tuple, you leak references to intermediate 
objects (because you don't decref obj before doing newobj = obj)


Other than that, looks quite worthwhile.

--
nosy: +pitrou

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10160
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10160] operator.attrgetter slower than lambda after adding dotted names ability

2010-10-30 Thread Christos Georgi ou

Χρήστος Γεωργίου (Christos Georgiou) t...@users.sourceforge.net added the 
comment:

Thank you very much, Antoine, for your review. My comments in reply:

- the dead code: it's not dead, IIRC it ensures that at least one argument is 
given, otherwise it raises an exception.

- PyUnicode_GET_SIZE: you're right. The previous patch didn't have this 
problem, because there were two loops: the first one made sure in advance that 
all arguments are PyUnicode.

- the false comment: right again. A remain from the first patch.

- dotted_getattr and references: right! I should have noted better what 
Raymond's initial loop did.

Attached a corrected version of the patch according to Antoine's comments.

--
Added file: http://bugs.python.org/file19440/issue10160.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10160
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10160] operator.attrgetter slower than lambda after adding dotted names ability

2010-10-30 Thread Raymond Hettinger

Raymond Hettinger rhettin...@users.sourceforge.net added the comment:

The patch looks fine.

You're overview of the process presented here in the tracker would make a nice 
comment in the code. 

If Antoine is happy with the latest patch, then it's ready to apply.

--
assignee: rhettinger - pitrou

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10160
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10160] operator.attrgetter slower than lambda after adding dotted names ability

2010-10-23 Thread Christos Georgi ou

Χρήστος Γεωργίου (Christos Georgiou) t...@users.sourceforge.net added the 
comment:

A newer version of the patch with the following changes:

- single loop in the ag-attr setup phase of attrgetter_new; interning of the 
stored attribute names
- added two more tests of invalid attrgetter parameters (.attr, attr.)

--
Added file: http://bugs.python.org/file19339/issue10160-2.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10160
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10160] operator.attrgetter slower than lambda after adding dotted names ability

2010-10-22 Thread Terry J. Reedy

Terry J. Reedy tjre...@udel.edu added the comment:

I am not sure whether I should attach a zip/tar file with both the attachments 
(the sample benchmark and the diff); so I'll attach the diff in a further 
comment.

Uploading separate files with .py, .diff extensions that can be viewed in a 
browser is indeed the right thing to do.\

--
nosy: +terry.reedy

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10160
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10160] operator.attrgetter slower than lambda after adding dotted names ability

2010-10-21 Thread Christos Georgi ou

Χρήστος Γεωργίου (Christos Georgiou) t...@users.sourceforge.net added the 
comment:

Modules/operator.c grows by ~70 lines, most of it the setup code for ag-attr; 
also I loop twice over the args of attrgetter_new, choosing fast code that runs 
once per attrgetter creation than temporary data.

Alex's suggestion to make use of Python-level functions to shorten the code of 
attrgetter_new could obviously work to decrease the source lines. I don't know 
how fast I would produce such a version if requested, though.

Whatever the way attrgetter_new sets up the data, I would suggest that you keep 
the logic changes in general, i.e. set-up in attrgetter_new and keep a thinner 
dotted_getattr , since it avoids running the same checks and splitting over and 
over again for every attrgetter_call invocation.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10160
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10160] operator.attrgetter slower than lambda after adding dotted names ability

2010-10-21 Thread Éric Araujo

Changes by Éric Araujo mer...@netwok.org:


--
components:  -Documentation
nosy:  -d...@python

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10160
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10160] operator.attrgetter slower than lambda after adding dotted names ability

2010-10-20 Thread Christos Georgi ou

New submission from Χρήστος Γεωργίου (Christos Georgiou) 
t...@users.sourceforge.net:

(Discovered in that StackOverflow answer: 
http://stackoverflow.com/questions/3940518/3942509#3942509 ; check the comments 
too)

operator.attrgetter in its simplest form (i.e. with a single non-dotted name) 
needs more time to execute than an equivalent lambda expression.

Attached file so3940518.py runs a simple benchmark comparing: a list 
comprehension of plain attribute access; attrgetter; and lambda. I will append 
sample benchmark times at the end of the comment.

Browsing Modules/operator.c, I noticed that the dotted_getattr function was 
using PyUnicode_Check and (possibly) splitting on dots on *every* call of the 
attrgetter, which I thought to be most inefficient.

I changed the py3k-daily-snapshot source to make the PyUnicode_Check calls in 
the attrgetter_new function; also, I modified the algorithm to pre-parse the 
operator.attrgetter functions for possible dots in the names, in order for the 
dotted_getattr function to become speedier.

The only “drawback” is that now operator.attrgetter raises a TypeError on 
creation, not on subsequent calls of the attrgetter object; this shouldn't be a 
compatibility problem. However, I obviously had to update both 
Doc/library/operator.rst and Lib/test/test_operator.py .

I am not sure whether I should attach a zip/tar file with both the attachments 
(the sample benchmark and the diff); so I'll attach the diff in a further 
comment.

On the Ubuntu server 9.10 where I made the changes, I ran the so3940518.py 
sample benchmark before and after the changes.

Run before the changes (third column is seconds, less is better):

list comp 0.40925 100
map attrgetter 1.3897 100
map lambda 1.0098 100

Run after the changes:

list comp 0.40036 100
map attrgetter 0.5196 100
map lambda 0.96 100

--
assignee: d...@python
components: Documentation, Library (Lib), Tests
files: so3940518.py
messages: 119247
nosy: d...@python, tzot
priority: normal
severity: normal
status: open
title: operator.attrgetter slower than lambda after adding dotted names ability
versions: Python 3.2
Added file: http://bugs.python.org/file19310/so3940518.py

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10160
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10160] operator.attrgetter slower than lambda after adding dotted names ability

2010-10-20 Thread Christos Georgi ou

Χρήστος Γεωργίου (Christos Georgiou) t...@users.sourceforge.net added the 
comment:

Here comes the diff to Modules/operator.c, Doc/library/operator.rst and 
Lib/test/test_operator.py . As far as I could check, there are no leaks, but a 
more experienced eye in core development could not hurt. Also, obviously 
test_operatory.py passes all tests.

Should this be accepted, I believe it should be backported to 2.7 (at least). I 
can do that, just let me know.

--
keywords: +patch
Added file: http://bugs.python.org/file19312/issue10160.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10160
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10160] operator.attrgetter slower than lambda after adding dotted names ability

2010-10-20 Thread Benjamin Peterson

Changes by Benjamin Peterson benja...@python.org:


--
assignee: d...@python - rhettinger
nosy: +rhettinger

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10160
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10160] operator.attrgetter slower than lambda after adding dotted names ability

2010-10-20 Thread Christos Georgi ou

Changes by Χρήστος Γεωργίου (Christos Georgiou) t...@users.sourceforge.net:


Removed file: http://bugs.python.org/file19312/issue10160.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10160
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10160] operator.attrgetter slower than lambda after adding dotted names ability

2010-10-20 Thread Christos Georgi ou

Χρήστος Γεωργίου (Christos Georgiou) t...@users.sourceforge.net added the 
comment:

Newer version of the diff, since I forgot some if(0) fprintf debug calls that 
shouldn't be there.

--
Added file: http://bugs.python.org/file19313/issue10160.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10160
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10160] operator.attrgetter slower than lambda after adding dotted names ability

2010-10-20 Thread Christos Georgi ou

Χρήστος Γεωργίου (Christos Georgiou) t...@users.sourceforge.net added the 
comment:

An explanation to the changes.

The old code kept the operator.itemgetter arguments in the ag-attr member. If 
the argument count (ag-nattrs) was 1, the single argument was kept; if more 
than 1, a tuple of the original arguments was kept.

On every attrgetter_call call, if ag-nattrs was 1, dotted_getattr was called 
with the plain ag-attr as attribute name; if  2, dotted_getattr was called 
for every one of the original arguments.

Now, ag-attr is always a tuple, containing either dotless strings or tuples of 
dotless strings:

operator.attrgetter(name1, name2.name3, name4)

stores (name1, (name2, name3), name4) in ag-attr.

dotted_getattr accordingly chooses based on type (either str or tuple, ensured 
by attrgetter_new) whether to do a single access or a recursive one.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10160
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10160] operator.attrgetter slower than lambda after adding dotted names ability

2010-10-20 Thread Raymond Hettinger

Raymond Hettinger rhettin...@users.sourceforge.net added the comment:

Thanks for noticing this.  I wasn't aware that it had slowed down after dotted 
name support had been added.

Since it is a mild performance issue, I'm lowering the priority.  Will take a 
further look when I get the chance.  A first look at the patch shows that it is 
bigger than I expected.  Isn't it possible to check for a dot on construction 
and just record the boolean so that a fast, non-splitting path can be used.

I'm reluctant to make the code volume grow much for this function.  It is 
already a bit voluminous for the minor convenience it offers.

--
priority: normal - low
stage:  - patch review
type:  - performance

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10160
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10160] operator.attrgetter slower than lambda after adding dotted names ability

2010-10-20 Thread Alex

Alex alex.gay...@gmail.com added the comment:

Voice of ignorance here: why can't this be implemented in the naive way one 
might in Python, use the existing string splitting algorithms of stringlib, but 
just leave it in __new__.

--
nosy: +alex

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10160
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com