[issue35369] List sorting makes duplicate comparisons

2018-12-02 Thread Raymond Hettinger


Change by Raymond Hettinger :


--
resolution:  -> wont fix
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35369] List sorting makes duplicate comparisons

2018-12-01 Thread David Wyde


David Wyde  added the comment:

Okay. Thanks!

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35369] List sorting makes duplicate comparisons

2018-12-01 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

> I've heard of using human choices for comparisons, when fewer decisions could 
> provide a notable speedup.

Memoization is the usual solution when expensive computations are being 
repeated.  That technique preserves the work that was done and it avoids adding 
unnecessary complexity to the consumer code.

> Do you think it's worth posting to python-ideas to see 
> what people's use cases are?

No, that would be a waste of time and it wouldn't change the validity of Tim's 
insights.

--
nosy: +rhettinger

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35369] List sorting makes duplicate comparisons

2018-12-01 Thread David Wyde


David Wyde  added the comment:

Thanks for the speedy and helpful response.

Keeping complexity down is fair. The wasted if-checks on subsequent iterations 
are certainly a negative trade-off. I saw that binarysort() is only called in 
one place, but I understand wanting to keep it generic.

I think that slow comparison functions, especially when repeatedly sorting 
short lists, are the main use case.

I don't know if that's common in performance-critical code. I've heard of using 
human choices for comparisons, when fewer decisions could provide a notable 
speedup. The patched code seems a bit slower in some situations, but is faster 
in others.

Do you think it's worth posting to python-ideas to see what people's use cases 
are?

--
Added file: https://bugs.python.org/file47964/sort-fix-2.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35369] List sorting makes duplicate comparisons

2018-11-30 Thread Tim Peters


Tim Peters  added the comment:

Yup, it can do some redundant comparisons; more on that here:

https://mail.python.org/pipermail/python-dev/2018-August/155020.html

I'm not inclined to make this already-difficult code even harder to understand 
for something that's quite likely not to matter - or even hurt.  For example, 
if we're sorting a million randomly ordered objects, we can expect about 20 
million comparisons.  9,800 is so close to zero compared to that it wouldn't be 
a measurable difference even if the new code came for free - which it doesn't.

It adds whole new piles of tests/branches, which will almost certainly be 
mis-predicted on the first iteration of the outer loop because they always fail 
the "initial" part on iterations after the first.  At best, they're cheap but 
pure do-nothing overhead on outer-loop iterations after the first.

I also dislike that binarysort grows a pile of undocumented assumptions about 
the _context_ it's called in.  As is, it can be used to sort any slice meeting 
the requirements spelled out in the comments.  "Leaking" requirements across 
function boundaries is fragile.

Note that this part:

else IFLT(pivot, *p) {
r = p;
}
else {
l = p + 1;
}

doesn't work at all the way it "looks like" it works.  It expands to:

else if ((k = ISLT(pivot, *p)) < 0) goto fail;
if (k) {
r = p:
}
else {
l = p + 1;
}

The

r = p;

and

l = p + 1;

lines you added above in your new code are useless, because this final "if (k)" 
block executes regardless of whether `initial` is true or false.  Indeed, if 
you hadn't _also_ added new code to set `k` to 0 or 1, the code would be wrong. 
 But that's impossible to see without expanding the macro.

So if you pursue this, at least rewrite that block as:

else {
IFLT(pivot, *p) {
r = p;
else {
l = p + 1;
}
}

and remove the new "k = 1;" and "k = 0;" mystery lines.

But unless there are real-world cases that show significant speedups, I'm - as 
I said in the message I linked to above - happy with the tradeoffs already in 
place.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35369] List sorting makes duplicate comparisons

2018-11-30 Thread David Wyde


Change by David Wyde :


Added file: https://bugs.python.org/file47963/sort.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35369] List sorting makes duplicate comparisons

2018-11-30 Thread David Wyde


New submission from David Wyde :

Python's Timsort sometimes makes the same comparison twice. This leads to extra 
compares, which can hurt performance.

Python sorts several length-3 permutations in 4 steps, and the problem 
accumulates with bigger data. There are ~9,800 duplicate less-than checks when 
sorting a million randomly ordered numbers, and 24,386 wasted comparisons when 
sorting all permutations of length 8.

I've attached a patch to fix this issue. Feedback and improvements are 
appreciated. Speed seems roughly comparable, and should be much improved for 
expensive comparison functions.

The same problem occurs in the Chromium Browser, and probably other ports of 
Python's Timsort implementation.

--
files: sort-fix.diff
keywords: patch
messages: 330839
nosy: dwyde, tim.peters
priority: normal
severity: normal
status: open
title: List sorting makes duplicate comparisons
type: performance
versions: Python 3.8
Added file: https://bugs.python.org/file47962/sort-fix.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com