Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

Erik Tue, 07 Mar 2017 17:24:18 -0800

On 08/03/17 00:18, Steven D'Aprano wrote:

I thought about that and rejected it as an unnecessary complication.
Hetrogeneous and unknown might as well be the same state: either way,
you cannot use the homogeneous-type optimization.

Knowing it's definitely one of two positive states and not knowing whichof those two states it is is not the same thing when it comes to whatone can and can't optimize cheaply :) It sort of depends on how cheaplyone can track the states though ...

Part of the complexity here is that I'd like this flag to be available
to Python code, not just a hidden internal state of the list.

Out of interest, for what purpose? Generally, I thought Python codeshould not need to worry about low-level optimisations such as this(which are C-Python specific AIUI). A list.is_heterogeneous() methodcould be implemented if it was necessary, but how would that be used?

But also avoids bothering with an O(N) scan in some situations where
the list really is hetrogeneous. So there's both an opportunity cost and
a benefit.


O(N) is worst case.

Most of the anecdotal evidence in this thread so far seems to suggestthat heterogeneous lists are not common. May or may not be true.Empirically, for me, it is true. Who knows? (and there is the question).

Remember, we're talking about opportunities for applying an optimization
here, nothing more. You're not giving up anything: at worst, the
ordinary, unoptimized routine will run and you're no worse off than you
are today.

You are a little bit - the extra overhead of checking all of this (whichis the unknown factor we're all skirting around ATM) costs. Soconverting a previously-heterogeneous list to a homogeneous list via adelete or whatever has a benefit if the optimisations can then beapplied to that list many times in the future (i.e., once it becomesrecognised as homogeneous again, it benefits from optimised paths in theinterpreter).

And of course, all that depends on your use case. It might work outbetter for one application over another. As you quite rightly point out,it needs someone to measure the alternatives and work out if _overall_it has a positive impact ...

so I'm not
a fan of the "once heterogeneous, always considered heterogeneous"
behaviour if it's cheap enough to avoid it.


It is not just a matter of the cost of tracking three states versus two.
It is a matter of the complexity of the interface.

I suppose this could be reported to Python code as None, False or the
type.

I didn't think any of this stuff would come back to Python code (Ithought we were talking about C-Python specific implementation only).How is this useful to Python code?

Ultimately, this is all very pie-in-the-sky unless somebody tests just
how expensive this is and whether the benefit is worthwhile.

I agree. As I said before, I'm just pointing out things I noticed whilelooking at the current C code which could be picked up on if someonewants to try implementing and benchmarking any of this.

It sort of feels like an argument, but I hope we're just violentlyagreeing on a generally shared goal ;)


Regards, E.
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

Reply via email to