On Sep 23, 2019, at 15:32, Richard Higginbotham <higgi...@gmail.com> wrote:

>> Considering your use case however, I wonder, if you would not be better
>> going with the iterator approach (as Andrew has been hinting already for
>> some time in his posts).
>> Richard M.
> They will most likely have good performance on data sets that are close, but 
> very poor on those that aren't balanced.

The code you posted doesn’t do any galloping, it just advances the index one at 
a time. The fact that you could do approximate binary searches to optimize it 
doesn’t help if you never do those searches. So it’ll be exactly as poor on 
unbalanced inputs as my Iterator code. After all, they’re exactly the same 
algorithm (except for trivial differences in how they handle one input or the 
other ending early); the only difference is that one implementation calls next 
and remembers the value, the other increases an index and indexes a list.

Of course if you always have lists already in memory, the added overhead of 
using an Iterator over the list is cost for no benefit. But that’s a fixed 
multiplier, it’s not different for balanced vs. unbalanced inputs. And if 
course if the Iterator ever allows you to avoid building an 
otherwise-unnecessary list, it’s a win, but that win is also a flat cost that 
depends only on the allocation time for N elements, which also doesn’t vary 
based on balanced vs. unbalanced inputs.

Anyway, if your use case really is a gigantic directory from a filesystem that 
guarantees sorted order and it supports iterdir, iterators should help quite a 
bit.
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/LT4N6YIFFQ5H7G2PDLEZREOJOGCBPGGD/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to