Tim Peters wrote: > That's not by accident - the inspiration for CPython's sort's basic > "galloping" approach was taken from this paper, which wasn't about > sorting at all: > "Adaptive Set Intersections, Unions, and Differences" (2000) > Erik D. Demaine, Alejandro López-Ortiz, J. Ian Munro > > That's concerned with manipulating large sets represented as sorted > sequences. They noticed too that data was "lumpy" in real life, and > that paper was the start of a number of papers trying ways to take > advantage of that. > So if someone _really_ wants to go gonzo with this, that's a promising > approach. Do note, though, that it doesn't fit naturally with > iterators. "Galloping" relies on O(1) access to data at arbitrary > index positions; it's a fancier variant of binary search (see > listsort.txt in your CPython source distribution for details). > If cache misses can be so expensive, how can that win? Because the > number of index accesses per galloping step is logarithmic in the > number of sequence elements consumed, and "compare" steps can (very > roughly speaking) be skipped entirely for all the sequence elements > _not_ touched by the galloping indices.
That's funny as that was about the same time as I was working on my list compare. I actually used a binary search as well. I divided the 'merged' list into three sections a head, body, and tail. The head is where the values in one list are totally outside the other list on the 'low' (in sort order) side. The body was where the two lists started to overlap and I used the minimum of the two lists and a binary search to find that. The tail was the opposite end so there was no need to use a binary search to find where it started (its where one list ends). I was very surprised I could sort the list and step through it with any speed improvement. No matter how I changed the initial list state it was very fast. I expected faster on the upper end with large lists, but not on the lower. My first time using sort and my second python program and I lucked out. Good times! _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/Y6HBHZRHPBMCXQCJD3FTU4JPFFSHVNNN/ Code of Conduct: http://python.org/psf/codeofconduct/