On Wed, May 28, 2014 at 3:02 AM, Steven D'Aprano <[email protected]> wrote: > But I know that Python is a high-level language with > lots of high-level data structures like dicts which trade-off time and > memory for programmer convenience, and that I'd want to see some real > benchmarks proving that my application was too slow before giving up that > convenience with a complicated strategy like this.
And they trade off that time and memory on the basis of X years of development expertise. A while ago (about ten years or so, now... wow, that's quite a while) I wrote a C++ program that needed an ever-growing array; for simplicity, I went with a very basic system of doubling the size every time, from a base of something like 1024 or 8192. (Note that it was storing and moving around only pointers, so it's comparable to Python's list.) That means it has an average 25% slack space at all times, more until its first allocation, and every now and then it has a huge job of copying a pile of pointers into a new array. (Imagine it's now at 16777216 and it needs to add the 16,777,217th string to the array. Bye-bye CPU caches.) These boundaries became *user-visible pauses*, fortunately at predictable points, but on a not-terrible computer it could cause a >1s pause just copying heaps of pointers. How do you think a Python list will perform, under the same workload (periodic appending of single strings or small groups of strings, very frequent retrieval based on index - probably about a 20:1 read:write ratio)? Not only would it be far more convenient, it's probably going to outperform my hand-rolled code - unless I'm so brilliant (or lucky) that I can stumble to something as good as can be achieved with years of dedicated development. Yeah, I don't think so. Same goes for hashing algorithms. I can at least boast that I've never written one of those... although I have several times written a binary tree of one form or another, in order to provide comparable features to a dict. And once again, now that I know how convenient and performant high level languages can be (which may or may not have been true 15 years ago, but I certainly didn't know it), I don't rewrite what can be done better by just tossing in a couple of curly braces and saying "here, map this to that". ChrisA -- https://mail.python.org/mailman/listinfo/python-list
