On Jan 3, 2005, at 4:49 PM, Tim Peters wrote:

[Tim Peters]
Ya, I understood that.  My conclusion was that Darwin's realloc()
implementation isn't production-quality.  So it goes.

[Bob Ippolito]
Whatever that means.

Well, it means what it said. The C standard says nothing about performance metrics of any kind, and a production-quality implementation of C requires very much more than just meeting what the standard requires. The phrase "quality of implementation" is used in the C Rationale (but not in the standard proper) to cover all such issues. realloc() pragmatics are quality-of-implementation issues; the accuracy of fp arithmetic is another (e.g., if you get back -666.0 from the C 1.0 + 2.0, there's nothing in the standard to justify a complaint).

 free() can be called either explicitly, or implicitly by calling
realloc() with a size larger than the size of the allocation.

From later comments feigning outrage <wink>, I take it that "the size of the allocation" here does not mean the specific number the user passed to the previous malloc/realloc call, but means whatever amount of address space the implementation decided to use internally. Sorry, but I assumed it meant the former at first.

Sorry for the confusion.

Was this a good decision? Probably not!

Sounds more like a bug (or two) to me than "a decision", but I don't
know.

You said yourself that it is standards compliant ;) I have filed it as
a bug, but it is probably unlikely to be backported to current versions
of Mac OS X unless a case can be made that it is indeed a security
flaw.

That's plausible. If you showed me a case where Python's list.sort() took cubic time, I'd certainly consider that to be "a bug", despite that nothing promises better behavior. If I wrote a malloc subsystem and somebody pointed out "did you know that when I malloc 1024**2+1 bytes, and then realloc(1), I lose the other megabyte forever?", I'd consider that to be "a bug" too (because, docs be damned, I wouldn't intentionally design a malloc subsystem with such behavior; and pymalloc does in fact copy bytes on a shrinking realloc in blocks it controls, whenever at least a quarter of the space is given back -- and it didn't at the start, and I considered that to be "a bug" when it was pointed out).

I wouldn't equate "until free() is called" with "forever". But yes, I consider it a bug just as you do, and have reported it appropriately. Practically, since it exists in Mac OS X 10.2 and Mac OS X 10.3, and may not ever be fixed, we should at least consider it.


...
Known case?  No.  Do I want to search Python application-space to find
one?  No.

Serious problems on a platform are usually well-known to users on that platform. For example, it was well-known that Python's list-growing strategy as of a few years ago fragmented address space horribly on Win9X. This was a C quality-of-implementation issue specific to that platform. It was eventually resolved by improving the list-growing strategy on all platforms -- although it's still the case that Win9X does worse on list-growing than other platforms, it's no longer a disaster for most list-growing apps on Win9X.

It does take a long time to figure such weird behavior out though. I would have to guess that most people Python users on Darwin have been at it for less than 3 years.


The number of people using Python on Darwin who have have written or used code that exercised this scenario are determined enough to track this sort of thing down is probably very small.

If there's a problem with "overallocate then realloc() to cut back" on
Darwin that affects many apps, then I'd expect Darwin users to know
about that already -- lots of people have used Python on Macs since
Python's beginning, "mysterious slowdowns" and "mysterious bloat" get
noticed, and Darwin has been around for a while.

Most people on Mac OS X have a lot of memory, and Mac OS X generally does a good job about swapping in and out without causing much of a problem, so I'm personally not very surprised that it could go unnoticed this long.


Google says:
Results 1 - 10 of about 1,150 for (darwin OR Mac OR "OS X") AND MemoryError AND Python.
Results 1 - 10 of about 942 for malloc vm_allocate failed. (0.73 seconds) 


Of course, in both cases, not all of these can be attributed to realloc()'s implementation, but I'm sure some of them can, especially the Python ones!

They're [#ifdef's] also the only good way to deal with platform-specific
inconsistencies. In this specific case, it's not even possible to
determine if a particular allocator implementation is stupid or not
without at least using a platform-allocator-specific function to query
the size reserved by a given allocation.

We've had bad experience on several platforms when passing large numbers to recv(). If that were addressed, it's unclear that Darwin realloc() behavior would remain a real issue. OTOH, it is clear that *just* worming around Darwin realloc() behavior won't help other platforms with problems in the same *immediate* area of bug 1092502. Gross over-allocation followed by a shrinking realloc() just isn't common in Python. sock_recv() is an exceptionally bad case. More typical is, e.g., fileobject.c's get_line(), where if "a line" exceed 100 characters the buffer keeps growing by 25% until there's enough room, then it's cut back once at the end. That typical use for shrinking realloc() just isn't going to be implicated in a real problem -- the over-allocation is always minor.

What about for list objects that are big at some point, then progressively shrink, but happen to stick around for a while? An "event queue" that got clogged for some reason and then became stable? Dictionaries? Of course these potential problems are a lot less likely to happen.


...
There's obviously a tradeoff between copying lots of bytes and having
lots of memory go to waste. That should be taken into consideration
when considering how many pages could be returned to the allocator.
Note that we can ask the allocator how much memory an allocation has
actually reserved (which is usually somewhat larger than the amount you
asked it for) and how much memory an allocation will reserve for a
given size. An allocation resize wouldn't even show up as smaller
unless at least one page would be freed (for sufficiently large
allocations anyway, the minimum granularity is 16 bytes because it
guarantees that alignment). Obviously if you have a lot of pages
anyway, one page isn't a big deal, so we would probably only resort to
free()/memcpy() if some fair percentage of the total pages used by the
allocation could be rescued.


If it does end up causing some real performance problems anyway,
there's always deeper hacks like using vm_copy(), a Darwin specific
function which will do copy-on-write instead (which only makes sense if
the allocation is big enough for this to actually be a performance
improvement).

As above, I'm skeptical that there's a general problem worth addressing here, and am still under the possible illusion that the Mac developers will eventually change their realloc()'s behavior anyway. If you're convinced it's worth the bother, go for it. If you do, I strongly hope that it keys off a new platform-neutral symbol (say, Py_SHRINKING_REALLOC_COPIES) and avoids Darwin-specific implementation code. Then if it turns out that it is a broad problem (across apps or across platforms), everyone can benefit. PyObject_Realloc() seems the best place to put it. Unfortunately, for blocks obtained from the system malloc(), there is no portable way to find out how much excess was allocated in a release-build Python, so "avoids Darwin-specific implementation code" may be impossible to achieve. The more it *can't* be used on any platform other than this flavor of Darwin, the more inclined I am to advise just fixing the immediate problem (sock_recv's potentially unbounded over-allocation).

I'm pretty sure this kind of malloc functionality is very specific to Darwin and does not carry over to any other BSD. In order for an intelligent implementation, an equivalent of malloc_size() and malloc_good_size() is required. Unfortunately, despite the man page, malloc_good_size() is not declared in <malloc/malloc.h>, however there is another, declared, way to get at that functionality (by poking into the malloc_introspection_t struct of the malloc_default_zone()).


-bob

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to