On Thu, Apr 30, 2015 at 11:42 AM, James K. Lowden <jklowden at schemamania.org> wrote:
> On Wed, 29 Apr 2015 20:29:07 -0600 > Scott Robison <scott at casaderobison.com> wrote: > > > > That code can fail on a system configured to overcommit memory. By > > > that standard, the pointer is invalid. > > > > > > > Accidentally sent before I was finished. In any case, by "invalid > > pointer" I did not mean to imply "it returns a bit pattern that could > > never represent a valid pointer". I mean "if you dereference a > > pointer returned by malloc that is not null or some implementation > > defined value, it should not result in an invalid memory access". > > Agreed. And I don't think that will happen with malloc. It might, and > I have a plausible scenario, but I don't think that's what happened. > The Linux man page for malloc documents that the pointer returned may not be usable in the case of optimistic memory allocations, as the eventual use of the pointer may trigger the need to commit a page of memory to the address space and that a page of memory may not be available at that point in time. Thus malloc, on Linux, makes no guarantee as to the viability of using the returned pointer. Perhaps you are correct and "sigsegv" is not the literal signal that is triggered in this case. I don't care, really. The fact is that an apparently valid pointer was returned from a memory allocation function yet can result in an invalid access for whatever reason (out of memory, in this case). The Linux OOM killer may kill the offending process (which is what one would expect, but one would also expect malloc to return null, so we already know not to expect the expected). Or it may kill some other process which has done nothing wrong! Sure, the OS is protecting the two processes address space from one another, but it seems to me that if one process can kill another process, there is a problem. I can see the utility of a memory allocation strategy like this. It should in no way be the *default* memory allocation strategy, especially for a system that tries to be posix compliant, because this is in direct violation of posix compliance (by my reading) from http://pubs.opengroup.org/onlinepubs/009695399/functions/malloc.html Upon successful completion with *size* not equal to 0, *malloc*() shall > return a pointer to the allocated space. > Or maybe posix just needs a better definition for "allocated space". Sure, an address was allocated in the processes address space, but actual memory was not allocated. The decades old interface contract was "if you call malloc with a non-zero size, you can depend on malloc to return a null pointer or a pointer to the first byte of an uninitialized allocation". Thus your application could decide what to do if the memory was not available: abort, exit, select an alternative code path that can get the job done with less or no memory, ignore the return value and let sigsegv handle it later. Now with optimistic memory allocation, you do not have a choice. If your malloc call results in an overcommit, your process can be killed later when it tries to access the memory. Or some other innocent process might be killed. I really wonder how many man hours have been wasted trying to debug problems with processes just to find out that the killed process did nothing wrong, it was some other process overcommitting memory. Or worse, how many man hours were wasted and no good reason was ever learned. I came across this last night while learning more about OOM: https://lwn.net/Articles/104179/ -- particular, the analogy, which I think is spot on. I realize that there is no one right answer to how an OS should handle memory exhaustion. There are various tradeoffs. However, C is not an operating system, it is a language, and the standards tell you how you can expect it to behave. In this case, the C API is broken on Linux by default, so it becomes impossible to write fault tolerant applications in the face of this feature. -- Scott Robison