Re: malloc troubles on 64-bit machine
Hi Mattijs, Am Montag, den 08.08.2005, 14:56 +0200 schrieb Matthijs van der Klip: ... > Linux 2.4 and 2.6 kernels have a setting for their overcommitment > behaviour under /proc/sys/vm/overcommit_memory. The different settings > are: ... > For now I've set this to '2' which means the kernel won't overcommit > anymore, just like any other proper OS... ;-) I am running with this setting too since you pointed me to it some time ago. I do not notice a difference though. Also it does not fix my memory leak. A 'fillmem' like tool can however reclaim the memory. Unfortunately it does also reclaim the space hold by the file system buffers. On my development system this well-filled file system buffer space is the most valuable resource. :( My experiments with the 'fillmem' like tool showed that just allocating memory does not show up in the 'Active' memory value. Only initializing the allocated memory does the trick. This means that the memory leak results from pages which have been in real use. > > One final question though: my experience with InnoDB is that it really, > really likes to be able to fit all of it's data and keys into the buffer > pool. This would limit the maximum size of my database to roughly 4GB in > this case, correct? This is in a website hosting environment where the > database is hit with about 1000 queries/s (mixed read/write). I do not believe this. Perhaps you mean that the performance degrades if the database is bigger than the cache. I this case you are right. But I can't think of any way to get around it. If you mean something else, I can't help you much with InnoDB. Please start a new thread with good "Subject:" on the MySQL mailing list and/or on the InnoDB forum (forums.mysql.com). Regards, Ingo -- Ingo Strüwing, Senior Software Developer MySQL AB, www.mysql.com Office: +49 30 43672407 Are you MySQL certified? www.mysql.com/certification -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: malloc troubles on 64-bit machine
On Mon, 1 Aug 2005, Joerg Bruehe wrote: > As a result, the allocation succeeds, but some process gets killed when > the paging space cannot take such an additional page. To the affected > process, this looks like a crash. Linux 2.4 and 2.6 kernels have a setting for their overcommitment behaviour under /proc/sys/vm/overcommit_memory. The different settings are: 0 - Heuristic overcommit handling. Obvious overcommits of address space are refused. Used for a typical system. It ensures a seriously wild allocation fails while allowing overcommit to reduce swap usage. root is allowed to allocate slighly more memory in this mode. This is the default. 1 - Always overcommit. Appropriate for some scientific applications. 2 - Don't overcommit. The total address space commit for the system is not permitted to exceed swap + a configurable percentage (default is 50) of physical RAM. Depending on the percentage you use, in most situations this means a process will not be killed while accessing pages but will receive errors on memory allocation as appropriate. Heuristic overcommit handling seems to be the default, and my problem is in the 'Obvious overcommits of address space are refused'. For some (to me unknown) reason the kernel looks at a single 7GB malloc as if it be an 'obvious overcommit' while 100 2GB mallocs (200GB total) is no problem. :P For now I've set this to '2' which means the kernel won't overcommit anymore, just like any other proper OS... ;-) This makes things far more simple as I can only allocate as much memory as there is physically available now. However it does force me to be a bit more conservative. I have configured InnoDB with a 4GB buffer pool now, which leaves about 3GB for connections (about 300 with my current MySQL settings). Now this seems reasonable. One final question though: my experience with InnoDB is that it really, really likes to be able to fit all of it's data and keys into the buffer pool. This would limit the maximum size of my database to roughly 4GB in this case, correct? This is in a website hosting environment where the database is hit with about 1000 queries/s (mixed read/write). > I am a bit surprised that the Linux kernel management will only allocate > memory if a single chunk of sufficient size is available. My > understanding was that in a paging system this is not necessary. > > If this is (becoming) standard Linux policy, it might be necessary to > demand memory piecewise. One drawback of this approach is increased > bookeeping, if it ever needs to be released. > > I have no idea how the developers view this issue - you might open a > change request if you consider this Linux kernel policy definite. > > You wrote that if a mysql server start fails, you can run "fillmem", and > after its exit the memory will be available. I am not sure whether > Rick's explanation addresses this issue as well - it might be the > "memory defragger" he refers to. If not, the once used chunks might > still be considered "active". I think it all refers to the IMHO buggy (hey, even the manpages state it!) VM memory allocation scheme. As stated I have disabled the overcommitment behaviour for now, which seems to fit better to a dedicated database server. Best regards, -- Matthijs van der Klip Systeembeheerder Spill E-Projects BV Arendstraat 1-3 1223 RE Hilversum Tel. 035-6478248 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: malloc troubles on 64-bit machine
Hi Matthijs! Matthijs van der Klip wrote: On Fri, 29 Jul 2005, Joerg Bruehe wrote: Now the only question that remains is why the Active memory goes close to zero when exiting fillmem and is not when ending a compile run. I asked Again IMHO, it shows an error in memory management. I do not know if it's an error or not. I do agree with you that the memory Well, I understood your description as "after several compile runs or a mysql run, 'active memory' is not returned". This is what looks to me like being an error. management in Linux 2.6 does not seem to be ideal. I even found the following comment in the malloc manpage: 'By default, Linux follows an optimistic memory allocation strategy. This means that when malloc() returns non-NULL there is no guarantee that the memory really is available. [[...]] See also the kernel Documentation directory, files vm/overcommit-accounting and sysctl/vm.txt.' This "memory overcommitment" is the other way around: Processes get more memory requests granted (in total) than they can use afterwards. AIUI, this is implemented by a delayed allocation in paging space: A page that got added to the process' address space (like by a "malloc()") is not immediately assigned a location in the paging space, only when it got modified in RAM and/or then needs to be written out to the paging space for the first time. As a result, the allocation succeeds, but some process gets killed when the paging space cannot take such an additional page. To the affected process, this looks like a crash. What I don't understand is why I seem to be one of few suffering from this problem. MySQL on Linux 2.6 (combined with a massive amount of RAM) is hardly an uncommon configuration nowadays. Secondly it seems two parties (MySQL and Fedora) are pointing to each other right now. Let me quote: On Fri, 29 Jul 2005, Rick Stevens wrote: Well, malloc() will fail if you request a chunk of memory and there isn't a SINGLE chunk available of that size. So if memory gets fragged, there isn't a single 7GB chunk available and malloc() will fail. fillmem allocates in smaller chunks, then releases it all so the memory defragger can clean things up. Ideally, that's what mysql should do. Or start off at some huge size and keep trying progressively smaller chunks until it gets some, e.g. try 8GB. If that fails, try 6GB, then 4, then 2, you get the idea. It could then link those together and manage them. I'm not surprised that it fails. You're asking a single application to grab 7/8 of your RAM--and all in one chunk--regardless of what else has been run before it. On a pristine system (e.g. right after a boot), it may work. After that... It sounds kind of reasonable if explained like this. Now, which method (allocating all in one single malloc call or allocating multiple smaller blocks) is considered good programming practice? And would this be something InnoDB would be likely to change? (A long shot, I guess) This is thin ice for me, but still: I am a bit surprised that the Linux kernel management will only allocate memory if a single chunk of sufficient size is available. My understanding was that in a paging system this is not necessary. If this is (becoming) standard Linux policy, it might be necessary to demand memory piecewise. One drawback of this approach is increased bookeeping, if it ever needs to be released. I have no idea how the developers view this issue - you might open a change request if you consider this Linux kernel policy definite. You wrote that if a mysql server start fails, you can run "fillmem", and after its exit the memory will be available. I am not sure whether Rick's explanation addresses this issue as well - it might be the "memory defragger" he refers to. If not, the once used chunks might still be considered "active". Regards, Jörg -- Joerg Bruehe, Senior Production Engineer MySQL AB, www.mysql.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: malloc troubles on 64-bit machine
On Fri, 29 Jul 2005, Joerg Bruehe wrote: > > Now the only question that remains is why the Active memory goes close to > > zero when exiting fillmem and is not when ending a compile run. I asked > > Again IMHO, it shows an error in memory management. I do not know if it's an error or not. I do agree with you that the memory management in Linux 2.6 does not seem to be ideal. I even found the following comment in the malloc manpage: 'By default, Linux follows an optimistic memory allocation strategy. This means that when malloc() returns non-NULL there is no guarantee that the memory really is available. This is a really bad bug. In case it turns out that the system is out of memory, one or more processes will be killed by the infamous OOM killer. In case Linux is employed under circum- stances where it would be less desirable to suddenly lose some randomly picked processes, and moreover the kernel version is sufficiently recent, one can switch off this overcommitting behavior using a command like # echo 2 > /proc/sys/vm/overcommit_memory See also the kernel Documentation directory, files vm/overcommit-accounting and sysctl/vm.txt.' What I don't understand is why I seem to be one of few suffering from this problem. MySQL on Linux 2.6 (combined with a massive amount of RAM) is hardly an uncommon configuration nowadays. Secondly it seems two parties (MySQL and Fedora) are pointing to each other right now. Let me quote: On Fri, 29 Jul 2005, Rick Stevens wrote: > Well, malloc() will fail if you request a chunk of memory and there > isn't a SINGLE chunk available of that size. So if memory gets fragged, > there isn't a single 7GB chunk available and malloc() will fail. > fillmem allocates in smaller chunks, then releases it all so the > memory defragger can clean things up. > > Ideally, that's what mysql should do. Or start off at some huge > size and keep trying progressively smaller chunks until it gets some, > e.g. try 8GB. If that fails, try 6GB, then 4, then 2, you get the > idea. It could then link those together and manage them. > > I'm not surprised that it fails. You're asking a single application to > grab 7/8 of your RAM--and all in one chunk--regardless of what else has > been run before it. On a pristine system (e.g. right after a boot), > it may work. After that... It sounds kind of reasonable if explained like this. Now, which method (allocating all in one single malloc call or allocating multiple smaller blocks) is considered good programming practice? And would this be something InnoDB would be likely to change? (A long shot, I guess) Best regards, -- Matthijs van der Klip System Administrator Spill E-Projects The Netherlands -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: malloc troubles on 64-bit machine
Matthijs, thank you for your detailed description: Matthijs van der Klip wrote: [[...]] I'd like to start with the following: http://lists.debian.org/debian-kernel/2004/12/msg00410.html This implies I'm not the only one strugling with 'Active' memory on a 2.6 kernel. Interesting detail: the problem report is issued by a MySQL developer named Ingo Strüwing, maybe you know him? Either way, I already contacted him to share my experiences. I know him very well :-) If Ingo does not have an answer, I will probably have none either. Furthermore I have started a thread on the Fedora mailing list about this, as it seems to be related to somewhat whacky memory management. Now back to the problem, what I've found out is basically the following: - When doing a malloc call it appears the requested amount of memory is tested against the total amount of memory minus the amount of 'Active' (according to /proc/meminfo) memory. [[...]] - Interestingly enough it is perfectly possible to allocate multiple 2GB blocks in above situation. This can be done almost without limit, [[...]] - Even more interesting is the fact that 'fillmem' is in fact able to reclaim the Active memory. [[...]] This is quite a detailed description, IMHO. Now the only question that remains is why the Active memory goes close to zero when exiting fillmem and is not when ending a compile run. I asked Again IMHO, it shows an error in memory management. this question on the Fedora list to find out if this is a normal situation or if there could be a memory leak somewhere in the compiler, linker etc "Memory leak" typically means that a process acquired additional memory, does not use it any longer but also does not return it for future allocations. So the process' memory consumption would grow, but at its exit the system would make all that memory available again. AIUI, what you describe is that it does _not_ become available after process exit, but this is a system issue and not internal to the application process / program. chain. In the meanwhile I can use the mentioned workaround, but it's still a bit weird situation. I agree. Have you ever tried to wait a bit after a failing restart and then attempt it again, rather than rebooting? We have waited up to 48 hours, but alas the Active memory never returned... So my assumption of delayed releasing was wrong. Sorry I cannot help. Regards, Jörg -- Joerg Bruehe, Senior Production Engineer MySQL AB, www.mysql.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: malloc troubles on 64-bit machine
On Fri, 29 Jul 2005, Jigal van Hemert wrote: > I do not know exactly which speedup optimizations might be taken in > Fedora Core 4 (as mentioned in your first posting) in general, or in a > 64 bit version specifically, so I am speculating: > > A running MySQL server as configured by you, with 7 GB buffer pool, will > occupy substantial amounts of RAM, probably backed in the "swap area" > (even though this is really a paging area). When the process terminates, > all its resources need to be freed, including flushing files, closing > file descriptors, and releasing these 7 GB. This may take some time. > > Consider that there are file systems that delay writes in order to > optimize disk I/O and to favor reads on which other processes might be > waiting. I suspect that similar strategies might be used on the page device. > > IOW: I doubt that the removal of a process from "ps" output implies that > all its resources have already been freed, and are available. > I admit that the Linux kernel should detect such a situation and delay > the new request (rather than reject it) as the scarce resources are just > getting available, but maybe this is not (yet) done? Hi Joerg, I am a colleague of Jigal van Hemert with which you had this discussion earlier. I subscribed to the MySQL list to clarify the situation, as I'm the one actually experiencing the problems. I'd like to start with the following: http://lists.debian.org/debian-kernel/2004/12/msg00410.html This implies I'm not the only one strugling with 'Active' memory on a 2.6 kernel. Interesting detail: the problem report is issued by a MySQL developer named Ingo Strüwing, maybe you know him? Either way, I already contacted him to share my experiences. Furthermore I have started a thread on the Fedora mailing list about this, as it seems to be related to somewhat whacky memory management. Now back to the problem, what I've found out is basically the following: - When doing a malloc call it appears the requested amount of memory is tested against the total amount of memory minus the amount of 'Active' (according to /proc/meminfo) memory. So when 6GB of Active memory has piled up on my system after a couple of compiles, the largest block of memory allocatable through malloc seems to be roughly 8GB-6GB=2GB. This is why the single malloc call for 7GB from InnoDB fails. - Interestingly enough it is perfectly possible to allocate multiple 2GB blocks in above situation. This can be done almost without limit, because the memory is not actually in use yet, it is only allocated. I have been able to allocate up to 12GB (did not try any higher) this way. As longs as the single malloc calls request blocks which fall within the Total - Active equation, this will succeed. I tested this by modifying the 'fillmem' utility from the 'memtest' package: http://carpanta.dc.fi.udc.es/~quintela/memtest/ - Even more interesting is the fact that 'fillmem' is in fact able to reclaim the Active memory. If I instruct fillmem to allocate (and actually use it by filling it with random values) near to 8GB of RAM, it does so with success and in the end the total amount of Active memory is near zero. After this I can restart MySQL again. This is a temporary workaround. Now the only question that remains is why the Active memory goes close to zero when exiting fillmem and is not when ending a compile run. I asked this question on the Fedora list to find out if this is a normal situation or if there could be a memory leak somewhere in the compiler, linker etc chain. In the meanwhile I can use the mentioned workaround, but it's still a bit weird situation. > Have you ever tried to wait a bit after a failing restart and then > attempt it again, rather than rebooting? We have waited up to 48 hours, but alas the Active memory never returned... Thanks for your time, -- Matthijs van der Klip System Administrator Spill E-Projects The Netherlands -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: malloc troubles on 64-bit machine
Hi Jigal, all; Jigal van Hemert wrote: Hi Joerg, From: "Joerg Bruehe" Jigal van Hemert wrote: 050726 14:13:12 mysqld started 050726 14:13:12 InnoDB: Error: cannot allocate 7340048384 bytes of InnoDB: memory with malloc! Total allocated memory InnoDB: by InnoDB 78086952 bytes. Operating system errno: 12 On my machine (Linux: SuSE 9.1), I have this line in /usr/include/asm-generic/errno-base.h : #define ENOMEM 12 /* Out of memory */ And perror 12 also produces a similar error description. So it looks like some address space (paging area?) was not yet free when the restart was attempted. Maybe the MySQL server had not yet fully terminated? MySQL server was terminated; at least it didn't show up in the output of the ps-command. Hmm. I do not know exactly which speedup optimizations might be taken in Fedora Core 4 (as mentioned in your first posting) in general, or in a 64 bit version specifically, so I am speculating: A running MySQL server as configured by you, with 7 GB buffer pool, will occupy substantial amounts of RAM, probably backed in the "swap area" (even though this is really a paging area). When the process terminates, all its resources need to be freed, including flushing files, closing file descriptors, and releasing these 7 GB. This may take some time. Consider that there are file systems that delay writes in order to optimize disk I/O and to favor reads on which other processes might be waiting. I suspect that similar strategies might be used on the page device. IOW: I doubt that the removal of a process from "ps" output implies that all its resources have already been freed, and are available. I admit that the Linux kernel should detect such a situation and delay the new request (rather than reject it) as the scarce resources are just getting available, but maybe this is not (yet) done? It doesn't happen all the time; the server was running for a few days now. We have never encountered such a situation on a 32-bit machine yet. You could simply terminate MySQL and start it immediately. Well, on a 32 bit machine the areas are smaller, so freeing them should be faster. Can memory get fragmented in some way after it is allocated? AFAIK, this should not happen since Linux is a paging system, not swapping. Of course I can imagine (RAM or paging space) management strategies that try to keep areas continuous, to allow larger I/O transfers, but IMHO these should not be taken so absolute that they delay operation. All in all, I suspect that with your growing storage sizes you need growing amounts of time to release them. Even though hardware gets faster, resource consumption manages to grow at at least the same rate ;-) Have you ever tried to wait a bit after a failing restart and then attempt it again, rather than rebooting? Sorry I can not give a more concrete help, Jörg -- Joerg Bruehe, Senior Production Engineer MySQL AB, www.mysql.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: malloc troubles on 64-bit machine
Hi Joerg, From: "Joerg Bruehe" > Jigal van Hemert wrote: > > 050726 14:13:12 mysqld started > > 050726 14:13:12 InnoDB: Error: cannot allocate 7340048384 bytes of > > InnoDB: memory with malloc! Total allocated memory > > InnoDB: by InnoDB 78086952 bytes. Operating system errno: 12 > On my machine (Linux: SuSE 9.1), I have this line in > /usr/include/asm-generic/errno-base.h : > #define ENOMEM 12 /* Out of memory */ And perror 12 also produces a similar error description. > So it looks like some address space (paging area?) was not yet free when > the restart was attempted. Maybe the MySQL server had not yet fully > terminated? MySQL server was terminated; at least it didn't show up in the output of the ps-command. It doesn't happen all the time; the server was running for a few days now. We have never encountered such a situation on a 32-bit machine yet. You could simply terminate MySQL and start it immediately. Can memory get fragmented in some way after it is allocated? Regards, Jigal. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: malloc troubles on 64-bit machine
Hi Jigal! Jigal van Hemert wrote: [[...]] After a while he needed to restart MySQL (made some changes somewhere) and it refused to do so: 050726 14:13:12 mysqld started 050726 14:13:12 InnoDB: Error: cannot allocate 7340048384 bytes of InnoDB: memory with malloc! Total allocated memory InnoDB: by InnoDB 78086952 bytes. Operating system errno: 12 [[...]] He then rebooted the entire server and: [[...]] ...it runs happily again. Any ideas anyone on the cause and (more importantly) how to fix this problem? On my machine (Linux: SuSE 9.1), I have this line in /usr/include/asm-generic/errno-base.h : #define ENOMEM 12 /* Out of memory */ So it looks like some address space (paging area?) was not yet free when the restart was attempted. Maybe the MySQL server had not yet fully terminated? HTH, Joerg -- Joerg Bruehe, Senior Production Engineer MySQL AB, www.mysql.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
malloc troubles on 64-bit machine
Hi all, We're trying to get a new 64-bit machine going to get around the memory limitations of the 32-bit machines we have. On this dual Opteron server with 8GB memory we've installed Fedora Core 4 and MySQL 4.1.13. Our sysadmin configured MySQL to use a 7GB buffer pool to accomodate a few big tables. After a while he needed to restart MySQL (made some changes somewhere) and it refused to do so: 050726 14:13:12 mysqld started 050726 14:13:12 InnoDB: Error: cannot allocate 7340048384 bytes of InnoDB: memory with malloc! Total allocated memory InnoDB: by InnoDB 78086952 bytes. Operating system errno: 12 InnoDB: Check if you should increase the swap file or InnoDB: ulimits of your operating system. InnoDB: On FreeBSD check you have compiled the OS with InnoDB: a big enough maximum process size. InnoDB: We keep retrying the allocation for 60 seconds... InnoDB: Fatal error: cannot allocate the memory for the buffer pool 050726 14:14:12 [ERROR] Can't init databases 050726 14:14:12 [ERROR] Aborting 050726 14:14:12 [Note] /usr/sbin/mysqld: Shutdown complete 050726 14:14:12 mysqld ended He then rebooted the entire server and: 050726 14:16:37 mysqld started 050726 14:16:41 InnoDB: Started; log sequence number 0 43635 /usr/sbin/mysqld: ready for connections. Version: '4.1.13-standard' socket: '/var/lib/mysql/mysql.sock' port: 3306 MySQL Community Edition - Standard (GPL) ...it runs happily again. Any ideas anyone on the cause and (more importantly) how to fix this problem? Regards, Jigal. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]