Re: malloc troubles on 64-bit machine

2005-08-09 Thread Ingo Strüwing
Hi Mattijs,

Am Montag, den 08.08.2005, 14:56 +0200 schrieb Matthijs van der Klip:
...
 Linux 2.4 and 2.6 kernels have a setting for their overcommitment 
 behaviour under /proc/sys/vm/overcommit_memory. The different settings 
 are:
...
 For now I've set this to '2' which means the kernel won't overcommit 
 anymore, just like any other proper OS... ;-) 

I am running with this setting too since you pointed me to it some time
ago. I do not notice a difference though. Also it does not fix my memory
leak. A 'fillmem' like tool can however reclaim the memory.
Unfortunately it does also reclaim the space hold by the file system
buffers. On my development system this well-filled file system buffer
space is the most valuable resource. :(

My experiments with the 'fillmem' like tool showed that just allocating
memory does not show up in the 'Active' memory value. Only initializing
the allocated memory does the trick. This means that the memory leak
results from pages which have been in real use.

 
 One final question though: my experience with InnoDB is that it really,
 really likes to be able to fit all of it's data and keys into the buffer
 pool. This would limit the maximum size of my database to roughly 4GB in
 this case, correct?  This is in a website hosting environment where the
 database is hit with about 1000 queries/s (mixed read/write).

I do not believe this. Perhaps you mean that the performance degrades if
the database is bigger than the cache. I this case you are right. But I
can't think of any way to get around it.

If you mean something else, I can't help you much with InnoDB. Please
start a new thread with good Subject: on the MySQL mailing list and/or
on the InnoDB forum (forums.mysql.com).
 
Regards,
Ingo
-- 
Ingo Strüwing, Senior Software Developer
MySQL AB, www.mysql.com
Office: +49 30 43672407

Are you MySQL certified?  www.mysql.com/certification



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: malloc troubles on 64-bit machine

2005-08-08 Thread Matthijs van der Klip
On Mon, 1 Aug 2005, Joerg Bruehe wrote:
 As a result, the allocation succeeds, but some process gets killed when 
 the paging space cannot take such an additional page. To the affected 
 process, this looks like a crash.

Linux 2.4 and 2.6 kernels have a setting for their overcommitment 
behaviour under /proc/sys/vm/overcommit_memory. The different settings 
are:

0   -   Heuristic overcommit handling. Obvious overcommits of
address space are refused. Used for a typical system. It
ensures a seriously wild allocation fails while allowing
overcommit to reduce swap usage.  root is allowed to
allocate slighly more memory in this mode. This is the
default.

1   -   Always overcommit. Appropriate for some scientific
applications.

2   -   Don't overcommit. The total address space commit
for the system is not permitted to exceed swap + a
configurable percentage (default is 50) of physical RAM.
Depending on the percentage you use, in most situations
this means a process will not be killed while accessing
pages but will receive errors on memory allocation as
appropriate.

Heuristic overcommit handling seems to be the default, and my problem is 
in the 'Obvious overcommits of address space are refused'. For some (to me 
unknown) reason the kernel looks at a single 7GB malloc as if it be an 
'obvious overcommit' while 100 2GB mallocs (200GB total) is no problem. :P

For now I've set this to '2' which means the kernel won't overcommit 
anymore, just like any other proper OS... ;-) This makes things far more 
simple as I can only allocate as much memory as there is physically 
available now. However it does force me to be a bit more conservative. I 
have configured InnoDB with a 4GB buffer pool now, which leaves about 3GB 
for connections (about 300 with my current MySQL settings). Now this seems 
reasonable.

One final question though: my experience with InnoDB is that it really,
really likes to be able to fit all of it's data and keys into the buffer
pool. This would limit the maximum size of my database to roughly 4GB in
this case, correct?  This is in a website hosting environment where the
database is hit with about 1000 queries/s (mixed read/write).


 I am a bit surprised that the Linux kernel management will only allocate 
 memory if a single chunk of sufficient size is available. My 
 understanding was that in a paging system this is not necessary.
 
 If this is (becoming) standard Linux policy, it might be necessary to 
 demand memory piecewise. One drawback of this approach is increased 
 bookeeping, if it ever needs to be released.
 
 I have no idea how the developers view this issue - you might open a 
 change request if you consider this Linux kernel policy definite.
 
 You wrote that if a mysql server start fails, you can run fillmem, and 
 after its exit the memory will be available. I am not sure whether 
 Rick's explanation addresses this issue as well - it might be the 
 memory defragger he refers to. If not, the once used chunks might 
 still be considered active.

I think it all refers to the IMHO buggy (hey, even the manpages state it!)  
VM memory allocation scheme. As stated I have disabled the overcommitment
behaviour for now, which seems to fit better to a dedicated database
server.


Best regards,

-- 
Matthijs van der Klip
Systeembeheerder

Spill E-Projects BV
Arendstraat 1-3
1223 RE  Hilversum
Tel. 035-6478248



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: malloc troubles on 64-bit machine

2005-08-01 Thread Matthijs van der Klip
On Fri, 29 Jul 2005, Joerg Bruehe wrote:
  Now the only question that remains is why the Active memory goes close to 
  zero when exiting fillmem and is not when ending a compile run. I asked 
 
 Again IMHO, it shows an error in memory management.

I do not know if it's an error or not. I do agree with you that the memory 
management in Linux 2.6 does not seem to be ideal. I even found the 
following comment in the malloc manpage:

'By default, Linux follows an optimistic memory allocation strategy. This 
means that when malloc() returns non-NULL there is no guarantee that 
the memory really is available. This is a  really  bad  bug. In case it 
turns out that the system is out of memory, one or more processes will be 
killed by the infamous OOM killer.  In case Linux is employed under circum-
stances where it would be less desirable to suddenly lose some randomly 
picked processes, and moreover the kernel version is sufficiently 
recent, one can switch off this  overcommitting behavior using a command 
like
# echo 2  /proc/sys/vm/overcommit_memory
See also the kernel Documentation directory, files 
vm/overcommit-accounting and sysctl/vm.txt.'


What I don't understand is why I seem to be one of few suffering from this
problem. MySQL on Linux 2.6 (combined with a massive amount of RAM) is
hardly an uncommon configuration nowadays.

Secondly it seems two parties (MySQL and Fedora) are pointing to each 
other right now. Let me quote:

On Fri, 29 Jul 2005, Rick Stevens wrote:
 Well, malloc() will fail if you request a chunk of memory and there
 isn't a SINGLE chunk available of that size.  So if memory gets fragged,
 there isn't a single 7GB chunk available and malloc() will fail.
 fillmem allocates in smaller chunks, then releases it all so the
 memory defragger can clean things up.
 
 Ideally, that's what mysql should do.  Or start off at some huge
 size and keep trying progressively smaller chunks until it gets some,
 e.g. try 8GB.  If that fails, try 6GB, then 4, then 2, you get the   
 idea.  It could then link those together and manage them.
 
 I'm not surprised that it fails.  You're asking a single application to
 grab 7/8 of your RAM--and all in one chunk--regardless of what else has
 been run before it.  On a pristine system (e.g. right after a boot),   
 it may work.  After that...

It sounds kind of reasonable if explained like this. Now, which method 
(allocating all in one single malloc call or allocating multiple smaller 
blocks) is considered good programming practice? And would this be 
something InnoDB would be likely to change? (A long shot, I guess)


Best regards,

-- 
Matthijs van der Klip
System Administrator
Spill E-Projects
The Netherlands





-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: malloc troubles on 64-bit machine

2005-08-01 Thread Joerg Bruehe

Hi Matthijs!


Matthijs van der Klip wrote:

On Fri, 29 Jul 2005, Joerg Bruehe wrote:

Now the only question that remains is why the Active memory goes close to 
zero when exiting fillmem and is not when ending a compile run. I asked 


Again IMHO, it shows an error in memory management.



I do not know if it's an error or not. I do agree with you that the memory 


Well, I understood your description as  after several compile runs or a 
mysql run, 'active memory' is not returned. This is what looks to me 
like being an error.


management in Linux 2.6 does not seem to be ideal. I even found the 
following comment in the malloc manpage:


'By default, Linux follows an optimistic memory allocation strategy. This 
means that when malloc() returns non-NULL there is no guarantee that 
the memory really is available. [[...]]
See also the kernel Documentation directory, files 
vm/overcommit-accounting and sysctl/vm.txt.'


This memory overcommitment is the other way around: Processes get more 
memory requests granted (in total) than they can use afterwards.
AIUI, this is implemented by a delayed allocation in paging space: A 
page that got added to the process' address space (like by a malloc()) 
 is not immediately assigned a location in the paging space, only when 
it got modified in RAM and/or then needs to be written out to the paging 
space for the first time.
As a result, the allocation succeeds, but some process gets killed when 
the paging space cannot take such an additional page. To the affected 
process, this looks like a crash.





What I don't understand is why I seem to be one of few suffering from this
problem. MySQL on Linux 2.6 (combined with a massive amount of RAM) is
hardly an uncommon configuration nowadays.

Secondly it seems two parties (MySQL and Fedora) are pointing to each 
other right now. Let me quote:


On Fri, 29 Jul 2005, Rick Stevens wrote:


Well, malloc() will fail if you request a chunk of memory and there
isn't a SINGLE chunk available of that size.  So if memory gets fragged,
there isn't a single 7GB chunk available and malloc() will fail.
fillmem allocates in smaller chunks, then releases it all so the
memory defragger can clean things up.

Ideally, that's what mysql should do.  Or start off at some huge
size and keep trying progressively smaller chunks until it gets some,
e.g. try 8GB.  If that fails, try 6GB, then 4, then 2, you get the   
idea.  It could then link those together and manage them.


I'm not surprised that it fails.  You're asking a single application to
grab 7/8 of your RAM--and all in one chunk--regardless of what else has
been run before it.  On a pristine system (e.g. right after a boot),   
it may work.  After that...



It sounds kind of reasonable if explained like this. Now, which method 
(allocating all in one single malloc call or allocating multiple smaller 
blocks) is considered good programming practice? And would this be 
something InnoDB would be likely to change? (A long shot, I guess)


This is thin ice for me, but still:
I am a bit surprised that the Linux kernel management will only allocate 
memory if a single chunk of sufficient size is available. My 
understanding was that in a paging system this is not necessary.


If this is (becoming) standard Linux policy, it might be necessary to 
demand memory piecewise. One drawback of this approach is increased 
bookeeping, if it ever needs to be released.


I have no idea how the developers view this issue - you might open a 
change request if you consider this Linux kernel policy definite.



You wrote that if a mysql server start fails, you can run fillmem, and 
after its exit the memory will be available. I am not sure whether 
Rick's explanation addresses this issue as well - it might be the 
memory defragger he refers to. If not, the once used chunks might 
still be considered active.



Regards,
Jörg

--
Joerg Bruehe, Senior Production Engineer
MySQL AB, www.mysql.com

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: malloc troubles on 64-bit machine

2005-07-29 Thread Matthijs van der Klip
On Fri, 29 Jul 2005, Jigal van Hemert wrote:
 I do not know exactly which speedup optimizations might be taken in
 Fedora Core 4 (as mentioned in your first posting) in general, or in a
 64 bit version specifically, so I am speculating:
 
 A running MySQL server as configured by you, with 7 GB buffer pool, will
 occupy substantial amounts of RAM, probably backed in the swap area
 (even though this is really a paging area). When the process terminates,
 all its resources need to be freed, including flushing files, closing
 file descriptors, and releasing these 7 GB. This may take some time.
 
 Consider that there are file systems that delay writes in order to
 optimize disk I/O and to favor reads on which other processes might be
 waiting. I suspect that similar strategies might be used on the page device.
 
 IOW: I doubt that the removal of a process from ps output implies that
 all its resources have already been freed, and are available.
 I admit that the Linux kernel should detect such a situation and delay
 the new request (rather than reject it) as the scarce resources are just
 getting available, but maybe this is not (yet) done?


Hi Joerg,

I am a colleague of Jigal van Hemert with which you had this discussion 
earlier. I subscribed to the MySQL list to clarify the situation, as I'm 
the one actually experiencing the problems.

I'd like to start with the following:

http://lists.debian.org/debian-kernel/2004/12/msg00410.html

This implies I'm not the only one strugling with 'Active' memory on a 2.6 
kernel. Interesting detail: the problem report is issued by a MySQL 
developer named Ingo Strüwing, maybe you know him? Either way, I already 
contacted him to share my experiences.

Furthermore I have started a thread on the Fedora mailing list about this, 
as it seems to be related to somewhat whacky memory management.

Now back to the problem, what I've found out is basically the following:

- When doing a malloc call it appears the requested amount of memory is 
  tested against the total amount of memory minus the amount of 'Active' 
  (according to /proc/meminfo) memory. So when 6GB of Active memory has 
  piled up on my system after a couple of compiles, the largest block of 
  memory allocatable through malloc seems to be roughly 8GB-6GB=2GB. This 
  is why the single malloc call for 7GB from InnoDB fails.

- Interestingly enough it is perfectly possible to allocate multiple 2GB 
  blocks in above situation. This can be done almost without limit, 
  because the memory is not actually in use yet, it is only allocated. I 
  have been able to allocate up to 12GB (did not try any higher) this way. 
  As longs as the single malloc calls request blocks which fall within the 
  Total - Active equation, this will succeed. I tested this by modifying 
  the 'fillmem' utility from the 'memtest' package:

  http://carpanta.dc.fi.udc.es/~quintela/memtest/

- Even more interesting is the fact that 'fillmem' is in fact able to 
  reclaim the Active memory. If I instruct fillmem to allocate (and 
  actually use it by filling it with random values) near to 8GB of RAM, it 
  does so with success and in the end the total amount of Active memory is 
  near zero. After this I can restart MySQL again. This is a temporary 
  workaround.


Now the only question that remains is why the Active memory goes close to 
zero when exiting fillmem and is not when ending a compile run. I asked 
this question on the Fedora list to find out if this is a normal situation 
or if there could be a memory leak somewhere in the compiler, linker etc 
chain. In the meanwhile I can use the mentioned workaround, but it's still 
a bit weird situation.


 Have you ever tried to wait a bit after a failing restart and then
 attempt it again, rather than rebooting?

We have waited up to 48 hours, but alas the Active memory never 
returned...


Thanks for your time,

-- 
Matthijs van der Klip
System Administrator
Spill E-Projects
The Netherlands






-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: malloc troubles on 64-bit machine

2005-07-29 Thread Joerg Bruehe

Matthijs,


thank you for your detailed description:

Matthijs van der Klip wrote:

[[...]]

I'd like to start with the following:

http://lists.debian.org/debian-kernel/2004/12/msg00410.html

This implies I'm not the only one strugling with 'Active' memory on a 2.6 
kernel. Interesting detail: the problem report is issued by a MySQL 
developer named Ingo Strüwing, maybe you know him? Either way, I already 
contacted him to share my experiences.


I know him very well :-)
If Ingo does not have an answer, I will probably have none either.



Furthermore I have started a thread on the Fedora mailing list about this, 
as it seems to be related to somewhat whacky memory management.


Now back to the problem, what I've found out is basically the following:

- When doing a malloc call it appears the requested amount of memory is 
  tested against the total amount of memory minus the amount of 'Active' 
  (according to /proc/meminfo) memory. [[...]]


- Interestingly enough it is perfectly possible to allocate multiple 2GB 
  blocks in above situation. This can be done almost without limit, [[...]]


- Even more interesting is the fact that 'fillmem' is in fact able to 
  reclaim the Active memory. [[...]]


This is quite a detailed description, IMHO.



Now the only question that remains is why the Active memory goes close to 
zero when exiting fillmem and is not when ending a compile run. I asked 


Again IMHO, it shows an error in memory management.

this question on the Fedora list to find out if this is a normal situation 
or if there could be a memory leak somewhere in the compiler, linker etc 


Memory leak typically means that a process acquired additional memory, 
does not use it any longer but also does not return it for future 
allocations. So the process' memory consumption would grow, but at its 
exit the system would make all that memory available again.
AIUI, what you describe is that it does _not_ become available after 
process exit, but this is a system issue and not internal to the 
application process / program.


chain. In the meanwhile I can use the mentioned workaround, but it's still 
a bit weird situation.


I agree.




Have you ever tried to wait a bit after a failing restart and then
attempt it again, rather than rebooting?



We have waited up to 48 hours, but alas the Active memory never 
returned...


So my assumption of delayed releasing was wrong. Sorry I cannot help.


Regards,
Jörg

--
Joerg Bruehe, Senior Production Engineer
MySQL AB, www.mysql.com

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: malloc troubles on 64-bit machine

2005-07-27 Thread Joerg Bruehe

Hi Jigal, all;


Jigal van Hemert wrote:

Hi Joerg,

From: Joerg Bruehe


Jigal van Hemert wrote:


050726 14:13:12  mysqld started
050726 14:13:12  InnoDB: Error: cannot allocate 7340048384 bytes of
InnoDB: memory with malloc! Total allocated memory
InnoDB: by InnoDB 78086952 bytes. Operating system errno: 12


On my machine (Linux: SuSE 9.1), I have this line in
/usr/include/asm-generic/errno-base.h :
   #define ENOMEM  12  /* Out of memory */



And perror 12 also produces a similar error description.



So it looks like some address space (paging area?) was not yet free when
the restart was attempted. Maybe the MySQL server had not yet fully
terminated?



MySQL server was terminated; at least it didn't show up in the output of the
ps-command.


Hmm.
I do not know exactly which speedup optimizations might be taken in 
Fedora Core 4 (as mentioned in your first posting) in general, or in a 
64 bit version specifically, so I am speculating:


A running MySQL server as configured by you, with 7 GB buffer pool, will 
occupy substantial amounts of RAM, probably backed in the swap area 
(even though this is really a paging area). When the process terminates, 
all its resources need to be freed, including flushing files, closing 
file descriptors, and releasing these 7 GB. This may take some time.


Consider that there are file systems that delay writes in order to 
optimize disk I/O and to favor reads on which other processes might be 
waiting. I suspect that similar strategies might be used on the page device.


IOW: I doubt that the removal of a process from ps output implies that 
all its resources have already been freed, and are available.
I admit that the Linux kernel should detect such a situation and delay 
the new request (rather than reject it) as the scarce resources are just 
getting available, but maybe this is not (yet) done?




It doesn't happen all the time; the server was running for a few days now.
We have never encountered such a situation on a 32-bit machine yet. You
could simply terminate MySQL and start it immediately.


Well, on a 32 bit machine the areas are smaller, so freeing them should 
be faster.




Can memory get fragmented in some way after it is allocated?


AFAIK, this should not happen since Linux is a paging system, not swapping.
Of course I can imagine (RAM or paging space) management strategies that 
try to keep areas continuous, to allow larger I/O transfers, but IMHO 
these should not be taken so absolute that they delay operation.



All in all, I suspect that with your growing storage sizes you need 
growing amounts of time to release them.
Even though hardware gets faster, resource consumption manages to grow 
at at least the same rate ;-)


Have you ever tried to wait a bit after a failing restart and then 
attempt it again, rather than rebooting?



Sorry I can not give a more concrete help,
Jörg

--
Joerg Bruehe, Senior Production Engineer
MySQL AB, www.mysql.com

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: malloc troubles on 64-bit machine

2005-07-26 Thread Joerg Bruehe

Hi Jigal!

Jigal van Hemert wrote:

[[...]]
After a while he needed to restart MySQL (made some changes somewhere) and
it refused to do so:

050726 14:13:12  mysqld started
050726 14:13:12  InnoDB: Error: cannot allocate 7340048384 bytes of
InnoDB: memory with malloc! Total allocated memory
InnoDB: by InnoDB 78086952 bytes. Operating system errno: 12
[[...]]

He then rebooted the entire server and:
[[...]]
...it runs happily again.

Any ideas anyone on the cause and (more importantly) how to fix this
problem?


On my machine (Linux: SuSE 9.1), I have this line in 
/usr/include/asm-generic/errno-base.h :

   #define ENOMEM  12  /* Out of memory */

So it looks like some address space (paging area?) was not yet free when 
the restart was attempted. Maybe the MySQL server had not yet fully 
terminated?


HTH,
Joerg

--
Joerg Bruehe, Senior Production Engineer
MySQL AB, www.mysql.com

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: malloc troubles on 64-bit machine

2005-07-26 Thread Jigal van Hemert
Hi Joerg,

From: Joerg Bruehe
 Jigal van Hemert wrote:
  050726 14:13:12  mysqld started
  050726 14:13:12  InnoDB: Error: cannot allocate 7340048384 bytes of
  InnoDB: memory with malloc! Total allocated memory
  InnoDB: by InnoDB 78086952 bytes. Operating system errno: 12
 On my machine (Linux: SuSE 9.1), I have this line in
 /usr/include/asm-generic/errno-base.h :
 #define ENOMEM  12  /* Out of memory */

And perror 12 also produces a similar error description.

 So it looks like some address space (paging area?) was not yet free when
 the restart was attempted. Maybe the MySQL server had not yet fully
 terminated?

MySQL server was terminated; at least it didn't show up in the output of the
ps-command.

It doesn't happen all the time; the server was running for a few days now.
We have never encountered such a situation on a 32-bit machine yet. You
could simply terminate MySQL and start it immediately.

Can memory get fragmented in some way after it is allocated?

Regards, Jigal.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]