Re: LNXADMIN as IDENTITY
Hi Tito, the reason is, that this machine also serves as terminal server. In the cookbook, it is also used to show the different setups of RHEL and SLES for LNXADMIN, but the main reason is to run the IUCV terminal server. Berthold On Thu, 9 Jul 2015 22:00:19 -0300 Tito Garrido wrote: > Hi Folks, > > I am trying to understand why the cookbook (For SSI installation) > creates LNXADMIN as a IDENTITY guest. Is it because of the clone > script? > > Thanks, > > Tito > -- -- Berthold Gunreben Build Service Team http://www.suse.de/ Maxfeldstr. 5 SUSE Linux GmbHD-90409 Nuernberg, Germany GF: Felix Imendörffer, Jane Smithard, Dilip Upmanyu, Graham Norton HRB 21284 (AG Nürnberg) -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to find a memory leak?
I replied to Mike and Alan yesterday evening but it does not show in the archives. I am assuming it got lost and I am resending. Sorry if this is a duplicate. > If vmcp is called with a buffer of 1M and the last slab in > /proc/buddyinfo is 0, would it not be reasonable to nudge > the kernel to free at least one slot up, assuming this can be done safely? > So there's no point in nudging the kernel to do a Hail Mary attempt to > find more memory. If it were available, the slab count would already be > 0. As I understand it from the time I was researching this, /proc/buddyinfo shows the current state of the slab cache. Since the kernel uses a large amount of memory for caches and buffers and these are ready to be freed when needed, a zero slab count does not necessarilly mean that a call needing that slab will fail. The kernel does several rounds of freeing and rearranging memory to find or construct a suitable slab. I looked at this in kernel 2.6 and it may have changed, but there the algorithm was different for slabs with size lesser than 32k: for those it tried even harder to free memory. I also remember there was some time limit on the freeing, if the kernel did not free the memoryin time, it failed. So a vmcp failure happens when there are zero free slabs and the kernel fails to free enough continuous memory. I guess you can end up with freeing a lot and still have enough fragmentation not to be able to find a large slab. Where the s390 is different is that it uses large continuous buffers all over. The rest of Linux tries to use smaller or discontinuous buffers which may be why the kernel mainline is not bothered by problems with reclaiming larger slabs. So the question for the VM/zLinux devs could be whether the diag that allows Linux to make CP calls could be changed to return partial data or do something else in order to not use a large buffer. Tomas -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to find a memory leak?
Perhaps double check if /bin/echo is a link to /usr/bin/echo on your system, in which case try updating your sudoers line to point to /usr/bin/echo instead of /bin/echo ? Tom Anderson Ex ignorantia ad sapientiam e tenebris ad lucem! > On Jul 9, 2015, at 8:51 AM, Michael MacIsaac wrote: > > Tomas, > >> I forgot to answer this question: you can drop buffers and cache by > running >> echo 3 > /proc/sys/vm/drop_caches > > Nice, even easier. Thanks! > > The next question is - can this ever be done by a non-root user? I tried > adding /bin/echo to /etc/sudoers, but still get an error: > > mike@lab153:~ $ sudo /bin/echo 3 > /proc/sys/vm/drop_caches > -bash: /proc/sys/vm/drop_caches: Permission denied > > > >-Mike > > On Thu, Jul 9, 2015 at 11:36 AM, Pavelka, Tomas > wrote: > >>> Thanks. I copied and pasted cmmflush and it seems to work nicely >> >> If I understand it right then you have to look at how cmmflush affects the >> output of /proc/buddyinfo. If you see non-zero in the last order of slab >> (i.e. the one with 1MB size) then you are good to run vmcp --buffer=1M. >> Otherwise you may still run into problems even if free -m shows a lot of >> free memory. >> >> But I have not tried cmmflush, maybe it will help. >> >> The way that I was able to reproduce the memory fragmentation problem was >> by copying large amount of data over SCP to that Linux machine. Try that >> and see if you can reproduce the vmcp --buffer=1M failure. >> >> Tomas >> > > -- > For LINUX-390 subscribe / signoff / archive access instructions, > send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit > http://www.marist.edu/htbin/wlvindex?LINUX-390 > -- > For more information on Linux on System z, visit > http://wiki.linuxvm.org/ > -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
LNXADMIN as IDENTITY
Hi Folks, I am trying to understand why the cookbook (For SSI installation) creates LNXADMIN as a IDENTITY guest. Is it because of the clone script? Thanks, Tito -- Linux User #387870 . _/_õ|__| ..º[ .-.___.-._| . . . . .__( o)__( o).:___ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to find a memory leak?
Alan Altmark writes: > On Thursday, 07/09/2015 at 04:25 EDT, Mark Post wrote: > > > The next question is - can this ever be done by a non-root user? I > tried > > > > No. > > # ls -l /proc/sys/vm/drop_caches > > -rw-r--r-- 1 root root 0 Jul 9 16:23 /proc/sys/vm/drop_caches > > Thank heavens! That's all we need -- unprivileged users messing with the > cache Even unprivileged programs have limited and controlled access to influencing the caching behaviour for files that they deal with, whether via read/write or mapped into memory. There are the POSIXy interfaces: madvise(..., MADV_RANDOM) and fadvise(..., POSIX_FADV_RANDOM) madvise(..., MADV_SEQUENTIAL) and fadvise(..., POSIX_FADV_SEQUENTIAL) Similarly WILLNEED, DONTNEED and a few extras like: fsync(...) fdatasync(...) and one or two where the APIs or functionality aren't as standardised or common like readahead(...). Linux has "per-open-file" tracking of readahead window information and per-page marks in the page cache itself and does a good job of deducing the right amount of sync/async readahead based on access pattern and memory pressure in most common cases. However, it's nice to be able to give it a hint or two (e.g. "I'm going to stream through this file once and then won't need it again") while continuing to use the usual simple file APIs without having to mess around reinventing your own buffering or fiddle around with separate threads, async I/Os or separate access methods (or equivalent) in O/Ses where caching is all-or-nothing or privileged-control-only. --Malcolm -- Malcolm Beattie Linux and System z Technical Consultant, zChampion IBM UK Systems and Technology Group -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to find a memory leak?
On Thursday, 07/09/2015 at 04:25 EDT, Mark Post wrote: > > The next question is - can this ever be done by a non-root user? I tried > > No. > # ls -l /proc/sys/vm/drop_caches > -rw-r--r-- 1 root root 0 Jul 9 16:23 /proc/sys/vm/drop_caches Thank heavens! That's all we need -- unprivileged users messing with the cache Alan Altmark Senior Managing z/VM and Linux Consultant Lab Services System z Delivery Practice IBM Systems & Technology Group ibm.com/systems/services/labservices office: 607.429.3323 mobile; 607.321.7556 alan_altm...@us.ibm.com IBM Endicott -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to find a memory leak?
>>> On 7/9/2015 at 11:51 AM, Michael MacIsaac wrote: > Tomas, > >> I forgot to answer this question: you can drop buffers and cache by > running >> echo 3 > /proc/sys/vm/drop_caches > > Nice, even easier. Thanks! > > The next question is - can this ever be done by a non-root user? I tried No. # ls -l /proc/sys/vm/drop_caches -rw-r--r-- 1 root root 0 Jul 9 16:23 /proc/sys/vm/drop_caches Mark Post -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to find a memory leak?
On Thursday, 07/09/2015 at 01:16 EDT, Michael MacIsaac wrote: > I'm going to stop here for now. I've learned a lot about Linux memory from > this thread (but that's easy when you don't know much to begin with :)). > > I guess a question to the Linux developers in Germany would be: > > If vmcp is called with a buffer of 1M and the last slab in /proc/buddyinfo > is 0, would it not be reasonable to nudge the kernel to free at least one > slot up, assuming this can be done safely? My 0.02 USD: CP has similar issues for I/O and V-SIE. Slab creation (coalescing adjacent page frames into larger slabs) is a function that is intended to ensure the available count for each slab is > 0. The ideal time to create a larger slab is when memory is being released. The only way to get larger slabs is to force more memory to be released. This is why the cache controls discussed here are important - they keep as much memory released as advisable. So there's no point in nudging the kernel to do a Hail Mary attempt to find more memory. If it were available, the slab count would already be > 0. Alan Altmark Senior Managing z/VM and Linux Consultant Lab Services System z Delivery Practice IBM Systems & Technology Group ibm.com/systems/services/labservices office: 607.429.3323 mobile; 607.321.7556 alan_altm...@us.ibm.com IBM Endicott -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to find a memory leak?
If the diag 8 response is truncated, the response from CP sets condition code 1 and returns how many bytes of the output would not fit in the buffer. If this information was somehow returned by the vmcp command, then you'd know how much bigger your response buffer should be, and then reissue the command with the correct buffer size. While that doesn't fix the problem of vmcp not being able to obtain a buffer, it would help avoid it by not needing a very large buffer for many commands. Pipelines in CMS automatically obtains a larger buffer for CP QUERY commands, because there are no side effects from issuing a query more than once. If the command is not a query, the number of bytes that didn't fit the buffer can be returned to the program, so that the command can be issued again with a larger buffer. On Thu, Jul 9, 2015 at 1:16 PM, Michael MacIsaac wrote: > I'm going to stop here for now. I've learned a lot about Linux memory from > this thread (but that's easy when you don't know much to begin with :)). > > I guess a question to the Linux developers in Germany would be: > > If vmcp is called with a buffer of 1M and the last slab in /proc/buddyinfo > is 0, would it not be reasonable to nudge the kernel to free at least one > slot up, assuming this can be done safely? > > Thanks. > > -Mike > > -- Bruce Hayden z/VM and Linux on z Systems ATS IBM, Endicott, NY -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to find a memory leak?
I'm going to stop here for now. I've learned a lot about Linux memory from this thread (but that's easy when you don't know much to begin with :)). I guess a question to the Linux developers in Germany would be: If vmcp is called with a buffer of 1M and the last slab in /proc/buddyinfo is 0, would it not be reasonable to nudge the kernel to free at least one slot up, assuming this can be done safely? Thanks. -Mike On Thu, Jul 9, 2015 at 12:53 PM, Pavelka, Tomas wrote: > > Maybe I'll think about sudo-enabling cmmflush and checking the last > field of /proc/buddyinfo to see if it needs to be run. > > I tried doing things based on the values of /proc/buddyinfo but what I > found is that if there are zeroes in the high order slab counts, there is a > chance that vmcp with 1M buffer will fail. But not a guarantee. Sometimes > Linux just rearranges the slabs and finds the memory. Which makes it even > harder to reproduce. Beware that you can spend ages debugging this ;-) > > Tomas > -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to find a memory leak?
finding the cause and setting an alert would certainly help anticipate. This data is collected each minute automatically, at a cost of less than .1% of one ifl per server, at process and system level. There are more metrics, this is a sample Report: ESALNXP LINUX Velocity Software Corporate ZMAP 4.2.0 02/ Monitor initialized: 02/27/ First record analyzed: 02/27/15 19:00:00 node/ <-Process Ident-> <---Storage Metrics (MB)--> Name IDPPID GRP Size RSS Peak Swap Data Stk EXEC Lib Lck PTbl - - - - --- --- 02/27/15 19:01:00 oracle0 0 0 7375 980 72120 174 4.9 1839 478 0 8.98 init 1 1 010 0.80 0.14 0.1 0.6 0 0 0.01 perl 2140 1 214096 9.00 4.06 0.1 1.4 2.2 0 0.03 snmpd22809 1 22808 359 34.70 3.50 0.1 0.0 29 0 0.05 and at system level: Report: ESALNXR LINUX RAM/Storage Analysis Report Velocity Sof Monitor initialized: 02/27/15 at 19:00:00 on 2828 serial 314C7 First record --- Node/<-Kernel(MB)-> <-Buffers(MB <---Cache><---Anonymous---> Stack<-Slab--> Time Total Free Size Actv Swap Total Actv Inact Size Size SRec Size Dirty B - - - - - 02/27/15 19:01:00 oracle 994.8 13.7 5500 0.8 115.60 00 38.40 251 0.2 --- 19:02:00 oracle 994.8 13.7 5500 0.8 115.60 00 38.30 251 0.0 On 7/9/2015 9:31 AM, Michael MacIsaac wrote: Barton, It reports on the /proc/buddyinfo values and anticipates vmcp failing? On Thu, Jul 9, 2015 at 12:26 PM, Barton Robinson < bar...@velocitysoftware.com> wrote: And a good performance monitor would already have this reported - down to the process level. On 7/9/2015 9:06 AM, Michael MacIsaac wrote: Let me answer my own question. Perhaps kludgy, but by adding 'tee' to sudo, this technique works: root@lab141:~ # visudo root@lab141:~ # tail -1 /etc/sudoers %zoom ALL=NOPASSWD:/usr/bin/tee root@lab141:~ # su - mike mike@lab141:~ # free -m total used free sharedbuffers cached Mem: 491473 18 0111170 -/+ buffers/cache:190300 Swap: 512 0511 mike@lab141:~ # echo 3 | sudo /usr/bin/tee /proc/sys/vm/drop_caches > /dev/null mike@lab141:~ # free -m total used free sharedbuffers cached Mem: 491103388 0 1 12 -/+ buffers/cache: 89401 Swap: 512 0511 -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to find a memory leak?
> Maybe I'll think about sudo-enabling cmmflush and checking the last field of > /proc/buddyinfo to see if it needs to be run. I tried doing things based on the values of /proc/buddyinfo but what I found is that if there are zeroes in the high order slab counts, there is a chance that vmcp with 1M buffer will fail. But not a guarantee. Sometimes Linux just rearranges the slabs and finds the memory. Which makes it even harder to reproduce. Beware that you can spend ages debugging this ;-) Tomas
Re: How to find a memory leak?
Barton, It reports on the /proc/buddyinfo values and anticipates vmcp failing? On Thu, Jul 9, 2015 at 12:26 PM, Barton Robinson < bar...@velocitysoftware.com> wrote: > And a good performance monitor would already have this reported - down > to the process level. > > > On 7/9/2015 9:06 AM, Michael MacIsaac wrote: > >> Let me answer my own question. Perhaps kludgy, but by adding 'tee' to >> sudo, this technique works: >> >> root@lab141:~ # visudo >> root@lab141:~ # tail -1 /etc/sudoers >> %zoom ALL=NOPASSWD:/usr/bin/tee >> root@lab141:~ # su - mike >> mike@lab141:~ # free -m >> total used free sharedbuffers cached >> Mem: 491473 18 0111170 >> -/+ buffers/cache:190300 >> Swap: 512 0511 >> mike@lab141:~ # echo 3 | sudo /usr/bin/tee /proc/sys/vm/drop_caches > >> /dev/null >> mike@lab141:~ # free -m >> total used free sharedbuffers cached >> Mem: 491103388 0 1 12 >> -/+ buffers/cache: 89401 >> Swap: 512 0511 >> >> >> >> -- >> For LINUX-390 subscribe / signoff / archive access instructions, >> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or >> visit >> http://www.marist.edu/htbin/wlvindex?LINUX-390 >> -- >> For more information on Linux on System z, visit >> http://wiki.linuxvm.org/ >> >> >> >> > -- > For LINUX-390 subscribe / signoff / archive access instructions, > send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or > visit > http://www.marist.edu/htbin/wlvindex?LINUX-390 > -- > For more information on Linux on System z, visit > http://wiki.linuxvm.org/ > -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to find a memory leak?
Tomas, > But as I said, in my experiments dropping caches did not help. So we both arrived at a technique that will not work - (he he :)) > What makes this hard to test is that vmcp running out of memory is not easily reproducible. Yes, the error has been quite intermittent. Maybe I'll think about sudo-enabling cmmflush and checking the last field of /proc/buddyinfo to see if it needs to be run. Thanks all. -Mike On Thu, Jul 9, 2015 at 12:06 PM, Michael MacIsaac wrote: > Let me answer my own question. Perhaps kludgy, but by adding 'tee' to > sudo, this technique works: > > root@lab141:~ # visudo > root@lab141:~ # tail -1 /etc/sudoers > %zoom ALL=NOPASSWD:/usr/bin/tee > root@lab141:~ # su - mike > mike@lab141:~ # free -m > total used free sharedbuffers cached > Mem: 491473 18 0111170 > -/+ buffers/cache:190300 > Swap: 512 0511 > mike@lab141:~ # echo 3 | sudo /usr/bin/tee /proc/sys/vm/drop_caches > > /dev/null > mike@lab141:~ # free -m > total used free sharedbuffers cached > Mem: 491103388 0 1 12 > -/+ buffers/cache: 89401 > Swap: 512 0511 > > > >> > -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to find a memory leak?
And a good performance monitor would already have this reported - down to the process level. On 7/9/2015 9:06 AM, Michael MacIsaac wrote: Let me answer my own question. Perhaps kludgy, but by adding 'tee' to sudo, this technique works: root@lab141:~ # visudo root@lab141:~ # tail -1 /etc/sudoers %zoom ALL=NOPASSWD:/usr/bin/tee root@lab141:~ # su - mike mike@lab141:~ # free -m total used free sharedbuffers cached Mem: 491473 18 0111170 -/+ buffers/cache:190300 Swap: 512 0511 mike@lab141:~ # echo 3 | sudo /usr/bin/tee /proc/sys/vm/drop_caches > /dev/null mike@lab141:~ # free -m total used free sharedbuffers cached Mem: 491103388 0 1 12 -/+ buffers/cache: 89401 Swap: 512 0511 -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to find a memory leak?
Let me answer my own question. Perhaps kludgy, but by adding 'tee' to sudo, this technique works: root@lab141:~ # visudo root@lab141:~ # tail -1 /etc/sudoers %zoom ALL=NOPASSWD:/usr/bin/tee root@lab141:~ # su - mike mike@lab141:~ # free -m total used free sharedbuffers cached Mem: 491473 18 0111170 -/+ buffers/cache:190300 Swap: 512 0511 mike@lab141:~ # echo 3 | sudo /usr/bin/tee /proc/sys/vm/drop_caches > /dev/null mike@lab141:~ # free -m total used free sharedbuffers cached Mem: 491103388 0 1 12 -/+ buffers/cache: 89401 Swap: 512 0511 > -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to find a memory leak?
> The next question is - can this ever be done by a non-root user? I tried > adding /bin/echo to /etc/sudoers, but still get an error: I was able to google these two approaches to dropping caches over sudo: sudo sh -c "sync; echo 3 > /proc/sys/vm/drop_caches" or echo 3 | sudo tee /proc/sys/vm/drop_caches See the comments here: http://www.linuxinsight.com/proc_sys_vm_drop_caches.html But as I said, in my experiments dropping caches did not help. What makes this hard to test is that vmcp running out of memory is not easily reproducible. It can happen once, then you can try rerunning for a while and it keeps happening. But suddenly the kernel rearranges the slabs and you can run fine for days. The problem is that I have not found a way to free memory for large kernel slabs from within a script. If you are trying to fix the problem as human, the solution is to repeatedly run vmcp --buffer=1M q userid and it will eventually go away. Tomas
Re: How to find a memory leak?
Easier, but the pages aren't dropped from the zVM side immediately so if you are memory constrained there, cmmflush is your friend. -Original Message- From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Michael MacIsaac Sent: Thursday, July 09, 2015 8:51 AM To: LINUX-390@VM.MARIST.EDU Subject: Re: [LINUX-390] How to find a memory leak? Tomas, > I forgot to answer this question: you can drop buffers and cache by running > echo 3 > /proc/sys/vm/drop_caches Nice, even easier. Thanks! The next question is - can this ever be done by a non-root user? I tried adding /bin/echo to /etc/sudoers, but still get an error: mike@lab153:~ $ sudo /bin/echo 3 > /proc/sys/vm/drop_caches -bash: /proc/sys/vm/drop_caches: Permission denied -Mike On Thu, Jul 9, 2015 at 11:36 AM, Pavelka, Tomas wrote: > > Thanks. I copied and pasted cmmflush and it seems to work nicely > > If I understand it right then you have to look at how cmmflush affects > the output of /proc/buddyinfo. If you see non-zero in the last order > of slab (i.e. the one with 1MB size) then you are good to run vmcp > --buffer=1M. > Otherwise you may still run into problems even if free -m shows a lot > of free memory. > > But I have not tried cmmflush, maybe it will help. > > The way that I was able to reproduce the memory fragmentation problem > was by copying large amount of data over SCP to that Linux machine. > Try that and see if you can reproduce the vmcp --buffer=1M failure. > > Tomas > -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to find a memory leak?
Tomas, > I forgot to answer this question: you can drop buffers and cache by running > echo 3 > /proc/sys/vm/drop_caches Nice, even easier. Thanks! The next question is - can this ever be done by a non-root user? I tried adding /bin/echo to /etc/sudoers, but still get an error: mike@lab153:~ $ sudo /bin/echo 3 > /proc/sys/vm/drop_caches -bash: /proc/sys/vm/drop_caches: Permission denied -Mike On Thu, Jul 9, 2015 at 11:36 AM, Pavelka, Tomas wrote: > > Thanks. I copied and pasted cmmflush and it seems to work nicely > > If I understand it right then you have to look at how cmmflush affects the > output of /proc/buddyinfo. If you see non-zero in the last order of slab > (i.e. the one with 1MB size) then you are good to run vmcp --buffer=1M. > Otherwise you may still run into problems even if free -m shows a lot of > free memory. > > But I have not tried cmmflush, maybe it will help. > > The way that I was able to reproduce the memory fragmentation problem was > by copying large amount of data over SCP to that Linux machine. Try that > and see if you can reproduce the vmcp --buffer=1M failure. > > Tomas > -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to find a memory leak?
> Thanks. I copied and pasted cmmflush and it seems to work nicely If I understand it right then you have to look at how cmmflush affects the output of /proc/buddyinfo. If you see non-zero in the last order of slab (i.e. the one with 1MB size) then you are good to run vmcp --buffer=1M. Otherwise you may still run into problems even if free -m shows a lot of free memory. But I have not tried cmmflush, maybe it will help. The way that I was able to reproduce the memory fragmentation problem was by copying large amount of data over SCP to that Linux machine. Try that and see if you can reproduce the vmcp --buffer=1M failure. Tomas
Re: How to find a memory leak?
> As a workaround, is there a command to flush the buffer cache? I forgot to answer this question: you can drop buffers and cache by running echo 3 > /proc/sys/vm/drop_caches See http://linux-mm.org/Drop_Caches As far as I remember this did not help at all. My guess about why that did not help is that when seeking for memory, the kernel will actually try to drop some caches, but in the case of memory fragmentation that does not help. But feel free to try. Other things I tried that did not work or work consistently was repeating the vmcp call with a possible wait and increasing the server memory to about 2G. What definitely does not help is increasing the memory with chmem, because that adds memory not usable by the kernel for this kind of buffer allocation (again, I forgot the details). Tomas
Re: How to find a memory leak?
Tomas, Marcy, Thanks. I copied and pasted cmmflush and it seems to work nicely: # free -m total used free sharedbuffers cached Mem: 492162329 0 29 83 -/+ buffers/cache: 49442 Swap: 898 0898 # cmmflush 11:16:17 Currently free 328MB, dropping cache... 11:16:18 Now free 422MB, released 93MB 11:16:18 CMM base is 0MB, target is 396MB 11:16:19 CMM currently at 396MB... 11:16:19 Done! CMM base restored to 0MB 11:16:19 Released 396 MB of memory # free -m total used free sharedbuffers cached Mem: 492 69423 0 0 19 -/+ buffers/cache: 49442 Swap: 898 0898 Rob, thanks for the contribution. -Mike On Thu, Jul 9, 2015 at 11:06 AM, Pavelka, Tomas wrote: > This is a really ugly problem that I don't have a solution for. But let me > give you a bit of info if you want to do your own digging: > > The way I found this is that I was adding NICs to a Linux on the fly. > Sometimes this would fail, saying page allocation in syslog. The discussion > on this list is here: > > http://www.mail-archive.com/linux-390%40vm.marist.edu/msg65371.html > > What I found later is that the NIC driver needs 64k of memory in kernel > space. This means the memory needs to be continuous. The kernel keeps > memory in structures called slabs, and keeps pools of these. If you do > > cat /proc/buddyinfo > Node 0, zone DMA 9078 10398 3135838164 14 0 > 0 2 > > Another way to get memory report is to run "echo m > /proc/sysrq-trigger" > and look into syslog for a report about kernel memory usage. > > You will see how many slabs of each order you have. 9078 of order 1 slabs > (4kb), 10398 of order 2 slabs (8kb) ... 2 order 9 slabs (1MB). If a slab of > lower order is needed it may split a higher order one (e.g. if the kernel > wants a 4k slab it may split an 8k slab into two). Lots of kernel > allocations and you may run out of the higher order slabs. What worked for > me for trigerring this condition was moving a lot of data to the Linux over > SCP. There may be other causes. > > Now the significance of 32k is that this is where Linux stops retrying to > rearrange memory to find larger slabs. I don't remember the details, but if > you want to investigate look at the kernel sources, namely mm/page_alloc.c > and mm/vmscan.c > > So the bottom line is, anytime you have an operation that needs a large > buffer in kernel (chccwdev of a NIC, vmcp with --buffer, DIAG from Linux) > it may fail at unexpected times. I have not found a good way to get around > this but I will be interested if you find anything. > > In the case of VMCP what may help is if it allocated a buffer at kernel > startup. At the moment it allocates it for every call, see > http://lxr.free-electrons.com/source/drivers/s390/char/vmcp.c#L105 > > Tomas > > -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to find a memory leak?
This is a really ugly problem that I don't have a solution for. But let me give you a bit of info if you want to do your own digging: The way I found this is that I was adding NICs to a Linux on the fly. Sometimes this would fail, saying page allocation in syslog. The discussion on this list is here: http://www.mail-archive.com/linux-390%40vm.marist.edu/msg65371.html What I found later is that the NIC driver needs 64k of memory in kernel space. This means the memory needs to be continuous. The kernel keeps memory in structures called slabs, and keeps pools of these. If you do cat /proc/buddyinfo Node 0, zone DMA 9078 10398 3135838164 14 0 0 2 Another way to get memory report is to run "echo m > /proc/sysrq-trigger" and look into syslog for a report about kernel memory usage. You will see how many slabs of each order you have. 9078 of order 1 slabs (4kb), 10398 of order 2 slabs (8kb) ... 2 order 9 slabs (1MB). If a slab of lower order is needed it may split a higher order one (e.g. if the kernel wants a 4k slab it may split an 8k slab into two). Lots of kernel allocations and you may run out of the higher order slabs. What worked for me for trigerring this condition was moving a lot of data to the Linux over SCP. There may be other causes. Now the significance of 32k is that this is where Linux stops retrying to rearrange memory to find larger slabs. I don't remember the details, but if you want to investigate look at the kernel sources, namely mm/page_alloc.c and mm/vmscan.c So the bottom line is, anytime you have an operation that needs a large buffer in kernel (chccwdev of a NIC, vmcp with --buffer, DIAG from Linux) it may fail at unexpected times. I have not found a good way to get around this but I will be interested if you find anything. In the case of VMCP what may help is if it allocated a buffer at kernel startup. At the moment it allocates it for every call, see http://lxr.free-electrons.com/source/drivers/s390/char/vmcp.c#L105 Tomas
Re: How to find a memory leak?
Use Rob's cmmflush ! https://zvmperf.wordpress.com/2012/07/06/using-cmm-to-flush-a-linux-guests-memory/ We use it every day in dev to keep the vm paging rate way down. -Original Message- From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Michael MacIsaac Sent: Thursday, July 09, 2015 7:49 AM To: LINUX-390@VM.MARIST.EDU Subject: Re: [LINUX-390] How to find a memory leak? Thomas, > Did you use a buffer larger than 32k on those vmcp commands? Yes, I always use 1M (vmcpCmd="/sbin/vmcp --buffer=1M") in the event there is a lot of output from CP. > Vmcp can fail due to memory fragmentation even on a server with lots > of free memory. Hmmm, interesting... could this be considered a bug? As a workaround, is there a command to flush the buffer cache? Thanks. -Mike M. On Thu, Jul 9, 2015 at 10:40 AM, Pavelka, Tomas wrote: > > In the past this server has gone to near zero memory, and vmcp > > commands > fail. > > Do you have any specifics? Did you use a buffer larger than 32k on > those vmcp commands? Vmcp can fail due to memory fragmentation even on > a server with lots of free memory. > > Tomas Pavelka > CA Technologies > Sr Software Engineer > > CA CZ, s.r.o > V Parku 12, > 148 00 Praha > Czech Republic > > Office: +25996 | tomas.pave...@ca.com > > > > Id. Císlo 25694073, z obchodního rejstříku, vedeného Městským soudem v > Praze, oddíl C, vložka 61808 / Id. No. 25694073, registered in the > Commercial Register maintained by the Municipal Court in Praque, > Section C, File 61808 > > > > -Original Message- > From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of > Michael MacIsaac > Sent: Thursday, July 09, 2015 4:15 PM > To: LINUX-390@VM.MARIST.EDU > Subject: Re: How to find a memory leak? > > Thanks Richard for the joke :)) > > Thanks Thomas for the input. I changed the ps command flag to '--sort > -rss', and restarted memusage - will continue to monitor. > > Thanks Dave for the pointer, but I don't have any of my own C/C++ > programs running, just many bash scripts (if they do no 'malloc's, can > they still cause memory leaks?). > > In the past this server has gone to near zero memory, and vmcp > commands fail. I'm guessing the OOM killer was invoked, but by then > it's already too late ... > > -Mike > > On Thu, Jul 9, 2015 at 9:54 AM, Dave Jones > wrote: > > > Hi, Mike. > > > > if the package AddressSanitizer (ASan) is available, you might want > > to ive it a go. It is a fast memory error detector. that can find > > use-after-free and {heap,stack,global}-buffer overflow bugs in C/C++ > > programs. it's here: > > > > https://code.google.com/p/address-sanitizer/ > > > > Good luckI still think C/C++ will be the death of us all. :-) > > > > DJ > > > > On 07/09/2015 07:50 AM, Pavelka, Tomas wrote: > > > Look at the " -/+ buffers/cache" line in the free output: > > > > > > Before: > > > -/+ buffers/cache: 41450 > > > After: > > > -/+ buffers/cache: 48443 > > > > > > (First number used, second free) > > > > > > Linux has various buffers and caches that are allocated if there > > > is free > > memory. For example for disk reads. These are dropped if the memory > > is needed by processes. The " -/+ buffers/cache" line shows what > > memory is actually used by processes and not the buffers. In your > > case the used memory rose only by 7 MB. > > > > > > BTW I would not look at the virtual memory size of proceses, this > > > may be > > allocated way over the virtual memory size of your machine. The more > > interesting metric is RSS which is how much memory is actually used. > > > > > > HTH, > > > Tomas > > > > > > Tomas Pavelka > > > CA Technologies > > > Sr Software Engineer > > > > > > CA CZ, s.r.o > > > V Parku 12, > > > 148 00 Praha > > > Czech Republic > > > > > > Office: +25996 | tomas.pave...@ca.com > > > > > > > > > > > > Id. Císlo 25694073, z obchodního rejstříku, vedeného Městským > > > soudem v > > Praze, oddíl C, vložka 61808 / Id. No. 25694073, registered in the > > Commercial Register maintained by the Municipal Court in Praque, > > Section C, File 61808 > > > > > > > > > -Original Message- > > > From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf > > > Of > > Michael MacIsaac > > > Sent: Thursday, July 09, 2015 2:19 PM > > > To: LINUX-390@VM.MARIST.EDU > > > Subject: How to find a memory leak? > > > > > > Hello list, > > > > > > I have a SLES 11 SP3 system that is leaking memory, but I don't > > > know how > > or where. > > > > > > I find a script on the Internet that runs forever, adapt it > > > somewhat, > > and start logging some info to a temp file. Here's the script: > > > > > > # cat memusage > > > #!/bin/bash > > > # > > > # track memory usage > > > # > > > outFile="/tmp/memusage" > > > while true > > > do > > > echo "---" >> > $outFile > > > date >> $outFile > > > ps aux --sort -vsz
Re: How to find a memory leak?
Thomas, > Did you use a buffer larger than 32k on those vmcp commands? Yes, I always use 1M (vmcpCmd="/sbin/vmcp --buffer=1M") in the event there is a lot of output from CP. > Vmcp can fail due to memory fragmentation even on a server with lots of free memory. Hmmm, interesting... could this be considered a bug? As a workaround, is there a command to flush the buffer cache? Thanks. -Mike M. On Thu, Jul 9, 2015 at 10:40 AM, Pavelka, Tomas wrote: > > In the past this server has gone to near zero memory, and vmcp commands > fail. > > Do you have any specifics? Did you use a buffer larger than 32k on those > vmcp commands? Vmcp can fail due to memory fragmentation even on a server > with lots of free memory. > > Tomas Pavelka > CA Technologies > Sr Software Engineer > > CA CZ, s.r.o > V Parku 12, > 148 00 Praha > Czech Republic > > Office: +25996 | tomas.pave...@ca.com > > > > Id. Císlo 25694073, z obchodního rejstříku, vedeného Městským soudem v > Praze, oddíl C, vložka 61808 / Id. No. 25694073, registered in the > Commercial Register maintained by the Municipal Court in Praque, Section C, > File 61808 > > > > -Original Message- > From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of > Michael MacIsaac > Sent: Thursday, July 09, 2015 4:15 PM > To: LINUX-390@VM.MARIST.EDU > Subject: Re: How to find a memory leak? > > Thanks Richard for the joke :)) > > Thanks Thomas for the input. I changed the ps command flag to '--sort > -rss', and restarted memusage - will continue to monitor. > > Thanks Dave for the pointer, but I don't have any of my own C/C++ programs > running, just many bash scripts (if they do no 'malloc's, can they still > cause memory leaks?). > > In the past this server has gone to near zero memory, and vmcp commands > fail. I'm guessing the OOM killer was invoked, but by then it's already > too late ... > > -Mike > > On Thu, Jul 9, 2015 at 9:54 AM, Dave Jones > wrote: > > > Hi, Mike. > > > > if the package AddressSanitizer (ASan) is available, you might want to > > ive it a go. It is a fast memory error detector. that can find > > use-after-free and {heap,stack,global}-buffer overflow bugs in C/C++ > > programs. it's here: > > > > https://code.google.com/p/address-sanitizer/ > > > > Good luckI still think C/C++ will be the death of us all. :-) > > > > DJ > > > > On 07/09/2015 07:50 AM, Pavelka, Tomas wrote: > > > Look at the " -/+ buffers/cache" line in the free output: > > > > > > Before: > > > -/+ buffers/cache: 41450 > > > After: > > > -/+ buffers/cache: 48443 > > > > > > (First number used, second free) > > > > > > Linux has various buffers and caches that are allocated if there is > > > free > > memory. For example for disk reads. These are dropped if the memory is > > needed by processes. The " -/+ buffers/cache" line shows what memory > > is actually used by processes and not the buffers. In your case the > > used memory rose only by 7 MB. > > > > > > BTW I would not look at the virtual memory size of proceses, this > > > may be > > allocated way over the virtual memory size of your machine. The more > > interesting metric is RSS which is how much memory is actually used. > > > > > > HTH, > > > Tomas > > > > > > Tomas Pavelka > > > CA Technologies > > > Sr Software Engineer > > > > > > CA CZ, s.r.o > > > V Parku 12, > > > 148 00 Praha > > > Czech Republic > > > > > > Office: +25996 | tomas.pave...@ca.com > > > > > > > > > > > > Id. Císlo 25694073, z obchodního rejstříku, vedeného Městským soudem > > > v > > Praze, oddíl C, vložka 61808 / Id. No. 25694073, registered in the > > Commercial Register maintained by the Municipal Court in Praque, > > Section C, File 61808 > > > > > > > > > -Original Message- > > > From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf > > > Of > > Michael MacIsaac > > > Sent: Thursday, July 09, 2015 2:19 PM > > > To: LINUX-390@VM.MARIST.EDU > > > Subject: How to find a memory leak? > > > > > > Hello list, > > > > > > I have a SLES 11 SP3 system that is leaking memory, but I don't know > > > how > > or where. > > > > > > I find a script on the Internet that runs forever, adapt it > > > somewhat, > > and start logging some info to a temp file. Here's the script: > > > > > > # cat memusage > > > #!/bin/bash > > > # > > > # track memory usage > > > # > > > outFile="/tmp/memusage" > > > while true > > > do > > > echo "---" >> > $outFile > > > date >> $outFile > > > ps aux --sort -vsz | head -22 >> $outFile > > > echo >> $outFile > > > free -m >> $outFile > > > sleep 300 > > > done > > > > > > After a fresh reboot of a 512 MB virtual machine, I start the script > > > and > > the first entry in the temp file shows about 20 MB (512 - 492) used by > > Linux and 97 MB used by processes: > > > > > > Wed Jul 8 12:37:45 EDT 2015 > > > USER PID %CPU %MEMVSZ RSS TTY STAT START TIME > COMMAND > > > r
Re: How to find a memory leak?
> In the past this server has gone to near zero memory, and vmcp commands fail. Do you have any specifics? Did you use a buffer larger than 32k on those vmcp commands? Vmcp can fail due to memory fragmentation even on a server with lots of free memory. Tomas Pavelka CA Technologies Sr Software Engineer CA CZ, s.r.o V Parku 12, 148 00 Praha Czech Republic Office: +25996 | tomas.pave...@ca.com Id. Císlo 25694073, z obchodního rejstříku, vedeného Městským soudem v Praze, oddíl C, vložka 61808 / Id. No. 25694073, registered in the Commercial Register maintained by the Municipal Court in Praque, Section C, File 61808 -Original Message- From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Michael MacIsaac Sent: Thursday, July 09, 2015 4:15 PM To: LINUX-390@VM.MARIST.EDU Subject: Re: How to find a memory leak? Thanks Richard for the joke :)) Thanks Thomas for the input. I changed the ps command flag to '--sort -rss', and restarted memusage - will continue to monitor. Thanks Dave for the pointer, but I don't have any of my own C/C++ programs running, just many bash scripts (if they do no 'malloc's, can they still cause memory leaks?). In the past this server has gone to near zero memory, and vmcp commands fail. I'm guessing the OOM killer was invoked, but by then it's already too late ... -Mike On Thu, Jul 9, 2015 at 9:54 AM, Dave Jones wrote: > Hi, Mike. > > if the package AddressSanitizer (ASan) is available, you might want to > ive it a go. It is a fast memory error detector. that can find > use-after-free and {heap,stack,global}-buffer overflow bugs in C/C++ > programs. it's here: > > https://code.google.com/p/address-sanitizer/ > > Good luckI still think C/C++ will be the death of us all. :-) > > DJ > > On 07/09/2015 07:50 AM, Pavelka, Tomas wrote: > > Look at the " -/+ buffers/cache" line in the free output: > > > > Before: > > -/+ buffers/cache: 41450 > > After: > > -/+ buffers/cache: 48443 > > > > (First number used, second free) > > > > Linux has various buffers and caches that are allocated if there is > > free > memory. For example for disk reads. These are dropped if the memory is > needed by processes. The " -/+ buffers/cache" line shows what memory > is actually used by processes and not the buffers. In your case the > used memory rose only by 7 MB. > > > > BTW I would not look at the virtual memory size of proceses, this > > may be > allocated way over the virtual memory size of your machine. The more > interesting metric is RSS which is how much memory is actually used. > > > > HTH, > > Tomas > > > > Tomas Pavelka > > CA Technologies > > Sr Software Engineer > > > > CA CZ, s.r.o > > V Parku 12, > > 148 00 Praha > > Czech Republic > > > > Office: +25996 | tomas.pave...@ca.com > > > > > > > > Id. Císlo 25694073, z obchodního rejstříku, vedeného Městským soudem > > v > Praze, oddíl C, vložka 61808 / Id. No. 25694073, registered in the > Commercial Register maintained by the Municipal Court in Praque, > Section C, File 61808 > > > > > > -Original Message- > > From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf > > Of > Michael MacIsaac > > Sent: Thursday, July 09, 2015 2:19 PM > > To: LINUX-390@VM.MARIST.EDU > > Subject: How to find a memory leak? > > > > Hello list, > > > > I have a SLES 11 SP3 system that is leaking memory, but I don't know > > how > or where. > > > > I find a script on the Internet that runs forever, adapt it > > somewhat, > and start logging some info to a temp file. Here's the script: > > > > # cat memusage > > #!/bin/bash > > # > > # track memory usage > > # > > outFile="/tmp/memusage" > > while true > > do > > echo "---" >> $outFile > > date >> $outFile > > ps aux --sort -vsz | head -22 >> $outFile > > echo >> $outFile > > free -m >> $outFile > > sleep 300 > > done > > > > After a fresh reboot of a 512 MB virtual machine, I start the script > > and > the first entry in the temp file shows about 20 MB (512 - 492) used by > Linux and 97 MB used by processes: > > > > Wed Jul 8 12:37:45 EDT 2015 > > USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND > > root 2181 0.0 0.2 115404 1024 ?Ssl 12:36 0:00 > > /usr/sbin/nscd > > root 1851 0.0 0.1 11512 692 ?S > /sbin/auditd -s disable > > root 2556 0.3 0.7 11456 4004 ?Ss 12:37 0:00 sshd: > > root@pts/0 > > root 2306 0.0 0.7 10720 3700 ?Ss 12:36 0:00 > > /usr/sbin/httpd2-prefork > > wwwrun2307 0.0 0.4 10720 2204 ?S12:36 0:00 > > /usr/sbin/httpd2-prefork > > wwwrun2308 0.0 0.4 10720 2204 ?S12:36 0:00 > > /usr/sbin/httpd2-prefork > > wwwrun2309 0.0 0.4 10720 2204 ?S12:36 0:00 > > /usr/sbin/httpd2-prefork > > wwwrun2310 0.0 0.4 10720 2204 ?S12:36 0:00 > > /usr/sbin/
Re: How to find a memory leak?
Thanks Richard for the joke :)) Thanks Thomas for the input. I changed the ps command flag to '--sort -rss', and restarted memusage - will continue to monitor. Thanks Dave for the pointer, but I don't have any of my own C/C++ programs running, just many bash scripts (if they do no 'malloc's, can they still cause memory leaks?). In the past this server has gone to near zero memory, and vmcp commands fail. I'm guessing the OOM killer was invoked, but by then it's already too late ... -Mike On Thu, Jul 9, 2015 at 9:54 AM, Dave Jones wrote: > Hi, Mike. > > if the package AddressSanitizer (ASan) is available, you might want to > ive it a go. It is a fast memory error detector. that can find > use-after-free and {heap,stack,global}-buffer overflow bugs in C/C++ > programs. it's here: > > https://code.google.com/p/address-sanitizer/ > > Good luckI still think C/C++ will be the death of us all. :-) > > DJ > > On 07/09/2015 07:50 AM, Pavelka, Tomas wrote: > > Look at the " -/+ buffers/cache" line in the free output: > > > > Before: > > -/+ buffers/cache: 41450 > > After: > > -/+ buffers/cache: 48443 > > > > (First number used, second free) > > > > Linux has various buffers and caches that are allocated if there is free > memory. For example for disk reads. These are dropped if the memory is > needed by processes. The " -/+ buffers/cache" line shows what memory is > actually used by processes and not the buffers. In your case the used > memory rose only by 7 MB. > > > > BTW I would not look at the virtual memory size of proceses, this may be > allocated way over the virtual memory size of your machine. The more > interesting metric is RSS which is how much memory is actually used. > > > > HTH, > > Tomas > > > > Tomas Pavelka > > CA Technologies > > Sr Software Engineer > > > > CA CZ, s.r.o > > V Parku 12, > > 148 00 Praha > > Czech Republic > > > > Office: +25996 | tomas.pave...@ca.com > > > > > > > > Id. Císlo 25694073, z obchodního rejstříku, vedeného Městským soudem v > Praze, oddíl C, vložka 61808 / Id. No. 25694073, registered in the > Commercial Register maintained by the Municipal Court in Praque, Section C, > File 61808 > > > > > > -Original Message- > > From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of > Michael MacIsaac > > Sent: Thursday, July 09, 2015 2:19 PM > > To: LINUX-390@VM.MARIST.EDU > > Subject: How to find a memory leak? > > > > Hello list, > > > > I have a SLES 11 SP3 system that is leaking memory, but I don't know how > or where. > > > > I find a script on the Internet that runs forever, adapt it somewhat, > and start logging some info to a temp file. Here's the script: > > > > # cat memusage > > #!/bin/bash > > # > > # track memory usage > > # > > outFile="/tmp/memusage" > > while true > > do > > echo "---" >> $outFile > > date >> $outFile > > ps aux --sort -vsz | head -22 >> $outFile > > echo >> $outFile > > free -m >> $outFile > > sleep 300 > > done > > > > After a fresh reboot of a 512 MB virtual machine, I start the script and > the first entry in the temp file shows about 20 MB (512 - 492) used by > Linux and 97 MB used by processes: > > > > Wed Jul 8 12:37:45 EDT 2015 > > USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND > > root 2181 0.0 0.2 115404 1024 ?Ssl 12:36 0:00 > > /usr/sbin/nscd > > root 1851 0.0 0.1 11512 692 ?S > /sbin/auditd -s disable > > root 2556 0.3 0.7 11456 4004 ?Ss 12:37 0:00 sshd: > > root@pts/0 > > root 2306 0.0 0.7 10720 3700 ?Ss 12:36 0:00 > > /usr/sbin/httpd2-prefork > > wwwrun2307 0.0 0.4 10720 2204 ?S12:36 0:00 > > /usr/sbin/httpd2-prefork > > wwwrun2308 0.0 0.4 10720 2204 ?S12:36 0:00 > > /usr/sbin/httpd2-prefork > > wwwrun2309 0.0 0.4 10720 2204 ?S12:36 0:00 > > /usr/sbin/httpd2-prefork > > wwwrun2310 0.0 0.4 10720 2204 ?S12:36 0:00 > > /usr/sbin/httpd2-prefork > > wwwrun2311 0.0 0.4 10720 2204 ?S12:36 0:00 > > /usr/sbin/httpd2-prefork > > root 1853 0.0 0.1 10428 824 ?S > /sbin/audispd > > root 997 0.0 0.6 9036 3224 ?Ssl 12:36 0:00 > > /usr/sbin/console-kit-da > > root 2265 0.0 0.5 8136 2532 ?Ss 12:36 0:00 > > /usr/lib/postfix/master > > postfix 2277 0.0 0.4 8004 2372 ?S12:36 0:00 qmgr -l > -t > > fifo -u > > postfix 2276 0.0 0.4 7948 2352 ?S12:36 0:00 pickup > -l > > -t fifo -u > > root 2172 0.0 0.3 7916 1532 ?Ss 12:36 0:00 > > /usr/sbin/sshd -o PidFi > > 101994 0.0 0.5 7852 2804 ?Ss 12:36 0:00 > > /usr/sbin/hald --daemon > > root 1869 0.0 0.8 6464 4504 ?Ss 12:36 0:00 > > /sbin/haveged -w 1024 - > > root 2559 1.0 0.6 6056 3076 pts/0
Re: How to find a memory leak?
Hi, Mike. if the package AddressSanitizer (ASan) is available, you might want to ive it a go. It is a fast memory error detector. that can find use-after-free and {heap,stack,global}-buffer overflow bugs in C/C++ programs. it's here: https://code.google.com/p/address-sanitizer/ Good luckI still think C/C++ will be the death of us all. :-) DJ On 07/09/2015 07:50 AM, Pavelka, Tomas wrote: > Look at the " -/+ buffers/cache" line in the free output: > > Before: > -/+ buffers/cache: 41450 > After: > -/+ buffers/cache: 48443 > > (First number used, second free) > > Linux has various buffers and caches that are allocated if there is free > memory. For example for disk reads. These are dropped if the memory is needed > by processes. The " -/+ buffers/cache" line shows what memory is actually > used by processes and not the buffers. In your case the used memory rose only > by 7 MB. > > BTW I would not look at the virtual memory size of proceses, this may be > allocated way over the virtual memory size of your machine. The more > interesting metric is RSS which is how much memory is actually used. > > HTH, > Tomas > > Tomas Pavelka > CA Technologies > Sr Software Engineer > > CA CZ, s.r.o > V Parku 12, > 148 00 Praha > Czech Republic > > Office: +25996 | tomas.pave...@ca.com > > > > Id. Císlo 25694073, z obchodního rejstříku, vedeného Městským soudem v Praze, > oddíl C, vložka 61808 / Id. No. 25694073, registered in the Commercial > Register maintained by the Municipal Court in Praque, Section C, File 61808 > > > -Original Message- > From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Michael > MacIsaac > Sent: Thursday, July 09, 2015 2:19 PM > To: LINUX-390@VM.MARIST.EDU > Subject: How to find a memory leak? > > Hello list, > > I have a SLES 11 SP3 system that is leaking memory, but I don't know how or > where. > > I find a script on the Internet that runs forever, adapt it somewhat, and > start logging some info to a temp file. Here's the script: > > # cat memusage > #!/bin/bash > # > # track memory usage > # > outFile="/tmp/memusage" > while true > do > echo "---" >> $outFile > date >> $outFile > ps aux --sort -vsz | head -22 >> $outFile > echo >> $outFile > free -m >> $outFile > sleep 300 > done > > After a fresh reboot of a 512 MB virtual machine, I start the script and the > first entry in the temp file shows about 20 MB (512 - 492) used by Linux and > 97 MB used by processes: > > Wed Jul 8 12:37:45 EDT 2015 > USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND > root 2181 0.0 0.2 115404 1024 ?Ssl 12:36 0:00 > /usr/sbin/nscd > root 1851 0.0 0.1 11512 692 ?S /sbin/auditd -s disable > root 2556 0.3 0.7 11456 4004 ?Ss 12:37 0:00 sshd: > root@pts/0 > root 2306 0.0 0.7 10720 3700 ?Ss 12:36 0:00 > /usr/sbin/httpd2-prefork > wwwrun2307 0.0 0.4 10720 2204 ?S12:36 0:00 > /usr/sbin/httpd2-prefork > wwwrun2308 0.0 0.4 10720 2204 ?S12:36 0:00 > /usr/sbin/httpd2-prefork > wwwrun2309 0.0 0.4 10720 2204 ?S12:36 0:00 > /usr/sbin/httpd2-prefork > wwwrun2310 0.0 0.4 10720 2204 ?S12:36 0:00 > /usr/sbin/httpd2-prefork > wwwrun2311 0.0 0.4 10720 2204 ?S12:36 0:00 > /usr/sbin/httpd2-prefork > root 1853 0.0 0.1 10428 824 ?S /sbin/audispd > root 997 0.0 0.6 9036 3224 ?Ssl 12:36 0:00 > /usr/sbin/console-kit-da > root 2265 0.0 0.5 8136 2532 ?Ss 12:36 0:00 > /usr/lib/postfix/master > postfix 2277 0.0 0.4 8004 2372 ?S12:36 0:00 qmgr -l -t > fifo -u > postfix 2276 0.0 0.4 7948 2352 ?S12:36 0:00 pickup -l > -t fifo -u > root 2172 0.0 0.3 7916 1532 ?Ss 12:36 0:00 > /usr/sbin/sshd -o PidFi > 101994 0.0 0.5 7852 2804 ?Ss 12:36 0:00 > /usr/sbin/hald --daemon > root 1869 0.0 0.8 6464 4504 ?Ss 12:36 0:00 > /sbin/haveged -w 1024 - > root 2559 1.0 0.6 6056 3076 pts/0Ss 12:37 0:00 -bash > root 998 0.0 0.2 3980 1332 ?S12:36 0:00 hald-runner > root 2591 0.0 0.3 3652 1604 pts/0S+ 12:37 0:00 /bin/bash > /usr/local/sb > root 2343 0.0 0.1 3508 944 ?Ss 12:36 0:00 > /usr/sbin/xinetd -pidfi > > total used free sharedbuffers cached > Mem: *492 97*394 0 5 50 > -/+ buffers/cache: 41450 > Swap: 898 0898 > > This morning the last entry shows 156 MB used by processes: ~59 MB of memory > lost in less than a day. But the 'VSZ' of the top 22 processes seems to be > about the same: > > Thu Jul 9 07:57:47 EDT 2015
Re: How to find a memory leak?
Look at the " -/+ buffers/cache" line in the free output: Before: -/+ buffers/cache: 41450 After: -/+ buffers/cache: 48443 (First number used, second free) Linux has various buffers and caches that are allocated if there is free memory. For example for disk reads. These are dropped if the memory is needed by processes. The " -/+ buffers/cache" line shows what memory is actually used by processes and not the buffers. In your case the used memory rose only by 7 MB. BTW I would not look at the virtual memory size of proceses, this may be allocated way over the virtual memory size of your machine. The more interesting metric is RSS which is how much memory is actually used. HTH, Tomas Tomas Pavelka CA Technologies Sr Software Engineer CA CZ, s.r.o V Parku 12, 148 00 Praha Czech Republic Office: +25996 | tomas.pave...@ca.com Id. Císlo 25694073, z obchodního rejstříku, vedeného Městským soudem v Praze, oddíl C, vložka 61808 / Id. No. 25694073, registered in the Commercial Register maintained by the Municipal Court in Praque, Section C, File 61808 -Original Message- From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Michael MacIsaac Sent: Thursday, July 09, 2015 2:19 PM To: LINUX-390@VM.MARIST.EDU Subject: How to find a memory leak? Hello list, I have a SLES 11 SP3 system that is leaking memory, but I don't know how or where. I find a script on the Internet that runs forever, adapt it somewhat, and start logging some info to a temp file. Here's the script: # cat memusage #!/bin/bash # # track memory usage # outFile="/tmp/memusage" while true do echo "---" >> $outFile date >> $outFile ps aux --sort -vsz | head -22 >> $outFile echo >> $outFile free -m >> $outFile sleep 300 done After a fresh reboot of a 512 MB virtual machine, I start the script and the first entry in the temp file shows about 20 MB (512 - 492) used by Linux and 97 MB used by processes: Wed Jul 8 12:37:45 EDT 2015 USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND root 2181 0.0 0.2 115404 1024 ?Ssl 12:36 0:00 /usr/sbin/nscd root 1851 0.0 0.1 11512 692 ?Shttp://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to find a memory leak?
Spray soapy water on it and look for bubbles :) --- mike99...@gmail.com wrote: From: Michael MacIsaac To: LINUX-390@VM.MARIST.EDU Subject: How to find a memory leak? Date: Thu, 9 Jul 2015 08:19:20 -0400 Hello list, I have a SLES 11 SP3 system that is leaking memory, but I don't know how or where. I find a script on the Internet that runs forever, adapt it somewhat, and start logging some info to a temp file. Here's the script: # cat memusage #!/bin/bash # # track memory usage # outFile="/tmp/memusage" while true do echo "---" >> $outFile date >> $outFile ps aux --sort -vsz | head -22 >> $outFile echo >> $outFile free -m >> $outFile sleep 300 done After a fresh reboot of a 512 MB virtual machine, I start the script and the first entry in the temp file shows about 20 MB (512 - 492) used by Linux and 97 MB used by processes: Wed Jul 8 12:37:45 EDT 2015 USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND root 2181 0.0 0.2 115404 1024 ?Ssl 12:36 0:00 /usr/sbin/nscd root 1851 0.0 0.1 11512 692 ?Shttp://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ _ Netscape. Just the Net You Need. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
How to find a memory leak?
Hello list, I have a SLES 11 SP3 system that is leaking memory, but I don't know how or where. I find a script on the Internet that runs forever, adapt it somewhat, and start logging some info to a temp file. Here's the script: # cat memusage #!/bin/bash # # track memory usage # outFile="/tmp/memusage" while true do echo "---" >> $outFile date >> $outFile ps aux --sort -vsz | head -22 >> $outFile echo >> $outFile free -m >> $outFile sleep 300 done After a fresh reboot of a 512 MB virtual machine, I start the script and the first entry in the temp file shows about 20 MB (512 - 492) used by Linux and 97 MB used by processes: Wed Jul 8 12:37:45 EDT 2015 USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND root 2181 0.0 0.2 115404 1024 ?Ssl 12:36 0:00 /usr/sbin/nscd root 1851 0.0 0.1 11512 692 ?Shttp://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: SSHD CPU spike
No, it would make it more secure. It's almost impossible to bruteforce a public key, and that is the only authentication method enabled. I would do it, but sometimes I have to ssh-in from other computers than my own, so public key authentication only would not be desired. I always have my phone, so Google Auth works fine. On Jul 9, 2015 8:08 AM, "Jake Anderson" wrote: > Hi Philippe > > Disabling the two features won't be a security vulnerability? > > Jake > > On Thursday 9 July 2015, Philipp Kern wrote: > > > On Wed, Jul 08, 2015 at 03:45:01PM -0300, Mauro Souza wrote: > > > I have a VPS that got a continuous stream of ssh login attempts, so I > set > > > up fail2ban on it. After that, I changed SSH port from 22 to a random > > one. > > > And installed portsentry. And configured PAM to use Google > Authentication > > > for SSH. > > > > > > Doing this, the failed logins went to zero. No more bots crawling > around > > > and bruteforcing my VPS. > > > > It should be enough to turn off PasswordAuthentication and > > ChallengeResponseAuthentication to no in sshd_config and simply use > > public key cryptography to login. > > > > Kind regards > > Philipp Kern > > > > -- > > For LINUX-390 subscribe / signoff / archive access instructions, > > send email to lists...@vm.marist.edu with the message: > > INFO LINUX-390 or visit > > http://www.marist.edu/htbin/wlvindex?LINUX-390 > > -- > > For more information on Linux on System z, visit > > http://wiki.linuxvm.org/ > > > > -- > For LINUX-390 subscribe / signoff / archive access instructions, > send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or > visit > http://www.marist.edu/htbin/wlvindex?LINUX-390 > -- > For more information on Linux on System z, visit > http://wiki.linuxvm.org/ > -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: SSHD CPU spike
Hi Philippe Disabling the two features won't be a security vulnerability? Jake On Thursday 9 July 2015, Philipp Kern wrote: > On Wed, Jul 08, 2015 at 03:45:01PM -0300, Mauro Souza wrote: > > I have a VPS that got a continuous stream of ssh login attempts, so I set > > up fail2ban on it. After that, I changed SSH port from 22 to a random > one. > > And installed portsentry. And configured PAM to use Google Authentication > > for SSH. > > > > Doing this, the failed logins went to zero. No more bots crawling around > > and bruteforcing my VPS. > > It should be enough to turn off PasswordAuthentication and > ChallengeResponseAuthentication to no in sshd_config and simply use > public key cryptography to login. > > Kind regards > Philipp Kern > > -- > For LINUX-390 subscribe / signoff / archive access instructions, > send email to lists...@vm.marist.edu with the message: > INFO LINUX-390 or visit > http://www.marist.edu/htbin/wlvindex?LINUX-390 > -- > For more information on Linux on System z, visit > http://wiki.linuxvm.org/ > -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: SSHD CPU spike
On Wed, Jul 08, 2015 at 03:45:01PM -0300, Mauro Souza wrote: > I have a VPS that got a continuous stream of ssh login attempts, so I set > up fail2ban on it. After that, I changed SSH port from 22 to a random one. > And installed portsentry. And configured PAM to use Google Authentication > for SSH. > > Doing this, the failed logins went to zero. No more bots crawling around > and bruteforcing my VPS. It should be enough to turn off PasswordAuthentication and ChallengeResponseAuthentication to no in sshd_config and simply use public key cryptography to login. Kind regards Philipp Kern -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/