Re: How to find a memory leak? cmmflush and iib/wmb

2015-11-12 Thread Marcy Cortes
Since I've advocated using cmmflush to clean up Linux memory, I should mention 
that there recently was a PTF that came out that broke it and here's its fix:

APAR VM65097: PEVM64770 LINUX CMM FLUSH DIAG X'10' SLOWDOWN 
http://www-01.ibm.com/support/docview.wss?uid=isg1VM65097&myns=apar&mynp=DOCTYPEcomponent&mync=E&cm_sp=apar-_-DOCTYPEcomponent-_-E



-Original Message-
From: Cortes, Marcy D. 
Sent: Tuesday, July 21, 2015 7:08 AM
To: LINUX-390@VM.MARIST.EDU
Subject: RE: How to find a memory leak? cmmflush and iib/wmb

Maybe?!I checked one WMB server that has 4G.   It seems to drop roughly 
2.5G a day.   Not sure how to tell what that is from!



-Original Message-
From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Will, 
Chris
Sent: Tuesday, July 21, 2015 7:02 AM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: [LINUX-390] How to find a memory leak? cmmflush and iib/wmb

Will cmmflush cause WMB or IIB to release memory that may build up over the 
week in the execution groups?

Chris Will

-Original Message-
From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Marcy 
Cortes
Sent: Thursday, July 09, 2015 12:00 PM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: How to find a memory leak?

Easier, but the pages aren't dropped from the zVM side immediately so if you 
are memory constrained there, cmmflush is your friend.


-Original Message-
From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Michael 
MacIsaac
Sent: Thursday, July 09, 2015 8:51 AM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: [LINUX-390] How to find a memory leak?

Tomas,

> I forgot to answer this question: you can drop buffers and cache by
running
> echo 3 > /proc/sys/vm/drop_caches

Nice, even easier. Thanks!

The next question is - can this ever be done by a non-root user? I tried adding 
/bin/echo to /etc/sudoers, but still get an error:

mike@lab153:~ $ sudo /bin/echo 3 > /proc/sys/vm/drop_caches
-bash: /proc/sys/vm/drop_caches: Permission denied



-Mike

On Thu, Jul 9, 2015 at 11:36 AM, Pavelka, Tomas 
wrote:

> > Thanks.  I copied and pasted cmmflush and it seems to work nicely
>
> If I understand it right then you have to look at how cmmflush affects 
> the output of /proc/buddyinfo. If you see non-zero in the last order 
> of slab (i.e. the one with 1MB size) then you are good to run vmcp 
> --buffer=1M.
> Otherwise you may still run into problems even if free -m shows a lot 
> of free memory.
>
> But I have not tried cmmflush, maybe it will help.
>
> The way that I was able to reproduce the memory fragmentation problem 
> was by copying large amount of data over SCP to that Linux machine.
> Try that and see if you can reproduce the vmcp --buffer=1M failure.
>
> Tomas
>

--
For LINUX-390 subscribe / signoff / archive access instructions, send email to 
lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit http://wiki.linuxvm.org/


The information contained in this communication is highly confidential and is 
intended solely for the use of the individual(s) to whom this communication is 
directed. If you are not the intended recipient, you are hereby notified that 
any viewing, copying, disclosure or distribution of this information is 
prohibited. Please notify the sender, by electronic mail or telephone, of any 
unintended receipt and delete the original message without making any copies.
 
 Blue Cross Blue Shield of Michigan and Blue Care Network of Michigan are 
nonprofit corporations and independent licensees of the Blue Cross and Blue 
Shield Association.


Re: How to find a memory leak? cmmflush and iib/wmb

2015-07-21 Thread Marcy Cortes
Maybe?!I checked one WMB server that has 4G.   It seems to drop roughly 
2.5G a day.   Not sure how to tell what that is from!



-Original Message-
From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Will, 
Chris
Sent: Tuesday, July 21, 2015 7:02 AM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: [LINUX-390] How to find a memory leak? cmmflush and iib/wmb

Will cmmflush cause WMB or IIB to release memory that may build up over the 
week in the execution groups?

Chris Will

-Original Message-
From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Marcy 
Cortes
Sent: Thursday, July 09, 2015 12:00 PM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: How to find a memory leak?

Easier, but the pages aren't dropped from the zVM side immediately so if you 
are memory constrained there, cmmflush is your friend.


-Original Message-
From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Michael 
MacIsaac
Sent: Thursday, July 09, 2015 8:51 AM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: [LINUX-390] How to find a memory leak?

Tomas,

> I forgot to answer this question: you can drop buffers and cache by
running
> echo 3 > /proc/sys/vm/drop_caches

Nice, even easier. Thanks!

The next question is - can this ever be done by a non-root user? I tried adding 
/bin/echo to /etc/sudoers, but still get an error:

mike@lab153:~ $ sudo /bin/echo 3 > /proc/sys/vm/drop_caches
-bash: /proc/sys/vm/drop_caches: Permission denied



-Mike

On Thu, Jul 9, 2015 at 11:36 AM, Pavelka, Tomas 
wrote:

> > Thanks.  I copied and pasted cmmflush and it seems to work nicely
>
> If I understand it right then you have to look at how cmmflush affects 
> the output of /proc/buddyinfo. If you see non-zero in the last order 
> of slab (i.e. the one with 1MB size) then you are good to run vmcp 
> --buffer=1M.
> Otherwise you may still run into problems even if free -m shows a lot 
> of free memory.
>
> But I have not tried cmmflush, maybe it will help.
>
> The way that I was able to reproduce the memory fragmentation problem 
> was by copying large amount of data over SCP to that Linux machine.
> Try that and see if you can reproduce the vmcp --buffer=1M failure.
>
> Tomas
>

--
For LINUX-390 subscribe / signoff / archive access instructions, send email to 
lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit http://wiki.linuxvm.org/


The information contained in this communication is highly confidential and is 
intended solely for the use of the individual(s) to whom this communication is 
directed. If you are not the intended recipient, you are hereby notified that 
any viewing, copying, disclosure or distribution of this information is 
prohibited. Please notify the sender, by electronic mail or telephone, of any 
unintended receipt and delete the original message without making any copies.
 
 Blue Cross Blue Shield of Michigan and Blue Care Network of Michigan are 
nonprofit corporations and independent licensees of the Blue Cross and Blue 
Shield Association.


Re: How to find a memory leak? cmmflush and iib/wmb

2015-07-21 Thread Will, Chris
Will cmmflush cause WMB or IIB to release memory that may build up over the 
week in the execution groups?

Chris Will

-Original Message-
From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Marcy 
Cortes
Sent: Thursday, July 09, 2015 12:00 PM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: How to find a memory leak?

Easier, but the pages aren't dropped from the zVM side immediately so if you 
are memory constrained there, cmmflush is your friend.


-Original Message-
From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Michael 
MacIsaac
Sent: Thursday, July 09, 2015 8:51 AM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: [LINUX-390] How to find a memory leak?

Tomas,

> I forgot to answer this question: you can drop buffers and cache by
running
> echo 3 > /proc/sys/vm/drop_caches

Nice, even easier. Thanks!

The next question is - can this ever be done by a non-root user? I tried adding 
/bin/echo to /etc/sudoers, but still get an error:

mike@lab153:~ $ sudo /bin/echo 3 > /proc/sys/vm/drop_caches
-bash: /proc/sys/vm/drop_caches: Permission denied



-Mike

On Thu, Jul 9, 2015 at 11:36 AM, Pavelka, Tomas 
wrote:

> > Thanks.  I copied and pasted cmmflush and it seems to work nicely
>
> If I understand it right then you have to look at how cmmflush affects 
> the output of /proc/buddyinfo. If you see non-zero in the last order 
> of slab (i.e. the one with 1MB size) then you are good to run vmcp 
> --buffer=1M.
> Otherwise you may still run into problems even if free -m shows a lot 
> of free memory.
>
> But I have not tried cmmflush, maybe it will help.
>
> The way that I was able to reproduce the memory fragmentation problem 
> was by copying large amount of data over SCP to that Linux machine.
> Try that and see if you can reproduce the vmcp --buffer=1M failure.
>
> Tomas
>

--
For LINUX-390 subscribe / signoff / archive access instructions, send email to 
lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit http://wiki.linuxvm.org/


The information contained in this communication is highly confidential and is 
intended solely for the use of the individual(s) to whom this communication is 
directed. If you are not the intended recipient, you are hereby notified that 
any viewing, copying, disclosure or distribution of this information is 
prohibited. Please notify the sender, by electronic mail or telephone, of any 
unintended receipt and delete the original message without making any copies.
 
 Blue Cross Blue Shield of Michigan and Blue Care Network of Michigan are 
nonprofit corporations and independent licensees of the Blue Cross and Blue 
Shield Association.


Re: How to find a memory leak?

2015-07-13 Thread Pavelka, Tomas
Let me try a different example than Mike's 'Q DASD DETAILS 0-': Suppose you 
are writing software for disaster recovery of LVM disks. The Linux that owned 
them will not come up so you link them from another Linux. Get a list of 
minidisk addresses and their owners and issue LINK against each. LINK needs a 
local virtual address that is also free. You don't know who will be using this 
so this needs to work in any configuration. You can ask the user to provide a 
range in a config file, but it would be nicer if you could find the range 
automatically. You could run QUERY VIRTUAL and find which virtual addresses are 
occupied. Once you start doing this, you run QUERY VIRTUAL every time you need 
to find free address, i.e. even when you have a few minidisks linked.
But at this point you realize that you program is unreliable because it can 
occasionally fail due to memory fragmentation, because to list all the used 
virtual addresses you need a large contiguous buffer in the kernel. What people 
wrote about reading the response of vmcp and enlarging the buffer if needed 
helps in the average scenario but does not help the worst case scenario. So if 
you are using vmcp in critical code you have to be very careful about keeping 
buffers small. It would be nice if this was not necessary, i.e. if there was a 
way to run DIAG 8 without the need for contiguous buffer.

Tomas


Re: How to find a memory leak?

2015-07-13 Thread Alan Altmark
On Monday, 07/13/2015 at 11:12 EDT, Michael MacIsaac 
wrote:
> I thought you also pointed out that CMS found a way to work around it
(or
> maybe that was Chuckie).  I thought, perhaps naively, that Linux may be
> able to work around it also.

Sorry, Mike.  I thought you were still talking about the "large memory"
problem.  There are two different (and independent) issues:
1. DIAGNOSE 8 requires contiguous memory.  In a memory-constrained or
highly-fragmented environment, Linux may not be able to allocate memory of
the needed size.
2. vmcp doesn't perform automatic buffer allocation for query-type
functions the same way as CMS.  Not fatal, but annoying, given the history
of CMS.

Alan Altmark

Senior Managing z/VM and Linux Consultant
Lab Services System z Delivery Practice
IBM Systems & Technology Group
ibm.com/systems/services/labservices
office: 607.429.3323
mobile; 607.321.7556
alan_altm...@us.ibm.com
IBM Endicott

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-13 Thread Marcy Cortes
Quick test on devices that actually exist shows that 9 devices seem to fit in 
the default 8192 buffer.
So you may need many more calls, depending what size buffer you decide you can 
use.

What one device looks like, FWIW (880 bytes):

D008  CUTYPE = 2107-E8, DEVTYPE = 3390-0A, VOLSER = VDL439, CYLS = 3339
  CACHE DETAILS:  CACHE NVS CFW DFW PINNED CONCOPY
   -SUBSYSTEM   YY   N   -N   N
  -DEVICE   Y-   -   YN   N
  DEVICE DETAILS: CCA = 08, DDC = --, DED = NO, SSD = YES, SS = 0
  DUPLEX DETAILS: --

  HYPERPAV DETAILS: BASE VOLUME IN POOL 8
  PPRC DETAILS: PRIMARY VOLUME
  CU DETAILS: SSID = D000, CUNUM = D000
  FENCED STATE: NONE
  HOST ACCESS: CPU  PARTITION GROUPED RESERVED MPATHMODE
   82C70EY   N Y
   784A04Y   N Y
   784A06Y   N Y
   82C70FY   N Y
   82C703Y   N Y
   82C708Y   N N



Q DASD DETAILS - on a test system reports that it would need over 4MB.  
Since 1MB is the biggest buffer you can ask for, splitting it up is going to 
have to happen anyway if you intend to run this at larger shops.



-Original Message-
From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Marcy 
Cortes
Sent: Monday, July 13, 2015 7:46 AM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: [LINUX-390] How to find a memory leak?

How about dividing it up in a little loop?
Issue Q D DETAILS 000-FFF, Q DA DETAILS 1000-1FFF



-Original Message-
From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Michael 
MacIsaac
Sent: Monday, July 13, 2015 7:42 AM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: [LINUX-390] How to find a memory leak?

Ursula,

Any chance the requirement for contiguous memory in vmcp could be relaxed?
In my test case, 'Q DA DETAILS 0-' requires ~2.5 MB and, as I understand 
it, that output can simply not be obtained through vmcp.

Thanks.

-Mike MacIsaac

On Mon, Jul 13, 2015 at 10:28 AM, Ursula Braun 
wrote:

> Tomas,
>
> the qeth driver has been improved in 2014 to reduce its demand for 
> contiguous storage:
>
> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/
> drivers/s390/net/qeth_core.h?id=d445a4e28c0ff740e946ae22860be85428814c
> 39
>
> Regards, Ursula Braun, IBM Germany, Linux on System z Dev.
>
> On Mon, 2015-07-13 at 13:39 +, Pavelka, Tomas wrote:
> > > I wouldn't really put that at the feet of s390 (z/Architecture).
> >
> > Bad wording on my part. When I said s390 I meant the s390 part of 
> > the
> Linux kernel implementation, not the entire architecture. I meant to 
> point out that the other parts of the kernel are working on getting 
> out of the requirement of large contiguous buffers. AFAIK the vmcp 
> driver uses the largest buffer and as you say, if the diag allowed to 
> return discontiguous memory then it would solve the fragmentation 
> problem. There are few other places that used larger buffers, NIC 
> driver was one of them. So not sure if "all over" was the good wording 
> either, maybe I should have said "multiple places" ;-)
> >
> > Tomas
> >
> > 
> > -- For LINUX-390 subscribe / signoff / archive access instructions, 
> > send email to lists...@vm.marist.edu with the message: INFO
> > LINUX-390
> or visit
> > http://www.marist.edu/htbin/wlvindex?LINUX-390
> > 
> > -- For more information on Linux on System z, visit 
> > http://wiki.linuxvm.org/
> >
>
> --
> For LINUX-390 subscribe / signoff / archive access instructions, send 
> email to lists...@vm.marist.edu with the message: INFO LINUX-390 or 
> visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
> --
> For more information on Linux on System z, visit 
> http://wiki.linuxvm.org/
>

--
For LINUX-390 subscribe / signoff / archive access instructions, send email to 
lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-13 Thread Michael MacIsaac
Alan,

> Mike, I already posted that the vmcp problem is caused by the DIAGNOSE
> 0x08 requirement for contiguous memory.

Thanks.

I thought you also pointed out that CMS found a way to work around it (or
maybe that was Chuckie).  I thought, perhaps naively, that Linux may be
able to work around it also.

Actually, I don't really want the output of 'Q DASD DETAILS 0-',
rather, it was the best test case I could think of to generate a lot of
output.  What I do want is to be able to get the output of any CP command
the user wants, and display it.

-Mike M.

On Mon, Jul 13, 2015 at 10:46 AM, Alan Altmark 
wrote:

> On Monday, 07/13/2015 at 10:43 EDT, Michael MacIsaac 
> wrote:
> > Any chance the requirement for contiguous memory in vmcp could be
> relaxed?
> > In my test case, 'Q DA DETAILS 0-' requires ~2.5 MB and, as I
> > understand it, that output can simply not be obtained through vmcp.
>
> Mike, I already posted that the vmcp problem is caused by the DIAGNOSE
> 0x08 requirement for contiguous memory.
>
> Alan Altmark
>
> Senior Managing z/VM and Linux Consultant
> Lab Services System z Delivery Practice
> IBM Systems & Technology Group
> ibm.com/systems/services/labservices
> office: 607.429.3323
> mobile; 607.321.7556
> alan_altm...@us.ibm.com
> IBM Endicott
>
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or
> visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
> --
> For more information on Linux on System z, visit
> http://wiki.linuxvm.org/
>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-13 Thread Christian Borntraeger
Am 10.07.2015 um 14:18 schrieb Bruce Hayden:
> The message sent to stderr is not documented in the device drivers book.
> It tells you about the response code of 2, but the description of that
> response code doesn't say anything about the error message to stderr or
> that the message tells you how long the output was.

I will have a look, if I can add something to the man page and the book.
At least the 2 message

Error: non-zero CP response for command '': #"
and
Error: output ( bytes) was truncated, try --buffer to increase size

should be documented, yes.

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-13 Thread Christian Borntraeger
Am 13.07.2015 um 16:42 schrieb Michael MacIsaac:
> Any chance the requirement for contiguous memory in vmcp could be relaxed?

That would require a change in the diagnose definition of z/VM. diag 8 does
require the buffer contiguous in guest real storage.

> In my test case, 'Q DA DETAILS 0-' requires ~2.5 MB and, as I
> understand it, that output can simply not be obtained through vmcp.

Splitting this up is propably the easiest workaround.

Christian

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-13 Thread Pavelka, Tomas
> the qeth driver has been improved in 2014 to reduce its demand for contiguous 
> storage:

Thanks Ursula, the memories keep coming back ;-) I remembered that the vmcp is 
still problematic but managed to forget that the qeth driver got fixed.


Re: How to find a memory leak?

2015-07-13 Thread Alan Altmark
On Monday, 07/13/2015 at 10:43 EDT, Michael MacIsaac 
wrote:
> Any chance the requirement for contiguous memory in vmcp could be
relaxed?
> In my test case, 'Q DA DETAILS 0-' requires ~2.5 MB and, as I
> understand it, that output can simply not be obtained through vmcp.

Mike, I already posted that the vmcp problem is caused by the DIAGNOSE
0x08 requirement for contiguous memory.

Alan Altmark

Senior Managing z/VM and Linux Consultant
Lab Services System z Delivery Practice
IBM Systems & Technology Group
ibm.com/systems/services/labservices
office: 607.429.3323
mobile; 607.321.7556
alan_altm...@us.ibm.com
IBM Endicott

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-13 Thread Marcy Cortes
How about dividing it up in a little loop?
Issue Q D DETAILS 000-FFF, Q DA DETAILS 1000-1FFF



-Original Message-
From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Michael 
MacIsaac
Sent: Monday, July 13, 2015 7:42 AM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: [LINUX-390] How to find a memory leak?

Ursula,

Any chance the requirement for contiguous memory in vmcp could be relaxed?
In my test case, 'Q DA DETAILS 0-' requires ~2.5 MB and, as I understand 
it, that output can simply not be obtained through vmcp.

Thanks.

-Mike MacIsaac

On Mon, Jul 13, 2015 at 10:28 AM, Ursula Braun 
wrote:

> Tomas,
>
> the qeth driver has been improved in 2014 to reduce its demand for 
> contiguous storage:
>
> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/
> drivers/s390/net/qeth_core.h?id=d445a4e28c0ff740e946ae22860be85428814c
> 39
>
> Regards, Ursula Braun, IBM Germany, Linux on System z Dev.
>
> On Mon, 2015-07-13 at 13:39 +, Pavelka, Tomas wrote:
> > > I wouldn't really put that at the feet of s390 (z/Architecture).
> >
> > Bad wording on my part. When I said s390 I meant the s390 part of 
> > the
> Linux kernel implementation, not the entire architecture. I meant to 
> point out that the other parts of the kernel are working on getting 
> out of the requirement of large contiguous buffers. AFAIK the vmcp 
> driver uses the largest buffer and as you say, if the diag allowed to 
> return discontiguous memory then it would solve the fragmentation 
> problem. There are few other places that used larger buffers, NIC 
> driver was one of them. So not sure if "all over" was the good wording 
> either, maybe I should have said "multiple places" ;-)
> >
> > Tomas
> >
> > 
> > -- For LINUX-390 subscribe / signoff / archive access instructions, 
> > send email to lists...@vm.marist.edu with the message: INFO 
> > LINUX-390
> or visit
> > http://www.marist.edu/htbin/wlvindex?LINUX-390
> > 
> > -- For more information on Linux on System z, visit 
> > http://wiki.linuxvm.org/
> >
>
> --
> For LINUX-390 subscribe / signoff / archive access instructions, send 
> email to lists...@vm.marist.edu with the message: INFO LINUX-390 or 
> visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
> --
> For more information on Linux on System z, visit 
> http://wiki.linuxvm.org/
>

--
For LINUX-390 subscribe / signoff / archive access instructions, send email to 
lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-13 Thread Michael MacIsaac
Ursula,

Any chance the requirement for contiguous memory in vmcp could be relaxed?
In my test case, 'Q DA DETAILS 0-' requires ~2.5 MB and, as I
understand it, that output can simply not be obtained through vmcp.

Thanks.

-Mike MacIsaac

On Mon, Jul 13, 2015 at 10:28 AM, Ursula Braun 
wrote:

> Tomas,
>
> the qeth driver has been improved in 2014 to reduce its demand for
> contiguous storage:
>
> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/s390/net/qeth_core.h?id=d445a4e28c0ff740e946ae22860be85428814c39
>
> Regards, Ursula Braun, IBM Germany, Linux on System z Dev.
>
> On Mon, 2015-07-13 at 13:39 +, Pavelka, Tomas wrote:
> > > I wouldn't really put that at the feet of s390 (z/Architecture).
> >
> > Bad wording on my part. When I said s390 I meant the s390 part of the
> Linux kernel implementation, not the entire architecture. I meant to point
> out that the other parts of the kernel are working on getting out of the
> requirement of large contiguous buffers. AFAIK the vmcp driver uses the
> largest buffer and as you say, if the diag allowed to return discontiguous
> memory then it would solve the fragmentation problem. There are few other
> places that used larger buffers, NIC driver was one of them. So not sure if
> "all over" was the good wording either, maybe I should have said "multiple
> places" ;-)
> >
> > Tomas
> >
> > --
> > For LINUX-390 subscribe / signoff / archive access instructions,
> > send email to lists...@vm.marist.edu with the message: INFO LINUX-390
> or visit
> > http://www.marist.edu/htbin/wlvindex?LINUX-390
> > --
> > For more information on Linux on System z, visit
> > http://wiki.linuxvm.org/
> >
>
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or
> visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
> --
> For more information on Linux on System z, visit
> http://wiki.linuxvm.org/
>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-13 Thread Ursula Braun
Tomas,

the qeth driver has been improved in 2014 to reduce its demand for
contiguous storage:
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/s390/net/qeth_core.h?id=d445a4e28c0ff740e946ae22860be85428814c39

Regards, Ursula Braun, IBM Germany, Linux on System z Dev.

On Mon, 2015-07-13 at 13:39 +, Pavelka, Tomas wrote:
> > I wouldn't really put that at the feet of s390 (z/Architecture).
>
> Bad wording on my part. When I said s390 I meant the s390 part of the Linux 
> kernel implementation, not the entire architecture. I meant to point out that 
> the other parts of the kernel are working on getting out of the requirement 
> of large contiguous buffers. AFAIK the vmcp driver uses the largest buffer 
> and as you say, if the diag allowed to return discontiguous memory then it 
> would solve the fragmentation problem. There are few other places that used 
> larger buffers, NIC driver was one of them. So not sure if "all over" was the 
> good wording either, maybe I should have said "multiple places" ;-)
>
> Tomas
>
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
> --
> For more information on Linux on System z, visit
> http://wiki.linuxvm.org/
>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-13 Thread Alan Altmark
On Monday, 07/13/2015 at 09:40 EDT, "Pavelka, Tomas"
 wrote:
> > I wouldn't really put that at the feet of s390 (z/Architecture).
>
> Bad wording on my part. When I said s390 I meant the s390 part of the
Linux
> kernel implementation, not the entire architecture. I meant to point out
that
> the other parts of the kernel are working on getting out of the
requirement of
> large contiguous buffers. AFAIK the vmcp driver uses the largest buffer
and as
> you say, if the diag allowed to return discontiguous memory then it
would solve
> the fragmentation problem. There are few other places that used larger
buffers,
> NIC driver was one of them. So not sure if "all over" was the good
wording
> either, maybe I should have said "multiple places" ;-)

The QDIO architecture doesn't require contiguous page frames.

Alan Altmark

Senior Managing z/VM and Linux Consultant
Lab Services System z Delivery Practice
IBM Systems & Technology Group
ibm.com/systems/services/labservices
office: 607.429.3323
mobile; 607.321.7556
alan_altm...@us.ibm.com
IBM Endicott

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-13 Thread Pavelka, Tomas
> I wouldn't really put that at the feet of s390 (z/Architecture).

Bad wording on my part. When I said s390 I meant the s390 part of the Linux 
kernel implementation, not the entire architecture. I meant to point out that 
the other parts of the kernel are working on getting out of the requirement of 
large contiguous buffers. AFAIK the vmcp driver uses the largest buffer and as 
you say, if the diag allowed to return discontiguous memory then it would solve 
the fragmentation problem. There are few other places that used larger buffers, 
NIC driver was one of them. So not sure if "all over" was the good wording 
either, maybe I should have said "multiple places" ;-)

Tomas

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-13 Thread Alan Altmark
On Friday, 07/10/2015 at 02:09 EDT, "Pavelka, Tomas"
 wrote:
> Where the s390 is different is that it uses large continuous buffers all
over.

I wouldn't really put that at the feet of s390 (z/Architecture).  As you
noted, the DIAGNOSE 0x08 problem is because CP requires the buffer to be
contiguous guest-absolute memory.  It has never been enhanced to use
scatter/gather the way other DIAGNOSE instructions have.  (And the way
most of the architecture operates.)

That would enable vmcp to request logically contiguous memory that would
be backed by memory that's discontiguous page frames.

Alan Altmark

Senior Managing z/VM and Linux Consultant
Lab Services System z Delivery Practice
IBM Systems & Technology Group
ibm.com/systems/services/labservices
office: 607.429.3323
mobile; 607.321.7556
alan_altm...@us.ibm.com
IBM Endicott

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-10 Thread Alan Altmark
On Friday, 07/10/2015 at 03:03 EDT, Rick Troth  wrote:
> Same thing *has* been said of VM ... by an IBMer ... to me ... as he
> deflected a legit inquiry/requirement.

I certainly hope that's ancient history.  If not, feel free to contact me
offline and maybe we can figure out what went wrong and how future
occurrences can be prevented.

About the only question for which such an answer might be appropriate in
z/VM is one that begins with "I want to write a modification to CP."

Alan Altmark

Senior Managing z/VM and Linux Consultant
Lab Services System z Delivery Practice
IBM Systems & Technology Group
ibm.com/systems/services/labservices
office: 607.429.3323
mobile; 607.321.7556
alan_altm...@us.ibm.com
IBM Endicott

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-10 Thread Rick Troth
Same thing *has* been said of VM ... by an IBMer ... to me ... as he
deflected a legit inquiry/requirement.


On 07/10/2015 11:59 AM, Stewart, Lee wrote:
> The same could be said of VM...   Well, most of it
>
> Lee Stewart ● VM System Support ● Visa ● Phone:  6(750)4601 - +1-303-389-4601 
> ● lstew...@visa.com
>
> -Original Message-
> From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Alan 
> Altmark
> Sent: Friday, July 10, 2015 7:23 AM
> To: LINUX-390@VM.MARIST.EDU
> Subject: Re: How to find a memory leak?
>
> This is Linux.  You have source code.  Who needs documentation?
>
> --
> For LINUX-390 subscribe / signoff / archive access instructions, send email 
> to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
> --
> For more information on Linux on System z, visit http://wiki.linuxvm.org/
>
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
> --
> For more information on Linux on System z, visit
> http://wiki.linuxvm.org/

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-10 Thread Stewart, Lee
The same could be said of VM...   Well, most of it

Lee Stewart ● VM System Support ● Visa ● Phone:  6(750)4601 - +1-303-389-4601 ● 
lstew...@visa.com

-Original Message-
From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Alan 
Altmark
Sent: Friday, July 10, 2015 7:23 AM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: How to find a memory leak?

This is Linux.  You have source code.  Who needs documentation?  

--
For LINUX-390 subscribe / signoff / archive access instructions, send email to 
lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit http://wiki.linuxvm.org/

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-10 Thread Alan Altmark
On Friday, 07/10/2015 at 08:20 EDT, Bruce Hayden 
wrote:
> The message sent to stderr is not documented in the device drivers book.
> It tells you about the response code of 2, but the description of that
> response code doesn't say anything about the error message to stderr or
> that the message tells you how long the output was.
>
> For use in scripts, it would be nice if there was a way to get the
output
> length without needing to parse the text of an error message!

This is Linux.  You have source code.  Who needs documentation?  And why
not change vmcp do the
auto-retry-with-a-bigger-buffer-if-query-thing-you-were-talking-about that
CMS does?   Bozo says that that are other commands such COMMANDS and
CPTYPE that can be re-issued without side effect.   He also said something
about CP FOR, but I wasn't listening by that point.

Well, I'm off to see if I find anyone to tell me more about leap seconds.
Or take a nap.  Oooh!  I can do both at the same time!

-- C

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-10 Thread Michael MacIsaac
OK, y'all convinced me that always using a 1M buffer for vmcp is not the
best idea. I appreciate all the feedback.

Here's my first take at a bash function to retry a vmcp command with a few
boundary test cases:

# cat testvmcp
#!/bin/bash

function zVerbose
 {
  echo "$@"
 }

#+--+
function zCPcmd
# invoke a CP command with the vmcp module/command
# Args 1-n: the command to issue
# Return:   the CP return code (not the vmcp rc)
#+--+
 {
  : SOURCE: ${BASH_SOURCE}
  : STACK:  ${FUNCNAME[@]}

  local CPrc=0 # assume CP command succeeds
  local CPout  # CP output

  zVerbose "Invoking: $vmcpCmd $@"
  CPout=`$vmcpCmd $@ 2>&1` # run the CP command
  local rc=$?
  if [ "$rc" = 2 ]; then   # output buffer overflow
local bytes=`echo $CPout | awk -F'(' '{print $2}' | awk '{print $1}'`
if [[ "$bytes" -gt 1048576 ]]; then# output too large
  zVerbose "Error: output of $bytes bytes larger than 1 MB"
  return 2
fi
zVerbose "Warning: increasing vmcp buffer size to $bytes bytes and
trying again"
CPout=`$vmcpCmd --buffer=$bytes $@ 2>&1`
local rc2=$?
if [ $rc2 != 0 ]; then # capture the CP return code
after "#"
  CPrc=`echo $CPout | grep "Error: non-zero CP" | awk -F# '{print $2}'`
fi
  elif [ $rc != 0 ]; then  # capture the CP return code
after "#"
CPrc=`echo $CPout | grep "Error: non-zero CP" | awk -F# '{print $2}'`
  fi
  zVerbose "CP output:"# show the output in verbose mode
  zVerbose "$CPout"# show the output in verbose mode
  zVerbose "CP return code: $CPrc" # show the CP return code in
verbose mode
  return $CPrc # return code from CP
 } # zCPcmd()

# main
vmcpCmd="/sbin/vmcp"

zCPcmd "Q DASD DETAILS -07FF"
echo
zCPcmd "Q DASD DETAILS -"
echo
zCPcmd "NOT A CP COMMNAD"


Here's a test of running it:

# testvmcp
Invoking: /sbin/vmcp Q DASD DETAILS -07FF
Warning: increasing vmcp buffer size to 109568 bytes and trying again
CP output:
HCPQDD040E Device  does not exist
HCPQDD040E Device 0001 does not exist
...
HCPQDD040E Device 07FF does not exist
Error: non-zero CP response for command 'Q DASD DETAILS -07FF': #40
CP return code: 40

Invoking: /sbin/vmcp Q DASD DETAILS -
Error: output of 2577330 bytes larger than 1 MB

Invoking: /sbin/vmcp NOT A CP COMMNAD
CP output:
Error: non-zero CP response for command 'NOT A CP COMMNAD': #1
CP return code: 1

-Mike M

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-10 Thread Bruce Hayden
The message sent to stderr is not documented in the device drivers book.
It tells you about the response code of 2, but the description of that
response code doesn't say anything about the error message to stderr or
that the message tells you how long the output was.

For use in scripts, it would be nice if there was a way to get the output
length without needing to parse the text of an error message!

On Fri, Jul 10, 2015 at 7:47 AM, Christian Borntraeger <
borntrae...@de.ibm.com> wrote:

> Am 10.07.2015 um 13:19 schrieb Christian Borntraeger:
> > Anothing thing: 1M is quite large (the larges contiguous memory that
> Linux wants
> > to allocate as slab). So try to not use that unless you need it. If you
> want to
> > know how much memory is needed, then vmcp gives you back the result of
> the diagnose:
> >
> > # vmcp q dasd all
> > [...]
> > Error: output (74764 bytes) was truncated, try --buffer to increase size
>
> For those who want to trigger some action in that case:
> The output from CP goes to stdout and the error message to stderr.
> The return value of vmcp is 2 instead of 0 in that case. So something
> if return == 2 then parse stderr and retry could work out.
>
> Christian
>
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or
> visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
> --
> For more information on Linux on System z, visit
> http://wiki.linuxvm.org/
>



--
Bruce Hayden
z/VM and Linux on z Systems ATS
IBM, Endicott, NY

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-10 Thread Christian Borntraeger
Am 10.07.2015 um 13:19 schrieb Christian Borntraeger:
> Anothing thing: 1M is quite large (the larges contiguous memory that Linux 
> wants
> to allocate as slab). So try to not use that unless you need it. If you want 
> to
> know how much memory is needed, then vmcp gives you back the result of the 
> diagnose:
>
> # vmcp q dasd all
> [...]
> Error: output (74764 bytes) was truncated, try --buffer to increase size

For those who want to trigger some action in that case:
The output from CP goes to stdout and the error message to stderr.
The return value of vmcp is 2 instead of 0 in that case. So something
if return == 2 then parse stderr and retry could work out.

Christian

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-10 Thread Pavelka, Tomas
> Only admins would have access to those sudo commands.

But the sudoers line shows an intent to restrict access to tee only:

%zoom ALL=NOPASSWD:/usr/bin/tee

The hole that Karsten has shown is that the line in sudoers is really the 
security equivalent of:

%zoom ALL=NOPASSWD:ALL

Whether it is a huge hole depends on whether you would be ok with allowing all 
users in %zoom to be able to run any command through sudo without a password.

Tomas


Re: How to find a memory leak?

2015-07-10 Thread Christian Borntraeger
Am 09.07.2015 um 19:16 schrieb Michael MacIsaac:
> I'm going to stop here for now.  I've learned a lot about Linux memory from
> this thread (but that's easy when you don't know much to begin with :)).
>
> I guess a question to the Linux developers in Germany would be:
>
> If vmcp is called with a buffer of 1M and the last slab in /proc/buddyinfo
> is 0, would it not be reasonable to nudge the kernel to free at least one
> slot up, assuming this can be done safely?

That does not help. The kernel frees up memory when needed (or when below a
watermark). So the AS-IS state does  not tell you anything. Now: newer kernels
(those that offer transparent hugepages) do have memory compaction which tries
to reorganize memory on pressure, so this case should be less of an issue.

Anothing thing: 1M is quite large (the larges contiguous memory that Linux wants
to allocate as slab). So try to not use that unless you need it. If you want to
know how much memory is needed, then vmcp gives you back the result of the 
diagnose:

# vmcp q dasd all
[...]
Error: output (74764 bytes) was truncated, try --buffer to increase size

# vmcp --buffer 75000 q dasd all
[...]
#

Christian

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-10 Thread Michael MacIsaac
> That's a huge security hole, btw. Don't do that!
this is an attempt to be more security minded, not less. Only admins would
have access to those sudo commands.

For example, if I am root and issue the command:

echo "newroot:x:0:0:root:/root:/bin/bash" >> /etc/passwd

then who added the newroot user?  root did, but there is no audit trail.
If I am a trusted admin and I log in with my own user ID, and must use sudo
in the command, then there is an audit trail.

-Mike



On Fri, Jul 10, 2015 at 4:02 AM, Karsten Hopp  wrote:

> That's a huge security hole, btw. Don't do that!
>
> echo "newroot:x:0:0:root:/root:/bin/bash" | /usr/bin/tee /etc/passwd
>
> a similiar command for /etc/shadow and you've gained root.
>
>
>
> Am 09.07.2015 um 18:06 schrieb Michael MacIsaac:
>
>> Let me answer my own question.  Perhaps kludgy, but by adding 'tee' to
>> sudo, this technique works:
>>
>> root@lab141:~ # visudo
>> root@lab141:~ # tail -1 /etc/sudoers
>> %zoom ALL=NOPASSWD:/usr/bin/tee
>> root@lab141:~ # su - mike
>> mike@lab141:~ # free -m
>>   total   used   free sharedbuffers cached
>> Mem:   491473 18  0111170
>> -/+ buffers/cache:190300
>> Swap:  512  0511
>> mike@lab141:~ # echo 3 | sudo /usr/bin/tee /proc/sys/vm/drop_caches >
>> /dev/null
>> mike@lab141:~ # free -m
>>   total   used   free sharedbuffers cached
>> Mem:   491103388  0  1 12
>> -/+ buffers/cache: 89401
>> Swap:  512  0511
>>
>>
>>
>> --
>> For LINUX-390 subscribe / signoff / archive access instructions,
>> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or
>> visit
>> http://www.marist.edu/htbin/wlvindex?LINUX-390
>> --
>> For more information on Linux on System z, visit
>> http://wiki.linuxvm.org/
>>
>
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or
> visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
> --
> For more information on Linux on System z, visit
> http://wiki.linuxvm.org/
>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-10 Thread Karsten Hopp

That's a huge security hole, btw. Don't do that!

echo "newroot:x:0:0:root:/root:/bin/bash" | /usr/bin/tee /etc/passwd

a similiar command for /etc/shadow and you've gained root.


Am 09.07.2015 um 18:06 schrieb Michael MacIsaac:

Let me answer my own question.  Perhaps kludgy, but by adding 'tee' to
sudo, this technique works:

root@lab141:~ # visudo
root@lab141:~ # tail -1 /etc/sudoers
%zoom ALL=NOPASSWD:/usr/bin/tee
root@lab141:~ # su - mike
mike@lab141:~ # free -m
  total   used   free sharedbuffers cached
Mem:   491473 18  0111170
-/+ buffers/cache:190300
Swap:  512  0511
mike@lab141:~ # echo 3 | sudo /usr/bin/tee /proc/sys/vm/drop_caches >
/dev/null
mike@lab141:~ # free -m
  total   used   free sharedbuffers cached
Mem:   491103388  0  1 12
-/+ buffers/cache: 89401
Swap:  512  0511



--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-09 Thread Pavelka, Tomas
I replied to Mike and Alan yesterday evening but it does not show in the 
archives. I am assuming it got lost and I am resending. Sorry if this is a 
duplicate.

> If vmcp is called with a buffer of 1M and the last slab in 
> /proc/buddyinfo is 0, would it not be reasonable to nudge 
> the kernel to free at least one slot up, assuming this can be done safely?

> So there's no point in nudging the kernel to do a Hail Mary attempt to
> find more memory.  If it were available, the slab count would already be > 0.

As I understand it from the time I was researching this, /proc/buddyinfo shows 
the current state of the slab cache. Since the kernel uses a large amount of 
memory for caches and buffers and these are ready to be freed when needed, a 
zero slab count does not necessarilly mean that a call needing that slab will 
fail. The kernel does several rounds of freeing and rearranging memory to find 
or construct a suitable slab. 
I looked at this in kernel 2.6 and it may have changed, but there the algorithm 
was different for slabs with size lesser than 32k: for those it tried even 
harder to free memory. I also remember there was some time limit on the 
freeing, if the kernel did not free the memoryin time, it failed.
So a vmcp failure happens when there are zero free slabs and the kernel fails 
to free enough continuous memory. I guess you can end up with freeing a lot and 
still have enough fragmentation not to be able to find a large slab.
Where the s390 is different is that it uses large continuous buffers all over. 
The rest of Linux tries to use smaller or discontinuous buffers which may be 
why the kernel mainline is not bothered by problems with reclaiming larger 
slabs. So the question for the VM/zLinux devs could be whether the diag that 
allows Linux to make CP calls could be changed to return partial data or do 
something else in order to not use a large buffer.

Tomas

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-09 Thread Thomas Anderson
Perhaps double check if /bin/echo is a link to /usr/bin/echo on your system, in 
which case try updating your sudoers line to point to /usr/bin/echo instead of 
/bin/echo ?


Tom Anderson
Ex ignorantia ad sapientiam
e tenebris ad lucem!

> On Jul 9, 2015, at 8:51 AM, Michael MacIsaac  wrote:
> 
> Tomas,
> 
>> I forgot to answer this question: you can drop buffers and cache by
> running
>> echo 3 > /proc/sys/vm/drop_caches
> 
> Nice, even easier. Thanks!
> 
> The next question is - can this ever be done by a non-root user? I tried
> adding /bin/echo to /etc/sudoers, but still get an error:
> 
> mike@lab153:~ $ sudo /bin/echo 3 > /proc/sys/vm/drop_caches
> -bash: /proc/sys/vm/drop_caches: Permission denied
> 
> 
> 
>-Mike
> 
> On Thu, Jul 9, 2015 at 11:36 AM, Pavelka, Tomas 
> wrote:
> 
>>> Thanks.  I copied and pasted cmmflush and it seems to work nicely
>> 
>> If I understand it right then you have to look at how cmmflush affects the
>> output of /proc/buddyinfo. If you see non-zero in the last order of slab
>> (i.e. the one with 1MB size) then you are good to run vmcp --buffer=1M.
>> Otherwise you may still run into problems even if free -m shows a lot of
>> free memory.
>> 
>> But I have not tried cmmflush, maybe it will help.
>> 
>> The way that I was able to reproduce the memory fragmentation problem was
>> by copying large amount of data over SCP to that Linux machine. Try that
>> and see if you can reproduce the vmcp --buffer=1M failure.
>> 
>> Tomas
>> 
> 
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
> --
> For more information on Linux on System z, visit
> http://wiki.linuxvm.org/
> 

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-09 Thread Malcolm Beattie
Alan Altmark writes:
> On Thursday, 07/09/2015 at 04:25 EDT, Mark Post  wrote:
> > > The next question is - can this ever be done by a non-root user? I
> tried
> >
> > No.
> > # ls -l /proc/sys/vm/drop_caches
> > -rw-r--r-- 1 root root 0 Jul  9 16:23 /proc/sys/vm/drop_caches
>
> Thank heavens!   That's all we need -- unprivileged users messing with the
> cache

Even unprivileged programs have limited and controlled access to
influencing the caching behaviour for files that they deal with,
whether via read/write or mapped into memory. There are the POSIXy
interfaces:
  madvise(..., MADV_RANDOM) and fadvise(..., POSIX_FADV_RANDOM)
  madvise(..., MADV_SEQUENTIAL) and fadvise(..., POSIX_FADV_SEQUENTIAL)
Similarly WILLNEED, DONTNEED and a few extras like:
  fsync(...)
  fdatasync(...)
and one or two where the APIs or functionality aren't as standardised
or common like readahead(...).

Linux has "per-open-file" tracking of readahead window information and
per-page marks in the page cache itself and does a good job of deducing
the right amount of sync/async readahead based on access pattern and
memory pressure in most common cases. However, it's nice to be able to
give it a hint or two (e.g. "I'm going to stream through this file once
and then won't need it again") while continuing to use the usual simple
file APIs without having to mess around reinventing your own buffering
or fiddle around with separate threads, async I/Os or separate access
methods (or equivalent) in O/Ses where caching is all-or-nothing or
privileged-control-only.

--Malcolm

--
Malcolm Beattie
Linux and System z Technical Consultant, zChampion
IBM UK Systems and Technology Group

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-09 Thread Alan Altmark
On Thursday, 07/09/2015 at 04:25 EDT, Mark Post  wrote:
> > The next question is - can this ever be done by a non-root user? I
tried
>
> No.
> # ls -l /proc/sys/vm/drop_caches
> -rw-r--r-- 1 root root 0 Jul  9 16:23 /proc/sys/vm/drop_caches

Thank heavens!   That's all we need -- unprivileged users messing with the
cache

Alan Altmark

Senior Managing z/VM and Linux Consultant
Lab Services System z Delivery Practice
IBM Systems & Technology Group
ibm.com/systems/services/labservices
office: 607.429.3323
mobile; 607.321.7556
alan_altm...@us.ibm.com
IBM Endicott

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-09 Thread Mark Post
>>> On 7/9/2015 at 11:51 AM, Michael MacIsaac  wrote: 
> Tomas,
> 
>> I forgot to answer this question: you can drop buffers and cache by
> running
>> echo 3 > /proc/sys/vm/drop_caches
> 
> Nice, even easier. Thanks!
> 
> The next question is - can this ever be done by a non-root user? I tried

No.
# ls -l /proc/sys/vm/drop_caches
-rw-r--r-- 1 root root 0 Jul  9 16:23 /proc/sys/vm/drop_caches


Mark Post

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-09 Thread Alan Altmark
On Thursday, 07/09/2015 at 01:16 EDT, Michael MacIsaac
 wrote:
> I'm going to stop here for now.  I've learned a lot about Linux memory
from
> this thread (but that's easy when you don't know much to begin with :)).
>
> I guess a question to the Linux developers in Germany would be:
>
> If vmcp is called with a buffer of 1M and the last slab in
/proc/buddyinfo
> is 0, would it not be reasonable to nudge the kernel to free at least
one
> slot up, assuming this can be done safely?

My 0.02 USD:

CP has similar issues for I/O and V-SIE.  Slab creation (coalescing
adjacent page frames into larger slabs) is a function that is intended to
ensure the available count for each slab is > 0.  The ideal time to create
a larger slab is when memory is being released.  The only way to get
larger slabs is to force more memory to be released.   This is why the
cache controls discussed here are important - they keep as much memory
released as advisable.

So there's no point in nudging the kernel to do a Hail Mary attempt to
find more memory.  If it were available, the slab count would already be >
0.

Alan Altmark

Senior Managing z/VM and Linux Consultant
Lab Services System z Delivery Practice
IBM Systems & Technology Group
ibm.com/systems/services/labservices
office: 607.429.3323
mobile; 607.321.7556
alan_altm...@us.ibm.com
IBM Endicott

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-09 Thread Bruce Hayden
If the diag 8 response is truncated, the response from CP sets condition
code 1 and returns how many bytes of the output would not fit in the
buffer.  If this information was somehow returned by the vmcp command, then
you'd know how much bigger your response buffer should be, and then reissue
the command with the correct buffer size.

While that doesn't fix the problem of vmcp not being able to obtain a
buffer, it would help avoid it by not needing a very large buffer for many
commands.

Pipelines in CMS automatically obtains a larger buffer for CP QUERY
commands, because there are no side effects from issuing a query more than
once.  If the command is not a query, the number of bytes that didn't fit
the buffer can be returned to the program, so that the command can be
issued again with a larger buffer.

On Thu, Jul 9, 2015 at 1:16 PM, Michael MacIsaac 
wrote:

> I'm going to stop here for now.  I've learned a lot about Linux memory from
> this thread (but that's easy when you don't know much to begin with :)).
>
> I guess a question to the Linux developers in Germany would be:
>
> If vmcp is called with a buffer of 1M and the last slab in /proc/buddyinfo
> is 0, would it not be reasonable to nudge the kernel to free at least one
> slot up, assuming this can be done safely?
>
> Thanks.
>
> -Mike
>
>

--
Bruce Hayden
z/VM and Linux on z Systems ATS
IBM, Endicott, NY

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-09 Thread Michael MacIsaac
I'm going to stop here for now.  I've learned a lot about Linux memory from
this thread (but that's easy when you don't know much to begin with :)).

I guess a question to the Linux developers in Germany would be:

If vmcp is called with a buffer of 1M and the last slab in /proc/buddyinfo
is 0, would it not be reasonable to nudge the kernel to free at least one
slot up, assuming this can be done safely?

Thanks.

-Mike

On Thu, Jul 9, 2015 at 12:53 PM, Pavelka, Tomas 
wrote:

> > Maybe I'll think about sudo-enabling cmmflush and checking the last
> field of /proc/buddyinfo to see if it needs to be run.
>
> I tried doing things based on the values of /proc/buddyinfo but what I
> found is that if there are zeroes in the high order slab counts, there is a
> chance that vmcp with 1M buffer will fail. But not a guarantee. Sometimes
> Linux just rearranges the slabs and finds the memory. Which makes it even
> harder to reproduce. Beware that you can spend ages debugging this ;-)
>
> Tomas
>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-09 Thread Barton Robinson

finding the cause and setting an alert would certainly help anticipate.
This data is collected each minute automatically, at a cost of less than
.1% of one ifl per server, at process and system level.  There are more
metrics, this is a sample


Report: ESALNXP  LINUX  Velocity Software Corporate ZMAP 4.2.0 02/
Monitor initialized: 02/27/ First record analyzed: 02/27/15 19:00:00

node/ <-Process Ident-> <---Storage Metrics (MB)-->
 Name IDPPID   GRP   Size RSS Peak Swap Data Stk EXEC Lib Lck PTbl
- - - -        --- --- 

02/27/15
19:01:00
oracle0 0 0 7375  980 72120  174  4.9 1839 478   0 8.98
 init 1 1 010  0.80 0.14  0.1 0.6   0   0 0.01
 perl  2140 1  214096  9.00 4.06  0.1 1.4 2.2   0 0.03
 snmpd22809 1 22808   359 34.70 3.50  0.1 0.0  29   0 0.05


and at system level:
Report: ESALNXR  LINUX RAM/Storage Analysis Report
Velocity Sof
Monitor initialized: 02/27/15 at 19:00:00 on 2828 serial 314C7 First
record
---

Node/ <-Kernel(MB)->
<-Buffers(MB
<---Cache><---Anonymous---> Stack<-Slab-->
Time Total Free Size Actv Swap Total Actv Inact Size Size SRec Size
Dirty B
 -     -  -    
- -
02/27/15
19:01:00
oracle   994.8 13.7  5500  0.8 115.60 00 38.40
251   0.2
---

19:02:00
oracle   994.8 13.7  5500  0.8 115.60 00 38.30
251   0.0



On 7/9/2015 9:31 AM, Michael MacIsaac wrote:

Barton,

It reports on the /proc/buddyinfo values and anticipates vmcp failing?

On Thu, Jul 9, 2015 at 12:26 PM, Barton Robinson <
bar...@velocitysoftware.com> wrote:


And a good performance monitor would already have this reported - down
to the process level.


On 7/9/2015 9:06 AM, Michael MacIsaac wrote:


Let me answer my own question.  Perhaps kludgy, but by adding 'tee' to
sudo, this technique works:

root@lab141:~ # visudo
root@lab141:~ # tail -1 /etc/sudoers
%zoom ALL=NOPASSWD:/usr/bin/tee
root@lab141:~ # su - mike
mike@lab141:~ # free -m
   total   used   free sharedbuffers cached
Mem:   491473 18  0111170
-/+ buffers/cache:190300
Swap:  512  0511
mike@lab141:~ # echo 3 | sudo /usr/bin/tee /proc/sys/vm/drop_caches >
/dev/null
mike@lab141:~ # free -m
   total   used   free sharedbuffers cached
Mem:   491103388  0  1 12
-/+ buffers/cache: 89401
Swap:  512  0511



--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or
visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/





--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or
visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/





--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-09 Thread Pavelka, Tomas
> Maybe I'll think about sudo-enabling cmmflush and checking the last field of 
> /proc/buddyinfo to see if it needs to be run.

I tried doing things based on the values of /proc/buddyinfo but what I found is 
that if there are zeroes in the high order slab counts, there is a chance that 
vmcp with 1M buffer will fail. But not a guarantee. Sometimes Linux just 
rearranges the slabs and finds the memory. Which makes it even harder to 
reproduce. Beware that you can spend ages debugging this ;-)

Tomas


Re: How to find a memory leak?

2015-07-09 Thread Michael MacIsaac
Barton,

It reports on the /proc/buddyinfo values and anticipates vmcp failing?

On Thu, Jul 9, 2015 at 12:26 PM, Barton Robinson <
bar...@velocitysoftware.com> wrote:

> And a good performance monitor would already have this reported - down
> to the process level.
>
>
> On 7/9/2015 9:06 AM, Michael MacIsaac wrote:
>
>> Let me answer my own question.  Perhaps kludgy, but by adding 'tee' to
>> sudo, this technique works:
>>
>> root@lab141:~ # visudo
>> root@lab141:~ # tail -1 /etc/sudoers
>> %zoom ALL=NOPASSWD:/usr/bin/tee
>> root@lab141:~ # su - mike
>> mike@lab141:~ # free -m
>>   total   used   free sharedbuffers cached
>> Mem:   491473 18  0111170
>> -/+ buffers/cache:190300
>> Swap:  512  0511
>> mike@lab141:~ # echo 3 | sudo /usr/bin/tee /proc/sys/vm/drop_caches >
>> /dev/null
>> mike@lab141:~ # free -m
>>   total   used   free sharedbuffers cached
>> Mem:   491103388  0  1 12
>> -/+ buffers/cache: 89401
>> Swap:  512  0511
>>
>>
>>
>> --
>> For LINUX-390 subscribe / signoff / archive access instructions,
>> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or
>> visit
>> http://www.marist.edu/htbin/wlvindex?LINUX-390
>> --
>> For more information on Linux on System z, visit
>> http://wiki.linuxvm.org/
>>
>>
>>
>>
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or
> visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
> --
> For more information on Linux on System z, visit
> http://wiki.linuxvm.org/
>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-09 Thread Michael MacIsaac
Tomas,

> But as I said, in my experiments dropping caches did not help.
So we both arrived at a technique that will not work - (he he :))

> What makes this hard to test is that vmcp running out of memory is not
easily reproducible.
Yes, the error has been quite intermittent.

Maybe I'll think about sudo-enabling cmmflush and checking the last field
of /proc/buddyinfo to see if it needs to be run.

Thanks all.

-Mike

On Thu, Jul 9, 2015 at 12:06 PM, Michael MacIsaac 
wrote:

> Let me answer my own question.  Perhaps kludgy, but by adding 'tee' to
> sudo, this technique works:
>
> root@lab141:~ # visudo
> root@lab141:~ # tail -1 /etc/sudoers
> %zoom ALL=NOPASSWD:/usr/bin/tee
> root@lab141:~ # su - mike
> mike@lab141:~ # free -m
>  total   used   free sharedbuffers cached
> Mem:   491473 18  0111170
> -/+ buffers/cache:190300
> Swap:  512  0511
> mike@lab141:~ # echo 3 | sudo /usr/bin/tee /proc/sys/vm/drop_caches >
> /dev/null
> mike@lab141:~ # free -m
>  total   used   free sharedbuffers cached
> Mem:   491103388  0  1 12
> -/+ buffers/cache: 89401
> Swap:  512  0511
>
>
>
>>
>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-09 Thread Barton Robinson

And a good performance monitor would already have this reported - down
to the process level.

On 7/9/2015 9:06 AM, Michael MacIsaac wrote:

Let me answer my own question.  Perhaps kludgy, but by adding 'tee' to
sudo, this technique works:

root@lab141:~ # visudo
root@lab141:~ # tail -1 /etc/sudoers
%zoom ALL=NOPASSWD:/usr/bin/tee
root@lab141:~ # su - mike
mike@lab141:~ # free -m
  total   used   free sharedbuffers cached
Mem:   491473 18  0111170
-/+ buffers/cache:190300
Swap:  512  0511
mike@lab141:~ # echo 3 | sudo /usr/bin/tee /proc/sys/vm/drop_caches >
/dev/null
mike@lab141:~ # free -m
  total   used   free sharedbuffers cached
Mem:   491103388  0  1 12
-/+ buffers/cache: 89401
Swap:  512  0511



--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/





--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-09 Thread Michael MacIsaac
Let me answer my own question.  Perhaps kludgy, but by adding 'tee' to
sudo, this technique works:

root@lab141:~ # visudo
root@lab141:~ # tail -1 /etc/sudoers
%zoom ALL=NOPASSWD:/usr/bin/tee
root@lab141:~ # su - mike
mike@lab141:~ # free -m
 total   used   free sharedbuffers cached
Mem:   491473 18  0111170
-/+ buffers/cache:190300
Swap:  512  0511
mike@lab141:~ # echo 3 | sudo /usr/bin/tee /proc/sys/vm/drop_caches >
/dev/null
mike@lab141:~ # free -m
 total   used   free sharedbuffers cached
Mem:   491103388  0  1 12
-/+ buffers/cache: 89401
Swap:  512  0511



>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-09 Thread Pavelka, Tomas
> The next question is - can this ever be done by a non-root user? I tried 
> adding /bin/echo to /etc/sudoers, but still get an error:

I was able to google these two approaches to dropping caches over sudo:

sudo sh -c "sync; echo 3 > /proc/sys/vm/drop_caches"

or

echo 3 | sudo tee /proc/sys/vm/drop_caches

See the comments here: http://www.linuxinsight.com/proc_sys_vm_drop_caches.html

But as I said, in my experiments dropping caches did not help. What makes this 
hard to test is that vmcp running out of memory is not easily reproducible. It 
can happen once, then you can try rerunning for a while and it keeps happening. 
But suddenly the kernel rearranges the slabs and you can run fine for days. The 
problem is that I have not found a way to free memory for large kernel slabs 
from within a script. If you are trying to fix the problem as human, the 
solution is to repeatedly run vmcp --buffer=1M q userid and it will eventually 
go away.

Tomas


Re: How to find a memory leak?

2015-07-09 Thread Marcy Cortes
Easier, but the pages aren't dropped from the zVM side immediately so if you 
are memory constrained there, cmmflush is your friend.


-Original Message-
From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Michael 
MacIsaac
Sent: Thursday, July 09, 2015 8:51 AM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: [LINUX-390] How to find a memory leak?

Tomas,

> I forgot to answer this question: you can drop buffers and cache by
running
> echo 3 > /proc/sys/vm/drop_caches

Nice, even easier. Thanks!

The next question is - can this ever be done by a non-root user? I tried adding 
/bin/echo to /etc/sudoers, but still get an error:

mike@lab153:~ $ sudo /bin/echo 3 > /proc/sys/vm/drop_caches
-bash: /proc/sys/vm/drop_caches: Permission denied



-Mike

On Thu, Jul 9, 2015 at 11:36 AM, Pavelka, Tomas 
wrote:

> > Thanks.  I copied and pasted cmmflush and it seems to work nicely
>
> If I understand it right then you have to look at how cmmflush affects 
> the output of /proc/buddyinfo. If you see non-zero in the last order 
> of slab (i.e. the one with 1MB size) then you are good to run vmcp 
> --buffer=1M.
> Otherwise you may still run into problems even if free -m shows a lot 
> of free memory.
>
> But I have not tried cmmflush, maybe it will help.
>
> The way that I was able to reproduce the memory fragmentation problem 
> was by copying large amount of data over SCP to that Linux machine. 
> Try that and see if you can reproduce the vmcp --buffer=1M failure.
>
> Tomas
>

--
For LINUX-390 subscribe / signoff / archive access instructions, send email to 
lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-09 Thread Michael MacIsaac
Tomas,

> I forgot to answer this question: you can drop buffers and cache by
running
> echo 3 > /proc/sys/vm/drop_caches

Nice, even easier. Thanks!

The next question is - can this ever be done by a non-root user? I tried
adding /bin/echo to /etc/sudoers, but still get an error:

mike@lab153:~ $ sudo /bin/echo 3 > /proc/sys/vm/drop_caches
-bash: /proc/sys/vm/drop_caches: Permission denied



-Mike

On Thu, Jul 9, 2015 at 11:36 AM, Pavelka, Tomas 
wrote:

> > Thanks.  I copied and pasted cmmflush and it seems to work nicely
>
> If I understand it right then you have to look at how cmmflush affects the
> output of /proc/buddyinfo. If you see non-zero in the last order of slab
> (i.e. the one with 1MB size) then you are good to run vmcp --buffer=1M.
> Otherwise you may still run into problems even if free -m shows a lot of
> free memory.
>
> But I have not tried cmmflush, maybe it will help.
>
> The way that I was able to reproduce the memory fragmentation problem was
> by copying large amount of data over SCP to that Linux machine. Try that
> and see if you can reproduce the vmcp --buffer=1M failure.
>
> Tomas
>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-09 Thread Pavelka, Tomas
> Thanks.  I copied and pasted cmmflush and it seems to work nicely

If I understand it right then you have to look at how cmmflush affects the 
output of /proc/buddyinfo. If you see non-zero in the last order of slab (i.e. 
the one with 1MB size) then you are good to run vmcp --buffer=1M. Otherwise you 
may still run into problems even if free -m shows a lot of free memory.

But I have not tried cmmflush, maybe it will help.

The way that I was able to reproduce the memory fragmentation problem was by 
copying large amount of data over SCP to that Linux machine. Try that and see 
if you can reproduce the vmcp --buffer=1M failure.

Tomas


Re: How to find a memory leak?

2015-07-09 Thread Pavelka, Tomas
> As a workaround, is there a command to flush the buffer cache?

I forgot to answer this question: you can drop buffers and cache by running

echo 3 > /proc/sys/vm/drop_caches

See http://linux-mm.org/Drop_Caches

As far as I remember this did not help at all. My guess about why that did not 
help is that when seeking for memory, the kernel will actually try to drop some 
caches, but in the case of memory fragmentation that does not help. But feel 
free to try.

Other things I tried that did not work or work consistently was repeating the 
vmcp call with a possible wait and increasing the server memory to about 2G. 
What definitely does not help is increasing the memory with chmem, because that 
adds memory not usable by the kernel for this kind of buffer allocation (again, 
I forgot the details).

Tomas


Re: How to find a memory leak?

2015-07-09 Thread Michael MacIsaac
Tomas, Marcy,

Thanks.  I copied and pasted cmmflush and it seems to work nicely:

# free -m
 total   used   free sharedbuffers cached
Mem:   492162329  0 29 83
-/+ buffers/cache: 49442
Swap:  898  0898
# cmmflush
11:16:17 Currently free 328MB, dropping cache...
11:16:18 Now free 422MB, released 93MB
11:16:18 CMM base is 0MB, target is 396MB
11:16:19 CMM currently at 396MB...
11:16:19 Done! CMM base restored to 0MB
11:16:19 Released 396 MB of memory
# free -m
 total   used   free sharedbuffers cached
Mem:   492 69423  0  0 19
-/+ buffers/cache: 49442
Swap:  898  0898

Rob, thanks for the contribution.

-Mike


On Thu, Jul 9, 2015 at 11:06 AM, Pavelka, Tomas 
wrote:

> This is a really ugly problem that I don't have a solution for. But let me
> give you a bit of info if you want to do your own digging:
>
> The way I found this is that I was adding NICs to a Linux on the fly.
> Sometimes this would fail, saying page allocation in syslog. The discussion
> on this list is here:
>
> http://www.mail-archive.com/linux-390%40vm.marist.edu/msg65371.html
>
> What I found later is that the NIC driver needs 64k of memory in kernel
> space. This means the memory needs to be continuous. The kernel keeps
> memory in structures called slabs, and keeps pools of these. If you do
>
> cat /proc/buddyinfo
> Node 0, zone  DMA   9078  10398   3135838164 14  0
>   0  2
>
> Another way to get memory report is to run "echo m > /proc/sysrq-trigger"
> and look into syslog for a report about kernel memory usage.
>
> You will see how many slabs of each order you have. 9078 of order 1 slabs
> (4kb), 10398 of order 2 slabs (8kb) ... 2 order 9 slabs (1MB). If a slab of
> lower order is needed it may split a higher order one (e.g. if the kernel
> wants a 4k slab it may split an 8k slab into two). Lots of kernel
> allocations and you may run out of the higher order slabs. What worked for
> me for trigerring this condition was moving a lot of data to the Linux over
> SCP. There may be other causes.
>
> Now the significance of 32k is that this is where Linux stops retrying to
> rearrange memory to find larger slabs. I don't remember the details, but if
> you want to investigate look at the kernel sources, namely mm/page_alloc.c
> and mm/vmscan.c
>
> So the bottom line is, anytime you have an operation that needs a large
> buffer in kernel (chccwdev of a NIC, vmcp with --buffer, DIAG from Linux)
> it may fail at unexpected times. I have not found a good way to get around
> this but I will be interested if you find anything.
>
> In the case of VMCP what may help is if it allocated a buffer at kernel
> startup. At the moment it allocates it for every call, see
> http://lxr.free-electrons.com/source/drivers/s390/char/vmcp.c#L105
>
> Tomas
>
>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-09 Thread Pavelka, Tomas
This is a really ugly problem that I don't have a solution for. But let me give 
you a bit of info if you want to do your own digging:

The way I found this is that I was adding NICs to a Linux on the fly. Sometimes 
this would fail, saying page allocation in syslog. The discussion on this list 
is here:

http://www.mail-archive.com/linux-390%40vm.marist.edu/msg65371.html

What I found later is that the NIC driver needs 64k of memory in kernel space. 
This means the memory needs to be continuous. The kernel keeps memory in 
structures called slabs, and keeps pools of these. If you do 

cat /proc/buddyinfo
Node 0, zone  DMA   9078  10398   3135838164 14  0  0   
   2

Another way to get memory report is to run "echo m > /proc/sysrq-trigger" and 
look into syslog for a report about kernel memory usage.

You will see how many slabs of each order you have. 9078 of order 1 slabs 
(4kb), 10398 of order 2 slabs (8kb) ... 2 order 9 slabs (1MB). If a slab of 
lower order is needed it may split a higher order one (e.g. if the kernel wants 
a 4k slab it may split an 8k slab into two). Lots of kernel allocations and you 
may run out of the higher order slabs. What worked for me for trigerring this 
condition was moving a lot of data to the Linux over SCP. There may be other 
causes.

Now the significance of 32k is that this is where Linux stops retrying to 
rearrange memory to find larger slabs. I don't remember the details, but if you 
want to investigate look at the kernel sources, namely mm/page_alloc.c and 
mm/vmscan.c

So the bottom line is, anytime you have an operation that needs a large buffer 
in kernel (chccwdev of a NIC, vmcp with --buffer, DIAG from Linux) it may fail 
at unexpected times. I have not found a good way to get around this but I will 
be interested if you find anything.

In the case of VMCP what may help is if it allocated a buffer at kernel 
startup. At the moment it allocates it for every call, see 
http://lxr.free-electrons.com/source/drivers/s390/char/vmcp.c#L105 

Tomas



Re: How to find a memory leak?

2015-07-09 Thread Marcy Cortes
Use Rob's cmmflush !
https://zvmperf.wordpress.com/2012/07/06/using-cmm-to-flush-a-linux-guests-memory/

We use it every day in dev to keep the vm paging rate way down.


-Original Message-
From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Michael 
MacIsaac
Sent: Thursday, July 09, 2015 7:49 AM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: [LINUX-390] How to find a memory leak?

Thomas,

> Did you use a buffer larger than 32k on those vmcp commands?
Yes, I always use 1M (vmcpCmd="/sbin/vmcp --buffer=1M") in the event there is a 
lot of output from CP.

> Vmcp can fail due to memory fragmentation even on a server with lots 
> of
free memory.
Hmmm, interesting... could this be considered a bug?

As a workaround, is there a command to flush the buffer cache?

Thanks.

-Mike M.



On Thu, Jul 9, 2015 at 10:40 AM, Pavelka, Tomas 
wrote:

> > In the past this server has gone to near zero memory, and vmcp 
> > commands
> fail.
>
> Do you have any specifics? Did you use a buffer larger than 32k on 
> those vmcp commands? Vmcp can fail due to memory fragmentation even on 
> a server with lots of free memory.
>
> Tomas Pavelka
> CA Technologies
> Sr Software Engineer
>
> CA CZ, s.r.o
> V Parku 12,
> 148 00 Praha
> Czech Republic
>
> Office: +25996 | tomas.pave...@ca.com
>
>
>
> Id. Císlo 25694073, z obchodního rejstříku, vedeného Městským soudem v 
> Praze, oddíl C, vložka 61808 / Id. No. 25694073, registered in the 
> Commercial Register maintained by the Municipal Court in Praque, 
> Section C, File 61808
>
>
>
> -Original Message-
> From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of 
> Michael MacIsaac
> Sent: Thursday, July 09, 2015 4:15 PM
> To: LINUX-390@VM.MARIST.EDU
> Subject: Re: How to find a memory leak?
>
> Thanks Richard for the joke :))
>
> Thanks Thomas for the input.  I changed the ps command flag to '--sort 
> -rss', and restarted memusage - will continue to monitor.
>
> Thanks Dave for the pointer, but I don't have any of my own C/C++ 
> programs running, just many bash scripts (if they do no 'malloc's, can 
> they still cause memory leaks?).
>
> In the past this server has gone to near zero memory, and vmcp 
> commands fail.  I'm guessing the OOM killer was invoked, but by then 
> it's already too late ...
>
> -Mike
>
> On Thu, Jul 9, 2015 at 9:54 AM, Dave Jones 
> wrote:
>
> > Hi, Mike.
> >
> > if the package AddressSanitizer (ASan) is available, you might want 
> > to ive it a go.  It is a fast memory error detector. that can find 
> > use-after-free and {heap,stack,global}-buffer overflow bugs in C/C++ 
> > programs. it's here:
> >
> > https://code.google.com/p/address-sanitizer/
> >
> > Good luckI still think C/C++ will be the death of us all. :-)
> >
> > DJ
> >
> > On 07/09/2015 07:50 AM, Pavelka, Tomas wrote:
> > > Look at the " -/+ buffers/cache" line in the free output:
> > >
> > > Before:
> > > -/+ buffers/cache: 41450
> > > After:
> > > -/+ buffers/cache: 48443
> > >
> > > (First number used, second free)
> > >
> > > Linux has various buffers and caches that are allocated if there 
> > > is free
> > memory. For example for disk reads. These are dropped if the memory 
> > is needed by processes. The " -/+ buffers/cache" line shows what 
> > memory is actually used by processes and not the buffers. In your 
> > case the used memory rose only by 7 MB.
> > >
> > > BTW I would not look at the virtual memory size of proceses, this 
> > > may be
> > allocated way over the virtual memory size of your machine. The more 
> > interesting metric is RSS which is how much memory is actually used.
> > >
> > > HTH,
> > > Tomas
> > >
> > > Tomas Pavelka
> > > CA Technologies
> > > Sr Software Engineer
> > >
> > > CA CZ, s.r.o
> > > V Parku 12,
> > > 148 00 Praha
> > > Czech Republic
> > >
> > > Office: +25996 | tomas.pave...@ca.com
> > >
> > >
> > >
> > > Id. Císlo 25694073, z obchodního rejstříku, vedeného Městským 
> > > soudem v
> > Praze, oddíl C, vložka 61808 / Id. No. 25694073, registered in the 
> > Commercial Register maintained by the Municipal Court in Praque, 
> > Section C, File 61808
> > >
> > >
> > > -Original Message-
> > > From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU

Re: How to find a memory leak?

2015-07-09 Thread Michael MacIsaac
Thomas,

> Did you use a buffer larger than 32k on those vmcp commands?
Yes, I always use 1M (vmcpCmd="/sbin/vmcp --buffer=1M") in the event there
is a lot of output from CP.

> Vmcp can fail due to memory fragmentation even on a server with lots of
free memory.
Hmmm, interesting... could this be considered a bug?

As a workaround, is there a command to flush the buffer cache?

Thanks.

-Mike M.



On Thu, Jul 9, 2015 at 10:40 AM, Pavelka, Tomas 
wrote:

> > In the past this server has gone to near zero memory, and vmcp commands
> fail.
>
> Do you have any specifics? Did you use a buffer larger than 32k on those
> vmcp commands? Vmcp can fail due to memory fragmentation even on a server
> with lots of free memory.
>
> Tomas Pavelka
> CA Technologies
> Sr Software Engineer
>
> CA CZ, s.r.o
> V Parku 12,
> 148 00 Praha
> Czech Republic
>
> Office: +25996 | tomas.pave...@ca.com
>
>
>
> Id. Císlo 25694073, z obchodního rejstříku, vedeného Městským soudem v
> Praze, oddíl C, vložka 61808 / Id. No. 25694073, registered in the
> Commercial Register maintained by the Municipal Court in Praque, Section C,
> File 61808
>
>
>
> -Original Message-
> From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of
> Michael MacIsaac
> Sent: Thursday, July 09, 2015 4:15 PM
> To: LINUX-390@VM.MARIST.EDU
> Subject: Re: How to find a memory leak?
>
> Thanks Richard for the joke :))
>
> Thanks Thomas for the input.  I changed the ps command flag to '--sort
> -rss', and restarted memusage - will continue to monitor.
>
> Thanks Dave for the pointer, but I don't have any of my own C/C++ programs
> running, just many bash scripts (if they do no 'malloc's, can they still
> cause memory leaks?).
>
> In the past this server has gone to near zero memory, and vmcp commands
> fail.  I'm guessing the OOM killer was invoked, but by then it's already
> too late ...
>
> -Mike
>
> On Thu, Jul 9, 2015 at 9:54 AM, Dave Jones 
> wrote:
>
> > Hi, Mike.
> >
> > if the package AddressSanitizer (ASan) is available, you might want to
> > ive it a go.  It is a fast memory error detector. that can find
> > use-after-free and {heap,stack,global}-buffer overflow bugs in C/C++
> > programs. it's here:
> >
> > https://code.google.com/p/address-sanitizer/
> >
> > Good luckI still think C/C++ will be the death of us all. :-)
> >
> > DJ
> >
> > On 07/09/2015 07:50 AM, Pavelka, Tomas wrote:
> > > Look at the " -/+ buffers/cache" line in the free output:
> > >
> > > Before:
> > > -/+ buffers/cache: 41450
> > > After:
> > > -/+ buffers/cache: 48443
> > >
> > > (First number used, second free)
> > >
> > > Linux has various buffers and caches that are allocated if there is
> > > free
> > memory. For example for disk reads. These are dropped if the memory is
> > needed by processes. The " -/+ buffers/cache" line shows what memory
> > is actually used by processes and not the buffers. In your case the
> > used memory rose only by 7 MB.
> > >
> > > BTW I would not look at the virtual memory size of proceses, this
> > > may be
> > allocated way over the virtual memory size of your machine. The more
> > interesting metric is RSS which is how much memory is actually used.
> > >
> > > HTH,
> > > Tomas
> > >
> > > Tomas Pavelka
> > > CA Technologies
> > > Sr Software Engineer
> > >
> > > CA CZ, s.r.o
> > > V Parku 12,
> > > 148 00 Praha
> > > Czech Republic
> > >
> > > Office: +25996 | tomas.pave...@ca.com
> > >
> > >
> > >
> > > Id. Císlo 25694073, z obchodního rejstříku, vedeného Městským soudem
> > > v
> > Praze, oddíl C, vložka 61808 / Id. No. 25694073, registered in the
> > Commercial Register maintained by the Municipal Court in Praque,
> > Section C, File 61808
> > >
> > >
> > > -Original Message-
> > > From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf
> > > Of
> > Michael MacIsaac
> > > Sent: Thursday, July 09, 2015 2:19 PM
> > > To: LINUX-390@VM.MARIST.EDU
> > > Subject: How to find a memory leak?
> > >
> > > Hello list,
> > >
> > > I have a SLES 11 SP3 system that is leaking memory, but I don't know
> > > how
> > or where.
> > >
> > > I find a script on the Intern

Re: How to find a memory leak?

2015-07-09 Thread Pavelka, Tomas
> In the past this server has gone to near zero memory, and vmcp commands fail.

Do you have any specifics? Did you use a buffer larger than 32k on those vmcp 
commands? Vmcp can fail due to memory fragmentation even on a server with lots 
of free memory.

Tomas Pavelka
CA Technologies
Sr Software Engineer

CA CZ, s.r.o 
V Parku 12, 
148 00 Praha 
Czech Republic

Office: +25996 | tomas.pave...@ca.com



Id. Císlo 25694073, z obchodního rejstříku, vedeného Městským soudem v Praze, 
oddíl C, vložka 61808 / Id. No. 25694073, registered in the Commercial Register 
maintained by the Municipal Court in Praque, Section C, File 61808



-Original Message-
From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Michael 
MacIsaac
Sent: Thursday, July 09, 2015 4:15 PM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: How to find a memory leak?

Thanks Richard for the joke :))

Thanks Thomas for the input.  I changed the ps command flag to '--sort -rss', 
and restarted memusage - will continue to monitor.

Thanks Dave for the pointer, but I don't have any of my own C/C++ programs 
running, just many bash scripts (if they do no 'malloc's, can they still cause 
memory leaks?).

In the past this server has gone to near zero memory, and vmcp commands fail.  
I'm guessing the OOM killer was invoked, but by then it's already too late ...

-Mike

On Thu, Jul 9, 2015 at 9:54 AM, Dave Jones  wrote:

> Hi, Mike.
>
> if the package AddressSanitizer (ASan) is available, you might want to 
> ive it a go.  It is a fast memory error detector. that can find 
> use-after-free and {heap,stack,global}-buffer overflow bugs in C/C++ 
> programs. it's here:
>
> https://code.google.com/p/address-sanitizer/
>
> Good luckI still think C/C++ will be the death of us all. :-)
>
> DJ
>
> On 07/09/2015 07:50 AM, Pavelka, Tomas wrote:
> > Look at the " -/+ buffers/cache" line in the free output:
> >
> > Before:
> > -/+ buffers/cache: 41450
> > After:
> > -/+ buffers/cache: 48443
> >
> > (First number used, second free)
> >
> > Linux has various buffers and caches that are allocated if there is 
> > free
> memory. For example for disk reads. These are dropped if the memory is 
> needed by processes. The " -/+ buffers/cache" line shows what memory 
> is actually used by processes and not the buffers. In your case the 
> used memory rose only by 7 MB.
> >
> > BTW I would not look at the virtual memory size of proceses, this 
> > may be
> allocated way over the virtual memory size of your machine. The more 
> interesting metric is RSS which is how much memory is actually used.
> >
> > HTH,
> > Tomas
> >
> > Tomas Pavelka
> > CA Technologies
> > Sr Software Engineer
> >
> > CA CZ, s.r.o
> > V Parku 12,
> > 148 00 Praha
> > Czech Republic
> >
> > Office: +25996 | tomas.pave...@ca.com
> >
> >
> >
> > Id. Císlo 25694073, z obchodního rejstříku, vedeného Městským soudem 
> > v
> Praze, oddíl C, vložka 61808 / Id. No. 25694073, registered in the 
> Commercial Register maintained by the Municipal Court in Praque, 
> Section C, File 61808
> >
> >
> > -Original Message-
> > From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf 
> > Of
> Michael MacIsaac
> > Sent: Thursday, July 09, 2015 2:19 PM
> > To: LINUX-390@VM.MARIST.EDU
> > Subject: How to find a memory leak?
> >
> > Hello list,
> >
> > I have a SLES 11 SP3 system that is leaking memory, but I don't know 
> > how
> or where.
> >
> > I find a script on the Internet that runs forever, adapt it 
> > somewhat,
> and start logging some info to a temp file. Here's the script:
> >
> > # cat memusage
> > #!/bin/bash
> > #
> > # track memory usage
> > #
> > outFile="/tmp/memusage"
> > while true
> > do
> >   echo "---" >> $outFile
> >   date >> $outFile
> >   ps aux --sort -vsz | head -22 >> $outFile
> >   echo >> $outFile
> >   free -m >> $outFile
> >   sleep 300
> > done
> >
> > After a fresh reboot of a 512 MB virtual machine, I start the script 
> > and
> the first entry in the temp file shows about 20 MB (512 - 492) used by 
> Linux and 97 MB used by processes:
> >
> > Wed Jul  8 12:37:45 EDT 2015
> > USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
> > root  2181  0.0  0.2 115404  1024 ?Ssl  12:36   0:00
> > /usr/sbin/n

Re: How to find a memory leak?

2015-07-09 Thread Michael MacIsaac
Thanks Richard for the joke :))

Thanks Thomas for the input.  I changed the ps command flag to '--sort
-rss', and restarted memusage - will continue to monitor.

Thanks Dave for the pointer, but I don't have any of my own C/C++ programs
running, just many bash scripts (if they do no 'malloc's, can they still
cause memory leaks?).

In the past this server has gone to near zero memory, and vmcp commands
fail.  I'm guessing the OOM killer was invoked, but by then it's already
too late ...

-Mike

On Thu, Jul 9, 2015 at 9:54 AM, Dave Jones  wrote:

> Hi, Mike.
>
> if the package AddressSanitizer (ASan) is available, you might want to
> ive it a go.  It is a fast memory error detector. that can find
> use-after-free and {heap,stack,global}-buffer overflow bugs in C/C++
> programs. it's here:
>
> https://code.google.com/p/address-sanitizer/
>
> Good luckI still think C/C++ will be the death of us all. :-)
>
> DJ
>
> On 07/09/2015 07:50 AM, Pavelka, Tomas wrote:
> > Look at the " -/+ buffers/cache" line in the free output:
> >
> > Before:
> > -/+ buffers/cache: 41450
> > After:
> > -/+ buffers/cache: 48443
> >
> > (First number used, second free)
> >
> > Linux has various buffers and caches that are allocated if there is free
> memory. For example for disk reads. These are dropped if the memory is
> needed by processes. The " -/+ buffers/cache" line shows what memory is
> actually used by processes and not the buffers. In your case the used
> memory rose only by 7 MB.
> >
> > BTW I would not look at the virtual memory size of proceses, this may be
> allocated way over the virtual memory size of your machine. The more
> interesting metric is RSS which is how much memory is actually used.
> >
> > HTH,
> > Tomas
> >
> > Tomas Pavelka
> > CA Technologies
> > Sr Software Engineer
> >
> > CA CZ, s.r.o
> > V Parku 12,
> > 148 00 Praha
> > Czech Republic
> >
> > Office: +25996 | tomas.pave...@ca.com
> >
> >
> >
> > Id. Císlo 25694073, z obchodního rejstříku, vedeného Městským soudem v
> Praze, oddíl C, vložka 61808 / Id. No. 25694073, registered in the
> Commercial Register maintained by the Municipal Court in Praque, Section C,
> File 61808
> >
> >
> > -Original Message-
> > From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of
> Michael MacIsaac
> > Sent: Thursday, July 09, 2015 2:19 PM
> > To: LINUX-390@VM.MARIST.EDU
> > Subject: How to find a memory leak?
> >
> > Hello list,
> >
> > I have a SLES 11 SP3 system that is leaking memory, but I don't know how
> or where.
> >
> > I find a script on the Internet that runs forever, adapt it somewhat,
> and start logging some info to a temp file. Here's the script:
> >
> > # cat memusage
> > #!/bin/bash
> > #
> > # track memory usage
> > #
> > outFile="/tmp/memusage"
> > while true
> > do
> >   echo "---" >> $outFile
> >   date >> $outFile
> >   ps aux --sort -vsz | head -22 >> $outFile
> >   echo >> $outFile
> >   free -m >> $outFile
> >   sleep 300
> > done
> >
> > After a fresh reboot of a 512 MB virtual machine, I start the script and
> the first entry in the temp file shows about 20 MB (512 - 492) used by
> Linux and 97 MB used by processes:
> >
> > Wed Jul  8 12:37:45 EDT 2015
> > USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
> > root  2181  0.0  0.2 115404  1024 ?Ssl  12:36   0:00
> > /usr/sbin/nscd
> > root  1851  0.0  0.1  11512   692 ?S > /sbin/auditd -s disable
> > root  2556  0.3  0.7  11456  4004 ?Ss   12:37   0:00 sshd:
> > root@pts/0
> > root  2306  0.0  0.7 10720 3700 ?Ss   12:36   0:00
> > /usr/sbin/httpd2-prefork
> > wwwrun2307  0.0  0.4  10720  2204 ?S12:36   0:00
> > /usr/sbin/httpd2-prefork
> > wwwrun2308  0.0  0.4  10720  2204 ?S12:36   0:00
> > /usr/sbin/httpd2-prefork
> > wwwrun2309  0.0  0.4  10720  2204 ?S12:36   0:00
> > /usr/sbin/httpd2-prefork
> > wwwrun2310  0.0  0.4  10720  2204 ?S12:36   0:00
> > /usr/sbin/httpd2-prefork
> > wwwrun2311  0.0  0.4  10720  2204 ?S12:36   0:00
> > /usr/sbin/httpd2-prefork
> > root  1853  0.0  0.1  10428   824 ?S > /sbin/audispd
> > root   

Re: How to find a memory leak?

2015-07-09 Thread Dave Jones
Hi, Mike.

if the package AddressSanitizer (ASan) is available, you might want to
ive it a go.  It is a fast memory error detector. that can find
use-after-free and {heap,stack,global}-buffer overflow bugs in C/C++
programs. it's here:

https://code.google.com/p/address-sanitizer/

Good luckI still think C/C++ will be the death of us all. :-)

DJ

On 07/09/2015 07:50 AM, Pavelka, Tomas wrote:
> Look at the " -/+ buffers/cache" line in the free output:
> 
> Before:
> -/+ buffers/cache: 41450
> After:
> -/+ buffers/cache: 48443
> 
> (First number used, second free)
> 
> Linux has various buffers and caches that are allocated if there is free 
> memory. For example for disk reads. These are dropped if the memory is needed 
> by processes. The " -/+ buffers/cache" line shows what memory is actually 
> used by processes and not the buffers. In your case the used memory rose only 
> by 7 MB.
> 
> BTW I would not look at the virtual memory size of proceses, this may be 
> allocated way over the virtual memory size of your machine. The more 
> interesting metric is RSS which is how much memory is actually used. 
> 
> HTH,
> Tomas
> 
> Tomas Pavelka
> CA Technologies
> Sr Software Engineer
> 
> CA CZ, s.r.o 
> V Parku 12, 
> 148 00 Praha 
> Czech Republic
> 
> Office: +25996 | tomas.pave...@ca.com
> 
> 
> 
> Id. Císlo 25694073, z obchodního rejstříku, vedeného Městským soudem v Praze, 
> oddíl C, vložka 61808 / Id. No. 25694073, registered in the Commercial 
> Register maintained by the Municipal Court in Praque, Section C, File 61808
> 
> 
> -Original Message-
> From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Michael 
> MacIsaac
> Sent: Thursday, July 09, 2015 2:19 PM
> To: LINUX-390@VM.MARIST.EDU
> Subject: How to find a memory leak?
> 
> Hello list,
> 
> I have a SLES 11 SP3 system that is leaking memory, but I don't know how or 
> where.
> 
> I find a script on the Internet that runs forever, adapt it somewhat, and 
> start logging some info to a temp file. Here's the script:
> 
> # cat memusage
> #!/bin/bash
> #
> # track memory usage
> #
> outFile="/tmp/memusage"
> while true
> do
>   echo "---" >> $outFile
>   date >> $outFile
>   ps aux --sort -vsz | head -22 >> $outFile
>   echo >> $outFile
>   free -m >> $outFile
>   sleep 300
> done
> 
> After a fresh reboot of a 512 MB virtual machine, I start the script and the 
> first entry in the temp file shows about 20 MB (512 - 492) used by Linux and 
> 97 MB used by processes:
> 
> Wed Jul  8 12:37:45 EDT 2015
> USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
> root  2181  0.0  0.2 115404  1024 ?Ssl  12:36   0:00
> /usr/sbin/nscd
> root  1851  0.0  0.1  11512   692 ?S /sbin/auditd -s disable
> root  2556  0.3  0.7  11456  4004 ?Ss   12:37   0:00 sshd:
> root@pts/0
> root  2306  0.0  0.7  10720  3700 ?Ss   12:36   0:00
> /usr/sbin/httpd2-prefork
> wwwrun2307  0.0  0.4  10720  2204 ?S12:36   0:00
> /usr/sbin/httpd2-prefork
> wwwrun2308  0.0  0.4  10720  2204 ?S12:36   0:00
> /usr/sbin/httpd2-prefork
> wwwrun2309  0.0  0.4  10720  2204 ?S12:36   0:00
> /usr/sbin/httpd2-prefork
> wwwrun2310  0.0  0.4  10720  2204 ?S12:36   0:00
> /usr/sbin/httpd2-prefork
> wwwrun2311  0.0  0.4  10720  2204 ?S12:36   0:00
> /usr/sbin/httpd2-prefork
> root  1853  0.0  0.1  10428   824 ?S /sbin/audispd
> root   997  0.0  0.6   9036  3224 ?Ssl  12:36   0:00
> /usr/sbin/console-kit-da
> root  2265  0.0  0.5   8136  2532 ?Ss   12:36   0:00
> /usr/lib/postfix/master
> postfix   2277  0.0  0.4   8004  2372 ?S12:36   0:00 qmgr -l -t
> fifo -u
> postfix   2276  0.0  0.4   7948  2352 ?S12:36   0:00 pickup -l
> -t fifo -u
> root  2172  0.0  0.3   7916  1532 ?Ss   12:36   0:00
> /usr/sbin/sshd -o PidFi
> 101994  0.0  0.5   7852  2804 ?Ss   12:36   0:00
> /usr/sbin/hald --daemon
> root  1869  0.0  0.8   6464  4504 ?Ss   12:36   0:00
> /sbin/haveged -w 1024 -
> root  2559  1.0  0.6   6056  3076 pts/0Ss   12:37   0:00 -bash
> root   998  0.0  0.2   3980  1332 ?S12:36   0:00 hald-runner
> root  2591  0.0  0.3   3652  1604 pts/0S+   12:37   0:00 /bin/bash
> /usr/local/sb
> root  2343  0.0  0.1   3508   944 ?Ss   12:36   0:00
> /usr/sbin/xinetd -pidfi
> 
>  

Re: How to find a memory leak?

2015-07-09 Thread Pavelka, Tomas
Look at the " -/+ buffers/cache" line in the free output:

Before:
-/+ buffers/cache: 41450
After:
-/+ buffers/cache: 48443

(First number used, second free)

Linux has various buffers and caches that are allocated if there is free 
memory. For example for disk reads. These are dropped if the memory is needed 
by processes. The " -/+ buffers/cache" line shows what memory is actually used 
by processes and not the buffers. In your case the used memory rose only by 7 
MB.

BTW I would not look at the virtual memory size of proceses, this may be 
allocated way over the virtual memory size of your machine. The more 
interesting metric is RSS which is how much memory is actually used. 

HTH,
Tomas

Tomas Pavelka
CA Technologies
Sr Software Engineer

CA CZ, s.r.o 
V Parku 12, 
148 00 Praha 
Czech Republic

Office: +25996 | tomas.pave...@ca.com



Id. Císlo 25694073, z obchodního rejstříku, vedeného Městským soudem v Praze, 
oddíl C, vložka 61808 / Id. No. 25694073, registered in the Commercial Register 
maintained by the Municipal Court in Praque, Section C, File 61808


-Original Message-
From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Michael 
MacIsaac
Sent: Thursday, July 09, 2015 2:19 PM
To: LINUX-390@VM.MARIST.EDU
Subject: How to find a memory leak?

Hello list,

I have a SLES 11 SP3 system that is leaking memory, but I don't know how or 
where.

I find a script on the Internet that runs forever, adapt it somewhat, and start 
logging some info to a temp file. Here's the script:

# cat memusage
#!/bin/bash
#
# track memory usage
#
outFile="/tmp/memusage"
while true
do
  echo "---" >> $outFile
  date >> $outFile
  ps aux --sort -vsz | head -22 >> $outFile
  echo >> $outFile
  free -m >> $outFile
  sleep 300
done

After a fresh reboot of a 512 MB virtual machine, I start the script and the 
first entry in the temp file shows about 20 MB (512 - 492) used by Linux and 97 
MB used by processes:

Wed Jul  8 12:37:45 EDT 2015
USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
root  2181  0.0  0.2 115404  1024 ?Ssl  12:36   0:00
/usr/sbin/nscd
root  1851  0.0  0.1  11512   692 ?Shttp://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit http://wiki.linuxvm.org/


Re: How to find a memory leak?

2015-07-09 Thread Richard Pinion
Spray soapy water on it and look for bubbles :)



--- mike99...@gmail.com wrote:

From: Michael MacIsaac 
To:   LINUX-390@VM.MARIST.EDU
Subject: How to find a memory leak?
Date: Thu, 9 Jul 2015 08:19:20 -0400

Hello list,

I have a SLES 11 SP3 system that is leaking memory, but I don't know how or
where.

I find a script on the Internet that runs forever, adapt it somewhat, and
start logging some info to a temp file. Here's the script:

# cat memusage
#!/bin/bash
#
# track memory usage
#
outFile="/tmp/memusage"
while true
do
  echo "---" >> $outFile
  date >> $outFile
  ps aux --sort -vsz | head -22 >> $outFile
  echo >> $outFile
  free -m >> $outFile
  sleep 300
done

After a fresh reboot of a 512 MB virtual machine, I start the script and
the first entry in the temp file shows about 20 MB (512 - 492) used by
Linux and 97 MB used by processes:

Wed Jul  8 12:37:45 EDT 2015
USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
root  2181  0.0  0.2 115404  1024 ?Ssl  12:36   0:00
/usr/sbin/nscd
root  1851  0.0  0.1  11512   692 ?Shttp://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/




_
Netscape.  Just the Net You Need.

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


How to find a memory leak?

2015-07-09 Thread Michael MacIsaac
Hello list,

I have a SLES 11 SP3 system that is leaking memory, but I don't know how or
where.

I find a script on the Internet that runs forever, adapt it somewhat, and
start logging some info to a temp file. Here's the script:

# cat memusage
#!/bin/bash
#
# track memory usage
#
outFile="/tmp/memusage"
while true
do
  echo "---" >> $outFile
  date >> $outFile
  ps aux --sort -vsz | head -22 >> $outFile
  echo >> $outFile
  free -m >> $outFile
  sleep 300
done

After a fresh reboot of a 512 MB virtual machine, I start the script and
the first entry in the temp file shows about 20 MB (512 - 492) used by
Linux and 97 MB used by processes:

Wed Jul  8 12:37:45 EDT 2015
USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
root  2181  0.0  0.2 115404  1024 ?Ssl  12:36   0:00
/usr/sbin/nscd
root  1851  0.0  0.1  11512   692 ?Shttp://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/