Re: Significant memory leak in 9.3p10?

2015-03-27 Thread Konstantin Belousov
On Thu, Mar 26, 2015 at 03:46:05PM -0400, J David wrote:
> On Mon, Mar 16, 2015 at 7:52 PM, J David  wrote:
> > On Mon, Mar 16, 2015 at 7:24 PM, Konstantin Belousov
> >  wrote:
> >> There are a lot of possibilities to create persistent anonymous shared
> >> memory objects.  Not complete list is tmpfs mounts, swap-backed md disks,
> >> sysv shared memory, possibly posix shared memory (I do not remember which
> >> implementation is used in stable/9).
> >
> > If that's the explanation, how could it be
> > detected/measured/investigated/resolved/prevented?
> >
> > Under ordinary circumstances, machines will go run like this for days/weeks:
> >
> > Mem: 549M Active, 3623M Inact, 567M Wired, 3484K Cache, 827M Buf, 3156M Free
> > Swap: 1024M Total, 1024M Free
> >
> > Then, when this happens, it rapidly degrades from that to so bad that
> > processes start getting killed for being out of swap space.
> 
> These FreeBSD machines running out of swap space and dying continues
> to be a daily problem causing outages and unscheduled reboots.  Is
> there really no way to even research what might be causing the
> problem?
> 
> (Widening the cross-posting in the hopes of eliciting more help, so
> the brief summary of the problem orginally posted to freebsd-stable is
> that an unknown actor consumes all the user-space memory in the
> system, including swap space, to the point where processes are killed
> for being out of swap space, but if every process on the machine is
> stopped, very little of the user-space memory in use is freed.
> Original message with more details is here:
> https://lists.freebsd.org/pipermail/freebsd-stable/2015-March/081986.html
> .)
> 
> There are no tmpfs mounts or md disks, so it would have to be one of
> the other causes.  How can FreeBSD's use of persistent, anonymous
> shared memory objects be investigated, measured, or controlled so we
> can get a handle on this issue?

Start by providing useful information about your system, not a description
of the information.

E.g., a consistent snapshot of the following:
ps auxww
swapinfo
mount -v
mdconfig -lv
vmstat -z
vmstat -m
vmstat -s
sysctl -a
ipcs -a

Collect this data both during the normal run, run while the problem appear
but userspace is not killed, and after you killed the processes.

Just in case, show kldstat.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Significant memory leak in 9.3p10?

2015-03-26 Thread J David
On Thu, Mar 26, 2015 at 9:28 PM, Steven Hartland
 wrote:
> Does vmstat -m or vmstat -z shed any light?

None, as those show kernel memory usage, not user space.  Looking at
them anyway shows nothing unusual, consuming large amounts of memory,
or disproportionate to the kernel memory shown as in-use.

The list of suspects that can consume user memory without being
associated with any user process is very short: some sort of
anonymous, persistent shared memory object.  Konstantin offered a
partial list of some likely candidates in response to the initial
message, including:

- NO: tmpfs mounts (not used)
- NO: swap-backed md disks (not used)
- PROBABLY NO: sysv shared memory (believed not to be used)
- MAYBE: possibly posix shared memory (unknown whether used)
- MAYBE: anonymous mmap segments that have somehow got lost (i.e. file
descriptor is hanging around in the kernel somewhere) -- proposed by
someone off-list
- MAYBE: others?

Of the two remaining known possibilities, posix shared memory seems
more likely than an unknown mmap bug.  Unfortunately, I have not found
any way to gather statistics and/or get/set limits on posix shared
memory usage.  Does such a method exist?

Really, it would be great if there were a tool that could walk the
entire list of VM blocks and generate some kind of report or
statistics (like vmstat -z or vmstat -m, but for VM rather than kernel
memory).  As it is, we are reduced to guessing what might be going on,
which is decidedly suboptimal.  However, I have no idea if such a tool
exists, if it is even possible to write, or (if it is) how to go about
writing it.

Thanks!
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Significant memory leak in 9.3p10?

2015-03-26 Thread Doug Hardie

> On 26 March 2015, at 18:02, Chris H  wrote:
> 
> On Thu, 26 Mar 2015 20:28:15 -0400 J David  wrote
> 
>> On Thu, Mar 26, 2015 at 8:25 PM, Chris H  wrote:
>>> As Kevin already noted; stopping firefox, and starting it again,
>>> seems the only solution.
>> 
>> The machines in questions are servers, they do not run Firefox or any
>> GUI.  And whatever is using the memory does not show up on ps or top.
> Fair enough. I'm still getting caught up, on the thread.
> 
> Maybe another "shot in the dark". But speaking of Servers. We
> ran into trouble with a web server generating *enormous* error
> logs -- a runaway script. The result was, even tho there was
> far more than adequate space for the swelling log(s). Memory,
> and eventually Swap usage, began to climb quite steadily.
> 
> Like I said; maybe a shot in the dark. But just thought I'd
> mention it.

I just encountered the same problem on a FreeBSD 8.2-RELEASE-p3 server today.  
Swap was at 100% and processes were being killed.  I used ps ax and killed all 
the processes with W status that I could.  Swap usage went down to 99%.  This 
was a production server so was forced to reboot.  After the reboot, the system 
came back up with the same process set and zero swap used.  Shortly after that 
a core image appeared and the root filesystem was full.  The core file was 
about 1 GB.  However, none of my processes are anywhere near that.  The 
specific process that was dumped is only about 140 lines of C code and doesn’t 
have any dynamic storage used, just a couple of short character strings and one 
integer.  The binary file is 23KB.  I couldn’t take time to run gdb on it as it 
was affecting production.


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Significant memory leak in 9.3p10?

2015-03-26 Thread Steven Hartland



On 26/03/2015 23:47, J David wrote:

In our case,

On Thu, Mar 26, 2015 at 5:03 PM, Kevin Oberman  wrote:

This is just a shot in the dark and not a really likely one, but I have had
issues with Firefox leaking memory badly. I can free the space by killing
firefox and restarting it.

In our case, we can log in from the console, kill every single
user-mode process on the system except the init, login, and the
console shell, and the memory is not recovered.  Gigabytes and
gigabytes user memory of it are being held by some un-findable
anonymous persistent structure not linked to any process.  Konstantin
proposed that it was some sort of shared memory usage, but there
appears to be no way to check or investigate most types of shared
memory usage on FreeBSD.


Does vmstat -m or vmstat -z shed any light?

Regards
Steve
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Significant memory leak in 9.3p10?

2015-03-26 Thread Chris H
On Thu, 26 Mar 2015 20:28:15 -0400 J David  wrote

> On Thu, Mar 26, 2015 at 8:25 PM, Chris H  wrote:
> > As Kevin already noted; stopping firefox, and starting it again,
> > seems the only solution.
> 
> The machines in questions are servers, they do not run Firefox or any
> GUI.  And whatever is using the memory does not show up on ps or top.
Fair enough. I'm still getting caught up, on the thread.

Maybe another "shot in the dark". But speaking of Servers. We
ran into trouble with a web server generating *enormous* error
logs -- a runaway script. The result was, even tho there was
far more than adequate space for the swelling log(s). Memory,
and eventually Swap usage, began to climb quite steadily.

Like I said; maybe a shot in the dark. But just thought I'd
mention it.
> 
> Thanks!
--Chris

--


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Significant memory leak in 9.3p10?

2015-03-26 Thread J David
On Thu, Mar 26, 2015 at 8:25 PM, Chris H  wrote:
> As Kevin already noted; stopping firefox, and starting it again,
> seems the only solution.

The machines in questions are servers, they do not run Firefox or any
GUI.  And whatever is using the memory does not show up on ps or top.

Thanks!
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Significant memory leak in 9.3p10?

2015-03-26 Thread Chris H
On Thu, 26 Mar 2015 14:03:45 -0700 Kevin Oberman  wrote

> On Thu, Mar 26, 2015 at 12:46 PM, J David  wrote:
> 
> > On Mon, Mar 16, 2015 at 7:52 PM, J David  wrote:
> > > On Mon, Mar 16, 2015 at 7:24 PM, Konstantin Belousov
> > >  wrote:
> > >> There are a lot of possibilities to create persistent anonymous shared
> > >> memory objects.  Not complete list is tmpfs mounts, swap-backed md
> > disks,
> > >> sysv shared memory, possibly posix shared memory (I do not remember
> > which
> > >> implementation is used in stable/9).
> > >
> > > If that's the explanation, how could it be
> > > detected/measured/investigated/resolved/prevented?
> > >
> > > Under ordinary circumstances, machines will go run like this for
> > days/weeks:
> > >
> > > Mem: 549M Active, 3623M Inact, 567M Wired, 3484K Cache, 827M Buf, 3156M
> > Free
> > > Swap: 1024M Total, 1024M Free
> > >
> > > Then, when this happens, it rapidly degrades from that to so bad that
> > > processes start getting killed for being out of swap space.
> >
> > These FreeBSD machines running out of swap space and dying continues
> > to be a daily problem causing outages and unscheduled reboots.  Is
> > there really no way to even research what might be causing the
> > problem?
> >
> > (Widening the cross-posting in the hopes of eliciting more help, so
> > the brief summary of the problem orginally posted to freebsd-stable is
> > that an unknown actor consumes all the user-space memory in the
> > system, including swap space, to the point where processes are killed
> > for being out of swap space, but if every process on the machine is
> > stopped, very little of the user-space memory in use is freed.
> > Original message with more details is here:
> > https://lists.freebsd.org/pipermail/freebsd-stable/2015-March/081986.html
> > .)
> >
> > There are no tmpfs mounts or md disks, so it would have to be one of
> > the other causes.  How can FreeBSD's use of persistent, anonymous
> > shared memory objects be investigated, measured, or controlled so we
> > can get a handle on this issue?
> >
> 
> This is just a shot in the dark and not a really likely one, but I have had
> issues with Firefox leaking memory badly. I can free the space by killing
> firefox and restarting it.
> 
> It seems to be linked to certain web sites, probably javascript. I have not
> been able to confirm which one does it. It just will start growing until
> the system slows to a crawl as too many things are swapped out. Normally my
> system does not touch swap.
I can confirm this -- both regular, as well as ESR. Upgrading firefox
[ultimately] has little-to-no effect. I have experienced this for near
2yrs. I suspect the [firefoxes] js engine. Any one of any number of
sites could/would/will cause it.

As Kevin already noted; stopping firefox, and starting it again,
seems the only solution.
> 
> If it is in user space, top should show it under RES.
> --
> Kevin Oberman, Network Engineer, Retired
> E-mail: rkober...@gmail.com
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

--Chris


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Significant memory leak in 9.3p10?

2015-03-26 Thread J David
On Thu, Mar 26, 2015 at 7:39 PM, The Lost Admin  wrote:
> Have you looked through the system shutdown scripts (part of init/rc) to see 
> what happens after the uptime is printed? that might give you a lead.

All of that output is printed by the kernel (see
sys/kern/kern_shutdown.c), not by scripts.  It happens after any
shutdown scripts are run.

> The output from your PS seams to be much shorter than I would expect. Are you 
> sure it included everything? For example, I would expect to see processes for 
> cron, syslog, and normally sshd.

Killed them all.  Killed absolutely user process but init, login, and
the shell.  Memory not freed.

> I’ve also got a few more kernel processes that you don’t appear to have. Most 
> notably is pagedaemon

pagedaemon is on the list with pid 5.

Thanks!
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Significant memory leak in 9.3p10?

2015-03-26 Thread J David
In our case,

On Thu, Mar 26, 2015 at 5:03 PM, Kevin Oberman  wrote:
> This is just a shot in the dark and not a really likely one, but I have had
> issues with Firefox leaking memory badly. I can free the space by killing
> firefox and restarting it.

In our case, we can log in from the console, kill every single
user-mode process on the system except the init, login, and the
console shell, and the memory is not recovered.  Gigabytes and
gigabytes user memory of it are being held by some un-findable
anonymous persistent structure not linked to any process.  Konstantin
proposed that it was some sort of shared memory usage, but there
appears to be no way to check or investigate most types of shared
memory usage on FreeBSD.

> If it is in user space, top should show it under RES.

This is definitely *not* the case.  Whatever is using the memory is
not associated with any user-space process, and does not show up on
top or ps.

It also does not appear to be SysV shared memory, as that reports:

$ ipcs -m

Shared Memory:

T   ID  KEY MODEOWNERGROUP


$


Also, kern.ipc.shmmax is only 512MB whereas this problem is consuming
usually 8-10GB.  So I guess the remaining possibilities are anonymous
mmap's that are somehow not associated with any process and Posix
shared memory.  Are there any ways to investigate either possibility?

Thanks!
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Significant memory leak in 9.3p10?

2015-03-26 Thread Kevin Oberman
On Thu, Mar 26, 2015 at 12:46 PM, J David  wrote:

> On Mon, Mar 16, 2015 at 7:52 PM, J David  wrote:
> > On Mon, Mar 16, 2015 at 7:24 PM, Konstantin Belousov
> >  wrote:
> >> There are a lot of possibilities to create persistent anonymous shared
> >> memory objects.  Not complete list is tmpfs mounts, swap-backed md
> disks,
> >> sysv shared memory, possibly posix shared memory (I do not remember
> which
> >> implementation is used in stable/9).
> >
> > If that's the explanation, how could it be
> > detected/measured/investigated/resolved/prevented?
> >
> > Under ordinary circumstances, machines will go run like this for
> days/weeks:
> >
> > Mem: 549M Active, 3623M Inact, 567M Wired, 3484K Cache, 827M Buf, 3156M
> Free
> > Swap: 1024M Total, 1024M Free
> >
> > Then, when this happens, it rapidly degrades from that to so bad that
> > processes start getting killed for being out of swap space.
>
> These FreeBSD machines running out of swap space and dying continues
> to be a daily problem causing outages and unscheduled reboots.  Is
> there really no way to even research what might be causing the
> problem?
>
> (Widening the cross-posting in the hopes of eliciting more help, so
> the brief summary of the problem orginally posted to freebsd-stable is
> that an unknown actor consumes all the user-space memory in the
> system, including swap space, to the point where processes are killed
> for being out of swap space, but if every process on the machine is
> stopped, very little of the user-space memory in use is freed.
> Original message with more details is here:
> https://lists.freebsd.org/pipermail/freebsd-stable/2015-March/081986.html
> .)
>
> There are no tmpfs mounts or md disks, so it would have to be one of
> the other causes.  How can FreeBSD's use of persistent, anonymous
> shared memory objects be investigated, measured, or controlled so we
> can get a handle on this issue?
>

This is just a shot in the dark and not a really likely one, but I have had
issues with Firefox leaking memory badly. I can free the space by killing
firefox and restarting it.

It seems to be linked to certain web sites, probably javascript. I have not
been able to confirm which one does it. It just will start growing until
the system slows to a crawl as too many things are swapped out. Normally my
system does not touch swap.

If it is in user space, top should show it under RES.
--
Kevin Oberman, Network Engineer, Retired
E-mail: rkober...@gmail.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Significant memory leak in 9.3p10?

2015-03-26 Thread J David
On Mon, Mar 16, 2015 at 7:52 PM, J David  wrote:
> On Mon, Mar 16, 2015 at 7:24 PM, Konstantin Belousov
>  wrote:
>> There are a lot of possibilities to create persistent anonymous shared
>> memory objects.  Not complete list is tmpfs mounts, swap-backed md disks,
>> sysv shared memory, possibly posix shared memory (I do not remember which
>> implementation is used in stable/9).
>
> If that's the explanation, how could it be
> detected/measured/investigated/resolved/prevented?
>
> Under ordinary circumstances, machines will go run like this for days/weeks:
>
> Mem: 549M Active, 3623M Inact, 567M Wired, 3484K Cache, 827M Buf, 3156M Free
> Swap: 1024M Total, 1024M Free
>
> Then, when this happens, it rapidly degrades from that to so bad that
> processes start getting killed for being out of swap space.

These FreeBSD machines running out of swap space and dying continues
to be a daily problem causing outages and unscheduled reboots.  Is
there really no way to even research what might be causing the
problem?

(Widening the cross-posting in the hopes of eliciting more help, so
the brief summary of the problem orginally posted to freebsd-stable is
that an unknown actor consumes all the user-space memory in the
system, including swap space, to the point where processes are killed
for being out of swap space, but if every process on the machine is
stopped, very little of the user-space memory in use is freed.
Original message with more details is here:
https://lists.freebsd.org/pipermail/freebsd-stable/2015-March/081986.html
.)

There are no tmpfs mounts or md disks, so it would have to be one of
the other causes.  How can FreeBSD's use of persistent, anonymous
shared memory objects be investigated, measured, or controlled so we
can get a handle on this issue?

Thanks!
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Significant memory leak in 9.3p10?

2015-03-16 Thread J David
On Mon, Mar 16, 2015 at 7:24 PM, Konstantin Belousov
 wrote:
> There are a lot of possibilities to create persistent anonymous shared
> memory objects.  Not complete list is tmpfs mounts, swap-backed md disks,
> sysv shared memory, possibly posix shared memory (I do not remember which
> implementation is used in stable/9).

If that's the explanation, how could it be
detected/measured/investigated/resolved/prevented?

Under ordinary circumstances, machines will go run like this for days/weeks:

Mem: 549M Active, 3623M Inact, 567M Wired, 3484K Cache, 827M Buf, 3156M Free
Swap: 1024M Total, 1024M Free

Then, when this happens, it rapidly degrades from that to so bad that
processes start getting killed for being out of swap space.

Thanks!
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Significant memory leak in 9.3p10?

2015-03-16 Thread Konstantin Belousov
On Mon, Mar 16, 2015 at 06:59:33PM -0400, J David wrote:
> Recently we have seen a large-scale memory leak on amd64 machines
> running FreeBSD 9.3-RELEASE-p10.
> 
> This was first observed on 9.3p2 but has since shown up all the way through 
> p10.
> 
> Here's what the header of top shows:
> 
> last pid: 32329;  load averages:  0.00,  0.01,  0.21up 3+15:37:29  
> 22:34:04
> 25 processes:  2 running, 22 sleeping, 1 waiting
> CPU: % user, % nice, % system, % interrupt, % idle
> Mem: 4072M Active, 895M Inact, 1284M Wired, 125M Cache, 826M Buf, 1521M Free
> Swap: 1024M Total, 874M Used, 149M Free, 85% Inuse
> 
> About 4G actively being used, another 895M inactive, and another 874M
> in swap.  So it seems like this is a user-space leak, rather than a
> kernel-space leak.
> 
> At the time of measurement, this machine was not doing anything and
> every possible process had been killed trying to find a culprit.  The
> entire output of "ps axlww" is:
> 
> UID   PID  PPID CPU PRI NI   VSZ  RSS MWCHAN   STAT TTTIME COMMAND
>   0 0 0   0 -52  0 0  224 -DLs  ?? 0:00.82 [kernel]
>   0 1 0   0  20  0  6280  556 wait SLs  ?? 0:00.57 /sbin/init 
> --
>   0 2 0   0 -16  0 0   16 pftm DL   ?? 0:00.85 [pfpurge]
>   0 3 0   0 -16  0 0   16 waiting_ DL   ?? 0:00.00
> [sctp_iterator]
>   0 4 0   0 -16  0 0   16 -DL   ?? 0:00.00 [xpt_thrd]
>   0 5 0   0 -16  0 0   16 psleep   DL   ?? 0:28.85 
> [pagedaemon]
>   0 6 0   0 -16  0 0   16 psleep   DL   ?? 0:45.03 [vmdaemon]
>   0 7 0   0 -16  0 0   16 pollid   DL   ?? 0:00.23 [idlepoll]
>   0 8 0   0 155  0 0   16 pgzero   DL   ?? 0:00.00 [pagezero]
>   0 9 0   0 -16  0 0   16 psleep   DL   ?? 0:00.83 [bufdaemon]
>   010 0   0 -16  0 0   16 audit_wo DL   ?? 0:00.00 [audit]
>   011 0   0 155  0 0   32 -RL   ??  8317:13.37 [idle]
>   012 0   0 -76  0 0  240 -WL   ??   301:43.54 [intr]
>   013 0   0  -8  0 0   48 -DL   ?? 0:09.89 [geom]
>   014 0   0 -16  0 0   16 -DL   ?? 2:58.88 [yarrow]
>   015 0   0 -68  0 0   64 -DL   ?? 0:02.32 [usb]
>   016 0   0 -16  0 0   16 vlruwt   DL   ?? 0:06.35 [vnlru]
>   017 0   0  16  0 0   16 syncer   DL   ?? 5:28.89 [syncer]
>   018 0   0 -16  0 0   16 sdflush  DL   ?? 0:10.27
> [softdepflush]
>   019 0   0 -16  0 0   16 -DL   ?? 0:55.09 [racctd]
>   0   830 1   0  20  0 45348 2396 wait Is   u0 0:00.07
> login [pam] (login)
> 500 32269   830   0  20  0 14556 2428 wait Su0 0:00.09 -sh (sh)
> 500 32340 32269   0  20  0 16296 1908 -R+   u0 0:00.00 ps axlww
> 
> Since the issue doesn't seem related to kernel memory usage, vmstat -m
> and -z have been skipped, but nothing jumps out as using gigs of RAM;
> they do appear consistent with 1284M of wired memory, which is not
> unreasonable for the affected machines' tuning and workload.
> 
> The only user-space processes running are login, sh, and ps.  So where
> did 5.5G of userspace RAM go?
> 
> The only other potentially useful information is that when this
> happens, shutting down the system will hang for about ten minutes.
> 
> $ sudo halt -p
> Waiting (max 60 seconds) for system process `vnlru' to stop...done
> Waiting (max 60 seconds) for system process `bufdaemon' to stop...done
> Waiting (max 60 seconds) for system process `syncer' to stop...
> Syncing disks, vnodes remaining...0 0 0 0 0 0 0 0 0 done
> All buffers synced.  <- 10 MINUTE HANG AFTER PRINTING THIS
> Uptime: 3d15h56m32s
> usbus0: Controller shutdown
> uhub0: at usbus0, port 1, addr 1 (disconnected)
> usbus0: controller did not stop
> usbus0: Controller shutdown complete
> acpi0: Powering system off
> Connection closed by foreign host.
> 
> So it seems like somewhere after "All buffers synced" and printing the
> uptime, it's very slowly unwinding whatever is using up all that RAM
> and swap.
> 
> Does anyone have any idea what might be causing this or how to fix/prevent it?

There are a lot of possibilities to create persistent anonymous shared
memory objects.  Not complete list is tmpfs mounts, swap-backed md disks,
sysv shared memory, possibly posix shared memory (I do not remember which
implementation is used in stable/9).

I quite possible missed some object types.  Also note that active/inactive
can be explained by cached file pages, and only swap usage suggests that
it might be something persisent from the list above.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Significant memory leak in 9.3p10?

2015-03-16 Thread J David
Recently we have seen a large-scale memory leak on amd64 machines
running FreeBSD 9.3-RELEASE-p10.

This was first observed on 9.3p2 but has since shown up all the way through p10.

Here's what the header of top shows:

last pid: 32329;  load averages:  0.00,  0.01,  0.21up 3+15:37:29  22:34:04
25 processes:  2 running, 22 sleeping, 1 waiting
CPU: % user, % nice, % system, % interrupt, % idle
Mem: 4072M Active, 895M Inact, 1284M Wired, 125M Cache, 826M Buf, 1521M Free
Swap: 1024M Total, 874M Used, 149M Free, 85% Inuse

About 4G actively being used, another 895M inactive, and another 874M
in swap.  So it seems like this is a user-space leak, rather than a
kernel-space leak.

At the time of measurement, this machine was not doing anything and
every possible process had been killed trying to find a culprit.  The
entire output of "ps axlww" is:

UID   PID  PPID CPU PRI NI   VSZ  RSS MWCHAN   STAT TTTIME COMMAND
  0 0 0   0 -52  0 0  224 -DLs  ?? 0:00.82 [kernel]
  0 1 0   0  20  0  6280  556 wait SLs  ?? 0:00.57 /sbin/init --
  0 2 0   0 -16  0 0   16 pftm DL   ?? 0:00.85 [pfpurge]
  0 3 0   0 -16  0 0   16 waiting_ DL   ?? 0:00.00
[sctp_iterator]
  0 4 0   0 -16  0 0   16 -DL   ?? 0:00.00 [xpt_thrd]
  0 5 0   0 -16  0 0   16 psleep   DL   ?? 0:28.85 [pagedaemon]
  0 6 0   0 -16  0 0   16 psleep   DL   ?? 0:45.03 [vmdaemon]
  0 7 0   0 -16  0 0   16 pollid   DL   ?? 0:00.23 [idlepoll]
  0 8 0   0 155  0 0   16 pgzero   DL   ?? 0:00.00 [pagezero]
  0 9 0   0 -16  0 0   16 psleep   DL   ?? 0:00.83 [bufdaemon]
  010 0   0 -16  0 0   16 audit_wo DL   ?? 0:00.00 [audit]
  011 0   0 155  0 0   32 -RL   ??  8317:13.37 [idle]
  012 0   0 -76  0 0  240 -WL   ??   301:43.54 [intr]
  013 0   0  -8  0 0   48 -DL   ?? 0:09.89 [geom]
  014 0   0 -16  0 0   16 -DL   ?? 2:58.88 [yarrow]
  015 0   0 -68  0 0   64 -DL   ?? 0:02.32 [usb]
  016 0   0 -16  0 0   16 vlruwt   DL   ?? 0:06.35 [vnlru]
  017 0   0  16  0 0   16 syncer   DL   ?? 5:28.89 [syncer]
  018 0   0 -16  0 0   16 sdflush  DL   ?? 0:10.27
[softdepflush]
  019 0   0 -16  0 0   16 -DL   ?? 0:55.09 [racctd]
  0   830 1   0  20  0 45348 2396 wait Is   u0 0:00.07
login [pam] (login)
500 32269   830   0  20  0 14556 2428 wait Su0 0:00.09 -sh (sh)
500 32340 32269   0  20  0 16296 1908 -R+   u0 0:00.00 ps axlww

Since the issue doesn't seem related to kernel memory usage, vmstat -m
and -z have been skipped, but nothing jumps out as using gigs of RAM;
they do appear consistent with 1284M of wired memory, which is not
unreasonable for the affected machines' tuning and workload.

The only user-space processes running are login, sh, and ps.  So where
did 5.5G of userspace RAM go?

The only other potentially useful information is that when this
happens, shutting down the system will hang for about ten minutes.

$ sudo halt -p
Waiting (max 60 seconds) for system process `vnlru' to stop...done
Waiting (max 60 seconds) for system process `bufdaemon' to stop...done
Waiting (max 60 seconds) for system process `syncer' to stop...
Syncing disks, vnodes remaining...0 0 0 0 0 0 0 0 0 done
All buffers synced.  <- 10 MINUTE HANG AFTER PRINTING THIS
Uptime: 3d15h56m32s
usbus0: Controller shutdown
uhub0: at usbus0, port 1, addr 1 (disconnected)
usbus0: controller did not stop
usbus0: Controller shutdown complete
acpi0: Powering system off
Connection closed by foreign host.

So it seems like somewhere after "All buffers synced" and printing the
uptime, it's very slowly unwinding whatever is using up all that RAM
and swap.

Does anyone have any idea what might be causing this or how to fix/prevent it?

Thanks in advance for any advice!
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"