David, As I've found this was changed because there was alot of these messages in dmesg:
WARNING: mclpools limit reached; increase kern.maxclusters before hitting panic. This means this is unhandled panic, which now can be fixed? Btw, how can we calculate "safe value"? -- Evgeniy On Tue, Apr 7, 2015 at 10:58 AM, David Gwynne <da...@gwynne.id.au> wrote: > >> On 7 Apr 2015, at 6:38 pm, Evgeniy Sudyr <eject.in...@gmail.com> wrote: >> >> David, >> >> yes, there are next changes in sysctl.conf, but kernel options were >> untouched (again it was GENERIC.MP -stable). >> >> $ cat /etc/sysctl.conf >> net.inet.ip.forwarding=1 >> net.inet.carp.preempt=1 >> net.inet6.ip6.forwarding=1 >> kern.maxfiles=5048026 >> kern.maxclusters=2000000 > > why did you raise those last two values? > > dlg > > ps. that last one is the cause of your panics. > >> >> >> On Tue, Apr 7, 2015 at 9:37 AM, David Gwynne <da...@gwynne.id.au> wrote: >>> >>>> On 6 Apr 2015, at 05:32, Evgeniy Sudyr <eject.in...@gmail.com> wrote: >>>> >>>> Mark, I will dig in to this. >>>> >>>> Sorry, but can someone give a hint what are "unusual" values for pools >>>> there which can be related to kernel panic Iv'e reported at the very >>>> beginning? >>>> >>>> Current vmstat -m output is: >>> >>> the abbreviated version below is kind of interesting. >>> >>> are you setting the kern.maxclusters sysctl? if so, to what value? >>> >>>> >>>> Memory Totals: In Use Free Requests >>>> 76695K 862K 24831415 >>>> Memory resource pool statistics >>>> Name Size Requests Fail InUse Pgreq Pgrel Npage Hiwat Minpg >>>> Maxpg Idle >>>> mbpl 256 2741641011 0 346 4789 0 0 4789 1 >>>> 125000 4767 >>>> mcl2k 2048 1108887843 0 183 10052 0 0 10052 4 >>>> 1000000 9959 >>>> >>>> In use 210238K, total allocated 0K; utilization inf% >>>> >>>> >>>> Will update if will find something... >>>> >>>> On Sun, Apr 5, 2015 at 6:59 PM, Mark Kettenis <mark.kette...@xs4all.nl> >>>> wrote: >>>>>> Date: Sun, 5 Apr 2015 18:44:43 +0200 >>>>>> From: Evgeniy Sudyr <eject.in...@gmail.com> >>>>>> >>>>>> Stuart, >>>>>> >>>>>> as part of troubleshooting, BIOS was upgraded from R 3.0 to latest R 3.2 >>>>>> >>>>>> http://www.supermicro.com/products/motherboard/Xeon/C600/X9SRW-F.cfm >>>>>> X9SRW5.115 >>>>>> >>>>>> How big chances are it hitted bug which was fixed in latest BIOS >>>>>> relase and this will not occurs again? Did you noticed something we >>>>>> can check with Supermicro support to make sure? >>>>> >>>>> So far I've not seen any real evidence that the BIOS is causing >>>>> problems. Ted noticed the higher-than-usual ACPI memory usage, >>>>> suggesting a memory leak. This made Stuart suggest that it might be >>>>> worth updating your BIOS. But we haven't actually established that >>>>> there is indeed a memory leak. In fact the information you posted >>>>> earlier suggests that there is no ACPI memory leak, or at least not >>>>> one directly related to executing AML. >>>>> >>>>> You'll really need to do some digging yourself here. Look at the >>>>> vmstat -m output immediately after booting your machine. Then keep >>>>> looking at it periodically and identify the memory types and pools >>>>> that keep growing. For malloc'ed memory look at the "MemUse" column >>>>> under "Memory statistics by type". For pools, look at the "InUse" >>>>> column under "Memory resource pool statistics". >>>> >>>> >>>> >>>> -- >>>> -- >>>> With regards, >>>> Eugene Sudyr >>>> >>> >> >> >> >> -- >> -- >> With regards, >> Eugene Sudyr > -- -- With regards, Eugene Sudyr