Re: killed: out of swap

2022-06-15 Thread Johnny Billquist

On 2022-06-15 06:57, Michael van Elst wrote:

b...@softjar.se (Johnny Billquist) writes:


I don't see any realistic way of doing anything with that.
It's basically the first process that tries to allocate another page
when there are no more. There are no other processes at that moment in
time that have the problem, so why should any of them be considered?


They might be the reason for the memory shortage. You can prefer large
processes as victims or protect system services to keep the system
managable.


So when one process tries to grow, you'd kill a process that currently 
have no issues in running? Which means you might end up killing a lot of 
non-problematic processes because of one runaway process? Seems to me to 
not be a good decision.


  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: killed: out of swap

2022-06-15 Thread David Brownlee
On Wed, 15 Jun 2022 at 08:31, Johnny Billquist  wrote:
>
> On 2022-06-15 06:57, Michael van Elst wrote:
> > b...@softjar.se (Johnny Billquist) writes:
> >
> >> I don't see any realistic way of doing anything with that.
> >> It's basically the first process that tries to allocate another page
> >> when there are no more. There are no other processes at that moment in
> >> time that have the problem, so why should any of them be considered?
> >
> > They might be the reason for the memory shortage. You can prefer large
> > processes as victims or protect system services to keep the system
> > managable.
>
> So when one process tries to grow, you'd kill a process that currently
> have no issues in running? Which means you might end up killing a lot of
> non-problematic processes because of one runaway process? Seems to me to
> not be a good decision.

As opposed to the process which had a successful malloc some time ago
and is running without issues, and is just about to try to use some of
its existing allocation?

Both options are wrong in some cases. Having a way to influence the
order in which processes are chosen would seem to be the best way to
end up with a better outcome. The existing behaviour should remain an
option, but (at least for me) it would not be the one chosen

David


Re: killed: out of swap

2022-06-15 Thread Johnny Billquist

On 2022-06-15 11:09, David Brownlee wrote:

On Wed, 15 Jun 2022 at 08:31, Johnny Billquist  wrote:


On 2022-06-15 06:57, Michael van Elst wrote:

b...@softjar.se (Johnny Billquist) writes:


I don't see any realistic way of doing anything with that.
It's basically the first process that tries to allocate another page
when there are no more. There are no other processes at that moment in
time that have the problem, so why should any of them be considered?


They might be the reason for the memory shortage. You can prefer large
processes as victims or protect system services to keep the system
managable.


So when one process tries to grow, you'd kill a process that currently
have no issues in running? Which means you might end up killing a lot of
non-problematic processes because of one runaway process? Seems to me to
not be a good decision.


As opposed to the process which had a successful malloc some time ago
and is running without issues, and is just about to try to use some of
its existing allocation?


That is speculation, which is my problem here. You are trading a known 
requester of non-existant memory for speculation that another process 
*might* want non-existant memory.



Both options are wrong in some cases. Having a way to influence the
order in which processes are chosen would seem to be the best way to
end up with a better outcome. The existing behaviour should remain an
option, but (at least for me) it would not be the one chosen


I (obviously) disagree. :-)

  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: killed: out of swap

2022-06-15 Thread Michael van Elst
b...@softjar.se (Johnny Billquist) writes:

>> They might be the reason for the memory shortage. You can prefer large
>> processes as victims or protect system services to keep the system
>> managable.

>So when one process tries to grow, you'd kill a process that currently 
>have no issues in running?


All processes have issues on that system and the goal is to keep things
alive so that you can recover, a system hang, crash or reboot is the
worst outcome.

Obviously there is no heuristic that can predict what action will have
the best outcome and which causes the least damage. Guessing on the
cost of various kinds of damage is an impossible task by itself as
that is fairly subjective.

But there can be a heuristic that helps in many cases, and for the rest
you can hint the system.




Re: killed: out of swap

2022-06-15 Thread Brian Buhrow
hello.  One algorithm I think that might be a good option and would 
address the concerns
that I've seen on this topic is to kill the process with the latest start time 
when looking for
resources to free.  Obviously this would fail in the case where a long-running 
process suddenly
starts consuming memory, but in general, it seems that badly behaved processes 
would
typically begin behaving badly from the beginning of their inception.  In any 
case, it should
address the issue of  X or syslogd getting killed, since they would often have 
older start
times than processes that cause the trouble.

-Brian


Re: killed: out of swap

2022-06-15 Thread Johnny Billquist




On 2022-06-15 16:01, Michael van Elst wrote:

b...@softjar.se (Johnny Billquist) writes:


They might be the reason for the memory shortage. You can prefer large
processes as victims or protect system services to keep the system
managable.



So when one process tries to grow, you'd kill a process that currently
have no issues in running?



All processes have issues on that system and the goal is to keep things
alive so that you can recover, a system hang, crash or reboot is the
worst outcome.


Maybe, but not definitely.

And the outcome is in general processes being killed, which basically 
should never result in an outright crash or reboot. Not even a hang, 
although if the wrong process is killed, you might end up not being able 
to access the system, so it's a bit of a grey area.



Obviously there is no heuristic that can predict what action will have
the best outcome and which causes the least damage. Guessing on the
cost of various kinds of damage is an impossible task by itself as
that is fairly subjective.


Agreed. But the one thing that is known at specific point in time is 
that there is one process who needed one more page, which could not be 
satisfied. All the other processes at that moment in time are not in 
trouble. Which also means, we do not know if killing another process is 
enough to keep this process going, and we do not know if that other 
process would ever get into trouble at all. So we are faced with the 
choice of killing one process we know are in trouble, or speculatively 
kill something else, and then hope that would help.


The suggestion that we'd add some kind of hinting could at least help 
some, but it is rather imperfect. And if we don't have any hints, we're 
back in the same place again.



But there can be a heuristic that helps in many cases, and for the rest
you can hint the system.


If you can come up with some heuristics, it would be interesting to see 
them. I don't see any easy ones.


  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: killed: out of swap

2022-06-15 Thread Johnny Billquist
By the way. This obviously does not at all solve the problem that the OP 
had. He was writing code with the expectation that malloc() should fail.
For this, we need to get something that will not allow overcommitting 
memory, and where malloc() can then return an error instead of the 
process getting killed.


A killed process won't make the OP happy, even if it was his own 
program/process.


  Johnny

On 2022-06-15 17:41, Johnny Billquist wrote:



On 2022-06-15 16:01, Michael van Elst wrote:

b...@softjar.se (Johnny Billquist) writes:


They might be the reason for the memory shortage. You can prefer large
processes as victims or protect system services to keep the system
managable.



So when one process tries to grow, you'd kill a process that currently
have no issues in running?



All processes have issues on that system and the goal is to keep things
alive so that you can recover, a system hang, crash or reboot is the
worst outcome.


Maybe, but not definitely.

And the outcome is in general processes being killed, which basically 
should never result in an outright crash or reboot. Not even a hang, 
although if the wrong process is killed, you might end up not being able 
to access the system, so it's a bit of a grey area.



Obviously there is no heuristic that can predict what action will have
the best outcome and which causes the least damage. Guessing on the
cost of various kinds of damage is an impossible task by itself as
that is fairly subjective.


Agreed. But the one thing that is known at specific point in time is 
that there is one process who needed one more page, which could not be 
satisfied. All the other processes at that moment in time are not in 
trouble. Which also means, we do not know if killing another process is 
enough to keep this process going, and we do not know if that other 
process would ever get into trouble at all. So we are faced with the 
choice of killing one process we know are in trouble, or speculatively 
kill something else, and then hope that would help.


The suggestion that we'd add some kind of hinting could at least help 
some, but it is rather imperfect. And if we don't have any hints, we're 
back in the same place again.



But there can be a heuristic that helps in many cases, and for the rest
you can hint the system.


If you can come up with some heuristics, it would be interesting to see 
them. I don't see any easy ones.


   Johnny



--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: killed: out of swap

2022-06-15 Thread Mouse
> By the way.  This obviously does not at all solve the problem that
> the OP had.  He was writing code with the expectation that malloc()
> should fail.  [...]  A killed process won't make the OP happy, even
> if it was his own program/process.

I'm not sure that last sentence is true.  As I read it, the reaction to
malloc failing would have been that we've put as much pressure on as we
can, so it's time to exit.  If so, killing that process is a reasonable
reaction - and that's what my proposed approaches were based on.
Perhaps my understanding is wrong, in which case not much will help
except no overcommit.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: killed: out of swap

2022-06-15 Thread Johnny Billquist

On 2022-06-15 18:41, Mouse wrote:

By the way.  This obviously does not at all solve the problem that
the OP had.  He was writing code with the expectation that malloc()
should fail.  [...]  A killed process won't make the OP happy, even
if it was his own program/process.


I'm not sure that last sentence is true.  As I read it, the reaction to
malloc failing would have been that we've put as much pressure on as we
can, so it's time to exit.  If so, killing that process is a reasonable
reaction - and that's what my proposed approaches were based on.
Perhaps my understanding is wrong, in which case not much will help
except no overcommit.


I might have misunderstood the purpose. I thought there was a desire to 
stress things a bit once you get to the memory full state to see how 
things were behaving.


So you might be absolutely right.

  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: killed: out of swap

2022-06-15 Thread Edgar Fuß
> Perhaps my understanding is wrong
No.


Dell PERC H750

2022-06-15 Thread Mark Davies
I have a machine with a Dell PERC H750 raid card.  I'd like to get it 
working under NetBSD.



When FreeBSD added support to their driver (mrsas) it looks like they 
did it with these three patches:


https://cgit.freebsd.org/src/commit/sys/dev/mrsas?id=2909aab4cfc296bcf83fa3e87ed41ed1f4244fea

https://cgit.freebsd.org/src/commit/sys/dev/mrsas?id=b518670c218c4e2674207e946d1b9a70502c5451

https://cgit.freebsd.org/src/commit/sys/dev/mrsas?id=e315cf4dc4d167d9f2e34fe03cd79468f035a6e8

The first patch seems to treat it the same as the previous generation 
card, and then the other two patches add specific changes for the "aero"



If I add the following to our  mfii driver:

--- mfii.c  17 May 2022 10:29:47 -  1.4.4.1
+++ mfii.c  8 Jun 2022 04:22:54 -
@@ -604,6 +604,8 @@
{ PCI_VENDOR_SYMBIOS,   PCI_PRODUCT_SYMBIOS_MEGARAID_3416,
&mfii_iop_35 },
{ PCI_VENDOR_SYMBIOS,   PCI_PRODUCT_SYMBIOS_MEGARAID_3516,
+   &mfii_iop_35 },
+   { PCI_VENDOR_SYMBIOS,   PCI_PRODUCT_SYMBIOS_MEGARAID_39XX_3,
&mfii_iop_35 }
 };



to add the card and initially treat it the same as the previous gen 
card, then on start up I detect the card but never get an "sd" disk detected


 [...]
[ 1.058596] mfii0 at pci6 dev 0 function 0: "PERC H750 Adapter", 
firmware 52.16.1-4074, 8192MB cache

[ 1.058596] mfii0: interrupting at ioapic2 pin 2
[ 1.058596] scsibus0 at mfii0: 240 targets, 8 luns per target
 [...]
[ 1.418319] mfii0: physical disk inserted id 64 enclosure 64
[ 1.418319] mfii0: physical disk inserted id 0 enclosure 64
[ 1.418319] mfii0: physical disk inserted id 1 enclosure 64
[ 1.418319] mfii0: physical disk inserted id 2 enclosure 64
[ 1.418319] mfii0: physical disk inserted id 3 enclosure 64
[ 1.418319] mfii0: physical disk inserted id 4 enclosure 64
[ 1.418319] mfii0: physical disk inserted id 5 enclosure 64
 [...]



On another machine with an H730 installed I see:

mfii0 at pci6 dev 0 function 0: "PERC H730P Adapter", firmware 
25.5.6.0009, 2048MB cache

mfii0: interrupting at ioapic2 pin 2
scsibus0 at mfii0: 64 targets, 8 luns per target
 [...]
mfii0: physical disk inserted id 32 enclosure 32
mfii0: physical disk inserted id 0 enclosure 32
mfii0: physical disk inserted id 1 enclosure 32
mfii0: physical disk inserted id 2 enclosure 32
mfii0: physical disk inserted id 3 enclosure 32
 [...]
sd0 at scsibus0 target 0 lun 0:  disk fixed
sd0: fabricating a geometry
sd0: 2681 GB, 2745600 cyl, 64 head, 32 sec, 512 bytes/sect x 5622988800 
sectors

sd0: fabricating a geometry
sd0: GPT GUID: 92f5aca9-29d3-4c7e-8c41-85fb2df819d6
 [...]



Any suggestions on what the equivalent of the freebsd patch 2 and 3 
would be, or anything else I may need to do to get this going (or why 
this approach wont work) would be appreciated.


cheers
mark