> Did those jobs share nodes -- sometimes two or more jobs using the same
> nodes? I am sure SGI has such users too, though such job mixes make
> the runtimes of specific jobs less obvious, so customers are more
> tolerant of variations and some inefficiencies, as they get hidden in
> the mix.
Hm
Kosaki-san wrote:
> Yes.
> Fujitsu HPC middleware watching sum of memory consumption of the job
> and, if over-consumption happened, kill process and remove job schedule.
Did those jobs share nodes -- sometimes two or more jobs using the same
nodes? I am sure SGI has such users too, though such j
Rik wrote:
> In that case the user is better off having that job killed and
> restarted elsewhere, than having all of the jobs on that node
> crawl to a halt due to swapping.
>
> Paul, is this guess correct? :)
Not for the loads I focus on. Each job gets exclusive use of its own
dedicated set of
Hi Rik
> > Sounds like a job for memory limits (ulimit?), not for OOM
> > notification, right?
>
> I suspect one problem could be that an HPC job scheduling program
> does not know exactly how much memory each job can take, so it can
> sometimes end up making a mistake and overcommitting the memo
On Tue, 19 Feb 2008 23:28:28 +0100
Pavel Machek <[EMAIL PROTECTED]> wrote:
> Sounds like a job for memory limits (ulimit?), not for OOM
> notification, right?
I suspect one problem could be that an HPC job scheduling program
does not know exactly how much memory each job can take, so it can
somet
Pavel, responding to pj:
> > There is not much my customers HPC jobs can do with notification before
> > swap. Their jobs either have the main memory they need to perform the
> > requested calculations with the desired performance, or their job is
> > useless and should be killed. Unlike the appl
On Tue 2008-02-19 09:00:08, Paul Jackson wrote:
> Kosaki-san wrote:
> > Thank you for wonderful interestings comment.
>
> You're most welcome. The pleasure is all mine.
>
> > you think kill the process just after swap, right?
> > but unfortunately, almost user hope receive notification before sw
pj, talking to himself:
> Of course
> for embedded use, I'd have to adapt it to a non-cpuset based mechanism
> (not difficult), as embedded definitely doesn't do cpusets.
I'm forgetting an important detail here. Kosaki-san has clearly stated
that this hook, at vmscan's writepage, is too late for
Rik wrote:
> Basically in all situations, the kernel needs to warn at the same point
> in time: when the system is about to run out of RAM for anonymous pages.
>
> ...
>
> In the HPC case, it leads to swapping (and a management program can kill or
> restart something else).
Thanks for stopping by
On Tue, 19 Feb 2008 09:00:08 -0600
Paul Jackson <[EMAIL PROTECTED]> wrote:
> Depending on what we're trying to do:
> 1) warn applications of swap coming soon (your case),
> 2) show how close we are to swapping,
> 3) show how much swap has happened already,
> 4) kill instantly if try to swap (m
Kosaki-san wrote:
> Thank you for wonderful interestings comment.
You're most welcome. The pleasure is all mine.
> you think kill the process just after swap, right?
> but unfortunately, almost user hope receive notification before swap ;-)
> because avoid swap.
There is not much my customers H
Hi Paul,
Thank you for wonderful interestings comment.
your comment is really nice.
I was HPC guy with large NUMA box at past.
I promise i don't ignroe hpc user.
but unfortunately I didn't have experience of use CPUSET
because at that point, it was under development yet.
I hope discuss you that
I just noticed this patchset, kosaki-san. It looks quite interesting;
my apologies for not commenting earlier.
I see mention somewhere that mem_notify is of particular interest to
embedded systems.
I have what seems, intuitively, a similar problem at the opposite
end of the world, on big-honkin
> > the Linux Today article is very nice description. (great works by Jake Edge)
> > http://www.linuxworld.com/news/2008/020508-kernel.html
>
> Just for future reference...the above-mentioned article is from LWN,
> syndicated onto LinuxWorld. It has, so far as I know, never been near
> Linux Today
Hi Rik
> More importantly, all gtk+ programs, as well as most databases and other
> system daemons have a poll() loop as their main loop.
not only gtk+, may be all modern GUI program :)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL P
On Sun, 10 Feb 2008 01:33:49 +0900
"KOSAKI Motohiro" <[EMAIL PROTECTED]> wrote:
> > Where is the netlink interface? Polling an FD is so last century :)
>
> to be honest, I don't know anyone use netlink and why hope receive
> low memory notify by netlink.
>
> poll() is old way, but it works good
Hi
> Interesting patch series (I am being yuppie and reading this thread
> from my iPhone on a treadmill at the gym - so further comments later).
> I think that this is broadly along the lines that I was thinking, but
> this should be an RFC only patch series for now.
sorry, I fixed at next post.
Yo,
Interesting patch series (I am being yuppie and reading this thread
from my iPhone on a treadmill at the gym - so further comments later).
I think that this is broadly along the lines that I was thinking, but
this should be an RFC only patch series for now.
Some initial questions:
Wh
18 matches
Mail list logo