Re: OOM killer problem - how to read the kernel log?

2007-11-09 Thread Douglas McNaught
"Tobias Brox" <[EMAIL PROTECTED]> writes:

> We have a database server which we've had some problems with, we've
> had some serious crashes (particularly during postgres vacuuming)
> which couldn't be fixed without hardware reboot.  We assumed that the
> problem would go away by itself after upgrading the server to a new
> box with 32G of RAM ... but, alas, one of the first days in operation
> it first killed lots of postgres children during the evening, and
> finally it crashed and required hardware reboot.  The box is certainly
> not out of memory, but after reading some posts here, I've understood
> that memory management is a complex issue.
>
> We're running 32 bits linux, maybe it would help to upgrade to 64 bits?

Big-time.  There are a lot of "artificial" internal kernel memory
limits in the 32-bit architecture, so you can run up against those
(triggering the OOM killer) even if you have plenty of application
memory.

32GB in the box actually might be counter-productive unless you're
running 64-bit, because just tracking that much memory will consume
precious low kernel RAM.

-Doug
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: yield API

2007-10-02 Thread Douglas McNaught
"linux-os \(Dick Johnson\)" <[EMAIL PROTECTED]> writes:

> Whether or not there is a POSIX definition of sched_yield(),
> there is a need for something that will give up the CPU
> and not busy-wait. There are many control applications
> where state-machines are kept in user-mode code. The code
> waits for an event. It shouldn't be spinning, wasting
> CPU time, when the kernel can be doing file and network
> I/O with the wasted CPU cycles.

These "control applications" would be real-time processes, for which
(AIUI) sched_yield() behavior is completely well-defined and
implemented as such by Linux.  The question here is how useful the
call is for SCHED_OTHER (non-real-time) processes, for which it has no
well-defined semantics.

-Doug
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Stracing Amanda (was: RSDL for 2.6.21-rc3- 0.29)

2007-03-12 Thread Douglas McNaught
Patrick Mau <[EMAIL PROTECTED]> writes:

> Why not temporarly replace "/bin/tar" with a shell script that does:
>
> #!/bin/sh
> exec strace -f -o output /bin/real.tar $@

You beat me to it.  :) I've done that before; it's a great suggestion.

Except that if you expect 'tar' to be invoked multiple times in a run,
you should probably use 'output.$$' for the output filename so things
don't get clobbered.

-Doug
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RSDL for 2.6.21-rc3- 0.29

2007-03-12 Thread Douglas McNaught
Gene Heskett <[EMAIL PROTECTED]> writes:

> On Monday 12 March 2007, Douglas McNaught wrote:
>>Gene Heskett <[EMAIL PROTECTED]> writes:
>>> I'd considered it, but with 32 dle entries, the whole strace output
>>> would be terrabytes & I don't have THAT much disk.  Not to mention it
>>> traces only the parent process, so tar would be merrily marching along
>>> to its own drummer and not traced I'm  afraid.
>>
>>$ strace -ff
>>
>>-Doug
>
> Someone else suggested the single -f, and I tried that, but even with the 
> shell history set for 100,000 lines, i can't get back to the start, and I 
> think its mucking with the shell arguments numbering as what I can see is 
> about 5 reads through /etc/services accompanied by endless complaints 
> of -EBADFD, the the logfile it generates says the port it was given was 
> rejected when amcheck was run, here is that snip:

I'd do 'strace -ff -o /tmp/amanda-strace ', which will give
you a set of files in /tmp, one for each PID created by fork().  Then
find the one that has the 'tar' invocation you're looking for.

-Doug
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RSDL for 2.6.21-rc3- 0.29

2007-03-12 Thread Douglas McNaught
Gene Heskett <[EMAIL PROTECTED]> writes:

> I'd considered it, but with 32 dle entries, the whole strace output would 
> be terrabytes & I don't have THAT much disk.  Not to mention it traces 
> only the parent process, so tar would be merrily marching along to its 
> own drummer and not traced I'm  afraid.

$ strace -ff 

-Doug
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RSDL for 2.6.21-rc3- 0.29

2007-03-12 Thread Douglas McNaught
Gene Heskett <[EMAIL PROTECTED]> writes:

> If, and I have previously, I revert to a 2.6.20-ck1 patching, this does 
> not occur.  So my contention is that someplace in this recent progression 
> from 2.6.20 to 2.6.21-rc3, there is a patch which acts to change how 
> c-time is being reported to tar.  Or there is a spillage into c-times 
> when tar does its estimate scans where the output goes to /dev/null.
> Or possibly even this version of tar is doing it differently.  I just 
> looked up how to get the c-times out of ls, and they, as far as ls is 
> concerned, look sane.  But tars actions while running a 2.6.21-rcX kernel 
> certainly are not.  I do have a plain -rc2 I can try, so that will be the 
> next test.  If that also fails in this manner, I'll build a later 
> 2.6.20-2 or whatever to verify that it doesn't so suffer.

You may find 'strace' useful to track down this sort of thing (though
the output can be voluminous).

-Doug
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] WorkStruct: Implement generic UP cmpxchg() where an arch doesn't support it

2006-12-06 Thread Douglas McNaught
Matthew Wilcox <[EMAIL PROTECTED]> writes:

> On Wed, Dec 06, 2006 at 05:36:29PM -0800, Linus Torvalds wrote:
>> Or are you saying that gcc aligns normal 32-bit entities at 
>> 16-bit alignment? Neither of those sound very likely.
>
> alignof(u32) is 2 on m68k.  Crazy, huh?

The original 68000 had a 16-bit bus (but 32-bit registers), which is
probably why it's that way.

-Doug
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PROBLEM: Incorrect RAM Detected at kernel init

2005-08-22 Thread Douglas McNaught
"Terry" <[EMAIL PROTECTED]> writes:

> The kernel appears to compile perfectly, installs fine, but after reboot it
> is only reporting 16M of RAM. I have tried with and without the mem=768M

I've seen this happen with BIOSes of your vintage when there's a
"memory hole at 16M" turned on--the kernel doesn't see anything beyond
it.  See if you can get into the Setup program and turn that off.

Since earlier kernels work, the later kernels are probably trusting
the e820 tables which may not be set up properly...

[Not that I know that much about this stuff]

-Doug
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Binding a thread (or specific process) to a designated CPU

2005-08-22 Thread Douglas McNaught
"Brian D. McGrew" <[EMAIL PROTECTED]> writes:

> Good morning,
>
> Using FC3 or FC4 with the 2.6.9 or later kernel, we're looking for a way
> to bind a thread (or an entire process) to a designated CPU.  We're
> using dual processor systems as well as P4 with HT and Xeons so all of
> our boxes either have two CPU's or 'appear' to have two.
>
> I want to be able, in my C++ code to designate a specific thread to a
> specific processor.  I've heard rumors that with the 2.6 kernel this is
> now possible???

Look into sched_setaffinity() and friends.

-Doug
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Environment variables inside the kernel?

2005-08-18 Thread Douglas McNaught
Guillermo López Alejos <[EMAIL PROTECTED]> writes:

> Hi,
>
> I have a piece of code which uses environment variables. I have been
> told that it is not going to work in kernel space because the concept
> of environment is not applicable inside the kernel.

Correct.

> I belive that, but I need to demonstrate it. I do not know how to
> proof this, perhaps referring to a solid reference about Linux design
> that points to the idea that it has no sense to use environment
> variables in kernel space.

Environment variables are a part of the API that Unix supplies to
userspace programs.  The kernel is not a userspace program, and as far
as I know it doesn't even do most of the work of maintaining the
environment for a process--that's done by the C library and the
userspace program loader.

> Do anyone knows about the existence of such document?

No, probably because it's such an obvious concept.  You might get hold
of one of the several books on Linux kernel programming and see if
they mention it.

If someone is insisting you use environment varaiables in kernel code,
challenge them to show you where they are implemented in the kernel.  :)

-Doug
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Swapping broken on 2.6.9? Limit Page Cache growth?

2005-07-11 Thread Douglas McNaught
Jon Florence <[EMAIL PROTECTED]> writes:

> Hi,
> I have got a box running  2.6.9-1.667smp (FC3)

That's a Red Hat kernel so you should take it up with them, not the
LKML.

-Doug
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/