Re: Urgent bugzilla mainteinance needed

2007-09-23 Thread Martin J. Bligh

Natalie Protasevich wrote:

On 9/23/07, David Woodhouse <[EMAIL PROTECTED]> wrote:

On Sun, 2007-09-23 at 11:08 -0700, Natalie Protasevich wrote:

On 9/23/07, Diego Calleja <[EMAIL PROTECTED]> wrote:

Take a look at http://bugzilla.kernel.org/show_bug.cgi?id=3710

bugzilla tries to send a mail to the reporter, it fails ("unknown user 
account"),
but the error failure is appended as a bugzilla comment. Then bugzilla tries to
send that comment to everyone involved in the bug, including the reporter,
so it fails again.Houston, we've a endless loop.

There're 540 comments in that bug report already, and the bugme-daemon
mail list is being spammed

I just sent emails to those who maintain bugzilla software and systems
that run it, hope someone will be online soon to help alleviate
this...

Bugzilla really shouldn't be accepting any mail with empty reverse-path
(MAIL FROM:<>)


Ah, then should be easy fix then. I don't have access to the system
though, will have to helplessly wait until one of the guys picks up...
:(


Sorry, can't fix this - I don't have direct access to the system,
and I don't think the others will be back until Monday ;-(

M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Urgent bugzilla mainteinance needed

2007-09-23 Thread Martin J. Bligh

Natalie Protasevich wrote:

On 9/23/07, David Woodhouse [EMAIL PROTECTED] wrote:

On Sun, 2007-09-23 at 11:08 -0700, Natalie Protasevich wrote:

On 9/23/07, Diego Calleja [EMAIL PROTECTED] wrote:

Take a look at http://bugzilla.kernel.org/show_bug.cgi?id=3710

bugzilla tries to send a mail to the reporter, it fails (unknown user 
account),
but the error failure is appended as a bugzilla comment. Then bugzilla tries to
send that comment to everyone involved in the bug, including the reporter,
so it fails again.Houston, we've a endless loop.

There're 540 comments in that bug report already, and the bugme-daemon
mail list is being spammed

I just sent emails to those who maintain bugzilla software and systems
that run it, hope someone will be online soon to help alleviate
this...

Bugzilla really shouldn't be accepting any mail with empty reverse-path
(MAIL FROM:)


Ah, then should be easy fix then. I don't have access to the system
though, will have to helplessly wait until one of the guys picks up...
:(


Sorry, can't fix this - I don't have direct access to the system,
and I don't think the others will be back until Monday ;-(

M.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-12 Thread Martin J. Bligh

Christoph Lameter wrote:

On Tue, 11 Sep 2007, Nick Piggin wrote:


But that's not my place to say, and I'm actually not arguing that high
order pagecache does not have uses (especially as a practical,
shorter-term solution which is unintrusive to filesystems).

So no, I don't think I'm really going against the basics of what we agreed
in Cambridge. But it sounds like it's still being billed as first-order
support right off the bat here.


Well its seems that we have different interpretations of what was agreed 
on. My understanding was that the large blocksize patchset was okay 
provided that I supply an acceptable mmap implementation and put a 
warning in.


I think all we agreed on was that both patches needed significant work
and would need to be judged after they were completed ;-)

There was talk of putting Christoph's approach in more-or-less as-is
as a very specialized and limited application ... but I don't think
we concluded anything for the more general and long-term case apart
from "this is hard" ;-)

M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-12 Thread Martin J. Bligh

Christoph Lameter wrote:

On Tue, 11 Sep 2007, Nick Piggin wrote:


But that's not my place to say, and I'm actually not arguing that high
order pagecache does not have uses (especially as a practical,
shorter-term solution which is unintrusive to filesystems).

So no, I don't think I'm really going against the basics of what we agreed
in Cambridge. But it sounds like it's still being billed as first-order
support right off the bat here.


Well its seems that we have different interpretations of what was agreed 
on. My understanding was that the large blocksize patchset was okay 
provided that I supply an acceptable mmap implementation and put a 
warning in.


I think all we agreed on was that both patches needed significant work
and would need to be judged after they were completed ;-)

There was talk of putting Christoph's approach in more-or-less as-is
as a very specialized and limited application ... but I don't think
we concluded anything for the more general and long-term case apart
from this is hard ;-)

M.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 00/23] per device dirty throttling -v8

2007-08-08 Thread Martin J. Bligh

Christoph Hellwig wrote:

On Sat, Aug 04, 2007 at 09:42:59PM +0200, J??rn Engel wrote:
  

On Sat, 4 August 2007 21:26:15 +0200, J??rn Engel wrote:


Given the choice between only "atime" and "noatime" I'd agree with you.
Heck, I use it myself.  But "relatime" seems to combine the best of both
worlds.  It currently just suffers from mount not supporting it in any
relevant distro.
  

And here is a completely untested patch to enable it by default.  Ingo,
can you see how good this fares compared to "atime" and
"noatime,nodiratime"?



Umm, no f**king way.  atime selection is 100% policy and belongs into
userspace.  Add to that the problem that we can't actually re-enable
atimes because of the way the vfs-level mount flags API is designed.
Instead of doing such a fugly kernel patch just talk to the handfull
of distributions that matter to update their defaults.
  


From what I've seen the problem seems to be that the inode
gets marked dirty when we update atime.

Why isn't this easily fixable by just adding an additional dirty
flag that says atime has changed? Then we only cause a write
when we remove the inode from the inode cache, if only atime
is updated.

Unlike relatime, there's no user-visible change (unless the
machine crashes without clean unmount, but not sure anyone
cares that much about that cornercase). Atime changes are
thus kept in-ram until umount / inode reclaim.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 00/23] per device dirty throttling -v8

2007-08-08 Thread Martin J. Bligh

Christoph Hellwig wrote:

On Sat, Aug 04, 2007 at 09:42:59PM +0200, J??rn Engel wrote:
  

On Sat, 4 August 2007 21:26:15 +0200, J??rn Engel wrote:


Given the choice between only atime and noatime I'd agree with you.
Heck, I use it myself.  But relatime seems to combine the best of both
worlds.  It currently just suffers from mount not supporting it in any
relevant distro.
  

And here is a completely untested patch to enable it by default.  Ingo,
can you see how good this fares compared to atime and
noatime,nodiratime?



Umm, no f**king way.  atime selection is 100% policy and belongs into
userspace.  Add to that the problem that we can't actually re-enable
atimes because of the way the vfs-level mount flags API is designed.
Instead of doing such a fugly kernel patch just talk to the handfull
of distributions that matter to update their defaults.
  


From what I've seen the problem seems to be that the inode
gets marked dirty when we update atime.

Why isn't this easily fixable by just adding an additional dirty
flag that says atime has changed? Then we only cause a write
when we remove the inode from the inode cache, if only atime
is updated.

Unlike relatime, there's no user-visible change (unless the
machine crashes without clean unmount, but not sure anyone
cares that much about that cornercase). Atime changes are
thus kept in-ram until umount / inode reclaim.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: vm/fs meetup in september?

2007-06-30 Thread Martin J. Bligh

Christoph Hellwig wrote:

On Tue, Jun 26, 2007 at 12:35:09PM +1000, Nick Piggin wrote:
  

I'd like to see you there, so I hope we can find a date that most
people are happy with. I'll try to start working that out after we
have a rough idea of who's interested.



Do we have any data preferences yet?
  


You mean date?


VM is arranged for the 3rd, IIRC Kernel summit doesn't
start until the 5th, so there's a gap on the 4th if you want
to sort out the fs stuff then? Not 100% sure on the dates.

M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: vm/fs meetup in september?

2007-06-30 Thread Martin J. Bligh

Christoph Hellwig wrote:

On Tue, Jun 26, 2007 at 12:35:09PM +1000, Nick Piggin wrote:
  

I'd like to see you there, so I hope we can find a date that most
people are happy with. I'll try to start working that out after we
have a rough idea of who's interested.



Do we have any data preferences yet?
  


You mean date?


VM is arranged for the 3rd, IIRC Kernel summit doesn't
start until the 5th, so there's a gap on the 4th if you want
to sort out the fs stuff then? Not 100% sure on the dates.

M.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: regression tracking (Re: Linux 2.6.21)

2007-06-19 Thread Martin J. Bligh



Yes, good work, thanks a lot for it!  The new interface is much better and more
useful.

Greetings,
Rafael


PS
BTW, would that be possible to create the "Hibernation/Suspend" subcategory
of "Power Management" that I asked for some time ago, please? :-)
  


Oops. Sorry. Done.

M.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: regression tracking (Re: Linux 2.6.21)

2007-06-19 Thread Martin J. Bligh



Yes, good work, thanks a lot for it!  The new interface is much better and more
useful.

Greetings,
Rafael


PS
BTW, would that be possible to create the Hibernation/Suspend subcategory
of Power Management that I asked for some time ago, please? :-)
  


Oops. Sorry. Done.

M.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Who is administering the kernel bugzilla?

2007-06-02 Thread Martin J. Bligh

Adrian Bunk wrote:

On Sat, Jun 02, 2007 at 01:42:06AM +0200, Rafael J. Wysocki wrote:

Hi,

Can anyone please tell me who's administering the kernel bugzilla now?

I've tried to write to [EMAIL PROTECTED] , but this address seems
to point to nowhere.
...


Martin Bligh (explicitely Cc'ed) should be able to help.


Yup, it does point somewhere, it's just an alias that bounces
one address ... I did get your request - will fix it.

M.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Who is administering the kernel bugzilla?

2007-06-02 Thread Martin J. Bligh

Adrian Bunk wrote:

On Sat, Jun 02, 2007 at 01:42:06AM +0200, Rafael J. Wysocki wrote:

Hi,

Can anyone please tell me who's administering the kernel bugzilla now?

I've tried to write to [EMAIL PROTECTED] , but this address seems
to point to nowhere.
...


Martin Bligh (explicitely Cc'ed) should be able to help.


Yup, it does point somewhere, it's just an alias that bounces
one address ... I did get your request - will fix it.

M.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Conveying memory pressure to userspace?

2007-05-10 Thread Martin J. Bligh

Bas Westerbaan wrote:

Hello,

Quite a lot of userspace applications cache.  Firefox caches pages;
mySQL caches queries;  libc' free (like practically all other
userspace allocators) won't directly return the first byte of memory
freed, etc.

These applications are unaware of memory pressure.  While the kernel
is trying its best to to free memory by for instance dropping,
possibly more valuable caches, userspace is left blissfully unaware.

Obviously this isn't a really big problem, given that we've still got
swap to swap out those rarely used caches, except for when the caches
aren't _that_ rarely used and of which the backing store (eg.
precomputed values) might be faster than the disk to swap back the
pages from.

A solution would be to either

 a) let the application make the kernel aware of pages that, when in
memory pressure, may be dropped.  This would be tricky to implement
for the userspace: it's hard to avoid an application to race into a
dropped page.  However, the kernel can directly free a page from
userspace, which makes it use full when under real pressure.  This in
contrast to
b) letting the application register itself with a cache share
priority.  The application (and other aware applications) would then
be able to query how fair they are at the moment proportional to their
cache share priority.  Freeing would still be completely in their own
hands.


The only relevant related matter I could find were madvise and mincore.

With madvise pages can be marked to be unnecessary and these should be
swapped out earlier.  With mincore one can determine whether pages are
resident (not cached).  This would make an existing alternative to
solution a.  However, this doesn't eliminate the writes to the swap
and polling everytime before accessing a cache isn't really pretty.

I did consider guessing the memory pressure by looking at
/proc/meminfo, but I think it isn't that accurate.


The prev_priority field in the zoneinfo stuff is more useful for
memory pressure. I'm playing with making a blocking callback that
can wake someone up when this gets down to a certain priority level
(prio=12 => everything's rosy, prio=0 => we're in deep shit).


Before hacking something together (and being uncertain about the
thoroughness with which I searched for existing work, sorry), I would
like your thoughts on this.

Please CC me, I'm not in the list.

 Bas



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Conveying memory pressure to userspace?

2007-05-10 Thread Martin J. Bligh

Bas Westerbaan wrote:

Hello,

Quite a lot of userspace applications cache.  Firefox caches pages;
mySQL caches queries;  libc' free (like practically all other
userspace allocators) won't directly return the first byte of memory
freed, etc.

These applications are unaware of memory pressure.  While the kernel
is trying its best to to free memory by for instance dropping,
possibly more valuable caches, userspace is left blissfully unaware.

Obviously this isn't a really big problem, given that we've still got
swap to swap out those rarely used caches, except for when the caches
aren't _that_ rarely used and of which the backing store (eg.
precomputed values) might be faster than the disk to swap back the
pages from.

A solution would be to either

 a) let the application make the kernel aware of pages that, when in
memory pressure, may be dropped.  This would be tricky to implement
for the userspace: it's hard to avoid an application to race into a
dropped page.  However, the kernel can directly free a page from
userspace, which makes it use full when under real pressure.  This in
contrast to
b) letting the application register itself with a cache share
priority.  The application (and other aware applications) would then
be able to query how fair they are at the moment proportional to their
cache share priority.  Freeing would still be completely in their own
hands.


The only relevant related matter I could find were madvise and mincore.

With madvise pages can be marked to be unnecessary and these should be
swapped out earlier.  With mincore one can determine whether pages are
resident (not cached).  This would make an existing alternative to
solution a.  However, this doesn't eliminate the writes to the swap
and polling everytime before accessing a cache isn't really pretty.

I did consider guessing the memory pressure by looking at
/proc/meminfo, but I think it isn't that accurate.


The prev_priority field in the zoneinfo stuff is more useful for
memory pressure. I'm playing with making a blocking callback that
can wake someone up when this gets down to a certain priority level
(prio=12 = everything's rosy, prio=0 = we're in deep shit).


Before hacking something together (and being uncertain about the
thoroughness with which I searched for existing work, sorry), I would
like your thoughts on this.

Please CC me, I'm not in the list.

 Bas



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bugzilla (was Linux 2.6.21)

2007-05-04 Thread Martin J. Bligh

Sorry, have been out sick, and someone removed me from the cc list,
which didn't help. In response to various bits:

Firstly a general comment - we're about to upgrade versions, which
will ease a few of these issues. I should really finish the creation
of virtual category owners for *all* categories. Will see if we can
batch that, as it's a total pain to do.

Andi Kleen wrote:
> - Ask more people to just categorize and reassign bugs (anybody
> interested?)

The category owners should be able to do that, and help spread the
load. The virtual category owner stuff enables many people to "watch"
new bugs for that category and help out.

> - Give more people in bugzilla the power to reassign arbitary bugs
> (bugzilla maintainers would need to do that)

Fairly easy to do, just a permissions issue. Either I can add a bunch
of "known" people, or let everyone do it and then slap people if
they're silly about it.


- You are required to select a category and 'component' for your report, which
often is difficult (especially if you're not a kernel expert)


Usually there is other and then someone else figures it out.


I can make that clearer in the form if it helps.


The Novell bugzilla actually has that fixed. You have a search email button
to look up addresses.  Perhaps that feature will be ported someday into
the kernel.org one (I would like to have it too) 


I *think* that's in the new version. Will check.


The only sane way to do that would be to save them somewhere and keep
a list and then let a group of people process them.

Hmm, wait... sounds like bugzilla, doesn't it?


Yes, though we could do with some improved email hooks still, I guess.
I much prefer having people watch categories than spamming lists, but
if people want lists spammed, we can have that.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bugzilla (was Linux 2.6.21)

2007-05-04 Thread Martin J. Bligh

Sorry, have been out sick, and someone removed me from the cc list,
which didn't help. In response to various bits:

Firstly a general comment - we're about to upgrade versions, which
will ease a few of these issues. I should really finish the creation
of virtual category owners for *all* categories. Will see if we can
batch that, as it's a total pain to do.

Andi Kleen wrote:
 - Ask more people to just categorize and reassign bugs (anybody
 interested?)

The category owners should be able to do that, and help spread the
load. The virtual category owner stuff enables many people to watch
new bugs for that category and help out.

 - Give more people in bugzilla the power to reassign arbitary bugs
 (bugzilla maintainers would need to do that)

Fairly easy to do, just a permissions issue. Either I can add a bunch
of known people, or let everyone do it and then slap people if
they're silly about it.


- You are required to select a category and 'component' for your report, which
often is difficult (especially if you're not a kernel expert)


Usually there is other and then someone else figures it out.


I can make that clearer in the form if it helps.


The Novell bugzilla actually has that fixed. You have a search email button
to look up addresses.  Perhaps that feature will be ported someday into
the kernel.org one (I would like to have it too) 


I *think* that's in the new version. Will check.


The only sane way to do that would be to save them somewhere and keep
a list and then let a group of people process them.

Hmm, wait... sounds like bugzilla, doesn't it?


Yes, though we could do with some improved email hooks still, I guess.
I much prefer having people watch categories than spamming lists, but
if people want lists spammed, we can have that.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.21

2007-04-28 Thread Martin J. Bligh
The thing is, these reports MUST NOT go to "everybody". If they do, that 
is actually *worse* than nothing, because people will just ignore them 
entirely, since they aren't "directed".


The emails need to be directed to the appropriate parties, not go to 
everybody. There is nobody who is interested in seeing all regressions, 
except perhaps me and Andrew. Most *real* developers (as opposed to people 
like me, who are integrators, not "real developers") want to be notified 
about problems in *their* area, and if it's just automation that sends out 
everything, it just dilutes the value of the thing, to the point where 
people will ignore it even for the cases when they happen to be related to 
what they do.


It's easy to send the different categories to different mailing lists,
if that's what we want to do. Apart from some aggressive filtering on
the SCSI lists etc stops me from bouncing messages to it, but that's
fixable.

Yes, human involvement from someone with half a brain would be better.
Andrew does a lot of that. Not a particularly good use of talent really.
but still.

As Andrew has pointed out before though - even though he forwards
the bugs, nobody does anything with it. The sad truth seems to be
that people have very little interest in fixing bugs when they are
reported - it's not sexy, I guess.

Let me put it another way: I would never use a source control system that 
forces me to look at my 22,000 files one at a time. I think such a system 
is fundamentally broken, because it makes it impossible to get the big 
picture ("what changed in the last week" kind of thing). The same is true 
of bugzilla: if you *know* which bug you're looking at, it's good. For 
anythign else, it's almost worse than useless, exactly because there is no 
way to get an overview


Go to http://bugzilla.kernel.org. Hit query. Find the box that says
"Bug Changes, Only bugs changed in the last __ days". Stick 7 in it.

74 bugs found.

Not hard to do.

(I've said this before, but I'll say it again: one thing that would 
already make bugzilla better is to just always drop any bug reports that 
are more than a week old and haven't been touched. It wouldn't need *much* 
touching, but if a reporter cannot be bothered to say "still true with 
current snapshot" once a week, then it shouldn't be seen as being somehow 
up to those scare resources we call "developers" to have to go through 
it).


I'm reluctant to drop / close them. We could fairly easily move them to
a "STALE" state if you want, and have that ping the user. Not sure what
we'd ping them with apart from "Nobody seems to give a toss about your
bug. Life's a bitch. Try sending chocolates, flowers, or fireworks".
I'm still unconvinced the users or the tool are the problem, but if it
makes you happier, we can do that.

So there are probably things that bugzilla could do to become more useful, 
but I don't see that happening. We'd need either a smarter/better 
bugzilla, or somebody who actually turns noise into real information. 
Adrian did that (although in fairness to others, other people definitely 
do it too. Dave Jones, for example. Very useful).


What would you want from a smarter / better bugzilla or other bug
tracking tool? A list of requirements / suggestions would be nice. The
main complaint we had before was lack of an email interface, and that
was fixed a long time ago. I admit development has not exactly been
active since, but the only person I got real feedback from was Dave J,
and we've been fixing his UI issues.

M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.21

2007-04-28 Thread Martin J. Bligh
The thing is, these reports MUST NOT go to everybody. If they do, that 
is actually *worse* than nothing, because people will just ignore them 
entirely, since they aren't directed.


The emails need to be directed to the appropriate parties, not go to 
everybody. There is nobody who is interested in seeing all regressions, 
except perhaps me and Andrew. Most *real* developers (as opposed to people 
like me, who are integrators, not real developers) want to be notified 
about problems in *their* area, and if it's just automation that sends out 
everything, it just dilutes the value of the thing, to the point where 
people will ignore it even for the cases when they happen to be related to 
what they do.


It's easy to send the different categories to different mailing lists,
if that's what we want to do. Apart from some aggressive filtering on
the SCSI lists etc stops me from bouncing messages to it, but that's
fixable.

Yes, human involvement from someone with half a brain would be better.
Andrew does a lot of that. Not a particularly good use of talent really.
but still.

As Andrew has pointed out before though - even though he forwards
the bugs, nobody does anything with it. The sad truth seems to be
that people have very little interest in fixing bugs when they are
reported - it's not sexy, I guess.

Let me put it another way: I would never use a source control system that 
forces me to look at my 22,000 files one at a time. I think such a system 
is fundamentally broken, because it makes it impossible to get the big 
picture (what changed in the last week kind of thing). The same is true 
of bugzilla: if you *know* which bug you're looking at, it's good. For 
anythign else, it's almost worse than useless, exactly because there is no 
way to get an overview


Go to http://bugzilla.kernel.org. Hit query. Find the box that says
Bug Changes, Only bugs changed in the last __ days. Stick 7 in it.

74 bugs found.

Not hard to do.

(I've said this before, but I'll say it again: one thing that would 
already make bugzilla better is to just always drop any bug reports that 
are more than a week old and haven't been touched. It wouldn't need *much* 
touching, but if a reporter cannot be bothered to say still true with 
current snapshot once a week, then it shouldn't be seen as being somehow 
up to those scare resources we call developers to have to go through 
it).


I'm reluctant to drop / close them. We could fairly easily move them to
a STALE state if you want, and have that ping the user. Not sure what
we'd ping them with apart from Nobody seems to give a toss about your
bug. Life's a bitch. Try sending chocolates, flowers, or fireworks.
I'm still unconvinced the users or the tool are the problem, but if it
makes you happier, we can do that.

So there are probably things that bugzilla could do to become more useful, 
but I don't see that happening. We'd need either a smarter/better 
bugzilla, or somebody who actually turns noise into real information. 
Adrian did that (although in fairness to others, other people definitely 
do it too. Dave Jones, for example. Very useful).


What would you want from a smarter / better bugzilla or other bug
tracking tool? A list of requirements / suggestions would be nice. The
main complaint we had before was lack of an email interface, and that
was fixed a long time ago. I admit development has not exactly been
active since, but the only person I got real feedback from was Dave J,
and we've been fixing his UI issues.

M.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/4] x86_64: Switch to SPARSE_VIRTUAL

2007-04-02 Thread Martin J. Bligh

cc: apw ... seeing as he wrote sparsemem in the first place, please copy
him on this stuff ?

Andi Kleen wrote:

On Monday 02 April 2007 17:37, Christoph Lameter wrote:

On Sun, 1 Apr 2007, Andi Kleen wrote:


Hmm, this means there is at least 2MB worth of struct page on every node?
Or do you have overlaps with other memory (I think you have)
In that case you have to handle the overlap in change_page_attr()
Correct. 2MB worth of struct page is 128 mb of memory. Are there nodes 
with smaller amounts of memory? 


Yes the discontigmem minimum is 64MB and there are some setups
(mostly with numa emulation) where you end up with nodes that small.


We're actually using numa emulation to do real (container) things with.
However, 128MB is still pretty small for that ... and worst case, we
just waste 1MB for a 64MB node, right? Which isn't beautiful, but
doesn't seem like the end of the world for an obscure corner case.


Do you have any benchmarks numbers to prove it? There seem to be a few
benchmarks where the discontig virt_to_page is a problem
(although I know ways to make it more efficient), and sparsemem
is normally slower. Still some numbers would be good.
You want a benchmark to prove that the removal of memory references and 
code improves performance?


You're just moving them into MMU, not really removing it.  And need more TLB 
entries.
It might be faster or it might not. There are some unexpected issues, like most x86-64 
CPUs have a quite small number of large TLBs so you can get thrashing etc.


So numbers with TLB intensive workloads would be good. 


There's also the possibility it just doesn't make enough difference
to affect a real benchmark ...

M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/4] x86_64: Switch to SPARSE_VIRTUAL

2007-04-02 Thread Martin J. Bligh

cc: apw ... seeing as he wrote sparsemem in the first place, please copy
him on this stuff ?

Andi Kleen wrote:

On Monday 02 April 2007 17:37, Christoph Lameter wrote:

On Sun, 1 Apr 2007, Andi Kleen wrote:


Hmm, this means there is at least 2MB worth of struct page on every node?
Or do you have overlaps with other memory (I think you have)
In that case you have to handle the overlap in change_page_attr()
Correct. 2MB worth of struct page is 128 mb of memory. Are there nodes 
with smaller amounts of memory? 


Yes the discontigmem minimum is 64MB and there are some setups
(mostly with numa emulation) where you end up with nodes that small.


We're actually using numa emulation to do real (container) things with.
However, 128MB is still pretty small for that ... and worst case, we
just waste 1MB for a 64MB node, right? Which isn't beautiful, but
doesn't seem like the end of the world for an obscure corner case.


Do you have any benchmarks numbers to prove it? There seem to be a few
benchmarks where the discontig virt_to_page is a problem
(although I know ways to make it more efficient), and sparsemem
is normally slower. Still some numbers would be good.
You want a benchmark to prove that the removal of memory references and 
code improves performance?


You're just moving them into MMU, not really removing it.  And need more TLB 
entries.
It might be faster or it might not. There are some unexpected issues, like most x86-64 
CPUs have a quite small number of large TLBs so you can get thrashing etc.


So numbers with TLB intensive workloads would be good. 


There's also the possibility it just doesn't make enough difference
to affect a real benchmark ...

M.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: The performance and behaviour of the anti-fragmentation related patches

2007-03-03 Thread Martin J. Bligh

Christoph Lameter wrote:

On Fri, 2 Mar 2007, William Lee Irwin III wrote:


On Fri, Mar 02, 2007 at 02:22:56PM -0800, Andrew Morton wrote:

Opterons seem to be particularly prone to lock starvation where a cacheline
gets captured in a single package for ever.

AIUI that phenomenon is universal to NUMA. Maybe it's time we
reexamined our locking algorithms in the light of fairness
considerations.


This is a phenomenon that is usually addressed at the cache logic level. 
Its a hardware maturation issue. A certain package should not be allowed
to hold onto a cacheline forever and other packages must have a mininum 
time when they can operate on that cacheline.


That'd be nice. Unfortunately we're stuck in the real world with
real hardware, and the situation is likely to remain thus for
quite some time ...

M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: The performance and behaviour of the anti-fragmentation related patches

2007-03-03 Thread Martin J. Bligh

Christoph Lameter wrote:

On Fri, 2 Mar 2007, William Lee Irwin III wrote:


On Fri, Mar 02, 2007 at 02:22:56PM -0800, Andrew Morton wrote:

Opterons seem to be particularly prone to lock starvation where a cacheline
gets captured in a single package for ever.

AIUI that phenomenon is universal to NUMA. Maybe it's time we
reexamined our locking algorithms in the light of fairness
considerations.


This is a phenomenon that is usually addressed at the cache logic level. 
Its a hardware maturation issue. A certain package should not be allowed
to hold onto a cacheline forever and other packages must have a mininum 
time when they can operate on that cacheline.


That'd be nice. Unfortunately we're stuck in the real world with
real hardware, and the situation is likely to remain thus for
quite some time ...

M.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: The performance and behaviour of the anti-fragmentation related patches

2007-03-02 Thread Martin J. Bligh

.. and think about a realistic future.

EVERYBODY will do on-die memory controllers. Yes, Intel doesn't do it 
today, but in the one- to two-year timeframe even Intel will.


What does that mean? It means that in bigger systems, you will no longer 
even *have* 8 or 16 banks where turning off a few banks makes sense. 
You'll quite often have just a few DIMM's per die, because that's what you 
want for latency. Then you'll have CSI or HT or another interconnect.


And with a few DIMM's per die, you're back where even just 2-way 
interleaving basically means that in order to turn off your DIMM, you 
probably need to remove HALF the memory for that CPU.


In other words: TURNING OFF DIMM's IS A BEDTIME STORY FOR DIMWITTED 
CHILDREN.


Even with only 4 banks per CPU, and 2-way interleaving, we could still
power off half the DIMMs in the system. That's a huge impact on the
power budget for a large cluster.

No, it's not ideal, but what was that quote again ... "perfect is the
enemy of good"? Something like that ;-)

There are maybe a couple machines IN EXISTENCE TODAY that can do it. But 
nobody actually does it in practice, and nobody even knows if it's going 
to be viable (yes, DRAM takes energy, but trying to keep memory free will 
likely waste power *too*, and I doubt anybody has any real idea of how 
much any of this would actually help in practice).


Batch jobs across clusters have spikes at different times of the day,
etc that are fairly predictable in many cases.

M.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: The performance and behaviour of the anti-fragmentation related patches

2007-03-02 Thread Martin J. Bligh

32GB is pretty much the minimum size to reproduce some of these
problems. Some workloads may need larger systems to easily trigger
them.


We can find a 32GB system here pretty easily to test things on if
need be.  Setting up large commercial databases is much harder.


That's my problem, too.

There does not seem to exist any single set of test cases that
accurately predicts how the VM will behave with customer
workloads.


Tracing might help? Showing Andrew traces of what happened in
production for the prev_priority change made it much easier to
demonstrate and explain the real problem ...

M.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: The performance and behaviour of the anti-fragmentation related patches

2007-03-02 Thread Martin J. Bligh

32GB is pretty much the minimum size to reproduce some of these
problems. Some workloads may need larger systems to easily trigger
them.


We can find a 32GB system here pretty easily to test things on if
need be.  Setting up large commercial databases is much harder.


That's my problem, too.

There does not seem to exist any single set of test cases that
accurately predicts how the VM will behave with customer
workloads.


Tracing might help? Showing Andrew traces of what happened in
production for the prev_priority change made it much easier to
demonstrate and explain the real problem ...

M.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: The performance and behaviour of the anti-fragmentation related patches

2007-03-02 Thread Martin J. Bligh

.. and think about a realistic future.

EVERYBODY will do on-die memory controllers. Yes, Intel doesn't do it 
today, but in the one- to two-year timeframe even Intel will.


What does that mean? It means that in bigger systems, you will no longer 
even *have* 8 or 16 banks where turning off a few banks makes sense. 
You'll quite often have just a few DIMM's per die, because that's what you 
want for latency. Then you'll have CSI or HT or another interconnect.


And with a few DIMM's per die, you're back where even just 2-way 
interleaving basically means that in order to turn off your DIMM, you 
probably need to remove HALF the memory for that CPU.


In other words: TURNING OFF DIMM's IS A BEDTIME STORY FOR DIMWITTED 
CHILDREN.


Even with only 4 banks per CPU, and 2-way interleaving, we could still
power off half the DIMMs in the system. That's a huge impact on the
power budget for a large cluster.

No, it's not ideal, but what was that quote again ... perfect is the
enemy of good? Something like that ;-)

There are maybe a couple machines IN EXISTENCE TODAY that can do it. But 
nobody actually does it in practice, and nobody even knows if it's going 
to be viable (yes, DRAM takes energy, but trying to keep memory free will 
likely waste power *too*, and I doubt anybody has any real idea of how 
much any of this would actually help in practice).


Batch jobs across clusters have spikes at different times of the day,
etc that are fairly predictable in many cases.

M.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] more support for memory-less-node.

2007-02-13 Thread Martin J. Bligh

Andi Kleen wrote:

I wasn't suggesting having NULL pointers for pgdats, if that's what you
mean. 


That is what started the original thread at least. Can happen on some
ia64 platforms.


OK, that does seem kind of ugly.

Just nodes with no memory in them, the pgdat would still be there. 
pgdat = struct node, except everything's badly named.


Ok those can happen even on x86-64, mostly because it's possible
to fill up a node early during boot up with bootmem and then
it's effectively empty.

[there is even still a open bug when this happens on node 0]
 
Handling out of memory here of course has to be always done.


Yup, if we just set the "size" of the node to zero, it seems
like a natural degenerate case that should be handled anyway.

Just NULL pointers in core data structures are evil. But I'm glad we 
agree here.


Now if it's better to set up a empty node or use a nearby node
for a memory less cpu can be further discussed. I still think
I lean towards the later.


Just seems kind of ugly and unnecessary, particularly if that
memory-less cpu (or IO node) is equidistant from one or more
memory-possessing nodes. As long as their zonelist is set up
correctly, it should all work fine without that, right?

build_zonelists_node already checks populated_zone() so it looks
like it's all set up for that already ...

M.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] more support for memory-less-node.

2007-02-13 Thread Martin J. Bligh

Christoph Lameter wrote:

On Tue, 13 Feb 2007, Andi Kleen wrote:


Adding NULL tests all over mm for this would seem like a clear case
of this to me.


Maybe there is an alternative. We are free to number the nodes right? 
How about requiring the low node number to have memory and the high ones 
do not?


F.e. have a boundary like

nr_mem_nodes ?

Everything above nr_mem_nodes has no memory and cannot be specified in a 
nodemask. Those nodes would not be visible to user space via memory 
policies and page migration. So the core mempolicy logic could be left 
untouched.


The nodes above nr_mem_nodes would exist purely within the kernel. They 
would have proximity information (which can be used to determine 
neighboring memory. More flexible then the current attachment 
to one fixed memory node) but those node numbers could not be specified as 
node masks in any memory operations. This would then allow memory less nodes 
with I/O or cpus. The user would not be aware of these.


What's wrong with just setting the existing counters like
node_spanned_pages / node_present_pages to zero?

M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] more support for memory-less-node.

2007-02-13 Thread Martin J. Bligh

Andi Kleen wrote:

Your description of the node is correct, it's an arbitrary container of
one or more resources. Not only is this definition flexible, it's also
very useful, for memory hotplug, odd types of NUMA boxes, etc.


I must disagree here. Special cases are always dangerous especially
if they are hard to regression test. I made this discovery the hard
way on x86-64 ... It's best to eliminate them in the first place,
otherwise they will later come back and bite you when you don't expect it.

Adding NULL tests all over mm for this would seem like a clear case
of this to me.


I wasn't suggesting having NULL pointers for pgdats, if that's what you
mean. Just nodes with no memory in them, the pgdat would still be there.
pgdat = struct node, except everything's badly named.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] more support for memory-less-node.

2007-02-13 Thread Martin J. Bligh

KAMEZAWA Hiroyuki wrote:
In my last posintg, mempolicy-fix-for-memory-less-node patch, there was a 
discussion 'what do you consider definition of "node" as...?

I found there is no consensus. But I want to go ahead.
Before posing patch again, I'd like to discuss more.

-Kame

In my understanding, a "node" is a block of cpu, memory, devices.
and there could be cpu-only-node, memory-only-node, device-only-node...

There will be discussion. IMHO, to represent hardware configuration
as it is, this definition is very natural and flexible.
(And because my work is memory-hotplug, this definition fits me.)

"Don't support such crazy configuraton" is one of opinions.
I hear x86_64 doesn't support it and defines node as a block of memory,
It remaps cpus on memory-less-nodes to other nodes.
I know ia64 allows memory-less-node. (I don't know about ppc.)
It works well on my box (and HP's box).


It doesn't make much sense for an architecture independent structure to
be "defined" in different ways by specific architectures. "not
supported" or "currently broken" might be a better description.

Your description of the node is correct, it's an arbitrary container of
one or more resources. Not only is this definition flexible, it's also
very useful, for memory hotplug, odd types of NUMA boxes, etc.

M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] more support for memory-less-node.

2007-02-13 Thread Martin J. Bligh

KAMEZAWA Hiroyuki wrote:
In my last posintg, mempolicy-fix-for-memory-less-node patch, there was a 
discussion 'what do you consider definition of node as...?

I found there is no consensus. But I want to go ahead.
Before posing patch again, I'd like to discuss more.

-Kame

In my understanding, a node is a block of cpu, memory, devices.
and there could be cpu-only-node, memory-only-node, device-only-node...

There will be discussion. IMHO, to represent hardware configuration
as it is, this definition is very natural and flexible.
(And because my work is memory-hotplug, this definition fits me.)

Don't support such crazy configuraton is one of opinions.
I hear x86_64 doesn't support it and defines node as a block of memory,
It remaps cpus on memory-less-nodes to other nodes.
I know ia64 allows memory-less-node. (I don't know about ppc.)
It works well on my box (and HP's box).


It doesn't make much sense for an architecture independent structure to
be defined in different ways by specific architectures. not
supported or currently broken might be a better description.

Your description of the node is correct, it's an arbitrary container of
one or more resources. Not only is this definition flexible, it's also
very useful, for memory hotplug, odd types of NUMA boxes, etc.

M.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] more support for memory-less-node.

2007-02-13 Thread Martin J. Bligh

Andi Kleen wrote:

Your description of the node is correct, it's an arbitrary container of
one or more resources. Not only is this definition flexible, it's also
very useful, for memory hotplug, odd types of NUMA boxes, etc.


I must disagree here. Special cases are always dangerous especially
if they are hard to regression test. I made this discovery the hard
way on x86-64 ... It's best to eliminate them in the first place,
otherwise they will later come back and bite you when you don't expect it.

Adding NULL tests all over mm for this would seem like a clear case
of this to me.


I wasn't suggesting having NULL pointers for pgdats, if that's what you
mean. Just nodes with no memory in them, the pgdat would still be there.
pgdat = struct node, except everything's badly named.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] more support for memory-less-node.

2007-02-13 Thread Martin J. Bligh

Christoph Lameter wrote:

On Tue, 13 Feb 2007, Andi Kleen wrote:


Adding NULL tests all over mm for this would seem like a clear case
of this to me.


Maybe there is an alternative. We are free to number the nodes right? 
How about requiring the low node number to have memory and the high ones 
do not?


F.e. have a boundary like

nr_mem_nodes ?

Everything above nr_mem_nodes has no memory and cannot be specified in a 
nodemask. Those nodes would not be visible to user space via memory 
policies and page migration. So the core mempolicy logic could be left 
untouched.


The nodes above nr_mem_nodes would exist purely within the kernel. They 
would have proximity information (which can be used to determine 
neighboring memory. More flexible then the current attachment 
to one fixed memory node) but those node numbers could not be specified as 
node masks in any memory operations. This would then allow memory less nodes 
with I/O or cpus. The user would not be aware of these.


What's wrong with just setting the existing counters like
node_spanned_pages / node_present_pages to zero?

M.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] more support for memory-less-node.

2007-02-13 Thread Martin J. Bligh

Andi Kleen wrote:

I wasn't suggesting having NULL pointers for pgdats, if that's what you
mean. 


That is what started the original thread at least. Can happen on some
ia64 platforms.


OK, that does seem kind of ugly.

Just nodes with no memory in them, the pgdat would still be there. 
pgdat = struct node, except everything's badly named.


Ok those can happen even on x86-64, mostly because it's possible
to fill up a node early during boot up with bootmem and then
it's effectively empty.

[there is even still a open bug when this happens on node 0]
 
Handling out of memory here of course has to be always done.


Yup, if we just set the size of the node to zero, it seems
like a natural degenerate case that should be handled anyway.

Just NULL pointers in core data structures are evil. But I'm glad we 
agree here.


Now if it's better to set up a empty node or use a nearby node
for a memory less cpu can be further discussed. I still think
I lean towards the later.


Just seems kind of ugly and unnecessary, particularly if that
memory-less cpu (or IO node) is equidistant from one or more
memory-possessing nodes. As long as their zonelist is set up
correctly, it should all work fine without that, right?

build_zonelists_node already checks populated_zone() so it looks
like it's all set up for that already ...

M.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Tracking mlocked pages and moving them off the LRU

2007-02-03 Thread Martin J. Bligh

Christoph Hellwig wrote:

On Fri, Feb 02, 2007 at 10:20:12PM -0800, Christoph Lameter wrote:

This is a new variation on the earlier RFC for tracking mlocked pages.
We now mark a mlocked page with a bit in the page flags and remove
them from the LRU. Pages get moved back when no vma that references
the page has VM_LOCKED set anymore.

This means that vmscan no longer uselessly cycles over large amounts
of mlocked memory should someone attempt to mlock large amounts of
memory (may even result in a livelock on large systems).

Synchronization is build around state changes of the PageMlocked bit.
The NR_MLOCK counter is incremented and decremented based on
state transitions of PageMlocked. So the count is accurate.

There is still some unfinished business:

1. We use the 21st page flag and we only have 20 on 32 bit NUMA platforms.

2. Since mlocked pages are now off the LRU page migration will no longer
   move them.

3. Use NR_MLOCK to tune various VM behaviors so that the VM does not 
   longer fall due to too many mlocked pages in certain areas.


This patch seems to not handle the cases where more than one process mlocks
a page and you really need a pincount in the page to not release it before
all processes have munlock it or died.  I did a similar patch a while
ago and tried to handle it by overloading the lru lists pointers with
a pincount, but at some point I gave up because I couldn't get that part
right.


Doesn't matter - you can just do it lazily. If you find a page that is
locked, move it to the locked list. when unlocking a page you *always*
move it back to the normal list. If someone else is still locking it,
we'll move it back to the lock list on next reclaim pass.

I have a half-finished patch from 6 months ago that does this, but never
found time to complete it ;-(

M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Tracking mlocked pages and moving them off the LRU

2007-02-03 Thread Martin J. Bligh

Christoph Hellwig wrote:

On Fri, Feb 02, 2007 at 10:20:12PM -0800, Christoph Lameter wrote:

This is a new variation on the earlier RFC for tracking mlocked pages.
We now mark a mlocked page with a bit in the page flags and remove
them from the LRU. Pages get moved back when no vma that references
the page has VM_LOCKED set anymore.

This means that vmscan no longer uselessly cycles over large amounts
of mlocked memory should someone attempt to mlock large amounts of
memory (may even result in a livelock on large systems).

Synchronization is build around state changes of the PageMlocked bit.
The NR_MLOCK counter is incremented and decremented based on
state transitions of PageMlocked. So the count is accurate.

There is still some unfinished business:

1. We use the 21st page flag and we only have 20 on 32 bit NUMA platforms.

2. Since mlocked pages are now off the LRU page migration will no longer
   move them.

3. Use NR_MLOCK to tune various VM behaviors so that the VM does not 
   longer fall due to too many mlocked pages in certain areas.


This patch seems to not handle the cases where more than one process mlocks
a page and you really need a pincount in the page to not release it before
all processes have munlock it or died.  I did a similar patch a while
ago and tried to handle it by overloading the lru lists pointers with
a pincount, but at some point I gave up because I couldn't get that part
right.


Doesn't matter - you can just do it lazily. If you find a page that is
locked, move it to the locked list. when unlocking a page you *always*
move it back to the normal list. If someone else is still locking it,
we'll move it back to the lock list on next reclaim pass.

I have a half-finished patch from 6 months ago that does this, but never
found time to complete it ;-(

M.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] libata: fix translation for START STOP UNIT

2007-01-31 Thread Martin J. Bligh

Jeff Garzik wrote:

Robert Hancock wrote:

Jeff Garzik wrote:
* Include the patch inline rather than as an attachment.  Even a 
text/plain attachment is very difficult to review and quote in 
popular email programs.


Jeff


I'd love to, but unfortunately nobody seems to have come up with a way 
of doing this in Thunderbird that keeps it from mangling whitespace 
without a ton of hassle. I was able to get it to cooperate once (sort 
of, anyway, I think it may have still damaged something on the first 
try), but it required mangling a bunch of settings that made using it 
for normal mail impossible.


The last time I looked, the main "how-to" page I found had an addendum 
that "I gave up on this, it's too hard, I just attach the patches". If 
anyone has gotten any new insight..


Just use "cat mail | sendmail -t" or git-send-email, Thunderbird will 
never get this stuff right.


I'm fighting Thunderbird right this second, in fact, because it randomly 
decided to stop supporting drag-n-drop to sub-folders in Fedora Core 6 :(


http://mbligh.org/linuxdocs/Email/Clients/Thunderbird

Describes how to fix this.

M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] libata: fix translation for START STOP UNIT

2007-01-31 Thread Martin J. Bligh

Jeff Garzik wrote:

Robert Hancock wrote:

Jeff Garzik wrote:
* Include the patch inline rather than as an attachment.  Even a 
text/plain attachment is very difficult to review and quote in 
popular email programs.


Jeff


I'd love to, but unfortunately nobody seems to have come up with a way 
of doing this in Thunderbird that keeps it from mangling whitespace 
without a ton of hassle. I was able to get it to cooperate once (sort 
of, anyway, I think it may have still damaged something on the first 
try), but it required mangling a bunch of settings that made using it 
for normal mail impossible.


The last time I looked, the main how-to page I found had an addendum 
that I gave up on this, it's too hard, I just attach the patches. If 
anyone has gotten any new insight..


Just use cat mail | sendmail -t or git-send-email, Thunderbird will 
never get this stuff right.


I'm fighting Thunderbird right this second, in fact, because it randomly 
decided to stop supporting drag-n-drop to sub-folders in Fedora Core 6 :(


http://mbligh.org/linuxdocs/Email/Clients/Thunderbird

Describes how to fix this.

M.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bug report : reproducible memory bug (hardware failure, sorry)

2007-01-29 Thread Martin J. Bligh

I finally re-ran memtest86 on the machine since it began to have too
many different kind of errors (GPF, invalid instruction...). It turned
out that one of the memory modules was bad. I guess my brand new 
list_debug race condition debugger will be useful in the future, but not

now. :)

I'll remember to let memtest86 run a few hours more on my new machines
next time.


Heh, well that's a lot less confusing at least ;-)

M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: remove global locks from mm/highmem.c

2007-01-29 Thread Martin J. Bligh

Andrew Morton wrote:

On Mon, 29 Jan 2007 17:31:20 -0800
"Martin J. Bligh" <[EMAIL PROTECTED]> wrote:


Peter Zijlstra wrote:

On Sun, 2007-01-28 at 14:29 -0800, Andrew Morton wrote:


As Christoph says, it's very much preferred that code be migrated over to
kmap_atomic().  Partly because kmap() is deadlockable in situations where a
large number of threads are trying to take two kmaps at the same time and
we run out.  This happened in the past, but incidences have gone away,
probably because of kmap->kmap_atomic conversions.
From which callsite have you measured problems?

CONFIG_HIGHPTE code in -rt was horrid. I'll do some measurements on
mainline.


CONFIG_HIGHPTE is always horrid -we've known that for years.


We have?  What's wrong with it?  


http://www.ussg.iu.edu/hypermail/linux/kernel/0307.0/0463.html

July 2003.


M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: remove global locks from mm/highmem.c

2007-01-29 Thread Martin J. Bligh

Peter Zijlstra wrote:

On Sun, 2007-01-28 at 14:29 -0800, Andrew Morton wrote:


As Christoph says, it's very much preferred that code be migrated over to
kmap_atomic().  Partly because kmap() is deadlockable in situations where a
large number of threads are trying to take two kmaps at the same time and
we run out.  This happened in the past, but incidences have gone away,
probably because of kmap->kmap_atomic conversions.



From which callsite have you measured problems?


CONFIG_HIGHPTE code in -rt was horrid. I'll do some measurements on
mainline.



CONFIG_HIGHPTE is always horrid -we've known that for years.
Don't use it.

If that's all we're fixing here, I'd be highly suspect ...

M.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: remove global locks from mm/highmem.c

2007-01-29 Thread Martin J. Bligh

Peter Zijlstra wrote:

On Sun, 2007-01-28 at 14:29 -0800, Andrew Morton wrote:


As Christoph says, it's very much preferred that code be migrated over to
kmap_atomic().  Partly because kmap() is deadlockable in situations where a
large number of threads are trying to take two kmaps at the same time and
we run out.  This happened in the past, but incidences have gone away,
probably because of kmap-kmap_atomic conversions.



From which callsite have you measured problems?


CONFIG_HIGHPTE code in -rt was horrid. I'll do some measurements on
mainline.



CONFIG_HIGHPTE is always horrid -we've known that for years.
Don't use it.

If that's all we're fixing here, I'd be highly suspect ...

M.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: remove global locks from mm/highmem.c

2007-01-29 Thread Martin J. Bligh

Andrew Morton wrote:

On Mon, 29 Jan 2007 17:31:20 -0800
Martin J. Bligh [EMAIL PROTECTED] wrote:


Peter Zijlstra wrote:

On Sun, 2007-01-28 at 14:29 -0800, Andrew Morton wrote:


As Christoph says, it's very much preferred that code be migrated over to
kmap_atomic().  Partly because kmap() is deadlockable in situations where a
large number of threads are trying to take two kmaps at the same time and
we run out.  This happened in the past, but incidences have gone away,
probably because of kmap-kmap_atomic conversions.
From which callsite have you measured problems?

CONFIG_HIGHPTE code in -rt was horrid. I'll do some measurements on
mainline.


CONFIG_HIGHPTE is always horrid -we've known that for years.


We have?  What's wrong with it?  looks around for bug reports


http://www.ussg.iu.edu/hypermail/linux/kernel/0307.0/0463.html

July 2003.


M.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bug report : reproducible memory bug (hardware failure, sorry)

2007-01-29 Thread Martin J. Bligh

I finally re-ran memtest86 on the machine since it began to have too
many different kind of errors (GPF, invalid instruction...). It turned
out that one of the memory modules was bad. I guess my brand new 
list_debug race condition debugger will be useful in the future, but not

now. :)

I'll remember to let memtest86 run a few hours more on my new machines
next time.


Heh, well that's a lot less confusing at least ;-)

M.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: lockmeter

2007-01-28 Thread Martin J. Bligh

Arjan van de Ven wrote:

On Sun, 2007-01-28 at 17:04 +, Christoph Hellwig wrote:

On Sun, Jan 28, 2007 at 08:52:25AM -0800, Martin J. Bligh wrote:

Mmm. not wholly convinced that's true. Whilst i don't have lockmeter
stats to hand, the heavy time in __d_lookup seems to indicate we may
still have a problem to me. I guess we could move the spinlocks out
of line again to test this fairly easily (or get lockmeter upstream).

We definitly should get lockmeter in.  Does anyone volunteer for doing
the cleanup and merged?


specifically; implementing it on top of lockdep should be very lean and
simple...


cc: John Hawkes, if we're going to discuss this ;-)

M.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20-rc6-mm1

2007-01-28 Thread Martin J. Bligh

Andrew Morton wrote:

On Sun, 28 Jan 2007 08:56:08 -0800
"Martin J. Bligh" <[EMAIL PROTECTED]> wrote:


- It seems that people have been busy creating the need for this.  I had to
  apply over sixty patches to this tree to fix post-2.6.20-rc4-mm1 compilation
  errors.  And a number of patches were dropped due to no-compile or to
  runtime errors.  Heaven knows how many runtime bugs were added.

dbench seems to panic on xfs / cfq ?


OK, I'll dump git-block.patch.  That means that the fsaio patches get
temporarily dropped as well as they depend on git-block changes somewhat.



OK ... if you can dump something in hotfixes, it should hopefully
auto-trigger another run. But OTOH, xfs on dbench seemed to do it
reliably on a bunch of machines, so maybe it's easy to reproduce.

M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bug report : reproducible memory allocator bug in 2.6.19.2

2007-01-28 Thread Martin J. Bligh

Mathieu Desnoyers wrote:

Hi,

Trying to build cross-compilers (or kernels) on a 2-way x86_64 (amd64) with
make -j3 triggers the following OOPS after about 30 minutes on
2.6.19.2. Due to the amount of time and the heavy load it takes before it
happens, I suspect a race condition. Memtest86 tests passed ok. The
amount of swap used when the condition happens is about 52k and stable
(only ~800MB/1GB are used).

I am going to give it a look, but I suspect you might help narrowing it
down more quickly. Any insight would be appreciated.


Mmm. that's going to be messy to debug ... but didn't we already know
that kernel was racy? Or is 2.6.19.2 after that fix already? Does 20-rc6
still break?

M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: lockmeter

2007-01-28 Thread Martin J. Bligh

Christoph Hellwig wrote:

On Sun, Jan 28, 2007 at 08:52:25AM -0800, Martin J. Bligh wrote:

Mmm. not wholly convinced that's true. Whilst i don't have lockmeter
stats to hand, the heavy time in __d_lookup seems to indicate we may
still have a problem to me. I guess we could move the spinlocks out
of line again to test this fairly easily (or get lockmeter upstream).


We definitly should get lockmeter in.  Does anyone volunteer for doing
the cleanup and merged?


On second thoughts .. I don't think it'd actually work for this since
the locks aren't global. Not that it shouldn't be done anyway, but ...

ISTR we still thought dcache scalability was a significant problem last
time anyone looked at it seriously - just never got fixed. Dipankar?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20-rc6-mm1

2007-01-28 Thread Martin J. Bligh

- It seems that people have been busy creating the need for this.  I had to
  apply over sixty patches to this tree to fix post-2.6.20-rc4-mm1 compilation
  errors.  And a number of patches were dropped due to no-compile or to
  runtime errors.  Heaven knows how many runtime bugs were added.


What looks to me like it might be another cfq problem? Not confied to
XFS this time.

http://test.kernel.org/abat/68514/debug/console.log


RIP: 0010:[]  [] 
cfq_dispatch_requests+0xaf/0x48c

RSP: 0018:81003465d9e8  EFLAGS: 00010002
RAX: 81003ebac200 RBX:  RCX: 81003eed5200
RDX:  RSI:  RDI: 
RBP: 81003eed5200 R08: 8031a363 R09: 2000
R10:  R11: 0001 R12: 
R13:  R14:  R15: 
FS:  2affc43a86f0() GS:81003ee37cc0() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 5556d026 CR3: 346c7000 CR4: 06e0
Process smartd (pid: 1943, threadinfo 81003465c000, task 
81003bccc040)

Stack:  0044 81003e1cb4c0 81003e1ea5e0 81003e1cb4c0
  81003eed5800 81003e1cb4c0 803174ed
 81003e1ea5e0 80317e3d 00d0 0002
Call Trace:
 [] elv_drain_elevator+0x16/0x5d
 [] elv_insert+0x4b/0x158
 [] blk_execute_rq_nowait+0x6d/0x89
 [] blk_execute_rq+0xb9/0xe1
 [] bio_phys_segments+0xf/0x15
 [] sg_io+0x217/0x328
 [] sock_def_readable+0x18/0x6c
 [] scsi_cmd_ioctl+0x1bd/0x391
 [] sd_ioctl+0x93/0xc1
 [] blkdev_driver_ioctl+0x5d/0x72
 [] blkdev_ioctl+0x638/0x693
 [] _spin_lock_irqsave+0x9/0xe
 [] __up_read+0x13/0x8a
 [] do_page_fault+0x45e/0x7b3
 [] skb_dequeue+0x48/0x50
 [] block_ioctl+0x1b/0x1f
 [] do_ioctl+0x21/0x6b
 [] vfs_ioctl+0x266/0x27f
 [] sys_ioctl+0x59/0x7a
 [] system_call+0x7e/0x83
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20-rc6-mm1

2007-01-28 Thread Martin J. Bligh

- It seems that people have been busy creating the need for this.  I had to
  apply over sixty patches to this tree to fix post-2.6.20-rc4-mm1 compilation
  errors.  And a number of patches were dropped due to no-compile or to
  runtime errors.  Heaven knows how many runtime bugs were added.


Build failure:

fs/built-in.o: In function `load_elf_binary':
binfmt_elf.c:(.text+0x32cc0): undefined reference to 
`arch_setup_additional_pages'


config:
http://test.kernel.org/abat/68517/build/dotconfig
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20-rc6-mm1

2007-01-28 Thread Martin J. Bligh



- It seems that people have been busy creating the need for this.  I had to
  apply over sixty patches to this tree to fix post-2.6.20-rc4-mm1 compilation
  errors.  And a number of patches were dropped due to no-compile or to
  runtime errors.  Heaven knows how many runtime bugs were added.


dbench seems to panic on xfs / cfq ?

http://test.kernel.org/abat/68498/debug/console.log

Pid: 30381, comm: dbench Not tainted 2.6.20-rc6-mm1-autokern1 #1
RIP: 0010:[]  [] 
cfq_dispatch_requests+0xa9/0x4a2

RSP: :8100e3231418  EFLAGS: 00010002
RAX:  RBX: 81007db97600 RCX: 00010410
RDX: 0002 RSI:  RDI: 81007db976a8
RBP: 8100e3231458 R08:  R09: a23e
R10: 8101ebf5c9c0 R11: 8101ed8d0298 R12: 81007db97688
R13:  R14:  R15: 
FS:  () GS:8101fe003740(0063) knlGS:f7de9460
CS:  0010 DS: 002b ES: 002b CR0: 80050033
CR2: f7eb6050 CR3: e251d000 CR4: 06e0
Process dbench (pid: 30381, threadinfo 8100e323, task 
8100e3855800)

Stack:  8100e3231478 8039296a e32314b8 81007dba4c48
 81007dba4c48 81007dba4c48 8100e36e0c38 
 8100e3231478 803910c2 0218abe9 8101ed8d0298
Call Trace:
 [] elv_drain_elevator+0x1b/0x63
 [] elv_insert+0x4b/0x144
 [] __elv_add_request+0x6e/0x70
 [] __make_request+0x255/0x32e
 [] generic_make_request+0x1c5/0x1fc
 [] submit_bio+0xbe/0xc7
 [] xfs_buf_iorequest+0x37e/0x3db
 [] default_wake_function+0x0/0xf
 [] default_wake_function+0x0/0xf
 [] xfs_buf_associate_memory+0x100/0x220
 [] xlog_bdstrat_cb+0x1c/0x45
 [] xlog_state_release_iclog+0x2f2/0x4b1
 [] xlog_state_sync_all+0xce/0x21c
 [] xfs_btree_del_cursor+0x59/0x61
 [] _xfs_log_force+0x93/0x2fe
 [] xfs_fs_alloc_inode+0x15/0x27
 [] iget_locked+0x6a/0x147
 [] xfs_iget+0x360/0x783
 [] xfs_trans_iget+0xa3/0x10f
 [] xfs_ialloc+0x8e/0x44a
 [] xfs_dir_ialloc+0x74/0x283
 [] xfs_create+0x347/0x626
 [] __up+0x19/0x1b
 [] xfs_vn_mknod+0x14b/0x2b7
 [] xfs_log_release_iclog+0x15/0x44
 [] __d_lookup+0xd7/0x114
 [] __d_lookup+0xd7/0x114
 [] do_lookup+0x63/0x1ba
 [] dput+0x22/0x151
 [] __link_path_walk+0x6b2/0xd16
 [] mntput_no_expire+0x1e/0x81
 [] link_path_walk+0xd6/0xe7
 [] xfs_vn_create+0xb/0xd
 [] vfs_create+0x7e/0xc2
 [] open_namei+0x18d/0x600
 [] do_filp_open+0x28/0x4b
 [] get_unused_fd+0x78/0x106
 [] do_sys_open+0x4d/0xd4
 [] compat_sys_open+0x15/0x17
 [] ia32_sysret+0x0/0xa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/7] breaking the global file_list_lock

2007-01-28 Thread Martin J. Bligh

Ingo Molnar wrote:

* Christoph Hellwig <[EMAIL PROTECTED]> wrote:


On Sun, Jan 28, 2007 at 12:51:18PM +0100, Peter Zijlstra wrote:
This patch-set breaks up the global file_list_lock which was found 
to be a severe contention point under basically any filesystem 
intensive workload.

Benchmarks, please.  Where exactly do you see contention for this?


it's the most contended spinlock we have during a parallel kernel 
compile on an 8-way system. But it's pretty common-sense as well, 
without doing any measurements, it's basically the only global lock left 
in just about every VFS workload that doesnt involve massive amount of 
dentries created/removed (which is still dominated by the dcache_lock).


filesystem intensive workload apparently means namespace operation 
heavy workload, right?  The biggest bottleneck I've seen with those is 
dcache lock.


the dcache lock is not a problem during kernel compiles. (its 
rcu-ification works nicely in that workload)


Mmm. not wholly convinced that's true. Whilst i don't have lockmeter
stats to hand, the heavy time in __d_lookup seems to indicate we may
still have a problem to me. I guess we could move the spinlocks out
of line again to test this fairly easily (or get lockmeter upstream).

114076 total  0.0545
 57766 default_idle 916.9206
 11869 prep_new_page 49.4542
  3830 find_trylock_page 67.1930
  2637 zap_pte_range  3.9125
  2486 strnlen_user  54.0435
  2018 do_page_fault  1.1941
  1940 do_wp_page 1.6973
  1869 __d_lookup 7.7231
  1331 page_remove_rmap   5.2196
  1287 do_no_page 1.6108
  1272 buffered_rmqueue   4.6423
  1160 __copy_to_user_ll 14.6835
  1027 _atomic_dec_and_lock  11.1630
   655 release_pages  1.9670
   644 do_path_lookup 1.6304
   630 schedule   0.4046
   617 kunmap_atomic  7.7125
   573 __handle_mm_fault  0.7365
   548 free_hot_page 78.2857
   500 __copy_user_intel  3.3784
   483 copy_pte_range 0.5941
   482 page_address   2.9571
   478 file_move  9.1923
   441 do_anonymous_page  0.7424
   429 filemap_nopage 0.4450
   401 anon_vma_unlink4.8902
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/7] breaking the global file_list_lock

2007-01-28 Thread Martin J. Bligh

Ingo Molnar wrote:

* Christoph Hellwig [EMAIL PROTECTED] wrote:


On Sun, Jan 28, 2007 at 12:51:18PM +0100, Peter Zijlstra wrote:
This patch-set breaks up the global file_list_lock which was found 
to be a severe contention point under basically any filesystem 
intensive workload.

Benchmarks, please.  Where exactly do you see contention for this?


it's the most contended spinlock we have during a parallel kernel 
compile on an 8-way system. But it's pretty common-sense as well, 
without doing any measurements, it's basically the only global lock left 
in just about every VFS workload that doesnt involve massive amount of 
dentries created/removed (which is still dominated by the dcache_lock).


filesystem intensive workload apparently means namespace operation 
heavy workload, right?  The biggest bottleneck I've seen with those is 
dcache lock.


the dcache lock is not a problem during kernel compiles. (its 
rcu-ification works nicely in that workload)


Mmm. not wholly convinced that's true. Whilst i don't have lockmeter
stats to hand, the heavy time in __d_lookup seems to indicate we may
still have a problem to me. I guess we could move the spinlocks out
of line again to test this fairly easily (or get lockmeter upstream).

114076 total  0.0545
 57766 default_idle 916.9206
 11869 prep_new_page 49.4542
  3830 find_trylock_page 67.1930
  2637 zap_pte_range  3.9125
  2486 strnlen_user  54.0435
  2018 do_page_fault  1.1941
  1940 do_wp_page 1.6973
  1869 __d_lookup 7.7231
  1331 page_remove_rmap   5.2196
  1287 do_no_page 1.6108
  1272 buffered_rmqueue   4.6423
  1160 __copy_to_user_ll 14.6835
  1027 _atomic_dec_and_lock  11.1630
   655 release_pages  1.9670
   644 do_path_lookup 1.6304
   630 schedule   0.4046
   617 kunmap_atomic  7.7125
   573 __handle_mm_fault  0.7365
   548 free_hot_page 78.2857
   500 __copy_user_intel  3.3784
   483 copy_pte_range 0.5941
   482 page_address   2.9571
   478 file_move  9.1923
   441 do_anonymous_page  0.7424
   429 filemap_nopage 0.4450
   401 anon_vma_unlink4.8902
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20-rc6-mm1

2007-01-28 Thread Martin J. Bligh



- It seems that people have been busy creating the need for this.  I had to
  apply over sixty patches to this tree to fix post-2.6.20-rc4-mm1 compilation
  errors.  And a number of patches were dropped due to no-compile or to
  runtime errors.  Heaven knows how many runtime bugs were added.


dbench seems to panic on xfs / cfq ?

http://test.kernel.org/abat/68498/debug/console.log

Pid: 30381, comm: dbench Not tainted 2.6.20-rc6-mm1-autokern1 #1
RIP: 0010:[8039b19a]  [8039b19a] 
cfq_dispatch_requests+0xa9/0x4a2

RSP: :8100e3231418  EFLAGS: 00010002
RAX:  RBX: 81007db97600 RCX: 00010410
RDX: 0002 RSI:  RDI: 81007db976a8
RBP: 8100e3231458 R08:  R09: a23e
R10: 8101ebf5c9c0 R11: 8101ed8d0298 R12: 81007db97688
R13:  R14:  R15: 
FS:  () GS:8101fe003740(0063) knlGS:f7de9460
CS:  0010 DS: 002b ES: 002b CR0: 80050033
CR2: f7eb6050 CR3: e251d000 CR4: 06e0
Process dbench (pid: 30381, threadinfo 8100e323, task 
8100e3855800)

Stack:  8100e3231478 8039296a e32314b8 81007dba4c48
 81007dba4c48 81007dba4c48 8100e36e0c38 
 8100e3231478 803910c2 0218abe9 8101ed8d0298
Call Trace:
 [803910c2] elv_drain_elevator+0x1b/0x63
 [80391155] elv_insert+0x4b/0x144
 [8039130f] __elv_add_request+0x6e/0x70
 [80395b84] __make_request+0x255/0x32e
 [80394c14] generic_make_request+0x1c5/0x1fc
 [80394d09] submit_bio+0xbe/0xc7
 [80383a80] xfs_buf_iorequest+0x37e/0x3db
 [802243c5] default_wake_function+0x0/0xf
 [802243c5] default_wake_function+0x0/0xf
 [80382a67] xfs_buf_associate_memory+0x100/0x220
 [8036b0a0] xlog_bdstrat_cb+0x1c/0x45
 [8036bcd9] xlog_state_release_iclog+0x2f2/0x4b1
 [8036cb37] xlog_state_sync_all+0xce/0x21c
 [803500b3] xfs_btree_del_cursor+0x59/0x61
 [8036d467] _xfs_log_force+0x93/0x2fe
 [8038917e] xfs_fs_alloc_inode+0x15/0x27
 [80282921] iget_locked+0x6a/0x147
 [80361eb3] xfs_iget+0x360/0x783
 [8037765f] xfs_trans_iget+0xa3/0x10f
 [80363125] xfs_ialloc+0x8e/0x44a
 [80377f16] xfs_dir_ialloc+0x74/0x283
 [8037cc87] xfs_create+0x347/0x626
 [804dcae9] __up+0x19/0x1b
 [80386be0] xfs_vn_mknod+0x14b/0x2b7
 [8036bead] xfs_log_release_iclog+0x15/0x44
 [80280f7e] __d_lookup+0xd7/0x114
 [80280f7e] __d_lookup+0xd7/0x114
 [80277666] do_lookup+0x63/0x1ba
 [8027fe46] dput+0x22/0x151
 [80277fbb] __link_path_walk+0x6b2/0xd16
 [80284991] mntput_no_expire+0x1e/0x81
 [802786f5] link_path_walk+0xd6/0xe7
 [80386d57] xfs_vn_create+0xb/0xd
 [80278f63] vfs_create+0x7e/0xc2
 [80279300] open_namei+0x18d/0x600
 [8026fe9f] do_filp_open+0x28/0x4b
 [80270038] get_unused_fd+0x78/0x106
 [802701ec] do_sys_open+0x4d/0xd4
 [8029ae0c] compat_sys_open+0x15/0x17
 [8021b0e2] ia32_sysret+0x0/0xa
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20-rc6-mm1

2007-01-28 Thread Martin J. Bligh

- It seems that people have been busy creating the need for this.  I had to
  apply over sixty patches to this tree to fix post-2.6.20-rc4-mm1 compilation
  errors.  And a number of patches were dropped due to no-compile or to
  runtime errors.  Heaven knows how many runtime bugs were added.


Build failure:

fs/built-in.o: In function `load_elf_binary':
binfmt_elf.c:(.text+0x32cc0): undefined reference to 
`arch_setup_additional_pages'


config:
http://test.kernel.org/abat/68517/build/dotconfig
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20-rc6-mm1

2007-01-28 Thread Martin J. Bligh

- It seems that people have been busy creating the need for this.  I had to
  apply over sixty patches to this tree to fix post-2.6.20-rc4-mm1 compilation
  errors.  And a number of patches were dropped due to no-compile or to
  runtime errors.  Heaven knows how many runtime bugs were added.


What looks to me like it might be another cfq problem? Not confied to
XFS this time.

http://test.kernel.org/abat/68514/debug/console.log


RIP: 0010:[80321d34]  [80321d34] 
cfq_dispatch_requests+0xaf/0x48c

RSP: 0018:81003465d9e8  EFLAGS: 00010002
RAX: 81003ebac200 RBX:  RCX: 81003eed5200
RDX:  RSI:  RDI: 
RBP: 81003eed5200 R08: 8031a363 R09: 2000
R10:  R11: 0001 R12: 
R13:  R14:  R15: 
FS:  2affc43a86f0() GS:81003ee37cc0() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 5556d026 CR3: 346c7000 CR4: 06e0
Process smartd (pid: 1943, threadinfo 81003465c000, task 
81003bccc040)

Stack:  0044 81003e1cb4c0 81003e1ea5e0 81003e1cb4c0
  81003eed5800 81003e1cb4c0 803174ed
 81003e1ea5e0 80317e3d 00d0 0002
Call Trace:
 [803174ed] elv_drain_elevator+0x16/0x5d
 [80317e3d] elv_insert+0x4b/0x158
 [8031a43d] blk_execute_rq_nowait+0x6d/0x89
 [8031a512] blk_execute_rq+0xb9/0xe1
 [802aa72f] bio_phys_segments+0xf/0x15
 [8031e4db] sg_io+0x217/0x328
 [8045eeb8] sock_def_readable+0x18/0x6c
 [8031ea68] scsi_cmd_ioctl+0x1bd/0x391
 [803df122] sd_ioctl+0x93/0xc1
 [8031c9d8] blkdev_driver_ioctl+0x5d/0x72
 [8031d025] blkdev_ioctl+0x638/0x693
 [804c65f3] _spin_lock_irqsave+0x9/0xe
 [8032707c] __up_read+0x13/0x8a
 [804c863d] do_page_fault+0x45e/0x7b3
 [8045fdaf] skb_dequeue+0x48/0x50
 [802ab382] block_ioctl+0x1b/0x1f
 [80294105] do_ioctl+0x21/0x6b
 [802943b5] vfs_ioctl+0x266/0x27f
 [80294427] sys_ioctl+0x59/0x7a
 [80209bae] system_call+0x7e/0x83
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: lockmeter

2007-01-28 Thread Martin J. Bligh

Christoph Hellwig wrote:

On Sun, Jan 28, 2007 at 08:52:25AM -0800, Martin J. Bligh wrote:

Mmm. not wholly convinced that's true. Whilst i don't have lockmeter
stats to hand, the heavy time in __d_lookup seems to indicate we may
still have a problem to me. I guess we could move the spinlocks out
of line again to test this fairly easily (or get lockmeter upstream).


We definitly should get lockmeter in.  Does anyone volunteer for doing
the cleanup and merged?


On second thoughts .. I don't think it'd actually work for this since
the locks aren't global. Not that it shouldn't be done anyway, but ...

ISTR we still thought dcache scalability was a significant problem last
time anyone looked at it seriously - just never got fixed. Dipankar?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bug report : reproducible memory allocator bug in 2.6.19.2

2007-01-28 Thread Martin J. Bligh

Mathieu Desnoyers wrote:

Hi,

Trying to build cross-compilers (or kernels) on a 2-way x86_64 (amd64) with
make -j3 triggers the following OOPS after about 30 minutes on
2.6.19.2. Due to the amount of time and the heavy load it takes before it
happens, I suspect a race condition. Memtest86 tests passed ok. The
amount of swap used when the condition happens is about 52k and stable
(only ~800MB/1GB are used).

I am going to give it a look, but I suspect you might help narrowing it
down more quickly. Any insight would be appreciated.


Mmm. that's going to be messy to debug ... but didn't we already know
that kernel was racy? Or is 2.6.19.2 after that fix already? Does 20-rc6
still break?

M.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20-rc6-mm1

2007-01-28 Thread Martin J. Bligh

Andrew Morton wrote:

On Sun, 28 Jan 2007 08:56:08 -0800
Martin J. Bligh [EMAIL PROTECTED] wrote:


- It seems that people have been busy creating the need for this.  I had to
  apply over sixty patches to this tree to fix post-2.6.20-rc4-mm1 compilation
  errors.  And a number of patches were dropped due to no-compile or to
  runtime errors.  Heaven knows how many runtime bugs were added.

dbench seems to panic on xfs / cfq ?


OK, I'll dump git-block.patch.  That means that the fsaio patches get
temporarily dropped as well as they depend on git-block changes somewhat.



OK ... if you can dump something in hotfixes, it should hopefully
auto-trigger another run. But OTOH, xfs on dbench seemed to do it
reliably on a bunch of machines, so maybe it's easy to reproduce.

M.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: lockmeter

2007-01-28 Thread Martin J. Bligh

Arjan van de Ven wrote:

On Sun, 2007-01-28 at 17:04 +, Christoph Hellwig wrote:

On Sun, Jan 28, 2007 at 08:52:25AM -0800, Martin J. Bligh wrote:

Mmm. not wholly convinced that's true. Whilst i don't have lockmeter
stats to hand, the heavy time in __d_lookup seems to indicate we may
still have a problem to me. I guess we could move the spinlocks out
of line again to test this fairly easily (or get lockmeter upstream).

We definitly should get lockmeter in.  Does anyone volunteer for doing
the cleanup and merged?


specifically; implementing it on top of lockdep should be very lean and
simple...


cc: John Hawkes, if we're going to discuss this ;-)

M.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: .version keeps being updated

2007-01-10 Thread Martin J. Bligh

Andrew Morton wrote:

On Tue, 9 Jan 2007 15:21:51 -0800 (PST)
Linus Torvalds <[EMAIL PROTECTED]> wrote:



On Tue, 9 Jan 2007, Andrew Morton wrote:

This new behavior of the kernel build system is likely to
make developers angry pretty quickly.

That might motivate them to fix it ;)

Actually, how about just removing the incrementing version count entirely?


I use it pretty commonly to answer the question "did I remember to install
that new kernel I just built before I rebooted"?  By comparing `uname -a'
with $TOPDIR/.version.


Yup, we need to do the same thing in automated testing. Especially when
you're doing lilo -R, and don't know if you ended up fscking or panicing
during attempted reboot to new kernel.

Better would be a checksum of the vmlinux vs the running kernel text,
but that seems to be impossible due to code rewriting. Could we embed
a checksum in a little /proc file for this?

M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: .version keeps being updated

2007-01-10 Thread Martin J. Bligh

Andrew Morton wrote:

On Tue, 9 Jan 2007 15:21:51 -0800 (PST)
Linus Torvalds [EMAIL PROTECTED] wrote:



On Tue, 9 Jan 2007, Andrew Morton wrote:

This new behavior of the kernel build system is likely to
make developers angry pretty quickly.

That might motivate them to fix it ;)

Actually, how about just removing the incrementing version count entirely?


I use it pretty commonly to answer the question did I remember to install
that new kernel I just built before I rebooted?  By comparing `uname -a'
with $TOPDIR/.version.


Yup, we need to do the same thing in automated testing. Especially when
you're doing lilo -R, and don't know if you ended up fscking or panicing
during attempted reboot to new kernel.

Better would be a checksum of the vmlinux vs the running kernel text,
but that seems to be impossible due to code rewriting. Could we embed
a checksum in a little /proc file for this?

M.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GPL only modules [was Re: [GIT PATCH] more Driver core patches for 2.6.19]

2006-12-14 Thread Martin J. Bligh

Jeff V. Merkey wrote:


Again, I agree with EVERY statement Linus made here. We operate exactly 
as Linus describes, and
legally, NO ONE can take us to task on GPL issues. We post patches of 
affected kernel code
(albiet the code resembles what Linus describes as a "skeleton driver") 
and our proprietary
non derived code we sell with our appliances. 



Yeah, like this one?

ftp://ftp.soleranetworks.com/pub/solera/dsfs/FedoraCore6/datascout-only-2.6.18-11-13-06.patch

@@ -1316,8 +1316,8 @@

mod->license_gplok = license_is_gpl_compatible(license);
if (!mod->license_gplok && !(tainted & TAINT_PROPRIETARY_MODULE)) {
-   printk(KERN_WARNING "%s: module license '%s' taints 
kernel.\n",

-  mod->name, license);
+// printk(KERN_WARNING "%s: module license '%s' taints 
kernel.\n",

+//mod->name, license);
add_taint(TAINT_PROPRIETARY_MODULE);
}
 }
@@ -1691,10 +1691,10 @@
/* Set up license info based on the info section */
set_license(mod, get_modinfo(sechdrs, infoindex, "license"));

-   if (strcmp(mod->name, "ndiswrapper") == 0)
-   add_taint(TAINT_PROPRIETARY_MODULE);
-   if (strcmp(mod->name, "driverloader") == 0)
-   add_taint(TAINT_PROPRIETARY_MODULE);
+// if (strcmp(mod->name, "ndiswrapper") == 0)
+// add_taint(TAINT_PROPRIETARY_MODULE);
+// if (strcmp(mod->name, "driverloader") == 0)
+// add_taint(TAINT_PROPRIETARY_MODULE);

/* Set up MODINFO_ATTR fields */
setup_modinfo(mod, sechdrs, infoindex);

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GPL only modules [was Re: [GIT PATCH] more Driver core patches for 2.6.19]

2006-12-14 Thread Martin J. Bligh

Dave Jones wrote:

On Wed, Dec 13, 2006 at 09:39:11PM -0800, Martin J. Bligh wrote:

 > The Ubuntu feisty fawn mess was a dangerous warning bell of where we're
 > going. If we don't stand up at some point, and ban binary drivers, we
 > will, I fear, end up with an unsustainable ecosystem for Linux when
 > binary drivers become pervasive. I don't want to see Linux destroyed
 > like that.

Thing is, if kernel.org kernels get patched to disallow binary modules,
whats to stop Ubuntu (or anyone else) reverting that change in the
kernels they distribute ?  The landscape doesn't really change much,
given that the majority of Linux end-users are probably running
distro kernels.


I don't think they'd dare spit in our faces quite that directly.
They think binary modules are permissible because we don't seem to have
consistently stated an intent contradicting that - some individual
developers have, but ultimately Linus hasn't.

I'm not talking about any legal issues to do with derived works,
copyrights or licenses - a clear statement of intent is probably all
it'd take to tip the balance.

M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GPL only modules [was Re: [GIT PATCH] more Driver core patches for 2.6.19]

2006-12-14 Thread Martin J. Bligh

Dave Jones wrote:

On Wed, Dec 13, 2006 at 09:39:11PM -0800, Martin J. Bligh wrote:

  The Ubuntu feisty fawn mess was a dangerous warning bell of where we're
  going. If we don't stand up at some point, and ban binary drivers, we
  will, I fear, end up with an unsustainable ecosystem for Linux when
  binary drivers become pervasive. I don't want to see Linux destroyed
  like that.

Thing is, if kernel.org kernels get patched to disallow binary modules,
whats to stop Ubuntu (or anyone else) reverting that change in the
kernels they distribute ?  The landscape doesn't really change much,
given that the majority of Linux end-users are probably running
distro kernels.


I don't think they'd dare spit in our faces quite that directly.
They think binary modules are permissible because we don't seem to have
consistently stated an intent contradicting that - some individual
developers have, but ultimately Linus hasn't.

I'm not talking about any legal issues to do with derived works,
copyrights or licenses - a clear statement of intent is probably all
it'd take to tip the balance.

M.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GPL only modules [was Re: [GIT PATCH] more Driver core patches for 2.6.19]

2006-12-14 Thread Martin J. Bligh

Jeff V. Merkey wrote:


Again, I agree with EVERY statement Linus made here. We operate exactly 
as Linus describes, and
legally, NO ONE can take us to task on GPL issues. We post patches of 
affected kernel code
(albiet the code resembles what Linus describes as a skeleton driver) 
and our proprietary
non derived code we sell with our appliances. 



Yeah, like this one?

ftp://ftp.soleranetworks.com/pub/solera/dsfs/FedoraCore6/datascout-only-2.6.18-11-13-06.patch

@@ -1316,8 +1316,8 @@

mod-license_gplok = license_is_gpl_compatible(license);
if (!mod-license_gplok  !(tainted  TAINT_PROPRIETARY_MODULE)) {
-   printk(KERN_WARNING %s: module license '%s' taints 
kernel.\n,

-  mod-name, license);
+// printk(KERN_WARNING %s: module license '%s' taints 
kernel.\n,

+//mod-name, license);
add_taint(TAINT_PROPRIETARY_MODULE);
}
 }
@@ -1691,10 +1691,10 @@
/* Set up license info based on the info section */
set_license(mod, get_modinfo(sechdrs, infoindex, license));

-   if (strcmp(mod-name, ndiswrapper) == 0)
-   add_taint(TAINT_PROPRIETARY_MODULE);
-   if (strcmp(mod-name, driverloader) == 0)
-   add_taint(TAINT_PROPRIETARY_MODULE);
+// if (strcmp(mod-name, ndiswrapper) == 0)
+// add_taint(TAINT_PROPRIETARY_MODULE);
+// if (strcmp(mod-name, driverloader) == 0)
+// add_taint(TAINT_PROPRIETARY_MODULE);

/* Set up MODINFO_ATTR fields */
setup_modinfo(mod, sechdrs, infoindex);

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GPL only modules [was Re: [GIT PATCH] more Driver core patches for 2.6.19]

2006-12-13 Thread Martin J. Bligh

Linus Torvalds wrote:


On Wed, 13 Dec 2006, Greg KH wrote:

Numerous kernel developers feel that loading non-GPL drivers into the
kernel violates the license of the kernel and their copyright.  Because
of this, a one year notice for everyone to address any non-GPL
compatible modules has been set.


Btw, I really think this is shortsighted.

It will only result in _exactly_ the crap we were just trying to avoid, 
namely stupid "shell game" drivers that don't actually help anything at 
all, and move code into user space instead.


I don't think pushing the drivers into userspace is a good idea at all,
that wasn't what I was getting at. Pushing the problem into a different
space doesn't fix it. IMHO, we're not a microkernel, and drivers for
hardware belong in the kernel.

Whether we allow binary kernel modules or not, I don't think we should
allow an API for userspace drivers - obviously that's not my call, it's
yours, but at least I don't want my opinion / intent misinterpreted.

> What was the point again?
>
> Was the point to alienate people by showing how we're less about the
> technology than about licenses?

The point of banning binary drivers would be to leverage hardware
companies into either releasing open source drivers, or the specs for
someone else to write them. Whether we have that much leverage is
debatable ... I suspect we do in some cases and not in others. It'll
cause some pain, as well as some gain, but I think we'd live through
it pretty well, personally.

The details of the legal minutiae are, I feel, less interesting than
what goal we want to acheive. If we decided to get rid of binary
drivers, we could likely find a way to achieve that. Is it a worthwhile
goal?

I've done both Linux support, where binary drivers are involved, before,
as well as supporting Sequent's Dynix/PTX in the face of a similar
situation with CA Unicenter. It makes life extremely difficult, if not
impossible for a support organisation, for fairly obvious and well known
reasons. When there are two binary drivers from different vendors in
there, any semblence of support becomes farcical.

The Ubuntu feisty fawn mess was a dangerous warning bell of where we're
going. If we don't stand up at some point, and ban binary drivers, we
will, I fear, end up with an unsustainable ecosystem for Linux when
binary drivers become pervasive. I don't want to see Linux destroyed
like that.

I don't think the motive behind what we decide to do should be decided
by legal stuff, though I'm sure we'd have to wade through that to
implement it. It's not about that ... it's about what kind of ecosystem
we want to create, and whether that can be successful or not. Indeed,
there are good arguments both for and against binary drivers on that
basis.

But please can we have the pragmatic argument about what we want to
achieve, and why ... rather than the legal / religious arguments about
licenses? The law is a tool, not an end in itself.

If you don't feel it's legitimate to leverage that tool to achieve a
pragmatic end, fair enough. But please don't assume that the motivation
was legal / religious, at least not on my part.

Perhaps, in the end, we will decide we'd like to ban binary drivers,
but can't. Either for pragmatic reasons (e.g. we don't have enough
leverage to create the hardware support base), or for legal ones
(we don't think it's enforcable). But we seem to be muddled between
those different reasons right now, at least it seems that way to me.

I think allowing binary hardware drivers in userspace hurts our ability
to leverage companies to release hardware specs. The 'grey water' of
binary kernel drivers convinces a lot of them to release stuff, and
Greg and others have pushed that cause, all credit to them. In one way,
it does make the kernel easier to support, but I don't think it really
helps much to make a supportable *system*.

M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GPL only modules [was Re: [GIT PATCH] more Driver core patches for 2.6.19]

2006-12-13 Thread Martin J. Bligh

Linus Torvalds wrote:


On Wed, 13 Dec 2006, Greg KH wrote:

Numerous kernel developers feel that loading non-GPL drivers into the
kernel violates the license of the kernel and their copyright.  Because
of this, a one year notice for everyone to address any non-GPL
compatible modules has been set.


Btw, I really think this is shortsighted.

It will only result in _exactly_ the crap we were just trying to avoid, 
namely stupid shell game drivers that don't actually help anything at 
all, and move code into user space instead.


I don't think pushing the drivers into userspace is a good idea at all,
that wasn't what I was getting at. Pushing the problem into a different
space doesn't fix it. IMHO, we're not a microkernel, and drivers for
hardware belong in the kernel.

Whether we allow binary kernel modules or not, I don't think we should
allow an API for userspace drivers - obviously that's not my call, it's
yours, but at least I don't want my opinion / intent misinterpreted.

 What was the point again?

 Was the point to alienate people by showing how we're less about the
 technology than about licenses?

The point of banning binary drivers would be to leverage hardware
companies into either releasing open source drivers, or the specs for
someone else to write them. Whether we have that much leverage is
debatable ... I suspect we do in some cases and not in others. It'll
cause some pain, as well as some gain, but I think we'd live through
it pretty well, personally.

The details of the legal minutiae are, I feel, less interesting than
what goal we want to acheive. If we decided to get rid of binary
drivers, we could likely find a way to achieve that. Is it a worthwhile
goal?

I've done both Linux support, where binary drivers are involved, before,
as well as supporting Sequent's Dynix/PTX in the face of a similar
situation with CA Unicenter. It makes life extremely difficult, if not
impossible for a support organisation, for fairly obvious and well known
reasons. When there are two binary drivers from different vendors in
there, any semblence of support becomes farcical.

The Ubuntu feisty fawn mess was a dangerous warning bell of where we're
going. If we don't stand up at some point, and ban binary drivers, we
will, I fear, end up with an unsustainable ecosystem for Linux when
binary drivers become pervasive. I don't want to see Linux destroyed
like that.

I don't think the motive behind what we decide to do should be decided
by legal stuff, though I'm sure we'd have to wade through that to
implement it. It's not about that ... it's about what kind of ecosystem
we want to create, and whether that can be successful or not. Indeed,
there are good arguments both for and against binary drivers on that
basis.

But please can we have the pragmatic argument about what we want to
achieve, and why ... rather than the legal / religious arguments about
licenses? The law is a tool, not an end in itself.

If you don't feel it's legitimate to leverage that tool to achieve a
pragmatic end, fair enough. But please don't assume that the motivation
was legal / religious, at least not on my part.

Perhaps, in the end, we will decide we'd like to ban binary drivers,
but can't. Either for pragmatic reasons (e.g. we don't have enough
leverage to create the hardware support base), or for legal ones
(we don't think it's enforcable). But we seem to be muddled between
those different reasons right now, at least it seems that way to me.

I think allowing binary hardware drivers in userspace hurts our ability
to leverage companies to release hardware specs. The 'grey water' of
binary kernel drivers convinces a lot of them to release stuff, and
Greg and others have pushed that cause, all credit to them. In one way,
it does make the kernel easier to support, but I don't think it really
helps much to make a supportable *system*.

M.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Device naming randomness (udev?)

2006-12-05 Thread Martin J. Bligh

Greg KH wrote:

On Mon, Dec 04, 2006 at 12:12:06AM +0100, Bj?rn Steinbrink wrote:

On 2006.12.03 14:39:44 -0800, Martin J. Bligh wrote:

This PC has 1 ethernet interface, an e1000. Ubuntu Dapper.

On 2.6.14, my e1000 interface appears as eth0.
On 2.6.15 to 2.6.18, my e1000 interface appears as eth1.

In both cases, there are no other ethX interfaces listed in
"ifconfig -a". There are no modules involved, just a static
kernel build.

Is this a bug in udev, or the kernel? I'm presuming udev,
but seems odd it changes over a kernel release boundary.
Any ideas on how I get rid of it? Makes automatic switching
between kernel versions a royal pain in the ass.

Just a wild guess here... Debian's (and I guess Ubuntu's) udev rules
contain a generator for persistent interface name rules. Maybe these
start working with 2.6.15 and thus the switch (ie. the kernel would call
it eth0, but udev renames it to eth1).
The generated rules are written to
/etc/udev/rules.d/z25_persistent-net.rules on Debian, not sure if its
the same for Ubuntu. Editing/removing the rules should fix your problem.


Yes, I'd place odds on this one.


Huh. Somehow there was this entry in /etc/iftab:

eth0 mac 00:0d:61:44:90:12 arp 1

But that's not my mac address. Damned if I know how that got there, but
if the persistent rules only work on later kernels, that'd explain it.
And indeed ... removing that entry makes it work more normally.

Thanks for the pointers - most helpful.

M.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Device naming randomness (udev?)

2006-12-05 Thread Martin J. Bligh

Greg KH wrote:

On Mon, Dec 04, 2006 at 12:12:06AM +0100, Bj?rn Steinbrink wrote:

On 2006.12.03 14:39:44 -0800, Martin J. Bligh wrote:

This PC has 1 ethernet interface, an e1000. Ubuntu Dapper.

On 2.6.14, my e1000 interface appears as eth0.
On 2.6.15 to 2.6.18, my e1000 interface appears as eth1.

In both cases, there are no other ethX interfaces listed in
ifconfig -a. There are no modules involved, just a static
kernel build.

Is this a bug in udev, or the kernel? I'm presuming udev,
but seems odd it changes over a kernel release boundary.
Any ideas on how I get rid of it? Makes automatic switching
between kernel versions a royal pain in the ass.

Just a wild guess here... Debian's (and I guess Ubuntu's) udev rules
contain a generator for persistent interface name rules. Maybe these
start working with 2.6.15 and thus the switch (ie. the kernel would call
it eth0, but udev renames it to eth1).
The generated rules are written to
/etc/udev/rules.d/z25_persistent-net.rules on Debian, not sure if its
the same for Ubuntu. Editing/removing the rules should fix your problem.


Yes, I'd place odds on this one.


Huh. Somehow there was this entry in /etc/iftab:

eth0 mac 00:0d:61:44:90:12 arp 1

But that's not my mac address. Damned if I know how that got there, but
if the persistent rules only work on later kernels, that'd explain it.
And indeed ... removing that entry makes it work more normally.

Thanks for the pointers - most helpful.

M.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Device naming randomness (udev?)

2006-12-03 Thread Martin J. Bligh

This PC has 1 ethernet interface, an e1000. Ubuntu Dapper.

On 2.6.14, my e1000 interface appears as eth0.
On 2.6.15 to 2.6.18, my e1000 interface appears as eth1.

In both cases, there are no other ethX interfaces listed in
"ifconfig -a". There are no modules involved, just a static
kernel build.

Is this a bug in udev, or the kernel? I'm presuming udev,
but seems odd it changes over a kernel release boundary.
Any ideas on how I get rid of it? Makes automatic switching
between kernel versions a royal pain in the ass.

M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Device naming randomness (udev?)

2006-12-03 Thread Martin J. Bligh

This PC has 1 ethernet interface, an e1000. Ubuntu Dapper.

On 2.6.14, my e1000 interface appears as eth0.
On 2.6.15 to 2.6.18, my e1000 interface appears as eth1.

In both cases, there are no other ethX interfaces listed in
ifconfig -a. There are no modules involved, just a static
kernel build.

Is this a bug in udev, or the kernel? I'm presuming udev,
but seems odd it changes over a kernel release boundary.
Any ideas on how I get rid of it? Makes automatic switching
between kernel versions a royal pain in the ass.

M.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: OOM killer firing on 2.6.18 and later during LTP runs

2006-11-25 Thread Martin J. Bligh

The traces are a bit confusing, but I don't actually see anything wrong
there.  The machine has used up all swap, has used up all memory and has
correctly gone and killed things.  After that, there's free memory again.


Yeah, it's just a bit odd that it's always in the IO path. Makes me
suspect there's actually a bunch of pagecache in the box as well, but
maybe it's just coincidence, and the rest of the box really is full
of anon mem. I thought we dumped the alt-sysrq-m type stuff on an OOM
kill, but it seems not. maybe that's just not in mainline.


This doesn't seem to happen every run, unfortnately, only
intermittently, and we don't have much data before that, so
hard to tell how long it's been going on.

Still happening on latest kernels.
http://test.kernel.org/abat/62445/debug/console.log


The same appears to have happened there too.  Although it does seem to have
killed a lot more than it should have.

Has something changed in the configuration of that machine?  New LTP
version?  Less swapsapce?


Difficult to tell, it's a fairly new box to the grid, so it seems to
have been doing that intermittently forever.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


OOM killer firing on 2.6.18 and later during LTP runs

2006-11-25 Thread Martin J. Bligh

On 2.6.18-rc7 and later during LTP:
http://test.kernel.org/abat/48393/debug/console.log

oom-killer: gfp_mask=0x201d2, order=0

Call Trace:
 [] out_of_memory+0x33/0x220
 [] __alloc_pages+0x23a/0x2c3
 [] __do_page_cache_readahead+0x99/0x212
 [] sync_page+0x0/0x45
 [] io_schedule+0x28/0x33
 [] __wait_on_bit_lock+0x5b/0x66
 [] dm_any_congested+0x3b/0x42
 [] filemap_nopage+0x14b/0x353
 [] __handle_mm_fault+0x387/0x93f
 [] do_page_fault+0x44b/0x7ba
 [] autoremove_wake_function+0x0/0x2e
oom-killer: gfp_mask=0x280d2, order=0

Call Trace:
 [] out_of_memory+0x33/0x220
 [] __alloc_pages+0x23a/0x2c3
 [] __handle_mm_fault+0x1d0/0x93f
 [] do_page_fault+0x44b/0x7ba
 [] thread_return+0x0/0xe0
 [] error_exit+0x0/0x84

--

This doesn't seem to happen every run, unfortnately, only
intermittently, and we don't have much data before that, so
hard to tell how long it's been going on.

Still happening on latest kernels.
http://test.kernel.org/abat/62445/debug/console.log

automount invoked oom-killer: gfp_mask=0x201d2, order=0, oomkilladj=0
lamb-payload invoked oom-killer: gfp_mask=0x201d2, order=0, oomkilladj=0

Call Trace:
 [] out_of_memory+0x70/0x262
 [] autoremove_wake_function+0x0/0x2e
 [] __alloc_pages+0x238/0x2c1
 [] __do_page_cache_readahead+0xab/0x234
 [] sync_page+0x0/0x45
 [] io_schedule+0x28/0x33
 [] __wait_on_bit_lock+0x5b/0x66
 [] dm_any_congested+0x3b/0x42
 [] filemap_nopage+0x148/0x34e
 [] __handle_mm_fault+0x1f8/0x9b0
 [] do_page_fault+0x441/0x7b5
 [] _spin_unlock_irq+0x9/0xc
 [] thread_return+0x64/0x100
 [] error_exit+0x0/0x84

Does at least seem to be the same stack, mostly, and this machine is
using dm it seems, which most of the others aren't
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


OOM killer firing on 2.6.18 and later during LTP runs

2006-11-25 Thread Martin J. Bligh

On 2.6.18-rc7 and later during LTP:
http://test.kernel.org/abat/48393/debug/console.log

oom-killer: gfp_mask=0x201d2, order=0

Call Trace:
 [802638cb] out_of_memory+0x33/0x220
 [80265374] __alloc_pages+0x23a/0x2c3
 [802667d2] __do_page_cache_readahead+0x99/0x212
 [80260799] sync_page+0x0/0x45
 [804b304c] io_schedule+0x28/0x33
 [804b32b8] __wait_on_bit_lock+0x5b/0x66
 [8043d849] dm_any_congested+0x3b/0x42
 [80262e50] filemap_nopage+0x14b/0x353
 [8026cf9a] __handle_mm_fault+0x387/0x93f
 [804b6366] do_page_fault+0x44b/0x7ba
 [80245a4e] autoremove_wake_function+0x0/0x2e
oom-killer: gfp_mask=0x280d2, order=0

Call Trace:
 [802638cb] out_of_memory+0x33/0x220
 [80265374] __alloc_pages+0x23a/0x2c3
 [8026cde3] __handle_mm_fault+0x1d0/0x93f
 [804b6366] do_page_fault+0x44b/0x7ba
 [804b2854] thread_return+0x0/0xe0
 [8020a405] error_exit+0x0/0x84

--

This doesn't seem to happen every run, unfortnately, only
intermittently, and we don't have much data before that, so
hard to tell how long it's been going on.

Still happening on latest kernels.
http://test.kernel.org/abat/62445/debug/console.log

automount invoked oom-killer: gfp_mask=0x201d2, order=0, oomkilladj=0
lamb-payload invoked oom-killer: gfp_mask=0x201d2, order=0, oomkilladj=0

Call Trace:
 [80264dca] out_of_memory+0x70/0x262
 [802459f6] autoremove_wake_function+0x0/0x2e
 [802668bf] __alloc_pages+0x238/0x2c1
 [80268070] __do_page_cache_readahead+0xab/0x234
 [8026205c] sync_page+0x0/0x45
 [804bf888] io_schedule+0x28/0x33
 [804bfaeb] __wait_on_bit_lock+0x5b/0x66
 [80446fc9] dm_any_congested+0x3b/0x42
 [80264158] filemap_nopage+0x148/0x34e
 [8026e49a] __handle_mm_fault+0x1f8/0x9b0
 [804c2d0f] do_page_fault+0x441/0x7b5
 [804c0d61] _spin_unlock_irq+0x9/0xc
 [804bf121] thread_return+0x64/0x100
 [804c119d] error_exit+0x0/0x84

Does at least seem to be the same stack, mostly, and this machine is
using dm it seems, which most of the others aren't
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: OOM killer firing on 2.6.18 and later during LTP runs

2006-11-25 Thread Martin J. Bligh

The traces are a bit confusing, but I don't actually see anything wrong
there.  The machine has used up all swap, has used up all memory and has
correctly gone and killed things.  After that, there's free memory again.


Yeah, it's just a bit odd that it's always in the IO path. Makes me
suspect there's actually a bunch of pagecache in the box as well, but
maybe it's just coincidence, and the rest of the box really is full
of anon mem. I thought we dumped the alt-sysrq-m type stuff on an OOM
kill, but it seems not. maybe that's just not in mainline.


This doesn't seem to happen every run, unfortnately, only
intermittently, and we don't have much data before that, so
hard to tell how long it's been going on.

Still happening on latest kernels.
http://test.kernel.org/abat/62445/debug/console.log


The same appears to have happened there too.  Although it does seem to have
killed a lot more than it should have.

Has something changed in the configuration of that machine?  New LTP
version?  Less swapsapce?


Difficult to tell, it's a fairly new box to the grid, so it seems to
have been doing that intermittently forever.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2/2] enables booting a NUMA system where some nodes have no memory

2006-11-16 Thread Martin J. Bligh

Christian Krafft wrote:

On Wed, 15 Nov 2006 16:57:56 -0800 (PST)
Christoph Lameter <[EMAIL PROTECTED]> wrote:


On Thu, 16 Nov 2006, KAMEZAWA Hiroyuki wrote:

But there is no memory on the node. Does the zonelist contain the zones of 
the node without memory or not? We simply fall back each allocation to the 
next node as if the node was overflowing?

yes. just fallback.
Ok, so we got a useless pglist_data struct and the struct zone contains a 
zonelist that does not include the zone.


Okay, I slowly understand what you are talking about.
I just tried a "numactl --cpunodebind 1 --membind 1 true" which hit an 
uninitialized zone in slab_node:

return zone_to_nid(policy->v.zonelist->zones[0]);

I also still don't know if it makes sense to have memoryless nodes, but 
supporting it does.
So wath would be reasonable, to have empty zonelists for those node, or to 
check if zonelists are uninitialized ?


You don't want empty zonelists on a node containing CPUs, else it won't
know where to allocate from. You just want to make sure that the zones
in that node (if existant) are not contained in *anyone's* zonelist.

M.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2/2] enables booting a NUMA system where some nodes have no memory

2006-11-16 Thread Martin J. Bligh

Christian Krafft wrote:

On Wed, 15 Nov 2006 16:57:56 -0800 (PST)
Christoph Lameter [EMAIL PROTECTED] wrote:


On Thu, 16 Nov 2006, KAMEZAWA Hiroyuki wrote:

But there is no memory on the node. Does the zonelist contain the zones of 
the node without memory or not? We simply fall back each allocation to the 
next node as if the node was overflowing?

yes. just fallback.
Ok, so we got a useless pglist_data struct and the struct zone contains a 
zonelist that does not include the zone.


Okay, I slowly understand what you are talking about.
I just tried a numactl --cpunodebind 1 --membind 1 true which hit an 
uninitialized zone in slab_node:

return zone_to_nid(policy-v.zonelist-zones[0]);

I also still don't know if it makes sense to have memoryless nodes, but 
supporting it does.
So wath would be reasonable, to have empty zonelists for those node, or to 
check if zonelists are uninitialized ?


You don't want empty zonelists on a node containing CPUs, else it won't
know where to allocate from. You just want to make sure that the zones
in that node (if existant) are not contained in *anyone's* zonelist.

M.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-mm2

2005-09-08 Thread Martin J. Bligh

> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13/2.6.13-mm2/
> 
> (kernel.org propagation is slow.  There's a temp copy at
> http://www.zip.com.au/~akpm/linux/patches/stuff/2.6.13-mm2.bz2)
> 
> 
> 
> - Added Andi's x86_64 tree, as separate patches
> 
> - Added a driver for TI acx1xx cardbus wireless NICs
> 
> - Large revamp of pcmcia suspend handling
> 
> - Largeish v4l and DVB updates
> 
> - Significant parport rework
> 
> - Many tty drivers still won't compile
> 
> - Lots of framebuffer driver updates
> 
> - There are still many patches here for 2.6.14.  We're doing pretty well
>   with merging up the subsystem trees.  ia64 and CIFS are still pending. 
>   x86_64 and several of Greg's trees (especially USB) aren't merged yet.

Build fails on x86_64, at least, with this config:
http://ftp.kernel.org/pub/linux/kernel/people/mbligh/config/abat/amd64

arch/x86_64/pci/built-in.o(.init.text+0xa88): In function `pci_acpi_scan_root':
: undefined reference to `pxm_to_node'
make: *** [.tmp_vmlinux1] Error 1
09/08/05-06:52:31 Build the kernel. Failed rc = 2
09/08/05-06:52:31 build: kernel build Failed rc = 1
09/08/05-06:52:31 command complete: (2) rc=126
Failed and terminated the run

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: single node SPARSEMEM fix

2005-09-08 Thread Martin J. Bligh
>> >> CONFIG_NUMA was meant to (and did at one point) support both NUMA and flat
>> >> machines. This is essential in order for the distros to support it - same
>> >> will go for sparsemem.
>> > 
>> > That's a different issue.  The current code works if you boot a NUMA=y
>> > SPARSEMEM=y machine with a single node.  The current Kconfig options
>> > also enforce that SPARSEMEM depends on NUMA on i386.
>> > 
>> > Magnus would like to enable SPARSEMEM=y while CONFIG_NUMA=n.  That
>> > requires some Kconfig changes, as well as an extra memory present call.
>> > I'm questioning why we need to do that when we could never do
>> > DISCONTIG=y while NUMA=n on i386.
>> 
>> Ah, OK - makes more sense. However, some machines do have large holes
>> in e820 map setups - is not really critical, more of an efficiency
>> thing.
> 
> Confused.   Does all this mean that we want the patch, or not?

>From that POV, nothing urgent, and would require more work to make use
of it anyway. Not sure if Magnus had another more immediate use for it?

M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: single node SPARSEMEM fix

2005-09-08 Thread Martin J. Bligh
  CONFIG_NUMA was meant to (and did at one point) support both NUMA and flat
  machines. This is essential in order for the distros to support it - same
  will go for sparsemem.
  
  That's a different issue.  The current code works if you boot a NUMA=y
  SPARSEMEM=y machine with a single node.  The current Kconfig options
  also enforce that SPARSEMEM depends on NUMA on i386.
  
  Magnus would like to enable SPARSEMEM=y while CONFIG_NUMA=n.  That
  requires some Kconfig changes, as well as an extra memory present call.
  I'm questioning why we need to do that when we could never do
  DISCONTIG=y while NUMA=n on i386.
 
 Ah, OK - makes more sense. However, some machines do have large holes
 in e820 map setups - is not really critical, more of an efficiency
 thing.
 
 Confused.   Does all this mean that we want the patch, or not?

From that POV, nothing urgent, and would require more work to make use
of it anyway. Not sure if Magnus had another more immediate use for it?

M.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-mm2

2005-09-08 Thread Martin J. Bligh

 ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13/2.6.13-mm2/
 
 (kernel.org propagation is slow.  There's a temp copy at
 http://www.zip.com.au/~akpm/linux/patches/stuff/2.6.13-mm2.bz2)
 
 
 
 - Added Andi's x86_64 tree, as separate patches
 
 - Added a driver for TI acx1xx cardbus wireless NICs
 
 - Large revamp of pcmcia suspend handling
 
 - Largeish v4l and DVB updates
 
 - Significant parport rework
 
 - Many tty drivers still won't compile
 
 - Lots of framebuffer driver updates
 
 - There are still many patches here for 2.6.14.  We're doing pretty well
   with merging up the subsystem trees.  ia64 and CIFS are still pending. 
   x86_64 and several of Greg's trees (especially USB) aren't merged yet.

Build fails on x86_64, at least, with this config:
http://ftp.kernel.org/pub/linux/kernel/people/mbligh/config/abat/amd64

arch/x86_64/pci/built-in.o(.init.text+0xa88): In function `pci_acpi_scan_root':
: undefined reference to `pxm_to_node'
make: *** [.tmp_vmlinux1] Error 1
09/08/05-06:52:31 Build the kernel. Failed rc = 2
09/08/05-06:52:31 build: kernel build Failed rc = 1
09/08/05-06:52:31 command complete: (2) rc=126
Failed and terminated the run

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: single node SPARSEMEM fix

2005-09-07 Thread Martin J. Bligh


--On Wednesday, September 07, 2005 11:27:54 -0700 Dave Hansen <[EMAIL 
PROTECTED]> wrote:

> On Wed, 2005-09-07 at 11:22 -0700, Martin J. Bligh wrote:
>> CONFIG_NUMA was meant to (and did at one point) support both NUMA and flat
>> machines. This is essential in order for the distros to support it - same
>> will go for sparsemem.
> 
> That's a different issue.  The current code works if you boot a NUMA=y
> SPARSEMEM=y machine with a single node.  The current Kconfig options
> also enforce that SPARSEMEM depends on NUMA on i386.
> 
> Magnus would like to enable SPARSEMEM=y while CONFIG_NUMA=n.  That
> requires some Kconfig changes, as well as an extra memory present call.
> I'm questioning why we need to do that when we could never do
> DISCONTIG=y while NUMA=n on i386.

Ah, OK - makes more sense. However, some machines do have large holes
in e820 map setups - is not really critical, more of an efficiency
thing.

M.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: single node SPARSEMEM fix

2005-09-07 Thread Martin J. Bligh
--On Wednesday, September 07, 2005 10:28:36 -0700 Dave Hansen <[EMAIL 
PROTECTED]> wrote:

> On Tue, 2005-09-06 at 12:56 +0900, Magnus Damm wrote:
>> This patch for 2.6.13-git5 fixes single node sparsemem support. In the case
>> when multiple nodes are used, setup_memory() in arch/i386/mm/discontig.c 
>> calls
>> get_memcfg_numa() which calls memory_present(). The single node case with
>> setup_memory() in arch/i386/kernel/setup.c does not call memory_present()
>> without this patch, which breaks single node support.
> 
> First of all, this is really a feature addition, not a bug fix. :)
> 
> The reason we haven't included this so far is that we don't really have
> any machines that need sparsemem on i386 that aren't NUMA.  So, we
> disabled it for now, and probably need to decide first why we need it
> before a patch like that goes in.

CONFIG_NUMA was meant to (and did at one point) support both NUMA and flat
machines. This is essential in order for the distros to support it - same
will go for sparsemem.
 
M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Hugh's alternate page fault scalability approach on 512p Altix

2005-09-07 Thread Martin J. Bligh
>> > Anticipatory prefaulting raises the highest fault rate obtainable 
>> > three-fold
>> > through gang scheduling faults but may allocate some pages to a task that 
>> > are
>> > not needed.
>> 
>> IIRC that costed more than it saved, at least for forky workloads like a
>> kernel compile - extra cost in zap_pte_range etc. If things have changed
>> substantially in that path, I guess we could run the numbers again - has
>> been a couple of years.
> 
> Right. The costs come about through wrong anticipations installing useless 
> mappings. The patches that I posted have this feature off by default. Gang 
> scheduling can be enabled by modifying a value in /proc. But I guess the 
> approach is essentially dead unless others want this feature too. The 
> current page fault scalability approach should be fine for a couple of 
> years and who knows what direction mmu technology has taken then.

It would seem to depends on the locality of reference in the affected files.
Which implies to me that the locality of libc, etc probably sucks, though
we had a simple debug patch somewhere to print out a bitmap of which pages
are faulted in and which are not ... was somewhere, I'll see if I can find
it.

M.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Hugh's alternate page fault scalability approach on 512p Altix

2005-09-07 Thread Martin J. Bligh
> Anticipatory prefaulting raises the highest fault rate obtainable three-fold
> through gang scheduling faults but may allocate some pages to a task that are
> not needed.

IIRC that costed more than it saved, at least for forky workloads like a
kernel compile - extra cost in zap_pte_range etc. If things have changed
substantially in that path, I guess we could run the numbers again - has
been a couple of years.

M.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Hugh's alternate page fault scalability approach on 512p Altix

2005-09-07 Thread Martin J. Bligh
 Anticipatory prefaulting raises the highest fault rate obtainable three-fold
 through gang scheduling faults but may allocate some pages to a task that are
 not needed.

IIRC that costed more than it saved, at least for forky workloads like a
kernel compile - extra cost in zap_pte_range etc. If things have changed
substantially in that path, I guess we could run the numbers again - has
been a couple of years.

M.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Hugh's alternate page fault scalability approach on 512p Altix

2005-09-07 Thread Martin J. Bligh
  Anticipatory prefaulting raises the highest fault rate obtainable 
  three-fold
  through gang scheduling faults but may allocate some pages to a task that 
  are
  not needed.
 
 IIRC that costed more than it saved, at least for forky workloads like a
 kernel compile - extra cost in zap_pte_range etc. If things have changed
 substantially in that path, I guess we could run the numbers again - has
 been a couple of years.
 
 Right. The costs come about through wrong anticipations installing useless 
 mappings. The patches that I posted have this feature off by default. Gang 
 scheduling can be enabled by modifying a value in /proc. But I guess the 
 approach is essentially dead unless others want this feature too. The 
 current page fault scalability approach should be fine for a couple of 
 years and who knows what direction mmu technology has taken then.

It would seem to depends on the locality of reference in the affected files.
Which implies to me that the locality of libc, etc probably sucks, though
we had a simple debug patch somewhere to print out a bitmap of which pages
are faulted in and which are not ... was somewhere, I'll see if I can find
it.

M.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: single node SPARSEMEM fix

2005-09-07 Thread Martin J. Bligh
--On Wednesday, September 07, 2005 10:28:36 -0700 Dave Hansen [EMAIL 
PROTECTED] wrote:

 On Tue, 2005-09-06 at 12:56 +0900, Magnus Damm wrote:
 This patch for 2.6.13-git5 fixes single node sparsemem support. In the case
 when multiple nodes are used, setup_memory() in arch/i386/mm/discontig.c 
 calls
 get_memcfg_numa() which calls memory_present(). The single node case with
 setup_memory() in arch/i386/kernel/setup.c does not call memory_present()
 without this patch, which breaks single node support.
 
 First of all, this is really a feature addition, not a bug fix. :)
 
 The reason we haven't included this so far is that we don't really have
 any machines that need sparsemem on i386 that aren't NUMA.  So, we
 disabled it for now, and probably need to decide first why we need it
 before a patch like that goes in.

CONFIG_NUMA was meant to (and did at one point) support both NUMA and flat
machines. This is essential in order for the distros to support it - same
will go for sparsemem.
 
M.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: single node SPARSEMEM fix

2005-09-07 Thread Martin J. Bligh


--On Wednesday, September 07, 2005 11:27:54 -0700 Dave Hansen [EMAIL 
PROTECTED] wrote:

 On Wed, 2005-09-07 at 11:22 -0700, Martin J. Bligh wrote:
 CONFIG_NUMA was meant to (and did at one point) support both NUMA and flat
 machines. This is essential in order for the distros to support it - same
 will go for sparsemem.
 
 That's a different issue.  The current code works if you boot a NUMA=y
 SPARSEMEM=y machine with a single node.  The current Kconfig options
 also enforce that SPARSEMEM depends on NUMA on i386.
 
 Magnus would like to enable SPARSEMEM=y while CONFIG_NUMA=n.  That
 requires some Kconfig changes, as well as an extra memory present call.
 I'm questioning why we need to do that when we could never do
 DISCONTIG=y while NUMA=n on i386.

Ah, OK - makes more sense. However, some machines do have large holes
in e820 map setups - is not really critical, more of an efficiency
thing.

M.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-mm1

2005-09-01 Thread Martin J. Bligh

Breaks build on PPC64

Lots of this:

In file included from fs/xfs/linux-2.6/xfs_linux.h:57,
 from fs/xfs/xfs.h:35,
 from fs/xfs/xfs_rtalloc.c:37:
fs/xfs/xfs_arch.h:55:21: warning: "__LITTLE_ENDIAN" is not defined
In file included from fs/xfs/xfs_rtalloc.c:50:
fs/xfs/xfs_bmap_btree.h:65:21: warning: "__LITTLE_ENDIAN" is not defined
  CC  fs/xfs/xfs_acl.o
In file included from fs/xfs/linux-2.6/xfs_linux.h:57,
 from fs/xfs/xfs.h:35,
 from fs/xfs/xfs_acl.c:33:
fs/xfs/xfs_arch.h:55:21: warning: "__LITTLE_ENDIAN" is not defined

Can't see anything obvious to cause that.
Then this:

CC  drivers/char/hvc_console.o
drivers/char/hvc_console.c: In function `hvc_poll':
drivers/char/hvc_console.c:600: error: `count' undeclared (first use in this 
function)
drivers/char/hvc_console.c:600: error: (Each undeclared identifier is reported 
only once
drivers/char/hvc_console.c:600: error: for each function it appears in.)
drivers/char/hvc_console.c:636: error: structure has no member named `flip'
make[2]: *** [drivers/char/hvc_console.o] Error 1
make[1]: *** [drivers/char] Error 2
make: *** [drivers] Error 2

Presumably this:

diff -puN drivers/char/hvc_console.c~tty-layer-buffering-revamp drivers/char/hvc
_console.c
--- 25/drivers/char/hvc_console.c~tty-layer-buffering-revampWed Aug 31 12:50
:55 2005
+++ 25-akpm/drivers/char/hvc_console.c  Wed Aug 31 12:50:56 2005
@@ -597,10 +597,8 @@ static int hvc_poll(struct hvc_struct *h
 
/* Read data if any */
for (;;) {
-   int count = N_INBUF;
-   if (count > (TTY_FLIPBUF_SIZE - tty->flip.count))
-   count = TTY_FLIPBUF_SIZE - tty->flip.count;
-
+   count = tty_buffer_request_room(tty, N_INBUF);
+   
/* If flip is full, just reschedule a later read */
if (count == 0) {
poll_mask |= HVC_POLL_READ;

shouldn't be deleting the declaration of count. 

and possibly the "flip removal" was incomplete (line 636) ???

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-mm1

2005-09-01 Thread Martin J. Bligh

Breaks build on PPC64

Lots of this:

In file included from fs/xfs/linux-2.6/xfs_linux.h:57,
 from fs/xfs/xfs.h:35,
 from fs/xfs/xfs_rtalloc.c:37:
fs/xfs/xfs_arch.h:55:21: warning: __LITTLE_ENDIAN is not defined
In file included from fs/xfs/xfs_rtalloc.c:50:
fs/xfs/xfs_bmap_btree.h:65:21: warning: __LITTLE_ENDIAN is not defined
  CC  fs/xfs/xfs_acl.o
In file included from fs/xfs/linux-2.6/xfs_linux.h:57,
 from fs/xfs/xfs.h:35,
 from fs/xfs/xfs_acl.c:33:
fs/xfs/xfs_arch.h:55:21: warning: __LITTLE_ENDIAN is not defined

Can't see anything obvious to cause that.
Then this:

CC  drivers/char/hvc_console.o
drivers/char/hvc_console.c: In function `hvc_poll':
drivers/char/hvc_console.c:600: error: `count' undeclared (first use in this 
function)
drivers/char/hvc_console.c:600: error: (Each undeclared identifier is reported 
only once
drivers/char/hvc_console.c:600: error: for each function it appears in.)
drivers/char/hvc_console.c:636: error: structure has no member named `flip'
make[2]: *** [drivers/char/hvc_console.o] Error 1
make[1]: *** [drivers/char] Error 2
make: *** [drivers] Error 2

Presumably this:

diff -puN drivers/char/hvc_console.c~tty-layer-buffering-revamp drivers/char/hvc
_console.c
--- 25/drivers/char/hvc_console.c~tty-layer-buffering-revampWed Aug 31 12:50
:55 2005
+++ 25-akpm/drivers/char/hvc_console.c  Wed Aug 31 12:50:56 2005
@@ -597,10 +597,8 @@ static int hvc_poll(struct hvc_struct *h
 
/* Read data if any */
for (;;) {
-   int count = N_INBUF;
-   if (count  (TTY_FLIPBUF_SIZE - tty-flip.count))
-   count = TTY_FLIPBUF_SIZE - tty-flip.count;
-
+   count = tty_buffer_request_room(tty, N_INBUF);
+   
/* If flip is full, just reschedule a later read */
if (count == 0) {
poll_mask |= HVC_POLL_READ;

shouldn't be deleting the declaration of count. 

and possibly the flip removal was incomplete (line 636) ???

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] Implement shared page tables

2005-08-31 Thread Martin J. Bligh
>> They're incompatible, but you could be left to choose one or the other
>> via config option.
> 
> Wouldn't need config option: there's /proc/sys/kernel/randomize_va_space
> for the whole running system, compatibility check on the ELFs run, and
> the infinite stack rlimit: enough ways to suppress randomization if it
> doesn't suit you.

Even better - much easier to deal with distro stuff if we can do it at
runtime.
 
>> 3% on "a certain industry-standard database benchmark" (cough) is huge,
>> and we expect the benefit for PPC64 will be larger as we can share the
>> underlying hardware PTEs without TLB flushing as well.
> 
> Okay - and you're implying that 3% comes from _using_ the shared page
> tables, rather than from avoiding the fork/exit overhead of setting
> them up and tearing them down.  And it can't use huge TLB pages
> because...  fragmentation?

Yes - as I understand it, that was a straight measurement with/without the
patch, and the shmem segment was already using hugetlb (in both cases). 
Yes, I find that a bit odd as to why as well - they are still trying 
to get some detailed profiling to explain. 

M.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] Implement shared page tables

2005-08-31 Thread Martin J. Bligh
--Hugh Dickins <[EMAIL PROTECTED]> wrote (on Wednesday, August 31, 2005 
14:42:38 +0100):

> On Wed, 31 Aug 2005, Arjan van de Ven wrote:
>> On Wed, 2005-08-31 at 12:44 +0100, Hugh Dickins wrote:
>> > I was going to say, doesn't randomize_va_space take away the rest of
>> > the point?  But no, it appears "randomize_va_space", as it currently
>> > appears in mainline anyway, is somewhat an exaggeration: it just shifts
>> > the stack a little, with no effect on the rest of the va space.
>> 
>> it also randomizes mmaps
> 
> Ah, via PF_RANDOMIZE, yes, thanks: so long as certain conditions are
> fulfilled - and my RLIM_INFINITY RLIMIT_STACK has been preventing it.
> 
> And mmaps include shmats: so unless the process specifies non-NULL
> shmaddr to attach at, it'll choose a randomized address for that too
> (subject to those various conditions).
> 
> Which is indeed a further disincentive against shared page tables.

Or shared pagetables a disincentive to randomizing the mmap space ;-)
They're incompatible, but you could be left to choose one or the other
via config option.

3% on "a certain industry-standard database benchmark" (cough) is huge,
and we expect the benefit for PPC64 will be larger as we can share the
underlying hardware PTEs without TLB flushing as well.

M.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] Implement shared page tables

2005-08-31 Thread Martin J. Bligh
--Hugh Dickins [EMAIL PROTECTED] wrote (on Wednesday, August 31, 2005 
14:42:38 +0100):

 On Wed, 31 Aug 2005, Arjan van de Ven wrote:
 On Wed, 2005-08-31 at 12:44 +0100, Hugh Dickins wrote:
  I was going to say, doesn't randomize_va_space take away the rest of
  the point?  But no, it appears randomize_va_space, as it currently
  appears in mainline anyway, is somewhat an exaggeration: it just shifts
  the stack a little, with no effect on the rest of the va space.
 
 it also randomizes mmaps
 
 Ah, via PF_RANDOMIZE, yes, thanks: so long as certain conditions are
 fulfilled - and my RLIM_INFINITY RLIMIT_STACK has been preventing it.
 
 And mmaps include shmats: so unless the process specifies non-NULL
 shmaddr to attach at, it'll choose a randomized address for that too
 (subject to those various conditions).
 
 Which is indeed a further disincentive against shared page tables.

Or shared pagetables a disincentive to randomizing the mmap space ;-)
They're incompatible, but you could be left to choose one or the other
via config option.

3% on a certain industry-standard database benchmark (cough) is huge,
and we expect the benefit for PPC64 will be larger as we can share the
underlying hardware PTEs without TLB flushing as well.

M.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] Implement shared page tables

2005-08-31 Thread Martin J. Bligh
 They're incompatible, but you could be left to choose one or the other
 via config option.
 
 Wouldn't need config option: there's /proc/sys/kernel/randomize_va_space
 for the whole running system, compatibility check on the ELFs run, and
 the infinite stack rlimit: enough ways to suppress randomization if it
 doesn't suit you.

Even better - much easier to deal with distro stuff if we can do it at
runtime.
 
 3% on a certain industry-standard database benchmark (cough) is huge,
 and we expect the benefit for PPC64 will be larger as we can share the
 underlying hardware PTEs without TLB flushing as well.
 
 Okay - and you're implying that 3% comes from _using_ the shared page
 tables, rather than from avoiding the fork/exit overhead of setting
 them up and tearing them down.  And it can't use huge TLB pages
 because...  fragmentation?

Yes - as I understand it, that was a straight measurement with/without the
patch, and the shmem segment was already using hugetlb (in both cases). 
Yes, I find that a bit odd as to why as well - they are still trying 
to get some detailed profiling to explain. 

M.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-rc6-mm1

2005-08-21 Thread Martin J. Bligh
> -scheduler-cache-hot-autodetect.patch
> 
>  Mabe Martin's machine crash

That machine now boots again with this -mm release. Darren and/or I
will continue trying to figure out what went wrong with this.
 
M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-rc6-mm1

2005-08-21 Thread Martin J. Bligh
 -scheduler-cache-hot-autodetect.patch
 
  Mabe Martin's machine crash

That machine now boots again with this -mm release. Darren and/or I
will continue trying to figure out what went wrong with this.
 
M.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-rc6-mm1

2005-08-20 Thread Martin J. Bligh
--Andrew Morton <[EMAIL PROTECTED]> wrote (on Friday, August 19, 2005 04:33:31 
-0700):

> 
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc6/2.6.13-rc6-mm1/
> 
> - Lots of fixes, updates and cleanups all over the place.
> 
> - If you have the right debugging options set, this kernel will generate
>   a storm of sleeping-in-atomic-code warnings at boot, from the scsi code.
>   It is being worked on.

Get a couple of debug warnings as you mention ... but then it panics.


scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 7.0

aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs

scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 7.0

aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBs

  Vendor: IBM   Model: GNHv1 S2  Rev: 0   
  Type:   Processor  ANSI SCSI revision: 02
 target1:0:9: Beginning Domain Validation
 target1:0:9: Ending Domain Validation
  Vendor: IBM-ESXS  Model: DTN036C1UCDY10F   Rev: S25J
  Type:   Direct-Access  ANSI SCSI revision: 03
scsi1:A:12:0: Tagged Queuing enabled.  Depth 253
 target1:0:12: Beginning Domain Validation
 target1:0:12: wide asynchronous.
 target1:0:12: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 127)
 target1:0:12: Ending Domain Validation
  Vendor: IBM-ESXS  Model: DTN036C1UCDY10F   Rev: S25J
  Type:   Direct-Access  ANSI SCSI revision: 03
scsi1:A:13:0: Tagged Queuing enabled.  Depth 253
 target1:0:13: Beginning Domain Validation
 target1:0:13: wide asynchronous.
 target1:0:13: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 127)
 target1:0:13: Ending Domain Validation
scsi2 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 7.0

aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs

scheduling while atomic: swapper/0x0100/0
 [] schedule+0x45/0x724
 [] __wake_up+0x31/0x3c
 [] dcache_shrinker_del+0x2e/0x38
 [] dput_recursive+0x232/0x270
 [] dput_recursive+0x265/0x270
 [] __down+0x7f/0x108
 [] default_wake_function+0x0/0x1c
 [] __down_failed+0x7/0xc
 [] .text.lock.attribute_container+0x8b/0xc1
 [] transport_remove_device+0xf/0x14
 [] transport_remove_classdev+0x0/0x58
 [] scsi_target_reap+0x86/0xb4
 [] scsi_device_dev_release+0x134/0x15c
 [] device_release+0x14/0x4c
 [] kobject_cleanup+0x47/0x6c
 [] kobject_release+0x0/0x14
 [] kobject_release+0xd/0x14
 [] kref_put+0x79/0x89
 [] kobject_put+0x16/0x1c
 [] kobject_release+0x0/0x14
 [] put_device+0x11/0x18
 [] scsi_put_command+0xa5/0xb0
 [] scsi_next_command+0x11/0x1c
 [] scsi_end_request+0xa9/0xb4
 [] scsi_io_completion+0x489/0x494
 [] scsi_generic_done+0x32/0x38
 [] scsi_finish_command+0xa0/0xa8
 [] scsi_softirq+0x139/0x154
 [] __do_softirq+0x8d/0x100
 [] do_softirq+0x2f/0x34
 [] irq_exit+0x34/0x38
 [] do_IRQ+0x20/0x28
 [] common_interrupt+0x1a/0x20
 [] default_idle+0x0/0x2c
 [] default_idle+0x23/0x2c
 [] cpu_idle+0x7b/0x8c
 [] rest_init+0x28/0x2c
 [] start_kernel+0x197/0x19c
scheduling while atomic: swapper/0x0100/0
 [] schedule+0x45/0x724
 [] default_wake_function+0x17/0x1c
 [] __wake_up_common+0x37/0x50
 [] __wake_up+0x31/0x3c
 [] wait_for_completion+0x90/0xe8
 [] default_wake_function+0x0/0x1c
 [] default_wake_function+0x0/0x1c
 [] call_usermodehelper_keys+0x144/0x15a
 [] __call_usermodehelper+0x0/0x4c
 [] __call_usermodehelper+0x0/0x4c
 [] kobject_hotplug+0x255/0x280
 [] class_device_del+0x8d/0xa8
 [] attribute_container_class_device_del+0x11/0x18
 [] transport_remove_classdev+0x4f/0x58
 [] attribute_container_device_trigger+0x7f/0xb8
 [] transport_remove_device+0xf/0x14
 [] transport_remove_classdev+0x0/0x58
 [] scsi_target_reap+0x86/0xb4
 [] scsi_device_dev_release+0x134/0x15c
 [] device_release+0x14/0x4c
 [] kobject_cleanup+0x47/0x6c
 [] kobject_release+0x0/0x14
 [] kobject_release+0xd/0x14
 [] kref_put+0x79/0x89
 [] kobject_put+0x16/0x1c
 [] kobject_release+0x0/0x14
 [] put_device+0x11/0x18
 [] scsi_put_command+0xa5/0xb0
 [] scsi_next_command+0x11/0x1c
 [] scsi_end_request+0xa9/0xb4
 [] scsi_io_completion+0x489/0x494
 [] scsi_generic_done+0x32/0x38
 [] scsi_finish_command+0xa0/0xa8
 [] scsi_softirq+0x139/0x154
 [] __do_softirq+0x8d/0x100
 [] do_softirq+0x2f/0x34
 [] irq_exit+0x34/0x38
 [] do_IRQ+0x20/0x28
 [] common_interrupt+0x1a/0x20
 [] default_idle+0x0/0x2c
 [] default_idle+0x23/0x2c
 [] cpu_idle+0x7b/0x8c
 [] rest_init+0x28/0x2c
 [] start_kernel+0x197/0x19c
Unable to handle kernel NULL pointer dereference at virtual address 
 printing eip:
c0263cf2
*pde = 0042c001
*pte = 
Oops:  [#1]
SMP 
last sysfs file: 
CPU:0
EIP:0060:[]Not tainted VLI
EFLAGS: 00010282   (2.6.13-rc6-mm1-autokern1) 
EIP is at scsi_run_queue+0xe/0xb4
eax: d777ce30   ebx: d777ce30   ecx: 0282   edx: d6c4cc00
esi: d777ce30   edi:    ebp: 0246   esp: c03dde98
ds: 007b   es: 007b   ss: 0068
Process swapper (pid: 0, threadinfo=c03dc000 task=c0373be0)
Stack: d777ce30 d777ce30 d76fb500 0246 c0263dfb d777ce30 

Re: 2.6.13-rc6-mm1

2005-08-20 Thread Martin J. Bligh
--Andrew Morton [EMAIL PROTECTED] wrote (on Friday, August 19, 2005 04:33:31 
-0700):

 
 ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc6/2.6.13-rc6-mm1/
 
 - Lots of fixes, updates and cleanups all over the place.
 
 - If you have the right debugging options set, this kernel will generate
   a storm of sleeping-in-atomic-code warnings at boot, from the scsi code.
   It is being worked on.

Get a couple of debug warnings as you mention ... but then it panics.


scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 7.0
Adaptec aic7899 Ultra160 SCSI adapter
aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs

scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 7.0
Adaptec aic7899 Ultra160 SCSI adapter
aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBs

  Vendor: IBM   Model: GNHv1 S2  Rev: 0   
  Type:   Processor  ANSI SCSI revision: 02
 target1:0:9: Beginning Domain Validation
 target1:0:9: Ending Domain Validation
  Vendor: IBM-ESXS  Model: DTN036C1UCDY10F   Rev: S25J
  Type:   Direct-Access  ANSI SCSI revision: 03
scsi1:A:12:0: Tagged Queuing enabled.  Depth 253
 target1:0:12: Beginning Domain Validation
 target1:0:12: wide asynchronous.
 target1:0:12: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 127)
 target1:0:12: Ending Domain Validation
  Vendor: IBM-ESXS  Model: DTN036C1UCDY10F   Rev: S25J
  Type:   Direct-Access  ANSI SCSI revision: 03
scsi1:A:13:0: Tagged Queuing enabled.  Depth 253
 target1:0:13: Beginning Domain Validation
 target1:0:13: wide asynchronous.
 target1:0:13: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 127)
 target1:0:13: Ending Domain Validation
scsi2 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 7.0
Adaptec aic7899 Ultra160 SCSI adapter
aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs

scheduling while atomic: swapper/0x0100/0
 [c02f65c5] schedule+0x45/0x724
 [c0115a15] __wake_up+0x31/0x3c
 [c016892a] dcache_shrinker_del+0x2e/0x38
 [c0168c62] dput_recursive+0x232/0x270
 [c0168c95] dput_recursive+0x265/0x270
 [c02f7d0f] __down+0x7f/0x108
 [c0115978] default_wake_function+0x0/0x1c
 [c02f6517] __down_failed+0x7/0xc
 [c0233612] .text.lock.attribute_container+0x8b/0xc1
 [c02337eb] transport_remove_device+0xf/0x14
 [c0233784] transport_remove_classdev+0x0/0x58
 [c0265b62] scsi_target_reap+0x86/0xb4
 [c02670c0] scsi_device_dev_release+0x134/0x15c
 [c022f5b8] device_release+0x14/0x4c
 [c01bc2c3] kobject_cleanup+0x47/0x6c
 [c01bc2e8] kobject_release+0x0/0x14
 [c01bc2f5] kobject_release+0xd/0x14
 [c01bcb95] kref_put+0x79/0x89
 [c01bc312] kobject_put+0x16/0x1c
 [c01bc2e8] kobject_release+0x0/0x14
 [c022f925] put_device+0x11/0x18
 [c025ecad] scsi_put_command+0xa5/0xb0
 [c0263df5] scsi_next_command+0x11/0x1c
 [c0263ed9] scsi_end_request+0xa9/0xb4
 [c02644c1] scsi_io_completion+0x489/0x494
 [c02646da] scsi_generic_done+0x32/0x38
 [c025f618] scsi_finish_command+0xa0/0xa8
 [c025f52d] scsi_softirq+0x139/0x154
 [c011e32d] __do_softirq+0x8d/0x100
 [c011e3cf] do_softirq+0x2f/0x34
 [c011e474] irq_exit+0x34/0x38
 [c0104840] do_IRQ+0x20/0x28
 [c01033ae] common_interrupt+0x1a/0x20
 [c0100c00] default_idle+0x0/0x2c
 [c0100c23] default_idle+0x23/0x2c
 [c0100cf3] cpu_idle+0x7b/0x8c
 [c01002c8] rest_init+0x28/0x2c
 [c03de87f] start_kernel+0x197/0x19c
scheduling while atomic: swapper/0x0100/0
 [c02f65c5] schedule+0x45/0x724
 [c011598f] default_wake_function+0x17/0x1c
 [c01159cb] __wake_up_common+0x37/0x50
 [c0115a15] __wake_up+0x31/0x3c
 [c02f6d34] wait_for_completion+0x90/0xe8
 [c0115978] default_wake_function+0x0/0x1c
 [c0115978] default_wake_function+0x0/0x1c
 [c01280c8] call_usermodehelper_keys+0x144/0x15a
 [c0127f38] __call_usermodehelper+0x0/0x4c
 [c0127f38] __call_usermodehelper+0x0/0x4c
 [c01bca4d] kobject_hotplug+0x255/0x280
 [c0231c39] class_device_del+0x8d/0xa8
 [c0233541] attribute_container_class_device_del+0x11/0x18
 [c02337d3] transport_remove_classdev+0x4f/0x58
 [c02333ef] attribute_container_device_trigger+0x7f/0xb8
 [c02337eb] transport_remove_device+0xf/0x14
 [c0233784] transport_remove_classdev+0x0/0x58
 [c0265b62] scsi_target_reap+0x86/0xb4
 [c02670c0] scsi_device_dev_release+0x134/0x15c
 [c022f5b8] device_release+0x14/0x4c
 [c01bc2c3] kobject_cleanup+0x47/0x6c
 [c01bc2e8] kobject_release+0x0/0x14
 [c01bc2f5] kobject_release+0xd/0x14
 [c01bcb95] kref_put+0x79/0x89
 [c01bc312] kobject_put+0x16/0x1c
 [c01bc2e8] kobject_release+0x0/0x14
 [c022f925] put_device+0x11/0x18
 [c025ecad] scsi_put_command+0xa5/0xb0
 [c0263df5] scsi_next_command+0x11/0x1c
 [c0263ed9] scsi_end_request+0xa9/0xb4
 [c02644c1] scsi_io_completion+0x489/0x494
 [c02646da] scsi_generic_done+0x32/0x38
 [c025f618] scsi_finish_command+0xa0/0xa8
 [c025f52d] scsi_softirq+0x139/0x154
 [c011e32d] __do_softirq+0x8d/0x100
 [c011e3cf] do_softirq+0x2f/0x34
 [c011e474] irq_exit+0x34/0x38
 [c0104840] do_IRQ+0x20/0x28
 [c01033ae] 

  1   2   3   >