date:20070719

Re: [NFS] [PATCH 5/5] knfsd: clean up EX_RDONLY

2007-07-19 Thread Christoph Hellwig

On Wed, Jul 18, 2007 at 06:57:30PM -0400, J. Bruce Fields wrote:
> From: J. Bruce Fields <[EMAIL PROTECTED]>
> 
> Share a little common code, reverse the arguments for consistency, drop
> the unnecessary "inline", and lowercase the name.

Ah, sorry - didn't notice this was a separate patch.

> @@ -1845,7 +1838,7 @@ nfsd_permission(struct svc_rqst *rqstp, struct 
> svc_export *exp,
>*/
>   if (!(acc & MAY_LOCAL_ACCESS))
>   if (acc & (MAY_WRITE | MAY_SATTR | MAY_TRUNC)) {
> - if (EX_RDONLY(exp, rqstp) || IS_RDONLY(inode))
> + if (exp_rdonly(rqstp, exp) || IS_RDONLY(inode))

In fact with just a singler caller left and reduced to a one-liner we
could kill this function completely..
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [NFS] [PATCH 4/5] knfsd: move EX_RDONLY out of header

2007-07-19 Thread Christoph Hellwig

On Wed, Jul 18, 2007 at 06:57:29PM -0400, J. Bruce Fields wrote:
> From: J. Bruce Fields <[EMAIL PROTECTED]>
> 
> EX_RDONLY is only called in one place; just put it there.
> 
> Signed-off-by: "J. Bruce Fields" <[EMAIL PROTECTED]>
> ---
>  fs/nfsd/vfs.c   |   12 
>  include/linux/nfsd/export.h |   12 
>  2 files changed, 12 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> index 5c97d0e..f2684e5 100644
> --- a/fs/nfsd/vfs.c
> +++ b/fs/nfsd/vfs.c
> @@ -1797,6 +1797,18 @@ nfsd_statfs(struct svc_rqst *rqstp, struct svc_fh 
> *fhp, struct kstatfs *stat)
>   return err;
>  }
>  
> +static inline int EX_RDONLY(struct svc_export *exp, struct svc_rqst *rqstp)
> +{
> + struct exp_flavor_info *f;
> + struct exp_flavor_info *end = exp->ex_flavors + exp->ex_nflavors;
> +
> + for (f = exp->ex_flavors; f < end; f++) {
> + if (f->pseudoflavor == rqstp->rq_flavor)
> + return f->flags & NFSEXP_READONLY;
> + }
> + return exp->ex_flags & NFSEXP_READONLY;
> +}

As mentioned last time lease remove the inline qualifier and give it a
lower-case name.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: cmpxchg is not available to generic code

2007-07-19 Thread Andrew Morton

On Thu, 19 Jul 2007 18:15:03 +1000 "Dave Airlie" <[EMAIL PROTECTED]> wrote:

> Maybe we could add CONFIG_HAVE_CMPXCHG and let DRM depend on it..

That would certainly be better than adding a sprinkle of architectures
in DRM Kconfig dependencies.

I don't know how important DRM is on ARM.  Zero?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] net/, drivers/net/ , missing EXPERIMENTAL in menus

2007-07-19 Thread Robert P. J. Day

On Thu, 19 Jul 2007, Adrian Bunk wrote:

...
> I would consider it more ugly to special case this and that in the
> kconfig code when plain dependencies already offer exactly the same
> functionality...

well, this is the *third* time i've proposed adding this kind of
feature so, at this point, i've really given up caring about it.  if
someone wants to do this, have at it.  i have better things to do than
to keep suggesting it and getting nowhere with it.

rday
--

Robert P. J. Day Linux Consulting, Training and Annoying Kernel
Pedantry Waterloo, Ontario, CANADA

http://fsdev.net/wiki/index.php?title=Main_Page

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: cmpxchg is not available to generic code

2007-07-19 Thread Andrew Morton

On Thu, 19 Jul 2007 09:02:03 +0100 (IST) Dave Airlie <[EMAIL PROTECTED]> wrote:

> 
> 
> > arm:
> >
> > drivers/char/drm/drm_lock.c: In function `drm_lock_take':
> > drivers/char/drm/drm_lock.c:221: error: implicit declaration of function 
> > `cmpxchg'
> >
> > You might be able to use atomic_cmpxchg, which _is_ present
> > on all architectures.  Or use a spinlock.
> >
> > What's that code doing anyway?  driver-private locking primitives?
> 
> When did arm suddenly start wanting DRM?

It's selectable in config.  allmodconfig broke.

> they need to grow a userpsace 
> cmpxchg as davem mentioned to go along with this, changing the drm now 
> isn't possible due to backwards compat..

For reference purposes, that position is not acceptable.  We _never_ accept the
"oh I can't change my proposed kernel interface because I already have
userspace relying on it" argument.

Hopefully that won't be an issue here.  I guess DRM now needs a
`depends on !ARM'.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: cmpxchg is not available to generic code

2007-07-19 Thread Dave Airlie


On 7/19/07, Andrew Morton <[EMAIL PROTECTED]> wrote:

On Thu, 19 Jul 2007 18:15:03 +1000 "Dave Airlie" <[EMAIL PROTECTED]> wrote:

> Maybe we could add CONFIG_HAVE_CMPXCHG and let DRM depend on it..

That would certainly be better than adding a sprinkle of architectures
in DRM Kconfig dependencies.

I don't know how important DRM is on ARM.  Zero?



I'd guess zero I suppose if you wanted you could hook up a PCI
graphics card on ARM, but if you do that I think you could implement
cmpxchg :-)

Dave.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/1] i386: Geode's TSC is not neccessary to mark tu unstable

2007-07-19 Thread Andi Kleen


> Wow, that's a really cool bug; nice work!  Don't forget to update
> arch/i386/kernel/cpu/mtrr/state.c, though; it uses setCx86() as well.  It 
> needs
> to include processor-cyrix.h.

It also needs some big fat comments

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Documentation for sysfs, hotplug, and firmware loading.

2007-07-19 Thread Cornelia Huck

On Wed, 18 Jul 2007 13:39:53 -0400,
Rob Landley <[EMAIL PROTECTED]> wrote:

> Nope.  If you recurse down under /sys/class following symlinks, you go into 
> an 
> endless loop bouncing off of /sys/devices and getting pointed back.  If you 
> don't follow symlinks, it works fine up until about 2.6.20 at which point 
> things that were previously directories BECAME symlinks because the 
> directories got moved, and it all broke.

I have no idea what you're doing.

> Which is why I want it documented where to look for these suckers.  Just give 
> me ONE STABLE WAY TO FIND THIS INFORMATION, PLEASE.

See Documentation/sysfs-rules.txt.

> This document is trying to document just enough information to make hotplug 
> work using sysfs (which includes firmware loading if necessary).
> 
> > (And how about referring to Documentation/sysfs-rules.txt?)
> 
> Because there isn't one in 2.6.22, and I've been writing this file on and off 
> for a month as I tracked down various bits of information?

That was a _suggestion_.

> I know.  I'm just trying to show people how to do it.  Notice that this 
> script 
> doesn't DO anything, it just dumps the variables (and proves 
> that /sys/hotplug got called).  You're worried about the scalability of a 
> debugging script.

If you use bash scripts as examples, people will write bash scripts.

> (Rummage)  Seems to be "add, remove, change, online, offline, move"?
> 
> I can list 'em.  Now I'm vaguely curious what generates online and offline 
> events (MII transciever state transitions on a network card, or does this 
> have to do with power saving modes?)  And I have no idea what the difference 
> between "change" and "move" is

"change" - something about the device has changed
"move" - the device is in a different position in the tree now

You may want to grep for the usage...

> 
> > >   DEVPATH
> > > Path under /sys at which this device's sysfs directory can be found.
> > > If $DEVPATH begins with /block/ the event refers to a block device,
> > > otherwise it refers to a char device.
> >
> > Huh? That's just the path in sysfs. And there's more than block and
> > char :) Check SUBSYSTEM for what your device actually is.
> 
> If you are doing mknod, you need three pieces of information:
> 1) Major, 2) Minor, 3) Block or Char device.  That's pretty much it.  If 
> you're trying to populate /dev you need that info.
> 
> > >   SUBSYSTEM
> > > If this is "block", it's a block device.  Anything else is a char
> > > device.
> >
> > No. For devices, SUBSYSTEM may be the class (like 'scsi_device') or the
> > bus (like 'pci').
> 
> Do you make a /dev node for either one?
> 
> I'm trying to, at minimum, document what you pass to mknod.  I consider it 
> important to know.

The problem is that your information is wrong. Imagine someone reading
this document, thinking "cool, I'll create a char node if
SUBSYSTEM!=block" and subsequently getting completely confused about
all those SUBSYSTEM==pci events.

> 
> > >   DRIVER
> > > If present, a suggested driver (module) for handling this device.  No
> > > relation to whether or not a driver is currently handling the device.
> >
> > No, this actually is the current driver.
> 
> I've had it suggest drivers for devices that didn't have any loaded, and I 
> had 
> it _not_ specify drivers for devices that were loaded.  (I checked.)

The code disagrees with you. If a driver matches and probing succeeds,
it will be specified, otherwise not. Maybe you were checking the wrong
devices?

> Ah yes.  I replied to that when it was first posted.  It's still "here's a 
> list of things NOT to do" rather then telling you what you CAN do.  I'm 
> trying to document what you can do.
> 
> Useful documentation is not "Doing THIS is forbidden.  Doing THIS is 
> forbidden.  Doing THIS is forbidden.  What are you allowed to do?  Guess!  
> Oh, and anything I didn't explicitly mention could change at any time.  Have 
> fun."

It _does_ specify what you may rely on. Don't rely on anything else.

> Sysfs CAN export a stable API.  It may only be a subset of what it's 
> exporting, but it can still do so.

And that is exactly what sysfs-rules.txt is doing. I don't understand
your problem.

If you think that getting this information from sysfs-rules.txt could
be made easier, do a patch against it.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH try #3] security: Convert LSM into a static interface

2007-07-19 Thread Christian Ehrhardt

On Wed, Jul 18, 2007 at 06:35:03PM -0700, Andrew Morton wrote:
> On Sat, 14 Jul 2007 12:37:01 -0400 (EDT)
> James Morris <[EMAIL PROTECTED]> wrote:
> 
> > Convert LSM into a static interface, as the ability to unload a security
> > module is not required by in-tree users and potentially complicates the
> > overall security architecture.
> > 
> > Needlessly exported LSM symbols have been unexported, to help reduce API
> > abuse.
> > 
> > Parameters for the capability and root_plug modules are now specified
> > at boot.
> > 
> > The SECURITY_FRAMEWORK_VERSION macro has also been removed.
> 
> I'd like to understand who is (or claims to be) adversely affected by this
> change, and what their complaints (if any) will be.

I am currently loading and unloading a prototype like security module
on a regular basis. The fact that such a module can be loaded and
unloaded (albeit in an unsecure way) greatly simplifies development.
Thus this change will adversely affect me and probably also others that
develop LSMs.

Additionally deployment of and choice among legitimate security modules
that may or may not (yet) be part of the main kernel tree is simplified by
an option to load these security modules (e.g. at boot time) into a running
kernel. This way a distribution can provide AppArmor, SELinux, SecLevl and
whatever as options very much in the same way that this works for a driver.

> Because I prefer my flamewars pre- rather than post-merge.

You asked for oppinion. I do not plan to engage in any flamewars.

regards Christian

signature.asc
Description: Digital signature

Re: [patch] fix the softlockup watchdog to actually work

2007-07-19 Thread Ingo Molnar


* Andrew Morton <[EMAIL PROTECTED]> wrote:

> > [this is -stable material too.]
> 
> This seems terribly sensitive.
> 
> Someone has broken the Vaio (shock, horror).  It now has mysterious 
> jerkiness: when leaning on autorepeat it stalls for maybe 0.25 seconds 
> every 1.5 seconds.  The stalls are far less than a second.  Yet this 
> is enough to trigger random softlockup warnings.
> 
> Some of those warnings are below.  Note that the traces are all pretty 
> useless, as softlockup warnings so often seem to be.

hm, you havent picked up the other softlockup enhancements i did, which 
make the warnings more useful.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 5/5] [V2] Move alloc_pid() to copy_process()

2007-07-19 Thread Pavel Emelyanov


[EMAIL PROTECTED] wrote:


Subject: [PATCH 5/5] Move alloc_pid call to copy_process

From: Sukadev Bhattiprolu <[EMAIL PROTECTED]>

Move alloc_pid() into copy_process(). This will keep all pid and pid
namespace code together and simplify error handling when we support
multiple pid namespaces.


I would add smth like this to the comment:

When a task creates a new pid namespace, its init (i.e. this task's
child) will have pids with extra info inside - the new numerical id,
that represent this new task in this new namespace. Thus, we have 
to allocate this new pid only after the namespace creation to find 
out which namespace this pid will live in.


Hope, I expressed my idea cleanly.

Acked-by: Pavel Emelyanov <[EMAIL PROTECTED]>


Signed-off-by: Sukadev Bhattiprolu <[EMAIL PROTECTED]>

Cc: Pavel Emelianov <[EMAIL PROTECTED]>
Cc: Eric W. Biederman <[EMAIL PROTECTED]>
Cc: Cedric Le Goater <[EMAIL PROTECTED]>
Cc: Dave Hansen <[EMAIL PROTECTED]>
Cc: Serge Hallyn <[EMAIL PROTECTED]>
Cc: Herbert Poetzel <[EMAIL PROTECTED]>
---
 kernel/fork.c |   19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

Index: lx26-22-rc6-mm1a/kernel/fork.c
===
--- lx26-22-rc6-mm1a.orig/kernel/fork.c 2007-07-16 12:55:13.0 -0700
+++ lx26-22-rc6-mm1a/kernel/fork.c  2007-07-17 10:08:12.0 -0700
@@ -1029,6 +1029,12 @@ static struct task_struct *copy_process(
if (p->binfmt && !try_module_get(p->binfmt->module))
goto bad_fork_cleanup_put_domain;
 
+	if (pid != _struct_pid) {

+   pid = alloc_pid();
+   if (!pid)
+   goto bad_fork_put_binfmt_module;
+   }
+
p->did_exec = 0;
delayacct_tsk_init(p);  /* Must remain after dup_task_struct() */
copy_flags(clone_flags, p);
@@ -1316,6 +1322,9 @@ bad_fork_cleanup_container:
 #endif
container_exit(p, container_callbacks_done);
delayacct_tsk_free(p);
+   if (pid != _struct_pid)
+   free_pid(pid);
+bad_fork_put_binfmt_module:
if (p->binfmt)
module_put(p->binfmt->module);
 bad_fork_cleanup_put_domain:
@@ -1380,19 +1389,16 @@ long do_fork(unsigned long clone_flags,
 {
struct task_struct *p;
int trace = 0;
-   struct pid *pid = alloc_pid();
long nr;
 
-	if (!pid)

-   return -EAGAIN;
-   nr = pid->nr;
if (unlikely(current->ptrace)) {
trace = fork_traceflag (clone_flags);
if (trace)
clone_flags |= CLONE_PTRACE;
}
 
-	p = copy_process(clone_flags, stack_start, regs, stack_size, parent_tidptr, child_tidptr, pid);

+   p = copy_process(clone_flags, stack_start, regs, stack_size,
+   parent_tidptr, child_tidptr, NULL);
/*
 * Do this prior waking up the new thread - the thread pointer
 * might get invalid after that point, if the thread exits quickly.
@@ -1400,6 +1406,8 @@ long do_fork(unsigned long clone_flags,
if (!IS_ERR(p)) {
struct completion vfork;
 
+		nr = pid_nr(task_pid(p));

+
if (clone_flags & CLONE_VFORK) {
p->vfork_done = 
init_completion();
@@ -1433,7 +1441,6 @@ long do_fork(unsigned long clone_flags,
}
}
} else {
-   free_pid(pid);
nr = PTR_ERR(p);
}
return nr;



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2.6.22.y] firewire: fix memory leak of fw_request instances

2007-07-19 Thread Stefan Richter

Date: Tue, 17 Jul 2007 02:15:36 +0200 (CEST)
From: Stefan Richter <[EMAIL PROTECTED]>
Subject: firewire: fix memory leak of fw_request instances

Found and debugged by Jay Fenlason <[EMAIL PROTECTED]>.
The bug was especially noticeable with direct I/O over fw-sbp2.

Signed-off-by: Stefan Richter <[EMAIL PROTECTED]>
Signed-off-by: Kristian Høgsberg <[EMAIL PROTECTED]>
---
Same as commit 9c9bdf4d50730fd04b06077e22d7a83b585f26b5.

 drivers/firewire/fw-transaction.c |4 +++-
 drivers/firewire/fw-transaction.h |4 
 2 files changed, 7 insertions(+), 1 deletion(-)

Index: linux-2.6.22/drivers/firewire/fw-transaction.c
===
--- linux-2.6.22.orig/drivers/firewire/fw-transaction.c
+++ linux-2.6.22/drivers/firewire/fw-transaction.c
@@ -605,8 +605,10 @@ fw_send_response(struct fw_card *card, s
 * check is sufficient to ensure we don't send response to
 * broadcast packets or posted writes.
 */
-   if (request->ack != ACK_PENDING)
+   if (request->ack != ACK_PENDING) {
+   kfree(request);
return;
+   }
 
if (rcode == RCODE_COMPLETE)
fw_fill_response(>response, request->request_header,
Index: linux-2.6.22/drivers/firewire/fw-transaction.h
===
--- linux-2.6.22.orig/drivers/firewire/fw-transaction.h
+++ linux-2.6.22/drivers/firewire/fw-transaction.h
@@ -124,6 +124,10 @@ typedef void (*fw_transaction_callback_t
  size_t length,
  void *callback_data);
 
+/*
+ * Important note:  The callback must guarantee that either fw_send_response()
+ * or kfree() is called on the @request.
+ */
 typedef void (*fw_address_callback_t)(struct fw_card *card,
  struct fw_request *request,
  int tcode, int destination, int source,

-- 
Stefan Richter
-=-=-=== -=== =--==
http://arcgraph.de/sr/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2.6.22.y] fw-ohci: fix "scheduling while atomic"

2007-07-19 Thread Stefan Richter

Date: Thu, 12 Jul 2007 22:25:14 +0200 (CEST)
From: Stefan Richter <[EMAIL PROTECTED]>
Subject: firewire: fw-ohci: fix "scheduling while atomic"

context_stop is called by bus_reset_tasklet, among else.

Signed-off-by: Stefan Richter <[EMAIL PROTECTED]>
---
Fixes http://bugzilla.kernel.org/show_bug.cgi?id=8735.
Same as commit b980f5a224f3df6c884dbf5ae48797ce352ba139.

 drivers/firewire/fw-ohci.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6.22/drivers/firewire/fw-ohci.c
===
--- linux-2.6.22.orig/drivers/firewire/fw-ohci.c
+++ linux-2.6.22/drivers/firewire/fw-ohci.c
@@ -586,7 +586,7 @@ static void context_stop(struct context 
break;
 
fw_notify("context_stop: still active (0x%08x)\n", reg);
-   msleep(1);
+   mdelay(1);
}
 }
 

-- 
Stefan Richter
-=-=-=== -=== =--==
http://arcgraph.de/sr/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] fix the softlockup watchdog to actually work

2007-07-19 Thread Andrew Morton

On Tue, 17 Jul 2007 17:49:34 +0200 Ingo Molnar <[EMAIL PROTECTED]> wrote:

> Subject: fix the softlockup watchdog to actually work
> From: Ingo Molnar <[EMAIL PROTECTED]>
> 
> this Xen related commit:
> 
>commit 966812dc98e6a7fcdf759cbfa0efab77500a8868
>Author: Jeremy Fitzhardinge <[EMAIL PROTECTED]>
>Date:   Tue May 8 00:28:02 2007 -0700
> 
>Ignore stolen time in the softlockup watchdog
> 
> broke the softlockup watchdog to never report any lockups. (!)
> 
> print_timestamp defaults to 0, this makes the following condition
> always true:
> 
>   if (print_timestamp < (touch_timestamp + 1) ||
> 
> and we'll in essence never report soft lockups.
> 
> apparently the functionality of the soft lockup watchdog was never
> actually tested with that patch applied ...
> 
> [this is -stable material too.]

This seems terribly sensitive.

Someone has broken the Vaio (shock, horror).  It now has mysterious
jerkiness: when leaning on autorepeat it stalls for maybe 0.25 seconds
every 1.5 seconds.  The stalls are far less than a second.  Yet this
is enough to trigger random softlockup warnings.

Some of those warnings are below.  Note that the traces are all pretty
useless, as softlockup warnings so often seem to be.

Of course, it could be that whatever is causing these pauses really _is_
stalling for a whole second occasionally, dunno.  But I didn't notice any
long stalls in the console output when a particular storm of softlockup
warnings came out.

But I'll sit on this patch for a while until this gets sorted out. 
Meanwhile, please double-check the elapsed-time arithmetic in there,
maybe do a bit of runtime testing?



[   78.820961] BUG: soft lockup detected on CPU#0!
[   78.821083]  [] update_process_times+0x32/0x54
[   78.821216]  [] tick_sched_timer+0x61/0x9c
[   78.821340]  [] hrtimer_interrupt+0x142/0x1d4
[   78.821463]  [] tick_sched_timer+0x0/0x9c
[   78.821587]  [] tick_do_broadcast+0x1f/0x3f
[   78.821707]  [] tick_handle_oneshot_broadcast+0x47/0x72
[   78.821852]  [] timer_interrupt+0x1a/0x20
[   78.821968]  [] handle_IRQ_event+0x1a/0x3f
[   78.822089]  [] handle_edge_irq+0x9d/0xcc
[   78.822206]  [] do_IRQ+0x53/0x6c
[   78.822307]  [] tick_notify+0x15c/0x208
[   78.822422]  [] common_interrupt+0x23/0x28
[   78.822539]  [] clockevents_notify+0x8/0x36
[   78.822663]  [] acpi_processor_idle+0x1d2/0x36d
[   78.822798]  [] cpu_idle+0x44/0x5e
[   78.822900]  [] start_kernel+0x26d/0x275
[   78.823017]  [] unknown_bootoption+0x0/0x202
[   78.823142]  ===
[  106.282830] BUG: soft lockup detected on CPU#0!
[  106.282967]  [] update_process_times+0x32/0x54
[  106.283116]  [] tick_sched_timer+0x61/0x9c
[  106.283255]  [] hrtimer_interrupt+0x142/0x1d4
[  106.283391]  [] tick_sched_timer+0x0/0x9c
[  106.283530]  [] tick_do_broadcast+0x1f/0x3f
[  106.283663]  [] tick_handle_oneshot_broadcast+0x47/0x72
[  106.283821]  [] timer_interrupt+0x1a/0x20
[  106.283949]  [] handle_IRQ_event+0x1a/0x3f
[  106.284084]  [] handle_edge_irq+0x9d/0xcc
[  106.284215]  [] do_IRQ+0x53/0x6c
[  106.284326]  [] tick_notify+0x15c/0x208
[  106.284455]  [] common_interrupt+0x23/0x28
[  106.284587]  [] clockevents_notify+0x8/0x36
[  106.284725]  [] acpi_processor_idle+0x1d2/0x36d
[  106.284875]  [] cpu_idle+0x44/0x5e
[  106.284988]  [] start_kernel+0x26d/0x275
[  106.285117]  [] unknown_bootoption+0x0/0x202
[  106.285257]  ===
[  109.266423] BUG: soft lockup detected on CPU#0!
[  109.266558]  [] update_process_times+0x32/0x54
[  109.266703]  [] tick_sched_timer+0x61/0x9c
[  109.270745]  [] hrtimer_interrupt+0x142/0x1d4
[  109.274790]  [] tick_sched_timer+0x0/0x9c
[  109.278865]  [] tick_do_broadcast+0x1f/0x3f
[  109.282950]  [] tick_handle_oneshot_broadcast+0x47/0x72
[  109.287026]  [] timer_interrupt+0x1a/0x20
[  109.291012]  [] handle_IRQ_event+0x1a/0x3f
[  109.294950]  [] handle_edge_irq+0x9d/0xcc
[  109.298864]  [] do_IRQ+0x53/0x6c
[  109.302818]  [] tick_notify+0x15c/0x208
[  109.306740]  [] common_interrupt+0x23/0x28
[  109.310641]  [] clockevents_notify+0x8/0x36
[  109.314543]  [] acpi_processor_idle+0x1d2/0x36d
[  109.318461]  [] cpu_idle+0x44/0x5e
[  109.322348]  [] start_kernel+0x26d/0x275
[  109.326267]  [] unknown_bootoption+0x0/0x202
[  109.330188]  ===

(ah, the Vaio breakage seems to be -mm-only, whew)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: cmpxchg is not available to generic code

2007-07-19 Thread David Miller

From: Andrew Morton <[EMAIL PROTECTED]>
Date: Thu, 19 Jul 2007 00:05:49 -0700

> What's that code doing anyway?  driver-private locking primitives?

It's an atomic lock shared with userspace.  Whatever implementation is
used to do the lock on that object must be identical in the userspace
DRM bits.

Unlike futex, the lock operation on the user side isn't optional.
So if the platform can't do a true cmpxchg it generally cannot
support DRM.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/5] [V2] Define is_global_init() and is_container_init()

2007-07-19 Thread sukadev

Subject: [PATCH 4/5] Define is_global_init() and is_container_init().

From: Serge E. Hallyn <[EMAIL PROTECTED]>


is_init() is an ambiguous name for the pid==1 check.  Split it into
is_global_init() and is_container_init().

A container init has it's tsk->pid == 1.

A global init also has it's tsk->pid == 1 and it's active pid namespace
is the init_pid_ns.  But rather than check the active pid namespace,
compare the task structure with 'init_pid_ns.child_reaper', which is
initialized during boot to the /sbin/init process and never changes.

Changelog:

2.6.22-rc4-mm2-pidns1:
- Use 'init_pid_ns.child_reaper' to determine if a given task is the
  global init (/sbin/init) process. This would improve performance
  and remove dependence on the task_pid().

2.6.21-mm2-pidns2:

- [Sukadev Bhattiprolu] Changed is_container_init() calls in {powerpc,
  ppc,avr32}/traps.c for the _exception() call to is_global_init().
  This way, we kill only the container if the container's init has a
  bug rather than force a kernel panic.

Signed-off-by: Serge E. Hallyn <[EMAIL PROTECTED]>
Signed-off-by: Sukadev Bhattiprolu <[EMAIL PROTECTED]>
Acked-by: Pavel Emelianov <[EMAIL PROTECTED]>

Cc: Eric W. Biederman <[EMAIL PROTECTED]>
Cc: Cedric Le Goater <[EMAIL PROTECTED]>
Cc: Dave Hansen <[EMAIL PROTECTED]>
Cc: Herbert Poetzel <[EMAIL PROTECTED]>
---
 arch/alpha/mm/fault.c|2 +-
 arch/arm/mm/fault.c  |2 +-
 arch/arm26/mm/fault.c|2 +-
 arch/avr32/kernel/traps.c|2 +-
 arch/avr32/mm/fault.c|6 +++---
 arch/i386/lib/usercopy.c |2 +-
 arch/i386/mm/fault.c |2 +-
 arch/ia64/mm/fault.c |2 +-
 arch/m68k/mm/fault.c |2 +-
 arch/mips/mm/fault.c |2 +-
 arch/powerpc/kernel/traps.c  |2 +-
 arch/powerpc/mm/fault.c  |2 +-
 arch/powerpc/platforms/pseries/ras.c |2 +-
 arch/ppc/kernel/traps.c  |2 +-
 arch/ppc/mm/fault.c  |2 +-
 arch/s390/lib/uaccess_pt.c   |2 +-
 arch/s390/mm/fault.c |2 +-
 arch/sh/mm/fault.c   |2 +-
 arch/sh64/mm/fault.c |6 +++---
 arch/um/kernel/trap.c|2 +-
 arch/x86_64/mm/fault.c   |2 +-
 arch/xtensa/mm/fault.c   |2 +-
 drivers/char/sysrq.c |2 +-
 include/linux/sched.h|   12 ++--
 kernel/capability.c  |3 ++-
 kernel/exit.c|2 +-
 kernel/kexec.c   |2 +-
 kernel/pid.c |7 +++
 kernel/signal.c  |2 +-
 kernel/sysctl.c  |2 +-
 mm/oom_kill.c|4 ++--
 security/commoncap.c |3 ++-
 32 files changed, 54 insertions(+), 37 deletions(-)

Index: lx26-22-rc6-mm1a/include/linux/sched.h
===
--- lx26-22-rc6-mm1a.orig/include/linux/sched.h 2007-07-16 12:55:15.0 
-0700
+++ lx26-22-rc6-mm1a/include/linux/sched.h  2007-07-16 13:10:48.0 
-0700
@@ -1219,12 +1219,20 @@ static inline int pid_alive(struct task_
 }
 
 /**
- * is_init - check if a task structure is init
+ * is_global_init - check if a task structure is init
  * @tsk: Task structure to be checked.
  *
  * Check if a task structure is the first user space task the kernel created.
+ *
+ * TODO: We should inline this function after some cleanups in pid_namespace.h
+ */
+extern int is_global_init(struct task_struct *tsk);
+
+/*
+ * is_container_init:
+ * check whether in the task is init in it's own pid namespace.
  */
-static inline int is_init(struct task_struct *tsk)
+static inline int is_container_init(struct task_struct *tsk)
 {
return tsk->pid == 1;
 }
Index: lx26-22-rc6-mm1a/kernel/pid.c
===
--- lx26-22-rc6-mm1a.orig/kernel/pid.c  2007-07-16 12:55:15.0 -0700
+++ lx26-22-rc6-mm1a/kernel/pid.c   2007-07-16 13:10:48.0 -0700
@@ -69,6 +69,13 @@ struct pid_namespace init_pid_ns = {
.last_pid = 0,
.child_reaper = _task
 };
+EXPORT_SYMBOL(init_pid_ns);
+
+int is_global_init(struct task_struct *tsk)
+{
+   return tsk == init_pid_ns.child_reaper;
+}
+EXPORT_SYMBOL(is_global_init);
 
 /*
  * Note: disable interrupts while the pidmap_lock is held as an
Index: lx26-22-rc6-mm1a/arch/alpha/mm/fault.c
===
--- lx26-22-rc6-mm1a.orig/arch/alpha/mm/fault.c 2007-07-16 12:55:15.0 
-0700
+++ lx26-22-rc6-mm1a/arch/alpha/mm/fault.c  2007-07-16 13:10:48.0 
-0700
@@ -192,7 +192,7 @@ do_page_fault(unsigned long address, uns
/* We ran out of memory, or

[PATCH 5/5] [V2] Move alloc_pid() to copy_process()

2007-07-19 Thread sukadev



Subject: [PATCH 5/5] Move alloc_pid call to copy_process

From: Sukadev Bhattiprolu <[EMAIL PROTECTED]>

Move alloc_pid() into copy_process(). This will keep all pid and pid
namespace code together and simplify error handling when we support
multiple pid namespaces.

Signed-off-by: Sukadev Bhattiprolu <[EMAIL PROTECTED]>

Cc: Pavel Emelianov <[EMAIL PROTECTED]>
Cc: Eric W. Biederman <[EMAIL PROTECTED]>
Cc: Cedric Le Goater <[EMAIL PROTECTED]>
Cc: Dave Hansen <[EMAIL PROTECTED]>
Cc: Serge Hallyn <[EMAIL PROTECTED]>
Cc: Herbert Poetzel <[EMAIL PROTECTED]>
---
 kernel/fork.c |   19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

Index: lx26-22-rc6-mm1a/kernel/fork.c
===
--- lx26-22-rc6-mm1a.orig/kernel/fork.c 2007-07-16 12:55:13.0 -0700
+++ lx26-22-rc6-mm1a/kernel/fork.c  2007-07-17 10:08:12.0 -0700
@@ -1029,6 +1029,12 @@ static struct task_struct *copy_process(
if (p->binfmt && !try_module_get(p->binfmt->module))
goto bad_fork_cleanup_put_domain;
 
+   if (pid != _struct_pid) {
+   pid = alloc_pid();
+   if (!pid)
+   goto bad_fork_put_binfmt_module;
+   }
+
p->did_exec = 0;
delayacct_tsk_init(p);  /* Must remain after dup_task_struct() */
copy_flags(clone_flags, p);
@@ -1316,6 +1322,9 @@ bad_fork_cleanup_container:
 #endif
container_exit(p, container_callbacks_done);
delayacct_tsk_free(p);
+   if (pid != _struct_pid)
+   free_pid(pid);
+bad_fork_put_binfmt_module:
if (p->binfmt)
module_put(p->binfmt->module);
 bad_fork_cleanup_put_domain:
@@ -1380,19 +1389,16 @@ long do_fork(unsigned long clone_flags,
 {
struct task_struct *p;
int trace = 0;
-   struct pid *pid = alloc_pid();
long nr;
 
-   if (!pid)
-   return -EAGAIN;
-   nr = pid->nr;
if (unlikely(current->ptrace)) {
trace = fork_traceflag (clone_flags);
if (trace)
clone_flags |= CLONE_PTRACE;
}
 
-   p = copy_process(clone_flags, stack_start, regs, stack_size, 
parent_tidptr, child_tidptr, pid);
+   p = copy_process(clone_flags, stack_start, regs, stack_size,
+   parent_tidptr, child_tidptr, NULL);
/*
 * Do this prior waking up the new thread - the thread pointer
 * might get invalid after that point, if the thread exits quickly.
@@ -1400,6 +1406,8 @@ long do_fork(unsigned long clone_flags,
if (!IS_ERR(p)) {
struct completion vfork;
 
+   nr = pid_nr(task_pid(p));
+
if (clone_flags & CLONE_VFORK) {
p->vfork_done = 
init_completion();
@@ -1433,7 +1441,6 @@ long do_fork(unsigned long clone_flags,
}
}
} else {
-   free_pid(pid);
nr = PTR_ERR(p);
}
return nr;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/5] [V2] Rename child_reaper() function

2007-07-19 Thread sukadev

Pavel,

Pls ack this if you agree.

Suka
---

Subject: [PATCH 2/5] Rename child_reaper function.

From: Sukadev Bhattiprolu <[EMAIL PROTECTED]>


Rename the child_reaper() function to task_child_reaper() to be
similar to other task_* functions and to distinguish the function
from 'struct pid_namspace.child_reaper'.

Signed-off-by: Sukadev Bhattiprolu <[EMAIL PROTECTED]>
---
 fs/exec.c |2 +-
 include/linux/pid_namespace.h |2 +-
 kernel/exit.c |4 ++--
 kernel/signal.c   |2 +-
 4 files changed, 5 insertions(+), 5 deletions(-)

Index: lx26-22-rc6-mm1/include/linux/pid_namespace.h
===
--- lx26-22-rc6-mm1.orig/include/linux/pid_namespace.h  2007-07-13 
13:07:48.0 -0700
+++ lx26-22-rc6-mm1/include/linux/pid_namespace.h   2007-07-13 
13:12:01.0 -0700
@@ -44,7 +44,7 @@ static inline struct pid_namespace *task
return tsk->nsproxy->pid_ns;
 }
 
-static inline struct task_struct *child_reaper(struct task_struct *tsk)
+static inline struct task_struct *task_child_reaper(struct task_struct *tsk)
 {
return init_pid_ns.child_reaper;
 }
Index: lx26-22-rc6-mm1/fs/exec.c
===
--- lx26-22-rc6-mm1.orig/fs/exec.c  2007-07-13 13:07:48.0 -0700
+++ lx26-22-rc6-mm1/fs/exec.c   2007-07-13 13:12:01.0 -0700
@@ -826,7 +826,7 @@ static int de_thread(struct task_struct 
 * Reparenting needs write_lock on tasklist_lock,
 * so it is safe to do it under read_lock.
 */
-   if (unlikely(tsk->group_leader == child_reaper(tsk)))
+   if (unlikely(tsk->group_leader == task_child_reaper(tsk)))
task_active_pid_ns(tsk)->child_reaper = tsk;
 
zap_other_threads(tsk);
Index: lx26-22-rc6-mm1/kernel/exit.c
===
--- lx26-22-rc6-mm1.orig/kernel/exit.c  2007-07-13 13:07:48.0 -0700
+++ lx26-22-rc6-mm1/kernel/exit.c   2007-07-13 13:12:01.0 -0700
@@ -695,7 +695,7 @@ forget_original_parent(struct task_struc
do {
reaper = next_thread(reaper);
if (reaper == father) {
-   reaper = child_reaper(father);
+   reaper = task_child_reaper(father);
break;
}
} while (reaper->exit_state);
@@ -908,7 +908,7 @@ fastcall NORET_TYPE void do_exit(long co
panic("Aiee, killing interrupt handler!");
if (unlikely(!tsk->pid))
panic("Attempted to kill the idle task!");
-   if (unlikely(tsk == child_reaper(tsk))) {
+   if (unlikely(tsk == task_child_reaper(tsk))) {
if (task_active_pid_ns(tsk) != _pid_ns)
task_active_pid_ns(tsk)->child_reaper =
init_pid_ns.child_reaper;
Index: lx26-22-rc6-mm1/kernel/signal.c
===
--- lx26-22-rc6-mm1.orig/kernel/signal.c2007-07-13 13:06:52.0 
-0700
+++ lx26-22-rc6-mm1/kernel/signal.c 2007-07-13 13:12:01.0 -0700
@@ -1853,7 +1853,7 @@ relock:
 * within that pid space. It can of course get signals from
 * its parent pid space.
 */
-   if (current == child_reaper(current))
+   if (current == task_child_reaper(current))
continue;
 
if (sig_kernel_stop(signr)) {
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/5] [V2] Use task_pid() to find leader's pid

2007-07-19 Thread sukadev


Subject: [PATCH 3/5] Use task_pid() to find leader's pid

From: Sukadev Bhattiprolu <[EMAIL PROTECTED]>

Use task_pid() to get leader's 'struct pid' and avoid the find_pid().

Signed-off-by: Sukadev Bhattiprolu <[EMAIL PROTECTED]>
Acked-by: Pavel Emelianov <[EMAIL PROTECTED]>

Cc: Eric W. Biederman <[EMAIL PROTECTED]>
Cc: Cedric Le Goater <[EMAIL PROTECTED]>
Cc: Dave Hansen <[EMAIL PROTECTED]>
Cc: Serge Hallyn <[EMAIL PROTECTED]>
Cc: Herbert Poetzel <[EMAIL PROTECTED]>
---
 fs/exec.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: lx26-22-rc6-mm1a/fs/exec.c
===
--- lx26-22-rc6-mm1a.orig/fs/exec.c 2007-07-13 18:23:55.0 -0700
+++ lx26-22-rc6-mm1a/fs/exec.c  2007-07-16 12:56:22.0 -0700
@@ -908,7 +908,7 @@ static int de_thread(struct task_struct 
 */
detach_pid(tsk, PIDTYPE_PID);
tsk->pid = leader->pid;
-   attach_pid(tsk, PIDTYPE_PID,  find_pid(tsk->pid));
+   attach_pid(tsk, PIDTYPE_PID,  task_pid(leader));
transfer_pid(leader, tsk, PIDTYPE_PGID);
transfer_pid(leader, tsk, PIDTYPE_SID);
list_replace_rcu(>tasks, >tasks);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/5] [V2] Define and use task_active_pid_ns() wrapper

2007-07-19 Thread sukadev


Subject: [PATCH 1/5] Define and use task_active_pid_ns() wrapper

From: Sukadev Bhattiprolu <[EMAIL PROTECTED]>

With multiple pid namespaces, a process is known by some pid_t in
every ancestor pid namespace.  Every time the process forks, the
child process also gets a pid_t in every ancestor pid namespace.

While a process is visible in >=1 pid namespaces, it can see pid_t's
in only one pid namespace.  We call this pid namespace it's "active
pid namespace", and it is always the youngest pid namespace in which
the process is known.

This patch defines and uses a wrapper to find the active pid namespace
of a process. The implementation of the wrapper will be changed in 
when support for multiple pid namespaces are added.

Changelog:
2.6.22-rc4-mm2-pidns1:
- [Pavel Emelianov, Alexey Dobriyan] Back out the change to use
  task_active_pid_ns() in child_reaper() since task->nsproxy
  can be NULL during task exit (so child_reaper() continues to
  use init_pid_ns).

  to implement child_reaper() since init_pid_ns.child_reaper to
  implement child_reaper() since tsk->nsproxy can be NULL during exit.

2.6.21-rc6-mm1:
- Rename task_pid_ns() to task_active_pid_ns() to reflect that a
  process can have multiple pid namespaces.

Signed-off-by: Sukadev Bhattiprolu <[EMAIL PROTECTED]>
Acked-by: Pavel Emelianov <[EMAIL PROTECTED]>

Cc: Eric W. Biederman <[EMAIL PROTECTED]>
Cc: Cedric Le Goater <[EMAIL PROTECTED]>
Cc: Dave Hansen <[EMAIL PROTECTED]>
Cc: Serge Hallyn <[EMAIL PROTECTED]>
Cc: Herbert Poetzel <[EMAIL PROTECTED]>
---
 fs/exec.c |2 +-
 fs/proc/proc_misc.c   |3 ++-
 include/linux/pid_namespace.h |7 ++-
 kernel/exit.c |5 +++--
 kernel/nsproxy.c  |2 +-
 kernel/pid.c  |4 ++--
 6 files changed, 15 insertions(+), 8 deletions(-)

Index: lx26-22-rc6-mm1/include/linux/pid_namespace.h
===
--- lx26-22-rc6-mm1.orig/include/linux/pid_namespace.h  2007-07-13 
13:07:01.0 -0700
+++ lx26-22-rc6-mm1/include/linux/pid_namespace.h   2007-07-13 
18:22:49.0 -0700
@@ -20,7 +20,7 @@ struct pid_namespace {
struct pidmap pidmap[PIDMAP_ENTRIES];
int last_pid;
struct task_struct *child_reaper;
-   struct kmem_cache_t *pid_cachep;
+   struct kmem_cache *pid_cachep;
 };
 
 extern struct pid_namespace init_pid_ns;
@@ -39,6 +39,11 @@ static inline void put_pid_ns(struct pid
kref_put(>kref, free_pid_ns);
 }
 
+static inline struct pid_namespace *task_active_pid_ns(struct task_struct *tsk)
+{
+   return tsk->nsproxy->pid_ns;
+}
+
 static inline struct task_struct *child_reaper(struct task_struct *tsk)
 {
return init_pid_ns.child_reaper;
Index: lx26-22-rc6-mm1/fs/exec.c
===
--- lx26-22-rc6-mm1.orig/fs/exec.c  2007-07-13 13:05:38.0 -0700
+++ lx26-22-rc6-mm1/fs/exec.c   2007-07-13 18:13:39.0 -0700
@@ -827,7 +827,7 @@ static int de_thread(struct task_struct 
 * so it is safe to do it under read_lock.
 */
if (unlikely(tsk->group_leader == child_reaper(tsk)))
-   tsk->nsproxy->pid_ns->child_reaper = tsk;
+   task_active_pid_ns(tsk)->child_reaper = tsk;
 
zap_other_threads(tsk);
read_unlock(_lock);
Index: lx26-22-rc6-mm1/fs/proc/proc_misc.c
===
--- lx26-22-rc6-mm1.orig/fs/proc/proc_misc.c2007-07-13 13:05:38.0 
-0700
+++ lx26-22-rc6-mm1/fs/proc/proc_misc.c 2007-07-13 13:07:48.0 -0700
@@ -94,7 +94,8 @@ static int loadavg_read_proc(char *page,
LOAD_INT(a), LOAD_FRAC(a),
LOAD_INT(b), LOAD_FRAC(b),
LOAD_INT(c), LOAD_FRAC(c),
-   nr_running(), nr_threads, current->nsproxy->pid_ns->last_pid);
+   nr_running(), nr_threads,
+   task_active_pid_ns(current)->last_pid);
return proc_calc_metrics(page, start, off, count, eof, len);
 }
 
Index: lx26-22-rc6-mm1/kernel/exit.c
===
--- lx26-22-rc6-mm1.orig/kernel/exit.c  2007-07-13 13:06:52.0 -0700
+++ lx26-22-rc6-mm1/kernel/exit.c   2007-07-13 18:13:39.0 -0700
@@ -909,8 +909,9 @@ fastcall NORET_TYPE void do_exit(long co
if (unlikely(!tsk->pid))
panic("Attempted to kill the idle task!");
if (unlikely(tsk == child_reaper(tsk))) {
-   if (tsk->nsproxy->pid_ns != _pid_ns)
-   tsk->nsproxy->pid_ns->child_reaper = 
init_pid_ns.child_reaper;
+   if (task_active_pid_ns(tsk) != _pid_ns)
+   task_active_pid_ns(tsk)->child_reaper =
+   init_pid_ns.child_reaper;
else

Re: [patch 1/3] ps3: Disk Storage Driver

2007-07-19 Thread Geert Uytterhoeven

On Thu, 19 Jul 2007, Jens Axboe wrote:
> On Wed, Jul 18 2007, Andrew Morton wrote:
> > On Mon, 16 Jul 2007 18:15:40 +0200
> > Geert Uytterhoeven <[EMAIL PROTECTED]> wrote:
> > 
> > > From: Geert Uytterhoeven <[EMAIL PROTECTED]>
> > > 
> > > Add a Disk Storage Driver for the PS3:
> > 
> > Your patchset significantly hits powerpc, scsi and block.  So who gets to
> > merge this?  Jens?  James?  Paul?
> > 
> > Me, I guess ;)
> 
> I think Paul was going to take it, or at least Geert hinted as such.

Yep, but as I heard Paul is on holidays, I was just going to send it to Andrew
anyway.

With kind regards,

Geert Uytterhoeven
Software Architect

Sony Network and Software Technology Center Europe
The Corporate Village · Da Vincilaan 7-D1 · B-1935 Zaventem · Belgium

Phone:+32 (0)2 700 8453 
Fax:  +32 (0)2 700 8622 
E-mail:   [EMAIL PROTECTED] 
Internet: http://www.sony-europe.com/

Sony Network and Software Technology Center Europe  
A division of Sony Service Centre (Europe) N.V. 
Registered office: Technologielaan 7 · B-1840 Londerzeel · Belgium  
VAT BE 0413.825.160 · RPR Brussels  
Fortis Bank Zaventem · Swift GEBABEBB08A · IBAN BE39001382358619

Re: [PATCH 1/1] i386: Geode's TSC is not neccessary to mark tu unstable

2007-07-19 Thread Andres Salomon

On Thu, 19 Jul 2007 08:49:05 +0200
Juergen Beisert <[EMAIL PROTECTED]> wrote:

> On Thursday 19 July 2007 03:02, Andrew Morton wrote:
> > On Sun, 15 Jul 2007 21:06:27 +0200
> >
> > Juergen Beisert <[EMAIL PROTECTED]> wrote:
> > > Replace NSC/Cyrix specific chipset access macros by inlined functions.
> > > With the macros a line like this fails (and does nothing):
> > >   setCx86(CX86_CCR2, getCx86(CX86_CCR2) | 0x88);
> > > With inlined functions this line will work as expected.
> >
> > I don't get it.  Why would the macros behave differently from inlined
> > functions?
> 
> X86 magic. The access order is important. The first access must always be the 
> offset at 0x22. This access enables the next access to 0x23 (data). If you do 
> it in wrong order, it fails. With the macros you get something like 0x22, 
> 0x22, 0x23, 0x23. With the inline functions 0x22,0x23,0x22,0x23.
> 
> Juergen

Wow, that's a really cool bug; nice work!  Don't forget to update
arch/i386/kernel/cpu/mtrr/state.c, though; it uses setCx86() as well.  It needs
to include processor-cyrix.h.


Acked-by: Andres Salomon <[EMAIL PROTECTED]>

-- 
Andres Salomon <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/5][V2] Misc helper patches for pid namespaces

2007-07-19 Thread sukadev

Some helper patches to support multiple pid namespaces. These
were posted earlier on Containers@ mailing list. 

[PATCH 1/5] Define and use task_active_pid_ns() wrapper
[PATCH 2/5] Rename child_reaper() function.
[PATCH 3/5] Use task_pid() to find leader's pid
[PATCH 4/5] Define is_global_init() and is_container_init().
[PATCH 5/5] Move alloc_pid() to copy_process()

Changelog:

- Addressed Oleg Nesterov's comments on [V1] of this patchset.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

cmpxchg is not available to generic code

2007-07-19 Thread Andrew Morton

arm:

drivers/char/drm/drm_lock.c: In function `drm_lock_take':
drivers/char/drm/drm_lock.c:221: error: implicit declaration of function 
`cmpxchg'

You might be able to use atomic_cmpxchg, which _is_ present
on all architectures.  Or use a spinlock.

What's that code doing anyway?  driver-private locking primitives?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/3]x86_64: early_printk for early debug port support

2007-07-19 Thread Yinghai Lu


On 7/18/07, Andi Kleen <[EMAIL PROTECTED]> wrote:

On Monday 21 May 2007 07:19:18 Yinghai Lu wrote:
> add early dbgp to early_printk.
>
> kernel command line:
> earlyprintk=dbgp
> or
> earlyprintk=dbgp1

Just checking some old patches. Was there ever an update for this one?
What were the testing results?

-Andi



please check the attachment. the diff to current Linus'd git.

after remove pci quirks for usb handoff, it could get boot log till
ohci try to reset the port with debug device. --- reset will fail.

Maybe Greg could continue debug it.

YH
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 9a54148..956d8dc 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -550,11 +550,12 @@ and is between 256 and 4096 characters. It is defined in the file
 	earlyprintk=	[IA-32,X86-64,SH]
 			earlyprintk=vga
 			earlyprintk=serial[,ttySn[,baudrate]]
+			earlyprintk=dbgp
 
 			Append ",keep" to not disable it when the real console
 			takes over.
 
-			Only vga or serial at a time, not both.
+			Only vga or serial or usb debug port at a time.
 
 			Currently only ttyS0 and ttyS1 are supported.
 
diff --git a/arch/x86_64/kernel/early_printk.c b/arch/x86_64/kernel/early_printk.c
index fd9aff3..3621d68 100644
--- a/arch/x86_64/kernel/early_printk.c
+++ b/arch/x86_64/kernel/early_printk.c
@@ -3,10 +3,19 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
+#include 
 #include 
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
+#define EARLY_PRINTK
+#include "../../../drivers/usb/host/ehci.h"
 
 /* Simple VGA output */
 
@@ -156,6 +165,594 @@ static struct console early_serial_console = {
 	.index =	-1,
 };
 
+
+static struct ehci_caps __iomem *ehci_caps;
+static struct ehci_regs __iomem *ehci_regs;
+static struct ehci_dbg_port __iomem *ehci_debug;
+static unsigned dbgp_endpoint_out;
+
+#define USB_DEBUG_DEVNUM 127
+
+#define DBGP_DATA_TOGGLE	0x8800
+
+static inline u32 dbgp_pid_update(u32 x, u32 tok)
+{
+	return x) ^ DBGP_DATA_TOGGLE) & 0x00) | ((tok) & 0xff));
+}
+
+static inline u32 dbgp_len_update(u32 x, u32 len)
+{
+	return (((x) & ~0x0f) | ((len) & 0x0f));
+}
+
+/*
+ * USB Packet IDs (PIDs)
+ */
+
+/* token */
+#define USB_PID_OUT		0xe1
+#define USB_PID_IN		0x69
+#define USB_PID_SOF		0xa5
+#define USB_PID_SETUP		0x2d
+/* handshake */
+#define USB_PID_ACK		0xd2
+#define USB_PID_NAK		0x5a
+#define USB_PID_STALL		0x1e
+#define USB_PID_NYET		0x96
+/* data */
+#define USB_PID_DATA0		0xc3
+#define USB_PID_DATA1		0x4b
+#define USB_PID_DATA2		0x87
+#define USB_PID_MDATA		0x0f
+/* Special */
+#define USB_PID_PREAMBLE	0x3c
+#define USB_PID_ERR		0x3c
+#define USB_PID_SPLIT		0x78
+#define USB_PID_PING		0xb4
+#define USB_PID_UNDEF_0		0xf0
+
+#define USB_PID_DATA_TOGGLE	0x88
+#define DBGP_CLAIM (DBGP_OWNER | DBGP_ENABLED | DBGP_INUSE)
+
+#define PCI_CAP_ID_EHCI_DEBUG	0xa
+
+#define HUB_ROOT_RESET_TIME	50	/* times are in msec */
+#define HUB_SHORT_RESET_TIME	10
+#define HUB_LONG_RESET_TIME	200
+#define HUB_RESET_TIMEOUT	500
+
+#define DBGP_MAX_PACKET		8
+
+static int dbgp_wait_until_complete(void)
+{
+	unsigned ctrl;
+	int loop = 0x10;
+
+	do {
+		ctrl = readl(_debug->control);
+		/* Stop when the transaction is finished */
+		if (ctrl & DBGP_DONE)
+			break;
+	} while (--loop > 0);
+
+	if (!loop)
+		return -1;
+
+	/* Now that we have observed the completed transaction,
+	 * clear the done bit.
+	 */
+	writel(ctrl | DBGP_DONE, _debug->control);
+	return (ctrl & DBGP_ERROR) ? -DBGP_ERRCODE(ctrl) : DBGP_LEN(ctrl);
+}
+
+static void dbgp_mdelay(int ms)
+{
+	int i;
+	while (ms--) {
+		for (i = 0; i < 1000; i++)
+			outb(0x1, 0x80);
+	}
+}
+
+static void dbgp_breath(void)
+{
+	/* Sleep to give the debug port a chance to breathe */
+}
+
+static int dbgp_wait_until_done(unsigned ctrl)
+{
+	unsigned pids, lpid;
+	int ret;
+	int loop = 3;
+
+retry:
+	writel(ctrl | DBGP_GO, _debug->control);
+	ret = dbgp_wait_until_complete();
+	pids = readl(_debug->pids);
+	lpid = DBGP_PID_GET(pids);
+
+	if (ret < 0)
+		return ret;
+
+	/* If the port is getting full or it has dropped data
+	 * start pacing ourselves, not necessary but it's friendly.
+	 */
+	if ((lpid == USB_PID_NAK) || (lpid == USB_PID_NYET))
+		dbgp_breath();
+
+	/* If I get a NACK reissue the transmission */
+	if (lpid == USB_PID_NAK) {
+		if(--loop > 0)
+			goto retry;
+	}
+
+	return ret;
+}
+
+static void dbgp_set_data(const void *buf, int size)
+{
+	const unsigned char *bytes = buf;
+	unsigned lo, hi;
+	int i;
+
+	lo = hi = 0;
+	for (i = 0; i < 4 && i < size; i++)
+		lo |= bytes[i] << (8*i);
+	for (; i < 8 && i < size; i++)
+		hi |= bytes[i] << (8*(i - 4));
+	writel(lo, _debug->data03);
+	writel(hi, _debug->data47);
+}
+
+static void dbgp_get_data(void *buf, int size)
+{
+	unsigned char *bytes = buf;
+	unsigned lo, hi;
+	int i;
+	lo = readl(_debug->data03);
+	hi = readl(_debug->data47);
+	for (i = 0; i < 4 && i < size; i++)
+		bytes[i] = (lo >> (8*i)) &

Re: [BUGFIX]{PATCH] flush icache on ia64 take2

2007-07-19 Thread KAMEZAWA Hiroyuki

On Fri, 6 Jul 2007 11:29:01 +0900
KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> wrote:

> This is a patch for fixing icache flush race in ia64(Montecito) by 
> implementing
> flush_icache_page() at el.
> 
> Changelog:
>  - updated against 2.6.22-rc7 (previous one was against 2.6.21)
>  - removed hugetlbe's lazy_mmu_prot_update().
>  - rewrote patch description.
>  - removed patch against mprotect() if flushes cache.
> 
Then, what should I do more for fixing this SIGILL problem ?

-Kame

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.22-git10 compile error

2007-07-19 Thread Cornelia Huck

On Thu, 19 Jul 2007 09:28:26 +0300,
Plamen Petrov <[EMAIL PROTECTED]> wrote:

> Hi, all!
> 
> Just for the record - linux kernel version 2.6.22-git10 fails to build
> with the following error:
> 
> In file included from net/netfilter/xt_connlimit.c:27:
> include/net/netfilter/nf_conntrack.h:99: error: field `ct_general' has 
> incomplete type
> include/net/netfilter/nf_conntrack.h: In function `nf_ct_get':
> include/net/netfilter/nf_conntrack.h:163: error: structure has no member 
> named `nfct'
> include/net/netfilter/nf_conntrack.h: In function `nf_ct_put':
> include/net/netfilter/nf_conntrack.h:170: error: implicit declaration of 
> function `nf_conntrack_put'
> include/net/netfilter/nf_conntrack.h: In function `nf_ct_is_untracked':
> include/net/netfilter/nf_conntrack.h:252: error: structure has no member 
> named `nfct'
> In file included from net/netfilter/xt_connlimit.c:28:
> include/net/netfilter/nf_conntrack_core.h: In function 
> `nf_conntrack_confirm':
> include/net/netfilter/nf_conntrack_core.h:68: error: structure has no 
> member named `nfct'
> make[2]: *** [net/netfilter/xt_connlimit.o] Error 1
> make[1]: *** [net/netfilter] Error 2
> make: *** [net] Error 2

This is fixed with commit 3fd8f9e4b6c184d03d340bc86630f700de967fa8.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/1] i386: Geode's TSC is not neccessary to mark tu unstable

2007-07-19 Thread Juergen Beisert

On Thursday 19 July 2007 03:02, Andrew Morton wrote:
> On Sun, 15 Jul 2007 21:06:27 +0200
>
> Juergen Beisert <[EMAIL PROTECTED]> wrote:
> > Replace NSC/Cyrix specific chipset access macros by inlined functions.
> > With the macros a line like this fails (and does nothing):
> > setCx86(CX86_CCR2, getCx86(CX86_CCR2) | 0x88);
> > With inlined functions this line will work as expected.
>
> I don't get it.  Why would the macros behave differently from inlined
> functions?

X86 magic. The access order is important. The first access must always be the 
offset at 0x22. This access enables the next access to 0x23 (data). If you do 
it in wrong order, it fails. With the macros you get something like 0x22, 
0x22, 0x23, 0x23. With the inline functions 0x22,0x23,0x22,0x23.

Juergen
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: regression: disk error loop (panic?) ide_do_rw_disk-bad:

2007-07-19 Thread Jens Axboe

On Thu, Jul 19 2007, Giacomo Catenazzi wrote:
> Linus Torvalds wrote:
> > 
> > On Thu, 19 Jul 2007, Bartlomiej Zolnierkiewicz wrote:
> >> Thanks for finding and fixing this.
> >>
> >> The latest patch (with additional cleanups) also looks good and should be
> >> safe enough (unchanged behavior for all non-pc requests) to merge it now.
> >>
> >> Acked-by: Bartlomiej Zolnierkiewicz <[EMAIL PROTECTED]>
> > 
> > Ok, Jens - mind signing off on the patch you sent out, and writing an 
> > explanatory message? Feel free to just crib from my explanation of my 
> > original patch, or whatever.
> > 
> > And it would be beautiful if people who saw the bad behaviour before 
> > reverting the ide.c changes were to go back to that broken state, and try 
> > the patch, and just verify that it acts like it should (ie you should see 
> > just a few error messages, and it shouldn't cause the IDE layer to go 
> > ballistic any more).
> 
> Ok, I tested a5fcaa210626a79465321e344c91a6a7dc3881fa , with
> the Jeans' patch with clean-up (Message-ID:
> <[EMAIL PROTECTED]>).
> 
> I don't see the error loop. but only 4 errors (2 for each hd, at hddtemp
> start)
> 
> Jul 19 08:22:19 catee kernel: hda: selected mode 0x45
> Jul 19 08:22:23 catee kernel: ide_do_rw_disk - bad command: dev hda:
> type=2, flags=104c8
> Jul 19 08:22:23 catee kernel:
> Jul 19 08:22:23 catee kernel: sector 14657019, nr/cnr 0/0
> Jul 19 08:22:23 catee kernel: bio c21a4780, biotail c21a4780, buffer
> , data , len 36
> Jul 19 08:22:23 catee kernel: cdb: 12 00 00 00 24 00 00 00 00 00 00 00
> 00 00 00 00
> Jul 19 08:22:23 catee kernel: ide_do_rw_disk - bad command: dev hdc:
> type=2, flags=104c8
> Jul 19 08:22:23 catee kernel:
> Jul 19 08:22:23 catee kernel: sector 34711027, nr/cnr 0/0
> Jul 19 08:22:23 catee kernel: bio c21a4740, biotail c21a4740, buffer
> , data , len 36
> Jul 19 08:22:23 catee kernel: cdb: 12 00 00 00 24 00 00 00 00 00 00 00
> 00 00 00 00
> Jul 19 08:22:23 catee kernel: ide_do_rw_disk - bad command: dev hdc:
> type=2, flags=104c8
> Jul 19 08:22:23 catee kernel:
> Jul 19 08:22:23 catee kernel: sector 7387152, nr/cnr 0/0
> Jul 19 08:22:23 catee kernel: bio c21a4900, biotail c21a4900, buffer
> , data , len 36
> Jul 19 08:22:23 catee kernel: cdb: 12 00 00 00 24 00 00 00 00 00 00 00
> 00 00 00 00
> Jul 19 08:22:23 catee kernel: ide_do_rw_disk - bad command: dev hda:
> type=2, flags=104c8
> Jul 19 08:22:23 catee kernel:
> Jul 19 08:22:23 catee kernel: sector 7387152, nr/cnr 0/0
> Jul 19 08:22:23 catee kernel: bio c21a4900, biotail c21a4900, buffer
> , data , len 36
> Jul 19 08:22:23 catee kernel: cdb: 12 00 00 00 24 00 00 00 00 00 00 00
> 00 00 00 00

Perfect, thanks a lot for testing!

Tested-By: Giacomo Catenazzi <[EMAIL PROTECTED]>

Linus, if you merge the patch I sent, can you just add this Tested-by?

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 0/3] PS3 Storage Drivers for 2.6.23, take 5

2007-07-19 Thread Alessandro Rubini


Hello.

> I didn't hear anything from the misc device maintainer (for the FLASH ROM
> Storage Driver).

Actually, I am not acting as a maintainer. I'm not active enough nor
up to date with all the structure of kernel maintainance. So please
don't wait for me.

Actually, I tried a pair of times to have my name removed from the
MAINTAINERS file over the years without success. Actually, I didn't
care a lot because nobody relly used that entry. I think it's time for
me to learn how to do it in the proper way.

Regards
/alessandro
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: System hangs on running kernbench

2007-07-19 Thread Dhaval Giani

> (gdb) thread 6
> [Switching to thread 6 (process 6233)]#0  __do_softirq ()
> at kernel/softirq.c:231
> 231 if (pending & 1) {
> (gdb) bt
> #0  __do_softirq () at kernel/softirq.c:231
> #1  0xc012998b in do_softirq () at kernel/softirq.c:269
> #2  0xc0129a09 in irq_exit () at kernel/softirq.c:305
> #3  0xc0117443 in smp_apic_timer_interrupt (regs=Variable "regs" is not 
> available.
> )
> at arch/i386/kernel/apic.c:592
> #4  0xc0105877 in apic_timer_interrupt () at include/asm/current.h:11
> #5  0xc0564480 in contig_page_data ()
> #6  0x0007 in ?? ()
> #7  0x0001 in ?? ()
> #8  0x in ?? ()

Looks interesting.

-- 
regards,
Dhaval

I would like to change the world but they don't give me the source code!
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: regression: disk error loop (panic?) ide_do_rw_disk-bad:

2007-07-19 Thread Giacomo Catenazzi

Linus Torvalds wrote:
> 
> On Thu, 19 Jul 2007, Bartlomiej Zolnierkiewicz wrote:
>> Thanks for finding and fixing this.
>>
>> The latest patch (with additional cleanups) also looks good and should be
>> safe enough (unchanged behavior for all non-pc requests) to merge it now.
>>
>> Acked-by: Bartlomiej Zolnierkiewicz <[EMAIL PROTECTED]>
> 
> Ok, Jens - mind signing off on the patch you sent out, and writing an 
> explanatory message? Feel free to just crib from my explanation of my 
> original patch, or whatever.
> 
> And it would be beautiful if people who saw the bad behaviour before 
> reverting the ide.c changes were to go back to that broken state, and try 
> the patch, and just verify that it acts like it should (ie you should see 
> just a few error messages, and it shouldn't cause the IDE layer to go 
> ballistic any more).

Ok, I tested a5fcaa210626a79465321e344c91a6a7dc3881fa , with
the Jeans' patch with clean-up (Message-ID:
<[EMAIL PROTECTED]>).

I don't see the error loop. but only 4 errors (2 for each hd, at hddtemp
start)

Jul 19 08:22:19 catee kernel: hda: selected mode 0x45
Jul 19 08:22:23 catee kernel: ide_do_rw_disk - bad command: dev hda:
type=2, flags=104c8
Jul 19 08:22:23 catee kernel:
Jul 19 08:22:23 catee kernel: sector 14657019, nr/cnr 0/0
Jul 19 08:22:23 catee kernel: bio c21a4780, biotail c21a4780, buffer
, data , len 36
Jul 19 08:22:23 catee kernel: cdb: 12 00 00 00 24 00 00 00 00 00 00 00
00 00 00 00
Jul 19 08:22:23 catee kernel: ide_do_rw_disk - bad command: dev hdc:
type=2, flags=104c8
Jul 19 08:22:23 catee kernel:
Jul 19 08:22:23 catee kernel: sector 34711027, nr/cnr 0/0
Jul 19 08:22:23 catee kernel: bio c21a4740, biotail c21a4740, buffer
, data , len 36
Jul 19 08:22:23 catee kernel: cdb: 12 00 00 00 24 00 00 00 00 00 00 00
00 00 00 00
Jul 19 08:22:23 catee kernel: ide_do_rw_disk - bad command: dev hdc:
type=2, flags=104c8
Jul 19 08:22:23 catee kernel:
Jul 19 08:22:23 catee kernel: sector 7387152, nr/cnr 0/0
Jul 19 08:22:23 catee kernel: bio c21a4900, biotail c21a4900, buffer
, data , len 36
Jul 19 08:22:23 catee kernel: cdb: 12 00 00 00 24 00 00 00 00 00 00 00
00 00 00 00
Jul 19 08:22:23 catee kernel: ide_do_rw_disk - bad command: dev hda:
type=2, flags=104c8
Jul 19 08:22:23 catee kernel:
Jul 19 08:22:23 catee kernel: sector 7387152, nr/cnr 0/0
Jul 19 08:22:23 catee kernel: bio c21a4900, biotail c21a4900, buffer
, data , len 36
Jul 19 08:22:23 catee kernel: cdb: 12 00 00 00 24 00 00 00 00 00 00 00
00 00 00 00
Jul 19 08:22:27 catee kernel: ttyS1: LSR safety check engaged!

The last git tree give me no errors.

patch in Message-ID: <[EMAIL PROTECTED]>

Tested-By: Giacomo Catenazzi <[EMAIL PROTECTED]>

ciao
cate

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATH 0/1] Kexec jump - v2 - the first step to kexec based hibernation

2007-07-19 Thread Huang, Ying

On Wed, 2007-07-18 at 18:04 -0700, Andrew Morton wrote:
> I like the idea but I think I'll let people chat about it a bit more
> before looking at merging the patches, OK?

I think maybe we should wait for Rafael to separate the device hibernate
(quiesce and state save) from device suspend. Without that, the ACPI
issue can not be resolved.

Best Regards,
Huang Ying
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

2.6.22-git10 compile error

2007-07-19 Thread Plamen Petrov


Hi, all!

Just for the record - linux kernel version 2.6.22-git10 fails to build
with the following error:

In file included from net/netfilter/xt_connlimit.c:27:
include/net/netfilter/nf_conntrack.h:99: error: field `ct_general' has 
incomplete type

include/net/netfilter/nf_conntrack.h: In function `nf_ct_get':
include/net/netfilter/nf_conntrack.h:163: error: structure has no member 
named `nfct'

include/net/netfilter/nf_conntrack.h: In function `nf_ct_put':
include/net/netfilter/nf_conntrack.h:170: error: implicit declaration of 
function `nf_conntrack_put'

include/net/netfilter/nf_conntrack.h: In function `nf_ct_is_untracked':
include/net/netfilter/nf_conntrack.h:252: error: structure has no member 
named `nfct'

In file included from net/netfilter/xt_connlimit.c:28:
include/net/netfilter/nf_conntrack_core.h: In function 
`nf_conntrack_confirm':
include/net/netfilter/nf_conntrack_core.h:68: error: structure has no 
member named `nfct'

make[2]: *** [net/netfilter/xt_connlimit.o] Error 1
make[1]: *** [net/netfilter] Error 2
make: *** [net] Error 2

Attached is the kernel config used, system is running Slackware 11 on
AMD Duron, gcc version is 3.4.6.

For more info - mail me.

--
Plamen Petrov, network administrator
Technical College - Silistra,
RU "Angel Kantchev"
http://tk.ru.acad.bg/

#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.22-git10
# Thu Jul 19 06:44:59 2007
#
CONFIG_X86_32=y
CONFIG_GENERIC_TIME=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_QUICKLIST=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_DMI=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32

#
# General setup
#
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
# CONFIG_BSD_PROCESS_ACCT is not set
# CONFIG_TASKSTATS is not set
# CONFIG_USER_NS is not set
# CONFIG_AUDIT is not set
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=15
# CONFIG_CPUSETS is not set
# CONFIG_SYSFS_DEPRECATED is not set
# CONFIG_RELAY is not set
# CONFIG_BLK_DEV_INITRD is not set
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SYSCTL=y
# CONFIG_EMBEDDED is not set
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_ANON_INODES=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_SLUB_DEBUG=y
# CONFIG_SLAB is not set
CONFIG_SLUB=y
# CONFIG_SLOB is not set
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
# CONFIG_MODVERSIONS is not set
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_KMOD=y
CONFIG_STOP_MACHINE=y
CONFIG_BLOCK=y
# CONFIG_LBD is not set
# CONFIG_BLK_DEV_IO_TRACE is not set
# CONFIG_LSF is not set
# CONFIG_BLK_DEV_BSG is not set

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
# CONFIG_IOSCHED_AS is not set
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
# CONFIG_DEFAULT_AS is not set
# CONFIG_DEFAULT_DEADLINE is not set
CONFIG_DEFAULT_CFQ=y
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="cfq"

#
# Processor type and features
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_SMP=y
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_PARAVIRT is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
# CONFIG_MCORE2 is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
CONFIG_MK7=y
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MGEODE_LX is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_MVIAC7 is not set
CONFIG_X86_GENERIC=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_L1_CACHE_SHIFT=7
CONFIG_X86_XADD=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y

Re: System hangs on running kernbench

2007-07-19 Thread Dhaval Giani

On Wed, Jul 18, 2007 at 08:16:37PM +0530, Dhaval Giani wrote:
> Hi Andrew,
> 
> On Wed, Jul 18, 2007 at 03:11:42PM +0530, Dhaval Giani wrote:
> > On Wed, Jul 18, 2007 at 01:07:00AM -0700, Andrew Morton wrote:
> > > On Wed, 18 Jul 2007 13:26:48 +0530 Dhaval Giani <[EMAIL PROTECTED]> wrote:
> > >  
> > In the meantime I will go and check if it was there in 2.6.22-rc4-mm2
> > 
> 
> It is hanging with 2.6.22-rc4-mm2 as well as on the latest git on
> kernel.org (2.6.22-git10).
> 
> I will get back to you with more information as soon as I have it.

Hi Andrew,

I've got a crash dump and stack traces. They are as follows (The trace
is on 2.6.22-git10)


(gdb) thread 1
[Switching to thread 1 (process 8096)]#0  delay_tsc (loops=1)
at include/asm/msr.h:64
64  {
(gdb) bt
#0  delay_tsc (loops=1) at include/asm/msr.h:64
#1  0xc0245130 in __delay (loops=Variable "loops" is not available.
) at arch/i386/lib/delay.c:74
#2  0xc0247115 in __spin_lock_debug (lock=0xc0564480)
at lib/spinlock_debug.c:111
#3  0xc02471cc in _raw_spin_lock (lock=0xc0564480) at lib/spinlock_debug.c:132
#4  0xc041ad3e in _spin_lock_irq (lock=0xc0564480) at kernel/spinlock.c:105
#5  0xc015ff2c in shrink_active_list (nr_pages=32, zone=0xc0563300, 
sc=0xd65b3e60, priority=5) at mm/vmscan.c:926
#6  0xc01602a3 in shrink_zone (priority=5, zone=0xc0563300, sc=0xd65b3e60)
at mm/vmscan.c:1044
#7  0xc016036c in shrink_zones (priority=5, zones=0xc056584c, sc=0xd65b3e60)
at mm/vmscan.c:1101
#8  0xc0160488 in try_to_free_pages (zones=0xc056584c, order=Variable "order" 
is not available.
)
at mm/vmscan.c:1153
#9  0xc015c190 in __alloc_pages (gfp_mask=688338, order=0, zonelist=0xc0565848)
at mm/page_alloc.c:1336
#10 0xc0165285 in do_anonymous_page (mm=0xe4498280, vma=0xd3afef3c, 
address=3083890688, page_table=0xd65ca838, pmd=0xe3e33df0, write_access=1)
at include/linux/gfp.h:100
#11 0xc0165a58 in __handle_mm_fault (mm=0xe4498280, vma=0xd3afef3c, 
address=3083890688, write_access=1) at mm/memory.c:2549
#12 0xc041c984 in do_page_fault (regs=0xd65b3fb8, error_code=6)
at include/linux/mm.h:776
#13 0xc041b37a in error_code () at include/linux/sched.h:13
#14 0x006c in ?? ()
#15 0x001b in ?? ()
#16 0x in ?? ()
(gdb) thread 2
[Switching to thread 2 (process 7371)]#0  __spin_lock_debug (lock=0xc0564480)
at include/asm/spinlock.h:88
88  {
(gdb) bt
#0  __spin_lock_debug (lock=0xc0564480) at include/asm/spinlock.h:88
#1  0xc02471cc in _raw_spin_lock (lock=0xc0564480) at lib/spinlock_debug.c:132
#2  0xc041ad3e in _spin_lock_irq (lock=0xc0564480) at kernel/spinlock.c:105
#3  0xc0160181 in shrink_active_list (nr_pages=32, zone=0xc0563300, sc=Variable 
"sc" is not available.
)
at mm/vmscan.c:994
#4  0xc01602a3 in shrink_zone (priority=5, zone=0xc0563300, sc=0xeea67e60)
at mm/vmscan.c:1044
#5  0xc016036c in shrink_zones (priority=5, zones=0xc056584c, sc=0xeea67e60)
at mm/vmscan.c:1101
#6  0xc0160488 in try_to_free_pages (zones=0xc056584c, order=Variable "order" 
is not available.
)
at mm/vmscan.c:1153
#7  0xc015c190 in __alloc_pages (gfp_mask=688338, order=0, zonelist=0xc0565848)
at mm/page_alloc.c:1336
#8  0xc0165285 in do_anonymous_page (mm=0xeeaabdc0, vma=0xf137c7ac, 
address=3084021760, page_table=0xeeabf938, pmd=0xeeaafdf0, write_access=1)
at include/linux/gfp.h:100
#9  0xc0165a58 in __handle_mm_fault (mm=0xeeaabdc0, vma=0xf137c7ac, 
address=3084021760, write_access=1) at mm/memory.c:2549
#10 0xc041c984 in do_page_fault (regs=0xeea67fb8, error_code=6)
at include/linux/mm.h:776
#11 0xc041b37a in error_code () at include/linux/sched.h:13
#12 0x002c in ?? ()
#13 0x000b in ?? ()
#14 0x in ?? ()
(gdb) thread 3
[Switching to thread 3 (process 8392)]#0  __spin_lock_debug (lock=0xc0564480)
at include/asm/spinlock.h:88
88  {
(gdb) bt
#0  __spin_lock_debug (lock=0xc0564480) at include/asm/spinlock.h:88
#1  0xc02471cc in _raw_spin_lock (lock=0xc0564480) at lib/spinlock_debug.c:132
#2  0xc041ad3e in _spin_lock_irq (lock=0xc0564480) at kernel/spinlock.c:105
#3  0xc016000f in shrink_active_list (nr_pages=32, zone=0xc0563300, sc=Variable 
"sc" is not available.
)
at mm/vmscan.c:950
#4  0xc01602a3 in shrink_zone (priority=5, zone=0xc0563300, sc=0xf733de60)
at mm/vmscan.c:1044
#5  0xc016036c in shrink_zones (priority=5, zones=0xc056584c, sc=0xf733de60)
at mm/vmscan.c:1101
#6  0xc0160488 in try_to_free_pages (zones=0xc056584c, order=Variable "order" 
is not available.
)
at mm/vmscan.c:1153
#7  0xc015c190 in __alloc_pages (gfp_mask=688338, order=0, zonelist=0xc0565848)
at mm/page_alloc.c:1336
#8  0xc0165285 in do_anonymous_page (mm=0xf3054940, vma=0xf60496a4, 
address=135188540, page_table=0xd275c768, pmd=0xf317d200, write_access=1)
at include/linux/gfp.h:100
#9  0xc0165a58 in __handle_mm_fault (mm=0xf3054940, vma=0xf60496a4, 
address=135188540, write_access=1) at mm/memory.c:2549
#10 0xc041c984 in do_page_fault (regs=0xf733dfb8,

Re: [PATCH] Check for compound pages in set_page_dirty()

2007-07-19 Thread Jens Axboe

On Wed, Jul 18 2007, Hugh Dickins wrote:
> On Wed, 18 Jul 2007, Jens Axboe wrote:
> > 
> > Since I had my hands dirty already...
> 
> Great, thanks.  (There's also such a test in fs/nfs/direct.c,
> but let's not trouble Trond until we've settled what to do here.)
> 
> > 
> > ---
> > 
> > [PATCH] Remove PageCompound() checks before calling set_page_dirty()
> > 
> > Pre commit 41d78ba55037468e6c86c53e3076d1a74841de39 it was illegal
> > to call set_page_dirty() on a compound page, since it stored the
> > destructor in the mapping field. But now it's ok, so remove the
> > ugly PageCompound() checks from bio and direct-io.
> > 
> > Signed-off-by: Jens Axboe <[EMAIL PROTECTED]>
> 
> I was about to Ack that, now that I've found something or other in the
> libhugetlb testsuite comes this way, even on page[1], without showing
> any problem.
> 
> However, I have noticed a particular inefficiency arising: that
> bio_check_pages_dirty test specifically avoids pages already
> PageDirty; but hugetlbfs_set_page_dirty carefully redirects to
> set the head page dirty: so tail pages of a hugetlb compound page
> will tend never to be PageDirty, and keep on coming back this way.
> 
> Which led me to look up the origin of those PageCompound tests:
> Author: Andrew Morton <[EMAIL PROTECTED]>
> Date:   Sun Sep 21 01:42:22 2003 -0700
> 
> [PATCH] Speed up direct-io hugetlbpage handling
> 
> This patch short-circuits all the direct-io page dirtying logic for
> higher-order pages.  Without this, we pointlessly bounce BIOs up to 
> keventd
> all the time.
> 
> diff --git a/fs/bio.c b/fs/bio.c
> index d016523..2463163 100644
> --- a/fs/bio.c
> +++ b/fs/bio.c
> @@ -532,6 +532,12 @@ void bio_unmap_user(struct bio *bio, int write_to_vm)
>   * check that the pages are still dirty.   If so, fine.  If not, redirty them
>   * in process context.
>   *
> + * We special-case compound pages here: normally this means reads into 
> hugetlb
> + * pages.  The logic in here doesn't really work right for compound pages
> + * because the VM does not uniformly chase down the head page in all cases.
> + * But dirtiness of compound pages is pretty meaningless anyway: the VM 
> doesn't
> + * handle them at all.  So we skip compound pages here at an early stage.
> ...
> 
> It looks like I was wrong in thinking it was just trying to avoid 
> the crash on page[1].mapping.  At the least, your patch needs also
> to remove that paragraph of comment from Andrew.  But really, it
> looks like those PageCompound tests should stay, unless you can
> persuade Andrew to Ack their removal.
> 
> Except (now, how many times can I change my mind in the course of
> one email?), hugetlbfs_set_page_dirty was specifically added by
> Ken Chen to avoid losing data via /proc/sys/vm/drop_caches.  Yet
> fs/bio.c is carefully avoiding going there when dirtying a hugepage.
> How does this work?  Looks like those PageCompound tests need to go!

Hehe, that didn't really get us much further, did it? :-)

My opinion is that since the win is marginal at best, we want to remove
such tests as it just clutters up the code. And it's definitely not
obvious why the tests are there, since they are not commented at all.
Since it's even confusing you, then we can't expect the more vm ignorant
of us (which definitely includes me) to grasp it!

> I'm lost: I hope Andrew and Ken can sort it out for us.

Posting a revised version, still leaving nfs out of it (I'll ping Trond
to do the same, if this goes in).

diff --git a/fs/bio.c b/fs/bio.c
index 33e4634..dcbb160 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -884,12 +884,6 @@ struct bio *bio_map_kern(request_queue_t *q, void *data, 
unsigned int len,
  * check that the pages are still dirty.   If so, fine.  If not, redirty them
  * in process context.
  *
- * We special-case compound pages here: normally this means reads into hugetlb
- * pages.  The logic in here doesn't really work right for compound pages
- * because the VM does not uniformly chase down the head page in all cases.
- * But dirtiness of compound pages is pretty meaningless anyway: the VM doesn't
- * handle them at all.  So we skip compound pages here at an early stage.
- *
  * Note that this code is very hard to test under normal circumstances because
  * direct-io pins the pages with get_user_pages().  This makes
  * is_page_cache_freeable return false, and the VM will not clean the pages.
@@ -911,7 +905,7 @@ void bio_set_pages_dirty(struct bio *bio)
for (i = 0; i < bio->bi_vcnt; i++) {
struct page *page = bvec[i].bv_page;
 
-   if (page && !PageCompound(page))
+   if (page)
set_page_dirty_lock(page);
}
 }
@@ -978,7 +972,7 @@ void bio_check_pages_dirty(struct bio *bio)
for (i = 0; i < bio->bi_vcnt; i++) {
struct page *page = bvec[i].bv_page;
 
-   if (PageDirty(page) || PageCompound(page)) {
+   if (PageDirty(page)) {

Re: [PATCH 33/33] IDE: sg chaining support

2007-07-19 Thread Jens Axboe

On Wed, Jul 18 2007, Bartlomiej Zolnierkiewicz wrote:
> On Monday 16 July 2007, Jens Axboe wrote:
> > Cc: [EMAIL PROTECTED]
> > Signed-off-by: Jens Axboe <[EMAIL PROTECTED]>
> 
> Acked-by: Bartlomiej Zolnierkiewicz <[EMAIL PROTECTED]>

(for both acks) Thanks for reviewing and acking!

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: regression: disk error loop (panic?) ide_do_rw_disk-bad:

2007-07-19 Thread Jens Axboe

On Wed, Jul 18 2007, Linus Torvalds wrote:
> 
> 
> On Thu, 19 Jul 2007, Bartlomiej Zolnierkiewicz wrote:
> > 
> > Thanks for finding and fixing this.
> > 
> > The latest patch (with additional cleanups) also looks good and should be
> > safe enough (unchanged behavior for all non-pc requests) to merge it now.
> > 
> > Acked-by: Bartlomiej Zolnierkiewicz <[EMAIL PROTECTED]>
> 
> Ok, Jens - mind signing off on the patch you sent out, and writing an 
> explanatory message? Feel free to just crib from my explanation of my 
> original patch, or whatever.

Sure thing, it's below.

> And it would be beautiful if people who saw the bad behaviour before
> reverting the ide.c changes were to go back to that broken state, and
> try the patch, and just verify that it acts like it should (ie you
> should see just a few error messages, and it shouldn't cause the IDE
> layer to go ballistic any more).

---

[PATCH] IDE: fix termination of non-fs requests

ide-disk calls

ide_end_request(drive, 0, 0);

to finish an unknown request, but this doesn't work so well for non-fs
requests, since ide_end_request() internally looks at ->hard_cur_sectors
to see how much data to end. Only file system requests store a transfer
value in there, pc requests fill out ->data_len as a byte based transfer
value instead.

Since we ask to end 0 bytes of that request, it will never be terminated
and ide-disk gets stuck in a loop "handling" that same request over and
over.

Switch __ide_end_request() to take a byte based transfer count, and
adjust ide_end_request() to look at the right field to determine how
much IO to end when it's being passed in 0.

Acked-by: Bartlomiej Zolnierkiewicz <[EMAIL PROTECTED]>
Signed-off-by: Jens Axboe <[EMAIL PROTECTED]>

diff --git a/drivers/ide/ide-io.c b/drivers/ide/ide-io.c
index c5b5011..f9de798 100644
--- a/drivers/ide/ide-io.c
+++ b/drivers/ide/ide-io.c
@@ -55,7 +55,7 @@
 #include 
 
 static int __ide_end_request(ide_drive_t *drive, struct request *rq,
-int uptodate, int nr_sectors)
+int uptodate, unsigned int nr_bytes)
 {
int ret = 1;
 
@@ -64,7 +64,7 @@ static int __ide_end_request(ide_drive_t *drive, struct 
request *rq,
 * complete the whole request right now
 */
if (blk_noretry_request(rq) && end_io_error(uptodate))
-   nr_sectors = rq->hard_nr_sectors;
+   nr_bytes = rq->hard_nr_sectors << 9;
 
if (!blk_fs_request(rq) && end_io_error(uptodate) && !rq->errors)
rq->errors = -EIO;
@@ -78,7 +78,7 @@ static int __ide_end_request(ide_drive_t *drive, struct 
request *rq,
HWGROUP(drive)->hwif->ide_dma_on(drive);
}
 
-   if (!end_that_request_first(rq, uptodate, nr_sectors)) {
+   if (!end_that_request_chunk(rq, uptodate, nr_bytes)) {
add_disk_randomness(rq->rq_disk);
if (!list_empty(>queuelist))
blkdev_dequeue_request(rq);
@@ -103,6 +103,7 @@ static int __ide_end_request(ide_drive_t *drive, struct 
request *rq,
 
 int ide_end_request (ide_drive_t *drive, int uptodate, int nr_sectors)
 {
+   unsigned int nr_bytes = nr_sectors << 9;
struct request *rq;
unsigned long flags;
int ret = 1;
@@ -114,10 +115,14 @@ int ide_end_request (ide_drive_t *drive, int uptodate, 
int nr_sectors)
spin_lock_irqsave(_lock, flags);
rq = HWGROUP(drive)->rq;
 
-   if (!nr_sectors)
-   nr_sectors = rq->hard_cur_sectors;
+   if (!nr_bytes) {
+   if (blk_pc_request(rq))
+   nr_bytes = rq->data_len;
+   else
+   nr_bytes = rq->hard_cur_sectors << 9;
+   }
 
-   ret = __ide_end_request(drive, rq, uptodate, nr_sectors);
+   ret = __ide_end_request(drive, rq, uptodate, nr_bytes);
 
spin_unlock_irqrestore(_lock, flags);
return ret;

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6 patch] kernel/sched.c: remove 2 unused exports

2007-07-19 Thread Adrian Bunk

On Tue, Jul 17, 2007 at 09:22:33PM +0200, Ingo Molnar wrote:
> 
> * Adrian Bunk <[EMAIL PROTECTED]> wrote:
> 
> > This patch removes the following unused exports:
> > - EXPORT_SYMBOL(cond_resched_softirq);
> > - EXPORT_SYMBOL_GPL(__wake_up_sync);
> 
> these are there for API completeness - their counterparts are exported.

Why is something with a comment "For internal use only" part of the API?

>   Ingo

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] Change softlockup trigger limit using a kernel parameter

2007-07-19 Thread Andrew Morton

On Wed, 18 Jul 2007 22:41:21 -0700 Ravikiran G Thirumalai <[EMAIL PROTECTED]> 
wrote:

> On Wed, Jul 18, 2007 at 04:08:58PM -0700, Andrew Morton wrote:
> > On Mon, 16 Jul 2007 15:26:50 -0700
> > Ravikiran G Thirumalai <[EMAIL PROTECTED]> wrote:
> >
> > > Kernel warns of softlockups if the softlockup thread is not able to run
> > > on a CPU for 10s.  It is useful to lower the softlockup warning
> > > threshold in testing environments to catch potential lockups early.
> > > Following patch adds a kernel parameter 'softlockup_lim' to control
> > > the softlockup threshold.
> > >
> >
> > Why not make it tunable at runtime?
> 
> Sure! Like a sysctl?
> 
> Here's a patch that does that (On top of Ingo's
> softlockup-improve-debug-output.patch)
>
> ...
>
> --- linux-2.6.22.orig/kernel/sysctl.c 2007-07-08 16:32:17.0 -0700
> +++ linux-2.6.22/kernel/sysctl.c  2007-07-18 21:05:57.877436750 -0700
> @@ -78,6 +78,7 @@ extern int percpu_pagelist_fraction;
>  extern int compat_log;
>  extern int maps_protect;
>  extern int sysctl_stat_interval;
> +extern int softlockup_thresh;

Just because sysctl.c does this all over the place doesn't make it right ;)
Please, if poss, find a header file for it.

>  /* this is needed for the proc_dointvec_minmax for [fs_]overflow UID and GID 
> */
>  static int maxolduid = 65535;
> @@ -206,6 +207,10 @@ static ctl_table root_table[] = {
>   { .ctl_name = 0 }
>  };
>  
> +/* Constants for kernel table minimum and  maximum */
> +static int one = 1;
> +static int ten = 10;

I'd suggest that these go next to "zero", "two" and "one_hundred".  Move 'em
all to top-of-file where they should always have been.

>  static ctl_table kern_table[] = {
>   {
>   .ctl_name   = KERN_PANIC,
> @@ -615,6 +620,19 @@ static ctl_table kern_table[] = {
>   .proc_handler   = _dointvec,
>   },
>  #endif
> +#ifdef CONFIG_DETECT_SOFTLOCKUP
> + {
> + .ctl_name   = KERN_SOFTLOCKUP_THRESHOLD,
> + .procname   = "softlockup_thresh",
> + .data   = _thresh,
> + .maxlen = sizeof(int),
> + .mode   = 0644,
> + .proc_handler   = _dointvec_minmax,
> + .strategy   = _intvec,
> + .extra1 = ,
> + .extra2 = ,
> + },
> +#endif

argh.  There's supposed to be a big comment right here:

/*
 * NOTE: do not add new entries to this table unless you have read
 * Documentation/sysctl/ctl_unnumbered.txt
 */

I'll fix that up.  Please use CTL_UNNUMBERED.

>  };
> Index: linux-2.6.22/include/linux/sysctl.h
> ===
> --- linux-2.6.22.orig/include/linux/sysctl.h  2007-07-08 16:32:17.0 
> -0700
> +++ linux-2.6.22/include/linux/sysctl.h   2007-07-18 21:41:56.584347500 
> -0700
> @@ -165,6 +165,7 @@ enum
>   KERN_MAX_LOCK_DEPTH=74,
>   KERN_NMI_WATCHDOG=75, /* int: enable/disable nmi watchdog */
>   KERN_PANIC_ON_NMI=76, /* int: whether we will panic on an unrecovered */
> + KERN_SOFTLOCKUP_THRESHOLD=77, /* int: softlockup tolerance threshold */
>  };

and zap this

> Index: linux-2.6.22/Documentation/sysctl/kernel.txt
> ===
> --- linux-2.6.22.orig/Documentation/sysctl/kernel.txt 2007-07-08 
> 16:32:17.0 -0700
> +++ linux-2.6.22/Documentation/sysctl/kernel.txt  2007-07-18 
> 22:07:29.460146250 -0700
> @@ -320,6 +320,14 @@ kernel.  This value defaults to SHMMAX.
>  
>  ==
>  
> +softlockup_thresh:
> +
> +This value can be used to lower the softlockup tolerance
> +threshold. The default threshold is 10s.  If a cpu is locked up
> +for 10s, the kernel complains.  Valid values are 1-10s.
> +

neato.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] posix-timer: fix deletion race

2007-07-19 Thread Thomas Gleixner

On Wed, 2007-07-18 at 16:43 -0700, Jeremy Katz wrote:
> On Wed, 18 Jul 2007, Jeremy Katz wrote:
> 
> > On Wed, 18 Jul 2007, Thomas Gleixner wrote:
> >
> >>> Also can you please enable CONFIG_PROVE_LOCKING, which might catch any
> >>> locking problem, which might be related to this.
> >> 
> >> Another test: Can you please disable CONFIG_SCHED_SMT to narrow it down
> >> further ?
> >
> > I'll try both of these.
> 
> I'm still seeing the sys_timer_delete version with your patch, and 
> sys_timer_delete and posix_timer_event without. The itimer_delete version 
> is currently missing in action, but getting any particular one to perform 
> on demand is currently not in my power.

Ok, let me summarize:

2.6.22 + hrt6

Both problems are there whether CONFIG_SCHED_SMT is on or not

2.6.22 + hrt6 + posixtimer patch

Both problems are there whether CONFIG_SCHED_SMT is on
The timer callback problem is gone when CONFIG_SCHED_SMT is off

Correct ?

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Jarek Poplawski

On Wed, Jul 18, 2007 at 01:48:20PM +0200, Jarek Poplawski wrote:
...
> I'd be very glad if it could be verified and/or tested.

Jarek,

This patch is verified crap!

Regards,
Jarek P.

PS: Olaf,

You've written earlier that one of the main reasons for poll_napi is
to work when the kernel "doesn't even service softirqs anymore". But
in your patch poll_napi leaves netif_rx_complete for softirqs, so
even if it starts to work for Ingo in normal conditions, probably
something else is needed, anyway.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] net/, drivers/net/ , missing EXPERIMENTAL in menus

2007-07-19 Thread Adrian Bunk

On Wed, Jul 18, 2007 at 05:18:20PM -0400, Robert P. J. Day wrote:
> On Wed, 18 Jul 2007, Adrian Bunk wrote:
> 
> > On Wed, Jul 18, 2007 at 04:51:33PM -0400, Robert P. J. Day wrote:
> > > On Wed, 18 Jul 2007, Jeff Garzik wrote:
> > >
> > > > Randy Dunlap wrote:
> > > > > On Wed, 18 Jul 2007 16:23:09 -0400 (EDT) Robert P. J. Day wrote:
> > > > > > there's no point adding all that redundant content when it can all 
> > > > > > be
> > > > > > done automatically.
> > > > >
> > > > > I like it.  Are there any kconfig patches to support this plan?
> > > >
> > > > Speaking specifically to adding 'EXPERIMENTAL', I distinctly
> > > > remember at some point in the past the config system was smart
> > > > enough to print " (EXPERIMENTAL)" if that entry depended on
> > > > CONFIG_EXPERIMENTAL.
> > > >
> > > > We should head in that direction.
> > >
> > > there's one point i want to re-iterate.  i'd prefer to see
> > > EXPERIMENTAL stop being a dependency, as in:
> > >
> > >   depends on SNAFU && FUBAR && EXPERIMENTAL
> > >
> > > "EXPERIMENTAL" is not a dependency in the true sense of the word
> > > -- it is more of an attribute, and i think it would far more sense
> > > to see entries like:
> > >
> > >   depends on SNAFU && FUBAR
> > >   maturity EXPERIMENTAL
> >
> > Plus some special case in the kconfig code that you can somewhere
> > select the maturity levels you want to use (currently it's a normal
> > option kconfig doesn't have to know anything about).
> 
> i already described that here:
> 
> http://readlist.com/lists/vger.kernel.org/linux-kernel/66/334172.html
> 
> where the top-level config would look something like:
> 
> [*] Activate maturity attributes
>   [*] EXPERIMENTAL
>   [*] DEPRECATED
>   [*] OBSOLETE
>   [*] BROKEN

We already made the mistake of offering BROKEN as an option in the past, 
and the result was that people enabled it instead of reporting that a 
dependency on BROKEN was wrong.

> whereupon you could select any combination of the attributes you want
> displayed *beyond the regular ones* during the config process.
> 
> > Remind me, would there be any big advantage after such a change
> > besides being able to automatically print " (EXPERIMENTAL)" at the
> > end of the prompt?
> 
> defining a new Kconfig attribute means you can process it differently
> from regular dependencies.  and if it's added as a general feature, it
> can be used for other possible attributes beyond just a maturity
> level.
> 
> if you leave these maturity levels as regular dependencies, you're
> going to have to brute force and manually process them, and why make
> it that ugly?

I would consider it more ugly to special case this and that in the 
kconfig code when plain dependencies already offer exactly the same 
functionality...

> rday

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH try #3] security: Convert LSM into a static interface

2007-07-19 Thread Greg KH

On Wed, Jul 18, 2007 at 10:42:09PM -0400, James Morris wrote:
> On Wed, 18 Jul 2007, Andrew Morton wrote:
> > aww man, you passed over an opportunity to fix vast amounts of coding style
> > cruftiness.
> 
> GregKH-esque :-)

Yeah, sorry, that was when I was young and foolish and liked to bang on
the spacebar more than I should have :)

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] Change softlockup trigger limit using a kernel parameter

2007-07-19 Thread Ravikiran G Thirumalai

On Wed, Jul 18, 2007 at 04:08:58PM -0700, Andrew Morton wrote:
> On Mon, 16 Jul 2007 15:26:50 -0700
> Ravikiran G Thirumalai <[EMAIL PROTECTED]> wrote:
>
> > Kernel warns of softlockups if the softlockup thread is not able to run
> > on a CPU for 10s.  It is useful to lower the softlockup warning
> > threshold in testing environments to catch potential lockups early.
> > Following patch adds a kernel parameter 'softlockup_lim' to control
> > the softlockup threshold.
> >
>
> Why not make it tunable at runtime?

Sure! Like a sysctl?

Here's a patch that does that (On top of Ingo's
softlockup-improve-debug-output.patch)

>
> >
> > Control the trigger limit for softlockup warnings.  This is useful for
> > debugging softlockups, by lowering the softlockup_lim to identify
> > possible softlockups earlier.
>
> Please check your patches with scripts/checkpatch.pl.

Yep will-do.
(checkpatch emitted one warning for the patch below, but that was because
of a 'stylo' that already exists in include/linux/sysctl.h -- which probably
needs a style change patch by itself)

---

Control the trigger limit for softlockup warnings.  This is useful for
debugging softlockups, by lowering the softlockup_thresh sysctl,
to identify possible softlockups earlier.

Patch also changes the softlockup printk to print the cpu softlockup time.

Signed-off-by: Ravikiran Thirumalai <[EMAIL PROTECTED]>
Signed-off-by: Shai Fultheim <[EMAIL PROTECTED]>

Index: linux-2.6.22/kernel/softlockup.c
===
--- linux-2.6.22.orig/kernel/softlockup.c   2007-07-18 11:15:18.506614500 
-0700
+++ linux-2.6.22/kernel/softlockup.c2007-07-18 21:39:20.498592750 -0700
@@ -23,6 +23,7 @@ static DEFINE_PER_CPU(unsigned long, pri
 static DEFINE_PER_CPU(struct task_struct *, watchdog_task);
 
 static int did_panic;
+int softlockup_thresh = 10;
 
 static int
 softlock_panic(struct notifier_block *this, unsigned long event, void *ptr)
@@ -101,7 +102,7 @@ void softlockup_tick(void)
wake_up_process(per_cpu(watchdog_task, this_cpu));
 
/* Warn about unreasonable 10+ seconds delays: */
-   if (now <= (touch_timestamp + 10))
+   if (now <= (touch_timestamp + softlockup_thresh))
return;
 
regs = get_irq_regs();
@@ -109,8 +110,9 @@ void softlockup_tick(void)
per_cpu(print_timestamp, this_cpu) = touch_timestamp;
 
spin_lock(_lock);
-   printk(KERN_ERR "BUG: soft lockup detected on CPU#%d! [%s:%d]\n",
-   this_cpu, current->comm, current->pid);
+   printk(KERN_ERR "BUG: soft lockup - CPU#%d stuck for %lus! [%s:%d]\n",
+   this_cpu, now - touch_timestamp,
+   current->comm, current->pid);
if (regs)
show_regs(regs);
else
Index: linux-2.6.22/kernel/sysctl.c
===
--- linux-2.6.22.orig/kernel/sysctl.c   2007-07-08 16:32:17.0 -0700
+++ linux-2.6.22/kernel/sysctl.c2007-07-18 21:05:57.877436750 -0700
@@ -78,6 +78,7 @@ extern int percpu_pagelist_fraction;
 extern int compat_log;
 extern int maps_protect;
 extern int sysctl_stat_interval;
+extern int softlockup_thresh;
 
 /* this is needed for the proc_dointvec_minmax for [fs_]overflow UID and GID */
 static int maxolduid = 65535;
@@ -206,6 +207,10 @@ static ctl_table root_table[] = {
{ .ctl_name = 0 }
 };
 
+/* Constants for kernel table minimum and  maximum */
+static int one = 1;
+static int ten = 10;
+
 static ctl_table kern_table[] = {
{
.ctl_name   = KERN_PANIC,
@@ -615,6 +620,19 @@ static ctl_table kern_table[] = {
.proc_handler   = _dointvec,
},
 #endif
+#ifdef CONFIG_DETECT_SOFTLOCKUP
+   {
+   .ctl_name   = KERN_SOFTLOCKUP_THRESHOLD,
+   .procname   = "softlockup_thresh",
+   .data   = _thresh,
+   .maxlen = sizeof(int),
+   .mode   = 0644,
+   .proc_handler   = _dointvec_minmax,
+   .strategy   = _intvec,
+   .extra1 = ,
+   .extra2 = ,
+   },
+#endif
 
{ .ctl_name = 0 }
 };
Index: linux-2.6.22/include/linux/sysctl.h
===
--- linux-2.6.22.orig/include/linux/sysctl.h2007-07-08 16:32:17.0 
-0700
+++ linux-2.6.22/include/linux/sysctl.h 2007-07-18 21:41:56.584347500 -0700
@@ -165,6 +165,7 @@ enum
KERN_MAX_LOCK_DEPTH=74,
KERN_NMI_WATCHDOG=75, /* int: enable/disable nmi watchdog */
KERN_PANIC_ON_NMI=76, /* int: whether we will panic on an unrecovered */
+   KERN_SOFTLOCKUP_THRESHOLD=77, /* int: softlockup tolerance threshold */
 };
 
 
Index: linux-2.6.22/Documentation/sysctl/kernel.txt
===
---

Re: [PATCH 1/5][TAKE8] manpage for fallocate

2007-07-19 Thread Mark Fasheh

On Thu, Jul 19, 2007 at 03:10:52PM +1000, David Chinner wrote:
> % git-log 84e1e99f112dead8f9ba036c02d24a9f5ce7f544 |head -10
> commit 84e1e99f112dead8f9ba036c02d24a9f5ce7f544
> Author: David Chinner <[EMAIL PROTECTED]>
> Date:   Mon Jun 18 16:50:27 2007 +1000
> 
> [XFS] Prevent ENOSPC from aborting transactions that need to succeed
> 
> During delayed allocation extent conversion or unwritten extent
> conversion, we need to reserve some blocks for transactions reservations.
> We need to reserve these blocks in case a btree split occurs and we need
> to allocate some blocks.
> 
> --
> 
> IOWs, XFS didn't provide this guarantee until about a month ago

Ok, once again XFS is ahead of the curve ;)

Comment rescinded then...
--Mark

--
Mark Fasheh
Senior Software Developer, Oracle
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/3] ps3: Disk Storage Driver

2007-07-19 Thread Jens Axboe

On Wed, Jul 18 2007, Andrew Morton wrote:
> On Mon, 16 Jul 2007 18:15:40 +0200
> Geert Uytterhoeven <[EMAIL PROTECTED]> wrote:
> 
> > From: Geert Uytterhoeven <[EMAIL PROTECTED]>
> > 
> > Add a Disk Storage Driver for the PS3:
> 
> Your patchset significantly hits powerpc, scsi and block.  So who gets to
> merge this?  Jens?  James?  Paul?
> 
> Me, I guess ;)

I think Paul was going to take it, or at least Geert hinted as such.

> > +#define PS3DISK_MAX_DISKS  16
> > +#define PS3DISK_MINORS 16
> > +
> > +#define KERNEL_SECTOR_SIZE 512
> 
> Sigh.  We have at least ten separate definitions of SECTOR_SIZE< none of
> them in the right place.  Cleanup opportunity for someone.

Indeed, it's universally 512 or << 9, just use that...

> > +#ifdef DEBUG
> > +   unsigned int n = 0;
> > +   struct bio *bio;
> > +   rq_for_each_bio(bio, req)
> > +   n++;
> 
> I'm surprised that the block core doesn't have a helper to count the number
> of bios in a request.

What would be the point of such a helper? I've never seen a need for it.
Geert uses it as debug code here, but the fact is that the number of
bios in a request is a pretty pointless number. It doesn't tell you
anything. There's no 1:1 mapping between bios and segments (or anything
else for that matter), so the exercise is purely pointless.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/3] ps3: Disk Storage Driver

2007-07-19 Thread Jens Axboe

On Wed, Jul 18 2007, Andrew Morton wrote:
 On Mon, 16 Jul 2007 18:15:40 +0200
 Geert Uytterhoeven [EMAIL PROTECTED] wrote:
 
  From: Geert Uytterhoeven [EMAIL PROTECTED]
  
  Add a Disk Storage Driver for the PS3:
 
 Your patchset significantly hits powerpc, scsi and block.  So who gets to
 merge this?  Jens?  James?  Paul?
 
 Me, I guess ;)

I think Paul was going to take it, or at least Geert hinted as such.

  +#define PS3DISK_MAX_DISKS  16
  +#define PS3DISK_MINORS 16
  +
  +#define KERNEL_SECTOR_SIZE 512
 
 Sigh.  We have at least ten separate definitions of SECTOR_SIZE none of
 them in the right place.  Cleanup opportunity for someone.

Indeed, it's universally 512 or  9, just use that...

  +#ifdef DEBUG
  +   unsigned int n = 0;
  +   struct bio *bio;
  +   rq_for_each_bio(bio, req)
  +   n++;
 
 I'm surprised that the block core doesn't have a helper to count the number
 of bios in a request.

What would be the point of such a helper? I've never seen a need for it.
Geert uses it as debug code here, but the fact is that the number of
bios in a request is a pretty pointless number. It doesn't tell you
anything. There's no 1:1 mapping between bios and segments (or anything
else for that matter), so the exercise is purely pointless.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/5][TAKE8] manpage for fallocate

2007-07-19 Thread Mark Fasheh

On Thu, Jul 19, 2007 at 03:10:52PM +1000, David Chinner wrote:
 % git-log 84e1e99f112dead8f9ba036c02d24a9f5ce7f544 |head -10
 commit 84e1e99f112dead8f9ba036c02d24a9f5ce7f544
 Author: David Chinner [EMAIL PROTECTED]
 Date:   Mon Jun 18 16:50:27 2007 +1000
 
 [XFS] Prevent ENOSPC from aborting transactions that need to succeed
 
 During delayed allocation extent conversion or unwritten extent
 conversion, we need to reserve some blocks for transactions reservations.
 We need to reserve these blocks in case a btree split occurs and we need
 to allocate some blocks.
 
 --
 
 IOWs, XFS didn't provide this guarantee until about a month ago

Ok, once again XFS is ahead of the curve ;)

Comment rescinded then...
--Mark

--
Mark Fasheh
Senior Software Developer, Oracle
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] Change softlockup trigger limit using a kernel parameter

2007-07-19 Thread Ravikiran G Thirumalai

On Wed, Jul 18, 2007 at 04:08:58PM -0700, Andrew Morton wrote:
 On Mon, 16 Jul 2007 15:26:50 -0700
 Ravikiran G Thirumalai [EMAIL PROTECTED] wrote:

  Kernel warns of softlockups if the softlockup thread is not able to run
  on a CPU for 10s.  It is useful to lower the softlockup warning
  threshold in testing environments to catch potential lockups early.
  Following patch adds a kernel parameter 'softlockup_lim' to control
  the softlockup threshold.
 

 Why not make it tunable at runtime?

Sure! Like a sysctl?

Here's a patch that does that (On top of Ingo's
softlockup-improve-debug-output.patch)


 
  Control the trigger limit for softlockup warnings.  This is useful for
  debugging softlockups, by lowering the softlockup_lim to identify
  possible softlockups earlier.

 Please check your patches with scripts/checkpatch.pl.

Yep will-do.
(checkpatch emitted one warning for the patch below, but that was because
of a 'stylo' that already exists in include/linux/sysctl.h -- which probably
needs a style change patch by itself)

---

Control the trigger limit for softlockup warnings.  This is useful for
debugging softlockups, by lowering the softlockup_thresh sysctl,
to identify possible softlockups earlier.

Patch also changes the softlockup printk to print the cpu softlockup time.

Signed-off-by: Ravikiran Thirumalai [EMAIL PROTECTED]
Signed-off-by: Shai Fultheim [EMAIL PROTECTED]

Index: linux-2.6.22/kernel/softlockup.c
===
--- linux-2.6.22.orig/kernel/softlockup.c   2007-07-18 11:15:18.506614500 
-0700
+++ linux-2.6.22/kernel/softlockup.c2007-07-18 21:39:20.498592750 -0700
@@ -23,6 +23,7 @@ static DEFINE_PER_CPU(unsigned long, pri
 static DEFINE_PER_CPU(struct task_struct *, watchdog_task);
 
 static int did_panic;
+int softlockup_thresh = 10;
 
 static int
 softlock_panic(struct notifier_block *this, unsigned long event, void *ptr)
@@ -101,7 +102,7 @@ void softlockup_tick(void)
wake_up_process(per_cpu(watchdog_task, this_cpu));
 
/* Warn about unreasonable 10+ seconds delays: */
-   if (now = (touch_timestamp + 10))
+   if (now = (touch_timestamp + softlockup_thresh))
return;
 
regs = get_irq_regs();
@@ -109,8 +110,9 @@ void softlockup_tick(void)
per_cpu(print_timestamp, this_cpu) = touch_timestamp;
 
spin_lock(print_lock);
-   printk(KERN_ERR BUG: soft lockup detected on CPU#%d! [%s:%d]\n,
-   this_cpu, current-comm, current-pid);
+   printk(KERN_ERR BUG: soft lockup - CPU#%d stuck for %lus! [%s:%d]\n,
+   this_cpu, now - touch_timestamp,
+   current-comm, current-pid);
if (regs)
show_regs(regs);
else
Index: linux-2.6.22/kernel/sysctl.c
===
--- linux-2.6.22.orig/kernel/sysctl.c   2007-07-08 16:32:17.0 -0700
+++ linux-2.6.22/kernel/sysctl.c2007-07-18 21:05:57.877436750 -0700
@@ -78,6 +78,7 @@ extern int percpu_pagelist_fraction;
 extern int compat_log;
 extern int maps_protect;
 extern int sysctl_stat_interval;
+extern int softlockup_thresh;
 
 /* this is needed for the proc_dointvec_minmax for [fs_]overflow UID and GID */
 static int maxolduid = 65535;
@@ -206,6 +207,10 @@ static ctl_table root_table[] = {
{ .ctl_name = 0 }
 };
 
+/* Constants for kernel table minimum and  maximum */
+static int one = 1;
+static int ten = 10;
+
 static ctl_table kern_table[] = {
{
.ctl_name   = KERN_PANIC,
@@ -615,6 +620,19 @@ static ctl_table kern_table[] = {
.proc_handler   = proc_dointvec,
},
 #endif
+#ifdef CONFIG_DETECT_SOFTLOCKUP
+   {
+   .ctl_name   = KERN_SOFTLOCKUP_THRESHOLD,
+   .procname   = softlockup_thresh,
+   .data   = softlockup_thresh,
+   .maxlen = sizeof(int),
+   .mode   = 0644,
+   .proc_handler   = proc_dointvec_minmax,
+   .strategy   = sysctl_intvec,
+   .extra1 = one,
+   .extra2 = ten,
+   },
+#endif
 
{ .ctl_name = 0 }
 };
Index: linux-2.6.22/include/linux/sysctl.h
===
--- linux-2.6.22.orig/include/linux/sysctl.h2007-07-08 16:32:17.0 
-0700
+++ linux-2.6.22/include/linux/sysctl.h 2007-07-18 21:41:56.584347500 -0700
@@ -165,6 +165,7 @@ enum
KERN_MAX_LOCK_DEPTH=74,
KERN_NMI_WATCHDOG=75, /* int: enable/disable nmi watchdog */
KERN_PANIC_ON_NMI=76, /* int: whether we will panic on an unrecovered */
+   KERN_SOFTLOCKUP_THRESHOLD=77, /* int: softlockup tolerance threshold */
 };
 
 
Index: linux-2.6.22/Documentation/sysctl/kernel.txt
===
---

Re: [PATCH try #3] security: Convert LSM into a static interface

2007-07-19 Thread Greg KH

On Wed, Jul 18, 2007 at 10:42:09PM -0400, James Morris wrote:
 On Wed, 18 Jul 2007, Andrew Morton wrote:
  aww man, you passed over an opportunity to fix vast amounts of coding style
  cruftiness.
 
 GregKH-esque :-)

Yeah, sorry, that was when I was young and foolish and liked to bang on
the spacebar more than I should have :)

greg k-h
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] net/, drivers/net/ , missing EXPERIMENTAL in menus

2007-07-19 Thread Adrian Bunk

On Wed, Jul 18, 2007 at 05:18:20PM -0400, Robert P. J. Day wrote:
 On Wed, 18 Jul 2007, Adrian Bunk wrote:
 
  On Wed, Jul 18, 2007 at 04:51:33PM -0400, Robert P. J. Day wrote:
   On Wed, 18 Jul 2007, Jeff Garzik wrote:
  
Randy Dunlap wrote:
 On Wed, 18 Jul 2007 16:23:09 -0400 (EDT) Robert P. J. Day wrote:
  there's no point adding all that redundant content when it can all 
  be
  done automatically.

 I like it.  Are there any kconfig patches to support this plan?
   
Speaking specifically to adding 'EXPERIMENTAL', I distinctly
remember at some point in the past the config system was smart
enough to print  (EXPERIMENTAL) if that entry depended on
CONFIG_EXPERIMENTAL.
   
We should head in that direction.
  
   there's one point i want to re-iterate.  i'd prefer to see
   EXPERIMENTAL stop being a dependency, as in:
  
 depends on SNAFU  FUBAR  EXPERIMENTAL
  
   EXPERIMENTAL is not a dependency in the true sense of the word
   -- it is more of an attribute, and i think it would far more sense
   to see entries like:
  
 depends on SNAFU  FUBAR
 maturity EXPERIMENTAL
 
  Plus some special case in the kconfig code that you can somewhere
  select the maturity levels you want to use (currently it's a normal
  option kconfig doesn't have to know anything about).
 
 i already described that here:
 
 http://readlist.com/lists/vger.kernel.org/linux-kernel/66/334172.html
 
 where the top-level config would look something like:
 
 [*] Activate maturity attributes
   [*] EXPERIMENTAL
   [*] DEPRECATED
   [*] OBSOLETE
   [*] BROKEN

We already made the mistake of offering BROKEN as an option in the past, 
and the result was that people enabled it instead of reporting that a 
dependency on BROKEN was wrong.

 whereupon you could select any combination of the attributes you want
 displayed *beyond the regular ones* during the config process.
 
  Remind me, would there be any big advantage after such a change
  besides being able to automatically print  (EXPERIMENTAL) at the
  end of the prompt?
 
 defining a new Kconfig attribute means you can process it differently
 from regular dependencies.  and if it's added as a general feature, it
 can be used for other possible attributes beyond just a maturity
 level.
 
 if you leave these maturity levels as regular dependencies, you're
 going to have to brute force and manually process them, and why make
 it that ugly?

I would consider it more ugly to special case this and that in the 
kconfig code when plain dependencies already offer exactly the same 
functionality...

 rday

cu
Adrian

-- 

   Is there not promise of rain? Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   Only a promise, Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

2007-07-19 Thread Jarek Poplawski

On Wed, Jul 18, 2007 at 01:48:20PM +0200, Jarek Poplawski wrote:
...
 I'd be very glad if it could be verified and/or tested.

Jarek,

This patch is verified crap!

Regards,
Jarek P.

PS: Olaf,

You've written earlier that one of the main reasons for poll_napi is
to work when the kernel doesn't even service softirqs anymore. But
in your patch poll_napi leaves netif_rx_complete for softirqs, so
even if it starts to work for Ingo in normal conditions, probably
something else is needed, anyway.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] posix-timer: fix deletion race

2007-07-19 Thread Thomas Gleixner

On Wed, 2007-07-18 at 16:43 -0700, Jeremy Katz wrote:
 On Wed, 18 Jul 2007, Jeremy Katz wrote:
 
  On Wed, 18 Jul 2007, Thomas Gleixner wrote:
 
  Also can you please enable CONFIG_PROVE_LOCKING, which might catch any
  locking problem, which might be related to this.
  
  Another test: Can you please disable CONFIG_SCHED_SMT to narrow it down
  further ?
 
  I'll try both of these.
 
 I'm still seeing the sys_timer_delete version with your patch, and 
 sys_timer_delete and posix_timer_event without. The itimer_delete version 
 is currently missing in action, but getting any particular one to perform 
 on demand is currently not in my power.

Ok, let me summarize:

2.6.22 + hrt6

Both problems are there whether CONFIG_SCHED_SMT is on or not

2.6.22 + hrt6 + posixtimer patch

Both problems are there whether CONFIG_SCHED_SMT is on
The timer callback problem is gone when CONFIG_SCHED_SMT is off

Correct ?

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] Change softlockup trigger limit using a kernel parameter

2007-07-19 Thread Andrew Morton

On Wed, 18 Jul 2007 22:41:21 -0700 Ravikiran G Thirumalai [EMAIL PROTECTED] 
wrote:

 On Wed, Jul 18, 2007 at 04:08:58PM -0700, Andrew Morton wrote:
  On Mon, 16 Jul 2007 15:26:50 -0700
  Ravikiran G Thirumalai [EMAIL PROTECTED] wrote:
 
   Kernel warns of softlockups if the softlockup thread is not able to run
   on a CPU for 10s.  It is useful to lower the softlockup warning
   threshold in testing environments to catch potential lockups early.
   Following patch adds a kernel parameter 'softlockup_lim' to control
   the softlockup threshold.
  
 
  Why not make it tunable at runtime?
 
 Sure! Like a sysctl?
 
 Here's a patch that does that (On top of Ingo's
 softlockup-improve-debug-output.patch)

 ...

 --- linux-2.6.22.orig/kernel/sysctl.c 2007-07-08 16:32:17.0 -0700
 +++ linux-2.6.22/kernel/sysctl.c  2007-07-18 21:05:57.877436750 -0700
 @@ -78,6 +78,7 @@ extern int percpu_pagelist_fraction;
  extern int compat_log;
  extern int maps_protect;
  extern int sysctl_stat_interval;
 +extern int softlockup_thresh;

Just because sysctl.c does this all over the place doesn't make it right ;)
Please, if poss, find a header file for it.

  /* this is needed for the proc_dointvec_minmax for [fs_]overflow UID and GID 
 */
  static int maxolduid = 65535;
 @@ -206,6 +207,10 @@ static ctl_table root_table[] = {
   { .ctl_name = 0 }
  };
  
 +/* Constants for kernel table minimum and  maximum */
 +static int one = 1;
 +static int ten = 10;

I'd suggest that these go next to zero, two and one_hundred.  Move 'em
all to top-of-file where they should always have been.

  static ctl_table kern_table[] = {
   {
   .ctl_name   = KERN_PANIC,
 @@ -615,6 +620,19 @@ static ctl_table kern_table[] = {
   .proc_handler   = proc_dointvec,
   },
  #endif
 +#ifdef CONFIG_DETECT_SOFTLOCKUP
 + {
 + .ctl_name   = KERN_SOFTLOCKUP_THRESHOLD,
 + .procname   = softlockup_thresh,
 + .data   = softlockup_thresh,
 + .maxlen = sizeof(int),
 + .mode   = 0644,
 + .proc_handler   = proc_dointvec_minmax,
 + .strategy   = sysctl_intvec,
 + .extra1 = one,
 + .extra2 = ten,
 + },
 +#endif

argh.  There's supposed to be a big comment right here:

/*
 * NOTE: do not add new entries to this table unless you have read
 * Documentation/sysctl/ctl_unnumbered.txt
 */

I'll fix that up.  Please use CTL_UNNUMBERED.

  };
 Index: linux-2.6.22/include/linux/sysctl.h
 ===
 --- linux-2.6.22.orig/include/linux/sysctl.h  2007-07-08 16:32:17.0 
 -0700
 +++ linux-2.6.22/include/linux/sysctl.h   2007-07-18 21:41:56.584347500 
 -0700
 @@ -165,6 +165,7 @@ enum
   KERN_MAX_LOCK_DEPTH=74,
   KERN_NMI_WATCHDOG=75, /* int: enable/disable nmi watchdog */
   KERN_PANIC_ON_NMI=76, /* int: whether we will panic on an unrecovered */
 + KERN_SOFTLOCKUP_THRESHOLD=77, /* int: softlockup tolerance threshold */
  };

and zap this

 Index: linux-2.6.22/Documentation/sysctl/kernel.txt
 ===
 --- linux-2.6.22.orig/Documentation/sysctl/kernel.txt 2007-07-08 
 16:32:17.0 -0700
 +++ linux-2.6.22/Documentation/sysctl/kernel.txt  2007-07-18 
 22:07:29.460146250 -0700
 @@ -320,6 +320,14 @@ kernel.  This value defaults to SHMMAX.
  
  ==
  
 +softlockup_thresh:
 +
 +This value can be used to lower the softlockup tolerance
 +threshold. The default threshold is 10s.  If a cpu is locked up
 +for 10s, the kernel complains.  Valid values are 1-10s.
 +

neato.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6 patch] kernel/sched.c: remove 2 unused exports

2007-07-19 Thread Adrian Bunk

On Tue, Jul 17, 2007 at 09:22:33PM +0200, Ingo Molnar wrote:
 
 * Adrian Bunk [EMAIL PROTECTED] wrote:
 
  This patch removes the following unused exports:
  - EXPORT_SYMBOL(cond_resched_softirq);
  - EXPORT_SYMBOL_GPL(__wake_up_sync);
 
 these are there for API completeness - their counterparts are exported.

Why is something with a comment For internal use only part of the API?

   Ingo

cu
Adrian

-- 

   Is there not promise of rain? Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   Only a promise, Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: regression: disk error loop (panic?) ide_do_rw_disk-bad:

2007-07-19 Thread Jens Axboe

On Wed, Jul 18 2007, Linus Torvalds wrote:
 
 
 On Thu, 19 Jul 2007, Bartlomiej Zolnierkiewicz wrote:
  
  Thanks for finding and fixing this.
  
  The latest patch (with additional cleanups) also looks good and should be
  safe enough (unchanged behavior for all non-pc requests) to merge it now.
  
  Acked-by: Bartlomiej Zolnierkiewicz [EMAIL PROTECTED]
 
 Ok, Jens - mind signing off on the patch you sent out, and writing an 
 explanatory message? Feel free to just crib from my explanation of my 
 original patch, or whatever.

Sure thing, it's below.

 And it would be beautiful if people who saw the bad behaviour before
 reverting the ide.c changes were to go back to that broken state, and
 try the patch, and just verify that it acts like it should (ie you
 should see just a few error messages, and it shouldn't cause the IDE
 layer to go ballistic any more).

---

[PATCH] IDE: fix termination of non-fs requests

ide-disk calls

ide_end_request(drive, 0, 0);

to finish an unknown request, but this doesn't work so well for non-fs
requests, since ide_end_request() internally looks at -hard_cur_sectors
to see how much data to end. Only file system requests store a transfer
value in there, pc requests fill out -data_len as a byte based transfer
value instead.

Since we ask to end 0 bytes of that request, it will never be terminated
and ide-disk gets stuck in a loop handling that same request over and
over.

Switch __ide_end_request() to take a byte based transfer count, and
adjust ide_end_request() to look at the right field to determine how
much IO to end when it's being passed in 0.

Acked-by: Bartlomiej Zolnierkiewicz [EMAIL PROTECTED]
Signed-off-by: Jens Axboe [EMAIL PROTECTED]

diff --git a/drivers/ide/ide-io.c b/drivers/ide/ide-io.c
index c5b5011..f9de798 100644
--- a/drivers/ide/ide-io.c
+++ b/drivers/ide/ide-io.c
@@ -55,7 +55,7 @@
 #include asm/bitops.h
 
 static int __ide_end_request(ide_drive_t *drive, struct request *rq,
-int uptodate, int nr_sectors)
+int uptodate, unsigned int nr_bytes)
 {
int ret = 1;
 
@@ -64,7 +64,7 @@ static int __ide_end_request(ide_drive_t *drive, struct 
request *rq,
 * complete the whole request right now
 */
if (blk_noretry_request(rq)  end_io_error(uptodate))
-   nr_sectors = rq-hard_nr_sectors;
+   nr_bytes = rq-hard_nr_sectors  9;
 
if (!blk_fs_request(rq)  end_io_error(uptodate)  !rq-errors)
rq-errors = -EIO;
@@ -78,7 +78,7 @@ static int __ide_end_request(ide_drive_t *drive, struct 
request *rq,
HWGROUP(drive)-hwif-ide_dma_on(drive);
}
 
-   if (!end_that_request_first(rq, uptodate, nr_sectors)) {
+   if (!end_that_request_chunk(rq, uptodate, nr_bytes)) {
add_disk_randomness(rq-rq_disk);
if (!list_empty(rq-queuelist))
blkdev_dequeue_request(rq);
@@ -103,6 +103,7 @@ static int __ide_end_request(ide_drive_t *drive, struct 
request *rq,
 
 int ide_end_request (ide_drive_t *drive, int uptodate, int nr_sectors)
 {
+   unsigned int nr_bytes = nr_sectors  9;
struct request *rq;
unsigned long flags;
int ret = 1;
@@ -114,10 +115,14 @@ int ide_end_request (ide_drive_t *drive, int uptodate, 
int nr_sectors)
spin_lock_irqsave(ide_lock, flags);
rq = HWGROUP(drive)-rq;
 
-   if (!nr_sectors)
-   nr_sectors = rq-hard_cur_sectors;
+   if (!nr_bytes) {
+   if (blk_pc_request(rq))
+   nr_bytes = rq-data_len;
+   else
+   nr_bytes = rq-hard_cur_sectors  9;
+   }
 
-   ret = __ide_end_request(drive, rq, uptodate, nr_sectors);
+   ret = __ide_end_request(drive, rq, uptodate, nr_bytes);
 
spin_unlock_irqrestore(ide_lock, flags);
return ret;

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 33/33] IDE: sg chaining support

2007-07-19 Thread Jens Axboe

On Wed, Jul 18 2007, Bartlomiej Zolnierkiewicz wrote:
 On Monday 16 July 2007, Jens Axboe wrote:
  Cc: [EMAIL PROTECTED]
  Signed-off-by: Jens Axboe [EMAIL PROTECTED]
 
 Acked-by: Bartlomiej Zolnierkiewicz [EMAIL PROTECTED]

(for both acks) Thanks for reviewing and acking!

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Check for compound pages in set_page_dirty()

2007-07-19 Thread Jens Axboe

On Wed, Jul 18 2007, Hugh Dickins wrote:
 On Wed, 18 Jul 2007, Jens Axboe wrote:
  
  Since I had my hands dirty already...
 
 Great, thanks.  (There's also such a test in fs/nfs/direct.c,
 but let's not trouble Trond until we've settled what to do here.)
 
  
  ---
  
  [PATCH] Remove PageCompound() checks before calling set_page_dirty()
  
  Pre commit 41d78ba55037468e6c86c53e3076d1a74841de39 it was illegal
  to call set_page_dirty() on a compound page, since it stored the
  destructor in the mapping field. But now it's ok, so remove the
  ugly PageCompound() checks from bio and direct-io.
  
  Signed-off-by: Jens Axboe [EMAIL PROTECTED]
 
 I was about to Ack that, now that I've found something or other in the
 libhugetlb testsuite comes this way, even on page[1], without showing
 any problem.
 
 However, I have noticed a particular inefficiency arising: that
 bio_check_pages_dirty test specifically avoids pages already
 PageDirty; but hugetlbfs_set_page_dirty carefully redirects to
 set the head page dirty: so tail pages of a hugetlb compound page
 will tend never to be PageDirty, and keep on coming back this way.
 
 Which led me to look up the origin of those PageCompound tests:
 Author: Andrew Morton [EMAIL PROTECTED]
 Date:   Sun Sep 21 01:42:22 2003 -0700
 
 [PATCH] Speed up direct-io hugetlbpage handling
 
 This patch short-circuits all the direct-io page dirtying logic for
 higher-order pages.  Without this, we pointlessly bounce BIOs up to 
 keventd
 all the time.
 
 diff --git a/fs/bio.c b/fs/bio.c
 index d016523..2463163 100644
 --- a/fs/bio.c
 +++ b/fs/bio.c
 @@ -532,6 +532,12 @@ void bio_unmap_user(struct bio *bio, int write_to_vm)
   * check that the pages are still dirty.   If so, fine.  If not, redirty them
   * in process context.
   *
 + * We special-case compound pages here: normally this means reads into 
 hugetlb
 + * pages.  The logic in here doesn't really work right for compound pages
 + * because the VM does not uniformly chase down the head page in all cases.
 + * But dirtiness of compound pages is pretty meaningless anyway: the VM 
 doesn't
 + * handle them at all.  So we skip compound pages here at an early stage.
 ...
 
 It looks like I was wrong in thinking it was just trying to avoid 
 the crash on page[1].mapping.  At the least, your patch needs also
 to remove that paragraph of comment from Andrew.  But really, it
 looks like those PageCompound tests should stay, unless you can
 persuade Andrew to Ack their removal.
 
 Except (now, how many times can I change my mind in the course of
 one email?), hugetlbfs_set_page_dirty was specifically added by
 Ken Chen to avoid losing data via /proc/sys/vm/drop_caches.  Yet
 fs/bio.c is carefully avoiding going there when dirtying a hugepage.
 How does this work?  Looks like those PageCompound tests need to go!

Hehe, that didn't really get us much further, did it? :-)

My opinion is that since the win is marginal at best, we want to remove
such tests as it just clutters up the code. And it's definitely not
obvious why the tests are there, since they are not commented at all.
Since it's even confusing you, then we can't expect the more vm ignorant
of us (which definitely includes me) to grasp it!

 I'm lost: I hope Andrew and Ken can sort it out for us.

Posting a revised version, still leaving nfs out of it (I'll ping Trond
to do the same, if this goes in).

diff --git a/fs/bio.c b/fs/bio.c
index 33e4634..dcbb160 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -884,12 +884,6 @@ struct bio *bio_map_kern(request_queue_t *q, void *data, 
unsigned int len,
  * check that the pages are still dirty.   If so, fine.  If not, redirty them
  * in process context.
  *
- * We special-case compound pages here: normally this means reads into hugetlb
- * pages.  The logic in here doesn't really work right for compound pages
- * because the VM does not uniformly chase down the head page in all cases.
- * But dirtiness of compound pages is pretty meaningless anyway: the VM doesn't
- * handle them at all.  So we skip compound pages here at an early stage.
- *
  * Note that this code is very hard to test under normal circumstances because
  * direct-io pins the pages with get_user_pages().  This makes
  * is_page_cache_freeable return false, and the VM will not clean the pages.
@@ -911,7 +905,7 @@ void bio_set_pages_dirty(struct bio *bio)
for (i = 0; i  bio-bi_vcnt; i++) {
struct page *page = bvec[i].bv_page;
 
-   if (page  !PageCompound(page))
+   if (page)
set_page_dirty_lock(page);
}
 }
@@ -978,7 +972,7 @@ void bio_check_pages_dirty(struct bio *bio)
for (i = 0; i  bio-bi_vcnt; i++) {
struct page *page = bvec[i].bv_page;
 
-   if (PageDirty(page) || PageCompound(page)) {
+   if (PageDirty(page)) {
page_cache_release(page);
bvec[i].bv_page = NULL;
}

Re: System hangs on running kernbench

2007-07-19 Thread Dhaval Giani

 (gdb) thread 6
 [Switching to thread 6 (process 6233)]#0  __do_softirq ()
 at kernel/softirq.c:231
 231 if (pending  1) {
 (gdb) bt
 #0  __do_softirq () at kernel/softirq.c:231
 #1  0xc012998b in do_softirq () at kernel/softirq.c:269
 #2  0xc0129a09 in irq_exit () at kernel/softirq.c:305
 #3  0xc0117443 in smp_apic_timer_interrupt (regs=Variable regs is not 
 available.
 )
 at arch/i386/kernel/apic.c:592
 #4  0xc0105877 in apic_timer_interrupt () at include/asm/current.h:11
 #5  0xc0564480 in contig_page_data ()
 #6  0x0007 in ?? ()
 #7  0x0001 in ?? ()
 #8  0x in ?? ()

Looks interesting.

-- 
regards,
Dhaval

I would like to change the world but they don't give me the source code!
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: regression: disk error loop (panic?) ide_do_rw_disk-bad:

2007-07-19 Thread Jens Axboe

On Thu, Jul 19 2007, Giacomo Catenazzi wrote:
 Linus Torvalds wrote:
  
  On Thu, 19 Jul 2007, Bartlomiej Zolnierkiewicz wrote:
  Thanks for finding and fixing this.
 
  The latest patch (with additional cleanups) also looks good and should be
  safe enough (unchanged behavior for all non-pc requests) to merge it now.
 
  Acked-by: Bartlomiej Zolnierkiewicz [EMAIL PROTECTED]
  
  Ok, Jens - mind signing off on the patch you sent out, and writing an 
  explanatory message? Feel free to just crib from my explanation of my 
  original patch, or whatever.
  
  And it would be beautiful if people who saw the bad behaviour before 
  reverting the ide.c changes were to go back to that broken state, and try 
  the patch, and just verify that it acts like it should (ie you should see 
  just a few error messages, and it shouldn't cause the IDE layer to go 
  ballistic any more).
 
 Ok, I tested a5fcaa210626a79465321e344c91a6a7dc3881fa , with
 the Jeans' patch with clean-up (Message-ID:
 [EMAIL PROTECTED]).
 
 I don't see the error loop. but only 4 errors (2 for each hd, at hddtemp
 start)
 
 Jul 19 08:22:19 catee kernel: hda: selected mode 0x45
 Jul 19 08:22:23 catee kernel: ide_do_rw_disk - bad command: dev hda:
 type=2, flags=104c8
 Jul 19 08:22:23 catee kernel:
 Jul 19 08:22:23 catee kernel: sector 14657019, nr/cnr 0/0
 Jul 19 08:22:23 catee kernel: bio c21a4780, biotail c21a4780, buffer
 , data , len 36
 Jul 19 08:22:23 catee kernel: cdb: 12 00 00 00 24 00 00 00 00 00 00 00
 00 00 00 00
 Jul 19 08:22:23 catee kernel: ide_do_rw_disk - bad command: dev hdc:
 type=2, flags=104c8
 Jul 19 08:22:23 catee kernel:
 Jul 19 08:22:23 catee kernel: sector 34711027, nr/cnr 0/0
 Jul 19 08:22:23 catee kernel: bio c21a4740, biotail c21a4740, buffer
 , data , len 36
 Jul 19 08:22:23 catee kernel: cdb: 12 00 00 00 24 00 00 00 00 00 00 00
 00 00 00 00
 Jul 19 08:22:23 catee kernel: ide_do_rw_disk - bad command: dev hdc:
 type=2, flags=104c8
 Jul 19 08:22:23 catee kernel:
 Jul 19 08:22:23 catee kernel: sector 7387152, nr/cnr 0/0
 Jul 19 08:22:23 catee kernel: bio c21a4900, biotail c21a4900, buffer
 , data , len 36
 Jul 19 08:22:23 catee kernel: cdb: 12 00 00 00 24 00 00 00 00 00 00 00
 00 00 00 00
 Jul 19 08:22:23 catee kernel: ide_do_rw_disk - bad command: dev hda:
 type=2, flags=104c8
 Jul 19 08:22:23 catee kernel:
 Jul 19 08:22:23 catee kernel: sector 7387152, nr/cnr 0/0
 Jul 19 08:22:23 catee kernel: bio c21a4900, biotail c21a4900, buffer
 , data , len 36
 Jul 19 08:22:23 catee kernel: cdb: 12 00 00 00 24 00 00 00 00 00 00 00
 00 00 00 00

Perfect, thanks a lot for testing!

Tested-By: Giacomo Catenazzi [EMAIL PROTECTED]

Linus, if you merge the patch I sent, can you just add this Tested-by?

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 0/3] PS3 Storage Drivers for 2.6.23, take 5

2007-07-19 Thread Alessandro Rubini


Hello.

 I didn't hear anything from the misc device maintainer (for the FLASH ROM
 Storage Driver).

Actually, I am not acting as a maintainer. I'm not active enough nor
up to date with all the structure of kernel maintainance. So please
don't wait for me.

Actually, I tried a pair of times to have my name removed from the
MAINTAINERS file over the years without success. Actually, I didn't
care a lot because nobody relly used that entry. I think it's time for
me to learn how to do it in the proper way.

Regards
/alessandro
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/1] i386: Geode's TSC is not neccessary to mark tu unstable

2007-07-19 Thread Juergen Beisert

On Thursday 19 July 2007 03:02, Andrew Morton wrote:
 On Sun, 15 Jul 2007 21:06:27 +0200

 Juergen Beisert [EMAIL PROTECTED] wrote:
  Replace NSC/Cyrix specific chipset access macros by inlined functions.
  With the macros a line like this fails (and does nothing):
  setCx86(CX86_CCR2, getCx86(CX86_CCR2) | 0x88);
  With inlined functions this line will work as expected.

 I don't get it.  Why would the macros behave differently from inlined
 functions?

X86 magic. The access order is important. The first access must always be the 
offset at 0x22. This access enables the next access to 0x23 (data). If you do 
it in wrong order, it fails. With the macros you get something like 0x22, 
0x22, 0x23, 0x23. With the inline functions 0x22,0x23,0x22,0x23.

Juergen
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.22-git10 compile error

2007-07-19 Thread Cornelia Huck

On Thu, 19 Jul 2007 09:28:26 +0300,
Plamen Petrov [EMAIL PROTECTED] wrote:

 Hi, all!
 
 Just for the record - linux kernel version 2.6.22-git10 fails to build
 with the following error:
 
 In file included from net/netfilter/xt_connlimit.c:27:
 include/net/netfilter/nf_conntrack.h:99: error: field `ct_general' has 
 incomplete type
 include/net/netfilter/nf_conntrack.h: In function `nf_ct_get':
 include/net/netfilter/nf_conntrack.h:163: error: structure has no member 
 named `nfct'
 include/net/netfilter/nf_conntrack.h: In function `nf_ct_put':
 include/net/netfilter/nf_conntrack.h:170: error: implicit declaration of 
 function `nf_conntrack_put'
 include/net/netfilter/nf_conntrack.h: In function `nf_ct_is_untracked':
 include/net/netfilter/nf_conntrack.h:252: error: structure has no member 
 named `nfct'
 In file included from net/netfilter/xt_connlimit.c:28:
 include/net/netfilter/nf_conntrack_core.h: In function 
 `nf_conntrack_confirm':
 include/net/netfilter/nf_conntrack_core.h:68: error: structure has no 
 member named `nfct'
 make[2]: *** [net/netfilter/xt_connlimit.o] Error 1
 make[1]: *** [net/netfilter] Error 2
 make: *** [net] Error 2

This is fixed with commit 3fd8f9e4b6c184d03d340bc86630f700de967fa8.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

2.6.22-git10 compile error

2007-07-19 Thread Plamen Petrov


Hi, all!

Just for the record - linux kernel version 2.6.22-git10 fails to build
with the following error:

In file included from net/netfilter/xt_connlimit.c:27:
include/net/netfilter/nf_conntrack.h:99: error: field `ct_general' has 
incomplete type

include/net/netfilter/nf_conntrack.h: In function `nf_ct_get':
include/net/netfilter/nf_conntrack.h:163: error: structure has no member 
named `nfct'

include/net/netfilter/nf_conntrack.h: In function `nf_ct_put':
include/net/netfilter/nf_conntrack.h:170: error: implicit declaration of 
function `nf_conntrack_put'

include/net/netfilter/nf_conntrack.h: In function `nf_ct_is_untracked':
include/net/netfilter/nf_conntrack.h:252: error: structure has no member 
named `nfct'

In file included from net/netfilter/xt_connlimit.c:28:
include/net/netfilter/nf_conntrack_core.h: In function 
`nf_conntrack_confirm':
include/net/netfilter/nf_conntrack_core.h:68: error: structure has no 
member named `nfct'

make[2]: *** [net/netfilter/xt_connlimit.o] Error 1
make[1]: *** [net/netfilter] Error 2
make: *** [net] Error 2

Attached is the kernel config used, system is running Slackware 11 on
AMD Duron, gcc version is 3.4.6.

For more info - mail me.

--
Plamen Petrov, network administrator
Technical College - Silistra,
RU Angel Kantchev
http://tk.ru.acad.bg/

#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.22-git10
# Thu Jul 19 06:44:59 2007
#
CONFIG_X86_32=y
CONFIG_GENERIC_TIME=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_QUICKLIST=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_DMI=y
CONFIG_DEFCONFIG_LIST=/lib/modules/$UNAME_RELEASE/.config

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32

#
# General setup
#
CONFIG_LOCALVERSION=
CONFIG_LOCALVERSION_AUTO=y
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
# CONFIG_BSD_PROCESS_ACCT is not set
# CONFIG_TASKSTATS is not set
# CONFIG_USER_NS is not set
# CONFIG_AUDIT is not set
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=15
# CONFIG_CPUSETS is not set
# CONFIG_SYSFS_DEPRECATED is not set
# CONFIG_RELAY is not set
# CONFIG_BLK_DEV_INITRD is not set
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SYSCTL=y
# CONFIG_EMBEDDED is not set
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_ANON_INODES=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_SLUB_DEBUG=y
# CONFIG_SLAB is not set
CONFIG_SLUB=y
# CONFIG_SLOB is not set
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
# CONFIG_MODVERSIONS is not set
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_KMOD=y
CONFIG_STOP_MACHINE=y
CONFIG_BLOCK=y
# CONFIG_LBD is not set
# CONFIG_BLK_DEV_IO_TRACE is not set
# CONFIG_LSF is not set
# CONFIG_BLK_DEV_BSG is not set

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
# CONFIG_IOSCHED_AS is not set
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
# CONFIG_DEFAULT_AS is not set
# CONFIG_DEFAULT_DEADLINE is not set
CONFIG_DEFAULT_CFQ=y
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED=cfq

#
# Processor type and features
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_SMP=y
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_PARAVIRT is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
# CONFIG_MCORE2 is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
CONFIG_MK7=y
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MGEODE_LX is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_MVIAC7 is not set
CONFIG_X86_GENERIC=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_L1_CACHE_SHIFT=7
CONFIG_X86_XADD=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y

Re: System hangs on running kernbench

2007-07-19 Thread Dhaval Giani

On Wed, Jul 18, 2007 at 08:16:37PM +0530, Dhaval Giani wrote:
 Hi Andrew,
 
 On Wed, Jul 18, 2007 at 03:11:42PM +0530, Dhaval Giani wrote:
  On Wed, Jul 18, 2007 at 01:07:00AM -0700, Andrew Morton wrote:
   On Wed, 18 Jul 2007 13:26:48 +0530 Dhaval Giani [EMAIL PROTECTED] wrote:

  In the meantime I will go and check if it was there in 2.6.22-rc4-mm2
  
 
 It is hanging with 2.6.22-rc4-mm2 as well as on the latest git on
 kernel.org (2.6.22-git10).
 
 I will get back to you with more information as soon as I have it.

Hi Andrew,

I've got a crash dump and stack traces. They are as follows (The trace
is on 2.6.22-git10)


(gdb) thread 1
[Switching to thread 1 (process 8096)]#0  delay_tsc (loops=1)
at include/asm/msr.h:64
64  {
(gdb) bt
#0  delay_tsc (loops=1) at include/asm/msr.h:64
#1  0xc0245130 in __delay (loops=Variable loops is not available.
) at arch/i386/lib/delay.c:74
#2  0xc0247115 in __spin_lock_debug (lock=0xc0564480)
at lib/spinlock_debug.c:111
#3  0xc02471cc in _raw_spin_lock (lock=0xc0564480) at lib/spinlock_debug.c:132
#4  0xc041ad3e in _spin_lock_irq (lock=0xc0564480) at kernel/spinlock.c:105
#5  0xc015ff2c in shrink_active_list (nr_pages=32, zone=0xc0563300, 
sc=0xd65b3e60, priority=5) at mm/vmscan.c:926
#6  0xc01602a3 in shrink_zone (priority=5, zone=0xc0563300, sc=0xd65b3e60)
at mm/vmscan.c:1044
#7  0xc016036c in shrink_zones (priority=5, zones=0xc056584c, sc=0xd65b3e60)
at mm/vmscan.c:1101
#8  0xc0160488 in try_to_free_pages (zones=0xc056584c, order=Variable order 
is not available.
)
at mm/vmscan.c:1153
#9  0xc015c190 in __alloc_pages (gfp_mask=688338, order=0, zonelist=0xc0565848)
at mm/page_alloc.c:1336
#10 0xc0165285 in do_anonymous_page (mm=0xe4498280, vma=0xd3afef3c, 
address=3083890688, page_table=0xd65ca838, pmd=0xe3e33df0, write_access=1)
at include/linux/gfp.h:100
#11 0xc0165a58 in __handle_mm_fault (mm=0xe4498280, vma=0xd3afef3c, 
address=3083890688, write_access=1) at mm/memory.c:2549
#12 0xc041c984 in do_page_fault (regs=0xd65b3fb8, error_code=6)
at include/linux/mm.h:776
#13 0xc041b37a in error_code () at include/linux/sched.h:13
#14 0x006c in ?? ()
#15 0x001b in ?? ()
#16 0x in ?? ()
(gdb) thread 2
[Switching to thread 2 (process 7371)]#0  __spin_lock_debug (lock=0xc0564480)
at include/asm/spinlock.h:88
88  {
(gdb) bt
#0  __spin_lock_debug (lock=0xc0564480) at include/asm/spinlock.h:88
#1  0xc02471cc in _raw_spin_lock (lock=0xc0564480) at lib/spinlock_debug.c:132
#2  0xc041ad3e in _spin_lock_irq (lock=0xc0564480) at kernel/spinlock.c:105
#3  0xc0160181 in shrink_active_list (nr_pages=32, zone=0xc0563300, sc=Variable 
sc is not available.
)
at mm/vmscan.c:994
#4  0xc01602a3 in shrink_zone (priority=5, zone=0xc0563300, sc=0xeea67e60)
at mm/vmscan.c:1044
#5  0xc016036c in shrink_zones (priority=5, zones=0xc056584c, sc=0xeea67e60)
at mm/vmscan.c:1101
#6  0xc0160488 in try_to_free_pages (zones=0xc056584c, order=Variable order 
is not available.
)
at mm/vmscan.c:1153
#7  0xc015c190 in __alloc_pages (gfp_mask=688338, order=0, zonelist=0xc0565848)
at mm/page_alloc.c:1336
#8  0xc0165285 in do_anonymous_page (mm=0xeeaabdc0, vma=0xf137c7ac, 
address=3084021760, page_table=0xeeabf938, pmd=0xeeaafdf0, write_access=1)
at include/linux/gfp.h:100
#9  0xc0165a58 in __handle_mm_fault (mm=0xeeaabdc0, vma=0xf137c7ac, 
address=3084021760, write_access=1) at mm/memory.c:2549
#10 0xc041c984 in do_page_fault (regs=0xeea67fb8, error_code=6)
at include/linux/mm.h:776
#11 0xc041b37a in error_code () at include/linux/sched.h:13
#12 0x002c in ?? ()
#13 0x000b in ?? ()
#14 0x in ?? ()
(gdb) thread 3
[Switching to thread 3 (process 8392)]#0  __spin_lock_debug (lock=0xc0564480)
at include/asm/spinlock.h:88
88  {
(gdb) bt
#0  __spin_lock_debug (lock=0xc0564480) at include/asm/spinlock.h:88
#1  0xc02471cc in _raw_spin_lock (lock=0xc0564480) at lib/spinlock_debug.c:132
#2  0xc041ad3e in _spin_lock_irq (lock=0xc0564480) at kernel/spinlock.c:105
#3  0xc016000f in shrink_active_list (nr_pages=32, zone=0xc0563300, sc=Variable 
sc is not available.
)
at mm/vmscan.c:950
#4  0xc01602a3 in shrink_zone (priority=5, zone=0xc0563300, sc=0xf733de60)
at mm/vmscan.c:1044
#5  0xc016036c in shrink_zones (priority=5, zones=0xc056584c, sc=0xf733de60)
at mm/vmscan.c:1101
#6  0xc0160488 in try_to_free_pages (zones=0xc056584c, order=Variable order 
is not available.
)
at mm/vmscan.c:1153
#7  0xc015c190 in __alloc_pages (gfp_mask=688338, order=0, zonelist=0xc0565848)
at mm/page_alloc.c:1336
#8  0xc0165285 in do_anonymous_page (mm=0xf3054940, vma=0xf60496a4, 
address=135188540, page_table=0xd275c768, pmd=0xf317d200, write_access=1)
at include/linux/gfp.h:100
#9  0xc0165a58 in __handle_mm_fault (mm=0xf3054940, vma=0xf60496a4, 
address=135188540, write_access=1) at mm/memory.c:2549
#10 0xc041c984 in do_page_fault (regs=0xf733dfb8, error_code=6)
at

Re: regression: disk error loop (panic?) ide_do_rw_disk-bad:

2007-07-19 Thread Giacomo Catenazzi

Linus Torvalds wrote:
 
 On Thu, 19 Jul 2007, Bartlomiej Zolnierkiewicz wrote:
 Thanks for finding and fixing this.

 The latest patch (with additional cleanups) also looks good and should be
 safe enough (unchanged behavior for all non-pc requests) to merge it now.

 Acked-by: Bartlomiej Zolnierkiewicz [EMAIL PROTECTED]
 
 Ok, Jens - mind signing off on the patch you sent out, and writing an 
 explanatory message? Feel free to just crib from my explanation of my 
 original patch, or whatever.
 
 And it would be beautiful if people who saw the bad behaviour before 
 reverting the ide.c changes were to go back to that broken state, and try 
 the patch, and just verify that it acts like it should (ie you should see 
 just a few error messages, and it shouldn't cause the IDE layer to go 
 ballistic any more).

Ok, I tested a5fcaa210626a79465321e344c91a6a7dc3881fa , with
the Jeans' patch with clean-up (Message-ID:
[EMAIL PROTECTED]).

I don't see the error loop. but only 4 errors (2 for each hd, at hddtemp
start)

Jul 19 08:22:19 catee kernel: hda: selected mode 0x45
Jul 19 08:22:23 catee kernel: ide_do_rw_disk - bad command: dev hda:
type=2, flags=104c8
Jul 19 08:22:23 catee kernel:
Jul 19 08:22:23 catee kernel: sector 14657019, nr/cnr 0/0
Jul 19 08:22:23 catee kernel: bio c21a4780, biotail c21a4780, buffer
, data , len 36
Jul 19 08:22:23 catee kernel: cdb: 12 00 00 00 24 00 00 00 00 00 00 00
00 00 00 00
Jul 19 08:22:23 catee kernel: ide_do_rw_disk - bad command: dev hdc:
type=2, flags=104c8
Jul 19 08:22:23 catee kernel:
Jul 19 08:22:23 catee kernel: sector 34711027, nr/cnr 0/0
Jul 19 08:22:23 catee kernel: bio c21a4740, biotail c21a4740, buffer
, data , len 36
Jul 19 08:22:23 catee kernel: cdb: 12 00 00 00 24 00 00 00 00 00 00 00
00 00 00 00
Jul 19 08:22:23 catee kernel: ide_do_rw_disk - bad command: dev hdc:
type=2, flags=104c8
Jul 19 08:22:23 catee kernel:
Jul 19 08:22:23 catee kernel: sector 7387152, nr/cnr 0/0
Jul 19 08:22:23 catee kernel: bio c21a4900, biotail c21a4900, buffer
, data , len 36
Jul 19 08:22:23 catee kernel: cdb: 12 00 00 00 24 00 00 00 00 00 00 00
00 00 00 00
Jul 19 08:22:23 catee kernel: ide_do_rw_disk - bad command: dev hda:
type=2, flags=104c8
Jul 19 08:22:23 catee kernel:
Jul 19 08:22:23 catee kernel: sector 7387152, nr/cnr 0/0
Jul 19 08:22:23 catee kernel: bio c21a4900, biotail c21a4900, buffer
, data , len 36
Jul 19 08:22:23 catee kernel: cdb: 12 00 00 00 24 00 00 00 00 00 00 00
00 00 00 00
Jul 19 08:22:27 catee kernel: ttyS1: LSR safety check engaged!

The last git tree give me no errors.

patch in Message-ID: [EMAIL PROTECTED]

Tested-By: Giacomo Catenazzi [EMAIL PROTECTED]

ciao
cate

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATH 0/1] Kexec jump - v2 - the first step to kexec based hibernation

2007-07-19 Thread Huang, Ying

On Wed, 2007-07-18 at 18:04 -0700, Andrew Morton wrote:
 I like the idea but I think I'll let people chat about it a bit more
 before looking at merging the patches, OK?

I think maybe we should wait for Rafael to separate the device hibernate
(quiesce and state save) from device suspend. Without that, the ACPI
issue can not be resolved.

Best Regards,
Huang Ying
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUGFIX]{PATCH] flush icache on ia64 take2

2007-07-19 Thread KAMEZAWA Hiroyuki

On Fri, 6 Jul 2007 11:29:01 +0900
KAMEZAWA Hiroyuki [EMAIL PROTECTED] wrote:

 This is a patch for fixing icache flush race in ia64(Montecito) by 
 implementing
 flush_icache_page() at el.
 
 Changelog:
  - updated against 2.6.22-rc7 (previous one was against 2.6.21)
  - removed hugetlbe's lazy_mmu_prot_update().
  - rewrote patch description.
  - removed patch against mprotect() if flushes cache.
 
Then, what should I do more for fixing this SIGILL problem ?

-Kame

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/3]x86_64: early_printk for early debug port support

2007-07-19 Thread Yinghai Lu


On 7/18/07, Andi Kleen [EMAIL PROTECTED] wrote:

On Monday 21 May 2007 07:19:18 Yinghai Lu wrote:
 add early dbgp to early_printk.

 kernel command line:
 earlyprintk=dbgp
 or
 earlyprintk=dbgp1

Just checking some old patches. Was there ever an update for this one?
What were the testing results?

-Andi



please check the attachment. the diff to current Linus'd git.

after remove pci quirks for usb handoff, it could get boot log till
ohci try to reset the port with debug device. --- reset will fail.

Maybe Greg could continue debug it.

YH
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 9a54148..956d8dc 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -550,11 +550,12 @@ and is between 256 and 4096 characters. It is defined in the file
 	earlyprintk=	[IA-32,X86-64,SH]
 			earlyprintk=vga
 			earlyprintk=serial[,ttySn[,baudrate]]
+			earlyprintk=dbgp
 
 			Append ,keep to not disable it when the real console
 			takes over.
 
-			Only vga or serial at a time, not both.
+			Only vga or serial or usb debug port at a time.
 
 			Currently only ttyS0 and ttyS1 are supported.
 
diff --git a/arch/x86_64/kernel/early_printk.c b/arch/x86_64/kernel/early_printk.c
index fd9aff3..3621d68 100644
--- a/arch/x86_64/kernel/early_printk.c
+++ b/arch/x86_64/kernel/early_printk.c
@@ -3,10 +3,19 @@
 #include linux/init.h
 #include linux/string.h
 #include linux/screen_info.h
+#include linux/usb/ch9.h
+#include linux/pci_regs.h
+#include linux/pci_ids.h
+#include linux/errno.h
 #include asm/io.h
 #include asm/processor.h
 #include asm/fcntl.h
 #include xen/hvc-console.h
+#include asm/pci-direct.h
+#include asm/pgtable.h
+#include asm/fixmap.h
+#define EARLY_PRINTK
+#include ../../../drivers/usb/host/ehci.h
 
 /* Simple VGA output */
 
@@ -156,6 +165,594 @@ static struct console early_serial_console = {
 	.index =	-1,
 };
 
+
+static struct ehci_caps __iomem *ehci_caps;
+static struct ehci_regs __iomem *ehci_regs;
+static struct ehci_dbg_port __iomem *ehci_debug;
+static unsigned dbgp_endpoint_out;
+
+#define USB_DEBUG_DEVNUM 127
+
+#define DBGP_DATA_TOGGLE	0x8800
+
+static inline u32 dbgp_pid_update(u32 x, u32 tok)
+{
+	return x) ^ DBGP_DATA_TOGGLE)  0x00) | ((tok)  0xff));
+}
+
+static inline u32 dbgp_len_update(u32 x, u32 len)
+{
+	return (((x)  ~0x0f) | ((len)  0x0f));
+}
+
+/*
+ * USB Packet IDs (PIDs)
+ */
+
+/* token */
+#define USB_PID_OUT		0xe1
+#define USB_PID_IN		0x69
+#define USB_PID_SOF		0xa5
+#define USB_PID_SETUP		0x2d
+/* handshake */
+#define USB_PID_ACK		0xd2
+#define USB_PID_NAK		0x5a
+#define USB_PID_STALL		0x1e
+#define USB_PID_NYET		0x96
+/* data */
+#define USB_PID_DATA0		0xc3
+#define USB_PID_DATA1		0x4b
+#define USB_PID_DATA2		0x87
+#define USB_PID_MDATA		0x0f
+/* Special */
+#define USB_PID_PREAMBLE	0x3c
+#define USB_PID_ERR		0x3c
+#define USB_PID_SPLIT		0x78
+#define USB_PID_PING		0xb4
+#define USB_PID_UNDEF_0		0xf0
+
+#define USB_PID_DATA_TOGGLE	0x88
+#define DBGP_CLAIM (DBGP_OWNER | DBGP_ENABLED | DBGP_INUSE)
+
+#define PCI_CAP_ID_EHCI_DEBUG	0xa
+
+#define HUB_ROOT_RESET_TIME	50	/* times are in msec */
+#define HUB_SHORT_RESET_TIME	10
+#define HUB_LONG_RESET_TIME	200
+#define HUB_RESET_TIMEOUT	500
+
+#define DBGP_MAX_PACKET		8
+
+static int dbgp_wait_until_complete(void)
+{
+	unsigned ctrl;
+	int loop = 0x10;
+
+	do {
+		ctrl = readl(ehci_debug-control);
+		/* Stop when the transaction is finished */
+		if (ctrl  DBGP_DONE)
+			break;
+	} while (--loop  0);
+
+	if (!loop)
+		return -1;
+
+	/* Now that we have observed the completed transaction,
+	 * clear the done bit.
+	 */
+	writel(ctrl | DBGP_DONE, ehci_debug-control);
+	return (ctrl  DBGP_ERROR) ? -DBGP_ERRCODE(ctrl) : DBGP_LEN(ctrl);
+}
+
+static void dbgp_mdelay(int ms)
+{
+	int i;
+	while (ms--) {
+		for (i = 0; i  1000; i++)
+			outb(0x1, 0x80);
+	}
+}
+
+static void dbgp_breath(void)
+{
+	/* Sleep to give the debug port a chance to breathe */
+}
+
+static int dbgp_wait_until_done(unsigned ctrl)
+{
+	unsigned pids, lpid;
+	int ret;
+	int loop = 3;
+
+retry:
+	writel(ctrl | DBGP_GO, ehci_debug-control);
+	ret = dbgp_wait_until_complete();
+	pids = readl(ehci_debug-pids);
+	lpid = DBGP_PID_GET(pids);
+
+	if (ret  0)
+		return ret;
+
+	/* If the port is getting full or it has dropped data
+	 * start pacing ourselves, not necessary but it's friendly.
+	 */
+	if ((lpid == USB_PID_NAK) || (lpid == USB_PID_NYET))
+		dbgp_breath();
+
+	/* If I get a NACK reissue the transmission */
+	if (lpid == USB_PID_NAK) {
+		if(--loop  0)
+			goto retry;
+	}
+
+	return ret;
+}
+
+static void dbgp_set_data(const void *buf, int size)
+{
+	const unsigned char *bytes = buf;
+	unsigned lo, hi;
+	int i;
+
+	lo = hi = 0;
+	for (i = 0; i  4  i  size; i++)
+		lo |= bytes[i]  (8*i);
+	for (; i  8  i  size; i++)
+		hi |= bytes[i]  (8*(i - 4));
+	writel(lo, ehci_debug-data03);
+	writel(hi, ehci_debug-data47);
+}
+
+static void dbgp_get_data(void *buf, int size)
+{
+

cmpxchg is not available to generic code

2007-07-19 Thread Andrew Morton

arm:

drivers/char/drm/drm_lock.c: In function `drm_lock_take':
drivers/char/drm/drm_lock.c:221: error: implicit declaration of function 
`cmpxchg'

You might be able to use atomic_cmpxchg, which _is_ present
on all architectures.  Or use a spinlock.

What's that code doing anyway?  driver-private locking primitives?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/5][V2] Misc helper patches for pid namespaces

2007-07-19 Thread sukadev

Some helper patches to support multiple pid namespaces. These
were posted earlier on Containers@ mailing list. 

[PATCH 1/5] Define and use task_active_pid_ns() wrapper
[PATCH 2/5] Rename child_reaper() function.
[PATCH 3/5] Use task_pid() to find leader's pid
[PATCH 4/5] Define is_global_init() and is_container_init().
[PATCH 5/5] Move alloc_pid() to copy_process()

Changelog:

- Addressed Oleg Nesterov's comments on [V1] of this patchset.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/1] i386: Geode's TSC is not neccessary to mark tu unstable

2007-07-19 Thread Andres Salomon

On Thu, 19 Jul 2007 08:49:05 +0200
Juergen Beisert [EMAIL PROTECTED] wrote:

 On Thursday 19 July 2007 03:02, Andrew Morton wrote:
  On Sun, 15 Jul 2007 21:06:27 +0200
 
  Juergen Beisert [EMAIL PROTECTED] wrote:
   Replace NSC/Cyrix specific chipset access macros by inlined functions.
   With the macros a line like this fails (and does nothing):
 setCx86(CX86_CCR2, getCx86(CX86_CCR2) | 0x88);
   With inlined functions this line will work as expected.
 
  I don't get it.  Why would the macros behave differently from inlined
  functions?
 
 X86 magic. The access order is important. The first access must always be the 
 offset at 0x22. This access enables the next access to 0x23 (data). If you do 
 it in wrong order, it fails. With the macros you get something like 0x22, 
 0x22, 0x23, 0x23. With the inline functions 0x22,0x23,0x22,0x23.
 
 Juergen

Wow, that's a really cool bug; nice work!  Don't forget to update
arch/i386/kernel/cpu/mtrr/state.c, though; it uses setCx86() as well.  It needs
to include processor-cyrix.h.


Acked-by: Andres Salomon [EMAIL PROTECTED]

-- 
Andres Salomon [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/3] ps3: Disk Storage Driver

2007-07-19 Thread Geert Uytterhoeven

On Thu, 19 Jul 2007, Jens Axboe wrote:
 On Wed, Jul 18 2007, Andrew Morton wrote:
  On Mon, 16 Jul 2007 18:15:40 +0200
  Geert Uytterhoeven [EMAIL PROTECTED] wrote:
  
   From: Geert Uytterhoeven [EMAIL PROTECTED]
   
   Add a Disk Storage Driver for the PS3:
  
  Your patchset significantly hits powerpc, scsi and block.  So who gets to
  merge this?  Jens?  James?  Paul?
  
  Me, I guess ;)
 
 I think Paul was going to take it, or at least Geert hinted as such.

Yep, but as I heard Paul is on holidays, I was just going to send it to Andrew
anyway.

With kind regards,
 
Geert Uytterhoeven
Software Architect

Sony Network and Software Technology Center Europe
The Corporate Village · Da Vincilaan 7-D1 · B-1935 Zaventem · Belgium
 
Phone:+32 (0)2 700 8453 
Fax:  +32 (0)2 700 8622 
E-mail:   [EMAIL PROTECTED] 
Internet: http://www.sony-europe.com/

Sony Network and Software Technology Center Europe  
A division of Sony Service Centre (Europe) N.V. 
Registered office: Technologielaan 7 · B-1840 Londerzeel · Belgium  
VAT BE 0413.825.160 · RPR Brussels  
Fortis Bank Zaventem · Swift GEBABEBB08A · IBAN BE39001382358619

[PATCH 1/5] [V2] Define and use task_active_pid_ns() wrapper

2007-07-19 Thread sukadev


Subject: [PATCH 1/5] Define and use task_active_pid_ns() wrapper

From: Sukadev Bhattiprolu [EMAIL PROTECTED]

With multiple pid namespaces, a process is known by some pid_t in
every ancestor pid namespace.  Every time the process forks, the
child process also gets a pid_t in every ancestor pid namespace.

While a process is visible in =1 pid namespaces, it can see pid_t's
in only one pid namespace.  We call this pid namespace it's active
pid namespace, and it is always the youngest pid namespace in which
the process is known.

This patch defines and uses a wrapper to find the active pid namespace
of a process. The implementation of the wrapper will be changed in 
when support for multiple pid namespaces are added.

Changelog:
2.6.22-rc4-mm2-pidns1:
- [Pavel Emelianov, Alexey Dobriyan] Back out the change to use
  task_active_pid_ns() in child_reaper() since task-nsproxy
  can be NULL during task exit (so child_reaper() continues to
  use init_pid_ns).

  to implement child_reaper() since init_pid_ns.child_reaper to
  implement child_reaper() since tsk-nsproxy can be NULL during exit.

2.6.21-rc6-mm1:
- Rename task_pid_ns() to task_active_pid_ns() to reflect that a
  process can have multiple pid namespaces.

Signed-off-by: Sukadev Bhattiprolu [EMAIL PROTECTED]
Acked-by: Pavel Emelianov [EMAIL PROTECTED]

Cc: Eric W. Biederman [EMAIL PROTECTED]
Cc: Cedric Le Goater [EMAIL PROTECTED]
Cc: Dave Hansen [EMAIL PROTECTED]
Cc: Serge Hallyn [EMAIL PROTECTED]
Cc: Herbert Poetzel [EMAIL PROTECTED]
---
 fs/exec.c |2 +-
 fs/proc/proc_misc.c   |3 ++-
 include/linux/pid_namespace.h |7 ++-
 kernel/exit.c |5 +++--
 kernel/nsproxy.c  |2 +-
 kernel/pid.c  |4 ++--
 6 files changed, 15 insertions(+), 8 deletions(-)

Index: lx26-22-rc6-mm1/include/linux/pid_namespace.h
===
--- lx26-22-rc6-mm1.orig/include/linux/pid_namespace.h  2007-07-13 
13:07:01.0 -0700
+++ lx26-22-rc6-mm1/include/linux/pid_namespace.h   2007-07-13 
18:22:49.0 -0700
@@ -20,7 +20,7 @@ struct pid_namespace {
struct pidmap pidmap[PIDMAP_ENTRIES];
int last_pid;
struct task_struct *child_reaper;
-   struct kmem_cache_t *pid_cachep;
+   struct kmem_cache *pid_cachep;
 };
 
 extern struct pid_namespace init_pid_ns;
@@ -39,6 +39,11 @@ static inline void put_pid_ns(struct pid
kref_put(ns-kref, free_pid_ns);
 }
 
+static inline struct pid_namespace *task_active_pid_ns(struct task_struct *tsk)
+{
+   return tsk-nsproxy-pid_ns;
+}
+
 static inline struct task_struct *child_reaper(struct task_struct *tsk)
 {
return init_pid_ns.child_reaper;
Index: lx26-22-rc6-mm1/fs/exec.c
===
--- lx26-22-rc6-mm1.orig/fs/exec.c  2007-07-13 13:05:38.0 -0700
+++ lx26-22-rc6-mm1/fs/exec.c   2007-07-13 18:13:39.0 -0700
@@ -827,7 +827,7 @@ static int de_thread(struct task_struct 
 * so it is safe to do it under read_lock.
 */
if (unlikely(tsk-group_leader == child_reaper(tsk)))
-   tsk-nsproxy-pid_ns-child_reaper = tsk;
+   task_active_pid_ns(tsk)-child_reaper = tsk;
 
zap_other_threads(tsk);
read_unlock(tasklist_lock);
Index: lx26-22-rc6-mm1/fs/proc/proc_misc.c
===
--- lx26-22-rc6-mm1.orig/fs/proc/proc_misc.c2007-07-13 13:05:38.0 
-0700
+++ lx26-22-rc6-mm1/fs/proc/proc_misc.c 2007-07-13 13:07:48.0 -0700
@@ -94,7 +94,8 @@ static int loadavg_read_proc(char *page,
LOAD_INT(a), LOAD_FRAC(a),
LOAD_INT(b), LOAD_FRAC(b),
LOAD_INT(c), LOAD_FRAC(c),
-   nr_running(), nr_threads, current-nsproxy-pid_ns-last_pid);
+   nr_running(), nr_threads,
+   task_active_pid_ns(current)-last_pid);
return proc_calc_metrics(page, start, off, count, eof, len);
 }
 
Index: lx26-22-rc6-mm1/kernel/exit.c
===
--- lx26-22-rc6-mm1.orig/kernel/exit.c  2007-07-13 13:06:52.0 -0700
+++ lx26-22-rc6-mm1/kernel/exit.c   2007-07-13 18:13:39.0 -0700
@@ -909,8 +909,9 @@ fastcall NORET_TYPE void do_exit(long co
if (unlikely(!tsk-pid))
panic(Attempted to kill the idle task!);
if (unlikely(tsk == child_reaper(tsk))) {
-   if (tsk-nsproxy-pid_ns != init_pid_ns)
-   tsk-nsproxy-pid_ns-child_reaper = 
init_pid_ns.child_reaper;
+   if (task_active_pid_ns(tsk) != init_pid_ns)
+   task_active_pid_ns(tsk)-child_reaper =
+   init_pid_ns.child_reaper;
else

[PATCH 2/5] [V2] Rename child_reaper() function

2007-07-19 Thread sukadev

Pavel,

Pls ack this if you agree.

Suka
---

Subject: [PATCH 2/5] Rename child_reaper function.

From: Sukadev Bhattiprolu [EMAIL PROTECTED]


Rename the child_reaper() function to task_child_reaper() to be
similar to other task_* functions and to distinguish the function
from 'struct pid_namspace.child_reaper'.

Signed-off-by: Sukadev Bhattiprolu [EMAIL PROTECTED]
---
 fs/exec.c |2 +-
 include/linux/pid_namespace.h |2 +-
 kernel/exit.c |4 ++--
 kernel/signal.c   |2 +-
 4 files changed, 5 insertions(+), 5 deletions(-)

Index: lx26-22-rc6-mm1/include/linux/pid_namespace.h
===
--- lx26-22-rc6-mm1.orig/include/linux/pid_namespace.h  2007-07-13 
13:07:48.0 -0700
+++ lx26-22-rc6-mm1/include/linux/pid_namespace.h   2007-07-13 
13:12:01.0 -0700
@@ -44,7 +44,7 @@ static inline struct pid_namespace *task
return tsk-nsproxy-pid_ns;
 }
 
-static inline struct task_struct *child_reaper(struct task_struct *tsk)
+static inline struct task_struct *task_child_reaper(struct task_struct *tsk)
 {
return init_pid_ns.child_reaper;
 }
Index: lx26-22-rc6-mm1/fs/exec.c
===
--- lx26-22-rc6-mm1.orig/fs/exec.c  2007-07-13 13:07:48.0 -0700
+++ lx26-22-rc6-mm1/fs/exec.c   2007-07-13 13:12:01.0 -0700
@@ -826,7 +826,7 @@ static int de_thread(struct task_struct 
 * Reparenting needs write_lock on tasklist_lock,
 * so it is safe to do it under read_lock.
 */
-   if (unlikely(tsk-group_leader == child_reaper(tsk)))
+   if (unlikely(tsk-group_leader == task_child_reaper(tsk)))
task_active_pid_ns(tsk)-child_reaper = tsk;
 
zap_other_threads(tsk);
Index: lx26-22-rc6-mm1/kernel/exit.c
===
--- lx26-22-rc6-mm1.orig/kernel/exit.c  2007-07-13 13:07:48.0 -0700
+++ lx26-22-rc6-mm1/kernel/exit.c   2007-07-13 13:12:01.0 -0700
@@ -695,7 +695,7 @@ forget_original_parent(struct task_struc
do {
reaper = next_thread(reaper);
if (reaper == father) {
-   reaper = child_reaper(father);
+   reaper = task_child_reaper(father);
break;
}
} while (reaper-exit_state);
@@ -908,7 +908,7 @@ fastcall NORET_TYPE void do_exit(long co
panic(Aiee, killing interrupt handler!);
if (unlikely(!tsk-pid))
panic(Attempted to kill the idle task!);
-   if (unlikely(tsk == child_reaper(tsk))) {
+   if (unlikely(tsk == task_child_reaper(tsk))) {
if (task_active_pid_ns(tsk) != init_pid_ns)
task_active_pid_ns(tsk)-child_reaper =
init_pid_ns.child_reaper;
Index: lx26-22-rc6-mm1/kernel/signal.c
===
--- lx26-22-rc6-mm1.orig/kernel/signal.c2007-07-13 13:06:52.0 
-0700
+++ lx26-22-rc6-mm1/kernel/signal.c 2007-07-13 13:12:01.0 -0700
@@ -1853,7 +1853,7 @@ relock:
 * within that pid space. It can of course get signals from
 * its parent pid space.
 */
-   if (current == child_reaper(current))
+   if (current == task_child_reaper(current))
continue;
 
if (sig_kernel_stop(signr)) {
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/5] [V2] Use task_pid() to find leader's pid

2007-07-19 Thread sukadev


Subject: [PATCH 3/5] Use task_pid() to find leader's pid

From: Sukadev Bhattiprolu [EMAIL PROTECTED]

Use task_pid() to get leader's 'struct pid' and avoid the find_pid().

Signed-off-by: Sukadev Bhattiprolu [EMAIL PROTECTED]
Acked-by: Pavel Emelianov [EMAIL PROTECTED]

Cc: Eric W. Biederman [EMAIL PROTECTED]
Cc: Cedric Le Goater [EMAIL PROTECTED]
Cc: Dave Hansen [EMAIL PROTECTED]
Cc: Serge Hallyn [EMAIL PROTECTED]
Cc: Herbert Poetzel [EMAIL PROTECTED]
---
 fs/exec.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: lx26-22-rc6-mm1a/fs/exec.c
===
--- lx26-22-rc6-mm1a.orig/fs/exec.c 2007-07-13 18:23:55.0 -0700
+++ lx26-22-rc6-mm1a/fs/exec.c  2007-07-16 12:56:22.0 -0700
@@ -908,7 +908,7 @@ static int de_thread(struct task_struct 
 */
detach_pid(tsk, PIDTYPE_PID);
tsk-pid = leader-pid;
-   attach_pid(tsk, PIDTYPE_PID,  find_pid(tsk-pid));
+   attach_pid(tsk, PIDTYPE_PID,  task_pid(leader));
transfer_pid(leader, tsk, PIDTYPE_PGID);
transfer_pid(leader, tsk, PIDTYPE_SID);
list_replace_rcu(leader-tasks, tsk-tasks);
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/5] [V2] Define is_global_init() and is_container_init()

2007-07-19 Thread sukadev

Subject: [PATCH 4/5] Define is_global_init() and is_container_init().

From: Serge E. Hallyn [EMAIL PROTECTED]


is_init() is an ambiguous name for the pid==1 check.  Split it into
is_global_init() and is_container_init().

A container init has it's tsk-pid == 1.

A global init also has it's tsk-pid == 1 and it's active pid namespace
is the init_pid_ns.  But rather than check the active pid namespace,
compare the task structure with 'init_pid_ns.child_reaper', which is
initialized during boot to the /sbin/init process and never changes.

Changelog:

2.6.22-rc4-mm2-pidns1:
- Use 'init_pid_ns.child_reaper' to determine if a given task is the
  global init (/sbin/init) process. This would improve performance
  and remove dependence on the task_pid().

2.6.21-mm2-pidns2:

- [Sukadev Bhattiprolu] Changed is_container_init() calls in {powerpc,
  ppc,avr32}/traps.c for the _exception() call to is_global_init().
  This way, we kill only the container if the container's init has a
  bug rather than force a kernel panic.

Signed-off-by: Serge E. Hallyn [EMAIL PROTECTED]
Signed-off-by: Sukadev Bhattiprolu [EMAIL PROTECTED]
Acked-by: Pavel Emelianov [EMAIL PROTECTED]

Cc: Eric W. Biederman [EMAIL PROTECTED]
Cc: Cedric Le Goater [EMAIL PROTECTED]
Cc: Dave Hansen [EMAIL PROTECTED]
Cc: Herbert Poetzel [EMAIL PROTECTED]
---
 arch/alpha/mm/fault.c|2 +-
 arch/arm/mm/fault.c  |2 +-
 arch/arm26/mm/fault.c|2 +-
 arch/avr32/kernel/traps.c|2 +-
 arch/avr32/mm/fault.c|6 +++---
 arch/i386/lib/usercopy.c |2 +-
 arch/i386/mm/fault.c |2 +-
 arch/ia64/mm/fault.c |2 +-
 arch/m68k/mm/fault.c |2 +-
 arch/mips/mm/fault.c |2 +-
 arch/powerpc/kernel/traps.c  |2 +-
 arch/powerpc/mm/fault.c  |2 +-
 arch/powerpc/platforms/pseries/ras.c |2 +-
 arch/ppc/kernel/traps.c  |2 +-
 arch/ppc/mm/fault.c  |2 +-
 arch/s390/lib/uaccess_pt.c   |2 +-
 arch/s390/mm/fault.c |2 +-
 arch/sh/mm/fault.c   |2 +-
 arch/sh64/mm/fault.c |6 +++---
 arch/um/kernel/trap.c|2 +-
 arch/x86_64/mm/fault.c   |2 +-
 arch/xtensa/mm/fault.c   |2 +-
 drivers/char/sysrq.c |2 +-
 include/linux/sched.h|   12 ++--
 kernel/capability.c  |3 ++-
 kernel/exit.c|2 +-
 kernel/kexec.c   |2 +-
 kernel/pid.c |7 +++
 kernel/signal.c  |2 +-
 kernel/sysctl.c  |2 +-
 mm/oom_kill.c|4 ++--
 security/commoncap.c |3 ++-
 32 files changed, 54 insertions(+), 37 deletions(-)

Index: lx26-22-rc6-mm1a/include/linux/sched.h
===
--- lx26-22-rc6-mm1a.orig/include/linux/sched.h 2007-07-16 12:55:15.0 
-0700
+++ lx26-22-rc6-mm1a/include/linux/sched.h  2007-07-16 13:10:48.0 
-0700
@@ -1219,12 +1219,20 @@ static inline int pid_alive(struct task_
 }
 
 /**
- * is_init - check if a task structure is init
+ * is_global_init - check if a task structure is init
  * @tsk: Task structure to be checked.
  *
  * Check if a task structure is the first user space task the kernel created.
+ *
+ * TODO: We should inline this function after some cleanups in pid_namespace.h
+ */
+extern int is_global_init(struct task_struct *tsk);
+
+/*
+ * is_container_init:
+ * check whether in the task is init in it's own pid namespace.
  */
-static inline int is_init(struct task_struct *tsk)
+static inline int is_container_init(struct task_struct *tsk)
 {
return tsk-pid == 1;
 }
Index: lx26-22-rc6-mm1a/kernel/pid.c
===
--- lx26-22-rc6-mm1a.orig/kernel/pid.c  2007-07-16 12:55:15.0 -0700
+++ lx26-22-rc6-mm1a/kernel/pid.c   2007-07-16 13:10:48.0 -0700
@@ -69,6 +69,13 @@ struct pid_namespace init_pid_ns = {
.last_pid = 0,
.child_reaper = init_task
 };
+EXPORT_SYMBOL(init_pid_ns);
+
+int is_global_init(struct task_struct *tsk)
+{
+   return tsk == init_pid_ns.child_reaper;
+}
+EXPORT_SYMBOL(is_global_init);
 
 /*
  * Note: disable interrupts while the pidmap_lock is held as an
Index: lx26-22-rc6-mm1a/arch/alpha/mm/fault.c
===
--- lx26-22-rc6-mm1a.orig/arch/alpha/mm/fault.c 2007-07-16 12:55:15.0 
-0700
+++ lx26-22-rc6-mm1a/arch/alpha/mm/fault.c  2007-07-16 13:10:48.0 
-0700
@@ -192,7 +192,7 @@ do_page_fault(unsigned long address, uns
/* We ran out of memory, or some other thing

[PATCH 5/5] [V2] Move alloc_pid() to copy_process()

2007-07-19 Thread sukadev



Subject: [PATCH 5/5] Move alloc_pid call to copy_process

From: Sukadev Bhattiprolu [EMAIL PROTECTED]

Move alloc_pid() into copy_process(). This will keep all pid and pid
namespace code together and simplify error handling when we support
multiple pid namespaces.

Signed-off-by: Sukadev Bhattiprolu [EMAIL PROTECTED]

Cc: Pavel Emelianov [EMAIL PROTECTED]
Cc: Eric W. Biederman [EMAIL PROTECTED]
Cc: Cedric Le Goater [EMAIL PROTECTED]
Cc: Dave Hansen [EMAIL PROTECTED]
Cc: Serge Hallyn [EMAIL PROTECTED]
Cc: Herbert Poetzel [EMAIL PROTECTED]
---
 kernel/fork.c |   19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

Index: lx26-22-rc6-mm1a/kernel/fork.c
===
--- lx26-22-rc6-mm1a.orig/kernel/fork.c 2007-07-16 12:55:13.0 -0700
+++ lx26-22-rc6-mm1a/kernel/fork.c  2007-07-17 10:08:12.0 -0700
@@ -1029,6 +1029,12 @@ static struct task_struct *copy_process(
if (p-binfmt  !try_module_get(p-binfmt-module))
goto bad_fork_cleanup_put_domain;
 
+   if (pid != init_struct_pid) {
+   pid = alloc_pid();
+   if (!pid)
+   goto bad_fork_put_binfmt_module;
+   }
+
p-did_exec = 0;
delayacct_tsk_init(p);  /* Must remain after dup_task_struct() */
copy_flags(clone_flags, p);
@@ -1316,6 +1322,9 @@ bad_fork_cleanup_container:
 #endif
container_exit(p, container_callbacks_done);
delayacct_tsk_free(p);
+   if (pid != init_struct_pid)
+   free_pid(pid);
+bad_fork_put_binfmt_module:
if (p-binfmt)
module_put(p-binfmt-module);
 bad_fork_cleanup_put_domain:
@@ -1380,19 +1389,16 @@ long do_fork(unsigned long clone_flags,
 {
struct task_struct *p;
int trace = 0;
-   struct pid *pid = alloc_pid();
long nr;
 
-   if (!pid)
-   return -EAGAIN;
-   nr = pid-nr;
if (unlikely(current-ptrace)) {
trace = fork_traceflag (clone_flags);
if (trace)
clone_flags |= CLONE_PTRACE;
}
 
-   p = copy_process(clone_flags, stack_start, regs, stack_size, 
parent_tidptr, child_tidptr, pid);
+   p = copy_process(clone_flags, stack_start, regs, stack_size,
+   parent_tidptr, child_tidptr, NULL);
/*
 * Do this prior waking up the new thread - the thread pointer
 * might get invalid after that point, if the thread exits quickly.
@@ -1400,6 +1406,8 @@ long do_fork(unsigned long clone_flags,
if (!IS_ERR(p)) {
struct completion vfork;
 
+   nr = pid_nr(task_pid(p));
+
if (clone_flags  CLONE_VFORK) {
p-vfork_done = vfork;
init_completion(vfork);
@@ -1433,7 +1441,6 @@ long do_fork(unsigned long clone_flags,
}
}
} else {
-   free_pid(pid);
nr = PTR_ERR(p);
}
return nr;
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] fix the softlockup watchdog to actually work

2007-07-19 Thread Andrew Morton

On Tue, 17 Jul 2007 17:49:34 +0200 Ingo Molnar [EMAIL PROTECTED] wrote:

 Subject: fix the softlockup watchdog to actually work
 From: Ingo Molnar [EMAIL PROTECTED]
 
 this Xen related commit:
 
commit 966812dc98e6a7fcdf759cbfa0efab77500a8868
Author: Jeremy Fitzhardinge [EMAIL PROTECTED]
Date:   Tue May 8 00:28:02 2007 -0700
 
Ignore stolen time in the softlockup watchdog
 
 broke the softlockup watchdog to never report any lockups. (!)
 
 print_timestamp defaults to 0, this makes the following condition
 always true:
 
   if (print_timestamp  (touch_timestamp + 1) ||
 
 and we'll in essence never report soft lockups.
 
 apparently the functionality of the soft lockup watchdog was never
 actually tested with that patch applied ...
 
 [this is -stable material too.]

This seems terribly sensitive.

Someone has broken the Vaio (shock, horror).  It now has mysterious
jerkiness: when leaning on autorepeat it stalls for maybe 0.25 seconds
every 1.5 seconds.  The stalls are far less than a second.  Yet this
is enough to trigger random softlockup warnings.

Some of those warnings are below.  Note that the traces are all pretty
useless, as softlockup warnings so often seem to be.

Of course, it could be that whatever is causing these pauses really _is_
stalling for a whole second occasionally, dunno.  But I didn't notice any
long stalls in the console output when a particular storm of softlockup
warnings came out.

But I'll sit on this patch for a while until this gets sorted out. 
Meanwhile, please double-check the elapsed-time arithmetic in there,
maybe do a bit of runtime testing?



[   78.820961] BUG: soft lockup detected on CPU#0!
[   78.821083]  [c0122475] update_process_times+0x32/0x54
[   78.821216]  [c012fe7a] tick_sched_timer+0x61/0x9c
[   78.821340]  [c012c2e7] hrtimer_interrupt+0x142/0x1d4
[   78.821463]  [c012fe19] tick_sched_timer+0x0/0x9c
[   78.821587]  [c012f74a] tick_do_broadcast+0x1f/0x3f
[   78.821707]  [c012fa01] tick_handle_oneshot_broadcast+0x47/0x72
[   78.821852]  [c01067ca] timer_interrupt+0x1a/0x20
[   78.821968]  [c014291e] handle_IRQ_event+0x1a/0x3f
[   78.822089]  [c0143521] handle_edge_irq+0x9d/0xcc
[   78.822206]  [c0105d7b] do_IRQ+0x53/0x6c
[   78.822307]  [c012f4f0] tick_notify+0x15c/0x208
[   78.822422]  [c01044cf] common_interrupt+0x23/0x28
[   78.822539]  [c012f1d4] clockevents_notify+0x8/0x36
[   78.822663]  [c020d199] acpi_processor_idle+0x1d2/0x36d
[   78.822798]  [c0102345] cpu_idle+0x44/0x5e
[   78.822900]  [c03baa8d] start_kernel+0x26d/0x275
[   78.823017]  [c03ba3fe] unknown_bootoption+0x0/0x202
[   78.823142]  ===
[  106.282830] BUG: soft lockup detected on CPU#0!
[  106.282967]  [c0122475] update_process_times+0x32/0x54
[  106.283116]  [c012fe7a] tick_sched_timer+0x61/0x9c
[  106.283255]  [c012c2e7] hrtimer_interrupt+0x142/0x1d4
[  106.283391]  [c012fe19] tick_sched_timer+0x0/0x9c
[  106.283530]  [c012f74a] tick_do_broadcast+0x1f/0x3f
[  106.283663]  [c012fa01] tick_handle_oneshot_broadcast+0x47/0x72
[  106.283821]  [c01067ca] timer_interrupt+0x1a/0x20
[  106.283949]  [c014291e] handle_IRQ_event+0x1a/0x3f
[  106.284084]  [c0143521] handle_edge_irq+0x9d/0xcc
[  106.284215]  [c0105d7b] do_IRQ+0x53/0x6c
[  106.284326]  [c012f4f0] tick_notify+0x15c/0x208
[  106.284455]  [c01044cf] common_interrupt+0x23/0x28
[  106.284587]  [c012f1d4] clockevents_notify+0x8/0x36
[  106.284725]  [c020d199] acpi_processor_idle+0x1d2/0x36d
[  106.284875]  [c0102345] cpu_idle+0x44/0x5e
[  106.284988]  [c03baa8d] start_kernel+0x26d/0x275
[  106.285117]  [c03ba3fe] unknown_bootoption+0x0/0x202
[  106.285257]  ===
[  109.266423] BUG: soft lockup detected on CPU#0!
[  109.266558]  [c0122475] update_process_times+0x32/0x54
[  109.266703]  [c012fe7a] tick_sched_timer+0x61/0x9c
[  109.270745]  [c012c2e7] hrtimer_interrupt+0x142/0x1d4
[  109.274790]  [c012fe19] tick_sched_timer+0x0/0x9c
[  109.278865]  [c012f74a] tick_do_broadcast+0x1f/0x3f
[  109.282950]  [c012fa01] tick_handle_oneshot_broadcast+0x47/0x72
[  109.287026]  [c01067ca] timer_interrupt+0x1a/0x20
[  109.291012]  [c014291e] handle_IRQ_event+0x1a/0x3f
[  109.294950]  [c0143521] handle_edge_irq+0x9d/0xcc
[  109.298864]  [c0105d7b] do_IRQ+0x53/0x6c
[  109.302818]  [c012f4f0] tick_notify+0x15c/0x208
[  109.306740]  [c01044cf] common_interrupt+0x23/0x28
[  109.310641]  [c012f1d4] clockevents_notify+0x8/0x36
[  109.314543]  [c020d199] acpi_processor_idle+0x1d2/0x36d
[  109.318461]  [c0102345] cpu_idle+0x44/0x5e
[  109.322348]  [c03baa8d] start_kernel+0x26d/0x275
[  109.326267]  [c03ba3fe] unknown_bootoption+0x0/0x202
[  109.330188]  ===

(ah, the Vaio breakage seems to be -mm-only, whew)

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: cmpxchg is not available to generic code

2007-07-19 Thread David Miller

From: Andrew Morton [EMAIL PROTECTED]
Date: Thu, 19 Jul 2007 00:05:49 -0700

 What's that code doing anyway?  driver-private locking primitives?

It's an atomic lock shared with userspace.  Whatever implementation is
used to do the lock on that object must be identical in the userspace
DRM bits.

Unlike futex, the lock operation on the user side isn't optional.
So if the platform can't do a true cmpxchg it generally cannot
support DRM.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2.6.22.y] fw-ohci: fix scheduling while atomic

2007-07-19 Thread Stefan Richter

Date: Thu, 12 Jul 2007 22:25:14 +0200 (CEST)
From: Stefan Richter [EMAIL PROTECTED]
Subject: firewire: fw-ohci: fix scheduling while atomic

context_stop is called by bus_reset_tasklet, among else.

Signed-off-by: Stefan Richter [EMAIL PROTECTED]
---
Fixes http://bugzilla.kernel.org/show_bug.cgi?id=8735.
Same as commit b980f5a224f3df6c884dbf5ae48797ce352ba139.

 drivers/firewire/fw-ohci.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6.22/drivers/firewire/fw-ohci.c
===
--- linux-2.6.22.orig/drivers/firewire/fw-ohci.c
+++ linux-2.6.22/drivers/firewire/fw-ohci.c
@@ -586,7 +586,7 @@ static void context_stop(struct context 
break;
 
fw_notify(context_stop: still active (0x%08x)\n, reg);
-   msleep(1);
+   mdelay(1);
}
 }
 

-- 
Stefan Richter
-=-=-=== -=== =--==
http://arcgraph.de/sr/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2.6.22.y] firewire: fix memory leak of fw_request instances

2007-07-19 Thread Stefan Richter

Date: Tue, 17 Jul 2007 02:15:36 +0200 (CEST)
From: Stefan Richter [EMAIL PROTECTED]
Subject: firewire: fix memory leak of fw_request instances

Found and debugged by Jay Fenlason [EMAIL PROTECTED].
The bug was especially noticeable with direct I/O over fw-sbp2.

Signed-off-by: Stefan Richter [EMAIL PROTECTED]
Signed-off-by: Kristian Høgsberg [EMAIL PROTECTED]
---
Same as commit 9c9bdf4d50730fd04b06077e22d7a83b585f26b5.

 drivers/firewire/fw-transaction.c |4 +++-
 drivers/firewire/fw-transaction.h |4 
 2 files changed, 7 insertions(+), 1 deletion(-)

Index: linux-2.6.22/drivers/firewire/fw-transaction.c
===
--- linux-2.6.22.orig/drivers/firewire/fw-transaction.c
+++ linux-2.6.22/drivers/firewire/fw-transaction.c
@@ -605,8 +605,10 @@ fw_send_response(struct fw_card *card, s
 * check is sufficient to ensure we don't send response to
 * broadcast packets or posted writes.
 */
-   if (request-ack != ACK_PENDING)
+   if (request-ack != ACK_PENDING) {
+   kfree(request);
return;
+   }
 
if (rcode == RCODE_COMPLETE)
fw_fill_response(request-response, request-request_header,
Index: linux-2.6.22/drivers/firewire/fw-transaction.h
===
--- linux-2.6.22.orig/drivers/firewire/fw-transaction.h
+++ linux-2.6.22/drivers/firewire/fw-transaction.h
@@ -124,6 +124,10 @@ typedef void (*fw_transaction_callback_t
  size_t length,
  void *callback_data);
 
+/*
+ * Important note:  The callback must guarantee that either fw_send_response()
+ * or kfree() is called on the @request.
+ */
 typedef void (*fw_address_callback_t)(struct fw_card *card,
  struct fw_request *request,
  int tcode, int destination, int source,

-- 
Stefan Richter
-=-=-=== -=== =--==
http://arcgraph.de/sr/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 5/5] [V2] Move alloc_pid() to copy_process()

2007-07-19 Thread Pavel Emelyanov


[EMAIL PROTECTED] wrote:


Subject: [PATCH 5/5] Move alloc_pid call to copy_process

From: Sukadev Bhattiprolu [EMAIL PROTECTED]

Move alloc_pid() into copy_process(). This will keep all pid and pid
namespace code together and simplify error handling when we support
multiple pid namespaces.


I would add smth like this to the comment:

When a task creates a new pid namespace, its init (i.e. this task's
child) will have pids with extra info inside - the new numerical id,
that represent this new task in this new namespace. Thus, we have 
to allocate this new pid only after the namespace creation to find 
out which namespace this pid will live in.


Hope, I expressed my idea cleanly.

Acked-by: Pavel Emelyanov [EMAIL PROTECTED]


Signed-off-by: Sukadev Bhattiprolu [EMAIL PROTECTED]

Cc: Pavel Emelianov [EMAIL PROTECTED]
Cc: Eric W. Biederman [EMAIL PROTECTED]
Cc: Cedric Le Goater [EMAIL PROTECTED]
Cc: Dave Hansen [EMAIL PROTECTED]
Cc: Serge Hallyn [EMAIL PROTECTED]
Cc: Herbert Poetzel [EMAIL PROTECTED]
---
 kernel/fork.c |   19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

Index: lx26-22-rc6-mm1a/kernel/fork.c
===
--- lx26-22-rc6-mm1a.orig/kernel/fork.c 2007-07-16 12:55:13.0 -0700
+++ lx26-22-rc6-mm1a/kernel/fork.c  2007-07-17 10:08:12.0 -0700
@@ -1029,6 +1029,12 @@ static struct task_struct *copy_process(
if (p-binfmt  !try_module_get(p-binfmt-module))
goto bad_fork_cleanup_put_domain;
 
+	if (pid != init_struct_pid) {

+   pid = alloc_pid();
+   if (!pid)
+   goto bad_fork_put_binfmt_module;
+   }
+
p-did_exec = 0;
delayacct_tsk_init(p);  /* Must remain after dup_task_struct() */
copy_flags(clone_flags, p);
@@ -1316,6 +1322,9 @@ bad_fork_cleanup_container:
 #endif
container_exit(p, container_callbacks_done);
delayacct_tsk_free(p);
+   if (pid != init_struct_pid)
+   free_pid(pid);
+bad_fork_put_binfmt_module:
if (p-binfmt)
module_put(p-binfmt-module);
 bad_fork_cleanup_put_domain:
@@ -1380,19 +1389,16 @@ long do_fork(unsigned long clone_flags,
 {
struct task_struct *p;
int trace = 0;
-   struct pid *pid = alloc_pid();
long nr;
 
-	if (!pid)

-   return -EAGAIN;
-   nr = pid-nr;
if (unlikely(current-ptrace)) {
trace = fork_traceflag (clone_flags);
if (trace)
clone_flags |= CLONE_PTRACE;
}
 
-	p = copy_process(clone_flags, stack_start, regs, stack_size, parent_tidptr, child_tidptr, pid);

+   p = copy_process(clone_flags, stack_start, regs, stack_size,
+   parent_tidptr, child_tidptr, NULL);
/*
 * Do this prior waking up the new thread - the thread pointer
 * might get invalid after that point, if the thread exits quickly.
@@ -1400,6 +1406,8 @@ long do_fork(unsigned long clone_flags,
if (!IS_ERR(p)) {
struct completion vfork;
 
+		nr = pid_nr(task_pid(p));

+
if (clone_flags  CLONE_VFORK) {
p-vfork_done = vfork;
init_completion(vfork);
@@ -1433,7 +1441,6 @@ long do_fork(unsigned long clone_flags,
}
}
} else {
-   free_pid(pid);
nr = PTR_ERR(p);
}
return nr;



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] fix the softlockup watchdog to actually work

2007-07-19 Thread Ingo Molnar


* Andrew Morton [EMAIL PROTECTED] wrote:

  [this is -stable material too.]
 
 This seems terribly sensitive.
 
 Someone has broken the Vaio (shock, horror).  It now has mysterious 
 jerkiness: when leaning on autorepeat it stalls for maybe 0.25 seconds 
 every 1.5 seconds.  The stalls are far less than a second.  Yet this 
 is enough to trigger random softlockup warnings.
 
 Some of those warnings are below.  Note that the traces are all pretty 
 useless, as softlockup warnings so often seem to be.

hm, you havent picked up the other softlockup enhancements i did, which 
make the warnings more useful.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH try #3] security: Convert LSM into a static interface

2007-07-19 Thread Christian Ehrhardt

On Wed, Jul 18, 2007 at 06:35:03PM -0700, Andrew Morton wrote:
 On Sat, 14 Jul 2007 12:37:01 -0400 (EDT)
 James Morris [EMAIL PROTECTED] wrote:
 
  Convert LSM into a static interface, as the ability to unload a security
  module is not required by in-tree users and potentially complicates the
  overall security architecture.
  
  Needlessly exported LSM symbols have been unexported, to help reduce API
  abuse.
  
  Parameters for the capability and root_plug modules are now specified
  at boot.
  
  The SECURITY_FRAMEWORK_VERSION macro has also been removed.
 
 I'd like to understand who is (or claims to be) adversely affected by this
 change, and what their complaints (if any) will be.

I am currently loading and unloading a prototype like security module
on a regular basis. The fact that such a module can be loaded and
unloaded (albeit in an unsecure way) greatly simplifies development.
Thus this change will adversely affect me and probably also others that
develop LSMs.

Additionally deployment of and choice among legitimate security modules
that may or may not (yet) be part of the main kernel tree is simplified by
an option to load these security modules (e.g. at boot time) into a running
kernel. This way a distribution can provide AppArmor, SELinux, SecLevl and
whatever as options very much in the same way that this works for a driver.

 Because I prefer my flamewars pre- rather than post-merge.

You asked for oppinion. I do not plan to engage in any flamewars.

regards Christian



signature.asc
Description: Digital signature

Re: Documentation for sysfs, hotplug, and firmware loading.

2007-07-19 Thread Cornelia Huck

On Wed, 18 Jul 2007 13:39:53 -0400,
Rob Landley [EMAIL PROTECTED] wrote:

 Nope.  If you recurse down under /sys/class following symlinks, you go into 
 an 
 endless loop bouncing off of /sys/devices and getting pointed back.  If you 
 don't follow symlinks, it works fine up until about 2.6.20 at which point 
 things that were previously directories BECAME symlinks because the 
 directories got moved, and it all broke.

I have no idea what you're doing.

 Which is why I want it documented where to look for these suckers.  Just give 
 me ONE STABLE WAY TO FIND THIS INFORMATION, PLEASE.

See Documentation/sysfs-rules.txt.

 This document is trying to document just enough information to make hotplug 
 work using sysfs (which includes firmware loading if necessary).
 
  (And how about referring to Documentation/sysfs-rules.txt?)
 
 Because there isn't one in 2.6.22, and I've been writing this file on and off 
 for a month as I tracked down various bits of information?

That was a _suggestion_.

 I know.  I'm just trying to show people how to do it.  Notice that this 
 script 
 doesn't DO anything, it just dumps the variables (and proves 
 that /sys/hotplug got called).  You're worried about the scalability of a 
 debugging script.

If you use bash scripts as examples, people will write bash scripts.

 (Rummage)  Seems to be add, remove, change, online, offline, move?
 
 I can list 'em.  Now I'm vaguely curious what generates online and offline 
 events (MII transciever state transitions on a network card, or does this 
 have to do with power saving modes?)  And I have no idea what the difference 
 between change and move is

change - something about the device has changed
move - the device is in a different position in the tree now

You may want to grep for the usage...

 
 DEVPATH
   Path under /sys at which this device's sysfs directory can be found.
   If $DEVPATH begins with /block/ the event refers to a block device,
   otherwise it refers to a char device.
 
  Huh? That's just the path in sysfs. And there's more than block and
  char :) Check SUBSYSTEM for what your device actually is.
 
 If you are doing mknod, you need three pieces of information:
 1) Major, 2) Minor, 3) Block or Char device.  That's pretty much it.  If 
 you're trying to populate /dev you need that info.
 
 SUBSYSTEM
   If this is block, it's a block device.  Anything else is a char
   device.
 
  No. For devices, SUBSYSTEM may be the class (like 'scsi_device') or the
  bus (like 'pci').
 
 Do you make a /dev node for either one?
 
 I'm trying to, at minimum, document what you pass to mknod.  I consider it 
 important to know.

The problem is that your information is wrong. Imagine someone reading
this document, thinking cool, I'll create a char node if
SUBSYSTEM!=block and subsequently getting completely confused about
all those SUBSYSTEM==pci events.

 
 DRIVER
   If present, a suggested driver (module) for handling this device.  No
   relation to whether or not a driver is currently handling the device.
 
  No, this actually is the current driver.
 
 I've had it suggest drivers for devices that didn't have any loaded, and I 
 had 
 it _not_ specify drivers for devices that were loaded.  (I checked.)

The code disagrees with you. If a driver matches and probing succeeds,
it will be specified, otherwise not. Maybe you were checking the wrong
devices?

 Ah yes.  I replied to that when it was first posted.  It's still here's a 
 list of things NOT to do rather then telling you what you CAN do.  I'm 
 trying to document what you can do.
 
 Useful documentation is not Doing THIS is forbidden.  Doing THIS is 
 forbidden.  Doing THIS is forbidden.  What are you allowed to do?  Guess!  
 Oh, and anything I didn't explicitly mention could change at any time.  Have 
 fun.

It _does_ specify what you may rely on. Don't rely on anything else.

 Sysfs CAN export a stable API.  It may only be a subset of what it's 
 exporting, but it can still do so.

And that is exactly what sysfs-rules.txt is doing. I don't understand
your problem.

If you think that getting this information from sysfs-rules.txt could
be made easier, do a patch against it.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/1] i386: Geode's TSC is not neccessary to mark tu unstable

2007-07-19 Thread Andi Kleen


 Wow, that's a really cool bug; nice work!  Don't forget to update
 arch/i386/kernel/cpu/mtrr/state.c, though; it uses setCx86() as well.  It 
 needs
 to include processor-cyrix.h.

It also needs some big fat comments

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: cmpxchg is not available to generic code

2007-07-19 Thread Dave Airlie


On 7/19/07, Andrew Morton [EMAIL PROTECTED] wrote:

On Thu, 19 Jul 2007 18:15:03 +1000 Dave Airlie [EMAIL PROTECTED] wrote:

 Maybe we could add CONFIG_HAVE_CMPXCHG and let DRM depend on it..

That would certainly be better than adding a sprinkle of architectures
in DRM Kconfig dependencies.

I don't know how important DRM is on ARM.  Zero?



I'd guess zero I suppose if you wanted you could hook up a PCI
graphics card on ARM, but if you do that I think you could implement
cmpxchg :-)

Dave.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] net/, drivers/net/ , missing EXPERIMENTAL in menus

2007-07-19 Thread Robert P. J. Day

On Thu, 19 Jul 2007, Adrian Bunk wrote:

...
 I would consider it more ugly to special case this and that in the
 kconfig code when plain dependencies already offer exactly the same
 functionality...

well, this is the *third* time i've proposed adding this kind of
feature so, at this point, i've really given up caring about it.  if
someone wants to do this, have at it.  i have better things to do than
to keep suggesting it and getting nowhere with it.

rday
--

Robert P. J. Day Linux Consulting, Training and Annoying Kernel
Pedantry Waterloo, Ontario, CANADA

http://fsdev.net/wiki/index.php?title=Main_Page

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: cmpxchg is not available to generic code

2007-07-19 Thread Andrew Morton

On Thu, 19 Jul 2007 09:02:03 +0100 (IST) Dave Airlie [EMAIL PROTECTED] wrote:

 
 
  arm:
 
  drivers/char/drm/drm_lock.c: In function `drm_lock_take':
  drivers/char/drm/drm_lock.c:221: error: implicit declaration of function 
  `cmpxchg'
 
  You might be able to use atomic_cmpxchg, which _is_ present
  on all architectures.  Or use a spinlock.
 
  What's that code doing anyway?  driver-private locking primitives?
 
 When did arm suddenly start wanting DRM?

It's selectable in config.  allmodconfig broke.

 they need to grow a userpsace 
 cmpxchg as davem mentioned to go along with this, changing the drm now 
 isn't possible due to backwards compat..

For reference purposes, that position is not acceptable.  We _never_ accept the
oh I can't change my proposed kernel interface because I already have
userspace relying on it argument.

Hopefully that won't be an issue here.  I guess DRM now needs a
`depends on !ARM'.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: cmpxchg is not available to generic code

2007-07-19 Thread Andrew Morton

On Thu, 19 Jul 2007 18:15:03 +1000 Dave Airlie [EMAIL PROTECTED] wrote:

 Maybe we could add CONFIG_HAVE_CMPXCHG and let DRM depend on it..

That would certainly be better than adding a sprinkle of architectures
in DRM Kconfig dependencies.

I don't know how important DRM is on ARM.  Zero?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [NFS] [PATCH 4/5] knfsd: move EX_RDONLY out of header

2007-07-19 Thread Christoph Hellwig

On Wed, Jul 18, 2007 at 06:57:29PM -0400, J. Bruce Fields wrote:
 From: J. Bruce Fields [EMAIL PROTECTED]
 
 EX_RDONLY is only called in one place; just put it there.
 
 Signed-off-by: J. Bruce Fields [EMAIL PROTECTED]
 ---
  fs/nfsd/vfs.c   |   12 
  include/linux/nfsd/export.h |   12 
  2 files changed, 12 insertions(+), 12 deletions(-)
 
 diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
 index 5c97d0e..f2684e5 100644
 --- a/fs/nfsd/vfs.c
 +++ b/fs/nfsd/vfs.c
 @@ -1797,6 +1797,18 @@ nfsd_statfs(struct svc_rqst *rqstp, struct svc_fh 
 *fhp, struct kstatfs *stat)
   return err;
  }
  
 +static inline int EX_RDONLY(struct svc_export *exp, struct svc_rqst *rqstp)
 +{
 + struct exp_flavor_info *f;
 + struct exp_flavor_info *end = exp-ex_flavors + exp-ex_nflavors;
 +
 + for (f = exp-ex_flavors; f  end; f++) {
 + if (f-pseudoflavor == rqstp-rq_flavor)
 + return f-flags  NFSEXP_READONLY;
 + }
 + return exp-ex_flags  NFSEXP_READONLY;
 +}

As mentioned last time lease remove the inline qualifier and give it a
lower-case name.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [NFS] [PATCH 5/5] knfsd: clean up EX_RDONLY

2007-07-19 Thread Christoph Hellwig

On Wed, Jul 18, 2007 at 06:57:30PM -0400, J. Bruce Fields wrote:
 From: J. Bruce Fields [EMAIL PROTECTED]
 
 Share a little common code, reverse the arguments for consistency, drop
 the unnecessary inline, and lowercase the name.

Ah, sorry - didn't notice this was a separate patch.

 @@ -1845,7 +1838,7 @@ nfsd_permission(struct svc_rqst *rqstp, struct 
 svc_export *exp,
*/
   if (!(acc  MAY_LOCAL_ACCESS))
   if (acc  (MAY_WRITE | MAY_SATTR | MAY_TRUNC)) {
 - if (EX_RDONLY(exp, rqstp) || IS_RDONLY(inode))
 + if (exp_rdonly(rqstp, exp) || IS_RDONLY(inode))

In fact with just a singler caller left and reduced to a one-liner we
could kill this function completely..
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: cmpxchg is not available to generic code

2007-07-19 Thread Andrew Morton

On Thu, 19 Jul 2007 09:19:10 +0100 (IST) Dave Airlie [EMAIL PROTECTED] wrote:

 
  they need to grow a userpsace
  cmpxchg as davem mentioned to go along with this, changing the drm now
  isn't possible due to backwards compat..
 
  For reference purposes, that position is not acceptable.  We _never_ accept 
  the
  oh I can't change my proposed kernel interface because I already have
  userspace relying on it argument.
 
 Yes it is, I thought breaking userspace API was the worst crime in kernel 
 history, (unless you are sysfs...) the userspace DRM has been around since 
 2.2 days at least, so there are lots of legacy userspaces to break..

oh, sorry, I thought this cmpxchg stuff was newly-added.

I guess something changed which has now made DRM available under arm
allmodconfig.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [NFS] [PATCH 4/5] knfsd: move EX_RDONLY out of header

2007-07-19 Thread Andrew Morton

On Thu, 19 Jul 2007 09:28:38 +0100 Christoph Hellwig [EMAIL PROTECTED] wrote:

 On Wed, Jul 18, 2007 at 06:57:29PM -0400, J. Bruce Fields wrote:
  From: J. Bruce Fields [EMAIL PROTECTED]
  
  EX_RDONLY is only called in one place; just put it there.
  
  Signed-off-by: J. Bruce Fields [EMAIL PROTECTED]
  ---
   fs/nfsd/vfs.c   |   12 
   include/linux/nfsd/export.h |   12 
   2 files changed, 12 insertions(+), 12 deletions(-)
  
  diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
  index 5c97d0e..f2684e5 100644
  --- a/fs/nfsd/vfs.c
  +++ b/fs/nfsd/vfs.c
  @@ -1797,6 +1797,18 @@ nfsd_statfs(struct svc_rqst *rqstp, struct svc_fh 
  *fhp, struct kstatfs *stat)
  return err;
   }
   
  +static inline int EX_RDONLY(struct svc_export *exp, struct svc_rqst *rqstp)
  +{
  +   struct exp_flavor_info *f;
  +   struct exp_flavor_info *end = exp-ex_flavors + exp-ex_nflavors;
  +
  +   for (f = exp-ex_flavors; f  end; f++) {
  +   if (f-pseudoflavor == rqstp-rq_flavor)
  +   return f-flags  NFSEXP_READONLY;
  +   }
  +   return exp-ex_flags  NFSEXP_READONLY;
  +}
 
 As mentioned last time lease remove the inline qualifier and give it a
 lower-case name.

that's the next patch in the series.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFT][PATCH v7] sata_mv: convert to new EH

2007-07-19 Thread Pasi Kärkkäinen

On Wed, Jul 18, 2007 at 09:40:33AM -0700, dean gaudet wrote:
 On Wed, 18 Jul 2007, Pasi Kärkkäinen wrote:
 
  What brand/model your sata_mv controller is? Would be nice to know to be
  able to get a known-to-work one.. 
 
 http://supermicro.com/products/accessories/addon/AoC-SAT2-MV8.cfm
 

Thanks! In fact I was thinking of exactly this model :-)

-- Pasi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: sysfs root link count broken in 2.6.22-git5

2007-07-19 Thread Jean Delvare

Hi Kay,

On Thu, 19 Jul 2007 02:44:54 +0200, Kay Sievers wrote:
 On 7/18/07, Jean Delvare [EMAIL PROTECTED] wrote:
  On Tue, 17 Jul 2007 20:38:28 -0700, Greg KH wrote:
   On Tue, Jul 17, 2007 at 11:05:30PM +0200, Jean Delvare wrote:
The code looks like:
   
   if (sysfs_get_mnt_path(sensors_sysfs_mount, NAME_MAX)
 || stat(sensors_sysfs_mount, statbuf)  0
 || statbuf.st_nlink = 2)  /* Empty directory */
return 0;   /* Failure */
   
This works OK with 2.6.22.1, but the last test fails with the current
git kernel even when sysfs is mounted.
  
   Yeah, but is checking the number of hard links in the directory a safe
   way to always verify that it isn't empty?
 
  I think so, yes. To the best of my knowledge, it has worked on all
  Unix-like systems for decades. There are other ways, but this is by far
  the less expensive.
 
 Well, just check if /sys/devices/ exists, that should be cheap enough. :)

Yes, this is a possibility, and one I had considered at first. But I
wasn't sure which subdirectory to check. sysfs isn't well known for its
stability, and I didn't know which directories exist since the
early days of sysfs, and which do not. For example, fs, kernel and
module were not present in 2.6.5. I am also not sure if directories
which exist today are guaranteed to exist forever. This is the reason
why I decided to check the link count instead, basically checking that
at least one subdirectory exists, without having to name it.

-- 
Jean Delvare
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Check for compound pages in set_page_dirty()

2007-07-19 Thread Jens Axboe

On Thu, Jul 19 2007, Jens Axboe wrote:
 On Wed, Jul 18 2007, Hugh Dickins wrote:
  On Wed, 18 Jul 2007, Jens Axboe wrote:
   
   Since I had my hands dirty already...
  
  Great, thanks.  (There's also such a test in fs/nfs/direct.c,
  but let's not trouble Trond until we've settled what to do here.)
  
   
   ---
   
   [PATCH] Remove PageCompound() checks before calling set_page_dirty()
   
   Pre commit 41d78ba55037468e6c86c53e3076d1a74841de39 it was illegal
   to call set_page_dirty() on a compound page, since it stored the
   destructor in the mapping field. But now it's ok, so remove the
   ugly PageCompound() checks from bio and direct-io.
   
   Signed-off-by: Jens Axboe [EMAIL PROTECTED]
  
  I was about to Ack that, now that I've found something or other in the
  libhugetlb testsuite comes this way, even on page[1], without showing
  any problem.
  
  However, I have noticed a particular inefficiency arising: that
  bio_check_pages_dirty test specifically avoids pages already
  PageDirty; but hugetlbfs_set_page_dirty carefully redirects to
  set the head page dirty: so tail pages of a hugetlb compound page
  will tend never to be PageDirty, and keep on coming back this way.
  
  Which led me to look up the origin of those PageCompound tests:
  Author: Andrew Morton [EMAIL PROTECTED]
  Date:   Sun Sep 21 01:42:22 2003 -0700
  
  [PATCH] Speed up direct-io hugetlbpage handling
  
  This patch short-circuits all the direct-io page dirtying logic for
  higher-order pages.  Without this, we pointlessly bounce BIOs up to 
  keventd
  all the time.
  
  diff --git a/fs/bio.c b/fs/bio.c
  index d016523..2463163 100644
  --- a/fs/bio.c
  +++ b/fs/bio.c
  @@ -532,6 +532,12 @@ void bio_unmap_user(struct bio *bio, int write_to_vm)
* check that the pages are still dirty.   If so, fine.  If not, redirty 
  them
* in process context.
*
  + * We special-case compound pages here: normally this means reads into 
  hugetlb
  + * pages.  The logic in here doesn't really work right for compound pages
  + * because the VM does not uniformly chase down the head page in all cases.
  + * But dirtiness of compound pages is pretty meaningless anyway: the VM 
  doesn't
  + * handle them at all.  So we skip compound pages here at an early stage.
  ...
  
  It looks like I was wrong in thinking it was just trying to avoid 
  the crash on page[1].mapping.  At the least, your patch needs also
  to remove that paragraph of comment from Andrew.  But really, it
  looks like those PageCompound tests should stay, unless you can
  persuade Andrew to Ack their removal.
  
  Except (now, how many times can I change my mind in the course of
  one email?), hugetlbfs_set_page_dirty was specifically added by
  Ken Chen to avoid losing data via /proc/sys/vm/drop_caches.  Yet
  fs/bio.c is carefully avoiding going there when dirtying a hugepage.
  How does this work?  Looks like those PageCompound tests need to go!
 
 Hehe, that didn't really get us much further, did it? :-)
 
 My opinion is that since the win is marginal at best, we want to remove
 such tests as it just clutters up the code. And it's definitely not
 obvious why the tests are there, since they are not commented at all.
 Since it's even confusing you, then we can't expect the more vm ignorant
 of us (which definitely includes me) to grasp it!
 
  I'm lost: I hope Andrew and Ken can sort it out for us.
 
 Posting a revised version, still leaving nfs out of it (I'll ping Trond
 to do the same, if this goes in).

FWIW, I ran a hugepage+O_DIRECT read test, that will cause the direct-io
code to hit set_page_dirty() for a compound page - and it works fine for
me. The fio job file used was:

[global]
directory=/data1
bs=4k
direct=1
hugepage-size=4m
iomem=shmhuge

[iothread]
filename=testfile
size=128m
rw=read

Test box is a lowly x86.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: cmpxchg is not available to generic code

2007-07-19 Thread Dave Airlie

On 7/19/07, David Miller [EMAIL PROTECTED] wrote:

From: Andrew Morton [EMAIL PROTECTED]
Date: Thu, 19 Jul 2007 00:05:49 -0700

 What's that code doing anyway?  driver-private locking primitives?

It's an atomic lock shared with userspace.  Whatever implementation is
used to do the lock on that object must be identical in the userspace
DRM bits.

Unlike futex, the lock operation on the user side isn't optional.
So if the platform can't do a true cmpxchg it generally cannot
support DRM.

Actually in  theory the userspace side is optional, it should fallback
to always entering the kernel and being slow, but Ive no idea how
well that codepath is tested... but it's an area I'd hate to play with
now ..

Maybe we could add CONFIG_HAVE_CMPXCHG and let DRM depend on it..

Dave.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

< 3 4 5 6 7 8 9 10 11 12 >

701 - 800 of 1493 matches

Mail list logo