date:20070418

[PATCH -mm] workqueue: debug possible lockups in flush_workqueue

2007-04-18 Thread Jarek Poplawski

Hi,

Here is my patch proposal for detecting possible lockups,
when flush_workqueue caller holds a lock (e.g. rtnl_lock)
also used in work functions.

Regards,
Jarek P.

Signed-off-by: Jarek Poplawski <[EMAIL PROTECTED]>

---

diff -Nurp 2.6.21-rc6-mm1-/kernel/workqueue.c 2.6.21-rc6-mm1/kernel/workqueue.c
--- 2.6.21-rc6-mm1-/kernel/workqueue.c  2007-04-18 20:07:45.0 +0200
+++ 2.6.21-rc6-mm1/kernel/workqueue.c   2007-04-18 21:29:50.0 +0200
@@ -67,6 +67,12 @@ struct workqueue_struct {
 /* All the per-cpu workqueues on the system, for hotplug cpu to add/remove
threads to each one as cpus come/go. */
 static DEFINE_MUTEX(workqueue_mutex);
+
+#ifdef CONFIG_PROVE_LOCKING
+/* Detect possible flush_workqueue() lockup with circular dependency check. */
+static struct lockdep_map flush_dep_map = { .name = "flush_dep_map" };
+#endif
+
 static LIST_HEAD(workqueues);
 
 static int singlethread_cpu __read_mostly;
@@ -247,8 +253,15 @@ static void run_workqueue(struct cpu_wor
 
BUG_ON(get_wq_data(work) != cwq);
work_clear_pending(work);
+#ifdef CONFIG_PROVE_LOCKING
+   /* lockdep dependency: flush_dep_map (read) before any lock: */
+   lock_acquire(_dep_map, 0, 0, 1, 2, _THIS_IP_);
+#endif
f(work);
 
+#ifdef CONFIG_PROVE_LOCKING
+   lock_release(_dep_map, 1, _THIS_IP_);
+#endif
if (unlikely(in_atomic() || lockdep_depth(current) > 0)) {
printk(KERN_ERR "BUG: workqueue leaked lock or atomic: "
"%s/0x%08x/%d\n",
@@ -389,6 +402,14 @@ void fastcall flush_workqueue(struct wor
int cpu;
 
might_sleep();
+#ifdef CONFIG_PROVE_LOCKING
+   /*
+* Add lockdep dependency: flush_dep_map (exclusive)
+* after any held mutex or rwsem.
+*/
+   lock_acquire(_dep_map, 0, 0, 0, 2, _THIS_IP_);
+   lock_release(_dep_map, 1, _THIS_IP_);
+#endif
for_each_cpu_mask(cpu, *cpu_map)
flush_cpu_workqueue(per_cpu_ptr(wq->cpu_wq, cpu));
 }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/8] RSS controller based on process containers (v2)

2007-04-18 Thread Vaidyanathan Srinivasan



Pavel Emelianov wrote:
> Peter Zijlstra wrote:
>> *ugh* /me no like.
>>
>> The basic premises seems to be that we can track page owners perfectly
>> (although this patch set does not yet do so), through get/release
> 
> It looks like you have examined the patches not very carefully
> before concluding this. These patches DO track page owners.
> 
> I know that a page may be shared among several containers and
> thus have many owners so we should track all of them. This is
> exactly what we decided not to do half-a-year ago.
> 
> Page sharing accounting is performed in OpenVZ beancounters, and
> this functionality will be pushed to mainline after this simple
> container.
> 
>> operations (on _mapcount).
>>
>> This is simply not true for unmapped pagecache pages. Those receive no
>> 'release' event; (the usage by find_get_page() could be seen as 'get').
> 
> These patches concern the mapped pagecache only. Unmapped pagecache
> control is out of the scope of it since we do not want one container
> to track all the resources.

Unmapped pagecache control and swapcache control is part of
independent pagecache controller that is being developed.  Initial
version was posted at http://lkml.org/lkml/2007/3/06/51
I plan to post a new version based on this patchset in a couple of days.

--Vaidy

>> Also, you don't seem to balance the active/inactive scanning on a per
>> container basis. This skews the per container working set logic.
> 
> This is not true. Balbir sent a patch to the first version of this
> container that added active/inactive balancing to the container.
> I have included this (a bit reworked) patch into this version and
> pointed this fact in the zeroth letter.
> 

 [snip]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] [RFC] Throttle swappiness for interactive tasks

2007-04-18 Thread Abhijit Bhopatkar


> I just wanted to know weather its worth going forward or we have
> better reasons to discount any such direction?

The reason that the wrong pages get swapped out sometimes
could be due to a side effect of the way the swappiness
policy is implemented.

While the VM only reclaims page cache pages, it will still
rotate through the anonymous pages on the LRU list, which
effectively randomizes the order of those pages on the list.


In my mind i find it fundamentally wrong to separate anon pages from
page cache. It should rather be lot more dependent on which task
accessed them last. Although it seems due to some twisted relationships
bet anon pages and interactive tasks separating them improves it.
Am i missing something here?


I need to get back to benchmarking my patch to split the
lists - anonymous and other swap backed pages on one set
of pageout lists, filesystem backed pages on another list.



Unfortunately my main desktop system at home depends on
Xen, so it's not as easy to use that patch there :(



Can you send me those patches please or point me to where i can find those?

Abhijit
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-18 Thread Andrew Morton

On Thu, 19 Apr 2007 05:18:07 +0200 Nick Piggin <[EMAIL PROTECTED]> wrote:

> And yes, by fairly, I mean fairly among all threads as a base resource
> class, because that's what Linux has always done

Yes, there are potential compatibility problems.  Example: a machine with
100 busy httpd processes and suddenly a big gzip starts up from console or
cron.

Under current kernels, that gzip will take ages and the httpds will take a
1% slowdown, which may well be exactly the behaviour which is desired.

If we were to schedule by UID then the gzip suddenly gets 50% of the CPU
and those httpd's all take a 50% hit, which could be quite serious.

That's simple to fix via nicing, but people have to know to do that, and
there will be a transition period where some disruption is possible.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH][BUG] Fix possible NULL pointer access in 8250 serial driver

2007-04-18 Thread Andrew Morton

On Thu, 19 Apr 2007 11:28:37 +0900 izumi <[EMAIL PROTECTED]> wrote:

> Russell King wrote:
> 
> > NAK.  This means that you change the list of ports available on the
> > machine to be limited to only those which are currently open.  Utterly
> > useless for debugging, where you normally want people to dump the
> > contents of /proc/tty/driver/*.
> > 
> > The original patch was better.
> > 
> 
>Is the original patch sufficient?  or is there anything we should 
> correct?
> 

Would it be better to do something like

--- a/drivers/serial/serial_core.c~a
+++ a/drivers/serial/serial_core.c
@@ -1686,9 +1686,12 @@ static int uart_line_info(char *buf, str
pm_state = state->pm_state;
if (pm_state)
uart_change_pm(state, 0);
-   spin_lock_irq(>lock);
-   status = port->ops->get_mctrl(port);
-   spin_unlock_irq(>lock);
+   status = 0;
+   if (port->info) {
+   spin_lock_irq(>lock);
+   status = port->ops->get_mctrl(port);
+   spin_unlock_irq(>lock);
+   }
if (pm_state)
uart_change_pm(state, pm_state);
mutex_unlock(>mutex);
_

so that a) we treat all uart types in the same way and b) the same problem
doesn't occur later with some other driver which is assuming an opened
device in its ->get_mctrl() handler?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 1/2] Input: ff, add FF_RAW effect

2007-04-18 Thread Dmitry Torokhov

Hi,

On Thursday 19 April 2007 00:25, johann deneux wrote:
> On 4/18/07, Jiri Slaby <[EMAIL PROTECTED]> wrote:
> > johann deneux napsal(a):
> > > Jiri,
> > >
> > > Which solution did you chose to implement? From what I remember, we
> > > last discussed Dmitry's idea of specifying an axis for an effect, then
> > > combine several effects to achieve complex effects.
> >
> > I think you mean motor instead of axis, because I don't push real axes to
> > the devices, but motor's torques...
> >
> 
> Yes, sorry, I meant motor.
> 

I have been thinking about this and I don't think that exporting motor
data is a good idea, at least not in case of Phantom driver. The fact
that there are 3 motors is a hardware implementation detail and it
is not interesting for general application.

My understanding that the end result of controlling these 3 motors
is a force vector (I don't know if there is such english term, this
is a literal translation from russian) applied to user's hand.
If we are interested in using FF API we need to come up with a way
to express this effect without exposing implementation details of
one particular device.

-- 
Dmitry
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] [RFC] Throttle swappiness for interactive tasks

2007-04-18 Thread Rik van Riel


Abhijit Bhopatkar wrote:


I just wanted to know weather its worth going forward or we have
better reasons to discount any such direction?


The reason that the wrong pages get swapped out sometimes
could be due to a side effect of the way the swappiness
policy is implemented.

While the VM only reclaims page cache pages, it will still
rotate through the anonymous pages on the LRU list, which
effectively randomizes the order of those pages on the list.

I need to get back to benchmarking my patch to split the
lists - anonymous and other swap backed pages on one set
of pageout lists, filesystem backed pages on another list.

One report I got was that the system is more interactive
under very heavy load, and my desktop system at the office
seems to behave better than it used to when I get back to
it after a few days.

Unfortunately my main desktop system at home depends on
Xen, so it's not as easy to use that patch there :(

--
Politics is the struggle between those who want to make their country
the best in the world, and those who believe it already is.  Each group
calls the other unpatriotic.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Success! Was: [PATCH 0/4] 2.6.21-rc7 NFS writes: fix a series of issues

2007-04-18 Thread Florin Iucha

On Wed, Apr 18, 2007 at 10:45:13PM -0400, Trond Myklebust wrote:
> On Wed, 2007-04-18 at 20:52 -0500, Florin Iucha wrote:
> > It seems that my original problem report had a big mistake!  There is
> > no hang, but at some point the write slows down to a trickle (from
> > 40,000 blocks/s to 22 blocks/s) as can be seen from the iostat log.
> 
> Yeah. You only captured the outgoing traffic to the server, but already
> it looks as if there were 'interesting' things going on. In frames 29346
> to 29350, the traffic stops altogether for 5 seconds (I only see
> keepalives) then it starts up again. Ditto for frames 40477-40482
> (another 5 seconds). ...
> Then at around frame 92072, the client starts to send a bunch of RSTs.
> Aha I'll bet that reverting the appended patch fixes the problem.

You win!

Reverting this patch (on top of your previous 5) allowed the big copy
to complete (70GB) as well as successful log-in to gnome!

Acked-By: Florin Iucha <[EMAIL PROTECTED]>

Thanks so much for the patience with this elusive bug and stubborn
bugreporter!

Regards,
florin

> ---
> commit 43d78ef2ba5bec26d0315859e8324bfc0be23766
> Author: Chuck Lever <[EMAIL PROTECTED]>
> Date:   Tue Feb 6 18:26:11 2007 -0500
> 
> NFS: disconnect before retrying NFSv4 requests over TCP
> 
> RFC3530 section 3.1.1 states an NFSv4 client MUST NOT send a request
> twice on the same connection unless it is the NULL procedure.  Section
> 3.1.1 suggests that the client should disconnect and reconnect if it
> wants to retry a request.
> 
> Implement this by adding an rpc_clnt flag that an ULP can use to
> specify that the underlying transport should be disconnected on a
> major timeout.  The NFSv4 client asserts this new flag, and requests
> no retries after a minor retransmit timeout.
> 
> Note that disconnecting on a retransmit is in general not safe to do
> if the RPC client does not reuse the TCP port number when reconnecting.
> 
> See http://bugzilla.linux-nfs.org/show_bug.cgi?id=6
> 
> Signed-off-by: Chuck Lever <[EMAIL PROTECTED]>
> Signed-off-by: Trond Myklebust <[EMAIL PROTECTED]>
> 
> diff --git a/fs/nfs/client.c b/fs/nfs/client.c
> index a3191f0..c46e94f 100644
> --- a/fs/nfs/client.c
> +++ b/fs/nfs/client.c
> @@ -394,7 +394,8 @@ static void nfs_init_timeout_values(struct rpc_timeout 
> *to, int proto,
>  static int nfs_create_rpc_client(struct nfs_client *clp, int proto,
>   unsigned int timeo,
>   unsigned int retrans,
> - rpc_authflavor_t flavor)
> + rpc_authflavor_t flavor,
> + int flags)
>  {
>   struct rpc_timeout  timeparms;
>   struct rpc_clnt *clnt = NULL;
> @@ -407,6 +408,7 @@ static int nfs_create_rpc_client(struct nfs_client *clp, 
> int proto,
>   .program= _program,
>   .version= clp->rpc_ops->version,
>   .authflavor = flavor,
> + .flags  = flags,
>   };
>  
>   if (!IS_ERR(clp->cl_rpcclient))
> @@ -548,7 +550,7 @@ static int nfs_init_client(struct nfs_client *clp, const 
> struct nfs_mount_data *
>* - RFC 2623, sec 2.3.2
>*/
>   error = nfs_create_rpc_client(clp, proto, data->timeo, data->retrans,
> - RPC_AUTH_UNIX);
> + RPC_AUTH_UNIX, 0);
>   if (error < 0)
>   goto error;
>   nfs_mark_client_ready(clp, NFS_CS_READY);
> @@ -868,7 +870,8 @@ static int nfs4_init_client(struct nfs_client *clp,
>   /* Check NFS protocol revision and initialize RPC op vector */
>   clp->rpc_ops = _v4_clientops;
>  
> - error = nfs_create_rpc_client(clp, proto, timeo, retrans, authflavour);
> + error = nfs_create_rpc_client(clp, proto, timeo, retrans, authflavour,
> + RPC_CLNT_CREATE_DISCRTRY);
>   if (error < 0)
>   goto error;
>   memcpy(clp->cl_ipaddr, ip_addr, sizeof(clp->cl_ipaddr));
> diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h
> index a1be89d..c7a78ee 100644
> --- a/include/linux/sunrpc/clnt.h
> +++ b/include/linux/sunrpc/clnt.h
> @@ -40,6 +40,7 @@ struct rpc_clnt {
>  
>   unsigned intcl_softrtry : 1,/* soft timeouts */
>   cl_intr : 1,/* interruptible */
> + cl_discrtry : 1,/* disconnect before retry */
>   cl_autobind : 1,/* use getport() */
>   cl_oneshot  : 1,/* dispose after use */
>   cl_dead : 1;/* abandoned */
> @@ -111,6 +112,7 @@ struct rpc_create_args {
>  #define RPC_CLNT_CREATE_ONESHOT  (1UL << 3)
>  #define

Re: [RFC 1/2] Input: ff, add FF_RAW effect

2007-04-18 Thread johann deneux


On 4/18/07, Jiri Slaby <[EMAIL PROTECTED]> wrote:

johann deneux napsal(a):
> Jiri,
>
> Which solution did you chose to implement? From what I remember, we
> last discussed Dmitry's idea of specifying an axis for an effect, then
> combine several effects to achieve complex effects.

I think you mean motor instead of axis, because I don't push real axes to
the devices, but motor's torques...



Yes, sorry, I meant motor.

--
Johann
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[KJ][PATCH]SPIN_LOCK_UNLOCKED cleanup in drivers/s390

2007-04-18 Thread Milind Arun Choudhary

SPIN_LOCK_UNLOCKED cleanup,use __SPIN_LOCK_UNLOCKED instead.
 
Signed-off-by: Milind Arun Choudhary <[EMAIL PROTECTED]>

---
 char/vmlogrdr.c |6 +++---
 cio/cmf.c   |2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)


diff --git a/drivers/s390/char/vmlogrdr.c b/drivers/s390/char/vmlogrdr.c
index b87d3b0..75d61a4 100644
--- a/drivers/s390/char/vmlogrdr.c
+++ b/drivers/s390/char/vmlogrdr.c
@@ -125,7 +125,7 @@ static struct vmlogrdr_priv_t sys_ser[] = {
  .recording_name = "EREP",
  .minor_num  = 0,
  .buffer_free= 1,
- .priv_lock  = SPIN_LOCK_UNLOCKED,
+ .priv_lock  = __SPIN_LOCK_UNLOCKED(sys_ser[0].priv_lock),
  .autorecording  = 1,
  .autopurge  = 1,
},
@@ -134,7 +134,7 @@ static struct vmlogrdr_priv_t sys_ser[] = {
  .recording_name = "ACCOUNT",
  .minor_num  = 1,
  .buffer_free= 1,
- .priv_lock  = SPIN_LOCK_UNLOCKED,
+ .priv_lock  = __SPIN_LOCK_UNLOCKED(sys_ser[1].priv_lock),
  .autorecording  = 1,
  .autopurge  = 1,
},
@@ -143,7 +143,7 @@ static struct vmlogrdr_priv_t sys_ser[] = {
  .recording_name = "SYMPTOM",
  .minor_num  = 2,
  .buffer_free= 1,
- .priv_lock  = SPIN_LOCK_UNLOCKED,
+ .priv_lock  = __SPIN_LOCK_UNLOCKED(sys_ser[2].priv_lock),
  .autorecording  = 1,
  .autopurge  = 1,
}
diff --git a/drivers/s390/cio/cmf.c b/drivers/s390/cio/cmf.c
index 90b22fa..28abd69 100644
--- a/drivers/s390/cio/cmf.c
+++ b/drivers/s390/cio/cmf.c
@@ -476,7 +476,7 @@ struct cmb_area {
 };
 
 static struct cmb_area cmb_area = {
-   .lock = SPIN_LOCK_UNLOCKED,
+   .lock = __SPIN_LOCK_UNLOCKED(cmb_area.lock),
.list = LIST_HEAD_INIT(cmb_area.list),
.num_channels  = 1024,
 };
-- 
Milind Arun Choudhary
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[KJ][PATCH] i2c: SPIN_LOCK_UNLOCKED cleanup

2007-04-18 Thread Milind Arun Choudhary

SPIN_LOCK_UNLOCKED cleanup,use __SPIN_LOCK_UNLOCKED instead 

Signed-off-by: Milind Arun Choudhary <[EMAIL PROTECTED]>

---
 i2c-pxa.c |2 +-
 i2c-s3c2410.c |2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/i2c/busses/i2c-pxa.c b/drivers/i2c/busses/i2c-pxa.c
index 14e83d0..d5d44ed 100644
--- a/drivers/i2c/busses/i2c-pxa.c
+++ b/drivers/i2c/busses/i2c-pxa.c
@@ -825,7 +825,7 @@ static const struct i2c_algorithm i2c_pxa_algorithm = {
 };
 
 static struct pxa_i2c i2c_pxa = {
-   .lock   = SPIN_LOCK_UNLOCKED,
+   .lock   = __SPIN_LOCK_UNLOCKED(i2c_pxa.lock),
.adap   = {
.owner  = THIS_MODULE,
.algo   = _pxa_algorithm,
diff --git a/drivers/i2c/busses/i2c-s3c2410.c b/drivers/i2c/busses/i2c-s3c2410.c
index 556f244..3eb5958 100644
--- a/drivers/i2c/busses/i2c-s3c2410.c
+++ b/drivers/i2c/busses/i2c-s3c2410.c
@@ -570,7 +570,7 @@ static const struct i2c_algorithm s3c24xx_i2c_algorithm = {
 };
 
 static struct s3c24xx_i2c s3c24xx_i2c = {
-   .lock   = SPIN_LOCK_UNLOCKED,
+   .lock   = __SPIN_LOCK_UNLOCKED(s3c24xx_i2c.lock),
.wait   = __WAIT_QUEUE_HEAD_INITIALIZER(s3c24xx_i2c.wait),
.adap   = {
.name   = "s3c2410-i2c",

-- 
Milind Arun Choudhary
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH][BUG] Fix possible NULL pointer access in 8250 serial driver

2007-04-18 Thread izumi


Russell King wrote:


NAK.  This means that you change the list of ports available on the
machine to be limited to only those which are currently open.  Utterly
useless for debugging, where you normally want people to dump the
contents of /proc/tty/driver/*.

The original patch was better.



  Is the original patch sufficient?  or is there anything we should 
correct?


Taku Izumi <[EMAIL PROTECTED]>



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-18 Thread Nick Piggin

On Wed, Apr 18, 2007 at 10:49:45PM +1000, Con Kolivas wrote:
> On Wednesday 18 April 2007 22:13, Nick Piggin wrote:
> >
> > The kernel compile (make -j8 on 4 thread system) is doing 1800 total
> > context switches per second (450/s per runqueue) for cfs, and 670
> > for mainline. Going up to 20ms granularity for cfs brings the context
> > switch numbers similar, but user time is still a % or so higher. I'd
> > be more worried about compute heavy threads which naturally don't do
> > much context switching.
> 
> While kernel compiles are nice and easy to do I've seen enough criticism of 
> them in the past to wonder about their usefulness as a standard benchmark on 
> their own.

Actually it is a real workload for most kernel developers including you
no doubt :)

The criticism's of kernbench for the kernel are probably fair in that
kernel compiles don't exercise a lot of kernel functionality (page
allocator and fault paths mostly, IIRC). However as far as I'm concerned,
they're great for testing the CPU scheduler, because it doesn't actually
matter whether you're running in userspace or kernel space for a context
switch to blow your caches. The results are quite stable.

You could actually make up a benchmark that hurts a whole lot more from
context switching, but I figure that kernbench is a real world thing
that shows it up quite well.

> > Some other numbers on the same system
> > Hackbench:  2.6.21-rc7  cfs-v2 1ms[*]   nicksched
> > 10 groups: Time: 1.332  0.743   0.607
> > 20 groups: Time: 1.197  1.100   1.241
> > 30 groups: Time: 1.754  2.376   1.834
> > 40 groups: Time: 3.451  2.227   2.503
> > 50 groups: Time: 3.726  3.399   3.220
> > 60 groups: Time: 3.548  4.567   3.668
> > 70 groups: Time: 4.206  4.905   4.314
> > 80 groups: Time: 4.551  6.324   4.879
> > 90 groups: Time: 7.904  6.962   5.335
> > 100 groups: Time: 7.293 7.799   5.857
> > 110 groups: Time: 10.5958.728   6.517
> > 120 groups: Time: 7.543 9.304   7.082
> > 130 groups: Time: 8.269 10.639  8.007
> > 140 groups: Time: 11.8678.250   8.302
> > 150 groups: Time: 14.8528.656   8.662
> > 160 groups: Time: 9.648 9.313   9.541
> 
> Hackbench even more so. A prolonged discussion with Rusty Russell on this 
> issue he suggested hackbench was more a pass/fail benchmark to ensure there 
> was no starvation scenario that never ended, and very little value should be 
> placed on the actual results returned from it.

Yeah, cfs seems to do a little worse than nicksched here, but I
include the numbers not because I think that is significant, but to
show mainline's poor characteristics.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Announce - Staircase Deadline cpu scheduler v0.42

2007-04-18 Thread Nick Piggin

On Thu, Apr 19, 2007 at 12:12:14PM +1000, Con Kolivas wrote:
> On Thursday 19 April 2007 10:41, Con Kolivas wrote:
> > On Thursday 19 April 2007 09:59, Con Kolivas wrote:
> > > Since there is so much work currently ongoing with alternative cpu
> > > schedulers, as a standard for comparison with the alternative virtual
> > > deadline fair designs I've addressed a few issues in the Staircase
> > > Deadline cpu scheduler which improve behaviour likely in a noticeable
> > > fashion and released version 0.41.
> > >
> > > http://ck.kolivas.org/patches/staircase-deadline/2.6.20.7-sd-0.41.patch
> > > http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc7-sd-0.41.patch
> > >
> > > and an incremental for those on 0.40:
> > > http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc7/sched-impleme
> > >nt -staircase-deadline-scheduler-further-improvements.patch
> > >
> > > Remember to renice X to -10 for nicest desktop behaviour :)
> > >
> > > Have fun.
> >
> > Oops forgot to cc a few people
> >
> > Nick you said I should still have something to offer so here it is.
> > Peter you said you never saw this design (it's a dual array affair sorry).
> > Gene and Willy you were some of the early testers that noticed the
> > advantages of the earlier designs,
> > Matt you did lots of great earlier testing.
> > WLI you inspired a lot of design ideas.
> > Mike you were the stick.
> > And a few others I've forgotten to mention and include.
> 
> Version 0.42
> 
> http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc7-sd-0.42.patch

OK, I run some tests later today...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-18 Thread Nick Piggin

On Wed, Apr 18, 2007 at 07:48:21AM -0700, Linus Torvalds wrote:
> 
> 
> On Wed, 18 Apr 2007, Matt Mackall wrote:
> > 
> > Why is X special? Because it does work on behalf of other processes?
> > Lots of things do this. Perhaps a scheduler should focus entirely on
> > the implicit and directed wakeup matrix and optimizing that
> > instead[1].
> 
> I 100% agree - the perfect scheduler would indeed take into account where 
> the wakeups come from, and try to "weigh" processes that help other 
> processes make progress more. That would naturally give server processes 
> more CPU power, because they help others
> 
> I don't believe for a second that "fairness" means "give everybody the 
> same amount of CPU". That's a totally illogical measure of fairness. All 
> processes are _not_ created equal.

I believe that unless the kernel is told of these inequalities, then it
must schedule fairly.

And yes, by fairly, I mean fairly among all threads as a base resource
class, because that's what Linux has always done (and if you aggregate
into higher classes, you still need that per-thread scheduling).

So I'm not excluding extra scheduling classes like per-process, per-user,
but among any class of equal schedulable entities, fair scheduling is the
only option because the alternative of unfairness is just insane.

> That said, even trying to do "fairness by effective user ID" would 
> probably already do a lot. In a desktop environment, X would get as much 
> CPU time as the user processes, simply because it's in a different 
> protection domain (and that's really what "effective user ID" means: it's 
> not about "users", it's really about "protection domains").
> 
> And "fairness by euid" is probably a hell of a lot easier to do than 
> trying to figure out the wakeup matrix.

Well my X server has an euid of root, which would mean my X clients can
cause X to do work and eat into root's resources. Or as Ingo said, X
may not be running as root. Seems like just another hack to try to
implicitly solve the X problem and probably create a lot of others
along the way.

All fairness issues aside, in the context of keeping a very heavily
loaded desktop interactive, X is special. That you are trying to think
up funny rules that would implicitly give X better priority is kind of
indicative of that.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 0/8] Cpuset aware writeback

2007-04-18 Thread Christoph Lameter

On Wed, 18 Apr 2007, Ethan Solomita wrote:

>Any new ETA? I'm trying to decide whether to go back to your original
> patches or wait for the new set. Adding new knobs isn't as important to me as
> having something that fixes the core problem, so hopefully this isn't waiting
> on them. They could always be patches on top of your core patches.
>-- Ethan

H Sorry. I got distracted and I have sent them to Kame-san who was 
interested in working on them. 

I have placed the most recent version at
http://ftp.kernel.org/pub/linux/kernel/people/christoph/cpuset_dirty

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/4] 2.6.21-rc7 NFS writes: fix a series of issues

2007-04-18 Thread Trond Myklebust

On Wed, 2007-04-18 at 20:52 -0500, Florin Iucha wrote:
> On Wed, Apr 18, 2007 at 10:11:46AM -0400, Trond Myklebust wrote:
> > Do you have a copy of wireshark or ethereal on hand? If so, could you
> > take a look at whether or not any NFS traffic is going between the
> > client and server once the hang happens?
> 
> I used the following command 
> 
>tcpdump -w nfs-traffic -i eth0 -vv -tt dst port nfs
> 
> to capture
> 
>http://iucha.net/nfs/21-rc7-nfs4/nfs-traffic.bz2
> 
> I started the capture before starting the copy and left it to run for
> a few minutes after the traffic slowed to a crawl.
> 
> The iostat and vmstat are at:
> 
>http://iucha.net/nfs/21-rc7-nfs4/iostat
>http://iucha.net/nfs/21-rc7-nfs4/vmstat
>
> It seems that my original problem report had a big mistake!  There is
> no hang, but at some point the write slows down to a trickle (from
> 40,000 blocks/s to 22 blocks/s) as can be seen from the iostat log.

Yeah. You only captured the outgoing traffic to the server, but already
it looks as if there were 'interesting' things going on. In frames 29346
to 29350, the traffic stops altogether for 5 seconds (I only see
keepalives) then it starts up again. Ditto for frames 40477-40482
(another 5 seconds). ...
Then at around frame 92072, the client starts to send a bunch of RSTs.
Aha I'll bet that reverting the appended patch fixes the problem.

The assumption Chuck makes is that if _no_ request bytes have been sent,
yet the request is on the 'receive list' then it must be a resend is
patently false in the case where the send queue just happens to be full.
A better solution would probably be to disconnect the socket following
the ETIMEDOUT handling in call_status().

Cheers
  Trond
---
commit 43d78ef2ba5bec26d0315859e8324bfc0be23766
Author: Chuck Lever <[EMAIL PROTECTED]>
Date:   Tue Feb 6 18:26:11 2007 -0500

NFS: disconnect before retrying NFSv4 requests over TCP

RFC3530 section 3.1.1 states an NFSv4 client MUST NOT send a request
twice on the same connection unless it is the NULL procedure.  Section
3.1.1 suggests that the client should disconnect and reconnect if it
wants to retry a request.

Implement this by adding an rpc_clnt flag that an ULP can use to
specify that the underlying transport should be disconnected on a
major timeout.  The NFSv4 client asserts this new flag, and requests
no retries after a minor retransmit timeout.

Note that disconnecting on a retransmit is in general not safe to do
if the RPC client does not reuse the TCP port number when reconnecting.

See http://bugzilla.linux-nfs.org/show_bug.cgi?id=6

Signed-off-by: Chuck Lever <[EMAIL PROTECTED]>
Signed-off-by: Trond Myklebust <[EMAIL PROTECTED]>

diff --git a/fs/nfs/client.c b/fs/nfs/client.c
index a3191f0..c46e94f 100644
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -394,7 +394,8 @@ static void nfs_init_timeout_values(struct rpc_timeout *to, 
int proto,
 static int nfs_create_rpc_client(struct nfs_client *clp, int proto,
unsigned int timeo,
unsigned int retrans,
-   rpc_authflavor_t flavor)
+   rpc_authflavor_t flavor,
+   int flags)
 {
struct rpc_timeout  timeparms;
struct rpc_clnt *clnt = NULL;
@@ -407,6 +408,7 @@ static int nfs_create_rpc_client(struct nfs_client *clp, 
int proto,
.program= _program,
.version= clp->rpc_ops->version,
.authflavor = flavor,
+   .flags  = flags,
};

if (!IS_ERR(clp->cl_rpcclient))
@@ -548,7 +550,7 @@ static int nfs_init_client(struct nfs_client *clp, const 
struct nfs_mount_data *
 * - RFC 2623, sec 2.3.2
 */
error = nfs_create_rpc_client(clp, proto, data->timeo, data->retrans,
-   RPC_AUTH_UNIX);
+   RPC_AUTH_UNIX, 0);
if (error < 0)
goto error;
nfs_mark_client_ready(clp, NFS_CS_READY);
@@ -868,7 +870,8 @@ static int nfs4_init_client(struct nfs_client *clp,
/* Check NFS protocol revision and initialize RPC op vector */
clp->rpc_ops = _v4_clientops;

-   error = nfs_create_rpc_client(clp, proto, timeo, retrans, authflavour);
+   error = nfs_create_rpc_client(clp, proto, timeo, retrans, authflavour,
+   RPC_CLNT_CREATE_DISCRTRY);
if (error < 0)
goto error;
memcpy(clp->cl_ipaddr, ip_addr, sizeof(clp->cl_ipaddr));
diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h
index a1be89d..c7a78ee 100644
--- a/include/linux/sunrpc/clnt.h
+++ b/include/linux/sunrpc/clnt.h
@@ -40,6 +40,7

Re: is there any generic GPIO chip framework like IRQ chips?

2007-04-18 Thread David Brownell

> >> > So, talking about what an (optional) implementation framework might
> >> > look like (and which could handle the SOC, FPGA, I2C, and MFD cases
> >> > I've looked at):
> 
> > See patches in following messages ... a preliminary "gpio_chip" core
> > for such a framework, plus example support for one SOC family's GPIOs,
> > and then updating one board's handling of GPIOs, including over I2C.
> 
> Just to compare, diffstats for GPIODEV:

Now, if they were functionally equivalent, such a comparison
would be less of an apples/oranges thing!

The most useful comparison would focus on technical aspects of
the gpio_chip abstraction itself (i.e. $SUBJECT).


>   it needs work - it doesn't adhere to your own 
> optimization scheme by using lookup table instead of list.

I thought it was more important to address the $SUBJECT first:
get a working gpio_chip abstraction which covers all the needed
functionality.  The patch had a hook for implementing such tweaks,
but it wasn't used.

The next version you'll see lets the platform code use its own
existing lookup code, as part of slimming things down a bit.
I also decided to take out the debugfs support.


>you speak about constructor
> parts which "anyone" can use to construct whatever GPIO API they like,
> whereas I'm speaking about exact API implementation which can be used
> right away.

I most certainly did not speak about "whatever GPIO API they like"!!

Quite the contrary, in fact.  Please don't put words in my mouth.
(You've been doing it quite extensively in this thread; it's rude.)

And that "core" patch I posted was clearly usable "right away";
otherwise the two examples _using_ it couldn't have worked.


> Well, besides gpio_keys we here have asic3_keys, samcop_keys,
> etc. - all that duplication just because the current GPIO API doesn't
> allow extensibility to more chips.

When I get tired of repeating myself, just remember:  the current
programming interface *DOES* allow such extensibility.  That's what it
means to be an "interface", rather than an implementation:  it defines
inputs and outputs, allowing any process that conforms to both.

In fact, the patches I sent demonstrated exactly that extensibility.
Same interface, additional chips; different implementation inside.


> > So you're agreeing that, at a technical level, what I described
> > could be augmented by a "caching" facility ... giving a programming
> > interface with all the characteristics of your "GPIODEV" thingie.
> 
> > All you're really disagreeing with is bootstrapping issues; and
> > whether there is in fact a need for such a layer.  The only argument
> > I could possibly buy is that it avoids the lookup of (b) ... but
> > that doesn't seem to matter in most cases I've looked at.


> So, now the most important question is what we all would get
> with your approach in the end.
> 
> So, if you could make sure gpiolib.c doesn't contain inefficient
> implementation,

I can make it comparable to existing implementations that work
the same way ... e.g. AT91 and OMAP code.  Of course, it's not
possible to get away from the cost of function indirection, with
a generic gpio_chip abstraction.  Or those lookup costs; but as
you agreed, those costs don't seem to matter much.  And if they
ever do matter, caching support would be easy to add.


> and make such extensible implementation available by default
> for ARM PXA/S3Cxxx/OMAP, then it's for sure cover Handhelds.org's,
> and many other peoples' usecases, and that would be highly
> appreciated.
> 
> If you could do it for 2.6.22 merge window, that would
> straight ideal.

I think having an optional gpio_chip, not unlike what was in
that one patch, should be reasonable; also, making it work on
some platforms that I use.  But I don't think there's much
overlap between those platforms and what hh.org uses.

- Dave
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-18 Thread Peter Williams


Ingo Molnar wrote:

* Peter Williams <[EMAIL PROTECTED]> wrote:

And my scheduler for example cuts down the amount of policy code and 
code size significantly.
Yours is one of the smaller patches mainly because you perpetuate (or 
you did in the last one I looked at) the (horrible to my eyes) dual 
array (active/expired) mechanism.  That this idea was bad should have 
been apparent to all as soon as the decision was made to excuse some 
tasks from being moved from the active array to the expired array.  
This essentially meant that there would be circumstances where extreme 
unfairness (to the extent of starvation in some cases) -- the very 
things that the mechanism was originally designed to ensure (as far as 
I can gather).  Right about then in the development of the O(1) 
scheduler alternative solutions should have been sought.


in hindsight i'd agree.


Hindsight's a wonderful place isn't it :-) and, of course, it's where I 
was making my comments from.


But back then we were clearly not ready for 
fine-grained accurate statistics + trees (cpus are alot faster at more 
complex arithmetics today, plus people still believed that low-res can 
be done well enough),  and taking out any of these two concepts from CFS

would result in a similarly complex runqueue implementation.


I disagree.  The single priority array with a promotion mechanism that I 
use in the SPA schedulers can do the job of avoiding starvation with no 
measurable increase in the overhead.  Fairness, nice, good interactive 
responsiveness can then be managed by how you determine tasks' dynamic 
priorities.


Also, the 
array switch was just thought to be of another piece of 'if the 
heuristics go wrong, we fall back to an array switch' logic, right in 
line with the other heuristics. And you have to accept it, mainline's 
ability to auto-renice make -j jobs (and other CPU hogs) was quite a 
plus for developers, so it had (and probably still has) quite some 
inertia.


I agree, it wasn't totally useless especially for the average user.  My 
main problem with it was that the effect of "nice" wasn't consistent or 
predictable enough for reliable resource allocation.


I also agree with the aims of the various heuristics i.e. you have to be 
unfair and give some tasks preferential treatment in order to give the 
users the type of responsiveness that they want.  It's just a shame that 
it got broken in the process but as you say it's easier to see these 
things in hindsight than in the middle of the melee.


Peter
--
Peter Williams   [EMAIL PROTECTED]

"Learning, n. The kind of ignorance distinguishing the studious."
 -- Ambrose Bierce
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [NETLINK] Don't attach callback to a going-away netlink socket

2007-04-18 Thread Herbert Xu

David Miller <[EMAIL PROTECTED]> wrote:
> 
> As discussed in this thread there might be other ways to a
> approach this, but this fix is good for now.
> 
> Patch applied, thank you.

Actually I was going to suggest something like this:

[NETLINK]: Kill CB only when socket is unused

Since we can still receive packets until all references to the
socket are gone, we don't need to kill the CB until that happens.
This also aligns ourselves with the receive queue purging which
happens at that point.

Original patch by Pavel Emelianov who noticed this race condition.

Signed-off-by: Herbert Xu <[EMAIL PROTECTED]>

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 0be19b7..914884c 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -139,6 +139,15 @@ static struct hlist_head *nl_pid_hashfn(struct nl_pid_hash 
*hash, u32 pid)
 
 static void netlink_sock_destruct(struct sock *sk)
 {
+   struct netlink_sock *nlk = nlk_sk(sk);
+
+   WARN_ON(mutex_is_locked(nlk_sk(sk)->cb_mutex));
+   if (nlk->cb) {
+   if (nlk->cb->done)
+   nlk->cb->done(nlk->cb);
+   netlink_destroy_callback(nlk->cb);
+   }
+
skb_queue_purge(>sk_receive_queue);
 
if (!sock_flag(sk, SOCK_DEAD)) {
@@ -147,7 +156,6 @@ static void netlink_sock_destruct(struct sock *sk)
}
BUG_TRAP(!atomic_read(>sk_rmem_alloc));
BUG_TRAP(!atomic_read(>sk_wmem_alloc));
-   BUG_TRAP(!nlk_sk(sk)->cb);
BUG_TRAP(!nlk_sk(sk)->groups);
 }
 
@@ -450,17 +458,7 @@ static int netlink_release(struct socket *sock)
netlink_remove(sk);
nlk = nlk_sk(sk);
 
-   mutex_lock(nlk->cb_mutex);
-   if (nlk->cb) {
-   if (nlk->cb->done)
-   nlk->cb->done(nlk->cb);
-   netlink_destroy_callback(nlk->cb);
-   nlk->cb = NULL;
-   }
-   mutex_unlock(nlk->cb_mutex);
-
-   /* OK. Socket is unlinked, and, therefore,
-  no new packets will arrive */
+   /* OK. Socket is unlinked. */
 
sock_orphan(sk);
sock->sk = NULL;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Announce - Staircase Deadline cpu scheduler v0.42

2007-04-18 Thread Con Kolivas

On Thursday 19 April 2007 10:41, Con Kolivas wrote:
> On Thursday 19 April 2007 09:59, Con Kolivas wrote:
> > Since there is so much work currently ongoing with alternative cpu
> > schedulers, as a standard for comparison with the alternative virtual
> > deadline fair designs I've addressed a few issues in the Staircase
> > Deadline cpu scheduler which improve behaviour likely in a noticeable
> > fashion and released version 0.41.
> >
> > http://ck.kolivas.org/patches/staircase-deadline/2.6.20.7-sd-0.41.patch
> > http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc7-sd-0.41.patch
> >
> > and an incremental for those on 0.40:
> > http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc7/sched-impleme
> >nt -staircase-deadline-scheduler-further-improvements.patch
> >
> > Remember to renice X to -10 for nicest desktop behaviour :)
> >
> > Have fun.
>
> Oops forgot to cc a few people
>
> Nick you said I should still have something to offer so here it is.
> Peter you said you never saw this design (it's a dual array affair sorry).
> Gene and Willy you were some of the early testers that noticed the
> advantages of the earlier designs,
> Matt you did lots of great earlier testing.
> WLI you inspired a lot of design ideas.
> Mike you were the stick.
> And a few others I've forgotten to mention and include.

Version 0.42

http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc7-sd-0.42.patch

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 0/8] Cpuset aware writeback

2007-04-18 Thread Ethan Solomita


Christoph Lameter wrote:

On Wed, 21 Mar 2007, Ethan Solomita wrote:

  

Christoph Lameter wrote:


On Thu, 1 Feb 2007, Ethan Solomita wrote:

  

   Hi Christoph -- has anything come of resolving the NFS / OOM concerns
that
Andrew Morton expressed concerning the patch? I'd be happy to see some
progress on getting this patch (i.e. the one you posted on 1/23) through.


Peter Zilkstra addressed the NFS issue. I will submit the patch again as
soon as the writeback code stabilizes a bit.
  

I'm pinging to see if this has gotten anywhere. Are you ready to
resubmit? Do we have the evidence to convince Andrew that the NFS issues are
resolved and so this patch won't obscure anything?



The NFS patch went into Linus tree a couple of days ago and I have a new 
version ready with additional support to set per dirty ratios per cpu. 
There is some interest in adding more VM controls to this patch. I hope I 
can post the next rev by tomorrow.
  


   Any new ETA? I'm trying to decide whether to go back to your 
original patches or wait for the new set. Adding new knobs isn't as 
important to me as having something that fixes the core problem, so 
hopefully this isn't waiting on them. They could always be patches on 
top of your core patches.

   -- Ethan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] sched: implement staircase deadline scheduler further improvements-1

2007-04-18 Thread Con Kolivas

On Thursday 19 April 2007 09:48, Con Kolivas wrote:
> While the Staircase Deadline scheduler has not been completely killed off
> and is still in -mm I would like to fix some outstanding issues that I've
> found since it still serves for comparison with all the upcoming
> schedulers.
>
> While still in -mm can we queue this on top please?
>
> A set of staircase-deadline v 0.41 patches will make their way into the
> usual place for those willing to test it.
>
> http://ck.kolivas.org/patches/staircase-deadline/

Oops! Minor thinko! Here is a respin. Please apply this one instead.

I better make a 0.42 heh.

---
The prio_level was being inappropriately decreased if a higher priority
task was still using previous timeslice. Fix that.

Task expiration of higher priority tasks was not being taken into account
with allocating priority slots. Check the expired best_static_prio level
to facilitate that.

Explicitly check all better static priority prio_levels when deciding on
allocating slots for niced tasks.

These changes improve behaviour in many ways.

Signed-off-by: Con Kolivas <[EMAIL PROTECTED]>

---
 kernel/sched.c |   64 ++---
 1 file changed, 43 insertions(+), 21 deletions(-)

Index: linux-2.6.21-rc7-sd/kernel/sched.c
===
--- linux-2.6.21-rc7-sd.orig/kernel/sched.c 2007-04-19 08:51:54.0 
+1000
+++ linux-2.6.21-rc7-sd/kernel/sched.c  2007-04-19 12:03:29.0 +1000
@@ -145,6 +145,12 @@ struct prio_array {
 */
DECLARE_BITMAP(prio_bitmap, MAX_PRIO + 1);
 
+   /*
+* The best static priority (of the dynamic priority tasks) queued
+* this array.
+*/
+   int best_static_prio;
+
 #ifdef CONFIG_SMP
/* For convenience looks back at rq */
struct rq *rq;
@@ -191,9 +197,9 @@ struct rq {
 
/*
 * The current dynamic priority level this runqueue is at per static
-* priority level, and the best static priority queued this rotation.
+* priority level.
 */
-   int prio_level[PRIO_RANGE], best_static_prio;
+   int prio_level[PRIO_RANGE];
 
/* How many times we have rotated the priority queue */
unsigned long prio_rotation;
@@ -669,7 +675,7 @@ static void task_new_array(struct task_s
 }
 
 /* Find the first slot from the relevant prio_matrix entry */
-static inline int first_prio_slot(struct task_struct *p)
+static int first_prio_slot(struct task_struct *p)
 {
if (unlikely(p->policy == SCHED_BATCH))
return p->static_prio;
@@ -682,11 +688,18 @@ static inline int first_prio_slot(struct
  * level. SCHED_BATCH tasks do not use the priority matrix. They only take
  * priority slots from their static_prio and above.
  */
-static inline int next_entitled_slot(struct task_struct *p, struct rq *rq)
+static int next_entitled_slot(struct task_struct *p, struct rq *rq)
 {
+   int search_prio = MAX_RT_PRIO, uprio = USER_PRIO(p->static_prio);
+   struct prio_array *array = rq->active;
DECLARE_BITMAP(tmp, PRIO_RANGE);
-   int search_prio, uprio = USER_PRIO(p->static_prio);
 
+   /*
+* Go straight to expiration if there are higher priority tasks
+* already expired.
+*/
+   if (p->static_prio > rq->expired->best_static_prio)
+   return MAX_PRIO;
if (!rq->prio_level[uprio])
rq->prio_level[uprio] = MAX_RT_PRIO;
/*
@@ -694,15 +707,22 @@ static inline int next_entitled_slot(str
 * static_prio are acceptable, and only if it's not better than
 * a queued better static_prio's prio_level.
 */
-   if (p->static_prio < rq->best_static_prio) {
-   search_prio = MAX_RT_PRIO;
+   if (p->static_prio < array->best_static_prio) {
if (likely(p->policy != SCHED_BATCH))
-   rq->best_static_prio = p->static_prio;
-   } else if (p->static_prio == rq->best_static_prio)
+   array->best_static_prio = p->static_prio;
+   } else if (p->static_prio == array->best_static_prio) {
search_prio = rq->prio_level[uprio];
-   else {
-   search_prio = max(rq->prio_level[uprio],
-   rq->prio_level[USER_PRIO(rq->best_static_prio)]);
+   } else {
+   int i;
+
+   search_prio = rq->prio_level[uprio];
+   /* A bound O(n) function, worst case n is 40 */
+   for (i = array->best_static_prio; i <= p->static_prio ; i++) {
+   if (!rq->prio_level[USER_PRIO(i)])
+   rq->prio_level[USER_PRIO(i)] = MAX_RT_PRIO;
+   search_prio = max(search_prio,
+ rq->prio_level[USER_PRIO(i)]);
+   }
}
if (unlikely(p->policy == SCHED_BATCH)) {
search_prio = max(search_prio,

Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy

2007-04-18 Thread Nigel Cunningham

Hi.

On Thu, 2007-04-19 at 00:02 +0200, Ingo Molnar wrote:
> * Christian Hesse <[EMAIL PROTECTED]> wrote:
> 
> > > although probably your suspend2 problem is still not fixed, it's 
> > > worth a try nevertheless. Which suspend2 patch did you apply, and 
> > > was it against -rc6 or -rc7?
> > 
> > You are right again. ;-)
> > 
> > Linux 2.6.21-rc7
> > Suspend2 2.2.9.11 (applies cleanly to -rc7)
> > CFS v3 (without any additional patches)
> > 
> > And it still hangs on suspend.
> 
> what's the easiest way for me to try suspend2? Apply the patch, reboot 
> into the kernel, then execute what command to suspend? (there's a 
> confusing mismash of initiators of all the suspend variants. Can i drive 
> this by echoing to /sys/power/state?)

From subsequent emails, I think you already got your answer, but just in
case...

Yes, if you enabled "Replace swsusp by default" and you already had it
set up for getting swsusp to resume. If not, and you're using an
initrd/ramfs, you'll need to modify it to echo
> /sys/power/suspend2/do_resume after /sys and /proc are mounted but
prior to mounting / and so on.

Regards,

Nigel


signature.asc
Description: This is a digitally signed message part

Re: [PATCH 0/4] 2.6.21-rc7 NFS writes: fix a series of issues

2007-04-18 Thread Florin Iucha

On Wed, Apr 18, 2007 at 10:11:46AM -0400, Trond Myklebust wrote:
> Do you have a copy of wireshark or ethereal on hand? If so, could you
> take a look at whether or not any NFS traffic is going between the
> client and server once the hang happens?

I used the following command 

   tcpdump -w nfs-traffic -i eth0 -vv -tt dst port nfs

to capture

   http://iucha.net/nfs/21-rc7-nfs4/nfs-traffic.bz2

I started the capture before starting the copy and left it to run for
a few minutes after the traffic slowed to a crawl.

The iostat and vmstat are at:

   http://iucha.net/nfs/21-rc7-nfs4/iostat
   http://iucha.net/nfs/21-rc7-nfs4/vmstat

It seems that my original problem report had a big mistake!  There is
no hang, but at some point the write slows down to a trickle (from
40,000 blocks/s to 22 blocks/s) as can be seen from the iostat log.

Regards,
florin

-- 
Bruce Schneier expects the Spanish Inquisition.
  http://geekz.co.uk/schneierfacts/fact/163

signature.asc
Description: Digital signature

Re: 2.6.21-rc6-mm1 ATA HPT37x regression

2007-04-18 Thread John Stoffel

> "John" == John Stoffel <[EMAIL PROTECTED]> writes:

> "John" == John Stoffel <[EMAIL PROTECTED]> writes:
 > Ok, so do I need to do anything special with the next -mm release and
 > the next version?
 
 Well, let Alan decide that (2Alan: and I said that HPT code is bogus :-).

Alan> Try drivers/ide/pci/hpt366 - if that works grab a dmesg and let
Alan> me know.  It means that Sergei's DPLL sync code seems to work
Alan> better than the vendor code and its time to swap it over.

John> Ok, I'll give that a whirl under 2.6.21-rc7 tonight.  I'll build them
John> in modular so I can switch around more easily.  I hope.  :]

John> Ok, here's the dmesg output using the hpt366 old IDE driver,
John> 2.6.21-rc7, SMP: 

John> [  160.926355] HPT302: IDE controller at PCI slot :03:06.0
John> [  160.928030] ACPI: PCI Interrupt :03:06.0[A] -> GSI 18 (level, low) 
-> IRQ
John>  18
John> [  160.931212] HPT302: chipset revision 1
John> [  160.932801] HPT302: DPLL base: 66 MHz, f_CNT: 100, assuming 33 MHz PCI
John> [  160.941157] HPT302: using 66 MHz DPLL clock
John> [  160.942646] HPT302: 100% native mode on irq 18
John> [  160.943918] ide2: BM-DMA at 0xe800-0xe807, BIOS settings: hde:DMA, 
hdf:pi
John> o
John> [  160.946636] ide3: BM-DMA at 0xe808-0xe80f, BIOS settings: hdg:DMA, 
hdh:pi
John> o
John> [  160.949439] Probing IDE interface ide2...
John> [  161.213560] hde: WDC WD1200JB-00CRA1, ATA DISK drive
John> [  161.828020] ide2 at 0xecf8-0xecff,0xecf2 on irq 18
John> [  161.829616] Probing IDE interface ide3...
John> [  162.094086] hdg: WDC WD1200JB-00EVA0, ATA DISK drive
John> [  162.709002] ide3 at 0xece0-0xece7,0xecda on irq 18


John> Which looks ok to me I guess.  It found my MD disks on there and
John> assmebled them, eventually.  *grin*

John> I'll reboot and send out the corresponding ATA HPT37x driver dmesg...

And here's the output (much more verbose!) from the hpt37x ATA driver:

[  158.712007] hpt37x: HPT302: Bus clock 33MHz.
[  158.713390] ACPI: PCI Interrupt :03:06.0[A] -> GSI 18 (level, low) -> IRQ
 18
[  158.716254] ata5: PATA max UDMA/133 cmd 0x0001ecf8 ctl 0x0001ecf2 bmdma 0x000
1e800 irq 18
[  158.719019] ata6: PATA max UDMA/133 cmd 0x0001ece0 ctl 0x0001ecda bmdma 0x000
1e808 irq 18
[  158.722257] scsi7 : pata_hpt37x
[  158.878133] ata5.00: ATA-5: WDC WD1200JB-00CRA1, 17.07W17, max UDMA/100
[  158.879576] ata5.00: 234441648 sectors, multi 16: LBA 
[  158.880934] Find mode for 12 reports C829C62
[  158.882240] Find mode for DMA 69 reports 1C6DDC62
[  158.888152] ata5.00: configured for UDMA/100
[  158.889437] scsi8 : pata_hpt37x
[  158.900338] Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
[  158.901660] ide: Assuming 33MHz system bus speed for PIO modes; override with
 idebus=xx
[  159.047026] ata6.00: ATA-6: WDC WD1200JB-00EVA0, 15.05R15, max UDMA/100
[  159.048412] ata6.00: 234441648 sectors, multi 16: LBA48 
[  159.050008] Find mode for 12 reports C829C62
[  159.051371] Find mode for DMA 69 reports 1C6DDC62
[  159.057079] ata6.00: configured for UDMA/100
[  159.063655] scsi 7:0:0:0: Direct-Access ATA  WDC WD1200JB-00C 17.0 PQ
: 0 ANSI: 5
[  159.067506] SCSI device sdi: 234441648 512-byte hdwr sectors (120034 MB)
[  159.069004] sdi: Write Protect is off
[  159.070412] sdi: Mode Sense: 00 3a 00 00
[  159.070487] SCSI device sdi: write cache: enabled, read cache: enabled, doesn
't support DPO or FUA
[  159.073427] SCSI device sdi: 234441648 512-byte hdwr sectors (120034 MB)
[  159.074882] sdi: Write Protect is off
[  159.076262] sdi: Mode Sense: 00 3a 00 00
[  159.076339] SCSI device sdi: write cache: enabled, read cache: enabled, doesn
't support DPO or FUA
[  159.079097]  sdi: sdi1
[  159.097634] sd 7:0:0:0: Attached scsi disk sdi
[  159.099212] sd 7:0:0:0: Attached scsi generic sg9 type 0
[  159.102344] scsi 8:0:0:0: Direct-Access ATA  WDC WD1200JB-00E 15.0 PQ
: 0 ANSI: 5
[  159.106197] SCSI device sdj: 234441648 512-byte hdwr sectors (120034 MB)
[  159.107722] sdj: Write Protect is off
[  159.109188] sdj: Mode Sense: 00 3a 00 00
[  159.109271] SCSI device sdj: write cache: enabled, read cache: enabled, doesn
't support DPO or FUA
[  159.112455] SCSI device sdj: 234441648 512-byte hdwr sectors (120034 MB)
[  159.114094] sdj: Write Protect is off
[  159.115870] sdj: Mode Sense: 00 3a 00 00
[  159.115943] SCSI device sdj: write cache: enabled, read cache: enabled, doesn
't support DPO or FUA
[  159.118965]  sdj: sdj1
[  159.138036] sd 8:0:0:0: Attached scsi disk sdj
[  159.139682] sd 8:0:0:0: Attached scsi generic sg10 type 0



In both cases, my RAID1 disks are found and come up cleanly, which is
good.  Thanks for all the work you guys have done on the IDE stuff, as
well as the new libATA stuff.

Let me know if you need more testing done here, I've only got a
scratch volume on this raid set.

John
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo

Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy

2007-04-18 Thread Nigel Cunningham

Hi.

On Wed, 2007-04-18 at 18:56 -0400, Bob Picco wrote:
> Ingo Molnar wrote:[Wed Apr 18 2007, 06:02:28PM EDT]
> > 
> > * Christian Hesse <[EMAIL PROTECTED]> wrote:
> > 
> > > > although probably your suspend2 problem is still not fixed, it's 
> > > > worth a try nevertheless. Which suspend2 patch did you apply, and 
> > > > was it against -rc6 or -rc7?
> > > 
> > > You are right again. ;-)
> > > 
> > > Linux 2.6.21-rc7
> > > Suspend2 2.2.9.11 (applies cleanly to -rc7)
> > > CFS v3 (without any additional patches)
> > > 
> > > And it still hangs on suspend.
> > 
> > what's the easiest way for me to try suspend2? Apply the patch, reboot 
> > into the kernel, then execute what command to suspend? (there's a 
> > confusing mismash of initiators of all the suspend variants. Can i drive 
> > this by echoing to /sys/power/state?)
> > 
> > Ingo
> I had hoped to collect more data with CFS V2. It crashes in
> scale_nice_down for s2ram when attempting to disable_nonboot_cpus. 
> So part of traceback looks like (typed by hand with obvious omissions):
> 
> scale_nice_down
> update_stats_wait_end - not shown in traceback because inlined
> pick_next_task_fair
> migration_call
> task_rq_lock
> notifier_call_chain
> _cpu_down
> disable_nonboot_cpus
> ...
> 
> This is standard -rc7 with V2 CFS applied. It could be a completely
> unrelated issue. I'll attempt to debug further tomorrow.

That - and Christian's other reply with the jpg - look to me more like
this is an interaction between CFS and cpu hotplugging than Suspend2
itself. Can you also reproduce this with swsusp?

Regards,

Nigel


signature.asc
Description: This is a digitally signed message part

PCI: Unable to handle 64-bit address space for

2007-04-18 Thread mchu

Hi all,

Anyone has idea of this:  Why it is displayed on boot? How to fix this?  Or at 
least not to display this message?

Using 2.6.9-42.ELsmp.

PCI: Probing PCI hardware (bus 00)
PCI: Ignoring BAR0-3 of IDE controller :00:1f.1
PCI: Unable to handle 64-bit address space for
PCI: Unable to handle 64-bit address space for
PCI: Unable to handle 64-bit address space for
PCI: Unable to handle 64-bit address space for
PCI: Unable to handle 64-bit address space for
PCI: Unable to handle 64-bit address space for
PCI: Unable to handle 64-bit address space for
PCI: Unable to handle 64-bit address space for
PCI: Unable to handle 64-bit address space for


Thanks for the help,

Michael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy

2007-04-18 Thread Nigel Cunningham

Hi.

On Thu, 2007-04-19 at 00:22 +0200, Christian Hesse wrote:
> On Thursday 19 April 2007, Ingo Molnar wrote:
> > * Christian Hesse <[EMAIL PROTECTED]> wrote:
> > > > although probably your suspend2 problem is still not fixed, it's
> > > > worth a try nevertheless. Which suspend2 patch did you apply, and
> > > > was it against -rc6 or -rc7?
> > >
> > > You are right again. ;-)
> > >
> > > Linux 2.6.21-rc7
> > > Suspend2 2.2.9.11 (applies cleanly to -rc7)
> > > CFS v3 (without any additional patches)
> > >
> > > And it still hangs on suspend.
> >
> > what's the easiest way for me to try suspend2? Apply the patch, reboot
> > into the kernel, then execute what command to suspend? (there's a
> > confusing mismash of initiators of all the suspend variants. Can i drive
> > this by echoing to /sys/power/state?)
> 
> Perhaps you have to install suspend2-userui as well for the output (I'm not 
> shure whether it works without). Then you can trigger the suspend by echoing 
> to /sys/power/suspend2/do_suspend.
> Useful informations can be found in the Howto:
> 
> http://www.suspend2.net/HOWTO
> 
> I dropped some ccs to not abuse Linus and friends.

You can suspend and resume without it.

Regards,

Nigel


signature.asc
Description: This is a digitally signed message part

Re: 2.6.21-rc6-mm1 ATA HPT37x regression

2007-04-18 Thread John Stoffel

> "John" == John Stoffel <[EMAIL PROTECTED]> writes:

>>> > Ok, so do I need to do anything special with the next -mm release and
>>> > the next version?
>>> 
>>> Well, let Alan decide that (2Alan: and I said that HPT code is bogus :-).

Alan> Try drivers/ide/pci/hpt366 - if that works grab a dmesg and let
Alan> me know.  It means that Sergei's DPLL sync code seems to work
Alan> better than the vendor code and its time to swap it over.

John> Ok, I'll give that a whirl under 2.6.21-rc7 tonight.  I'll build them
John> in modular so I can switch around more easily.  I hope.  :]

Ok, here's the dmesg output using the hpt366 old IDE driver,
2.6.21-rc7, SMP: 

[  160.926355] HPT302: IDE controller at PCI slot :03:06.0
[  160.928030] ACPI: PCI Interrupt :03:06.0[A] -> GSI 18 (level, low) -> IRQ
 18
[  160.931212] HPT302: chipset revision 1
[  160.932801] HPT302: DPLL base: 66 MHz, f_CNT: 100, assuming 33 MHz PCI
[  160.941157] HPT302: using 66 MHz DPLL clock
[  160.942646] HPT302: 100% native mode on irq 18
[  160.943918] ide2: BM-DMA at 0xe800-0xe807, BIOS settings: hde:DMA, hdf:pi
o
[  160.946636] ide3: BM-DMA at 0xe808-0xe80f, BIOS settings: hdg:DMA, hdh:pi
o
[  160.949439] Probing IDE interface ide2...
[  161.213560] hde: WDC WD1200JB-00CRA1, ATA DISK drive
[  161.828020] ide2 at 0xecf8-0xecff,0xecf2 on irq 18
[  161.829616] Probing IDE interface ide3...
[  162.094086] hdg: WDC WD1200JB-00EVA0, ATA DISK drive
[  162.709002] ide3 at 0xece0-0xece7,0xecda on irq 18


Which looks ok to me I guess.  It found my MD disks on there and
assmebled them, eventually.  *grin*

I'll reboot and send out the corresponding ATA HPT37x driver dmesg...

John

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

PCI Express MMCONFIG and BIOS Bug messages..

2007-04-18 Thread Robert Hancock

I've seen a lot of systems (including brand new Xeon-based servers from 
IBM and HP) that output messages on boot like:


PCI: BIOS Bug: MCFG area at f000 is not E820-reserved
PCI: Not using MMCONFIG.

As I understand it, this is sort of a sanity check mechanism to make 
sure the MCFG address reported is remotely reasonable and intended to be 
used as such. Problem is, I doubt the BIOS authors would agree that this 
constitutes a bug. Microsoft is providing a lot of the direction for 
BIOS writers, and have a look at this presentation "PCI Express, 
Windows, And The Legacy Transition" from back in 2004:


http://download.microsoft.com/download/1/8/f/18f8cee2-0b64-41f2-893d-a6f2295b40c8/TW04047_WINHEC2004.ppt

On page 14, "Existing Windows - Reserve MMCONFIG":

Existing Windows versions won’t understand MCFG table
* Backwards-compatible range reservation must be used
Report range in ACPI "Motherboard Resources"
*_CRS of PNP0C02 node
* PNP0C02 must be at \_SB scope
* Range must be marked as consumed
Do not include range in _CRS of PCI root bus
* If included, OS will assume that this range can be allocated to devices
E820 table/EFI memory map
 * Not necessary to describe MMConfig here
 * For Windows, these are used to describe RAM
 * No harm in including range as reserved either

So Microsoft is explicitly telling the BIOS developers that there is no 
need to reserve the MMCONFIG space in the E820 table because Windows 
doesn't care. On that basis it doesn't seem like a valid check to 
require it to be so reserved, then.


Really, I think we should be basing this check on whether the 
corresponding memory range is reserved in the ACPI resources, like 
Windows expects. This does require putting more fingers into ACPI from 
this early boot stage, though..


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: CPU_IDLE prevents resuming from STR [was: Re: 2.6.21-rc6-mm1]

2007-04-18 Thread Shaohua Li

On Wed, 2007-04-18 at 19:00 -0400, Joshua Wise wrote:
> On Tue, 17 Apr 2007, Shaohua Li wrote:
> > Looks there is init order issue of sysfs files. The new refreshed patch
> > should fix your bug.
> 
> Yes, that did fix the hang on resume from STR -- that now works fine.
> 
> However:
> [EMAIL PROTECTED]:/sys/devices/system/cpu/cpuidle$ cat available_drivers 
> current_driver
> 
> 
> [EMAIL PROTECTED]:/sys/devices/system/cpu/cpuidle$ cat available_governors 
> current_governor
> ladder
> ladder
it's correct and looks you didn't compile the acpi processor module.

Thanks,
Shaohua
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] CONFIG_PACKET_MMAP should depend on MMU

2007-04-18 Thread Aubrey Li


On 4/18/07, David Howells <[EMAIL PROTECTED]> wrote:

Aubrey Li <[EMAIL PROTECTED]> wrote:

> Here, in the attachment I wrote a small test app. Please correct if
> there is anything wrong, and feel free to improve it.

Okay... I have that working... probably.  I don't know what output it's
supposed to produce, but I see this:

# /packet-mmap/sample_packet_mmap
00-00-00-01-00-00-00-8a-00-00-00-8a-00-42-00-50-
38-43-13-a0-00-07-ff-3c-00-00-00-00-00-00-00-00-
00-11-08-00-00-00-00-01-00-01-00-06-00-d0-b7-de-
32-7b-00-00-00-00-00-00-00-00-00-00-00-00-00-00-
00-00-00-90-cc-a2-75-6b-00-d0-b7-de-32-7b-08-00-
45-00-00-7c-00-00-40-00-40-11-b4-13-c0-a8-02-80-
c0-a8-02-8d-08-01-03-20-00-68-8e-65-7f-5b-7e-03-
00-00-00-01-00-00-00-00-00-00-00-00-00-00-00-00-
00-00-00-00-00-00-00-00-00-00-00-01-00-00-81-a4-
00-00-00-01-00-00-00-00-00-00-00-00-00-1d-b8-86-
00-00-10-00-ff-ff-ff-ff-00-00-0e-f0-00-00-09-02-
01-cb-03-16-46-26-38-0d-00-00-00-00-46-26-38-1e-
00-00-00-00-46-26-38-1e-00-00-00-00-00-00-00-00-
00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00- [repeated]

Does that look reasonable?


Yes, it's reasonable for me, as long as your
host IP is 192.168.2.128
and
target IP is 192.168.2.141
See below


00-90-cc-a2-75-6b-|___ MAC Address
00-d0-b7-de-32-7b-|
08-00Type: IP
45-00Ver, IHL, TOS
00-7cIP.total.length
00-00-
40-00-
40TTL
11UDP protocol
b4-13Checksum
c0-a8-02-80---Source IP: 192.168.2.128
c0-a8-02-8d---Dest IP: 192.168.2.141

snip--



I've attached the preliminary patch.


Thanks, I'll take a look and try to see if I can give some feedback.

-Aubrey


Note four things about it:

 (1) I've had to add the get_unmapped_area() op to the proto_ops struct, but
 I've only done it for CONFIG_MMU=n as making it available for CONFIG_MMU=y
 could cause problems.

 (2) There's a race between packet_get_unmapped_area() being called and
 packet_mmap() being called.

 (3) I've added an extra check into packet_set_ring() to make sure the caller
 isn't asking for a combination of buffer size and count that will exceed
 ULONG_MAX.  This protects a multiply done elsewhere.

 (4) The entire data buffer is allocated as one contiguous lump in NOMMU-mode.

David

---
[PATCH] NOMMU: Support mmap() on AF_PACKET sockets

From: David Howells <[EMAIL PROTECTED]>

Support mmap() on AF_PACKET sockets in NOMMU-mode kernels.

Signed-Off-By: David Howells <[EMAIL PROTECTED]>
---

 include/linux/net.h|7 +++
 include/net/sock.h |8 +++
 net/core/sock.c|   10 
 net/packet/af_packet.c |  118 
 net/socket.c   |   77 +++
 5 files changed, 219 insertions(+), 1 deletions(-)

diff --git a/include/linux/net.h b/include/linux/net.h
index 4db21e6..9e77cf6 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -161,6 +161,11 @@ struct proto_ops {
int (*recvmsg)   (struct kiocb *iocb, struct socket *sock,
  struct msghdr *m, size_t total_len,
  int flags);
+#ifndef CONFIG_MMU
+   unsigned long   (*get_unmapped_area)(struct file *file, struct socket 
*sock,
+unsigned long addr, unsigned long 
len,
+unsigned long pgoff, unsigned long 
flags);
+#endif
int (*mmap)  (struct file *file, struct socket *sock,
  struct vm_area_struct * vma);
ssize_t (*sendpage)  (struct socket *sock, struct page *page,
@@ -191,6 +196,8 @@ extern int   sock_sendmsg(struct socket *sock, 
struct msghdr *msg,
 extern int  sock_recvmsg(struct socket *sock, struct msghdr *msg,
  size_t size, int flags);
 extern int  sock_map_fd(struct socket *sock);
+extern void sock_make_mappable(struct socket *sock,
+   unsigned long prot);
 extern struct socket *sockfd_lookup(int fd, int *err);
 #define sockfd_put(sock) fput(sock->file)
 extern int  net_ratelimit(void);
diff --git a/include/net/sock.h b/include/net/sock.h
index 2c7d60c..d91edea 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -841,6 +841,14 @@ extern int  sock_no_sendmsg(struct 
kiocb *, struct socket *,
struct msghdr *, size_t);
 extern int  sock_no_recvmsg(struct kiocb *, struct socket 
*,

Re: problem with

2007-04-18 Thread Robert Hancock


liangbowen wrote:

Hi

I compiled the following code with gcc under FC2 :

#include 
main()
{
struct semaphore sum;

}

It doesn't compile, saying "storage size of `sem'
isn't known".

and I looked inside asm/semaphore.h, I saw:
#ifndef I386_SEMAPHORE_H
#define I386_SEMAPHORE_H

#include 

#endif

Did I missed something? Please guide me how to fix it.

Sincerely


You're trying to use a kernel data structure in a user-space program. 
Don't. The definitions in that header are inside #ifdef __KERNEL__ and 
so the provided userspace headers remove that part.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ck] Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-18 Thread Peter Williams


Chris Friesen wrote:

Mark Glines wrote:


One minor question: is it even possible to be completely fair on SMP?
For instance, if you have a 2-way SMP box running 3 applications, one of
which has 2 threads, will the threaded app have an advantage here?  (The
current system seems to try to keep each thread on a specific CPU, to
reduce cache thrashing, which means threads and processes alike each
get 50% of the CPU.)


I think the ideal in this case would be to have both threads on one cpu, 
with the other app on the other cpu.  This gives inter-process fairness 
while minimizing the amount of task migration required.


Solving this sort of issue was one of the reasons for the smpnice patches.



More interesting is the case of three processes on a 2-cpu system.  Do 
we constantly migrate one of them back and forth to ensure that each of 
them gets 66% of a cpu?


Depends how keen you are on fairness.  Unless the process are long term 
continuously active tasks that never sleep it's probably not an issue as 
they'll probably move around enough in the long term for them each to 
get 66% over the long term.


Exact load balancing for real work loads (where tasks are coming and 
going, sleeping and waking semi randomly and over relatively brief 
periods) is probably unattainable because by the time you've work out 
the ideal placement of the currently runnable tasks on the available 
CPUs it's all changed and the solution is invalid.  The best you can 
hope for that change isn't so great as to completely invalidate the 
solution and the changes you make as a result are an improvement on the 
current allocation of processes to CPUs.


The above probably doesn't hold for some systems such as those large 
super computer jobs that run for several days but they're probably best 
served by explicit allocation of processes to CPUs using the process 
affinity mechanism.


Peter
--
Peter Williams   [EMAIL PROTECTED]

"Learning, n. The kind of ignorance distinguishing the studious."
 -- Ambrose Bierce
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Upgraded to 2.6.20.7 - positives

2007-04-18 Thread Robert Hancock


Chuck Ebbert wrote:

Denis Vlasenko wrote:

* From make menuconfig questions it looks like SATA/PATA
  rewrite (in the form of libata) is almost finished. Hehe,
  untangling IDE mess was quite a feat, and Jeff did it. Kudos.



ADMA mode on nvidia chipsets still seems broken despite massive
amount of SATA fixes backported from 2.6.21...


News to me.. pleast post details.

--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Patch -mm 3/3] RFC: Introduce kobject->owner for refcounting.

2007-04-18 Thread Rusty Russell

On Wed, 2007-04-18 at 11:20 -0400, Alan Stern wrote:
> On Wed, 18 Apr 2007, Rusty Russell wrote:
> 
> > Hi Alan,
> > 
> > Your assertion is correct.  I haven't studied the driver core, so I
> > might be off-base here, but you'll note that if the module references
> > the core kmalloc'ed object rather than the other way around it can be
> > done safely.  The core can also reference the module, but it must be
> > able to live without it once it's gone (eg. by returning -ENOENT).
> 
> "Live without it once it's gone..."  Do you mean once the object is gone 
> or once the module is gone?  The core in general has no way to know when 
> the module is gone; all it knows about is the object.  The trouble arises 
> when the module is gone (whether the core knows it or not) but the object 
> is still present.

Hi Alan,

I meant that the module is gone: it has told the object (via
unregister_xxx) that it's gone.

> > A really poor example is below:
...
> The example is fine as far as it goes, but it assumes that all
> interactions with the underlying r->foo object can be done under a
> spinlock.  Of course this isn't true in general.

There are certainly other ways of doing it, such as a mutex, a refcnt &
completion (for function pointers), or disabling preemption across the
access and using stop_machine().  Of course, these add complexity.

This is the reason that I've always disliked module removal.  We have a
lot of code to deal with it and it has awkward semantics (unless --wait
is used).  OTOH, I'm not a fan of the network approach, either: I feel
that bringing up an interface should bump the refcnt of the module which
implements that interface.  Currently taking out e1000 will just kill my
eth0.

Cheers,
Rusty.

> 
> Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-18 Thread Linus Torvalds



On Wed, 18 Apr 2007, Davide Libenzi wrote:
> 
> I know, we agree there. But that did not fit my "Pirates of the Caribbean" 
> quote :)

Ahh, I'm clearly not cultured enough, I didn't catch that reference.

Linus "yes, I've seen the movie, but it
 apparently left more of a mark in other people" Torvalds
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ck] Announce - Staircase Deadline cpu scheduler v0.41

2007-04-18 Thread Con Kolivas

On Thursday 19 April 2007 09:59, Con Kolivas wrote:
> Since there is so much work currently ongoing with alternative cpu
> schedulers, as a standard for comparison with the alternative virtual
> deadline fair designs I've addressed a few issues in the Staircase Deadline
> cpu scheduler which improve behaviour likely in a noticeable fashion and
> released version 0.41.
>
> http://ck.kolivas.org/patches/staircase-deadline/2.6.20.7-sd-0.41.patch
> http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc7-sd-0.41.patch
>
> and an incremental for those on 0.40:
> http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc7/sched-implement
>-staircase-deadline-scheduler-further-improvements.patch
>
> Remember to renice X to -10 for nicest desktop behaviour :)
>
> Have fun.

Oops forgot to cc a few people

Nick you said I should still have something to offer so here it is.
Peter you said you never saw this design (it's a dual array affair sorry).
Gene and Willy you were some of the early testers that noticed the advantages 
of the earlier designs,
Matt you did lots of great earlier testing.
WLI you inspired a lot of design ideas.
Mike you were the stick.
And a few others I've forgotten to mention and include.

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH][RFC] Kill off legacy power management stuff.

2007-04-18 Thread Robert P. J. Day

On Wed, 18 Apr 2007, Dave Jones wrote:

> On Wed, Apr 18, 2007 at 05:23:15PM -0400, Len Brown wrote:
>
>  > > p.p.s.  patch improvements that will let me avoid doing any of that
>  > > myself always welcome. :-)
>  >
>  > well, I'm sorry that I've known about the APM issue for a long time
>  > and done nothing about it.  I did ping davej when he broke it,
>  > but his to-do list is probably even longer than mine.
>
> ping timeout.
>
> I don't recall too many of the details surrounding those changes,
> but I certainly won't stand in the way of anyone trying to fix it.
> It sounds like you and Robert are on top of it, or do you want me to
> poke at it ?

well, before i get even more confused by what was (once upon a time) a
fairly straightforward removal patch, the first obvious question is --
are there *any* circumstances that *require* a config selection of
CONFIG_PM_LEGACY, as opposed to a selection of APM and/or ACPI?  if
there are, then it can't simply be removed.  my original patch
submission was based on the assumption that absolutely no one needed
the legacy stuff anymore and absolutely everything related to it could
be scrapped.

so, first things first:  what *needs* legacy PM at the moment?

rday

p.s.  i'm confused by the header file include/linux/pm_legacy.h,
especially this part:

#ifdef CONFIG_PM_LEGACY
...
# else /* CONFIG_PM_LEGACY */

#define PM_IS_ACTIVE() 0
...
#endif
===

  so the macro "PM_IS_ACTIVE()" represents whether *legacy* PM has
been selected.  in other words, it makes no (apparent) sense that the
value of that macro would represent some kind of contention mechanism
between APM and ACPI, which is entirely independent from the legacy
stuff.  right?

-- 

Robert P. J. Day
Linux Consulting, Training and Annoying Kernel Pedantry
Waterloo, Ontario, CANADA

http://fsdev.net/wiki/index.php?title=Main_Page

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-18 Thread Peter Williams


Linus Torvalds wrote:


On Wed, 18 Apr 2007, Matt Mackall wrote:

On Wed, Apr 18, 2007 at 07:48:21AM -0700, Linus Torvalds wrote:
And "fairness by euid" is probably a hell of a lot easier to do than 
trying to figure out the wakeup matrix.

For the record, you actually don't need to track a whole NxN matrix
(or do the implied O(n**3) matrix inversion!) to get to the same
result.


I'm sure you can do things differently, but the reason I think "fairness 
by euid" is actually worth looking at is that it's pretty much the 
*identical* issue that we'll have with "fairness by virtual machine" and a 
number of other "container" issues.


The fact is:

 - "fairness" is *not* about giving everybody the same amount of CPU time 
   (scaled by some niceness level or not). Anybody who thinks that is 
   "fair" is just being silly and hasn't thought it through.


 - "fairness" is multi-level. You want to be fair to threads within a 
   thread group (where "process" may be one good approximation of what a 
   "thread group" is, but not necessarily the only one).


   But you *also* want to be fair in between those "thread groups", and 
   then you want to be fair across "containers" (where "user" may be one 
   such container).


So I claim that anything that cannot be fair by user ID is actually really 
REALLY unfair. I think it's absolutely humongously STUPID to call 
something the "Completely Fair Scheduler", and then just be fair on a 
thread level. That's not fair AT ALL! It's the anti-thesis of being fair!


So if you have 2 users on a machine running CPU hogs, you should *first* 
try to be fair among users. If one user then runs 5 programs, and the 
other one runs just 1, then the *one* program should get 50% of the CPU 
time (the users fair share), and the five programs should get 10% of CPU 
time each. And if one of them uses two threads, each thread should get 5%.


So you should see one thread get 50& CPU (single thread of one user), 4 
threads get 10% CPU (their fair share of that users time), and 2 threads 
get 5% CPU (the fair share within that thread group!).


Any scheduling argument that just considers the above to be "7 threads 
total" and gives each thread 14% of CPU time "fairly" is *anything* but 
fair. It's a joke if that kind of scheduler then calls itself CFS!


And yes, that's largely what the current scheduler will do, but at least 
the current scheduler doesn't claim to be fair! So the current scheduler 
is a lot *better* if only in the sense that it doesn't make ridiculous 
claims that aren't true!


Linus


Sounds a lot like the PLFS (process level fair sharing) scheduler in 
Aurema's ARMTech (for whom I used to work).  The "fair" in the title is 
a bit misleading as it's all about unfair scheduling in order to meet 
specific policies.  But it's based on the principle that if you can 
allocate CPU band width "fairly" (which really means in proportion to 
the entitlement each process is allocated) then you can allocate CPU 
band width "fairly" between higher level entities such as process 
groups, users groups and so on by subdividing the entitlements downwards.


The tricky part of implementing this was the fact that not all entities 
at the various levels have sufficient demand for CPU band width to use 
their entitlements and this in turn means that the entities above them 
will have difficulty using their entitlements even if other of their 
subordinates have sufficient demand (because their entitlements will be 
too small).  The trick is to have a measure of each entity's demand for 
CPU bandwidth and use that to modify the way entitlement is divided 
among subordinates.


As a first guess, an entity's CPU band width usage is an indicator of 
demand but doesn't take into account unmet demand due to tasks waiting 
on a run queue waiting for access to the CPU.  On the other hand, usage 
plus time waiting on the queue isn't a good measure of demand either 
(although it's probably a good upper bound) as it's unlikely that the 
task would have used the same amount of CPU as the waiting time if it 
had gone straight to the CPU.


But my main point is that it is possible to build schedulers that can 
achieve higher level scheduling policies.  Versions of PLFS work on 
Windows from user space by twiddling process priorities.  Part of my 
more recent work at Aurema had been involved in patching Linux's 
scheduler so that nice worked more predictably so that we could release 
a user space version of PLFS for Linux.  The other part was to add hard 
CPU band width caps for processes so that ARMTech could enforce hard CPU 
bandwidth caps on higher level entities (as this can't be done without 
the kernel being able to do it at that level.


Peter
--
Peter Williams   [EMAIL PROTECTED]

"Learning, n. The kind of ignorance distinguishing the studious."
 -- Ambrose Bierce
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

Re: [PATCH] fix OOM killing processes wrongly thought MPOL_BIND

2007-04-18 Thread KAMEZAWA Hiroyuki

On Wed, 18 Apr 2007 20:35:22 +0100 (BST)
Hugh Dickins <[EMAIL PROTECTED]> wrote:

> I only have CONFIG_NUMA=y for build testing: surprised when trying a memhog
> to see lots of other processes killed with "No available memory (MPOL_BIND)".
> memhog is killed correctly once we initialize nodemask in constrained_alloc().
> 

thank you for catching bug.

Acked-by: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]>


> Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]>
> ---
> Perhaps appropriate for 2.6.20-stable too - regression since 2.6.19.
> 
>  mm/oom_kill.c |2 ++
>  1 file changed, 2 insertions(+)
> 
> --- 2.6.21-rc7/mm/oom_kill.c  2007-03-26 07:30:54.0 +0100
> +++ linux/mm/oom_kill.c   2007-04-18 20:18:21.0 +0100
> @@ -176,6 +176,8 @@ static inline int constrained_alloc(stru
>   struct zone **z;
>   nodemask_t nodes;
>   int node;
> +
> + nodes_clear(nodes);
>   /* node has memory ? */
>   for_each_online_node(node)
>   if (NODE_DATA(node)->node_present_pages)
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-18 Thread Davide Libenzi

On Wed, 18 Apr 2007, Linus Torvalds wrote:

> On Wed, 18 Apr 2007, Davide Libenzi wrote:
> > 
> > "Perhaps on the rare occasion pursuing the right course demands an act of 
> >  unfairness, unfairness itself can be the right course?"
> 
> I don't think that's the right issue.
> 
> It's just that "fairness" != "equal".
> 
> Do you think it "fair" to pay everybody the same regardless of how good a 
> job they do? I don't think anybody really believes that. 
> 
> Equating "fair" and "equal" is simply a very fundamental mistake. They're 
> not the same thing. Never have been, and never will.

I know, we agree there. But that did not fit my "Pirates of the Caribbean" 
quote :)



- Davide


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [NETLINK] Don't attach callback to a going-away netlink socket

2007-04-18 Thread David Miller

From: Pavel Emelianov <[EMAIL PROTECTED]>
Date: Wed, 18 Apr 2007 12:16:18 +0400

> The proposal it to make sock_orphan before detaching the callback
> in netlink_release() and to check for the sock to be SOCK_DEAD in
> netlink_dump_start() before setting a new callback.

As discussed in this thread there might be other ways to a
approach this, but this fix is good for now.

Patch applied, thank you.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.20.6 vanilla does't boot

2007-04-18 Thread Michal Jaegermann

On Wed, Apr 18, 2007 at 03:39:25PM -0400, Len Brown wrote:
> On Sunday 15 April 2007 11:50, Michal Jaegermann wrote:
> > 
> > A kernel derived from 2.6.21-rc6-git1 (2.6.20-1.3053.fc7.x86_64 from
> > Fedora "rawhide" to be more precise) did boot on the hardware in
> > question, though; but only when I gave it 'acpi=off'.  Without that
> > parameter it was getting stuck apparently when starting hotplug.
> > In that kernel case disks were accessed using pata_atiixp driver.
> 
> If "acpi=off" is necessary to boot the latest kernel, please
> report an ACPI bug:
> http://bugzilla.kernel.org/enter_bug.cgi?product=ACPI

I now travel and what I can do at this moment is somewhat limited.
In particular I cannot gain an access to the hardware in question.
But please see
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=232490
and the most recent comments there in particular.

> Please mention in the bug report what the latest working kernel was.

This is mentioned in the referenced report as well.

Michal
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] utrace: remove exports

2007-04-18 Thread Jim Keniston

Christoph Hellwig wrote:
> 
> All the exports in utrace are totally unused, and not really something
> I'd want modules to use anyway :)
> 

Please leave the exports in place.

Very early in Documentation/utrace.txt, it says:
"The UTRACE is infrastructure code for tracing and controlling user
threads.  This is the foundation for writing tracing engines, which
can be loadable kernel modules."

If we can't use utrace to write ad hoc instrumentation modules (i.e.,
because utrace_attach(), utrace_detach(), etc. are no longer exported),
then utrace's usefulness is greatly reduced.

Jim Keniston
IBM LTC

> 
> Signed-off-by: Christoph Hellwig <[EMAIL PROTECTED]>
> 
...
> Index: linux-2.6/kernel/utrace.c
> ===
> --- linux-2.6.orig/kernel/utrace.c2007-04-13 15:56:28.0 +0200
> +++ linux-2.6/kernel/utrace.c 2007-04-13 15:56:39.0 +0200
> @@ -490,7 +490,6 @@ restart:
> 
>   return engine;
>  }
> -EXPORT_SYMBOL_GPL(utrace_attach);
> 
>  /*
>   * When an engine is detached, the target thread may still see it and make
> @@ -700,8 +699,6 @@ utrace_detach(struct task_struct *target
> 
>   return 0;
>  }
> -EXPORT_SYMBOL_GPL(utrace_detach);
> -
> 
>  /*
>   * Called with utrace->lock held.
> @@ -900,8 +897,7 @@ restart:  /* See below. */
> 
>   return ret;
>  }
> -EXPORT_SYMBOL_GPL(utrace_set_flags);
> -
> +
>  /*
>   * While running an engine callback, no locks are held.
>   * If a callback updates its engine's action state, then
> @@ -1930,8 +1926,6 @@ utrace_inject_signal(struct task_struct 
> 
>   return ret;
>  }
> -EXPORT_SYMBOL_GPL(utrace_inject_signal);
> -
> 
>  const struct utrace_regset *
>  utrace_regset(struct task_struct *target,
> @@ -1946,8 +1940,6 @@ utrace_regset(struct task_struct *target
> 
>   return >regsets[which];
>  }
> -EXPORT_SYMBOL_GPL(utrace_regset);
> -
> 
>  /*
>   * Return the task_struct for the task using ptrace on this one, or NULL.
> -


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH][RFC] Kill off legacy power management stuff.

2007-04-18 Thread Dave Jones

On Wed, Apr 18, 2007 at 05:23:15PM -0400, Len Brown wrote:

 > > p.p.s.  patch improvements that will let me avoid doing any of that
 > > myself always welcome. :-)
 > 
 > well, I'm sorry that I've known about the APM issue for a long time
 > and done nothing about it.  I did ping davej when he broke it,
 > but his to-do list is probably even longer than mine.

ping timeout.

I don't recall too many of the details surrounding those changes,
but I certainly won't stand in the way of anyone trying to fix it.
It sounds like you and Robert are on top of it, or do you want
me to poke at it ?

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Announce - Staircase Deadline cpu scheduler v0.41

2007-04-18 Thread Con Kolivas

Since there is so much work currently ongoing with alternative cpu schedulers, 
as a standard for comparison with the alternative virtual deadline fair 
designs I've addressed a few issues in the Staircase Deadline cpu scheduler 
which improve behaviour likely in a noticeable fashion and released version 
0.41.

http://ck.kolivas.org/patches/staircase-deadline/2.6.20.7-sd-0.41.patch
http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc7-sd-0.41.patch

and an incremental for those on 0.40:
http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc7/sched-implement-staircase-deadline-scheduler-further-improvements.patch

Remember to renice X to -10 for nicest desktop behaviour :)

Have fun.

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

NETDEV WATCHDOG, tulip, 2.6.18

2007-04-18 Thread Lou Poppler


Package: linux-kernel
Version: 2.6.18-4-686 (Debian 2.6.18.dfsg.1-12)

(Submitted to linux-kernel@vger.kernel.org && [EMAIL PROTECTED])

I also have recurrent problems with
NETDEV WATCHDOG: eth0: transmit timed out

I am running on a Pentium 3 with a Linksys LNE100TX V5.1
PCI ethernet card, which also identifies itself as  ADMtek Comet rev 17
for which the kernel uses the tulip driver module,
Linux Tulip driver version 1.1.13-NAPI (May 11, 2002)

This works fine after booting, and for a day or two after booting,
no problems with heavy net traffic or light traffic.
Eventually something happens to it though, and then it is not right again
until reboot.  The behavior then is an occasional freeze, where nothing
moves for 10 seconds or so, then full-speed network I/O for a few seconds,
then another freeze, etc.

I only got this machine recently.  I first installed Debian Sarge on it,
and had the same problem with Sarge's 2.6.8 kernel.  I read many messages
about the NETDEV WATCHDOG situation, and some writers suggested it might
be fixed in later kernels, so I upgraded to Etch with the 2.6.18 kernel.
For me at least, the problem is still the same.

I am holding the machine in the broken condition (rather than rebooting)
in case anyone wants me to test something else.

Here is some info to document the problem:

dmesg at boot:
Linux version 2.6.18-4-686 (Debian 2.6.18.dfsg.1-12) ([EMAIL PROTECTED]) (gcc 
version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)) #1 SMP Mon Mar 26 
17:17:36 UTC 2007
BIOS-provided physical RAM map:
 BIOS-e820:  - 0009f800 (usable)
 BIOS-e820: 0009f800 - 000a (reserved)
 BIOS-e820: 000e7000 - 0010 (reserved)
 BIOS-e820: 0010 - 040fd800 (usable)
 BIOS-e820: 040fd800 - 040ff800 (ACPI data)
 BIOS-e820: 040ff800 - 040ffc00 (ACPI NVS)
 BIOS-e820: 040ffc00 - 1800 (usable)
 BIOS-e820: fffe7000 - 0001 (reserved)
0MB HIGHMEM available.
384MB LOWMEM available.
On node 0 totalpages: 98304
  DMA zone: 4096 pages, LIFO batch:0
  Normal zone: 94208 pages, LIFO batch:31
DMI 2.1 present.
ACPI: RSDP (v000 PTLTD ) @ 0x000f6ac0
ACPI: RSDT (v001 PTLTDRSDT   0x PTL  0x0100) @ 0x040fda87
ACPI: FADT (v001 GATEWA TABOR II 0x19990928 PTL  0x000f4240) @ 0x040ff78c
ACPI: DSDT (v001 GATEWA TABOR II 0x MSFT 0x0100) @ 0x
ACPI: PM-Timer IO Port: 0x8008
Allocating PCI resources starting at 2000 (gap: 1800:e7fe7000)
Detected 596.938 MHz processor.
Built 1 zonelists.  Total pages: 98304
Kernel command line: root=/dev/hda2 ro 
Local APIC disabled by BIOS -- you can enable it with "lapic"

mapped APIC to d000 (0130a000)
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Initializing CPU#0
PID hash table entries: 2048 (order: 11, 8192 bytes)
Console: colour VGA+ 80x25
Dentry cache hash table entries: 65536 (order: 6, 262144 bytes)
Inode-cache hash table entries: 32768 (order: 5, 131072 bytes)
Memory: 382128k/393216k available (1544k kernel code, 10556k reserved, 577k 
data, 196k init, 0k highmem)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay using timer specific routine.. 1194.90 BogoMIPS (lpj=2389801)
Security Framework v1.0.0 initialized
SELinux:  Disabled at boot.
Capability LSM initialized
Mount-cache hash table entries: 512
CPU: After generic identify, caps: 0383f9ff     
 
CPU: After vendor identify, caps: 0383f9ff     
 
CPU: L1 I cache: 16K, L1 D cache: 16K
CPU: L2 cache: 512K
CPU: After all inits, caps: 0383f9ff   0040  
 
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
Compat vDSO mapped to e000.
Checking 'hlt' instruction... OK.
SMP alternatives: switching to UP code
Freeing SMP alternatives: 16k freed
ACPI: Core revision 20060707
ACPI: setting ELCR to 0200 (from 1a00)
CPU0: Intel Pentium III (Katmai) stepping 03
SMP motherboard not detected.
Local APIC not detected. Using dummy APIC emulation.
Brought up 1 CPUs
migration_cost=0
checking if image is initramfs... it is
Freeing initrd memory: 4397k freed
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: PCI BIOS revision 2.10 entry at 0xfd983, last bus=1
PCI: Using configuration type 1
Setting up standard PCI resources
ACPI: Interpreter enabled
ACPI: Using PIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (:00)
PCI: Probing PCI hardware (bus 00)
ACPI: Assume root bridge [\_SB_.PCI0] bus is 0
* Found PM-Timer Bug on the chipset. Due to workarounds for a bug,
* this clock source is slow. Consider trying other clock sources
PCI quirk: region 8000-803f claimed by PIIX4 ACPI
PCI quirk: region 7000-700f claimed by PIIX4 SMB
Boot

Re: Loud "pop" coming from hard drive on reboot

2007-04-18 Thread Mark Lord


Mark Lord wrote:

Tejun Heo wrote:


1. shutdown(8) issues SYNCHRONIZE_CACHE followed by STANDBY_NOW
2. kernel shutdown starts
3. libata shutdown issues SYNCHRONIZE_CACHE
4. power goes off


Okay, after some experimentatino, it's the STANDBY_NOW that
is causing the Power-Off_Retract_Count to increment on my machine.

Tell me again why we think we need to issue that command ?


Arghh.. okay, removing that from the code has no effect on it either.
I just don't understand the problem any more, since I don't actually
have it here (I think).

Can somebody explain again what the issue was, when it began happening,
and whatever else?  And what Tejun's fix (2.6.22) does again?

I've lost all of the early postings.

Thanks
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] sched: implement staircase deadline scheduler further improvements

2007-04-18 Thread Con Kolivas

While the Staircase Deadline scheduler has not been completely killed off and 
is still in -mm I would like to fix some outstanding issues that I've found
since it still serves for comparison with all the upcoming schedulers.

While still in -mm can we queue this on top please?

A set of staircase-deadline v 0.41 patches will make their way into the usual
place for those willing to test it.

http://ck.kolivas.org/patches/staircase-deadline/
---
The prio_level was being inappropriately decreased if a higher priority
task was still using previous timeslice. Fix that.

Task expiration of higher priority tasks was not being taken into account
with allocating priority slots. Check the expired best_static_prio level
to facilitate that.

Explicitly check all better static priority prio_levels when deciding on
allocating slots for niced tasks.

These changes improve behaviour in many ways.

Signed-off-by: Con Kolivas <[EMAIL PROTECTED]>

---
 kernel/sched.c |   61 ++---
 1 file changed, 41 insertions(+), 20 deletions(-)

Index: linux-2.6.21-rc7-sd/kernel/sched.c
===
--- linux-2.6.21-rc7-sd.orig/kernel/sched.c 2007-04-19 08:51:54.0 
+1000
+++ linux-2.6.21-rc7-sd/kernel/sched.c  2007-04-19 09:30:39.0 +1000
@@ -145,6 +145,12 @@ struct prio_array {
 */
DECLARE_BITMAP(prio_bitmap, MAX_PRIO + 1);
 
+   /*
+* The best static priority (of the dynamic priority tasks) queued
+* this array.
+*/
+   int best_static_prio;
+
 #ifdef CONFIG_SMP
/* For convenience looks back at rq */
struct rq *rq;
@@ -191,9 +197,9 @@ struct rq {
 
/*
 * The current dynamic priority level this runqueue is at per static
-* priority level, and the best static priority queued this rotation.
+* priority level.
 */
-   int prio_level[PRIO_RANGE], best_static_prio;
+   int prio_level[PRIO_RANGE];
 
/* How many times we have rotated the priority queue */
unsigned long prio_rotation;
@@ -669,7 +675,7 @@ static void task_new_array(struct task_s
 }
 
 /* Find the first slot from the relevant prio_matrix entry */
-static inline int first_prio_slot(struct task_struct *p)
+static int first_prio_slot(struct task_struct *p)
 {
if (unlikely(p->policy == SCHED_BATCH))
return p->static_prio;
@@ -682,11 +688,18 @@ static inline int first_prio_slot(struct
  * level. SCHED_BATCH tasks do not use the priority matrix. They only take
  * priority slots from their static_prio and above.
  */
-static inline int next_entitled_slot(struct task_struct *p, struct rq *rq)
+static int next_entitled_slot(struct task_struct *p, struct rq *rq)
 {
+   int search_prio = MAX_RT_PRIO, uprio = USER_PRIO(p->static_prio);
+   struct prio_array *array = rq->active;
DECLARE_BITMAP(tmp, PRIO_RANGE);
-   int search_prio, uprio = USER_PRIO(p->static_prio);
 
+   /*
+* Go straight to expiration if there are higher priority tasks
+* already expired.
+*/
+   if (p->static_prio > rq->expired->best_static_prio)
+   return MAX_PRIO;
if (!rq->prio_level[uprio])
rq->prio_level[uprio] = MAX_RT_PRIO;
/*
@@ -694,15 +707,21 @@ static inline int next_entitled_slot(str
 * static_prio are acceptable, and only if it's not better than
 * a queued better static_prio's prio_level.
 */
-   if (p->static_prio < rq->best_static_prio) {
-   search_prio = MAX_RT_PRIO;
+   if (p->static_prio < array->best_static_prio) {
if (likely(p->policy != SCHED_BATCH))
-   rq->best_static_prio = p->static_prio;
-   } else if (p->static_prio == rq->best_static_prio)
+   array->best_static_prio = p->static_prio;
+   } else if (p->static_prio == array->best_static_prio) {
search_prio = rq->prio_level[uprio];
-   else {
+   } else {
+   int i;
+
+   /* A bound O(n) function, worst case n is 40 */
+   for (i = array->best_static_prio; i <= p->static_prio ; i++) {
+   if (!rq->prio_level[USER_PRIO(i)])
+   rq->prio_level[USER_PRIO(i)] = MAX_RT_PRIO;
search_prio = max(rq->prio_level[uprio],
-   rq->prio_level[USER_PRIO(rq->best_static_prio)]);
+   rq->prio_level[USER_PRIO(i)]);
+   }
}
if (unlikely(p->policy == SCHED_BATCH)) {
search_prio = max(search_prio, p->static_prio);
@@ -718,6 +737,8 @@ static void queue_expired(struct task_st
 {
task_new_array(p, rq, rq->expired);
p->prio = p->normal_prio = first_prio_slot(p);
+   if (p->static_prio < rq->expired->best_static_prio)
+   rq->expired->best_static_prio =

Re: [PATCH][RFC] Kill off legacy power management stuff.

2007-04-18 Thread Robert P. J. Day

On Wed, 18 Apr 2007, Len Brown wrote:

> On Wednesday 18 April 2007 16:23, Robert P. J. Day wrote:

> > ok, i get it now and -- correct me if i'm wrong -- all my legacy PM
> > removal patch was doing was exposing a design boo-boo in which
> > APM/ACPI contention was being handled by a macro in a subsystem even
> > older than either of them, right?
>
> yeah, it didn't start out that way, the bug was added when the
> CONFIG_PM_LEGACY #define was added.
>
> > so all that needs to be done is add back in a contention solution
> > of some kind that doesn't rely on that ancient system, yes?
>
> Yes, it is a matter of making the variable not go away when the
> #define goes away.
>
> > as for that thinkpad t30 situation, well, that's just borked, and
> > should be fixed.
>
> yes, the actual failure is that APM mode on the T30 hangs -- and
> that is independent of the issue at hand.  However, there could be
> other failures on other machines when both APM and ACPI think they
> are active.

at this point, i think the proper approach is to locate and remove all
dependencies on the legacy PM code, which includes making sure there's
a reliable contention mechanism for APM and ACPI that doesn't need
anything out of the legacy code or header files.  once that's done,
the legacy deletion itself should be trivial.

the obvious place for the contention stuff is, i would think,
include/linux/pm.h, yes?

rday
-- 

Robert P. J. Day
Linux Consulting, Training and Annoying Kernel Pedantry
Waterloo, Ontario, CANADA

http://fsdev.net/wiki/index.php?title=Main_Page

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Loud "pop" coming from hard drive on reboot

2007-04-18 Thread Mark Lord


Tejun Heo wrote:


1. shutdown(8) issues SYNCHRONIZE_CACHE followed by STANDBY_NOW
2. kernel shutdown starts
3. libata shutdown issues SYNCHRONIZE_CACHE
4. power goes off


Okay, after some experimentatino, it's the STANDBY_NOW that
is causing the Power-Off_Retract_Count to increment on my machine.

Tell me again why we think we need to issue that command ?

Thanks.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-18 Thread Davide Libenzi

On Wed, 18 Apr 2007, Ingo Molnar wrote:

> That's one reason why i dont think it's necessarily a good idea to 
> group-schedule threads, we dont really want to do a per thread group 
> percpu_alloc().

I still do not have clear how much overhead this will bring into the 
table, but I think (like Linus was pointing out) the hierarchy should look 
like:

Top (VCPU maybe?)
User
Process
Thread

The "run_queue" concept (and data) that now is bound to a CPU, need to be 
replicated in:

ROOT <- VCPUs add themselves here
VCPU <- USERs add themselves here
USER <- PROCs add themselves here
PROC <- THREADs add themselves here
THREAD (ultimate fine grained scheduling unit)

So ROOT, VCPU, USER and PROC will have their own "run_queue". Picking up a 
new task would mean:

VCPU = ROOT->lookup();
USER = VCPU->lookup();
PROC = USER->lookup();
THREAD = PROC->lookup();

Run-time statistics should propagate back the other way around.

> In fact for threads the _reverse_ problem exists, threaded apps tend to 
> _strive_ for more performance - hence their desperation of using the 
> threaded programming model to begin with ;) (just think of media 
> playback apps which are typically multithreaded)

The same user nicing two different multi-threaded processes would expect a 
predictable CPU distribution too. Doing that efficently (the old per-cpu 
run-queue is pretty nice from many POVs) is the real challenge.

- Davide

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Upgraded to 2.6.20.7 - positives

2007-04-18 Thread Chuck Ebbert

Denis Vlasenko wrote:
> * From make menuconfig questions it looks like SATA/PATA
>   rewrite (in the form of libata) is almost finished. Hehe,
>   untangling IDE mess was quite a feat, and Jeff did it. Kudos.
> 

ADMA mode on nvidia chipsets still seems broken despite massive
amount of SATA fixes backported from 2.6.21...

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Loud "pop" coming from hard drive on reboot

2007-04-18 Thread Chuck Ebbert

Robert Hancock wrote:
> Tejun Heo wrote:
>> This really isn't a regression.  It's been always like that with libata.
>>  libata doesn't make devices go into standby mode and shutdown(8) does
>> it for libata.  The problem here is that libata does issue
>> SYNCHRONIZE_CACHE on shutdown.  So, the sequence of event is...
>>
>> 1. shutdown(8) issues SYNCHRONIZE_CACHE followed by STANDBY_NOW
> 
> This part is presumably distribution dependent. I have never seen Fedora
> or CentOS shut down drives on power down from the shutdown script/utility..
> 

Some distro shutdown scripts must be doing "halt -h" at shutdown time.

-n : don't sync cache (default is to sync)
-h : put harddrives in standby (default is no standby)

And BTW not put them in sleep instead of standby (whether it's
the halt program or the kernel?) They won't wake up from that
until they're reset.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: CFS and suspend2: hang in atomic copy

2007-04-18 Thread Bob Picco

Ingo Molnar wrote:  [Wed Apr 18 2007, 06:02:28PM EDT]
> 
> * Christian Hesse <[EMAIL PROTECTED]> wrote:
> 
> > > although probably your suspend2 problem is still not fixed, it's 
> > > worth a try nevertheless. Which suspend2 patch did you apply, and 
> > > was it against -rc6 or -rc7?
> > 
> > You are right again. ;-)
> > 
> > Linux 2.6.21-rc7
> > Suspend2 2.2.9.11 (applies cleanly to -rc7)
> > CFS v3 (without any additional patches)
> > 
> > And it still hangs on suspend.
> 
> what's the easiest way for me to try suspend2? Apply the patch, reboot 
> into the kernel, then execute what command to suspend? (there's a 
> confusing mismash of initiators of all the suspend variants. Can i drive 
> this by echoing to /sys/power/state?)
> 
>   Ingo
I had hoped to collect more data with CFS V2. It crashes in
scale_nice_down for s2ram when attempting to disable_nonboot_cpus. 
So part of traceback looks like (typed by hand with obvious omissions):

scale_nice_down
update_stats_wait_end - not shown in traceback because inlined
pick_next_task_fair
migration_call
task_rq_lock
notifier_call_chain
_cpu_down
disable_nonboot_cpus
...

This is standard -rc7 with V2 CFS applied. It could be a completely
unrelated issue. I'll attempt to debug further tomorrow.

bob
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: CFS and suspend2: hang in atomic copy

2007-04-18 Thread Christian Hesse

On Thursday 19 April 2007, Ingo Molnar wrote:
> * Christian Hesse <[EMAIL PROTECTED]> wrote:
> > Linux 2.6.21-rc7
> > Suspend2 2.2.9.11 (applies cleanly to -rc7)
> > CFS v3 (without any additional patches)
> >
> > And it still hangs on suspend.
>
> i just tried the same and it suspended+resumed just fine:
>
> Restarting tasks ... done.
> Suspend2 debugging info:
> - Suspend core   : 2.2.9.12
> - Kernel Version : 2.6.21-rc7-CFS-v3
> - Compiler vers. : 4.0
> - Attempt number : 2
> - Parameters : 0 81920 0 0 0 0
> - Overall expected compression percentage: 0.
> - Compressor is 'lzf'.
>   Compressed 31133696 bytes into 14880587 (52 percent compression).
> - SwapAllocator active.
>   Swap available for image: 512036 pages.
> - FileAllocator inactive.
> - I/O speed: Write 76 MB/s, Read 42 MB/s.
> - Extra pages: 18 used/500.
>
> could you send me your .config?

My config is attached.

I now got some error message from my system:

http://www.eworm.de/tmp/cfs-suspend.jpg
-- 
Regards,
Chris
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.21-rc7-r1
# Wed Apr 18 22:25:20 2007
#
CONFIG_X86_32=y
CONFIG_GENERIC_TIME=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_DMI=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32

#
# General setup
#
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
# CONFIG_IPC_NS is not set
CONFIG_SYSVIPC_SYSCTL=y
# CONFIG_POSIX_MQUEUE is not set
# CONFIG_BSD_PROCESS_ACCT is not set
# CONFIG_TASKSTATS is not set
# CONFIG_UTS_NS is not set
# CONFIG_AUDIT is not set
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_IKPATCHES=y
CONFIG_IKPATCHES_PROC=y
# CONFIG_CPUSETS is not set
# CONFIG_SYSFS_DEPRECATED is not set
# CONFIG_RELAY is not set
# CONFIG_BLK_DEV_INITRD is not set
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SYSCTL=y
# CONFIG_EMBEDDED is not set
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SHMEM=y
CONFIG_SLAB=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
# CONFIG_SLOB is not set

#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
# CONFIG_MODVERSIONS is not set
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_KMOD=y
CONFIG_STOP_MACHINE=y

#
# Block layer
#
CONFIG_BLOCK=y
# CONFIG_LBD is not set
# CONFIG_BLK_DEV_IO_TRACE is not set
# CONFIG_LSF is not set

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
# CONFIG_IOSCHED_AS is not set
# CONFIG_IOSCHED_DEADLINE is not set
CONFIG_IOSCHED_CFQ=y
# CONFIG_DEFAULT_AS is not set
# CONFIG_DEFAULT_DEADLINE is not set
CONFIG_DEFAULT_CFQ=y
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="cfq"

#
# Processor type and features
#
# CONFIG_TICK_ONESHOT is not set
# CONFIG_NO_HZ is not set
# CONFIG_HIGH_RES_TIMERS is not set
CONFIG_SMP=y
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_PARAVIRT is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
CONFIG_MPENTIUMM=y
# CONFIG_MCORE2 is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MGEODE_LX is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_CMPXCHG64=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_TSC=y
# CONFIG_HPET_TIMER is not set
CONFIG_NR_CPUS=2
# CONFIG_SCHED_SMT is not set
CONFIG_SCHED_MC=y
CONFIG_PREEMPT_NONE=y
# CONFIG_PREEMPT_VOLUNTARY is not set
# CONFIG_PREEMPT is not set
# CONFIG_PREEMPT_BKL

Upgraded to 2.6.20.7 - positives

2007-04-18 Thread Denis Vlasenko

Hi kernel people,

Just upgraded by home box to 2.6.20.7. Wow.

* Reiser3 mount times are drastically reduced,
  even when journal replay is needed
  (I have few 100Gb+ reiser3 partitions mounted at boot)
* sit pseudo-interface is gone. In previous kernel, I tried
  to disable it in kernel config to no avial. Now it was easy
  to simply compile it as a module.
* From make menuconfig questions it looks like SATA/PATA
  rewrite (in the form of libata) is almost finished. Hehe,
  untangling IDE mess was quite a feat, and Jeff did it. Kudos.

Need to check now whether losetup oopses are gone too,
or hunt them down if they are still with us :)

Thanks everybody for your amazing work.
--
vda
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Loud "pop" coming from hard drive on reboot

2007-04-18 Thread Robert Hancock


Tejun Heo wrote:

This really isn't a regression.  It's been always like that with libata.
 libata doesn't make devices go into standby mode and shutdown(8) does
it for libata.  The problem here is that libata does issue
SYNCHRONIZE_CACHE on shutdown.  So, the sequence of event is...

1. shutdown(8) issues SYNCHRONIZE_CACHE followed by STANDBY_NOW


This part is presumably distribution dependent. I have never seen Fedora 
or CentOS shut down drives on power down from the shutdown script/utility..



2. kernel shutdown starts
3. libata shutdown issues SYNCHRONIZE_CACHE
4. power goes off

Some drives seem to spin up at step #3 even when its cache is clean and
power goes off right after the disk finishes the command.  So, it's
really bad when it happens - spin down, spin up followed by immediate
power off.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Loud "pop" coming from hard drive on reboot

2007-04-18 Thread Robert Hancock


Stephen Clark wrote:
So this is the pop I hear on my new laptop that is using 
libata=combined_mode
when I shut my system down. I didn't get the pop with the same disk 
drive in an older
laptop that was only ide. It sounds like a relay closing or opening, but 
is really my

drive head doing an emergency retract/park?


Yes, that would be what it is, and why.

I would vote that the sd stop-on-shutdown patch should go in, and 
possibly with the new behavior enabled by default. Surely the number of 
people running Linux on a laptop (or any other system with load/unload 
head technology drives) is much greater than the number of people 
running a SAN, multi-initiator, etc. environment where you might not 
want this..


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: CPU_IDLE prevents resuming from STR [was: Re: 2.6.21-rc6-mm1]

2007-04-18 Thread Joshua Wise


On Tue, 17 Apr 2007, Shaohua Li wrote:

Looks there is init order issue of sysfs files. The new refreshed patch
should fix your bug.


Yes, that did fix the hang on resume from STR -- that now works fine.

However:
[EMAIL PROTECTED]:/sys/devices/system/cpu/cpuidle$ cat available_drivers 
current_driver


[EMAIL PROTECTED]:/sys/devices/system/cpu/cpuidle$ cat available_governors 
current_governor
ladder
ladder

Is this correct? For reference, my config is http://joshuawise.com/config.gz
-- I didn't see any options for cpuidle drivers to access ACPI states...

joshua
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

mkinitrd. (was Re: [RFC] [PATCH] Allow overriding module parameters from kernel command_line)

2007-04-18 Thread Dave Jones

On Thu, Apr 19, 2007 at 08:47:13AM +1000, Neil Brown wrote:

 > > Fixed by changing /etc/fstab and rebuilding initrd, but IMO rootfstype=
 > > should have worked.
 > 
 > I think these are both issues that should be solved by smarts in the
 > initrd.

This is getting away from the intent of Kyle's original patch
(Which I think is worthwhile fwiw, having recently hit the exact
 same sata_nv bug that prompted him to write it)

 > What we really need is a single reference implementation of "mkinitrd"
 > which each distro can fiddle with to their heart's content.  Then
 > sensible ideas like the above can be incorporated into the reference,
 > and all distros will ultimately pick them up.
 > 
 > But unfortunately I don't have the time to volunteer for this role...

The problem I see with such a 'one mkinitrd to rule them all', is that
it would suffer from the same thing that stopped any vendor stepping
up and getting behind hpa's klibc project...  Apathy due to "our current
stuff works, why would we throw it all away and start again"

It's a great idea in theory, in practise however, initrd construction
for every distro now contains years of custom hacks and workarounds
(that may not even be relevant on other distros).

Given the critical nature of mkinitrd (get something wrong, and your
system doesn't boot), unsurprisingly, people are reluctant to change
away from something they're familar with, unless there's a *really*
compelling reason.

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Loud "pop" coming from hard drive on reboot

2007-04-18 Thread Robert Hancock


Stephen Clark wrote:
I tried this on 2.6.20.2 it applied to libata with some fuzz and I had 
to manually edit libata.h

When I did a shutdown I still got the click/pop.

I also noticed the last thing displayed on the lcd before it goes blank is
Synchronizing SCSI Disks - then the click/pop.

HTH,
Steve


That patch on its own will not help, you also need Tejun's 
stop-on-shutdown patch, otherwise the kernel will not try to stop the 
disk before powering off.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 1/1] Char: tty, add io tty_insert_flip_string variants

2007-04-18 Thread Jiri Slaby

Alan Cox napsal(a):
> On Thu, 19 Apr 2007 00:35:20 +0200 (CEST)
> Jiri Slaby <[EMAIL PROTECTED]> wrote:
> 
>> Hi,
>>
>> don't you consider this useful for some drivers. There are many cases, when
>> tty_insert_flip_stringio might be used.
> 
> I couldn't see anyone who really benefitted when I first looked at this
> but if you've got a case you want to use them then I've certainly got no
> problem with it.

Ah, I'm an idiot -- I should go through the code more carefully,
tty_prepare_flip_string + memcpy_fromio can do the job.

thanks,
-- 
http://www.fi.muni.cz/~xslaby/Jiri Slaby
faculty of informatics, masaryk university, brno, cz
e-mail: jirislaby gmail com, gpg pubkey fingerprint:
 B674 9967 0407 CE62 ACC8  22A0 32CC 55C3 39D4 7A7E
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] [PATCH] Allow overriding module parameters from kernel command_line

2007-04-18 Thread Neil Brown

On Wednesday April 18, [EMAIL PROTECTED] wrote:
> > On Wed, 18 Apr 2007 11:55:52 -0400 Kyle McMartin <[EMAIL PROTECTED]> wrote:
> > With the move to initramfs and heavily modular configs, which include
> > loading storage drivers from early userspace, it's becoming harder
> > to provide users with a way of overriding module parameters at boot.
> > 
> > Currently, users would have to break into the initramfs, edit the
> > modprobe options, and then let boot continue. They have a much easier time
> > dealing with adding options on the command line from Grub or what have you.
> > 
> > I hacked out this patch quickly to re-parse saved_command_line[] when we
> > load a module in an attempt to rectify this.
> > 
> > (The specific use-case I was looking at here was HPA commands failing on
> >  sata_nv controllers, and needing to pass the adma=0 option to the module...
> >  Users had a hard time testing without an easy way of overriding the 
> > module.)
> > 
> > Clearly this is not entirely optimal, because we're parsing command_line
> > after the module params are parsed. This ends of being a policy decision,
> > whether the /sbin/modprobe commandline should override the kernel
> > command_line, or vice versa.
> 
> Similar-but-different: I was trying to persuade a Fedora system to use ext2
> for the root filesystem the other day.  Turns out that we somehow managed
> to break `rootfstype=' in this situation and it cheerfully continued to use
> ext3.
> 
> Fixed by changing /etc/fstab and rebuilding initrd, but IMO rootfstype=
> should have worked.

I think these are both issues that should be solved by smarts in the
initrd.

All of the (unused) kernel parameters are in the environment aren't
they? (if not, they can easily be put there).

So maybe insmod/modprobe could be updated to extract relevant options
from the environment.
And the mount of the root filesystem should be called as:

mount ${rootfstype+-t $rootfstype} $dev $mountpoint

We are depending more and more on initrd and I think it hurts not
having a reference implementation.
Currently each distro makes their own and while I'm sure they are all
quite good in their own way, the fact that they are independent makes
community input harder.

What we really need is a single reference implementation of "mkinitrd"
which each distro can fiddle with to their heart's content.  Then
sensible ideas like the above can be incorporated into the reference,
and all distros will ultimately pick them up.

But unfortunately I don't have the time to volunteer for this role...

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 1/1] Char: tty, add io tty_insert_flip_string variants

2007-04-18 Thread Alan Cox

On Thu, 19 Apr 2007 00:35:20 +0200 (CEST)
Jiri Slaby <[EMAIL PROTECTED]> wrote:

> Hi,
> 
> don't you consider this useful for some drivers. There are many cases, when
> tty_insert_flip_stringio might be used.

I couldn't see anyone who really benefitted when I first looked at this
but if you've got a case you want to use them then I've certainly got no
problem with it.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC 1/1] Char: tty, add io tty_insert_flip_string variants

2007-04-18 Thread Jiri Slaby

Hi,

don't you consider this useful for some drivers. There are many cases, when
tty_insert_flip_stringio might be used.

--

tty, add io tty_insert_flip_string variants

Signed-off-by: Jiri Slaby <[EMAIL PROTECTED]>

---
commit a7dafceb31ff535b793227036f5b2b6a1e8cf233
tree 51e1bd24bfb9a2842bb12e097cec0b97a7bf3998
parent 71c2e9b72594f69e4e226006206ffa74b55c1642
author Jiri Slaby <[EMAIL PROTECTED]> Thu, 19 Apr 2007 00:27:22 +0200
committer Jiri Slaby <[EMAIL PROTECTED]> Thu, 19 Apr 2007 00:27:22 +0200

 drivers/char/tty_io.c   |  101 ---
 1 file changed, 69 insertions(+), 32 deletions(-)

diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c
index 959a616..6b215ed 100644
--- a/drivers/char/tty_io.c
+++ b/drivers/char/tty_io.c
@@ -95,6 +95,7 @@
 #include 
 #include 
 
+#include 
 #include 
 #include 
 
@@ -441,6 +442,25 @@ int tty_buffer_request_room(struct tty_struct *tty, size_t 
size)
 }
 EXPORT_SYMBOL_GPL(tty_buffer_request_room);
 
+#define __tty_insert_flip_string(tty, chars, flags, size, chrfun, flfun) ({ \
+   int copied = 0; \
+   do {\
+   int space = tty_buffer_request_room(tty, size - copied);\
+   struct tty_buffer *tb = tty->buf.tail;  \
+   /* If there is no space then tb may be NULL */  \
+   if (unlikely(space == 0))   \
+   break;  \
+   chrfun(tb->char_buf_ptr + tb->used, chars, space);  \
+   flfun(tb->flag_buf_ptr + tb->used, flags, space);   \
+   tb->used += space;  \
+   copied += space;\
+   chars += space; \
+   /* There is a small chance that we need to split the data over \
+  several buffers. If this is the case we must loop */ \
+   } while (unlikely(size > copied));  \
+   copied; \
+})
+
 /**
  * tty_insert_flip_string  -   Add characters to the tty buffer
  * @tty: tty structure
@@ -456,26 +476,33 @@ EXPORT_SYMBOL_GPL(tty_buffer_request_room);
 int tty_insert_flip_string(struct tty_struct *tty, const unsigned char *chars,
size_t size)
 {
-   int copied = 0;
-   do {
-   int space = tty_buffer_request_room(tty, size - copied);
-   struct tty_buffer *tb = tty->buf.tail;
-   /* If there is no space then tb may be NULL */
-   if(unlikely(space == 0))
-   break;
-   memcpy(tb->char_buf_ptr + tb->used, chars, space);
-   memset(tb->flag_buf_ptr + tb->used, TTY_NORMAL, space);
-   tb->used += space;
-   copied += space;
-   chars += space;
-   /* There is a small chance that we need to split the data over
-  several buffers. If this is the case we must loop */
-   } while (unlikely(size > copied));
-   return copied;
+   return __tty_insert_flip_string(tty, chars, TTY_NORMAL, size,
+   memcpy, memset);
 }
 EXPORT_SYMBOL(tty_insert_flip_string);
 
 /**
+ * tty_insert_flip_stringio-   Add characters to the tty buffer
+ * @tty: tty structure
+ * @chars: characters
+ * @size: size
+ *
+ * Queue a series of bytes to the tty buffering from io memory. All the
+ * characters passed are marked as without error. Returns the number
+ * added.
+ *
+ * Locking: Called functions may take tty->buf.lock
+ */
+
+int tty_insert_flip_stringio(struct tty_struct *tty,
+   const unsigned char __iomem *chars, size_t size)
+{
+   return __tty_insert_flip_string(tty, chars, TTY_NORMAL, size,
+   memcpy_fromio, memset);
+}
+EXPORT_SYMBOL(tty_insert_flip_stringio);
+
+/**
  * tty_insert_flip_string_flags-   Add characters to the tty buffer
  * @tty: tty structure
  * @chars: characters
@@ -492,27 +519,35 @@ EXPORT_SYMBOL(tty_insert_flip_string);
 int tty_insert_flip_string_flags(struct tty_struct *tty,
const unsigned char *chars, const char *flags, size_t size)
 {
-   int copied = 0;
-   do {
-   int space = tty_buffer_request_room(tty, size - copied);
-   struct tty_buffer *tb = tty->buf.tail;
-   /* If there is no space then tb may be NULL */
-   if(unlikely(space == 0))
-   break;
-   memcpy(tb->char_buf_ptr + tb->used, chars, space);
-   memcpy(tb->flag_buf_ptr + tb->used, flags, space);
-   tb->used += space;
-

[PATCH 2/2] wistron_btns: add led support

2007-04-18 Thread Éric Piel


This patch adds support for mail and wifi leds. It modifies the Kconfig
file to automatically pull led_class with wistron_btns, hopefully
everyone is fine with this.

It doesn't add support for bluetooth led because, so far, it seems all 
the laptops with bluetooth have led and bluetooth system linked (meaning 
it is already managed by the driver).


This was tested on a TM 610 and a Aspire 3020.

Eric
(sorry for the multiple receptions)

From: Eric Piel <[EMAIL PROTECTED]>

wriston_btns: Add led support
Add support to wistron_btns for leds that comes with the multimedia keys. Mail
and wifi leds are supported, on laptops which have them. Depending on the
laptop, wifi subsystem may control just the led, or both the led and the wifi
card. Wifi led interface is activated only for the former type of laptops, as
the latter type is already managed. Leds are controled by the interface in
/sys/class/leds. 

Signed-off-by: Eric Piel <[EMAIL PROTECTED]>

--- linux-2.6.21/drivers/input/misc/wistron_btns.c.bak	2007-04-07 15:09:30.0 +0200
+++ linux-2.6.21/drivers/input/misc/wistron_btns.c	2007-04-14 12:42:38.0 +0200
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * Number of attempts to read data from queue per poll;
@@ -46,11 +47,12 @@
 /* BIOS subsystem IDs */
 #define WIFI		0x35
 #define BLUETOOTH	0x34
+#define MAIL_LED	0x31
 
 MODULE_AUTHOR("Miloslav Trmac <[EMAIL PROTECTED]>");
 MODULE_DESCRIPTION("Wistron laptop button driver");
 MODULE_LICENSE("GPL v2");
-MODULE_VERSION("0.2");
+MODULE_VERSION("0.3");
 
 static int force; /* = 0; */
 module_param(force, bool, 0);
@@ -251,6 +253,7 @@
 static const struct key_entry *keymap; /* = NULL; Current key map */
 static int have_wifi;
 static int have_bluetooth;
+static int have_leds;
 
 static int __init dmi_matched(struct dmi_system_id *dmi)
 {
@@ -263,6 +266,8 @@
 		else if (key->type == KE_BLUETOOTH)
 			have_bluetooth = 1;
 	}
+	have_leds = key->code & (FE_MAIL_LED | FE_WIFI_LED);
+
 	return 1;
 }
 
@@ -1028,6 +1033,83 @@
 	input_sync(input_dev);
 }
 
+
+ /* led management */
+static void wistron_mail_led_set(struct led_classdev *led_cdev,
+enum led_brightness value)
+{
+	bios_set_state(MAIL_LED, (value != LED_OFF) ? 1 : 0);
+}
+
+/* same as setting up wifi card, but for laptops on which the led is managed */
+static void wistron_wifi_led_set(struct led_classdev *led_cdev,
+enum led_brightness value)
+{
+	bios_set_state(WIFI, (value != LED_OFF) ? 1 : 0);
+}
+
+static struct led_classdev wistron_mail_led = {
+	.name			= "mail:green",
+	.brightness_set		= wistron_mail_led_set,
+};
+
+static struct led_classdev wistron_wifi_led = {
+	.name			= "wifi:red",
+	.brightness_set		= wistron_wifi_led_set,
+};
+
+static void __devinit wistron_led_init(struct device *parent)
+{
+	if (have_leds & FE_WIFI_LED) {
+		u16 wifi = bios_get_default_setting(WIFI);
+		if (wifi & 1) {
+			wistron_wifi_led.brightness = (wifi & 2) ? LED_FULL : LED_OFF;
+			if (led_classdev_register(parent, _wifi_led))
+have_leds &= ~FE_WIFI_LED;
+			else
+bios_set_state(WIFI, wistron_wifi_led.brightness);
+
+		} else
+			have_leds &= ~FE_WIFI_LED;
+	}
+
+	if (have_leds & FE_MAIL_LED) {
+		/* bios_get_default_setting(MAIL) always retuns 0, so just turn the led off */
+		wistron_mail_led.brightness = LED_OFF;
+		if (led_classdev_register(parent, _mail_led))
+			have_leds &= ~FE_MAIL_LED;
+		else
+			bios_set_state(MAIL_LED, wistron_mail_led.brightness);
+	}
+}
+
+static void __devexit wistron_led_remove(void)
+{
+	if (have_leds & FE_MAIL_LED)
+		led_classdev_unregister(_mail_led);
+
+	if (have_leds & FE_WIFI_LED)
+		led_classdev_unregister(_wifi_led);
+}
+
+static inline void wistron_led_suspend(void)
+{
+	if (have_leds & FE_MAIL_LED)
+		led_classdev_suspend(_mail_led);
+
+	if (have_leds & FE_WIFI_LED)
+		led_classdev_suspend(_wifi_led);
+}
+
+static inline void wistron_led_resume(void)
+{
+	if (have_leds & FE_MAIL_LED)
+		led_classdev_resume(_mail_led);
+
+	if (have_leds & FE_WIFI_LED)
+		led_classdev_resume(_wifi_led);
+}
+
  /* Driver core */
 
 static int wifi_enabled;
@@ -1125,6 +1207,7 @@
 			bios_set_state(BLUETOOTH, bluetooth_enabled);
 	}
 
+	wistron_led_init(>dev);
 	poll_bios(1); /* Flush stale event queue and arm timer */
 
 	return 0;
@@ -1133,6 +1216,7 @@
 static int __devexit wistron_remove(struct platform_device *dev)
 {
 	del_timer_sync(_timer);
+	wistron_led_remove();
 	input_unregister_device(input_dev);
 	bios_detach();

@@ -1150,6 +1233,7 @@
 	if (have_bluetooth)
 		bios_set_state(BLUETOOTH, 0);
 
+	wistron_led_suspend();
 	return 0;
 }
 
@@ -1161,6 +1245,7 @@
 	if (have_bluetooth)
 		bios_set_state(BLUETOOTH, bluetooth_enabled);
 
+	wistron_led_resume();
 	poll_bios(1);
 
 	return 0;
--- linux-2.6.21/drivers/input/misc/Kconfig.bak	2007-04-09 23:18:49.0 +0200
+++ linux-2.6.21/drivers/input/misc/Kconfig	2007-04-14 02:53:01.0 +0200
@@ -43,9 +43,12 @@
 config INPUT_WISTRON_BTNS
 	tristate "x86 Wistron laptop button

[PATCH 0/2] wistron_btns: small fix and led support

2007-04-18 Thread Éric Piel


Hello,

The following two patches are against the input tree and improve the 
wistron_btns driver.
The first patch is mostly trivial, it fixes a typo that I introduced in 
the previous batch.
The second patch adds led support to the driver (and therefore also 
dependency on the led class).


See you,
Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] wistron_btns: add led support

2007-04-18 Thread Éric Piel

This fix a typo on the TM610 definition, inserted in my recent patch 
"add-acerhk-database".


Eric

From: Eric Piel <[EMAIL PROTECTED]>

wriston_btns: Fix typo for TM610
I did a typo in a previous patch for wistron_btns "add acerhk database". This
patch fixes this typo that prevented PROG2 key to work.

Signed-off-by: Eric Piel <[EMAIL PROTECTED]>

--- linux-2.6.21/drivers/input/misc/wistron_btns.c.bak	2007-04-07 15:09:30.0 +0200
+++ linux-2.6.21/drivers/input/misc/wistron_btns.c	2007-04-07 15:09:44.0 +0200
@@ -490,7 +490,7 @@
 	{ KE_KEY, 0x01, {KEY_HELP} },
 	{ KE_KEY, 0x02, {KEY_CONFIG} },
 	{ KE_KEY, 0x11, {KEY_PROG1} },
-	{ KE_KEY, 0x12, {KEY_PROG3} },
+	{ KE_KEY, 0x12, {KEY_PROG2} },
 	{ KE_KEY, 0x13, {KEY_PROG3} },
 	{ KE_KEY, 0x14, {KEY_MAIL} },
 	{ KE_KEY, 0x15, {KEY_WWW} },

Re: CFS and suspend2: hang in atomic copy

2007-04-18 Thread Christian Hesse

On Thursday 19 April 2007, Ingo Molnar wrote:
> * Christian Hesse <[EMAIL PROTECTED]> wrote:
> > > although probably your suspend2 problem is still not fixed, it's
> > > worth a try nevertheless. Which suspend2 patch did you apply, and
> > > was it against -rc6 or -rc7?
> >
> > You are right again. ;-)
> >
> > Linux 2.6.21-rc7
> > Suspend2 2.2.9.11 (applies cleanly to -rc7)
> > CFS v3 (without any additional patches)
> >
> > And it still hangs on suspend.
>
> what's the easiest way for me to try suspend2? Apply the patch, reboot
> into the kernel, then execute what command to suspend? (there's a
> confusing mismash of initiators of all the suspend variants. Can i drive
> this by echoing to /sys/power/state?)

Perhaps you have to install suspend2-userui as well for the output (I'm not 
shure whether it works without). Then you can trigger the suspend by echoing 
to /sys/power/suspend2/do_suspend.
Useful informations can be found in the Howto:

http://www.suspend2.net/HOWTO

I dropped some ccs to not abuse Linus and friends.
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.

Re: CFS and suspend2: hang in atomic copy

2007-04-18 Thread Ingo Molnar


* Christian Hesse <[EMAIL PROTECTED]> wrote:

> Linux 2.6.21-rc7
> Suspend2 2.2.9.11 (applies cleanly to -rc7)
> CFS v3 (without any additional patches)
> 
> And it still hangs on suspend.

i just tried the same and it suspended+resumed just fine:

Restarting tasks ... done.
Suspend2 debugging info:
- Suspend core   : 2.2.9.12
- Kernel Version : 2.6.21-rc7-CFS-v3
- Compiler vers. : 4.0
- Attempt number : 2
- Parameters : 0 81920 0 0 0 0
- Overall expected compression percentage: 0.
- Compressor is 'lzf'.
  Compressed 31133696 bytes into 14880587 (52 percent compression).
- SwapAllocator active.
  Swap available for image: 512036 pages.
- FileAllocator inactive.
- I/O speed: Write 76 MB/s, Read 42 MB/s.
- Extra pages: 18 used/500.

could you send me your .config?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: CFS and suspend2: hang in atomic copy

2007-04-18 Thread Ingo Molnar


* Christian Hesse <[EMAIL PROTECTED]> wrote:

> > although probably your suspend2 problem is still not fixed, it's 
> > worth a try nevertheless. Which suspend2 patch did you apply, and 
> > was it against -rc6 or -rc7?
> 
> You are right again. ;-)
> 
> Linux 2.6.21-rc7
> Suspend2 2.2.9.11 (applies cleanly to -rc7)
> CFS v3 (without any additional patches)
> 
> And it still hangs on suspend.

what's the easiest way for me to try suspend2? Apply the patch, reboot 
into the kernel, then execute what command to suspend? (there's a 
confusing mismash of initiators of all the suspend variants. Can i drive 
this by echoing to /sys/power/state?)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: CFS and suspend2: hang in atomic copy

2007-04-18 Thread Christian Hesse

On Wednesday 18 April 2007, Ingo Molnar wrote:
> * Christian Hesse <[EMAIL PROTECTED]> wrote:
> > > i took a quick look at suspend2 and it makes some use of yield().
> > > There's a bug in CFS's yield code, i've attached a patch that should
> > > fix it, does it make any difference to the hang?
> >
> > This patch should apply cleanly against what? The second hunk is
> > ignored as it has already been applied. Is this correct?
>
> hm, i think you might have had one of the earlier CFS patches.

You are right.

> > But no, it does not change anything. Let me know if you have any other
> > patches to test.
>
> could you try the -v3 patch i released a few hours ago:
>
>http://redhat.com/~mingo/cfs-scheduler/
>
> although probably your suspend2 problem is still not fixed, it's worth a
> try nevertheless. Which suspend2 patch did you apply, and was it against
> -rc6 or -rc7?

You are right again. ;-)

Linux 2.6.21-rc7
Suspend2 2.2.9.11 (applies cleanly to -rc7)
CFS v3 (without any additional patches)

And it still hangs on suspend.
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-18 Thread Con Kolivas

On Wednesday 18 April 2007 22:33, Con Kolivas wrote:
> On Wednesday 18 April 2007 22:14, Nick Piggin wrote:
> > On Wed, Apr 18, 2007 at 07:33:56PM +1000, Con Kolivas wrote:
> > > On Wednesday 18 April 2007 18:55, Nick Piggin wrote:
> > > > Again, for comparison 2.6.21-rc7 mainline:
> > > >
> > > > 508.87user 32.47system 2:17.82elapsed 392%CPU
> > > > 509.05user 32.25system 2:17.84elapsed 392%CPU
> > > > 508.75user 32.26system 2:17.83elapsed 392%CPU
> > > > 508.63user 32.17system 2:17.88elapsed 392%CPU
> > > > 509.01user 32.26system 2:17.90elapsed 392%CPU
> > > > 509.08user 32.20system 2:17.95elapsed 392%CPU
> > > >
> > > > So looking at elapsed time, a granularity of 100ms is just behind the
> > > > mainline score. However it is using slightly less user time and
> > > > slightly more idle time, which indicates that balancing might have
> > > > got a bit less aggressive.
> > > >
> > > > But anyway, it conclusively shows the efficiency impact of such tiny
> > > > timeslices.
> > >
> > > See test.kernel.org for how (the now defunct) SD was performing on
> > > kernbench. It had low latency _and_ equivalent throughput to mainline.
> > > Set the standard appropriately on both counts please.
> >
> > I can give it a run. Got an updated patch against -rc7?
>
> I said I wasn't pursuing it but since you're offering, the rc6 patch should
> apply ok.
>
> http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc6-sd-0.40.patch

Oh and if you go to the effort of trying you may as well try the timeslice 
tweak to see what effect it has on SD as well.

/proc/sys/kernel/rr_interval

100 is the highest.

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-18 Thread Ingo Molnar

* Davide Libenzi <[EMAIL PROTECTED]> wrote:

> I think Ingo's idea of a new sched_group to contain the generic 
> parameters needed for the "key" calculation, works better than adding 
> more fields to existing strctures (that would, of course, host 
> pointers to it). Otherwise I can already the the struct_signal being 
> the target for other unrelated fields :)

yeah. Another detail is that for global containers like uids, the 
statistics will have to be percpu_alloc()-ed, both for correctness 
(runqueues are per CPU) and for performance.

That's one reason why i dont think it's necessarily a good idea to 
group-schedule threads, we dont really want to do a per thread group 
percpu_alloc().

In fact for threads the _reverse_ problem exists, threaded apps tend to 
_strive_ for more performance - hence their desperation of using the 
threaded programming model to begin with ;) (just think of media 
playback apps which are typically multithreaded)

I dont think threads are all that different. Also, the 
resource-conserving act of using CLONE_VM to share the VM (and to use a 
different programming environment like Java) should not be 'punished' by 
forcing the thread group to be accounted as a single, shared entity 
against other 'fat' tasks.

so my current impression is that we want per UID accounting to solve the 
X problem, the kernel threads problem and the many-users problem, but 
i'd not want to do it for threads just yet because for them there's not 
really any apparent problem to be solved.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC][PATCH] fix for async scsi scan sysfs problem

2007-04-18 Thread Josef Bacik

Hello,

I'm having a problem on the newest version of linus's git tree with my qla2xxx
card.  This is on a UP box, the problem doesn't happen on my similarly
configured SMP box.  When I unload and then try to load the qla2xxx driver again
I get this message

kobject_add failed for 3:0:0:0 with -EEXIST, don't try to register
things with the same name in the same directory.
 [] show_trace_log_lvl+0x1a/0x2f
 [] show_trace+0x12/0x14
 [] dump_stack+0x16/0x18
 [] kobject_shadow_add+0xcd/0x1df
 [] kobject_add+0xa/0xc
 [] device_add+0xab/0x62e
 [] scsi_sysfs_add_sdev+0x2d/0x1eb [scsi_mod]
 [] scsi_probe_and_add_lun+0x974/0xaa5 [scsi_mod]
 [] __scsi_scan_target+0xc0/0x5f1 [scsi_mod]
 [] scsi_scan_target+0x97/0xa6 [scsi_mod]
 [] fc_scsi_scan_rport+0x5a/0x76 [scsi_transport_fc]
 [] run_workqueue+0x89/0x14e
 [] worker_thread+0xf8/0x124
 [] kthread+0xb3/0xdc
 [] kernel_thread_helper+0x7/0x10
 ===

I traced this down to the async scanning doing a kobject_add for that object,
the backtrace below shows the path we took to add it.

 [] show_trace_log_lvl+0x1a/0x2f
 [] show_trace+0x12/0x14
 [] dump_stack+0x16/0x18
 [] kobject_shadow_add+0xcd/0x1df
 [] kobject_add+0xa/0xc
 [] class_device_add+0x9e/0x3ad
 [] scsi_sysfs_add_sdev+0x5a/0x1eb [scsi_mod]
 [] do_scan_async+0x62/0xf8 [scsi_mod]
 [] kthread+0xb3/0xdc
 [] kernel_thread_helper+0x7/0x10
 ===

Looking through everything I came to the conclusion that we don't really need
the scsi_sysfs_add_devices in scsi_finish_async_scan, which gets run everytime
we do a do_scan_async.  In doing the scanning, if we come upon anything we will
already be registering the device with sysfs so the scsi_sysfs_add_devices step
is kind of useless.  I tested this and it worked fine on my UP box (where the
problem was happening) and my SMP box (where the problem wasn't happening).  Now
I'm not entirely sure if this is correct, but I'm attaching the patch that I
used to fix it for me, please point out if I've done something wrong or if there
is a different way this needs to be fixed.  Thank you,

Josef

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 0949145..2c8527b 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -1661,15 +1661,6 @@ int scsi_scan_host_selected(struct Scsi_
return 0;
 }
 
-static void scsi_sysfs_add_devices(struct Scsi_Host *shost)
-{
-   struct scsi_device *sdev;
-   shost_for_each_device(sdev, shost) {
-   if (scsi_sysfs_add_sdev(sdev) != 0)
-   scsi_destroy_sdev(sdev);
-   }
-}
-
 /**
  * scsi_prep_async_scan - prepare for an async scan
  * @shost: the host which will be scanned
@@ -1741,8 +1732,6 @@ static void scsi_finish_async_scan(struc
 
wait_for_completion(>prev_finished);
 
-   scsi_sysfs_add_devices(shost);
-
spin_lock(_scan_lock);
shost->async_scan = 0;
list_del(>list);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] [PATCH] Allow overriding module parameters from kernel command_line

2007-04-18 Thread Andrew Morton

> On Wed, 18 Apr 2007 11:55:52 -0400 Kyle McMartin <[EMAIL PROTECTED]> wrote:
> With the move to initramfs and heavily modular configs, which include
> loading storage drivers from early userspace, it's becoming harder
> to provide users with a way of overriding module parameters at boot.
> 
> Currently, users would have to break into the initramfs, edit the
> modprobe options, and then let boot continue. They have a much easier time
> dealing with adding options on the command line from Grub or what have you.
> 
> I hacked out this patch quickly to re-parse saved_command_line[] when we
> load a module in an attempt to rectify this.
> 
> (The specific use-case I was looking at here was HPA commands failing on
>  sata_nv controllers, and needing to pass the adma=0 option to the module...
>  Users had a hard time testing without an easy way of overriding the module.)
> 
> Clearly this is not entirely optimal, because we're parsing command_line
> after the module params are parsed. This ends of being a policy decision,
> whether the /sbin/modprobe commandline should override the kernel
> command_line, or vice versa.

Similar-but-different: I was trying to persuade a Fedora system to use ext2
for the root filesystem the other day.  Turns out that we somehow managed
to break `rootfstype=' in this situation and it cheerfully continued to use
ext3.

Fixed by changing /etc/fstab and rebuilding initrd, but IMO rootfstype=
should have worked.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel BUG at net/core/skbuff.c in linux-2.6.21-rc6

2007-04-18 Thread Herbert Xu

Hi Paul:

Paul Mackerras <[EMAIL PROTECTED]> wrote:
> 
> So this doesn't change process_input_packet(), which treats the case
> where the first byte is 0xff (PPP_ALLSTATIONS) but the second byte is
> 0x03 (PPP_UI) as indicating a packet with a PPP protocol number of
> 0xff.  Arguably that's wrong since PPP protocol 0xff is reserved, and
> the RFC does envision the possibility of receiving frames where the
> control field has values other than 0x03.

Your fix is probably needed too.  However, I think the issue that Patrick
was trying to fix is the case where p[0] != PPP_ALLSTATIONS and therefore
we'd still have a problem there.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, v3

2007-04-18 Thread Ingo Molnar


* William Lee Irwin III <[EMAIL PROTECTED]> wrote:

> It appears to me that the following can be taken in for mainline (or 
> rejected for mainline) independently of the rest of the cfs patch.

yeah - it's a patch written by Suresh, and this should already be in the 
for-v2.6.22 -mm queue. See:

  Subject: [patch] sched: align rq to cacheline boundary

on lkml.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Loud "pop" coming from hard drive on reboot

2007-04-18 Thread Greg Freemyer

On 4/18/07, Bartlomiej Zolnierkiewicz <[EMAIL PROTECTED]> wrote:

On Wednesday 18 April 2007, Chuck Ebbert wrote:
> Mark Lord wrote:
> > Mark Lord wrote:
> >>
> >> With the patch applied, I don't see *any* new activity in those
> >> S.M.A.R.T.
> >> attributes over multiple hibernates (Linux "suspend-to-disk").
> >
> > Scratch that -- operator failure.  ;)
> > The patch makes no difference over hibernates in the SMART logs.
> >
> > It's still logging extra Power-Off_Retract_Count pegs,
> > which it DID NOT USED TO DO not so long ago.
> >
>
> Just to add to the fun, my problems are happening with the "old"
> IDE drivers...

The issue you are experiencing results in the same problem (disk doing
power off retract) but it has a totally different root cause - your notebook
loses power on reboot.  It is actually a hardware problem and as you have
reported the same problem is present when using "the other" OS.

I think that the issue needs to be fixed (by detecting affected notebook(s)
using DMI?) in Linux PM handling and not in IDE subsystem because:

* there may be some other hardware devices affected by the power loss
  (== they require shutdown sequence)

* the same problem will bite if somebody decides to use libata (FC7?)

Bart

OpenSUSE 10.3 is still in Alpha stage (at least a few months away from
release), but they too have switched to libata by default.  (You can
override by adding a boot param).

Greg
--
Greg Freemyer
The Norcross Group
Forensics for the 21st Century
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Loud "pop" coming from hard drive on reboot

2007-04-18 Thread Chuck Ebbert

Bartlomiej Zolnierkiewicz wrote:
> On Wednesday 18 April 2007, Chuck Ebbert wrote:
>> Mark Lord wrote:
>>> Mark Lord wrote:
 With the patch applied, I don't see *any* new activity in those
 S.M.A.R.T.
 attributes over multiple hibernates (Linux "suspend-to-disk").
>>> Scratch that -- operator failure.  ;)
>>> The patch makes no difference over hibernates in the SMART logs.
>>>
>>> It's still logging extra Power-Off_Retract_Count pegs,
>>> which it DID NOT USED TO DO not so long ago.
>>>
>> Just to add to the fun, my problems are happening with the "old"
>> IDE drivers...
> 
> The issue you are experiencing results in the same problem (disk doing
> power off retract) but it has a totally different root cause - your notebook
> loses power on reboot.  It is actually a hardware problem and as you have
> reported the same problem is present when using "the other" OS.
> 

My "power off retract count" increases whether I do a halt/poweroff or
a reboot. The only difference is the volume of the noise.

And I just noticed my "seek error rate" is increasing.

/me plans purchase of another drive, definitely not Seagate...

> I think that the issue needs to be fixed (by detecting affected notebook(s)
> using DMI?) in Linux PM handling and not in IDE subsystem because:
>
> * there may be some other hardware devices affected by the power loss
>   (== they require shutdown sequence)
>
> * the same problem will bite if somebody decides to use libata (FC7?)

Yeah, this needs fixing too. I've been playing with another notebook and
the power does stay on during reboot, so I wonder how widespread the problem is?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, v3

2007-04-18 Thread William Lee Irwin III

On Wed, Apr 18, 2007 at 07:50:17PM +0200, Ingo Molnar wrote:
> this is the third release of the CFS patchset (against v2.6.21-rc7), and 
> can be downloaded from:
>http://redhat.com/~mingo/cfs-scheduler/
> this is a pure "fix reported regressions" release so there's much less 
> churn:
>5 files changed, 71 insertions(+), 29 deletions(-)
> (the added lines are mostly debug related, not genuine increase in the 
> scheduler's size)

It appears to me that the following can be taken in for mainline
(or rejected for mainline) independently of the rest of the cfs patch.


-- wli

Mark the runqueues cacheline_aligned_in_smp to avoid false sharing.

Index: sched/kernel/sched.c
===
--- sched.orig/kernel/sched.c   2007-04-18 14:10:03.593207728 -0700
+++ sched/kernel/sched.c2007-04-18 14:11:39.270660075 -0700
@@ -278,7 +278,7 @@
struct lock_class_key rq_lock_key;
 };
 
-static DEFINE_PER_CPU(struct rq, runqueues);
+static DEFINE_PER_CPU(struct rq, runqueues) cacheline_aligned_in_smp;
 
 static inline int cpu_of(struct rq *rq)
 {
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH][RFC] Kill off legacy power management stuff.

2007-04-18 Thread Len Brown

On Wednesday 18 April 2007 16:23, Robert P. J. Day wrote:
> On Wed, 18 Apr 2007, Len Brown wrote:

> > Here is how it should work. CONFIG_ACPI and CONFIG_APM should both
> > available in a kernel build. However, at boot time, of ACPI is
> > active, then APM should be disabled.
> >
> > The pm_active flag used to handle this, but that method was BROKEN
> > when the CONFIG_PM_LEGACY #define was added.  Today, there are
> > systems (such as the Thinkpad T30) that will not boot if
> > CONFIG_PM_LEGACY is not defined.  The reason nobody is complaining
> > is because the distros are currently defining CONFIG_PM_LEGACY.
> > But when you nuke that option and everything under it, this bug will
> > be exposed and some systems will stop booting.
> 
> ok, i get it now and -- correct me if i'm wrong -- all my legacy PM
> removal patch was doing was exposing a design boo-boo in which
> APM/ACPI contention was being handled by a macro in a subsystem even
> older than either of them, right?

yeah, it didn't start out that way, the bug was added when the
CONFIG_PM_LEGACY #define was added.

> so all that needs to be done is add 
> back in a contention solution of some kind that doesn't rely on that
> ancient system, yes?

Yes, it is a matter of making the variable not go away when
the #define goes away.

> as for that thinkpad t30 situation, well, that's just borked, and
> should be fixed.

yes, the actual failure is that APM mode on the T30 hangs -- and that is
independent of the issue at hand.  However, there could be other
failures on other machines when both APM and ACPI think they are active.

> rday
> 
> p.s.  at the risk of repeating myself repetitively, do we now agree
> that what i was trying to remove *was* adequately ancient?  although
> it's clear that it has to be done slightly more carefully than was
> done in my initial patch.

yes, I think so.

> p.p.s.  patch improvements that will let me avoid doing any of that
> myself always welcome. :-)

well, I'm sorry that I've known about the APM issue for a long time
and done nothing about it.  I did ping davej when he broke it,
but his to-do list is probably even longer than mine.

-Len
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] fix OOM killing processes wrongly thought MPOL_BIND

2007-04-18 Thread William Lee Irwin III

On Wed, Apr 18, 2007 at 08:35:22PM +0100, Hugh Dickins wrote:
> I only have CONFIG_NUMA=y for build testing: surprised when trying a memhog
> to see lots of other processes killed with "No available memory (MPOL_BIND)".
> memhog is killed correctly once we initialize nodemask in constrained_alloc().
> Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]>
> ---
> Perhaps appropriate for 2.6.20-stable too - regression since 2.6.19.

This is a clear fix for an uninitialized variable.


Acked-by: William Irwin <[EMAIL PROTECTED]>


-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Loud "pop" coming from hard drive on reboot

2007-04-18 Thread Bartlomiej Zolnierkiewicz


On Wednesday 18 April 2007, Tejun Heo wrote:
> Mark Lord wrote:
> > Chuck Ebbert wrote:
> >> Mark Lord wrote:
> >>> I'll patch it locally on my own machines, but what about the tens
> >>> of thousands of other Seagate notebook drive owners out there?
> >>>
> >>
> >> This is a problem with Seagate specifically, spinning back up
> >> on receipt of some command after spindown?
> > 
> > No, they just seem to be affected worse by it than some other brands.
> > The bug is that libata/SCSI now spin-down the drive before the distro's
> > scripts are done with it, so it spins down, and then gets spun up again
> > by the distro, and then spun down again by the distro.
> > 
> > And along the way, one/both of the two causes a full mechanism "park",
> > which is hard on things if abused (like this).
> > 
> > Or at least that's what I recall for it.  Tejun?
> 
> This really isn't a regression.  It's been always like that with libata.

Tejun, it is a regression over IDE subsystem
(so all PATA and some SATA also).

Dave/Chuck, this also seems like a FC7 regression
(because of the libata PATA switch).

>  libata doesn't make devices go into standby mode and shutdown(8) does
> it for libata.  The problem here is that libata does issue
> SYNCHRONIZE_CACHE on shutdown.  So, the sequence of event is...
> 
> 1. shutdown(8) issues SYNCHRONIZE_CACHE followed by STANDBY_NOW
> 2. kernel shutdown starts
> 3. libata shutdown issues SYNCHRONIZE_CACHE
> 4. power goes off
> 
> Some drives seem to spin up at step #3 even when its cache is clean and
> power goes off right after the disk finishes the command.  So, it's
> really bad when it happens - spin down, spin up followed by immediate
> power off.
> 
> SCSI part of the fix is queued in scsi-misc-2.6 tree and libata-dev part
> is acked and waiting to be merged, so the fix will be available in
> 2.6.22.  However, it's disabled by default to remain compatible with the
> current behavior and requires userland change to fully fix the problem.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.21-rc5 from fc7-rc2 problems

2007-04-18 Thread Len Brown

On Wednesday 18 April 2007 16:23, Jeff Garzik wrote:
> Len Brown wrote:
> > < Linux version 2.6.20-1.2933.fc6
> > < ([EMAIL PROTECTED]) (gcc version 4.1.1 20070105
> > < (Red Hat 4.1.1-51)) #1 SMP Mon Mar 19 11:38:26 EDT 2007
> > ---
> >> Linux version 2.6.20-1.3023.fc7
> >> ([EMAIL PROTECTED]) (gcc version 4.1.2 20070317
> >> (Red Hat 4.1.2-5)) #1 SMP Sun Mar 25 22:12:02 EDT 2007
> > 
> > I agree that the fc7 version string looks strange, because
> > there are other things in the fc7 dmesg which are clearly from 2.6.21,
> > such as this:
> > 
> > < ACPI: Core revision 20060707
> > ---
> >> ACPI: Core revision 20070126
> > 
> > Perhaps you can try building a kernel.org 2.6.21 kernel and running
> > it on your FC6 install?
> > 
> > The ALI15X3 stuff exists only in the working FC6 dmesg:
> > 
> > < Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
> > < ide: Assuming 33MHz system bus speed for PIO modes; override with 
> > idebus=xx
> > < ALI15X3: IDE controller at PCI slot :00:0f.0
> > < ACPI: Unable to derive IRQ for device :00:0f.0
> > < ACPI: PCI Interrupt :00:0f.0[A]: no GSI
> > < ALI15X3: chipset revision 195
> > < ALI15X3: not 100% native mode: will probe irqs later
> > < ide0: BM-DMA at 0x1000-0x1007, BIOS settings: hda:DMA, hdb:pio
> > < ide1: BM-DMA at 0x1008-0x100f, BIOS settings: hdc:DMA, hdd:pio
> > < Probing IDE interface ide0...
> > < hda: HITACHI_DK23CA-20, ATA DISK drive
> > < ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
> > < Probing IDE interface ide1...
> > < hdc: MATSHITADVD-ROM SR-8175, ATAPI CD/DVD-ROM drive
> > < ide1 at 0x170-0x177,0x376 on irq 15
> > < Probing IDE interface ide2...
> > < Probing IDE interface ide3...
> > < Probing IDE interface ide4...
> > < Probing IDE interface ide5...
> > < hda: max request size: 128KiB
> > < hda: 39070080 sectors (20003 MB) w/2048KiB Cache, CHS=38760/16/63, 
> > UDMA(66)
> > < hda: cache flushes not supported
> > <  hda: hda1 hda2
> > < ide-floppy driver 0.99.newide
> > 
> > FC7 looks like it is using libata instead:
> > -Len
> > 
> >> SCSI subsystem initialized
> >> libata version 2.20 loaded.
> >> ACPI: Unable to derive IRQ for device :00:0f.0
> >> ACPI: PCI Interrupt :00:0f.0[A]: no GSI
> >> ata1: PATA max UDMA/100 cmd 0x000101f0 ctl 0x000103f6 bmdma 0x00011000
> >> irq 14
> >> ata2: PATA max UDMA/100 cmd 0x00010170 ctl 0x00010376 bmdma 0x00011008
> >> irq 15
> >> scsi0 : pata_ali
> >> PM: Adding info for No Bus:host0
> >> ata1.00: ATA-5: HITACHI_DK23CA-20, 00H1A0A3, max UDMA/100   <
> >> drive can do 100
> >> ata1.00: 39070080 sectors, multi 16: LBA
> >> ata1.00: configured for UDMA/33<=== configured as 33
> >> scsi1 : pata_ali
> >> PM: Adding info for No Bus:host1
> >> ata2.00: ATAPI, max UDMA/33   <=== cd can't be read now
> >> ata2.00: configured for UDMA/33
> >> PM: Adding info for No Bus:target0:0:0
> >> scsi 0:0:0:0: Direct-Access ATA  HITACHI_DK23CA-2 00H1 PQ: 0 ANSI: 
> >> 5
> >> PM: Adding info for scsi:0:0:0:0
> >> PM: Adding info for No Bus:target1:0:0
> >> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> >> ata2.00: cmd a0/01:00:00:00:00/00:00:00:00:00/a0 tag 0 cdb 0x12 data 36 in
> >>  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> >> ata2: soft resetting port
> 
> It looks like interrupts are not being delivered?

Dunno, both 2.6.20 and 2.6.21 say they're looking on IRQ14 and IRQ15.
ACPI isn't involved at all with those IRQs, as it couldn't find any info
for :00:0f.0[A] and thus the legacy hard-coding for IDE must rule the day.

Is it possible to configure 2.6.21 with the driver that was running in 2.6.20?
If yes, and that works, then we know we didn't somehow otherwise break 
interrupts.

-Len
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: CFS and suspend2: hang in atomic copy

2007-04-18 Thread Ingo Molnar


* Christian Hesse <[EMAIL PROTECTED]> wrote:

> > i took a quick look at suspend2 and it makes some use of yield(). 
> > There's a bug in CFS's yield code, i've attached a patch that should 
> > fix it, does it make any difference to the hang?
> 
> This patch should apply cleanly against what? The second hunk is 
> ignored as it has already been applied. Is this correct?

hm, i think you might have had one of the earlier CFS patches.

> But no, it does not change anything. Let me know if you have any other 
> patches to test.

could you try the -v3 patch i released a few hours ago:

   http://redhat.com/~mingo/cfs-scheduler/

although probably your suspend2 problem is still not fixed, it's worth a 
try nevertheless. Which suspend2 patch did you apply, and was it against 
-rc6 or -rc7?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: AppArmor FAQ

2007-04-18 Thread James Morris

On Wed, 18 Apr 2007, Crispin Cowan wrote:

> James Morris wrote:
> > On Tue, 17 Apr 2007, Alan Cox wrote:
> >   
> >> I'm not sure if AppArmor can be made good security for the general case,
> >> but it is a model that works in the limited http environment
> >> (eg .htaccess) and is something people can play with and hack on and may
> >> be possible to configure to be very secure.
> >> 
> > Perhaps -- until your httpd is compromised via a buffer overflow or 
> > simply misbehaves due to a software or configuration flaw, then the 
> > assumptions being made about its use of pathnames and their security 
> > properties are out the window.
> >   
> How is it that you think a buffer overflow in httpd could allow an
> attacker to break out of an AppArmor profile?

Because you can change the behavior of the application and then bypass 
policy entirely by utilizing any mechanism other than direct filesystem 
access: IPC, shared memory, Unix domain sockets, local IP networking, 
remote networking etc.

This not even considering object aliasing (which would allow you to 
inappropriately access objects with full blessing of policy), as I'm 
assuming that the limited environment Alan is referring to entirely 
prevents them.

Also worth noting here is that you have to consider any limited 
environment as enforcing security policy, and thus its configuration 
becomes an additional component of security policy.

So, your real security policy is actually more complicated than it appears 
to be, is not represented completely in the policy configuration file, and 
must be managed disparately.  And it's still only capable of controlling 
access to filesystem objects.

- James
-- 
James Morris
<[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Loud "pop" coming from hard drive on reboot

2007-04-18 Thread Bartlomiej Zolnierkiewicz

On Wednesday 18 April 2007, Chuck Ebbert wrote:
> Mark Lord wrote:
> > Mark Lord wrote:
> >>
> >> With the patch applied, I don't see *any* new activity in those
> >> S.M.A.R.T.
> >> attributes over multiple hibernates (Linux "suspend-to-disk").
> > 
> > Scratch that -- operator failure.  ;)
> > The patch makes no difference over hibernates in the SMART logs.
> > 
> > It's still logging extra Power-Off_Retract_Count pegs,
> > which it DID NOT USED TO DO not so long ago.
> > 
> 
> Just to add to the fun, my problems are happening with the "old"
> IDE drivers...

The issue you are experiencing results in the same problem (disk doing
power off retract) but it has a totally different root cause - your notebook
loses power on reboot.  It is actually a hardware problem and as you have
reported the same problem is present when using "the other" OS.

I think that the issue needs to be fixed (by detecting affected notebook(s)
using DMI?) in Linux PM handling and not in IDE subsystem because:

* there may be some other hardware devices affected by the power loss
  (== they require shutdown sequence)

* the same problem will bite if somebody decides to use libata (FC7?)

Bart
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kaffeine problem with CFS

2007-04-18 Thread Ingo Molnar


* S.Çağlar Onur <[EMAIL PROTECTED]> wrote:

> > great! Could you please unapply the hack above and try the proper 
> > fix below, does this one solve the hangs too?
> 
> Instead of that one, i tried CFSv3 and i cannot reproduce the hang 
> anymore, Thanks!...

cool, thanks for the quick turnaround!

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kaffeine problem with CFS

2007-04-18 Thread S.Çağlar Onur

18 Nis 2007 Çar tarihinde, Ingo Molnar şunları yazmıştı: 
> * S.Çağlar Onur <[EMAIL PROTECTED]> wrote:
> > -   schedule();
> > +   msleep(1);
> >
> > which Ingo sends me to try also has the same effect on me. I cannot
> > reproduce hangs anymore with that patch applied top of CFS while one
> > console checks out SVN repos and other one compiles a small test
> > software.
>
> great! Could you please unapply the hack above and try the proper fix
> below, does this one solve the hangs too?

Instead of that one, i tried CFSv3 and i cannot reproduce the hang anymore, 
Thanks!...

Cheers
-- 
S.Çağlar Onur <[EMAIL PROTECTED]>
http://cekirdek.pardus.org.tr/~caglar/

Linux is like living in a teepee. No Windows, no Gates and an Apache in house!


signature.asc
Description: This is a digitally signed message part.

Re: [PATCH] fix OOM killing processes wrongly thought MPOL_BIND

2007-04-18 Thread Christoph Lameter

On Wed, 18 Apr 2007, Hugh Dickins wrote:

> I only have CONFIG_NUMA=y for build testing: surprised when trying a memhog
> to see lots of other processes killed with "No available memory (MPOL_BIND)".
> memhog is killed correctly once we initialize nodemask in constrained_alloc().
> 
> Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]>

Acked-by: Christoph Lameter <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 1/2] Input: ff, add FF_RAW effect

2007-04-18 Thread Jiri Slaby

johann deneux napsal(a):
> Jiri,
> 
> Which solution did you chose to implement? From what I remember, we
> last discussed Dmitry's idea of specifying an axis for an effect, then
> combine several effects to achieve complex effects.

I think you mean motor instead of axis, because I don't push real axes to
the devices, but motor's torques...

> The implementation would specify the axis using the upper bits of the
> effect type.

Ok, if this is preferred, I'll post it with the const of having more context
switches for a single effect.

This was just a realization of the idea how I though it with the
quick'n'dirty FF_RAW.

thanks,
-- 
http://www.fi.muni.cz/~xslaby/Jiri Slaby
faculty of informatics, masaryk university, brno, cz
e-mail: jirislaby gmail com, gpg pubkey fingerprint:
 B674 9967 0407 CE62 ACC8  22A0 32CC 55C3 39D4 7A7E
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-18 Thread Ingo Molnar

* Linus Torvalds <[EMAIL PROTECTED]> wrote:

> > perhaps a more fitting term would be 'precise group-scheduling'. 
> > Within the lowest level task group entity (be that thread group or 
> > uid group, etc.) 'precise scheduling' is equivalent to 'fairness'.
> 
> Yes. Absolutely. Except I think that at least if you're going to name 
> somethign "complete" (or "perfect" or "precise"), you should also 
> admit that groups can be hierarchical.

yes. Am i correct to sum up your impression as:

 " Ingo, for you the hierarchy still appears to be an after-thought,
   while in practice it's easily the most important thing! Why are you
   so hung up about 'fairness', it makes no sense!"

right?

and you would definitely be right if you suggested that i neglected the 
'group scheduling' aspects of CFS (except for a minimalistic nice level 
implementation, which is a poor-man's-non-automatic-group-scheduling), 
but i very much know its important and i'll definitely fix it for -v4.

But please let me explain my reasons for my different focus:

yes, group scheduling in practice is the most important first-layer 
thing, and without it any of the other 'CFS wins' can easily be useless.

Firstly, i have not neglected the group scheduling related CFS 
regressions at all, mainly because there _is_ already a quick hack to 
check whether group scheduling would solve these regressions: renice. 
And it was tried in both of the two CFS regression cases i'm aware of: 
Mike's X starvation problem and Willy's "kevents starvation with 
thousands of scheddos tasks running" problem. And in both cases, 
applying the renice hack [which should be properly and automatically 
implemented as uid group scheduling] fixed the regression for them! So i 
was not worried at all, group scheduling _provably solves_ these CFS 
regressions. I rather concentrated on the CFS regressions that were much 
less clear.

But PLEASE believe me: even with perfect cross-group CPU allocation but 
with a simple non-heuristic scheduler underlying it, you can _easily_ 
get a sucky desktop experience! I know it because i tried it and others 
tried it too. (in fact the first version of sched_fair.c was tick based 
and low-res, and it sucked)

Two more things were needed:

  - the high precision of nsec/64-bit accounting
('reliability of scheduling')

  - extremely even time-distribution of CPU power 
('determinism/smoothness, human perception')

(i'm expanding on these two concepts further below)

take out any of these and group scheduling or not, you are easily going 
to have a sucky desktop! (We know that from years of experiments: many 
people tried to rip out the unfairness from the scheduler and there were 
always nasty corner cases that 'should' have worked but didnt.)

Without these we'd in essence start again at square one, just at a 
different square, this time with another group of people being 
irritated!

But the biggest and hardest to achieve _wins_ of CFS are _NOT_ achieved 
via a simple 'get rid of the unfairness of the upstream scheduler and 
apply group scheduling'. (I know that because i tried it before and 
because others tried it before, for many many years.) You will _easily_ 
get sucky desktop experience. The other two things are very much needed 
too:

 - the high precision of nsec/64-bit accounting, and the many
   corner-cases this solves. (For example on a typical desktop there are
   _lots_ of timing-driven workloads that are in essence 'invisible' to
   low-resolution, timer-tick based accounting and are heavily skewed.)

 - extremely even time-distribution of CPU power. CFS behaves pretty
   well even under the dreaded 'make -jN in an xterm' kernel build
   workload as reported by Mark Lord, because it also distributes CPU
   power in a _finegrained_ way. A shell prompt under CFS still behaves
   acceptably on a single-CPU testbox of mine with a "make -j50"
   workload. (yes, fifty) Humans react alot more negatively to sudden
   changes in application behavior ('lags', pauses, short hangs) than
   they react to fine, gradual, all-encompassing slowdowns. This is a
   key property of CFS.

  ( Otherwise renicing X to -10 would have solved most of the
interactivity complaints against the vanilla scheduler, otherwise
renicing X to -10 would have fixed Mike's setup under SD (it didnt)
while it worked much better under CFS, otherwise Gene wouldnt have
found CFS markedly better than SD, etc., etc. So getting rid of the
heuristics is less than 50% of the road to the perfect desktop
scheduler. )

and i claim that these were the really hard bits, and i spent most of 
the CFS coding only on getting _these_ details 100% right under various 
workloads, and it makes a night and day difference _even without any 
group scheduling help_.

and note another reason here: group scheduling _masks_ many other 
scheduling deficiencies that are possible in scheduler. So since CFS 
doesnt do group scheduling, i get a _fuller_

Re: Stupid GIT question...

2007-04-18 Thread Andreas Schwab

[EMAIL PROTECTED] writes:

> What's the command to get a diff of "what I would merge if I said 'git pull'?"

$ git fetch
$ git diff master origin

Andreas.

-- 
Andreas Schwab, SuSE Labs, [EMAIL PROTECTED]
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Performance degradation with FFSB between 2.6.20 and 2.6.21-rc7

2007-04-18 Thread Andrew Morton

> On Wed, 18 Apr 2007 15:54:00 +0200 Valerie Clement <[EMAIL PROTECTED]> wrote:
> 
> Running benchmark tests (FFSB) on an ext4 filesystem, I noticed a 
> performance degradation (about 15-20 percent) in sequential write tests 
> between 2.6.19-rc6 and 2.6.21-rc4 kernels.
> 
> I ran the same tests on ext3 and XFS filesystems and I saw the same 
> performance difference between the two kernel versions for these two 
> filesystems.
> 
> I have also reproduced it between 2.6.20.7 and 2.6.21-rc7.
> The FFSB tests run 16 threads, each creating 1GB files. The tests were 
> done on the same x86_64 system, with the same kernel configuration and 
> on the same scsi device. Below are the throughput values given by FFSB.
> 
>kernel   XFSext3
> --
>   2.6.20.748 MB/sec 44 MB/sec
> 
>   2.6.21-rc7  38 MB/sec 37 MB/sec
> 
> Did anyone else run across the problem?
> Is there a known issue?
> 

That's a new discovery, thanks.

It could be due to I/O scheduler changes.  Which one are you using?  CFQ?

Or it could be that there has been some changed behaviour at the VFS/pagecache
layer: the VFS might be submitting little hunks of lots of files, rather than
large hunks of few files.

Or it could be a block-layer thing: perhaps some driver change has caused
us to be placing less data into the queue.  Which device driver is that machine
using?

Being a simple soul, the first thing I'll try when I get near a test box
will be

for i in $(seq 1 16)
do
time dd if=/dev/zero of=$i bs=1M count=1024 &
done
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.21-rc5 from fc7-rc2 problems

2007-04-18 Thread Chuck Ebbert

Stephen Clark wrote:
> Chuck Ebbert wrote:
> 
>> Stephen Clark wrote:
>>  
>>
>>> Hello,
>>>
>>> I have just tried booting the fc7-rc2 live cd on 2 of my laptops and it
>>> failed on both.
>>>
>>>   
>>
>> FC7 test4 will be out any day now. Please test that -- test2 is ancient
>> now.
>>
>>
>>  
>>
> Ok I'll try that when it comes out - I was actually using the livecd
> version will the new version
> have a livecd also?

I'm pretty sure the Live CD will be part of the release.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] [RFC] Throttle swappiness for interactive tasks

2007-04-18 Thread Rik van Riel


अभिजित भोपटकर (Abhijit Bhopatkar) wrote:

The mm structures of interactive tasks are marked and
the pages belonging to them are never shifted to inactive
list in lru algorithm. Thus keeping interactive tasks in
memory as long as possible.
The interactivity is already determined by schedular so
we reuse that knowledge to mark the mm structures.


Aside from the obvious question of whether the idea is good,
there are some practical problems with your patch:

1) the mm->interactive flag is never cleared, even if the
   task stops being interactive

2) what if the interactive tasks use up more memory than
   the system has?  Will you OOM kill instead of swapping
   out part of an interactive task?

3) the scheduler can change its idea about which task is
   interactive and which task isn't very rapidly, while
   disk IO is very slow - the scheduler's classification
   may not be useful on swap timescales

4) a currently completely idle task can still be marked
   interactive in the scheduler, even if it has been
   idle for days.  Such a task is an obvious good
   candidate for swapout, isn't it?

--
Politics is the struggle between those who want to make their country
the best in the world, and those who believe it already is.  Each group
calls the other unpatriotic.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 8 >

1 - 100 of 738 matches

Mail list logo