[PATCH -mm] workqueue: debug possible lockups in flush_workqueue
Hi, Here is my patch proposal for detecting possible lockups, when flush_workqueue caller holds a lock (e.g. rtnl_lock) also used in work functions. Regards, Jarek P. Signed-off-by: Jarek Poplawski <[EMAIL PROTECTED]> --- diff -Nurp 2.6.21-rc6-mm1-/kernel/workqueue.c 2.6.21-rc6-mm1/kernel/workqueue.c --- 2.6.21-rc6-mm1-/kernel/workqueue.c 2007-04-18 20:07:45.0 +0200 +++ 2.6.21-rc6-mm1/kernel/workqueue.c 2007-04-18 21:29:50.0 +0200 @@ -67,6 +67,12 @@ struct workqueue_struct { /* All the per-cpu workqueues on the system, for hotplug cpu to add/remove threads to each one as cpus come/go. */ static DEFINE_MUTEX(workqueue_mutex); + +#ifdef CONFIG_PROVE_LOCKING +/* Detect possible flush_workqueue() lockup with circular dependency check. */ +static struct lockdep_map flush_dep_map = { .name = "flush_dep_map" }; +#endif + static LIST_HEAD(workqueues); static int singlethread_cpu __read_mostly; @@ -247,8 +253,15 @@ static void run_workqueue(struct cpu_wor BUG_ON(get_wq_data(work) != cwq); work_clear_pending(work); +#ifdef CONFIG_PROVE_LOCKING + /* lockdep dependency: flush_dep_map (read) before any lock: */ + lock_acquire(_dep_map, 0, 0, 1, 2, _THIS_IP_); +#endif f(work); +#ifdef CONFIG_PROVE_LOCKING + lock_release(_dep_map, 1, _THIS_IP_); +#endif if (unlikely(in_atomic() || lockdep_depth(current) > 0)) { printk(KERN_ERR "BUG: workqueue leaked lock or atomic: " "%s/0x%08x/%d\n", @@ -389,6 +402,14 @@ void fastcall flush_workqueue(struct wor int cpu; might_sleep(); +#ifdef CONFIG_PROVE_LOCKING + /* +* Add lockdep dependency: flush_dep_map (exclusive) +* after any held mutex or rwsem. +*/ + lock_acquire(_dep_map, 0, 0, 0, 2, _THIS_IP_); + lock_release(_dep_map, 1, _THIS_IP_); +#endif for_each_cpu_mask(cpu, *cpu_map) flush_cpu_workqueue(per_cpu_ptr(wq->cpu_wq, cpu)); } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/8] RSS controller based on process containers (v2)
Pavel Emelianov wrote: > Peter Zijlstra wrote: >> *ugh* /me no like. >> >> The basic premises seems to be that we can track page owners perfectly >> (although this patch set does not yet do so), through get/release > > It looks like you have examined the patches not very carefully > before concluding this. These patches DO track page owners. > > I know that a page may be shared among several containers and > thus have many owners so we should track all of them. This is > exactly what we decided not to do half-a-year ago. > > Page sharing accounting is performed in OpenVZ beancounters, and > this functionality will be pushed to mainline after this simple > container. > >> operations (on _mapcount). >> >> This is simply not true for unmapped pagecache pages. Those receive no >> 'release' event; (the usage by find_get_page() could be seen as 'get'). > > These patches concern the mapped pagecache only. Unmapped pagecache > control is out of the scope of it since we do not want one container > to track all the resources. Unmapped pagecache control and swapcache control is part of independent pagecache controller that is being developed. Initial version was posted at http://lkml.org/lkml/2007/3/06/51 I plan to post a new version based on this patchset in a couple of days. --Vaidy >> Also, you don't seem to balance the active/inactive scanning on a per >> container basis. This skews the per container working set logic. > > This is not true. Balbir sent a patch to the first version of this > container that added active/inactive balancing to the container. > I have included this (a bit reworked) patch into this version and > pointed this fact in the zeroth letter. > [snip] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [RFC] Throttle swappiness for interactive tasks
> I just wanted to know weather its worth going forward or we have > better reasons to discount any such direction? The reason that the wrong pages get swapped out sometimes could be due to a side effect of the way the swappiness policy is implemented. While the VM only reclaims page cache pages, it will still rotate through the anonymous pages on the LRU list, which effectively randomizes the order of those pages on the list. In my mind i find it fundamentally wrong to separate anon pages from page cache. It should rather be lot more dependent on which task accessed them last. Although it seems due to some twisted relationships bet anon pages and interactive tasks separating them improves it. Am i missing something here? I need to get back to benchmarking my patch to split the lists - anonymous and other swap backed pages on one set of pageout lists, filesystem backed pages on another list. Unfortunately my main desktop system at home depends on Xen, so it's not as easy to use that patch there :( Can you send me those patches please or point me to where i can find those? Abhijit - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Thu, 19 Apr 2007 05:18:07 +0200 Nick Piggin <[EMAIL PROTECTED]> wrote: > And yes, by fairly, I mean fairly among all threads as a base resource > class, because that's what Linux has always done Yes, there are potential compatibility problems. Example: a machine with 100 busy httpd processes and suddenly a big gzip starts up from console or cron. Under current kernels, that gzip will take ages and the httpds will take a 1% slowdown, which may well be exactly the behaviour which is desired. If we were to schedule by UID then the gzip suddenly gets 50% of the CPU and those httpd's all take a 50% hit, which could be quite serious. That's simple to fix via nicing, but people have to know to do that, and there will be a transition period where some disruption is possible. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][BUG] Fix possible NULL pointer access in 8250 serial driver
On Thu, 19 Apr 2007 11:28:37 +0900 izumi <[EMAIL PROTECTED]> wrote: > Russell King wrote: > > > NAK. This means that you change the list of ports available on the > > machine to be limited to only those which are currently open. Utterly > > useless for debugging, where you normally want people to dump the > > contents of /proc/tty/driver/*. > > > > The original patch was better. > > > >Is the original patch sufficient? or is there anything we should > correct? > Would it be better to do something like --- a/drivers/serial/serial_core.c~a +++ a/drivers/serial/serial_core.c @@ -1686,9 +1686,12 @@ static int uart_line_info(char *buf, str pm_state = state->pm_state; if (pm_state) uart_change_pm(state, 0); - spin_lock_irq(>lock); - status = port->ops->get_mctrl(port); - spin_unlock_irq(>lock); + status = 0; + if (port->info) { + spin_lock_irq(>lock); + status = port->ops->get_mctrl(port); + spin_unlock_irq(>lock); + } if (pm_state) uart_change_pm(state, pm_state); mutex_unlock(>mutex); _ so that a) we treat all uart types in the same way and b) the same problem doesn't occur later with some other driver which is assuming an opened device in its ->get_mctrl() handler? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 1/2] Input: ff, add FF_RAW effect
Hi, On Thursday 19 April 2007 00:25, johann deneux wrote: > On 4/18/07, Jiri Slaby <[EMAIL PROTECTED]> wrote: > > johann deneux napsal(a): > > > Jiri, > > > > > > Which solution did you chose to implement? From what I remember, we > > > last discussed Dmitry's idea of specifying an axis for an effect, then > > > combine several effects to achieve complex effects. > > > > I think you mean motor instead of axis, because I don't push real axes to > > the devices, but motor's torques... > > > > Yes, sorry, I meant motor. > I have been thinking about this and I don't think that exporting motor data is a good idea, at least not in case of Phantom driver. The fact that there are 3 motors is a hardware implementation detail and it is not interesting for general application. My understanding that the end result of controlling these 3 motors is a force vector (I don't know if there is such english term, this is a literal translation from russian) applied to user's hand. If we are interested in using FF API we need to come up with a way to express this effect without exposing implementation details of one particular device. -- Dmitry - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [RFC] Throttle swappiness for interactive tasks
Abhijit Bhopatkar wrote: I just wanted to know weather its worth going forward or we have better reasons to discount any such direction? The reason that the wrong pages get swapped out sometimes could be due to a side effect of the way the swappiness policy is implemented. While the VM only reclaims page cache pages, it will still rotate through the anonymous pages on the LRU list, which effectively randomizes the order of those pages on the list. I need to get back to benchmarking my patch to split the lists - anonymous and other swap backed pages on one set of pageout lists, filesystem backed pages on another list. One report I got was that the system is more interactive under very heavy load, and my desktop system at the office seems to behave better than it used to when I get back to it after a few days. Unfortunately my main desktop system at home depends on Xen, so it's not as easy to use that patch there :( -- Politics is the struggle between those who want to make their country the best in the world, and those who believe it already is. Each group calls the other unpatriotic. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Success! Was: [PATCH 0/4] 2.6.21-rc7 NFS writes: fix a series of issues
On Wed, Apr 18, 2007 at 10:45:13PM -0400, Trond Myklebust wrote: > On Wed, 2007-04-18 at 20:52 -0500, Florin Iucha wrote: > > It seems that my original problem report had a big mistake! There is > > no hang, but at some point the write slows down to a trickle (from > > 40,000 blocks/s to 22 blocks/s) as can be seen from the iostat log. > > Yeah. You only captured the outgoing traffic to the server, but already > it looks as if there were 'interesting' things going on. In frames 29346 > to 29350, the traffic stops altogether for 5 seconds (I only see > keepalives) then it starts up again. Ditto for frames 40477-40482 > (another 5 seconds). ... > Then at around frame 92072, the client starts to send a bunch of RSTs. > Aha I'll bet that reverting the appended patch fixes the problem. You win! Reverting this patch (on top of your previous 5) allowed the big copy to complete (70GB) as well as successful log-in to gnome! Acked-By: Florin Iucha <[EMAIL PROTECTED]> Thanks so much for the patience with this elusive bug and stubborn bugreporter! Regards, florin > --- > commit 43d78ef2ba5bec26d0315859e8324bfc0be23766 > Author: Chuck Lever <[EMAIL PROTECTED]> > Date: Tue Feb 6 18:26:11 2007 -0500 > > NFS: disconnect before retrying NFSv4 requests over TCP > > RFC3530 section 3.1.1 states an NFSv4 client MUST NOT send a request > twice on the same connection unless it is the NULL procedure. Section > 3.1.1 suggests that the client should disconnect and reconnect if it > wants to retry a request. > > Implement this by adding an rpc_clnt flag that an ULP can use to > specify that the underlying transport should be disconnected on a > major timeout. The NFSv4 client asserts this new flag, and requests > no retries after a minor retransmit timeout. > > Note that disconnecting on a retransmit is in general not safe to do > if the RPC client does not reuse the TCP port number when reconnecting. > > See http://bugzilla.linux-nfs.org/show_bug.cgi?id=6 > > Signed-off-by: Chuck Lever <[EMAIL PROTECTED]> > Signed-off-by: Trond Myklebust <[EMAIL PROTECTED]> > > diff --git a/fs/nfs/client.c b/fs/nfs/client.c > index a3191f0..c46e94f 100644 > --- a/fs/nfs/client.c > +++ b/fs/nfs/client.c > @@ -394,7 +394,8 @@ static void nfs_init_timeout_values(struct rpc_timeout > *to, int proto, > static int nfs_create_rpc_client(struct nfs_client *clp, int proto, > unsigned int timeo, > unsigned int retrans, > - rpc_authflavor_t flavor) > + rpc_authflavor_t flavor, > + int flags) > { > struct rpc_timeout timeparms; > struct rpc_clnt *clnt = NULL; > @@ -407,6 +408,7 @@ static int nfs_create_rpc_client(struct nfs_client *clp, > int proto, > .program= _program, > .version= clp->rpc_ops->version, > .authflavor = flavor, > + .flags = flags, > }; > > if (!IS_ERR(clp->cl_rpcclient)) > @@ -548,7 +550,7 @@ static int nfs_init_client(struct nfs_client *clp, const > struct nfs_mount_data * >* - RFC 2623, sec 2.3.2 >*/ > error = nfs_create_rpc_client(clp, proto, data->timeo, data->retrans, > - RPC_AUTH_UNIX); > + RPC_AUTH_UNIX, 0); > if (error < 0) > goto error; > nfs_mark_client_ready(clp, NFS_CS_READY); > @@ -868,7 +870,8 @@ static int nfs4_init_client(struct nfs_client *clp, > /* Check NFS protocol revision and initialize RPC op vector */ > clp->rpc_ops = _v4_clientops; > > - error = nfs_create_rpc_client(clp, proto, timeo, retrans, authflavour); > + error = nfs_create_rpc_client(clp, proto, timeo, retrans, authflavour, > + RPC_CLNT_CREATE_DISCRTRY); > if (error < 0) > goto error; > memcpy(clp->cl_ipaddr, ip_addr, sizeof(clp->cl_ipaddr)); > diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h > index a1be89d..c7a78ee 100644 > --- a/include/linux/sunrpc/clnt.h > +++ b/include/linux/sunrpc/clnt.h > @@ -40,6 +40,7 @@ struct rpc_clnt { > > unsigned intcl_softrtry : 1,/* soft timeouts */ > cl_intr : 1,/* interruptible */ > + cl_discrtry : 1,/* disconnect before retry */ > cl_autobind : 1,/* use getport() */ > cl_oneshot : 1,/* dispose after use */ > cl_dead : 1;/* abandoned */ > @@ -111,6 +112,7 @@ struct rpc_create_args { > #define RPC_CLNT_CREATE_ONESHOT (1UL << 3) > #define
Re: [RFC 1/2] Input: ff, add FF_RAW effect
On 4/18/07, Jiri Slaby <[EMAIL PROTECTED]> wrote: johann deneux napsal(a): > Jiri, > > Which solution did you chose to implement? From what I remember, we > last discussed Dmitry's idea of specifying an axis for an effect, then > combine several effects to achieve complex effects. I think you mean motor instead of axis, because I don't push real axes to the devices, but motor's torques... Yes, sorry, I meant motor. -- Johann - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[KJ][PATCH]SPIN_LOCK_UNLOCKED cleanup in drivers/s390
SPIN_LOCK_UNLOCKED cleanup,use __SPIN_LOCK_UNLOCKED instead. Signed-off-by: Milind Arun Choudhary <[EMAIL PROTECTED]> --- char/vmlogrdr.c |6 +++--- cio/cmf.c |2 +- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/s390/char/vmlogrdr.c b/drivers/s390/char/vmlogrdr.c index b87d3b0..75d61a4 100644 --- a/drivers/s390/char/vmlogrdr.c +++ b/drivers/s390/char/vmlogrdr.c @@ -125,7 +125,7 @@ static struct vmlogrdr_priv_t sys_ser[] = { .recording_name = "EREP", .minor_num = 0, .buffer_free= 1, - .priv_lock = SPIN_LOCK_UNLOCKED, + .priv_lock = __SPIN_LOCK_UNLOCKED(sys_ser[0].priv_lock), .autorecording = 1, .autopurge = 1, }, @@ -134,7 +134,7 @@ static struct vmlogrdr_priv_t sys_ser[] = { .recording_name = "ACCOUNT", .minor_num = 1, .buffer_free= 1, - .priv_lock = SPIN_LOCK_UNLOCKED, + .priv_lock = __SPIN_LOCK_UNLOCKED(sys_ser[1].priv_lock), .autorecording = 1, .autopurge = 1, }, @@ -143,7 +143,7 @@ static struct vmlogrdr_priv_t sys_ser[] = { .recording_name = "SYMPTOM", .minor_num = 2, .buffer_free= 1, - .priv_lock = SPIN_LOCK_UNLOCKED, + .priv_lock = __SPIN_LOCK_UNLOCKED(sys_ser[2].priv_lock), .autorecording = 1, .autopurge = 1, } diff --git a/drivers/s390/cio/cmf.c b/drivers/s390/cio/cmf.c index 90b22fa..28abd69 100644 --- a/drivers/s390/cio/cmf.c +++ b/drivers/s390/cio/cmf.c @@ -476,7 +476,7 @@ struct cmb_area { }; static struct cmb_area cmb_area = { - .lock = SPIN_LOCK_UNLOCKED, + .lock = __SPIN_LOCK_UNLOCKED(cmb_area.lock), .list = LIST_HEAD_INIT(cmb_area.list), .num_channels = 1024, }; -- Milind Arun Choudhary - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[KJ][PATCH] i2c: SPIN_LOCK_UNLOCKED cleanup
SPIN_LOCK_UNLOCKED cleanup,use __SPIN_LOCK_UNLOCKED instead Signed-off-by: Milind Arun Choudhary <[EMAIL PROTECTED]> --- i2c-pxa.c |2 +- i2c-s3c2410.c |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/i2c/busses/i2c-pxa.c b/drivers/i2c/busses/i2c-pxa.c index 14e83d0..d5d44ed 100644 --- a/drivers/i2c/busses/i2c-pxa.c +++ b/drivers/i2c/busses/i2c-pxa.c @@ -825,7 +825,7 @@ static const struct i2c_algorithm i2c_pxa_algorithm = { }; static struct pxa_i2c i2c_pxa = { - .lock = SPIN_LOCK_UNLOCKED, + .lock = __SPIN_LOCK_UNLOCKED(i2c_pxa.lock), .adap = { .owner = THIS_MODULE, .algo = _pxa_algorithm, diff --git a/drivers/i2c/busses/i2c-s3c2410.c b/drivers/i2c/busses/i2c-s3c2410.c index 556f244..3eb5958 100644 --- a/drivers/i2c/busses/i2c-s3c2410.c +++ b/drivers/i2c/busses/i2c-s3c2410.c @@ -570,7 +570,7 @@ static const struct i2c_algorithm s3c24xx_i2c_algorithm = { }; static struct s3c24xx_i2c s3c24xx_i2c = { - .lock = SPIN_LOCK_UNLOCKED, + .lock = __SPIN_LOCK_UNLOCKED(s3c24xx_i2c.lock), .wait = __WAIT_QUEUE_HEAD_INITIALIZER(s3c24xx_i2c.wait), .adap = { .name = "s3c2410-i2c", -- Milind Arun Choudhary - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][BUG] Fix possible NULL pointer access in 8250 serial driver
Russell King wrote: NAK. This means that you change the list of ports available on the machine to be limited to only those which are currently open. Utterly useless for debugging, where you normally want people to dump the contents of /proc/tty/driver/*. The original patch was better. Is the original patch sufficient? or is there anything we should correct? Taku Izumi <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Wed, Apr 18, 2007 at 10:49:45PM +1000, Con Kolivas wrote: > On Wednesday 18 April 2007 22:13, Nick Piggin wrote: > > > > The kernel compile (make -j8 on 4 thread system) is doing 1800 total > > context switches per second (450/s per runqueue) for cfs, and 670 > > for mainline. Going up to 20ms granularity for cfs brings the context > > switch numbers similar, but user time is still a % or so higher. I'd > > be more worried about compute heavy threads which naturally don't do > > much context switching. > > While kernel compiles are nice and easy to do I've seen enough criticism of > them in the past to wonder about their usefulness as a standard benchmark on > their own. Actually it is a real workload for most kernel developers including you no doubt :) The criticism's of kernbench for the kernel are probably fair in that kernel compiles don't exercise a lot of kernel functionality (page allocator and fault paths mostly, IIRC). However as far as I'm concerned, they're great for testing the CPU scheduler, because it doesn't actually matter whether you're running in userspace or kernel space for a context switch to blow your caches. The results are quite stable. You could actually make up a benchmark that hurts a whole lot more from context switching, but I figure that kernbench is a real world thing that shows it up quite well. > > Some other numbers on the same system > > Hackbench: 2.6.21-rc7 cfs-v2 1ms[*] nicksched > > 10 groups: Time: 1.332 0.743 0.607 > > 20 groups: Time: 1.197 1.100 1.241 > > 30 groups: Time: 1.754 2.376 1.834 > > 40 groups: Time: 3.451 2.227 2.503 > > 50 groups: Time: 3.726 3.399 3.220 > > 60 groups: Time: 3.548 4.567 3.668 > > 70 groups: Time: 4.206 4.905 4.314 > > 80 groups: Time: 4.551 6.324 4.879 > > 90 groups: Time: 7.904 6.962 5.335 > > 100 groups: Time: 7.293 7.799 5.857 > > 110 groups: Time: 10.5958.728 6.517 > > 120 groups: Time: 7.543 9.304 7.082 > > 130 groups: Time: 8.269 10.639 8.007 > > 140 groups: Time: 11.8678.250 8.302 > > 150 groups: Time: 14.8528.656 8.662 > > 160 groups: Time: 9.648 9.313 9.541 > > Hackbench even more so. A prolonged discussion with Rusty Russell on this > issue he suggested hackbench was more a pass/fail benchmark to ensure there > was no starvation scenario that never ended, and very little value should be > placed on the actual results returned from it. Yeah, cfs seems to do a little worse than nicksched here, but I include the numbers not because I think that is significant, but to show mainline's poor characteristics. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Announce - Staircase Deadline cpu scheduler v0.42
On Thu, Apr 19, 2007 at 12:12:14PM +1000, Con Kolivas wrote: > On Thursday 19 April 2007 10:41, Con Kolivas wrote: > > On Thursday 19 April 2007 09:59, Con Kolivas wrote: > > > Since there is so much work currently ongoing with alternative cpu > > > schedulers, as a standard for comparison with the alternative virtual > > > deadline fair designs I've addressed a few issues in the Staircase > > > Deadline cpu scheduler which improve behaviour likely in a noticeable > > > fashion and released version 0.41. > > > > > > http://ck.kolivas.org/patches/staircase-deadline/2.6.20.7-sd-0.41.patch > > > http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc7-sd-0.41.patch > > > > > > and an incremental for those on 0.40: > > > http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc7/sched-impleme > > >nt -staircase-deadline-scheduler-further-improvements.patch > > > > > > Remember to renice X to -10 for nicest desktop behaviour :) > > > > > > Have fun. > > > > Oops forgot to cc a few people > > > > Nick you said I should still have something to offer so here it is. > > Peter you said you never saw this design (it's a dual array affair sorry). > > Gene and Willy you were some of the early testers that noticed the > > advantages of the earlier designs, > > Matt you did lots of great earlier testing. > > WLI you inspired a lot of design ideas. > > Mike you were the stick. > > And a few others I've forgotten to mention and include. > > Version 0.42 > > http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc7-sd-0.42.patch OK, I run some tests later today... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Wed, Apr 18, 2007 at 07:48:21AM -0700, Linus Torvalds wrote: > > > On Wed, 18 Apr 2007, Matt Mackall wrote: > > > > Why is X special? Because it does work on behalf of other processes? > > Lots of things do this. Perhaps a scheduler should focus entirely on > > the implicit and directed wakeup matrix and optimizing that > > instead[1]. > > I 100% agree - the perfect scheduler would indeed take into account where > the wakeups come from, and try to "weigh" processes that help other > processes make progress more. That would naturally give server processes > more CPU power, because they help others > > I don't believe for a second that "fairness" means "give everybody the > same amount of CPU". That's a totally illogical measure of fairness. All > processes are _not_ created equal. I believe that unless the kernel is told of these inequalities, then it must schedule fairly. And yes, by fairly, I mean fairly among all threads as a base resource class, because that's what Linux has always done (and if you aggregate into higher classes, you still need that per-thread scheduling). So I'm not excluding extra scheduling classes like per-process, per-user, but among any class of equal schedulable entities, fair scheduling is the only option because the alternative of unfairness is just insane. > That said, even trying to do "fairness by effective user ID" would > probably already do a lot. In a desktop environment, X would get as much > CPU time as the user processes, simply because it's in a different > protection domain (and that's really what "effective user ID" means: it's > not about "users", it's really about "protection domains"). > > And "fairness by euid" is probably a hell of a lot easier to do than > trying to figure out the wakeup matrix. Well my X server has an euid of root, which would mean my X clients can cause X to do work and eat into root's resources. Or as Ingo said, X may not be running as root. Seems like just another hack to try to implicitly solve the X problem and probably create a lot of others along the way. All fairness issues aside, in the context of keeping a very heavily loaded desktop interactive, X is special. That you are trying to think up funny rules that would implicitly give X better priority is kind of indicative of that. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 0/8] Cpuset aware writeback
On Wed, 18 Apr 2007, Ethan Solomita wrote: >Any new ETA? I'm trying to decide whether to go back to your original > patches or wait for the new set. Adding new knobs isn't as important to me as > having something that fixes the core problem, so hopefully this isn't waiting > on them. They could always be patches on top of your core patches. >-- Ethan H Sorry. I got distracted and I have sent them to Kame-san who was interested in working on them. I have placed the most recent version at http://ftp.kernel.org/pub/linux/kernel/people/christoph/cpuset_dirty - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] 2.6.21-rc7 NFS writes: fix a series of issues
On Wed, 2007-04-18 at 20:52 -0500, Florin Iucha wrote: > On Wed, Apr 18, 2007 at 10:11:46AM -0400, Trond Myklebust wrote: > > Do you have a copy of wireshark or ethereal on hand? If so, could you > > take a look at whether or not any NFS traffic is going between the > > client and server once the hang happens? > > I used the following command > >tcpdump -w nfs-traffic -i eth0 -vv -tt dst port nfs > > to capture > >http://iucha.net/nfs/21-rc7-nfs4/nfs-traffic.bz2 > > I started the capture before starting the copy and left it to run for > a few minutes after the traffic slowed to a crawl. > > The iostat and vmstat are at: > >http://iucha.net/nfs/21-rc7-nfs4/iostat >http://iucha.net/nfs/21-rc7-nfs4/vmstat > > It seems that my original problem report had a big mistake! There is > no hang, but at some point the write slows down to a trickle (from > 40,000 blocks/s to 22 blocks/s) as can be seen from the iostat log. Yeah. You only captured the outgoing traffic to the server, but already it looks as if there were 'interesting' things going on. In frames 29346 to 29350, the traffic stops altogether for 5 seconds (I only see keepalives) then it starts up again. Ditto for frames 40477-40482 (another 5 seconds). ... Then at around frame 92072, the client starts to send a bunch of RSTs. Aha I'll bet that reverting the appended patch fixes the problem. The assumption Chuck makes is that if _no_ request bytes have been sent, yet the request is on the 'receive list' then it must be a resend is patently false in the case where the send queue just happens to be full. A better solution would probably be to disconnect the socket following the ETIMEDOUT handling in call_status(). Cheers Trond --- commit 43d78ef2ba5bec26d0315859e8324bfc0be23766 Author: Chuck Lever <[EMAIL PROTECTED]> Date: Tue Feb 6 18:26:11 2007 -0500 NFS: disconnect before retrying NFSv4 requests over TCP RFC3530 section 3.1.1 states an NFSv4 client MUST NOT send a request twice on the same connection unless it is the NULL procedure. Section 3.1.1 suggests that the client should disconnect and reconnect if it wants to retry a request. Implement this by adding an rpc_clnt flag that an ULP can use to specify that the underlying transport should be disconnected on a major timeout. The NFSv4 client asserts this new flag, and requests no retries after a minor retransmit timeout. Note that disconnecting on a retransmit is in general not safe to do if the RPC client does not reuse the TCP port number when reconnecting. See http://bugzilla.linux-nfs.org/show_bug.cgi?id=6 Signed-off-by: Chuck Lever <[EMAIL PROTECTED]> Signed-off-by: Trond Myklebust <[EMAIL PROTECTED]> diff --git a/fs/nfs/client.c b/fs/nfs/client.c index a3191f0..c46e94f 100644 --- a/fs/nfs/client.c +++ b/fs/nfs/client.c @@ -394,7 +394,8 @@ static void nfs_init_timeout_values(struct rpc_timeout *to, int proto, static int nfs_create_rpc_client(struct nfs_client *clp, int proto, unsigned int timeo, unsigned int retrans, - rpc_authflavor_t flavor) + rpc_authflavor_t flavor, + int flags) { struct rpc_timeout timeparms; struct rpc_clnt *clnt = NULL; @@ -407,6 +408,7 @@ static int nfs_create_rpc_client(struct nfs_client *clp, int proto, .program= _program, .version= clp->rpc_ops->version, .authflavor = flavor, + .flags = flags, }; if (!IS_ERR(clp->cl_rpcclient)) @@ -548,7 +550,7 @@ static int nfs_init_client(struct nfs_client *clp, const struct nfs_mount_data * * - RFC 2623, sec 2.3.2 */ error = nfs_create_rpc_client(clp, proto, data->timeo, data->retrans, - RPC_AUTH_UNIX); + RPC_AUTH_UNIX, 0); if (error < 0) goto error; nfs_mark_client_ready(clp, NFS_CS_READY); @@ -868,7 +870,8 @@ static int nfs4_init_client(struct nfs_client *clp, /* Check NFS protocol revision and initialize RPC op vector */ clp->rpc_ops = _v4_clientops; - error = nfs_create_rpc_client(clp, proto, timeo, retrans, authflavour); + error = nfs_create_rpc_client(clp, proto, timeo, retrans, authflavour, + RPC_CLNT_CREATE_DISCRTRY); if (error < 0) goto error; memcpy(clp->cl_ipaddr, ip_addr, sizeof(clp->cl_ipaddr)); diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h index a1be89d..c7a78ee 100644 --- a/include/linux/sunrpc/clnt.h +++ b/include/linux/sunrpc/clnt.h @@ -40,6 +40,7
Re: is there any generic GPIO chip framework like IRQ chips?
> >> > So, talking about what an (optional) implementation framework might > >> > look like (and which could handle the SOC, FPGA, I2C, and MFD cases > >> > I've looked at): > > > See patches in following messages ... a preliminary "gpio_chip" core > > for such a framework, plus example support for one SOC family's GPIOs, > > and then updating one board's handling of GPIOs, including over I2C. > > Just to compare, diffstats for GPIODEV: Now, if they were functionally equivalent, such a comparison would be less of an apples/oranges thing! The most useful comparison would focus on technical aspects of the gpio_chip abstraction itself (i.e. $SUBJECT). > it needs work - it doesn't adhere to your own > optimization scheme by using lookup table instead of list. I thought it was more important to address the $SUBJECT first: get a working gpio_chip abstraction which covers all the needed functionality. The patch had a hook for implementing such tweaks, but it wasn't used. The next version you'll see lets the platform code use its own existing lookup code, as part of slimming things down a bit. I also decided to take out the debugfs support. >you speak about constructor > parts which "anyone" can use to construct whatever GPIO API they like, > whereas I'm speaking about exact API implementation which can be used > right away. I most certainly did not speak about "whatever GPIO API they like"!! Quite the contrary, in fact. Please don't put words in my mouth. (You've been doing it quite extensively in this thread; it's rude.) And that "core" patch I posted was clearly usable "right away"; otherwise the two examples _using_ it couldn't have worked. > Well, besides gpio_keys we here have asic3_keys, samcop_keys, > etc. - all that duplication just because the current GPIO API doesn't > allow extensibility to more chips. When I get tired of repeating myself, just remember: the current programming interface *DOES* allow such extensibility. That's what it means to be an "interface", rather than an implementation: it defines inputs and outputs, allowing any process that conforms to both. In fact, the patches I sent demonstrated exactly that extensibility. Same interface, additional chips; different implementation inside. > > So you're agreeing that, at a technical level, what I described > > could be augmented by a "caching" facility ... giving a programming > > interface with all the characteristics of your "GPIODEV" thingie. > > > All you're really disagreeing with is bootstrapping issues; and > > whether there is in fact a need for such a layer. The only argument > > I could possibly buy is that it avoids the lookup of (b) ... but > > that doesn't seem to matter in most cases I've looked at. > So, now the most important question is what we all would get > with your approach in the end. > > So, if you could make sure gpiolib.c doesn't contain inefficient > implementation, I can make it comparable to existing implementations that work the same way ... e.g. AT91 and OMAP code. Of course, it's not possible to get away from the cost of function indirection, with a generic gpio_chip abstraction. Or those lookup costs; but as you agreed, those costs don't seem to matter much. And if they ever do matter, caching support would be easy to add. > and make such extensible implementation available by default > for ARM PXA/S3Cxxx/OMAP, then it's for sure cover Handhelds.org's, > and many other peoples' usecases, and that would be highly > appreciated. > > If you could do it for 2.6.22 merge window, that would > straight ideal. I think having an optional gpio_chip, not unlike what was in that one patch, should be reasonable; also, making it work on some platforms that I use. But I don't think there's much overlap between those platforms and what hh.org uses. - Dave - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
Ingo Molnar wrote: * Peter Williams <[EMAIL PROTECTED]> wrote: And my scheduler for example cuts down the amount of policy code and code size significantly. Yours is one of the smaller patches mainly because you perpetuate (or you did in the last one I looked at) the (horrible to my eyes) dual array (active/expired) mechanism. That this idea was bad should have been apparent to all as soon as the decision was made to excuse some tasks from being moved from the active array to the expired array. This essentially meant that there would be circumstances where extreme unfairness (to the extent of starvation in some cases) -- the very things that the mechanism was originally designed to ensure (as far as I can gather). Right about then in the development of the O(1) scheduler alternative solutions should have been sought. in hindsight i'd agree. Hindsight's a wonderful place isn't it :-) and, of course, it's where I was making my comments from. But back then we were clearly not ready for fine-grained accurate statistics + trees (cpus are alot faster at more complex arithmetics today, plus people still believed that low-res can be done well enough), and taking out any of these two concepts from CFS would result in a similarly complex runqueue implementation. I disagree. The single priority array with a promotion mechanism that I use in the SPA schedulers can do the job of avoiding starvation with no measurable increase in the overhead. Fairness, nice, good interactive responsiveness can then be managed by how you determine tasks' dynamic priorities. Also, the array switch was just thought to be of another piece of 'if the heuristics go wrong, we fall back to an array switch' logic, right in line with the other heuristics. And you have to accept it, mainline's ability to auto-renice make -j jobs (and other CPU hogs) was quite a plus for developers, so it had (and probably still has) quite some inertia. I agree, it wasn't totally useless especially for the average user. My main problem with it was that the effect of "nice" wasn't consistent or predictable enough for reliable resource allocation. I also agree with the aims of the various heuristics i.e. you have to be unfair and give some tasks preferential treatment in order to give the users the type of responsiveness that they want. It's just a shame that it got broken in the process but as you say it's easier to see these things in hindsight than in the middle of the melee. Peter -- Peter Williams [EMAIL PROTECTED] "Learning, n. The kind of ignorance distinguishing the studious." -- Ambrose Bierce - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [NETLINK] Don't attach callback to a going-away netlink socket
David Miller <[EMAIL PROTECTED]> wrote: > > As discussed in this thread there might be other ways to a > approach this, but this fix is good for now. > > Patch applied, thank you. Actually I was going to suggest something like this: [NETLINK]: Kill CB only when socket is unused Since we can still receive packets until all references to the socket are gone, we don't need to kill the CB until that happens. This also aligns ourselves with the receive queue purging which happens at that point. Original patch by Pavel Emelianov who noticed this race condition. Signed-off-by: Herbert Xu <[EMAIL PROTECTED]> Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c index 0be19b7..914884c 100644 --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -139,6 +139,15 @@ static struct hlist_head *nl_pid_hashfn(struct nl_pid_hash *hash, u32 pid) static void netlink_sock_destruct(struct sock *sk) { + struct netlink_sock *nlk = nlk_sk(sk); + + WARN_ON(mutex_is_locked(nlk_sk(sk)->cb_mutex)); + if (nlk->cb) { + if (nlk->cb->done) + nlk->cb->done(nlk->cb); + netlink_destroy_callback(nlk->cb); + } + skb_queue_purge(>sk_receive_queue); if (!sock_flag(sk, SOCK_DEAD)) { @@ -147,7 +156,6 @@ static void netlink_sock_destruct(struct sock *sk) } BUG_TRAP(!atomic_read(>sk_rmem_alloc)); BUG_TRAP(!atomic_read(>sk_wmem_alloc)); - BUG_TRAP(!nlk_sk(sk)->cb); BUG_TRAP(!nlk_sk(sk)->groups); } @@ -450,17 +458,7 @@ static int netlink_release(struct socket *sock) netlink_remove(sk); nlk = nlk_sk(sk); - mutex_lock(nlk->cb_mutex); - if (nlk->cb) { - if (nlk->cb->done) - nlk->cb->done(nlk->cb); - netlink_destroy_callback(nlk->cb); - nlk->cb = NULL; - } - mutex_unlock(nlk->cb_mutex); - - /* OK. Socket is unlinked, and, therefore, - no new packets will arrive */ + /* OK. Socket is unlinked. */ sock_orphan(sk); sock->sk = NULL; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Announce - Staircase Deadline cpu scheduler v0.42
On Thursday 19 April 2007 10:41, Con Kolivas wrote: > On Thursday 19 April 2007 09:59, Con Kolivas wrote: > > Since there is so much work currently ongoing with alternative cpu > > schedulers, as a standard for comparison with the alternative virtual > > deadline fair designs I've addressed a few issues in the Staircase > > Deadline cpu scheduler which improve behaviour likely in a noticeable > > fashion and released version 0.41. > > > > http://ck.kolivas.org/patches/staircase-deadline/2.6.20.7-sd-0.41.patch > > http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc7-sd-0.41.patch > > > > and an incremental for those on 0.40: > > http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc7/sched-impleme > >nt -staircase-deadline-scheduler-further-improvements.patch > > > > Remember to renice X to -10 for nicest desktop behaviour :) > > > > Have fun. > > Oops forgot to cc a few people > > Nick you said I should still have something to offer so here it is. > Peter you said you never saw this design (it's a dual array affair sorry). > Gene and Willy you were some of the early testers that noticed the > advantages of the earlier designs, > Matt you did lots of great earlier testing. > WLI you inspired a lot of design ideas. > Mike you were the stick. > And a few others I've forgotten to mention and include. Version 0.42 http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc7-sd-0.42.patch -- -ck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 0/8] Cpuset aware writeback
Christoph Lameter wrote: On Wed, 21 Mar 2007, Ethan Solomita wrote: Christoph Lameter wrote: On Thu, 1 Feb 2007, Ethan Solomita wrote: Hi Christoph -- has anything come of resolving the NFS / OOM concerns that Andrew Morton expressed concerning the patch? I'd be happy to see some progress on getting this patch (i.e. the one you posted on 1/23) through. Peter Zilkstra addressed the NFS issue. I will submit the patch again as soon as the writeback code stabilizes a bit. I'm pinging to see if this has gotten anywhere. Are you ready to resubmit? Do we have the evidence to convince Andrew that the NFS issues are resolved and so this patch won't obscure anything? The NFS patch went into Linus tree a couple of days ago and I have a new version ready with additional support to set per dirty ratios per cpu. There is some interest in adding more VM controls to this patch. I hope I can post the next rev by tomorrow. Any new ETA? I'm trying to decide whether to go back to your original patches or wait for the new set. Adding new knobs isn't as important to me as having something that fixes the core problem, so hopefully this isn't waiting on them. They could always be patches on top of your core patches. -- Ethan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sched: implement staircase deadline scheduler further improvements-1
On Thursday 19 April 2007 09:48, Con Kolivas wrote: > While the Staircase Deadline scheduler has not been completely killed off > and is still in -mm I would like to fix some outstanding issues that I've > found since it still serves for comparison with all the upcoming > schedulers. > > While still in -mm can we queue this on top please? > > A set of staircase-deadline v 0.41 patches will make their way into the > usual place for those willing to test it. > > http://ck.kolivas.org/patches/staircase-deadline/ Oops! Minor thinko! Here is a respin. Please apply this one instead. I better make a 0.42 heh. --- The prio_level was being inappropriately decreased if a higher priority task was still using previous timeslice. Fix that. Task expiration of higher priority tasks was not being taken into account with allocating priority slots. Check the expired best_static_prio level to facilitate that. Explicitly check all better static priority prio_levels when deciding on allocating slots for niced tasks. These changes improve behaviour in many ways. Signed-off-by: Con Kolivas <[EMAIL PROTECTED]> --- kernel/sched.c | 64 ++--- 1 file changed, 43 insertions(+), 21 deletions(-) Index: linux-2.6.21-rc7-sd/kernel/sched.c === --- linux-2.6.21-rc7-sd.orig/kernel/sched.c 2007-04-19 08:51:54.0 +1000 +++ linux-2.6.21-rc7-sd/kernel/sched.c 2007-04-19 12:03:29.0 +1000 @@ -145,6 +145,12 @@ struct prio_array { */ DECLARE_BITMAP(prio_bitmap, MAX_PRIO + 1); + /* +* The best static priority (of the dynamic priority tasks) queued +* this array. +*/ + int best_static_prio; + #ifdef CONFIG_SMP /* For convenience looks back at rq */ struct rq *rq; @@ -191,9 +197,9 @@ struct rq { /* * The current dynamic priority level this runqueue is at per static -* priority level, and the best static priority queued this rotation. +* priority level. */ - int prio_level[PRIO_RANGE], best_static_prio; + int prio_level[PRIO_RANGE]; /* How many times we have rotated the priority queue */ unsigned long prio_rotation; @@ -669,7 +675,7 @@ static void task_new_array(struct task_s } /* Find the first slot from the relevant prio_matrix entry */ -static inline int first_prio_slot(struct task_struct *p) +static int first_prio_slot(struct task_struct *p) { if (unlikely(p->policy == SCHED_BATCH)) return p->static_prio; @@ -682,11 +688,18 @@ static inline int first_prio_slot(struct * level. SCHED_BATCH tasks do not use the priority matrix. They only take * priority slots from their static_prio and above. */ -static inline int next_entitled_slot(struct task_struct *p, struct rq *rq) +static int next_entitled_slot(struct task_struct *p, struct rq *rq) { + int search_prio = MAX_RT_PRIO, uprio = USER_PRIO(p->static_prio); + struct prio_array *array = rq->active; DECLARE_BITMAP(tmp, PRIO_RANGE); - int search_prio, uprio = USER_PRIO(p->static_prio); + /* +* Go straight to expiration if there are higher priority tasks +* already expired. +*/ + if (p->static_prio > rq->expired->best_static_prio) + return MAX_PRIO; if (!rq->prio_level[uprio]) rq->prio_level[uprio] = MAX_RT_PRIO; /* @@ -694,15 +707,22 @@ static inline int next_entitled_slot(str * static_prio are acceptable, and only if it's not better than * a queued better static_prio's prio_level. */ - if (p->static_prio < rq->best_static_prio) { - search_prio = MAX_RT_PRIO; + if (p->static_prio < array->best_static_prio) { if (likely(p->policy != SCHED_BATCH)) - rq->best_static_prio = p->static_prio; - } else if (p->static_prio == rq->best_static_prio) + array->best_static_prio = p->static_prio; + } else if (p->static_prio == array->best_static_prio) { search_prio = rq->prio_level[uprio]; - else { - search_prio = max(rq->prio_level[uprio], - rq->prio_level[USER_PRIO(rq->best_static_prio)]); + } else { + int i; + + search_prio = rq->prio_level[uprio]; + /* A bound O(n) function, worst case n is 40 */ + for (i = array->best_static_prio; i <= p->static_prio ; i++) { + if (!rq->prio_level[USER_PRIO(i)]) + rq->prio_level[USER_PRIO(i)] = MAX_RT_PRIO; + search_prio = max(search_prio, + rq->prio_level[USER_PRIO(i)]); + } } if (unlikely(p->policy == SCHED_BATCH)) { search_prio = max(search_prio,
Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy
Hi. On Thu, 2007-04-19 at 00:02 +0200, Ingo Molnar wrote: > * Christian Hesse <[EMAIL PROTECTED]> wrote: > > > > although probably your suspend2 problem is still not fixed, it's > > > worth a try nevertheless. Which suspend2 patch did you apply, and > > > was it against -rc6 or -rc7? > > > > You are right again. ;-) > > > > Linux 2.6.21-rc7 > > Suspend2 2.2.9.11 (applies cleanly to -rc7) > > CFS v3 (without any additional patches) > > > > And it still hangs on suspend. > > what's the easiest way for me to try suspend2? Apply the patch, reboot > into the kernel, then execute what command to suspend? (there's a > confusing mismash of initiators of all the suspend variants. Can i drive > this by echoing to /sys/power/state?) From subsequent emails, I think you already got your answer, but just in case... Yes, if you enabled "Replace swsusp by default" and you already had it set up for getting swsusp to resume. If not, and you're using an initrd/ramfs, you'll need to modify it to echo > /sys/power/suspend2/do_resume after /sys and /proc are mounted but prior to mounting / and so on. Regards, Nigel signature.asc Description: This is a digitally signed message part
Re: [PATCH 0/4] 2.6.21-rc7 NFS writes: fix a series of issues
On Wed, Apr 18, 2007 at 10:11:46AM -0400, Trond Myklebust wrote: > Do you have a copy of wireshark or ethereal on hand? If so, could you > take a look at whether or not any NFS traffic is going between the > client and server once the hang happens? I used the following command tcpdump -w nfs-traffic -i eth0 -vv -tt dst port nfs to capture http://iucha.net/nfs/21-rc7-nfs4/nfs-traffic.bz2 I started the capture before starting the copy and left it to run for a few minutes after the traffic slowed to a crawl. The iostat and vmstat are at: http://iucha.net/nfs/21-rc7-nfs4/iostat http://iucha.net/nfs/21-rc7-nfs4/vmstat It seems that my original problem report had a big mistake! There is no hang, but at some point the write slows down to a trickle (from 40,000 blocks/s to 22 blocks/s) as can be seen from the iostat log. Regards, florin -- Bruce Schneier expects the Spanish Inquisition. http://geekz.co.uk/schneierfacts/fact/163 signature.asc Description: Digital signature
Re: 2.6.21-rc6-mm1 ATA HPT37x regression
> "John" == John Stoffel <[EMAIL PROTECTED]> writes: > "John" == John Stoffel <[EMAIL PROTECTED]> writes: > Ok, so do I need to do anything special with the next -mm release and > the next version? Well, let Alan decide that (2Alan: and I said that HPT code is bogus :-). Alan> Try drivers/ide/pci/hpt366 - if that works grab a dmesg and let Alan> me know. It means that Sergei's DPLL sync code seems to work Alan> better than the vendor code and its time to swap it over. John> Ok, I'll give that a whirl under 2.6.21-rc7 tonight. I'll build them John> in modular so I can switch around more easily. I hope. :] John> Ok, here's the dmesg output using the hpt366 old IDE driver, John> 2.6.21-rc7, SMP: John> [ 160.926355] HPT302: IDE controller at PCI slot :03:06.0 John> [ 160.928030] ACPI: PCI Interrupt :03:06.0[A] -> GSI 18 (level, low) -> IRQ John> 18 John> [ 160.931212] HPT302: chipset revision 1 John> [ 160.932801] HPT302: DPLL base: 66 MHz, f_CNT: 100, assuming 33 MHz PCI John> [ 160.941157] HPT302: using 66 MHz DPLL clock John> [ 160.942646] HPT302: 100% native mode on irq 18 John> [ 160.943918] ide2: BM-DMA at 0xe800-0xe807, BIOS settings: hde:DMA, hdf:pi John> o John> [ 160.946636] ide3: BM-DMA at 0xe808-0xe80f, BIOS settings: hdg:DMA, hdh:pi John> o John> [ 160.949439] Probing IDE interface ide2... John> [ 161.213560] hde: WDC WD1200JB-00CRA1, ATA DISK drive John> [ 161.828020] ide2 at 0xecf8-0xecff,0xecf2 on irq 18 John> [ 161.829616] Probing IDE interface ide3... John> [ 162.094086] hdg: WDC WD1200JB-00EVA0, ATA DISK drive John> [ 162.709002] ide3 at 0xece0-0xece7,0xecda on irq 18 John> Which looks ok to me I guess. It found my MD disks on there and John> assmebled them, eventually. *grin* John> I'll reboot and send out the corresponding ATA HPT37x driver dmesg... And here's the output (much more verbose!) from the hpt37x ATA driver: [ 158.712007] hpt37x: HPT302: Bus clock 33MHz. [ 158.713390] ACPI: PCI Interrupt :03:06.0[A] -> GSI 18 (level, low) -> IRQ 18 [ 158.716254] ata5: PATA max UDMA/133 cmd 0x0001ecf8 ctl 0x0001ecf2 bmdma 0x000 1e800 irq 18 [ 158.719019] ata6: PATA max UDMA/133 cmd 0x0001ece0 ctl 0x0001ecda bmdma 0x000 1e808 irq 18 [ 158.722257] scsi7 : pata_hpt37x [ 158.878133] ata5.00: ATA-5: WDC WD1200JB-00CRA1, 17.07W17, max UDMA/100 [ 158.879576] ata5.00: 234441648 sectors, multi 16: LBA [ 158.880934] Find mode for 12 reports C829C62 [ 158.882240] Find mode for DMA 69 reports 1C6DDC62 [ 158.888152] ata5.00: configured for UDMA/100 [ 158.889437] scsi8 : pata_hpt37x [ 158.900338] Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 [ 158.901660] ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx [ 159.047026] ata6.00: ATA-6: WDC WD1200JB-00EVA0, 15.05R15, max UDMA/100 [ 159.048412] ata6.00: 234441648 sectors, multi 16: LBA48 [ 159.050008] Find mode for 12 reports C829C62 [ 159.051371] Find mode for DMA 69 reports 1C6DDC62 [ 159.057079] ata6.00: configured for UDMA/100 [ 159.063655] scsi 7:0:0:0: Direct-Access ATA WDC WD1200JB-00C 17.0 PQ : 0 ANSI: 5 [ 159.067506] SCSI device sdi: 234441648 512-byte hdwr sectors (120034 MB) [ 159.069004] sdi: Write Protect is off [ 159.070412] sdi: Mode Sense: 00 3a 00 00 [ 159.070487] SCSI device sdi: write cache: enabled, read cache: enabled, doesn 't support DPO or FUA [ 159.073427] SCSI device sdi: 234441648 512-byte hdwr sectors (120034 MB) [ 159.074882] sdi: Write Protect is off [ 159.076262] sdi: Mode Sense: 00 3a 00 00 [ 159.076339] SCSI device sdi: write cache: enabled, read cache: enabled, doesn 't support DPO or FUA [ 159.079097] sdi: sdi1 [ 159.097634] sd 7:0:0:0: Attached scsi disk sdi [ 159.099212] sd 7:0:0:0: Attached scsi generic sg9 type 0 [ 159.102344] scsi 8:0:0:0: Direct-Access ATA WDC WD1200JB-00E 15.0 PQ : 0 ANSI: 5 [ 159.106197] SCSI device sdj: 234441648 512-byte hdwr sectors (120034 MB) [ 159.107722] sdj: Write Protect is off [ 159.109188] sdj: Mode Sense: 00 3a 00 00 [ 159.109271] SCSI device sdj: write cache: enabled, read cache: enabled, doesn 't support DPO or FUA [ 159.112455] SCSI device sdj: 234441648 512-byte hdwr sectors (120034 MB) [ 159.114094] sdj: Write Protect is off [ 159.115870] sdj: Mode Sense: 00 3a 00 00 [ 159.115943] SCSI device sdj: write cache: enabled, read cache: enabled, doesn 't support DPO or FUA [ 159.118965] sdj: sdj1 [ 159.138036] sd 8:0:0:0: Attached scsi disk sdj [ 159.139682] sd 8:0:0:0: Attached scsi generic sg10 type 0 In both cases, my RAID1 disks are found and come up cleanly, which is good. Thanks for all the work you guys have done on the IDE stuff, as well as the new libATA stuff. Let me know if you need more testing done here, I've only got a scratch volume on this raid set. John - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo
Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy
Hi. On Wed, 2007-04-18 at 18:56 -0400, Bob Picco wrote: > Ingo Molnar wrote:[Wed Apr 18 2007, 06:02:28PM EDT] > > > > * Christian Hesse <[EMAIL PROTECTED]> wrote: > > > > > > although probably your suspend2 problem is still not fixed, it's > > > > worth a try nevertheless. Which suspend2 patch did you apply, and > > > > was it against -rc6 or -rc7? > > > > > > You are right again. ;-) > > > > > > Linux 2.6.21-rc7 > > > Suspend2 2.2.9.11 (applies cleanly to -rc7) > > > CFS v3 (without any additional patches) > > > > > > And it still hangs on suspend. > > > > what's the easiest way for me to try suspend2? Apply the patch, reboot > > into the kernel, then execute what command to suspend? (there's a > > confusing mismash of initiators of all the suspend variants. Can i drive > > this by echoing to /sys/power/state?) > > > > Ingo > I had hoped to collect more data with CFS V2. It crashes in > scale_nice_down for s2ram when attempting to disable_nonboot_cpus. > So part of traceback looks like (typed by hand with obvious omissions): > > scale_nice_down > update_stats_wait_end - not shown in traceback because inlined > pick_next_task_fair > migration_call > task_rq_lock > notifier_call_chain > _cpu_down > disable_nonboot_cpus > ... > > This is standard -rc7 with V2 CFS applied. It could be a completely > unrelated issue. I'll attempt to debug further tomorrow. That - and Christian's other reply with the jpg - look to me more like this is an interaction between CFS and cpu hotplugging than Suspend2 itself. Can you also reproduce this with swsusp? Regards, Nigel signature.asc Description: This is a digitally signed message part
PCI: Unable to handle 64-bit address space for
Hi all, Anyone has idea of this: Why it is displayed on boot? How to fix this? Or at least not to display this message? Using 2.6.9-42.ELsmp. PCI: Probing PCI hardware (bus 00) PCI: Ignoring BAR0-3 of IDE controller :00:1f.1 PCI: Unable to handle 64-bit address space for PCI: Unable to handle 64-bit address space for PCI: Unable to handle 64-bit address space for PCI: Unable to handle 64-bit address space for PCI: Unable to handle 64-bit address space for PCI: Unable to handle 64-bit address space for PCI: Unable to handle 64-bit address space for PCI: Unable to handle 64-bit address space for PCI: Unable to handle 64-bit address space for Thanks for the help, Michael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy
Hi. On Thu, 2007-04-19 at 00:22 +0200, Christian Hesse wrote: > On Thursday 19 April 2007, Ingo Molnar wrote: > > * Christian Hesse <[EMAIL PROTECTED]> wrote: > > > > although probably your suspend2 problem is still not fixed, it's > > > > worth a try nevertheless. Which suspend2 patch did you apply, and > > > > was it against -rc6 or -rc7? > > > > > > You are right again. ;-) > > > > > > Linux 2.6.21-rc7 > > > Suspend2 2.2.9.11 (applies cleanly to -rc7) > > > CFS v3 (without any additional patches) > > > > > > And it still hangs on suspend. > > > > what's the easiest way for me to try suspend2? Apply the patch, reboot > > into the kernel, then execute what command to suspend? (there's a > > confusing mismash of initiators of all the suspend variants. Can i drive > > this by echoing to /sys/power/state?) > > Perhaps you have to install suspend2-userui as well for the output (I'm not > shure whether it works without). Then you can trigger the suspend by echoing > to /sys/power/suspend2/do_suspend. > Useful informations can be found in the Howto: > > http://www.suspend2.net/HOWTO > > I dropped some ccs to not abuse Linus and friends. You can suspend and resume without it. Regards, Nigel signature.asc Description: This is a digitally signed message part
Re: 2.6.21-rc6-mm1 ATA HPT37x regression
> "John" == John Stoffel <[EMAIL PROTECTED]> writes: >>> > Ok, so do I need to do anything special with the next -mm release and >>> > the next version? >>> >>> Well, let Alan decide that (2Alan: and I said that HPT code is bogus :-). Alan> Try drivers/ide/pci/hpt366 - if that works grab a dmesg and let Alan> me know. It means that Sergei's DPLL sync code seems to work Alan> better than the vendor code and its time to swap it over. John> Ok, I'll give that a whirl under 2.6.21-rc7 tonight. I'll build them John> in modular so I can switch around more easily. I hope. :] Ok, here's the dmesg output using the hpt366 old IDE driver, 2.6.21-rc7, SMP: [ 160.926355] HPT302: IDE controller at PCI slot :03:06.0 [ 160.928030] ACPI: PCI Interrupt :03:06.0[A] -> GSI 18 (level, low) -> IRQ 18 [ 160.931212] HPT302: chipset revision 1 [ 160.932801] HPT302: DPLL base: 66 MHz, f_CNT: 100, assuming 33 MHz PCI [ 160.941157] HPT302: using 66 MHz DPLL clock [ 160.942646] HPT302: 100% native mode on irq 18 [ 160.943918] ide2: BM-DMA at 0xe800-0xe807, BIOS settings: hde:DMA, hdf:pi o [ 160.946636] ide3: BM-DMA at 0xe808-0xe80f, BIOS settings: hdg:DMA, hdh:pi o [ 160.949439] Probing IDE interface ide2... [ 161.213560] hde: WDC WD1200JB-00CRA1, ATA DISK drive [ 161.828020] ide2 at 0xecf8-0xecff,0xecf2 on irq 18 [ 161.829616] Probing IDE interface ide3... [ 162.094086] hdg: WDC WD1200JB-00EVA0, ATA DISK drive [ 162.709002] ide3 at 0xece0-0xece7,0xecda on irq 18 Which looks ok to me I guess. It found my MD disks on there and assmebled them, eventually. *grin* I'll reboot and send out the corresponding ATA HPT37x driver dmesg... John - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
PCI Express MMCONFIG and BIOS Bug messages..
I've seen a lot of systems (including brand new Xeon-based servers from IBM and HP) that output messages on boot like: PCI: BIOS Bug: MCFG area at f000 is not E820-reserved PCI: Not using MMCONFIG. As I understand it, this is sort of a sanity check mechanism to make sure the MCFG address reported is remotely reasonable and intended to be used as such. Problem is, I doubt the BIOS authors would agree that this constitutes a bug. Microsoft is providing a lot of the direction for BIOS writers, and have a look at this presentation "PCI Express, Windows, And The Legacy Transition" from back in 2004: http://download.microsoft.com/download/1/8/f/18f8cee2-0b64-41f2-893d-a6f2295b40c8/TW04047_WINHEC2004.ppt On page 14, "Existing Windows - Reserve MMCONFIG": Existing Windows versions won’t understand MCFG table * Backwards-compatible range reservation must be used Report range in ACPI "Motherboard Resources" *_CRS of PNP0C02 node * PNP0C02 must be at \_SB scope * Range must be marked as consumed Do not include range in _CRS of PCI root bus * If included, OS will assume that this range can be allocated to devices E820 table/EFI memory map * Not necessary to describe MMConfig here * For Windows, these are used to describe RAM * No harm in including range as reserved either So Microsoft is explicitly telling the BIOS developers that there is no need to reserve the MMCONFIG space in the E820 table because Windows doesn't care. On that basis it doesn't seem like a valid check to require it to be so reserved, then. Really, I think we should be basing this check on whether the corresponding memory range is reserved in the ACPI resources, like Windows expects. This does require putting more fingers into ACPI from this early boot stage, though.. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CPU_IDLE prevents resuming from STR [was: Re: 2.6.21-rc6-mm1]
On Wed, 2007-04-18 at 19:00 -0400, Joshua Wise wrote: > On Tue, 17 Apr 2007, Shaohua Li wrote: > > Looks there is init order issue of sysfs files. The new refreshed patch > > should fix your bug. > > Yes, that did fix the hang on resume from STR -- that now works fine. > > However: > [EMAIL PROTECTED]:/sys/devices/system/cpu/cpuidle$ cat available_drivers > current_driver > > > [EMAIL PROTECTED]:/sys/devices/system/cpu/cpuidle$ cat available_governors > current_governor > ladder > ladder it's correct and looks you didn't compile the acpi processor module. Thanks, Shaohua - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] CONFIG_PACKET_MMAP should depend on MMU
On 4/18/07, David Howells <[EMAIL PROTECTED]> wrote: Aubrey Li <[EMAIL PROTECTED]> wrote: > Here, in the attachment I wrote a small test app. Please correct if > there is anything wrong, and feel free to improve it. Okay... I have that working... probably. I don't know what output it's supposed to produce, but I see this: # /packet-mmap/sample_packet_mmap 00-00-00-01-00-00-00-8a-00-00-00-8a-00-42-00-50- 38-43-13-a0-00-07-ff-3c-00-00-00-00-00-00-00-00- 00-11-08-00-00-00-00-01-00-01-00-06-00-d0-b7-de- 32-7b-00-00-00-00-00-00-00-00-00-00-00-00-00-00- 00-00-00-90-cc-a2-75-6b-00-d0-b7-de-32-7b-08-00- 45-00-00-7c-00-00-40-00-40-11-b4-13-c0-a8-02-80- c0-a8-02-8d-08-01-03-20-00-68-8e-65-7f-5b-7e-03- 00-00-00-01-00-00-00-00-00-00-00-00-00-00-00-00- 00-00-00-00-00-00-00-00-00-00-00-01-00-00-81-a4- 00-00-00-01-00-00-00-00-00-00-00-00-00-1d-b8-86- 00-00-10-00-ff-ff-ff-ff-00-00-0e-f0-00-00-09-02- 01-cb-03-16-46-26-38-0d-00-00-00-00-46-26-38-1e- 00-00-00-00-46-26-38-1e-00-00-00-00-00-00-00-00- 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00- [repeated] Does that look reasonable? Yes, it's reasonable for me, as long as your host IP is 192.168.2.128 and target IP is 192.168.2.141 See below 00-90-cc-a2-75-6b-|___ MAC Address 00-d0-b7-de-32-7b-| 08-00Type: IP 45-00Ver, IHL, TOS 00-7cIP.total.length 00-00- 40-00- 40TTL 11UDP protocol b4-13Checksum c0-a8-02-80---Source IP: 192.168.2.128 c0-a8-02-8d---Dest IP: 192.168.2.141 snip-- I've attached the preliminary patch. Thanks, I'll take a look and try to see if I can give some feedback. -Aubrey Note four things about it: (1) I've had to add the get_unmapped_area() op to the proto_ops struct, but I've only done it for CONFIG_MMU=n as making it available for CONFIG_MMU=y could cause problems. (2) There's a race between packet_get_unmapped_area() being called and packet_mmap() being called. (3) I've added an extra check into packet_set_ring() to make sure the caller isn't asking for a combination of buffer size and count that will exceed ULONG_MAX. This protects a multiply done elsewhere. (4) The entire data buffer is allocated as one contiguous lump in NOMMU-mode. David --- [PATCH] NOMMU: Support mmap() on AF_PACKET sockets From: David Howells <[EMAIL PROTECTED]> Support mmap() on AF_PACKET sockets in NOMMU-mode kernels. Signed-Off-By: David Howells <[EMAIL PROTECTED]> --- include/linux/net.h|7 +++ include/net/sock.h |8 +++ net/core/sock.c| 10 net/packet/af_packet.c | 118 net/socket.c | 77 +++ 5 files changed, 219 insertions(+), 1 deletions(-) diff --git a/include/linux/net.h b/include/linux/net.h index 4db21e6..9e77cf6 100644 --- a/include/linux/net.h +++ b/include/linux/net.h @@ -161,6 +161,11 @@ struct proto_ops { int (*recvmsg) (struct kiocb *iocb, struct socket *sock, struct msghdr *m, size_t total_len, int flags); +#ifndef CONFIG_MMU + unsigned long (*get_unmapped_area)(struct file *file, struct socket *sock, +unsigned long addr, unsigned long len, +unsigned long pgoff, unsigned long flags); +#endif int (*mmap) (struct file *file, struct socket *sock, struct vm_area_struct * vma); ssize_t (*sendpage) (struct socket *sock, struct page *page, @@ -191,6 +196,8 @@ extern int sock_sendmsg(struct socket *sock, struct msghdr *msg, extern int sock_recvmsg(struct socket *sock, struct msghdr *msg, size_t size, int flags); extern int sock_map_fd(struct socket *sock); +extern void sock_make_mappable(struct socket *sock, + unsigned long prot); extern struct socket *sockfd_lookup(int fd, int *err); #define sockfd_put(sock) fput(sock->file) extern int net_ratelimit(void); diff --git a/include/net/sock.h b/include/net/sock.h index 2c7d60c..d91edea 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -841,6 +841,14 @@ extern int sock_no_sendmsg(struct kiocb *, struct socket *, struct msghdr *, size_t); extern int sock_no_recvmsg(struct kiocb *, struct socket *,
Re: problem with
liangbowen wrote: Hi I compiled the following code with gcc under FC2 : #include main() { struct semaphore sum; } It doesn't compile, saying "storage size of `sem' isn't known". and I looked inside asm/semaphore.h, I saw: #ifndef I386_SEMAPHORE_H #define I386_SEMAPHORE_H #include #endif Did I missed something? Please guide me how to fix it. Sincerely You're trying to use a kernel data structure in a user-space program. Don't. The definitions in that header are inside #ifdef __KERNEL__ and so the provided userspace headers remove that part. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ck] Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
Chris Friesen wrote: Mark Glines wrote: One minor question: is it even possible to be completely fair on SMP? For instance, if you have a 2-way SMP box running 3 applications, one of which has 2 threads, will the threaded app have an advantage here? (The current system seems to try to keep each thread on a specific CPU, to reduce cache thrashing, which means threads and processes alike each get 50% of the CPU.) I think the ideal in this case would be to have both threads on one cpu, with the other app on the other cpu. This gives inter-process fairness while minimizing the amount of task migration required. Solving this sort of issue was one of the reasons for the smpnice patches. More interesting is the case of three processes on a 2-cpu system. Do we constantly migrate one of them back and forth to ensure that each of them gets 66% of a cpu? Depends how keen you are on fairness. Unless the process are long term continuously active tasks that never sleep it's probably not an issue as they'll probably move around enough in the long term for them each to get 66% over the long term. Exact load balancing for real work loads (where tasks are coming and going, sleeping and waking semi randomly and over relatively brief periods) is probably unattainable because by the time you've work out the ideal placement of the currently runnable tasks on the available CPUs it's all changed and the solution is invalid. The best you can hope for that change isn't so great as to completely invalidate the solution and the changes you make as a result are an improvement on the current allocation of processes to CPUs. The above probably doesn't hold for some systems such as those large super computer jobs that run for several days but they're probably best served by explicit allocation of processes to CPUs using the process affinity mechanism. Peter -- Peter Williams [EMAIL PROTECTED] "Learning, n. The kind of ignorance distinguishing the studious." -- Ambrose Bierce - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Upgraded to 2.6.20.7 - positives
Chuck Ebbert wrote: Denis Vlasenko wrote: * From make menuconfig questions it looks like SATA/PATA rewrite (in the form of libata) is almost finished. Hehe, untangling IDE mess was quite a feat, and Jeff did it. Kudos. ADMA mode on nvidia chipsets still seems broken despite massive amount of SATA fixes backported from 2.6.21... News to me.. pleast post details. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Patch -mm 3/3] RFC: Introduce kobject->owner for refcounting.
On Wed, 2007-04-18 at 11:20 -0400, Alan Stern wrote: > On Wed, 18 Apr 2007, Rusty Russell wrote: > > > Hi Alan, > > > > Your assertion is correct. I haven't studied the driver core, so I > > might be off-base here, but you'll note that if the module references > > the core kmalloc'ed object rather than the other way around it can be > > done safely. The core can also reference the module, but it must be > > able to live without it once it's gone (eg. by returning -ENOENT). > > "Live without it once it's gone..." Do you mean once the object is gone > or once the module is gone? The core in general has no way to know when > the module is gone; all it knows about is the object. The trouble arises > when the module is gone (whether the core knows it or not) but the object > is still present. Hi Alan, I meant that the module is gone: it has told the object (via unregister_xxx) that it's gone. > > A really poor example is below: ... > The example is fine as far as it goes, but it assumes that all > interactions with the underlying r->foo object can be done under a > spinlock. Of course this isn't true in general. There are certainly other ways of doing it, such as a mutex, a refcnt & completion (for function pointers), or disabling preemption across the access and using stop_machine(). Of course, these add complexity. This is the reason that I've always disliked module removal. We have a lot of code to deal with it and it has awkward semantics (unless --wait is used). OTOH, I'm not a fan of the network approach, either: I feel that bringing up an interface should bump the refcnt of the module which implements that interface. Currently taking out e1000 will just kill my eth0. Cheers, Rusty. > > Alan Stern - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Wed, 18 Apr 2007, Davide Libenzi wrote: > > I know, we agree there. But that did not fit my "Pirates of the Caribbean" > quote :) Ahh, I'm clearly not cultured enough, I didn't catch that reference. Linus "yes, I've seen the movie, but it apparently left more of a mark in other people" Torvalds - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ck] Announce - Staircase Deadline cpu scheduler v0.41
On Thursday 19 April 2007 09:59, Con Kolivas wrote: > Since there is so much work currently ongoing with alternative cpu > schedulers, as a standard for comparison with the alternative virtual > deadline fair designs I've addressed a few issues in the Staircase Deadline > cpu scheduler which improve behaviour likely in a noticeable fashion and > released version 0.41. > > http://ck.kolivas.org/patches/staircase-deadline/2.6.20.7-sd-0.41.patch > http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc7-sd-0.41.patch > > and an incremental for those on 0.40: > http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc7/sched-implement >-staircase-deadline-scheduler-further-improvements.patch > > Remember to renice X to -10 for nicest desktop behaviour :) > > Have fun. Oops forgot to cc a few people Nick you said I should still have something to offer so here it is. Peter you said you never saw this design (it's a dual array affair sorry). Gene and Willy you were some of the early testers that noticed the advantages of the earlier designs, Matt you did lots of great earlier testing. WLI you inspired a lot of design ideas. Mike you were the stick. And a few others I've forgotten to mention and include. -- -ck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][RFC] Kill off legacy power management stuff.
On Wed, 18 Apr 2007, Dave Jones wrote: > On Wed, Apr 18, 2007 at 05:23:15PM -0400, Len Brown wrote: > > > > p.p.s. patch improvements that will let me avoid doing any of that > > > myself always welcome. :-) > > > > well, I'm sorry that I've known about the APM issue for a long time > > and done nothing about it. I did ping davej when he broke it, > > but his to-do list is probably even longer than mine. > > ping timeout. > > I don't recall too many of the details surrounding those changes, > but I certainly won't stand in the way of anyone trying to fix it. > It sounds like you and Robert are on top of it, or do you want me to > poke at it ? well, before i get even more confused by what was (once upon a time) a fairly straightforward removal patch, the first obvious question is -- are there *any* circumstances that *require* a config selection of CONFIG_PM_LEGACY, as opposed to a selection of APM and/or ACPI? if there are, then it can't simply be removed. my original patch submission was based on the assumption that absolutely no one needed the legacy stuff anymore and absolutely everything related to it could be scrapped. so, first things first: what *needs* legacy PM at the moment? rday p.s. i'm confused by the header file include/linux/pm_legacy.h, especially this part: #ifdef CONFIG_PM_LEGACY ... # else /* CONFIG_PM_LEGACY */ #define PM_IS_ACTIVE() 0 ... #endif === so the macro "PM_IS_ACTIVE()" represents whether *legacy* PM has been selected. in other words, it makes no (apparent) sense that the value of that macro would represent some kind of contention mechanism between APM and ACPI, which is entirely independent from the legacy stuff. right? -- Robert P. J. Day Linux Consulting, Training and Annoying Kernel Pedantry Waterloo, Ontario, CANADA http://fsdev.net/wiki/index.php?title=Main_Page - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
Linus Torvalds wrote: On Wed, 18 Apr 2007, Matt Mackall wrote: On Wed, Apr 18, 2007 at 07:48:21AM -0700, Linus Torvalds wrote: And "fairness by euid" is probably a hell of a lot easier to do than trying to figure out the wakeup matrix. For the record, you actually don't need to track a whole NxN matrix (or do the implied O(n**3) matrix inversion!) to get to the same result. I'm sure you can do things differently, but the reason I think "fairness by euid" is actually worth looking at is that it's pretty much the *identical* issue that we'll have with "fairness by virtual machine" and a number of other "container" issues. The fact is: - "fairness" is *not* about giving everybody the same amount of CPU time (scaled by some niceness level or not). Anybody who thinks that is "fair" is just being silly and hasn't thought it through. - "fairness" is multi-level. You want to be fair to threads within a thread group (where "process" may be one good approximation of what a "thread group" is, but not necessarily the only one). But you *also* want to be fair in between those "thread groups", and then you want to be fair across "containers" (where "user" may be one such container). So I claim that anything that cannot be fair by user ID is actually really REALLY unfair. I think it's absolutely humongously STUPID to call something the "Completely Fair Scheduler", and then just be fair on a thread level. That's not fair AT ALL! It's the anti-thesis of being fair! So if you have 2 users on a machine running CPU hogs, you should *first* try to be fair among users. If one user then runs 5 programs, and the other one runs just 1, then the *one* program should get 50% of the CPU time (the users fair share), and the five programs should get 10% of CPU time each. And if one of them uses two threads, each thread should get 5%. So you should see one thread get 50& CPU (single thread of one user), 4 threads get 10% CPU (their fair share of that users time), and 2 threads get 5% CPU (the fair share within that thread group!). Any scheduling argument that just considers the above to be "7 threads total" and gives each thread 14% of CPU time "fairly" is *anything* but fair. It's a joke if that kind of scheduler then calls itself CFS! And yes, that's largely what the current scheduler will do, but at least the current scheduler doesn't claim to be fair! So the current scheduler is a lot *better* if only in the sense that it doesn't make ridiculous claims that aren't true! Linus Sounds a lot like the PLFS (process level fair sharing) scheduler in Aurema's ARMTech (for whom I used to work). The "fair" in the title is a bit misleading as it's all about unfair scheduling in order to meet specific policies. But it's based on the principle that if you can allocate CPU band width "fairly" (which really means in proportion to the entitlement each process is allocated) then you can allocate CPU band width "fairly" between higher level entities such as process groups, users groups and so on by subdividing the entitlements downwards. The tricky part of implementing this was the fact that not all entities at the various levels have sufficient demand for CPU band width to use their entitlements and this in turn means that the entities above them will have difficulty using their entitlements even if other of their subordinates have sufficient demand (because their entitlements will be too small). The trick is to have a measure of each entity's demand for CPU bandwidth and use that to modify the way entitlement is divided among subordinates. As a first guess, an entity's CPU band width usage is an indicator of demand but doesn't take into account unmet demand due to tasks waiting on a run queue waiting for access to the CPU. On the other hand, usage plus time waiting on the queue isn't a good measure of demand either (although it's probably a good upper bound) as it's unlikely that the task would have used the same amount of CPU as the waiting time if it had gone straight to the CPU. But my main point is that it is possible to build schedulers that can achieve higher level scheduling policies. Versions of PLFS work on Windows from user space by twiddling process priorities. Part of my more recent work at Aurema had been involved in patching Linux's scheduler so that nice worked more predictably so that we could release a user space version of PLFS for Linux. The other part was to add hard CPU band width caps for processes so that ARMTech could enforce hard CPU bandwidth caps on higher level entities (as this can't be done without the kernel being able to do it at that level. Peter -- Peter Williams [EMAIL PROTECTED] "Learning, n. The kind of ignorance distinguishing the studious." -- Ambrose Bierce - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
Re: [PATCH] fix OOM killing processes wrongly thought MPOL_BIND
On Wed, 18 Apr 2007 20:35:22 +0100 (BST) Hugh Dickins <[EMAIL PROTECTED]> wrote: > I only have CONFIG_NUMA=y for build testing: surprised when trying a memhog > to see lots of other processes killed with "No available memory (MPOL_BIND)". > memhog is killed correctly once we initialize nodemask in constrained_alloc(). > thank you for catching bug. Acked-by: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> > Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]> > --- > Perhaps appropriate for 2.6.20-stable too - regression since 2.6.19. > > mm/oom_kill.c |2 ++ > 1 file changed, 2 insertions(+) > > --- 2.6.21-rc7/mm/oom_kill.c 2007-03-26 07:30:54.0 +0100 > +++ linux/mm/oom_kill.c 2007-04-18 20:18:21.0 +0100 > @@ -176,6 +176,8 @@ static inline int constrained_alloc(stru > struct zone **z; > nodemask_t nodes; > int node; > + > + nodes_clear(nodes); > /* node has memory ? */ > for_each_online_node(node) > if (NODE_DATA(node)->node_present_pages) > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Wed, 18 Apr 2007, Linus Torvalds wrote: > On Wed, 18 Apr 2007, Davide Libenzi wrote: > > > > "Perhaps on the rare occasion pursuing the right course demands an act of > > unfairness, unfairness itself can be the right course?" > > I don't think that's the right issue. > > It's just that "fairness" != "equal". > > Do you think it "fair" to pay everybody the same regardless of how good a > job they do? I don't think anybody really believes that. > > Equating "fair" and "equal" is simply a very fundamental mistake. They're > not the same thing. Never have been, and never will. I know, we agree there. But that did not fit my "Pirates of the Caribbean" quote :) - Davide - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [NETLINK] Don't attach callback to a going-away netlink socket
From: Pavel Emelianov <[EMAIL PROTECTED]> Date: Wed, 18 Apr 2007 12:16:18 +0400 > The proposal it to make sock_orphan before detaching the callback > in netlink_release() and to check for the sock to be SOCK_DEAD in > netlink_dump_start() before setting a new callback. As discussed in this thread there might be other ways to a approach this, but this fix is good for now. Patch applied, thank you. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.20.6 vanilla does't boot
On Wed, Apr 18, 2007 at 03:39:25PM -0400, Len Brown wrote: > On Sunday 15 April 2007 11:50, Michal Jaegermann wrote: > > > > A kernel derived from 2.6.21-rc6-git1 (2.6.20-1.3053.fc7.x86_64 from > > Fedora "rawhide" to be more precise) did boot on the hardware in > > question, though; but only when I gave it 'acpi=off'. Without that > > parameter it was getting stuck apparently when starting hotplug. > > In that kernel case disks were accessed using pata_atiixp driver. > > If "acpi=off" is necessary to boot the latest kernel, please > report an ACPI bug: > http://bugzilla.kernel.org/enter_bug.cgi?product=ACPI I now travel and what I can do at this moment is somewhat limited. In particular I cannot gain an access to the hardware in question. But please see https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=232490 and the most recent comments there in particular. > Please mention in the bug report what the latest working kernel was. This is mentioned in the referenced report as well. Michal - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] utrace: remove exports
Christoph Hellwig wrote: > > All the exports in utrace are totally unused, and not really something > I'd want modules to use anyway :) > Please leave the exports in place. Very early in Documentation/utrace.txt, it says: "The UTRACE is infrastructure code for tracing and controlling user threads. This is the foundation for writing tracing engines, which can be loadable kernel modules." If we can't use utrace to write ad hoc instrumentation modules (i.e., because utrace_attach(), utrace_detach(), etc. are no longer exported), then utrace's usefulness is greatly reduced. Jim Keniston IBM LTC > > Signed-off-by: Christoph Hellwig <[EMAIL PROTECTED]> > ... > Index: linux-2.6/kernel/utrace.c > === > --- linux-2.6.orig/kernel/utrace.c2007-04-13 15:56:28.0 +0200 > +++ linux-2.6/kernel/utrace.c 2007-04-13 15:56:39.0 +0200 > @@ -490,7 +490,6 @@ restart: > > return engine; > } > -EXPORT_SYMBOL_GPL(utrace_attach); > > /* > * When an engine is detached, the target thread may still see it and make > @@ -700,8 +699,6 @@ utrace_detach(struct task_struct *target > > return 0; > } > -EXPORT_SYMBOL_GPL(utrace_detach); > - > > /* > * Called with utrace->lock held. > @@ -900,8 +897,7 @@ restart: /* See below. */ > > return ret; > } > -EXPORT_SYMBOL_GPL(utrace_set_flags); > - > + > /* > * While running an engine callback, no locks are held. > * If a callback updates its engine's action state, then > @@ -1930,8 +1926,6 @@ utrace_inject_signal(struct task_struct > > return ret; > } > -EXPORT_SYMBOL_GPL(utrace_inject_signal); > - > > const struct utrace_regset * > utrace_regset(struct task_struct *target, > @@ -1946,8 +1940,6 @@ utrace_regset(struct task_struct *target > > return >regsets[which]; > } > -EXPORT_SYMBOL_GPL(utrace_regset); > - > > /* > * Return the task_struct for the task using ptrace on this one, or NULL. > - - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][RFC] Kill off legacy power management stuff.
On Wed, Apr 18, 2007 at 05:23:15PM -0400, Len Brown wrote: > > p.p.s. patch improvements that will let me avoid doing any of that > > myself always welcome. :-) > > well, I'm sorry that I've known about the APM issue for a long time > and done nothing about it. I did ping davej when he broke it, > but his to-do list is probably even longer than mine. ping timeout. I don't recall too many of the details surrounding those changes, but I certainly won't stand in the way of anyone trying to fix it. It sounds like you and Robert are on top of it, or do you want me to poke at it ? Dave -- http://www.codemonkey.org.uk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Announce - Staircase Deadline cpu scheduler v0.41
Since there is so much work currently ongoing with alternative cpu schedulers, as a standard for comparison with the alternative virtual deadline fair designs I've addressed a few issues in the Staircase Deadline cpu scheduler which improve behaviour likely in a noticeable fashion and released version 0.41. http://ck.kolivas.org/patches/staircase-deadline/2.6.20.7-sd-0.41.patch http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc7-sd-0.41.patch and an incremental for those on 0.40: http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc7/sched-implement-staircase-deadline-scheduler-further-improvements.patch Remember to renice X to -10 for nicest desktop behaviour :) Have fun. -- -ck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
NETDEV WATCHDOG, tulip, 2.6.18
Package: linux-kernel Version: 2.6.18-4-686 (Debian 2.6.18.dfsg.1-12) (Submitted to linux-kernel@vger.kernel.org && [EMAIL PROTECTED]) I also have recurrent problems with NETDEV WATCHDOG: eth0: transmit timed out I am running on a Pentium 3 with a Linksys LNE100TX V5.1 PCI ethernet card, which also identifies itself as ADMtek Comet rev 17 for which the kernel uses the tulip driver module, Linux Tulip driver version 1.1.13-NAPI (May 11, 2002) This works fine after booting, and for a day or two after booting, no problems with heavy net traffic or light traffic. Eventually something happens to it though, and then it is not right again until reboot. The behavior then is an occasional freeze, where nothing moves for 10 seconds or so, then full-speed network I/O for a few seconds, then another freeze, etc. I only got this machine recently. I first installed Debian Sarge on it, and had the same problem with Sarge's 2.6.8 kernel. I read many messages about the NETDEV WATCHDOG situation, and some writers suggested it might be fixed in later kernels, so I upgraded to Etch with the 2.6.18 kernel. For me at least, the problem is still the same. I am holding the machine in the broken condition (rather than rebooting) in case anyone wants me to test something else. Here is some info to document the problem: dmesg at boot: Linux version 2.6.18-4-686 (Debian 2.6.18.dfsg.1-12) ([EMAIL PROTECTED]) (gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)) #1 SMP Mon Mar 26 17:17:36 UTC 2007 BIOS-provided physical RAM map: BIOS-e820: - 0009f800 (usable) BIOS-e820: 0009f800 - 000a (reserved) BIOS-e820: 000e7000 - 0010 (reserved) BIOS-e820: 0010 - 040fd800 (usable) BIOS-e820: 040fd800 - 040ff800 (ACPI data) BIOS-e820: 040ff800 - 040ffc00 (ACPI NVS) BIOS-e820: 040ffc00 - 1800 (usable) BIOS-e820: fffe7000 - 0001 (reserved) 0MB HIGHMEM available. 384MB LOWMEM available. On node 0 totalpages: 98304 DMA zone: 4096 pages, LIFO batch:0 Normal zone: 94208 pages, LIFO batch:31 DMI 2.1 present. ACPI: RSDP (v000 PTLTD ) @ 0x000f6ac0 ACPI: RSDT (v001 PTLTDRSDT 0x PTL 0x0100) @ 0x040fda87 ACPI: FADT (v001 GATEWA TABOR II 0x19990928 PTL 0x000f4240) @ 0x040ff78c ACPI: DSDT (v001 GATEWA TABOR II 0x MSFT 0x0100) @ 0x ACPI: PM-Timer IO Port: 0x8008 Allocating PCI resources starting at 2000 (gap: 1800:e7fe7000) Detected 596.938 MHz processor. Built 1 zonelists. Total pages: 98304 Kernel command line: root=/dev/hda2 ro Local APIC disabled by BIOS -- you can enable it with "lapic" mapped APIC to d000 (0130a000) Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Initializing CPU#0 PID hash table entries: 2048 (order: 11, 8192 bytes) Console: colour VGA+ 80x25 Dentry cache hash table entries: 65536 (order: 6, 262144 bytes) Inode-cache hash table entries: 32768 (order: 5, 131072 bytes) Memory: 382128k/393216k available (1544k kernel code, 10556k reserved, 577k data, 196k init, 0k highmem) Checking if this processor honours the WP bit even in supervisor mode... Ok. Calibrating delay using timer specific routine.. 1194.90 BogoMIPS (lpj=2389801) Security Framework v1.0.0 initialized SELinux: Disabled at boot. Capability LSM initialized Mount-cache hash table entries: 512 CPU: After generic identify, caps: 0383f9ff CPU: After vendor identify, caps: 0383f9ff CPU: L1 I cache: 16K, L1 D cache: 16K CPU: L2 cache: 512K CPU: After all inits, caps: 0383f9ff 0040 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. Compat vDSO mapped to e000. Checking 'hlt' instruction... OK. SMP alternatives: switching to UP code Freeing SMP alternatives: 16k freed ACPI: Core revision 20060707 ACPI: setting ELCR to 0200 (from 1a00) CPU0: Intel Pentium III (Katmai) stepping 03 SMP motherboard not detected. Local APIC not detected. Using dummy APIC emulation. Brought up 1 CPUs migration_cost=0 checking if image is initramfs... it is Freeing initrd memory: 4397k freed NET: Registered protocol family 16 ACPI: bus type pci registered PCI: PCI BIOS revision 2.10 entry at 0xfd983, last bus=1 PCI: Using configuration type 1 Setting up standard PCI resources ACPI: Interpreter enabled ACPI: Using PIC for interrupt routing ACPI: PCI Root Bridge [PCI0] (:00) PCI: Probing PCI hardware (bus 00) ACPI: Assume root bridge [\_SB_.PCI0] bus is 0 * Found PM-Timer Bug on the chipset. Due to workarounds for a bug, * this clock source is slow. Consider trying other clock sources PCI quirk: region 8000-803f claimed by PIIX4 ACPI PCI quirk: region 7000-700f claimed by PIIX4 SMB Boot
Re: Loud "pop" coming from hard drive on reboot
Mark Lord wrote: Tejun Heo wrote: 1. shutdown(8) issues SYNCHRONIZE_CACHE followed by STANDBY_NOW 2. kernel shutdown starts 3. libata shutdown issues SYNCHRONIZE_CACHE 4. power goes off Okay, after some experimentatino, it's the STANDBY_NOW that is causing the Power-Off_Retract_Count to increment on my machine. Tell me again why we think we need to issue that command ? Arghh.. okay, removing that from the code has no effect on it either. I just don't understand the problem any more, since I don't actually have it here (I think). Can somebody explain again what the issue was, when it began happening, and whatever else? And what Tejun's fix (2.6.22) does again? I've lost all of the early postings. Thanks - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] sched: implement staircase deadline scheduler further improvements
While the Staircase Deadline scheduler has not been completely killed off and is still in -mm I would like to fix some outstanding issues that I've found since it still serves for comparison with all the upcoming schedulers. While still in -mm can we queue this on top please? A set of staircase-deadline v 0.41 patches will make their way into the usual place for those willing to test it. http://ck.kolivas.org/patches/staircase-deadline/ --- The prio_level was being inappropriately decreased if a higher priority task was still using previous timeslice. Fix that. Task expiration of higher priority tasks was not being taken into account with allocating priority slots. Check the expired best_static_prio level to facilitate that. Explicitly check all better static priority prio_levels when deciding on allocating slots for niced tasks. These changes improve behaviour in many ways. Signed-off-by: Con Kolivas <[EMAIL PROTECTED]> --- kernel/sched.c | 61 ++--- 1 file changed, 41 insertions(+), 20 deletions(-) Index: linux-2.6.21-rc7-sd/kernel/sched.c === --- linux-2.6.21-rc7-sd.orig/kernel/sched.c 2007-04-19 08:51:54.0 +1000 +++ linux-2.6.21-rc7-sd/kernel/sched.c 2007-04-19 09:30:39.0 +1000 @@ -145,6 +145,12 @@ struct prio_array { */ DECLARE_BITMAP(prio_bitmap, MAX_PRIO + 1); + /* +* The best static priority (of the dynamic priority tasks) queued +* this array. +*/ + int best_static_prio; + #ifdef CONFIG_SMP /* For convenience looks back at rq */ struct rq *rq; @@ -191,9 +197,9 @@ struct rq { /* * The current dynamic priority level this runqueue is at per static -* priority level, and the best static priority queued this rotation. +* priority level. */ - int prio_level[PRIO_RANGE], best_static_prio; + int prio_level[PRIO_RANGE]; /* How many times we have rotated the priority queue */ unsigned long prio_rotation; @@ -669,7 +675,7 @@ static void task_new_array(struct task_s } /* Find the first slot from the relevant prio_matrix entry */ -static inline int first_prio_slot(struct task_struct *p) +static int first_prio_slot(struct task_struct *p) { if (unlikely(p->policy == SCHED_BATCH)) return p->static_prio; @@ -682,11 +688,18 @@ static inline int first_prio_slot(struct * level. SCHED_BATCH tasks do not use the priority matrix. They only take * priority slots from their static_prio and above. */ -static inline int next_entitled_slot(struct task_struct *p, struct rq *rq) +static int next_entitled_slot(struct task_struct *p, struct rq *rq) { + int search_prio = MAX_RT_PRIO, uprio = USER_PRIO(p->static_prio); + struct prio_array *array = rq->active; DECLARE_BITMAP(tmp, PRIO_RANGE); - int search_prio, uprio = USER_PRIO(p->static_prio); + /* +* Go straight to expiration if there are higher priority tasks +* already expired. +*/ + if (p->static_prio > rq->expired->best_static_prio) + return MAX_PRIO; if (!rq->prio_level[uprio]) rq->prio_level[uprio] = MAX_RT_PRIO; /* @@ -694,15 +707,21 @@ static inline int next_entitled_slot(str * static_prio are acceptable, and only if it's not better than * a queued better static_prio's prio_level. */ - if (p->static_prio < rq->best_static_prio) { - search_prio = MAX_RT_PRIO; + if (p->static_prio < array->best_static_prio) { if (likely(p->policy != SCHED_BATCH)) - rq->best_static_prio = p->static_prio; - } else if (p->static_prio == rq->best_static_prio) + array->best_static_prio = p->static_prio; + } else if (p->static_prio == array->best_static_prio) { search_prio = rq->prio_level[uprio]; - else { + } else { + int i; + + /* A bound O(n) function, worst case n is 40 */ + for (i = array->best_static_prio; i <= p->static_prio ; i++) { + if (!rq->prio_level[USER_PRIO(i)]) + rq->prio_level[USER_PRIO(i)] = MAX_RT_PRIO; search_prio = max(rq->prio_level[uprio], - rq->prio_level[USER_PRIO(rq->best_static_prio)]); + rq->prio_level[USER_PRIO(i)]); + } } if (unlikely(p->policy == SCHED_BATCH)) { search_prio = max(search_prio, p->static_prio); @@ -718,6 +737,8 @@ static void queue_expired(struct task_st { task_new_array(p, rq, rq->expired); p->prio = p->normal_prio = first_prio_slot(p); + if (p->static_prio < rq->expired->best_static_prio) + rq->expired->best_static_prio =
Re: [PATCH][RFC] Kill off legacy power management stuff.
On Wed, 18 Apr 2007, Len Brown wrote: > On Wednesday 18 April 2007 16:23, Robert P. J. Day wrote: > > ok, i get it now and -- correct me if i'm wrong -- all my legacy PM > > removal patch was doing was exposing a design boo-boo in which > > APM/ACPI contention was being handled by a macro in a subsystem even > > older than either of them, right? > > yeah, it didn't start out that way, the bug was added when the > CONFIG_PM_LEGACY #define was added. > > > so all that needs to be done is add back in a contention solution > > of some kind that doesn't rely on that ancient system, yes? > > Yes, it is a matter of making the variable not go away when the > #define goes away. > > > as for that thinkpad t30 situation, well, that's just borked, and > > should be fixed. > > yes, the actual failure is that APM mode on the T30 hangs -- and > that is independent of the issue at hand. However, there could be > other failures on other machines when both APM and ACPI think they > are active. at this point, i think the proper approach is to locate and remove all dependencies on the legacy PM code, which includes making sure there's a reliable contention mechanism for APM and ACPI that doesn't need anything out of the legacy code or header files. once that's done, the legacy deletion itself should be trivial. the obvious place for the contention stuff is, i would think, include/linux/pm.h, yes? rday -- Robert P. J. Day Linux Consulting, Training and Annoying Kernel Pedantry Waterloo, Ontario, CANADA http://fsdev.net/wiki/index.php?title=Main_Page - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Loud "pop" coming from hard drive on reboot
Tejun Heo wrote: 1. shutdown(8) issues SYNCHRONIZE_CACHE followed by STANDBY_NOW 2. kernel shutdown starts 3. libata shutdown issues SYNCHRONIZE_CACHE 4. power goes off Okay, after some experimentatino, it's the STANDBY_NOW that is causing the Power-Off_Retract_Count to increment on my machine. Tell me again why we think we need to issue that command ? Thanks. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Wed, 18 Apr 2007, Ingo Molnar wrote: > That's one reason why i dont think it's necessarily a good idea to > group-schedule threads, we dont really want to do a per thread group > percpu_alloc(). I still do not have clear how much overhead this will bring into the table, but I think (like Linus was pointing out) the hierarchy should look like: Top (VCPU maybe?) User Process Thread The "run_queue" concept (and data) that now is bound to a CPU, need to be replicated in: ROOT <- VCPUs add themselves here VCPU <- USERs add themselves here USER <- PROCs add themselves here PROC <- THREADs add themselves here THREAD (ultimate fine grained scheduling unit) So ROOT, VCPU, USER and PROC will have their own "run_queue". Picking up a new task would mean: VCPU = ROOT->lookup(); USER = VCPU->lookup(); PROC = USER->lookup(); THREAD = PROC->lookup(); Run-time statistics should propagate back the other way around. > In fact for threads the _reverse_ problem exists, threaded apps tend to > _strive_ for more performance - hence their desperation of using the > threaded programming model to begin with ;) (just think of media > playback apps which are typically multithreaded) The same user nicing two different multi-threaded processes would expect a predictable CPU distribution too. Doing that efficently (the old per-cpu run-queue is pretty nice from many POVs) is the real challenge. - Davide - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Upgraded to 2.6.20.7 - positives
Denis Vlasenko wrote: > * From make menuconfig questions it looks like SATA/PATA > rewrite (in the form of libata) is almost finished. Hehe, > untangling IDE mess was quite a feat, and Jeff did it. Kudos. > ADMA mode on nvidia chipsets still seems broken despite massive amount of SATA fixes backported from 2.6.21... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Loud "pop" coming from hard drive on reboot
Robert Hancock wrote: > Tejun Heo wrote: >> This really isn't a regression. It's been always like that with libata. >> libata doesn't make devices go into standby mode and shutdown(8) does >> it for libata. The problem here is that libata does issue >> SYNCHRONIZE_CACHE on shutdown. So, the sequence of event is... >> >> 1. shutdown(8) issues SYNCHRONIZE_CACHE followed by STANDBY_NOW > > This part is presumably distribution dependent. I have never seen Fedora > or CentOS shut down drives on power down from the shutdown script/utility.. > Some distro shutdown scripts must be doing "halt -h" at shutdown time. -n : don't sync cache (default is to sync) -h : put harddrives in standby (default is no standby) And BTW not put them in sleep instead of standby (whether it's the halt program or the kernel?) They won't wake up from that until they're reset. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS and suspend2: hang in atomic copy
Ingo Molnar wrote: [Wed Apr 18 2007, 06:02:28PM EDT] > > * Christian Hesse <[EMAIL PROTECTED]> wrote: > > > > although probably your suspend2 problem is still not fixed, it's > > > worth a try nevertheless. Which suspend2 patch did you apply, and > > > was it against -rc6 or -rc7? > > > > You are right again. ;-) > > > > Linux 2.6.21-rc7 > > Suspend2 2.2.9.11 (applies cleanly to -rc7) > > CFS v3 (without any additional patches) > > > > And it still hangs on suspend. > > what's the easiest way for me to try suspend2? Apply the patch, reboot > into the kernel, then execute what command to suspend? (there's a > confusing mismash of initiators of all the suspend variants. Can i drive > this by echoing to /sys/power/state?) > > Ingo I had hoped to collect more data with CFS V2. It crashes in scale_nice_down for s2ram when attempting to disable_nonboot_cpus. So part of traceback looks like (typed by hand with obvious omissions): scale_nice_down update_stats_wait_end - not shown in traceback because inlined pick_next_task_fair migration_call task_rq_lock notifier_call_chain _cpu_down disable_nonboot_cpus ... This is standard -rc7 with V2 CFS applied. It could be a completely unrelated issue. I'll attempt to debug further tomorrow. bob - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS and suspend2: hang in atomic copy
On Thursday 19 April 2007, Ingo Molnar wrote: > * Christian Hesse <[EMAIL PROTECTED]> wrote: > > Linux 2.6.21-rc7 > > Suspend2 2.2.9.11 (applies cleanly to -rc7) > > CFS v3 (without any additional patches) > > > > And it still hangs on suspend. > > i just tried the same and it suspended+resumed just fine: > > Restarting tasks ... done. > Suspend2 debugging info: > - Suspend core : 2.2.9.12 > - Kernel Version : 2.6.21-rc7-CFS-v3 > - Compiler vers. : 4.0 > - Attempt number : 2 > - Parameters : 0 81920 0 0 0 0 > - Overall expected compression percentage: 0. > - Compressor is 'lzf'. > Compressed 31133696 bytes into 14880587 (52 percent compression). > - SwapAllocator active. > Swap available for image: 512036 pages. > - FileAllocator inactive. > - I/O speed: Write 76 MB/s, Read 42 MB/s. > - Extra pages: 18 used/500. > > could you send me your .config? My config is attached. I now got some error message from my system: http://www.eworm.de/tmp/cfs-suspend.jpg -- Regards, Chris # # Automatically generated make config: don't edit # Linux kernel version: 2.6.21-rc7-r1 # Wed Apr 18 22:25:20 2007 # CONFIG_X86_32=y CONFIG_GENERIC_TIME=y CONFIG_CLOCKSOURCE_WATCHDOG=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_SEMAPHORE_SLEEPERS=y CONFIG_X86=y CONFIG_MMU=y CONFIG_ZONE_DMA=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_GENERIC_BUG=y CONFIG_GENERIC_HWEIGHT=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_DMI=y CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" # # Code maturity level options # CONFIG_EXPERIMENTAL=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 # # General setup # CONFIG_LOCALVERSION="" # CONFIG_LOCALVERSION_AUTO is not set CONFIG_SWAP=y CONFIG_SYSVIPC=y # CONFIG_IPC_NS is not set CONFIG_SYSVIPC_SYSCTL=y # CONFIG_POSIX_MQUEUE is not set # CONFIG_BSD_PROCESS_ACCT is not set # CONFIG_TASKSTATS is not set # CONFIG_UTS_NS is not set # CONFIG_AUDIT is not set CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_IKPATCHES=y CONFIG_IKPATCHES_PROC=y # CONFIG_CPUSETS is not set # CONFIG_SYSFS_DEPRECATED is not set # CONFIG_RELAY is not set # CONFIG_BLK_DEV_INITRD is not set # CONFIG_CC_OPTIMIZE_FOR_SIZE is not set CONFIG_SYSCTL=y # CONFIG_EMBEDDED is not set CONFIG_UID16=y CONFIG_SYSCTL_SYSCALL=y CONFIG_KALLSYMS=y # CONFIG_KALLSYMS_EXTRA_PASS is not set CONFIG_HOTPLUG=y CONFIG_PRINTK=y CONFIG_BUG=y CONFIG_ELF_CORE=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_EPOLL=y CONFIG_SHMEM=y CONFIG_SLAB=y CONFIG_VM_EVENT_COUNTERS=y CONFIG_RT_MUTEXES=y # CONFIG_TINY_SHMEM is not set CONFIG_BASE_SMALL=0 # CONFIG_SLOB is not set # # Loadable module support # CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y # CONFIG_MODULE_FORCE_UNLOAD is not set # CONFIG_MODVERSIONS is not set # CONFIG_MODULE_SRCVERSION_ALL is not set CONFIG_KMOD=y CONFIG_STOP_MACHINE=y # # Block layer # CONFIG_BLOCK=y # CONFIG_LBD is not set # CONFIG_BLK_DEV_IO_TRACE is not set # CONFIG_LSF is not set # # IO Schedulers # CONFIG_IOSCHED_NOOP=y # CONFIG_IOSCHED_AS is not set # CONFIG_IOSCHED_DEADLINE is not set CONFIG_IOSCHED_CFQ=y # CONFIG_DEFAULT_AS is not set # CONFIG_DEFAULT_DEADLINE is not set CONFIG_DEFAULT_CFQ=y # CONFIG_DEFAULT_NOOP is not set CONFIG_DEFAULT_IOSCHED="cfq" # # Processor type and features # # CONFIG_TICK_ONESHOT is not set # CONFIG_NO_HZ is not set # CONFIG_HIGH_RES_TIMERS is not set CONFIG_SMP=y CONFIG_X86_PC=y # CONFIG_X86_ELAN is not set # CONFIG_X86_VOYAGER is not set # CONFIG_X86_NUMAQ is not set # CONFIG_X86_SUMMIT is not set # CONFIG_X86_BIGSMP is not set # CONFIG_X86_VISWS is not set # CONFIG_X86_GENERICARCH is not set # CONFIG_X86_ES7000 is not set # CONFIG_PARAVIRT is not set # CONFIG_M386 is not set # CONFIG_M486 is not set # CONFIG_M586 is not set # CONFIG_M586TSC is not set # CONFIG_M586MMX is not set # CONFIG_M686 is not set # CONFIG_MPENTIUMII is not set # CONFIG_MPENTIUMIII is not set CONFIG_MPENTIUMM=y # CONFIG_MCORE2 is not set # CONFIG_MPENTIUM4 is not set # CONFIG_MK6 is not set # CONFIG_MK7 is not set # CONFIG_MK8 is not set # CONFIG_MCRUSOE is not set # CONFIG_MEFFICEON is not set # CONFIG_MWINCHIPC6 is not set # CONFIG_MWINCHIP2 is not set # CONFIG_MWINCHIP3D is not set # CONFIG_MGEODEGX1 is not set # CONFIG_MGEODE_LX is not set # CONFIG_MCYRIXIII is not set # CONFIG_MVIAC3_2 is not set # CONFIG_X86_GENERIC is not set CONFIG_X86_CMPXCHG=y CONFIG_X86_L1_CACHE_SHIFT=6 CONFIG_RWSEM_XCHGADD_ALGORITHM=y # CONFIG_ARCH_HAS_ILOG2_U32 is not set # CONFIG_ARCH_HAS_ILOG2_U64 is not set CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_X86_WP_WORKS_OK=y CONFIG_X86_INVLPG=y CONFIG_X86_BSWAP=y CONFIG_X86_POPAD_OK=y CONFIG_X86_CMPXCHG64=y CONFIG_X86_GOOD_APIC=y CONFIG_X86_INTEL_USERCOPY=y CONFIG_X86_USE_PPRO_CHECKSUM=y CONFIG_X86_TSC=y # CONFIG_HPET_TIMER is not set CONFIG_NR_CPUS=2 # CONFIG_SCHED_SMT is not set CONFIG_SCHED_MC=y CONFIG_PREEMPT_NONE=y # CONFIG_PREEMPT_VOLUNTARY is not set # CONFIG_PREEMPT is not set # CONFIG_PREEMPT_BKL
Upgraded to 2.6.20.7 - positives
Hi kernel people, Just upgraded by home box to 2.6.20.7. Wow. * Reiser3 mount times are drastically reduced, even when journal replay is needed (I have few 100Gb+ reiser3 partitions mounted at boot) * sit pseudo-interface is gone. In previous kernel, I tried to disable it in kernel config to no avial. Now it was easy to simply compile it as a module. * From make menuconfig questions it looks like SATA/PATA rewrite (in the form of libata) is almost finished. Hehe, untangling IDE mess was quite a feat, and Jeff did it. Kudos. Need to check now whether losetup oopses are gone too, or hunt them down if they are still with us :) Thanks everybody for your amazing work. -- vda - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Loud "pop" coming from hard drive on reboot
Tejun Heo wrote: This really isn't a regression. It's been always like that with libata. libata doesn't make devices go into standby mode and shutdown(8) does it for libata. The problem here is that libata does issue SYNCHRONIZE_CACHE on shutdown. So, the sequence of event is... 1. shutdown(8) issues SYNCHRONIZE_CACHE followed by STANDBY_NOW This part is presumably distribution dependent. I have never seen Fedora or CentOS shut down drives on power down from the shutdown script/utility.. 2. kernel shutdown starts 3. libata shutdown issues SYNCHRONIZE_CACHE 4. power goes off Some drives seem to spin up at step #3 even when its cache is clean and power goes off right after the disk finishes the command. So, it's really bad when it happens - spin down, spin up followed by immediate power off. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Loud "pop" coming from hard drive on reboot
Stephen Clark wrote: So this is the pop I hear on my new laptop that is using libata=combined_mode when I shut my system down. I didn't get the pop with the same disk drive in an older laptop that was only ide. It sounds like a relay closing or opening, but is really my drive head doing an emergency retract/park? Yes, that would be what it is, and why. I would vote that the sd stop-on-shutdown patch should go in, and possibly with the new behavior enabled by default. Surely the number of people running Linux on a laptop (or any other system with load/unload head technology drives) is much greater than the number of people running a SAN, multi-initiator, etc. environment where you might not want this.. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CPU_IDLE prevents resuming from STR [was: Re: 2.6.21-rc6-mm1]
On Tue, 17 Apr 2007, Shaohua Li wrote: Looks there is init order issue of sysfs files. The new refreshed patch should fix your bug. Yes, that did fix the hang on resume from STR -- that now works fine. However: [EMAIL PROTECTED]:/sys/devices/system/cpu/cpuidle$ cat available_drivers current_driver [EMAIL PROTECTED]:/sys/devices/system/cpu/cpuidle$ cat available_governors current_governor ladder ladder Is this correct? For reference, my config is http://joshuawise.com/config.gz -- I didn't see any options for cpuidle drivers to access ACPI states... joshua - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
mkinitrd. (was Re: [RFC] [PATCH] Allow overriding module parameters from kernel command_line)
On Thu, Apr 19, 2007 at 08:47:13AM +1000, Neil Brown wrote: > > Fixed by changing /etc/fstab and rebuilding initrd, but IMO rootfstype= > > should have worked. > > I think these are both issues that should be solved by smarts in the > initrd. This is getting away from the intent of Kyle's original patch (Which I think is worthwhile fwiw, having recently hit the exact same sata_nv bug that prompted him to write it) > What we really need is a single reference implementation of "mkinitrd" > which each distro can fiddle with to their heart's content. Then > sensible ideas like the above can be incorporated into the reference, > and all distros will ultimately pick them up. > > But unfortunately I don't have the time to volunteer for this role... The problem I see with such a 'one mkinitrd to rule them all', is that it would suffer from the same thing that stopped any vendor stepping up and getting behind hpa's klibc project... Apathy due to "our current stuff works, why would we throw it all away and start again" It's a great idea in theory, in practise however, initrd construction for every distro now contains years of custom hacks and workarounds (that may not even be relevant on other distros). Given the critical nature of mkinitrd (get something wrong, and your system doesn't boot), unsurprisingly, people are reluctant to change away from something they're familar with, unless there's a *really* compelling reason. Dave -- http://www.codemonkey.org.uk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Loud "pop" coming from hard drive on reboot
Stephen Clark wrote: I tried this on 2.6.20.2 it applied to libata with some fuzz and I had to manually edit libata.h When I did a shutdown I still got the click/pop. I also noticed the last thing displayed on the lcd before it goes blank is Synchronizing SCSI Disks - then the click/pop. HTH, Steve That patch on its own will not help, you also need Tejun's stop-on-shutdown patch, otherwise the kernel will not try to stop the disk before powering off. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 1/1] Char: tty, add io tty_insert_flip_string variants
Alan Cox napsal(a): > On Thu, 19 Apr 2007 00:35:20 +0200 (CEST) > Jiri Slaby <[EMAIL PROTECTED]> wrote: > >> Hi, >> >> don't you consider this useful for some drivers. There are many cases, when >> tty_insert_flip_stringio might be used. > > I couldn't see anyone who really benefitted when I first looked at this > but if you've got a case you want to use them then I've certainly got no > problem with it. Ah, I'm an idiot -- I should go through the code more carefully, tty_prepare_flip_string + memcpy_fromio can do the job. thanks, -- http://www.fi.muni.cz/~xslaby/Jiri Slaby faculty of informatics, masaryk university, brno, cz e-mail: jirislaby gmail com, gpg pubkey fingerprint: B674 9967 0407 CE62 ACC8 22A0 32CC 55C3 39D4 7A7E - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] [PATCH] Allow overriding module parameters from kernel command_line
On Wednesday April 18, [EMAIL PROTECTED] wrote: > > On Wed, 18 Apr 2007 11:55:52 -0400 Kyle McMartin <[EMAIL PROTECTED]> wrote: > > With the move to initramfs and heavily modular configs, which include > > loading storage drivers from early userspace, it's becoming harder > > to provide users with a way of overriding module parameters at boot. > > > > Currently, users would have to break into the initramfs, edit the > > modprobe options, and then let boot continue. They have a much easier time > > dealing with adding options on the command line from Grub or what have you. > > > > I hacked out this patch quickly to re-parse saved_command_line[] when we > > load a module in an attempt to rectify this. > > > > (The specific use-case I was looking at here was HPA commands failing on > > sata_nv controllers, and needing to pass the adma=0 option to the module... > > Users had a hard time testing without an easy way of overriding the > > module.) > > > > Clearly this is not entirely optimal, because we're parsing command_line > > after the module params are parsed. This ends of being a policy decision, > > whether the /sbin/modprobe commandline should override the kernel > > command_line, or vice versa. > > Similar-but-different: I was trying to persuade a Fedora system to use ext2 > for the root filesystem the other day. Turns out that we somehow managed > to break `rootfstype=' in this situation and it cheerfully continued to use > ext3. > > Fixed by changing /etc/fstab and rebuilding initrd, but IMO rootfstype= > should have worked. I think these are both issues that should be solved by smarts in the initrd. All of the (unused) kernel parameters are in the environment aren't they? (if not, they can easily be put there). So maybe insmod/modprobe could be updated to extract relevant options from the environment. And the mount of the root filesystem should be called as: mount ${rootfstype+-t $rootfstype} $dev $mountpoint We are depending more and more on initrd and I think it hurts not having a reference implementation. Currently each distro makes their own and while I'm sure they are all quite good in their own way, the fact that they are independent makes community input harder. What we really need is a single reference implementation of "mkinitrd" which each distro can fiddle with to their heart's content. Then sensible ideas like the above can be incorporated into the reference, and all distros will ultimately pick them up. But unfortunately I don't have the time to volunteer for this role... NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 1/1] Char: tty, add io tty_insert_flip_string variants
On Thu, 19 Apr 2007 00:35:20 +0200 (CEST) Jiri Slaby <[EMAIL PROTECTED]> wrote: > Hi, > > don't you consider this useful for some drivers. There are many cases, when > tty_insert_flip_stringio might be used. I couldn't see anyone who really benefitted when I first looked at this but if you've got a case you want to use them then I've certainly got no problem with it. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC 1/1] Char: tty, add io tty_insert_flip_string variants
Hi, don't you consider this useful for some drivers. There are many cases, when tty_insert_flip_stringio might be used. -- tty, add io tty_insert_flip_string variants Signed-off-by: Jiri Slaby <[EMAIL PROTECTED]> --- commit a7dafceb31ff535b793227036f5b2b6a1e8cf233 tree 51e1bd24bfb9a2842bb12e097cec0b97a7bf3998 parent 71c2e9b72594f69e4e226006206ffa74b55c1642 author Jiri Slaby <[EMAIL PROTECTED]> Thu, 19 Apr 2007 00:27:22 +0200 committer Jiri Slaby <[EMAIL PROTECTED]> Thu, 19 Apr 2007 00:27:22 +0200 drivers/char/tty_io.c | 101 --- 1 file changed, 69 insertions(+), 32 deletions(-) diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c index 959a616..6b215ed 100644 --- a/drivers/char/tty_io.c +++ b/drivers/char/tty_io.c @@ -95,6 +95,7 @@ #include #include +#include #include #include @@ -441,6 +442,25 @@ int tty_buffer_request_room(struct tty_struct *tty, size_t size) } EXPORT_SYMBOL_GPL(tty_buffer_request_room); +#define __tty_insert_flip_string(tty, chars, flags, size, chrfun, flfun) ({ \ + int copied = 0; \ + do {\ + int space = tty_buffer_request_room(tty, size - copied);\ + struct tty_buffer *tb = tty->buf.tail; \ + /* If there is no space then tb may be NULL */ \ + if (unlikely(space == 0)) \ + break; \ + chrfun(tb->char_buf_ptr + tb->used, chars, space); \ + flfun(tb->flag_buf_ptr + tb->used, flags, space); \ + tb->used += space; \ + copied += space;\ + chars += space; \ + /* There is a small chance that we need to split the data over \ + several buffers. If this is the case we must loop */ \ + } while (unlikely(size > copied)); \ + copied; \ +}) + /** * tty_insert_flip_string - Add characters to the tty buffer * @tty: tty structure @@ -456,26 +476,33 @@ EXPORT_SYMBOL_GPL(tty_buffer_request_room); int tty_insert_flip_string(struct tty_struct *tty, const unsigned char *chars, size_t size) { - int copied = 0; - do { - int space = tty_buffer_request_room(tty, size - copied); - struct tty_buffer *tb = tty->buf.tail; - /* If there is no space then tb may be NULL */ - if(unlikely(space == 0)) - break; - memcpy(tb->char_buf_ptr + tb->used, chars, space); - memset(tb->flag_buf_ptr + tb->used, TTY_NORMAL, space); - tb->used += space; - copied += space; - chars += space; - /* There is a small chance that we need to split the data over - several buffers. If this is the case we must loop */ - } while (unlikely(size > copied)); - return copied; + return __tty_insert_flip_string(tty, chars, TTY_NORMAL, size, + memcpy, memset); } EXPORT_SYMBOL(tty_insert_flip_string); /** + * tty_insert_flip_stringio- Add characters to the tty buffer + * @tty: tty structure + * @chars: characters + * @size: size + * + * Queue a series of bytes to the tty buffering from io memory. All the + * characters passed are marked as without error. Returns the number + * added. + * + * Locking: Called functions may take tty->buf.lock + */ + +int tty_insert_flip_stringio(struct tty_struct *tty, + const unsigned char __iomem *chars, size_t size) +{ + return __tty_insert_flip_string(tty, chars, TTY_NORMAL, size, + memcpy_fromio, memset); +} +EXPORT_SYMBOL(tty_insert_flip_stringio); + +/** * tty_insert_flip_string_flags- Add characters to the tty buffer * @tty: tty structure * @chars: characters @@ -492,27 +519,35 @@ EXPORT_SYMBOL(tty_insert_flip_string); int tty_insert_flip_string_flags(struct tty_struct *tty, const unsigned char *chars, const char *flags, size_t size) { - int copied = 0; - do { - int space = tty_buffer_request_room(tty, size - copied); - struct tty_buffer *tb = tty->buf.tail; - /* If there is no space then tb may be NULL */ - if(unlikely(space == 0)) - break; - memcpy(tb->char_buf_ptr + tb->used, chars, space); - memcpy(tb->flag_buf_ptr + tb->used, flags, space); - tb->used += space; -
[PATCH 2/2] wistron_btns: add led support
This patch adds support for mail and wifi leds. It modifies the Kconfig file to automatically pull led_class with wistron_btns, hopefully everyone is fine with this. It doesn't add support for bluetooth led because, so far, it seems all the laptops with bluetooth have led and bluetooth system linked (meaning it is already managed by the driver). This was tested on a TM 610 and a Aspire 3020. Eric (sorry for the multiple receptions) From: Eric Piel <[EMAIL PROTECTED]> wriston_btns: Add led support Add support to wistron_btns for leds that comes with the multimedia keys. Mail and wifi leds are supported, on laptops which have them. Depending on the laptop, wifi subsystem may control just the led, or both the led and the wifi card. Wifi led interface is activated only for the former type of laptops, as the latter type is already managed. Leds are controled by the interface in /sys/class/leds. Signed-off-by: Eric Piel <[EMAIL PROTECTED]> --- linux-2.6.21/drivers/input/misc/wistron_btns.c.bak 2007-04-07 15:09:30.0 +0200 +++ linux-2.6.21/drivers/input/misc/wistron_btns.c 2007-04-14 12:42:38.0 +0200 @@ -30,6 +30,7 @@ #include #include #include +#include /* * Number of attempts to read data from queue per poll; @@ -46,11 +47,12 @@ /* BIOS subsystem IDs */ #define WIFI 0x35 #define BLUETOOTH 0x34 +#define MAIL_LED 0x31 MODULE_AUTHOR("Miloslav Trmac <[EMAIL PROTECTED]>"); MODULE_DESCRIPTION("Wistron laptop button driver"); MODULE_LICENSE("GPL v2"); -MODULE_VERSION("0.2"); +MODULE_VERSION("0.3"); static int force; /* = 0; */ module_param(force, bool, 0); @@ -251,6 +253,7 @@ static const struct key_entry *keymap; /* = NULL; Current key map */ static int have_wifi; static int have_bluetooth; +static int have_leds; static int __init dmi_matched(struct dmi_system_id *dmi) { @@ -263,6 +266,8 @@ else if (key->type == KE_BLUETOOTH) have_bluetooth = 1; } + have_leds = key->code & (FE_MAIL_LED | FE_WIFI_LED); + return 1; } @@ -1028,6 +1033,83 @@ input_sync(input_dev); } + + /* led management */ +static void wistron_mail_led_set(struct led_classdev *led_cdev, +enum led_brightness value) +{ + bios_set_state(MAIL_LED, (value != LED_OFF) ? 1 : 0); +} + +/* same as setting up wifi card, but for laptops on which the led is managed */ +static void wistron_wifi_led_set(struct led_classdev *led_cdev, +enum led_brightness value) +{ + bios_set_state(WIFI, (value != LED_OFF) ? 1 : 0); +} + +static struct led_classdev wistron_mail_led = { + .name = "mail:green", + .brightness_set = wistron_mail_led_set, +}; + +static struct led_classdev wistron_wifi_led = { + .name = "wifi:red", + .brightness_set = wistron_wifi_led_set, +}; + +static void __devinit wistron_led_init(struct device *parent) +{ + if (have_leds & FE_WIFI_LED) { + u16 wifi = bios_get_default_setting(WIFI); + if (wifi & 1) { + wistron_wifi_led.brightness = (wifi & 2) ? LED_FULL : LED_OFF; + if (led_classdev_register(parent, _wifi_led)) +have_leds &= ~FE_WIFI_LED; + else +bios_set_state(WIFI, wistron_wifi_led.brightness); + + } else + have_leds &= ~FE_WIFI_LED; + } + + if (have_leds & FE_MAIL_LED) { + /* bios_get_default_setting(MAIL) always retuns 0, so just turn the led off */ + wistron_mail_led.brightness = LED_OFF; + if (led_classdev_register(parent, _mail_led)) + have_leds &= ~FE_MAIL_LED; + else + bios_set_state(MAIL_LED, wistron_mail_led.brightness); + } +} + +static void __devexit wistron_led_remove(void) +{ + if (have_leds & FE_MAIL_LED) + led_classdev_unregister(_mail_led); + + if (have_leds & FE_WIFI_LED) + led_classdev_unregister(_wifi_led); +} + +static inline void wistron_led_suspend(void) +{ + if (have_leds & FE_MAIL_LED) + led_classdev_suspend(_mail_led); + + if (have_leds & FE_WIFI_LED) + led_classdev_suspend(_wifi_led); +} + +static inline void wistron_led_resume(void) +{ + if (have_leds & FE_MAIL_LED) + led_classdev_resume(_mail_led); + + if (have_leds & FE_WIFI_LED) + led_classdev_resume(_wifi_led); +} + /* Driver core */ static int wifi_enabled; @@ -1125,6 +1207,7 @@ bios_set_state(BLUETOOTH, bluetooth_enabled); } + wistron_led_init(>dev); poll_bios(1); /* Flush stale event queue and arm timer */ return 0; @@ -1133,6 +1216,7 @@ static int __devexit wistron_remove(struct platform_device *dev) { del_timer_sync(_timer); + wistron_led_remove(); input_unregister_device(input_dev); bios_detach(); @@ -1150,6 +1233,7 @@ if (have_bluetooth) bios_set_state(BLUETOOTH, 0); + wistron_led_suspend(); return 0; } @@ -1161,6 +1245,7 @@ if (have_bluetooth) bios_set_state(BLUETOOTH, bluetooth_enabled); + wistron_led_resume(); poll_bios(1); return 0; --- linux-2.6.21/drivers/input/misc/Kconfig.bak 2007-04-09 23:18:49.0 +0200 +++ linux-2.6.21/drivers/input/misc/Kconfig 2007-04-14 02:53:01.0 +0200 @@ -43,9 +43,12 @@ config INPUT_WISTRON_BTNS tristate "x86 Wistron laptop button
[PATCH 0/2] wistron_btns: small fix and led support
Hello, The following two patches are against the input tree and improve the wistron_btns driver. The first patch is mostly trivial, it fixes a typo that I introduced in the previous batch. The second patch adds led support to the driver (and therefore also dependency on the led class). See you, Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] wistron_btns: add led support
This fix a typo on the TM610 definition, inserted in my recent patch "add-acerhk-database". Eric From: Eric Piel <[EMAIL PROTECTED]> wriston_btns: Fix typo for TM610 I did a typo in a previous patch for wistron_btns "add acerhk database". This patch fixes this typo that prevented PROG2 key to work. Signed-off-by: Eric Piel <[EMAIL PROTECTED]> --- linux-2.6.21/drivers/input/misc/wistron_btns.c.bak 2007-04-07 15:09:30.0 +0200 +++ linux-2.6.21/drivers/input/misc/wistron_btns.c 2007-04-07 15:09:44.0 +0200 @@ -490,7 +490,7 @@ { KE_KEY, 0x01, {KEY_HELP} }, { KE_KEY, 0x02, {KEY_CONFIG} }, { KE_KEY, 0x11, {KEY_PROG1} }, - { KE_KEY, 0x12, {KEY_PROG3} }, + { KE_KEY, 0x12, {KEY_PROG2} }, { KE_KEY, 0x13, {KEY_PROG3} }, { KE_KEY, 0x14, {KEY_MAIL} }, { KE_KEY, 0x15, {KEY_WWW} },
Re: CFS and suspend2: hang in atomic copy
On Thursday 19 April 2007, Ingo Molnar wrote: > * Christian Hesse <[EMAIL PROTECTED]> wrote: > > > although probably your suspend2 problem is still not fixed, it's > > > worth a try nevertheless. Which suspend2 patch did you apply, and > > > was it against -rc6 or -rc7? > > > > You are right again. ;-) > > > > Linux 2.6.21-rc7 > > Suspend2 2.2.9.11 (applies cleanly to -rc7) > > CFS v3 (without any additional patches) > > > > And it still hangs on suspend. > > what's the easiest way for me to try suspend2? Apply the patch, reboot > into the kernel, then execute what command to suspend? (there's a > confusing mismash of initiators of all the suspend variants. Can i drive > this by echoing to /sys/power/state?) Perhaps you have to install suspend2-userui as well for the output (I'm not shure whether it works without). Then you can trigger the suspend by echoing to /sys/power/suspend2/do_suspend. Useful informations can be found in the Howto: http://www.suspend2.net/HOWTO I dropped some ccs to not abuse Linus and friends. -- Regards, Chris signature.asc Description: This is a digitally signed message part.
Re: CFS and suspend2: hang in atomic copy
* Christian Hesse <[EMAIL PROTECTED]> wrote: > Linux 2.6.21-rc7 > Suspend2 2.2.9.11 (applies cleanly to -rc7) > CFS v3 (without any additional patches) > > And it still hangs on suspend. i just tried the same and it suspended+resumed just fine: Restarting tasks ... done. Suspend2 debugging info: - Suspend core : 2.2.9.12 - Kernel Version : 2.6.21-rc7-CFS-v3 - Compiler vers. : 4.0 - Attempt number : 2 - Parameters : 0 81920 0 0 0 0 - Overall expected compression percentage: 0. - Compressor is 'lzf'. Compressed 31133696 bytes into 14880587 (52 percent compression). - SwapAllocator active. Swap available for image: 512036 pages. - FileAllocator inactive. - I/O speed: Write 76 MB/s, Read 42 MB/s. - Extra pages: 18 used/500. could you send me your .config? Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS and suspend2: hang in atomic copy
* Christian Hesse <[EMAIL PROTECTED]> wrote: > > although probably your suspend2 problem is still not fixed, it's > > worth a try nevertheless. Which suspend2 patch did you apply, and > > was it against -rc6 or -rc7? > > You are right again. ;-) > > Linux 2.6.21-rc7 > Suspend2 2.2.9.11 (applies cleanly to -rc7) > CFS v3 (without any additional patches) > > And it still hangs on suspend. what's the easiest way for me to try suspend2? Apply the patch, reboot into the kernel, then execute what command to suspend? (there's a confusing mismash of initiators of all the suspend variants. Can i drive this by echoing to /sys/power/state?) Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS and suspend2: hang in atomic copy
On Wednesday 18 April 2007, Ingo Molnar wrote: > * Christian Hesse <[EMAIL PROTECTED]> wrote: > > > i took a quick look at suspend2 and it makes some use of yield(). > > > There's a bug in CFS's yield code, i've attached a patch that should > > > fix it, does it make any difference to the hang? > > > > This patch should apply cleanly against what? The second hunk is > > ignored as it has already been applied. Is this correct? > > hm, i think you might have had one of the earlier CFS patches. You are right. > > But no, it does not change anything. Let me know if you have any other > > patches to test. > > could you try the -v3 patch i released a few hours ago: > >http://redhat.com/~mingo/cfs-scheduler/ > > although probably your suspend2 problem is still not fixed, it's worth a > try nevertheless. Which suspend2 patch did you apply, and was it against > -rc6 or -rc7? You are right again. ;-) Linux 2.6.21-rc7 Suspend2 2.2.9.11 (applies cleanly to -rc7) CFS v3 (without any additional patches) And it still hangs on suspend. -- Regards, Chris signature.asc Description: This is a digitally signed message part.
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Wednesday 18 April 2007 22:33, Con Kolivas wrote: > On Wednesday 18 April 2007 22:14, Nick Piggin wrote: > > On Wed, Apr 18, 2007 at 07:33:56PM +1000, Con Kolivas wrote: > > > On Wednesday 18 April 2007 18:55, Nick Piggin wrote: > > > > Again, for comparison 2.6.21-rc7 mainline: > > > > > > > > 508.87user 32.47system 2:17.82elapsed 392%CPU > > > > 509.05user 32.25system 2:17.84elapsed 392%CPU > > > > 508.75user 32.26system 2:17.83elapsed 392%CPU > > > > 508.63user 32.17system 2:17.88elapsed 392%CPU > > > > 509.01user 32.26system 2:17.90elapsed 392%CPU > > > > 509.08user 32.20system 2:17.95elapsed 392%CPU > > > > > > > > So looking at elapsed time, a granularity of 100ms is just behind the > > > > mainline score. However it is using slightly less user time and > > > > slightly more idle time, which indicates that balancing might have > > > > got a bit less aggressive. > > > > > > > > But anyway, it conclusively shows the efficiency impact of such tiny > > > > timeslices. > > > > > > See test.kernel.org for how (the now defunct) SD was performing on > > > kernbench. It had low latency _and_ equivalent throughput to mainline. > > > Set the standard appropriately on both counts please. > > > > I can give it a run. Got an updated patch against -rc7? > > I said I wasn't pursuing it but since you're offering, the rc6 patch should > apply ok. > > http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc6-sd-0.40.patch Oh and if you go to the effort of trying you may as well try the timeslice tweak to see what effect it has on SD as well. /proc/sys/kernel/rr_interval 100 is the highest. -- -ck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
* Davide Libenzi <[EMAIL PROTECTED]> wrote: > I think Ingo's idea of a new sched_group to contain the generic > parameters needed for the "key" calculation, works better than adding > more fields to existing strctures (that would, of course, host > pointers to it). Otherwise I can already the the struct_signal being > the target for other unrelated fields :) yeah. Another detail is that for global containers like uids, the statistics will have to be percpu_alloc()-ed, both for correctness (runqueues are per CPU) and for performance. That's one reason why i dont think it's necessarily a good idea to group-schedule threads, we dont really want to do a per thread group percpu_alloc(). In fact for threads the _reverse_ problem exists, threaded apps tend to _strive_ for more performance - hence their desperation of using the threaded programming model to begin with ;) (just think of media playback apps which are typically multithreaded) I dont think threads are all that different. Also, the resource-conserving act of using CLONE_VM to share the VM (and to use a different programming environment like Java) should not be 'punished' by forcing the thread group to be accounted as a single, shared entity against other 'fat' tasks. so my current impression is that we want per UID accounting to solve the X problem, the kernel threads problem and the many-users problem, but i'd not want to do it for threads just yet because for them there's not really any apparent problem to be solved. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH] fix for async scsi scan sysfs problem
Hello, I'm having a problem on the newest version of linus's git tree with my qla2xxx card. This is on a UP box, the problem doesn't happen on my similarly configured SMP box. When I unload and then try to load the qla2xxx driver again I get this message kobject_add failed for 3:0:0:0 with -EEXIST, don't try to register things with the same name in the same directory. [] show_trace_log_lvl+0x1a/0x2f [] show_trace+0x12/0x14 [] dump_stack+0x16/0x18 [] kobject_shadow_add+0xcd/0x1df [] kobject_add+0xa/0xc [] device_add+0xab/0x62e [] scsi_sysfs_add_sdev+0x2d/0x1eb [scsi_mod] [] scsi_probe_and_add_lun+0x974/0xaa5 [scsi_mod] [] __scsi_scan_target+0xc0/0x5f1 [scsi_mod] [] scsi_scan_target+0x97/0xa6 [scsi_mod] [] fc_scsi_scan_rport+0x5a/0x76 [scsi_transport_fc] [] run_workqueue+0x89/0x14e [] worker_thread+0xf8/0x124 [] kthread+0xb3/0xdc [] kernel_thread_helper+0x7/0x10 === I traced this down to the async scanning doing a kobject_add for that object, the backtrace below shows the path we took to add it. [] show_trace_log_lvl+0x1a/0x2f [] show_trace+0x12/0x14 [] dump_stack+0x16/0x18 [] kobject_shadow_add+0xcd/0x1df [] kobject_add+0xa/0xc [] class_device_add+0x9e/0x3ad [] scsi_sysfs_add_sdev+0x5a/0x1eb [scsi_mod] [] do_scan_async+0x62/0xf8 [scsi_mod] [] kthread+0xb3/0xdc [] kernel_thread_helper+0x7/0x10 === Looking through everything I came to the conclusion that we don't really need the scsi_sysfs_add_devices in scsi_finish_async_scan, which gets run everytime we do a do_scan_async. In doing the scanning, if we come upon anything we will already be registering the device with sysfs so the scsi_sysfs_add_devices step is kind of useless. I tested this and it worked fine on my UP box (where the problem was happening) and my SMP box (where the problem wasn't happening). Now I'm not entirely sure if this is correct, but I'm attaching the patch that I used to fix it for me, please point out if I've done something wrong or if there is a different way this needs to be fixed. Thank you, Josef diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c index 0949145..2c8527b 100644 --- a/drivers/scsi/scsi_scan.c +++ b/drivers/scsi/scsi_scan.c @@ -1661,15 +1661,6 @@ int scsi_scan_host_selected(struct Scsi_ return 0; } -static void scsi_sysfs_add_devices(struct Scsi_Host *shost) -{ - struct scsi_device *sdev; - shost_for_each_device(sdev, shost) { - if (scsi_sysfs_add_sdev(sdev) != 0) - scsi_destroy_sdev(sdev); - } -} - /** * scsi_prep_async_scan - prepare for an async scan * @shost: the host which will be scanned @@ -1741,8 +1732,6 @@ static void scsi_finish_async_scan(struc wait_for_completion(>prev_finished); - scsi_sysfs_add_devices(shost); - spin_lock(_scan_lock); shost->async_scan = 0; list_del(>list); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] [PATCH] Allow overriding module parameters from kernel command_line
> On Wed, 18 Apr 2007 11:55:52 -0400 Kyle McMartin <[EMAIL PROTECTED]> wrote: > With the move to initramfs and heavily modular configs, which include > loading storage drivers from early userspace, it's becoming harder > to provide users with a way of overriding module parameters at boot. > > Currently, users would have to break into the initramfs, edit the > modprobe options, and then let boot continue. They have a much easier time > dealing with adding options on the command line from Grub or what have you. > > I hacked out this patch quickly to re-parse saved_command_line[] when we > load a module in an attempt to rectify this. > > (The specific use-case I was looking at here was HPA commands failing on > sata_nv controllers, and needing to pass the adma=0 option to the module... > Users had a hard time testing without an easy way of overriding the module.) > > Clearly this is not entirely optimal, because we're parsing command_line > after the module params are parsed. This ends of being a policy decision, > whether the /sbin/modprobe commandline should override the kernel > command_line, or vice versa. Similar-but-different: I was trying to persuade a Fedora system to use ext2 for the root filesystem the other day. Turns out that we somehow managed to break `rootfstype=' in this situation and it cheerfully continued to use ext3. Fixed by changing /etc/fstab and rebuilding initrd, but IMO rootfstype= should have worked. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at net/core/skbuff.c in linux-2.6.21-rc6
Hi Paul: Paul Mackerras <[EMAIL PROTECTED]> wrote: > > So this doesn't change process_input_packet(), which treats the case > where the first byte is 0xff (PPP_ALLSTATIONS) but the second byte is > 0x03 (PPP_UI) as indicating a packet with a PPP protocol number of > 0xff. Arguably that's wrong since PPP protocol 0xff is reserved, and > the RFC does envision the possibility of receiving frames where the > control field has values other than 0x03. Your fix is probably needed too. However, I think the issue that Patrick was trying to fix is the case where p[0] != PPP_ALLSTATIONS and therefore we'd still have a problem there. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, v3
* William Lee Irwin III <[EMAIL PROTECTED]> wrote: > It appears to me that the following can be taken in for mainline (or > rejected for mainline) independently of the rest of the cfs patch. yeah - it's a patch written by Suresh, and this should already be in the for-v2.6.22 -mm queue. See: Subject: [patch] sched: align rq to cacheline boundary on lkml. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Loud "pop" coming from hard drive on reboot
On 4/18/07, Bartlomiej Zolnierkiewicz <[EMAIL PROTECTED]> wrote: On Wednesday 18 April 2007, Chuck Ebbert wrote: > Mark Lord wrote: > > Mark Lord wrote: > >> > >> With the patch applied, I don't see *any* new activity in those > >> S.M.A.R.T. > >> attributes over multiple hibernates (Linux "suspend-to-disk"). > > > > Scratch that -- operator failure. ;) > > The patch makes no difference over hibernates in the SMART logs. > > > > It's still logging extra Power-Off_Retract_Count pegs, > > which it DID NOT USED TO DO not so long ago. > > > > Just to add to the fun, my problems are happening with the "old" > IDE drivers... The issue you are experiencing results in the same problem (disk doing power off retract) but it has a totally different root cause - your notebook loses power on reboot. It is actually a hardware problem and as you have reported the same problem is present when using "the other" OS. I think that the issue needs to be fixed (by detecting affected notebook(s) using DMI?) in Linux PM handling and not in IDE subsystem because: * there may be some other hardware devices affected by the power loss (== they require shutdown sequence) * the same problem will bite if somebody decides to use libata (FC7?) Bart OpenSUSE 10.3 is still in Alpha stage (at least a few months away from release), but they too have switched to libata by default. (You can override by adding a boot param). Greg -- Greg Freemyer The Norcross Group Forensics for the 21st Century - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Loud "pop" coming from hard drive on reboot
Bartlomiej Zolnierkiewicz wrote: > On Wednesday 18 April 2007, Chuck Ebbert wrote: >> Mark Lord wrote: >>> Mark Lord wrote: With the patch applied, I don't see *any* new activity in those S.M.A.R.T. attributes over multiple hibernates (Linux "suspend-to-disk"). >>> Scratch that -- operator failure. ;) >>> The patch makes no difference over hibernates in the SMART logs. >>> >>> It's still logging extra Power-Off_Retract_Count pegs, >>> which it DID NOT USED TO DO not so long ago. >>> >> Just to add to the fun, my problems are happening with the "old" >> IDE drivers... > > The issue you are experiencing results in the same problem (disk doing > power off retract) but it has a totally different root cause - your notebook > loses power on reboot. It is actually a hardware problem and as you have > reported the same problem is present when using "the other" OS. > My "power off retract count" increases whether I do a halt/poweroff or a reboot. The only difference is the volume of the noise. And I just noticed my "seek error rate" is increasing. /me plans purchase of another drive, definitely not Seagate... > I think that the issue needs to be fixed (by detecting affected notebook(s) > using DMI?) in Linux PM handling and not in IDE subsystem because: > > * there may be some other hardware devices affected by the power loss > (== they require shutdown sequence) > > * the same problem will bite if somebody decides to use libata (FC7?) Yeah, this needs fixing too. I've been playing with another notebook and the power does stay on during reboot, so I wonder how widespread the problem is? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, v3
On Wed, Apr 18, 2007 at 07:50:17PM +0200, Ingo Molnar wrote: > this is the third release of the CFS patchset (against v2.6.21-rc7), and > can be downloaded from: >http://redhat.com/~mingo/cfs-scheduler/ > this is a pure "fix reported regressions" release so there's much less > churn: >5 files changed, 71 insertions(+), 29 deletions(-) > (the added lines are mostly debug related, not genuine increase in the > scheduler's size) It appears to me that the following can be taken in for mainline (or rejected for mainline) independently of the rest of the cfs patch. -- wli Mark the runqueues cacheline_aligned_in_smp to avoid false sharing. Index: sched/kernel/sched.c === --- sched.orig/kernel/sched.c 2007-04-18 14:10:03.593207728 -0700 +++ sched/kernel/sched.c2007-04-18 14:11:39.270660075 -0700 @@ -278,7 +278,7 @@ struct lock_class_key rq_lock_key; }; -static DEFINE_PER_CPU(struct rq, runqueues); +static DEFINE_PER_CPU(struct rq, runqueues) cacheline_aligned_in_smp; static inline int cpu_of(struct rq *rq) { - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][RFC] Kill off legacy power management stuff.
On Wednesday 18 April 2007 16:23, Robert P. J. Day wrote: > On Wed, 18 Apr 2007, Len Brown wrote: > > Here is how it should work. CONFIG_ACPI and CONFIG_APM should both > > available in a kernel build. However, at boot time, of ACPI is > > active, then APM should be disabled. > > > > The pm_active flag used to handle this, but that method was BROKEN > > when the CONFIG_PM_LEGACY #define was added. Today, there are > > systems (such as the Thinkpad T30) that will not boot if > > CONFIG_PM_LEGACY is not defined. The reason nobody is complaining > > is because the distros are currently defining CONFIG_PM_LEGACY. > > But when you nuke that option and everything under it, this bug will > > be exposed and some systems will stop booting. > > ok, i get it now and -- correct me if i'm wrong -- all my legacy PM > removal patch was doing was exposing a design boo-boo in which > APM/ACPI contention was being handled by a macro in a subsystem even > older than either of them, right? yeah, it didn't start out that way, the bug was added when the CONFIG_PM_LEGACY #define was added. > so all that needs to be done is add > back in a contention solution of some kind that doesn't rely on that > ancient system, yes? Yes, it is a matter of making the variable not go away when the #define goes away. > as for that thinkpad t30 situation, well, that's just borked, and > should be fixed. yes, the actual failure is that APM mode on the T30 hangs -- and that is independent of the issue at hand. However, there could be other failures on other machines when both APM and ACPI think they are active. > rday > > p.s. at the risk of repeating myself repetitively, do we now agree > that what i was trying to remove *was* adequately ancient? although > it's clear that it has to be done slightly more carefully than was > done in my initial patch. yes, I think so. > p.p.s. patch improvements that will let me avoid doing any of that > myself always welcome. :-) well, I'm sorry that I've known about the APM issue for a long time and done nothing about it. I did ping davej when he broke it, but his to-do list is probably even longer than mine. -Len - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fix OOM killing processes wrongly thought MPOL_BIND
On Wed, Apr 18, 2007 at 08:35:22PM +0100, Hugh Dickins wrote: > I only have CONFIG_NUMA=y for build testing: surprised when trying a memhog > to see lots of other processes killed with "No available memory (MPOL_BIND)". > memhog is killed correctly once we initialize nodemask in constrained_alloc(). > Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]> > --- > Perhaps appropriate for 2.6.20-stable too - regression since 2.6.19. This is a clear fix for an uninitialized variable. Acked-by: William Irwin <[EMAIL PROTECTED]> -- wli - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Loud "pop" coming from hard drive on reboot
On Wednesday 18 April 2007, Tejun Heo wrote: > Mark Lord wrote: > > Chuck Ebbert wrote: > >> Mark Lord wrote: > >>> I'll patch it locally on my own machines, but what about the tens > >>> of thousands of other Seagate notebook drive owners out there? > >>> > >> > >> This is a problem with Seagate specifically, spinning back up > >> on receipt of some command after spindown? > > > > No, they just seem to be affected worse by it than some other brands. > > The bug is that libata/SCSI now spin-down the drive before the distro's > > scripts are done with it, so it spins down, and then gets spun up again > > by the distro, and then spun down again by the distro. > > > > And along the way, one/both of the two causes a full mechanism "park", > > which is hard on things if abused (like this). > > > > Or at least that's what I recall for it. Tejun? > > This really isn't a regression. It's been always like that with libata. Tejun, it is a regression over IDE subsystem (so all PATA and some SATA also). Dave/Chuck, this also seems like a FC7 regression (because of the libata PATA switch). > libata doesn't make devices go into standby mode and shutdown(8) does > it for libata. The problem here is that libata does issue > SYNCHRONIZE_CACHE on shutdown. So, the sequence of event is... > > 1. shutdown(8) issues SYNCHRONIZE_CACHE followed by STANDBY_NOW > 2. kernel shutdown starts > 3. libata shutdown issues SYNCHRONIZE_CACHE > 4. power goes off > > Some drives seem to spin up at step #3 even when its cache is clean and > power goes off right after the disk finishes the command. So, it's > really bad when it happens - spin down, spin up followed by immediate > power off. > > SCSI part of the fix is queued in scsi-misc-2.6 tree and libata-dev part > is acked and waiting to be merged, so the fix will be available in > 2.6.22. However, it's disabled by default to remain compatible with the > current behavior and requires userland change to fully fix the problem. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc5 from fc7-rc2 problems
On Wednesday 18 April 2007 16:23, Jeff Garzik wrote: > Len Brown wrote: > > < Linux version 2.6.20-1.2933.fc6 > > < ([EMAIL PROTECTED]) (gcc version 4.1.1 20070105 > > < (Red Hat 4.1.1-51)) #1 SMP Mon Mar 19 11:38:26 EDT 2007 > > --- > >> Linux version 2.6.20-1.3023.fc7 > >> ([EMAIL PROTECTED]) (gcc version 4.1.2 20070317 > >> (Red Hat 4.1.2-5)) #1 SMP Sun Mar 25 22:12:02 EDT 2007 > > > > I agree that the fc7 version string looks strange, because > > there are other things in the fc7 dmesg which are clearly from 2.6.21, > > such as this: > > > > < ACPI: Core revision 20060707 > > --- > >> ACPI: Core revision 20070126 > > > > Perhaps you can try building a kernel.org 2.6.21 kernel and running > > it on your FC6 install? > > > > The ALI15X3 stuff exists only in the working FC6 dmesg: > > > > < Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 > > < ide: Assuming 33MHz system bus speed for PIO modes; override with > > idebus=xx > > < ALI15X3: IDE controller at PCI slot :00:0f.0 > > < ACPI: Unable to derive IRQ for device :00:0f.0 > > < ACPI: PCI Interrupt :00:0f.0[A]: no GSI > > < ALI15X3: chipset revision 195 > > < ALI15X3: not 100% native mode: will probe irqs later > > < ide0: BM-DMA at 0x1000-0x1007, BIOS settings: hda:DMA, hdb:pio > > < ide1: BM-DMA at 0x1008-0x100f, BIOS settings: hdc:DMA, hdd:pio > > < Probing IDE interface ide0... > > < hda: HITACHI_DK23CA-20, ATA DISK drive > > < ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 > > < Probing IDE interface ide1... > > < hdc: MATSHITADVD-ROM SR-8175, ATAPI CD/DVD-ROM drive > > < ide1 at 0x170-0x177,0x376 on irq 15 > > < Probing IDE interface ide2... > > < Probing IDE interface ide3... > > < Probing IDE interface ide4... > > < Probing IDE interface ide5... > > < hda: max request size: 128KiB > > < hda: 39070080 sectors (20003 MB) w/2048KiB Cache, CHS=38760/16/63, > > UDMA(66) > > < hda: cache flushes not supported > > < hda: hda1 hda2 > > < ide-floppy driver 0.99.newide > > > > FC7 looks like it is using libata instead: > > -Len > > > >> SCSI subsystem initialized > >> libata version 2.20 loaded. > >> ACPI: Unable to derive IRQ for device :00:0f.0 > >> ACPI: PCI Interrupt :00:0f.0[A]: no GSI > >> ata1: PATA max UDMA/100 cmd 0x000101f0 ctl 0x000103f6 bmdma 0x00011000 > >> irq 14 > >> ata2: PATA max UDMA/100 cmd 0x00010170 ctl 0x00010376 bmdma 0x00011008 > >> irq 15 > >> scsi0 : pata_ali > >> PM: Adding info for No Bus:host0 > >> ata1.00: ATA-5: HITACHI_DK23CA-20, 00H1A0A3, max UDMA/100 < > >> drive can do 100 > >> ata1.00: 39070080 sectors, multi 16: LBA > >> ata1.00: configured for UDMA/33<=== configured as 33 > >> scsi1 : pata_ali > >> PM: Adding info for No Bus:host1 > >> ata2.00: ATAPI, max UDMA/33 <=== cd can't be read now > >> ata2.00: configured for UDMA/33 > >> PM: Adding info for No Bus:target0:0:0 > >> scsi 0:0:0:0: Direct-Access ATA HITACHI_DK23CA-2 00H1 PQ: 0 ANSI: > >> 5 > >> PM: Adding info for scsi:0:0:0:0 > >> PM: Adding info for No Bus:target1:0:0 > >> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen > >> ata2.00: cmd a0/01:00:00:00:00/00:00:00:00:00/a0 tag 0 cdb 0x12 data 36 in > >> res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) > >> ata2: soft resetting port > > It looks like interrupts are not being delivered? Dunno, both 2.6.20 and 2.6.21 say they're looking on IRQ14 and IRQ15. ACPI isn't involved at all with those IRQs, as it couldn't find any info for :00:0f.0[A] and thus the legacy hard-coding for IDE must rule the day. Is it possible to configure 2.6.21 with the driver that was running in 2.6.20? If yes, and that works, then we know we didn't somehow otherwise break interrupts. -Len - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS and suspend2: hang in atomic copy
* Christian Hesse <[EMAIL PROTECTED]> wrote: > > i took a quick look at suspend2 and it makes some use of yield(). > > There's a bug in CFS's yield code, i've attached a patch that should > > fix it, does it make any difference to the hang? > > This patch should apply cleanly against what? The second hunk is > ignored as it has already been applied. Is this correct? hm, i think you might have had one of the earlier CFS patches. > But no, it does not change anything. Let me know if you have any other > patches to test. could you try the -v3 patch i released a few hours ago: http://redhat.com/~mingo/cfs-scheduler/ although probably your suspend2 problem is still not fixed, it's worth a try nevertheless. Which suspend2 patch did you apply, and was it against -rc6 or -rc7? Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: AppArmor FAQ
On Wed, 18 Apr 2007, Crispin Cowan wrote: > James Morris wrote: > > On Tue, 17 Apr 2007, Alan Cox wrote: > > > >> I'm not sure if AppArmor can be made good security for the general case, > >> but it is a model that works in the limited http environment > >> (eg .htaccess) and is something people can play with and hack on and may > >> be possible to configure to be very secure. > >> > > Perhaps -- until your httpd is compromised via a buffer overflow or > > simply misbehaves due to a software or configuration flaw, then the > > assumptions being made about its use of pathnames and their security > > properties are out the window. > > > How is it that you think a buffer overflow in httpd could allow an > attacker to break out of an AppArmor profile? Because you can change the behavior of the application and then bypass policy entirely by utilizing any mechanism other than direct filesystem access: IPC, shared memory, Unix domain sockets, local IP networking, remote networking etc. This not even considering object aliasing (which would allow you to inappropriately access objects with full blessing of policy), as I'm assuming that the limited environment Alan is referring to entirely prevents them. Also worth noting here is that you have to consider any limited environment as enforcing security policy, and thus its configuration becomes an additional component of security policy. So, your real security policy is actually more complicated than it appears to be, is not represented completely in the policy configuration file, and must be managed disparately. And it's still only capable of controlling access to filesystem objects. - James -- James Morris <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Loud "pop" coming from hard drive on reboot
On Wednesday 18 April 2007, Chuck Ebbert wrote: > Mark Lord wrote: > > Mark Lord wrote: > >> > >> With the patch applied, I don't see *any* new activity in those > >> S.M.A.R.T. > >> attributes over multiple hibernates (Linux "suspend-to-disk"). > > > > Scratch that -- operator failure. ;) > > The patch makes no difference over hibernates in the SMART logs. > > > > It's still logging extra Power-Off_Retract_Count pegs, > > which it DID NOT USED TO DO not so long ago. > > > > Just to add to the fun, my problems are happening with the "old" > IDE drivers... The issue you are experiencing results in the same problem (disk doing power off retract) but it has a totally different root cause - your notebook loses power on reboot. It is actually a hardware problem and as you have reported the same problem is present when using "the other" OS. I think that the issue needs to be fixed (by detecting affected notebook(s) using DMI?) in Linux PM handling and not in IDE subsystem because: * there may be some other hardware devices affected by the power loss (== they require shutdown sequence) * the same problem will bite if somebody decides to use libata (FC7?) Bart - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kaffeine problem with CFS
* S.Çağlar Onur <[EMAIL PROTECTED]> wrote: > > great! Could you please unapply the hack above and try the proper > > fix below, does this one solve the hangs too? > > Instead of that one, i tried CFSv3 and i cannot reproduce the hang > anymore, Thanks!... cool, thanks for the quick turnaround! Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kaffeine problem with CFS
18 Nis 2007 Çar tarihinde, Ingo Molnar şunları yazmıştı: > * S.Çağlar Onur <[EMAIL PROTECTED]> wrote: > > - schedule(); > > + msleep(1); > > > > which Ingo sends me to try also has the same effect on me. I cannot > > reproduce hangs anymore with that patch applied top of CFS while one > > console checks out SVN repos and other one compiles a small test > > software. > > great! Could you please unapply the hack above and try the proper fix > below, does this one solve the hangs too? Instead of that one, i tried CFSv3 and i cannot reproduce the hang anymore, Thanks!... Cheers -- S.Çağlar Onur <[EMAIL PROTECTED]> http://cekirdek.pardus.org.tr/~caglar/ Linux is like living in a teepee. No Windows, no Gates and an Apache in house! signature.asc Description: This is a digitally signed message part.
Re: [PATCH] fix OOM killing processes wrongly thought MPOL_BIND
On Wed, 18 Apr 2007, Hugh Dickins wrote: > I only have CONFIG_NUMA=y for build testing: surprised when trying a memhog > to see lots of other processes killed with "No available memory (MPOL_BIND)". > memhog is killed correctly once we initialize nodemask in constrained_alloc(). > > Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]> Acked-by: Christoph Lameter <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 1/2] Input: ff, add FF_RAW effect
johann deneux napsal(a): > Jiri, > > Which solution did you chose to implement? From what I remember, we > last discussed Dmitry's idea of specifying an axis for an effect, then > combine several effects to achieve complex effects. I think you mean motor instead of axis, because I don't push real axes to the devices, but motor's torques... > The implementation would specify the axis using the upper bits of the > effect type. Ok, if this is preferred, I'll post it with the const of having more context switches for a single effect. This was just a realization of the idea how I though it with the quick'n'dirty FF_RAW. thanks, -- http://www.fi.muni.cz/~xslaby/Jiri Slaby faculty of informatics, masaryk university, brno, cz e-mail: jirislaby gmail com, gpg pubkey fingerprint: B674 9967 0407 CE62 ACC8 22A0 32CC 55C3 39D4 7A7E - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
* Linus Torvalds <[EMAIL PROTECTED]> wrote: > > perhaps a more fitting term would be 'precise group-scheduling'. > > Within the lowest level task group entity (be that thread group or > > uid group, etc.) 'precise scheduling' is equivalent to 'fairness'. > > Yes. Absolutely. Except I think that at least if you're going to name > somethign "complete" (or "perfect" or "precise"), you should also > admit that groups can be hierarchical. yes. Am i correct to sum up your impression as: " Ingo, for you the hierarchy still appears to be an after-thought, while in practice it's easily the most important thing! Why are you so hung up about 'fairness', it makes no sense!" right? and you would definitely be right if you suggested that i neglected the 'group scheduling' aspects of CFS (except for a minimalistic nice level implementation, which is a poor-man's-non-automatic-group-scheduling), but i very much know its important and i'll definitely fix it for -v4. But please let me explain my reasons for my different focus: yes, group scheduling in practice is the most important first-layer thing, and without it any of the other 'CFS wins' can easily be useless. Firstly, i have not neglected the group scheduling related CFS regressions at all, mainly because there _is_ already a quick hack to check whether group scheduling would solve these regressions: renice. And it was tried in both of the two CFS regression cases i'm aware of: Mike's X starvation problem and Willy's "kevents starvation with thousands of scheddos tasks running" problem. And in both cases, applying the renice hack [which should be properly and automatically implemented as uid group scheduling] fixed the regression for them! So i was not worried at all, group scheduling _provably solves_ these CFS regressions. I rather concentrated on the CFS regressions that were much less clear. But PLEASE believe me: even with perfect cross-group CPU allocation but with a simple non-heuristic scheduler underlying it, you can _easily_ get a sucky desktop experience! I know it because i tried it and others tried it too. (in fact the first version of sched_fair.c was tick based and low-res, and it sucked) Two more things were needed: - the high precision of nsec/64-bit accounting ('reliability of scheduling') - extremely even time-distribution of CPU power ('determinism/smoothness, human perception') (i'm expanding on these two concepts further below) take out any of these and group scheduling or not, you are easily going to have a sucky desktop! (We know that from years of experiments: many people tried to rip out the unfairness from the scheduler and there were always nasty corner cases that 'should' have worked but didnt.) Without these we'd in essence start again at square one, just at a different square, this time with another group of people being irritated! But the biggest and hardest to achieve _wins_ of CFS are _NOT_ achieved via a simple 'get rid of the unfairness of the upstream scheduler and apply group scheduling'. (I know that because i tried it before and because others tried it before, for many many years.) You will _easily_ get sucky desktop experience. The other two things are very much needed too: - the high precision of nsec/64-bit accounting, and the many corner-cases this solves. (For example on a typical desktop there are _lots_ of timing-driven workloads that are in essence 'invisible' to low-resolution, timer-tick based accounting and are heavily skewed.) - extremely even time-distribution of CPU power. CFS behaves pretty well even under the dreaded 'make -jN in an xterm' kernel build workload as reported by Mark Lord, because it also distributes CPU power in a _finegrained_ way. A shell prompt under CFS still behaves acceptably on a single-CPU testbox of mine with a "make -j50" workload. (yes, fifty) Humans react alot more negatively to sudden changes in application behavior ('lags', pauses, short hangs) than they react to fine, gradual, all-encompassing slowdowns. This is a key property of CFS. ( Otherwise renicing X to -10 would have solved most of the interactivity complaints against the vanilla scheduler, otherwise renicing X to -10 would have fixed Mike's setup under SD (it didnt) while it worked much better under CFS, otherwise Gene wouldnt have found CFS markedly better than SD, etc., etc. So getting rid of the heuristics is less than 50% of the road to the perfect desktop scheduler. ) and i claim that these were the really hard bits, and i spent most of the CFS coding only on getting _these_ details 100% right under various workloads, and it makes a night and day difference _even without any group scheduling help_. and note another reason here: group scheduling _masks_ many other scheduling deficiencies that are possible in scheduler. So since CFS doesnt do group scheduling, i get a _fuller_
Re: Stupid GIT question...
[EMAIL PROTECTED] writes: > What's the command to get a diff of "what I would merge if I said 'git pull'?" $ git fetch $ git diff master origin Andreas. -- Andreas Schwab, SuSE Labs, [EMAIL PROTECTED] SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Performance degradation with FFSB between 2.6.20 and 2.6.21-rc7
> On Wed, 18 Apr 2007 15:54:00 +0200 Valerie Clement <[EMAIL PROTECTED]> wrote: > > Running benchmark tests (FFSB) on an ext4 filesystem, I noticed a > performance degradation (about 15-20 percent) in sequential write tests > between 2.6.19-rc6 and 2.6.21-rc4 kernels. > > I ran the same tests on ext3 and XFS filesystems and I saw the same > performance difference between the two kernel versions for these two > filesystems. > > I have also reproduced it between 2.6.20.7 and 2.6.21-rc7. > The FFSB tests run 16 threads, each creating 1GB files. The tests were > done on the same x86_64 system, with the same kernel configuration and > on the same scsi device. Below are the throughput values given by FFSB. > >kernel XFSext3 > -- > 2.6.20.748 MB/sec 44 MB/sec > > 2.6.21-rc7 38 MB/sec 37 MB/sec > > Did anyone else run across the problem? > Is there a known issue? > That's a new discovery, thanks. It could be due to I/O scheduler changes. Which one are you using? CFQ? Or it could be that there has been some changed behaviour at the VFS/pagecache layer: the VFS might be submitting little hunks of lots of files, rather than large hunks of few files. Or it could be a block-layer thing: perhaps some driver change has caused us to be placing less data into the queue. Which device driver is that machine using? Being a simple soul, the first thing I'll try when I get near a test box will be for i in $(seq 1 16) do time dd if=/dev/zero of=$i bs=1M count=1024 & done - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc5 from fc7-rc2 problems
Stephen Clark wrote: > Chuck Ebbert wrote: > >> Stephen Clark wrote: >> >> >>> Hello, >>> >>> I have just tried booting the fc7-rc2 live cd on 2 of my laptops and it >>> failed on both. >>> >>> >> >> FC7 test4 will be out any day now. Please test that -- test2 is ancient >> now. >> >> >> >> > Ok I'll try that when it comes out - I was actually using the livecd > version will the new version > have a livecd also? I'm pretty sure the Live CD will be part of the release. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [RFC] Throttle swappiness for interactive tasks
अभिजित भोपटकर (Abhijit Bhopatkar) wrote: The mm structures of interactive tasks are marked and the pages belonging to them are never shifted to inactive list in lru algorithm. Thus keeping interactive tasks in memory as long as possible. The interactivity is already determined by schedular so we reuse that knowledge to mark the mm structures. Aside from the obvious question of whether the idea is good, there are some practical problems with your patch: 1) the mm->interactive flag is never cleared, even if the task stops being interactive 2) what if the interactive tasks use up more memory than the system has? Will you OOM kill instead of swapping out part of an interactive task? 3) the scheduler can change its idea about which task is interactive and which task isn't very rapidly, while disk IO is very slow - the scheduler's classification may not be useful on swap timescales 4) a currently completely idle task can still be marked interactive in the scheduler, even if it has been idle for days. Such a task is an obvious good candidate for swapout, isn't it? -- Politics is the struggle between those who want to make their country the best in the world, and those who believe it already is. Each group calls the other unpatriotic. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/