Re: [patch] CFS scheduler, v3
William Lee Irwin III wrote: William Lee Irwin III wrote: This essentially doesn't look correct because while you want to enforce the CPU bandwidth allocation, this doesn't have much to do with that apart from the CPU bandwidth appearing as a term. It's more properly a rate of service as opposed to a time at which anything should happen or a number useful for predicting such. When service should begin more properly depends on the other tasks in the system and a number of other decisions that are part of the scheduling policy. On Sat, Apr 21, 2007 at 10:23:07AM +1000, Peter Williams wrote: This model takes all of those into consideration. The idea is not just to predict but to use the calculated time to decide when to boot the current process of the CPU (if it doesn't leave voluntarily) and put this one on. This more or less removes the need to give each task a predetermined chunk of CPU when they go on to the CPU. This should, in general, reduce the number context switches as tasks get to run until they've finished what they're doing or another task becomes higher priority rather than being booted off after an arbitrary time interval. (If this ever gets tried it will be interesting to see if this prediction comes true.) BTW Even if Ingo doesn't choose to try this model, I'll probably make a patch (way in the future after Ingo's changes are settled) to try it out myself. I think I smoked out what you were doing. William Lee Irwin III wrote: If you want to choose a "quasi-inter-arrival time" to achieve the specified CPU bandwidth allocation, this would be it, but to use that to actually enforce the CPU bandwidth allocation, you would need to take into account the genuine inter-arrival time to choose an actual time for service to begin. In other words, this should be a quota for the task to have waited. If it's not waited long enough, then it should be delayed by the difference to achieve the inter-arrival time you're trying to enforce. If it's waited longer, it should be executed sooner modulo other constraints, and perhaps even credited for future scheduling cycles. On Sat, Apr 21, 2007 at 10:23:07AM +1000, Peter Williams wrote: The idea isn't to enforce the bandwidth entitlement to the extent of throttling tasks if they exceed their entitlement and there's no other tasks ready to use the CPU. This is mainly because the bandwidth entitlement isn't fixed -- it's changing constantly as the number and type of runnable tasks changes. Well, a little hysteresis will end up throttling in such a manner anyway as a side-effect, Think of this as a calming influence :-) or you'll get anomalies. Say two tasks with equal entitlements compete, where one sleeps for 1/3 of the time and the other is fully CPU-bound. If only the times when they're in direct competition are split 50/50, then the CPU-bound task gets 2/3 and the sleeper 1/3, which is not the intended effect. I don't believe this model will be very vulnerable to it, though. Nor me. William Lee Irwin III wrote: In order to partially avoid underallocating CPU bandwidth to p, one should track the time last spent sleeping and do the following: On Sat, Apr 21, 2007 at 10:23:07AM +1000, Peter Williams wrote: Yes I made a mistake in omitting to take into account sleep interval. See another e-mail to Ingo correcting this problem. I took it to be less trivial of an error than it was. No big deal. No, you were right it was definitely a NON trivial error. William Lee Irwin III wrote: In order to do better, longer-term history is required, On Sat, Apr 21, 2007 at 10:23:07AM +1000, Peter Williams wrote: The half life of the Kalman filter (roughly equivalent to a running average) used to calculate the averages determines how much history is taken into account. It could be made configurable (at least, until enough real life experience was available to decide on the best value to use). A Kalman filter would do better than a running average. I'm all for it. As a long time user of Kalman filters I tend to think of them as the same thing. I use the term running average when talking about the idea behind a scheduler because I think that more people will understand what the general idea is. When it comes to implementation I always replace the idea of "running average" with a roughly equivalent Kalman filter. William Lee Irwin III wrote: To attempt to maintain an infinite history of bandwidth underutilization to be credited too far in the future would enable potentially long-term overutilization when such credits are cashed en masse for a sustained period of time. At some point you have to say "use it or lose it;" over a shorter period of time some smoothing is still admissible and even desirable. On Sat, Apr 21, 2007 at 10:23:07AM +1000, Peter Williams wrote: Yes, that's why I suggest a running average over the last few scheduling cycles for the task. But thinking about it some more I'm now not so
Re: 2.6.20.7 locking up hard on boot
On Fri, Apr 20, 2007 at 11:30:59PM -0500, Marcos Pinto wrote: > Yes, I just tried 2.6.20.3 with ACPI enabled and it booted perfectly. > I'm hoping this means you know what's wrong? :-) Can you do a 'git bisect' on the versions between 2.6.20.3 and 2.6.20.7 to try to find the problem patch? thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PCI bridge range sizing bug
Jesse Barnes wrote: On Friday, April 20, 2007 11:28 am Linus Torvalds wrote: On Fri, 20 Apr 2007, Jesse Barnes wrote: Sounds good, hopefully reassigning the bridge resources won't cause too much trouble. Do you have time to hack this up? If not, I could give it a try, as long as ajax is willing to test... Actually, I would suggest we not do it automatically (because the need for it is just so low, and the downsides are potentially huge - there are just too many resources that are "hidden" from us through ACPI tricks and having hardware that doesn't actually expose their PCI resources fully through the normal PCI resource setup). Yeah, that's probably prudent. OTOH we should probably let the user know in no uncertain terms that some of the stuff behind one of their bridges will be inaccessible. Something like that would have made it a lot more obvious why my Matrox PCIe x1 video card will not work in my Dell 9150, while a PCI video card does work. The PCI video card directly sits on the bus, and gets its resources assigned by the BIOS. The PCIe video card turned out to be a PCIe to AGP bridge, and the BIOS did not assign the needed PCI resources, making the system crash when I started X. X seemed to have some trouble reading the ROM, too... -- Politics is the struggle between those who want to make their country the best in the world, and those who believe it already is. Each group calls the other unpatriotic. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, v3
William Lee Irwin III wrote: >> This essentially doesn't look correct because while you want to enforce >> the CPU bandwidth allocation, this doesn't have much to do with that >> apart from the CPU bandwidth appearing as a term. It's more properly >> a rate of service as opposed to a time at which anything should happen >> or a number useful for predicting such. When service should begin more >> properly depends on the other tasks in the system and a number of other >> decisions that are part of the scheduling policy. On Sat, Apr 21, 2007 at 10:23:07AM +1000, Peter Williams wrote: > This model takes all of those into consideration. The idea is not just > to predict but to use the calculated time to decide when to boot the > current process of the CPU (if it doesn't leave voluntarily) and put > this one on. This more or less removes the need to give each task a > predetermined chunk of CPU when they go on to the CPU. This should, in > general, reduce the number context switches as tasks get to run until > they've finished what they're doing or another task becomes higher > priority rather than being booted off after an arbitrary time interval. > (If this ever gets tried it will be interesting to see if this > prediction comes true.) > BTW Even if Ingo doesn't choose to try this model, I'll probably make a > patch (way in the future after Ingo's changes are settled) to try it out > myself. I think I smoked out what you were doing. William Lee Irwin III wrote: >> If you want to choose a "quasi-inter-arrival time" to achieve the >> specified CPU bandwidth allocation, this would be it, but to use that >> to actually enforce the CPU bandwidth allocation, you would need to >> take into account the genuine inter-arrival time to choose an actual >> time for service to begin. In other words, this should be a quota for >> the task to have waited. If it's not waited long enough, then it should >> be delayed by the difference to achieve the inter-arrival time you're >> trying to enforce. If it's waited longer, it should be executed >> sooner modulo other constraints, and perhaps even credited for future >> scheduling cycles. On Sat, Apr 21, 2007 at 10:23:07AM +1000, Peter Williams wrote: > The idea isn't to enforce the bandwidth entitlement to the extent of > throttling tasks if they exceed their entitlement and there's no other > tasks ready to use the CPU. This is mainly because the bandwidth > entitlement isn't fixed -- it's changing constantly as the number and > type of runnable tasks changes. Well, a little hysteresis will end up throttling in such a manner anyway as a side-effect, or you'll get anomalies. Say two tasks with equal entitlements compete, where one sleeps for 1/3 of the time and the other is fully CPU-bound. If only the times when they're in direct competition are split 50/50, then the CPU-bound task gets 2/3 and the sleeper 1/3, which is not the intended effect. I don't believe this model will be very vulnerable to it, though. William Lee Irwin III wrote: >> In order to partially avoid underallocating CPU bandwidth to p, one >> should track the time last spent sleeping and do the following: On Sat, Apr 21, 2007 at 10:23:07AM +1000, Peter Williams wrote: > Yes I made a mistake in omitting to take into account sleep interval. > See another e-mail to Ingo correcting this problem. I took it to be less trivial of an error than it was. No big deal. William Lee Irwin III wrote: >> In order to do better, longer-term history is required, On Sat, Apr 21, 2007 at 10:23:07AM +1000, Peter Williams wrote: > The half life of the Kalman filter (roughly equivalent to a running > average) used to calculate the averages determines how much history is > taken into account. It could be made configurable (at least, until > enough real life experience was available to decide on the best value to > use). A Kalman filter would do better than a running average. I'm all for it. William Lee Irwin III wrote: >> To attempt to maintain an infinite history of >> bandwidth underutilization to be credited too far in the future would >> enable potentially long-term overutilization when such credits are >> cashed en masse for a sustained period of time. At some point you have >> to say "use it or lose it;" over a shorter period of time some smoothing >> is still admissible and even desirable. On Sat, Apr 21, 2007 at 10:23:07AM +1000, Peter Williams wrote: > Yes, that's why I suggest a running average over the last few scheduling > cycles for the task. But thinking about it some more I'm now not so > sure. The lack of apparent "smoothness" when I've done this sort of > thing with raw rather than smooth data (in modifications to the current > dynamic priority based scheduler model) is usually noticed by running > top and seeing wildly fluctuating dynamic priorities. I'm not sure that > the actual responsiveness of the system reflects this. So I'm now > willing to reserve my
sysfs: Need ability to remove all symlinks pointing to an object
How do I remove all references to an object in sysfs? The following patch attempt to get that functionality in sysfs but I am not that familiar with it. Help SLUB: Remove alias before installing symlink We cannot really track the aliases that are created when aliasing a slab. kmem_cache_destroy only decrements a refcounter. This means that the aliases are never removed. However, when the last ref count to a slab is dropped then we should remove all symlinks. Signed-off-by: Chriustoph Lameter <[EMAIL PROTECTED]> Index: linux-2.6.21-rc6/mm/slub.c === --- linux-2.6.21-rc6.orig/mm/slub.c 2007-04-20 16:44:14.0 -0700 +++ linux-2.6.21-rc6/mm/slub.c 2007-04-20 17:12:18.0 -0700 @@ -3315,6 +3315,7 @@ static int sysfs_slab_add(struct kmem_ca /* Defer until later */ return 0; + sysfs_remove_link(_subsys.kset.kobj, s->name); kobj_set_kset_s(s, slab_subsys); kobject_set_name(>kobj, s->name); kobject_init(>kobj); @@ -3329,8 +3330,18 @@ static int sysfs_slab_add(struct kmem_ca return 0; } +static void sysfs_remove_aliases(struct kmem_cache *s) +{ + /* +* Remove all symlinks pointing to the kobject of +* in the kmem_cache structure +*/ + sysfs_remove_links(_subsys.kset, >kobj); +} + static void sysfs_slab_remove(struct kmem_cache *s) { + sysfs_remove_aliases(s); kobject_uevent(>kobj, KOBJ_REMOVE); kobject_del(>kobj); } @@ -3351,9 +3362,11 @@ static int sysfs_slab_alias(struct kmem_ { struct saved_alias *al; - if (slab_state == SYSFS) + if (slab_state == SYSFS) { + sysfs_remove_link(_subsys.kset.kobj, name); return sysfs_create_link(_subsys.kset.kobj, >kobj, name); + } al = kmalloc(sizeof(struct saved_alias), GFP_KERNEL); if (!al) Index: linux-2.6.21-rc6/fs/sysfs/symlink.c === --- linux-2.6.21-rc6.orig/fs/sysfs/symlink.c2007-04-20 17:05:00.0 -0700 +++ linux-2.6.21-rc6/fs/sysfs/symlink.c 2007-04-20 17:18:50.0 -0700 @@ -117,6 +117,38 @@ void sysfs_remove_link(struct kobject * sysfs_hash_and_remove(kobj->dentry,name); } +/* + * Remove all symlinks pointing to the indicated object + */ +void sysfs_remove_links(struct kset *kset, struct kobject *needle) +{ + struct list_head *entry; + +restart: + spin_lock(>list_lock); + list_for_each(entry,>list) { + struct kobject * k = container_of(entry, struct kobject, entry); + struct sysfs_symlink *sl = + container_of(k, struct sysfs_symlink, target_kobj); + + if (sl->target_kobj == needle) { + /* sysfs_remove_link needs the lock. sigh */ + spin_unlock(>list_lock); + + sysfs_remove_link(k, sl->link_name); + /* +* Somehow sysfs_remove_link does +* not clean up after itself +*/ + kfree(sl->link_name); + kfree(sl); + kobject_put(needle); + goto restart; + } +} +spin_unlock(>list_lock); +} + static int sysfs_get_target_path(struct kobject * kobj, struct kobject * target, char *path) { @@ -188,5 +220,6 @@ const struct inode_operations sysfs_syml }; -EXPORT_SYMBOL_GPL(sysfs_create_link); EXPORT_SYMBOL_GPL(sysfs_remove_link); +EXPORT_SYMBOL_GPL(sysfs_remove_links); +EXPORT_SYMBOL_GPL(sysfs_create_link); Index: linux-2.6.21-rc6/include/linux/kobject.h === --- linux-2.6.21-rc6.orig/include/linux/kobject.h 2007-04-20 17:09:03.0 -0700 +++ linux-2.6.21-rc6/include/linux/kobject.h2007-04-20 17:09:49.0 -0700 @@ -166,6 +166,9 @@ static inline struct kobj_type * get_kty extern struct kobject * kset_find_obj(struct kset *, const char *); +extern void sysfs_remove_links(struct kset *kset, struct kobject *needle); + + /** * Use this when initializing an embedded kset with no other - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.20.7 locking up hard on boot
Yes, I just tried 2.6.20.3 with ACPI enabled and it booted perfectly. I'm hoping this means you know what's wrong? :-) Thanks again, Marcos On 4/20/07, Adrian Bunk <[EMAIL PROTECTED]> wrote: On Fri, Apr 20, 2007 at 07:47:13PM -0500, Marcos Pinto wrote: > I'm not subscribed, so please personally CC me any answers/comments. > Thank you. > > While booting, (AMD64 Turion x2) 2.6.20.7 kernel locks up hard. The > last kernel that I tried, 2.6.18.8, worked perfectly without any > trickery. 2.6.20.7 only boots up with "acpi=off" being added to the > kernel line. Note that 2.6.18.8 works perfectly with acpi on, which > is really the > only way I can run this box because with "acpi=off" it overheats and > freezes. > Please let me know if there's anything else that I could do to help with > this. > > > Here's what's on the screen when it happens: > > Brought up 2 CPUs > testing NMI watchdog ... OK. > Disabling vsyscall due to use of PM timer > time.c: Using 3.579545 MHz WALL PM GTOD PM timer. > time.c: Detected 1808.264 MHz processor. > migration_cost=281 > NET: Registered protocol family 16 > ACPI: bus type pci registered > PCI: Using MMCONFIG at e000 > PCI: No mmconfig possible on device 00:18 > PCI: No mmconfig possible on device 07:05 > ACPI: Interpreter enabled > ACPI: Using IOAPIC for interrupt routing > ACPI: PCI Root Bridge [PCI0] (:00) > ACPI: Assume root bridge [\_SB_.PCI0] bus is 0 > :00:0d.0: cannot adjust BAR0 (not I/O) > :00:0d.0: cannot adjust BAR1 (not I/O) >... Does 2.6.20.3 boot with ACPI enabled? cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Make new setting of panic_on_oom
> > > read_lock(_lock); > > > > > > + if (sysctl_panic_on_oom == 2) > > > + panic("out of memory. Compulsory panic_on_oom is selected.\n"); > > > + > > > > Wouldn't it be safer to put the panic before the read_lock()? > > I agree. Otherwise the patch seem to be okay. Ok. This is take 2. Thanks for your comment. - The current panic_on_oom may not work if there is a process using cpusets/mempolicy, because other nodes' memory may remain. But some people want failover by panic ASAP even if they are used. This patch makes new setting for its request. This is not tested yet. But it would work. Please apply. Signed-off-by: Yasunori Goto <[EMAIL PROTECTED]> --- Documentation/sysctl/vm.txt | 23 +-- mm/oom_kill.c |3 +++ 2 files changed, 20 insertions(+), 6 deletions(-) Index: panic_on_oom2/Documentation/sysctl/vm.txt === --- panic_on_oom2.orig/Documentation/sysctl/vm.txt 2007-04-21 12:39:09.0 +0900 +++ panic_on_oom2/Documentation/sysctl/vm.txt 2007-04-21 12:39:58.0 +0900 @@ -197,11 +197,22 @@ panic_on_oom -This enables or disables panic on out-of-memory feature. If this is set to 1, -the kernel panics when out-of-memory happens. If this is set to 0, the kernel -will kill some rogue process, called oom_killer. Usually, oom_killer can kill -rogue processes and system will survive. If you want to panic the system -rather than killing rogue processes, set this to 1. +This enables or disables panic on out-of-memory feature. -The default value is 0. +If this is set to 0, the kernel will kill some rogue process, +called oom_killer. Usually, oom_killer can kill rogue processes and +system will survive. + +If this is set to 1, the kernel panics when out-of-memory happens. +However, if a process limits using nodes by mempolicy/cpusets, +and those nodes become memory exhaustion status, one process +may be killed by oom-killer. No panic occurs in this case. +Because other nodes' memory may be free. This means system total status +may be not fatal yet. +If this is set to 2, the kernel panics compulsorily even on the +above-mentioned. + +The default value is 0. +1 and 2 are for failover of clustering. Please select either +according to your policy of failover. Index: panic_on_oom2/mm/oom_kill.c === --- panic_on_oom2.orig/mm/oom_kill.c2007-04-21 12:39:09.0 +0900 +++ panic_on_oom2/mm/oom_kill.c 2007-04-21 12:40:31.0 +0900 @@ -409,6 +409,9 @@ show_mem(); } + if (sysctl_panic_on_oom == 2) + panic("out of memory. Compulsory panic_on_oom is selected.\n"); + cpuset_lock(); read_lock(_lock); -- Yasunori Goto - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] Extend Linux to support proportional-share scheduling
On Fri, Apr 20, 2007 at 11:30:04AM -0700, Tong Li wrote: > This patch extends the existing Linux scheduler with support for > proportional-share scheduling (as a new KConfig option). > http://www.cs.duke.edu/~tongli/linux/linux-2.6.19.2-trio.patch > It uses a scheduling algorithm, called Distributed Weighted Round-Robin > (DWRR), which retains the existing scheduler design as much as possible, > and extends it to achieve proportional fairness with O(1) time complexity > and a constant error bound, compared to the ideal fair scheduling > algorithm. The code is by no means final and has been only tested on a > four-processor dual-core x86-64 system. Rather than focusing on coding > issues, the intent of this RFC is to invite discussions on the proposed > DWRR algorithm and proportional-share scheduling in general. Very nice. I think we need this kind of functionality in mainline. -- wli - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] lazy freeing of memory through MADV_FREE
Eric Dumazet wrote: Rik van Riel a écrit : Andrew Morton wrote: On Fri, 20 Apr 2007 17:38:06 -0400 Rik van Riel <[EMAIL PROTECTED]> wrote: Andrew Morton wrote: I've also merged Nick's "mm: madvise avoid exclusive mmap_sem". - Nick's patch also will help this problem. It could be that your patch no longer offers a 2x speedup when combined with Nick's patch. It could well be that the combination of the two is even better, but it would be nice to firm that up a bit. I'll test that. Thanks. Well, good news. It turns out that Nick's patch does not improve peak performance much, but it does prevent the decline when running with 16 threads on my quad core CPU! We _definately_ want both patches, there's a huge benefit in having them both. Here are the transactions/seconds for each combination: vanilla new glibc madv_free kernel madv_free + mmap_sem threads 1 610 609 596545 545 tps versus 610 tps for one thread ? It seems quite bad, no ? Could you please find an explanation for this ? I have no idea why this happens. Especially the last one, going from a write lock to a read lock on the mmap_sem should not make ANY difference whatsoever since we're running single threaded! 2103211361196 1200 4107011282014 2024 8100010881665 2087 1677910731310 1999 Performance with 2 database threads is way better though, and performance with 4 or more threads more than doubles... If you have an explanation on why single threaded performance went down a little on my quad core system, please let me know. Does performance suffer at all on a real UP system? -- Politics is the struggle between those who want to make their country the best in the world, and those who believe it already is. Each group calls the other unpatriotic. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] KMEM_CACHE() simplify slab cache creation
This patch provides a new macro KMEM_CACHE(, ) to simplify slab creation. KMEM_CACHE creates a slab with the name of the struct, with the size of the struct and with the alignment of the struct. Additional slab flags may be specified if necessary. Example struct test_slab { int a,b,c; struct list_head; } __cacheline_aligned_in_smp; test_slab_cache = KMEM_CACHE(test_slab, SLAB_PANIC) willl create a new slab named "test_slab" of the size sizeof(struct test_slab) and aligned to the alignment of test slab. If it fails then we panic. Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> Index: linux-2.6.21-rc6/include/linux/slab.h === --- linux-2.6.21-rc6.orig/include/linux/slab.h 2007-04-20 20:15:14.0 -0700 +++ linux-2.6.21-rc6/include/linux/slab.h 2007-04-20 20:24:03.0 -0700 @@ -55,6 +55,18 @@ unsigned int kmem_cache_size(struct kmem const char *kmem_cache_name(struct kmem_cache *); int kmem_ptr_validate(struct kmem_cache *cachep, const void *ptr); +/* + * Please use this macro to create slab caches. Simply specify the + * name of the structure and maybe some flags that are listed above. + * + * The alignment of the struct determines object alignment. If you + * f.e. add cacheline_aligned_in_smp to the struct declaration + * then the objects will be properly aligned in SMP configurations. + */ +#define KMEM_CACHE(__struct, __flags) kmem_cache_create(#__struct,\ + sizeof(struct __struct), __alignof__(struct __struct),\ + (__flags), NULL, NULL) + #ifdef CONFIG_NUMA extern void *kmem_cache_alloc_node(struct kmem_cache *, gfp_t flags, int node); #else Index: linux-2.6.21-rc6/kernel/delayacct.c === --- linux-2.6.21-rc6.orig/kernel/delayacct.c2007-04-20 20:15:14.0 -0700 +++ linux-2.6.21-rc6/kernel/delayacct.c 2007-04-20 20:17:47.0 -0700 @@ -31,11 +31,7 @@ __setup("nodelayacct", delayacct_setup_d void delayacct_init(void) { - delayacct_cache = kmem_cache_create("delayacct_cache", - sizeof(struct task_delay_info), - 0, - SLAB_PANIC, - NULL, NULL); + delayacct_cache = KMEM_CACHE(task_delay_info, SLAB_PANIC); delayacct_tsk_init(_task); } Index: linux-2.6.21-rc6/kernel/pid.c === --- linux-2.6.21-rc6.orig/kernel/pid.c 2007-04-20 20:15:14.0 -0700 +++ linux-2.6.21-rc6/kernel/pid.c 2007-04-20 20:17:47.0 -0700 @@ -412,7 +412,5 @@ void __init pidmap_init(void) set_bit(0, init_pid_ns.pidmap[0].page); atomic_dec(_pid_ns.pidmap[0].nr_free); - pid_cachep = kmem_cache_create("pid", sizeof(struct pid), - __alignof__(struct pid), - SLAB_PANIC, NULL, NULL); + pid_cachep = KMEM_CACHE(pid, SLAB_PANIC); } Index: linux-2.6.21-rc6/kernel/signal.c === --- linux-2.6.21-rc6.orig/kernel/signal.c 2007-04-20 20:15:14.0 -0700 +++ linux-2.6.21-rc6/kernel/signal.c2007-04-20 20:17:47.0 -0700 @@ -2636,9 +2636,5 @@ __attribute__((weak)) const char *arch_v void __init signals_init(void) { - sigqueue_cachep = - kmem_cache_create("sigqueue", - sizeof(struct sigqueue), - __alignof__(struct sigqueue), - SLAB_PANIC, NULL, NULL); + sigqueue_cachep = KMEM_CACHE(sigqueue, SLAB_PANIC); } Index: linux-2.6.21-rc6/kernel/taskstats.c === --- linux-2.6.21-rc6.orig/kernel/taskstats.c2007-04-20 20:15:14.0 -0700 +++ linux-2.6.21-rc6/kernel/taskstats.c 2007-04-20 20:17:47.0 -0700 @@ -524,9 +524,7 @@ void __init taskstats_init_early(void) { unsigned int i; - taskstats_cache = kmem_cache_create("taskstats_cache", - sizeof(struct taskstats), - 0, SLAB_PANIC, NULL, NULL); + taskstats_cache = KMEM_CACHE(taskstats, SLAB_PANIC); for_each_possible_cpu(i) { INIT_LIST_HEAD(&(per_cpu(listener_array, i).list)); init_rwsem(&(per_cpu(listener_array, i).sem)); Index: linux-2.6.21-rc6/block/cfq-iosched.c === --- linux-2.6.21-rc6.orig/block/cfq-iosched.c 2007-04-20 20:15:14.0 -0700 +++ linux-2.6.21-rc6/block/cfq-iosched.c2007-04-20 20:17:47.0 -0700 @@ -,13 +,11 @@ static void cfq_slab_kill(void) static int __init cfq_slab_setup(void)
Re: [RFC 0/8] Cpuset aware writeback
On Fri, 20 Apr 2007, Ethan Solomita wrote: > cpuset_write_dirty_map.htm > >In __set_page_dirty_nobuffers() you always call cpuset_update_dirty_nodes() > but in __set_page_dirty_buffers() you call it only if page->mapping is still > set after locking. Is there a reason for the difference? Also a question not > about your patch: why do those functions call __mark_inode_dirty() even if the > dirty page has been truncated and mapping == NULL? If page->mapping has been cleared then the page was removed from the mapping. __mark_inode_dirty just dirties the inode. If a truncation occurs then the inode was modified. > cpuset_write_throttle.htm > >I noticed that several lines have leading spaces. I didn't check if other > patches have the problem too. Maybe download the patches? How did those strange .htm endings get appended to the patches? >In get_dirty_limits(), when cpusets are configd you don't subtract highmen > the same way that is done without cpusets. Is this intentional? That is something in flux upstream. Linus changed it recently. Do it one way or the other. >It seems that dirty_exceeded is still a global punishment across cpusets. > Should it be addressed? Sure. It would be best if you could place that somehow in a cpuset. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Fwd: Fw: [2.6.20.4] BUG: dentry xattrs still in use in shrink_dcache_for_umount() with reiserfs
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Andrew Morton wrote: > On Wed, 18 Apr 2007 11:00:00 -0400 > Jeff Mahoney <[EMAIL PROTECTED]> wrote: > >>> Do you think that could be a reason of the extra reference count on >>> xattr_root dentry? >> No, I don't think it is. Looking at the code now, it seems obvious, but >> I didn't notice it before and nobody else has reported a problem. >> >> getxattr() doesn't require any VFS locking. When we get down into the >> reiserfs code, it takes a read lock. If we get two concurrent threads >> looking up an xattr before the root has been saved, there's a window >> where REISERFS_SB(s)->xattr_root is NULL but we've already looked it up >> and taken a reference on it. >> >> I have a patch set to clean up the extended attribute code that fixes >> this problem along the way by killing off the xattr locks and using the >> backing files/dirs i_mutex instead. I'll post them to the reiserfs >> mailing list. > > Do we have anything suitable for 2.6.21 which will address this crash? > > Also, it's not clear to me how many users we can expect to be impacted by it. > I assume that if the same bug is in 2.6.20 then the answer is "not many". > How come Andrea is able to keep hitting it? I have the patchset that I mentioned, but I'm not proposing it for 2.6.21. It's much too invasive to be introduced in an -rc7, but it does include locking changes that I believe avoid this bug. Vladimir was right in his analysis that sometimes get_xa_root() takes the reference once and other times twice, but not for the right reasons. I save a reference to the xattr dir to avoid a lookup later, but when there are multiple getxattrs or listxattrs as the first xattr operation on the file system, we can end up taking a second reference when we shouldn't. This is because those operations are protected by read locks and the ->xattr_root pointer isn't protected by anything else. A quick fix would be to just extend the protection of the priv root's i_mutex around the assignment, and test first. The right fix involves a complete rework of the locking, and I have code to do that, it's just too late to include it in 2.6.21. I'd love to know what Andrea (and now Andi Kleen) are doing to hit this now. There haven't been any changes in this code in a while, and the shrink_dcache_for_umount() has been around since October. I'm unable to reproduce locally so far, so if Andrea or Andi could see if this fixes it for them, I'd appreciate it. - -Jeff - --- a/fs/reiserfs/xattr.c 2007-04-20 21:19:05.0 -0400 +++ b/fs/reiserfs/xattr.c 2007-04-20 21:41:16.0 -0400 @@ -72,14 +72,16 @@ err = privroot->d_inode->i_op->mkdir(privroot->d_inode, xaroot, 0700); - - mutex_unlock(>d_inode->i_mutex); if (err) { + mutex_unlock(>d_inode->i_mutex); dput(xaroot); dput(privroot); return ERR_PTR(err); } - - REISERFS_SB(sb)->xattr_root = dget(xaroot); + if (REISERFS_SB(sb)->xattr_root == NULL) + REISERFS_SB(sb)->xattr_root = dget(xaroot); + mutex_unlock(>d_inode->i_mutex); } out: - -- Jeff Mahoney SUSE Labs -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iD8DBQFGKWy4LPWxlyuTD7IRAhcVAJ9vpYk2ayYf7xP7eB40inFpkiERvgCglayP P7pDkPouMuBlw07rs1qaKPo= =jdRe -END PGP SIGNATURE- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
slab allocators: Remove SLAB_DEBUG_INITIAL flag
I have never seen a use of SLAB_DEBUG_INITIAL. It is only supported by SLAB. I think its purpose was to have a callback after an object has been freed to verify that the state is the constructor state again? The callback is performed before each freeing of an object. I would think that it is much easier to check the object state manually before the free. That also places the check near the code object manipulation of the object. Also the SLAB_DEBUG_INITIAL callback is only performed if the kernel was compiled with SLAB debugging on. If there would be code in a constructor handling SLAB_DEBUG_INITIAL then it would have to be conditional on SLAB_DEBUG otherwise it would just be dead code. But there is no such code in the kernel. I think SLUB_DEBUG_INITIAL is too problematic to make real use of, difficult to understand and there are easier ways to accomplish the same effect (i.e. add debug code before kfree). There is a related flag SLAB_CTOR_VERIFY that is frequently checked to be clear in fs inode caches. Remove the pointless checks (they would even be pointless without removeal of SLAB_DEBUG_INITIAL) from the fs constructors. This is the last slab flag that SLUB did not support. Remove the check for unimplemented flags from SLUB. Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> Index: linux-2.6.21-rc6/include/linux/slab.h === --- linux-2.6.21-rc6.orig/include/linux/slab.h 2007-04-20 18:07:16.0 -0700 +++ linux-2.6.21-rc6/include/linux/slab.h 2007-04-20 18:08:22.0 -0700 @@ -21,7 +21,6 @@ typedef struct kmem_cache kmem_cache_t _ * The ones marked DEBUG are only valid if CONFIG_SLAB_DEBUG is set. */ #define SLAB_DEBUG_FREE0x0100UL/* DEBUG: Perform (expensive) checks on free */ -#define SLAB_DEBUG_INITIAL 0x0200UL/* DEBUG: Call constructor (as verifier) */ #define SLAB_RED_ZONE 0x0400UL/* DEBUG: Red zone objs in a cache */ #define SLAB_POISON0x0800UL/* DEBUG: Poison objects */ #define SLAB_HWCACHE_ALIGN 0x2000UL/* Align objs on cache lines */ @@ -36,7 +35,6 @@ typedef struct kmem_cache kmem_cache_t _ /* Flags passed to a constructor functions */ #define SLAB_CTOR_CONSTRUCTOR 0x001UL /* If not set, then deconstructor */ #define SLAB_CTOR_ATOMIC 0x002UL /* Tell constructor it can't sleep */ -#define SLAB_CTOR_VERIFY 0x004UL /* Tell constructor it's a verify call */ /* * struct kmem_cache related prototypes Index: linux-2.6.21-rc6/mm/slab.c === --- linux-2.6.21-rc6.orig/mm/slab.c 2007-04-20 18:07:16.0 -0700 +++ linux-2.6.21-rc6/mm/slab.c 2007-04-20 18:08:22.0 -0700 @@ -116,8 +116,7 @@ #include /* - * DEBUG - 1 for kmem_cache_create() to honour; SLAB_DEBUG_INITIAL, - * SLAB_RED_ZONE & SLAB_POISON. + * DEBUG - 1 for kmem_cache_create() to honour; SLAB_RED_ZONE & SLAB_POISON. * 0 for faster, smaller code (especially in the critical paths). * * STATS - 1 to collect stats for /proc/slabinfo. @@ -172,7 +171,7 @@ /* Legal flag mask for kmem_cache_create(). */ #if DEBUG -# define CREATE_MASK (SLAB_DEBUG_INITIAL | SLAB_RED_ZONE | \ +# define CREATE_MASK (SLAB_RED_ZONE | \ SLAB_POISON | SLAB_HWCACHE_ALIGN | \ SLAB_CACHE_DMA | \ SLAB_STORE_USER | \ @@ -2182,12 +2181,6 @@ kmem_cache_create (const char *name, siz #if DEBUG WARN_ON(strchr(name, ' ')); /* It confuses parsers */ - if ((flags & SLAB_DEBUG_INITIAL) && !ctor) { - /* No constructor, but inital state check requested */ - printk(KERN_ERR "%s: No con, but init state check " - "requested - %s\n", __FUNCTION__, name); - flags &= ~SLAB_DEBUG_INITIAL; - } #if FORCED_DEBUG /* * Enable redzoning and last user accounting, except for caches with @@ -2892,15 +2885,6 @@ static void *cache_free_debugcheck(struc BUG_ON(objnr >= cachep->num); BUG_ON(objp != index_to_obj(cachep, slabp, objnr)); - if (cachep->flags & SLAB_DEBUG_INITIAL) { - /* -* Need to call the slab's constructor so the caller can -* perform a verify of its state (debugging). Called without -* the cache-lock held. -*/ - cachep->ctor(objp + obj_offset(cachep), -cachep, SLAB_CTOR_CONSTRUCTOR | SLAB_CTOR_VERIFY); - } if (cachep->flags & SLAB_POISON && cachep->dtor) { /* we want to cache poison the object, * call the destruction callback Index: linux-2.6.21-rc6/drivers/mtd/ubi/eba.c ===
Re: [RFC 0/8] Cpuset aware writeback
Christoph Lameter wrote: H Sorry. I got distracted and I have sent them to Kame-san who was interested in working on them. I have placed the most recent version at http://ftp.kernel.org/pub/linux/kernel/people/christoph/cpuset_dirty Hi Christoph -- a few comments on the patches: cpuset_write_dirty_map.htm In __set_page_dirty_nobuffers() you always call cpuset_update_dirty_nodes() but in __set_page_dirty_buffers() you call it only if page->mapping is still set after locking. Is there a reason for the difference? Also a question not about your patch: why do those functions call __mark_inode_dirty() even if the dirty page has been truncated and mapping == NULL? cpuset_write_throttle.htm I noticed that several lines have leading spaces. I didn't check if other patches have the problem too. In get_dirty_limits(), when cpusets are configd you don't subtract highmen the same way that is done without cpusets. Is this intentional? It seems that dirty_exceeded is still a global punishment across cpusets. Should it be addressed? -- Ethan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.20.7 locking up hard on boot
On Fri, Apr 20, 2007 at 07:47:13PM -0500, Marcos Pinto wrote: > I'm not subscribed, so please personally CC me any answers/comments. > Thank you. > > While booting, (AMD64 Turion x2) 2.6.20.7 kernel locks up hard. The > last kernel that I tried, 2.6.18.8, worked perfectly without any > trickery. 2.6.20.7 only boots up with "acpi=off" being added to the > kernel line. Note that 2.6.18.8 works perfectly with acpi on, which > is really the > only way I can run this box because with "acpi=off" it overheats and > freezes. > Please let me know if there's anything else that I could do to help with > this. > > > Here's what's on the screen when it happens: > > Brought up 2 CPUs > testing NMI watchdog ... OK. > Disabling vsyscall due to use of PM timer > time.c: Using 3.579545 MHz WALL PM GTOD PM timer. > time.c: Detected 1808.264 MHz processor. > migration_cost=281 > NET: Registered protocol family 16 > ACPI: bus type pci registered > PCI: Using MMCONFIG at e000 > PCI: No mmconfig possible on device 00:18 > PCI: No mmconfig possible on device 07:05 > ACPI: Interpreter enabled > ACPI: Using IOAPIC for interrupt routing > ACPI: PCI Root Bridge [PCI0] (:00) > ACPI: Assume root bridge [\_SB_.PCI0] bus is 0 > :00:0d.0: cannot adjust BAR0 (not I/O) > :00:0d.0: cannot adjust BAR1 (not I/O) >... Does 2.6.20.3 boot with ACPI enabled? cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] utilities: add helper functions for safe 64-bit integer operations as 32-bit halves
From: John Anthony Kazos Jr. <[EMAIL PROTECTED]> Add helper functions "upper_32_bits" and "lower_32_bits" to to allow 64-bit integers to be separated into their 32-bit upper and lower halves without promoting integers, without stretching sign bits, and without generating compiler warnings when used with any integer not greater than 64 bits wide. High-order bits are assumed to be zero for integers with fewer than 64 of them. Signed-off-by: John Anthony Kazos Jr. <[EMAIL PROTECTED]> --- Using these functions with signed quantities is an error, especially if you read a 32-bit quantity from disk that happens to have the high bit set into an int on a 32-bit machine, then use it with a function taking a u64 which screws your data. When switching to using these functions, it's a good opportunity to check for these signedness errors. (Haven't we learned anything over the past decades of computing about assuming that one little bit doesn't matter?) Not sure exactly whom the maintainer is for this, so I added [EMAIL PROTECTED] It's certainly not limited to one subsystem anymore, and converting the whole kernel to this could be a good step for readability and correctness across architectures of any word size. --- linux-2.6.21-rc7-git4.orig/include/linux/kernel.h 2007-04-20 20:22:13.0 -0400 +++ linux-2.6.21-rc7-git4.mod/include/linux/kernel.h2007-04-20 20:37:41.0 -0400 @@ -40,6 +40,23 @@ extern const char linux_proc_banner[]; #define DIV_ROUND_UP(n,d) (((n) + (d) - 1) / (d)) #define roundup(x, y) x) + ((y) - 1)) / (y)) * (y)) +/** + * lower_32_bits, upper_32_bits - separate the halves of a 64-bit integer + * @n: the integer to separate + * + * Separate a 64-bit integer into its upper and lower 32-bit halves. + * Designed to avoid integer promotions and compiler warnings when used + * with smaller integers, in which case the missing bits are assumed to + * be zero. Designed to treat integers as unsigned whether or not they + * really are. (If you are using these with signed integers, your code + * is almost certainly wrong. The cast is good for people too lazy to + * type "unsigned" in their code, since breaking things is bad.) + * + * These assume the integer used is NOT greater than 64 bits wide. + */ +#define upper_32_bits(n) (sizeof(n) == 8 ? (u64)(n) >> 32 : 0) +#define lower_32_bits(n) (sizeof(n) == 8 ? (u32)(n) : (n)) + #defineKERN_EMERG "<0>" /* system is unusable */ #defineKERN_ALERT "<1>" /* action must be taken immediately */ #defineKERN_CRIT "<2>" /* critical conditions */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.20.7 locking up hard on boot
I'm not subscribed, so please personally CC me any answers/comments. Thank you. While booting, (AMD64 Turion x2) 2.6.20.7 kernel locks up hard. The last kernel that I tried, 2.6.18.8, worked perfectly without any trickery. 2.6.20.7 only boots up with "acpi=off" being added to the kernel line. Note that 2.6.18.8 works perfectly with acpi on, which is really the only way I can run this box because with "acpi=off" it overheats and freezes. Please let me know if there's anything else that I could do to help with this. Here's what's on the screen when it happens: Brought up 2 CPUs testing NMI watchdog ... OK. Disabling vsyscall due to use of PM timer time.c: Using 3.579545 MHz WALL PM GTOD PM timer. time.c: Detected 1808.264 MHz processor. migration_cost=281 NET: Registered protocol family 16 ACPI: bus type pci registered PCI: Using MMCONFIG at e000 PCI: No mmconfig possible on device 00:18 PCI: No mmconfig possible on device 07:05 ACPI: Interpreter enabled ACPI: Using IOAPIC for interrupt routing ACPI: PCI Root Bridge [PCI0] (:00) ACPI: Assume root bridge [\_SB_.PCI0] bus is 0 :00:0d.0: cannot adjust BAR0 (not I/O) :00:0d.0: cannot adjust BAR1 (not I/O) lspci -nn 00:00.0 RAM memory [0500]: nVidia Corporation C51 Host Bridge [10de:02f7] (rev a2) 00:00.1 RAM memory [0500]: nVidia Corporation C51 Memory Controller 0 [10de:02fa] (rev a2) 00:00.2 RAM memory [0500]: nVidia Corporation C51 Memory Controller 1 [10de:02fe] (rev a2) 00:00.3 RAM memory [0500]: nVidia Corporation C51 Memory Controller 5 [10de:02f8] (rev a2) 00:00.4 RAM memory [0500]: nVidia Corporation C51 Memory Controller 4 [10de:02f9] (rev a2) 00:00.5 RAM memory [0500]: nVidia Corporation C51 Host Bridge [10de:02ff] (rev a2) 00:00.6 RAM memory [0500]: nVidia Corporation C51 Memory Controller 3 [10de:027f] (rev a2) 00:00.7 RAM memory [0500]: nVidia Corporation C51 Memory Controller 2 [10de:027e] (rev a2) 00:02.0 PCI bridge [0604]: nVidia Corporation C51 PCI Express Bridge [10de:02fc] (rev a1) 00:03.0 PCI bridge [0604]: nVidia Corporation C51 PCI Express Bridge [10de:02fd] (rev a1) 00:04.0 PCI bridge [0604]: nVidia Corporation C51 PCI Express Bridge [10de:02fb] (rev a1) 00:09.0 RAM memory [0500]: nVidia Corporation MCP51 Host Bridge [10de:0270] (rev a2) 00:0a.0 ISA bridge [0601]: nVidia Corporation MCP51 LPC Bridge [10de:0260] (rev a3) 00:0a.1 SMBus [0c05]: nVidia Corporation MCP51 SMBus [10de:0264] (rev a3) 00:0a.3 Co-processor [0b40]: nVidia Corporation MCP51 PMU [10de:0271] (rev a3) 00:0b.0 USB Controller [0c03]: nVidia Corporation MCP51 USB Controller [10de:026d] (rev a3) 00:0b.1 USB Controller [0c03]: nVidia Corporation MCP51 USB Controller [10de:026e] (rev a3) 00:0d.0 IDE interface [0101]: nVidia Corporation MCP51 IDE [10de:0265] (rev f1) 00:0e.0 IDE interface [0101]: nVidia Corporation MCP51 Serial ATA Controller [10de:0266] (rev f1) 00:10.0 PCI bridge [0604]: nVidia Corporation MCP51 PCI Bridge [10de:026f] (rev a2) 00:10.1 Audio device [0403]: nVidia Corporation MCP51 High Definition Audio [10de:026c] (rev a2) 00:14.0 Bridge [0680]: nVidia Corporation MCP51 Ethernet Controller [10de:0269] (rev a3) 00:18.0 Host bridge [0600]: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration [1022:1100] 00:18.1 Host bridge [0600]: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map [1022:1101] 00:18.2 Host bridge [0600]: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller [1022:1102] 00:18.3 Host bridge [0600]: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control [1022:1103] 03:00.0 Network controller [0280]: Broadcom Corporation Dell Wireless 1390 WLAN Mini-PCI Card [14e4:4311] (rev 01) 05:00.0 VGA compatible controller [0300]: nVidia Corporation GeForce Go 7200 [10de:01d6] (rev a1) 07:05.0 FireWire (IEEE 1394) [0c00]: Ricoh Co Ltd R5C832 IEEE 1394 Controller [1180:0832] 07:05.1 Generic system peripheral [0805]: Ricoh Co Ltd R5C822 SD/SDIO/MMC/MS/MSPro Host Adapter [1180:0822] (rev 19) 07:05.2 System peripheral [0880]: Ricoh Co Ltd Unknown device [1180:0843] (rev 01) 07:05.3 System peripheral [0880]: Ricoh Co Ltd R5C592 Memory Stick Bus Host Adapter [1180:0592] (rev 0a) 07:05.4 System peripheral [0880]: Ricoh Co Ltd xD-Picture Card Controller [1180:0852] (rev 05) lspci -vnn 00:00.0 RAM memory [0500]: nVidia Corporation C51 Host Bridge [10de:02f7] (rev a2) Subsystem: Hewlett-Packard Company Unknown device [103c:30b7] Flags: bus master, 66MHz, fast devsel, latency 0 Capabilities: [44] HyperTransport: Slave or Primary Interface Capabilities: [e0] HyperTransport: MSI Mapping 00:00.1 RAM memory [0500]: nVidia Corporation C51 Memory Controller 0 [10de:02fa] (rev a2) Subsystem: Hewlett-Packard Company Unknown device [103c:30b7] Flags: 66MHz, fast devsel 00:00.2 RAM memory [0500]: nVidia Corporation C51 Memory Controller 1 [10de:02fe] (rev a2) Subsystem: Hewlett-Packard Company Unknown device [103c:30b7]
Re: [PATCH] lazy freeing of memory through MADV_FREE
Rik van Riel a écrit : Andrew Morton wrote: On Fri, 20 Apr 2007 17:38:06 -0400 Rik van Riel <[EMAIL PROTECTED]> wrote: Andrew Morton wrote: I've also merged Nick's "mm: madvise avoid exclusive mmap_sem". - Nick's patch also will help this problem. It could be that your patch no longer offers a 2x speedup when combined with Nick's patch. It could well be that the combination of the two is even better, but it would be nice to firm that up a bit. I'll test that. Thanks. Well, good news. It turns out that Nick's patch does not improve peak performance much, but it does prevent the decline when running with 16 threads on my quad core CPU! We _definately_ want both patches, there's a huge benefit in having them both. Here are the transactions/seconds for each combination: vanilla new glibc madv_free kernel madv_free + mmap_sem threads 1 610 609 596545 545 tps versus 610 tps for one thread ? It seems quite bad, no ? Could you please find an explanation for this ? 2103211361196 1200 4107011282014 2024 8100010881665 2087 1677910731310 1999 Thank you - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Linux 2.6.16.49-rc1
Location: ftp://ftp.kernel.org/pub/linux/kernel/people/bunk/linux-2.6.16.y/testing/ git tree: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-2.6.16.y.git RSS feed of the git tree: http://www.kernel.org/git/?p=linux/kernel/git/stable/linux-2.6.16.y.git;a=rss Changes since 2.6.16.48: Adrian Bunk (1): Linux 2.6.16.49-rc1 Ard van Breemen (1): start_kernel: test if irq's got enabled early, barf, and disable them again Aristeu Sergio Rozanski Filho (1): tty_io: fix race in master pty close/slave pty close path Aubrey Li (1): [NET]: Fix UDP checksum issue in net poll mode. David S. Miller (3): [SCSI] QLOGICPTI: Do not unmap DMA unless we actually mapped something. [SPARC64]: Fix SBUS IOMMU allocation code. [SPARC64]: Fix arg passing to compat_sys_ipc(). Jean Delvare (1): hwmon/w83627ehf: Fix the fan5 clock divider write Linas Vepstas (1): elevator: move clearing of unplug flag earlier Olaf Kirch (1): [IrDA]: Correctly handling socket error Tom Callaway (1): [SPARC64]: Fix inline directive in pci_iommu.c Makefile|2 arch/sparc64/kernel/pci_iommu.c |2 arch/sparc64/kernel/sbus.c | 560 +--- arch/sparc64/kernel/sys32.S |1 arch/sparc64/kernel/systbls.S |2 block/elevator.c| 11 drivers/char/tty_io.c | 14 drivers/hwmon/w83627ehf.c |6 drivers/scsi/qlogicpti.c|2 init/main.c |5 net/core/netpoll.c |7 net/irda/af_irda.c |3 12 files changed, 272 insertions(+), 343 deletions(-) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fix ext2 allocator overflows above 31 bit blocks
Mingming Cao wrote: On Fri, 2007-04-20 at 18:14 -0500, Eric Sandeen wrote: It's a bug, today. They are fixed in mm tree, as part of the patches which backports ext3 block reservation code to ext2. filesystem block numbers are all ext2_fsblk_t type(i.e. unsigned long)(see ext2_new_blocks()). Maybe need a round of thorough review to see if anything left, but I think what in mm tree looks good. Oh... oops. I didn't think to check mm, didn't expect to find those changes on ext2. Ok, I will double-check that against what I did. And those patches in mm tree also backports the ext3 best-effort allocates multiple blocks code (allocate multiple blocks within the block reservation window as much as possible), FYI. Ok, thanks Mingming! -Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fixed spinlock use in hysdn_log_close()
On Sat, 14 Apr 2007 07:09:07 +0200 Matthias Kaehlcke <[EMAIL PROTECTED]> wrote: > fixed incorrect spinlock use in hysdn_log_close(). the function > declared a spinlock on the stack and used it to 'protect' a shared > driver structure. the patch removes the declaration of hysdn_lock and > uses card->hysdn_lock instead. > Interesting. > > --- > diff --git a/drivers/isdn/hysdn/hysdn_proclog.c > b/drivers/isdn/hysdn/hysdn_proclog.c > index f7e83a8..32f0b75 100644 > --- a/drivers/isdn/hysdn/hysdn_proclog.c > +++ b/drivers/isdn/hysdn/hysdn_proclog.c > @@ -299,7 +299,6 @@ hysdn_log_close(struct inode *ino, struct file *filep) > hysdn_card *card; > int retval = 0; > unsigned long flags; > - spinlock_t hysdn_lock = SPIN_LOCK_UNLOCKED; > > lock_kernel(); > if ((filep->f_mode & (FMODE_READ | FMODE_WRITE)) == FMODE_WRITE) { > @@ -309,7 +308,7 @@ hysdn_log_close(struct inode *ino, struct file *filep) > /* read access -> log/debug read, mark one further file as > closed */ > > pd = NULL; > - spin_lock_irqsave(_lock, flags); > + spin_lock_irqsave(>hysdn_lock, flags); I guess it won't hurt - are you actually able to test this code? afaict most of the data in there is locked with lock_kernel(), if it's locked at all. If you had some runtime problem and this patch fixed it then fine. If however you're not able to test this code then perhaps the safest option is to simply remove that locking altogether, which is pretty much a runtime-equivalent change. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Acecad USB Tablet: usbmouse takeover and odd motion
On 4/21/07, Jiri Kosina <[EMAIL PROTECTED]> wrote: On Fri, 20 Apr 2007, Giuseppe Bilotta wrote: > Oh, I see. I'll blacklist those modules, maybe also issue a ticket on > the Debian BTS. If Debian enables usbmouse and usbkbd by default in their standard kernels, would you be so kind and raise a proper ticket on them not to do so? Thanks. This also makes me to speed up with one of my items on TODO list - rename usbmouse and usbkbd to something that wouldn't be so confusing and wouldn't make people think that they should enable these drivers if they want support for USB keyboards/mice. Will queue this for 2.6.22. Actually, I just found out that usbmouse and usbkbd are in the blacklist file (/etc/modprobe.d/blacklist), so the fact that they are being called reveals some kind of fscked up setup on my side. I'll try to fix that, sorry for the noise. -- Giuseppe "Oblomov" Bilotta - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: AGPGart / AMD K7
On Fri, Apr 20, 2007 at 04:26:51PM -0700, Greg Kroah-Hartman wrote: > > We should always have a bus in bus_add_driver() > > Instead of returning success when we don't, BUG(). > > Nah, I don't like adding BUG() calls to the kernel if it can be helped, > how about the version I copied you on a few hours ago, which is also > below? Either works for me.. Dave -- http://www.codemonkey.org.uk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fix ext2 allocator overflows above 31 bit blocks
On Fri, 2007-04-20 at 18:14 -0500, Eric Sandeen wrote: > Andreas Dilger wrote: > > On Apr 20, 2007 12:10 -0500, Eric Sandeen wrote: > >> If ext3 can do 16T, ext2 probably should be able to as well. > >> There are still "int" block containers in the block allocation path > >> that need to be fixed up. > > > > Yeah, but who wants to do 16TB e2fsck on every boot? I think there > > needs to be some limits imposed for the sake of usability. > > I figure this is in the fine tradition of "enough rope to hang oneself" > > If you have 16T of filesystem you probably know enough to not hang > yourself this way. > > *shrug* > > It's a bug, today. They are fixed in mm tree, as part of the patches which backports ext3 block reservation code to ext2. filesystem block numbers are all ext2_fsblk_t type(i.e. unsigned long)(see ext2_new_blocks()). Maybe need a round of thorough review to see if anything left, but I think what in mm tree looks good. And those patches in mm tree also backports the ext3 best-effort allocates multiple blocks code (allocate multiple blocks within the block reservation window as much as possible), FYI. Mingming > If we need another change to limit ext2 to 500G or > something, fine by me. :) > > -Eric > - > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, v3
William Lee Irwin III wrote: On Fri, Apr 20, 2007 at 10:10:45AM +1000, Peter Williams wrote: I have a suggestion I'd like to make that addresses both nice and fairness at the same time. As I understand the basic principle behind this scheduler it to work out a time by which a task should make it onto the CPU and then place it into an ordered list (based on this value) of tasks waiting for the CPU. I think that this is a great idea and my suggestion is with regard to a method for working out this time that takes into account both fairness and nice. Hmm. Let me take a look... On Fri, Apr 20, 2007 at 10:10:45AM +1000, Peter Williams wrote: First suppose we have the following metrics available in addition to what's already provided. rq->avg_weight_load /* a running average of the weighted load on the CPU */ p->avg_cpu_per_cycle /* the average time in nsecs that p spends on the CPU each scheduling cycle */ I'm suspicious of mean service times not paired with mean inter-arrival times. On Fri, Apr 20, 2007 at 10:10:45AM +1000, Peter Williams wrote: where a scheduling cycle for a task starts when it is placed on the queue after waking or being preempted and ends when it is taken off the CPU either voluntarily or after being preempted. So p->avg_cpu_per_cycle is just the average amount of time p spends on the CPU each time it gets on to the CPU. Sorry for the long explanation here but I just wanted to make sure there was no chance that "scheduling cycle" would be construed as some mechanism being imposed on the scheduler.) I went and looked up priority queueing queueing theory garbage and re-derived various things I needed. The basics check out. Probably no one cares that I checked. On Fri, Apr 20, 2007 at 10:10:45AM +1000, Peter Williams wrote: We can then define: effective_weighted_load = max(rq->raw_weighted_load, rq->avg_weighted_load) If p is just waking (i.e. it's not on the queue and its load_weight is not included in rq->raw_weighted_load) and we need to queue it, we say that the maximum time (in all fairness) that p should have to wait to get onto the CPU is: expected_wait = p->avg_cpu_per_cycle * effective_weighted_load / p->load_weight You're right. The time that the task spent sleeping before being woken should be subtracted from this value. If the answer is less than or equal to zero pre-emption should occur. This doesn't look right, probably because the scaling factor of p->avg_cpu_per_cycle is the reciprocal of its additive contribution to the ->avg_weight_load as opposed to a direct estimate of its initial delay or waiting time before completing its current requested service. p->load_weight/effective_weighted_load more properly represents an entitlement to CPU bandwidth. Yes. But expected_wait isn't entitlement it's its inverse. p->avg_cpu_per_cycle/(p->load_weight/effective_weighted_load) would be more like the expected time spent on the runqueue When I went to school that would be just another way of expressing the equation that I expressed. (whether waiting to run or actually running) for a time interval spent in a runnable state and the expected time runnable and waiting to run in such an interval would be p->avg_cpu_per_cycle*(1-effective_weighted_load/p->load_weight), Neither represents the initial delay between entering the runqeueue and first acquiring the CPU, but that's a bit hard to figure out without deciding the scheduling policy up-front anyway. This essentially doesn't look correct because while you want to enforce the CPU bandwidth allocation, this doesn't have much to do with that apart from the CPU bandwidth appearing as a term. It's more properly a rate of service as opposed to a time at which anything should happen or a number useful for predicting such. When service should begin more properly depends on the other tasks in the system and a number of other decisions that are part of the scheduling policy. This model takes all of those into consideration. The idea is not just to predict but to use the calculated time to decide when to boot the current process of the CPU (if it doesn't leave voluntarily) and put this one on. This more or less removes the need to give each task a predetermined chunk of CPU when they go on to the CPU. This should, in general, reduce the number context switches as tasks get to run until they've finished what they're doing or another task becomes higher priority rather than being booted off after an arbitrary time interval. (If this ever gets tried it will be interesting to see if this prediction comes true.) BTW Even if Ingo doesn't choose to try this model, I'll probably make a patch (way in the future after Ingo's changes are settled) to try it out myself. If you want to choose a "quasi-inter-arrival time" to achieve the specified CPU bandwidth allocation, this would be it, but to use that to actually enforce the CPU bandwidth allocation, you would
[RFC PATCH - Try #2] Re: BUG in sysfs_remove_group
Updated version of the patch, which splits __lookup_hash() into normal and kernel variants, to prevent a check of the type of lookup. Also splits lookup_one_len(). Tests ok on my system. Please review. Subject: [PATCH] security: prevent permission checking of file removal via sysfs_remove_group() Prevent permission checking from being peformed when the kernel wants to unconditionally remove a sysfs group, by introducing an kernel-only variant of lookup_one_len(), lookup_one_len_kern(). Additionally, as sysfs_remove_group() does not check the return value of the lookup before using it, a BUG_ON has been added to pinpoint the cause of any problems potentially caused by this (and as a form of annotation). Signed-off-by: James Morris <[EMAIL PROTECTED]> --- fs/namei.c| 72 +++- fs/sysfs/group.c |6 +++- include/linux/namei.h |1 + 3 files changed, 57 insertions(+), 22 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index ee60cc4..cabe2b8 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1243,22 +1243,13 @@ int __user_path_lookup_open(const char __user *name, unsigned int lookup_flags, return err; } -/* - * Restricted form of lookup. Doesn't follow links, single-component only, - * needs parent already locked. Doesn't follow mounts. - * SMP-safe. - */ -static struct dentry * __lookup_hash(struct qstr *name, struct dentry * base, struct nameidata *nd) +static inline struct dentry *__lookup_hash_kern(struct qstr *name, struct dentry *base, struct nameidata *nd) { - struct dentry * dentry; + struct dentry *dentry; struct inode *inode; int err; inode = base->d_inode; - err = permission(inode, MAY_EXEC, nd); - dentry = ERR_PTR(err); - if (err) - goto out; /* * See if the low-level filesystem might want @@ -1287,35 +1278,76 @@ out: return dentry; } +/* + * Restricted form of lookup. Doesn't follow links, single-component only, + * needs parent already locked. Doesn't follow mounts. + * SMP-safe. + */ +static inline struct dentry * __lookup_hash(struct qstr *name, struct dentry *base, struct nameidata *nd) +{ + struct dentry *dentry; + struct inode *inode; + int err; + + inode = base->d_inode; + + err = permission(inode, MAY_EXEC, nd); + dentry = ERR_PTR(err); + if (err) + goto out; + + dentry = __lookup_hash_kern(name, base, nd); +out: + return dentry; +} + static struct dentry *lookup_hash(struct nameidata *nd) { return __lookup_hash(>last, nd->dentry, nd); } /* SMP-safe */ -struct dentry * lookup_one_len(const char * name, struct dentry * base, int len) +static inline int __lookup_one_len(const char *name, struct qstr *this, struct dentry *base, int len) { unsigned long hash; - struct qstr this; unsigned int c; - this.name = name; - this.len = len; + this->name = name; + this->len = len; if (!len) - goto access; + return -EACCES; hash = init_name_hash(); while (len--) { c = *(const unsigned char *)name++; if (c == '/' || c == '\0') - goto access; + return -EACCES; hash = partial_name_hash(c, hash); } - this.hash = end_name_hash(hash); + this->hash = end_name_hash(hash); + return 0; +} +struct dentry *lookup_one_len(const char *name, struct dentry *base, int len) +{ + int err; + struct qstr this; + + err = __lookup_one_len(name, , base, len); + if (err) + return ERR_PTR(err); return __lookup_hash(, base, NULL); -access: - return ERR_PTR(-EACCES); +} + +struct dentry *lookup_one_len_kern(const char *name, struct dentry *base, int len) +{ + int err; + struct qstr this; + + err = __lookup_one_len(name, , base, len); + if (err) + return ERR_PTR(err); + return __lookup_hash_kern(, base, NULL); } /* diff --git a/fs/sysfs/group.c b/fs/sysfs/group.c index b20951c..52eed2a 100644 --- a/fs/sysfs/group.c +++ b/fs/sysfs/group.c @@ -70,9 +70,11 @@ void sysfs_remove_group(struct kobject * kobj, { struct dentry * dir; - if (grp->name) - dir = lookup_one_len(grp->name, kobj->dentry, + if (grp->name) { + dir = lookup_one_len_kern(grp->name, kobj->dentry, strlen(grp->name)); + BUG_ON(IS_ERR(dir)); + } else dir = dget(kobj->dentry); diff --git a/include/linux/namei.h b/include/linux/namei.h index d39a5a6..b7dd249 100644 --- a/include/linux/namei.h +++ b/include/linux/namei.h @@ -82,6 +82,7 @@ extern struct file *nameidata_to_filp(struct nameidata *nd, int flags); extern void
Re: AGPGart / AMD K7
On Fri, Apr 20, 2007 at 07:42:33PM -0400, Preston A. Elder wrote: > Final followup, > > If I compile EDAC out of the kernel completely, everything works now. > > This should be resolved though. > 1) dd.c should produce some kind of warning when it wants to assign a > driver to a device, but it can't because a driver is already assigned to > a device > > ie. change: > if (!dev->driver) > driver_probe_device(drv, dev); > to: > if (!dev->driver) > driver_probe_device(drv, dev); > else > printk(KERN_WARNING "__driver_attach (%s): alreay registered > with driver %s\n", > dev->bus_id, dev->driver->name); > > 2) Possibly a device should be able to have more than one driver > associated with it - so the AGP driver and EDAC could both use the > device in question here (though this would probably be a sizable change). I'm working on this change for PCI devices right now, but it's slow going due to some other external things (OLS paper that I am woefully behind on, etc...) thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] the scheduled -EINVAL for invalid timevals in setitimer
On Sun, 15 Apr 2007 05:29:22 +0200 Thomas Gleixner <[EMAIL PROTECTED]> wrote: > On Sat, 2007-04-14 at 17:03 +0200, Adrian Bunk wrote: > > As scheduled, do_setitimer() now returns -EINVAL for invalid timeval. > > > > Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]> > > Acked-by: Thomas Gleixner <[EMAIL PROTECTED]> Worried-about-by: me I guess if it starts biting people we can revert it from 2.6.22.x. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Fwd: Fw: [2.6.20.4] BUG: dentry xattrs still in use in shrink_dcache_for_umount() with reiserfs
On Wed, 18 Apr 2007 11:00:00 -0400 Jeff Mahoney <[EMAIL PROTECTED]> wrote: > > Do you think that could be a reason of the extra reference count on > > xattr_root dentry? > > No, I don't think it is. Looking at the code now, it seems obvious, but > I didn't notice it before and nobody else has reported a problem. > > getxattr() doesn't require any VFS locking. When we get down into the > reiserfs code, it takes a read lock. If we get two concurrent threads > looking up an xattr before the root has been saved, there's a window > where REISERFS_SB(s)->xattr_root is NULL but we've already looked it up > and taken a reference on it. > > I have a patch set to clean up the extended attribute code that fixes > this problem along the way by killing off the xattr locks and using the > backing files/dirs i_mutex instead. I'll post them to the reiserfs > mailing list. Do we have anything suitable for 2.6.21 which will address this crash? Also, it's not clear to me how many users we can expect to be impacted by it. I assume that if the same bug is in 2.6.20 then the answer is "not many". How come Andrea is able to keep hitting it? Thanks. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFD] alternative kobject release wait mechanism
On Fri, Apr 20, 2007 at 11:40:39AM -0400, Alan Stern wrote: > Here's a patch to do what I mentioned earlier. Not tested -- it may > expose some existing bugs. It may even break something, but I'm not aware > of anything that depends on it explicitly. > > Greg, do you know of anything in particular that depends on a kobjects not > being released before their children are released? Yes, the whole driver model :) When adding a new device, we always grab a reference to the parent device so it can not go away before we do. Look at the last kobject_put(parent); in kobject_cleanup() which ensures this. > Index: usb-2.6/lib/kobject.c > === > --- usb-2.6.orig/lib/kobject.c > +++ usb-2.6/lib/kobject.c > @@ -192,12 +192,15 @@ void kobject_init(struct kobject * kobj) > > static void unlink(struct kobject * kobj) > { > + struct kobject *parent = kobj->parent; > + > if (kobj->kset) { > spin_lock(>kset->list_lock); > list_del_init(>entry); > spin_unlock(>kset->list_lock); > } > kobject_put(kobj); > + kobject_put(parent); > } > > /** > @@ -241,7 +244,6 @@ int kobject_shadow_add(struct kobject * > if (error) { > /* unlink does the kobject_put() for us */ > unlink(kobj); > - kobject_put(parent); > > /* be noisy on error issues */ > if (error == -EEXIST) > @@ -489,7 +491,6 @@ void kobject_cleanup(struct kobject * ko > { > struct kobj_type * t = get_ktype(kobj); > struct kset * s = kobj->kset; > - struct kobject * parent = kobj->parent; > > pr_debug("kobject %s: cleaning up\n",kobject_name(kobj)); > if (kobj->k_name != kobj->name) > @@ -505,7 +506,6 @@ void kobject_cleanup(struct kobject * ko > > if (s) > kset_put(s); > - kobject_put(parent); > } Ick, no, I think this used to be the way things worked, but bad things would end up happening, so we fixed it up to be the way things are today. Read the comments for the changelog for this file for details. Specifically, look at commit 10921a8f1305b8ec97794941db78b825db5839bc in the history.git repo which is almost exactly what you are proposing to be reverted... thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2/2] 2.6.21-rc7: known regressions
Dave Jones wrote: > On Fri, Apr 20, 2007 at 10:16:54AM -0700, Jeremy Fitzhardinge wrote: > > Dave Jones wrote: > > > > Andi, I think. I've got his firstfloor.org patches applied to this > kernel. > > > > > > Ah, I saw you patched in CFS too, and thought it may be related. > > > > > > > Well, I have CONFIG_FB_BACKLIGHT enabled, and it still works. > > > > Maybe there's something in Andi's queue which is making it work? > > Shrug, I'm out of ideas. I'm hoping that it'll magically start working > when people start flushing their git trees for .22 > Maybe that'll yield a clue for something that can be backported to .21.x > because right now, I'm completely puzzled. Well, it seemed reliable, but I just got a resume failure. Different from any previous symptom I've seen: Intel machine check architecture supported Intel machine check reporting enabled enabled on CPU#1 Back to C! J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] lazy freeing of memory through MADV_FREE
Andrew Morton wrote: On Fri, 20 Apr 2007 17:38:06 -0400 Rik van Riel <[EMAIL PROTECTED]> wrote: Andrew Morton wrote: I've also merged Nick's "mm: madvise avoid exclusive mmap_sem". - Nick's patch also will help this problem. It could be that your patch no longer offers a 2x speedup when combined with Nick's patch. It could well be that the combination of the two is even better, but it would be nice to firm that up a bit. I'll test that. Thanks. Well, good news. It turns out that Nick's patch does not improve peak performance much, but it does prevent the decline when running with 16 threads on my quad core CPU! We _definately_ want both patches, there's a huge benefit in having them both. Here are the transactions/seconds for each combination: vanilla new glibc madv_free kernel madv_free + mmap_sem threads 1 610 609 596545 2103211361196 1200 4107011282014 2024 8100010881665 2087 1677910731310 1999 -- Politics is the struggle between those who want to make their country the best in the world, and those who believe it already is. Each group calls the other unpatriotic. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] cxacru: Add Documentation file
The sysfs attributes for exposing cxacru statistics/status information with possible values is now explained in Documentation/networking/cxacru.txt including information on the writable adsl_state attribute's commands and a sample of the kernel log format. --- Documentation/networking/00-INDEX |2 + Documentation/networking/cxacru.txt | 84 +++ 2 files changed, 86 insertions(+), 0 deletions(-) create mode 100644 Documentation/networking/cxacru.txt diff --git a/Documentation/networking/00-INDEX b/Documentation/networking/00-INDEX index e06b6e3..153d84d 100644 --- a/Documentation/networking/00-INDEX +++ b/Documentation/networking/00-INDEX @@ -32,6 +32,8 @@ cops.txt - info on the COPS LocalTalk Linux driver cs89x0.txt - the Crystal LAN (CS8900/20-based) Ethernet ISA adapter driver +cxacru.txt + - Conexant AccessRunner USB ADSL Modem de4x5.txt - the Digital EtherWORKS DE4?? and DE5?? PCI Ethernet driver decnet.txt diff --git a/Documentation/networking/cxacru.txt b/Documentation/networking/cxacru.txt new file mode 100644 index 000..2623eaa --- /dev/null +++ b/Documentation/networking/cxacru.txt @@ -0,0 +1,84 @@ +Firmware is required for this device: http://accessrunner.sourceforge.net/ + +While it is capable of managing/maintaining the ADSL connection without the +module loaded, the device will sometimes stop responding after unloading the +driver and it is necessary to unplug/remove power to the device to fix this. + +Detected devices will appear as ATM devices named "cxacru". In /sys/class/atm/ +these are directories named cxacruN where N is the device number. A symlink +named device points to the USB interface device's directory which contains +several sysfs attribute files for retriving device statistics: + +* adsl_controller_version + +* adsl_headend +* adsl_headend_environment + Information about the remote headend. + +* downstream_attenuation (dB) +* downstream_bits_per_frame +* downstream_rate (kbps) +* downstream_snr_margin (dB) + Downstream stats. + +* upstream_attenuation (dB) +* upstream_bits_per_frame +* upstream_rate (kbps) +* upstream_snr_margin (dB) +* transmitter_power (dBm/Hz) + Upstream stats. + +* downstream_crc_errors +* downstream_fec_errors +* downstream_hec_errors +* upstream_crc_errors +* upstream_fec_errors +* upstream_hec_errors + Error counts. + +* line_startable + Indicates that ADSL support on the device + is/can be enabled, see adsl_start. + +* line_status + "initialising" + "down" + "attempting to activate" + "training" + "channel analysis" + "exchange" + "waiting" + "up" + + Changes between "down" and "attempting to activate" + if there is no signal. + +* link_status + "not connected" + "connected" + "lost" + +* mac_address + +* modulation + "ANSI T1.413" + "ITU-T G.992.1 (G.DMT)" + "ITU-T G.992.2 (G.LITE)" + +* startup_attempts + Count of total attempts to initialise ADSL. + +To enable/disable ADSL, the following can be written to the adsl_state file: + "start" + "stop + "restart" (stops, waits 1.5s, then starts) + "poll" (used to resume status polling if it was disabled due to failure) + +Changes in adsl/line state are reported via kernel log messages: + [4942145.150704] ATM dev 0: ADSL state: running + [4942243.663766] ATM dev 0: ADSL line: down + [4942249.665075] ATM dev 0: ADSL line: attempting to activate + [4942253.654954] ATM dev 0: ADSL line: training + [4942255.666387] ATM dev 0: ADSL line: channel analysis + [4942259.656262] ATM dev 0: ADSL line: exchange + [2635357.696901] ATM dev 0: ADSL line: up (8128 kb/s down | 832 kb/s up) -- 1.5.0.1 -- Simon Arlott - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Acecad USB Tablet: usbmouse takeover and odd motion
On Fri, 20 Apr 2007, Giuseppe Bilotta wrote: > Oh, I see. I'll blacklist those modules, maybe also issue a ticket on > the Debian BTS. If Debian enables usbmouse and usbkbd by default in their standard kernels, would you be so kind and raise a proper ticket on them not to do so? Thanks. This also makes me to speed up with one of my items on TODO list - rename usbmouse and usbkbd to something that wouldn't be so confusing and wouldn't make people think that they should enable these drivers if they want support for USB keyboards/mice. Will queue this for 2.6.22. -- Jiri Kosina SUSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: AGPGart / AMD K7
Final followup, If I compile EDAC out of the kernel completely, everything works now. This should be resolved though. 1) dd.c should produce some kind of warning when it wants to assign a driver to a device, but it can't because a driver is already assigned to a device ie. change: if (!dev->driver) driver_probe_device(drv, dev); to: if (!dev->driver) driver_probe_device(drv, dev); else printk(KERN_WARNING "__driver_attach (%s): alreay registered with driver %s\n", dev->bus_id, dev->driver->name); 2) Possibly a device should be able to have more than one driver associated with it - so the AGP driver and EDAC could both use the device in question here (though this would probably be a sizable change). Either way, at least I found the culprit :) PreZ :) Preston A. Elder wrote: > Dave Jones wrote: > >> On Fri, Apr 20, 2007 at 04:22:06PM -0400, Preston A. Elder wrote: >> > Dave, Greg, >> > >> > Here is the trace with 2.6.20.6 >> > >> > I added back in my trace code, as you see. As you can also see, >> > agp_amdk7_probe is still not called. >> >> Try looking down in __driver_attach() >> The fact that we're not calling the ->probe function is quite bizarre. >> >> It could be this in __driver_attach >> >> if (!dev->driver) >> driver_probe_device(drv, dev); >> >> Though that'd be odd. >> >> Putting a #define DEBUG 1 in drivers/base/dd.c may also yield some clues. >> >> Dave >> >> >> > OK, I found it! > > Here is more trace: > Linux agpgart interface v0.101 (c) Dave Jones > agp_amdk7_init: In function > agp_amdk7_init: Before pci_register_driver > __pci_register_driver: In Function (driver = agpgart-amdk7, multithread = 0) > __pci_register_driver: Before Spinlock > __pci_register_driver: Before Init List Head > __pci_register_driver: Before driver_register > bus_add_driver: In Function (c0492920) > bus_add_driver: Before kobject_set_name > bus_add_driver: error = 0 > bus_add_driver: Before kobject_register > bus_add_driver: error = 0 > bus_add_driver: Before driver_attach > __driver_attach (:00:00.0,1): Before Down (parent) (c21c8600) > __driver_attach (:00:00.0, 1): Before Down > __driver_attach (:00:00.0, 1): Before Probe Device (c049fe54) > __driver_attach (:00:00.0, 1): alreay registered with driver amd76x_edac > __driver_attach (:00:00.0, 1): Before Up > __driver_attach (:00:00.0, 1): Before Up (parent) (c21c8600) > __driver_attach (:00:00.0, 1): Returning 0 > bus_add_driver: error = 0 > bus_add_driver: Before klist_add_tail > bus_add_driver: Before module_add_driver > bus_add_driver: Before driver_add_attrs > bus_add_driver: error = 0 > bus_add_driver: Before add_bind_files > bus_add_driver: error = 0 > bus_add_driver: Returning 0 > __pci_register_driver: error = 0 > __pci_register_driver: Before pci_create_newid_file > __pci_register_driver: error = 0 > __pci_register_driver: Returning 0 > > I snipped some since __driver_attach is called many times. > > But the long and short is that 00:00:00 is already associated with the > 'amd76x_edac' driver, and as such will not call the agp probe call. > What is this edac, btw? > > PreZ :) > > > > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Permanent Kgdb integration into the kernel - lets get with it.
On Fri, 20 Apr 2007 15:51:35 -0700 Piet Delaney <[EMAIL PROTECTED]> wrote: > > Is there any movement on this? > > Hi Randy: > > Jason Wessel <[EMAIL PROTECTED]> is currently leading yet > another attempt at getting kgdb permanently into the kernel. Jason > has a linux2_6_21 patch on SourceForge: > > http://kgdb.cvs.sourceforge.net/kgdb/kgdb-2/8250.patch?view=log=linux2_6_21_uprev > > and has been working with Sergei Shtylyov <[EMAIL PROTECTED]> > recently on getting KGDBOE Netpoll patches that got lost around the > time of Tom's attempt. Just on Monday there were a dozen posting to > the source forge mailing list: > > > > [EMAIL PROTECTED] > > on this effort. Unfortunately there doesn't seem to be anything there which I can autopull into the -mm tree. I'm presently set up for quilt trees in open directories and for git trees. I could add plain-old-gzipped-diffs or whatever. > I'd like to see a git repository on kernel.org that is used to update > the mainstream kernel. Unfortunately getting accounts on kernel.org is > next to impossible. If Jason doesn't have one yet it would be nice to > offer him one for the kgdb developers to use. This seems to be to do with some silly spamfilter issue or something. I sent an email off-list. > I agree with Andi that the kgdb code seems to be getting more > complicated that needed thought I don't find the hooks offensive. > Here I keep my kgdb hooks completely under #ifdef KGDB, so there > is absolutely no difference to the kernel when KGDB isn't configured. We can address that sort of thing via review: you send send the diffs out, we read them and comment on them. We do this 100 times a day - it is simply a non-issue, as long as there's actually someone who has the time/effort/inclination to push this feature to completion, which is what kgdb has sadly lacked for the past decade or so. > I also like having debug printks, similar to the SOCK_DEBUG() macros, > to make it easy to watch kgdb internals in action. Ya can't run kgdb > on itself. > > I find these blemish's a minor concern compared to the damage/lost > of not integrating kgdb into the kernel permanently. When developers > can't rely on using kgdb for easy development they tend to write code > without consideration for what it's like using their code with the > debugger. Linux is making a major headway into $100 embedded systems; > the recent use in the Linksys WRT54GL (DD-WRT) and Engenius EOC-3220 > for example. Making kgdb easily accessible will make the viability of > using Linux for embedded system greatly increased, IMHO. Lots of people want kgdb. One person is famously less keen on it, but we'll be able to talk him around, as long as the patches aren't daft. > Perhaps with a bit of support from the kernel.org folks we could get > the kgdb patch, with all of it's blemishes, into Andrews 2.6.21-rc7-mm1 > patch. Accounts on kernel.org for kgdb developers would be a modest > effort. I find the CVS patch framework rather clumsy and would rather > follow the KISS principle and just use git repositories like the rest > of the kernel developers appear to be using. Yes, a git tree on k.org is appropriate - let's make that happen. But beware that it will need to be updated pretty reguarly, and it'll need to be against Linux-tree-of-the-day, please. The x86 code continues to change at a tremendous rate (I count 204 x86 patches queued for 2.6.22), and kgdb supports lots of other architectures too. So whoever is signed up to maintain this will have quite a bit of messy maintaneance to do during the getting-it-ready phase. This will of course be minimised by being VERY careful to mimimise the impact of the patchset on the existing code. And if/when it is merged, there will be quite a bit of tricky maintenance work to do, if my experience of the kgdb stub is anything to go by. There inevitably will be long-term low-level impact upon the arch maintainers too. But that's OK, because the way to merge this feature is to put the arch-neutral core into the tree first, and to then send each per-arch patch to the relevant arch maintainer for merging. That way, they get to decide whether they wish to take on the burden of participating in its long-term maintenance. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: AGPGart / AMD K7
On Fri, Apr 20, 2007 at 04:33:42PM -0400, Dave Jones wrote: > On Fri, Apr 20, 2007 at 04:22:06PM -0400, Preston A. Elder wrote: > > Dave, Greg, > > > > Here is the trace with 2.6.20.6 > > > > I added back in my trace code, as you see. As you can also see, > > agp_amdk7_probe is still not called. > > Try looking down in __driver_attach() > The fact that we're not calling the ->probe function is quite bizarre. > > It could be this in __driver_attach > > if (!dev->driver) > driver_probe_device(drv, dev); > > Though that'd be odd. > > Putting a #define DEBUG 1 in drivers/base/dd.c may also yield some clues. Setting CONFIG_DEBUG_DRIVER automatically enables this also and might provide some more hints. thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: AGPGart / AMD K7
On Fri, Apr 20, 2007 at 03:00:28PM -0400, Dave Jones wrote: > On Fri, Apr 20, 2007 at 11:29:52AM -0700, Greg Kroah-Hartman wrote: > > On Fri, Apr 20, 2007 at 02:20:29PM -0400, Dave Jones wrote: > > > > > > btw Greg, wtf does driver_register return a 0 as 'success' if it > > > completes the function, and 0 as 'failure' if !bus ? > > > That seems doomed to failure. > > > > I don't know why the code does that, we should always have a bus > > assigned to a driver. I'll change that and watch to see what breaks :) > > Maybe this? > > We should always have a bus in bus_add_driver() > Instead of returning success when we don't, BUG(). Nah, I don't like adding BUG() calls to the kernel if it can be helped, how about the version I copied you on a few hours ago, which is also below? thanks, greg k-h -- From: Greg Kroah-Hartman <[EMAIL PROTECTED]> Subject: driver core: bus_add_driver should return an error if no bus As pointed out by Dave Jones. Cc: Dave Jones <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- drivers/base/bus.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/base/bus.c +++ b/drivers/base/bus.c @@ -601,7 +601,7 @@ int bus_add_driver(struct device_driver int error = 0; if (!bus) - return 0; + return -EINVAL; pr_debug("bus %s: add driver %s\n", bus->name, drv->name); error = kobject_set_name(>kobj, "%s", drv->name); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
other potentially deletable, dead stuff
and while len is working on detaching APM and ACPI from legacy power management, here's the short list of other stuff that is listed as on its way to being dead, based on the contents of Kconfig files. any of this stuff candidates for removal, if not scheduling for removal? (note: i made no effort to cull this list of entries that i know folks are already aware of or might be working on, or stuff that we've already established is *not* really obsolete. it's just a list.) config NET_CLS_POLICE bool "Traffic Policing (obsolete)" config IP_NF_CONNTRACK_SUPPORT bool "Layer 3 Dependent Connection tracking (OBSOLETE)" config IP6_NF_QUEUE tristate "IP6 Userspace queueing via NETLINK (OBSOLETE)" config IP_NF_QUEUE tristate "IP Userspace queueing via NETLINK (OBSOLETE)" config ARPD bool "IP: ARP daemon support (EXPERIMENTAL)" ... This code is experimental and also obsolete... config BRIDGE_EBT_ULOG tristate "ebt: ulog support (OBSOLETE)" config PCMCIA_IOCTL bool "PCMCIA control ioctl (obsolete)" config SHAPER tristate "Traffic Shaper (OBSOLETE)" config SUN_BPP tristate "Bidirectional parallel port support (OBSOLETE)" config I2O_CONFIG_OLD_IOCTL bool "Enable ioctls (OBSOLETE)" config MOXA_SMARTIO tristate "Moxa SmartIO support (OBSOLETE)" config RAW_DRIVER tristate "RAW driver (/dev/raw/rawN) (OBSOLETE)" config ISDN_I4L tristate "Old ISDN4Linux (obsolete)" config MODE_TT bool "Tracing thread support (DEPRECATED)" ... This option controls whether tracing thread support is compiled into UML. This option is largely obsolete ... rday -- Robert P. J. Day Linux Consulting, Training and Annoying Kernel Pedantry Waterloo, Ontario, CANADA http://fsdev.net/wiki/index.php?title=Main_Page - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fix ext2 allocator overflows above 31 bit blocks
Andreas Dilger wrote: On Apr 20, 2007 12:10 -0500, Eric Sandeen wrote: If ext3 can do 16T, ext2 probably should be able to as well. There are still "int" block containers in the block allocation path that need to be fixed up. Yeah, but who wants to do 16TB e2fsck on every boot? I think there needs to be some limits imposed for the sake of usability. I figure this is in the fine tradition of "enough rope to hang oneself" If you have 16T of filesystem you probably know enough to not hang yourself this way. *shrug* It's a bug, today. If we need another change to limit ext2 to 500G or something, fine by me. :) -Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] NET: Add packet sock option to return orig_dev to userspace when bonded
From: "Waskiewicz Jr, Peter P" <[EMAIL PROTECTED]> Date: Thu, 8 Mar 2007 18:17:39 -0800 > Peter P. Waskiewicz Jr. <[EMAIL PROTECTED]> > NET: Add packet sock option to return orig_dev to userspace when > bonded I'm going to apply this patch (by hand, your email client corrupted the patch massively, adding newlines and whatnot all over the patch). But I'm going to rename the option to be just PACKET_ORIGDEV because although bonding is the only user of this orig_dev decapsulation method, that might not always be true and it'd be a shame to give a special cased name when it is not deserved here. But please do fix your email client for future patch submissions. Thanks. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Permanent Kgdb integration into the kernel - lets get with it.
On Tue, 2007-04-17 at 11:30 -0700, Randy Dunlap wrote: > On Thu, 08 Mar 2007 14:24:10 -0800 Piet Delaney wrote: > > > On Thu, 2007-03-08 at 11:49 -0700, Tom Rini wrote: > > > On Thu, Mar 08, 2007 at 07:37:56PM +0100, Andi Kleen wrote: > > > > On Thursday 08 March 2007 18:44, Dave Jiang wrote: > > > > > > > > > In spite of kgdb, shouldn't it have that \n anyways in case some > > > > > other code > > > > > gets added in the future after the macro? Or are you saying that > > > > > there should > > > > > never be any code ever after that macro? > > > > > > > > Sure if there is mainline code added after that macro we add the \n. > > > > But only if it makes sense to add code there, which it didn't in kgdb. > > > > > > Was that because with recent enough tools and config options there was > > > enough annotations so GDB could finally figure out where things had > > > stopped? Thanks. > > > > The reason Linus said he didn't allow George's kgdb mm patch to > > be integrating into the kernel a year or two ago was that Amit and > > George had significantly different implementations. So Amit, Tom, > > George, and the rest of the kgdb development gang worked together > > and came up with a unified version that we now support on SourceForge. > > > > Tom rolled up a mm patch back in December for Andrew and then the > > integration process stopped. I suggest we work together on getting > > the kgdb patch back into the mm series and permanently into the kernel > > like the kexec code and then we can avoid this kernel development > > obfuscation. > > Hi, > Is there any movement on this? Hi Randy: Jason Wessel <[EMAIL PROTECTED]> is currently leading yet another attempt at getting kgdb permanently into the kernel. Jason has a linux2_6_21 patch on SourceForge: http://kgdb.cvs.sourceforge.net/kgdb/kgdb-2/8250.patch?view=log=linux2_6_21_uprev and has been working with Sergei Shtylyov <[EMAIL PROTECTED]> recently on getting KGDBOE Netpoll patches that got lost around the time of Tom's attempt. Just on Monday there were a dozen posting to the source forge mailing list: [EMAIL PROTECTED] on this effort. I'd like to see a git repository on kernel.org that is used to update the mainstream kernel. Unfortunately getting accounts on kernel.org is next to impossible. If Jason doesn't have one yet it would be nice to offer him one for the kgdb developers to use. I agree with Andi that the kgdb code seems to be getting more complicated that needed thought I don't find the hooks offensive. Here I keep my kgdb hooks completely under #ifdef KGDB, so there is absolutely no difference to the kernel when KGDB isn't configured. I also like having debug printks, similar to the SOCK_DEBUG() macros, to make it easy to watch kgdb internals in action. Ya can't run kgdb on itself. I find these blemish's a minor concern compared to the damage/lost of not integrating kgdb into the kernel permanently. When developers can't rely on using kgdb for easy development they tend to write code without consideration for what it's like using their code with the debugger. Linux is making a major headway into $100 embedded systems; the recent use in the Linksys WRT54GL (DD-WRT) and Engenius EOC-3220 for example. Making kgdb easily accessible will make the viability of using Linux for embedded system greatly increased, IMHO. Perhaps with a bit of support from the kernel.org folks we could get the kgdb patch, with all of it's blemishes, into Andrews 2.6.21-rc7-mm1 patch. Accounts on kernel.org for kgdb developers would be a modest effort. I find the CVS patch framework rather clumsy and would rather follow the KISS principle and just use git repositories like the rest of the kernel developers appear to be using. -piet > > Thanks, > --- > ~Randy > *** Remember to use Documentation/SubmitChecklist when testing your code *** -- Piet DelaneyPhone: (408) 200-5256 Blue Lane Technologies Fax: (408) 200-5299 10450 Bubb Rd. Cupertino, Ca. 95014Email: [EMAIL PROTECTED] signature.asc Description: This is a digitally signed message part
Re: [PATCH] fix ext2 allocator overflows above 31 bit blocks
On Apr 20, 2007 12:10 -0500, Eric Sandeen wrote: > If ext3 can do 16T, ext2 probably should be able to as well. > There are still "int" block containers in the block allocation path > that need to be fixed up. Yeah, but who wants to do 16TB e2fsck on every boot? I think there needs to be some limits imposed for the sake of usability. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH 2/3] x86: new API for modifying CPU feature flags
x86: new API for modifying CPU feature flags Use an API for setting/clearing CPU features, so the process can be debugged. Adds: set_cpu_feature() clear_cpu_feature() clear_all_cpu_features() Todo: mask_boot_cpu_features() set_cpu_feature_word() more? (Hardcoded printk for now, should be dprintk.) Signed-off-by: Chuck Ebbert <[EMAIL PROTECTED]> --- include/asm-i386/cpufeature.h | 17 + 1 file changed, 17 insertions(+) --- 2.6.21-rc7-d390.orig/include/asm-i386/cpufeature.h +++ 2.6.21-rc7-d390/include/asm-i386/cpufeature.h @@ -108,6 +108,23 @@ #define cpu_has(c, bit)test_bit(bit, (c)->x86_capability) #define boot_cpu_has(bit) test_bit(bit, boot_cpu_data.x86_capability) +#define alter_cpu_feature(fn, feat, c) \ + do {typeof(c) __c = (c); \ + printk("CPU: %s: %s feature %s for CPU %p", \ + __func__, #fn, #feat, __c); \ + fn ## _bit(X86_FEATURE_ ## feat, __c->x86_capability); \ + } while (0) + +#define set_cpu_feature(f, c) alter_cpu_feature(set, f, c) +#define clear_cpu_feature(f, c)alter_cpu_feature(clear, f, c) + +#define clear_all_cpu_features(c) \ + do {typeof(c) __c = (c); \ + printk("CPU: %s: clearing all capabilities for CPU %p", \ + __func__, __c); \ + memset(&__c->x86_capability, 0, sizeof __c->x86_capability); \ + } while (0) + #define cpu_has_fpuboot_cpu_has(X86_FEATURE_FPU) #define cpu_has_vmeboot_cpu_has(X86_FEATURE_VME) #define cpu_has_de boot_cpu_has(X86_FEATURE_DE)
[RFC PATCH 3/3] x86: use the x86 CPU feature API
x86: use the x86 CPU feature API Just a small demo for now. Signed-off-by: Chuck Ebbert <[EMAIL PROTECTED]> --- arch/i386/kernel/cpu/amd.c|4 ++-- arch/i386/kernel/cpu/common.c |2 +- 2 files changed, 3 insertions(+), 3 deletions(-) --- 2.6.21-rc7-d390.orig/arch/i386/kernel/cpu/amd.c +++ 2.6.21-rc7-d390/arch/i386/kernel/cpu/amd.c @@ -109,8 +109,8 @@ static void __cpuinit init_amd(struct cp { /* Based on AMD doc 20734R - June 2000 */ if ( c->x86_model == 0 ) { - clear_bit(X86_FEATURE_APIC, c->x86_capability); - set_bit(X86_FEATURE_PGE, c->x86_capability); + clear_cpu_feature(APIC, c); + set_cpu_feature(PGE, c); } break; } --- 2.6.21-rc7-d390.orig/arch/i386/kernel/cpu/common.c +++ 2.6.21-rc7-d390/arch/i386/kernel/cpu/common.c @@ -381,7 +381,7 @@ void __cpuinit identify_cpu(struct cpuin c->x86_model_id[0] = '\0'; /* Unset */ c->x86_max_cores = 1; c->x86_clflush_size = 32; - memset(>x86_capability, 0, sizeof c->x86_capability); + clear_all_cpu_features(c); if (!have_cpuid_p()) { /* First of all, decide if this is a 486 or higher */
[RFC PATCH 1/3] x86: use defined names for all CPU feature flags
x86: use defined names for all CPU feature flags Don't use hard coded values for CPU flags. Signed-off-by: Chuck Ebbert <[EMAIL PROTECTED]> --- arch/i386/kernel/cpu/amd.c |2 +- arch/i386/kernel/cpu/centaur.c |2 +- arch/i386/kernel/cpu/cyrix.c|6 +++--- arch/x86_64/kernel/setup.c |2 +- include/asm-i386/cpufeature.h |4 +++- include/asm-x86_64/cpufeature.h |1 + 6 files changed, 10 insertions(+), 7 deletions(-) --- 2.6.21-rc7-d390.orig/arch/x86_64/kernel/setup.c +++ 2.6.21-rc7-d390/arch/x86_64/kernel/setup.c @@ -576,7 +576,7 @@ static void __cpuinit init_amd(struct cp /* Bit 31 in normal CPUID used for nonstandard 3DNow ID; 3DNow is IDd by bit 31 in extended CPUID (1*32+31) anyway */ - clear_bit(0*32+31, >x86_capability); + clear_bit(X86_FEATURE_PBE, >x86_capability); /* On C+ stepping K8 rep microcode works well for copy/memset */ level = cpuid_eax(1); --- 2.6.21-rc7-d390.orig/include/asm-i386/cpufeature.h +++ 2.6.21-rc7-d390/include/asm-i386/cpufeature.h @@ -42,6 +42,7 @@ #define X86_FEATURE_HT (0*32+28) /* Hyper-Threading */ #define X86_FEATURE_ACC(0*32+29) /* Automatic clock control */ #define X86_FEATURE_IA64 (0*32+30) /* IA-64 processor */ +#define X86_FEATURE_PBE(0*32+31) /* PBE */ /* AMD-defined CPU features, CPUID level 0x8001, word 1 */ /* Don't duplicate feature flags which are redundant with Intel! */ @@ -49,6 +50,7 @@ #define X86_FEATURE_MP (1*32+19) /* MP Capable. */ #define X86_FEATURE_NX (1*32+20) /* Execute Disable */ #define X86_FEATURE_MMXEXT (1*32+22) /* AMD MMX extensions */ +#define X86_FEATURE_CXMMXORIG (1*32+24) /* Cyrix MMX, initial location */ #define X86_FEATURE_LM (1*32+29) /* Long Mode (x86-64) */ #define X86_FEATURE_3DNOWEXT (1*32+30) /* AMD 3DNow! extensions */ #define X86_FEATURE_3DNOW (1*32+31) /* 3DNow! */ @@ -60,7 +62,7 @@ /* Other features, Linux-defined mapping, word 3 */ /* This range is used for feature bits which conflict or are synthesized */ -#define X86_FEATURE_CXMMX (3*32+ 0) /* Cyrix MMX extensions */ +#define X86_FEATURE_CXMMX (3*32+ 0) /* Cyrix MMX extensions, final location */ #define X86_FEATURE_K6_MTRR(3*32+ 1) /* AMD K6 nonstandard MTRRs */ #define X86_FEATURE_CYRIX_ARR (3*32+ 2) /* Cyrix ARRs (= MTRRs) */ #define X86_FEATURE_CENTAUR_MCR(3*32+ 3) /* Centaur MCRs (= MTRRs) */ --- 2.6.21-rc7-d390.orig/arch/i386/kernel/cpu/cyrix.c +++ 2.6.21-rc7-d390/arch/i386/kernel/cpu/cyrix.c @@ -192,11 +192,11 @@ static void __cpuinit init_cyrix(struct /* Bit 31 in normal CPUID used for nonstandard 3DNow ID; 3DNow is IDd by bit 31 in extended CPUID (1*32+31) anyway */ - clear_bit(0*32+31, c->x86_capability); + clear_bit(X86_FEATURE_PBE, c->x86_capability); /* Cyrix used bit 24 in extended (AMD) CPUID for Cyrix MMX extensions */ - if ( test_bit(1*32+24, c->x86_capability) ) { - clear_bit(1*32+24, c->x86_capability); + if ( test_bit(X86_FEATURE_CXMMXORIG, c->x86_capability) ) { + clear_bit(X86_FEATURE_CXMMXORIG, c->x86_capability); set_bit(X86_FEATURE_CXMMX, c->x86_capability); } --- 2.6.21-rc7-d390.orig/include/asm-x86_64/cpufeature.h +++ 2.6.21-rc7-d390/include/asm-x86_64/cpufeature.h @@ -40,6 +40,7 @@ #define X86_FEATURE_HT (0*32+28) /* Hyper-Threading */ #define X86_FEATURE_ACC(0*32+29) /* Automatic clock control */ #define X86_FEATURE_IA64 (0*32+30) /* IA-64 processor */ +#define X86_FEATURE_PBE(0*32+31) /* PBE */ /* AMD-defined CPU features, CPUID level 0x8001, word 1 */ /* Don't duplicate feature flags which are redundant with Intel! */ --- 2.6.21-rc7-d390.orig/arch/i386/kernel/cpu/amd.c +++ 2.6.21-rc7-d390/arch/i386/kernel/cpu/amd.c @@ -83,7 +83,7 @@ static void __cpuinit init_amd(struct cp /* Bit 31 in normal CPUID used for nonstandard 3DNow ID; 3DNow is IDd by bit 31 in extended CPUID (1*32+31) anyway */ - clear_bit(0*32+31, c->x86_capability); + clear_bit(X86_FEATURE_PBE, c->x86_capability); r = get_model_name(c); --- 2.6.21-rc7-d390.orig/arch/i386/kernel/cpu/centaur.c +++ 2.6.21-rc7-d390/arch/i386/kernel/cpu/centaur.c @@ -334,7 +334,7 @@ static void __cpuinit init_centaur(struc /* Bit 31 in normal CPUID used for nonstandard 3DNow ID; 3DNow is IDd by bit 31 in extended CPUID (1*32+31) anyway */ - clear_bit(0*32+31, c->x86_capability); + clear_bit(X86_FEATURE_PBE, c->x86_capability); switch (c->x86) {
[RFC PATCH 0/3] Clean up x86 CPU feature setup
x86 CPU feature flag setup has become impossible to debug. Every user just does set_bit()/clear_bit() or writes the entire set to change the flags, so there's no way to trace how they're being set. This patchset creates an API and debug messages for tracking how the flags get set. It's not nearly done, but I want to know whether or not to continue. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3] hpet: Detect hidden HPET on NVidia motherboards
On Wed, 18 Apr 2007 00:57:48 +0300 Mikko Tiihonen <[EMAIL PROTECTED]> wrote: > Enables HPET for NVidia motherboards with broken BIOS. The patch reads > the HPET address from the pci config space. The patch should also work > if ACPI is disabled. > > The new quirk activates use of HPET only run if > - CONFIG_HPET_NFORCE_DETECT is enabled > - nohpet boot option is not set > - main chipset is from NVidia > - ACPI tables do not list HPET > - matching PCI ID for device with HPET is found > - BIOS has set up the HPET to some address > - there is no other resource allocated at the HPET address > > This is true at least for some Asus, Gigabyte and DFI motherboards. > > Patch is against 2.6.21-rc6-git7 but should apply cleanly to most > kernels. I looked at applying this but a) there have been rather a lot of underlying changes in Andi's devel tree and b) we still haven't heard from Andy? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/6] [RFC]mlx4_core public includes
Include files for hardware/firmware information and interface of mlx4_core module for protocol-specific drivers (such as mlx4_ib). Signed-off-by: Roland Dreier <[EMAIL PROTECTED]> --- cmd.h | 178 + cq.h | 123 +++ device.h | 323 + doorbell.h | 97 ++ driver.h | 59 +++ qp.h | 288 ++ srq.h | 42 +++ 7 files changed, 1110 insertions(+) diff --git a/include/linux/mlx4/cmd.h b/include/linux/mlx4/cmd.h new file mode 100644 index 000..4fb552d --- /dev/null +++ b/include/linux/mlx4/cmd.h @@ -0,0 +1,178 @@ +/* + * Copyright (c) 2006 Cisco Systems, Inc. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + *copyright notice, this list of conditions and the following + *disclaimer. + * + * - Redistributions in binary form must reproduce the above + *copyright notice, this list of conditions and the following + *disclaimer in the documentation and/or other materials + *provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + */ + +#ifndef MLX4_CMD_H +#define MLX4_CMD_H + +#include + +enum { + /* initialization and general commands */ + MLX4_CMD_SYS_EN = 0x1, + MLX4_CMD_SYS_DIS = 0x2, + MLX4_CMD_MAP_FA = 0xfff, + MLX4_CMD_UNMAP_FA= 0xffe, + MLX4_CMD_RUN_FW = 0xff6, + MLX4_CMD_MOD_STAT_CFG= 0x34, + MLX4_CMD_QUERY_DEV_CAP = 0x3, + MLX4_CMD_QUERY_FW= 0x4, + MLX4_CMD_ENABLE_LAM = 0xff8, + MLX4_CMD_DISABLE_LAM = 0xff7, + MLX4_CMD_QUERY_DDR = 0x5, + MLX4_CMD_QUERY_ADAPTER = 0x6, + MLX4_CMD_INIT_HCA= 0x7, + MLX4_CMD_CLOSE_HCA = 0x8, + MLX4_CMD_INIT_PORT = 0x9, + MLX4_CMD_CLOSE_PORT = 0xa, + MLX4_CMD_QUERY_HCA = 0xb, + MLX4_CMD_SET_PORT= 0xc, + MLX4_CMD_ACCESS_DDR = 0x2e, + MLX4_CMD_MAP_ICM = 0xffa, + MLX4_CMD_UNMAP_ICM = 0xff9, + MLX4_CMD_MAP_ICM_AUX = 0xffc, + MLX4_CMD_UNMAP_ICM_AUX = 0xffb, + MLX4_CMD_SET_ICM_SIZE= 0xffd, + + /* TPT commands */ + MLX4_CMD_SW2HW_MPT = 0xd, + MLX4_CMD_QUERY_MPT = 0xe, + MLX4_CMD_HW2SW_MPT = 0xf, + MLX4_CMD_READ_MTT= 0x10, + MLX4_CMD_WRITE_MTT = 0x11, + MLX4_CMD_SYNC_TPT= 0x2f, + + /* EQ commands */ + MLX4_CMD_MAP_EQ = 0x12, + MLX4_CMD_SW2HW_EQ= 0x13, + MLX4_CMD_HW2SW_EQ= 0x14, + MLX4_CMD_QUERY_EQ= 0x15, + + /* CQ commands */ + MLX4_CMD_SW2HW_CQ= 0x16, + MLX4_CMD_HW2SW_CQ= 0x17, + MLX4_CMD_QUERY_CQ= 0x18, + MLX4_CMD_RESIZE_CQ = 0x2c, + + /* SRQ commands */ + MLX4_CMD_SW2HW_SRQ = 0x35, + MLX4_CMD_HW2SW_SRQ = 0x36, + MLX4_CMD_QUERY_SRQ = 0x37, + MLX4_CMD_ARM_SRQ = 0x40, + + /* QP/EE commands */ + MLX4_CMD_RST2INIT_QP = 0x19, + MLX4_CMD_INIT2RTR_QP = 0x1a, + MLX4_CMD_RTR2RTS_QP = 0x1b, + MLX4_CMD_RTS2RTS_QP = 0x1c, + MLX4_CMD_SQERR2RTS_QP= 0x1d, + MLX4_CMD_2ERR_QP = 0x1e, + MLX4_CMD_RTS2SQD_QP = 0x1f, + MLX4_CMD_SQD2SQD_QP = 0x38, + MLX4_CMD_SQD2RTS_QP = 0x20, + MLX4_CMD_2RST_QP = 0x21, + MLX4_CMD_QUERY_QP= 0x22, + MLX4_CMD_INIT2INIT_QP= 0x2d, + MLX4_CMD_SUSPEND_QP = 0x32, + MLX4_CMD_UNSUSPEND_QP= 0x33, + /* special QP and management commands */ + MLX4_CMD_CONF_SPECIAL_QP = 0x23, + MLX4_CMD_MAD_IFC = 0x24, + + /* multicast commands */ + MLX4_CMD_READ_MCG= 0x25, + MLX4_CMD_WRITE_MCG =
[PATCH 0/6] [RFC]IB/mlx4: Mellanox ConnectX adapter driver
As promised, here is a series of patches adding the mlx4_core and mlx4_ib drivers for the new Mellanox ConnectX adapter. These patches are split up in an ad hoc way to avoid mailing list size limits, but when this driver is finally merged, I will give it to Linus to pull in a single changeset. The full driver is also available via git from master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git connectx and it is also in my for-mm patch, so Andrew will pick it up for -mm kernels automatically. The driver is split into two kernel modules because the ConnectX adapter can be used as an InfiniBand adapter, 1G/10G ethernet NIC, and an fibre channel HBA at the same time, and so resource management and basic tasks such as issuing commands to the firmware are handled in a mlx4_core module, while everything InfiniBand-specific is in mlx4_ib. In the not-to-distant future, an mlx4_eth module that handles ethernet NIC stuff will be released. My goal is to merge this for 2.6.22. If you feel that would not be appropriate, please do let me know and I will hold off. And of course all criticisms, suggestions, comments, etc. are very much appreciated. My feeling is that the driver is fairly clean already (and I will do some further cleanup before merging) and seems to be reasonably usable, and I trust myself to continue cleaning things up, so there's not much to be gained by waiting a release cycle. The overall driver is not too huge -- 11371 insertions in the diffstat: drivers/infiniband/Kconfig|2 + drivers/infiniband/Makefile |1 + drivers/infiniband/hw/mlx4/Kconfig|9 + drivers/infiniband/hw/mlx4/Makefile |3 + drivers/infiniband/hw/mlx4/ah.c | 100 +++ drivers/infiniband/hw/mlx4/cq.c | 525 ++ drivers/infiniband/hw/mlx4/doorbell.c | 215 ++ drivers/infiniband/hw/mlx4/mad.c | 339 + drivers/infiniband/hw/mlx4/main.c | 612 drivers/infiniband/hw/mlx4/mlx4_ib.h | 285 drivers/infiniband/hw/mlx4/mr.c | 184 + drivers/infiniband/hw/mlx4/qp.c | 1263 + drivers/infiniband/hw/mlx4/srq.c | 334 + drivers/infiniband/hw/mlx4/user.h | 91 +++ drivers/net/Kconfig | 14 + drivers/net/Makefile |1 + drivers/net/mlx4/Makefile |4 + drivers/net/mlx4/alloc.c | 179 + drivers/net/mlx4/cmd.c| 429 +++ drivers/net/mlx4/cq.c | 254 +++ drivers/net/mlx4/eq.c | 704 ++ drivers/net/mlx4/fw.c | 758 drivers/net/mlx4/fw.h | 165 + drivers/net/mlx4/icm.c| 379 ++ drivers/net/mlx4/icm.h| 135 drivers/net/mlx4/intf.c | 142 drivers/net/mlx4/main.c | 939 drivers/net/mlx4/mcg.c| 370 ++ drivers/net/mlx4/mlx4.h | 334 + drivers/net/mlx4/mr.c | 482 + drivers/net/mlx4/pd.c | 102 +++ drivers/net/mlx4/profile.c| 238 +++ drivers/net/mlx4/qp.c | 270 +++ drivers/net/mlx4/reset.c | 172 + drivers/net/mlx4/srq.c| 227 ++ include/linux/mlx4/cmd.h | 178 + include/linux/mlx4/cq.h | 123 include/linux/mlx4/device.h | 323 + include/linux/mlx4/doorbell.h | 97 +++ include/linux/mlx4/driver.h | 59 ++ include/linux/mlx4/qp.h | 288 include/linux/mlx4/srq.h | 42 ++ 42 files changed, 11371 insertions(+), 0 deletions(-) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 6/6] [RFC]mlx4 build system stuff
Hook up mlx4_core and mlx4_ib drivers to Kconfig and Makefiles. Signed-off-by: Roland Dreier <[EMAIL PROTECTED]> --- infiniband/Kconfig |2 ++ infiniband/Makefile |1 + infiniband/hw/mlx4/Kconfig |9 + infiniband/hw/mlx4/Makefile |3 +++ net/Kconfig | 14 ++ net/Makefile|1 + net/mlx4/Makefile |4 7 files changed, 34 insertions(+) diff --git a/drivers/infiniband/Kconfig b/drivers/infiniband/Kconfig index 82afba5..37deaae 100644 --- a/drivers/infiniband/Kconfig +++ b/drivers/infiniband/Kconfig @@ -45,6 +45,8 @@ source "drivers/infiniband/hw/ehca/Kconfig" source "drivers/infiniband/hw/amso1100/Kconfig" source "drivers/infiniband/hw/cxgb3/Kconfig" +source "drivers/infiniband/hw/mlx4/Kconfig" + source "drivers/infiniband/ulp/ipoib/Kconfig" source "drivers/infiniband/ulp/srp/Kconfig" diff --git a/drivers/infiniband/Makefile b/drivers/infiniband/Makefile index da2066c..75f325e 100644 --- a/drivers/infiniband/Makefile +++ b/drivers/infiniband/Makefile @@ -4,6 +4,7 @@ obj-$(CONFIG_INFINIBAND_IPATH) += hw/ipath/ obj-$(CONFIG_INFINIBAND_EHCA) += hw/ehca/ obj-$(CONFIG_INFINIBAND_AMSO1100) += hw/amso1100/ obj-$(CONFIG_INFINIBAND_CXGB3) += hw/cxgb3/ +obj-$(CONFIG_MLX4_INFINIBAND) += hw/mlx4/ obj-$(CONFIG_INFINIBAND_IPOIB) += ulp/ipoib/ obj-$(CONFIG_INFINIBAND_SRP) += ulp/srp/ obj-$(CONFIG_INFINIBAND_ISER) += ulp/iser/ diff --git a/drivers/infiniband/hw/mlx4/Kconfig b/drivers/infiniband/hw/mlx4/Kconfig new file mode 100644 index 000..b8912cd --- /dev/null +++ b/drivers/infiniband/hw/mlx4/Kconfig @@ -0,0 +1,9 @@ +config MLX4_INFINIBAND + tristate "Mellanox ConnectX HCA support" + depends on INFINIBAND + select MLX4_CORE + ---help--- + This driver provides low-level InfiniBand support for + Mellanox ConnectX PCI Express host channel adapters (HCAs). + This is required to use InfiniBand protocols such as + IP-over-IB or SRP with these devices. diff --git a/drivers/infiniband/hw/mlx4/Makefile b/drivers/infiniband/hw/mlx4/Makefile new file mode 100644 index 000..70f09c7 --- /dev/null +++ b/drivers/infiniband/hw/mlx4/Makefile @@ -0,0 +1,3 @@ +obj-$(CONFIG_MLX4_INFINIBAND) += mlx4_ib.o + +mlx4_ib-y := ah.o cq.o doorbell.o mad.o main.o mr.o qp.o srq.o diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig index c3f9f59..842f020 100644 --- a/drivers/net/Kconfig +++ b/drivers/net/Kconfig @@ -2493,6 +2493,20 @@ config PASEMI_MAC This driver supports the on-chip 1/10Gbit Ethernet controller on PA Semi's PWRficient line of chips. +config MLX4_CORE + tristate + depends on PCI + default n + +config MLX4_DEBUG + bool "Verbose debugging output" if (MLX4_CORE && EMBEDDED) + default y + ---help--- + This option causes debugging code to be compiled into the + mlx4_core driver. The output can be turned on via the + debug_level module parameter (which can also be set after + the driver is loaded through sysfs). + endmenu source "drivers/net/tokenring/Kconfig" diff --git a/drivers/net/Makefile b/drivers/net/Makefile index 33af833..1604e1a 100644 --- a/drivers/net/Makefile +++ b/drivers/net/Makefile @@ -197,6 +197,7 @@ obj-$(CONFIG_SMC911X) += smc911x.o obj-$(CONFIG_DM9000) += dm9000.o obj-$(CONFIG_FEC_8XX) += fec_8xx/ obj-$(CONFIG_PASEMI_MAC) += pasemi_mac.o +obj-$(CONFIG_MLX4_CORE) += mlx4/ obj-$(CONFIG_MACB) += macb.o diff --git a/drivers/net/mlx4/Makefile b/drivers/net/mlx4/Makefile new file mode 100644 index 000..4f18889 --- /dev/null +++ b/drivers/net/mlx4/Makefile @@ -0,0 +1,4 @@ +obj-$(CONFIG_MLX4_CORE)+= mlx4_core.o + +mlx4_core-y := alloc.o cmd.o cq.o eq.o fw.o icm.o intf.o main.o mcg.o mr.o \ + pd.o profile.o qp.o reset.o srq.o - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Check for error returned by kthread_create on creating journal thread
On Mon, 16 Apr 2007 11:41:14 +0400 Pavel Emelianov <[EMAIL PROTECTED]> wrote: > If the thread failed to create the subsequent wait_event > will hang forever. > > This is likely to happen if kernel hits max_threads limit. > > Will be critical for virtualization systems that limit the > number of tasks and kernel memory usage within the container. > > > [diff-jbd-check-start-journal-thread-return-value text/plain (1.7KB)] > --- ./fs/jbd/journal.c.jbdthreads 2007-04-16 11:17:36.0 +0400 > +++ ./fs/jbd/journal.c2007-04-16 11:30:09.0 +0400 > @@ -211,10 +211,16 @@ end_loop: > return 0; > } > > -static void journal_start_thread(journal_t *journal) > +static int journal_start_thread(journal_t *journal) > { > - kthread_run(kjournald, journal, "kjournald"); > + struct task_struct *t; > + > + t = kthread_run(kjournald, journal, "kjournald"); > + if (IS_ERR(t)) > + return PTR_ERR(t); > + > wait_event(journal->j_wait_done_commit, journal->j_task != 0); > + return 0; > } Thanks. Please don't forget those Signed-off-by:s I assume that you runtime tested this and that the mount failed in an appropriate fashion? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/6] [RFC]mlx4_ib main files
Main include file and .c file for mlx4_ib. Signed-off-by: Roland Dreier <[EMAIL PROTECTED]> --- main.c| 612 ++ mlx4_ib.h | 285 2 files changed, 897 insertions(+) diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c new file mode 100644 index 000..6f7165f --- /dev/null +++ b/drivers/infiniband/hw/mlx4/main.c @@ -0,0 +1,612 @@ +/* + * Copyright (c) 2006, 2007 Cisco Systems, Inc. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + *copyright notice, this list of conditions and the following + *disclaimer. + * + * - Redistributions in binary form must reproduce the above + *copyright notice, this list of conditions and the following + *disclaimer in the documentation and/or other materials + *provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + */ + +#include +#include +#include + +#include +#include + +#include +#include + +#include "mlx4_ib.h" +#include "user.h" + +#define DRV_NAME "mlx4_ib" +#define DRV_VERSION"0.01" +#define DRV_RELDATE"May 1, 2006" + +MODULE_AUTHOR("Roland Dreier"); +MODULE_DESCRIPTION("Mellanox ConnectX HCA InfiniBand driver"); +MODULE_LICENSE("Dual BSD/GPL"); +MODULE_VERSION(DRV_VERSION); + +static const char mlx4_ib_version[] __devinitdata = + DRV_NAME ": Mellanox ConnectX InfiniBand driver v" + DRV_VERSION " (" DRV_RELDATE ")\n"; + +static void init_query_mad(struct ib_smp *mad) +{ + mad->base_version = 1; + mad->mgmt_class= IB_MGMT_CLASS_SUBN_LID_ROUTED; + mad->class_version = 1; + mad->method= IB_MGMT_METHOD_GET; +} + +static int mlx4_ib_query_device(struct ib_device *ibdev, + struct ib_device_attr *props) +{ + struct mlx4_ib_dev *dev = to_mdev(ibdev); + struct ib_smp *in_mad = NULL; + struct ib_smp *out_mad = NULL; + int err = -ENOMEM; + + in_mad = kzalloc(sizeof *in_mad, GFP_KERNEL); + out_mad = kmalloc(sizeof *out_mad, GFP_KERNEL); + if (!in_mad || !out_mad) + goto out; + + init_query_mad(in_mad); + in_mad->attr_id = IB_SMP_ATTR_NODE_INFO; + + err = mlx4_MAD_IFC(to_mdev(ibdev), 1, 1, 1, NULL, NULL, in_mad, out_mad); + if (err) + goto out; + + memset(props, 0, sizeof *props); + + props->fw_ver = dev->dev->caps.fw_ver; + props->device_cap_flags= IB_DEVICE_CHANGE_PHY_PORT | + IB_DEVICE_PORT_ACTIVE_EVENT | + IB_DEVICE_SYS_IMAGE_GUID| + IB_DEVICE_RC_RNR_NAK_GEN; + if (dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_BAD_PKEY_CNTR) + props->device_cap_flags |= IB_DEVICE_BAD_PKEY_CNTR; + if (dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_BAD_QKEY_CNTR) + props->device_cap_flags |= IB_DEVICE_BAD_QKEY_CNTR; + if (dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_APM) + props->device_cap_flags |= IB_DEVICE_AUTO_PATH_MIG; + if (dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_UD_AV_PORT) + props->device_cap_flags |= IB_DEVICE_UD_AV_PORT_ENFORCE; + + props->vendor_id = be32_to_cpup((__be32 *) (out_mad->data + 36)) & + 0xff; + props->vendor_part_id = be16_to_cpup((__be16 *) (out_mad->data + 30)); + props->hw_ver = be32_to_cpup((__be32 *) (out_mad->data + 32)); + memcpy(>sys_image_guid, out_mad->data + 4, 8); + + props->max_mr_size = ~0ull; + props->page_size_cap = dev->dev->caps.page_size_cap; + props->max_qp = dev->dev->caps.num_qps - dev->dev->caps.reserved_qps; + props->max_qp_wr = dev->dev->caps.max_wqes; + props->max_sge = min(dev->dev->caps.max_sq_sg, +dev->dev->caps.max_rq_sg); + props->max_cq
/proc/sef/fd/0 is a socket ?!
Hi all... After a big update in my systems, two of them just does not let me ssh into it. It says that stdin is not a terminal. The same hapens if I try to open any terminal emulator, like aterm. It finally let me do somathing like ssh [EMAIL PROTECTED] /bin/bash -i, to get a terminal, and I saw this: nada:/etc/rc.d# ll /proc/self/fd/ total 0 lrwx-- 1 root root 64 2007.04.21 00:13 0 -> socket:[23705] lrwx-- 1 root root 64 2007.04.21 00:13 1 -> socket:[23705] lrwx-- 1 root root 64 2007.04.21 00:13 2 -> socket:[23707] lr-x-- 1 root root 64 2007.04.21 00:13 3 -> /proc/6814/fd/ whats that ? udev really messed something ? One other box is working just fine. Any ideas ? -- J.A. Magallon \ Software is like sex: \ It's better when it's free Mandriva Linux release 2008.0 (Cooker) for i586 Linux 2.6.20-jam10 (gcc 4.1.2 20070302 (prerelease) (4.1.2-1mdv2007.1)) #1 SMP PREEMPT - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -mm] freezer: Document task_lock in thaw_process
From: Rafael J. Wysocki <[EMAIL PROTECTED]> The task_lock() in include/linux/freezer.h:thaw_process() looks as though it were protecting p->flags, which is not the case. Add a comment that explains why it's there. Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> --- include/linux/freezer.h |6 ++ 1 file changed, 6 insertions(+) Index: linux-2.6.21-rc6-mm1/include/linux/freezer.h === --- linux-2.6.21-rc6-mm1.orig/include/linux/freezer.h 2007-04-09 15:24:25.0 +0200 +++ linux-2.6.21-rc6-mm1/include/linux/freezer.h2007-04-21 00:17:30.0 +0200 @@ -37,6 +37,12 @@ static inline void do_not_freeze(struct /* * Wake up a frozen process + * + * task_lock() is taken to prevent the race with refrigerator() which may + * occur if the freezing of tasks fails. Namely, without the lock, if the + * freezing of tasks failed, thaw_tasks() might have run before a task in + * refrigerator() could call frozen_process(), in which case the task would be + * frozen and no one would thaw it. */ static inline int thaw_process(struct task_struct *p) { - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Sleep during spinlock in TPM driver
I've been working with the TPM driver, and I found that if I opened, used, then closed the TPM char device very frequently, I would get a kernel BUG message saying that the kernel tried to sleep while holding a spinlock. I think I've isolated the problem to this function, in drivers/char/tpm/tpm.c: int tpm_release(struct inode *inode, struct file *file) { struct tpm_chip *chip = file->private_data; spin_lock(_lock); file->private_data = NULL; chip->num_opens--; del_singleshot_timer_sync(>user_read_timer); flush_scheduled_work(); atomic_set(>data_pending, 0); put_device(chip->dev); kfree(chip->data_buffer); spin_unlock(_lock); return 0; } EXPORT_SYMBOL_GPL(tpm_release); I believe that flush_scheduled_work can sleep, correct? Does anyone know why this function is called while the spinlock is held? -David - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] lazy freeing of memory through MADV_FREE
On Fri, 20 Apr 2007 17:38:06 -0400 Rik van Riel <[EMAIL PROTECTED]> wrote: > Andrew Morton wrote: > > > I've also merged Nick's "mm: madvise avoid exclusive mmap_sem". > > > > - Nick's patch also will help this problem. It could be that your patch > > no longer offers a 2x speedup when combined with Nick's patch. > > > > It could well be that the combination of the two is even better, but it > > would be nice to firm that up a bit. > > I'll test that. Thanks. > > I do go on about that. But we're adding page flags at about one per > > year, and when we run out we're screwed - we'll need to grow the > > pageframe. > > If you want, I can take a look at folding this into the > ->mapping pointer. I can guarantee you it won't be > pretty, though :) Well, let's see how fugly it ends up looking? > > - I need to update your patch for Nick's patch. Please confirm that > > down_read(mmap_sem) is sufficient for MADV_FREE. > > It is. MADV_FREE needs no more protection than MADV_DONTNEED. > > > Stylistic nit: > > > >> + if (PageLazyFree(page) && !migration) { > >> + /* There is new data in the page. Reinstate it. */ > >> + if (unlikely(pte_dirty(pteval))) { > >> + set_pte_at(mm, address, pte, pteval); > >> + ret = SWAP_FAIL; > >> + goto out_unmap; > >> + } > > > > The comment should be inside the second `if' statement. As it is, It > > looks like we reinstate the page if (PageLazyFree(page) && !migration). > > Want me to move it? I did that, thanks. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: AGPGart / AMD K7
Dave Jones wrote: > On Fri, Apr 20, 2007 at 04:22:06PM -0400, Preston A. Elder wrote: > > Dave, Greg, > > > > Here is the trace with 2.6.20.6 > > > > I added back in my trace code, as you see. As you can also see, > > agp_amdk7_probe is still not called. > > Try looking down in __driver_attach() > The fact that we're not calling the ->probe function is quite bizarre. > > It could be this in __driver_attach > > if (!dev->driver) > driver_probe_device(drv, dev); > > Though that'd be odd. > > Putting a #define DEBUG 1 in drivers/base/dd.c may also yield some clues. > > Dave > > OK, I found it! Here is more trace: Linux agpgart interface v0.101 (c) Dave Jones agp_amdk7_init: In function agp_amdk7_init: Before pci_register_driver __pci_register_driver: In Function (driver = agpgart-amdk7, multithread = 0) __pci_register_driver: Before Spinlock __pci_register_driver: Before Init List Head __pci_register_driver: Before driver_register bus_add_driver: In Function (c0492920) bus_add_driver: Before kobject_set_name bus_add_driver: error = 0 bus_add_driver: Before kobject_register bus_add_driver: error = 0 bus_add_driver: Before driver_attach __driver_attach (:00:00.0,1): Before Down (parent) (c21c8600) __driver_attach (:00:00.0, 1): Before Down __driver_attach (:00:00.0, 1): Before Probe Device (c049fe54) __driver_attach (:00:00.0, 1): alreay registered with driver amd76x_edac __driver_attach (:00:00.0, 1): Before Up __driver_attach (:00:00.0, 1): Before Up (parent) (c21c8600) __driver_attach (:00:00.0, 1): Returning 0 bus_add_driver: error = 0 bus_add_driver: Before klist_add_tail bus_add_driver: Before module_add_driver bus_add_driver: Before driver_add_attrs bus_add_driver: error = 0 bus_add_driver: Before add_bind_files bus_add_driver: error = 0 bus_add_driver: Returning 0 __pci_register_driver: error = 0 __pci_register_driver: Before pci_create_newid_file __pci_register_driver: error = 0 __pci_register_driver: Returning 0 I snipped some since __driver_attach is called many times. But the long and short is that 00:00:00 is already associated with the 'amd76x_edac' driver, and as such will not call the agp probe call. What is this edac, btw? PreZ :) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc7: HPET enabled freeze my machine at boot
On 4/19/07, guilherme <[EMAIL PROTECTED]> wrote: Hi, If i enable "High Resolution Timer Support", my machine stops here at boot: Clocksource tsc unstable (delta = -297340790165 ns) Time: hpet clocksource has been installed. If i disable HPET, it boots fine. Hmmm.. What happens if you boot w/ clocksource=acpi_pm ? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
BUG: Dentry still in use during umount in 2.6.21-rc5-git6
One of my autoboot test clients gave me this during shutdown. It used reiserfs and autofs and NFS heavily. Unmounting file systems BUG: Dentry 8100f3693a40{i=2352220,n=xattrs} still in use (1) [unmount of reiserfs sda9] [ cut here ] kernel BUG at /mnt/dm-2/newautoboot/autoboot/lsrc/mainline/linux/fs/dcache.c:623! invalid opcode: [1] SMP CPU 1 Modules linked in: Pid: 15791, comm: umount Not tainted 2.6.21-rc5-git6 #44 RIP: 0010:[] [] shrink_dcache_for_umount_subtree+0x178/0x250 RSP: 0018:8100f5f67e18 EFLAGS: 00010292 RAX: 0060 RBX: 8100f3693a40 RCX: 5207 RDX: RSI: 0046 RDI: 00014661 RBP: 8100f6dc9cc0 R08: 00a0 R09: 0005 R10: R11: R12: 8100f3693aa0 R13: 00014661 R14: 0050ea70 R15: 0050ead0 FS: 2adc863a86d0() GS:8100f7fdc1c0() knlGS:b7be38d0 CS: 0010 DS: ES: CR0: 8005003b CR2: 2adc8626a688 CR3: f628b000 CR4: 06e0 Process umount (pid: 15791, threadinfo 8100f5f66000, task 8100f7a08100) Stack: 810004dab218 810004dab000 80558860 810004dab000 8028815b 810004dab000 8027a1a5 8100f6c50980 806c1600 8027a2a4 Call Trace: [] shrink_dcache_for_umount+0x2f/0x3d [] generic_shutdown_super+0x19/0xf2 [] kill_block_super+0x26/0x3b [] deactivate_super+0x47/0x60 [] sys_umount+0x1f7/0x22a [] sys_newstat+0x19/0x31 [] system_call+0x7e/0x83 Code: 0f 0b eb fe 48 8b 6b 28 48 39 dd 75 04 31 ed eb 04 f0 ff 4d RIP [] shrink_dcache_for_umount_subtree+0x178/0x250 RSP /etc/init.d/boot.d/K14boot.localfs: line 93: 15791 Segmentation fault umount -avt noproc,nonfs,nonfs4,nosmbfs,nocifs,notmpfs -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: how to tell linux (on x86) to ignore 1M or memory
Rene Herman <[EMAIL PROTECTED]> wrote: > On 04/19/2007 04:18 PM, Bart Trojanowski wrote: >> I need to preserve some state from the bios before entering protected >> mode. For now I want to copy it into some ram accessible by real-mode, >> say the last megabyte visible in real-mode. >> >> What's the easiest way to have linux ignore the megabyte starting at 15M? > > Note that real-mode can only access the first megabyte (*) and not the first > 16. 16MB is the 16-bit protected mode (286) limit. > > (*) well, the first 1M + 64K - 16 bytes using segment assuming A20 is > enabled and x > 1 in x86... Interrupt 15h, function 87h allows copying from/to extended memory. You might like to look into Ralph Brown's interrupt list for more details. You could also cpio-gzip the data and append it to the initramfs. -- Fun things to slip into your budget Does that line item say 'Personal Massage System' Oops, it's supposed to be 'Message'. Go ahead and sign the authorization, Boss; I'll correct it later. (Iike Hell I will) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] cciss: Fix warnings during compilation under 32bitenvironment
On Fri, 20 Apr 2007, Andrew Morton wrote: > On Fri, 20 Apr 2007 16:20:59 -0400 > James Bottomley <[EMAIL PROTECTED]> wrote: > > > On Fri, 2007-04-20 at 12:30 -0700, Andrew Morton wrote: > > > On Fri, 20 Apr 2007 14:50:06 -0400 > > > James Bottomley <[EMAIL PROTECTED]> wrote: > > > > > > > > CONFIG_LBD=y gives us an additional 3kb of instructions on i386 > > > > > allnoconfig. Other architectures might do less well. It's not a huge > > > > > difference, but that's the way in which creeping bloatiness happens. > > > > > > > > OK, sure, but if we really care about this saving, then unconditionally > > > > casting to u64 is therefore wrong as well ... this is starting to open > > > > quite a large can of worms ... > > > > > > > > For the record, if we have to do this, I fancy sector_upper_32() ... we > > > > should already have some similar accessor for dma_addr_t as well. > > > > > > hm. How about this? > > > > > > --- a/include/linux/kernel.h~upper-32-bits > > > +++ a/include/linux/kernel.h > > > @@ -40,6 +40,17 @@ extern const char linux_proc_banner[]; > > > #define DIV_ROUND_UP(n,d) (((n) + (d) - 1) / (d)) > > > #define roundup(x, y) x) + ((y) - 1)) / (y)) * (y)) > > > > > > +/** > > > + * upper_32_bits - return bits 32-63 of a number > > > + * @n: the number we're accessing > > > + * > > > + * A basic shift-right of a 64- or 32-bit quantity. Use this to suppress > > > + * the "right shift count >= width of type" warning when that quantity is > > > + * 32-bits. > > > + */ > > > +#define upper_32_bits(n) (((u64)(n)) >> 32) > > > > Won't this have the unwanted side effect of promoting everything in a > > calculation to long long on 32 bit platforms, even if n was only 32 > > bits? > > bummer. > > > > + > > > + > > > #define KERN_EMERG "<0>" /* system is unusable > > > */ > > > #define KERN_ALERT "<1>" /* action must be taken immediately > > > */ > > > #define KERN_CRIT "<2>" /* critical conditions > > > */ > > > _ > > > > > > It seems to generate the desired code. I avoided Alan's ((n >> 31) >> 1) > > > trick because it'll generate peculiar results with signed 64-bit > > > quantities. > > > > I've seen the trick done similarly with ((n >> 16) >> 16) which > > shouldn't have the issue. > > That works if we know the caller is treating the return value as 32 bits, > but we don't know that. > > If we have > > #define upper_32_bits(x) ((x >> 16) >> 16) > > then > > upper_32_bits(0x) > > will return 0x if it's treated as 32-bits, but it'll return > 0x if the caller is using 64-bits. > > I spose > > #define upper_32_bits(x) ((u32)((x >> 16) >> 16)) > > will do the trick. What about this? #define upper_32_bits(x) (sizeof(x) == 8 ? (u64)(x) >> 32 : 0) The u64 cast prevents the sign bit from being carried over and therefore eliminates the need for a subsequent cast to u32 since the upper 32 of the result will be 0. Shouldn't be any case where an integer gets promoted if 64 bits is the largest possible promotion. Assuming, of course, I'm not an idiot. Which I somewhat frequently am. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Acecad USB Tablet: usbmouse takeover and odd motion
On 4/20/07, Vojtech Pavlik <[EMAIL PROTECTED]> wrote: On Fri, Apr 20, 2007 at 06:09:55PM +0200, Giuseppe Bilotta wrote: > On 4/20/07, Dmitry Torokhov <[EMAIL PROTECTED]> wrote: > >On 4/20/07, Giuseppe Bilotta <[EMAIL PROTECTED]> wrote: > >> > >> Sorry, it seems I was wrong, it's not usbhid but usbmouse taking over. > >> After a fresh plug (e.g. at bootup) I get the following: > >> > > > >Well, the question is - why do you have usbmouse module on your system? > > Stock Debian kernel 2.6.18 comes with it. > > With my custom kernels I can probably skip compiling it at all, if you > so suggest; should I blacklist it for the distro kernel? Or is there a > chance that some random USB mouse plugged in would fail to function by > doing so? usbmouse and usbkbd are only intended for embedded systems where the full usbhid doesn't fit and for testing purposes: Normal distros shouldn't have them enabled. Oh, I see. I'll blacklist those modules, maybe also issue a ticket on the Debian BTS. Thanks all. -- Giuseppe "Oblomov" Bilotta - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH(experimental) 2/2] Fix freezer-kthread_stop race
On Friday, 20 April 2007 23:20, Oleg Nesterov wrote: > On 04/20, Gautham R Shenoy wrote: > > > > On Fri, Apr 20, 2007 at 10:54:36AM +0200, Rafael J. Wysocki wrote: > > > > > > Hmm, can't we do something like this instead: > > > > > > --- > > > kernel/kthread.c | 10 ++ > > > 1 file changed, 10 insertions(+) > > > > > > Index: linux-2.6.21-rc7/kernel/kthread.c > > > === > > > --- linux-2.6.21-rc7.orig/kernel/kthread.c > > > +++ linux-2.6.21-rc7/kernel/kthread.c > > > @@ -13,6 +13,7 @@ > > > #include > > > #include > > > #include > > > +#include > > > #include > > > > > > /* > > > @@ -232,6 +233,15 @@ int kthread_stop(struct task_struct *k) > > > > > > /* Now set kthread_should_stop() to true, and wake it up. */ > > > kthread_stop_info.k = k; > > > + if (!(current->flags & PF_NOFREEZE)) { > > > + /* If we are freezable, the freezer will wait for us */ > > > + task_lock(k); > > > + k->flags |= PF_NOFREEZE; > > > + if (frozen(k)) > > > + k->flags &= ~PF_FROZEN; > > > + > > > + task_unlock(k); > > > + } > > > > Yes, we can do this for now since the tasks have only two freeze states, > > namely Freezeable and Non Freezeable. > > No, we can't change k->flags, k owns its ->flags, and it is not atomic. Yes, but if we move PF_FROZEN to a separate field in task_struct with appropriate locking, then it won't be a problem any more IMO. > Rafael, may I suggest you to document task_lock() in thaw_process() ? This > looks really confusing, as if task_lock() protects "p->flags &= ~PF_FROZEN". > > Actually, task_lock() is needed to prevent the race with refrigerator() > when the freezing fails, but this is not obvious. Sure, I will. Greetings, Rafael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, v4
Any chance of supporting 2.6.20? On 4/21/07, Ingo Molnar <[EMAIL PROTECTED]> wrote: i'm pleased to announce release -v4 of the CFS patchset. The patch against v2.6.21-rc7 can be downloaded from: http://redhat.com/~mingo/cfs-scheduler/ this CFS release too is mainly about fixing regressions and improving interactivity, so the rate of change is relatively low: 11 files changed, 136 insertions(+), 72 deletions(-) in particular the preemption fix could resolve the 'desktop slows down under IO load' reports and the 'firefox does not switch tabs fast enough' reports as well. The suspend2 crash and the yield related Kaffeine hangs should be resolved as well. Changes since -v3: - usability fix: automatic renicing of kernel threads such as keventd, OOM tasks and tasks doing privileged hardware access (such as Xorg). (This is a substitute for group scheduling until the group scheduling details have been worked out.) - bugfix: buggy yield() caused suspend2 problems - preemption fix: it caused desktop app latencies As usual, any sort of feedback, bugreport, fix and suggestion is more than welcome, Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, v4
On Friday 20 April 2007, Ingo Molnar wrote: >i'm pleased to announce release -v4 of the CFS patchset. The patch >against v2.6.21-rc7 can be downloaded from: > >http://redhat.com/~mingo/cfs-scheduler/ > >this CFS release too is mainly about fixing regressions and improving >interactivity, so the rate of change is relatively low: > >11 files changed, 136 insertions(+), 72 deletions(-) > >in particular the preemption fix could resolve the 'desktop slows down >under IO load' reports and the 'firefox does not switch tabs fast >enough' reports as well. The suspend2 crash and the yield related >Kaffeine hangs should be resolved as well. > >Changes since -v3: > > - usability fix: automatic renicing of kernel threads such as keventd, > OOM tasks and tasks doing privileged hardware access (such as Xorg). > (This is a substitute for group scheduling until the group scheduling >details have been worked out.) > > - bugfix: buggy yield() caused suspend2 problems > > - preemption fix: it caused desktop app latencies > >As usual, any sort of feedback, bugreport, fix and suggestion is more >than welcome, I've been running this one for several hours now, with amanda running in the background due a typu in one of my scripts, so now its playing catchup. This one is another keeper IMO, or as we are fond of saying around here, its good enough for the girls I go with. If this isn't the best one so far, its very very close and I'm getting pickier. kmail is the only thing that's lagging, and that's just kmail, which I believe is single threaded. Even with gzip eating 95% of the cpu, graphics animations like the cards in patience are moving at at least 80% speed. Nice, keep this one and use it for the reference. > Ingo -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) In order to dial out, it is necessary to broaden one's dimension. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] lazy freeing of memory through MADV_FREE
Andrew Morton wrote: I've also merged Nick's "mm: madvise avoid exclusive mmap_sem". - Nick's patch also will help this problem. It could be that your patch no longer offers a 2x speedup when combined with Nick's patch. It could well be that the combination of the two is even better, but it would be nice to firm that up a bit. I'll test that. I do go on about that. But we're adding page flags at about one per year, and when we run out we're screwed - we'll need to grow the pageframe. If you want, I can take a look at folding this into the ->mapping pointer. I can guarantee you it won't be pretty, though :) - I need to update your patch for Nick's patch. Please confirm that down_read(mmap_sem) is sufficient for MADV_FREE. It is. MADV_FREE needs no more protection than MADV_DONTNEED. Stylistic nit: + if (PageLazyFree(page) && !migration) { + /* There is new data in the page. Reinstate it. */ + if (unlikely(pte_dirty(pteval))) { + set_pte_at(mm, address, pte, pteval); + ret = SWAP_FAIL; + goto out_unmap; + } The comment should be inside the second `if' statement. As it is, It looks like we reinstate the page if (PageLazyFree(page) && !migration). Want me to move it? -- Politics is the struggle between those who want to make their country the best in the world, and those who believe it already is. Each group calls the other unpatriotic. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ieee1394: update MAINTAINERS database
On Fri, 2007-04-20 at 23:21 +0200, Stefan Richter wrote: > - update Ben's address > - replace Ben's contact by mine as raw1394's 2nd contact > - eth1394's and pcilynx's maintenance doesn't really differ from that > of other parts of the stack like video1394 > > Signed-off-by: Stefan Richter <[EMAIL PROTECTED]> > --- > > Ben, is this correct? Looks good to me. -- Ubuntu:http://www.ubuntu.com/ Linux1394: http://www.linux1394.org/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Getting the new RxRPC patches upstream
On 04/20, Andrew Morton wrote: > > On Fri, 20 Apr 2007 11:41:46 +0100 > David Howells <[EMAIL PROTECTED]> wrote: > > > There are only two non-net patches that AF_RXRPC depends on: > > > > (1) The key facility changes. That's all my code anyway, and shouldn't be > > a > > problem to merge unless someone else has put some changes in there > > that I > > don't know about. > > > > (2) try_to_cancel_delayed_work(). I suppose I could use > > cancel_delayed_work() instead, but that's less efficient as it waits > > for > > the timer completion function to finish. > > There are significant workqueue changes in -mm and I plan to send them > in for 2.6.22. I doubt if there's anything in there which directly > affects cancel_delayed_work(), but making changes of this nature against > 2.6.21 might lead to grief. I think it is better to use cancel_delayed_work(), but change it to use del_timer(). I belive cancel_delayed_work() doesn't need del_timer_sync(). We only care when del_timer() returns true. In that case, if the timer function still runs (possible for single-threaded wqs), it has already passed __queue_work(). Oleg. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] lazy freeing of memory through MADV_FREE 2/2
On 4/20/07, Andrew Morton <[EMAIL PROTECTED]> wrote: OK, we need to flesh this out a lot please. People often get confused about what our MADV_DONTNEED behaviour is. Well, there's not really much to flesh out. The current MADV_DONTNEED is useful in some situations. The behavior cannot be changed, even glibc will rely on it for the case when MADV_FREE is not supported. What might be nice to have is to have a POSIX-compliant POSIX_MADV_DONTNEED implementation. We currently do nothing which is OK since no test suite can detect that. But some code might want to use the real behavior and we're missing an optimization possibility. Just for reference: the MADV_CURRENT behavior is to throw away data in the range. The POSIX_MADV_DONTNEED behavior is to never lose data. I.e., file backed data is written back, anon data is at most swapped out. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] ieee1394: update MAINTAINERS database
- update Ben's address - replace Ben's contact by mine as raw1394's 2nd contact - eth1394's and pcilynx's maintenance doesn't really differ from that of other parts of the stack like video1394 Signed-off-by: Stefan Richter <[EMAIL PROTECTED]> --- Ben, is this correct? MAINTAINERS | 22 -- 1 file changed, 4 insertions(+), 18 deletions(-) Index: linux/MAINTAINERS === --- linux.orig/MAINTAINERS +++ linux/MAINTAINERS @@ -1681,7 +1681,7 @@ S:Maintained IEEE 1394 SUBSYSTEM P: Ben Collins -M: [EMAIL PROTECTED] +M: [EMAIL PROTECTED] P: Stefan Richter M: [EMAIL PROTECTED] L: [EMAIL PROTECTED] @@ -1689,25 +1689,11 @@ W: http://www.linux1394.org/ T: git kernel.org:/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6.git S: Maintained -IEEE 1394 IPV4 DRIVER (eth1394) -P: Stefan Richter -M: [EMAIL PROTECTED] -L: [EMAIL PROTECTED] -S: Odd Fixes - -IEEE 1394 PCILYNX DRIVER -P: Jody McIntyre -M: [EMAIL PROTECTED] -P: Stefan Richter -M: [EMAIL PROTECTED] -L: [EMAIL PROTECTED] -S: Odd Fixes - -IEEE 1394 RAW I/O DRIVER -P: Ben Collins -M: [EMAIL PROTECTED] +IEEE 1394 RAW I/O DRIVER (raw1394) P: Dan Dennedy M: [EMAIL PROTECTED] +P: Stefan Richter +M: [EMAIL PROTECTED] L: [EMAIL PROTECTED] S: Maintained -- Stefan Richter -=-=-=== -=-- =-=-- http://arcgraph.de/sr/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH(experimental) 2/2] Fix freezer-kthread_stop race
On 04/20, Gautham R Shenoy wrote: > > On Fri, Apr 20, 2007 at 10:54:36AM +0200, Rafael J. Wysocki wrote: > > > > Hmm, can't we do something like this instead: > > > > --- > > kernel/kthread.c | 10 ++ > > 1 file changed, 10 insertions(+) > > > > Index: linux-2.6.21-rc7/kernel/kthread.c > > === > > --- linux-2.6.21-rc7.orig/kernel/kthread.c > > +++ linux-2.6.21-rc7/kernel/kthread.c > > @@ -13,6 +13,7 @@ > > #include > > #include > > #include > > +#include > > #include > > > > /* > > @@ -232,6 +233,15 @@ int kthread_stop(struct task_struct *k) > > > > /* Now set kthread_should_stop() to true, and wake it up. */ > > kthread_stop_info.k = k; > > + if (!(current->flags & PF_NOFREEZE)) { > > + /* If we are freezable, the freezer will wait for us */ > > + task_lock(k); > > + k->flags |= PF_NOFREEZE; > > + if (frozen(k)) > > + k->flags &= ~PF_FROZEN; > > + > > + task_unlock(k); > > + } > > Yes, we can do this for now since the tasks have only two freeze states, > namely Freezeable and Non Freezeable. No, we can't change k->flags, k owns its ->flags, and it is not atomic. Rafael, may I suggest you to document task_lock() in thaw_process() ? This looks really confusing, as if task_lock() protects "p->flags &= ~PF_FROZEN". Actually, task_lock() is needed to prevent the race with refrigerator() when the freezing fails, but this is not obvious. Oleg. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] cciss: Fix warnings during compilation under 32bitenvironment
On Fri, 20 Apr 2007 16:20:59 -0400 James Bottomley <[EMAIL PROTECTED]> wrote: > On Fri, 2007-04-20 at 12:30 -0700, Andrew Morton wrote: > > On Fri, 20 Apr 2007 14:50:06 -0400 > > James Bottomley <[EMAIL PROTECTED]> wrote: > > > > > > CONFIG_LBD=y gives us an additional 3kb of instructions on i386 > > > > allnoconfig. Other architectures might do less well. It's not a huge > > > > difference, but that's the way in which creeping bloatiness happens. > > > > > > OK, sure, but if we really care about this saving, then unconditionally > > > casting to u64 is therefore wrong as well ... this is starting to open > > > quite a large can of worms ... > > > > > > For the record, if we have to do this, I fancy sector_upper_32() ... we > > > should already have some similar accessor for dma_addr_t as well. > > > > hm. How about this? > > > > --- a/include/linux/kernel.h~upper-32-bits > > +++ a/include/linux/kernel.h > > @@ -40,6 +40,17 @@ extern const char linux_proc_banner[]; > > #define DIV_ROUND_UP(n,d) (((n) + (d) - 1) / (d)) > > #define roundup(x, y) x) + ((y) - 1)) / (y)) * (y)) > > > > +/** > > + * upper_32_bits - return bits 32-63 of a number > > + * @n: the number we're accessing > > + * > > + * A basic shift-right of a 64- or 32-bit quantity. Use this to suppress > > + * the "right shift count >= width of type" warning when that quantity is > > + * 32-bits. > > + */ > > +#define upper_32_bits(n) (((u64)(n)) >> 32) > > Won't this have the unwanted side effect of promoting everything in a > calculation to long long on 32 bit platforms, even if n was only 32 > bits? bummer. > > + > > + > > #defineKERN_EMERG "<0>" /* system is unusable > > */ > > #defineKERN_ALERT "<1>" /* action must be taken immediately > > */ > > #defineKERN_CRIT "<2>" /* critical conditions > > */ > > _ > > > > It seems to generate the desired code. I avoided Alan's ((n >> 31) >> 1) > > trick because it'll generate peculiar results with signed 64-bit > > quantities. > > I've seen the trick done similarly with ((n >> 16) >> 16) which > shouldn't have the issue. That works if we know the caller is treating the return value as 32 bits, but we don't know that. If we have #define upper_32_bits(x) ((x >> 16) >> 16) then upper_32_bits(0x) will return 0x if it's treated as 32-bits, but it'll return 0x if the caller is using 64-bits. I spose #define upper_32_bits(x) ((u32)((x >> 16) >> 16)) will do the trick. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH(experimental) 2/2] Fix freezer-kthread_stop race
On 04/19, Gautham R Shenoy wrote: > > @@ -63,12 +74,16 @@ void refrigerator(void) > recalc_sigpending(); /* We sent fake signal, clean it up */ > spin_unlock_irq(>sighand->siglock); > > + task_lock(current); > for (;;) { > set_current_state(TASK_UNINTERRUPTIBLE); > if (!frozen(current)) > break; > + task_unlock(current); > schedule(); > + task_lock(current); > } > + task_unlock(current); > pr_debug("%s left refrigerator\n", current->comm); > current->state = save; Just curious, why this change? > +int hold_freezer_for_task(struct task_struct *p) > +{ > + int ret = 0; > + spin_lock(_status.lock); > + if (freezer_status.count >= 0) > + { > + set_tsk_thread_flag(p, TIF_FREEZER_HELD); > + thaw_process(p); > + freezer_status.count++; > + ret = 1; > + } > + spin_unlock(_status.lock); > + > + return ret; > +} I think this can work if it is used only in kthread_stop(). But what if another task wants to do hold_freezer_for_task(p) ? freezer_status.count is recursive, but TIF_FREEZER_HELD is not. IOW, I believe this is not generic enough. Also, you are planning to add different freezing states (FE_HOTPLUG_CPU, FE_SUSPEND, etc). In that case each of them needs a separate .count, because it should be negative when try_to_freeze_tasks() returns. Now consider the case when we are doing freeze_processes(FE_A | FE_B) ... Oleg. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH(experimental) 2/2] Fix freezer-kthread_stop race
On Friday, 20 April 2007 20:31, Ingo Molnar wrote: > > * Andrew Morton <[EMAIL PROTECTED]> wrote: > > > > > I mean, we already have four of them (PF_NOFREEZE, PF_FROZEN, > > > > PF_FREEZER_SKIP, TIF_FREEZE), and you will need to introduce two > > > > more for the freezer-based CPU hotplug, so if yet another one is > > > > needed, that will make up almost a separate u8 field ... > > > > > > I am perfectly ok with it. But I am not sure if everybody would > > > agree to have another field in the task struct, though in this case > > > it does make sense :-) > > > > OK by me. You might want to consider making that fields's locking > > protocol be set_bit(), clear_bit(), etc rather than task_lock(). > > is OK to me too, the extra field isnt a problem. OK, so I'll try to prepare a patch introducing it over the weekend. :-) Greetings, Rafael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] lazy freeing of memory through MADV_FREE 2/2
On Thu, 19 Apr 2007 17:15:28 -0400 Rik van Riel <[EMAIL PROTECTED]> wrote: > Restore MADV_DONTNEED to its original Linux behaviour. This is still > not the same behaviour as POSIX, but applications may be depending on > the Linux behaviour already. Besides, glibc catches POSIX_MADV_DONTNEED > and makes sure nothing is done... OK, we need to flesh this out a lot please. People often get confused about what our MADV_DONTNEED behaviour is. I regularly forget, then look at the code, then get it wrong. That's for mainline, let alone older kernels whose behaviour is gawd-knows-what. So... For the changelog (and the manpage) could we please have a full description of the 2.6.21 behaviour and the 2.6.21-post-rik behaviour (and the 2.4 behaviour, if it differs at all)? Also some code comments to demystify all of this once and for all? Thanks. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] dev_dbg: check dev_dbg() arguments
Duplicate what Zach Brown did for pr_debug in commit 8b2a1fd1b394c60eaa2587716102dd5e9b4e5990 Signed-off-by: Dan Williams <[EMAIL PROTECTED]> --- include/linux/device.h |6 +- 1 files changed, 5 insertions(+), 1 deletions(-) diff --git a/include/linux/device.h b/include/linux/device.h index 5cf30e9..b6825d0 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -554,7 +554,11 @@ extern const char *dev_driver_string(struct device *dev); #define dev_dbg(dev, format, arg...) \ dev_printk(KERN_DEBUG , dev , format , ## arg) #else -#define dev_dbg(dev, format, arg...) do { (void)(dev); } while (0) +static inline int __attribute__ ((format (printf, 2, 3))) +dev_dbg(struct device * dev, const char * fmt, ...) +{ + return 0; +} #endif #define dev_err(dev, format, arg...) \ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] lazy freeing of memory through MADV_FREE
On Tue, 17 Apr 2007 03:15:51 -0400 Rik van Riel <[EMAIL PROTECTED]> wrote: > Make it possible for applications to have the kernel free memory > lazily. This reduces a repeated free/malloc cycle from freeing > pages and allocating them, to just marking them freeable. If the > application wants to reuse them before the kernel needs the memory, > not even a page fault will happen. > > This patch, together with Ulrich's glibc change, increases > MySQL sysbench performance by a factor of 2 on my quad core > test system. > > Signed-off-by: Rik van Riel <[EMAIL PROTECTED]> > > --- > Ulrich Drepper has test glibc RPMS for this functionality at: > > http://people.redhat.com/drepper/rpms > > Andrew, I have stress tested this patch for a few days now and > have not been able to find any more bugs. I believe it is ready > to be merged in -mm, and upstream at the next merge window. > > When the patch goes upstream, I will submit a small follow-up > patch to revert MADV_DONTNEED behaviour to what it did previously > and have the new behaviour trigger only on MADV_FREE: at that > point people will have to get new test RPMs of glibc. > > I've also merged Nick's "mm: madvise avoid exclusive mmap_sem". - Nick's patch also will help this problem. It could be that your patch no longer offers a 2x speedup when combined with Nick's patch. It could well be that the combination of the two is even better, but it would be nice to firm that up a bit. Chewing a page flag is an expensive thing to do. I do go on about that. But we're adding page flags at about one per year, and when we run out we're screwed - we'll need to grow the pageframe. - I need to update your patch for Nick's patch. Please confirm that down_read(mmap_sem) is sufficient for MADV_FREE. Stylistic nit: > + if (PageLazyFree(page) && !migration) { > + /* There is new data in the page. Reinstate it. */ > + if (unlikely(pte_dirty(pteval))) { > + set_pte_at(mm, address, pte, pteval); > + ret = SWAP_FAIL; > + goto out_unmap; > + } The comment should be inside the second `if' statement. As it is, It looks like we reinstate the page if (PageLazyFree(page) && !migration). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: why UDF have so ugly filesize limit?
> from fs/udf/super.c: > in function udf_fill_super > sb->s_maxbytes = 1<<30; (1 GB) > > Why sb->s_maxbytes is not equal to MAX_LFS_FILESIZE? Because UDF had some flaws and user could crash a kernel with larger filesize. In -mm kernel are patches fixing the flaw and also raising the limit back to MAX_LFS_FILESIZE. > So, in include/linux/fs.h written that the filesystems should put that > (MAX_LFS_FILESIZE) into their s_maxbytes, otherwise bad things can > happen in VM. Bad things can happen only if you set it to more than MAX_LFS_FILESIZE. With smaller values only users are disappointed ;). Honza -- Jan Kara <[EMAIL PROTECTED]> SuSE CR Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2]: PCI Error Recovery: Symbios SCSI First Failure
Implement the so-called "first failure data capture" (FFDC) for the symbios PCI error recovery. After a PCI error event is reported, the driver requests that MMIO be enabled. Once enabled, it then reads and dumps assorted status registers, and concludes by requesting the usual reset sequence. (includes a whitespace fix for bad indentation). Signed-off-by: Linas Vepstas <[EMAIL PROTECTED]> drivers/scsi/sym53c8xx_2/sym_glue.c | 15 +++ drivers/scsi/sym53c8xx_2/sym_glue.h |1 + drivers/scsi/sym53c8xx_2/sym_hipd.c | 18 ++ 3 files changed, 30 insertions(+), 4 deletions(-) Index: linux-2.6.21-rc4-git4/drivers/scsi/sym53c8xx_2/sym_glue.c === --- linux-2.6.21-rc4-git4.orig/drivers/scsi/sym53c8xx_2/sym_glue.c 2007-04-20 12:52:01.0 -0500 +++ linux-2.6.21-rc4-git4/drivers/scsi/sym53c8xx_2/sym_glue.c 2007-04-20 15:25:35.0 -0500 @@ -1987,6 +1987,20 @@ static pci_ers_result_t sym2_io_error_de disable_irq(pdev->irq); pci_disable_device(pdev); + /* Request that MMIO be enabled, so register dump can be taken. */ + return PCI_ERS_RESULT_CAN_RECOVER; +} + +/** + * sym2_io_slot_dump -- Enable MMIO and dump debug registers + * @pdev: pointer to PCI device + */ +static pci_ers_result_t sym2_io_slot_dump (struct pci_dev *pdev) +{ + struct sym_hcb *np = pci_get_drvdata(pdev); + + sym_dump_registers(np); + /* Request a slot reset. */ return PCI_ERS_RESULT_NEED_RESET; } @@ -2241,6 +2255,7 @@ MODULE_DEVICE_TABLE(pci, sym2_id_table); static struct pci_error_handlers sym2_err_handler = { .error_detected = sym2_io_error_detected, + .mmio_enabled = sym2_io_slot_dump, .slot_reset = sym2_io_slot_reset, .resume = sym2_io_resume, }; Index: linux-2.6.21-rc4-git4/drivers/scsi/sym53c8xx_2/sym_glue.h === --- linux-2.6.21-rc4-git4.orig/drivers/scsi/sym53c8xx_2/sym_glue.h 2007-04-20 12:15:07.0 -0500 +++ linux-2.6.21-rc4-git4/drivers/scsi/sym53c8xx_2/sym_glue.h 2007-04-20 15:21:31.0 -0500 @@ -270,5 +270,6 @@ void sym_xpt_async_bus_reset(struct sym_ void sym_xpt_async_sent_bdr(struct sym_hcb *np, int target); int sym_setup_data_and_start (struct sym_hcb *np, struct scsi_cmnd *csio, struct sym_ccb *cp); void sym_log_bus_error(struct sym_hcb *np); +void sym_dump_registers(struct sym_hcb *np); #endif /* SYM_GLUE_H */ Index: linux-2.6.21-rc4-git4/drivers/scsi/sym53c8xx_2/sym_hipd.c === --- linux-2.6.21-rc4-git4.orig/drivers/scsi/sym53c8xx_2/sym_hipd.c 2007-04-20 12:18:59.0 -0500 +++ linux-2.6.21-rc4-git4/drivers/scsi/sym53c8xx_2/sym_hipd.c 2007-04-20 15:18:01.0 -0500 @@ -1180,10 +1180,10 @@ static void sym_log_hard_error(struct sy scr_to_cpu((int) *(u32 *)(script_base + script_ofs))); } -printf ("%s: regdump:", sym_name(np)); -for (i=0; i<24;i++) -printf (" %02x", (unsigned)INB_OFF(np, i)); -printf (".\n"); + printf ("%s: regdump:", sym_name(np)); + for (i=0; i<24;i++) + printf (" %02x", (unsigned)INB_OFF(np, i)); + printf (".\n"); /* * PCI BUS error. @@ -1192,6 +1192,16 @@ static void sym_log_hard_error(struct sy sym_log_bus_error(np); } +void sym_dump_registers(struct sym_hcb *np) +{ + u_short sist; + u_char dstat; + + sist = INW(np, nc_sist); + dstat = INB(np, nc_dstat); + sym_log_hard_error(np, sist, dstat); +} + static struct sym_chip sym_dev_table[] = { {PCI_DEVICE_ID_NCR_53C810, 0x0f, "810", 4, 8, 4, 64, FE_ERL} - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
Ingo Molnar wrote: ( Lets be cautious though: the jury is still out whether people actually like this more than the current approach. While CFS feedback looks promising after a whopping 3 days of it being released [ ;-) ], the test coverage of all 'fairness centric' schedulers, even considering years of availability is less than 1% i'm afraid, and that < 1% was mostly self-selecting. ) All of my testing has been on desktop machines, although in most cases they were really loaded desktops which had load avg 10..100 from time to time, and none were low memory machines. Up to CFS v3 I thought nicksched was my winner, now CFSv3 looks better, by not having stumbles under stupid loads. I have not tested: 1 - server loads, nntp, smtp, etc 2 - low memory machines 3 - uniprocessor systems I think this should be done before drawing conclusions. Or if someone has tried this, perhaps they would report what they saw. People are talking about smoothness, but not how many pages per second come out of their overloaded web server. -- Bill Davidsen <[EMAIL PROTECTED]> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Acecad USB Tablet: usbmouse takeover and odd motion
On Fri, Apr 20, 2007 at 06:09:55PM +0200, Giuseppe Bilotta wrote: > On 4/20/07, Dmitry Torokhov <[EMAIL PROTECTED]> wrote: > >On 4/20/07, Giuseppe Bilotta <[EMAIL PROTECTED]> wrote: > >> > >> Sorry, it seems I was wrong, it's not usbhid but usbmouse taking over. > >> After a fresh plug (e.g. at bootup) I get the following: > >> > > > >Well, the question is - why do you have usbmouse module on your system? > > Stock Debian kernel 2.6.18 comes with it. > > With my custom kernels I can probably skip compiling it at all, if you > so suggest; should I blacklist it for the distro kernel? Or is there a > chance that some random USB mouse plugged in would fail to function by > doing so? usbmouse and usbkbd are only intended for embedded systems where the full usbhid doesn't fit and for testing purposes: Normal distros shouldn't have them enabled. -- Vojtech Pavlik Director SuSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/8] Kconfig: silicon backplane dependency.
On Friday 20 April 2007 13:35, Martin Schwidefsky wrote: > From: Martin Schwidefsky <[EMAIL PROTECTED]> > > Make the "Sonics Silicon Backplane" menu dependent on the two buses > it can be found on. > Goes on top of git-wireless.patch. > > Cc: Michael Buesch <[EMAIL PROTECTED]> > Cc: John W. Linville <[EMAIL PROTECTED]> > Signed-off-by: Martin Schwidefsky <[EMAIL PROTECTED]> > --- > > drivers/ssb/Kconfig |1 + > 1 files changed, 1 insertion(+) > > diff -urpN linux-2.6/drivers/ssb/Kconfig linux-2.6-patched/drivers/ssb/Kconfig > --- linux-2.6/drivers/ssb/Kconfig 2007-04-19 15:24:40.0 +0200 > +++ linux-2.6-patched/drivers/ssb/Kconfig 2007-04-19 15:55:44.0 > +0200 > @@ -1,4 +1,5 @@ > menu "Sonics Silicon Backplane" > + depends on PCI || PCMCIA This is wrong. SSB does not depend on PCI or PCMCIA. SSB can (and does) stay very well on its own feet and can be the main system bus. Most Linksys WRT routers work that way. They have no PCI bus, but a SSB bus instead. Nevertheless, I'm not sure what your problem really is. Does a s390 machine exist with a B44 card? I doubt it. So what about the following: We simply add a DEPENDS ON !S390 to both SSB and B44. I really thing that is the right fix for this. The patch above is clearly not, as it breaks things for embedded devices without PCI or PCMCIA bus. -- Greetings Michael. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [d_path 0/7] Fixes to d_path: Respin
> > I gave a chroot example that showed that in the current > > implementation, you can get pretty random clashes between mounts; there are > > other cases with lazy unmounts as well. > > Irrelevant as well. If you create chroot problems it's your problem. > > The fact is that if you have a normal setup the code works fine. All > other situations cannot be handled with the current kernel interface. > > This does not give anybody the right to say "since the code doesn't > always work we can break it completely". That's completely > unacceptable. I'm not sure I understand the situation completely. What exactly is broken in libc by removing unreachable mounts from /proc/mounts? Is it the situation when - file descriptor is opened - process does chroot - process does fstatvfs on file descriptor ? In that case currently fstatvfs() _usually_ gives the correct results, but can give wrong results if mounts paths accidently clash in /proc/mounts? Also isn't it the case, that fstatvfs() or statvfs() performed within the chroot could also give incorrect result for a _reachable_ mount if it clashes with an unreachable mount? If this is the case, I would think that removing the unreachable mounts from /proc/mounts, would actually be fixing this second case, which is more likely to be used anyway. BTW, this patch, or at least a predecessor is in -mm, and it very much feels the Right Thing(tm). The /proc/mounts under a chroot environment actually looks sane, instead of some random crap, that it was previously. While we should make every effort to keep the kernel interfaces stable, this shouldn't prevent us from fixing bugs. And this one is clearly a bug, even if not a very serious one. Miklos - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2]: PCI Error Recovery: Symbios SCSI base support
Hi Matthew, After a long hiatus, I took another stab at pci error recovery for the symbios. This is very nearly the same patch as before, with only an update to enable MWI, and to support chip workarounds. I think I've addressed all the other issues that came up. Thus, again, I'll ask that the patch go in (for 2.6.22 of course). To recap the only outstanding issue: >> @@ -657,6 +657,10 @@ static irqreturn_t sym53c8xx_intr(int ir >> + /* Avoid spinloop trying to handle interrupts on frozen device */ >> + if (pci_channel_offline(np->s.device)) >> + return IRQ_HANDLED; > >Just wondering ... should we really be returning HANDLED? What if the >IRQ is shared? Will the hardware de-assert the level interrupt when it >puts the device in reset (ie is this a transitory glitch?), or do we >have to cope with a screaming interrupt? This routine *always* returns HANDLED anyway, so this patch does not change semantics. For a symbios device plugged into a shared irq line, this is a problem with or without my patch. Yes, irq's will typically scream until handled. Yes, the device reset will eventually clear the irq, assuming the system doesn't deadlock on a screaming irq. --linas Here's the formal changelog entry: Various PCI bus errors can be signaled by newer PCI controllers. This patch adds the PCI error recovery callbacks to the Symbios SCSI device driver. The patch has been tested, and appears to work well. Signed-off-by: Linas Vepstas <[EMAIL PROTECTED]> -- drivers/scsi/sym53c8xx_2/sym_glue.c | 136 drivers/scsi/sym53c8xx_2/sym_glue.h |4 + drivers/scsi/sym53c8xx_2/sym_hipd.c |6 + 3 files changed, 146 insertions(+) Index: linux-2.6.21-rc4-git4/drivers/scsi/sym53c8xx_2/sym_glue.c === --- linux-2.6.21-rc4-git4.orig/drivers/scsi/sym53c8xx_2/sym_glue.c 2007-04-20 12:07:38.0 -0500 +++ linux-2.6.21-rc4-git4/drivers/scsi/sym53c8xx_2/sym_glue.c 2007-04-20 12:52:01.0 -0500 @@ -657,6 +657,10 @@ static irqreturn_t sym53c8xx_intr(int ir unsigned long flags; struct sym_hcb *np = (struct sym_hcb *)dev_id; + /* Avoid spinloop trying to handle interrupts on frozen device */ + if (pci_channel_offline(np->s.device)) + return IRQ_HANDLED; + if (DEBUG_FLAGS & DEBUG_TINY) printf_debug ("["); spin_lock_irqsave(np->s.host->host_lock, flags); @@ -726,6 +730,20 @@ static int sym_eh_handler(int op, char * dev_warn(>device->sdev_gendev, "%s operation started.\n", opname); + /* We may be in an error condition because the PCI bus +* went down. In this case, we need to wait until the +* PCI bus is reset, the card is reset, and only then +* proceed with the scsi error recovery. There's no +* point in hurrying; take a leisurely wait. +*/ +#define WAIT_FOR_PCI_RECOVERY 35 + if (pci_channel_offline(np->s.device)) { + int finished_reset = wait_for_completion_timeout( + >s.io_reset_wait, WAIT_FOR_PCI_RECOVERY*HZ); + if (!finished_reset) + return SCSI_FAILED; + } + spin_lock_irq(host->host_lock); /* This one is queued in some place -> to wait for completion */ FOR_EACH_QUEUED_ELEMENT(>busy_ccbq, qp) { @@ -1510,6 +1528,7 @@ static struct Scsi_Host * __devinit sym_ np->maxoffs = dev->chip.offset_max; np->maxburst= dev->chip.burst_max; np->myaddr = dev->host_id; + init_completion(>s.io_reset_wait); /* * Edit its name. @@ -1948,6 +1967,116 @@ static void __devexit sym2_remove(struct attach_count--; } +/** + * sym2_io_error_detected() -- called when PCI error is detected + * @pdev: pointer to PCI device + * @state: current state of the PCI slot + */ +static pci_ers_result_t sym2_io_error_detected (struct pci_dev *pdev, + enum pci_channel_state state) +{ + struct sym_hcb *np = pci_get_drvdata(pdev); + + /* If slot is permanently frozen, turn everything off */ + if (state == pci_channel_io_perm_failure) { + sym2_remove(pdev); + return PCI_ERS_RESULT_DISCONNECT; + } + + init_completion(>s.io_reset_wait); + disable_irq(pdev->irq); + pci_disable_device(pdev); + + /* Request a slot reset. */ + return PCI_ERS_RESULT_NEED_RESET; +} + +/** + * sym2_reset_workarounds -- hardware-specific work-arounds + * + * This routine is similar to sym_set_workarounds(), except + * that, at this point, we already know that the device was + * succesfully intialized at least once before, and so most + * of the steps taken there are un-needed here. + */ +static void sym2_reset_workarounds (struct pci_dev *pdev) +{ + u_char revision; + u_short status_reg; + struct
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
Mike Galbraith wrote: On Tue, 2007-04-17 at 05:40 +0200, Nick Piggin wrote: On Tue, Apr 17, 2007 at 04:29:01AM +0200, Mike Galbraith wrote: Yup, and progress _is_ happening now, quite rapidly. Progress as in progress on Ingo's scheduler. I still don't know how we'd decide when to replace the mainline scheduler or with what. I don't think we can say Ingo's is better than the alternatives, can we? No, that would require massive performance testing of all alternatives. If there is some kind of bakeoff, then I'd like one of Con's designs to be involved, and mine, and Peter's... The trouble with a bakeoff is that it's pretty darn hard to get people to test in the first place, and then comes weighting the subjective and hard performance numbers. If they're close in numbers, do you go with the one which starts the least flamewars or what? Here we disagree... I picked a scheduler not by running benchmarks, but by running loads which piss me off with the mainline scheduler. And then I ran the other schedulers for a while to find the things, normal things I do, which resulted in bad behavior. And when I found one which had (so far) no such cases I called it my winner, but I haven't tested it under server load, so I can't begin to say it's "the best." What we need is for lots of people to run every scheduler in real life, and do "worst case analysis" by finding the cases which cause bad behavior. And if there were a way to easily choose another scheduler, call it plugable, modular, or Russian Roulette, people who found a worst case would report it (aka bitch about it) and try another. But the average user is better able to boot with an option like "sched=cfs" (or sc, or nick, or ...) than to patch and build a kernel. So if we don't get easily switched schedulers people will not test nearly as well. The best scheduler isn't the one 2% faster than the rest, it's the one with the fewest jackpot cases where it sucks. And if the mainline had multiple schedulers this testing would get done, authors would get more reports and have a better chance of fixing corner cases. Note that we really need multiple schedulers to make people happy, because fairness is not the most desirable behavior on all machines, and adding knobs probably isn't the answer. I want a server to degrade gently, I want my desktop to show my movie and echo my typing, and if that's hard on compiles or the file transfer, so be it. Con doesn't want to compromise his goals, I agree but want to have an option if I don't share them. -- Bill Davidsen <[EMAIL PROTECTED]> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/5] NFS: Fix the 'desynchronized value of nfs_i.ncommit' error
From: Trond Myklebust <[EMAIL PROTECTED]> Redirtying a request that is already marked for commit will screw up the accounting for NR_UNSTABLE_NFS as well as nfs_i.ncommit. Ensure that all requests on the commit queue are labelled with the PG_NEED_COMMIT flag, and avoid moving them onto the dirty list inside nfs_page_mark_flush(). Also inline nfs_mark_request_dirty() into nfs_page_mark_flush() for atomicity reasons. Avoid dropping the spinlock until we're done marking the request in the radix tree and have added it to the ->dirty list. Signed-off-by: Trond Myklebust <[EMAIL PROTECTED]> --- fs/nfs/write.c | 47 ++- 1 files changed, 22 insertions(+), 25 deletions(-) diff --git a/fs/nfs/write.c b/fs/nfs/write.c index 8e94246..ce5b4a9 100644 --- a/fs/nfs/write.c +++ b/fs/nfs/write.c @@ -38,7 +38,6 @@ static struct nfs_page * nfs_update_request(struct nfs_open_context*, struct page *, unsigned int, unsigned int); -static void nfs_mark_request_dirty(struct nfs_page *req); static long nfs_flush_mapping(struct address_space *mapping, struct writeback_control *wbc, int how); static const struct rpc_call_ops nfs_write_partial_ops; static const struct rpc_call_ops nfs_write_full_ops; @@ -255,7 +254,8 @@ static void nfs_end_page_writeback(struct page *page) static int nfs_page_mark_flush(struct page *page) { struct nfs_page *req; - spinlock_t *req_lock = _I(page->mapping->host)->req_lock; + struct nfs_inode *nfsi = NFS_I(page->mapping->host); + spinlock_t *req_lock = >req_lock; int ret; spin_lock(req_lock); @@ -279,11 +279,23 @@ static int nfs_page_mark_flush(struct page *page) return ret; spin_lock(req_lock); } - spin_unlock(req_lock); + if (test_bit(PG_NEED_COMMIT, >wb_flags)) { + /* This request is marked for commit */ + spin_unlock(req_lock); + nfs_unlock_request(req); + return 1; + } if (nfs_set_page_writeback(page) == 0) { nfs_list_remove_request(req); - nfs_mark_request_dirty(req); - } + /* add the request to the inode's dirty list. */ + radix_tree_tag_set(>nfs_page_tree, + req->wb_index, NFS_PAGE_TAG_DIRTY); + nfs_list_add_request(req, >dirty); + nfsi->ndirty++; + spin_unlock(req_lock); + __mark_inode_dirty(page->mapping->host, I_DIRTY_PAGES); + } else + spin_unlock(req_lock); ret = test_bit(PG_NEED_FLUSH, >wb_flags); nfs_unlock_request(req); return ret; @@ -406,24 +418,6 @@ static void nfs_inode_remove_request(struct nfs_page *req) nfs_release_request(req); } -/* - * Add a request to the inode's dirty list. - */ -static void -nfs_mark_request_dirty(struct nfs_page *req) -{ - struct inode *inode = req->wb_context->dentry->d_inode; - struct nfs_inode *nfsi = NFS_I(inode); - - spin_lock(>req_lock); - radix_tree_tag_set(>nfs_page_tree, - req->wb_index, NFS_PAGE_TAG_DIRTY); - nfs_list_add_request(req, >dirty); - nfsi->ndirty++; - spin_unlock(>req_lock); - __mark_inode_dirty(inode, I_DIRTY_PAGES); -} - static void nfs_redirty_request(struct nfs_page *req) { @@ -438,7 +432,7 @@ nfs_dirty_request(struct nfs_page *req) { struct page *page = req->wb_page; - if (page == NULL) + if (page == NULL || test_bit(PG_NEED_COMMIT, >wb_flags)) return 0; return !PageWriteback(req->wb_page); } @@ -456,6 +450,7 @@ nfs_mark_request_commit(struct nfs_page *req) spin_lock(>req_lock); nfs_list_add_request(req, >commit); nfsi->ncommit++; + set_bit(PG_NEED_COMMIT, &(req)->wb_flags); spin_unlock(>req_lock); inc_zone_page_state(req->wb_page, NR_UNSTABLE_NFS); __mark_inode_dirty(inode, I_DIRTY_DATASYNC); @@ -470,7 +465,7 @@ int nfs_write_need_commit(struct nfs_write_data *data) static inline int nfs_reschedule_unstable_write(struct nfs_page *req) { - if (test_and_clear_bit(PG_NEED_COMMIT, >wb_flags)) { + if (test_bit(PG_NEED_COMMIT, >wb_flags)) { nfs_mark_request_commit(req); return 1; } @@ -557,6 +552,7 @@ static void nfs_cancel_commit_list(struct list_head *head) req = nfs_list_entry(head->next); dec_zone_page_state(req->wb_page, NR_UNSTABLE_NFS); nfs_list_remove_request(req); + clear_bit(PG_NEED_COMMIT, &(req)->wb_flags); nfs_inode_remove_request(req); nfs_unlock_request(req); } @@ -1295,6 +1291,7 @@ static void nfs_commit_done(struct rpc_task *task, void *calldata) while
Re: [PATCH 12/15] ide: make ide_hwif_t.ide_dma_host_on void
Hello, once I wrote: [PATCH] ide: make ide_hwif_t.ide_dma_host_on void * since ide_hwif_t.ide_dma_host_on is called either when drive->using_dma == 1 or when return value is discarded make it void, also drop "ide_" prefix * make __ide_dma_host_on() void and drop "__" prefix BTW, it would also make sense to make hwif->ide_dma_timeout() and hwif->ide_dma_lostirq void too (and possibly drop the ide_ prefix). Their results are *explicitly* ignored. I've started preparing the patches and found out that aec62xx has completely bogus ide_dma_timeout() -- the same as ide_dma_lostirq() and it doesn't even call __ide_dma_timeout()... :-/ Don't know whether to deal with this in a separate patch... MBR, Sergei - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 5/5] RPC: Fix the TCP resend semantics for NFSv4
From: Trond Myklebust <[EMAIL PROTECTED]> Fix a regression due to the patch "NFS: disconnect before retrying NFSv4 requests over TCP" The assumption made in xprt_transmit() that the condition "req->rq_bytes_sent == 0 and request is on the receive list" should imply that we're dealing with a retransmission is false. Firstly, it may simply happen that the socket send queue was full at the time the request was initially sent through xprt_transmit(). Secondly, doing this for each request that was retransmitted implies that we disconnect and reconnect for _every_ request that happened to be retransmitted irrespective of whether or not a disconnection has already occurred. Fix is to move this logic into the call_status request timeout handler. Signed-off-by: Trond Myklebust <[EMAIL PROTECTED]> --- net/sunrpc/clnt.c |4 net/sunrpc/xprt.c | 10 -- 2 files changed, 4 insertions(+), 10 deletions(-) diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index 6d7221f..396cdbe 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -1046,6 +1046,8 @@ call_status(struct rpc_task *task) rpc_delay(task, 3*HZ); case -ETIMEDOUT: task->tk_action = call_timeout; + if (task->tk_client->cl_discrtry) + xprt_disconnect(task->tk_xprt); break; case -ECONNREFUSED: case -ENOTCONN: @@ -1169,6 +1171,8 @@ call_decode(struct rpc_task *task) out_retry: req->rq_received = req->rq_private_buf.len = 0; task->tk_status = 0; + if (task->tk_client->cl_discrtry) + xprt_disconnect(task->tk_xprt); } /* diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c index ee6ffa0..456a145 100644 --- a/net/sunrpc/xprt.c +++ b/net/sunrpc/xprt.c @@ -735,16 +735,6 @@ void xprt_transmit(struct rpc_task *task) xprt_reset_majortimeo(req); /* Turn off autodisconnect */ del_singleshot_timer_sync(>timer); - } else { - /* If all request bytes have been sent, -* then we must be retransmitting this one */ - if (!req->rq_bytes_sent) { - if (task->tk_client->cl_discrtry) { - xprt_disconnect(xprt); - task->tk_status = -ENOTCONN; - return; - } - } } } else if (!req->rq_bytes_sent) return; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/5] NFS: Fix race in nfs_set_page_dirty
From: Trond Myklebust <[EMAIL PROTECTED]> Protect nfs_set_page_dirty() against races with nfs_inode_add_request. Signed-off-by: Trond Myklebust <[EMAIL PROTECTED]> --- fs/nfs/write.c | 17 ++--- 1 files changed, 14 insertions(+), 3 deletions(-) diff --git a/fs/nfs/write.c b/fs/nfs/write.c index ce5b4a9..7975589 100644 --- a/fs/nfs/write.c +++ b/fs/nfs/write.c @@ -388,6 +388,8 @@ static int nfs_inode_add_request(struct inode *inode, struct nfs_page *req) } SetPagePrivate(req->wb_page); set_page_private(req->wb_page, (unsigned long)req); + if (PageDirty(req->wb_page)) + set_bit(PG_NEED_FLUSH, >wb_flags); nfsi->npages++; atomic_inc(>wb_count); return 0; @@ -407,6 +409,8 @@ static void nfs_inode_remove_request(struct nfs_page *req) set_page_private(req->wb_page, 0); ClearPagePrivate(req->wb_page); radix_tree_delete(>nfs_page_tree, req->wb_index); + if (test_and_clear_bit(PG_NEED_FLUSH, >wb_flags)) + __set_page_dirty_nobuffers(req->wb_page); nfsi->npages--; if (!nfsi->npages) { spin_unlock(>req_lock); @@ -1527,15 +1531,22 @@ int nfs_wb_page(struct inode *inode, struct page* page) int nfs_set_page_dirty(struct page *page) { + spinlock_t *req_lock = _I(page->mapping->host)->req_lock; struct nfs_page *req; + int ret; - req = nfs_page_find_request(page); + spin_lock(req_lock); + req = nfs_page_find_request_locked(page); if (req != NULL) { /* Mark any existing write requests for flushing */ - set_bit(PG_NEED_FLUSH, >wb_flags); + ret = !test_and_set_bit(PG_NEED_FLUSH, >wb_flags); + spin_unlock(req_lock); nfs_release_request(req); + return ret; } - return __set_page_dirty_nobuffers(page); + ret = __set_page_dirty_nobuffers(page); + spin_unlock(req_lock); + return ret; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PCI bridge range sizing bug
On Friday, April 20, 2007 11:28 am Linus Torvalds wrote: > On Fri, 20 Apr 2007, Jesse Barnes wrote: > > Sounds good, hopefully reassigning the bridge resources won't cause > > too much trouble. Do you have time to hack this up? If not, I > > could give it a try, as long as ajax is willing to test... > > Actually, I would suggest we not do it automatically (because the > need for it is just so low, and the downsides are potentially huge - > there are just too many resources that are "hidden" from us through > ACPI tricks and having hardware that doesn't actually expose their > PCI resources fully through the normal PCI resource setup). Yeah, that's probably prudent. OTOH we should probably let the user know in no uncertain terms that some of the stuff behind one of their bridges will be inaccessible. Thanks, Jesse - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/5] NFS: clean up the unstable write code
From: Trond Myklebust <[EMAIL PROTECTED]> Get rid of the inlined #ifdefs. Signed-off-by: Trond Myklebust <[EMAIL PROTECTED]> --- fs/nfs/write.c | 117 -- include/linux/nfs_page.h | 30 2 files changed, 71 insertions(+), 76 deletions(-) diff --git a/fs/nfs/write.c b/fs/nfs/write.c index ad2e91b..3ed4feb 100644 --- a/fs/nfs/write.c +++ b/fs/nfs/write.c @@ -460,6 +460,43 @@ nfs_mark_request_commit(struct nfs_page *req) inc_zone_page_state(req->wb_page, NR_UNSTABLE_NFS); __mark_inode_dirty(inode, I_DIRTY_DATASYNC); } + +static inline +int nfs_write_need_commit(struct nfs_write_data *data) +{ + return data->verf.committed != NFS_FILE_SYNC; +} + +static inline +int nfs_reschedule_unstable_write(struct nfs_page *req) +{ + if (test_and_clear_bit(PG_NEED_COMMIT, >wb_flags)) { + nfs_mark_request_commit(req); + return 1; + } + if (test_and_clear_bit(PG_NEED_RESCHED, >wb_flags)) { + nfs_redirty_request(req); + return 1; + } + return 0; +} +#else +static inline void +nfs_mark_request_commit(struct nfs_page *req) +{ +} + +static inline +int nfs_write_need_commit(struct nfs_write_data *data) +{ + return 0; +} + +static inline +int nfs_reschedule_unstable_write(struct nfs_page *req) +{ + return 0; +} #endif /* @@ -746,26 +783,12 @@ int nfs_updatepage(struct file *file, struct page *page, static void nfs_writepage_release(struct nfs_page *req) { - nfs_end_page_writeback(req->wb_page); -#if defined(CONFIG_NFS_V3) || defined(CONFIG_NFS_V4) - if (!PageError(req->wb_page)) { - if (NFS_NEED_RESCHED(req)) { - nfs_redirty_request(req); - goto out; - } else if (NFS_NEED_COMMIT(req)) { - nfs_mark_request_commit(req); - goto out; - } - } - nfs_inode_remove_request(req); - -out: - nfs_clear_commit(req); - nfs_clear_reschedule(req); -#else - nfs_inode_remove_request(req); -#endif + if (PageError(req->wb_page) || !nfs_reschedule_unstable_write(req)) { + nfs_end_page_writeback(req->wb_page); + nfs_inode_remove_request(req); + } else + nfs_end_page_writeback(req->wb_page); nfs_clear_page_writeback(req); } @@ -1008,22 +1031,28 @@ static void nfs_writeback_done_partial(struct rpc_task *task, void *calldata) nfs_set_pageerror(page); req->wb_context->error = task->tk_status; dprintk(", error = %d\n", task->tk_status); - } else { -#if defined(CONFIG_NFS_V3) || defined(CONFIG_NFS_V4) - if (data->verf.committed < NFS_FILE_SYNC) { - if (!NFS_NEED_COMMIT(req)) { - nfs_defer_commit(req); - memcpy(>wb_verf, >verf, sizeof(req->wb_verf)); - dprintk(" defer commit\n"); - } else if (memcmp(>wb_verf, >verf, sizeof(req->wb_verf))) { - nfs_defer_reschedule(req); - dprintk(" server reboot detected\n"); - } - } else -#endif - dprintk(" OK\n"); + goto out; } + if (nfs_write_need_commit(data)) { + spinlock_t *req_lock = _I(page->mapping->host)->req_lock; + + spin_lock(req_lock); + if (test_bit(PG_NEED_RESCHED, >wb_flags)) { + /* Do nothing we need to resend the writes */ + } else if (!test_and_set_bit(PG_NEED_COMMIT, >wb_flags)) { + memcpy(>wb_verf, >verf, sizeof(req->wb_verf)); + dprintk(" defer commit\n"); + } else if (memcmp(>wb_verf, >verf, sizeof(req->wb_verf))) { + set_bit(PG_NEED_RESCHED, >wb_flags); + clear_bit(PG_NEED_COMMIT, >wb_flags); + dprintk(" server reboot detected\n"); + } + spin_unlock(req_lock); + } else + dprintk(" OK\n"); + +out: if (atomic_dec_and_test(>wb_complete)) nfs_writepage_release(req); } @@ -1064,25 +1093,21 @@ static void nfs_writeback_done_full(struct rpc_task *task, void *calldata) if (task->tk_status < 0) { nfs_set_pageerror(page); req->wb_context->error = task->tk_status; - nfs_end_page_writeback(page); - nfs_inode_remove_request(req); dprintk(", error = %d\n", task->tk_status); - goto next; + goto remove_request; } - nfs_end_page_writeback(page); -#if
[PATCH 2/5] NFS: Don't clear PG_writeback until after we've processed unstable writes
From: Trond Myklebust <[EMAIL PROTECTED]> Ensure that we don't release the PG_writeback lock until after the page has either been redirtied, or queued on the nfs_inode 'commit' list. Signed-off-by: Trond Myklebust <[EMAIL PROTECTED]> --- fs/nfs/write.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/nfs/write.c b/fs/nfs/write.c index 3ed4feb..8e94246 100644 --- a/fs/nfs/write.c +++ b/fs/nfs/write.c @@ -920,8 +920,8 @@ out_bad: list_del(>pages); nfs_writedata_release(data); } - nfs_end_page_writeback(req->wb_page); nfs_redirty_request(req); + nfs_end_page_writeback(req->wb_page); nfs_clear_page_writeback(req); return -ENOMEM; } @@ -966,8 +966,8 @@ static int nfs_flush_one(struct inode *inode, struct list_head *head, int how) while (!list_empty(head)) { struct nfs_page *req = nfs_list_entry(head->next); nfs_list_remove_request(req); - nfs_end_page_writeback(req->wb_page); nfs_redirty_request(req); + nfs_end_page_writeback(req->wb_page); nfs_clear_page_writeback(req); } return -ENOMEM; @@ -1002,8 +1002,8 @@ out_err: while (!list_empty(head)) { req = nfs_list_entry(head->next); nfs_list_remove_request(req); - nfs_end_page_writeback(req->wb_page); nfs_redirty_request(req); + nfs_end_page_writeback(req->wb_page); nfs_clear_page_writeback(req); } return error; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/5] 2.6.21-rc7 NFS writes: fix a series of issues
I've split the issues introduced by the 2.6.21-rcX write code up into 4 subproblems. The first patch is just a cleanup in order to ease review. Patch number 2 ensures that we never release the PG_writeback flag until _after_ we've either discarded the unstable request altogether, or put it on the nfs_inode's commit or dirty lists. Patch number 3 fixes the 'desynchronized value of nfs_i.ncommit' error. It uses the PG_NEED_COMMIT flag as an indicator for whether or not the request may be redirtied. Patch number 4 protects the NFS '.set_page_dirty' address_space operation against races with nfs_inode_add_request. Finally, patch number 5 fixes an issue with the RPC code that is supposed ensure that NFSv4 disconnects before resending a request. The current code will disconnect for every request it resends (and has a bunch of false positive cases), instead of just ensuring that it disconnects once every time a timeout or a garbage reply occurs. My thanks to the various patient victim^Wpeople who helped with extensive testing. Cheers Trond - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: qla2xxx hba crashes with older 2310 cards
On Fri, 2007-04-20 at 13:24 -0700, David Miller wrote: > From: Robert Peterson <[EMAIL PROTECTED]> > Date: Fri, 20 Apr 2007 10:40:30 -0500 > > > I've seen some chatter about the qla2xxx driver but not paid attention, so > > I'm sorry if this is a known issue. I've got an older qlogic hba, and > > recent > > drivers don't seem to play nice with it. I've got the latest firmware from > > qlogic's web site. I'm using a 2.6.21-rc6 kernel from Steve Whitehouse's > > -nmw git tree. Reverting to an older driver (but same kernel) and it works. > > The current driver gives this: > > Yes, known problem, I'm sorry these guys broke the driver for > you as well, please see this thread: > > http://marc.info/?l=linux-kernel=117671067701124=2 > > This was really a stupid change to make. OK,OK, we heard you the first time ... the maintainers will try to fix this in a manner acceptable to all concerned ... could we try to cool down the public traductions now? James - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: AGPGart / AMD K7
On Fri, Apr 20, 2007 at 04:22:06PM -0400, Preston A. Elder wrote: > Dave, Greg, > > Here is the trace with 2.6.20.6 > > I added back in my trace code, as you see. As you can also see, > agp_amdk7_probe is still not called. Try looking down in __driver_attach() The fact that we're not calling the ->probe function is quite bizarre. It could be this in __driver_attach if (!dev->driver) driver_probe_device(drv, dev); Though that'd be odd. Putting a #define DEBUG 1 in drivers/base/dd.c may also yield some clues. Dave -- http://www.codemonkey.org.uk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PCI bridge range sizing bug
On Fri, Apr 20, 2007 at 11:28:42AM -0700, Linus Torvalds wrote: > Actually, I would suggest we not do it automatically (because the need for > it is just so low, and the downsides are potentially huge - there are just > too many resources that are "hidden" from us through ACPI tricks and > having hardware that doesn't actually expose their PCI resources fully > through the normal PCI resource setup). Definitely. I was intending to enable that *only* with some boot option. > Ivan, want to add some way to force that allocation (something like > "pci=assign-bus-resources") Yes, hopefully I'll get something in a next couple of days. Ivan. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: qla2xxx hba crashes with older 2310 cards
From: Robert Peterson <[EMAIL PROTECTED]> Date: Fri, 20 Apr 2007 10:40:30 -0500 > I've seen some chatter about the qla2xxx driver but not paid attention, so > I'm sorry if this is a known issue. I've got an older qlogic hba, and recent > drivers don't seem to play nice with it. I've got the latest firmware from > qlogic's web site. I'm using a 2.6.21-rc6 kernel from Steve Whitehouse's > -nmw git tree. Reverting to an older driver (but same kernel) and it works. > The current driver gives this: Yes, known problem, I'm sorry these guys broke the driver for you as well, please see this thread: http://marc.info/?l=linux-kernel=117671067701124=2 This was really a stupid change to make. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: AGPGart / AMD K7
Dave, Greg, Here is the trace with 2.6.20.6 I added back in my trace code, as you see. As you can also see, agp_amdk7_probe is still not called. Linux agpgart interface v0.101 (c) Dave Jones agp_amdk7_init: In function agp_amdk7_init: Before pci_register_driver __pci_register_driver: In Function (driver = agpgart-amdk7, multithread = 0) __pci_register_driver: Before Spinlock __pci_register_driver: Before Init List Head __pci_register_driver: Before driver_register bus_add_driver: In Function (c048e920) bus_add_driver: Before kobject_set_name bus_add_driver: error = 0 bus_add_driver: Before kobject_register bus_add_driver: error = 0 bus_add_driver: Before driver_attach bus_add_driver: error = 0 bus_add_driver: Before klist_add_tail bus_add_driver: Before module_add_driver bus_add_driver: Before driver_add_attrs bus_add_driver: error = 0 bus_add_driver: Before add_bind_files bus_add_driver: error = 0 bus_add_driver: Returning 0 __pci_register_driver: error = 0 __pci_register_driver: Before pci_create_newid_file __pci_register_driver: error = 0 __pci_register_driver: Returning 0 Even when I start X (using the fglrx driver) I still do not see the probe function being called. Everything looks successful, too :( I will try with 2.6.21rc7, but I don't hold out too much hope. PreZ Dave Jones wrote: > On Fri, Apr 20, 2007 at 02:31:01PM -0400, Preston A. Elder wrote: > > > Here is the code for __pci_register_driver: > > ... > > > > So in the above case, we ARE saying if driver_register returns 0 then > > pci_create_newid_file. > > > > Is it different to the code you have? As I said, this IS 2.6.19. > > Yes, .20 changed this in this way.. > > @@ -445,9 +442,12 @@ int __pci_register_driver(struct pci_driver *drv, struct > module *owner) > > /* register with core */ > error = driver_register(>driver); > + if (error) > + return error; > > - if (!error) > - error = pci_create_newid_file(drv); > + error = pci_create_newid_file(drv); > + if (error) > + driver_unregister(>driver); > > return error; > } > > > Retry your tracing with .20 (or better yet, .21rc7/todays git) > > Dave > > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] cciss: Fix warnings during compilation under 32bitenvironment
On Fri, 2007-04-20 at 12:30 -0700, Andrew Morton wrote: > On Fri, 20 Apr 2007 14:50:06 -0400 > James Bottomley <[EMAIL PROTECTED]> wrote: > > > > CONFIG_LBD=y gives us an additional 3kb of instructions on i386 > > > allnoconfig. Other architectures might do less well. It's not a huge > > > difference, but that's the way in which creeping bloatiness happens. > > > > OK, sure, but if we really care about this saving, then unconditionally > > casting to u64 is therefore wrong as well ... this is starting to open > > quite a large can of worms ... > > > > For the record, if we have to do this, I fancy sector_upper_32() ... we > > should already have some similar accessor for dma_addr_t as well. > > hm. How about this? > > --- a/include/linux/kernel.h~upper-32-bits > +++ a/include/linux/kernel.h > @@ -40,6 +40,17 @@ extern const char linux_proc_banner[]; > #define DIV_ROUND_UP(n,d) (((n) + (d) - 1) / (d)) > #define roundup(x, y) x) + ((y) - 1)) / (y)) * (y)) > > +/** > + * upper_32_bits - return bits 32-63 of a number > + * @n: the number we're accessing > + * > + * A basic shift-right of a 64- or 32-bit quantity. Use this to suppress > + * the "right shift count >= width of type" warning when that quantity is > + * 32-bits. > + */ > +#define upper_32_bits(n) (((u64)(n)) >> 32) Won't this have the unwanted side effect of promoting everything in a calculation to long long on 32 bit platforms, even if n was only 32 bits? > + > + > #define KERN_EMERG "<0>" /* system is unusable > */ > #define KERN_ALERT "<1>" /* action must be taken immediately > */ > #define KERN_CRIT "<2>" /* critical conditions > */ > _ > > It seems to generate the desired code. I avoided Alan's ((n >> 31) >> 1) > trick because it'll generate peculiar results with signed 64-bit > quantities. I've seen the trick done similarly with ((n >> 16) >> 16) which shouldn't have the issue. James - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/