Re: [PATCH -v6 2/2] Updating ctime and mtime for memory-mapped files
Hi Linus, On Friday 18 January 2008, Linus Torvalds wrote: > On Fri, 18 Jan 2008, Miklos Szeredi wrote: > > > > What I'm saying is that the times could be left un-updated for a long > > time if program doesn't do munmap() or msync(MS_SYNC) for a long time. > > Sure. > > But in those circumstances, the programmer cannot depend on the mtime > *anyway* (because there is no synchronization), so what's the downside? Can we get "if the write to the page hits the disk, the mtime has hit the disk already no less than SOME_GRANULARITY before"? That is very important for computer forensics. Esp. in saving your ass! Ok, now back again to making that fast :-) Best Regards Ingo Oeser -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/5] x86: Add config variables for SMP_MAX
Hi Mike, On Friday 18 January 2008, [EMAIL PROTECTED] wrote: > +config THREAD_ORDER > + int "Kernel stack size (in page order)" > + range 1 3 > + depends on X86_64_SMP > + default "3" if X86_SMP_MAX > + default "1" > + help > + Increases kernel stack size. > + Could you please elaborate, why this is needed and put more info about this requirement into this patch description? People worked hard to push data allocation from stack to heap to make THREAD_ORDER of 0 and 1 possible. So why increase it again and why does this help scalability? Many thanks and Best Regards Ingo Oeser, puzzled a bit :-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -v6 2/2] Updating ctime and mtime for memory-mapped files
Hi Linus, On Friday 18 January 2008, Linus Torvalds wrote: On Fri, 18 Jan 2008, Miklos Szeredi wrote: What I'm saying is that the times could be left un-updated for a long time if program doesn't do munmap() or msync(MS_SYNC) for a long time. Sure. But in those circumstances, the programmer cannot depend on the mtime *anyway* (because there is no synchronization), so what's the downside? Can we get if the write to the page hits the disk, the mtime has hit the disk already no less than SOME_GRANULARITY before? That is very important for computer forensics. Esp. in saving your ass! Ok, now back again to making that fast :-) Best Regards Ingo Oeser -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/5] x86: Add config variables for SMP_MAX
Hi Mike, On Friday 18 January 2008, [EMAIL PROTECTED] wrote: +config THREAD_ORDER + int Kernel stack size (in page order) + range 1 3 + depends on X86_64_SMP + default 3 if X86_SMP_MAX + default 1 + help + Increases kernel stack size. + Could you please elaborate, why this is needed and put more info about this requirement into this patch description? People worked hard to push data allocation from stack to heap to make THREAD_ORDER of 0 and 1 possible. So why increase it again and why does this help scalability? Many thanks and Best Regards Ingo Oeser, puzzled a bit :-) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
kinit (was: sleep before boot panic)
On Wednesday 09 January 2008, H. Peter Anvin wrote: > Pavel Machek wrote: > > > >> Of course, if we'd been using kinit, "soft panic" would > >> have been done exclusively in userspace... > > > > What's the status of kinit, btw? > > Pavel > > It's bitrotted a bit since it was first rejected. It wouldn't take too > much work to bring it back up to speed, however. klibc, and some of the > kinit components, are used for the initramfs in Debian. Yes, and I like most of it. The only thing really missing for me is LVM support. Debian (?) did a evil hack to make it work. Maybe one day this itches soo much, I'll even scratch it :-) Then I'll be able to test kernels on a standard LVM installation again. So please keep up the good work! Best Regards Ingo Oeser -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
kinit (was: sleep before boot panic)
On Wednesday 09 January 2008, H. Peter Anvin wrote: Pavel Machek wrote: Of course, if we'd been using kinit, soft panic would have been done exclusively in userspace... What's the status of kinit, btw? Pavel It's bitrotted a bit since it was first rejected. It wouldn't take too much work to bring it back up to speed, however. klibc, and some of the kinit components, are used for the initramfs in Debian. Yes, and I like most of it. The only thing really missing for me is LVM support. Debian (?) did a evil hack to make it work. Maybe one day this itches soo much, I'll even scratch it :-) Then I'll be able to test kernels on a standard LVM installation again. So please keep up the good work! Best Regards Ingo Oeser -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: sleep before boot panic
On Monday 07 January 2008, Andi Kleen wrote: > Bernd Schubert <[EMAIL PROTECTED]> writes: > > > Hi, > > > > I just switched to libata (pata) on my laptop and the immediate panic made > > it > > impossible to figure out why my boot partition wasn't available. > > After applying this little patch I could check boot printk output and then > > saw > > everything was properly recognized and only scsi-disk support was missing. > > The correct fix would be to make scroll back (and sysrq) still work > after panic. It's a little more complicated, but possible (essentially > it needs a polled keyboard handler) Customer: "This system could not find the root fs." Support: "Oh, yeah, just connect a (USB-) keyboard and scroll back." Hmm, device detection works after panic? I really like the "soft" panic better, where you still can operate the kernel debugging features, but just have no user space supporting it. One better hopes, that keyboards never need external firmware to be loaded at this stage :-) Best Regards Ingo Oeser, who just hit the same problem yesterday... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: sleep before boot panic
On Monday 07 January 2008, Andi Kleen wrote: Bernd Schubert [EMAIL PROTECTED] writes: Hi, I just switched to libata (pata) on my laptop and the immediate panic made it impossible to figure out why my boot partition wasn't available. After applying this little patch I could check boot printk output and then saw everything was properly recognized and only scsi-disk support was missing. The correct fix would be to make scroll back (and sysrq) still work after panic. It's a little more complicated, but possible (essentially it needs a polled keyboard handler) Customer: This system could not find the root fs. Support: Oh, yeah, just connect a (USB-) keyboard and scroll back. Hmm, device detection works after panic? I really like the soft panic better, where you still can operate the kernel debugging features, but just have no user space supporting it. One better hopes, that keyboards never need external firmware to be loaded at this stage :-) Best Regards Ingo Oeser, who just hit the same problem yesterday... -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: sleep before boot panic
Hi Bernd, CC'ed hpa, since I'm sure he can give useful advise on that :-) On Sunday 06 January 2008, Bernd Schubert wrote: > On Sunday 06 January 2008, Ingo Oeser wrote: > > Hi Bernd, > > > > On Sunday 06 January 2008, you wrote: > > > Index: zd1211rw.git.beno/init/do_mounts.c > > > === > > > --- zd1211rw.git.beno.orig/init/do_mounts.c 2008-01-06 > > > 18:44:23.0 > > > +0100 > > > +++ zd1211rw.git.beno/init/do_mounts.c2008-01-06 18:45:44.0 > > > +0100 @@ -330,6 +330,7 @@ > > > printk("Please append a correct \"root=\" boot option; here are > > > the > > > available partitions:\n"); > > > > > > printk_all_partitions(); > > > + msleep(60 * 1000); > > > > ssleep(60); > > feel free to replace it replace it :) Not that urgent, but if you resubmit please do it :-) > There is no dump_stack() here, but disc detection is relatively early in boot > process and on all these information are already scrolled off screen when the > panic is done. For this and any other panic it would be optimal if scrolling > still would work, but scrolling also requires kernel code, so I see there's a > reason not to this for all panics. However, for this boot problem I tend to > say there's no need to panic at all... But the kernel cannot continue from that position. You would need a "soft" panic, which allows behavior of panic=X, but let the kernel continue. Even better is to continue with the init in the builtin ramfs. That should always be available and can implement any behavior desired (like droping into a dash). > Btw, not all stack straces are useless, *most* of them are actually very > useful. I didn't say that. Just if you cannot continue due to admin error, but the kernel is in a perfect valid state otherwise, dumping stack is next to useless. Best Regards Ingo Oeser -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: sleep before boot panic
Hi Bernd, On Sunday 06 January 2008, you wrote: > Index: zd1211rw.git.beno/init/do_mounts.c > === > --- zd1211rw.git.beno.orig/init/do_mounts.c 2008-01-06 18:44:23.0 > +0100 > +++ zd1211rw.git.beno/init/do_mounts.c2008-01-06 18:45:44.0 > +0100 > @@ -330,6 +330,7 @@ > printk("Please append a correct \"root=\" boot option; here are > the > available partitions:\n"); > > printk_all_partitions(); > + msleep(60 * 1000); ssleep(60); > panic("VFS: Unable to mount root fs on %s", b); > } Better would be for this and similiar panic()s (fatal user/admin errors on boot) to NOT print a stack trace+registers, since it is useless and actually hides useful information. Best Regards Ingo Oeser -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: sleep before boot panic
Hi Bernd, On Sunday 06 January 2008, you wrote: Index: zd1211rw.git.beno/init/do_mounts.c === --- zd1211rw.git.beno.orig/init/do_mounts.c 2008-01-06 18:44:23.0 +0100 +++ zd1211rw.git.beno/init/do_mounts.c2008-01-06 18:45:44.0 +0100 @@ -330,6 +330,7 @@ printk(Please append a correct \root=\ boot option; here are the available partitions:\n); printk_all_partitions(); + msleep(60 * 1000); ssleep(60); panic(VFS: Unable to mount root fs on %s, b); } Better would be for this and similiar panic()s (fatal user/admin errors on boot) to NOT print a stack trace+registers, since it is useless and actually hides useful information. Best Regards Ingo Oeser -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: sleep before boot panic
Hi Bernd, CC'ed hpa, since I'm sure he can give useful advise on that :-) On Sunday 06 January 2008, Bernd Schubert wrote: On Sunday 06 January 2008, Ingo Oeser wrote: Hi Bernd, On Sunday 06 January 2008, you wrote: Index: zd1211rw.git.beno/init/do_mounts.c === --- zd1211rw.git.beno.orig/init/do_mounts.c 2008-01-06 18:44:23.0 +0100 +++ zd1211rw.git.beno/init/do_mounts.c2008-01-06 18:45:44.0 +0100 @@ -330,6 +330,7 @@ printk(Please append a correct \root=\ boot option; here are the available partitions:\n); printk_all_partitions(); + msleep(60 * 1000); ssleep(60); feel free to replace it replace it :) Not that urgent, but if you resubmit please do it :-) There is no dump_stack() here, but disc detection is relatively early in boot process and on all these information are already scrolled off screen when the panic is done. For this and any other panic it would be optimal if scrolling still would work, but scrolling also requires kernel code, so I see there's a reason not to this for all panics. However, for this boot problem I tend to say there's no need to panic at all... But the kernel cannot continue from that position. You would need a soft panic, which allows behavior of panic=X, but let the kernel continue. Even better is to continue with the init in the builtin ramfs. That should always be available and can implement any behavior desired (like droping into a dash). Btw, not all stack straces are useless, *most* of them are actually very useful. I didn't say that. Just if you cannot continue due to admin error, but the kernel is in a perfect valid state otherwise, dumping stack is next to useless. Best Regards Ingo Oeser -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] add task handling notifier
Hi Jan, I like and support your idea! On Thursday 20 December 2007, Jan Beulich wrote: > With more and more sub-systems/sub-components leaving their footprint > in task handling functions, it seems reasonable to add notifiers that > these components can use instead of having them all patch themselves > directly into core files. Yes, but why export variables? Wouldn't it be better to export an API? That simplifies the callers (they all pass "current" as task and "task_notifier_list" as arguments). It also prevents exposing internal variables (notifier lists ARE internal variables) to modules. What do you think? Best Regards Ingo Oeser -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] add task handling notifier
Hi Jan, I like and support your idea! On Thursday 20 December 2007, Jan Beulich wrote: With more and more sub-systems/sub-components leaving their footprint in task handling functions, it seems reasonable to add notifiers that these components can use instead of having them all patch themselves directly into core files. Yes, but why export variables? Wouldn't it be better to export an API? That simplifies the callers (they all pass current as task and task_notifier_list as arguments). It also prevents exposing internal variables (notifier lists ARE internal variables) to modules. What do you think? Best Regards Ingo Oeser -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][RFC] dynamic pipe resizing
On Saturday 15 December 2007, Jan Engelhardt wrote: > On Aug 24 2007 10:52, Jens Axboe wrote: > >Subject: [PATCH][RFC] dynamic pipe resizing > >Like with my original splice patches from 2005, I used fcntl() > >F_GETPIPE_SZ and F_SETPIPE_SZ to change the size of the pipe. I'm not > >particularly fond of that interface, so suggestions on how to improve it > >would be appreciated. Even if fcntl() should be the preferred approach, > >I think it would be better to pass in a byte based value instead of a > >number of pages. > > > Could this patch still make it in? > Yes, I think its set() and get() parts should use bytes and convert > to/from pages. > > Perhaps just round up, and mention the rounding in the manpage, so that > noone gets a shock when the pipe is not exactly as small as requested. Yes, but document only that it is rounding and make the unit arbitrary. That reduces the ABI requirements. Best Regards Ingo Oeser -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][RFC] dynamic pipe resizing
On Saturday 15 December 2007, Jan Engelhardt wrote: On Aug 24 2007 10:52, Jens Axboe wrote: Subject: [PATCH][RFC] dynamic pipe resizing Like with my original splice patches from 2005, I used fcntl() F_GETPIPE_SZ and F_SETPIPE_SZ to change the size of the pipe. I'm not particularly fond of that interface, so suggestions on how to improve it would be appreciated. Even if fcntl() should be the preferred approach, I think it would be better to pass in a byte based value instead of a number of pages. Could this patch still make it in? Yes, I think its set() and get() parts should use bytes and convert to/from pages. Perhaps just round up, and mention the rounding in the manpage, so that noone gets a shock when the pipe is not exactly as small as requested. Yes, but document only that it is rounding and make the unit arbitrary. That reduces the ABI requirements. Best Regards Ingo Oeser -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [feature] automatically detect hung TASK_UNINTERRUPTIBLE tasks
On Saturday 01 December 2007, Ingo Molnar wrote: > maybe, but we'd have to see how often this gets triggered. An OOM is > something that could happen in any overloaded system - while a hung task > is likely due to a kernel bug. What about a client using hard mounted NFS shares here? That shouldn't be killed by the OOM killer in that situation, should it? Am I missing sth.? Best Regards Ingo Oeser -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [feature] automatically detect hung TASK_UNINTERRUPTIBLE tasks
On Saturday 01 December 2007, Ingo Molnar wrote: maybe, but we'd have to see how often this gets triggered. An OOM is something that could happen in any overloaded system - while a hung task is likely due to a kernel bug. What about a client using hard mounted NFS shares here? That shouldn't be killed by the OOM killer in that situation, should it? Am I missing sth.? Best Regards Ingo Oeser -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ata: ahci: Enable enclosure management via LED (resend)
On Thursday 25 October 2007, Kristen Carlson Accardi wrote: > I did look into using the LED class for this, but it didn't appropriate > as I wanted the leds to be associated with a particular disk, and not > with the platform as a whole. It seemed to me that the led_class was > a bit of overkill for what we needed to do here, since we just need > on/off and nothing else. Maybe. But didn't you want mdadm to control it? Then it would make sense. But you have a point in the LED API missing the ability to associate a LED to a specific device (e.g. where it is installed :-). So I'm fine either way, since I see you point. Best Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ata: ahci: Enable enclosure management via LED (resend)
Hi Kristen, On Thursday 25 October 2007, Kristen Carlson Accardi wrote: > Enable enclosure management via LED > > As described in the AHCI spec, some AHCI controllers may support > Enclosure management via a variety of protocols. This patch > adds support for the LED message type that is specified in > AHCI 1.1 and higher. Linux has a LED subsystem for that. May I suggest, that you just register these leds and let userspace handle them via that via the LED API? The LED userspace API is described in Documentation/leds-class.txt and the headers for registering LEDs is linux/leds.h under include/ Since you explicitly WANT user space to control these, that should be the right API. Richard, what do YOU think? Best Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ata: ahci: Enable enclosure management via LED (resend)
Hi Kristen, On Thursday 25 October 2007, Kristen Carlson Accardi wrote: Enable enclosure management via LED As described in the AHCI spec, some AHCI controllers may support Enclosure management via a variety of protocols. This patch adds support for the LED message type that is specified in AHCI 1.1 and higher. Linux has a LED subsystem for that. May I suggest, that you just register these leds and let userspace handle them via that via the LED API? The LED userspace API is described in Documentation/leds-class.txt and the headers for registering LEDs is linux/leds.h under include/ Since you explicitly WANT user space to control these, that should be the right API. Richard, what do YOU think? Best Regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ata: ahci: Enable enclosure management via LED (resend)
On Thursday 25 October 2007, Kristen Carlson Accardi wrote: I did look into using the LED class for this, but it didn't appropriate as I wanted the leds to be associated with a particular disk, and not with the platform as a whole. It seemed to me that the led_class was a bit of overkill for what we needed to do here, since we just need on/off and nothing else. Maybe. But didn't you want mdadm to control it? Then it would make sense. But you have a point in the LED API missing the ability to associate a LED to a specific device (e.g. where it is installed :-). So I'm fine either way, since I see you point. Best Regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 13/13] RT: Cache cpus_allowed weight for optimizing migration
Hi Gregory, On Tuesday 23 October 2007, Gregory Haskins wrote: > Calculating the weight is probably relatively expensive, so it is only > done when the cpus_allowed mask is updated (which should be relatively > infrequent, especially compared to scheduling frequency) and cached in > the task_struct. Why not make it a task flag, since according to your code, you are only interested whether this is <= 1 or > 1. Since !(x <= 1) <=> (x > 1) for any given unsigned integer x, the required data structure is a "boolean" or a flag. Best Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 13/13] RT: Cache cpus_allowed weight for optimizing migration
Hi Gregory, On Tuesday 23 October 2007, Gregory Haskins wrote: Calculating the weight is probably relatively expensive, so it is only done when the cpus_allowed mask is updated (which should be relatively infrequent, especially compared to scheduling frequency) and cached in the task_struct. Why not make it a task flag, since according to your code, you are only interested whether this is = 1 or 1. Since !(x = 1) = (x 1) for any given unsigned integer x, the required data structure is a boolean or a flag. Best Regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] Extending kbuild syntax
Hi Sam, On Saturday 29 September 2007, Sam Ravnborg wrote: > Introducing the following new variable could make this a oneliner: > ccflags-y > > ccflags-$(DEBUG) := -DDEBUG > > grep -r -C 1 -B 1 EXTRA_CFLAGS shows that the above is a > very common pattern especially in drivers/ ACK. Also ACK for asflags, if done the same way :-) > The second is the more controversial suggestion. Yes, but please bear in mind, what the developers are trying to express in these cases. > In several Makefile we have simple if expression of the variants: > if ($(CONFIG_FOO),y) > obj-$(CONFIG_BAR) += fubar.o > endif This is "feature FOO of module BAR" where the feature itself cannot be a module. The composition scheme described in section 3.3 is at least equally useful. And that is used today. Maybe the documentation of that scheme is not prominent enough :-) > obj-y-ifn- This is the only one needed, because it is cumbersome to express negative rules in kbuild to include stubs (e.g. nommu stuff). But again this can be done with composition rules right now, but is order dependent. If we could get rid of this requirement, I would be happy already. So kbuild is just lacking an "else" clause here. Best Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] Extending kbuild syntax
Hi Sam, On Saturday 29 September 2007, Sam Ravnborg wrote: Introducing the following new variable could make this a oneliner: ccflags-y ccflags-$(DEBUG) := -DDEBUG grep -r -C 1 -B 1 EXTRA_CFLAGS shows that the above is a very common pattern especially in drivers/ ACK. Also ACK for asflags, if done the same way :-) The second is the more controversial suggestion. Yes, but please bear in mind, what the developers are trying to express in these cases. In several Makefile we have simple if expression of the variants: if ($(CONFIG_FOO),y) obj-$(CONFIG_BAR) += fubar.o endif This is feature FOO of module BAR where the feature itself cannot be a module. The composition scheme described in section 3.3 is at least equally useful. And that is used today. Maybe the documentation of that scheme is not prominent enough :-) obj-y-ifn- This is the only one needed, because it is cumbersome to express negative rules in kbuild to include stubs (e.g. nommu stuff). But again this can be done with composition rules right now, but is order dependent. If we could get rid of this requirement, I would be happy already. So kbuild is just lacking an else clause here. Best Regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] Combine path_put and path_put_conditional
Hi Andreas, On Friday 28 September 2007, Andreas Gruenbacher wrote: > The name path_put_conditional (formerly, dput_path) is a little unclear. > Replace (path_put_conditional + path_put) with path_walk_put_both, > "put a pair of paths after a path_walk" (see the kerneldoc). ^ So why not name it path_walk_put_pair() then? Rationale: "_both" is just counting, "_pair" means they are related somehow. Best regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[OT] kbuild syntax extension for ccflags and asflags (was: [PATCH 1/3] CodingStyle updates)
Hi Sam, On Saturday 29 September 2007, Sam Ravnborg wrote: > Lately I have considered extending the kbuild syntax a bit. > > Introducing > ccflags-y > asflags-y > > [with same functionality as the EXTRA_CFLAGS, EXTRA_AFLAGS] > would allow us to do: > > ccflags-$(CONFIG_WHATEVER_DEBUG) := -DDEBUG Please do! That is very useful for testing and developing new modules. I learnt a lot from you in this regard and used that kind of syntax to the extreme in some other non-kernel project of mine. There it included also ccflags, asflags and so on. I further split that into -debug-y and -optimize-y flags, but that was just for my own convenience. Best Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[OT] kbuild syntax extension for ccflags and asflags (was: [PATCH 1/3] CodingStyle updates)
Hi Sam, On Saturday 29 September 2007, Sam Ravnborg wrote: Lately I have considered extending the kbuild syntax a bit. Introducing ccflags-y asflags-y [with same functionality as the EXTRA_CFLAGS, EXTRA_AFLAGS] would allow us to do: ccflags-$(CONFIG_WHATEVER_DEBUG) := -DDEBUG Please do! That is very useful for testing and developing new modules. I learnt a lot from you in this regard and used that kind of syntax to the extreme in some other non-kernel project of mine. There it included also ccflags, asflags and so on. I further split that into -debug-y and -optimize-y flags, but that was just for my own convenience. Best Regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] Combine path_put and path_put_conditional
Hi Andreas, On Friday 28 September 2007, Andreas Gruenbacher wrote: The name path_put_conditional (formerly, dput_path) is a little unclear. Replace (path_put_conditional + path_put) with path_walk_put_both, put a pair of paths after a path_walk (see the kerneldoc). ^ So why not name it path_walk_put_pair() then? Rationale: _both is just counting, _pair means they are related somehow. Best regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/3] Fix coding style
On Tuesday 25 September 2007, Srivatsa Vaddagiri wrote: > Index: current/kernel/sched_debug.c > === > --- current.orig/kernel/sched_debug.c > +++ current/kernel/sched_debug.c > @@ -239,11 +239,7 @@ static int > root_user_share_read_proc(char *page, char **start, off_t off, int count, >int *eof, void *data) > { > - int len; > - > - len = sprintf(page, "%d\n", init_task_grp_load); > - > - return len; > + return sprintf(page, "%d\n", init_task_grp_load); > } > > static int > @@ -297,7 +293,7 @@ static int __init init_sched_debug_procf > pe->proc_fops = _debug_fops; > > #ifdef CONFIG_FAIR_USER_SCHED > - pe = create_proc_entry("root_user_share", 0644, NULL); > + pe = create_proc_entry("root_user_cpu_share", 0644, NULL); > if (!pe) > return -ENOMEM; What about moving this debug stuff under debugfs? Please consider using the functions in . They compile into nothing, if DEBUGFS is not compiled in and have already useful functions for reading/writing integers and booleans. Best Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/3] Fix coding style
On Tuesday 25 September 2007, Srivatsa Vaddagiri wrote: Index: current/kernel/sched_debug.c === --- current.orig/kernel/sched_debug.c +++ current/kernel/sched_debug.c @@ -239,11 +239,7 @@ static int root_user_share_read_proc(char *page, char **start, off_t off, int count, int *eof, void *data) { - int len; - - len = sprintf(page, %d\n, init_task_grp_load); - - return len; + return sprintf(page, %d\n, init_task_grp_load); } static int @@ -297,7 +293,7 @@ static int __init init_sched_debug_procf pe-proc_fops = sched_debug_fops; #ifdef CONFIG_FAIR_USER_SCHED - pe = create_proc_entry(root_user_share, 0644, NULL); + pe = create_proc_entry(root_user_cpu_share, 0644, NULL); if (!pe) return -ENOMEM; What about moving this debug stuff under debugfs? Please consider using the functions in linux/debugfs.h . They compile into nothing, if DEBUGFS is not compiled in and have already useful functions for reading/writing integers and booleans. Best Regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Why do so many machines need "noapic"?
On Saturday 15 September 2007, Andrew Morton wrote: > There are 48 bugs in bugzilla which mention "noapic" > > http://bugzilla.kernel.org/buglist.cgi?query_format=advanced_desc_type=allwordssubstr_desc=_desc_type=substring_desc=noapic_version_type=allwordssubstr_version=_status=NEW_status=REOPENED_status=ASSIGNED_to1=1=substring=_to2=1=1=1=substring==include_id===Now==both=doit=Reuse+same+sort+as+last+time=noop=noop= > > And there are 173,000 on the internet ;) > http://www.google.com/search?hl=en=linux+noapic=Google+Search > > We screwed this pooch a long time ago - years. Perhaps if some of the many > noapic users could run a bisection search to work out when it broke we > could start fixing things. But they all have a workaround so there's no > motivation. I have 2 SMP-Boards and both need noapic. One is from 2001 (AUSUS CUR-DLS), one is from June 2006 (Gigabyte M57SLI-S4). There are many reasons: 1. Bugs which have such a simple workaround don't get much attention. 2. Usually SMP boards are used for machines, which just HAVE to work, since they have been expensive. These are not consumer boards. 3. I usually had only USB problems (no IRQ), if ommiting noapic. USB technology is a cosumer grade technology and enterprise grade developers don't have much interest in it (until now?). 4. IRQ routing setup is often a BIOS issue. You might be able to fix that by upgrading your BIOS. That often needs a Windows tool. Linux people not always (want to) have access to Windows :-) I reported the all the problems (starting 2001), no developer seemed interested. I can report them against the latest RC6 kernel tomorrow and put them into bugzilla, if we now REALLY care. Best Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Why do so many machines need noapic?
On Saturday 15 September 2007, Andrew Morton wrote: There are 48 bugs in bugzilla which mention noapic http://bugzilla.kernel.org/buglist.cgi?query_format=advancedshort_desc_type=allwordssubstrshort_desc=long_desc_type=substringlong_desc=noapickernel_version_type=allwordssubstrkernel_version=bug_status=NEWbug_status=REOPENEDbug_status=ASSIGNEDemailassigned_to1=1emailtype1=substringemail1=emailassigned_to2=1emailreporter2=1emailcc2=1emailtype2=substringemail2=bugidtype=includebug_id=chfieldfrom=chfieldto=Nowchfieldvalue=regression=bothcmdtype=doitorder=Reuse+same+sort+as+last+timefield0-0-0=nooptype0-0-0=noopvalue0-0-0= And there are 173,000 on the internet ;) http://www.google.com/search?hl=enq=linux+noapicbtnG=Google+Search We screwed this pooch a long time ago - years. Perhaps if some of the many noapic users could run a bisection search to work out when it broke we could start fixing things. But they all have a workaround so there's no motivation. I have 2 SMP-Boards and both need noapic. One is from 2001 (AUSUS CUR-DLS), one is from June 2006 (Gigabyte M57SLI-S4). There are many reasons: 1. Bugs which have such a simple workaround don't get much attention. 2. Usually SMP boards are used for machines, which just HAVE to work, since they have been expensive. These are not consumer boards. 3. I usually had only USB problems (no IRQ), if ommiting noapic. USB technology is a cosumer grade technology and enterprise grade developers don't have much interest in it (until now?). 4. IRQ routing setup is often a BIOS issue. You might be able to fix that by upgrading your BIOS. That often needs a Windows tool. Linux people not always (want to) have access to Windows :-) I reported the all the problems (starting 2001), no developer seemed interested. I can report them against the latest RC6 kernel tomorrow and put them into bugzilla, if we now REALLY care. Best Regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] crypto: cleanup: Use max() in blkcipher_get_spot() to state the intention.
[PATCH] crypto: cleanup: Use max() in blkcipher_get_spot() to state the intention. Signed-off-by: Ingo Oeser <[EMAIL PROTECTED]> --- Hi Herbert, here is the requested patch against Linus' latest tree. It at least compiles. Best Regards Ingo Oeser crypto/blkcipher.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/crypto/blkcipher.c b/crypto/blkcipher.c index d8f8ec3..1c99d92 100644 --- a/crypto/blkcipher.c +++ b/crypto/blkcipher.c @@ -65,7 +65,7 @@ static inline void blkcipher_unmap_dst(struct blkcipher_walk *walk) static inline u8 *blkcipher_get_spot(u8 *start, unsigned int len) { u8 *end_page = (u8 *)(((unsigned long)(start + len - 1)) & PAGE_MASK); - return start > end_page ? start : end_page; + return max(start, end_page); } static inline unsigned int blkcipher_done_slow(struct crypto_blkcipher *tfm, -- 1.5.2.5 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] crypto: cleanup: Use max() in blkcipher_get_spot() to state the intention.
[PATCH] crypto: cleanup: Use max() in blkcipher_get_spot() to state the intention. Signed-off-by: Ingo Oeser [EMAIL PROTECTED] --- Hi Herbert, here is the requested patch against Linus' latest tree. It at least compiles. Best Regards Ingo Oeser crypto/blkcipher.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/crypto/blkcipher.c b/crypto/blkcipher.c index d8f8ec3..1c99d92 100644 --- a/crypto/blkcipher.c +++ b/crypto/blkcipher.c @@ -65,7 +65,7 @@ static inline void blkcipher_unmap_dst(struct blkcipher_walk *walk) static inline u8 *blkcipher_get_spot(u8 *start, unsigned int len) { u8 *end_page = (u8 *)(((unsigned long)(start + len - 1)) PAGE_MASK); - return start end_page ? start : end_page; + return max(start, end_page); } static inline unsigned int blkcipher_done_slow(struct crypto_blkcipher *tfm, -- 1.5.2.5 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] crypto: blkcipher_get_spot() handling of buffer at end of page
Hi Herbert, On Monday 10 September 2007, Herbert Xu wrote: > On Sat, Sep 08, 2007 at 12:14:23PM +0800, Herbert Xu wrote: > > > > [CRYPTO] blkcipher: Fix handling of kmalloc page straddling > > As Bob correctly noted, I had the boolean test inverted. > Here is the correction: > > [CRYPTO] blkcipher: Fix inverted test in blkcipher_get_spot > > The previous patch had the conditional inverted. This patch fixes it > so that we return the original position if it does not straddle a page. What about using max() for this to make your intention obvious? static inline u8 *blkcipher_get_spot(u8 *start, unsigned int len) { u8 *end_page = (u8 *)(((unsigned long)(start + len - 1)) & PAGE_MASK); return max(start, end_page); } Best Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFD] new syscalls: suspend_other/resume_other?
Hi there, at the moment implementing a mark and sweep garbage collection subsystem is quite a hack, because you always have to use up some signals for suspend/resume all threads to implement this. For runtime environments (like D system libraries or JVMs) this is a hack, since you take away flexibility from the application. A possible solution would be a syscall or a PTRACE extension to realize the suspend/resume. I best describe the possible syscall manpages here, so you get an idea. NAME suspend_other - suspends execution of all but the calling thread SYNOPSIS long suspend_other(void); RETURN VALUE Positive count suspended threads on success. If 0, then suspend_other was a no-op and there is nothing to resume, but the call should still considered successful. If the number is -1, the errno has to be checked for possible error values. ERRORS EDEADLK We run already a suspend_other() and the calling thread has just been resumed. EPERM The calling thread is not allowed to do this. (optional case due to security) DESCRIPTION After sucessful return of this call, the affected process is single threaded and only the calling thread runs in this process (==MM struct). The thread, which calls this is responsible for resuming all the suspended threads. One can iterate through "/proc/self/task/", to verify for sure that one knows all threads, if the returned count doesn't match the expected value. Any per thread queued signals are deferred until resume_other() or process destruction. NOTES This call might be restricted to the main thread. NAME resume_other - resume execution of foreign thread in this process SYNOPSIS long resume_other(pid_t tid) RETURN VALUE Returns 0, if successful. Otherwise -1 is returned, the errno has to be checked for possible error values and the call has no effect at all. Any non-blocked signals of that thread which happend during suspend/resume are deliverd now. ERRORS EINVAL The thread is running already. (this is a severe caller BUG). ESRCH The thread with tid does not exist. (or doesn't belong to this process). EPERM The calling thread is not allowed to do this. (optional case due to security) DESCRIPTION After sucessful return of this call, the affected thread is the the state it was before it was suspended by calling suspend_other(). NOTES The value -1 for tid is reserved for future extension (e.g. meaning ALL other threads). This call might be restricted to the main thread. - Any opinions? Best Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFD] new syscalls: suspend_other/resume_other?
Hi there, at the moment implementing a mark and sweep garbage collection subsystem is quite a hack, because you always have to use up some signals for suspend/resume all threads to implement this. For runtime environments (like D system libraries or JVMs) this is a hack, since you take away flexibility from the application. A possible solution would be a syscall or a PTRACE extension to realize the suspend/resume. I best describe the possible syscall manpages here, so you get an idea. NAME suspend_other - suspends execution of all but the calling thread SYNOPSIS long suspend_other(void); RETURN VALUE Positive count suspended threads on success. If 0, then suspend_other was a no-op and there is nothing to resume, but the call should still considered successful. If the number is -1, the errno has to be checked for possible error values. ERRORS EDEADLK We run already a suspend_other() and the calling thread has just been resumed. EPERM The calling thread is not allowed to do this. (optional case due to security) DESCRIPTION After sucessful return of this call, the affected process is single threaded and only the calling thread runs in this process (==MM struct). The thread, which calls this is responsible for resuming all the suspended threads. One can iterate through /proc/self/task/, to verify for sure that one knows all threads, if the returned count doesn't match the expected value. Any per thread queued signals are deferred until resume_other() or process destruction. NOTES This call might be restricted to the main thread. NAME resume_other - resume execution of foreign thread in this process SYNOPSIS long resume_other(pid_t tid) RETURN VALUE Returns 0, if successful. Otherwise -1 is returned, the errno has to be checked for possible error values and the call has no effect at all. Any non-blocked signals of that thread which happend during suspend/resume are deliverd now. ERRORS EINVAL The thread is running already. (this is a severe caller BUG). ESRCH The thread with tid does not exist. (or doesn't belong to this process). EPERM The calling thread is not allowed to do this. (optional case due to security) DESCRIPTION After sucessful return of this call, the affected thread is the the state it was before it was suspended by calling suspend_other(). NOTES The value -1 for tid is reserved for future extension (e.g. meaning ALL other threads). This call might be restricted to the main thread. - Any opinions? Best Regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] memchr (trivial) optimization
On Wednesday 22 August 2007, lode leroy wrote: > While profiling something completely unrelated, I noticed > that on the workloads I used memchr for, I saw a 30%-40% improvement > in performance, with the following trivial changes... > (basically, it saves 3 operations for each call) Yes, but then you could be a bit more explicit to the compiler on what you are doing here: void *memchr(const void *s, int c, size_t n) { const unsigned char *p = s; for (; n != 0; n--, p++) { if ((unsigned char)c == *p) { return (void *)p; } return NULL; } Now the compiler should see the loop more clearly. Best Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] memchr (trivial) optimization
On Wednesday 22 August 2007, lode leroy wrote: While profiling something completely unrelated, I noticed that on the workloads I used memchr for, I saw a 30%-40% improvement in performance, with the following trivial changes... (basically, it saves 3 operations for each call) Yes, but then you could be a bit more explicit to the compiler on what you are doing here: void *memchr(const void *s, int c, size_t n) { const unsigned char *p = s; for (; n != 0; n--, p++) { if ((unsigned char)c == *p) { return (void *)p; } return NULL; } Now the compiler should see the loop more clearly. Best Regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH]: proc: export a processes resource limits via proc/
Hi Neil, > +static struct limit_names lnames[RLIM_NLIMITS] = { static const ... may be better here. Best Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH]: proc: export a processes resource limits via proc/pid
Hi Neil, +static struct limit_names lnames[RLIM_NLIMITS] = { static const ... may be better here. Best Regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Driver-level memory management
Hi Michael, On Sunday 12 August 2007, Michael Bourgeous wrote: > I'm working on a driver for older HDTV cards based on the TL880 chip. > These cards typically have 16MB of their own memory, which is > available to me over the PCI bus. Various functions of the card > require me to manage this memory, allocating and freeing chunks of it > as necessary. I can easily include my own allocation and management > code, Ok. > but I'm sure this is a problem that has been solved before. Yes! in your Kconfig select GENERIC_ALLOCATOR in your driver.c #include Code is in lib/genalloc.c, if you like to take a look. Memory for MANAGING free/allocated space is NOT taken from your on-card memory! That allocator is explicitly developed for such use cases. Happy hacking! Best regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Driver-level memory management
Hi Michael, On Sunday 12 August 2007, Michael Bourgeous wrote: I'm working on a driver for older HDTV cards based on the TL880 chip. These cards typically have 16MB of their own memory, which is available to me over the PCI bus. Various functions of the card require me to manage this memory, allocating and freeing chunks of it as necessary. I can easily include my own allocation and management code, Ok. but I'm sure this is a problem that has been solved before. Yes! in your Kconfig select GENERIC_ALLOCATOR in your driver.c #include linux/genalloc.h Code is in lib/genalloc.c, if you like to take a look. Memory for MANAGING free/allocated space is NOT taken from your on-card memory! That allocator is explicitly developed for such use cases. Happy hacking! Best regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 03/11] fuse: add reference counting to fuse_file
Hi Miklos, On Friday 03 August 2007, Miklos Szeredi wrote: > From: Miklos Szeredi <[EMAIL PROTECTED]> > > Make lifetime of 'struct fuse_file' independent from 'struct file' by > adding a reference counter and destructor. What about using krefs to implement that? see include/linux/kref.h and lib/kref.c Just embed that "struct kref" inside your struct fuse_file struct fuse_file { ... struct kref ref; ... } init in struct fuse_file *fuse_file_alloc(void) where you added the counter. ... kref_init(>ref); ... and implement the release function like: static void fuse_file_release(struct kref *ff_ref) { struct fuse_file *ff = container_of(ff_ref, struct fuse_file, ref); struct fuse_req *req = ff->reserved_req; struct fuse_conn *fc = get_fuse_conn(req->dentry->d_inode); request_send_background(fc, req); kfree(ff); } This will also fix the missing smp_barriers, is very simple, saves code, makes your life easier and is a well known known kernel infrastructure :-) BTW: FUSE rocks! :-) You can add my "Signed-off-by: Ingo Oeser <[EMAIL PROTECTED]>", if you want to use that suggestion. Best Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 03/11] fuse: add reference counting to fuse_file
Hi Miklos, On Friday 03 August 2007, Miklos Szeredi wrote: From: Miklos Szeredi [EMAIL PROTECTED] Make lifetime of 'struct fuse_file' independent from 'struct file' by adding a reference counter and destructor. What about using krefs to implement that? see include/linux/kref.h and lib/kref.c Just embed that struct kref inside your struct fuse_file struct fuse_file { ... struct kref ref; ... } init in struct fuse_file *fuse_file_alloc(void) where you added the counter. ... kref_init(ff-ref); ... and implement the release function like: static void fuse_file_release(struct kref *ff_ref) { struct fuse_file *ff = container_of(ff_ref, struct fuse_file, ref); struct fuse_req *req = ff-reserved_req; struct fuse_conn *fc = get_fuse_conn(req-dentry-d_inode); request_send_background(fc, req); kfree(ff); } This will also fix the missing smp_barriers, is very simple, saves code, makes your life easier and is a well known known kernel infrastructure :-) BTW: FUSE rocks! :-) You can add my Signed-off-by: Ingo Oeser [EMAIL PROTECTED], if you want to use that suggestion. Best Regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [EXT4 set 7][PATCH 1/1]Remove 32000 subdirs limit.
On Tuesday 17 July 2007, Kalpak Shah wrote: > Index: linux-2.6.22/include/linux/ext4_fs.h > === > --- linux-2.6.22.orig/include/linux/ext4_fs.h > +++ linux-2.6.22/include/linux/ext4_fs.h > @@ -797,12 +797,18 @@ struct ext4_dir_entry_2 { >#define is_dx(dir) (EXT4_HAS_COMPAT_FEATURE(dir->i_sb, \ > EXT4_FEATURE_COMPAT_DIR_INDEX) > && \ > (EXT4_I(dir)->i_flags & EXT4_INDEX_FL)) > -#define EXT4_DIR_LINK_MAX(dir) (!is_dx(dir) && (dir)->i_nlink >= > EXT4_LINK_MAX) > -#define EXT4_DIR_LINK_EMPTY(dir) ((dir)->i_nlink == 2 || (dir)->i_nlink == 1) > +static inline int ext4_dir_link_max(struct inode *dir) > +{ > + return (!is_dx(dir) && (dir)->i_nlink >= EXT4_LINK_MAX); > +} > +static inline int ext4_dir_link_empty(struct inode *dir) > +{ > + return ((dir)->i_nlink == 2 || (dir)->i_nlink == 1); > +} even better: static inline bool ext4_is_dx(const struct inode *dir) { #ifdef FOOBAR return EXT4_HAS_COMPAT_FEATURE(dir->i_sb, EXT4_FEATURE_COMPAT_DIR_INDEX) && (EXT4_I(dir)->i_flags & EXT4_INDEX_FL)); #else return false; #endif } static inline bool ext4_dir_link_max(const struct inode *dir) { return !ext4_is_dx(dir) && (dir->i_nlink >= EXT4_LINK_MAX); } static inline bool ext4_dir_link_empty(const struct inode *dir) { #ifdef FOOBAR return dir->i_nlink == 2 || dir->i_nlink == 1; #else return dir->i_nlink == 2; #endif } FOOBAR is the define, which enables ext4_is_dx(). That is not in the patch, so left as an exercise to the reader :-) Best Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [EXT4 set 7][PATCH 1/1]Remove 32000 subdirs limit.
On Tuesday 17 July 2007, Kalpak Shah wrote: Index: linux-2.6.22/include/linux/ext4_fs.h === --- linux-2.6.22.orig/include/linux/ext4_fs.h +++ linux-2.6.22/include/linux/ext4_fs.h @@ -797,12 +797,18 @@ struct ext4_dir_entry_2 { #define is_dx(dir) (EXT4_HAS_COMPAT_FEATURE(dir-i_sb, \ EXT4_FEATURE_COMPAT_DIR_INDEX) \ (EXT4_I(dir)-i_flags EXT4_INDEX_FL)) -#define EXT4_DIR_LINK_MAX(dir) (!is_dx(dir) (dir)-i_nlink = EXT4_LINK_MAX) -#define EXT4_DIR_LINK_EMPTY(dir) ((dir)-i_nlink == 2 || (dir)-i_nlink == 1) +static inline int ext4_dir_link_max(struct inode *dir) +{ + return (!is_dx(dir) (dir)-i_nlink = EXT4_LINK_MAX); +} +static inline int ext4_dir_link_empty(struct inode *dir) +{ + return ((dir)-i_nlink == 2 || (dir)-i_nlink == 1); +} even better: static inline bool ext4_is_dx(const struct inode *dir) { #ifdef FOOBAR return EXT4_HAS_COMPAT_FEATURE(dir-i_sb, EXT4_FEATURE_COMPAT_DIR_INDEX) (EXT4_I(dir)-i_flags EXT4_INDEX_FL)); #else return false; #endif } static inline bool ext4_dir_link_max(const struct inode *dir) { return !ext4_is_dx(dir) (dir-i_nlink = EXT4_LINK_MAX); } static inline bool ext4_dir_link_empty(const struct inode *dir) { #ifdef FOOBAR return dir-i_nlink == 2 || dir-i_nlink == 1; #else return dir-i_nlink == 2; #endif } FOOBAR is the define, which enables ext4_is_dx(). That is not in the patch, so left as an exercise to the reader :-) Best Regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] intel-rng: Undo mess made by an 80 column extremist
On Friday 08 June 2007, John Stoffel wrote: > Jeff> On Thu, Jun 07, 2007 at 09:56:06PM -0400, John Stoffel wrote: > >> Thinking about it more, I wonder if Krysztof is bitching more about > >> the tab width of 8 characters? I know that it ticks me off, > > Jeff> Even if he is, _that_ is definitely not getting changed. > > Oh sure... I know that part is written in stone. Yes, and as a person doing Linux code review for 12 years now, I'm really thankful for it. 8 char tab, 80 column rule and 25-50 lines of code per function actually enable effective review of code snippets. Because you can see more code flow per patch. And enables high code reuse. If you can get within 1-5min, what a functions does and match it with your actually written down last 20 code lines, you just reuse it more often. If you have more to choose from, you reuse naturally. Personally I find best candidates by code position in tree and function signature. > Jeff> If code starts creeping way right due to indentation levels, > Jeff> create a new function. > > Sure... compilers are good, us humans haven't gotten much better, make > it easier on us and harder on the computer. Yes, let compile remove all the abstraction overhead. GCC does a pretty good job there, I think. I recently analyzed some code and it took much, much longer (factor 2-3), because of laxer coding rules similiar to the ones you suggest. I even asked the developers, who wrote that code and to ones who work daily with that code base and they had the same problems. They all couldn't explain the "Why?" only the "How?". Not to mention, that this was a core component. After refactoring some big messy parts into smaller functions, identical, missing, unhandled cases became visible, inappropriate usages were identified and even some loops could be removed. Now try to find such problems within Linux. They should be a small percentage and not within core components. So a big THANKS to all the code cops here: You actually make the damn fast change rate of Linux possible by keeping the base clean and neat. Best Regards Ingo oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/6] Storing ipcs into radix trees
Hi, On Friday 08 June 2007, Nadia Derbey wrote: > Ingo Oeser wrote: > > ... together with this means 4*256 -> 1k of precious stack space used. > > Please consider either lowering IPCS_MAX_SCAN_ENTRIES or kmalloc() that. > You're completely right, but trying to lower the extraction size, I'm > afraid this will have an impact on performances. > > Here are the results of a small test I did: I have run ctxbench on both > the 256 and and 16 entries versions > > 1) 256 entries: > 42523679 itterations in 300.005423 seconds = 141743/sec > 2) 16 entries: > 41774255 itterations in 300.005334 seconds = 139245/sec So that is around 1.8% in a benchmark. Not bad, if one considers, that this is an expensive syncronisation primitive anyway (and thus shouldn't dominate any real workload). At least _much_ better than possible stack underflow :-) BTW: You forgot to include measurements with the unmodified code as it is in Linus' tree now. They woule be a nice data point here. > Will try with a dynamic allocation. But than you have an additional error path or have to sleep until memory becomes available. Maybe try doubling IPCS_MAX_SCAN_ENTRIES - until the performance impact is in the noise - is simpler. Up to 64 seems acceptable. Best Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/6] Storing ipcs into radix trees
Hi, On Friday 08 June 2007, Nadia Derbey wrote: Ingo Oeser wrote: ... together with this means 4*256 - 1k of precious stack space used. Please consider either lowering IPCS_MAX_SCAN_ENTRIES or kmalloc() that. You're completely right, but trying to lower the extraction size, I'm afraid this will have an impact on performances. Here are the results of a small test I did: I have run ctxbench on both the 256 and and 16 entries versions 1) 256 entries: 42523679 itterations in 300.005423 seconds = 141743/sec 2) 16 entries: 41774255 itterations in 300.005334 seconds = 139245/sec So that is around 1.8% in a benchmark. Not bad, if one considers, that this is an expensive syncronisation primitive anyway (and thus shouldn't dominate any real workload). At least _much_ better than possible stack underflow :-) BTW: You forgot to include measurements with the unmodified code as it is in Linus' tree now. They woule be a nice data point here. Will try with a dynamic allocation. But than you have an additional error path or have to sleep until memory becomes available. Maybe try doubling IPCS_MAX_SCAN_ENTRIES - until the performance impact is in the noise - is simpler. Up to 64 seems acceptable. Best Regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] intel-rng: Undo mess made by an 80 column extremist
On Friday 08 June 2007, John Stoffel wrote: Jeff On Thu, Jun 07, 2007 at 09:56:06PM -0400, John Stoffel wrote: Thinking about it more, I wonder if Krysztof is bitching more about the tab width of 8 characters? I know that it ticks me off, Jeff Even if he is, _that_ is definitely not getting changed. Oh sure... I know that part is written in stone. Yes, and as a person doing Linux code review for 12 years now, I'm really thankful for it. 8 char tab, 80 column rule and 25-50 lines of code per function actually enable effective review of code snippets. Because you can see more code flow per patch. And enables high code reuse. If you can get within 1-5min, what a functions does and match it with your actually written down last 20 code lines, you just reuse it more often. If you have more to choose from, you reuse naturally. Personally I find best candidates by code position in tree and function signature. Jeff If code starts creeping way right due to indentation levels, Jeff create a new function. Sure... compilers are good, us humans haven't gotten much better, make it easier on us and harder on the computer. Yes, let compile remove all the abstraction overhead. GCC does a pretty good job there, I think. I recently analyzed some code and it took much, much longer (factor 2-3), because of laxer coding rules similiar to the ones you suggest. I even asked the developers, who wrote that code and to ones who work daily with that code base and they had the same problems. They all couldn't explain the Why? only the How?. Not to mention, that this was a core component. After refactoring some big messy parts into smaller functions, identical, missing, unhandled cases became visible, inappropriate usages were identified and even some loops could be removed. Now try to find such problems within Linux. They should be a small percentage and not within core components. So a big THANKS to all the code cops here: You actually make the damn fast change rate of Linux possible by keeping the base clean and neat. Best Regards Ingo oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/6] Storing ipcs into radix trees
Hi Nadia, good to see someone is pounding this old beast again :-) On Thursday 07 June 2007, [EMAIL PROTECTED] wrote: > Index: linux-2.6.21/ipc/util.h > === > --- linux-2.6.21.orig/ipc/util.h 2007-06-07 11:00:30.0 +0200 > +++ linux-2.6.21/ipc/util.h 2007-06-07 11:07:22.0 +0200 > @@ -13,6 +13,8 @@ > #define USHRT_MAX 0x > #define SEQ_MULTIPLIER (IPCMNI) > > +#define IPCS_MAX_SCAN_ENTRIES 256 That ... > Index: linux-2.6.21/ipc/util.c > === > --- linux-2.6.21.orig/ipc/util.c 2007-06-07 11:00:30.0 +0200 > +++ linux-2.6.21/ipc/util.c 2007-06-07 11:29:43.0 +0200 > @@ -252,72 +241,94 @@ void __init ipc_init_proc_interface(cons > * @key: The key to find > * > * Requires ipc_ids.mutex locked. > - * Returns the identifier if found or -1 if not. > + * Returns the LOCKED pointer to the ipc structure if found or NULL > + * if not. > + * If key is found ipc contains its ipc structure > */ > > -int ipc_findkey(struct ipc_ids* ids, key_t key) > +struct kern_ipc_perm *ipc_findkey(struct ipc_ids *ids, key_t key) > { > - int id; > - struct kern_ipc_perm* p; > - int max_id = ids->max_id; > + struct kern_ipc_perm *ipc; > + struct kern_ipc_perm *ipcs[IPCS_MAX_SCAN_ENTRIES]; ... together with this means 4*256 -> 1k of precious stack space used. Please consider either lowering IPCS_MAX_SCAN_ENTRIES or kmalloc() that. Same problem with your third patch called "Changing the loops on a single ipcid into radix_tree_gang_lookup() calls" If you cannot sleep, try to lower that constant (e.g. 16-32). The current users use much smaller numbers. If you can sleep and performance goes down after lowering that constant, try to kmalloc these arrays (since kmalloc() of that small amount should succeed easily). Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/6] Storing ipcs into radix trees
Hi Nadia, good to see someone is pounding this old beast again :-) On Thursday 07 June 2007, [EMAIL PROTECTED] wrote: Index: linux-2.6.21/ipc/util.h === --- linux-2.6.21.orig/ipc/util.h 2007-06-07 11:00:30.0 +0200 +++ linux-2.6.21/ipc/util.h 2007-06-07 11:07:22.0 +0200 @@ -13,6 +13,8 @@ #define USHRT_MAX 0x #define SEQ_MULTIPLIER (IPCMNI) +#define IPCS_MAX_SCAN_ENTRIES 256 That ... Index: linux-2.6.21/ipc/util.c === --- linux-2.6.21.orig/ipc/util.c 2007-06-07 11:00:30.0 +0200 +++ linux-2.6.21/ipc/util.c 2007-06-07 11:29:43.0 +0200 @@ -252,72 +241,94 @@ void __init ipc_init_proc_interface(cons * @key: The key to find * * Requires ipc_ids.mutex locked. - * Returns the identifier if found or -1 if not. + * Returns the LOCKED pointer to the ipc structure if found or NULL + * if not. + * If key is found ipc contains its ipc structure */ -int ipc_findkey(struct ipc_ids* ids, key_t key) +struct kern_ipc_perm *ipc_findkey(struct ipc_ids *ids, key_t key) { - int id; - struct kern_ipc_perm* p; - int max_id = ids-max_id; + struct kern_ipc_perm *ipc; + struct kern_ipc_perm *ipcs[IPCS_MAX_SCAN_ENTRIES]; ... together with this means 4*256 - 1k of precious stack space used. Please consider either lowering IPCS_MAX_SCAN_ENTRIES or kmalloc() that. Same problem with your third patch called Changing the loops on a single ipcid into radix_tree_gang_lookup() calls If you cannot sleep, try to lower that constant (e.g. 16-32). The current users use much smaller numbers. If you can sleep and performance goes down after lowering that constant, try to kmalloc these arrays (since kmalloc() of that small amount should succeed easily). Regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Bridge] [BUG] Dropping fragmented IP packets within VLAN frames on bridge
On Saturday 26 May 2007, Patrick McHardy wrote: > Adam Osuchowski wrote: > > if (((skb->protocol == htons(ETH_P_IP) && skb->len > skb->dev->mtu) || > > (IS_VLAN_IP(skb) && skb->len > skb->dev->mtu - VLAN_HLEN)) && > > !skb_is_gso(skb)) > > return ip_fragment ... > > > net/8021q ignores the VLAN header overhead, so we should probably do the > same here for consistency. Using IS_VLAN_IP (and IS_PPPOE_IP for current > -rc) looks fine, additionally we should probably also check for > skb->nfct != NULL to make sure that at least without connection tracking > the bridge doesn't perform fragmentation. And could we separe the conditions for that into a static helper function explaining each of these conditions? e.g. sth. like that: static bool br_nf_need_fragment(struct sk_buff *skb) { /* Plain IP packet does not fit in MTU */ if (!(skb->protocol == htons(ETH_P_IP) && skb->len > skb->dev->mtu)) return true; /* VLAN encapsulated IP packet does not fit in MTU */ if (IS_VLAN_IP(skb) && skb->len > skb->dev->mtu - VLAN_HLEN) return true; /* PPPoE encapsulated IP packet does not fit in MTU */ if (IS_PPPOE_IP(skb) && skb->len > skb->dev->mtu - PPPOE_SES_HLEN) return true; return false; } and then br_nf_dev_queue_xmit() becomes: static int br_nf_dev_queue_xmit(struct sk_buff *skb) { if (br_nf_need_fragment(skb) && !skb_is_gso(skb)) return ip_fragment(skb, br_dev_queue_push_xmit); else return br_dev_queue_push_xmit(skb); } which is much more readable, more documented and doesn't contain a condition monster :-) @Patrick: Could you check, wether the PPPoE case is correct? What do you think? Should I submit a patch for that? Best Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: epoll,threading
Hi Arunachalam, On Saturday 26 May 2007, Arunachalam wrote: > I want to know in detail about , what the events (epoll or /dev/poll or > select ) achieve in contrast to thread per client. > > i can have a thread per client and use send and recv system call directly > right? Why do i go for these event mechanisms? Try 30.000 clients or more on a x86 32bit box. That will show you the difference quite nicely :-) More seriously: Thread per client scales only to a certain amount of clients per RAM. If you like to scale beyond that to like to minimize your state per client. If you have a thread then you have a task structure as unswappable memory in kernel, a per-thread stack, which is reducing your virtual memory per process (you have only around 3GB of virtual memory per process in Linux x86 32bit). So one uses a process or thread pool to scale beyond that. Pool size is typically related to the amount of CPU cores in the system. Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: epoll,threading
Hi Arunachalam, On Saturday 26 May 2007, Arunachalam wrote: I want to know in detail about , what the events (epoll or /dev/poll or select ) achieve in contrast to thread per client. i can have a thread per client and use send and recv system call directly right? Why do i go for these event mechanisms? Try 30.000 clients or more on a x86 32bit box. That will show you the difference quite nicely :-) More seriously: Thread per client scales only to a certain amount of clients per RAM. If you like to scale beyond that to like to minimize your state per client. If you have a thread then you have a task structure as unswappable memory in kernel, a per-thread stack, which is reducing your virtual memory per process (you have only around 3GB of virtual memory per process in Linux x86 32bit). So one uses a process or thread pool to scale beyond that. Pool size is typically related to the amount of CPU cores in the system. Regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Bridge] [BUG] Dropping fragmented IP packets within VLAN frames on bridge
On Saturday 26 May 2007, Patrick McHardy wrote: Adam Osuchowski wrote: if (((skb-protocol == htons(ETH_P_IP) skb-len skb-dev-mtu) || (IS_VLAN_IP(skb) skb-len skb-dev-mtu - VLAN_HLEN)) !skb_is_gso(skb)) return ip_fragment ... net/8021q ignores the VLAN header overhead, so we should probably do the same here for consistency. Using IS_VLAN_IP (and IS_PPPOE_IP for current -rc) looks fine, additionally we should probably also check for skb-nfct != NULL to make sure that at least without connection tracking the bridge doesn't perform fragmentation. And could we separe the conditions for that into a static helper function explaining each of these conditions? e.g. sth. like that: static bool br_nf_need_fragment(struct sk_buff *skb) { /* Plain IP packet does not fit in MTU */ if (!(skb-protocol == htons(ETH_P_IP) skb-len skb-dev-mtu)) return true; /* VLAN encapsulated IP packet does not fit in MTU */ if (IS_VLAN_IP(skb) skb-len skb-dev-mtu - VLAN_HLEN) return true; /* PPPoE encapsulated IP packet does not fit in MTU */ if (IS_PPPOE_IP(skb) skb-len skb-dev-mtu - PPPOE_SES_HLEN) return true; return false; } and then br_nf_dev_queue_xmit() becomes: static int br_nf_dev_queue_xmit(struct sk_buff *skb) { if (br_nf_need_fragment(skb) !skb_is_gso(skb)) return ip_fragment(skb, br_dev_queue_push_xmit); else return br_dev_queue_push_xmit(skb); } which is much more readable, more documented and doesn't contain a condition monster :-) @Patrick: Could you check, wether the PPPoE case is correct? What do you think? Should I submit a patch for that? Best Regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Introduce boot based time
Hi John and Tomas, On Thursday 10 May 2007, john stultz wrote: > I'm not sure I follow this. > > total_sleep_time stores seconds. So on 32bit systems that's 130some > years, so it shouldn't be an issue. > > Is the reason you want it to be a ktime is because you want a way to > keep sub-second sleep granularity? No, I'm just overworked and getting sloppy :-/ Sorry for the noise... Best regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Introduce boot based time
Hi Tomas, On Thursday 10 May 2007, Tomas Janousek wrote: > diff --git a/include/linux/time.h b/include/linux/time.h > index 8997b61..06f3eaf 100644 > --- a/include/linux/time.h > +++ b/include/linux/time.h > @@ -116,6 +116,8 @@ extern int do_setitimer(int which, struct itimerval > *value, > extern unsigned int alarm_setitimer(unsigned int seconds); > extern int do_getitimer(int which, struct itimerval *value); > extern void getnstimeofday(struct timespec *tv); > +extern void getboottime(struct timespec *ts); > +extern void monotonic_to_bootbased(struct timespec *ts); > > extern struct timespec timespec_trunc(struct timespec t, unsigned gran); > extern int timekeeping_is_continuous(void); > diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c > index f9217bf..dd9647a 100644 > --- a/kernel/time/timekeeping.c > +++ b/kernel/time/timekeeping.c > @@ -36,9 +36,17 @@ EXPORT_SYMBOL(xtime_lock); > * at zero at system boot time, so wall_to_monotonic will be negative, > * however, we will ALWAYS keep the tv_nsec part positive so we can use > * the usual normalization. > + * > + * wall_to_monotonic is moved after resume from suspend for the monotonic > + * time not to jump. We need to add total_sleep_time to wall_to_monotonic > + * to get the real boot based time offset. > + * > + * - wall_to_monotonic is no longer the boot time, getboottime must be > + * used instead. > */ > struct timespec xtime __attribute__ ((aligned (16))); > struct timespec wall_to_monotonic __attribute__ ((aligned (16))); > +static unsigned long total_sleep_time; Could you make that a ktime_t (or struct ktime)? There are machines, which sleep more than they are awake. Just imagine a surveillance camera triggered by door entrance. Yes, these things might run Linux (e.g. on "cris" architecture). Or your VCR. Yes, these devices might sleep more than they are awake, if you are not a TV junkie :-) Best regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Introduce boot based time
Hi Tomas, On Thursday 10 May 2007, Tomas Janousek wrote: diff --git a/include/linux/time.h b/include/linux/time.h index 8997b61..06f3eaf 100644 --- a/include/linux/time.h +++ b/include/linux/time.h @@ -116,6 +116,8 @@ extern int do_setitimer(int which, struct itimerval *value, extern unsigned int alarm_setitimer(unsigned int seconds); extern int do_getitimer(int which, struct itimerval *value); extern void getnstimeofday(struct timespec *tv); +extern void getboottime(struct timespec *ts); +extern void monotonic_to_bootbased(struct timespec *ts); extern struct timespec timespec_trunc(struct timespec t, unsigned gran); extern int timekeeping_is_continuous(void); diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c index f9217bf..dd9647a 100644 --- a/kernel/time/timekeeping.c +++ b/kernel/time/timekeeping.c @@ -36,9 +36,17 @@ EXPORT_SYMBOL(xtime_lock); * at zero at system boot time, so wall_to_monotonic will be negative, * however, we will ALWAYS keep the tv_nsec part positive so we can use * the usual normalization. + * + * wall_to_monotonic is moved after resume from suspend for the monotonic + * time not to jump. We need to add total_sleep_time to wall_to_monotonic + * to get the real boot based time offset. + * + * - wall_to_monotonic is no longer the boot time, getboottime must be + * used instead. */ struct timespec xtime __attribute__ ((aligned (16))); struct timespec wall_to_monotonic __attribute__ ((aligned (16))); +static unsigned long total_sleep_time; Could you make that a ktime_t (or struct ktime)? There are machines, which sleep more than they are awake. Just imagine a surveillance camera triggered by door entrance. Yes, these things might run Linux (e.g. on cris architecture). Or your VCR. Yes, these devices might sleep more than they are awake, if you are not a TV junkie :-) Best regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Introduce boot based time
Hi John and Tomas, On Thursday 10 May 2007, john stultz wrote: I'm not sure I follow this. total_sleep_time stores seconds. So on 32bit systems that's 130some years, so it shouldn't be an issue. Is the reason you want it to be a ktime is because you want a way to keep sub-second sleep granularity? No, I'm just overworked and getting sloppy :-/ Sorry for the noise... Best regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] LogFS proper
On Tuesday 08 May 2007, Thomas Gleixner wrote: > On Tue, 2007-05-08 at 00:00 +0200, Jörn Engel wrote: > > +#define packed __attribute__((__packed__)) > > Please use the __attribute__((__packed__)) on your structs instead of > creating some extra "needs lookup" magic. Don't worry, we have __packed predefined for this. Just look in include/linux/compiler-gcc.h I love it, because I always forget at least one brace or undescore level :-) Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] LogFS proper
On Tuesday 08 May 2007, Thomas Gleixner wrote: On Tue, 2007-05-08 at 00:00 +0200, Jörn Engel wrote: +#define packed __attribute__((__packed__)) Please use the __attribute__((__packed__)) on your structs instead of creating some extra needs lookup magic. Don't worry, we have __packed predefined for this. Just look in include/linux/compiler-gcc.h I love it, because I always forget at least one brace or undescore level :-) Regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] [PATCH] DRM TTM Memory Manager patch
On Tuesday 01 May 2007, Dave Airlie wrote: > > - what's with the /proc interface? Don't add new proc code for > > non-process related things. This should all go into sysfs > > somewhere. And yes, I know /proc/dri/ is there today, but don't add > > new stuff please. > > Well we should move all that stuff to sysfs, but we have all the > infrastructure for publishing this stuff under /proc/dri and adding > new files doesn't take a major amount, as much as I appreciate sysfs, > it isn't suitable for this sort of information dump, the whole one > value per file is quite useless to provide this sort of information > which is uni-directional for users to send to us for debugging without > have to install some special tool to join all the values into one > place.. and I don't think drmfs is the answer either... or maybe it > is Ok, what about debugfs then? If it is just for debugging blobs -> debugfs, if it is crucial for operation -> sysfs and representation of one value per file. Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] [PATCH] DRM TTM Memory Manager patch
On Tuesday 01 May 2007, Dave Airlie wrote: - what's with the /proc interface? Don't add new proc code for non-process related things. This should all go into sysfs somewhere. And yes, I know /proc/dri/ is there today, but don't add new stuff please. Well we should move all that stuff to sysfs, but we have all the infrastructure for publishing this stuff under /proc/dri and adding new files doesn't take a major amount, as much as I appreciate sysfs, it isn't suitable for this sort of information dump, the whole one value per file is quite useless to provide this sort of information which is uni-directional for users to send to us for debugging without have to install some special tool to join all the values into one place.. and I don't think drmfs is the answer either... or maybe it is Ok, what about debugfs then? If it is just for debugging blobs - debugfs, if it is crucial for operation - sysfs and representation of one value per file. Regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] crypto: Use padlock.ko only as a module
Hi Scott, On Sunday 29 April 2007, Simon Arlott wrote: > Ideally I'd just remove that module completely, all it does is > trigger the loading of the other two modules when modules are > used - so I'll submit a patch for that instead. That's much better! When you force a feature to be a module on a kernel without module support, it will effectivly be disabled. And if it is so simple to do the same in userspace like you suggest, than that's much better. Best Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] crypto: Use padlock.ko only as a module
Hi Scott, On Sunday 29 April 2007, Simon Arlott wrote: Ideally I'd just remove that module completely, all it does is trigger the loading of the other two modules when modules are used - so I'll submit a patch for that instead. That's much better! When you force a feature to be a module on a kernel without module support, it will effectivly be disabled. And if it is so simple to do the same in userspace like you suggest, than that's much better. Best Regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: init's children list is long and slows reaping children.
On Tuesday 10 April 2007, Jeff Garzik wrote: > Thus, rather than forcing authors to make their code more complex, we > should find another solution. What about sth. like the "pre-forking" concept? So just have a thread creator thread, which checks the amount of unused threads and keeps them within certain limits. So that anything which needs a thread now simply queues up the work and specifies, that it wants a new thread, if possible. One problem seems to be, that a thread is nothing else but a statement on what other tasks I can wait before doing my current one (e.g. I don't want to mlseep() twice on the same reset timeout). But we usually use locking to order that. Do I miss anything fundamental here? Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: init's children list is long and slows reaping children.
On Tuesday 10 April 2007, Jeff Garzik wrote: > That's why I feel thread creation -- cheap under Linux -- is quite > appropriate for many of these situations. Maybe that (thread creation) can be done at open(), socket-creation, service request, syscall or whatever event triggers a driver/subsystem to actually queue work into a thread. And when there is a close(), socket-destruction, service completion or whatever these threads can be marked for destruction and destroyed by a timer or even immediately. Regards Ingo Oeser -- If something is getting cheap, it is getting wasted just because it is cheap. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: init's children list is long and slows reaping children.
On Tuesday 10 April 2007, Jeff Garzik wrote: That's why I feel thread creation -- cheap under Linux -- is quite appropriate for many of these situations. Maybe that (thread creation) can be done at open(), socket-creation, service request, syscall or whatever event triggers a driver/subsystem to actually queue work into a thread. And when there is a close(), socket-destruction, service completion or whatever these threads can be marked for destruction and destroyed by a timer or even immediately. Regards Ingo Oeser -- If something is getting cheap, it is getting wasted just because it is cheap. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: init's children list is long and slows reaping children.
On Tuesday 10 April 2007, Jeff Garzik wrote: Thus, rather than forcing authors to make their code more complex, we should find another solution. What about sth. like the pre-forking concept? So just have a thread creator thread, which checks the amount of unused threads and keeps them within certain limits. So that anything which needs a thread now simply queues up the work and specifies, that it wants a new thread, if possible. One problem seems to be, that a thread is nothing else but a statement on what other tasks I can wait before doing my current one (e.g. I don't want to mlseep() twice on the same reset timeout). But we usually use locking to order that. Do I miss anything fundamental here? Regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: getting processor numbers
Hi Ulrich, On Tuesday 03 April 2007, Ulrich Drepper wrote: > So, anybody else has a proposal? This is a pressing issue and cannot > wait until someday in the distant future NUMA topology information is > easily and speedily accessible. Since for now you just need a fast and dirty hack, which will be replaced with better interfaces, I suggest creating a directory with some files in it. These should just contain, what you need to handle your most pressing cases. I propose /sys/devices/system/topology_counters/ for that. These can contain "online_cpu", "proped_cpu", "max_cpu" and maybe the same for nodes. All that as a simple file with an integer value. Since sysfs-attribute files are pollable (if the owners notifies sysfs on changes), you also have the notification system you need (select, poll, epoll etc.). If you promise to just keep the slow code around, than one day when the shiny NUMA topology stuff is ready, this directory can be completely removed and glibc (plus all their users) keeps working. It will then even work better with a new glibc version, which supports the shiny new NUMA topology stuff. The kernel can create these counters quiete easy, since most of them are the hamming weight (or population count) of some bitmaps. Does this sound like a proper hacky solution? :-) Regards Ingo Oeser pgpeUyaLE4v0G.pgp Description: PGP signature
Re: getting processor numbers
Hi Ulrich, On Tuesday 03 April 2007, Ulrich Drepper wrote: So, anybody else has a proposal? This is a pressing issue and cannot wait until someday in the distant future NUMA topology information is easily and speedily accessible. Since for now you just need a fast and dirty hack, which will be replaced with better interfaces, I suggest creating a directory with some files in it. These should just contain, what you need to handle your most pressing cases. I propose /sys/devices/system/topology_counters/ for that. These can contain online_cpu, proped_cpu, max_cpu and maybe the same for nodes. All that as a simple file with an integer value. Since sysfs-attribute files are pollable (if the owners notifies sysfs on changes), you also have the notification system you need (select, poll, epoll etc.). If you promise to just keep the slow code around, than one day when the shiny NUMA topology stuff is ready, this directory can be completely removed and glibc (plus all their users) keeps working. It will then even work better with a new glibc version, which supports the shiny new NUMA topology stuff. The kernel can create these counters quiete easy, since most of them are the hamming weight (or population count) of some bitmaps. Does this sound like a proper hacky solution? :-) Regards Ingo Oeser pgpeUyaLE4v0G.pgp Description: PGP signature
Re: [PATCH] sysctl: vfs_cache_divisor
Hi Randy, On Monday 19 March 2007, Randy Dunlap wrote: > Were there any patches written after this? If so, I missed them. > If not, does this patch help any? How is division by zero avoided? Maybe one can avoid setting it to zero. Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: vfs_cache_divisor
Hi Randy, On Monday 19 March 2007, Randy Dunlap wrote: Were there any patches written after this? If so, I missed them. If not, does this patch help any? How is division by zero avoided? Maybe one can avoid setting it to zero. Regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 0/2] semaphores: add down_interruptible_timeout() and asm-generic/semaphore.h
Hi Inaky, On Tuesday, 27. February 2007, Inaky Perez-Gonzalez wrote: > On Monday 26 February 2007 18:18, Alan wrote: > > > Yeah, I need semaphore. This is a hw register that says when the hw > > > is ready to accept a new command. Code that wants to send commands has > > > to down the semaphore and then send it. When hw is ready to get a new > > > command, it sends and IRQ and the IRQ up()s the semaphore. > > > > So you need a mutex not a semaphore > > Theoretically I could use a mutex. Practically it would trigger ugly > complications. Only the owner can unlock a mutex (for example), so > I could not unlock from an IRQ handler -- not to mention that the > semantic rules outlined in Documentation/mutex-design.txt explicitly > forbid IRQ usage. > > And then, this is what semaphores where designed for, as gates :) > for once that I get to use a semaphore properly... But they are not required for that :-) I would suggest to use an irq-safe spinlock for the hardware access and a status indicator (ready for command), if this is really just a command register. If the status indicator is updated (in IRQ) and read under spinlock, that is safe. If that command sending is speed critical, please try a FIFO and batch that stuff. Timeout based locking mechanisms are flawed, because they introduce the hard to find timing sensitive bugs. Please try sth. different (e.g. like suggested above). Semaphores aren't good "busy/ready flags", as you might have already noticed. Many Thanks and Best Regards Ingo Oeser, the down{_interruptible,}_timeout() implementation of Linux :-) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 0/2] semaphores: add down_interruptible_timeout() and asm-generic/semaphore.h
Hi Inaky, On Tuesday, 27. February 2007, Inaky Perez-Gonzalez wrote: On Monday 26 February 2007 18:18, Alan wrote: Yeah, I need semaphore. This is a hw register that says when the hw is ready to accept a new command. Code that wants to send commands has to down the semaphore and then send it. When hw is ready to get a new command, it sends and IRQ and the IRQ up()s the semaphore. So you need a mutex not a semaphore Theoretically I could use a mutex. Practically it would trigger ugly complications. Only the owner can unlock a mutex (for example), so I could not unlock from an IRQ handler -- not to mention that the semantic rules outlined in Documentation/mutex-design.txt explicitly forbid IRQ usage. And then, this is what semaphores where designed for, as gates :) for once that I get to use a semaphore properly... But they are not required for that :-) I would suggest to use an irq-safe spinlock for the hardware access and a status indicator (ready for command), if this is really just a command register. If the status indicator is updated (in IRQ) and read under spinlock, that is safe. If that command sending is speed critical, please try a FIFO and batch that stuff. Timeout based locking mechanisms are flawed, because they introduce the hard to find timing sensitive bugs. Please try sth. different (e.g. like suggested above). Semaphores aren't good busy/ready flags, as you might have already noticed. Many Thanks and Best Regards Ingo Oeser, the down{_interruptible,}_timeout() implementation of Linux :-) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/1] MM: detach_vmas_to_be_unmapped fix
Hi, On Wednesday, 21. February 2007, [EMAIL PROTECTED] wrote: > > --- > > mm/mmap.c |4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff -puN mm/mmap.c~Avoiding-mmap-fragmentation_fixup mm/mmap.c > --- linux-2.6_clean/mm/mmap.c~Avoiding-mmap-fragmentation_fixup > 2007-02-21 09:49:32.0 -0800 > +++ linux-2.6_clean-akuster/mm/mmap.c 2007-02-21 09:51:26.0 -0800 > @@ -1720,9 +1720,9 @@ detach_vmas_to_be_unmapped(struct mm_str > *insertion_point = vma; > tail_vma->vm_next = NULL; > if (mm->unmap_area == arch_unmap_area) > - addr = prev ? prev->vm_end : mm->mmap_base; > + addr = prev ? prev->vm_start : mm->mmap_base; > else > - addr = vma ? vma->vm_start : mm->mmap_base; > + addr = vma ? vma->vm_end : mm->mmap_base; > mm->unmap_area(mm, addr); > mm->mmap_cache = NULL; /* Kill the cache. */ > } Please comment, why you think this is necessary. Thanks & Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/1] MM: detach_vmas_to_be_unmapped fix
Hi, On Wednesday, 21. February 2007, [EMAIL PROTECTED] wrote: --- mm/mmap.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff -puN mm/mmap.c~Avoiding-mmap-fragmentation_fixup mm/mmap.c --- linux-2.6_clean/mm/mmap.c~Avoiding-mmap-fragmentation_fixup 2007-02-21 09:49:32.0 -0800 +++ linux-2.6_clean-akuster/mm/mmap.c 2007-02-21 09:51:26.0 -0800 @@ -1720,9 +1720,9 @@ detach_vmas_to_be_unmapped(struct mm_str *insertion_point = vma; tail_vma-vm_next = NULL; if (mm-unmap_area == arch_unmap_area) - addr = prev ? prev-vm_end : mm-mmap_base; + addr = prev ? prev-vm_start : mm-mmap_base; else - addr = vma ? vma-vm_start : mm-mmap_base; + addr = vma ? vma-vm_end : mm-mmap_base; mm-unmap_area(mm, addr); mm-mmap_cache = NULL; /* Kill the cache. */ } Please comment, why you think this is necessary. Thanks Regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] MTD: fix DOC2000/2001/2001PLUS build error
On Monday, 5. February 2007, Linus Torvalds wrote: > So thank God for the few selects we have, and we should add a whole lot > more! But "select" is not fine grained enough. I would like to have "require", "recommend", "suggest" for feature A. require X does not work without X, but X is way down the tree e.g. ext3 and block device or how select currently is intended recommend X it is usable but uncomfortable without X, enabled per default e.g. firewalling recommends connection tracking support or NAT recommends all NAT helpers suggest X many people use A together with X, so you might be interested in enabling it, but I disabled it per default unless you said "featuritis mode" before. e.g. highmem and SMP or a network driver and NAPI. That is what the Debian/Ubuntu package management does and maybe other too. And this also gives us new keywords to replace select with, so migration is doable :-) This would also make "EMBEDDED" superflous, because it would just mean "disable anything not required". And this would enable an individual tree for the users current configuration problem instead of a global one. Regards Ingo "and tomorrow we change the world" Oeser :-) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] MTD: fix DOC2000/2001/2001PLUS build error
On Monday, 5. February 2007, Linus Torvalds wrote: So thank God for the few selects we have, and we should add a whole lot more! But select is not fine grained enough. I would like to have require, recommend, suggest for feature A. require X does not work without X, but X is way down the tree e.g. ext3 and block device or how select currently is intended recommend X it is usable but uncomfortable without X, enabled per default e.g. firewalling recommends connection tracking support or NAT recommends all NAT helpers suggest X many people use A together with X, so you might be interested in enabling it, but I disabled it per default unless you said featuritis mode before. e.g. highmem and SMP or a network driver and NAPI. That is what the Debian/Ubuntu package management does and maybe other too. And this also gives us new keywords to replace select with, so migration is doable :-) This would also make EMBEDDED superflous, because it would just mean disable anything not required. And this would enable an individual tree for the users current configuration problem instead of a global one. Regards Ingo and tomorrow we change the world Oeser :-) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -mm 3/10][RFC] aio: use iov_length instead of ki_left
On Tuesday, 16. January 2007 06:37, Nate Diller wrote: > On 1/15/07, Christoph Hellwig <[EMAIL PROTECTED]> wrote: > > On Mon, Jan 15, 2007 at 05:54:50PM -0800, Nate Diller wrote: > > > Convert code using iocb->ki_left to use the more generic iov_length() > > > call. > > > > No way. We need to reduce the numer of iovec traversals, not adding > > more of them. > > ok, I can work on a version of this that uses struct iodesc. Maybe > something like this? > > struct iodesc { > struct iovec *iov; > unsigned long nr_segs; > size_t nbytes; > }; > > I suppose it's worth doing the iodesc thing along with this patchset > anyway, since it'll avoid an extra round of interface churn. What about this instead struct iodesc { struct iovec *iov; unsigned long nr_segs; unsigned long seg_limit; size_t nr_bytes; }; That will enable resizeable iodescs with partial completion state and will enable successive filling of an iodesc with iovs. This will be needed anyway. I built an complete short userspace module for that already. I can post and GPLv2 it somewhere, if people are interested. Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 31/59] sysctl: C99 convert the ctl_tables in arch/mips/au1000/common/power.c
Hi Eric, On Tuesday, 16. January 2007 17:39, Eric W. Biederman wrote: > diff --git a/arch/mips/au1000/common/power.c b/arch/mips/au1000/common/power.c > index b531ab7..31256b8 100644 > --- a/arch/mips/au1000/common/power.c > +++ b/arch/mips/au1000/common/power.c > @@ -419,15 +419,41 @@ static int pm_do_freq(ctl_table * ctl, int write, > struct file *file, > + { > + .ctl_name = CTL_UNNUMBERED, > + .procname = "suspend", > + .data = NULL, > + .maxlen = 0, > + .mode = 0600, > + .proc_handler = _do_suspend > + }, No need for zero initialization for maxlen. > + { > + .ctl_name = CTL_UNNUMBERED, > + .procname = "sleep", > + .data = NULL, > + .maxlen = 0, > + .mode = 0600, > + .proc_handler = _do_sleep > + }, dito > + { > + .ctl_name = CTL_UNNUMBERED, > + .procname = "freq", > + .data = NULL, > + .maxlen = 0, > + .mode = 0600, > + .proc_handler = _do_freq > + }, > + {} > }; dito Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 45/59] sysctl: C99 convert ctl_tables in drivers/parport/procfs.c
Hi Eric, On Tuesday, 16. January 2007 17:39, Eric W. Biederman wrote: > diff --git a/drivers/parport/procfs.c b/drivers/parport/procfs.c > index 2e744a2..5337789 100644 > --- a/drivers/parport/procfs.c > +++ b/drivers/parport/procfs.c > @@ -263,50 +263,118 @@ struct parport_sysctl_table { > + { > + .ctl_name = DEV_PARPORT_BASE_ADDR, > + .procname = "base-addr", > + .data = NULL, > + .maxlen = 0, > + .mode = 0444, > + .proc_handler = _hardware_base_addr > + }, No need to initialize to zero or NULL. Just list any variable, which is NOT zero or NULL. > + { > + .ctl_name = DEV_PARPORT_AUTOPROBE + 1, > + .procname = "autoprobe0", > + .data = NULL, > + .maxlen = 0, > + .maxlen = 0444, > + .proc_handler = _autoprobe > + }, Typo here? .mode = 0444 makes mor sense. Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -mm 3/10][RFC] aio: use iov_length instead of ki_left
On Tuesday, 16. January 2007 06:37, Nate Diller wrote: On 1/15/07, Christoph Hellwig [EMAIL PROTECTED] wrote: On Mon, Jan 15, 2007 at 05:54:50PM -0800, Nate Diller wrote: Convert code using iocb-ki_left to use the more generic iov_length() call. No way. We need to reduce the numer of iovec traversals, not adding more of them. ok, I can work on a version of this that uses struct iodesc. Maybe something like this? struct iodesc { struct iovec *iov; unsigned long nr_segs; size_t nbytes; }; I suppose it's worth doing the iodesc thing along with this patchset anyway, since it'll avoid an extra round of interface churn. What about this instead struct iodesc { struct iovec *iov; unsigned long nr_segs; unsigned long seg_limit; size_t nr_bytes; }; That will enable resizeable iodescs with partial completion state and will enable successive filling of an iodesc with iovs. This will be needed anyway. I built an complete short userspace module for that already. I can post and GPLv2 it somewhere, if people are interested. Regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 45/59] sysctl: C99 convert ctl_tables in drivers/parport/procfs.c
Hi Eric, On Tuesday, 16. January 2007 17:39, Eric W. Biederman wrote: diff --git a/drivers/parport/procfs.c b/drivers/parport/procfs.c index 2e744a2..5337789 100644 --- a/drivers/parport/procfs.c +++ b/drivers/parport/procfs.c @@ -263,50 +263,118 @@ struct parport_sysctl_table { + { + .ctl_name = DEV_PARPORT_BASE_ADDR, + .procname = base-addr, + .data = NULL, + .maxlen = 0, + .mode = 0444, + .proc_handler = do_hardware_base_addr + }, No need to initialize to zero or NULL. Just list any variable, which is NOT zero or NULL. + { + .ctl_name = DEV_PARPORT_AUTOPROBE + 1, + .procname = autoprobe0, + .data = NULL, + .maxlen = 0, + .maxlen = 0444, + .proc_handler = do_autoprobe + }, Typo here? .mode = 0444 makes mor sense. Regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 31/59] sysctl: C99 convert the ctl_tables in arch/mips/au1000/common/power.c
Hi Eric, On Tuesday, 16. January 2007 17:39, Eric W. Biederman wrote: diff --git a/arch/mips/au1000/common/power.c b/arch/mips/au1000/common/power.c index b531ab7..31256b8 100644 --- a/arch/mips/au1000/common/power.c +++ b/arch/mips/au1000/common/power.c @@ -419,15 +419,41 @@ static int pm_do_freq(ctl_table * ctl, int write, struct file *file, + { + .ctl_name = CTL_UNNUMBERED, + .procname = suspend, + .data = NULL, + .maxlen = 0, + .mode = 0600, + .proc_handler = pm_do_suspend + }, No need for zero initialization for maxlen. + { + .ctl_name = CTL_UNNUMBERED, + .procname = sleep, + .data = NULL, + .maxlen = 0, + .mode = 0600, + .proc_handler = pm_do_sleep + }, dito + { + .ctl_name = CTL_UNNUMBERED, + .procname = freq, + .data = NULL, + .maxlen = 0, + .mode = 0600, + .proc_handler = pm_do_freq + }, + {} }; dito Regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [DISCUSS] Make the variable NULL after freeing it.
On Monday, 1. January 2007 17:25, Andreas Schwab wrote: > Ingo Oeser <[EMAIL PROTECTED]> writes: > > Then this works, because the side effect (+20) is evaluated only once. > > It's not a side effect, it's a non-lvalue, and you can't take the address > of a non-lvalue. Just verified this. So If we cannot make it work in all cases, it will cause more problems then it will solve. So we are left with a function, which will a) only be used by janitors to provide "kfree(x); x = NULL;" with an macro KFREE(x) in all the simple cases. b) be used by developers, who are aware of the fact that reusable pointer values should set to NULL after kfree(). Doing a) and b) is "running into open doors", so doesn't prevent any error, obfuscates code more and works only sometimes. I give up here and would vote for dropping that idea then. Regards Ingo Oeser pgpcCSfafJsC7.pgp Description: PGP signature
Re: [PATCH] [DISCUSS] Make the variable NULL after freeing it.
Hi, On Monday, 1. January 2007 07:37, Amit Choudhary wrote: > --- Ingo Oeser <[EMAIL PROTECTED]> wrote: > > #define kfree_nullify(x) do { \ > > if (__builtin_constant_p(x)) { \ > > kfree(x); \ > > } else { \ > > typeof(x) *__addr_x = \ Ok, I should change that line to typeof(x) *__addr_x = &(x); \ > > kfree(*__addr_x); \ > > *__addr_x = NULL; \ > > } \ > > } while (0) > > > > Regards > > > > Ingo Oeser > > > > This is a nice approach but what if someone does kfree_nullify(x+20). Then this works, because the side effect (+20) is evaluated only once. AFAIK __builtin_constant_p() and typeof() are both free of side effects. > I decided to keep it simple. If someone is calling kfree_nullify() with > anything other than a > simple variable, then they should call kfree(). kfree_nullify() has to replace kfree() to be of any use one day. So this is not an option. Anybody thinking of "Hey, this must be NULL afterwards!", will set it to NULL himself. Anybody else doesn't know or care about it, which is the case we like to catch. > But definitely an approach that takes care of all > situations is the best but I cannot think of a macro that can handle all > situations. The simple > macro that I sent earlier will catch all the other usage at compile time. The problems I see are: 1. parameter to kfree is a value not a pointer -> solved by using a macro instead of function, but generate new (the other) problems -> take the address of the value there. 2. possible side effects of macro parameter usage -> solved by assigning once only and using typeof 3. Constants don't have an address -> need to check for constant So apart from missing braces before taking the address, I don't see any problem with my solution :-) Should I send a patch? > Please let me know if I have missed something. I reviewed it and you missed side effects (kfree(x); x = NULL). Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [DISCUSS] Make the variable NULL after freeing it.
Hi, On Monday, 1. January 2007 07:37, Amit Choudhary wrote: --- Ingo Oeser [EMAIL PROTECTED] wrote: #define kfree_nullify(x) do { \ if (__builtin_constant_p(x)) { \ kfree(x); \ } else { \ typeof(x) *__addr_x = x; \ Ok, I should change that line to typeof(x) *__addr_x = (x); \ kfree(*__addr_x); \ *__addr_x = NULL; \ } \ } while (0) Regards Ingo Oeser This is a nice approach but what if someone does kfree_nullify(x+20). Then this works, because the side effect (+20) is evaluated only once. AFAIK __builtin_constant_p() and typeof() are both free of side effects. I decided to keep it simple. If someone is calling kfree_nullify() with anything other than a simple variable, then they should call kfree(). kfree_nullify() has to replace kfree() to be of any use one day. So this is not an option. Anybody thinking of Hey, this must be NULL afterwards!, will set it to NULL himself. Anybody else doesn't know or care about it, which is the case we like to catch. But definitely an approach that takes care of all situations is the best but I cannot think of a macro that can handle all situations. The simple macro that I sent earlier will catch all the other usage at compile time. The problems I see are: 1. parameter to kfree is a value not a pointer - solved by using a macro instead of function, but generate new (the other) problems - take the address of the value there. 2. possible side effects of macro parameter usage - solved by assigning once only and using typeof 3. Constants don't have an address - need to check for constant So apart from missing braces before taking the address, I don't see any problem with my solution :-) Should I send a patch? Please let me know if I have missed something. I reviewed it and you missed side effects (kfree(x); x = NULL). Regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [DISCUSS] Make the variable NULL after freeing it.
On Monday, 1. January 2007 17:25, Andreas Schwab wrote: Ingo Oeser [EMAIL PROTECTED] writes: Then this works, because the side effect (+20) is evaluated only once. It's not a side effect, it's a non-lvalue, and you can't take the address of a non-lvalue. Just verified this. So If we cannot make it work in all cases, it will cause more problems then it will solve. So we are left with a function, which will a) only be used by janitors to provide kfree(x); x = NULL; with an macro KFREE(x) in all the simple cases. b) be used by developers, who are aware of the fact that reusable pointer values should set to NULL after kfree(). Doing a) and b) is running into open doors, so doesn't prevent any error, obfuscates code more and works only sometimes. I give up here and would vote for dropping that idea then. Regards Ingo Oeser pgpcCSfafJsC7.pgp Description: PGP signature
Re: [PATCH] [DISCUSS] Make the variable NULL after freeing it.
On Sunday, 31. December 2006 14:38, Bernd Petrovitsch wrote: > That depends on the decision/definition if (so called) "double free" is > an error or not (and "free(NULL)" must work in POSIX-compliant > environments). A double free of non-NULL is certainly an error. So the idea of setting it to NULL is ok, since then you can kfree the variable over and over again without any harm. It is just complicated to do this side effect free. Maybe one should check for builtin-constant and take the address, if this is not an builtin-constant. sth, like this #define kfree_nullify(x) do { \ if (__builtin_constant_p(x)) { \ kfree(x); \ } else { \ typeof(x) *__addr_x = \ kfree(*__addr_x); \ *__addr_x = NULL; \ } \ } while (0) Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/8] KVM: Implement a few system configuration msrs
Hi, On Thursday, 28. December 2006 11:11, Avi Kivity wrote: > Index: linux-2.6/drivers/kvm/svm.c > === > --- linux-2.6.orig/drivers/kvm/svm.c > +++ linux-2.6/drivers/kvm/svm.c > @@ -1068,6 +1068,9 @@ static int emulate_on_interception(struc > static int svm_get_msr(struct kvm_vcpu *vcpu, unsigned ecx, u64 *data) > { > switch (ecx) { > + case 0xc0010010: /* SYSCFG */ > + case 0xc0010015: /* HWCR */ > + case MSR_IA32_PLATFORM_ID: > case MSR_IA32_P5_MC_ADDR: > case MSR_IA32_P5_MC_TYPE: > case MSR_IA32_MC0_CTL: What about just defining constants for these? Then you can rip out these comments. Same for linux-2.6/drivers/kvm/vmx.c Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/8] KVM: Implement a few system configuration msrs
Hi, On Thursday, 28. December 2006 11:11, Avi Kivity wrote: Index: linux-2.6/drivers/kvm/svm.c === --- linux-2.6.orig/drivers/kvm/svm.c +++ linux-2.6/drivers/kvm/svm.c @@ -1068,6 +1068,9 @@ static int emulate_on_interception(struc static int svm_get_msr(struct kvm_vcpu *vcpu, unsigned ecx, u64 *data) { switch (ecx) { + case 0xc0010010: /* SYSCFG */ + case 0xc0010015: /* HWCR */ + case MSR_IA32_PLATFORM_ID: case MSR_IA32_P5_MC_ADDR: case MSR_IA32_P5_MC_TYPE: case MSR_IA32_MC0_CTL: What about just defining constants for these? Then you can rip out these comments. Same for linux-2.6/drivers/kvm/vmx.c Regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [DISCUSS] Make the variable NULL after freeing it.
On Sunday, 31. December 2006 14:38, Bernd Petrovitsch wrote: That depends on the decision/definition if (so called) double free is an error or not (and free(NULL) must work in POSIX-compliant environments). A double free of non-NULL is certainly an error. So the idea of setting it to NULL is ok, since then you can kfree the variable over and over again without any harm. It is just complicated to do this side effect free. Maybe one should check for builtin-constant and take the address, if this is not an builtin-constant. sth, like this #define kfree_nullify(x) do { \ if (__builtin_constant_p(x)) { \ kfree(x); \ } else { \ typeof(x) *__addr_x = x; \ kfree(*__addr_x); \ *__addr_x = NULL; \ } \ } while (0) Regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 009 of 14] knfsd: SUNRPC: teach svc_sendto() to deal with IPv6 addresses
On Wednesday, 13. December 2006 00:59, NeilBrown wrote: > diff .prev/net/sunrpc/svcsock.c ./net/sunrpc/svcsock.c > --- .prev/net/sunrpc/svcsock.c2006-12-13 10:31:39.0 +1100 > +++ ./net/sunrpc/svcsock.c2006-12-13 10:32:15.0 +1100 > @@ -438,6 +439,47 @@ svc_wake_up(struct svc_serv *serv) > } > } > > +union svc_pktinfo_u { > + struct in_pktinfo pkti; > +#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) > + struct in6_pktinfo pkti6; > +#endif > +}; > + > +static void svc_set_cmsg_data(struct svc_rqst *rqstp, struct cmsghdr *cmh) > +{ > + switch (rqstp->rq_sock->sk_sk->sk_family) { > + case AF_INET: > + do { > + struct in_pktinfo *pki = > + (struct in_pktinfo *) CMSG_DATA(cmh); struct in_pktinfo *pki = CMSG_DATA(cmh); Ugly casting not needed here, since CMSG_DATA should return "void *", which can be casted to any pointer. > + > + cmh->cmsg_level = SOL_IP; > + cmh->cmsg_type = IP_PKTINFO; > + pki->ipi_ifindex = 0; > + pki->ipi_spec_dst.s_addr = rqstp->rq_daddr.addr.s_addr; > + cmh->cmsg_len = CMSG_LEN(sizeof(*pki)); > + } while (0); > + break; > +#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) > + case AF_INET6: > + do { > + struct in6_pktinfo *pki = > + (struct in6_pktinfo *) CMSG_DATA(cmh); > + No casting needed, so: struct in6_pktinfo *pki = CMSG_DATA(cmh); Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.19] e1000: replace kmalloc with kzalloc
On Tuesday, 12. December 2006 18:34, Pekka Enberg wrote: > On 12/12/06, Yan Burman <[EMAIL PROTECTED]> wrote: > > size = txdr->count * sizeof(struct e1000_buffer); > > - if (!(txdr->buffer_info = kmalloc(size, GFP_KERNEL))) { > > + if (!(txdr->buffer_info = kzalloc(size, GFP_KERNEL))) { > > ret_val = 1; > > goto err_nomem; > > } > > - memset(txdr->buffer_info, 0, size); > > No one seems to be using size elsewhere so why not convert to > kcalloc() and get rid of it? (Seems to apply to other places as well.) Because if done properly that often exceeds the 80 column limit. The intermediate variable should be optimized away from the compiler. But kcalloc() is better for another reason: Overflow checking. Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.19] e1000: replace kmalloc with kzalloc
On Tuesday, 12. December 2006 18:34, Pekka Enberg wrote: On 12/12/06, Yan Burman [EMAIL PROTECTED] wrote: size = txdr-count * sizeof(struct e1000_buffer); - if (!(txdr-buffer_info = kmalloc(size, GFP_KERNEL))) { + if (!(txdr-buffer_info = kzalloc(size, GFP_KERNEL))) { ret_val = 1; goto err_nomem; } - memset(txdr-buffer_info, 0, size); No one seems to be using size elsewhere so why not convert to kcalloc() and get rid of it? (Seems to apply to other places as well.) Because if done properly that often exceeds the 80 column limit. The intermediate variable should be optimized away from the compiler. But kcalloc() is better for another reason: Overflow checking. Regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/