Re: Add the infamous Huawei E220 to option.c
On Thu, 29 Nov 2007 08:38:59 +0100, Oliver Neukum <[EMAIL PROTECTED]> wrote: > Am Donnerstag, 29. November 2007 01:13:05 schrieb Pete Zaitcev: > > The problem stems from the fact that both option and usb-storage can bind > > to the modem when in storage mode: the former binds because of the storage > > class, the latter binds because of VID/PID match. The modprobe loads both, > > Isn't it possible to fix this in option's module table? At first thought it'll need adding a field to struct usb_serial to save the driver_info from the ID table in usb_serial_probe. It's something I'd like to discuss actually. I hate fields which store information this way: filled in one place, used in another place... From the perspective of code prettiness I would rather add another method for usb_serial_probe to call. But I'm not sure really. -- Pete - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
> Hi Gary, Kenji-san, et. al, > > * Gary Hade <[EMAIL PROTECTED]>: >> Alex, What I was trying to suggest is a boot-time kernel >> option, not a kernel configuration option. The basic idea is >> to give the user (with a single binary kernel) the ability to >> include your ACPI-PCI slot driver feature changes only when >> they are really needed. In addition to reducing the number of >> system/PCI hotplug driver combinations where your changes would >> need to be validated, I believe would also help alleviate other >> worries (e.g. Andi Kleen's memory consumption concern). I >> believe this goal could also be achieved with the kernel config >> option by making the pci_slot module runtime loadable with the >> PCI hotplug drivers only visiting your new code when the >> pci_slot driver is loaded, although I think this would be more >> difficult to implement. > > I have modified my patch series so that the final patch that > introduces my ACPI-PCI slot driver is a full-fledged module, that > has a tristate Kconfig option. > Thank you for your good job. I tested shpchp and pciehp both with and without pci_slot module. There seems no regression from shpchp and pciehp's point of view. (I had a little concern about the hotplug slots' name that vary depending on whether pci_slot functionality is enabled or disabled. But, now that we can build pci_slot driver as a kernel module, I don't think it is a big problem). Only the problems is that I got Call Traces with the following error messages when pci_slot driver was loaded, and one strange slot named '1023' was registered (other slots are fine). This is the same problem I reported before. sysfs: duplicate filename '1023' can not be created WARNING: at fs/sysfs/dir.c:424 sysfs_add_one() kobject_add failed for 1023 with -EEXIST, don't try to register things with the same name in the same directory. On my system, hotplug slots themselves can be added, removed and replaced with the ohter type of I/O box. The ACPI firmware tells OS the presence of those slots using _STA method (That is, it doesn't use 'LoadTable()' AML operator). On the other hand, current pci_slot driver doesn't check _STA. As a result, pci_slot driver tryied to register the invalid (non-existing) slots. The ACPI firmware of my system returns '1023' if the invalid slot's _SUN is evaluated. This is the cause of Call Traces mentioned above. To fix this problem, pci_slot driver need to check _STA when scanning ACPI Namespace. I'm sorry for reporting this so late. I'm attaching the patch to fix the problem. This is against 2.6.24-rc3 with your patches applied. Could you try it? BTW, acpiphp also seems to have the same problem... Thanks, Kenji Kaneshige --- drivers/acpi/pci_slot.c | 13 + 1 file changed, 13 insertions(+) Index: linux-2.6.24-rc3/drivers/acpi/pci_slot.c === --- linux-2.6.24-rc3.orig/drivers/acpi/pci_slot.c +++ linux-2.6.24-rc3/drivers/acpi/pci_slot.c @@ -113,10 +113,17 @@ register_slot(acpi_handle handle, u32 lv int device; unsigned long sun; char name[KOBJ_NAME_LEN]; + acpi_status status; + struct acpi_device *dummy_device; struct pci_slot *pci_slot; struct pci_bus *pci_bus = context; + /* Skip non-existing device object. */ + status = acpi_bus_get_device(handle, _device); + if (ACPI_FAILURE(status)) + return AE_OK; + if (check_slot(handle, , )) return AE_OK; @@ -150,12 +157,18 @@ walk_p2p_bridge(acpi_handle handle, u32 acpi_status status; acpi_handle dummy_handle; acpi_walk_callback user_function; + struct acpi_device *dummy_device; struct pci_dev *dev; struct pci_bus *pci_bus; struct p2p_bridge_context child_context; struct p2p_bridge_context *parent_context = context; + /* Skip non-existing device object. */ + status = acpi_bus_get_device(handle, _device); + if (ACPI_FAILURE(status)) + return AE_OK; + pci_bus = parent_context->pci_bus; user_function = parent_context->user_function; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Question regarding mutex locking
On 29-11-2007 03:34, David Schwartz wrote: >> Thanks for the help. Someday, I hope to understand this stuff. >> >> Larry > > Any code either deals with an object or it doesn't. If it doesn't deal with > that object, it should not be acquiring locks on that object. If it does > deal with that object, it must know the internal details of that object, > including when and whether locks are held, or it cannot deal with that > object sanely. ... Maybe it'll unnecessarily complicate the thing, but since you repeat the need to know the object - sometimes the locking is done to synchronize something in time only, so to assure only one action is done at a time or a few actions are done in proper order, or/and shouldn't be broken in the meantime by other actions (so, no need to deal with any common data). But, of course, we can say an action could be a kind of object too. Regards, Jarek P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] New kobject/kset/ktype documentation and example code
On Wed, 2007-11-28 at 22:08 -0800, Greg KH wrote: > On Wed, Nov 28, 2007 at 06:00:27PM +0100, Kay Sievers wrote: > > On Wed, 2007-11-28 at 17:51 +0100, Cornelia Huck wrote: > > > On Wed, 28 Nov 2007 17:36:29 +0100, Kay Sievers <[EMAIL PROTECTED]> wrote: > > > > On Wed, 2007-11-28 at 17:12 +0100, Cornelia Huck wrote: > > > > > On Wed, 28 Nov 2007 16:57:48 +0100, Kay Sievers <[EMAIL PROTECTED]> > > > > > wrote: > > > > > > On Wed, 2007-11-28 at 16:48 +0100, Cornelia Huck wrote: > > > > > > > On Wed, 28 Nov 2007 13:23:02 +0100, Kay Sievers <[EMAIL > > > > > > > PROTECTED]> wrote: > > > > > > > > On Wed, 2007-11-28 at 12:45 +0100, Cornelia Huck wrote: > > > > > > > > > On Tue, 27 Nov 2007 15:02:52 -0800, Greg KH <[EMAIL > > > > > > > > > PROTECTED]> wrote: > > > > > > > > > > > > > > > > The uevent function will be called when the uevent is about > > > > > > > > > > to be sent to > > > > > > > > > > userspace to allow more environment variables to be added > > > > > > > > > > to the uevent. > > > > > > > > > > > > > > > > > > It may be helpful to mention which uevents are by default > > > > > > > > > created by > > > > > > > > > the kobject core (KOBJ_ADD, KOBJ_DEL, KOBJ_MOVE). > > > > > > > > > > > > > > > > I think, we should remove all these default events from the > > > > > > > > kobject > > > > > > > > core. We will not be able to manage the timing issues and "raw" > > > > > > > > kobject > > > > > > > > users should request the events on their own, when they are > > > > > > > > finished > > > > > > > > adding stuff to the kobject. I see currently no way to solve the > > > > > > > > "attributes created after the event" problem. The new > > > > > > > > *_create_and_register functions do not allow default attributes > > > > > > > > to be > > > > > > > > created, which will just lead to serious trouble when someone > > > > > > > > wants to > > > > > > > > use udev to set defaults and such things. We may just want to > > > > > > > > require an > > > > > > > > explicit call to send the event? > > > > > > > > > > > > > > There will always be attributes that will show up later (for > > > > > > > example, > > > > > > > after a device is activated). Probably the best approach is to > > > > > > > keep the > > > > > > > default uevents, but have the attribute-adder send another uevent > > > > > > > when > > > > > > > they are done? > > > > > > > > > > > > Uh, that's more an exception where we can't give guarantees because > > > > > > of > > > > > > very specific hardware setups, and it would be an additional > > > > > > "change" > > > > > > event. There are valid cases for this, but only a _very_ few. > > > > > > > > > > > > There is absolutely no reason not to do it right with the "add" > > > > > > event, > > > > > > just because we are too lazy to solve it proper the current code. > > > > > > It's > > > > > > just so broken by design, what we are doing today. :) > > > > > > > > > > I'm worrying a bit about changes that impact the whole code tree in > > > > > lots of places. I'd be fine with the device layer doing its uevent > > > > > manually in device_add() at the very end, though. (This would allow > > > > > drivers to add attributes in their probe function before the uevent, > > > > > for example.) > > > > > > > > > > I think I still remember what I did 2.5 years ago :) > > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e57cd73e2e844a3da25cc6b420674c81bbe1b387 > > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=18c3d5271b472c096adfc856e107c79f6fd30d7d > > > > > > The driver core does use the split already in most places, I did that > > > > long ago. There are not too many (~20) users of kobject_register(), and > > > > it's a pretty straight-forward change to change that to _init, _add, > > > > _uevent, and get rid of that totally useless "convenience api". > > > > > > > > I think there is no longer any excuse to keep that broken code around, > > > > and even require to document that it's broken. The whole purpose of the > > > > uevent is userspace consumption, which just doesn't work correctly with > > > > the code we offer. The fix is trivial, and should be done now, and we no > > > > longer need to fiddle around timing issues, just because we are too > > > > lazy. > > > > > > > > I propose the removal of _all_ funtions that have *register* in their > > > > name, and always require the following sequence: > > > > _init() > > > > _add() > > > > _uevent(_ADD) > > > > > > > > _uevent(_REMOVE) > > > > _del() > > > > _put() > > > > > > > > The _create_and_register() functions would become _create_ and_add() > > > > and will need an additional _uevent() call after they populated the > > > > object. > > > > > > I'm absolutely fine with doing that at the kobject level (after all, > > > it's a quite contained change, and the uevent function explicitely > > > works on a kobject). > > > > > > For the other
Re: Add the infamous Huawei E220 to option.c
Am Donnerstag, 29. November 2007 07:33:03 schrieb Johann Wilhelm: > But in my opinion the the modul-load-order should be forced by udev... > this should work and we only have 1 position to keep in mind if we eg. > get a new E220, support for the E270 or something else... No, udev cannot help here because any of the two modules may already be loaded when you plug in your device. You also need to get the kernel space probing corrected. Basically you have three options. 1. Make both drivers handle the issue. That means code duplication 2. Make the option driver fail gracefully in probe() 3. Make sure usbcore doesn't probe the devices in the wrong mode with the option driver Regards Oliver - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Add the infamous Huawei E220 to option.c
Hi there, Well your code basically looks nice... but keep in mind that there are several different E220-devices (in fact i know of 2 different PIDs... and I would be really surprised if they only use 2 of them...). So you should check all possible PIDs... But in my opinion the the modul-load-order should be forced by udev... this should work and we only have 1 position to keep in mind if we eg. get a new E220, support for the E270 or something else... 73 Zitat von Pete Zaitcev <[EMAIL PROTECTED]>: Hi, All: It looks like the Huawei E220 saga is not over yet. A collegue of mine, David Russll, reported that the modem does not work reliably on Fedora 8, which does have the initializer in usb-storage. The problem stems from the fact that both option and usb-storage can bind to the modem when in storage mode: the former binds because of the storage class, the latter binds because of VID/PID match. The modprobe loads both, it's random which wins. If usb-storage wins, everything is fine. If option wins, it binds to modem still in storage mode and does not work. I propose we add the same initializer that usb-storage has to the option. This way no matter which driver wins the modem gets initialized. The patch is tested on David's modem, but I would like someone give it more testing. I dunno, do we want some kind of code sharing between storage and option? They both could use the normal usb_control_msg, I think. Also, from archives it looks like Johann may need PID 0x1004 added. Since we're on topic, David's modem has exactly same IDs as Norbert's, but works fine with the length of 1. Although it's possible that the firmware is different without different firmware reported in USB desc- riptors. Does anyone know a magic AT command? ATI or something? Norbert, please try my patch, maybe it'll work this time. And finally, pleas stop using that script from the polish website and above all quit using the generic serial subdriver. The option must work now with the patch. Please let me know if it fails. Thanks in advance, -- Pete diff -urp -X dontdiff linux-2.6.23.1-42.fc8/drivers/usb/serial/option.c linux-2.6.23.1-42.fc8.e220.1/drivers/usb/serial/option.c --- linux-2.6.23.1-42.fc8/drivers/usb/serial/option.c 2007-10-09 13:31:38.0 -0700 +++ linux-2.6.23.1-42.fc8.e220.1/drivers/usb/serial/option.c 2007-11-27 21:36:11.0 -0800 @@ -448,7 +448,7 @@ static void option_indat_callback(struct err = usb_submit_urb(urb, GFP_ATOMIC); if (err) printk(KERN_ERR "%s: resubmit read urb failed. " - "(%d)", __FUNCTION__, err); + "(%d)\n", __FUNCTION__, err); } } return; @@ -728,6 +728,35 @@ static int option_send_setup(struct usb_ return 0; } +static void option_start_huawei(struct usb_serial *serial) +{ + struct usb_device *dev = serial->dev; + char *buf; + int rc; + + if (!(le16_to_cpu(dev->descriptor.idVendor) == HUAWEI_VENDOR_ID && + le16_to_cpu(dev->descriptor.idProduct) == HUAWEI_PRODUCT_E220)) + return; + + if ((buf = kmalloc(1, GFP_KERNEL)) == 0) + goto err_buf; + + buf[0] = 0x1; + rc = usb_control_msg(dev, usb_sndctrlpipe(dev, 0), + USB_REQ_SET_FEATURE, USB_TYPE_STANDARD | USB_RECIP_DEVICE, + 0x01, 0x0, buf, 1, 1000); + if (rc) { + printk(KERN_ERR "%s: HUAWEI E220 setup failed (%d)\n", + __FUNCTION__, rc); + } + + kfree(buf); + return; + +err_buf: + ; +} + static int option_startup(struct usb_serial *serial) { int i, err; @@ -736,6 +765,8 @@ static int option_startup(struct usb_ser dbg("%s", __FUNCTION__); + option_start_huawei(serial); + /* Now setup per port private data */ for (i = 0; i < serial->num_ports; i++) { port = serial->port[i]; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Add the infamous Huawei E220 to option.c
Am Donnerstag, 29. November 2007 01:13:05 schrieb Pete Zaitcev: > The problem stems from the fact that both option and usb-storage can bind > to the modem when in storage mode: the former binds because of the storage > class, the latter binds because of VID/PID match. The modprobe loads both, Isn't it possible to fix this in option's module table? Regards Oliver - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/6] timekeeping: rename timekeeping_is_continuous to timekeeping_valid_for_hres
Function timekeeping_is_continuous() no longer checks flag CLOCK_IS_CONTINUOUS, and it checks CLOCK_SOURCE_VALID_FOR_HRES now. So rename the function accordingly. Signed-off-by: Li Zefan <[EMAIL PROTECTED]> --- include/linux/time.h |2 +- kernel/time/tick-sched.c |2 +- kernel/time/timekeeping.c |4 ++-- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/include/linux/time.h b/include/linux/time.h index b04136d..fa21fe5 100644 --- a/include/linux/time.h +++ b/include/linux/time.h @@ -120,7 +120,7 @@ extern void getboottime(struct timespec *ts); extern void monotonic_to_bootbased(struct timespec *ts); extern struct timespec timespec_trunc(struct timespec t, unsigned gran); -extern int timekeeping_is_continuous(void); +extern int timekeeping_valid_for_hres(void); extern void update_wall_time(void); /** diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index 27a2338..fb69787 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -654,7 +654,7 @@ int tick_check_oneshot_change(int allow_nohz) if (ts->nohz_mode != NOHZ_MODE_INACTIVE) return 0; - if (!timekeeping_is_continuous() || !tick_is_oneshot_available()) + if (!timekeeping_valid_for_hres() || !tick_is_oneshot_available()) return 0; if (!allow_nohz) diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c index e5e466b..e112dc4 100644 --- a/kernel/time/timekeeping.c +++ b/kernel/time/timekeeping.c @@ -211,9 +211,9 @@ static inline s64 __get_nsec_offset(void) { return 0; } #endif /** - * timekeeping_is_continuous - check to see if timekeeping is free running + * timekeeping_valid_for_hres - Check if timekeeping is suitable for hres */ -int timekeeping_is_continuous(void) +int timekeeping_valid_for_hres(void) { unsigned long seq; int ret; -- 1.5.3.rc7 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: void* arithmnetic
On Nov. 29, 2007, 3:19 +0200, "Ming Lei" <[EMAIL PROTECTED]> wrote: > 2007/11/29, Jan Engelhardt <[EMAIL PROTECTED]>: >> On Nov 29 2007 01:05, J.A. Magallón wrote: >>> Since begin of the ages the build of the nvidia driver says things like >>> this: >>> >> Explicitly adding -Wpointer-arith to ones own Makefile is like >> admitting the code might be problematic. :-> >> >> >> I think sizeof(void *) == 1 is taken as granted as sizeof(int) >= 4 >> these days. Sigh. > sizeof(void *) == 4, sizeof(void)==1, :) well, sizeof(void *) == sizeof(unsigned long) maybe :) >> - >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >> the body of a message to [EMAIL PROTECTED] >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> Please read the FAQ at http://www.tux.org/lkml/ >> > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/6] time: fix typo in comments
Fix typo in comments. BTW: I have to fix coding style in arch/ia64/kernel/time.c also, otherwise checkpatch.pl will be complaining. Signed-off-by: Li Zefan <[EMAIL PROTECTED]> --- arch/ia64/kernel/time.c | 14 +++--- arch/x86/kernel/time_64.c |2 +- include/linux/hrtimer.h |2 +- include/linux/jiffies.h |6 +++--- kernel/time.c |4 ++-- kernel/time/clockevents.c |2 +- kernel/time/timekeeping.c |2 +- 7 files changed, 16 insertions(+), 16 deletions(-) diff --git a/arch/ia64/kernel/time.c b/arch/ia64/kernel/time.c index 2bb8421..5fc8c89 100644 --- a/arch/ia64/kernel/time.c +++ b/arch/ia64/kernel/time.c @@ -49,13 +49,13 @@ EXPORT_SYMBOL(last_cli_ip); #endif static struct clocksource clocksource_itc = { -.name = "itc", -.rating = 350, -.read = itc_get_cycles, -.mask = CLOCKSOURCE_MASK(64), -.mult = 0, /*to be caluclated*/ -.shift = 16, -.flags = CLOCK_SOURCE_IS_CONTINUOUS, + .name = "itc", + .rating = 350, + .read = itc_get_cycles, + .mask = CLOCKSOURCE_MASK(64), + .mult = 0, /*to be calculated*/ + .shift = 16, + .flags = CLOCK_SOURCE_IS_CONTINUOUS, }; static struct clocksource *itc_clocksource; diff --git a/arch/x86/kernel/time_64.c b/arch/x86/kernel/time_64.c index 368b194..2cc7570 100644 --- a/arch/x86/kernel/time_64.c +++ b/arch/x86/kernel/time_64.c @@ -235,7 +235,7 @@ static unsigned int __init tsc_calibrate_cpu_khz(void) reserve_evntsel_nmi(MSR_K7_EVNTSEL0 + i); } local_irq_save(flags); - /* start meauring cycles, incrementing from 0 */ + /* start measuring cycles, incrementing from 0 */ wrmsrl(MSR_K7_PERFCTR0 + i, 0); wrmsrl(MSR_K7_EVNTSEL0 + i, 1 << 22 | 3 << 16 | 0x76); rdtscl(tsc_start); diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h index 80c7e98..d42c6be 100644 --- a/include/linux/hrtimer.h +++ b/include/linux/hrtimer.h @@ -78,7 +78,7 @@ enum hrtimer_cb_mode { * as otherwise the timer could be removed before the softirq code finishes the * the handling of the timer. * - * The HRTIMER_STATE_ENQUEUE bit is always or'ed to the current state to + * The HRTIMER_STATE_ENQUEUED bit is always or'ed to the current state to * preserve the HRTIMER_STATE_CALLBACK bit in the above scenario. * * All state transitions are protected by cpu_base->lock. diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h index 8b08002..b071f46 100644 --- a/include/linux/jiffies.h +++ b/include/linux/jiffies.h @@ -36,7 +36,7 @@ /* LATCH is used in the interval timer and ftape setup. */ #define LATCH ((CLOCK_TICK_RATE + HZ/2) / HZ) /* For divider */ -/* Suppose we want to devide two numbers NOM and DEN: NOM/DEN, the we can +/* Suppose we want to devide two numbers NOM and DEN: NOM/DEN, then we can * improve accuracy by shifting LSH bits, hence calculating: * (NOM << LSH) / DEN * This however means trouble for large NOM, because (NOM << LSH) may no @@ -154,7 +154,7 @@ extern unsigned long preset_lpj; * We want to do realistic conversions of time so we need to use the same * values the update wall clock code uses as the jiffies size. This value * is: TICK_NSEC (which is defined in timex.h). This - * is a constant and is in nanoseconds. We will used scaled math + * is a constant and is in nanoseconds. We will use scaled math * with a set of scales defined here as SEC_JIFFIE_SC, USEC_JIFFIE_SC and * NSEC_JIFFIE_SC. Note that these defines contain nothing but * constants and so are computed at compile time. SHIFT_HZ (computed in @@ -198,7 +198,7 @@ extern unsigned long preset_lpj; * operator if the result is a long long AND at least one of the * operands is cast to long long (usually just prior to the "*" so as * not to confuse it into thinking it really has a 64-bit operand, - * which, buy the way, it can do, but it take more code and at least 2 + * which, buy the way, it can do, but it takes more code and at least 2 * mpys). * We also need to be aware that one second in nanoseconds is only a diff --git a/kernel/time.c b/kernel/time.c index 09d3c45..c25f472 100644 --- a/kernel/time.c +++ b/kernel/time.c @@ -266,7 +266,7 @@ EXPORT_SYMBOL(jiffies_to_usecs); * * This function should be only used for timestamps returned by * current_kernel_time() or CURRENT_TIME, not with do_gettimeofday() because - * it doesn't handle the better resolution of the later. + * it doesn't handle the better resolution of the latter. */ struct timespec timespec_trunc(struct timespec t, unsigned gran) { @@ -314,7 +314,7 @@ EXPORT_SYMBOL_GPL(getnstimeofday); * This algorithm was first published by Gauss (I think). * * WARNING: this function will overflow on 2106-02-07 06:28:16 on - * machines were long is
[PATCH 5/6] time: delete comments that refer to noexistent symbols
Function do_timer_interrupt_hook() don't take argument regs, and structure hrtimer_sleeper don't have member cb_pending. So delete comments refering to these symbols. Signed-off-by: Li Zefan <[EMAIL PROTECTED]> --- include/asm-x86/mach-voyager/do_timer.h |1 - include/linux/hrtimer.h |1 - 2 files changed, 0 insertions(+), 2 deletions(-) diff --git a/include/asm-x86/mach-voyager/do_timer.h b/include/asm-x86/mach-voyager/do_timer.h index bc2b589..9e5a459 100644 --- a/include/asm-x86/mach-voyager/do_timer.h +++ b/include/asm-x86/mach-voyager/do_timer.h @@ -6,7 +6,6 @@ /** * do_timer_interrupt_hook - hook into timer tick - * @regs: standard registers from interrupt * * Call the pit clock event handler. see asm/i8253.h **/ diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h index 0a23302..d42c6be 100644 --- a/include/linux/hrtimer.h +++ b/include/linux/hrtimer.h @@ -149,7 +149,6 @@ struct hrtimer_sleeper { * @get_time: function to retrieve the current time of the clock * @get_softirq_time: function to retrieve the current time from the softirq * @softirq_time: the time when running the hrtimer queue in the softirq - * @cb_pending:list of timers where the callback is pending * @offset:offset of this clock to the monotonic base * @reprogram: function to reprogram the timer event */ -- 1.5.3.rc7 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 6/6] tick: add a missing dot in prink
Add a missing '.' in prink information. Signed-off-by: Li Zefan <[EMAIL PROTECTED]> --- kernel/time/tick-oneshot.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/kernel/time/tick-oneshot.c b/kernel/time/tick-oneshot.c index 0258d31..0b5e513 100644 --- a/kernel/time/tick-oneshot.c +++ b/kernel/time/tick-oneshot.c @@ -78,7 +78,7 @@ int tick_switch_to_oneshot(void (*handler)(struct clock_event_device *)) printk(KERN_INFO "Clockevents: " "could not switch to one-shot mode:"); if (!dev) { - printk(" no tick device\n"); + printk(" no tick device.\n"); } else { if (!tick_device_is_functional(dev)) printk(" %s is not functional.\n", dev->name); -- 1.5.3.rc7 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/6] time: small fixes and code cleanups
Those patches do some small fixes and code cleanups. No actual bug is fixed though. Li Zefan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/6] clocksource: remove redundant code
Flag CLOCK_SOURCE_WATCHDOG is cleared twice. Note clocksource_change_rating() won't do anyting with the cs flag. Signed-off-by: Li Zefan <[EMAIL PROTECTED]> --- kernel/time/clocksource.c |1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c index c8a9d13..0ba9fa8 100644 --- a/kernel/time/clocksource.c +++ b/kernel/time/clocksource.c @@ -91,7 +91,6 @@ static void clocksource_ratewd(struct clocksource *cs, int64_t delta) cs->name, delta); cs->flags &= ~(CLOCK_SOURCE_VALID_FOR_HRES | CLOCK_SOURCE_WATCHDOG); clocksource_change_rating(cs, 0); - cs->flags &= ~CLOCK_SOURCE_WATCHDOG; list_del(>wd_list); } -- 1.5.3.rc7 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/6] clockevent: simplify list operations
list_for_each_safe() suffices here. Signed-off-by: Li Zefan <[EMAIL PROTECTED]> --- kernel/time/clockevents.c | 11 --- 1 files changed, 4 insertions(+), 7 deletions(-) diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c index 822beeb..68fbe73 100644 --- a/kernel/time/clockevents.c +++ b/kernel/time/clockevents.c @@ -200,6 +200,8 @@ void clockevents_exchange_device(struct clock_event_device *old, */ void clockevents_notify(unsigned long reason, void *arg) { + struct list_head *node, *tmp; + spin_lock(_lock); clockevents_do_notify(reason, arg); @@ -209,13 +211,8 @@ void clockevents_notify(unsigned long reason, void *arg) * Unregister the clock event devices which were * released from the users in the notify chain. */ - while (!list_empty(_released)) { - struct clock_event_device *dev; - - dev = list_entry(clockevents_released.next, -struct clock_event_device, list); - list_del(>list); - } + list_for_each_safe(node, tmp, _released) + list_del(node); break; default: break; -- 1.5.3.rc7 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] jiffies counter leaps in 2.6.24-rc3
On Sat, 24 Nov 2007 20:31:25 +0100 "Rafael J. Wysocki" <[EMAIL PROTECTED]> wrote: > On Saturday, 24 of November 2007, Stefano Brivio wrote: > > On Sat, 24 Nov 2007 19:48:58 +0100 > > "Rafael J. Wysocki" <[EMAIL PROTECTED]> wrote: > > > > > NO_HZ? Highres timers? > > > > CONFIG_HZ_1000=y > > # CONFIG_HIGH_RES_TIMERS is not set > > > > > I understand that the previous kernels behave correctly. All of them? > > > > 2.6.21 behaved correctly. Sorry but git-bisect would take a lot of time > > (I can't reliably reproduce the jiffies jump), so I would avoid that if > > not strictly needed. > > Well, it would be good to know if 2.6.23 behaves correctly, at least. Weird, it looks like I can't boot with 2.6.23.9 because of some issues with dm-crypt (my root filesystem is encrypted). I double-checked the configuration (which I just took from my current one), well, no way. Any other test I can do? In the meanwhile, I noted another thing: sometimes it happens that I become root and the jiffies counter jumps ahead. Then, when I close any root session, the jiffies counter jumps back to the correct value. Please remember that this isn't just an aesthetic issue, as some drivers (e.g. b43 and b43legacy, but I guess a lot more) rely on jiffies. Do I need to file a bug to bugzilla.kernel.org? Thank you. -- Ciao Stefano - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [libata] Set proper ATA UDMA mode for bf548 according to system clock.
Any comment? Thanks Sonic On Nov 27, 2007 12:47 PM, sonic zhang <[EMAIL PROTECTED]> wrote: > UDMA Mode - Frequency compatibility > > UDMA5 - 100 MB/s - SCLK = 133 MHz > UDMA4 - 66 MB/s- SCLK >= 80 MHz > UDMA3 - 44.4 MB/s - SCLK >= 50 MHz > UDMA2 - 33 MB/s- SCLK >= 40 MHz > > > Signed-off-by: Sonic Zhang <[EMAIL PROTECTED]> > --- > drivers/ata/pata_bf54x.c |7 +++ > 1 files changed, 7 insertions(+), 0 deletions(-) > > diff --git a/drivers/ata/pata_bf54x.c b/drivers/ata/pata_bf54x.c > index 81db405..088a41f 100644 > --- a/drivers/ata/pata_bf54x.c > +++ b/drivers/ata/pata_bf54x.c > @@ -1489,6 +1489,8 @@ static int __devinit bfin_atapi_probe(st > int board_idx = 0; > struct resource *res; > struct ata_host *host; > + unsigned int fsclk = get_sclk(); > + int udma_mode = 5; > const struct ata_port_info *ppi[] = > { _port_info[board_idx], NULL }; > > @@ -1507,6 +1509,11 @@ static int __devinit bfin_atapi_probe(st > if (res == NULL) > return -EINVAL; > > + while (bfin_port_info[board_idx].udma_mask>0 && udma_fsclk[udma_mode] > > fsclk) { > + udma_mode--; > + bfin_port_info[board_idx].udma_mask >>= 1; > + } > + > /* > * Now that that's out of the way, wire up the port.. > */ > -- > 1.4.3.4 > > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] 2.6.24-rc3-git2 softlockup detected
Andrew Morton wrote: > On Wed, 28 Nov 2007 12:47:19 +0530 Kamalesh Babulal <[EMAIL PROTECTED]> wrote: > >> Andrew Morton wrote: >>> On Wed, 28 Nov 2007 11:59:00 +0530 Kamalesh Babulal <[EMAIL PROTECTED]> >>> wrote: >>> Hi, >>> (cc linux-scsi, for sym53c8xx) >>> Soft lockup is detected while bootup with 2.6.24-rc3-git2 on powerbox >>> I assume this is a post-2.6.23 regression? >>> BUG: soft lockup - CPU#1 stuck for 11s! [insmod:375] NIP: c002f02c LR: d01414fc CTR: c002f018 REGS: c0077cbef0b0 TRAP: 0901 Not tainted (2.6.24-rc3-git2-autotest) MSR: 80009032 CR: 24022088 XER: TASK = c0077cbd8000[375] 'insmod' THREAD: c0077cbec000 CPU: 1 GPR00: d01414fc c0077cbef330 c052b930 d80080002014 GPR04: d8008000202c c0077ca1cb00 d014ce54 GPR08: c0077ca1c63c 002a c002f018 GPR12: d0143610 c0473d00 NIP [c002f02c] .ioread8+0x14/0x60 LR [d01414fc] .sym_hcb_attach+0x1188/0x1378 [sym53c8xx] Call Trace: [c0077cbef330] [c0077cbef3c0] 0xc0077cbef3c0 (unreliable) [c0077cbef3a0] [d01414fc] .sym_hcb_attach+0x1188/0x1378 [sym53c8xx] [c0077cbef470] [d01395f8] .sym2_probe+0x700/0x99c [sym53c8xx] [c0077cbef710] [c01bc118] .pci_device_probe+0x124/0x1b0 [c0077cbef7b0] [c0221138] .driver_probe_device+0x144/0x20c [c0077cbef850] [c0221450] .__driver_attach+0xcc/0x154 [c0077cbef8e0] [c021ff94] .bus_for_each_dev+0x7c/0xd4 [c0077cbef9a0] [c0220e9c] .driver_attach+0x28/0x40 [c0077cbefa20] [c02204d8] .bus_add_driver+0x90/0x228 [c0077cbefac0] [c0221858] .driver_register+0x94/0xb0 [c0077cbefb40] [c01bc430] .__pci_register_driver+0x6c/0xcc [c0077cbefbe0] [d0143428] .sym2_init+0x108/0x15b0 [sym53c8xx] [c0077cbefc80] [c008ce80] .sys_init_module+0x17c4/0x1958 [c0077cbefe30] [c000872c] syscall_exit+0x0/0x40 Instruction dump: 6000 786b0420 38210070 7d635b78 e8010010 7c0803a6 4e800020 7c0802a6 f8010010 f821ff91 7c0004ac 8923 <0c09> 4c00012c 79290620 2f8900ff >>> I see no obvious lockup sites near the end of sym_hcb_attach(). Maybe it's >>> being called lots of times from a higher level.. Do the traces all look >>> the same? >> Hi Andrew, >> >> I see this call trace twice and both looks similar and on another reboot >> the following trace is seen twice in different cpu >> >> BUG: soft lockup detected on CPU#3! >> Call Trace: >> [C0003FEDEDA0] [C0010220] .show_stack+0x68/0x1b0 (unreliable) >> [C0003FEDEE40] [C00A061C] .softlockup_tick+0xf0/0x13c >> [C0003FEDEEF0] [C0072E2C] .run_local_timers+0x1c/0x30 >> [C0003FEDEF70] [C0022FA0] .timer_interrupt+0xa8/0x488 >> [C0003FEDF050] [C00034EC] decrementer_common+0xec/0x100 >> --- Exception: 901 at .ioread8+0x14/0x60 >> LR = .sym_hcb_attach+0x1194/0x1384 [sym53c8xx] >> [C0003FEDF340] [D02B3BC0] 0xd02b3bc0 (unreliable) >> [C0003FEDF3B0] [D029A3C0] .sym_hcb_attach+0x1194/0x1384 >> [sym53c8xx] >> [C0003FEDF480] [D0291D30] .sym2_probe+0x75c/0x9f8 [sym53c8xx] >> [C0003FEDF710] [C01B65A4] .pci_device_probe+0x13c/0x1dc >> [C0003FEDF7D0] [C0219A0C] .driver_probe_device+0xa0/0x15c >> [C0003FEDF870] [C0219C64] .__driver_attach+0xb4/0x138 >> [C0003FEDF900] [C021913C] .bus_for_each_dev+0x7c/0xd4 >> [C0003FEDF9C0] [C02198B0] .driver_attach+0x28/0x40 >> [C0003FEDFA40] [C0218BA4] .bus_add_driver+0x98/0x18c >> [C0003FEDFAE0] [C021A064] .driver_register+0xa8/0xc4 >> [C0003FEDFB60] [C01B68AC] .__pci_register_driver+0x5c/0xa4 >> [C0003FEDFBF0] [D029C204] .sym2_init+0x104/0x1550 [sym53c8xx] >> [C0003FEDFC90] [C008D1F4] .sys_init_module+0x1764/0x1998 >> [C0003FEDFE30] [C000869C] syscall_exit+0x0/0x40 >> > > hm, odd. > > Can you look up sym_hcb_attach+0x1194/0x1384 in gdb? Something like > Hi Andrew, I tried with 2.6.24-rc3-git3 and got the following trace BUG: soft lockup - CPU#2 stuck for 11s! [insmod:375] NIP: c002f02c LR: d01414fc CTR: c002f018 REGS: c0077ca3b0b0 TRAP: 0901 Not tainted (2.6.24-rc3-git3-autokern1) MSR: 80009032 CR: 24022088 XER: TASK = c0077cc58000[375] 'insmod' THREAD: c0077ca38000 CPU: 2 GPR00: d01414fc c0077ca3b330 c052b880 d80080002014 GPR04: d8008000202c c0077c82eb00 d014ce54 GPR08: c0077c82e63c 002a c002f018 GPR12: d0143610 c0473f80 NIP [c002f02c]
Re: [patch 1/1] Writeback fix for concurrent large and small file writes
Thank you. Integrated the fixes in my patch. On Nov 28, 2007 6:13 PM, Frans Pop <[EMAIL PROTECTED]> wrote: > Two typos in comments. > > Cheers, > FJP > > Michael Rubin wrote: > > + * The flush tree organizes the dirtied_when keys with the rb_tree. Any > > + * inodes with a duplicate dirtied_when value are link listed together. > > This + * link list is sorted by the inode's i_flushed_when. When both the > > + * dirited_when and the i_flushed_when are indentical the order in the > > + * linked list determines the order we flush the inodes. > > s/dirited_when/dirtied_when/ > > > + * Here is where we interate to find the next inode to process. The > > + * strategy is to first look for any other inodes with the same > > dirtied_when + * value. If we have already processed that node then we > > need to find + * the next highest dirtied_when value in the tree. > > s/interate/iterate/ > > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
get block_device from name
Please CC: me on any replies, as I am not subscribed to the list. I want to do some bio IO on a block device (*no* filesystem involved). First I need to get hold of the struct block_device. What is the current recommended way to get from the name of a device (e.g. "/dev/sda1") to the corresponding struct block_device ? What is the current canonical representation of a device - dev_t ? Did kdev_t go away ? Given the struct block_device, can I immediately use it in a bio, or do I need to prepare the device first ? I think I understand how to submit a bio, once I have the block_device ready. I have read that the handling of block_devices is not as straightforward as it could be... ( http://lwn.net/Articles/247072/ ) Any suggestions would be appreciated. Again, please CC: me on any replies. Tim - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Question regarding mutex locking
On Wed, Nov 28, 2007 at 03:33:12PM -0800, Stephen Hemminger wrote: ... > WTF are you teaching a lesson on how NOT to do locking? > > Any code which has this kind of convoluted dependency on conditional > locking is fundamentally broken. > As a matter of fact I've been thinking, about one more Re: to myself to point this all is a good example how problematic such solution would be, but I've decided it's rather apparent. IMHO learning needs bad examples too - to better understand why they should be avoided. On the other hand, I've seen quite a lot of fundamentally right, but practically broken code, so I'm not sure what's better. And, btw., I guess this 'fundamentally broken' type of locking could be found in the kernel too, but I'd prefer not too look after this now. Thanks, Jarek P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH] LTTng instrumentation mm (using page_to_pfn)
On Wed, 2007-11-28 at 21:34 -0500, Mathieu Desnoyers wrote: > Before I start digging deeper in checking whether it is already > instrumented by the fs instrumentation (and would therefore be > redundant), is there a particular data structure from mm/ that you > suggest taking the swap file number and location in swap from ? page_private() at this point stores a swp_entry_t. There are swp_type() and swp_offset() helpers to decode the two bits you need after you've turned page_private() into a swp_entry_t. See how get_swap_bio() creates a temporary swp_entry_t from the page_private() passed into it, then uses swp_type/offset() on it? I don't know if there is some history behind it, but it doesn't make a whole ton of sense to me to be passing page_private(page) into get_swap_bio() (which happens from its only two call sites). It just kinda obfuscates where 'index' came from. It think we probably could just be doing swp_entry_t entry = { .val = page_private(page), }; in get_swap_bio() and not passing page_private(). We have the page in there already, so we don't need to pass a derived value like page_private(). At the least, it'll save some clutter in the function declaration. Or, make a helper: static swp_entry_t page_swp_entry(struct page *page) { swp_entry_t entry; VM_BUG_ON(!PageSwapCache(page)); entry.val = page_private(page); return entry; } I see at least 4 call sites that could use this. The try_to_unmap_one() caller would trip over the debug check, so you'd have to move the call inside of the if(PageSwapCache(page)) statement. -- Dave - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: nommu: Add new vmalloc_user() and remap_vmalloc_range() interfaces.
Paul Mundt wrote: This builds on top of the earlier vmalloc_32_user() work introduced by b50731732f926d6c49fd0724616a7344c31cd5cf, as we now have places in the nommu allmodconfig that hit up against these missing APIs. As vmalloc_32_user() is already implemented, this is moved over to vmalloc_user() and simply made a wrapper. As all current nommu platforms are 32-bit addressable, there's no special casing we have to do for ZONE_DMA and things of that nature as per GFP_VMALLOC32. remap_vmalloc_range() needs to check VM_USERMAP in order to figure out whether we permit the remap or not, which means that we also have to rework the vmalloc_user() code to grovel for the VMA and set the flag. Signed-off-by: Paul Mundt <[EMAIL PROTECTED]> Acked-by: Greg Ungerer <[EMAIL PROTECTED]> mm/nommu.c | 45 - 1 file changed, 44 insertions(+), 1 deletion(-) diff --git a/mm/nommu.c b/mm/nommu.c index 35622c5..c4768d0 100644 --- a/mm/nommu.c +++ b/mm/nommu.c @@ -10,6 +10,7 @@ * Copyright (c) 2000-2003 David McCullough <[EMAIL PROTECTED]> * Copyright (c) 2000-2001 D Jeff Dionne <[EMAIL PROTECTED]> * Copyright (c) 2002 Greg Ungerer <[EMAIL PROTECTED]> + * Copyright (c) 2007 Paul Mundt <[EMAIL PROTECTED]> */ #include @@ -183,6 +184,26 @@ void *__vmalloc(unsigned long size, gfp_t gfp_mask, pgprot_t prot) } EXPORT_SYMBOL(__vmalloc); +void *vmalloc_user(unsigned long size) +{ + void *ret; + + ret = __vmalloc(size, GFP_KERNEL | __GFP_HIGHMEM | __GFP_ZERO, + PAGE_KERNEL); + if (ret) { + struct vm_area_struct *vma; + + down_write(>mm->mmap_sem); + vma = find_vma(current->mm, (unsigned long)ret); + if (vma) + vma->vm_flags |= VM_USERMAP; + up_write(>mm->mmap_sem); + } + + return ret; +} +EXPORT_SYMBOL(vmalloc_user); + struct page * vmalloc_to_page(void *addr) { return virt_to_page(addr); @@ -253,10 +274,17 @@ EXPORT_SYMBOL(vmalloc_32); * * The resulting memory area is 32bit addressable and zeroed so it can be * mapped to userspace without leaking data. + * + * VM_USERMAP is set on the corresponding VMA so that subsequent calls to + * remap_vmalloc_range() are permissible. */ void *vmalloc_32_user(unsigned long size) { - return __vmalloc(size, GFP_KERNEL | __GFP_ZERO, PAGE_KERNEL); + /* +* We'll have to sort out the ZONE_DMA bits for 64-bit, +* but for now this can simply use vmalloc_user() directly. +*/ + return vmalloc_user(size); } EXPORT_SYMBOL(vmalloc_32_user); @@ -1213,6 +1241,21 @@ int remap_pfn_range(struct vm_area_struct *vma, unsigned long from, } EXPORT_SYMBOL(remap_pfn_range); +int remap_vmalloc_range(struct vm_area_struct *vma, void *addr, + unsigned long pgoff) +{ + unsigned int size = vma->vm_end - vma->vm_start; + + if (!(vma->vm_flags & VM_USERMAP)) + return -EINVAL; + + vma->vm_start = (unsigned long)(addr + (pgoff << PAGE_SHIFT)); + vma->vm_end = vma->vm_start + size; + + return 0; +} +EXPORT_SYMBOL(remap_vmalloc_range); + void swap_unplug_io_fn(struct backing_dev_info *bdi, struct page *page) { } -- Greg Ungerer -- Chief Software Dude EMAIL: [EMAIL PROTECTED] Secure Computing CorporationPHONE: +61 7 3435 2888 825 Stanley St, FAX: +61 7 3891 3630 Woolloongabba, QLD, 4102, Australia WEB: http://www.SnapGear.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] New kobject/kset/ktype documentation and example code
On Wed, Nov 28, 2007 at 02:03:28PM -0500, Alan Stern wrote: > On Tue, 27 Nov 2007, Greg KH wrote: > > > Part of the difficulty in understanding the driver model - and the kobject > > abstraction upon which it is built - is that there is no obvious starting > > place. Dealing with kobjects requires understanding a few different types, > > all of which make reference to each other. In an attempt to make things > > easier, we'll take a multi-pass approach, starting with vague terms and > > adding detail as we go. To that end, here are some quick definitions of > > some terms we will be working with. > > > > - A kobject is an object of type struct kobject. Kobjects have a name > >and a reference count. A kobject also has a parent pointer (allowing > >objects to be arranged into hierarchies), a specific type, and, > >usually, a representation in the sysfs virtual filesystem. > > As Cornelia said, it would be worthwhile mentioning krefs in this > document as well. They are simple enough to explain, after all. Now added, thanks. > > Initialization of kobjects > > > > Code which creates a kobject must, of course, initialize that object. Some > > of the internal fields are setup with a (mandatory) call to kobject_init(): > > kobject_init() isn't mandatory if you use kobject_register(). But then > Kay wants to do away with kobject_register()... > > > The other kobject fields which should be set, directly or indirectly, by > > the creator are its ktype, kset, and parent. We will get to those shortly, > > however please note that the ktype and kset must be set before the > > kobject_init() function is called. > > In fact kset, ktype, and parent are optional, right? You might mention > at this point that not all those fields are needed, and explain later > which combinations are legal. They are optional, but if you want to do anything, you need to set them :) > > When a reference is released, the call to kobject_put() will decrement the > > reference count and, possibly, free the object. Note that kobject_init() > > sets the reference count to one, so the code which sets up the kobject will > > need to do a kobject_put() eventually to release that reference. > > It's worth mentioning here (and perhaps elsewhere too) that all of the > function calls described here can sleep and hence must be made in > process context, with the exception of the *_get() routines. It's > possible to call *_put() in atomic context; the SCSI core does this > (with device_put, not kobject_put) and has to jump through hoops to run > the corresponding release routine in a waitqueue task. In general, > though, it isn't safe. Is this really needed? If anyone calls them from non-process context, they will get a nasty run-time warning, right? > > Because kobjects are dynamic, they must not be declared statically or on > > the stack, but instead, always from the heap. Future versions of the > > kernel will contain a run-time check for kobjects that are created > > statically and will warn the developer of this improper usage. > > Why not? What's wrong with static kobjects? I've never understood this. They are reference counted. Other portions of the kernel can grab them and think they are safe to use. If you do this with a static object, what happens when the code goes away? Most of the nasty race conditions that require this are now cleaned up with Tejun's great sysfs work, so you will probably not see problems if you do this, but in general, it's not a good thing to do. > > ktypes and release methods > > > > One important thing still missing from the discussion is what happens to a > > kobject when its reference count reaches zero. The code which created the > > kobject generally does not know when that will happen; if it did, there > > would be little point in using a kobject in the first place. Even > > predicatable object lifecycles become more complicated when sysfs is > > predictable thanks. > > One important point cannot be overstated: every kobject must have a > > release() method, and the kobject must persist (in a consistent state) > > until that method is called. If these constraints are not met, the code is > > flawed. Note that the kernel will warn you if you forget to provide a > > release() method. Do not try to get rid of this warning by providing an > > "empty" release function, you will be mocked merciously by the kobject > > maintainer if you attempt this. > > Not to mention that doing this will leak memory. Unless the kobject > is static... heh. I think your other questions are already answered, right? thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 05/14] percpu: Use a Kconfig variable to configure arch specific percpu setup
Second portion. Add a new seg_offset macro to calculate the offset. This can be avoided if the linker relocates the per cpu area to zero. Includes a patch to read trickle count via both methods to verify that it actually works. Both patches on top of the per cpu cleanup patches that I sent today too. x86_64: Make the x86_32 percpu operations usable on x86_64 Calculate the offset relative to gs in order to be able to address per cpu data using the x86_64 per cpu macros. The subtraction of __per_cpu_start will make the offset based from the beginning of the per cpu area. That is where gs points to. Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> --- drivers/char/random.c|2 +- include/asm-x86/percpu.h | 29 ++--- init/main.c |5 + 3 files changed, 24 insertions(+), 12 deletions(-) Index: linux-2.6.24-rc3-mm2/include/asm-x86/percpu.h === --- linux-2.6.24-rc3-mm2.orig/include/asm-x86/percpu.h 2007-11-28 17:50:01.861182410 -0800 +++ linux-2.6.24-rc3-mm2/include/asm-x86/percpu.h 2007-11-28 21:22:50.845872906 -0800 @@ -16,7 +16,13 @@ #define __my_cpu_offset read_pda(data_offset) #define per_cpu_offset(x) (__per_cpu_offset(x)) +#define __percpu_seg "%%gs:" +/* Calculate the offset to use with the segment register */ +#define seg_offset(name) (*SHIFT_PTR(_cpu_var(name), - (unsigned long)__per_cpu_start)) +#else +#define __percpu_seg "" +#define seg_offset(name) per_cpu_var(name) #endif #include @@ -64,16 +70,11 @@ DECLARE_PER_CPU(struct x8664_pda, pda); *PER_CPU(cpu_gdt_descr, %ebx) */ #ifdef CONFIG_SMP - #define __my_cpu_offset x86_read_percpu(this_cpu_off) - /* fs segment starts at (positive) offset == __per_cpu_offset[cpu] */ #define __percpu_seg "%%fs:" - #else /* !SMP */ - #define __percpu_seg "" - #endif /* SMP */ #include @@ -81,6 +82,13 @@ DECLARE_PER_CPU(struct x8664_pda, pda); /* We can use this directly for local CPU (faster). */ DECLARE_PER_CPU(unsigned long, this_cpu_off); +#define seg_offset(name) per_cpu_var(name) + +#endif /* __ASSEMBLY__ */ +#endif /* !CONFIG_X86_64 */ + +#ifndef __ASSEMBLY__ + /* For arch-specific code, we can use direct single-insn ops (they * don't give an lvalue though). */ extern void __bad_percpu_size(void); @@ -132,11 +140,10 @@ extern void __bad_percpu_size(void); } \ ret__; }) -#define x86_read_percpu(var) percpu_from_op("mov", per_cpu__##var) -#define x86_write_percpu(var,val) percpu_to_op("mov", per_cpu__##var, val) -#define x86_add_percpu(var,val) percpu_to_op("add", per_cpu__##var, val) -#define x86_sub_percpu(var,val) percpu_to_op("sub", per_cpu__##var, val) -#define x86_or_percpu(var,val) percpu_to_op("or", per_cpu__##var, val) +#define x86_read_percpu(var) percpu_from_op("mov", seg_offset(var)) +#define x86_write_percpu(var,val) percpu_to_op("mov", seg_offset(var), val) +#define x86_add_percpu(var,val) percpu_to_op("add", seg_offset(var), val) +#define x86_sub_percpu(var,val) percpu_to_op("sub", seg_offset(var), val) +#define x86_or_percpu(var,val) percpu_to_op("or", seg_offset(var), val) #endif /* !__ASSEMBLY__ */ -#endif /* !CONFIG_X86_64 */ #endif /* _ASM_X86_PERCPU_H_ */ Index: linux-2.6.24-rc3-mm2/drivers/char/random.c === --- linux-2.6.24-rc3-mm2.orig/drivers/char/random.c 2007-11-28 21:20:58.225804398 -0800 +++ linux-2.6.24-rc3-mm2/drivers/char/random.c 2007-11-28 21:28:38.967363573 -0800 @@ -272,7 +272,7 @@ static int random_write_wakeup_thresh = static int trickle_thresh __read_mostly = INPUT_POOL_WORDS * 28; -static DEFINE_PER_CPU(int, trickle_count) = 0; +DEFINE_PER_CPU(int, trickle_count) = 55; /* * A pool of size .poolwords is stirred with a primitive polynomial Index: linux-2.6.24-rc3-mm2/init/main.c === --- linux-2.6.24-rc3-mm2.orig/init/main.c 2007-11-28 21:10:54.245804225 -0800 +++ linux-2.6.24-rc3-mm2/init/main.c2007-11-28 21:22:17.769053628 -0800 @@ -504,6 +504,8 @@ void __init __attribute__((weak)) smp_se { } +DECLARE_PER_CPU(int, trickle_count); + asmlinkage void __init start_kernel(void) { char * command_line; @@ -645,6 +647,9 @@ asmlinkage void __init start_kernel(void acpi_early_init(); /* before LAPIC and SMP init */ + printk("Reading trickle cound =%lu. Is %lu\n", + x86_read_percpu(trickle_count), + __raw_get_cpu_var(trickle_count)); /* Do the rest non-__init'ed, we're now alive */ rest_init(); } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] Sample kset/ktype/kobject implementation
On Wed, Nov 28, 2007 at 05:35:32PM +0100, Cornelia Huck wrote: > On Tue, 27 Nov 2007 15:04:06 -0800, > Greg KH <[EMAIL PROTECTED]> wrote: > > > static struct foo_obj *create_foo_obj(const char *name) > > { > > struct foo_obj *foo; > > int retval; > > > > /* allocate the memory for the whole object */ > > foo = kzalloc(sizeof(*foo), GFP_KERNEL); > > if (!foo) > > return NULL; > > > > /* initialize the kobject portion of the object properly */ > > kobject_set_name(>kobj, "%s", name); > > Returncode not checked :) good catch. Hm, I don't think anyone checks that function :) > > foo->kobj.kset = example_kset; > > foo->kobj.ktype = _ktype; > > > > /* > > * Register the kobject with the kernel, all the default files will > > * be created here and the uevent will be sent out. If we were to > > * call kobject_init() and then kobject_add() we would be > > * responsible for sending out the initial KOBJ_ADD uevent. > > */ > > retval = kobject_register(>kobj); > > if (retval) { > > kfree(foo); > > kobject_put(foo) is needed since it gets you through kobject_cleanup() > where the name can be freed. No, kobject_register() should have handled that for us, right? thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] New kobject/kset/ktype documentation and example code
On Wed, Nov 28, 2007 at 12:45:45PM +0100, Cornelia Huck wrote: > On Tue, 27 Nov 2007 15:02:52 -0800, > Greg KH <[EMAIL PROTECTED]> wrote: > > - A kset can provide a set of default attributes that all kobjects that > >belong to it automatically inherit and have created whenever a kobject > >is registered belonging to the kset. > > Hm, the default attributes are provided by the ktype? Yes, now fixed. > > The uevent function will be called when the uevent is about to be sent to > > userspace to allow more environment variables to be added to the uevent. > > It may be helpful to mention which uevents are by default created by > the kobject core (KOBJ_ADD, KOBJ_DEL, KOBJ_MOVE). Is this really needed? > > - refcount is the kobject's reference count; it is initialized by > > kobject_init() > > There is no field called "refcount"; the embedded struct kref kref is > initialized by kobject_init(). now removed, thanks. > > Often, much of the initialization of a kobject is handled by the layer that > > manages the containing kset. See the sample/kobject/kset-example.c for how > > this is usually handled. > > Do we also want to mention kobject_rename() and kobject_move(), or are > those functions so esoteric that most people don't want to know about > them? They can be found in the kerneldoc api reference if they are needed :) thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] New kobject/kset/ktype documentation and example code
On Wed, Nov 28, 2007 at 06:00:27PM +0100, Kay Sievers wrote: > On Wed, 2007-11-28 at 17:51 +0100, Cornelia Huck wrote: > > On Wed, 28 Nov 2007 17:36:29 +0100, > > Kay Sievers <[EMAIL PROTECTED]> wrote: > > > > > > > > On Wed, 2007-11-28 at 17:12 +0100, Cornelia Huck wrote: > > > > On Wed, 28 Nov 2007 16:57:48 +0100, > > > > Kay Sievers <[EMAIL PROTECTED]> wrote: > > > > > > > > > On Wed, 2007-11-28 at 16:48 +0100, Cornelia Huck wrote: > > > > > > On Wed, 28 Nov 2007 13:23:02 +0100, > > > > > > Kay Sievers <[EMAIL PROTECTED]> wrote: > > > > > > > On Wed, 2007-11-28 at 12:45 +0100, Cornelia Huck wrote: > > > > > > > > On Tue, 27 Nov 2007 15:02:52 -0800, Greg KH <[EMAIL PROTECTED]> > > > > > > > > wrote: > > > > > > > > > > > > > > The uevent function will be called when the uevent is about > > > > > > > > > to be sent to > > > > > > > > > userspace to allow more environment variables to be added to > > > > > > > > > the uevent. > > > > > > > > > > > > > > > > It may be helpful to mention which uevents are by default > > > > > > > > created by > > > > > > > > the kobject core (KOBJ_ADD, KOBJ_DEL, KOBJ_MOVE). > > > > > > > > > > > > > > I think, we should remove all these default events from the > > > > > > > kobject > > > > > > > core. We will not be able to manage the timing issues and "raw" > > > > > > > kobject > > > > > > > users should request the events on their own, when they are > > > > > > > finished > > > > > > > adding stuff to the kobject. I see currently no way to solve the > > > > > > > "attributes created after the event" problem. The new > > > > > > > *_create_and_register functions do not allow default attributes > > > > > > > to be > > > > > > > created, which will just lead to serious trouble when someone > > > > > > > wants to > > > > > > > use udev to set defaults and such things. We may just want to > > > > > > > require an > > > > > > > explicit call to send the event? > > > > > > > > > > > > There will always be attributes that will show up later (for > > > > > > example, > > > > > > after a device is activated). Probably the best approach is to keep > > > > > > the > > > > > > default uevents, but have the attribute-adder send another uevent > > > > > > when > > > > > > they are done? > > > > > > > > > > Uh, that's more an exception where we can't give guarantees because of > > > > > very specific hardware setups, and it would be an additional "change" > > > > > event. There are valid cases for this, but only a _very_ few. > > > > > > > > > > There is absolutely no reason not to do it right with the "add" event, > > > > > just because we are too lazy to solve it proper the current code. It's > > > > > just so broken by design, what we are doing today. :) > > > > > > > > I'm worrying a bit about changes that impact the whole code tree in > > > > lots of places. I'd be fine with the device layer doing its uevent > > > > manually in device_add() at the very end, though. (This would allow > > > > drivers to add attributes in their probe function before the uevent, > > > > for example.) > > > > > > I think I still remember what I did 2.5 years ago :) > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e57cd73e2e844a3da25cc6b420674c81bbe1b387 > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=18c3d5271b472c096adfc856e107c79f6fd30d7d > > > > > The driver core does use the split already in most places, I did that > > > long ago. There are not too many (~20) users of kobject_register(), and > > > it's a pretty straight-forward change to change that to _init, _add, > > > _uevent, and get rid of that totally useless "convenience api". > > > > > > I think there is no longer any excuse to keep that broken code around, > > > and even require to document that it's broken. The whole purpose of the > > > uevent is userspace consumption, which just doesn't work correctly with > > > the code we offer. The fix is trivial, and should be done now, and we no > > > longer need to fiddle around timing issues, just because we are too > > > lazy. > > > > > > I propose the removal of _all_ funtions that have *register* in their > > > name, and always require the following sequence: > > > _init() > > > _add() > > > _uevent(_ADD) > > > > > > _uevent(_REMOVE) > > > _del() > > > _put() > > > > > > The _create_and_register() functions would become _create_ and_add() > > > and will need an additional _uevent() call after they populated the > > > object. > > > > I'm absolutely fine with doing that at the kobject level (after all, > > it's a quite contained change, and the uevent function explicitely > > works on a kobject). > > > > For the other _register()/_unregister() functions, it's a different > > piece of cake. They are: > > - distributed through lot of different code > > - at a higher level than kobjects, and kobject_uevent() acts on the > > kobject > > - usually encapsulating a sequence that wants to be used by
Re: [RFC] New kobject/kset/ktype documentation and example code
On Wed, Nov 28, 2007 at 01:23:02PM +0100, Kay Sievers wrote: > On Wed, 2007-11-28 at 12:45 +0100, Cornelia Huck wrote: > > On Tue, 27 Nov 2007 15:02:52 -0800, Greg KH <[EMAIL PROTECTED]> wrote: > > > > A kset serves these functions: > > > > > > - It serves as a bag containing a group of objects. A kset can be used by > > >the kernel to track "all block devices" or "all PCI device drivers." > > > > > > - A kset is also a subdirectory in sysfs, where the associated kobjects > > >with the kset can show up. > > > > Perhaps better wording: > > > > A kset is also represented via a subdirectory in sysfs, under which the > > kobjects associated with the kset can show up. > > This draws a misleading picture. A member of a kset shows up where the > "parent" pointer points to. Like /sys/block is a kset, the kset contains > disks and partitions, but partitions do not live at the kset, and tons > of other kset directories where this is the case. > > "If the kobject belonging to a kset has no parent kobject set, it will > be added to the kset's directory. Not all members of a kset do > necessarily live in the kset directory. If an explicit parent kobject is > assigned before the kobject is added, the kobject is registered with the > kset, but added below the parent kobject." Nice, thanks, I've added this :) greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 05/14] percpu: Use a Kconfig variable to configure arch specific percpu setup
Here is the first of two patches for x86_64 that move the pda into the per cpu area and then make the x86 percpu macros work for x86_64. This needs to be generalized for other arches. The __per_cpu_start offsets can be taken care of by the linker. We can also tell the linker to completely relocate the percpu area to 0. X86_64: Declare pda as per cpu data thereby moving it into the cpu area Declare the pda as a per cpu variable. This will have the effect of moving the pda data into the cpu area managed by cpu alloc. The boot_pdas are only needed in head64.c so move the declaration over there and make it static. Remove the code that allocates special pda data structures. The pda is moved to the beginning of the per cpu area. gs is pointing to the pda. And therefore gs: is now pointing to the per cpu area of the current processor. A per cpu variable can then be reached at %gs:[_cpu_ - __per_cpu_start] Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> --- arch/x86/kernel/head64.c |6 ++ arch/x86/kernel/setup64.c | 13 ++--- arch/x86/kernel/smpboot_64.c | 16 include/asm-generic/vmlinux.lds.h |1 + include/asm-x86/pda.h |1 - include/linux/percpu.h|4 6 files changed, 21 insertions(+), 20 deletions(-) Index: linux-2.6.24-rc3-mm2/arch/x86/kernel/setup64.c === --- linux-2.6.24-rc3-mm2.orig/arch/x86/kernel/setup64.c 2007-11-28 20:59:13.124188194 -0800 +++ linux-2.6.24-rc3-mm2/arch/x86/kernel/setup64.c 2007-11-28 21:08:50.473347382 -0800 @@ -30,7 +30,9 @@ cpumask_t cpu_initialized __cpuinitdata struct x8664_pda *_cpu_pda[NR_CPUS] __read_mostly; EXPORT_SYMBOL(_cpu_pda); -struct x8664_pda boot_cpu_pda[NR_CPUS] __cacheline_aligned; + +DEFINE_PER_CPU_FIRST(struct x8664_pda, pda); +EXPORT_PER_CPU_SYMBOL(pda); struct desc_ptr idt_descr = { 256 * 16 - 1, (unsigned long) idt_table }; @@ -109,10 +111,15 @@ void __init setup_per_cpu_areas(void) } if (!ptr) panic("Cannot allocate cpu data for CPU %d\n", i); - cpu_pda(i)->data_offset = ptr - __per_cpu_start; memcpy(ptr, __per_cpu_start, __per_cpu_end - __per_cpu_start); + /* Relocate the pda */ + memcpy(ptr, cpu_pda(i), sizeof(struct x8664_pda)); + cpu_pda(i) = (struct x8664_pda *)ptr; + cpu_pda(i)->data_offset = ptr - __per_cpu_start; } -} + /* Fix up pda for this processor */ + pda_init(0); +} void pda_init(int cpu) { Index: linux-2.6.24-rc3-mm2/arch/x86/kernel/smpboot_64.c === --- linux-2.6.24-rc3-mm2.orig/arch/x86/kernel/smpboot_64.c 2007-11-28 20:59:13.136188167 -0800 +++ linux-2.6.24-rc3-mm2/arch/x86/kernel/smpboot_64.c 2007-11-28 20:59:35.399937395 -0800 @@ -556,22 +556,6 @@ static int __cpuinit do_boot_cpu(int cpu return -1; } - /* Allocate node local memory for AP pdas */ - if (cpu_pda(cpu) == _cpu_pda[cpu]) { - struct x8664_pda *newpda, *pda; - int node = cpu_to_node(cpu); - pda = cpu_pda(cpu); - newpda = kmalloc_node(sizeof (struct x8664_pda), GFP_ATOMIC, - node); - if (newpda) { - memcpy(newpda, pda, sizeof (struct x8664_pda)); - cpu_pda(cpu) = newpda; - } else - printk(KERN_ERR - "Could not allocate node local PDA for CPU %d on node %d\n", - cpu, node); - } - alternatives_smp_switch(1); c_idle.idle = get_idle_for_cpu(cpu); Index: linux-2.6.24-rc3-mm2/arch/x86/kernel/head64.c === --- linux-2.6.24-rc3-mm2.orig/arch/x86/kernel/head64.c 2007-11-28 20:59:13.152187359 -0800 +++ linux-2.6.24-rc3-mm2/arch/x86/kernel/head64.c 2007-11-28 20:59:35.403937534 -0800 @@ -22,6 +22,12 @@ #include #include +/* + * Only used before the per cpu areas are setup. The use for the non possible + * cpus continues after boot + */ +static struct x8664_pda boot_cpu_pda[NR_CPUS] __cacheline_aligned; + static void __init zap_identity_mappings(void) { pgd_t *pgd = pgd_offset_k(0UL); Index: linux-2.6.24-rc3-mm2/include/asm-x86/pda.h === --- linux-2.6.24-rc3-mm2.orig/include/asm-x86/pda.h 2007-11-28 20:59:13.164187921 -0800 +++ linux-2.6.24-rc3-mm2/include/asm-x86/pda.h 2007-11-28 20:59:35.403937534 -0800 @@ -39,7 +39,6 @@ struct x8664_pda { } cacheline_aligned_in_smp; extern struct x8664_pda *_cpu_pda[]; -extern struct x8664_pda boot_cpu_pda[]; extern void pda_init(int); #define cpu_pda(i)
Re: [RFC] New kobject/kset/ktype documentation and example code
On Wed, Nov 28, 2007 at 10:01:08AM +0100, Cornelia Huck wrote: > On Tue, 27 Nov 2007 15:02:52 -0800, > Greg KH <[EMAIL PROTECTED]> wrote: > > So, for example, UIO code has a structure that defines the memory region > > associated with a uio device: > > > > struct uio_mem { > > struct kobject kobj; > > unsigned long addr; > > unsigned long size; > > int memtype; > > void __iomem *internal_addr; > > }; > > > > If you have a struct uio_mem structure, finding its embedded kobject is > > just a > > matter of using the kobj pointer. > > Pointer may be a confusing term, how about "structure member"? thanks, now fixed. > > Code that works with kobjects will often > > have the opposite problem, however: given a struct kobject pointer, what is > > the pointer to the containing structure? You must avoid tricks (such as > > assuming that the kobject is at the beginning of the structure) and, > > instead, use the container_of() macro, found in : > > > > container_of(pointer, type, member) > > > > where pointer is the pointer to the embedded kobject, type is the type of > > the containing structure, and member is the name of the structure field to > > which pointer points. The return value from container_of() is a pointer to > > the given type. So, for example, a pointer to a struct kobject embedded > > within a struct cdev called "kp" could be converted to a pointer to the > > "struct uio_mem", I guess. yes, now fixed. > > containing structure with: > > > > struct uio_mem *u_mem = container_of(kp, struct uio_mem, kobj); > > > > Programmers will often define a simple macro for "back-casting" kobject > > pointers to the containing type. > > > > > > Initialization of kobjects > > > > Code which creates a kobject must, of course, initialize that object. Some > > of the internal fields are setup with a (mandatory) call to kobject_init(): > > > > void kobject_init(struct kobject *kobj); > > > > Among other things, kobject_init() sets the kobject's reference count to > > one. Calling kobject_init() is not sufficient, however. Kobject users > > must, at a minimum, set the name of the kobject; this is the name that will > > be used in sysfs entries. > > Unless they don't register their kobject. (But they should always set a > name anyway to avoid funny debug messages, so it is probably a good > idea to call this a "must"). Yeah, I'll leave this in. > > To set the name of a kobject properly, do not > > attempt to manipulate the internal name field, but instead use: > > > > int kobject_set_name(struct kobject *kobj, const char *format, ...); > > > > This function takes a printk-style variable argument list. Believe it or > > not, it is actually possible for this operation to fail; conscientious code > > should check the return value and react accordingly. > > > > The other kobject fields which should be set, directly or indirectly, by > > the creator are its ktype, kset, and parent. We will get to those shortly, > > however please note that the ktype and kset must be set before the > > kobject_init() function is called. > > > > > > > > Reference counts > > > > One of the key functions of a kobject is to serve as a reference counter > > for the object in which it is embedded. > > Hm, I thought that was the purpose of struct kref? Yes, I'll add a reference to kref now. > > As long as references to the object > > exist, the object (and the code which supports it) must continue to exist. > > The low-level functions for manipulating a kobject's reference counts are: > > > > struct kobject *kobject_get(struct kobject *kobj); > > void kobject_put(struct kobject *kobj); > > > > A successful call to kobject_get() will increment the kobject's reference > > counter and return the pointer to the kobject. If, however, the kobject is > > already in the process of being destroyed, the operation will fail and > > kobject_get() will return NULL. > > Eh, no. We'll always return !NULL if the kobject is !NULL to start > with. If the reference count is already 0, the code will moan, but the > caller will still get a pointer. Good point, this was the way things used to work a long time ago, I'll remove this. > > This return value must always be tested, or > > no end of unpleasant race conditions could result. > > > > When a reference is released, the call to kobject_put() will decrement the > > reference count and, possibly, free the object. Note that kobject_init() > > sets the reference count to one, so the code which sets up the kobject will > > need to do a kobject_put() eventually to release that reference. > > > > Because kobjects are dynamic, they must not be declared statically or on > > the stack, but instead, always from the heap. Future versions of the > > kernel will contain a run-time check for kobjects that are created > > statically and will warn the developer of this improper usage. > > > > > > Hooking into sysfs > > > > An initialized
Re: [RFC] New kobject/kset/ktype documentation and example code
On Tue, Nov 27, 2007 at 08:50:14PM -0700, Jonathan Corbet wrote: > Greg KH <[EMAIL PROTECTED]> wrote: > > > Jonathan, I used your old lwn.net article about kobjects as the basis > > for this document, I hope you don't mind > > Certainly I have no objections, I'm glad it was useful. Thanks, it was a great framework to work with. > > It is rare (even unknown) for kernel code to create a standalone kobject; > > with one major exception explained below. > > You don't keep this promise - bet you thought we wouldn't notice... > Actually I guess you do, in the "creating simple kobjects" section. > When you get to that point, you should mention that this is a situation > where standalone kobjects make sense. Sorry, yes, that is where I tried to explain it. I'll flush it out some more. > Given that there are quite a few standalone kobjects created by this > patch set (kernel_kobj, security_kobj, s390_kobj, etc.), the "(even > unknown)" should probably come out. Ok. > > So, for example, UIO code has a structure that defines the memory region > > associated with a uio device: > > *The* UIO code, presumably. fixed. > > the given type. So, for example, a pointer to a struct kobject embedded > > within a struct cdev called "kp" could be converted to a pointer to the > > containing structure with: > > That should be "struct uio_mem", I think. fixed. > > one. Calling kobject_init() is not sufficient, however. Kobject users > > must, at a minimum, set the name of the kobject; this is the name that will > > be used in sysfs entries. > > Is setting the name mandatory now, or are there still places where > kobjects (which do not appear in sysfs) do have - and do not need - a > name? Any kobject that is registered needs to have a name. If someone tries to call kobject_register() or kobject_add() without a name set they will find out that it is not allowed :) And yes, there are a few places in the kernel with kobjects that are never registered. I'm working on trying to get rid of them... > > Because kobjects are dynamic, they must not be declared statically or on > > the stack, but instead, always from the heap. Future versions of the > > "always be allocated from the heap"? thanks. > > "empty" release function, you will be mocked merciously by the kobject > > maintainer if you attempt this. > > So just how should severely should we mock kobject maintainers who can't > spell "mercilessly"? :) Heh, turns out that a lot of people sent me this privately :) > > - A kset can provide a set of default attributes that all kobjects that > >belong to it automatically inherit and have created whenever a kobject > >is registered belonging to the kset. > > Can we try that one again? > > - A kset can provide a set of default attributes for all kobjects which >belong to it. No, it's the ktype that does this, I'll go fix that up... > > There is currently > > no other way to add a kobject to a kset without directly messing with the > > list pointers. > > Presumably the latter way is not recommended; I would either say so or > not mention this possibility at all. Ah, yes, now removed. Thanks for the review, I really appreciate it. greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PPC: CELLEB - fix potential NULL pointer dereference
On 11/29/07, Ishizaki Kou <[EMAIL PROTECTED]> wrote: [...snip...] > > There is no problem to use Michael's part, and I also prefer simple > one like this. > > Cyrill, would you please update your patch? > > Best regards, > Kou Ishizaki > Please see updated patch enveloped. (Can't do it inline becase I'm on my work now where I have no Linux machine) Cyrill --- From: Cyrill Gorcunov <[EMAIL PROTECTED]> Subject: [PATCH] PPC: CELLEB - fix possible NULL pointer dereference This patch adds checking for NULL returned value to prevent possible NULL pointer dereference. Signed-off-by: Cyrill Gorcunov <[EMAIL PROTECTED]> --- arch/powerpc/platforms/celleb/pci.c | 11 --- 1 files changed, 8 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/platforms/celleb/pci.c b/arch/powerpc/platforms/celleb/pci.c index 6bc32fd..13ec4a6 100644 --- a/arch/powerpc/platforms/celleb/pci.c +++ b/arch/powerpc/platforms/celleb/pci.c @@ -138,8 +138,6 @@ static void celleb_config_read_fake(unsigned char *config, int where, *val = celleb_fake_config_readl(p); break; } - - return; } static void celleb_config_write_fake(unsigned char *config, int where, @@ -158,7 +156,6 @@ static void celleb_config_write_fake(unsigned char *config, int where, celleb_fake_config_writel(val, p); break; } - return; } static int celleb_fake_pci_read_config(struct pci_bus *bus, @@ -351,6 +348,10 @@ static int __init celleb_setup_fake_pci_device(struct device_node *node, wi1 = of_get_property(node, "vendor-id", NULL); wi2 = of_get_property(node, "class-code", NULL); wi3 = of_get_property(node, "revision-id", NULL); + if (!wi0 || !wi1 || !wi2 || !wi3) { + printk(KERN_ERR "PCI: Missing device tree properties.\n"); + goto error; + } celleb_config_write_fake(*config, PCI_DEVICE_ID, 2, wi0[0] & 0x); celleb_config_write_fake(*config, PCI_VENDOR_ID, 2, wi1[0] & 0x); @@ -372,6 +373,10 @@ static int __init celleb_setup_fake_pci_device(struct device_node *node, celleb_setup_pci_base_addrs(hose, devno, fn, num_base_addr); li = of_get_property(node, "interrupts", ); + if (!li) { + printk(KERN_ERR "PCI: interrupts not found.\n"); + goto error; + } val = li[0]; celleb_config_write_fake(*config, PCI_INTERRUPT_PIN, 1, 1); celleb_config_write_fake(*config, PCI_INTERRUPT_LINE, 1, val);
Re: [patch 05/14] percpu: Use a Kconfig variable to configure arch specific percpu setup
Christoph Lameter wrote: > x86_64 can use a 32 bit offset instead of a 64 bit addres because it uses > the small model. A load of a 64 bit address would require much more > expensive instructions. A load of a 64 bit address is currently avoided > through the use of the pda that contains the full 64 bit address in the > data_offset field. Operations on per cpu data on x86_64 must therefore > first load data_offset via gs and then add the per cpu address to this > offset. Then the per cpu operation is performed on that address. > Hm. Certainly a non-one-instruction access would be considerably less useful than one that is, because of preemption issues. (In general you need to pin yourself to a cpu if you're using percpu data, but sometimes it doesn't matter. In particular, the reason I'm interested in this at all is because Xen puts its interrupt mask flag in per-cpu data, and a single instruction means that masking interrupts [=disable preemption] can be done in one instruction with no scope for preemption in the middle doing something unexpected.) > In order to avoid this situation through one instruction we need a small > 32 bit offset relative to gs. Otherwise we cannot get away from the PDA > and the use of data_offset. > Hm, yes, I see. Dratted large address space. What's wrong with 4G anyway? ;) Anyway, I can see the problem with my thinking about this so far. J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: git guidance
Jakub Narebski wrote: > Al Boldi wrote: > > Johannes Schindelin wrote: > >> By that definition, no SCM, not even CVS, is transparent. Nothing > >> short of unpacked directories of all versions (wasting a lot of disk > >> space) would. > > > > Who said anything about unpacking? > > > > I'm talking about GIT transparently serving a Virtual Version Control > > dir to be mounted on the client. > > Are you talking about something like (in alpha IIRC) gitfs? > > http://www.sfgoth.com/~mitch/linux/gitfs/ This looks like a good start. > Besides, you can always use "git show :". For example > gitweb (and I think other web interfaces) can show any version of a file > or a directory, accessing only repository. Sure, browsing is the easy part, but Version Control starts when things become writable. Thanks for the link! -- Al - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc3-mm2 (bugfix for memory cgroup per-zone-struct allocation.)
On Thu, 29 Nov 2007 12:23:29 +0900 KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> wrote: > I noticed CONFIG_NUMA + CONFIG_CGROUP_MEM_CONT + CONFIG_SLUB cannot boot > because of my patch. > (SLAB is ok.) > I'll post workaround soon. > == This is a fix. tested on my ia64/NUMA box both on SLAB/SLUB. This patch fixes kmalloc_node() is called against node-without-memory. It's better to add memory hotplug callback for supporing possible nodes (memory hotplug) but here just uses kmalloc(). Should be revisited later. Signed-off-by: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> mm/memcontrol.c | 14 -- 1 file changed, 12 insertions(+), 2 deletions(-) Index: linux-2.6.24-rc3-mm2/mm/memcontrol.c === --- linux-2.6.24-rc3-mm2.orig/mm/memcontrol.c +++ linux-2.6.24-rc3-mm2/mm/memcontrol.c @@ -1117,8 +1117,18 @@ static int alloc_mem_cgroup_per_zone_inf struct mem_cgroup_per_node *pn; struct mem_cgroup_per_zone *mz; int zone; - - pn = kmalloc_node(sizeof(*pn), GFP_KERNEL, node); + /* +* This routine is called against possible nodes. +* But it's BUG to call kmalloc() against offline node. +* +* TODO: this routine can waste much memory for nodes which will +* never be onlined. It's better to use memory hotplug callback +* function. +*/ + if (node_state(node, N_HIGH_MEMORY)) + pn = kmalloc_node(sizeof(*pn), GFP_KERNEL, node); + else + pn = kmalloc(sizeof(*pn), GFP_KERNEL); if (!pn) return 1; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] kmemcheck: trap uses of uninitialized memory (v2)
Vegard Nossum wrote: Hi, On Nov 28, 2007 7:51 AM, Richard Knutsson <[EMAIL PROTECTED]> wrote: Vegard Nossum wrote: +static int Not 'static bool'? +page_is_tracked(struct page *page) Why not returning 'false' and 'true'? Sorry, I am not used to using bool in C :-) I will change this if bool is preferred in kernel code. Well, why not use them since we have them (C99 standard and over a year in the kernel). ;) What is "preferred" in a group of a few thousands, is hard to say, but I believe it is the way to go. The only "resistance" to it I know, is "it is not a C idiom". A quite illogical statement, at best. However, the 0/1 vs false/true is just a preference. (I like false/true, since I also say "true AND false = false" for example... (NOT true = false, makes sense to me, NOT 1 = 0 seem strange, why can't it be 2, or -1 ;) )) +static unsigned int +opcode_get_size(const uint8_t *opcode) Are we not using 'u8' in the kernel? Actually, I don't see any reason to use u8 when uint8_t is already standard and used in other places in the kernel. I believe I have heard they can be a problem in some situations. It also have the benefit of uniforming the kernel-code. cu Richard Knutsson - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] base/class.c: prevent ooops due to insert/remove race
On Wed, Nov 28, 2007 at 11:00:36PM -0500, Mark Lord wrote: > While doing insert/remove (quickly) tests on USB, I managed to trigger > an Oops on 2.6.23.1 on the call to strlen() in make_class_name(). > > This patch prevents this oops. > > There is still the larger problem of the overall race > that caused this in the first place, but much of the rest > of the code in class.c appears to also do NULL checks to > avoid Oops'ing, so this continues the tradition. > > Signed-off-by: Mark Lord <[EMAIL PROTECTED]> As this is a bandage over the real problem, I'd prefer to not apply this one right now until we find the root cause. thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -mm] printk trivial optimizations fix
On Wednesday 28 November 2007 11:02, Hugh Dickins wrote: > mm's printk has been showing "%p" in abominable upper case recently: > its trivial optimizations have changed the default from lower to upper, > so the 'p' case needs to enforce lower explicitly. > > Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]> > --- > > lib/vsprintf.c |1 + > 1 file changed, 1 insertion(+) > > --- 2.6.24-rc3-mm2/lib/vsprintf.c 2007-11-28 12:42:26.0 + > +++ linux/lib/vsprintf.c 2007-11-28 17:01:20.0 + > @@ -525,6 +525,7 @@ int vsnprintf(char *buf, size_t size, co > continue; > > case 'p': > + flags |= SMALL; > if (field_width == -1) { > field_width = 2*sizeof(void *); > flags |= ZEROPAD; Thanks Hugh for catching this. My fault :( -- vda - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] kconfig: Make KCONFIG_ALLCONFIG work with randconfig.
On Wed, Nov 28, 2007 at 06:08:16PM +0100, Roman Zippel wrote: > On Wed, 28 Nov 2007, Paul Mundt wrote: > > While allyes/mod/noconfigs do seem to work fine with KCONFIG_ALLCONFIG > > provisions, randconfig tramples all over the provided values at perhaps > > not surprisingly, random. > > Please be careful with such broad statements, there is only an issue with > choice values. > Ok, I'll rephrase, '100% of the provided values I tested with were being randomly clobbered'. Is that better? Broken is broken, whether it applies to a small subset of symbols or not. > > Debugging this a bit, there seemed to be two issues: > > > > - SYMBOL_DEF and SYMBOL_DEF_USER overlap, which made > > def_sym->flags the same regardless of whether we came from an > > KCONFIG_ALLCONFIG path or not. > > Look at how SYMBOL_DEF is used in confdata.c. > Ah, ok. I was just trying to find something I could test that would be different for the KCONFIG_ALLCONFIG path, but it seems like is_new is a much cleaner solution for this, thanks for pointing it out! Updated patch follows. Signed-off-by: Paul Mundt <[EMAIL PROTECTED]> --- scripts/kconfig/conf.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/scripts/kconfig/conf.c b/scripts/kconfig/conf.c index a38787a..8d6f174 100644 --- a/scripts/kconfig/conf.c +++ b/scripts/kconfig/conf.c @@ -374,7 +374,8 @@ static int conf_choice(struct menu *menu) continue; break; case set_random: - def = (random() % cnt) + 1; + if (is_new) + def = (random() % cnt) + 1; case set_default: case set_yes: case set_mod: - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] base/class.c: prevent ooops due to insert/remove race
While doing insert/remove (quickly) tests on USB, I managed to trigger an Oops on 2.6.23.1 on the call to strlen() in make_class_name(). This patch prevents this oops. There is still the larger problem of the overall race that caused this in the first place, but much of the rest of the code in class.c appears to also do NULL checks to avoid Oops'ing, so this continues the tradition. Signed-off-by: Mark Lord <[EMAIL PROTECTED]> --- Patch applies to both 2.6.24 and 2.6.23. --- old/drivers/base/class.c2007-11-28 22:54:59.0 -0500 +++ linux/drivers/base/class.c 2007-11-28 22:54:48.0 -0500 @@ -354,6 +354,8 @@ char *class_name; int size; + if (!name) + return NULL; size = strlen(name) + strlen(kobject_name(kobj)) + 2; class_name = kmalloc(size, GFP_KERNEL); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] IB/ehca: Fix static rate if path faster than link
thanks, applied - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][for -mm] per-zone and reclaim enhancements for memory controller take 3 [3/10] per-zone active inactive counter
On Thu, 29 Nov 2007 12:33:28 +0900 (JST) [EMAIL PROTECTED] (YAMAMOTO Takashi) wrote: > > +static inline struct mem_cgroup_per_zone * > > +mem_cgroup_zoneinfo(struct mem_cgroup *mem, int nid, int zid) > > +{ > > + if (!mem->info.nodeinfo[nid]) > > can this be true? > > YAMAMOTO Takashi When I set early_init=1, I added that check. BUG_ON() is better ? Thanks, -Kame - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][for -mm] per-zone and reclaim enhancements for memory controller take 3 [3/10] per-zone active inactive counter
> +static inline struct mem_cgroup_per_zone * > +mem_cgroup_zoneinfo(struct mem_cgroup *mem, int nid, int zid) > +{ > + if (!mem->info.nodeinfo[nid]) can this be true? YAMAMOTO Takashi > + return NULL; > + return >info.nodeinfo[nid]->zoneinfo[zid]; > +} > + - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][for -mm] per-zone and reclaim enhancements for memory controller take 3 [3/10] per-zone active inactive counter
On Thu, 29 Nov 2007, KAMEZAWA Hiroyuki wrote: > ok, just use N_HIGH_MEMORY here and add comment for hotplugging support is > not yet. > > Christoph-san, Lee-san, could you confirm following ? > > - when SLAB is used, kmalloc_node() against offline node will success. > - when SLUB is used, kmalloc_node() against offline node will panic. > > Then, the caller should take care that node is online before kmalloc(). H... An offline node implies that the per node structure does not exist. SLAB should fail too. If there is something wrong with the allocs then its likely a difference in the way hotplug was put into SLAB and SLUB. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][for -mm] per-zone and reclaim enhancements for memory controller take 3 [3/10] per-zone active inactive counter
On Thu, 29 Nov 2007 12:19:37 +0900 (JST) [EMAIL PROTECTED] (YAMAMOTO Takashi) wrote: > > @@ -651,10 +758,11 @@ > > /* Avoid race with charge */ > > atomic_set(>ref_cnt, 0); > > if (clear_page_cgroup(page, pc) == pc) { > > + int active; > > css_put(>css); > > + active = pc->flags & PAGE_CGROUP_FLAG_ACTIVE; > > res_counter_uncharge(>res, PAGE_SIZE); > > - list_del_init(>lru); > > - mem_cgroup_charge_statistics(mem, pc->flags, false); > > + __mem_cgroup_remove_list(pc); > > kfree(pc); > > } else /* being uncharged ? ...do relax */ > > break; > > 'active' seems unused. > ok, I will post clean-up against -mm2. Thanks, -Kame - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PPC: CELLEB - fix potential NULL pointer dereference
Cyrill Gorcunov <[EMAIL PROTECTED]> wrote: > On 11/28/07, Cyrill Gorcunov <[EMAIL PROTECTED]> wrote: > > On 11/28/07, Michael Ellerman <[EMAIL PROTECTED]> wrote: > > > On Mon, 2007-11-26 at 10:46 +0300, Cyrill Gorcunov wrote: > > > > This patch adds checking for NULL value returned to prevent possible > > > > NULL pointer dereference. > > > > Also two unneeded 'return' are removed. > > > > > > > > Signed-off-by: Cyrill Gorcunov <[EMAIL PROTECTED]> > > > > --- > > > > Any comments are welcome. > > > > > > I guess it's good to be paranoid, but this is a little verbose: > > > > > >wi0 = of_get_property(node, "device-id", NULL); > > > + if (unlikely((!wi0))) { > > > + printk(KERN_ERR "PCI: device-id not found.\n"); > > > + goto error; > > > + } > > >wi1 = of_get_property(node, "vendor-id", NULL); > > > + if (unlikely((!wi1))) { > > > + printk(KERN_ERR "PCI: vendor-id not found.\n"); > > > + goto error; > > > + } > > >wi2 = of_get_property(node, "class-code", NULL); > > > + if (unlikely((!wi2))) { > > > + printk(KERN_ERR "PCI: class-code not found.\n"); > > > + goto error; > > > + } > > >wi3 = of_get_property(node, "revision-id", NULL); > > > + if (unlikely((!wi3))) { > > > + printk(KERN_ERR "PCI: revision-id not found.\n"); > > > + goto error; > > > + } > > > > > > Perhaps instead: > > > > > >wi0 = of_get_property(node, "device-id", NULL); > > >wi1 = of_get_property(node, "vendor-id", NULL); > > >wi2 = of_get_property(node, "class-code", NULL); > > >wi3 = of_get_property(node, "revision-id", NULL); > > > > > > if (!wi0 || !wi1 || !wi2 || !wi3) { > > > printk(KERN_ERR "PCI: Missing device tree properties.\n"); > > > goto error; > > > } > > > > Hi Michael, yes that is much better (actually I was doubt about what form of > > which the checking style to use - your form is much compact but mine does > > show where *exactly* the problem appeared). So 'case that is the fake driver > > your form is preferred ;) Ishizaki, could you use Michael's part then? > > > > > > > > > > > cheers > > > > > > -- > > > Michael Ellerman > > > OzLabs, IBM Australia Development Lab > > > > > > wwweb: http://michael.ellerman.id.au > > > phone: +61 2 6212 1183 (tie line 70 21183) > > > > > > We do not inherit the earth from our ancestors, > > > we borrow it from our children. - S.M.A.R.T Person > > > > > > > > > > Cyrill > > > Ishizaki I can update the patch if you needed. Should I? > > Cyrill There is no problem to use Michael's part, and I also prefer simple one like this. Cyrill, would you please update your patch? Best regards, Kou Ishizaki - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc3-mm2
> +per-zone-and-reclaim-enhancements-for-memory-controller-take-3-add-scan_global_lru-macro.patch > +per-zone-and-reclaim-enhancements-for-memory-controller-take-3-nid-zid-helper-function-for-cgroup.patch > +per-zone-and-reclaim-enhancements-for-memory-controller-take-3-per-zone-active-inactive-counter.patch > +per-zone-and-reclaim-enhancements-for-memory-controller-take-3-calculate-mapper_ratio-per-cgroup.patch > +per-zone-and-reclaim-enhancements-for-memory-controller-take-3-calculate-active-inactive-imbalance-per-cgroup.patch > +per-zone-and-reclaim-enhancements-for-memory-controller-take-3-remember-reclaim-priority-in-memory-cgroup.patch > +per-zone-and-reclaim-enhancements-for-memory-controller-take-3-remember-reclaim-priority-in-memory-cgroup-fix.patch > +per-zone-and-reclaim-enhancements-for-memory-controller-take-3-remember-reclaim-priority-in-memory-cgroup-fix-2.patch > +per-zone-and-reclaim-enhancements-for-memory-controller-take-3-calculate-the-number-of-pages-to-be-scanned-per-cgroup.patch > +per-zone-and-reclaim-enhancements-for-memory-controller-take-3-modifies-vmscanc-for-isolate-globa-cgroup-lru-activity.patch > +per-zone-and-reclaim-enhancements-for-memory-controller-take-3-modifies-vmscanc-for-isolate-globa-cgroup-lru-activity-fix.patch > +per-zone-and-reclaim-enhancements-for-memory-controller-take-3-per-zone-lru-for-cgroup.patch > +per-zone-and-reclaim-enhancements-for-memory-controller-take-3-per-zone-lock-for-cgroup.patch > > cgroup memeory controller updates > I noticed CONFIG_NUMA + CONFIG_CGROUP_MEM_CONT + CONFIG_SLUB cannot boot because of my patch. (SLAB is ok.) I'll post workaround soon. Sorry, -Kame - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][for -mm] per-zone and reclaim enhancements for memory controller take 3 [3/10] per-zone active inactive counter
> @@ -651,10 +758,11 @@ > /* Avoid race with charge */ > atomic_set(>ref_cnt, 0); > if (clear_page_cgroup(page, pc) == pc) { > + int active; > css_put(>css); > + active = pc->flags & PAGE_CGROUP_FLAG_ACTIVE; > res_counter_uncharge(>res, PAGE_SIZE); > - list_del_init(>lru); > - mem_cgroup_charge_statistics(mem, pc->flags, false); > + __mem_cgroup_remove_list(pc); > kfree(pc); > } else /* being uncharged ? ...do relax */ > break; 'active' seems unused. YAMAMOTO Takashi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][for -mm] per-zone and reclaim enhancements for memory controller take 3 [3/10] per-zone active inactive counter
On Thu, 29 Nov 2007 11:24:06 +0900 KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> wrote: > On Thu, 29 Nov 2007 10:37:02 +0900 > KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> wrote: > > > Maybe zonelists of NODE_DATA() is not initialized. you are right. > > I think N_HIGH_MEMORY will be suitable here...(I'll consider node-hotplug > > case later.) > > > > Thank you for test! > > > Could you try this ? > Sorry..this can be a workaround but I noticed I miss something.. ok, just use N_HIGH_MEMORY here and add comment for hotplugging support is not yet. Christoph-san, Lee-san, could you confirm following ? - when SLAB is used, kmalloc_node() against offline node will success. - when SLUB is used, kmalloc_node() against offline node will panic. Then, the caller should take care that node is online before kmalloc(). Regards, -Kame - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] (2.4.26-rc3-mm2) -mm Update CAP_LAST_CAP to reflect CAP_MAC_ADMIN
Quoting Casey Schaufler ([EMAIL PROTECTED]): > From: Casey Schaufler <[EMAIL PROTECTED]> > > Bump the value of CAP_LAST_CAP to reflect the current last cap value. > It appears that the patch that introduced CAP_LAST_CAP and the patch > that introduced CAP_MAC_ADMIN came in more or less at the same time. > > Signed-off-by: Casey Schaufler <[EMAIL PROTECTED]> Signed-off-by: Serge Hallyn <[EMAIL PROTECTED]> > > --- > > include/linux/capability.h |8 > 1 file changed, 4 insertions(+), 4 deletions(-) > > > diff -uprN -X linux-2.6.24-rc3-mm2-base/Documentation/dontdiff > linux-2.6.24-rc3-mm2-base/include/linux/capability.h > linux-2.6.24-rc3-mm2-lastcap/include/linux/capability.h > --- linux-2.6.24-rc3-mm2-base/include/linux/capability.h 2007-11-27 > 16:47:02.0 -0800 > +++ linux-2.6.24-rc3-mm2-lastcap/include/linux/capability.h 2007-11-28 > 14:04:57.0 -0800 > @@ -315,10 +315,6 @@ typedef struct kernel_cap_struct { > > #define CAP_SETFCAP 31 > > -#define CAP_LAST_CAP CAP_SETFCAP > - > -#define cap_valid(x) ((x) >= 0 && (x) <= CAP_LAST_CAP) > - > /* Override MAC access. > The base kernel enforces no MAC policy. > An LSM may enforce a MAC policy, and if it does and it chooses > @@ -336,6 +332,10 @@ typedef struct kernel_cap_struct { > > #define CAP_MAC_ADMIN33 > > +#define CAP_LAST_CAP CAP_MAC_ADMIN > + > +#define cap_valid(x) ((x) >= 0 && (x) <= CAP_LAST_CAP) > + > /* > * Bit location of each capability (used by user-space library and kernel) > */ > > - > To unsubscribe from this list: send the line "unsubscribe > linux-security-module" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Patch](Resend) mm/sparse.c: Improve the error handling for sparse_add_one_section()
Looks good to me. Thanks. Acked-by: Yasunori Goto <[EMAIL PROTECTED]> > On Tue, Nov 27, 2007 at 10:53:45AM -0800, Dave Hansen wrote: > >On Tue, 2007-11-27 at 10:26 +0800, WANG Cong wrote: > >> > >> @@ -414,7 +418,7 @@ int sparse_add_one_section(struct zone * > >> out: > >> pgdat_resize_unlock(pgdat, ); > >> if (ret <= 0) > >> - __kfree_section_memmap(memmap, nr_pages); > >> + kfree(usemap); > >> return ret; > >> } > >> #endif > > > >Why did you get rid of the memmap free here? A bad return from > >sparse_init_one_section() indicates that we didn't use the memmap, so it > >will leak otherwise. > > Sorry, I was confused by the recursion. This one should be OK. > > Thanks. > > > > Improve the error handling for mm/sparse.c::sparse_add_one_section(). And I > see no reason to check 'usemap' until holding the 'pgdat_resize_lock'. > > Cc: Christoph Lameter <[EMAIL PROTECTED]> > Cc: Dave Hansen <[EMAIL PROTECTED]> > Cc: Rik van Riel <[EMAIL PROTECTED]> > Cc: Yasunori Goto <[EMAIL PROTECTED]> > Cc: Andy Whitcroft <[EMAIL PROTECTED]> > Signed-off-by: WANG Cong <[EMAIL PROTECTED]> > > --- > Index: linux-2.6/mm/sparse.c > === > --- linux-2.6.orig/mm/sparse.c > +++ linux-2.6/mm/sparse.c > @@ -391,9 +391,17 @@ int sparse_add_one_section(struct zone * >* no locking for this, because it does its own >* plus, it does a kmalloc >*/ > - sparse_index_init(section_nr, pgdat->node_id); > + ret = sparse_index_init(section_nr, pgdat->node_id); > + if (ret < 0) > + return ret; > memmap = kmalloc_section_memmap(section_nr, pgdat->node_id, nr_pages); > + if (!memmap) > + return -ENOMEM; > usemap = __kmalloc_section_usemap(); > + if (!usemap) { > + __kfree_section_memmap(memmap, nr_pages); > + return -ENOMEM; > + } > > pgdat_resize_lock(pgdat, ); > > @@ -403,18 +411,16 @@ int sparse_add_one_section(struct zone * > goto out; > } > > - if (!usemap) { > - ret = -ENOMEM; > - goto out; > - } > ms->section_mem_map |= SECTION_MARKED_PRESENT; > > ret = sparse_init_one_section(ms, section_nr, memmap, usemap); > > out: > pgdat_resize_unlock(pgdat, ); > - if (ret <= 0) > + if (ret <= 0) { > + kfree(usemap); > __kfree_section_memmap(memmap, nr_pages); > + } > return ret; > } > #endif -- Yasunori Goto - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fix plip 1
On Mon, 26 Nov 2007, Linus Torvalds wrote: > > > On Thu, 22 Nov 2007, Mikulas Patocka wrote: > > > > netif_rx is meant to be called from interrupts because it doesn't wake up > > ksoftirqd. For calling from outside interrupts, netif_rx_ni exists. > > Argh. Can you _please_ use more useful subject lines than "fix plip 1/2"? > > Those subject lines are what becomes the single-line description of the > problem, used by visualizers like gitk and gitweb. So "fix plip 1" is a > singularly bad such line! OK, I see Mikulas - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Question regarding mutex locking
> Thanks for the help. Someday, I hope to understand this stuff. > > Larry Any code either deals with an object or it doesn't. If it doesn't deal with that object, it should not be acquiring locks on that object. If it does deal with that object, it must know the internal details of that object, including when and whether locks are held, or it cannot deal with that object sanely. So your question starts out broken, it says, "I need to lock an object, but I have no clue what's going on with that very same object." If you don't know what's going on with the object, you don't know enough about the object to lock it. If you do, you should know whether you hold the lock or not. Either architect so this function doesn't deal with that object and so doesn't need to lock it or architect it so that this function knows what's going on with that object and so knows whether it holds the lock or not. If you don't follow this rule, a lot of things can go horribly wrong. The two biggest issues are: 1) You don't know the semantic effect of locking and unlocking the mutex. So any code placed before the mutex is acquired or after its released may not do what's expected. For example, you cannot unlock the mutex and yield, because you might not actually wind up unlocking the mutex. 2) A function that acquires a lock normally expects the object it locks to be in a consistent state when it acquires the lock. However, since your code may or may not acquire the mutex, it is not assured that its lock gets the object in a consistent state. Requiring the caller to know this and call the function with the object in a consistent state creates brokenness of varying kinds. (If the object may change, why not just release the lock before calling? If the object may not change, why is the sub-function releasing the lock?) DS - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH] LTTng instrumentation mm (using page_to_pfn)
I am adding the rest.. two questions left : * Dave Hansen ([EMAIL PROTECTED]) wrote: > > > > Index: linux-2.6-lttng/mm/memory.c > > === > > --- linux-2.6-lttng.orig/mm/memory.c2007-11-28 08:42:09.0 > > -0500 > > +++ linux-2.6-lttng/mm/memory.c 2007-11-28 09:02:57.0 -0500 > > @@ -2072,6 +2072,7 @@ static int do_swap_page(struct mm_struct > > delayacct_set_flag(DELAYACCT_PF_SWAPIN); > > page = lookup_swap_cache(entry); > > if (!page) { > > + trace_mark(mm_swap_in, "pfn %lu", page_to_pfn(page)); > > grab_swap_token(); /* Contend for token _before_ read-in */ > > swapin_readahead(entry, address, vma); > > page = read_swap_cache_async(entry, vma, address); > > How about putting the swap file number and the offset as well? > [...] > > Index: linux-2.6-lttng/mm/page_io.c > > === > > --- linux-2.6-lttng.orig/mm/page_io.c 2007-11-28 08:38:47.0 > > -0500 > > +++ linux-2.6-lttng/mm/page_io.c2007-11-28 08:52:14.0 -0500 > > @@ -114,6 +114,7 @@ int swap_writepage(struct page *page, st > > rw |= (1 << BIO_RW_SYNC); > > count_vm_event(PSWPOUT); > > set_page_writeback(page); > > + trace_mark(mm_swap_out, "pfn %lu", page_to_pfn(page)); > > unlock_page(page); > > submit_bio(rw, bio); > > I'd also like to see the swap file number and the location in swap for > this one. > Before I start digging deeper in checking whether it is already instrumented by the fs instrumentation (and would therefore be redundant), is there a particular data structure from mm/ that you suggest taking the swap file number and location in swap from ? Mathieu > -- Dave > -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][for -mm] per-zone and reclaim enhancements for memory controller take 3 [3/10] per-zone active inactive counter
On Thu, 29 Nov 2007 10:37:02 +0900 KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> wrote: > Maybe zonelists of NODE_DATA() is not initialized. you are right. > I think N_HIGH_MEMORY will be suitable here...(I'll consider node-hotplug > case later.) > > Thank you for test! > Could you try this ? Thanks, -Kame == Don't call kmalloc() against possible but offline node. Signed-off-by: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> mm/memcontrol.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) Index: test-2.6.24-rc3-mm1/mm/memcontrol.c === --- test-2.6.24-rc3-mm1.orig/mm/memcontrol.c +++ test-2.6.24-rc3-mm1/mm/memcontrol.c @@ -1117,8 +1117,14 @@ static int alloc_mem_cgroup_per_zone_inf struct mem_cgroup_per_node *pn; struct mem_cgroup_per_zone *mz; int zone; - - pn = kmalloc_node(sizeof(*pn), GFP_KERNEL, node); + /* +* This routine is called against possible nodes. +* But it's BUG to call kmalloc() against offline node. +*/ + if (node_state(N_ONLINE, node)) + pn = kmalloc_node(sizeof(*pn), GFP_KERNEL, node); + else + pn = kmalloc(sizeof(*pn), GFP_KERNEL); if (!pn) return 1; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/1] Writeback fix for concurrent large and small file writes
Two typos in comments. Cheers, FJP Michael Rubin wrote: > + * The flush tree organizes the dirtied_when keys with the rb_tree. Any > + * inodes with a duplicate dirtied_when value are link listed together. > This + * link list is sorted by the inode's i_flushed_when. When both the > + * dirited_when and the i_flushed_when are indentical the order in the > + * linked list determines the order we flush the inodes. s/dirited_when/dirtied_when/ > + * Here is where we interate to find the next inode to process. The > + * strategy is to first look for any other inodes with the same > dirtied_when + * value. If we have already processed that node then we > need to find + * the next highest dirtied_when value in the tree. s/interate/iterate/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 05/14] percpu: Use a Kconfig variable to configure arch specific percpu setup
On Wed, 28 Nov 2007, Jeremy Fitzhardinge wrote: > > percpu references are quite frequent already (vm statistics) and will be > > more frequent after we have converted the per cpu arrays to per cpu > > allocations. > > > > Well, I think the point is moot, because x86 will always use 32-bit > offsets. Each reference will only be 1 byte bigger than a normal > variable reference. Just because i386 is not able to use it does not mean that other arches are not. F.e. IA64 can embedd offsets in the actual instruction (but of course not 64bit). x86_64 can use a 32 bit offset instead of a 64 bit addres because it uses the small model. A load of a 64 bit address would require much more expensive instructions. A load of a 64 bit address is currently avoided through the use of the pda that contains the full 64 bit address in the data_offset field. Operations on per cpu data on x86_64 must therefore first load data_offset via gs and then add the per cpu address to this offset. Then the per cpu operation is performed on that address. In order to avoid this situation through one instruction we need a small 32 bit offset relative to gs. Otherwise we cannot get away from the PDA and the use of data_offset. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 05/14] percpu: Use a Kconfig variable to configure arch specific percpu setup
Christoph Lameter wrote: > The percpu areas need to be allocated in a NUMA aware fashion. Otherwise > you use distant memory for the most performance sensitive areas. The NUMA > subsystem must be so far up that these allocations can be performed in the > right way. And this means at least you need to know on which node each > processor is located. That is what the PDA is currently used for and i386 > has no other way of doing that. I think we could use an array [NR_CPUS] > for this one but we want to avoid these arrays because NR_CPUS may get > very big. > Oh, you mean there needs to be some percpu data mechanism operating in order to do numa-aware allocations, which would be necessary to allocate the percpu memory itself? I can see how that would be awkward. J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] net/e1000: fix memcpy in e1000_get_strings
drivers/net/e1000/e1000_ethtool.c:113: #define E1000_TEST_LEN sizeof(e1000_gstrings_test) / ETH_GSTRING_LEN drivers/net/e1000e/ethtool.c:106: #define E1000_TEST_LEN sizeof(e1000_gstrings_test) / ETH_GSTRING_LEN E1000_TEST_LEN*ETH_GSTRING_LEN will expand to sizeof(e1000_gstrings_test) / (ETH_GSTRING_LEN * ETH_GSTRING_LEN) Please confirm that the change is as wanted. -- A lack of parentheses around defines causes unexpected results due to operator precedences. Signed-off-by: Roel Kluin <[EMAIL PROTECTED]> --- diff --git a/drivers/net/e1000/e1000_ethtool.c b/drivers/net/e1000/e1000_ethtool.c index 667f18b..b83ccce 100644 --- a/drivers/net/e1000/e1000_ethtool.c +++ b/drivers/net/e1000/e1000_ethtool.c @@ -1923,7 +1923,7 @@ e1000_get_strings(struct net_device *netdev, uint32_t stringset, uint8_t *data) switch (stringset) { case ETH_SS_TEST: memcpy(data, *e1000_gstrings_test, - E1000_TEST_LEN*ETH_GSTRING_LEN); + sizeof(e1000_gstrings_test)); break; case ETH_SS_STATS: for (i = 0; i < E1000_GLOBAL_STATS_LEN; i++) { diff --git a/drivers/net/e1000e/ethtool.c b/drivers/net/e1000e/ethtool.c index 6a39784..338c49d 100644 --- a/drivers/net/e1000e/ethtool.c +++ b/drivers/net/e1000e/ethtool.c @@ -1739,7 +1739,7 @@ static void e1000_get_strings(struct net_device *netdev, u32 stringset, switch (stringset) { case ETH_SS_TEST: memcpy(data, *e1000_gstrings_test, - E1000_TEST_LEN*ETH_GSTRING_LEN); + sizeof(e1000_gstrings_test)); break; case ETH_SS_STATS: for (i = 0; i < E1000_GLOBAL_STATS_LEN; i++) { - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 05/14] percpu: Use a Kconfig variable to configure arch specific percpu setup
On Wed, 28 Nov 2007, Jeremy Fitzhardinge wrote: > Don't think it matters either way. Before percpu is allocated, NUMA > issues don't matter. Once they are - by whatever mechanism - you can > set the segment bases up appropriately. The fact that you chose to put > percpu data at address X doesn't affect the percpu mechanism one way or > the other. The percpu areas need to be allocated in a NUMA aware fashion. Otherwise you use distant memory for the most performance sensitive areas. The NUMA subsystem must be so far up that these allocations can be performed in the right way. And this means at least you need to know on which node each processor is located. That is what the PDA is currently used for and i386 has no other way of doing that. I think we could use an array [NR_CPUS] for this one but we want to avoid these arrays because NR_CPUS may get very big. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 2/2] [net/wireless/iwlwifi] : iwlwifi 4965 Fix race conditional panic.
The cancel_delayed_work_sync has moved into ilw_cancel_deferred_work. Thanks Zhu Yi. [net/wireless/iwlwifi] : iwlwifi 4965 Fix race conditional panic. Signed-off-by: Joonwoo Park <[EMAIL PROTECTED]> --- diff --git a/drivers/net/wireless/iwlwifi/iwl4965-base.c b/drivers/net/wireless/iwlwifi/iwl4965-base.c index 9918780..2474eba 100644 --- a/drivers/net/wireless/iwlwifi/iwl4965-base.c +++ b/drivers/net/wireless/iwlwifi/iwl4965-base.c @@ -8864,6 +8864,7 @@ static void iwl_cancel_deferred_work(struct iwl_priv *priv) { iwl_hw_cancel_deferred_work(priv); + cancel_delayed_work_sync(>init_alive_start); cancel_delayed_work(>scan_check); cancel_delayed_work(>alive_start); cancel_delayed_work(>post_associate); --- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Out of tree module using LSM
--- Jan Engelhardt <[EMAIL PROTECTED]> wrote: > > On Nov 28 2007 18:22, [EMAIL PROTECTED] wrote: > > > >Talpa is modular itself being composed of a set of kernel modules of which > >not all are loaded simultaneously. Where possible LSM can be used and _no_ > >messing with syscall table will take place. Unfortunately where another > >LSM user is present that won't work > > SELinux supports chaining, so if talpa is loaded as a secondary to selinux, > where is the problem? For those LSMs which do not support chaining (*cough* > apparmor *cough* be one, mtadm another), fix them. Um, cough cough (I ready do have a nasty cold) SELinux supports a very limited bit of chaining. I don't think you're going to be chaining security_secid_to_secctx() or security_secctx_to_secid() with the current SELinux code, but you could prove me wrong there. Chaining is a red herring. If you want talpa it seems that you have a use case that isn't going to require the presence of another LSM. You may have other issues, but at this point I say throw caution to the wind, clean it up based on the suggestions you've seen here, and put the patch up as an RFC on the LSM list. What's the worst that could happen? Casey Schaufler [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 1/2] [net/wireless/iwlwifi] : iwlwifi 3945 Fix raceconditional panic.
2007/11/29, Zhu Yi <[EMAIL PROTECTED]>: > > Good catch. But it will be better if you add it into > iwl_cancel_deferred_work(). > Thanks. I agree with you. Actually, I considered it, but I was afraid of side effect. Anyway, I'm attaching a new one. Thanks. Joonwoo [net/wireless/iwlwifi] : iwlwifi 3945 Fix race conditional panic. Signed-off-by: Joonwoo Park <[EMAIL PROTECTED]> --- diff --git a/drivers/net/wireless/iwlwifi/iwl3945-base.c b/drivers/net/wireless/iwlwifi/iwl3945-base.c index 465da4f..e51e872 100644 --- a/drivers/net/wireless/iwlwifi/iwl3945-base.c +++ b/drivers/net/wireless/iwlwifi/iwl3945-base.c @@ -8270,6 +8270,7 @@ static void iwl_cancel_deferred_work(struct iwl_priv *priv) { iwl_hw_cancel_deferred_work(priv); + cancel_delayed_work_sync(>init_alive_start); cancel_delayed_work(>scan_check); cancel_delayed_work(>alive_start); cancel_delayed_work(>post_associate); --- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 05/14] percpu: Use a Kconfig variable to configure arch specific percpu setup
Christoph Lameter wrote: > On Wed, 28 Nov 2007, Jeremy Fitzhardinge wrote: > > >> I don't see the problem. The way i386 does it inherently supports >> per-cpu data very early on (it uses the prototype percpu section until >> the real percpu values are set up). >> > > Ok so we could do that for x86_64 as well? There is more complicated > bootstrap since i386 does not support NUMA aware placement of per cpu > areas. > Don't think it matters either way. Before percpu is allocated, NUMA issues don't matter. Once they are - by whatever mechanism - you can set the segment bases up appropriately. The fact that you chose to put percpu data at address X doesn't affect the percpu mechanism one way or the other. > percpu references are quite frequent already (vm statistics) and will be > more frequent after we have converted the per cpu arrays to per cpu > allocations. > Well, I think the point is moot, because x86 will always use 32-bit offsets. Each reference will only be 1 byte bigger than a normal variable reference. J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 05/14] percpu: Use a Kconfig variable to configure arch specific percpu setup
On Wed, 28 Nov 2007, Jeremy Fitzhardinge wrote: > I don't see the problem. The way i386 does it inherently supports > per-cpu data very early on (it uses the prototype percpu section until > the real percpu values are set up). Ok so we could do that for x86_64 as well? There is more complicated bootstrap since i386 does not support NUMA aware placement of per cpu areas. > > The i386 way of referring to per cpu data is not optimal because it is > > always offset by __per_cpu_start. per cpu data offsets need to be relative > > to the beginning of the per cpu area. per cpu data is less than 64k so 2 > > byte offsets would be enough. > > > > I don't see that's terribly important. percpu references aren't all > that common overall, and - at least on x86 - using a 16-bit offset > (assuming its possible) would require a prefix anyway, so it would only > save 1 byte per reference. But I can't convince gas to generate a > 16-bit offset anyway. percpu references are quite frequent already (vm statistics) and will be more frequent after we have converted the per cpu arrays to per cpu allocations. > > That way the __per_cpu_offset array and the registers that are used on > > various platforms are pointing to the actual data and can be loaded > > directly into a register and then a load with a small offset to that > > register can be performed. On x86_64 this is gs, on i386 fs, on sparc g5, > > on ia64 a fixed address stands in for the register. > > The asm used to generate these references is inherently arch-specific > anyway, so the type and size of offset needed from the per-cpu base > register to the data itself can be arch-dependent without loss of > generality. Well yes that is already the case and made explicit by the percpu cleanup done so far. The offset of a base is used by multiple architectures. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][for -mm] per-zone and reclaim enhancements for memory controller take 3 [3/10] per-zone active inactive counter
On Wed, 28 Nov 2007 16:19:59 -0500 Lee Schermerhorn <[EMAIL PROTECTED]> wrote: > As soon as this loop hits the first non-existent node on my platform, I > get a NULL pointer deref down in __alloc_pages. Stack trace below. > > Perhaps N_POSSIBLE should be N_HIGH_MEMORY? That would require handling > of memory/node hotplug for each memory control group, right? But, I'm > going to try N_HIGH_MEMORY as a work around. > Hmm, ok. (>_< > Call Trace: > [] show_stack+0x80/0xa0 > sp=a001008e39c0 bsp=a001008dd1b0 > [] show_regs+0x870/0x8a0 > sp=a001008e3b90 bsp=a001008dd158 > [] die+0x190/0x300 > sp=a001008e3b90 bsp=a001008dd110 > [] ia64_do_page_fault+0x8e0/0xa20 > sp=a001008e3b90 bsp=a001008dd0b8 > [] ia64_leave_kernel+0x0/0x270 > sp=a001008e3c20 bsp=a001008dd0b8 > [] __alloc_pages+0x30/0x6e0 > sp=a001008e3df0 bsp=a001008dcfe0 > [] new_slab+0x610/0x6c0 > sp=a001008e3e00 bsp=a001008dcf80 > [] get_new_slab+0x50/0x200 > sp=a001008e3e00 bsp=a001008dcf48 > [] __slab_alloc+0x2e0/0x4e0 > sp=a001008e3e00 bsp=a001008dcf00 > [] kmem_cache_alloc_node+0x180/0x200 > sp=a001008e3e10 bsp=a001008dcec0 > [] mem_cgroup_create+0x160/0x400 > sp=a001008e3e10 bsp=a001008dce78 > [] cgroup_init_subsys+0xa0/0x400 > sp=a001008e3e20 bsp=a001008dce28 > [] cgroup_init+0x90/0x160 > sp=a001008e3e20 bsp=a001008dce00 > [] start_kernel+0x700/0x820 > sp=a001008e3e20 bsp=a001008dcd80 > Maybe zonelists of NODE_DATA() is not initialized. you are right. I think N_HIGH_MEMORY will be suitable here...(I'll consider node-hotplug case later.) Thank you for test! Regards, -Kame - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] asm-arm/{arch-omap,arch-ixp23xx}: parentheses around NR_IRQS definition
Roel Kluin wrote: > Add parentheses to prevent operator precedence errors > > Signed-off-by: Roel Kluin <[EMAIL PROTECTED]> For the arch-ixp23xx part I should have added: Acked-by: Lennert Buytenhek <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 05/14] percpu: Use a Kconfig variable to configure arch specific percpu setup
> * drop support for stack-protector (does it really help? do people > use it?) AFAIK we only ever had a single classical stack buffer overflow in the kernel. It certainly doesn't seem to be a common security problem it is solving. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 05/14] percpu: Use a Kconfig variable to configure arch specific percpu setup
Christoph Lameter wrote: > On Wed, 28 Nov 2007, Jeremy Fitzhardinge wrote: > > > Yes, I would like to convert x86_64 to match i386's percpu, and drop the > >> pda altogether. The only thing preventing this is the stack canary, and >> I'm wondering how much value there is in keeping it, given the >> disadvantages of having this divergence between 32 and 64 bit. >> > > I think most of the PDA could be gotten rid of. The problems are > > 1. The stack canary > Yes, this is a biggie. It needs one of: * fix gcc * post-process the .s file * drop support for stack-protector (does it really help? do people use it?) > 2. The PDA is used to store per cpu data before the per cpu areas >are setup. > I don't see the problem. The way i386 does it inherently supports per-cpu data very early on (it uses the prototype percpu section until the real percpu values are set up). > The i386 way of referring to per cpu data is not optimal because it is > always offset by __per_cpu_start. per cpu data offsets need to be relative > to the beginning of the per cpu area. per cpu data is less than 64k so 2 > byte offsets would be enough. > I don't see that's terribly important. percpu references aren't all that common overall, and - at least on x86 - using a 16-bit offset (assuming its possible) would require a prefix anyway, so it would only save 1 byte per reference. But I can't convince gas to generate a 16-bit offset anyway. > That way the __per_cpu_offset array and the registers that are used on > various platforms are pointing to the actual data and can be loaded > directly into a register and then a load with a small offset to that > register can be performed. On x86_64 this is gs, on i386 fs, on sparc g5, > on ia64 a fixed address stands in for the register. The asm used to generate these references is inherently arch-specific anyway, so the type and size of offset needed from the per-cpu base register to the data itself can be arch-dependent without loss of generality. I definitely see that small offsets might be useful for other architectures, but for x86 it doesn't help and makes things more complex. The only difference between 32- and 64-bit is whether we generate an offset from %fs, %gs or nothing (for the UP case). > In loops over all per > cpu variables this will also simplify the code. > Why's that? > And ultimately we can get rid of the ugly RELOC_HIDE macro. It simply > becomes the adding of the base address in a register to a per cpu offset. > I was never quite sure what that was for. J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 05/14] percpu: Use a Kconfig variable to configure arch specific percpu setup
On Thu, 29 Nov 2007, Andi Kleen wrote: > On Wed, Nov 28, 2007 at 04:11:37PM -0800, Christoph Lameter wrote: > > 1. The stack canary > > You would need to change gcc with a new option and only allow the stack > checking when the compiler supports the new option. However the problem > is still how to get a reasonable fixed offset. Or perhaps just change > gcc to use a linker symbol relative to %gs that could be set to anything? I still think we should leave the canary as is. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: void* arithmnetic
2007/11/29, Jan Engelhardt <[EMAIL PROTECTED]>: > > On Nov 29 2007 01:05, J.A. Magallón wrote: > > > >Since begin of the ages the build of the nvidia driver says things like > >this: > > > > Explicitly adding -Wpointer-arith to ones own Makefile is like > admitting the code might be problematic. :-> > > > I think sizeof(void *) == 1 is taken as granted as sizeof(int) >= 4 > these days. Sigh. sizeof(void *) == 4, sizeof(void)==1, :) > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 05/14] percpu: Use a Kconfig variable to configure arch specific percpu setup
On Wed, Nov 28, 2007 at 04:11:37PM -0800, Christoph Lameter wrote: > 1. The stack canary You would need to change gcc with a new option and only allow the stack checking when the compiler supports the new option. However the problem is still how to get a reasonable fixed offset. Or perhaps just change gcc to use a linker symbol relative to %gs that could be set to anything? -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/3] net/bonding: Return nothing for not applicable values
>The previous code returned '\n' (that is, a single empty line) >from most files, with one exception (xmit_hash_policy), where >it returned 'NA\n'. This patch consolidates each file to return >nothing at all if not applicable, not even a '\n'. > >I find this behaviour more usual, more useful, more efficient >and shorter to code from both sides. [...] >+ if ((bond->params.mode == BOND_MODE_XOR) || >+ (bond->params.mode == BOND_MODE_8023AD)) { > count = sprintf(buf, "%s %d\n", > xmit_hashtype_tbl[bond->params.xmit_policy].modename, > bond->params.xmit_policy); Rather than this (returning nothing if not in xor or 802.3ad mode), I'd prefer to see this always return whatever the xmit policy is (regardless of the mode), and remove the mode test from bonding_store_xmit_hash(). This would be consistent with the way the arp_ip_target option is treated: the actual value is always displayed, even if it is not used, and it is legal to change the value, regardless of the mode. Other than this, I'm fine with the changes. -J --- -Jay Vosburgh, IBM Linux Technology Center, [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] [net/wireless/iwlwifi] : iwlwifi 3945 Fix race conditional panic.
On Wed, 2007-11-28 at 19:41 +0900, Joonwoo Park wrote: > [net/wireless/iwlwifi] : iwlwifi 3945 Fix race conditional panic. > > Signed-off-by: Joonwoo Park <[EMAIL PROTECTED]> > --- > diff --git a/drivers/net/wireless/iwlwifi/iwl3945-base.c > b/drivers/net/wireless/iwlwifi/iwl3945-base.c > index 465da4f..ac6c4a9 100644 > --- a/drivers/net/wireless/iwlwifi/iwl3945-base.c > +++ b/drivers/net/wireless/iwlwifi/iwl3945-base.c > @@ -8570,6 +8570,7 @@ static void iwl_pci_remove(struct pci_dev *pdev) > IWL_DEBUG_INFO("*** UNLOAD DRIVER ***\n"); > > mutex_lock(>mutex); > + cancel_delayed_work_sync(>init_alive_start); > set_bit(STATUS_EXIT_PENDING, >status); > __iwl_down(priv); > mutex_unlock(>mutex); Good catch. But it will be better if you add it into iwl_cancel_deferred_work(). Thanks, -yi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
On Wed, Nov 28, 2007 at 04:02:38PM -0800, Kristen Carlson Accardi wrote: > On Wed, 28 Nov 2007 13:31:47 -0800 > Gary Hade <[EMAIL PROTECTED]> wrote: > > > FYI, the node contains 2 hotpluggable PCIe slots and 5 > > non-hotpluggable PCIe slots but 'pci_slot' only exposed > > the 2 hotpluggable slots. This does not appear to be due > > to a 'pci_slot' driver problem since I looked at the DSDT > > and SSDT and found that there are currently no _SUN methods > > for the non-hotpluggable slots. > > Thanks for testing Gary. I would think this situation would be the > common case, since I doubt most firmware writers would bother to > implement _SUN for non-hotpluggable slots -- at least on other DSDT > I've seen this has been the case as well. Yea, I was also not surprised although features such as Alex working on may provide some motivation to change that. Gary -- Gary Hade System x Enablement IBM Linux Technology Center 503-578-4503 IBM T/L: 775-4503 [EMAIL PROTECTED] http://www.ibm.com/linux/ltc - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Error returns not handled correctly by sysfs.c:subsys_attr_store()
Andrew Patterson wrote: > I tried with clean 2.6.24-rc3 and get the same bad behavior. This is on > an ia64 box, so maybe that is an issue. I can try on an x86 box as well. > Oh, one other thing. I tried a "uname -r" to make sure I had the > correct kernel booted and got: > > # uname -r > 2.6.24-rc3 > x > y > z > # Yeah, please try it on another machine from clean tree. sysfs code is definitely not endian dependent and is 64 bit clean. Heck, all my test machines run 64 bit these days. I would be surprised if it's something architecture dependent but please try on a different machine with different userland with kernel built from fresh source tree. Thanks. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] asm-arm/{arch-omap,arch-ixp23xx}: parentheses around NR_IRQS definition
in include/asm-arm/arch-omap/board-innovator.h:40 #define NR_IRQSIH_BOARD_BASE + NR_FPGA_IRQS in include/asm-arm/arch-ixp23xx/irqs.h:156: #define NR_IRQS NR_IXP23XX_IRQS + NR_IXP23XX_MACH_IRQS This could lead to problems when this definition is used in: arch/ia64/sn/kernel/irq.c:516: sn_irq_lh = kmalloc(sizeof(struct list_head *) * NR_IRQS, GFP_KERNEL); arch/x86/kernel/io_apic_32.c:693: irq_cpu_data[i].irq_delta = kmalloc(sizeof(unsigned long) * NR_IRQS, GFP_KERNEL); 694: irq_cpu_data[i].last_irq = kmalloc(sizeof(unsigned long) * NR_IRQS, GFP_KERNEL); 699: memset(irq_cpu_data[i].irq_delta,0,sizeof(unsigned long) * NR_IRQS); 700: memset(irq_cpu_data[i].last_irq,0,sizeof(unsigned long) * NR_IRQS); fs/proc/proc_misc.c:464: per_irq_sum = kzalloc(sizeof(unsigned int)*NR_IRQS, GFP_KERNEL); I am not sure whether this definition actually is used in any of these files. Am I being paranoya? anyway, adding parentheses should be safe. -- Add parentheses to prevent operator precedence errors Signed-off-by: Roel Kluin <[EMAIL PROTECTED]> --- diff --git a/include/asm-arm/arch-ixp23xx/irqs.h b/include/asm-arm/arch-ixp23xx/irqs.h index e696395..27c5808 100644 --- a/include/asm-arm/arch-ixp23xx/irqs.h +++ b/include/asm-arm/arch-ixp23xx/irqs.h @@ -153,7 +153,7 @@ */ #define NR_IXP23XX_MACH_IRQS 32 -#define NR_IRQSNR_IXP23XX_IRQS + NR_IXP23XX_MACH_IRQS +#define NR_IRQS(NR_IXP23XX_IRQS + NR_IXP23XX_MACH_IRQS) #define IXP23XX_MACH_IRQ(irq) (NR_IXP23XX_IRQ + (irq)) diff --git a/include/asm-arm/arch-omap/board-innovator.h b/include/asm-arm/arch-omap/board-innovator.h index b3cf334..56d2c98 100644 --- a/include/asm-arm/arch-omap/board-innovator.h +++ b/include/asm-arm/arch-omap/board-innovator.h @@ -37,7 +37,7 @@ #define OMAP1510P1_EMIFF_PRI_VALUE 0x00 #define NR_FPGA_IRQS 24 -#define NR_IRQS IH_BOARD_BASE + NR_FPGA_IRQS +#define NR_IRQS (IH_BOARD_BASE + NR_FPGA_IRQS) #ifndef __ASSEMBLY__ void fpga_write(unsigned char val, int reg); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Out of tree module using LSM
On Thu, Nov 29, 2007 at 01:53:46AM +0100, Jan Engelhardt wrote: > > On Nov 28 2007 16:38, Greg KH wrote: > >> > >> And if we are talking about the situation when files are written to > >> in controlled way (i.e. we are not concerned with malware running on > >> the box in question and just want to stop it from passing through > >> mailsewer, etc.), then there's no damn need to play with LSM - just > >> have e.g. coda with its commit-on-close and run the scanner on > >> commit. End of story. Mind you, in such setups one would be much > >> better off just having the mail server run the tests explicitly in > >> the userland, along with the rest of anti-spam, etc. filters. > > > >I've repeated the above statements so many times to a number of the > >anti-virus companies, and other people that really should know better, > >that I'm really sick of it. For some reason, they keep trying to do > >things like this in the kernel, despite it being trivial to do in > >userspace properly. > > > Do you mean something along the lines of FUSE? That is one way, but not the simplest or nicest (people don't want to run their whole fs on FUSE just yet). The easiest way is as Al described above, just have the userspace program that wrote the file to disk, check it then. There are some nice SAMBA plugins that do just that already out there... thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/3] net/bonding: Adhere to coding style: break line after the if condition
=?utf-8?q?Ferenc_W=C3=A1gner?= wrote: Signed-off-by: Ferenc Wágner <[EMAIL PROTECTED]> Acked-by: Randy Dunlap <[EMAIL PROTECTED]> Thanks. --- Randy Dunlap <[EMAIL PROTECTED]> writes: drivers/net/bonding/bond_sysfs.c |9 ++--- 1 files changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c index 5c31f5c..9de2c52 100644 --- a/drivers/net/bonding/bond_sysfs.c +++ b/drivers/net/bonding/bond_sysfs.c @@ -91,7 +91,8 @@ static ssize_t bonding_show_bonds(struct class *cls, char *buf) } res += sprintf(buf + res, "%s ", bond->dev->name); } - if (res) buf[res-1] = '\n'; /* eat the leftover space */ + if (res) + buf[res-1] = '\n'; /* eat the leftover space */ up_read(&(bonding_rwsem)); return res; } @@ -239,7 +240,8 @@ static ssize_t bonding_show_slaves(struct device *d, res += sprintf(buf + res, "%s ", slave->dev->name); } read_unlock(>lock); - if (res) buf[res-1] = '\n'; /* eat the leftover space */ + if (res) + buf[res-1] = '\n'; /* eat the leftover space */ return res; } @@ -705,7 +707,8 @@ static ssize_t bonding_show_arp_targets(struct device *d, res += sprintf(buf + res, "%u.%u.%u.%u ", NIPQUAD(bond->params.arp_targets[i])); } - if (res) buf[res-1] = '\n'; /* eat the leftover space */ + if (res) + buf[res-1] = '\n'; /* eat the leftover space */ return res; } -- ~Randy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Out of tree module using LSM
On Wed, Nov 28, 2007 at 12:42:52PM +, Tvrtko A. Ursulin wrote: > > Hi Linus, all, > > During one recent LKML discussion > (http://marc.info/?l=linux-kernel=119267398722085=2) about LSM going > static you called for LSM users to speak up. > > We here at Sophos (the fourth largest endpoint security vendor in the world) > have such a module called Talpa which is a part of our main endpoint security > product for Linux that protects from viruses and malware hosted on Linux, > including those targetting Windows or other connected devices, > (http://www.sophos.com/products/enterprise/endpoint/security-and-control/linux/index.html) > > which is GPL code and has been in the field for almost three years now. It's > source code has been shipping with the product from the start. We also have > a SourceForge project at http://sourceforge.net/projects/talpa/ to host it. > > In essence, what our module does is it intercepts file accesses and allows > userspace daemons to vet them. One of the means we implemented that is > through LSM and although it is not a perfect match for such use we prefer to > use an official interface. Unfortunately, with time it became impossible to > use LSM on some distributions (SELinux) so we had to implement other > intercept methods which are significantly less nice, and which may also > become unworkable over time. Do you have a patch that shows the type of interface you would like to see? Like James stated, if you do not participate in the development process, we have no way of knowing what you even want from the kernel. What has kept you from submitting your code for inclusion in the main kernel source tree? Right now, your customers void their support warranties if they run your software, as it can not be supported by the distros as an out-of-tree kernel module. I'm sure your customers would like to not have this problem. thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc3-mm2 - Build Failure on powerpc timerfd() undeclared
On Wednesday 28 November 2007 19:43:45 Andrew Morton wrote: > > I guess all architectures except x86 are currently broken because they > > reference the old sys_timerfd function. > > None of them were broken in my testing and I'm unsure why powerpc broke > here. PowerPC is unique in that it actually relies on the declarations in include/{linux,asm}/syscalls.h to be present, because the spu_syscall_table is generated from C code, not from assembly. One reason why I did this was to be sure to find this exact type of problem at compile-time, not at link time. Arnd <>< - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Out of tree module using LSM
On Nov 28 2007 16:38, Greg KH wrote: >> >> And if we are talking about the situation when files are written to >> in controlled way (i.e. we are not concerned with malware running on >> the box in question and just want to stop it from passing through >> mailsewer, etc.), then there's no damn need to play with LSM - just >> have e.g. coda with its commit-on-close and run the scanner on >> commit. End of story. Mind you, in such setups one would be much >> better off just having the mail server run the tests explicitly in >> the userland, along with the rest of anti-spam, etc. filters. > >I've repeated the above statements so many times to a number of the >anti-virus companies, and other people that really should know better, >that I'm really sick of it. For some reason, they keep trying to do >things like this in the kernel, despite it being trivial to do in >userspace properly. > Do you mean something along the lines of FUSE? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Out of tree module using LSM
On Nov 28 2007 18:22, [EMAIL PROTECTED] wrote: > >Talpa is modular itself being composed of a set of kernel modules of which >not all are loaded simultaneously. Where possible LSM can be used and _no_ >messing with syscall table will take place. Unfortunately where another >LSM user is present that won't work SELinux supports chaining, so if talpa is loaded as a secondary to selinux, where is the problem? For those LSMs which do not support chaining (*cough* apparmor *cough* be one, mtadm another), fix them. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.23.9-rt12: BUGs
> I'll try rt12... > > Same problems in rt12, getting lots of "delay of xxx usecs exceeds > estimated spare time of ; restart" in jackd (on my T61 Lenovo laptop > running fc7). Does not happen with 2.6.22.10 + rt9. This is both with > the internal snd-hda-intel card and a pcmcia rme hdsp multiface. While trying out 2.6.23.9-rt12 I got the three attached bugs. Also attached is the output of dmesg for a clean boot on the machine. Jack displays timing problems, similar to when there were timing issues with dual processor machines. Still investigating as time permits. -- Fernando apparently while suspending --- Nov 27 20:06:01 localhost kernel: Stopping tasks ... done. Nov 27 20:06:01 localhost kernel: Suspending console(s) Nov 27 20:06:01 localhost kernel: sd 0:0:0:0: [sda] Synchronizing SCSI cache Nov 27 20:06:01 localhost kernel: sd 0:0:0:0: [sda] Stopping disk Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :15:00.2 disabled Nov 27 20:06:01 localhost kernel: eth%d: Going into suspend... Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :03:00.0 disabled Nov 27 20:06:01 localhost pcscd: winscard_msg_srv.c:238:SHMProcessEventsContext() select returns with failure: Interrupted system call Nov 27 20:06:01 localhost pcscd: winscard_svc.c:222:ContextThread() Error in SHMProcessEventsContext Nov 27 20:06:01 localhost pcscd: winscard_msg_srv.c:238:SHMProcessEventsContext() select returns with failure: Interrupted system call Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :00:1f.2 disabled Nov 27 20:06:01 localhost pcscd: winscard_svc.c:222:ContextThread() Error in SHMProcessEventsContext Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :00:1d.7 disabled Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :00:1d.2 disabled Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :00:1d.1 disabled Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :00:1d.0 disabled Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :00:1b.0 disabled Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :00:1a.7 disabled Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :00:1a.1 disabled Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :00:1a.0 disabled Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :00:19.0 disabled Nov 27 20:06:01 localhost kernel: Disabling non-boot CPUs ... Nov 27 20:06:01 localhost kernel: Breaking affinity for irq 218 Nov 27 20:06:01 localhost kernel: CPU 1 is now offline Nov 27 20:06:01 localhost kernel: SMP alternatives: switching to UP code Nov 27 20:06:01 localhost kernel: BUG: sleeping function called from invalid context pm-suspend(3740) at kernel/rtmutex.c:637 Nov 27 20:06:01 localhost gnome-power-manager: (nando) DBUS timed out, but recovering Nov 27 20:06:01 localhost kernel: in_atomic():0 [], irqs_disabled():1 Nov 27 20:06:01 localhost kernel: [] __rt_spin_lock+0x21/0x3d Nov 27 20:06:01 localhost kernel: [] free_pages_bulk+0x28/0x188 Nov 27 20:06:01 localhost kernel: [] __drain_pages+0x48/0x69 Nov 27 20:06:01 localhost kernel: [] page_alloc_cpu_notify+0x1e/0x3d Nov 27 20:06:01 localhost kernel: [] notifier_call_chain+0x2a/0x47 Nov 27 20:06:01 localhost kernel: [] raw_notifier_call_chain+0x17/0x1a Nov 27 20:06:01 localhost kernel: [] _cpu_down+0x184/0x242 Nov 27 20:06:01 localhost kernel: [] disable_nonboot_cpus+0x4e/0xd2 Nov 27 20:06:01 localhost kernel: [] acpi_sleep_prepare+0x41/0x48 Nov 27 20:06:01 localhost kernel: [] suspend_devices_and_enter+0x64/0x96 Nov 27 20:06:01 localhost kernel: [] enter_state+0x11b/0x193 Nov 27 20:06:01 localhost kernel: [] state_store+0x8e/0xa2 Nov 27 20:06:01 localhost kernel: [] state_store+0x0/0xa2 Nov 27 20:06:01 localhost kernel: [] subsys_attr_store+0x27/0x2b Nov 27 20:06:01 localhost kernel: [] sysfs_write_file+0xa6/0xd9 Nov 27 20:06:01 localhost kernel: [] sysfs_write_file+0x0/0xd9 Nov 27 20:06:01 localhost kernel: [] vfs_write+0xa8/0x15a Nov 27 20:06:01 localhost gnome-power-manager: (nando) Resuming computer Nov 27 20:06:01 localhost kernel: [] sys_write+0x41/0x67 Nov 27 20:06:01 localhost kernel: [] syscall_call+0x7/0xb Nov 27 20:06:01 localhost kernel: [] xfrm_send_policy_notify+0x44f/0x4f4 Nov 27 20:06:01 localhost NetworkManager: Waking up from sleep. Nov 27 20:06:01 localhost kernel: === Nov 27 20:06:01 localhost NetworkManager: Deactivating device eth1. Nov 27 20:06:01 localhost kernel: CPU1 is down Nov 27 20:06:01 localhost NetworkManager: eth1: Device is fully-supported using driver 'e1000'. Nov 27 20:06:01 localhost kernel: Intel machine check architecture supported. Nov 27 20:06:01 localhost NetworkManager: nm_device_init(): waiting for device's worker thread to start Nov 27 20:06:01 localhost kernel: Intel machine check reporting enabled on CPU#0.
[PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task
This generalizes the getreg32 and putreg32 functions so they can be used on the current task, as well as on a task stopped in TASK_TRACED and switched off. This lays the groundwork to share this code for all kinds of user-mode machine state access, not just ptrace. Signed-off-by: Roland McGrath <[EMAIL PROTECTED]> --- arch/x86/ia32/ptrace32.c | 16 1 files changed, 16 insertions(+), 0 deletions(-) diff --git a/arch/x86/ia32/ptrace32.c b/arch/x86/ia32/ptrace32.c index c52d066..d5663e2 100644 --- a/arch/x86/ia32/ptrace32.c +++ b/arch/x86/ia32/ptrace32.c @@ -48,19 +48,27 @@ static int putreg32(struct task_struct *child, unsigned regno, u32 val) if (val && (val & 3) != 3) return -EIO; child->thread.fsindex = val & 0x; + if (child == current) + loadsegment(fs, child->thread.fsindex); break; case offsetof(struct user32, regs.gs): if (val && (val & 3) != 3) return -EIO; child->thread.gsindex = val & 0x; + if (child == current) + load_gs_index(child->thread.gsindex); break; case offsetof(struct user32, regs.ds): if (val && (val & 3) != 3) return -EIO; child->thread.ds = val & 0x; + if (child == current) + loadsegment(ds, child->thread.ds); break; case offsetof(struct user32, regs.es): child->thread.es = val & 0x; + if (child == current) + loadsegment(es, child->thread.ds); break; case offsetof(struct user32, regs.ss): if ((val & 3) != 3) @@ -129,15 +137,23 @@ static int getreg32(struct task_struct *child, unsigned regno, u32 *val) switch (regno) { case offsetof(struct user32, regs.fs): *val = child->thread.fsindex; + if (child == current) + asm("movl %%fs,%0" : "=r" (*val)); break; case offsetof(struct user32, regs.gs): *val = child->thread.gsindex; + if (child == current) + asm("movl %%gs,%0" : "=r" (*val)); break; case offsetof(struct user32, regs.ds): *val = child->thread.ds; + if (child == current) + asm("movl %%ds,%0" : "=r" (*val)); break; case offsetof(struct user32, regs.es): *val = child->thread.es; + if (child == current) + asm("movl %%es,%0" : "=r" (*val)); break; R32(cs, cs); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH x86/mm 5/6] x86-32 ptrace get/putreg current task
This generalizes the getreg and putreg functions so they can be used on the current task, as well as on a task stopped in TASK_TRACED and switched off. This lays the groundwork to share this code for all kinds of user-mode machine state access, not just ptrace. Signed-off-by: Roland McGrath <[EMAIL PROTECTED]> --- arch/x86/kernel/ptrace_32.c |8 1 files changed, 8 insertions(+), 0 deletions(-) diff --git a/arch/x86/kernel/ptrace_32.c b/arch/x86/kernel/ptrace_32.c index 5aca84e..2607130 100644 --- a/arch/x86/kernel/ptrace_32.c +++ b/arch/x86/kernel/ptrace_32.c @@ -55,6 +55,12 @@ static int putreg(struct task_struct *child, if (value && (value & 3) != 3) return -EIO; child->thread.gs = value; + if (child == current) + /* +* The user-mode %gs is not affected by +* kernel entry, so we must update the CPU. +*/ + loadsegment(gs, value); return 0; case DS: case ES: @@ -104,6 +110,8 @@ static unsigned long getreg(struct task_struct *child, unsigned long regno) break; case GS: retval = child->thread.gs; + if (child == current) + savesegment(gs, retval); break; case DS: case ES: - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH x86/mm 4/6] x86-64 ptrace get/putreg current task
This generalizes the getreg and putreg functions so they can be used on the current task, as well as on a task stopped in TASK_TRACED and switched off. This lays the groundwork to share this code for all kinds of user-mode machine state access, not just ptrace. Signed-off-by: Roland McGrath <[EMAIL PROTECTED]> --- arch/x86/kernel/ptrace_64.c | 36 ++-- 1 files changed, 34 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/ptrace_64.c b/arch/x86/kernel/ptrace_64.c index 2427548..5979dbe 100644 --- a/arch/x86/kernel/ptrace_64.c +++ b/arch/x86/kernel/ptrace_64.c @@ -67,21 +67,29 @@ static int putreg(struct task_struct *child, if (value && (value & 3) != 3) return -EIO; child->thread.fsindex = value & 0x; + if (child == current) + loadsegment(fs, child->thread.fsindex); return 0; case offsetof(struct user_regs_struct,gs): if (value && (value & 3) != 3) return -EIO; child->thread.gsindex = value & 0x; + if (child == current) + load_gs_index(child->thread.gsindex); return 0; case offsetof(struct user_regs_struct,ds): if (value && (value & 3) != 3) return -EIO; child->thread.ds = value & 0x; + if (child == current) + loadsegment(ds, child->thread.ds); return 0; case offsetof(struct user_regs_struct,es): if (value && (value & 3) != 3) return -EIO; child->thread.es = value & 0x; + if (child == current) + loadsegment(es, child->thread.es); return 0; case offsetof(struct user_regs_struct,ss): if ((value & 3) != 3) @@ -135,14 +143,32 @@ static unsigned long getreg(struct task_struct *child, unsigned long regno) { struct pt_regs *regs = task_pt_regs(child); unsigned long val; + unsigned int seg; switch (regno) { case offsetof(struct user_regs_struct, fs): + if (child == current) { + /* Older gas can't assemble movq %?s,%r?? */ + asm("movl %%fs,%0" : "=r" (seg)); + return seg; + } return child->thread.fsindex; case offsetof(struct user_regs_struct, gs): + if (child == current) { + asm("movl %%gs,%0" : "=r" (seg)); + return seg; + } return child->thread.gsindex; case offsetof(struct user_regs_struct, ds): + if (child == current) { + asm("movl %%ds,%0" : "=r" (seg)); + return seg; + } return child->thread.ds; case offsetof(struct user_regs_struct, es): + if (child == current) { + asm("movl %%es,%0" : "=r" (seg)); + return seg; + } return child->thread.es; case offsetof(struct user_regs_struct, fs_base): /* @@ -152,7 +178,10 @@ static unsigned long getreg(struct task_struct *child, unsigned long regno) */ if (child->thread.fs != 0) return child->thread.fs; - if (child->thread.fsindex != FS_TLS_SEL) + seg = child->thread.fsindex; + if (child == current) + asm("movl %%fs,%0" : "=r" (seg)); + if (seg != FS_TLS_SEL) return 0; return get_desc_base(>thread.tls_array[FS_TLS]); case offsetof(struct user_regs_struct, gs_base): @@ -161,7 +190,10 @@ static unsigned long getreg(struct task_struct *child, unsigned long regno) */ if (child->thread.gs != 0) return child->thread.gs; - if (child->thread.gsindex != GS_TLS_SEL) + seg = child->thread.gsindex; + if (child == current) + asm("movl %%gs,%0" : "=r" (seg)); + if (seg != GS_TLS_SEL) return 0; return get_desc_base(>thread.tls_array[GS_TLS]); case offsetof(struct user_regs_struct, flags): - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH x86/mm 3/6] x86-32 ptrace whitespace
This canonicalizes the indentation in the getreg and putreg functions. Signed-off-by: Roland McGrath <[EMAIL PROTECTED]> --- arch/x86/kernel/ptrace_32.c | 110 +- 1 files changed, 55 insertions(+), 55 deletions(-) diff --git a/arch/x86/kernel/ptrace_32.c b/arch/x86/kernel/ptrace_32.c index f81e2f1..5aca84e 100644 --- a/arch/x86/kernel/ptrace_32.c +++ b/arch/x86/kernel/ptrace_32.c @@ -51,37 +51,37 @@ static int putreg(struct task_struct *child, struct pt_regs *regs = task_pt_regs(child); regno >>= 2; switch (regno) { - case GS: - if (value && (value & 3) != 3) - return -EIO; - child->thread.gs = value; - return 0; - case DS: - case ES: - case FS: - if (value && (value & 3) != 3) - return -EIO; - value &= 0x; - break; - case SS: - case CS: - if ((value & 3) != 3) - return -EIO; - value &= 0x; - break; - case EFL: - value &= FLAG_MASK; - /* -* If the user value contains TF, mark that -* it was not "us" (the debugger) that set it. -* If not, make sure it stays set if we had. -*/ - if (value & X86_EFLAGS_TF) - clear_tsk_thread_flag(child, TIF_FORCED_TF); - else if (test_tsk_thread_flag(child, TIF_FORCED_TF)) - value |= X86_EFLAGS_TF; - value |= regs->flags & ~FLAG_MASK; - break; + case GS: + if (value && (value & 3) != 3) + return -EIO; + child->thread.gs = value; + return 0; + case DS: + case ES: + case FS: + if (value && (value & 3) != 3) + return -EIO; + value &= 0x; + break; + case SS: + case CS: + if ((value & 3) != 3) + return -EIO; + value &= 0x; + break; + case EFL: + value &= FLAG_MASK; + /* +* If the user value contains TF, mark that +* it was not "us" (the debugger) that set it. +* If not, make sure it stays set if we had. +*/ + if (value & X86_EFLAGS_TF) + clear_tsk_thread_flag(child, TIF_FORCED_TF); + else if (test_tsk_thread_flag(child, TIF_FORCED_TF)) + value |= X86_EFLAGS_TF; + value |= regs->flags & ~FLAG_MASK; + break; } *pt_regs_access(regs, regno) = value; return 0; @@ -94,26 +94,26 @@ static unsigned long getreg(struct task_struct *child, unsigned long regno) regno >>= 2; switch (regno) { - case EFL: - /* -* If the debugger set TF, hide it from the readout. -*/ - retval = regs->flags; - if (test_tsk_thread_flag(child, TIF_FORCED_TF)) - retval &= ~X86_EFLAGS_TF; - break; - case GS: - retval = child->thread.gs; - break; - case DS: - case ES: - case FS: - case SS: - case CS: - retval = 0x; - /* fall through */ - default: - retval &= *pt_regs_access(regs, regno); + case EFL: + /* +* If the debugger set TF, hide it from the readout. +*/ + retval = regs->flags; + if (test_tsk_thread_flag(child, TIF_FORCED_TF)) + retval &= ~X86_EFLAGS_TF; + break; + case GS: + retval = child->thread.gs; + break; + case DS: + case ES: + case FS: + case SS: + case CS: + retval = 0x; + /* fall through */ + default: + retval &= *pt_regs_access(regs, regno); } return retval; } @@ -190,7 +190,7 @@ static int ptrace_set_debugreg(struct task_struct *child, * Make sure the single step bit is not set. */ void ptrace_disable(struct task_struct *child) -{ +{ user_disable_single_step(child); clear_tsk_thread_flag(child, TIF_SYSCALL_EMU); } @@ -203,7 +203,7 @@ long
[PATCH x86/mm 2/6] x86-64 ptrace whitespace
This canonicalizes the indentation in the getreg and putreg functions. Signed-off-by: Roland McGrath <[EMAIL PROTECTED]> --- arch/x86/kernel/ptrace_64.c | 224 +- 1 files changed, 112 insertions(+), 112 deletions(-) diff --git a/arch/x86/kernel/ptrace_64.c b/arch/x86/kernel/ptrace_64.c index 56b31cd..2427548 100644 --- a/arch/x86/kernel/ptrace_64.c +++ b/arch/x86/kernel/ptrace_64.c @@ -2,7 +2,7 @@ /* * Pentium III FXSR, SSE support * Gareth Hughes <[EMAIL PROTECTED]>, May 2000 - * + * * x86-64 port 2000-2002 Andi Kleen */ @@ -48,7 +48,7 @@ * Make sure the single step bit is not set. */ void ptrace_disable(struct task_struct *child) -{ +{ user_disable_single_step(child); } @@ -63,69 +63,69 @@ static int putreg(struct task_struct *child, { struct pt_regs *regs = task_pt_regs(child); switch (regno) { - case offsetof(struct user_regs_struct,fs): - if (value && (value & 3) != 3) - return -EIO; - child->thread.fsindex = value & 0x; - return 0; - case offsetof(struct user_regs_struct,gs): - if (value && (value & 3) != 3) - return -EIO; - child->thread.gsindex = value & 0x; - return 0; - case offsetof(struct user_regs_struct,ds): - if (value && (value & 3) != 3) - return -EIO; - child->thread.ds = value & 0x; - return 0; - case offsetof(struct user_regs_struct,es): - if (value && (value & 3) != 3) - return -EIO; - child->thread.es = value & 0x; - return 0; - case offsetof(struct user_regs_struct,ss): - if ((value & 3) != 3) - return -EIO; - value &= 0x; - return 0; - case offsetof(struct user_regs_struct,fs_base): - if (value >= TASK_SIZE_OF(child)) - return -EIO; - /* -* When changing the segment base, use do_arch_prctl -* to set either thread.fs or thread.fsindex and the -* corresponding GDT slot. -*/ - if (child->thread.fs != value) - return do_arch_prctl(child, ARCH_SET_FS, value); - return 0; - case offsetof(struct user_regs_struct,gs_base): - /* -* Exactly the same here as the %fs handling above. -*/ - if (value >= TASK_SIZE_OF(child)) - return -EIO; - if (child->thread.gs != value) - return do_arch_prctl(child, ARCH_SET_GS, value); - return 0; - case offsetof(struct user_regs_struct,flags): - value &= FLAG_MASK; - /* -* If the user value contains TF, mark that -* it was not "us" (the debugger) that set it. -* If not, make sure it stays set if we had. -*/ - if (value & X86_EFLAGS_TF) - clear_tsk_thread_flag(child, TIF_FORCED_TF); - else if (test_tsk_thread_flag(child, TIF_FORCED_TF)) - value |= X86_EFLAGS_TF; - value |= regs->flags & ~FLAG_MASK; - break; - case offsetof(struct user_regs_struct,cs): - if ((value & 3) != 3) - return -EIO; - value &= 0x; - break; + case offsetof(struct user_regs_struct,fs): + if (value && (value & 3) != 3) + return -EIO; + child->thread.fsindex = value & 0x; + return 0; + case offsetof(struct user_regs_struct,gs): + if (value && (value & 3) != 3) + return -EIO; + child->thread.gsindex = value & 0x; + return 0; + case offsetof(struct user_regs_struct,ds): + if (value && (value & 3) != 3) + return -EIO; + child->thread.ds = value & 0x; + return 0; + case offsetof(struct user_regs_struct,es): + if (value && (value & 3) != 3) + return -EIO; + child->thread.es = value & 0x; +
Re: void* arithmnetic
On Nov 29 2007 01:05, J.A. Magallón wrote: > >Since begin of the ages the build of the nvidia driver says things like >this: > Explicitly adding -Wpointer-arith to ones own Makefile is like admitting the code might be problematic. :-> I think sizeof(void *) == 1 is taken as granted as sizeof(int) >= 4 these days. Sigh. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Out of tree module using LSM
On Wed, Nov 28, 2007 at 06:30:40PM +, Al Viro wrote: > On Wed, Nov 28, 2007 at 01:15:05PM -0500, [EMAIL PROTECTED] wrote: > > (Note that the concept has interesting implications in the other direction > > as > > well - rather than stopping you from reading a file that has malware, you > > could > > in theory write an anti-export package that would let you write onto > > external > > memory or outbound e-mail, but prevent the write if it was > > corporate-sensitive > > data, or whatever. > > You _can_ _not_ do that. If shared mapping gets dirtied, you have no way to > intercept that. At all. Especially since the page stays mapped while it is > written out, so the next modification can come when hardware had already > started outbound DMA and there's no way to abort it, no matter what your > external scanner would do. > > Folks, really, that doesn't work. At all. You can intercept all system > calls you want and it will not be enough to prevent the "bad" contents > from hitting the disk. > > And if we are talking about the situation when files are written to in > controlled way (i.e. we are not concerned with malware running on the box > in question and just want to stop it from passing through mailsewer, etc.), > then there's no damn need to play with LSM - just have e.g. coda with its > commit-on-close and run the scanner on commit. End of story. Mind you, > in such setups one would be much better off just having the mail server run > the tests explicitly in the userland, along with the rest of anti-spam, etc. > filters. I've repeated the above statements so many times to a number of the anti-virus companies, and other people that really should know better, that I'm really sick of it. For some reason, they keep trying to do things like this in the kernel, despite it being trivial to do in userspace properly. In the end, I even got one company to agree that it should be done in userspace (McAfee), but they ignored this and went off to update their kernel code again :( Just because other operating systems require you to do things like this within the kernel, doesn't mean that you have to do the same thing on Linux... so sad, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH x86/mm 1/6] x86-64 ia32 ptrace pt_regs cleanup
This cleans up the getreg32/putreg32 functions to use struct pt_regs in a straightforward fashion, instead of equivalent ugly pointer arithmetic. Signed-off-by: Roland McGrath <[EMAIL PROTECTED]> --- arch/x86/ia32/ptrace32.c | 21 + 1 files changed, 9 insertions(+), 12 deletions(-) diff --git a/arch/x86/ia32/ptrace32.c b/arch/x86/ia32/ptrace32.c index 1e382e3..c52d066 100644 --- a/arch/x86/ia32/ptrace32.c +++ b/arch/x86/ia32/ptrace32.c @@ -37,11 +37,11 @@ #define R32(l,q) \ case offsetof(struct user32, regs.l): \ - stack[offsetof(struct pt_regs, q) / 8] = val; break + regs->q = val; break; static int putreg32(struct task_struct *child, unsigned regno, u32 val) { - __u64 *stack = (__u64 *)task_pt_regs(child); + struct pt_regs *regs = task_pt_regs(child); switch (regno) { case offsetof(struct user32, regs.fs): @@ -65,12 +65,12 @@ static int putreg32(struct task_struct *child, unsigned regno, u32 val) case offsetof(struct user32, regs.ss): if ((val & 3) != 3) return -EIO; - stack[offsetof(struct pt_regs, ss)/8] = val & 0x; + regs->ss = val & 0x; break; case offsetof(struct user32, regs.cs): if ((val & 3) != 3) return -EIO; - stack[offsetof(struct pt_regs, cs)/8] = val & 0x; + regs->cs = val & 0x; break; R32(ebx, bx); @@ -84,9 +84,7 @@ static int putreg32(struct task_struct *child, unsigned regno, u32 val) R32(eip, ip); R32(esp, sp); - case offsetof(struct user32, regs.eflags): { - __u64 *flags = [offsetof(struct pt_regs, flags)/8]; - + case offsetof(struct user32, regs.eflags): val &= FLAG_MASK; /* * If the user value contains TF, mark that @@ -97,9 +95,8 @@ static int putreg32(struct task_struct *child, unsigned regno, u32 val) clear_tsk_thread_flag(child, TIF_FORCED_TF); else if (test_tsk_thread_flag(child, TIF_FORCED_TF)) val |= X86_EFLAGS_TF; - *flags = val | (*flags & ~FLAG_MASK); + regs->flags = val | (regs->flags & ~FLAG_MASK); break; - } case offsetof(struct user32, u_debugreg[0]) ... offsetof(struct user32, u_debugreg[7]): @@ -123,11 +120,11 @@ static int putreg32(struct task_struct *child, unsigned regno, u32 val) #define R32(l,q) \ case offsetof(struct user32, regs.l): \ - *val = stack[offsetof(struct pt_regs, q)/8]; break + *val = regs->q; break static int getreg32(struct task_struct *child, unsigned regno, u32 *val) { - __u64 *stack = (__u64 *)task_pt_regs(child); + struct pt_regs *regs = task_pt_regs(child); switch (regno) { case offsetof(struct user32, regs.fs): @@ -160,7 +157,7 @@ static int getreg32(struct task_struct *child, unsigned regno, u32 *val) /* * If the debugger set TF, hide it from the readout. */ - *val = stack[offsetof(struct pt_regs, flags)/8]; + *val = regs->flags; if (test_tsk_thread_flag(child, TIF_FORCED_TF)) *val &= ~X86_EFLAGS_TF; break; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Oops in USB / dev code plugging/unplugging multi flash reader
On Wed, Nov 28, 2007 at 06:17:39PM -0500, Mark Lord wrote: > Greg KH wrote: >> On Wed, Nov 28, 2007 at 03:02:35PM -0500, Mark Lord wrote: >>> While testing a new USB reader/cable today, >>> I was plugging/unplugging the USB multi-flash reader (22 in 1), >>> and produced this weird oops. >>> >>> There's a locking problem in there somewhere, Greg. >>> >>> 2.6.23.8 >> Can you duplicate this without the closed source ATI graphics driver >> loaded? > .. > > I don't know if I can reproduce it easily regardless. > But that fglrx module has ZERO users, so it was completely benign here > (I've now deleted it from my system). > > The tracebacks clearly show USB/dev error. I'm not disagreeing, but I've seen some very strange crap over the years come from those closed source video drivers so I do not trust them at all. If you can reproduce this without it loaded, please send the new oops message to the linux-usb mailing list and the developers there will be glad to work with you to track this down. Oh, can you also reproduce it with 2.6.24-rc3? thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/1] Writeback fix for concurrent large and small file writes
On Wed, Nov 28, 2007 at 11:29:57AM -0800, Michael Rubin wrote: > >From [EMAIL PROTECTED] Wed Nov 28 11:10:06 2007 > Message-Id: <[EMAIL PROTECTED]> > Date: Wed, 28 Nov 2007 11:01:21 -0800 > From: [EMAIL PROTECTED] > To: [EMAIL PROTECTED] > Subject: [patch 1/1] Writeback fix for concurrent large and small file writes. > > From: Michael Rubin <[EMAIL PROTECTED]> > > Fixing a bug where writing to large files while concurrently writing to > smaller ones creates a situation where writeback cannot keep up with the Could you demonstrate the situation? Or if I guess it right, could it be fixed by the following patch? (not a nack: If so, your patch could also be considered as a general purpose improvement, instead of a bug fix.) diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index 0fca820..62e62e2 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -301,7 +301,7 @@ __sync_single_inode(struct inode *inode, struct writeback_control *wbc) * Someone redirtied the inode while were writing back * the pages. */ - redirty_tail(inode); + requeue_io(inode); } else if (atomic_read(>i_count)) { /* * The inode is clean, inuse Thank you, Fengguang > traffic and memory baloons until the we hit the threshold watermark. This > can result in surprising latency spikes when syncing. This latency > can take minutes on large memory systems. Upon request I can provide > a test to reproduce this situation. The flush tree fixes this issue and > fixes several other minor issues with fairness also. > > 1) Adding a data structure to guarantee fairness when writing inodes > to disk. The flush_tree is based on an rbtree. The only difference is > how duplicate keys are chained off the same rb_node. > > 2) Added a FS flag to mark file systems that are not disk backed so we > don't have to flush them. Not sure I marked all of them. But just marking > these improves writeback performance. > > 3) Added an inode flag to allow inodes to be marked so that they are > never written back to disk. See get_pipe_inode. > > Under autotest this patch has passed: fsx, bonnie, and iozone. I am > currently writing more writeback focused tests (which so far have been > passed) to add into autotest. > > Signed-off-by: Michael Rubin <[EMAIL PROTECTED]> > --- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: How to map user space's virtual memory into kernel logical address space
Maitre Bart wrote: A given app is allocating a large amount of memory (~10M) with malloc(). It passes this pointer to the kernel (device driver) via an custom ioctl. I would like the driver to work on that memory with a pointer (as if it was allocated with vmalloc) as well as the user space too (upon return of the syscall). Is there a way to map a user space's virtual memory range into the kernel logical address space? As far as I learned from my readings, using the user-space pointer directly in kernel space will not work. Of course, copy_from_user() is out of question for efficiency purposes. ioremap() is pretty close to what I wish to do except that it accepts a physical address and I don't how to get it from a user space pointer. And since a physical address is required, I assume the range is considered contiguous, which is not really the case for malloc(). mmap()/remap_pfn_range() are interesting but I don't know how to get a kernel pointer out of them. kmap() does the job for a single page (and anyway, I wouldn't know how to feed it with a struct page from the userland pointer). get_user_pages() looks promising but it seems I have to call kmap() on each page, so it looks like I cannot operate on the buffer with a single pointer. Does any one know if it is possible? And if so, how can I do it? 10MB is an awfully big mapping to put into kernel virtual memory space. I suspect it might be easier to allocate the memory in the kernel and map it in from userspace, but then you have the same problem (and 10MB is awfully big for vmalloc). Is there a good reason why you have to be able to do this? There's likely a better way. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] (2.4.26-rc3-mm2) -mm Update CAP_LAST_CAP to reflect CAP_MAC_ADMIN
From: Casey Schaufler <[EMAIL PROTECTED]> Bump the value of CAP_LAST_CAP to reflect the current last cap value. It appears that the patch that introduced CAP_LAST_CAP and the patch that introduced CAP_MAC_ADMIN came in more or less at the same time. Signed-off-by: Casey Schaufler <[EMAIL PROTECTED]> --- include/linux/capability.h |8 1 file changed, 4 insertions(+), 4 deletions(-) diff -uprN -X linux-2.6.24-rc3-mm2-base/Documentation/dontdiff linux-2.6.24-rc3-mm2-base/include/linux/capability.h linux-2.6.24-rc3-mm2-lastcap/include/linux/capability.h --- linux-2.6.24-rc3-mm2-base/include/linux/capability.h2007-11-27 16:47:02.0 -0800 +++ linux-2.6.24-rc3-mm2-lastcap/include/linux/capability.h 2007-11-28 14:04:57.0 -0800 @@ -315,10 +315,6 @@ typedef struct kernel_cap_struct { #define CAP_SETFCAP 31 -#define CAP_LAST_CAP CAP_SETFCAP - -#define cap_valid(x) ((x) >= 0 && (x) <= CAP_LAST_CAP) - /* Override MAC access. The base kernel enforces no MAC policy. An LSM may enforce a MAC policy, and if it does and it chooses @@ -336,6 +332,10 @@ typedef struct kernel_cap_struct { #define CAP_MAC_ADMIN33 +#define CAP_LAST_CAP CAP_MAC_ADMIN + +#define cap_valid(x) ((x) >= 0 && (x) <= CAP_LAST_CAP) + /* * Bit location of each capability (used by user-space library and kernel) */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: void* arithmnetic
Em Thu, Nov 29, 2007 at 01:05:31AM +0100, J.A. Magallón escreveu: > Hi all... > > Since begin of the ages the build of the nvidia driver says things like > this: > > include/asm/compat.h:210: warning: pointer of type 'void *' used in arithmetic > > There are several of this warnings. The code in question for this example > is: > > static __inline__ void __user *compat_alloc_user_space(long len) > { > struct pt_regs *regs = task_pt_regs(current); > return (void __user *)regs->rsp - len; > } > > As this is dealing with mem blocks, I suppose it's counting in bytes, so > we could do something like: > >return (void __user *)((u8*)regs->rsp - len); > > so the arithmetic knows how to inc/dec for each unity... > I think the warning is correct and that void* arithmetic is undefined in C, > isn't it ? Yes, but not in gcc, the language the kernel is written 8) It is allowed and the size of a void is 1. -Wpointer-arith disables this. [EMAIL PROTECTED] ~]$ cat voidptr.c #include int main(int argc, char *argv[]) { void *ptr = argv[argc - 1]; puts(ptr + 4); return 0; } [EMAIL PROTECTED] ~]$ gcc -Wall voidptr.c -o voidptr [EMAIL PROTECTED] ~]$ ./a Magallón llón [EMAIL PROTECTED] ~]$ gcc -Wall -Wpointer-arith voidptr.c -o voidptr voidptr.c: In function ‘main’: voidptr.c:7: warning: pointer of type ‘void *’ used in arithmetic [EMAIL PROTECTED] ~]$ ./a Magallón llón [EMAIL PROTECTED] ~]$ - Arnaldo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/3] net/bonding: Adhere to coding style: break line after the if condition
Signed-off-by: Ferenc Wágner <[EMAIL PROTECTED]> --- Randy Dunlap <[EMAIL PROTECTED]> writes: > Wagner Ferenc wrote: >> Randy Dunlap <[EMAIL PROTECTED]> writes: >> >>> Patches 1 & 3 use >>> >>> if (res) statement; >>> >>> but the preferred form is >>> >>> if (res) >>> statement; >>> >>> Even if this style was already used in the source file, it should >>> be cleaned up. >> >> No principal problem. So that I learn something useful: how should I >> go about this? I created the patches with git-format-patch, and they >> depend on each other, so I'd rather not git-reset, if possible... >> >> Can I just create a follow-up patch which fixes this stylistic issue? > > That's OK with me. I can't say how it might be done with git. drivers/net/bonding/bond_sysfs.c |9 ++--- 1 files changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c index 5c31f5c..9de2c52 100644 --- a/drivers/net/bonding/bond_sysfs.c +++ b/drivers/net/bonding/bond_sysfs.c @@ -91,7 +91,8 @@ static ssize_t bonding_show_bonds(struct class *cls, char *buf) } res += sprintf(buf + res, "%s ", bond->dev->name); } - if (res) buf[res-1] = '\n'; /* eat the leftover space */ + if (res) + buf[res-1] = '\n'; /* eat the leftover space */ up_read(&(bonding_rwsem)); return res; } @@ -239,7 +240,8 @@ static ssize_t bonding_show_slaves(struct device *d, res += sprintf(buf + res, "%s ", slave->dev->name); } read_unlock(>lock); - if (res) buf[res-1] = '\n'; /* eat the leftover space */ + if (res) + buf[res-1] = '\n'; /* eat the leftover space */ return res; } @@ -705,7 +707,8 @@ static ssize_t bonding_show_arp_targets(struct device *d, res += sprintf(buf + res, "%u.%u.%u.%u ", NIPQUAD(bond->params.arp_targets[i])); } - if (res) buf[res-1] = '\n'; /* eat the leftover space */ + if (res) + buf[res-1] = '\n'; /* eat the leftover space */ return res; } -- 1.4.4.4 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: named + capset = EPERM [Was: 2.6.24-rc3-mm2]
Quoting Serge E. Hallyn ([EMAIL PROTECTED]): > Quoting Serge E. Hallyn ([EMAIL PROTECTED]): > > Quoting Casey Schaufler ([EMAIL PROTECTED]): > > > > > > --- Jiri Slaby <[EMAIL PROTECTED]> wrote: > > > > > > > On 11/28/2007 12:41 PM, Andrew Morton wrote: > > > > > > > > > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc3/2.6.24-rc3-mm2/ > > > > [...] > > > > > +capabilities-introduce-per-process-capability-bounding-set.patch > > > > > > > > A regression against -mm1. This patch breaks bind (9.5.0-18.a7.fc8): > > > > capset(0x19980330, 0, > > > > > > > {CAP_DAC_READ_SEARCH|CAP_SETGID|CAP_SETUID|CAP_NET_BIND_SERVICE|CAP_SYS_CHROOT|CAP_SYS_RESOURCE, > > > > > > > CAP_DAC_READ_SEARCH|CAP_SETGID|CAP_SETUID|CAP_NET_BIND_SERVICE|CAP_SYS_CHROOT|CAP_SYS_RESOURCE, > > > > 0}) = -1 EPERM (Operation not permitted) > > > > > > > > $ grep SEC .config > > > > CONFIG_SECCOMP=y > > > > # CONFIG_NETWORK_SECMARK is not set > > > > CONFIG_RPCSEC_GSS_KRB5=m > > > > # CONFIG_RPCSEC_GSS_SPKM3 is not set > > > > # CONFIG_SECURITY is not set > > > > # CONFIG_SECURITY_FILE_CAPABILITIES is not set > > > > > > > > probably this hunk?: > > > > @@ -133,6 +119,12 @@ int cap_capset_check (struct task_struct > > > > /* incapable of using this inheritable set */ > > > > return -EPERM; > > > > } > > > > + if (!!cap_issubset(*inheritable, > > > > + cap_combine(target->cap_inheritable, > > > > + current->cap_bset))) { > > > > + /* no new pI capabilities outside bounding set */ > > > > + return -EPERM; > > > > + } > > > > That shouldn't be it, since you can't lower cap_bset since > > CONFIG_SECURITY_FILE_CAPABILITIES=n. > > Hmm, but sure enough that appears to be it. > > Still trying to figure out why. No. Seriously. You're kidding me. Patch attached :( Thanks for spotting this, Jiri. I don't know where I introduced this since I thought all my tests had passed... thanks, -serge >From 70d5da610fdbd66a36886c01e27b7fb11d2de044 Mon Sep 17 00:00:00 2001 From: [EMAIL PROTECTED] <[EMAIL PROTECTED](none)> Date: Wed, 28 Nov 2007 16:16:23 -0800 Subject: [PATCH 1/1] capabilities: correct logic at capset_check Fix typo at capset_check introduced with capability bounding set patch. Signed-off-by: [EMAIL PROTECTED] <[EMAIL PROTECTED](none)> --- security/commoncap.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/security/commoncap.c b/security/commoncap.c index c25ad09..503e958 100644 --- a/security/commoncap.c +++ b/security/commoncap.c @@ -119,7 +119,7 @@ int cap_capset_check (struct task_struct *target, kernel_cap_t *effective, /* incapable of using this inheritable set */ return -EPERM; } - if (!!cap_issubset(*inheritable, + if (!cap_issubset(*inheritable, cap_combine(target->cap_inheritable, current->cap_bset))) { /* no new pI capabilities outside bounding set */ -- 1.5.1 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/