Re: [patch 14/21] Xen-paravirt: Add XEN config options and disable unsupported config options.
On 02/15/2007 11:04 PM, Jeremy Fitzhardinge wrote: HZ - I'm assuming dynticks will appear in the short term, and this will become moot Doesn't Xen send any non-blocked domain a 100hz alarm implicitly, without anyway for the guest to disable it? I guess you'll have to break kernel/hypervisor compatibility if you want dynticks? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 12/21] Xen-paravirt: Allocate and free vmalloc areas
On Thu, 15 Feb 2007 23:30:57 -0800 Jeremy Fitzhardinge <[EMAIL PROTECTED]> wrote: > > If you really need to run atomically, that gets ugly. Even of one were to > > run handle_mm_fault() by hand, it still needs to allocate memory. > > > > Two ugly options might be: > > > > a) touch all the pages, then go atomic, then touch them all again. If > >one of them faults (ie: you raced with swapout) then go back and try > >again. Obviously susceptible to livelocking. > > > > b) Do get_user_pages() against all the pages, then go atomic, then do > >put_page() against them all. Of course, they can immediately get > >swapped out. > > > > But that function's already racy against swapout and I guess it works OK. > > I don't have clue what it is actually trying to do, so I'm guessing madly > > here. > > > > It's for populating the pagetable in a vmalloc area. There's magic in > the fault handler to synchronize the vmalloc mappings between different > process's kernel mappings, so if the mapping isn't currently present, it > will fault and create the appropriate mapping. It's not operating on > swappable user memory, so swapping isn't an issue; but if the fault > handler exits immediately with preempt disabled, then there's a problem. > oh, I see. The vmalloc fault can run atomically. In fact it can run at hard iRQ. So no probs (apart from the fact that it required an email dialogue to work this out rather than reading the code, but I do go on). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 12/21] Xen-paravirt: Allocate and free vmalloc areas
Andrew Morton wrote: On Thu, 15 Feb 2007 23:08:02 -0800 Jeremy Fitzhardinge <[EMAIL PROTECTED]> wrote: Andrew Morton wrote: This won't work when CONFIG_PREEMPT=y. The pagefault handler will see in_atomic() and will scram. Is there some other way to get the pagetable populated for the address range? If you really need to run atomically, that gets ugly. Even of one were to run handle_mm_fault() by hand, it still needs to allocate memory. Two ugly options might be: a) touch all the pages, then go atomic, then touch them all again. If one of them faults (ie: you raced with swapout) then go back and try again. Obviously susceptible to livelocking. b) Do get_user_pages() against all the pages, then go atomic, then do put_page() against them all. Of course, they can immediately get swapped out. But that function's already racy against swapout and I guess it works OK. I don't have clue what it is actually trying to do, so I'm guessing madly here. Well its only operating on kernel pages, and against a vmalloc region. So it would be immune to any sort of unmapping or swapout. Andrew's option a) should work. What's this for, and how is it not Xen specific if nothing else in the tree needs such a weird hack? -- SUSE Labs, Novell Inc. Send instant messages to your online friends http://au.messenger.yahoo.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: irq: Kill IRQ compression
Len Brown <[EMAIL PROTECTED]> writes: > This code makes simple systems complex: > > ACPI: PCI Interrupt :03:04.0[A] -> GSI 18 (level, low) -> IRQ 16 > ACPI: PCI Interrupt :00:1d.0[A] -> GSI 16 (level, low) -> IRQ 17 > ACPI: PCI Interrupt :00:1d.1[B] -> GSI 19 (level, low) -> IRQ 18 > ACPI: PCI Interrupt :00:1d.7[D] -> GSI 23 (level, low) -> IRQ 19 > > The same code was already removed from x86_64 By itself I don't think we are going to observe any real problems with this patch. However if we are going to be serious about this we need to do a few more things. - kill iopaic_renumber_irq. - Increase NR_IRQS. We will still be limited to about 208 interrupts in use at one time, but we can allow more irq sources to be described. This reminds me. I really need to dig up my patch that doesn't allocate a vector for an irq until request irq time. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 14/21] Xen-paravirt: Add XEN config options and disable unsupported config options.
Zachary Amsden <[EMAIL PROTECTED]> writes: > > We do support different HZ values, although 100HZ is actually preferable for > us, > so I don't object to that. PREEMPT is supported by us, but not as tested as I > would like, so I also don't object to dropping it for generic paravirt guests > - > Rusty - Avi any objections to dropping preempt in terms of lguest/KVM > paravirtualization? > > Paravirt-ops definitely needs a hook for kexec, although we should not disable > kexec for the natively booted paravirt-ops. Eric - is there a way to disable > it > at runtime? > > We do support the doublefault task gate, and it would be good to keep it, but > I > can't complain so much if it is gone from generic paravirt kernels for now, > because it is non-essential, and generally fatal anyway. We do need it for > native boots of paravirt-ops kernels, however, so turning off the config > option > still needs to be revisited Have machine_kexec_prepare fail. I think your machine description or paravirt_ops or whatever it is needs to hook both machine_kexec_prepare and machine_kexec. I know there actually has been some work to get kexec actually working under Xen but I don't know where that has gone. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 11/21] Xen-paravirt: Add apply_to_page_range() which applies a function to a pte range.
Andrew Morton wrote: On Thu, 15 Feb 2007 18:25:00 -0800 Jeremy Fitzhardinge <[EMAIL PROTECTED]> wrote: Add a new mm function apply_to_page_range() which applies a given function to every pte in a given virtual address range in a given mm structure. This is a generic alternative to cut-and-pasting the Linux idiomatic pagetable walking code in every place that a sequence of PTEs must be accessed. Although this interface is intended to be useful in a wide range of situations, it is currently used specifically by several Xen subsystems, for example: to ensure that pagetables have been allocated for a virtual address range, and to construct batched special pagetable update requests to map I/O memory (in ioremap()). There was some discussion about this sort of thing last week. The consensus was that it's better to run the callback against a whole pmd's worth of ptes, mainly to amortise the callback's cost (a lot). It was implemented in ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20/2.6.20-mm1/broken-out/smaps-extract-pmd-walker-from-smaps-code.patch Speaking of that patch, I missed the discussion, but I'd hope it doesn't go upstream in its current form. We now have one way of walking range of ptes. The code may be duplicated a few times, but it is simple, we know how it works, and it is easy to get right because everyone does the same thing. We used to have about a dozen slightly different ways of doing this until Hugh spent the effort to standardise it all. Isn't it nice? If we want an ever-so-slightly lower performing interface for those paths that don't care to count every cycle -- which I think is a fine idea BTW -- it should be implemented in mm/memory.c and it should use our standard form of pagetable walking. -- SUSE Labs, Novell Inc. Send instant messages to your online friends http://au.messenger.yahoo.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 12/21] Xen-paravirt: Allocate and free vmalloc areas
Andrew Morton wrote: > On Thu, 15 Feb 2007 23:08:02 -0800 Jeremy Fitzhardinge <[EMAIL PROTECTED]> > wrote: > > >> Andrew Morton wrote: >> >>> This won't work when CONFIG_PREEMPT=y. The pagefault handler will see >>> in_atomic() and will scram. >>> >>> >> Is there some other way to get the pagetable populated for the address >> range? >> >> > > If you really need to run atomically, that gets ugly. Even of one were to > run handle_mm_fault() by hand, it still needs to allocate memory. > > Two ugly options might be: > > a) touch all the pages, then go atomic, then touch them all again. If >one of them faults (ie: you raced with swapout) then go back and try >again. Obviously susceptible to livelocking. > > b) Do get_user_pages() against all the pages, then go atomic, then do >put_page() against them all. Of course, they can immediately get >swapped out. > > But that function's already racy against swapout and I guess it works OK. > I don't have clue what it is actually trying to do, so I'm guessing madly > here. > It's for populating the pagetable in a vmalloc area. There's magic in the fault handler to synchronize the vmalloc mappings between different process's kernel mappings, so if the mapping isn't currently present, it will fault and create the appropriate mapping. It's not operating on swappable user memory, so swapping isn't an issue; but if the fault handler exits immediately with preempt disabled, then there's a problem. (Even though we don't support preempt, I've been generally coding in a preempt-safe way, just so that the code looks "normal". And maybe we will support preempt at some point.) J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 12/21] Xen-paravirt: Allocate and free vmalloc areas
On Thu, 15 Feb 2007 23:08:02 -0800 Jeremy Fitzhardinge <[EMAIL PROTECTED]> wrote: > Andrew Morton wrote: > > This won't work when CONFIG_PREEMPT=y. The pagefault handler will see > > in_atomic() and will scram. > > > > Is there some other way to get the pagetable populated for the address > range? > If you really need to run atomically, that gets ugly. Even of one were to run handle_mm_fault() by hand, it still needs to allocate memory. Two ugly options might be: a) touch all the pages, then go atomic, then touch them all again. If one of them faults (ie: you raced with swapout) then go back and try again. Obviously susceptible to livelocking. b) Do get_user_pages() against all the pages, then go atomic, then do put_page() against them all. Of course, they can immediately get swapped out. But that function's already racy against swapout and I guess it works OK. I don't have clue what it is actually trying to do, so I'm guessing madly here. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 14/21] Xen-paravirt: Add XEN config options and disable unsupported config options.
Andrew Morton wrote: > On Thu, 15 Feb 2007 22:14:45 -0800 Dan Hecht <[EMAIL PROTECTED]> wrote: > > >>> config PREEMPT >>> bool "Preemptible Kernel (Low-Latency Desktop)" >>> + depends on !XEN >>> help >>> This option reduces the latency of the kernel by making >>> all kernel code (that is not executing in a critical section) >>> >>> > > Oh, so that's why it doesn't break when CONFIG_PREEMPT=y. In which case > that preempt_disable() I spotted is wrong-and-unneeded. > > Why doesn't Xen work with preemption?? > I've forgotten the details. Ian? Keir? Steven? Maybe it can be done. J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 11/21] Xen-paravirt: Add apply_to_page_range() which applies a function to a pte range.
Andrew Morton wrote: > I guess your pte-at-a-time walker could be quite simply implemented underneath > the smaps pmd-at-a-time walker. > Yes, converting should be pretty simple. There aren't many users in Xen, but they are moderately performance critical (fork, exec and exit), so using batching would be good. J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 00/21] Xen-paravirt: Xen guest implementation for paravirt_ops interface
Andrew Morton wrote: > On Thu, 15 Feb 2007 18:24:49 -0800 Jeremy Fitzhardinge <[EMAIL PROTECTED]> > wrote: > > >> This patch series implements the Linux Xen guest in terms of the >> paravirt-ops interface. >> > > The whole patchset exports 67 symbols to modules. How come? > > Are they all needed? Yep, pretty much. They're all generally to do with Xen's virtual device model, and are needed by modular frontend/backed drivers. This series only includes the basic block and network frontend devices, but there are more waiting in the wings. The breakdown, roughly, is: * event channel management * pseudophysical <-> machine addresses * grant-table management * xenbus, which includes o which has a filesystem-like namespace o the means to monitor changes in objects in the namespace o shared resource management o suspend/resume J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 11/21] Xen-paravirt: Add apply_to_page_range() which applies a function to a pte range.
On Thu, 15 Feb 2007 23:06:45 -0800 Jeremy Fitzhardinge <[EMAIL PROTECTED]> wrote: > Andrew Morton wrote: > > On Thu, 15 Feb 2007 18:25:00 -0800 Jeremy Fitzhardinge <[EMAIL PROTECTED]> > > wrote: > > > > > >> Add a new mm function apply_to_page_range()[...] > > There was some discussion about this sort of thing last week. The > > consensus was that it's better to run the callback against a whole pmd's > > worth of ptes, mainly to amortise the callback's cost (a lot). > > > > It was implemented in > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20/2.6.20-mm1/broken-out/smaps-extract-pmd-walker-from-smaps-code.patch > > Yes I was looking at that and wondering what the upshot would be. I'll > have a closer look, but it seems like it should be usable. It's a question of who-merges-first. I wasn't planning on merging the smaps stuff into 2.6.21. Perhaps the best approach is to proceed as-is and clean things up once it's all merged. I guess your pte-at-a-time walker could be quite simply implemented underneath the smaps pmd-at-a-time walker. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH(Experimental) 1/4] freezer-cpu-hotplug core
On Wed, Feb 14, 2007 at 11:22:09PM +0300, Oleg Nesterov wrote: > > o Splits CPU_DEAD into two events namely > > - CPU_DEAD: which will be handled while the processes are still > > frozen. > > > > - CPU_DEAD_KILL_THREADS: To be handled after we thaw_processes. > > > Imho, this is not right. This change the meaning of CPU_DEAD, and so > we should fix all users of CPU_DEAD as well. Why should we fix all users? Only users who were doing a kthread_stop() in CPU_DEAD need to be fixed. From my count, only 5 users (out of a total of 35) need to be fixed to not do kthread_stop in CPU_DEAD. > > How about > > CPU_DEAD_WHATEVER > the processes are still frozen > > CPU_DEAD > after we thaw_processes > > This way we can add processing of the new CPU_DEAD_WHATEVER event where > it may help. Well, -most- of the work needs to be done in a state when processes are frozen. The only exception is cleaning up of per-cpu threads (which is not possible with processes frozen - if we can find a way to make that possible, then everything can be done in CPU_DEAD). If we go by the change suggested above, then we need to fix all users of CPU_DEAD to do what they are doing in CPU_DEAD_WHATEVER (when processes are frozen). I would rather avoid this invasive change and let CPU_DEAD be sent with processes frozen still. > CPU_UP_PREPARE is called after freeze_processes()... Probably this works, > but imho this is no good. Suppose for a moment that khelper will be frozen > (yes, yes it can't be), then we can't do kthread_create(). Yes, I am worried about doing so many things with processes frozen. Maybe time (and more testing) will tell us if this is a bad thing or not. The only dependency I have found so far is that kthread workqueue needs to be up (and hence its worker thread needs to be exempted from hotplug freeze). We should mark kthread workqueue accordingly as not freezable for hotplug. -- Regards, vatsa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-pm] Suspend to RAM, Sony Vaio PCG-SRX51P, lcd stays off
Mattia Dongili wrote: > On Thu, February 15, 2007 11:36 am, Pavel Machek said: >>> sys_vendor = "Sony Corporation" >>> sys_product = "PCG-SRX51P(DE) " >>> sys_version = "01 " >>> bios_version = "R0232U2" >>> > (unrelated to your suspend problems) does the sony-laptop (formerly > sony_acpi) module helps controlling brightness? > (should appear soon or you can eventually grab it from the linux-acpi tree) No, I use the sonypi driver. But I can test the sony-laptop one as soon as it is in -mm again if it would be of any help. >>> Latest kernel I tested is 2.6.20-git11 from today. > > I read reports of successful suspends on that laptop, eg: > http://freenet-homepage.de/obauer/index.html Ok, now I feel totally dumb. 's2ram -s -f' actually works iff you disable fb support completely in the kernel. It works even from X. Don't know how many combinations I tried but that one somehow slipped through. Anyway thanks for your help. So could this machine be added to the s2ram database? Thanks, Jan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 12/21] Xen-paravirt: Allocate and free vmalloc areas
Andrew Morton wrote: > This won't work when CONFIG_PREEMPT=y. The pagefault handler will see > in_atomic() and will scram. > Is there some other way to get the pagetable populated for the address range? J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 14/21] Xen-paravirt: Add XEN config options and disable unsupported config options.
On Thu, 15 Feb 2007 22:14:45 -0800 Dan Hecht <[EMAIL PROTECTED]> wrote: > > config PREEMPT > > bool "Preemptible Kernel (Low-Latency Desktop)" > > + depends on !XEN > > help > > This option reduces the latency of the kernel by making > > all kernel code (that is not executing in a critical section) > > Oh, so that's why it doesn't break when CONFIG_PREEMPT=y. In which case that preempt_disable() I spotted is wrong-and-unneeded. Why doesn't Xen work with preemption?? > I hate to sound like a broken record, but this really isn't the right > way to do this. If you are going to inhibit config settings when Xen > support is compiled, then it should be: > > config XEN > depends on !KEXEC && !DOUBLEFAULT && HZ_100 && !PREEMPT > Agree. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 11/21] Xen-paravirt: Add apply_to_page_range() which applies a function to a pte range.
Andrew Morton wrote: > On Thu, 15 Feb 2007 18:25:00 -0800 Jeremy Fitzhardinge <[EMAIL PROTECTED]> > wrote: > > >> Add a new mm function apply_to_page_range()[...] > There was some discussion about this sort of thing last week. The > consensus was that it's better to run the callback against a whole pmd's > worth of ptes, mainly to amortise the callback's cost (a lot). > > It was implemented in > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20/2.6.20-mm1/broken-out/smaps-extract-pmd-walker-from-smaps-code.patch Yes I was looking at that and wondering what the upshot would be. I'll have a closer look, but it seems like it should be usable. J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 14/21] Xen-paravirt: Add XEN config options and disable unsupported config options.
Dan Hecht wrote: > 1) Complete the xen paravirt-ops backend so it can handle these > "incompatible" options. I realize this is just a matter of time (at > least for most of them, what is your plan for PREEMPT?). Hope it will go away? There's some relatively deep reason to do with migration which makes preempt very difficult to support, but I've forgotten the details. I've been wondering if there's some way to make it runtime selectable without massive performance cost. > 2) Disable the option at runtime only if the kernel is booted on Xen. > When the kernel is booted on native, lhype, paravirt kvm, vmware, etc > it should be not be inhibited. This may not be feasible for all of > these options, but as Zach pointed out, is easy enough for > DOUBLEFAULT. Maybe it can be done for KEXEC? And HZ is easy to allow > too, even though Xen will still give interrupts at 100hz -- we do this > when you boot on VMI. PREEMPT is probably the only real compile time > incompatible options with Xen. You just simply have to change the > loop decrementors in your timer interrupt. > > You basically chose #2 for SMP: while your backend doesn't support it > yet, it's not harmful to have the config option enabled; you just > don't allow a second vcpu to startup when running on Xen. Yes, this has been my preferred approach. So for each of the restrictions: PREEMPT - hard, will try to do something at runtime HZ - I'm assuming dynticks will appear in the short term, and this will become moot KEXEC - there's been some work on making KEXEC and Xen play nicely together; I was waiting to see if that matures, and/or make it runtime switchable in the meantime DOUBLEFAULT - not really an issue; guests won't get them under Xen (I think) and we can ignore setting the gate pretty easily J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Don't probe for DDC on VBE1.2
On Thu, 15 Feb 2007 21:59:06 -0800 (PST) Zwane Mwaikambo <[EMAIL PROTECTED]> wrote: > On Thu, 15 Feb 2007, Andrew Morton wrote: > > > On Thu, 15 Feb 2007 21:45:06 -0800 Andrew Morton <[EMAIL PROTECTED]> wrote: > > > > > On Thu, 15 Feb 2007 21:35:35 -0800 (PST) Zwane Mwaikambo <[EMAIL > > > PROTECTED]> wrote: > > > > > > > On Thu, 15 Feb 2007, Andrew Morton wrote: > > > > > > > > > This makes the long-suffering-but-vigorously-defended Vaio come up > > > > > with a > > > > > black display. Everything's working OK otherwise. Sort of a Black > > > > > Screen > > > > > of Life. I wouldn't call it an improvement though. > > > > > > > > Bugger, what does your kernel commandline look like? > > > > > > Kernel command line: ro root=LABEL=/ rhgb vga=0x263 clock=pit > > > > Removing the vga=0x263 "fixes" it. > > > > Sorry i missed this earlier, could you also post up an Xorg.0.log (or > equivalent for your system). > It's not an X problem - the screen is black immediately upon loading the kernel. But I guess you knew that and you're just after display info: http://userweb.kernel.org/~akpm/Xorg.0.log.txt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 00/21] Xen-paravirt: Xen guest implementation for paravirt_ops interface
On Thu, 15 Feb 2007 18:24:49 -0800 Jeremy Fitzhardinge <[EMAIL PROTECTED]> wrote: > This patch series implements the Linux Xen guest in terms of the > paravirt-ops interface. The whole patchset exports 67 symbols to modules. How come? Are they all needed? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 18/21] Xen-paravirt: Add Xen grant table support
Andrew Morton wrote: > On Thu, 15 Feb 2007 18:25:07 -0800 Jeremy Fitzhardinge <[EMAIL PROTECTED]> > wrote: > > >> +int gnttab_grant_foreign_access(domid_t domid, unsigned long frame, >> +int readonly) >> +{ >> +int ref; >> + >> +if (unlikely((ref = get_free_entry()) == -1)) >> +return -ENOSPC; >> + >> +shared[ref].frame = frame; >> +shared[ref].domid = domid; >> +wmb(); >> +shared[ref].flags = GTF_permit_access | (readonly ? GTF_readonly : 0); >> + >> +return ref; >> +} >> +EXPORT_SYMBOL_GPL(gnttab_grant_foreign_access); >> > > We have lots of open-coded mysteriously unexplained barriers in here. > > I assume they're not smp_wmb() because this could be a !SMP guest talking > to an SMP host? > Yeah. The grant tables refer to pages which are shared with other domains, so they could be running on other cpus even if this domain is UP. There's a lockless protocol going on here, but I'll need to look it up and sprinkle some comments. J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 10/21] Xen-paravirt: add hooks to intercept mm creation and destruction
Andrew Morton wrote: > On Thu, 15 Feb 2007 18:24:59 -0800 Jeremy Fitzhardinge <[EMAIL PROTECTED]> > wrote: > > >> Add hooks to allow a paravirt implementation to track the lifetime of >> an mm. >> >> --- a/arch/i386/kernel/paravirt.c >> +++ b/arch/i386/kernel/paravirt.c >> @@ -706,6 +706,10 @@ struct paravirt_ops paravirt_ops = { >> .irq_enable_sysexit = native_irq_enable_sysexit, >> .iret = native_iret, >> >> +.dup_mmap = (void *)native_nop, >> +.exit_mmap = (void *)native_nop, >> +.activate_mm = (void *)native_nop, >> + >> .startup_ipi_hook = (void *)native_nop, >> }; >> > > eww. I suppose there's a good reason for the casting. > Yeah, it's a bit ugly. The alternative is to have a separate correctly-typed nop function for each operation. But that's even more typing. > It seems strange to call out to arch_foo() from within an arch header file. > I mean, we implicity know we're i386. > > Maybe it's just poorly named. > The other two are arch_* and are called from common code. arch_activate_mm() is either empty or a call to paravirt_ops.activate_mm. I could name it paravirt_activate_mm (as it was in earlier versions of this patch), but then it would be inconsistent with the other functions. I thought the consistency was more important, because these calls need to be properly matched. >> +static inline void paravirt_activate_mm(struct mm_struct *prev, >> +struct mm_struct *next) >> +{ >> +} >> + >> +static inline void paravirt_dup_mmap(struct mm_struct *oldmm, >> + struct mm_struct *mm) >> +{ >> +} >> + >> +static inline void paravirt_exit_mmap(struct mm_struct *mm) >> +{ >> +} >> > > These functions are unreferenced in this patchset. > OK, I'll drop them. >> #endif /* CONFIG_PARAVIRT */ >> #endif /* __ASM_PARAVIRT_H */ >> === >> --- a/include/linux/sched.h >> +++ b/include/linux/sched.h >> @@ -374,6 +374,12 @@ struct mm_struct { >> rwlock_tioctx_list_lock; >> struct kioctx *ioctx_list; >> }; >> + >> +#ifndef __HAVE_ARCH_MM_LIFETIME >> +#define arch_activate_mm(prev, next)do {} while(0) >> +#define arch_dup_mmap(oldmm, mm)do {} while(0) >> +#define arch_exit_mmap(mm) do {} while(0) >> +#endif >> > > Can we lose __HAVE_ARCH_MM_LIFETIME? Just define these (preferably in C, > not in cpp) in the appropriate include/asm-foo/ files? > I guess, if you want. For everything except i386 (and x86_64 in the not too distant future) they'll be noops. But for consistency, I/we would have to put the appropriate arch_activate_mm() into each arch's activate_mm(); I seem to remember some were not as straightforward as i386. >> struct sighand_struct { >> atomic_tcount; >> === >> --- a/kernel/fork.c >> +++ b/kernel/fork.c >> @@ -286,6 +286,7 @@ static inline int dup_mmap(struct mm_str >> if (retval) >> goto out; >> } >> +arch_dup_mmap(oldmm, mm); >> retval = 0; >> out: >> up_write(>mmap_sem); >> === >> --- a/mm/mmap.c >> +++ b/mm/mmap.c >> @@ -1976,6 +1976,8 @@ void exit_mmap(struct mm_struct *mm) >> struct vm_area_struct *vma = mm->mmap; >> unsigned long nr_accounted = 0; >> unsigned long end; >> + >> +arch_exit_mmap(mm); >> >> lru_add_drain(); >> flush_cache_mm(mm); >> > > Perhaps some commentary telling the arch maintainer what these hooks he's > being offered are for? > OK. J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 18/21] Xen-paravirt: Add Xen grant table support
On Thu, 15 Feb 2007 18:25:07 -0800 Jeremy Fitzhardinge <[EMAIL PROTECTED]> wrote: > +int gnttab_grant_foreign_access(domid_t domid, unsigned long frame, > + int readonly) > +{ > + int ref; > + > + if (unlikely((ref = get_free_entry()) == -1)) > + return -ENOSPC; > + > + shared[ref].frame = frame; > + shared[ref].domid = domid; > + wmb(); > + shared[ref].flags = GTF_permit_access | (readonly ? GTF_readonly : 0); > + > + return ref; > +} > +EXPORT_SYMBOL_GPL(gnttab_grant_foreign_access); We have lots of open-coded mysteriously unexplained barriers in here. I assume they're not smp_wmb() because this could be a !SMP guest talking to an SMP host? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 17/21] Xen-paravirt: Add the Xen virtual console driver.
On Thu, 15 Feb 2007 18:25:06 -0800 Jeremy Fitzhardinge <[EMAIL PROTECTED]> wrote: > === > --- a/arch/i386/kernel/early_printk.c > +++ b/arch/i386/kernel/early_printk.c > @@ -1,2 +1,4 @@ > > +#ifndef CONFIG_XEN > #include "../../x86_64/kernel/early_printk.c" > +#endif > === This can be done in Kconfig: disable CONFIG_EARLY_PRINTK if CONFIG_XEN. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH(Experimental) 1/4] freezer-cpu-hotplug core
On Wed, Feb 14, 2007 at 10:47:42PM +0300, Oleg Nesterov wrote: > > for (;;) { > > - if (cwq->wq->freezeable) > > + if (cwq->wq->freezeable) { > > Else? This is wrong. The change like this should start from making all > cwq->threads freezeable, otherwise it just doesn't work. I agree we need to have all threads frozen for hotplug. Only exception I have found is kthread workqueue, which needs to be active after freeze_processes(). stop_machine and CPU_UP_PREPARE/kthread_create() depend on it to work. A worker thread (like kthread workqueue), which has exempted itself from hotplug-freeze, should essentially be prepared to get preempted any time and made to run on any cpu. If that is the case, do you see any problems in having the if () statement above? > > +wait_to_die: > > + /* Wait for kthread_stop */ > > + set_current_state(TASK_INTERRUPTIBLE); > > + while (!kthread_should_stop()) { > > + schedule(); > > + set_current_state(TASK_INTERRUPTIBLE); > > + } > > + __set_current_state(TASK_RUNNING); > > + return 0; > > } > > I believe this is not needed, see the comments for the next patch. Without this, thread cleanup (cwq->should_stop)/create(CPU_UP_PREPARE) becomes racy -- Regards, vatsa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: GPL vs non-GPL device drivers
On 2/16/07, Scott Preece <[EMAIL PROTECTED]> wrote: On 2/15/07, Miguel Ojeda <[EMAIL PROTECTED]> wrote: > Stupid, maybe. But some people just don't want closed-source > projects/companies like yours using their free work, without any kind > of feedback. Some others don't care, but they could in the future, as > it is their code, and that is your risk. > --- So, how are such companies any different from the myriad individuals and companies that use Linux on the desktop or in their server rooms without ever modifying it and who also contribute nothing back to the community? They are also, in many (most?) cases taking advantage of the free (as in beer) nature of Linux - saving money by using the work of others without returning anything, but the product builders seem to get a lot more abuse... scott Well, as I pointed out in other message, I think there is a difference between using the GPL'd code (as user), and modify it, link closed modules, create derivated closed code, and then redistribute everything. I'm not saying that we should prevent companies using Linux for money, I'm just saying that if someone modify some GPL'd code or link or something like that, he/she should release such code as GPL'd; and that is the feedback for the original authors: The improvement of the code. I don't care if this this or other company makes money, it is free to do it, but it would be better to receive feedback as opening such closed modules. Anyway, it is not clear if opening the source would break his business or help it. -- Miguel Ojeda http://maxextreme.googlepages.com/index.htm - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: GPL vs non-GPL device drivers
On 2/16/07, v j <[EMAIL PROTECTED]> wrote: > It's written in black and white, in the license. Please point me to where it says I cannot load proprietary modules in the Kernel. It doesn't. It does, however, say you can't distribute your module unless you make it available under the same terms as the kernel. It makes that really clear. I'll say that again, for everyone else who is reading this: the GPL makes it really clear that extensions to a GPL work are required to be distributed under the terms of the GPL. All this junk about "derivative works" is just the legal jargon used to implement the intent of the GPL. You can argue that a particular extension isn't a derivative work if you want, but you can't argue with the intent.. cause it is written in plain english. I know his opinion. I don't debate his opinion. It is his code. I choose not to use his code because of the license issue. That's good. No, just that the trend is disturbing. If enough Kernel Developers choose to write their Software in a way that prevents others from using it freely, then that is troubling. Especially when these Kernel Developers are substituting existing interfaces in the Kernel with ones that are NEW and require specific licenses. It's hardly a new trend.. the kernel has always been GPL.. the intent has always been that all extensions that are distributed be distributed under the GPL. This whole EXPORT_SYMBOL_GPL thing is new.. but it doesn't require your module to be under the GPL to load, it requires that your module export a license declaration that claims it is GPL - you can do that without changing your license. Frankly, I don't understand why you're willing to ignore the intent of the GPL but you don't appear to be willing to just make your module export a license declaration of "GPL". Trent - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix modular AGPGART (ia64 allmodconfig)
On Fri, 16 Feb 2007, Kyle McMartin wrote: > On Thu, Feb 15, 2007 at 10:09:57PM -0800, Zwane Mwaikambo wrote: > > +ifeq ($(CONFIG_COMPAT),y) > > +agpgart-y += compat_ioctl.o > > +endif > > + > > eh? > > Couldn't this be? > agpgart-$(CONFIG_COMPAT) += compat_ioctl.o Yep thay works too and does look cleaner - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 12/21] Xen-paravirt: Allocate and free vmalloc areas
On Thu, 15 Feb 2007 18:25:01 -0800 Jeremy Fitzhardinge <[EMAIL PROTECTED]> wrote: > +void lock_vm_area(struct vm_struct *area) > +{ > + unsigned long i; > + char c; > + > + /* > + * Prevent context switch to a lazy mm that doesn't have this area > + * mapped into its page tables. > + */ > + preempt_disable(); > + > + /* > + * Ensure that the page tables are mapped into the current mm. The > + * page-fault path will copy the page directory pointers from init_mm. > + */ > + for (i = 0; i < area->size; i += PAGE_SIZE) > + (void)__get_user(c, (char __user *)area->addr + i); > +} > +EXPORT_SYMBOL_GPL(lock_vm_area); This won't work when CONFIG_PREEMPT=y. The pagefault handler will see in_atomic() and will scram. (pet-peeve-from-someone-who-remembers-fortran: the reader expects the variable `i' to be signed. signed int really) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: GPL vs non-GPL device drivers
On 2/16/07, v j <[EMAIL PROTECTED]> wrote: > It's written in black and white, in the license. Please point me to where it says I cannot load proprietary modules in the Kernel. > Apart from that, > Greg KH has made his opinion clear, and you have said you understand > and don't debate that he holds this opinion, and his code is what you > said you were linking to (the sysfs/class stuff), so why do you keep > saying that "it is not clear". I know his opinion. I don't debate his opinion. It is his code. I choose not to use his code because of the license issue. > Do you think that, somehow, Linus' opinion trumps Greg KH's opinion on > his own code? No, just that the trend is disturbing. If enough Kernel Developers choose to write their Software in a way that prevents others from using it freely, then that is troubling. Especially when these Kernel Isn't there a big difference between "use GPL code" and "modify GPL code, link closed modules to it & redistribute everything as binaries"? -- Miguel Ojeda http://maxextreme.googlepages.com/index.htm - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 11/21] Xen-paravirt: Add apply_to_page_range() which applies a function to a pte range.
On Thu, 15 Feb 2007 18:25:00 -0800 Jeremy Fitzhardinge <[EMAIL PROTECTED]> wrote: > Add a new mm function apply_to_page_range() which applies a given > function to every pte in a given virtual address range in a given mm > structure. This is a generic alternative to cut-and-pasting the Linux > idiomatic pagetable walking code in every place that a sequence of > PTEs must be accessed. > > Although this interface is intended to be useful in a wide range of > situations, it is currently used specifically by several Xen > subsystems, for example: to ensure that pagetables have been allocated > for a virtual address range, and to construct batched special > pagetable update requests to map I/O memory (in ioremap()). There was some discussion about this sort of thing last week. The consensus was that it's better to run the callback against a whole pmd's worth of ptes, mainly to amortise the callback's cost (a lot). It was implemented in ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20/2.6.20-mm1/broken-out/smaps-extract-pmd-walker-from-smaps-code.patch - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 10/21] Xen-paravirt: add hooks to intercept mm creation and destruction
On Thu, 15 Feb 2007 18:24:59 -0800 Jeremy Fitzhardinge <[EMAIL PROTECTED]> wrote: > Add hooks to allow a paravirt implementation to track the lifetime of > an mm. > > --- a/arch/i386/kernel/paravirt.c > +++ b/arch/i386/kernel/paravirt.c > @@ -706,6 +706,10 @@ struct paravirt_ops paravirt_ops = { > .irq_enable_sysexit = native_irq_enable_sysexit, > .iret = native_iret, > > + .dup_mmap = (void *)native_nop, > + .exit_mmap = (void *)native_nop, > + .activate_mm = (void *)native_nop, > + > .startup_ipi_hook = (void *)native_nop, > }; eww. I suppose there's a good reason for the casting. > === > --- a/include/asm-i386/mmu_context.h > +++ b/include/asm-i386/mmu_context.h > @@ -5,6 +5,7 @@ > #include > #include > #include > +#include > > /* > * Used for LDT copy/destruction. > @@ -65,7 +66,10 @@ static inline void switch_mm(struct mm_s > #define deactivate_mm(tsk, mm) \ > asm("movl %0,%%gs": :"r" (0)); > > -#define activate_mm(prev, next) \ > - switch_mm((prev),(next),NULL) > +#define activate_mm(prev, next) \ > + do {\ > + arch_activate_mm(prev, next); \ > + switch_mm((prev),(next),NULL); \ > + } while(0); It seems strange to call out to arch_foo() from within an arch header file. I mean, we implicity know we're i386. Maybe it's just poorly named. > +static inline void paravirt_activate_mm(struct mm_struct *prev, > + struct mm_struct *next) > +{ > +} > + > +static inline void paravirt_dup_mmap(struct mm_struct *oldmm, > + struct mm_struct *mm) > +{ > +} > + > +static inline void paravirt_exit_mmap(struct mm_struct *mm) > +{ > +} These functions are unreferenced in this patchset. > #endif /* CONFIG_PARAVIRT */ > #endif /* __ASM_PARAVIRT_H */ > === > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -374,6 +374,12 @@ struct mm_struct { > rwlock_tioctx_list_lock; > struct kioctx *ioctx_list; > }; > + > +#ifndef __HAVE_ARCH_MM_LIFETIME > +#define arch_activate_mm(prev, next) do {} while(0) > +#define arch_dup_mmap(oldmm, mm) do {} while(0) > +#define arch_exit_mmap(mm) do {} while(0) > +#endif Can we lose __HAVE_ARCH_MM_LIFETIME? Just define these (preferably in C, not in cpp) in the appropriate include/asm-foo/ files? > struct sighand_struct { > atomic_tcount; > === > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -286,6 +286,7 @@ static inline int dup_mmap(struct mm_str > if (retval) > goto out; > } > + arch_dup_mmap(oldmm, mm); > retval = 0; > out: > up_write(>mmap_sem); > === > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -1976,6 +1976,8 @@ void exit_mmap(struct mm_struct *mm) > struct vm_area_struct *vma = mm->mmap; > unsigned long nr_accounted = 0; > unsigned long end; > + > + arch_exit_mmap(mm); > > lru_add_drain(); > flush_cache_mm(mm); Perhaps some commentary telling the arch maintainer what these hooks he's being offered are for? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: e1000_intr in request_irq faults in 2.6.20-git
Andrew Morton wrote: On Thu, 15 Feb 2007 18:10:53 -0800 "Brandeburg, Jesse" <[EMAIL PROTECTED]> wrote: @@ -1431,6 +1427,10 @@ e1000_open(struct net_device *netdev) e1000_update_mng_vlan(adapter); } + err = e1000_request_irq(adapter); + if (err) + goto err_req_irq; + /* If AMT is enabled, let the firmware know that the network * interface is now open */ if (adapter->hw.mac_type == e1000_82573 && @@ -1439,10 +1439,11 @@ e1000_open(struct net_device *netdev) return E1000_SUCCESS; +err_req_irq: + e1000_down(adapter); + e1000_free_irq(adapter); err_up: We don't want that e1000_free_irq(adapter) in the error path. indeed, thanks for spotting and telling me before I sent this to Jeff. Cheers, Auke - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: GPL vs non-GPL device drivers
It's written in black and white, in the license. Please point me to where it says I cannot load proprietary modules in the Kernel. Apart from that, Greg KH has made his opinion clear, and you have said you understand and don't debate that he holds this opinion, and his code is what you said you were linking to (the sysfs/class stuff), so why do you keep saying that "it is not clear". I know his opinion. I don't debate his opinion. It is his code. I choose not to use his code because of the license issue. Do you think that, somehow, Linus' opinion trumps Greg KH's opinion on his own code? No, just that the trend is disturbing. If enough Kernel Developers choose to write their Software in a way that prevents others from using it freely, then that is troubling. Especially when these Kernel Developers are substituting existing interfaces in the Kernel with ones that are NEW and require specific licenses. vj. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix modular AGPGART (ia64 allmodconfig)
On Thu, Feb 15, 2007 at 10:09:57PM -0800, Zwane Mwaikambo wrote: > +ifeq ($(CONFIG_COMPAT),y) > +agpgart-y+= compat_ioctl.o > +endif > + eh? Couldn't this be? agpgart-$(CONFIG_COMPAT)+= compat_ioctl.o Cheers, Kyle M. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Don't probe for DDC on VBE1.2
On Thu, 15 Feb 2007, Andrew Morton wrote: > On Thu, 15 Feb 2007 21:45:06 -0800 Andrew Morton <[EMAIL PROTECTED]> wrote: > > > On Thu, 15 Feb 2007 21:35:35 -0800 (PST) Zwane Mwaikambo <[EMAIL > > PROTECTED]> wrote: > > > > > On Thu, 15 Feb 2007, Andrew Morton wrote: > > > > > > > This makes the long-suffering-but-vigorously-defended Vaio come up with > > > > a > > > > black display. Everything's working OK otherwise. Sort of a Black > > > > Screen > > > > of Life. I wouldn't call it an improvement though. > > > > > > Bugger, what does your kernel commandline look like? > > > > Kernel command line: ro root=LABEL=/ rhgb vga=0x263 clock=pit > > Removing the vga=0x263 "fixes" it. > Sorry i missed this earlier, could you also post up an Xorg.0.log (or equivalent for your system). Thanks, Zwane - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 14/21] Xen-paravirt: Add XEN config options and disable unsupported config options.
On 02/15/2007 06:25 PM, Jeremy Fitzhardinge wrote: The XEN config option enables the Xen paravirt_ops interface, which is installed when the kernel finds itself running under Xen. (By some as-yet fully defined mechanism, implemented in a future patch.) Xen is no longer a sub-architecture, so the X86_XEN subarch config option has gone. The disabled config options are: - PREEMPT: Xen doesn't support it - HZ: set to 100Hz for now, to cut down on VCPU context switch rate. This will be adapted to use tickless later. - kexec: not yet supported config KEXEC bool "kexec system call" + depends on !XEN help kexec is a system call that implements the ability to shutdown your current kernel, and to start another kernel. It is like a reboot config DOUBLEFAULT default y bool "Enable doublefault exception handler" if EMBEDDED + depends on !XEN help This option allows trapping of rare doublefault exceptions that would otherwise cause a system to silently reboot. Disabling this +config XEN + bool "Enable support for Xen hypervisor" + depends PARAVIRT + default y + help + This is the Linux Xen port. choice - prompt "Timer frequency" + prompt "Timer frequency" if !XEN default HZ_250 help Allows the configuration of the timer frequency. It is customary @@ -49,7 +49,7 @@ endchoice config HZ int - default 100 if HZ_100 + default 100 if HZ_100 || XEN default 250 if HZ_250 default 300 if HZ_300 default 1000 if HZ_1000 config PREEMPT bool "Preemptible Kernel (Low-Latency Desktop)" + depends on !XEN help This option reduces the latency of the kernel by making all kernel code (that is not executing in a critical section) I hate to sound like a broken record, but this really isn't the right way to do this. If you are going to inhibit config settings when Xen support is compiled, then it should be: config XEN depends on !KEXEC && !DOUBLEFAULT && HZ_100 && !PREEMPT I'm really not trying to make things more difficult for you or for Xen users. One of the primary goals of paravirt-ops is to allow the same kernel binary to bind to multiple hypervisor interfaces (and native), at runtime. So, we should assume that kernels which are built with CONFIG_PARAVIRT will also enable all the paravirt-ops backends; and this is a good thing. But, the problem is, then, when someone enables paravirt (and all the backends, including Xen), you are silently forcing all these options off. Compiling in paravirt support should not change force you into using some set of config option; when a paravirt-ops kernel is run on native, it should be as fully featured as a non-paravirtops kernel. So, this should really be solved (in order of preference) by: 1) Complete the xen paravirt-ops backend so it can handle these "incompatible" options. I realize this is just a matter of time (at least for most of them, what is your plan for PREEMPT?). 2) Disable the option at runtime only if the kernel is booted on Xen. When the kernel is booted on native, lhype, paravirt kvm, vmware, etc it should be not be inhibited. This may not be feasible for all of these options, but as Zach pointed out, is easy enough for DOUBLEFAULT. Maybe it can be done for KEXEC? And HZ is easy to allow too, even though Xen will still give interrupts at 100hz -- we do this when you boot on VMI. PREEMPT is probably the only real compile time incompatible options with Xen. You just simply have to change the loop decrementors in your timer interrupt. You basically chose #2 for SMP: while your backend doesn't support it yet, it's not harmful to have the config option enabled; you just don't allow a second vcpu to startup when running on Xen. 3) Use the config XEN line I suggest above. The net effect is the same as your proposed change (it prevents users from compiling the incompatible options together), but at least the user understands what is going on, instead of options just silently changing on them. Dan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 14/21] Xen-paravirt: Add XEN config options and disable unsupported config options.
Jeremy Fitzhardinge wrote: The XEN config option enables the Xen paravirt_ops interface, which is installed when the kernel finds itself running under Xen. (By some as-yet fully defined mechanism, implemented in a future patch.) Xen is no longer a sub-architecture, so the X86_XEN subarch config option has gone. The disabled config options are: - PREEMPT: Xen doesn't support it - HZ: set to 100Hz for now, to cut down on VCPU context switch rate. This will be adapted to use tickless later. - kexec: not yet supported Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> Signed-off-by: Ian Pratt <[EMAIL PROTECTED]> Signed-off-by: Christian Limpach <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> We do support different HZ values, although 100HZ is actually preferable for us, so I don't object to that. PREEMPT is supported by us, but not as tested as I would like, so I also don't object to dropping it for generic paravirt guests - Rusty - Avi any objections to dropping preempt in terms of lguest/KVM paravirtualization? Paravirt-ops definitely needs a hook for kexec, although we should not disable kexec for the natively booted paravirt-ops. Eric - is there a way to disable it at runtime? We do support the doublefault task gate, and it would be good to keep it, but I can't complain so much if it is gone from generic paravirt kernels for now, because it is non-essential, and generally fatal anyway. We do need it for native boots of paravirt-ops kernels, however, so turning off the config option still needs to be revisited. Zach - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Fix modular AGPGART (ia64 allmodconfig)
My previous compat AGP patch broke modular AGPGART. Test built on; i386 CONFIG_AGP=y,m x86_64 CONFIG_AGP=y ia64 CONFIG_AGP=m Signed-off-by: Zwane Mwaikambo <[EMAIL PROTECTED]> Index: linux-2.6.20-mm1-ia64/drivers/char/agp/Makefile === RCS file: /home/cvsroot/linux-2.6.20-mm1/drivers/char/agp/Makefile,v retrieving revision 1.1.1.1 diff -u -p -B -r1.1.1.1 Makefile --- linux-2.6.20-mm1-ia64/drivers/char/agp/Makefile 15 Feb 2007 17:35:27 - 1.1.1.1 +++ linux-2.6.20-mm1-ia64/drivers/char/agp/Makefile 16 Feb 2007 05:47:27 - @@ -1,7 +1,10 @@ agpgart-y := backend.o frontend.o generic.o isoch.o +ifeq ($(CONFIG_COMPAT),y) +agpgart-y += compat_ioctl.o +endif + obj-$(CONFIG_AGP) += agpgart.o -obj-$(CONFIG_COMPAT) += compat_ioctl.o obj-$(CONFIG_AGP_ALI) += ali-agp.o obj-$(CONFIG_AGP_ATI) += ati-agp.o obj-$(CONFIG_AGP_AMD) += amd-k7-agp.o - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: GPL vs non-GPL device drivers
v j wrote: Greg KH has gone and made the basic sysfs interface, which any generic driver could use as EXPORT_SYMBOL_GPL. ...The point is that old functionality is being ripped off and new ones introduced, and their interfaces are not open anymore. Hmm...you keep using the word "open". What definition are you using? Because the new implementation is licensed under the GPL, which is an "Open Source" license. By definition, this means that it is "open". What I see you saying is that the interfaces are more restrictive than before. This is true. However, if you are confident that you are abiding by the terms of the GPL then there is nothing stopping you from patching the kernel to convert the EXPORT_SYMBOL_GPL to just EXPORT_SYMBOL. The _GPL version is just a hint as to the intent/opinion of the designer. The flip side of that is that using only items exported via EXPORT_SYMBOL doesn't make you automatically compliant to the GPL. You could still be infringing if the module is legally considered a derivative work of the kernel. Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 13/21] Xen-paravirt: Add nosegneg capability to the vsyscall page notes
Jeremy Fitzhardinge wrote: Add the "nosegneg" fake capabilty to the vsyscall page notes. This is used by the runtime linker to select a glibc version which then disables negative-offset accesses to the thread-local segment via %gs. These accesses require emulation in Xen (because segments are truncated to protect the hypervisor address space) and avoiding them provides a measurable performance boost. Signed-off-by: Ian Pratt <[EMAIL PROTECTED]> Signed-off-by: Christian Limpach <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> Acked-by: Zachary Amsden <[EMAIL PROTECTED]> We would like to see this by dynamic, but that is much more difficult to achieve, and seeing your recent linker issues, I don't think this should gate merging this code. The performance loss for us I believe to be negligible, and the fix is quite a bit more complicated than something achievable in the .21 timeframe. Zach - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 07/21] Xen-paravirt: remove ctor for pgd cache
Jeremy Fitzhardinge wrote: Remove the ctor for the pgd cache. There's no point in having the cache machinery do this via an indirect call when all pgd are freed in the one place anyway. Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> Acked-by: Zachary Amsden <[EMAIL PROTECTED]> This does introduce a bug for us, but that is trivially fixed, and I would like to inspect final merged code to make sure bugfix is proper. Zach - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: e1000_intr in request_irq faults in 2.6.20-git
On Thu, 15 Feb 2007 18:10:53 -0800 "Brandeburg, Jesse" <[EMAIL PROTECTED]> wrote: > @@ -1431,6 +1427,10 @@ e1000_open(struct net_device *netdev) > e1000_update_mng_vlan(adapter); > } > > + err = e1000_request_irq(adapter); > + if (err) > + goto err_req_irq; > + > /* If AMT is enabled, let the firmware know that the network > * interface is now open */ > if (adapter->hw.mac_type == e1000_82573 && > @@ -1439,10 +1439,11 @@ e1000_open(struct net_device *netdev) > > return E1000_SUCCESS; > > +err_req_irq: > + e1000_down(adapter); > + e1000_free_irq(adapter); > err_up: We don't want that e1000_free_irq(adapter) in the error path. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: GPL vs non-GPL device drivers
On 2/16/07, v j <[EMAIL PROTECTED]> wrote: This is only because of the terms of GPL. Morally, as many here have pointed out this should fall into the same category. I say it does. If you have the ability, and enjoy Linux, you should try and make the time to contribute some code or other assistance to the Linux project. Atleast we try and report genuine bugs and submit patches when necessary. Good stuff. We get abuse however because it is not clear what the terms of GPL are WRT loadable modules. If this were written in black and white and we knew what we were fighting against, this would not be an issue. We only get crap because no one here yet knows how to interpret proprietary modules loaded into the kernel. It's written in black and white, in the license. Apart from that, Greg KH has made his opinion clear, and you have said you understand and don't debate that he holds this opinion, and his code is what you said you were linking to (the sysfs/class stuff), so why do you keep saying that "it is not clear". Do you think that, somehow, Linus' opinion trumps Greg KH's opinion on his own code? Trent - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: GPL vs non-GPL device drivers
On Feb 16, 2007, Jeff Garzik <[EMAIL PROTECTED]> wrote: > Alexandre Oliva wrote: >> On Feb 15, 2007, Jeff Garzik <[EMAIL PROTECTED]> wrote: >> >>> Michael K. Edwards wrote: On 2/15/07, Jeff Garzik <[EMAIL PROTECTED]> wrote: >> > The /whole point/ of the GPL is to funnel contributions back. >> Bzzzt. The whole point of the GPL is to "guarantee your freedom to share and change free software--to make sure the software is free for all its users." >> >>> No, that's the FSF marketing fluff you've been taught to recite. >> >> The same FSF that wrote the GPL, no less ;-) >> >>> I'm referring to the original reason why Linus chose the GPL >> If he chose it for this reason, he chose the wrong license. > Strange, then, how its been so successful in funelling back contributions. There's nothing strange about it. Promoting (as opposed to mandating) contributions is a great possible, even probable consequence of the GPL, but it is far from being the whole point of the GPL. If Linus' whole point had been to funnel back contributions, assuming he fully understood the GPL back when he chose it, he'd probably have chosen a different license that *required* contributions to be funneled back. And then Linux would have remained non-Free Software. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer [EMAIL PROTECTED], gcc.gnu.org} Free Software Evangelist [EMAIL PROTECTED], gnu.org} - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: possible bug in page allocation mechanism
On Thu, 15 Feb 2007 14:11:42 -0800 (PST) Tim Cullen <[EMAIL PROTECTED]> wrote: > There appears to be a inconsistenancy with reference > counts on pages allocated with alloc_pages when order > is greater than zero. In buffered_rmqueue when order > != 0 then __rmqueue is called. This returns a page > pointer that is really a pointer to the first page in > a group of pages. Subsequently prep_new_page is called > on the first page of the group but not on any others. > This results in the first page having a reference > count of 1 while all other pages in the allocation > have a reference count of 0. I would think that all > pages in the same allocation should all have the same > reference count at the end of the allocation. > > I've looked at this in the 2.6.20, 2.6.19.1, and the > 2.6.17.7 kernels. They contain the same code in this > area. > That's as-designed. All page refcount manipulations on a higher-order page are supposed to be against the head pageframe. Also, we have the compound page logic in there to force code which tries to manipulate the refcount of a tail page to be redirected to the head page. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Don't probe for DDC on VBE1.2
On Thu, 15 Feb 2007 21:45:06 -0800 Andrew Morton <[EMAIL PROTECTED]> wrote: > On Thu, 15 Feb 2007 21:35:35 -0800 (PST) Zwane Mwaikambo <[EMAIL PROTECTED]> > wrote: > > > On Thu, 15 Feb 2007, Andrew Morton wrote: > > > > > This makes the long-suffering-but-vigorously-defended Vaio come up with a > > > black display. Everything's working OK otherwise. Sort of a Black Screen > > > of Life. I wouldn't call it an improvement though. > > > > Bugger, what does your kernel commandline look like? > > Kernel command line: ro root=LABEL=/ rhgb vga=0x263 clock=pit Removing the vga=0x263 "fixes" it. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: GPL vs non-GPL device drivers
> So, how are such companies any different from the myriad individuals > and companies that use Linux on the desktop or in their server rooms > without ever modifying it and who also contribute nothing back to the > community? They are also, in many (most?) cases taking advantage of > the free (as in beer) nature of Linux - saving money by using the work > of others without returning anything, but the product builders seem to > get a lot more abuse... if they don't modify it and don't distribute it there is not issue. This is only because of the terms of GPL. Morally, as many here have pointed out this should fall into the same category. it's people who modify it (by creating a derived work) and then redistribute it that get the abuse. Atleast we try and report genuine bugs and submit patches when necessary. We get abuse however because it is not clear what the terms of GPL are WRT loadable modules. If this were written in black and white and we knew what we were fighting against, this would not be an issue. We only get crap because no one here yet knows how to interpret proprietary modules loaded into the kernel. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Don't probe for DDC on VBE1.2
On Thu, 15 Feb 2007 21:35:35 -0800 (PST) Zwane Mwaikambo <[EMAIL PROTECTED]> wrote: > On Thu, 15 Feb 2007, Andrew Morton wrote: > > > This makes the long-suffering-but-vigorously-defended Vaio come up with a > > black display. Everything's working OK otherwise. Sort of a Black Screen > > of Life. I wouldn't call it an improvement though. > > Bugger, what does your kernel commandline look like? Kernel command line: ro root=LABEL=/ rhgb vga=0x263 clock=pit http://userweb.kernel.org/~akpm/dmesg-sony.txt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Don't probe for DDC on VBE1.2
On Thu, 15 Feb 2007, Andrew Morton wrote: > This makes the long-suffering-but-vigorously-defended Vaio come up with a > black display. Everything's working OK otherwise. Sort of a Black Screen > of Life. I wouldn't call it an improvement though. Bugger, what does your kernel commandline look like? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/3] Input: psmouse - create PS/2 protocol options for Kconfig
Dmitry Torokhov wrote: > On Thursday 15 February 2007 20:30, Andrew Morton wrote: >> On Thu, 15 Feb 2007 19:55:29 -0500 >> Andres Salomon <[EMAIL PROTECTED]> wrote: [...] >> Perhaps a nicer implementation would be to have a separate .c file for each >> variant. >> > > Having completely separate sub-drivers is very hard because of very delicate > PS/2 protocol probing > > What do you think about patch below? It somewhat reduces #ifdef clutter in > main > module moving it in .h files... > Normally, I'm a fan of that sort of thing. However, in this case, I think it makes sense to have the #ifdefs right in the probe function; at least for me, it makes it easier to understand what's going on. The synaptics stuff is especially tricky; with a cursory glance over the code, one might assume that all the synaptics functions disappear when CONFIG_MOUSE_PS2_SYNAPTICS is unset. However, if the #ifdef's are in the probe function, it's pretty clear that some synaptics functions still get called even when CONFIG_MOUSE_PS2_SYNAPTICS is unset. Just my opinion, anyways. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.20-mm1
On Fri, 16 Feb 2007 00:39:12 +0100 "J.A. Magallón" <[EMAIL PROTECTED]> wrote: > > > ee1394 usblp evdev > > > CPU:1 > > > EIP:0060:[]Tainted: P VLI > > > EFLAGS: 00010246 (2.6.20-jam01 #1) > > > EIP is at sysfs_lookup+0x5b/0x20a > > > eax: f6707118 ebx: f6b33e5c ecx: f6917d38 edx: 0004 > > > esi: edi: f670717c ebp: f6b33e24 esp: f6997db4 > > > ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 > > > Process udevd (pid: 3899, ti=f6996000 task=f7e34540 task.ti=f6996000) > > > Stack: f66e1800 f6707118 c016da12 f66e1800 f6707118 c02f75c0 f6707118 > > > f6997f04 > > >f6997e38 c0164238 f6997e44 c210d8c0 f6b39340 f6b393b4 f7a7d025 > > > f6997e38 > > >27692f8b f6997f04 c0165a6a f7a7d01d 000200d2 c037ddac > > > 0286 > > > Call Trace: > > > [] d_alloc+0x140/0x198 > > > [] do_lookup+0x128/0x165 > > > [] __link_path_walk+0x7e2/0xc9b > > > [] link_path_walk+0x45/0xbf > > > [] do_path_lookup+0x88/0x1cc > > > [] getname+0x90/0xad > > > [] __user_walk_fd+0x2f/0x47 > > > [] vfs_lstat_fd+0x16/0x3d > > > [] sys_lstat64+0xf/0x23 > > > [] do_page_fault+0x326/0x5e2 > > > [] do_page_fault+0x0/0x5e2 > > > [] sysenter_past_esp+0x5f/0x85 > > > [] wait_for_completion_interruptible+0xdf/0xee > > > > > > Oh dear. Any one of about 700 developers might have caused this. > > > > bisection-search will find this. Can you upload the .config please? > > > > Here it goes: > > http://belly.cps.unizar.es/~magallon/oops/config-2.6.20-jam01 Nope, can't reproduce (the bug, that is). Actually, the oops you have there is the fourth one, so we might be seeing downstream effects of oops #1. Can you please capture the first oops trace? Increasing the log buffer size or using netconsole might help. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/3] Input: psmouse - create PS/2 protocol options for Kconfig
On Thursday 15 February 2007 20:30, Andrew Morton wrote: > On Thu, 15 Feb 2007 19:55:29 -0500 > Andres Salomon <[EMAIL PROTECTED]> wrote: > > > Andrew Morton wrote: > > > On Thu, 15 Feb 2007 05:08:21 -0500 > > > Andres Salomon <[EMAIL PROTECTED]> wrote: > > > > > >> Initial framework for disabling PS/2 protocol extensions. The current > > >> protocols can only be disabled if CONFIG_EMBEDDED is selected. No > > >> source files are changed, merely build stuff. > > > > > > ugleee. What benefit do we get for all this additional maintenance > > > burden? > > > > On any platform where you know exactly what ps/2 device you'll have > > plugged in, you can cut the size of psmouse.ko in half by disabling > > protocol extensions that are not in use. > > hm, saving 4k or so. > > Oh well, let's leave that up to Dmitry. Yes I do want it. There are some protocols that are unlikely be in common boxes (OLPC driver, eGalax touchscreen, I also have one driver for a touchpad from a Chinise manufacturer) that I'd like see in mainline. However I want to be able to compile them out so psmouse stays manageable size-wise. And I am not sure where 4K fugure came from - it is more like 40K. > > > For future protocol extension additions, we can add them without making > > the psmouse driver even larger. We have one queued up for OLPC's > > touchpad which we'd like to get included in mainline, and it was made > > clear that being able to disable protocol extensions was a prerequisite > > for getting it included. The OLPC ps/2 protocol extension would be > > disabled unless CONFIG_EMBEDDED and CONFIG_MOUSE_PS2_OLPC were both enabled. > > > > See http://lkml.org/lkml/2006/9/11/200 > > Perhaps a nicer implementation would be to have a separate .c file for each > variant. > Having completely separate sub-drivers is very hard because of very delicate PS/2 protocol probing What do you think about patch below? It somewhat reduces #ifdef clutter in main module moving it in .h files... -- Dmitry From: Andres Salomon <[EMAIL PROTECTED]> Input: psmouse - allow disabing certain protocol extensions Allow ALPS, LOGIPS2PP, LIFEBOOK, TRACKPOINT and TOUCHKIT protocol extensions the psmouse to be disabled during compilation. This will allow users save some memory when they certain that they will only use a certain type of mice. Signed-off-by: Andres Salomon <[EMAIL PROTECTED]> Signed-off-by: Dmitry Torokhov <[EMAIL PROTECTED]> --- drivers/input/mouse/Kconfig| 61 drivers/input/mouse/Makefile |9 ++- drivers/input/mouse/alps.h | 17 +- drivers/input/mouse/logips2pp.h|7 ++ drivers/input/mouse/psmouse-base.c | 12 drivers/input/mouse/synaptics.c| 92 - drivers/input/mouse/synaptics.h| 26 ++ drivers/input/mouse/touchkit_ps2.h |7 ++ drivers/input/mouse/trackpoint.h |9 +++ 9 files changed, 182 insertions(+), 58 deletions(-) Index: work/drivers/input/mouse/Kconfig === --- work.orig/drivers/input/mouse/Kconfig +++ work/drivers/input/mouse/Kconfig @@ -37,6 +37,65 @@ config MOUSE_PS2 To compile this driver as a module, choose M here: the module will be called psmouse. +config MOUSE_PS2_ALPS + bool "ALPS PS/2 mouse protocol extension" if EMBEDDED + default y + depends on MOUSE_PS2 + ---help--- + Say Y here if you have an ALPS PS/2 touchpad connected to + your system. + + If unsure, say Y. + +config MOUSE_PS2_LOGIPS2PP + bool "Logictech PS/2++ mouse protocol extension" if EMBEDDED + default y + depends on MOUSE_PS2 + ---help--- + Say Y here if you have a Logictech PS/2++ mouse connected to + your system. + + If unsure, say Y. + +config MOUSE_PS2_SYNAPTICS + bool "Synaptics PS/2 mouse protocol extension" if EMBEDDED + default y + depends on MOUSE_PS2 + ---help--- + Say Y here if you have a Synaptics PS/2 TouchPad connected to + your system. + + If unsure, say Y. + +config MOUSE_PS2_LIFEBOOK + bool "Fujitsu Lifebook PS/2 mouse protocol extension" if EMBEDDED + default y + depends on MOUSE_PS2 + ---help--- + Say Y here if you have a Fujitsu B-series Lifebook PS/2 + TouchScreen connected to your system. + + If unsure, say Y. + +config MOUSE_PS2_TRACKPOINT + bool "IBM Trackpoint PS/2 mouse protocol extension" if EMBEDDED + default y + depends on MOUSE_PS2 + ---help--- + Say Y here if you have an IBM Trackpoint PS/2 mouse connected + to your system. + + If unsure, say Y. + +config MOUSE_PS2_TOUCHKIT + bool "eGalax TouchKit PS/2 protocol extension" + depends on MOUSE_PS2 + ---help--- + Say Y here if you have an eGalax TouchKit PS/2 touchscreen +
Re: [RFC PATCH(Experimental) 2/4] Revert changes to workqueue.c
On Wed, Feb 14, 2007 at 11:09:04PM +0300, Oleg Nesterov wrote: > What else you don't like? Why do you want to remove cwq_should_stop() and > restore an ugly (ugly for workqueue.c) kthread_stop/kthread_should_stop() ? What is ugly abt kthread_stop in workqueue.c? I feel it is nice if the cleanup is synchronous i.e when cpu_down() is complete, all the dead cpu's worker threads would have terminated. Otherwise we expose races between CPU_UP_PREPARE/kthread_create and the (old) thread exiting. > We can restore take_over_works(), although I don't see why this is needed. > But cwq_should_stop() will just work regardless, why do you want to add > this "wait_to_die" ... well, hack :) wait_to_die is not a new "hack"! Its already used in several other places .. > > -static DEFINE_MUTEX(workqueue_mutex); > > +static DEFINE_SPINLOCK(workqueue_lock); > > No. We can't do this. see below. Ok .. > > struct workqueue_struct *__create_workqueue(const char *name, > > int singlethread, int freezeable) > > { > > @@ -798,17 +756,20 @@ struct workqueue_struct *__create_workqu > > INIT_LIST_HEAD(>list); > > cwq = init_cpu_workqueue(wq, singlethread_cpu); > > err = create_workqueue_thread(cwq, singlethread_cpu); > > + if (!err) > > + wake_up_process(cwq->thread); > > } else { > > - mutex_lock(_mutex); > > + spin_lock(_lock); > > list_add(>list, ); > > - > > - for_each_possible_cpu(cpu) { > > + spin_unlock(_lock); > > + for_each_online_cpu(cpu) { > > cwq = init_cpu_workqueue(wq, cpu); > > - if (err || !(cpu_online(cpu) || cpu == embryonic_cpu)) > > - continue; > > err = create_workqueue_thread(cwq, cpu); > > + if (err) > > + break; > > No, we can't break. We are going to execute destroy_workqueue(), it will > iterate over all cwqs. and try to kthread_stop() uninitialized cwq->thread? How abt retaining the break above but setting cwq->thread = NULL in create_workqueue_thread in failure case? > > +static void take_over_work(struct workqueue_struct *wq, unsigned int cpu) > > +{ > > +} > > I think this is unneeded complication, but ok, should work. This is required if we want to stop per-cpu threads synchronously. > > + case CPU_DEAD: > > + list_for_each_entry(wq, , list) > > + take_over_work(wq, hotcpu); > > + break; > > + > > + case CPU_DEAD_KILL_THREADS: > > + list_for_each_entry(wq, , list) > > + cleanup_workqueue_thread(wq, hotcpu); > > } > > Both CPU_UP_CANCELED and CPU_DEAD_KILL_THREADS runs after thaw_processes(), > this means that workqueue_cpu_callback() is racy wrt create/destroy workqueue, > we should take the mutex, and it can't be spinlock_t. Ok yes ..thanks for pointing out! -- Regards, vatsa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: e1000_intr in request_irq faults in 2.6.20-git
[Adding Dimitri Mishin to the CC - he proposed the same patch earlier] Len Brown wrote: On Thursday 15 February 2007 21:10, Brandeburg, Jesse wrote: Eric W. Biederman wrote: Len Brown <[EMAIL PROTECTED]> writes: e1000 faults in 2.6.20-git, while 2.6.20 worked fine. System is a D875PBZ with LOM. clues? I'm guessing this is an old bug found by the following bit of debug coded added into since v2.6.20 +#ifdef CONFIG_DEBUG_SHIRQ + if (irqflags & IRQF_SHARED) { + /* +* It's a shared IRQ -- the driver ought to be prepared for it +* to happen immediately, so let's make sure +* We do this before actually registering it, to make sure that +* a 'real' IRQ doesn't run in parallel with our fake +*/ + if (irqflags & IRQF_DISABLED) { + unsigned long flags; + + local_irq_save(flags); + handler(irq, dev_id); + local_irq_restore(flags); + } else + handler(irq, dev_id); + } +#endif I don't have a clue why the e1000 wasn't ready though. our code is clearly calling request_irq before we have assigned the function pointer adapter->clean_rx as well as adapter->alloc_rx_buf That would be a bug, a possible patch would be (inline and attached): compile tested, *but* I couldn't test this patch to make sure it worked because I couldn't boot 2.6.20-git due to it not finding my RAID0 + lvm disk. [PATCH] e1000: fix shared interrupt warning message From: Jesse Brandeburg <[EMAIL PROTECTED]> Signed-off-by: Jesse Brandeburg <[EMAIL PROTECTED]> --- drivers/net/e1000/e1000_main.c | 13 +++-- 1 files changed, 7 insertions(+), 6 deletions(-) diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c index 619c892..b8c4d5c 100644 --- a/drivers/net/e1000/e1000_main.c +++ b/drivers/net/e1000/e1000_main.c @@ -1417,10 +1417,6 @@ e1000_open(struct net_device *netdev) if ((err = e1000_setup_all_rx_resources(adapter))) goto err_setup_rx; - err = e1000_request_irq(adapter); - if (err) - goto err_req_irq; - e1000_power_up_phy(adapter); if ((err = e1000_up(adapter))) @@ -1431,6 +1427,10 @@ e1000_open(struct net_device *netdev) e1000_update_mng_vlan(adapter); } + err = e1000_request_irq(adapter); + if (err) + goto err_req_irq; + /* If AMT is enabled, let the firmware know that the network * interface is now open */ if (adapter->hw.mac_type == e1000_82573 && @@ -1439,10 +1439,11 @@ e1000_open(struct net_device *netdev) return E1000_SUCCESS; +err_req_irq: + e1000_down(adapter); + e1000_free_irq(adapter); err_up: e1000_power_down_phy(adapter); - e1000_free_irq(adapter); -err_req_irq: e1000_free_all_rx_resources(adapter); err_setup_rx: e1000_free_all_tx_resources(adapter); Works for me(tm) on latest 2.6.20-git and D875PBZ. If there are no objections I'll push this patch to Jeff Garzik together with two other changes I have for him. Cheers, Auke - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: BUG: warning at kernel/cpu.c:51/unlock_cpu_hotplug() - 2.6.18.6
Hi Antoine, On Thu, Feb 15, 2007 at 08:59:05PM +, Antoine Martin wrote: > I just caught this in the log whilst running some unit tests. > (the test was in the process of starting 900 Java threads) > > audit(1171565587.887:96): enforcing=0 old_enforcing=1 auid=4294967295 > BUG: warning at kernel/cpu.c:51/unlock_cpu_hotplug() > > Call Trace: > [] dump_stack+0x12/0x17 > [] unlock_cpu_hotplug+0x3f/0x6c > [] sched_getaffinity+0x7d/0xa2 > [] sys_sched_getaffinity+0x26/0x55 > [] system_call+0x7e/0x83 Looks like one of those spurious unlock_cpu_hotplug warnings. Let me know if this fix works for you or not. regards gautham. Signed-off-by: Gautham R Shenoy <[EMAIL PROTECTED]> kernel/cpu.c |2 +- 1 files changed, 1 insertion(+), 1 deletion(-) Index: linux-2.6.18/kernel/cpu.c === --- linux-2.6.18.orig/kernel/cpu.c +++ linux-2.6.18/kernel/cpu.c @@ -53,8 +53,8 @@ void unlock_cpu_hotplug(void) recursive_depth--; return; } - mutex_unlock(_bitmask_lock); recursive = NULL; + mutex_unlock(_bitmask_lock); } EXPORT_SYMBOL_GPL(unlock_cpu_hotplug); -- Gautham R Shenoy Linux Technology Center IBM India. "Freedom comes with a price tag of responsibility, which is still a bargain, because Freedom is priceless!" - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] i386: irq: Kill IRQ compression
This code makes simple systems complex: ACPI: PCI Interrupt :03:04.0[A] -> GSI 18 (level, low) -> IRQ 16 ACPI: PCI Interrupt :00:1d.0[A] -> GSI 16 (level, low) -> IRQ 17 ACPI: PCI Interrupt :00:1d.1[B] -> GSI 19 (level, low) -> IRQ 18 ACPI: PCI Interrupt :00:1d.7[D] -> GSI 23 (level, low) -> IRQ 19 The same code was already removed from x86_64 Signed-off-by: Len Brown <[EMAIL PROTECTED]> Cc: Eric W. Biederman <[EMAIL PROTECTED]> Cc: Natalie Protasevich <[EMAIL PROTECTED]> --- arch/i386/kernel/io_apic.c |5 arch/i386/kernel/mpparse.c | 41 - include/asm-i386/io_apic.h |1 3 files changed, 1 insertion(+), 46 deletions(-) Index: linus/arch/i386/kernel/mpparse.c === --- linus.orig/arch/i386/kernel/mpparse.c +++ linus/arch/i386/kernel/mpparse.c @@ -1041,20 +1041,12 @@ void __init mp_config_acpi_legacy_irqs ( } } -#define MAX_GSI_NUM4096 - int mp_register_gsi(u32 gsi, int triggering, int polarity) { int ioapic = -1; int ioapic_pin = 0; int idx, bit = 0; static int pci_irq = 16; - /* -* Mapping between Global System Interrups, which -* represent all possible interrupts, and IRQs -* assigned to actual devices. -*/ - static int gsi_to_irq[MAX_GSI_NUM]; /* Don't set up the ACPI SCI because it's already set up */ if (acpi_gbl_FADT.sci_interrupt == gsi) @@ -1087,42 +1079,11 @@ int mp_register_gsi(u32 gsi, int trigger if ((1< 15), but -* avoid a problem where the 8254 timer (IRQ0) is setup -* via an override (so it's not on pin 0 of the ioapic), -* and at the same time, the pin 0 interrupt is a PCI -* type. The gsi > 15 test could cause these two pins -* to be shared as IRQ0, and they are not shareable. -* So test for this condition, and if necessary, avoid -* the pin collision. -*/ - if (gsi > 15 || (gsi == 0 && !timer_uses_ioapic_pin_0)) - gsi = pci_irq++; - /* -* Don't assign IRQ used by ACPI SCI -*/ - if (gsi == acpi_gbl_FADT.sci_interrupt) - gsi = pci_irq++; - gsi_to_irq[irq] = gsi; - } else { - printk(KERN_ERR "GSI %u is too high\n", gsi); - return gsi; - } - } - io_apic_set_pci_routing(ioapic, ioapic_pin, gsi, triggering == ACPI_EDGE_SENSITIVE ? 0 : 1, polarity == ACPI_ACTIVE_HIGH ? 0 : 1); Index: linus/arch/i386/kernel/io_apic.c === --- linus.orig/arch/i386/kernel/io_apic.c +++ linus/arch/i386/kernel/io_apic.c @@ -2212,8 +2212,6 @@ static inline void unlock_ExtINT_logic(v ioapic_write_entry(apic, pin, entry0); } -int timer_uses_ioapic_pin_0; - /* * This code may look a bit paranoid, but it's supposed to cooperate with * a wide range of boards and BIOS bugs. Fortunately only the timer IRQ @@ -2250,9 +2248,6 @@ static inline void __init check_timer(vo pin2 = ioapic_i8259.pin; apic2 = ioapic_i8259.apic; - if (pin1 == 0) - timer_uses_ioapic_pin_0 = 1; - printk(KERN_INFO "..TIMER: vector=0x%02X apic1=%d pin1=%d apic2=%d pin2=%d\n", vector, apic1, pin1, apic2, pin2); Index: linus/include/asm-i386/io_apic.h === --- linus.orig/include/asm-i386/io_apic.h +++ linus/include/asm-i386/io_apic.h @@ -142,7 +142,6 @@ extern int io_apic_get_unique_id (int io extern int io_apic_get_version (int ioapic); extern int io_apic_get_redir_entries (int ioapic); extern int io_apic_set_pci_routing (int ioapic, int pin, int irq, int edge_level, int active_high_low); -extern int timer_uses_ioapic_pin_0; #endif /* CONFIG_ACPI */ extern int (*ioapic_renumber_irq)(int ioapic, int irq); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: GPL vs non-GPL device drivers
Alexandre Oliva wrote: On Feb 15, 2007, Jeff Garzik <[EMAIL PROTECTED]> wrote: Michael K. Edwards wrote: On 2/15/07, Jeff Garzik <[EMAIL PROTECTED]> wrote: The /whole point/ of the GPL is to funnel contributions back. Bzzzt. The whole point of the GPL is to "guarantee your freedom to share and change free software--to make sure the software is free for all its users." No, that's the FSF marketing fluff you've been taught to recite. The same FSF that wrote the GPL, no less ;-) In the context of the Linux kernel, I'm referring to the original reason why Linus chose the GPL for the Linux kernel. If he chose it for this reason, he chose the wrong license. The GPL Strange, then, how its been so successful in funelling back contributions. Jeff - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: loosen dependancy on rtc cmos
On Thursday 15 February 2007 8:38 pm, Len Brown wrote: > So I've taken Andi's advice and checked in the patches below. OK; that simplifies things for me, good! I can discard that patch (broken by Andi's pcspkr change anyway), stop worring about whether most folk will even see that driver, and make time to look at the ACPI hooks for RTC wakeup, instead. :) But it would be nice to see the PNP bus infrastructure upgraded in various ways too, now that its availability is less iffy. - Add something analagous to platform_driver_probe() so that the init code can be removed after it runs. - Add shutdown() calls to the PNP bus. Otherwise e.g. one must use shutdown notifiers for PNP interfaces, while normal driver model code works for other interfaces to such hardware. Transitioning from "legacy" drivers to PNP (for PCs) and platform_bus (other platforms) is a bit awkward because of differences like those. Drivers still need too many different modes; it's too complicated to have a core with different bus glues as thin veneers ... the glue must be thicker (and thus error prone). Similarly, I/O space resource reservation acts differently. I can hope that having PNPACPI be more common will start nudging more drivers to get rid of their "legacy" modes, or at least focus on "Real Driver" modes that don't involve poking at hardware and hoping it doesn't bite back. ;) - Dave > commit 243b66e76ab722cdec1921d7f80c0cb808131c37 > Author: Len Brown <[EMAIL PROTECTED]> > Date: Thu Feb 15 22:34:36 2007 -0500 > > ACPI: always enable CONFIG_PNPACPI on CONFIG_ACPI kernels > > We removed the ACPI motherboard driver which handled > the ACPI=y, PNP=n case, so now we need to enforce that > PNP & PNPACPI are always enabled for ACPI kernels. > > Most major distros ship this way this already. > > Cc: Bjorn Helgaas <[EMAIL PROTECTED]> > Signed-off-by: Len Brown <[EMAIL PROTECTED]> > > commit 8d4956c201c2f7683289f70095443c59a39f94ef > Author: Len Brown <[EMAIL PROTECTED]> > Date: Thu Feb 15 22:46:42 2007 -0500 > > ACPI: remove non-PNPACPI version of get_rtc_dev() > > It isn't needed in ACPI code anymore because > now ACPI always includes PNPACPI. > > Cc: David Brownell <[EMAIL PROTECTED]> > Signed-off-by: Len Brown <[EMAIL PROTECTED]> > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Don't probe for DDC on VBE1.2
On Thu, 15 Feb 2007 08:29:49 -0800 (PST) Zwane Mwaikambo <[EMAIL PROTECTED]> wrote: > VBE1.2 doesn't support function 15h (DDC) resulting in a 'hang' whilst > uncompressing kernel with some video cards. Make sure we check VBE version > before fiddling around with DDC. > > http://bugzilla.kernel.org/show_bug.cgi?id=1458 > > Opened: 2003-10-30 09:12 Last update: 2007-02-13 22:03 > > :( > > Much thanks to Tobias Hain for help in testing and investigating the bug. > Tested on; > > i386, Chips & Technologies 65548 VESA VBE 1.2 > CONFIG_VIDEO_SELECT=Y > CONFIG_FIRMWARE_EDID=Y > > Untested on x86_64. > > Signed-off-by: Zwane Mwaikambo <[EMAIL PROTECTED]> > > Index: linux-2.6.20-rc6-mm1/arch/i386/boot/video.S > === > RCS file: /home/cvsroot/linux-2.6.20-rc6-mm1/arch/i386/boot/video.S,v > retrieving revision 1.1.1.1 > diff -u -p -B -r1.1.1.1 video.S > --- linux-2.6.20-rc6-mm1/arch/i386/boot/video.S 30 Jan 2007 05:28:31 > - 1.1.1.1 > +++ linux-2.6.20-rc6-mm1/arch/i386/boot/video.S 15 Feb 2007 16:27:32 > - > @@ -1945,6 +1945,20 @@ store_edid: > rep > stosl > > + pushw %es > + pushw %ds > + popw%es > + leawmodelist+1024, %di > + movw$0x4f00, %ax > + int $0x10 > + popw%es > + > + cmpw$0x004f, %ax > + jne no_edid > + > + cmpw$0x0102, 4(%di) # only do EDID on > 1.2 > + je no_edid > + > pushw %es # save ES > xorw%di, %di# Report Capability > pushw %di This makes the long-suffering-but-vigorously-defended Vaio come up with a black display. Everything's working OK otherwise. Sort of a Black Screen of Life. I wouldn't call it an improvement though. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] DualFS: File System with Meta-data and Data Separation
Hi, On Fri, 16 Feb 2007, Andi Kleen wrote: If you stripe two disks with a standard fs versus use one of them as metadata volume and the other as data volume with dualfs i would expect the striped variant usually be faster because it will give parallelism not only to data versus metadata, but also to all data versus other data. If you have a RAID system, both the data and meta-data devices of DualFS can be stripped, and you get the same result. No problem for DualFS :) Sure, but then you need four disks. And if your workloads happens to be much more data intensive than metadata intensive the stripped spindles assigned to metadata only will be more idle than the ones doing data. Stripping everything from the same pool has the potential to adapt itself to any workload mix better. Why do you need four disks? Data and meda-data devices of DualFS can be on different disks, can be two partitions of the same disk, or can be two areas of the same partition. The important thing is that data and meta-data blocks are separated and that they are managed in different ways. Please, take a look at the presentation (see below). I can see that you win for some specific workloads, but it is hard to see how you can win over a wide range of workloads because of that. No, we win for a wide range of common workloads. See the results in the PDF (see below). Also I would expect your design to be slow for metadata read intensive workloads. E.g. have you tried to boot a root partition with dual fs? That's a very important IO benchmark for desktop Linux systems. I do not think so. The performance of DualFS is superb in meta-data read intensive workloads . And it is also better than the performance of other file system when reading a directory tree with several copies of the Linux kernel source code (I showed those results on Tuesday at the LSF07 workshop) PDFs available? Sure: http://www.ditec.um.es/~piernas/dualfs/presentation-lsf07-final.pdf Is that with running a LFS style cleaner inbetween or without? 'With' a cleaner. I would be interested in a "install distro with installer ; boot afterwards from it" type benchmark. Do you have something like this? -Andi I think that the results sent by Sorin answer your question :-) Regards, Juan. -- D. Juan Piernas Cánovas Departamento de IngenierÃa y TecnologÃa de Computadores Facultad de Informática. Universidad de Murcia Campus de Espinardo - 30080 Murcia (SPAIN) Tel.: +34968367657Fax: +34968364151 email: [EMAIL PROTECTED] PGP public key: http://pgp.rediris.es:11371/pks/lookup?search=piernas%40ditec.um.es=index *** Por favor, envÃeme sus documentos en formato texto, HTML, PDF o PostScript :-) ***
Re: sata_nv ADMA controller lockup investigation
Jeff Garzik wrote: Robert Hancock wrote: It's curious that only the post-cache-flush command is having issues, and normal NCQ operation seems fine. Maybe it's related to that tag 0 being reused repeatedly? If you take cache flush out of the equation, what happens when NCQ is enabled with a queue depth of 1 (to reproduce tag-0-used-repeatedly condition)? Jeff I was able to reproduce it in my same test case with NCQ depth set to 1. Of course, there were still some cache flushes going on there, so I'm not certain that test really told us anything new. I'm rather doutbful that it's related to reusing the same tag now, though, based on the tests I've been doing. It may be something to do with switching between NCQ and non-NCQ commands, maybe the controller isn't able to handle doing that too rapidly. This patch seems to fix the problem - or at least it hasn't failed the tests that I've run so far. It's not really ideal though, so I'd like to do some more investigation/testing before proclaiming it as a fix. Experimentally, it appears that 10 microseconds is not enough delay, but 20 seems to work better. Hints from the peanut gallery remain welcome. --- linux-2.6.20-git6edit/drivers/ata/sata_nv.c.before_hacking 2007-02-15 18:19:13.0 -0600 +++ linux-2.6.20-git6edit/drivers/ata/sata_nv.c 2007-02-15 22:36:02.0 -0600 @@ -219,6 +219,7 @@ void __iomem * gen_block; void __iomem * notifier_clear_block; u8 flags; + int last_issue_ncq; }; struct nv_host_priv { @@ -1260,6 +1261,7 @@ { struct nv_adma_port_priv *pp = qc->ap->private_data; void __iomem *mmio = pp->ctl_block; + int curr_ncq = (qc->tf.protocol == ATA_PROT_NCQ); VPRINTK("ENTER\n"); @@ -1274,6 +1276,14 @@ /* write append register, command tag in lower 8 bits and (number of cpbs to append -1) in top 8 bits */ wmb(); + + if(curr_ncq != pp->last_issue_ncq) { + /* Seems to need some delay before switching between NCQ and non-NCQ + commands, else we get command timeouts and such. */ + udelay(20); + pp->last_issue_ncq = curr_ncq; + } + writew(qc->tag, mmio + NV_ADMA_APPEND); DPRINTK("Issued tag %u\n",qc->tag); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 10/21] Xen-paravirt: add hooks to intercept mm creation and destruction
Add hooks to allow a paravirt implementation to track the lifetime of an mm. Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> -- arch/i386/kernel/paravirt.c|4 include/asm-i386/mmu_context.h |8 ++-- include/asm-i386/paravirt.h| 38 ++ kernel/fork.c |3 +++ mm/mmap.c |4 5 files changed, 55 insertions(+), 2 deletions(-) === --- a/arch/i386/kernel/paravirt.c +++ b/arch/i386/kernel/paravirt.c @@ -706,6 +706,10 @@ struct paravirt_ops paravirt_ops = { .irq_enable_sysexit = native_irq_enable_sysexit, .iret = native_iret, + .dup_mmap = (void *)native_nop, + .exit_mmap = (void *)native_nop, + .activate_mm = (void *)native_nop, + .startup_ipi_hook = (void *)native_nop, }; === --- a/include/asm-i386/mmu_context.h +++ b/include/asm-i386/mmu_context.h @@ -5,6 +5,7 @@ #include #include #include +#include /* * Used for LDT copy/destruction. @@ -65,7 +66,10 @@ static inline void switch_mm(struct mm_s #define deactivate_mm(tsk, mm) \ asm("movl %0,%%gs": :"r" (0)); -#define activate_mm(prev, next) \ - switch_mm((prev),(next),NULL) +#define activate_mm(prev, next)\ + do {\ + arch_activate_mm(prev, next); \ + switch_mm((prev),(next),NULL); \ + } while(0); #endif === --- a/include/asm-i386/paravirt.h +++ b/include/asm-i386/paravirt.h @@ -126,6 +126,12 @@ struct paravirt_ops void (*io_delay)(void); void (*const_udelay)(unsigned long loops); + void (fastcall *activate_mm)(struct mm_struct *prev, +struct mm_struct *next); + void (fastcall *dup_mmap)(struct mm_struct *oldmm, + struct mm_struct *mm); + void (fastcall *exit_mmap)(struct mm_struct *mm); + #ifdef CONFIG_X86_LOCAL_APIC void (*apic_write)(unsigned long reg, unsigned long v); void (*apic_write_atomic)(unsigned long reg, unsigned long v); @@ -429,6 +435,24 @@ static inline void startup_ipi_hook(int } #endif +#define __HAVE_ARCH_MM_LIFETIME +static inline void arch_activate_mm(struct mm_struct *prev, + struct mm_struct *next) +{ + paravirt_ops.activate_mm(prev, next); +} + +static inline void arch_dup_mmap(struct mm_struct *oldmm, +struct mm_struct *mm) +{ + paravirt_ops.dup_mmap(oldmm, mm); +} + +static inline void arch_exit_mmap(struct mm_struct *mm) +{ + paravirt_ops.exit_mmap(mm); +} + #define __flush_tlb() paravirt_ops.flush_tlb_user() #define __flush_tlb_global() paravirt_ops.flush_tlb_kernel() #define __flush_tlb_single(addr) paravirt_ops.flush_tlb_single(addr) @@ -673,5 +697,20 @@ static inline void paravirt_pagetable_se set_pgd([0], base[USER_PTRS_PER_PGD]); #endif } + +static inline void paravirt_activate_mm(struct mm_struct *prev, + struct mm_struct *next) +{ +} + +static inline void paravirt_dup_mmap(struct mm_struct *oldmm, +struct mm_struct *mm) +{ +} + +static inline void paravirt_exit_mmap(struct mm_struct *mm) +{ +} + #endif /* CONFIG_PARAVIRT */ #endif /* __ASM_PARAVIRT_H */ === --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -374,6 +374,12 @@ struct mm_struct { rwlock_tioctx_list_lock; struct kioctx *ioctx_list; }; + +#ifndef __HAVE_ARCH_MM_LIFETIME +#define arch_activate_mm(prev, next) do {} while(0) +#define arch_dup_mmap(oldmm, mm) do {} while(0) +#define arch_exit_mmap(mm) do {} while(0) +#endif struct sighand_struct { atomic_tcount; === --- a/kernel/fork.c +++ b/kernel/fork.c @@ -286,6 +286,7 @@ static inline int dup_mmap(struct mm_str if (retval) goto out; } + arch_dup_mmap(oldmm, mm); retval = 0; out: up_write(>mmap_sem); === --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1976,6 +1976,8 @@ void exit_mmap(struct mm_struct *mm) struct vm_area_struct *vma = mm->mmap; unsigned long nr_accounted = 0; unsigned long end; + + arch_exit_mmap(mm); lru_add_drain(); flush_cache_mm(mm); -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at
[patch 18/21] Xen-paravirt: Add Xen grant table support
Add Xen 'grant table' driver which allows granting of access to selected local memory pages by other virtual machines and, symmetrically, the mapping of remote memory pages which other virtual machines have granted access to. This driver is a prerequisite for many of the Xen virtual device drivers, which grant the 'device driver domain' restricted and temporary access to only those memory pages that are currently involved in I/O operations. Signed-off-by: Ian Pratt <[EMAIL PROTECTED]> Signed-off-by: Christian Limpach <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- drivers/xen/Makefile |1 drivers/xen/core/Makefile |1 drivers/xen/core/grant_table.c | 446 +++ include/xen/grant_table.h | 107 + 4 files changed, 555 insertions(+) === --- a/drivers/xen/Makefile +++ b/drivers/xen/Makefile @@ -1,1 +1,2 @@ obj-y += console/ +obj-y += core/ obj-y += console/ === --- /dev/null +++ b/drivers/xen/core/Makefile @@ -0,0 +1,1 @@ +obj-y += grant_table.o === --- /dev/null +++ b/drivers/xen/core/grant_table.c @@ -0,0 +1,446 @@ +/** + * grant_table.c + * + * Granting foreign access to our memory reservation. + * + * Copyright (c) 2005, Christopher Clark + * Copyright (c) 2004-2005, K A Fraser + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License version 2 + * as published by the Free Software Foundation; or, when distributed + * separately from the Linux kernel or incorporated into other + * software packages, subject to the following license: + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this source file (the "Software"), to deal in the Software without + * restriction, including without limitation the rights to use, copy, modify, + * merge, publish, distribute, sublicense, and/or sell copies of the Software, + * and to permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include + + +/* External tools reserve first few grant table entries. */ +#define NR_RESERVED_ENTRIES 8 + +#define NR_GRANT_ENTRIES \ + (NR_GRANT_FRAMES * PAGE_SIZE / sizeof(struct grant_entry)) +#define GNTTAB_LIST_END (NR_GRANT_ENTRIES + 1) + +static grant_ref_t gnttab_list[NR_GRANT_ENTRIES]; +static int gnttab_free_count; +static grant_ref_t gnttab_free_head; +static DEFINE_SPINLOCK(gnttab_list_lock); + +static struct grant_entry *shared; + +static struct gnttab_free_callback *gnttab_free_callback_list; + +static int get_free_entries(int count) +{ + unsigned long flags; + int ref; + grant_ref_t head; + spin_lock_irqsave(_list_lock, flags); + if (gnttab_free_count < count) { + spin_unlock_irqrestore(_list_lock, flags); + return -1; + } + ref = head = gnttab_free_head; + gnttab_free_count -= count; + while (count-- > 1) + head = gnttab_list[head]; + gnttab_free_head = gnttab_list[head]; + gnttab_list[head] = GNTTAB_LIST_END; + spin_unlock_irqrestore(_list_lock, flags); + return ref; +} + +#define get_free_entry() get_free_entries(1) + +static void do_free_callbacks(void) +{ + struct gnttab_free_callback *callback, *next; + + callback = gnttab_free_callback_list; + gnttab_free_callback_list = NULL; + + while (callback != NULL) { + next = callback->next; + if (gnttab_free_count >= callback->count) { + callback->next = NULL; + callback->fn(callback->arg); + } else { + callback->next = gnttab_free_callback_list; + gnttab_free_callback_list = callback; + } + callback = next; + } +} + +static inline void check_free_callbacks(void) +{ + if (unlikely(gnttab_free_callback_list))
[patch 11/21] Xen-paravirt: Add apply_to_page_range() which applies a function to a pte range.
Add a new mm function apply_to_page_range() which applies a given function to every pte in a given virtual address range in a given mm structure. This is a generic alternative to cut-and-pasting the Linux idiomatic pagetable walking code in every place that a sequence of PTEs must be accessed. Although this interface is intended to be useful in a wide range of situations, it is currently used specifically by several Xen subsystems, for example: to ensure that pagetables have been allocated for a virtual address range, and to construct batched special pagetable update requests to map I/O memory (in ioremap()). Signed-off-by: Ian Pratt <[EMAIL PROTECTED]> Signed-off-by: Christian Limpach <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> Cc: Christoph Lameter <[EMAIL PROTECTED]> --- include/linux/mm.h |5 ++ mm/memory.c| 94 2 files changed, 99 insertions(+) === --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1130,6 +1130,11 @@ struct page *follow_page(struct vm_area_ unsigned long __follow_page(void *vaddr); +typedef int (*pte_fn_t)(pte_t *pte, struct page *pmd_page, unsigned long addr, + void *data); +extern int apply_to_page_range(struct mm_struct *mm, unsigned long address, + unsigned long size, pte_fn_t fn, void *data); + #ifdef CONFIG_PROC_FS void vm_stat_account(struct mm_struct *, unsigned long, struct file *, long); #else === --- a/mm/memory.c +++ b/mm/memory.c @@ -1414,6 +1414,100 @@ int remap_pfn_range(struct vm_area_struc } EXPORT_SYMBOL(remap_pfn_range); +static int apply_to_pte_range(struct mm_struct *mm, pmd_t *pmd, +unsigned long addr, unsigned long end, +pte_fn_t fn, void *data) +{ + pte_t *pte; + int err; + struct page *pmd_page; + spinlock_t *ptl; + + pte = (mm == _mm) ? + pte_alloc_kernel(pmd, addr) : + pte_alloc_map_lock(mm, pmd, addr, ); + if (!pte) + return -ENOMEM; + + BUG_ON(pmd_huge(*pmd)); + + pmd_page = pmd_page(*pmd); + + do { + err = fn(pte, pmd_page, addr, data); + if (err) + break; + } while (pte++, addr += PAGE_SIZE, addr != end); + + if (mm != _mm) + pte_unmap_unlock(pte-1, ptl); + return err; +} + +static int apply_to_pmd_range(struct mm_struct *mm, pud_t *pud, +unsigned long addr, unsigned long end, +pte_fn_t fn, void *data) +{ + pmd_t *pmd; + unsigned long next; + int err; + + pmd = pmd_alloc(mm, pud, addr); + if (!pmd) + return -ENOMEM; + do { + next = pmd_addr_end(addr, end); + err = apply_to_pte_range(mm, pmd, addr, next, fn, data); + if (err) + break; + } while (pmd++, addr = next, addr != end); + return err; +} + +static int apply_to_pud_range(struct mm_struct *mm, pgd_t *pgd, +unsigned long addr, unsigned long end, +pte_fn_t fn, void *data) +{ + pud_t *pud; + unsigned long next; + int err; + + pud = pud_alloc(mm, pgd, addr); + if (!pud) + return -ENOMEM; + do { + next = pud_addr_end(addr, end); + err = apply_to_pmd_range(mm, pud, addr, next, fn, data); + if (err) + break; + } while (pud++, addr = next, addr != end); + return err; +} + +/* + * Scan a region of virtual memory, filling in page tables as necessary + * and calling a provided function on each leaf page table. + */ +int apply_to_page_range(struct mm_struct *mm, unsigned long addr, + unsigned long size, pte_fn_t fn, void *data) +{ + pgd_t *pgd; + unsigned long next; + unsigned long end = addr + size; + int err; + + BUG_ON(addr >= end); + pgd = pgd_offset(mm, addr); + do { + next = pgd_addr_end(addr, end); + err = apply_to_pud_range(mm, pgd, addr, next, fn, data); + if (err) + break; + } while (pgd++, addr = next, addr != end); + return err; +} +EXPORT_SYMBOL_GPL(apply_to_page_range); + /* * handle_pte_fault chooses page fault handler according to an entry * which was read non-atomically. Before making any commitment, on -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at
[patch 12/21] Xen-paravirt: Allocate and free vmalloc areas
Allocate/destroy a 'vmalloc' VM area: alloc_vm_area and free_vm_area The alloc function ensures that page tables are constructed for the region of kernel virtual address space and mapped into init_mm. Lock an area so that PTEs are accessible in the current address space: lock_vm_area and unlock_vm_area. The lock function prevents context switches to a lazy mm that doesn't have the area mapped into its page tables. It also ensures that the page tables are mapped into the current mm by causing the page fault handler to copy the page directory pointers from init_mm into the current mm. These functions are not particularly Xen-specific, so they're put into mm/vmalloc.c. Signed-off-by: Ian Pratt <[EMAIL PROTECTED]> Signed-off-by: Christian Limpach <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> Cc: "Jan Beulich" <[EMAIL PROTECTED]> -- include/linux/vmalloc.h |8 + mm/vmalloc.c| 62 ++ 2 files changed, 70 insertions(+) === --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -68,6 +68,14 @@ extern int map_vm_area(struct vm_struct struct page ***pages); extern void unmap_vm_area(struct vm_struct *area); +/* Allocate/destroy a 'vmalloc' VM area. */ +extern struct vm_struct *alloc_vm_area(unsigned long size); +extern void free_vm_area(struct vm_struct *area); + +/* Lock an area so that PTEs are accessible in the current address space. */ +extern void lock_vm_area(struct vm_struct *area); +extern void unlock_vm_area(struct vm_struct *area); + /* * Internals. Dont't use.. */ === --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -747,3 +747,65 @@ out_einval_locked: } EXPORT_SYMBOL(remap_vmalloc_range); +static int f(pte_t *pte, struct page *pmd_page, unsigned long addr, void *data) +{ + /* apply_to_page_range() does all the hard work. */ + return 0; +} + +struct vm_struct *alloc_vm_area(unsigned long size) +{ + struct vm_struct *area; + + area = get_vm_area(size, VM_IOREMAP); + if (area == NULL) + return NULL; + + /* +* This ensures that page tables are constructed for this region +* of kernel virtual address space and mapped into init_mm. +*/ + if (apply_to_page_range(_mm, (unsigned long)area->addr, + area->size, f, NULL)) { + free_vm_area(area); + return NULL; + } + + return area; +} +EXPORT_SYMBOL_GPL(alloc_vm_area); + +void free_vm_area(struct vm_struct *area) +{ + struct vm_struct *ret; + ret = remove_vm_area(area->addr); + BUG_ON(ret != area); + kfree(area); +} +EXPORT_SYMBOL_GPL(free_vm_area); + +void lock_vm_area(struct vm_struct *area) +{ + unsigned long i; + char c; + + /* +* Prevent context switch to a lazy mm that doesn't have this area +* mapped into its page tables. +*/ + preempt_disable(); + + /* +* Ensure that the page tables are mapped into the current mm. The +* page-fault path will copy the page directory pointers from init_mm. +*/ + for (i = 0; i < area->size; i += PAGE_SIZE) + (void)__get_user(c, (char __user *)area->addr + i); +} +EXPORT_SYMBOL_GPL(lock_vm_area); + +void unlock_vm_area(struct vm_struct *area) +{ + preempt_enable(); +} +EXPORT_SYMBOL_GPL(unlock_vm_area); -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 01/21] Xen-paravirt: Fix typo in sync_constant_test_bit()s name.
Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> === --- a/include/asm-i386/sync_bitops.h +++ b/include/asm-i386/sync_bitops.h @@ -130,7 +130,7 @@ static inline int sync_test_and_change_b return oldbit; } -static __always_inline int sync_const_test_bit(int nr, const volatile unsigned long *addr) +static __always_inline int sync_constant_test_bit(int nr, const volatile unsigned long *addr) { return ((1UL << (nr & 31)) & (((const volatile unsigned int *)addr)[nr >> 5])) != 0; -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 20/21] Xen-paravirt: Add Xen virtual block device driver.
The block device frontend driver allows the kernel to access block devices exported exported by a virtual machine containing a physical block device driver. Signed-off-by: Ian Pratt <[EMAIL PROTECTED]> Signed-off-by: Christian Limpach <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> Cc: Arjan van de Ven <[EMAIL PROTECTED]> Cc: Greg KH <[EMAIL PROTECTED]> Cc: Jens Axboe <[EMAIL PROTECTED]> --- drivers/block/Kconfig|1 drivers/block/Makefile |1 drivers/block/xen/Kconfig| 14 drivers/block/xen/Makefile |5 drivers/block/xen/blkfront.c | 844 + drivers/block/xen/block.h| 135 ++ drivers/block/xen/vbd.c | 230 +++ include/linux/major.h|2 8 files changed, 1232 insertions(+) === --- a/drivers/block/Kconfig +++ b/drivers/block/Kconfig @@ -461,6 +461,7 @@ config CDROM_PKTCDVD_WCACHE don't do deferred write error handling yet. source "drivers/s390/block/Kconfig" +source "drivers/block/xen/Kconfig" config ATA_OVER_ETH tristate "ATA over Ethernet support" === --- a/drivers/block/Makefile +++ b/drivers/block/Makefile @@ -29,3 +29,4 @@ obj-$(CONFIG_BLK_DEV_SX8) += sx8.o obj-$(CONFIG_BLK_DEV_SX8) += sx8.o obj-$(CONFIG_BLK_DEV_UB) += ub.o +obj-$(CONFIG_XEN) += xen/ === --- /dev/null +++ b/drivers/block/xen/Kconfig @@ -0,0 +1,14 @@ +menu "Xen block device drivers" +depends on XEN + +config XEN_BLKDEV_FRONTEND + tristate "Block device frontend driver" + depends on XEN + default y + help + The block device frontend driver allows the kernel to access block + devices exported from a device driver virtual machine. Unless you + are building a dedicated device driver virtual machine, then you + almost certainly want to say Y here. + +endmenu === --- /dev/null +++ b/drivers/block/xen/Makefile @@ -0,0 +1,5 @@ + +obj-$(CONFIG_XEN_BLKDEV_FRONTEND) := xenblk.o + +xenblk-objs := blkfront.o vbd.o + === --- /dev/null +++ b/drivers/block/xen/blkfront.c @@ -0,0 +1,844 @@ +/** + * blkfront.c + * + * XenLinux virtual block device driver. + * + * Copyright (c) 2003-2004, Keir Fraser & Steve Hand + * Modifications by Mark A. Williamson are (c) Intel Research Cambridge + * Copyright (c) 2004, Christian Limpach + * Copyright (c) 2004, Andrew Warfield + * Copyright (c) 2005, Christopher Clark + * Copyright (c) 2005, XenSource Ltd + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License version 2 + * as published by the Free Software Foundation; or, when distributed + * separately from the Linux kernel or incorporated into other + * software packages, subject to the following license: + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this source file (the "Software"), to deal in the Software without + * restriction, including without limitation the rights to use, copy, modify, + * merge, publish, distribute, sublicense, and/or sell copies of the Software, + * and to permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#include +#include "block.h" +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define BLKIF_STATE_DISCONNECTED 0 +#define BLKIF_STATE_CONNECTED1 +#define BLKIF_STATE_SUSPENDED2 + +#define MAXIMUM_OUTSTANDING_BLOCK_REQS \ +(BLKIF_MAX_SEGMENTS_PER_REQUEST * BLK_RING_SIZE) +#define GRANT_INVALID_REF 0 + +static void connect(struct blkfront_info *); +static void blkfront_closing(struct xenbus_device *); +static int blkfront_remove(struct xenbus_device *); +static int talk_to_backend(struct xenbus_device *, struct blkfront_info *); +static int setup_blkring(struct xenbus_device *, struct blkfront_info *); +
[patch 21/21] Xen-paravirt: Add the Xen virtual network device driver.
The network device frontend driver allows the kernel to access network devices exported exported by a virtual machine containing a physical network device driver. Signed-off-by: Ian Pratt <[EMAIL PROTECTED]> Signed-off-by: Christian Limpach <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> Cc: netdev@vger.kernel.org --- drivers/net/Kconfig| 12 drivers/net/Makefile |2 drivers/net/xen-netfront.c | 2066 include/xen/events.h |2 4 files changed, 2082 insertions(+) === --- a/drivers/net/Kconfig +++ b/drivers/net/Kconfig @@ -2525,6 +2525,18 @@ source "drivers/atm/Kconfig" source "drivers/s390/net/Kconfig" +config XEN_NETDEV_FRONTEND + tristate "Xen network device frontend driver" + depends on XEN + default y + help + The network device frontend driver allows the kernel to + access network devices exported exported by a virtual + machine containing a physical network device driver. The + frontend driver is intended for unprivileged guest domains; + if you are compiling a kernel for a Xen guest, you almost + certainly want to enable this. + config ISERIES_VETH tristate "iSeries Virtual Ethernet driver support" depends on PPC_ISERIES === --- a/drivers/net/Makefile +++ b/drivers/net/Makefile @@ -218,3 +218,5 @@ obj-$(CONFIG_FS_ENET) += fs_enet/ obj-$(CONFIG_FS_ENET) += fs_enet/ obj-$(CONFIG_NETXEN_NIC) += netxen/ + +obj-$(CONFIG_XEN_NETDEV_FRONTEND) += xen-netfront.o === --- /dev/null +++ b/drivers/net/xen-netfront.c @@ -0,0 +1,2066 @@ +/** + * Virtual network driver for conversing with remote driver backends. + * + * Copyright (c) 2002-2005, K A Fraser + * Copyright (c) 2005, XenSource Ltd + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License version 2 + * as published by the Free Software Foundation; or, when distributed + * separately from the Linux kernel or incorporated into other + * software packages, subject to the following license: + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this source file (the "Software"), to deal in the Software without + * restriction, including without limitation the rights to use, copy, modify, + * merge, publish, distribute, sublicense, and/or sell copies of the Software, + * and to permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#ifdef CONFIG_XEN_BALLOON +#include +#endif +#include +#include + +#include +#include +#include + +/* + * Mutually-exclusive module options to select receive data path: + * rx_copy : Packets are copied by network backend into local memory + * rx_flip : Page containing packet data is transferred to our ownership + * For fully-virtualised guests there is no option - copying must be used. + * For paravirtualised guests, flipping is the default. + */ +#ifdef CONFIG_XEN +static int MODPARM_rx_copy = 0; +module_param_named(rx_copy, MODPARM_rx_copy, bool, 0); +MODULE_PARM_DESC(rx_copy, "Copy packets from network card (rather than flip)"); +static int MODPARM_rx_flip = 0; +module_param_named(rx_flip, MODPARM_rx_flip, bool, 0); +MODULE_PARM_DESC(rx_flip, "Flip packets from network card (rather than copy)"); +#else +static const int MODPARM_rx_copy = 1; +static const int MODPARM_rx_flip = 0; +#endif + +#define RX_COPY_THRESHOLD 256 + +#define GRANT_INVALID_REF 0 + +#define NET_TX_RING_SIZE __RING_SIZE((struct netif_tx_sring *)0, PAGE_SIZE) +#define NET_RX_RING_SIZE __RING_SIZE((struct netif_rx_sring *)0, PAGE_SIZE) + +struct netfront_info { +
[patch 03/21] Xen-paravirt: Add pagetable accessors to pack and unpack pagetable entries
Add a set of accessors to pack, unpack and modify page table entries (at all levels). This allows a paravirt implementation to control the contents of pgd/pmd/pte entries. For example, Xen uses this to convert the (pseudo-)physical address into a machine address when populating a pagetable entry, and converting back to pphys address when an entry is read. Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> -- arch/i386/kernel/paravirt.c | 113 arch/i386/kernel/vmlinux.lds.S|3 include/asm-i386/page.h | 18 - include/asm-i386/paravirt.h | 68 + include/asm-i386/pgtable-2level.h |5 - include/asm-i386/pgtable-3level.h | 27 6 files changed, 199 insertions(+), 35 deletions(-) === --- a/arch/i386/kernel/paravirt.c +++ b/arch/i386/kernel/paravirt.c @@ -34,7 +34,7 @@ #include /* nop stub */ -static void native_nop(void) +void native_nop(void) { } @@ -400,38 +400,76 @@ static void native_flush_tlb_single(u32 } #ifndef CONFIG_X86_PAE -static void native_set_pte(pte_t *ptep, pte_t pteval) +void native_set_pte(pte_t *ptep, pte_t pteval) { *ptep = pteval; } -static void native_set_pte_at(struct mm_struct *mm, u32 addr, pte_t *ptep, pte_t pteval) +void native_set_pte_at(struct mm_struct *mm, u32 addr, + pte_t *ptep, pte_t pteval) { *ptep = pteval; } -static void native_set_pmd(pmd_t *pmdp, pmd_t pmdval) +void native_set_pmd(pmd_t *pmdp, pmd_t pmdval) { *pmdp = pmdval; } +unsigned long native_pte_val(pte_t pte) +{ + return pte.pte_low; +} + +unsigned long native_pmd_val(pmd_t pmd) +{ + BUG(); + return 0; +} + +unsigned long native_pgd_val(pgd_t pgd) +{ + return pgd.pgd; +} + +pte_t native_make_pte(unsigned long pte) +{ + return (pte_t){ pte }; +} + +pmd_t native_make_pmd(unsigned long pmd) +{ + BUG(); +} + +pgd_t native_make_pgd(unsigned long pgd) +{ + return (pgd_t){ pgd }; +} + +pte_t native_ptep_get_and_clear(pte_t *ptep) +{ + return __pte(xchg(&(ptep)->pte_low, 0)); +} + #else /* CONFIG_X86_PAE */ -static void native_set_pte(pte_t *ptep, pte_t pte) +void native_set_pte(pte_t *ptep, pte_t pte) { ptep->pte_high = pte.pte_high; smp_wmb(); ptep->pte_low = pte.pte_low; } -static void native_set_pte_at(struct mm_struct *mm, u32 addr, pte_t *ptep, pte_t pte) +void native_set_pte_at(struct mm_struct *mm, u32 addr, pte_t *ptep, pte_t pte) { ptep->pte_high = pte.pte_high; smp_wmb(); ptep->pte_low = pte.pte_low; } -static void native_set_pte_present(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte) +void native_set_pte_present(struct mm_struct *mm, u32 addr, + pte_t *ptep, pte_t pte) { ptep->pte_low = 0; smp_wmb(); @@ -440,35 +478,78 @@ static void native_set_pte_present(struc ptep->pte_low = pte.pte_low; } -static void native_set_pte_atomic(pte_t *ptep, pte_t pteval) +void native_set_pte_atomic(pte_t *ptep, pte_t pteval) { set_64bit((unsigned long long *)ptep,pte_val(pteval)); } -static void native_set_pmd(pmd_t *pmdp, pmd_t pmdval) +void native_set_pmd(pmd_t *pmdp, pmd_t pmdval) { set_64bit((unsigned long long *)pmdp,pmd_val(pmdval)); } -static void native_set_pud(pud_t *pudp, pud_t pudval) +void native_set_pud(pud_t *pudp, pud_t pudval) { *pudp = pudval; } -static void native_pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) +void native_pte_clear(struct mm_struct *mm, u32 addr, pte_t *ptep) { ptep->pte_low = 0; smp_wmb(); ptep->pte_high = 0; } -static void native_pmd_clear(pmd_t *pmd) +void native_pmd_clear(pmd_t *pmd) { u32 *tmp = (u32 *)pmd; *tmp = 0; smp_wmb(); *(tmp + 1) = 0; } + +unsigned long long native_pte_val(pte_t pte) +{ + return pte.pte_low | ((unsigned long long)pte.pte_high << 32); +} + +unsigned long long native_pmd_val(pmd_t pmd) +{ + return pmd.pmd; +} + +unsigned long long native_pgd_val(pgd_t pgd) +{ + return pgd.pgd; +} + +pte_t native_make_pte(unsigned long long pte) +{ + return (pte_t){ pte }; +} + +pmd_t native_make_pmd(unsigned long long pmd) +{ + return (pmd_t){ pmd }; +} + +pgd_t native_make_pgd(unsigned long long pgd) +{ + return (pgd_t){ pgd }; +} + +pte_t native_ptep_get_and_clear(pte_t *ptep) +{ + pte_t res; + + /* xchg acts as a barrier before the setting of the high bits */ + res.pte_low = xchg(>pte_low, 0); + res.pte_high = ptep->pte_high; + ptep->pte_high = 0; + + return res; +} + #endif /* CONFIG_X86_PAE */ /* These are in entry.S */ @@ -561,6 +642,9 @@ struct paravirt_ops paravirt_ops = { .set_pmd = native_set_pmd, .pte_update = (void *)native_nop,
[patch 02/21] Xen-paravirt: ignore vgacon if hardware not present
Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> === --- a/drivers/video/console/vgacon.c +++ b/drivers/video/console/vgacon.c @@ -372,7 +372,8 @@ static const char *vgacon_startup(void) } /* VGA16 modes are not handled by VGACON */ - if ((ORIG_VIDEO_MODE == 0x0D) ||/* 320x200/4 */ + if ((ORIG_VIDEO_MODE == 0x00) ||/* SCREEN_INFO not initialized */ + (ORIG_VIDEO_MODE == 0x0D) ||/* 320x200/4 */ (ORIG_VIDEO_MODE == 0x0E) ||/* 640x200/4 */ (ORIG_VIDEO_MODE == 0x10) ||/* 640x350/4 */ (ORIG_VIDEO_MODE == 0x12) ||/* 640x480/4 */ -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 17/21] Xen-paravirt: Add the Xen virtual console driver.
This provides a bootstrap and ongoing emergency console which is intended to be available from very early during boot and at all times thereafter, in contrast with alternatives such as UDP-based syslogd, or logging in via ssh. The protocol is based on a simple shared-memory ring buffer. Signed-off-by: Ian Pratt <[EMAIL PROTECTED]> Signed-off-by: Christian Limpach <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> Cc: Alan <[EMAIL PROTECTED]> --- arch/i386/kernel/early_printk.c|2 drivers/Makefile |3 drivers/xen/Makefile |1 drivers/xen/console/Makefile |2 drivers/xen/console/console.c | 585 +++ drivers/xen/console/xencons_ring.c | 144 include/xen/xencons.h | 14 init/main.c|2 8 files changed, 753 insertions(+) === --- a/arch/i386/kernel/early_printk.c +++ b/arch/i386/kernel/early_printk.c @@ -1,2 +1,4 @@ +#ifndef CONFIG_XEN #include "../../x86_64/kernel/early_printk.c" +#endif === --- a/drivers/Makefile +++ b/drivers/Makefile @@ -14,6 +14,9 @@ obj-$(CONFIG_ACPI)+= acpi/ # was used and do nothing if so obj-$(CONFIG_PNP) += pnp/ obj-$(CONFIG_ARM_AMBA) += amba/ + +# Xen is the default console when running as a guest +obj-$(CONFIG_XEN) += xen/ # char/ comes before serial/ etc so that the VT console is the boot-time # default. === --- /dev/null +++ b/drivers/xen/Makefile @@ -0,0 +1,1 @@ +obj-y += console/ === --- /dev/null +++ b/drivers/xen/console/Makefile @@ -0,0 +1,2 @@ + +obj-y := console.o xencons_ring.o === --- /dev/null +++ b/drivers/xen/console/console.c @@ -0,0 +1,585 @@ +/** + * console.c + * + * Virtual console driver. + * + * Copyright (c) 2002-2004, K A Fraser. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License version 2 + * as published by the Free Software Foundation; or, when distributed + * separately from the Linux kernel or incorporated into other + * software packages, subject to the following license: + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this source file (the "Software"), to deal in the Software without + * restriction, including without limitation the rights to use, copy, modify, + * merge, publish, distribute, sublicense, and/or sell copies of the Software, + * and to permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include + +MODULE_LICENSE("Dual BSD/GPL"); + +static int xc_disabled = 0; +static int xc_num = -1; + +/* /dev/xvc0 device number allocated by lanana.org. */ +#define XEN_XVC_MAJOR 204 +#define XEN_XVC_MINOR 191 + +#ifdef CONFIG_MAGIC_SYSRQ +static unsigned long sysrq_requested; +#endif + +static int __init xencons_setup(char *str) +{ + if (!strcmp(str, "off")) + xc_disabled = 1; + return 1; +} +__setup("xencons=", xencons_setup); + +/* The kernel and user-land drivers share a common transmit buffer. */ +static unsigned int wbuf_size = 4096; +#define WBUF_MASK(_i) ((_i)&(wbuf_size-1)) +static char *wbuf; +static unsigned int wc, wp; /* write_cons, write_prod */ + +static int __init xencons_bufsz_setup(char *str) +{ + unsigned int goal; + goal = simple_strtoul(str, NULL, 0); + if (goal) { + goal = roundup_pow_of_two(goal); + if (wbuf_size < goal) + wbuf_size = goal; + } + return 1; +} +__setup("xencons_bufsz=", xencons_bufsz_setup); + +/* This lock
[patch 00/21] Xen-paravirt: Xen guest implementation for paravirt_ops interface
Hi Andi, This patch series implements the Linux Xen guest in terms of the paravirt-ops interface. The features in implemented this patch series are: * domU only * UP only (most code is SMP-safe, but there's no way to create a new vcpu) * writable pagetables, with late pinning/early unpinning (no shadow pagetable support) * supports both PAE and non-PAE modes * xen console * virtual block device (blockfront) * virtual network device (netfront) The patch series is in two parts: 1-12: cleanups to the core kernel, either to fix outright problems, or to add appropriate hooks for Xen 13-21: the Xen guest implementation itself I've tried to make each patch as self-explanatory as possible. The series is based on git changeset ec2f9d1331f658433411c58077871e1eef4ee1b4 + x86_64-2.6.20-git8-070213-1.patch. Changes since the previous posting: - rebased - addressed review comments: - deal with missing vga hardware better - deal with Andi's comments - clean up header file placement - update netfront, and move it into drivers/net I looked at linking in xen-head.S rather than including it into xen-head.S, but it seems to provoke linker bugs, so I've left it as-is for now. Thanks, J -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 14/21] Xen-paravirt: Add XEN config options and disable unsupported config options.
The XEN config option enables the Xen paravirt_ops interface, which is installed when the kernel finds itself running under Xen. (By some as-yet fully defined mechanism, implemented in a future patch.) Xen is no longer a sub-architecture, so the X86_XEN subarch config option has gone. The disabled config options are: - PREEMPT: Xen doesn't support it - HZ: set to 100Hz for now, to cut down on VCPU context switch rate. This will be adapted to use tickless later. - kexec: not yet supported Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> Signed-off-by: Ian Pratt <[EMAIL PROTECTED]> Signed-off-by: Christian Limpach <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- arch/i386/Kconfig |7 +-- arch/i386/Kconfig.debug |1 + arch/i386/xen/Kconfig | 10 ++ kernel/Kconfig.hz |4 ++-- kernel/Kconfig.preempt |1 + 5 files changed, 19 insertions(+), 4 deletions(-) === --- a/arch/i386/Kconfig +++ b/arch/i386/Kconfig @@ -192,6 +192,8 @@ config PARAVIRT under a hypervisor, improving performance significantly. However, when run without a hypervisor the kernel is theoretically slower. If in doubt, say N. + +source "arch/i386/xen/Kconfig" config ACPI_SRAT bool @@ -298,12 +300,12 @@ config X86_UP_IOAPIC config X86_LOCAL_APIC bool - depends on X86_UP_APIC || ((X86_VISWS || SMP) && !X86_VOYAGER) || X86_GENERICARCH + depends on X86_UP_APIC || (((X86_VISWS || SMP) && !X86_VOYAGER) || X86_GENERICARCH) default y config X86_IO_APIC bool - depends on X86_UP_IOAPIC || (SMP && !(X86_VISWS || X86_VOYAGER)) || X86_GENERICARCH + depends on X86_UP_IOAPIC || ((SMP && !(X86_VISWS || X86_VOYAGER)) || X86_GENERICARCH) default y config X86_VISWS_APIC @@ -743,6 +745,7 @@ source kernel/Kconfig.hz config KEXEC bool "kexec system call" + depends on !XEN help kexec is a system call that implements the ability to shutdown your current kernel, and to start another kernel. It is like a reboot === --- a/arch/i386/Kconfig.debug +++ b/arch/i386/Kconfig.debug @@ -79,6 +79,7 @@ config DOUBLEFAULT config DOUBLEFAULT default y bool "Enable doublefault exception handler" if EMBEDDED + depends on !XEN help This option allows trapping of rare doublefault exceptions that would otherwise cause a system to silently reboot. Disabling this === --- /dev/null +++ b/arch/i386/xen/Kconfig @@ -0,0 +1,10 @@ +# +# This Kconfig describes xen options +# + +config XEN + bool "Enable support for Xen hypervisor" + depends PARAVIRT + default y + help + This is the Linux Xen port. === --- a/kernel/Kconfig.hz +++ b/kernel/Kconfig.hz @@ -3,7 +3,7 @@ # choice - prompt "Timer frequency" + prompt "Timer frequency" if !XEN default HZ_250 help Allows the configuration of the timer frequency. It is customary @@ -49,7 +49,7 @@ endchoice config HZ int - default 100 if HZ_100 + default 100 if HZ_100 || XEN default 250 if HZ_250 default 300 if HZ_300 default 1000 if HZ_1000 === --- a/kernel/Kconfig.preempt +++ b/kernel/Kconfig.preempt @@ -35,6 +35,7 @@ config PREEMPT_VOLUNTARY config PREEMPT bool "Preemptible Kernel (Low-Latency Desktop)" + depends on !XEN help This option reduces the latency of the kernel by making all kernel code (that is not executing in a critical section) -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 05/21] Xen-paravirt: paravirt_ops: hooks to set up initial pagetable
This patch introduces paravirt_ops hooks to control how the kernel's initial pagetable is set up. In the case of a native boot, the very early bootstrap code creates a simple non-PAE pagetable to map the kernel and physical memory. When the VM subsystem is initialized, it creates a proper pagetable which respects the PAE mode, large pages, etc. When booting under a hypervisor, there are many possibilities for what paging environment the hypervisor establishes for the guest kernel, so the constructon of the kernel's pagetable depends on the hypervisor. In the case of Xen, the hypervisor boots the kernel with a fully constructed pagetable, which is already using PAE if necessary. Also, Xen requires particular care when constructing pagetables to make sure all pagetables are always mapped read-only. In order to make this easier, kernel's initial pagetable construction has been changed to only allocate and initialize a pagetable page if there's no page already present in the pagetable. This allows the Xen paravirt backend to make a copy of the hypervisor-provided pagetable, allowing the kernel to establish any more mappings it needs while keeping the existing ones. A slightly subtle point which is worth highlighting here is that Xen requires all kernel mappings to share the same pte_t pages between all pagetables, so that updating a kernel page's mapping in one pagetable is reflected in all other pagetables. This makes it possible to allocate a page and attach it to a pagetable without having to explicitly enumerate that page's mapping in all pagetables. Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> -- arch/i386/kernel/paravirt.c | 40 ++ arch/i386/mm/init.c | 89 +- include/asm-i386/paravirt.h | 54 + include/asm-i386/pgtable.h |3 + 4 files changed, 150 insertions(+), 36 deletions(-) === --- a/arch/i386/kernel/paravirt.c +++ b/arch/i386/kernel/paravirt.c @@ -379,6 +379,43 @@ static void native_io_delay(void) { asm volatile("outb %al,$0x80"); } + +void native_pagetable_setup_start(pgd_t *base) +{ +#ifdef CONFIG_X86_PAE + int i; + + /* +* Init entries of the first-level page table to the +* zero page, if they haven't already been set up. +* +* In a normal native boot, we'll be running on a +* pagetable rooted in swapper_pg_dir, but not in PAE +* mode, so this will end up clobbering the mappings +* for the lower 24Mbytes of the address space, +* without affecting the kernel address space. +*/ + for (i = 0; i < USER_PTRS_PER_PGD; i++) + set_pgd([i], + __pgd(__pa(empty_zero_page) | _PAGE_PRESENT)); + memset([USER_PTRS_PER_PGD], 0, sizeof(pgd_t)); +#endif +} + +void native_pagetable_setup_done(pgd_t *base) +{ +#ifdef CONFIG_X86_PAE + /* +* Add low memory identity-mappings - SMP needs it when +* starting up on an AP from real-mode. In the non-PAE +* case we already have these mappings through head.S. +* All user-space mappings are explicitly cleared after +* SMP startup. +*/ + set_pgd([0], base[USER_PTRS_PER_PGD]); +#endif +} + static void native_flush_tlb(void) { @@ -627,6 +664,9 @@ struct paravirt_ops paravirt_ops = { #endif .set_lazy_mode = (void *)native_nop, + .pagetable_setup_start = native_pagetable_setup_start, + .pagetable_setup_done = native_pagetable_setup_done, + .flush_tlb_user = native_flush_tlb, .flush_tlb_kernel = native_flush_tlb_global, .flush_tlb_single = native_flush_tlb_single, === --- a/arch/i386/mm/init.c +++ b/arch/i386/mm/init.c @@ -42,6 +42,7 @@ #include #include #include +#include unsigned int __VMALLOC_RESERVE = 128 << 20; @@ -62,6 +63,8 @@ static pmd_t * __init one_md_table_init( #ifdef CONFIG_X86_PAE pmd_table = (pmd_t *) alloc_bootmem_low_pages(PAGE_SIZE); + memset(pmd_table, 0, PAGE_SIZE); + paravirt_alloc_pd(__pa(pmd_table) >> PAGE_SHIFT); set_pgd(pgd, __pgd(__pa(pmd_table) | _PAGE_PRESENT)); pud = pud_offset(pgd, 0); @@ -83,12 +86,11 @@ static pte_t * __init one_page_table_ini { if (pmd_none(*pmd)) { pte_t *page_table = (pte_t *) alloc_bootmem_low_pages(PAGE_SIZE); + memset(page_table, 0, PAGE_SIZE); + paravirt_alloc_pt(__pa(page_table) >> PAGE_SHIFT); set_pmd(pmd, __pmd(__pa(page_table) | _PAGE_TABLE)); - if (page_table != pte_offset_kernel(pmd, 0)) - BUG(); - - return page_table; + BUG_ON(page_table != pte_offset_kernel(pmd, 0)); } return
[patch 13/21] Xen-paravirt: Add nosegneg capability to the vsyscall page notes
Add the "nosegneg" fake capabilty to the vsyscall page notes. This is used by the runtime linker to select a glibc version which then disables negative-offset accesses to the thread-local segment via %gs. These accesses require emulation in Xen (because segments are truncated to protect the hypervisor address space) and avoiding them provides a measurable performance boost. Signed-off-by: Ian Pratt <[EMAIL PROTECTED]> Signed-off-by: Christian Limpach <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- arch/i386/kernel/vsyscall-note.S | 28 + 1 files changed, 28 insertions(+) === --- a/arch/i386/kernel/vsyscall-note.S +++ b/arch/i386/kernel/vsyscall-note.S @@ -23,3 +24,31 @@ 3: .balign 4; /* pad out section */ ASM_ELF_NOTE_BEGIN(".note.kernel-version", "a", UTS_SYSNAME, 0) .long LINUX_VERSION_CODE ASM_ELF_NOTE_END + +#ifdef CONFIG_XEN +/* + * Add a special note telling glibc's dynamic linker a fake hardware + * flavor that it will use to choose the search path for libraries in the + * same way it uses real hardware capabilities like "mmx". + * We supply "nosegneg" as the fake capability, to indicate that we + * do not like negative offsets in instructions using segment overrides, + * since we implement those inefficiently. This makes it possible to + * install libraries optimized to avoid those access patterns in someplace + * like /lib/i686/tls/nosegneg. Note that an /etc/ld.so.conf.d/file + * corresponding to the bits here is needed to make ldconfig work right. + * It should contain: + * hwcap 0 nosegneg + * to match the mapping of bit to name that we give here. + */ +#define NOTE_KERNELCAP_BEGIN(ncaps, mask) \ + ASM_ELF_NOTE_BEGIN(".note.kernelcap", "a", "GNU", 2) \ + .long ncaps, mask +#define NOTE_KERNELCAP(bit, name) \ + .byte bit; .asciz name +#define NOTE_KERNELCAP_END ASM_ELF_NOTE_END + +NOTE_KERNELCAP_BEGIN(1, 1) +NOTE_KERNELCAP(1, "nosegneg") +NOTE_KERNELCAP_END +#endif + -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 04/21] Xen-paravirt: ===================================================================
-static void vmi_set_pte_present(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte) +static void vmi_set_pte_present(struct mm_struct *mm, u32 addr, pte_t *ptep, pte_t pte) { vmi_check_page_type(__pa(ptep) >> PAGE_SHIFT, VMI_PAGE_PTE); vmi_ops.set_pte(pte, ptep, vmi_flags_addr_defer(mm, addr, VMI_PAGE_PT, 1)); @@ -492,7 +492,7 @@ static void vmi_set_pud(pud_t *pudp, pud vmi_ops.set_pte(pte, (pte_t *)pudp, VMI_PAGE_PDP); } -static void vmi_pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) +static void vmi_pte_clear(struct mm_struct *mm, u32 addr, pte_t *ptep) { const pte_t pte = { 0 }; vmi_check_page_type(__pa(ptep) >> PAGE_SHIFT, VMI_PAGE_PTE); -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 09/21] Xen-paravirt: Allow paravirt backend to select PGD allocation alignment
Xen requires pgds to be page-aligned, so make this a parameter which can be set in the paravirt_ops structure. Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> -- arch/i386/kernel/paravirt.c |1 + arch/i386/mm/init.c |2 +- include/asm-i386/paravirt.h |5 - include/asm-i386/pgtable.h |6 ++ 4 files changed, 12 insertions(+), 2 deletions(-) === --- a/arch/i386/kernel/paravirt.c +++ b/arch/i386/kernel/paravirt.c @@ -573,6 +573,7 @@ struct paravirt_ops paravirt_ops = { .paravirt_enabled = 0, .kernel_rpl = 0, .shared_kernel_pmd = 1, /* Only used when CONFIG_X86_PAE is set */ + .pgd_alignment = sizeof(pgd_t) * PTRS_PER_PGD, .patch = native_patch, .banner = default_banner, === --- a/arch/i386/mm/init.c +++ b/arch/i386/mm/init.c @@ -745,7 +745,7 @@ void __init pgtable_cache_init(void) } pgd_cache = kmem_cache_create("pgd", PTRS_PER_PGD*sizeof(pgd_t), - PTRS_PER_PGD*sizeof(pgd_t), + PGD_ALIGNMENT, 0, NULL, NULL); if (!pgd_cache) panic("pgtable_cache_init(): Cannot create pgd cache"); === --- a/include/asm-i386/paravirt.h +++ b/include/asm-i386/paravirt.h @@ -33,9 +33,12 @@ struct mm_struct; struct mm_struct; struct paravirt_ops { + int paravirt_enabled; unsigned int kernel_rpl; + int shared_kernel_pmd; - int paravirt_enabled; + int pgd_alignment; + const char *name; /* === --- a/include/asm-i386/pgtable.h +++ b/include/asm-i386/pgtable.h @@ -270,6 +270,12 @@ static inline void vmalloc_sync_all(void #define pte_update_defer(mm, addr, ptep) do { } while (0) #endif +#ifdef CONFIG_PARAVIRT +#define PGD_ALIGNMENT (paravirt_ops.pgd_alignment) +#else +#define PGD_ALIGNMENT (sizeof(pgd_t) * PTRS_PER_PGD) +#endif + /* * We only update the dirty/accessed state if we set * the dirty bit by hand in the kernel, since the hardware -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 08/21] Xen-paravirt: Allow paravirt backend to choose kernel PMD sharing
Xen does not allow guests to have the kernel pmd shared between page tables, so parameterize pgtable.c to allow both modes of operation. Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> -- arch/i386/kernel/paravirt.c|1 arch/i386/mm/fault.c |6 +-- arch/i386/mm/pageattr.c|2 - arch/i386/mm/pgtable.c | 61 +++ include/asm-i386/page.h|7 ++- include/asm-i386/paravirt.h|1 include/asm-i386/pgtable-2level-defs.h |2 + include/asm-i386/pgtable-2level.h |2 - include/asm-i386/pgtable-3level-defs.h |6 +++ include/asm-i386/pgtable-3level.h | 16 ++-- include/asm-i386/pgtable.h |7 +++ 11 files changed, 68 insertions(+), 43 deletions(-) === --- a/arch/i386/kernel/paravirt.c +++ b/arch/i386/kernel/paravirt.c @@ -572,6 +572,7 @@ struct paravirt_ops paravirt_ops = { .name = "bare hardware", .paravirt_enabled = 0, .kernel_rpl = 0, + .shared_kernel_pmd = 1, /* Only used when CONFIG_X86_PAE is set */ .patch = native_patch, .banner = default_banner, === --- a/arch/i386/mm/fault.c +++ b/arch/i386/mm/fault.c @@ -616,8 +616,7 @@ do_sigbus: force_sig_info_fault(SIGBUS, BUS_ADRERR, address, tsk); } -#ifndef CONFIG_X86_PAE -void vmalloc_sync_all(void) +void _vmalloc_sync_all(void) { /* * Note that races in the updates of insync and start aren't @@ -628,6 +627,8 @@ void vmalloc_sync_all(void) static DECLARE_BITMAP(insync, PTRS_PER_PGD); static unsigned long start = TASK_SIZE; unsigned long address; + + BUG_ON(SHARED_KERNEL_PMD); BUILD_BUG_ON(TASK_SIZE & ~PGDIR_MASK); for (address = start; address >= TASK_SIZE; address += PGDIR_SIZE) { @@ -651,4 +652,3 @@ void vmalloc_sync_all(void) start = address + PGDIR_SIZE; } } -#endif === --- a/arch/i386/mm/pageattr.c +++ b/arch/i386/mm/pageattr.c @@ -91,7 +91,7 @@ static void set_pmd_pte(pte_t *kpte, uns unsigned long flags; set_pte_atomic(kpte, pte); /* change init_mm */ - if (PTRS_PER_PMD > 1) + if (SHARED_KERNEL_PMD) return; spin_lock_irqsave(_lock, flags); === --- a/arch/i386/mm/pgtable.c +++ b/arch/i386/mm/pgtable.c @@ -241,31 +241,42 @@ static void pgd_ctor(pgd_t *pgd) unsigned long flags; if (PTRS_PER_PMD == 1) { + /* !PAE, no pagetable sharing */ memset(pgd, 0, USER_PTRS_PER_PGD*sizeof(pgd_t)); + + clone_pgd_range(pgd + USER_PTRS_PER_PGD, + swapper_pg_dir + USER_PTRS_PER_PGD, + KERNEL_PGD_PTRS); + spin_lock_irqsave(_lock, flags); - } - - clone_pgd_range(pgd + USER_PTRS_PER_PGD, - swapper_pg_dir + USER_PTRS_PER_PGD, - KERNEL_PGD_PTRS); - - if (PTRS_PER_PMD > 1) - return; - - /* must happen under lock */ - paravirt_alloc_pd_clone(__pa(pgd) >> PAGE_SHIFT, - __pa(swapper_pg_dir) >> PAGE_SHIFT, - USER_PTRS_PER_PGD, PTRS_PER_PGD - USER_PTRS_PER_PGD); - - pgd_list_add(pgd); - spin_unlock_irqrestore(_lock, flags); + + /* must happen under lock */ + paravirt_alloc_pd_clone(__pa(pgd) >> PAGE_SHIFT, + __pa(swapper_pg_dir) >> PAGE_SHIFT, + USER_PTRS_PER_PGD, + PTRS_PER_PGD - USER_PTRS_PER_PGD); + + pgd_list_add(pgd); + spin_unlock_irqrestore(_lock, flags); + } else { + /* PAE, PMD may be shared */ + if (SHARED_KERNEL_PMD) { + clone_pgd_range((pgd_t *)pgd + USER_PTRS_PER_PGD, + swapper_pg_dir + USER_PTRS_PER_PGD, + KERNEL_PGD_PTRS); + } else { + spin_lock_irqsave(_lock, flags); + pgd_list_add(pgd); + spin_unlock_irqrestore(_lock, flags); + } + } } static void pgd_dtor(pgd_t *pgd) { unsigned long flags; /* can be called from interrupt context */ - if (PTRS_PER_PMD == 1) + if (SHARED_KERNEL_PMD) return; paravirt_release_pd(__pa(pgd) >> PAGE_SHIFT); @@ -279,19 +290,25 @@ pgd_t *pgd_alloc(struct mm_struct *mm) int i; pgd_t *pgd = kmem_cache_alloc(pgd_cache, GFP_KERNEL); - if (pgd) + if
[patch 07/21] Xen-paravirt: remove ctor for pgd cache
Remove the ctor for the pgd cache. There's no point in having the cache machinery do this via an indirect call when all pgd are freed in the one place anyway. Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> === --- a/arch/i386/mm/init.c +++ b/arch/i386/mm/init.c @@ -739,11 +739,9 @@ void __init pgtable_cache_init(void) panic("pgtable_cache_init(): cannot create pmd cache"); } pgd_cache = kmem_cache_create("pgd", - PTRS_PER_PGD*sizeof(pgd_t), - PTRS_PER_PGD*sizeof(pgd_t), - 0, - pgd_ctor, - PTRS_PER_PMD == 1 ? pgd_dtor : NULL); + PTRS_PER_PGD*sizeof(pgd_t), + PTRS_PER_PGD*sizeof(pgd_t), + 0, NULL, NULL); if (!pgd_cache) panic("pgtable_cache_init(): Cannot create pgd cache"); } === --- a/arch/i386/mm/pgtable.c +++ b/arch/i386/mm/pgtable.c @@ -236,7 +236,7 @@ static inline void pgd_list_del(pgd_t *p set_page_private(next, (unsigned long)pprev); } -void pgd_ctor(void *pgd, struct kmem_cache *cache, unsigned long unused) +static void pgd_ctor(pgd_t *pgd) { unsigned long flags; @@ -245,7 +245,7 @@ void pgd_ctor(void *pgd, struct kmem_cac spin_lock_irqsave(_lock, flags); } - clone_pgd_range((pgd_t *)pgd + USER_PTRS_PER_PGD, + clone_pgd_range(pgd + USER_PTRS_PER_PGD, swapper_pg_dir + USER_PTRS_PER_PGD, KERNEL_PGD_PTRS); @@ -261,10 +261,12 @@ void pgd_ctor(void *pgd, struct kmem_cac spin_unlock_irqrestore(_lock, flags); } -/* never called when PTRS_PER_PMD > 1 */ -void pgd_dtor(void *pgd, struct kmem_cache *cache, unsigned long unused) +static void pgd_dtor(pgd_t *pgd) { unsigned long flags; /* can be called from interrupt context */ + + if (PTRS_PER_PMD == 1) + return; paravirt_release_pd(__pa(pgd) >> PAGE_SHIFT); spin_lock_irqsave(_lock, flags); @@ -276,6 +278,9 @@ pgd_t *pgd_alloc(struct mm_struct *mm) { int i; pgd_t *pgd = kmem_cache_alloc(pgd_cache, GFP_KERNEL); + + if (pgd) + pgd_ctor(pgd); if (PTRS_PER_PMD == 1 || !pgd) return pgd; @@ -296,6 +301,7 @@ out_oom: paravirt_release_pd(__pa(pmd) >> PAGE_SHIFT); kmem_cache_free(pmd_cache, pmd); } + pgd_dtor(pgd); kmem_cache_free(pgd_cache, pgd); return NULL; } @@ -313,5 +319,6 @@ void pgd_free(pgd_t *pgd) kmem_cache_free(pmd_cache, pmd); } /* in the non-PAE case, free_pgtables() clears user pgd entries */ + pgd_dtor(pgd); kmem_cache_free(pgd_cache, pgd); } === --- a/include/asm-i386/pgtable.h +++ b/include/asm-i386/pgtable.h @@ -41,8 +41,6 @@ extern struct page *pgd_list; extern struct page *pgd_list; void pmd_ctor(void *, struct kmem_cache *, unsigned long); -void pgd_ctor(void *, struct kmem_cache *, unsigned long); -void pgd_dtor(void *, struct kmem_cache *, unsigned long); void pgtable_cache_init(void); void paging_init(void); -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 06/21] Xen-paravirt: paravirt_ops: allocate a fixmap slot
Allocate a fixmap slot for use by a paravirt_ops implementation. Xen uses this to map the hypervisor's shared info page, which doesn't have a pseudo-physical page number, and therefore can't be mapped ordinarily. Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> -- include/asm-i386/fixmap.h |3 +++ 1 file changed, 3 insertions(+) === --- a/include/asm-i386/fixmap.h +++ b/include/asm-i386/fixmap.h @@ -86,6 +86,9 @@ enum fixed_addresses { #ifdef CONFIG_PCI_MMCONFIG FIX_PCIE_MCFG, #endif +#ifdef CONFIG_PARAVIRT + FIX_PARAVIRT, +#endif __end_of_permanent_fixed_addresses, /* temporary boot-time mappings, used before ioremap() is functional */ #define NR_FIX_BTMAPS 16 -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: loosen dependancy on rtc cmos
On Wednesday 14 February 2007 18:47, David Brownell wrote: > On Wednesday 14 February 2007 3:20 pm, Len Brown wrote: > > > > > > I still need to resubmit the patch, for X86_PC, which defines the platform > > > device in the (common) case where PNPACPI isn't defined. > > > > CONFIG_PNPACPI=y is not the common case? > > It's certainly not in the defconfig for x86-64. And it's only been > three weeks since the CONFIG_EXPERIMENTAL dependency got removed. > > So, no I would not think it's the common case. Turns out that it is common. It is in i386 defconfig, ships on i386 Fedora Core 6, ships on i386 OpenSuse 10.2. For x86_64 it isn't in defconfig or OpenSuse 10.2, but is in Fedora Core 6. So I've taken Andi's advice and checked in the patches below. (I didn't bother updating defconfig, it will generate according to this rule and I don't want to conflict with any re-generation andi might be checking in) thanks, -Len commit 243b66e76ab722cdec1921d7f80c0cb808131c37 Author: Len Brown <[EMAIL PROTECTED]> Date: Thu Feb 15 22:34:36 2007 -0500 ACPI: always enable CONFIG_PNPACPI on CONFIG_ACPI kernels We removed the ACPI motherboard driver which handled the ACPI=y, PNP=n case, so now we need to enforce that PNP & PNPACPI are always enabled for ACPI kernels. Most major distros ship this way this already. Cc: Bjorn Helgaas <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig index 20eacc2..2d21fed 100644 --- a/drivers/acpi/Kconfig +++ b/drivers/acpi/Kconfig @@ -13,6 +13,7 @@ config ACPI depends on IA64 || X86 depends on PCI depends on PM + select PNP default y ---help--- Advanced Configuration and Power Interface (ACPI) support for diff --git a/drivers/pnp/pnpacpi/Kconfig b/drivers/pnp/pnpacpi/Kconfig index ad27e5e..b04767c 100644 --- a/drivers/pnp/pnpacpi/Kconfig +++ b/drivers/pnp/pnpacpi/Kconfig @@ -2,17 +2,5 @@ # Plug and Play ACPI configuration # config PNPACPI - bool "Plug and Play ACPI support" - depends on PNP && ACPI - default y - ---help--- - Linux uses the PNPACPI to autodetect built-in - mainboard resources (e.g. parallel port resources). - - Some features (e.g. real hotplug) are not currently - implemented. - - If you would like the kernel to detect and allocate resources to - your mainboard devices (on some systems they are disabled by the - BIOS) say Y here. Also the PNPACPI can help prevent resource - conflicts between mainboard devices and other bus devices. + bool + default (PNP && ACPI) commit 8d4956c201c2f7683289f70095443c59a39f94ef Author: Len Brown <[EMAIL PROTECTED]> Date: Thu Feb 15 22:46:42 2007 -0500 ACPI: remove non-PNPACPI version of get_rtc_dev() It isn't needed in ACPI code anymore because now ACPI always includes PNPACPI. Cc: David Brownell <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> diff --git a/drivers/acpi/glue.c b/drivers/acpi/glue.c index 9950087..4334c20 100644 --- a/drivers/acpi/glue.c +++ b/drivers/acpi/glue.c @@ -255,8 +255,6 @@ arch_initcall(init_acpi_device_notify); static struct cmos_rtc_board_info rtc_info; -#ifdef CONFIG_PNPACPI - /* PNP devices are registered in a subsys_initcall(); * ACPI specifies the PNP IDs to use. */ @@ -280,31 +278,6 @@ static struct device *__init get_rtc_dev(void) return bus_find_device(_bus_type, NULL, NULL, pnp_match); } -#else - -/* We expect non-PNPACPI platforms to register an RTC device, usually - * at or near arch_initcall(). That also helps for example PCs that - * aren't configured with ACPI (where this code wouldn't run, but the - * RTC would still be available). The device name matches the driver; - * that's how the platform bus works. - */ -#include - -static int __init platform_match(struct device *dev, void *data) -{ - struct platform_device *pdev; - - pdev = container_of(dev, struct platform_device, dev); - return strcmp(pdev->name, "rtc_cmos") == 0; -} - -static struct device *__init get_rtc_dev(void) -{ - return bus_find_device(_bus_type, NULL, NULL, platform_match); -} - -#endif - static int __init acpi_rtc_init(void) { struct device *dev = get_rtc_dev(); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] net, 8139too.c: fix netpoll deadlock
On Wed, 14 Feb 2007 11:30:25 +0900 (JST), Atsushi Nemoto <[EMAIL PROTECTED]> wrote: > > hm, this isnt really about NAPI polling, but about the > > netconsole/netpoll/netdump poll_controller() handler. > > > > with netconsole, printk can be called from IRQ context (and is > > frequently from IRQ context during bootup or module initialization), so > > a BH rule isnt enough for them. > > I see NAPI poll routine might be called with interrupt disabled. > > Many (all?) NAPI drivers call netif_receive_skb() from its poll > routine (as described in NAPI-HOWTO.txt), but I thought > netif_receive_skb() cannot be called from irq context or irq disabled. > So it seems the problem is not solved completely. Or am I missing > something? Any comments for this issue? If my understanding was correct, I think add some checking to netif_receive_skb() is better then fixing all poll routines. Is this patch acceptable? Subject: fix irq problem with NAPI + NETPOLL It seems netif_receive_skb() was designed not to call from irq context, but NAPI + NETPOLL break this rule. If netif_receive_skb() was called from irq context, redirect to netif_rx() instead of processing the skb in that context. Signed-off-by: Atsushi Nemoto <[EMAIL PROTECTED]> --- --- linux-2.6.20/net/core/dev.c 2007-02-05 03:44:54.0 +0900 +++ linux/net/core/dev.c2007-02-16 13:19:06.0 +0900 @@ -1769,8 +1769,15 @@ int netif_receive_skb(struct sk_buff *sk __be16 type; /* if we've gotten here through NAPI, check netpoll */ - if (skb->dev->poll && netpoll_rx(skb)) - return NET_RX_DROP; +#ifdef CONFIG_NET_POLL_CONTROLLER + if (skb->dev->poll && skb->dev->poll_controller) { + /* NAPI poll might be called in irq context on NETPOLL */ + if (in_irq() || irqs_disabled()) + return netif_rx(skb); + if (netpoll_rx(skb)) + return NET_RX_DROP; + } +#endif if (!skb->tstamp.off_sec) net_timestamp(skb); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/1] - acpi_unload_table_id() always returns error
Thanks for the fix, John. Do you grant Intel permission to apply it to the upstream ACPICA tree (with its non-GPL license)? -Len On Thursday 15 February 2007 15:08, John Keller wrote: > acpi_unload_table_id() is always returning an error status. > Also, once the matching table is found, don't bother looking > for another match. > > > Signed-off-by: John Keller <[EMAIL PROTECTED]> > --- > > > Index: release/drivers/acpi/tables/tbxface.c > === > --- release.orig/drivers/acpi/tables/tbxface.c2007-02-13 > 08:20:42.0 -0600 > +++ release/drivers/acpi/tables/tbxface.c 2007-02-15 14:04:07.855248010 > -0600 > @@ -338,9 +338,9 @@ acpi_status acpi_unload_table_id(acpi_ow > int i; > acpi_status status = AE_NOT_EXIST; > > - ACPI_FUNCTION_TRACE(acpi_unload_table); > + ACPI_FUNCTION_TRACE(acpi_unload_table_id); > > - /* Find table from the requested type list */ > + /* Find table in the global table list */ > for (i = 0; i < acpi_gbl_root_table_list.count; ++i) { > if (id != acpi_gbl_root_table_list.tables[i].owner_id) { > continue; > @@ -352,8 +352,9 @@ acpi_status acpi_unload_table_id(acpi_ow > * simply a position within the hierarchy > */ > acpi_tb_delete_namespace_by_owner(i); > - acpi_tb_release_owner_id(i); > + status = acpi_tb_release_owner_id(i); > acpi_tb_set_table_loaded_flag(i, FALSE); > + break; > } > return_ACPI_STATUS(status); > } > - > To unsubscribe from this list: send the line "unsubscribe linux-acpi" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: kernel porting query
1) Can any one please shed some light on precisely and exactly what are differences in different boards for which we need to port linux? The differences depends on the boards ... Mostly if they belong to the same family the differences could be mainly in the peripherials ... Interrupt mappings... Serial interface ... Memory sizes (if differnet boards have different memory sizes) ... Sometimes some devices may be moved to some other locations in the memory map (Look at memory map of your SoC on your board). 2) Also, I would appreciate if you could point out code portions / source files that need to be changed in the process of porting Linux? Look in linux-2.6.20\arch\mips\mips-boards for help on porting to your board. You can add directory specific to your board here. I think you can start from xxx_setup.c file in the above directory. Good luck. Ajay. TIA, Rick -- To unsubscribe from this list: send an email with "unsubscribe kernelnewbies" to [EMAIL PROTECTED] Please read the FAQ at http://kernelnewbies.org/FAQ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: somebody dropped a (warning) bomb
On 02/15/2007 08:02 PM, Linus Torvalds wrote: Think of it this way: in science, a theory is proven to be bad by a single undeniable fact just showing that it's wrong. The same is largely true of a warning. If the warning sometimes happens for code that is perfectly fine, the warning is bad. Slight difference; if a compulsory warning sometimes happens for code that is perfectly fine, the warning is bad. I do want to be _able_ to get as many warnings as a compiler can muster though. Given char's special nature, shouldn't the conclusion of this thread have long been simply that gcc needs -Wno-char-pointer-sign? (with whatever default, as far as I'm concerned). Rene. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: GPL vs non-GPL device drivers
On Feb 15, 2007, Chris Snook <[EMAIL PROTECTED]> wrote: > v j wrote: >> You don't get it do you. Our source code is meaningless to the Open >> Source community at large. You don't have to offer it to the community at large. You only have to pass it on to your customers, under the terms of the GPL. > Collaborating with the competition ("coopetition") on a common > technology platform reduces costs for anyone who chooses to get > involved, giving them a collective competitive edge against anyone who > doesn't. http://www.lsd.ic.unicamp.br/~oliva/papers/free-software/BMind.pdf -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer [EMAIL PROTECTED], gcc.gnu.org} Free Software Evangelist [EMAIL PROTECTED], gnu.org} - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: GPL vs non-GPL device drivers
On Feb 15, 2007, "v j" <[EMAIL PROTECTED]> wrote: > On 2/14/07, Arjan van de Ven <[EMAIL PROTECTED]> wrote: >> I think you have a bit of a misunderstanding... Linux is not royalty >> free. Just the royalty is not in the form of cash, but in the form of >> having to give your improvements back to the open source world. It's not giving back, it's giving forward. Improvements don't have to go back, but whoever receives them must receive them under the same license. > Sure. But this is not legally binding. Indeed, you don't have to give it back or give it forward. It's just that, if you don't comply with the license, you don't have permission to distribute the software at all. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer [EMAIL PROTECTED], gcc.gnu.org} Free Software Evangelist [EMAIL PROTECTED], gnu.org} - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] shm: Fix the locking and cleanup error handling in do_shmat.
On Thu, 2007-02-15 at 22:34 -0500, Eric W. Biederman wrote: > > drivers/video/Kconfig:1606:warning: 'select' used by config symbol > 'FB_PS3' > > refer to undefined symbol 'PS3_PS3AV' > > /mnt/md0/devel/linux-mm/ipc/shm.c: In function 'do_shmat': > > /mnt/md0/devel/linux-mm/ipc/shm.c:945: warning: passing argument 1 > of 'IS_ERR' > > makes pointer from integer without a cast > > /mnt/md0/devel/linux-mm/ipc/shm.c:946: warning: passing argument 1 > of 'PTR_ERR' > > makes pointer from integer without a cast > > /mnt/md0/devel/linux-mm/ipc/shm.c:931: warning: label 'out_nattch' > defined but > > not used > > /mnt/md0/devel/linux-mm/ipc/shm.c:890: error: label 'out_put_path' > used but not > > defined > > make[2]: *** [ipc/shm.o] Error 1 > > make[1]: *** [ipc] Error 2 > > make: *** [_all] Error 2 > > Definitely some weird patch application problem. > > All of the calls to IS_ERR and PTR_ERR should have been removed. > Michal since it didn't seem to blow up when Andrew applied it I'm > going to assume the problem is on your end for now. > > Eric > Hi Eric, I am also troubling with the incorrect "nattch" value in do_shmat(). Actually, I found this bugs when I do some LTP testing on blackfin-uClinux platform. The "nattch" value returned from shmctl() system call is wrong. So I think your patch can solve this bug. But which version Linux kernel is your patch applying for? I want to do some test on my blackfin-uClinux 2.6.20 platform. Thanks -Bryan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Add Cobalt button interface driver support
Hi, This patch adds support for the back panel buttons on Cobalt server. It's tested on the Cobalt Qube2. Yoichi Signed-off-by: Yoichi Yuasa <[EMAIL PROTECTED]> diff -pruN -X mips/Documentation/dontdiff mips-orig/drivers/input/misc/Kconfig mips/drivers/input/misc/Kconfig --- mips-orig/drivers/input/misc/Kconfig2007-02-14 15:35:30.566862000 +0900 +++ mips/drivers/input/misc/Kconfig 2007-02-15 17:20:28.721985250 +0900 @@ -89,4 +89,10 @@ config HP_SDC_RTC Say Y here if you want to support the built-in real time clock of the HP SDC controller. +config INPUT_COBALT_BTNS + tristate "Cobalt button interface" + depends on MIPS_COBALT + help + Say Y here if you want to support MIPS Cobalt button interface. + endif diff -pruN -X mips/Documentation/dontdiff mips-orig/drivers/input/misc/Makefile mips/drivers/input/misc/Makefile --- mips-orig/drivers/input/misc/Makefile 2007-02-14 15:35:30.566862000 +0900 +++ mips/drivers/input/misc/Makefile2007-02-15 11:55:30.856042500 +0900 @@ -12,3 +12,4 @@ obj-$(CONFIG_INPUT_WISTRON_BTNS) += wist obj-$(CONFIG_INPUT_ATLAS_BTNS) += atlas_btns.o obj-$(CONFIG_HP_SDC_RTC) += hp_sdc_rtc.o obj-$(CONFIG_INPUT_IXP4XX_BEEPER) += ixp4xx-beeper.o +obj-$(CONFIG_INPUT_COBALT_BTNS)+= cobalt_btns.o diff -pruN -X mips/Documentation/dontdiff mips-orig/drivers/input/misc/cobalt_btns.c mips/drivers/input/misc/cobalt_btns.c --- mips-orig/drivers/input/misc/cobalt_btns.c 1970-01-01 09:00:00.0 +0900 +++ mips/drivers/input/misc/cobalt_btns.c 2007-02-15 17:17:31.550912750 +0900 @@ -0,0 +1,199 @@ +/* + * Cobalt button interface driver. + * + * Copyright (C) 2007 Yoichi Yuasa <[EMAIL PROTECTED]> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ +#include +#include +#include +#include +#include +#include +#include + +#include + +#define BUTTONS_POLL_FREQUENCY 30 +#define BUTTONS_COUNT_THRESHOLD3 +#define BUTTONS_STATUS_MASK0xfe00 + +struct buttons_dev { + struct input_dev *input; + void __iomem *reg; +}; + +struct buttons_map { + uint32_tmask; + int keycode; + int count; +}; + +static struct buttons_map buttons_map[] = { + { 0x0200, KEY_RESTART, }, + { 0x0400, KEY_LEFT, }, + { 0x0800, KEY_UP, }, + { 0x1000, KEY_DOWN, }, + { 0x2000, KEY_RIGHT, }, + { 0x4000, KEY_ENTER, }, + { 0x8000, KEY_SELECT, }, +}; + +static struct resource cobalt_buttons_resource = { + .start = 0x1d00, + .end= 0x1d03, + .flags = IORESOURCE_MEM, +}; + +static struct platform_device cobalt_buttons_device = { + .name = "Cobalt buttons", + .num_resources = 1, + .resource = _buttons_resource, +}; + +static struct timer_list buttons_timer; + +static void handle_buttons(unsigned long data) +{ + struct buttons_map *button = buttons_map; + struct buttons_dev *bdev; + uint32_t status; + int i; + + bdev = (struct buttons_dev *)data; + status = readl(bdev->reg); + status = ~status & BUTTONS_STATUS_MASK; + + for (i = 0; i < ARRAY_SIZE(buttons_map); i++) { + if (status & button->mask) { + button->count++; + } else { + if (button->count >= BUTTONS_COUNT_THRESHOLD) { + input_report_key(bdev->input, button->keycode, 0); + input_sync(bdev->input); + } + button->count = 0; + } + + if (button->count == BUTTONS_COUNT_THRESHOLD) { + input_report_key(bdev->input, button->keycode, 1); + input_sync(bdev->input); + } + + button++; + } + + mod_timer(_timer, jiffies + HZ / BUTTONS_POLL_FREQUENCY); +} + +static int __init cobalt_buttons_probe(struct platform_device *pdev) +{ + struct buttons_dev *bdev; + struct input_dev *input; + struct resource *res; + int retval, i; + + bdev = kzalloc(sizeof(struct buttons_dev), GFP_KERNEL); + if
Re: [PATCH] Add Cobalt button interface driver support
On Thursday 15 February 2007 22:36, Yoichi Yuasa wrote: > Hi, > > This patch adds support for the back panel buttons on Cobalt server. > It's tested on the Cobalt Qube2. > Hi, Thank you for your patch. Couple of comments: > + > + button++; > + } > + > + mod_timer(_timer, jiffies + HZ / BUTTONS_POLL_FREQUENCY); May I suggest using msecs_to_jiffies() to avoid direct computations on HZ? > + > + bdev->input = input; > + bdev->reg = ioremap(res->start, res->end - res->start + 1); > + dev_set_drvdata(>dev, bdev); > + > + setup_timer(_timer, handle_buttons, (unsigned long)bdev); > + mod_timer(_timer, jiffies + HZ / BUTTONS_POLL_FREQUENCY); Please implement cobalt_buttons_open() and cobalt_buttons_close() methods and start/stop timer from there - there is no point in polling hardware if noone is listening to the events. > + > + return retval; > +} > + > +static int __devexit cobalt_buttons_remove(struct platform_device *pdev) > +{ > + struct device *dev = >dev; > + struct buttons_dev *bdev = dev_get_drvdata(dev); > + > + del_timer(_timer); del_timer_sync? Is there any possibility it may run SMP? > + > +static void __exit cobalt_buttons_exit(void) > +{ > + platform_driver_unregister(_buttons_driver); You are not allowed to unregister statically allocated devices - if there are references left and your module goes away then these references become invalid and kernel goes boom. Please convert to platform_device_alloc/platform_device_add. Is there a point of moving platform device creation code into platform-specific code? Is there a possibility of this driver being used elsewhere? > + platform_device_unregister(_buttons_device); > +} > + > +module_init(cobalt_buttons_init); > +module_exit(cobalt_buttons_exit); > -- Dmitry - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Linux Kernel Markers Documentation - fix
Linux Kernel Markers Documentation - fix Fixes from Randy's comments. Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]> --- a/Documentation/marker.txt +++ b/Documentation/marker.txt @@ -18,7 +18,7 @@ code is reached. They can be used for tracing (LTTng, LKET over SystemTAP), overall performance accounting (SystemTAP). They could also be used to implement -efficient hooks for SELinux or any other subsystem the would have this +efficient hooks for SELinux or any other subsystem that would have this kind of need. Using the markers for system audit (SELinux) would require to pass a @@ -30,8 +30,9 @@ variable by address that would be later checked by the marked routine. MARK(subsystem_event, "%d %s %p[struct task_struct]", someint, somestring, current); Where : -- Subsystem is the name of your subsystem. -- event is the name of the event to mark. +- subsystem_event is an identifier unique to your event +- subsystem is the name of your subsystem. +- event is the name of the event to mark. - "%d %s %p[struct task_struct]" is the formatted string for (printk-style). - someint is an integer. - somestring is a char pointer. @@ -39,7 +40,7 @@ Where : The expression %p[struct task_struct] is a suggested marker definition standard that could eventually be used for pointer type checking in -sparse. The brackets contain the type to which the pointer refer. +sparse. The brackets contain the type to which the pointer refers. The marker mechanism supports multiple instances of the same marker. Markers can be put in inline functions, inlined static functions and @@ -104,8 +105,8 @@ static int __init probe_init(void) { int result; result = marker_set_probe("subsystem_event", - FS_CLOSE_FORMAT, - probe_fs_close); + SUBSYSTEM_EVENT_FORMAT, + probe_subsystem_event); if (!result) goto cleanup; return 0; -- Mathieu Desnoyers Computer Engineering Ph.D. Candidate, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: e1000_intr in request_irq faults in 2.6.20-git
On Thursday 15 February 2007 21:10, Brandeburg, Jesse wrote: > Eric W. Biederman wrote: > > Len Brown <[EMAIL PROTECTED]> writes: > > > >> e1000 faults in 2.6.20-git, while 2.6.20 worked fine. > >> > >> System is a D875PBZ with LOM. > >> > >> clues? > > > > I'm guessing this is an old bug found by the following bit of > > debug coded added into since v2.6.20 > > > > +#ifdef CONFIG_DEBUG_SHIRQ > > + if (irqflags & IRQF_SHARED) { > > + /* > > +* It's a shared IRQ -- the driver ought to be > > prepared for it +* to happen immediately, so let's > > make sure +* We do this before actually > > registering it, to make sure that +* a 'real' IRQ > > doesn't run in parallel with our fake +*/ > > + if (irqflags & IRQF_DISABLED) { > > + unsigned long flags; > > + > > + local_irq_save(flags); > > + handler(irq, dev_id); > > + local_irq_restore(flags); > > + } else > > + handler(irq, dev_id); > > + } > > +#endif > > > > I don't have a clue why the e1000 wasn't ready though. > > > > our code is clearly calling request_irq before we have assigned the > function pointer adapter->clean_rx as well as adapter->alloc_rx_buf > > That would be a bug, a possible patch would be (inline and attached): > compile tested, *but* I couldn't test this patch to make sure it worked > because I couldn't boot 2.6.20-git due to it not finding my RAID0 + lvm > disk. > > [PATCH] e1000: fix shared interrupt warning message > > From: Jesse Brandeburg <[EMAIL PROTECTED]> > > Signed-off-by: Jesse Brandeburg <[EMAIL PROTECTED]> > --- > > drivers/net/e1000/e1000_main.c | 13 +++-- > 1 files changed, 7 insertions(+), 6 deletions(-) > > diff --git a/drivers/net/e1000/e1000_main.c > b/drivers/net/e1000/e1000_main.c > index 619c892..b8c4d5c 100644 > --- a/drivers/net/e1000/e1000_main.c > +++ b/drivers/net/e1000/e1000_main.c > @@ -1417,10 +1417,6 @@ e1000_open(struct net_device *netdev) > if ((err = e1000_setup_all_rx_resources(adapter))) > goto err_setup_rx; > > - err = e1000_request_irq(adapter); > - if (err) > - goto err_req_irq; > - > e1000_power_up_phy(adapter); > > if ((err = e1000_up(adapter))) > @@ -1431,6 +1427,10 @@ e1000_open(struct net_device *netdev) > e1000_update_mng_vlan(adapter); > } > > + err = e1000_request_irq(adapter); > + if (err) > + goto err_req_irq; > + > /* If AMT is enabled, let the firmware know that the network > * interface is now open */ > if (adapter->hw.mac_type == e1000_82573 && > @@ -1439,10 +1439,11 @@ e1000_open(struct net_device *netdev) > > return E1000_SUCCESS; > > +err_req_irq: > + e1000_down(adapter); > + e1000_free_irq(adapter); > err_up: > e1000_power_down_phy(adapter); > - e1000_free_irq(adapter); > -err_req_irq: > e1000_free_all_rx_resources(adapter); > err_setup_rx: > e1000_free_all_tx_resources(adapter); > Works for me(tm) on latest 2.6.20-git and D875PBZ. thanks, -Len - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: GPL vs non-GPL device drivers
On Feb 15, 2007, Jeff Garzik <[EMAIL PROTECTED]> wrote: > Michael K. Edwards wrote: >> On 2/15/07, Jeff Garzik <[EMAIL PROTECTED]> wrote: >>> The /whole point/ of the GPL is to funnel contributions back. >> Bzzzt. The whole point of the GPL is to "guarantee your freedom to >> share and change free software--to make sure the software is free for >> all its users." > No, that's the FSF marketing fluff you've been taught to recite. The same FSF that wrote the GPL, no less ;-) > In the context of the Linux kernel, I'm referring to the original > reason why Linus chose the GPL for the Linux kernel. If he chose it for this reason, he chose the wrong license. The GPL does not funnel contributions back. It doesn't even require anyone to make contributions: you're free to keep your improvements only to yourself. And even if you distribute them, you can choose whom to distribute it to, and that might very well leave the 'back' out. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer [EMAIL PROTECTED], gcc.gnu.org} Free Software Evangelist [EMAIL PROTECTED], gnu.org} - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Linux Kernel Markers Documentation
Hi Randy, * Randy Dunlap ([EMAIL PROTECTED]) wrote: > so is the MARK() supposed to be: > MARK(subsystem, event, ... > > Please make the 2 doc. lines above match the parameters... > or the parameters match the text. > Fixing the paragraph below. > > +- "%d %s %p[struct task_struct]" is the formatted string for > > (printk-style). > > +- someint is an integer. > > +- somestring is a char pointer. > > +- current is a pointer to a struct task_struct. > > + > > +The expression %p[struct task_struct] is a suggested marker definition > > +standard that could eventually be used for pointer type checking in > > +sparse. The brackets contain the type to which the pointer refer. > refers. > > > + > > +#define SUBSYSTEM_EVENT_FORMAT "%d %s %p[struct task_struct]" > > Is SUBSYSTEM_EVENT_FORMAT used implicitly below? or elsewhere? > Yes, error follows. > > +void probe_subsystem_event(const char *format, ...) > > +{ > > + va_list ap; > > + /* Declare args */ > > + unsigned int value; > > + const char *mystr; > > + struct task_struct *task; > > + > > + /* Assign args */ > > + va_start(ap, format); > > + value = va_arg(ap, typeof(value)); > > + mystr = va_arg(ap, typeof(mystr)); > > + task = va_arg(ap, typeof(task)); > > + > > + /* Call tracer */ > > + trace_subsystem_event(value, mystr, task); > > + > > + /* Or call printk */ > > + vprintk(format, ap); > > + > > + /* or count, check rights... */ > > + > > + va_end(ap); > > +} > > + > > +static int __init probe_init(void) > > +{ > > + int result; > > + result = marker_set_probe("subsystem_event", > > + FS_CLOSE_FORMAT, > > + probe_fs_close); > > Do FS_CLOSE_FORMAT and probe_fs_close() need to be defined here? > I.e., is this a complete example? > should be SUBSYSTEM_EVENT_FORMAT. Will fix, thanks. Regards, Mathieu -- Mathieu Desnoyers Computer Engineering Ph.D. Candidate, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] shm: Fix the locking and cleanup error handling in do_shmat.
> drivers/video/Kconfig:1606:warning: 'select' used by config symbol 'FB_PS3' > refer to undefined symbol 'PS3_PS3AV' > /mnt/md0/devel/linux-mm/ipc/shm.c: In function 'do_shmat': > /mnt/md0/devel/linux-mm/ipc/shm.c:945: warning: passing argument 1 of 'IS_ERR' > makes pointer from integer without a cast > /mnt/md0/devel/linux-mm/ipc/shm.c:946: warning: passing argument 1 of > 'PTR_ERR' > makes pointer from integer without a cast > /mnt/md0/devel/linux-mm/ipc/shm.c:931: warning: label 'out_nattch' defined but > not used > /mnt/md0/devel/linux-mm/ipc/shm.c:890: error: label 'out_put_path' used but > not > defined > make[2]: *** [ipc/shm.o] Error 1 > make[1]: *** [ipc] Error 2 > make: *** [_all] Error 2 Definitely some weird patch application problem. All of the calls to IS_ERR and PTR_ERR should have been removed. Michal since it didn't seem to blow up when Andrew applied it I'm going to assume the problem is on your end for now. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.20-git] remove modpost false warnings on ARM
On Thu, 2007-02-15 at 19:10 -0800, David Brownell wrote: > This patch stops "modpost" from issuing erroneous modpost warnings on ARM > builds, which it's been doing since since maybe last summer. A canonical > example would be driver method table entries: > > WARNING: - Section mismatch: reference to .exit.text:_remove > from .data after '$d' (at offset 0x4) Looks fine to me. Rusty. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 9/11] Panic delay fix
On Thu, 2007-02-15 at 15:42 -0800, Zachary Amsden wrote: > So Rusty, Chris, Jeremy, any objections to killing udelay() and friends > in paravirt-ops? It would simplify things a bit. I agree. Thanks! Rusty. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 09/14] syslets: x86, add move_user_context() method
On Thu, 15 Feb 2007, Davide Libenzi wrote: > On Thu, 15 Feb 2007, Ingo Molnar wrote: > > > /* > > + * Move user-space context from one kernel thread to another. > > + * This includes registers and FPU state. Callers must make > > + * sure that neither task is running user context at the moment: > > + */ > > +void > > +move_user_context(struct task_struct *new_task, struct task_struct > > *old_task) > > +{ > > + struct pt_regs *old_regs = task_pt_regs(old_task); > > + struct pt_regs *new_regs = task_pt_regs(new_task); > > + union i387_union *tmp; > > + > > + *new_regs = *old_regs; > > + /* > > +* Flip around the FPU state too: > > +*/ > > + tmp = new_task->thread.i387; > > + new_task->thread.i387 = old_task->thread.i387; > > + old_task->thread.i387 = tmp; > > +} > > Let's say that old_task ("prev" at the incoming schedule) has TS_USEDFPU > set. Its context gets moved to the new_task (the one returning to > userspace) *before* the __unlazy_fpu() done in __switch_to(). The > __unlazy_fpu() at the following schedule will save the state to the old > new_task context, and that fine as far as the going-to-sleep task goes. > The next fault happening in new_task (return to userspace one) will reload > a non up2date context (the one we got from old_task, but never hit by > the __unlazy_fpu() flush). Right? Yeah. Given TS_USEDFPU set, before move_user_context(): CPU => FPUc NTSK => FPUn OTSK => FPUo After move_user_context(): CPU => FPUc NTSK => FPUo OTSK => FPUn After the incoming __unlazy_fpu() in __switch_to(): CPU => FPUc NTSK => FPUo OTSK => FPUc After the first fault in NTSK: CPU => FPUo NTSK => FPUo OTSK => FPUc So NTSK loads a non up2date FPUo, instead of the FPUc that was the "dirty" context to migrate (since TS_USEDFPU was set). I think you need an early __unlazy_fpu() in that case. - Davide - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] input: extend EV_LED
On Thu, 15 Feb 2007, Richard Purdie wrote: > This has been discussed in several places several times. The problem > with hardware accelerated flashing is that you're are often limited to > certain constraints (this case being no exception) and indicating what > these are to userspace in a generic fashion is difficult. The hability to blinking at one rate is *very* common on laptops. Blinking at a few discrete rates is also common enough. They should be supported in a generic way. I want to convert ibm-acpi to the led interface, but if it means I have to provide custom attributes on top of the led class, it sort of defeats most of the purpose of using the led class to begin with -- it will NOT be generic. If I have to provide those attributes elsewhere in the sysfs tree other than somewhere in the led class, then it defeats the purpose of using the led class completely: I will just scrap the idea. I am not going to remove functionality. And I am not going to emulate in software something the hardware can do, especially when that means bothering the EC with a slow ACPI-subsystem-gated LPC bus IO port access for no good reason. Here's a suggestion for a simple, non-overengineered interface: a "blink" attribute (on/off) for leds which can hardware-blink. Only one blink frequency is common enough that this attribute by itself is very useful (e.g. it is all a ThinkPad and most WiFi/network card leds need). For hardware-blink leds with various frequencies, there is the typical way to provide such things: give us a RO blink_available_frequencies attribute which says which discrete frequencies are allowed (space separated), and a RW blink_frequency attribute to set the frequency. If instead of blink_available_frequencies, the driver provides RO blink_frequency_min and _max attributes, then it means it can blink on that range of freqs. That is simple enough to implement and use, and generic enough. You just need to set in stone if you want the freq in Hz, or a submultiple. You can even implement an optional "blink" software emulation that drivers can hook into for systems where the driver *knows* that led access is fast, but there is no hardware blinking emulation. > One way I've come up with is adds capability to the class to have LED > specific triggers and you can then expose these hardware capabilities as > an extra trigger specific to the LED. How would that look like? It doesn't sound too bad. Could you give us an example of what the tree would look like, and what the attributes would be (and do)? > Another proposal more specific to this use case was to have some > information behind the scenes which the software timer based trigger > could use to turn on the "hardware acceleration" if present and capable > of the requested mode. This might just need a function pointer in the > core so could be quite neat. This looks like a severely overengineered way to deal with the problem at first glance. -- "One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie." -- The Silicon Valley Tarot Henrique Holschuh - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/