Re: [PATCH/RFC] PCI prepare/activate instead of enable to avoid IRQ storm and rogue DMA access
On Thu, Mar 15, 2007 at 11:37:20AM +0900, Tejun Heo wrote: ... > Also, the current implementation doesn't have any arch independent part. I thnk you meant "arch dependent" here. > It's wholly contained in arch independent PCI layer, but it might be > beneficial to have arch dependent hooks (IRQ line enable/disable?) in > the future. > > >What if the device with the IRQ problem is never loaded? Sometimes > >devices aren't loaded until after boot. > > What do you mean by loading a device? Do you mean loading driver for > the device? Yes, I think that's what he meant. > >Any change like this has to be done without changing device drivers. > >Changing the skge/sky2 drivers as special case is not acceptable. I don't like the idead of changing the driver API for PCI device setup. But if it's necessary to solve this class of problem, I think it's ok. > I dunno about that. What I'm proposing is alternative two-step PCI > initialization step - the first step enables the device just enough for > initialization/reset and the second one enables full access. We're > doing part of it already for bus master. I'm proposing to expand that > approach and make them handled by generic PCI layer. As you can see, it > doesn't add noticeable complexity to drivers. I think it's even clearer > than doing pci_set_master() explicitly. Please update Documentation/pci.txt to reflect the API changes too. > If this way of solving the problem is chosen, eventually most drivers > should be converted to new initialization steps. And there is no way to > do this without modifying low level driver. Only low level driver knows > when full blown access can be enabled and such thing must happen before > registering the device to upper layer (e.g. ATA/SCSI, netif). Agreed. ISTR this has been discussed before but don't recall the exact context. I'll try to find the previous thread. When I started the parisc port on 2.4 kernels, the policy was to leave all interrupts enabled even if no interrupt handler was registered. This is useful for debugging misconfigured IRQ routing. Did the policy already change or is this a proposal to change the policy? thanks, grant > sky2/skge aren't exceptions. If this way of solving the problem is > chosen, eventually most if not all drivers should be converted to new > model. It may take two years, maybe five, but as a start just > converting ATA and network drivers shouldn't take too long and that > would help a lot of cases. > > Thanks. > > -- > tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH take3 16/20] acpi files switched
On Thursday 15 March 2007 01:13, Steven Rostedt wrote: > Moved the shared files that were in arch/i386/kernel/acpi to the common > area. When I do a "make cscope" on an i386 or an x86_64 box, will it find these files in the common area? thanks -Len - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [OT] Re: New thread RDSL, post-2.6.20 kernels and amanda (tar) miss-fires
On Thursday 15 March 2007, Willy Tarreau wrote: [...] >with "/bin/tar -f - >/tmp/test/", you ask bash to open the file > "/tmp/test/" for write, then start tar and pass this file as its > stdout. Obviously this is wrong. I think that what you're trying to do > is send extracted files to /tmp/test, which is what '-C' is for. Also, > you need to specify a command for tar. You didn't. I bet if you do the > following, it will work : > >[EMAIL PROTECTED] data]# dd if=00010.coyote._lib.1 bs=32k skip=1 | >/bin/gzip -dc | /bin/tar -C /tmp/test/ -xf - > >Now, Gene, this is becoming totally off-topic right here. My apologies, I've been corrected, thanks for your patience. And I'll see if I can get that text in the amanda file headers amended too. >Regards, >Willy -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) It takes less time to do a thing right than it does to explain why you did it wrong. -- H.W. Longfellow - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: New thread RDSL, post-2.6.20 kernels and amanda (tar) miss-fires
On Thursday 15 March 2007, Ray Lee wrote: >Gene Heskett wrote: >> Here is an example >> [EMAIL PROTECTED] data]# dd if=00010.coyote._lib.1 bs=32k count=1 >> AMANDA: FILE 20070314104344 coyote /lib lev 1 comp .gz program >> /bin/tar To restore, position tape at start of file and run: >> dd if= bs=32k skip=1 | /bin/gzip -dc | /bin/tar -f - ... >> >> And the elipsis is an error if not removed. Then one is supposed to >> be able to redirect tars output with the usual >/tmp/test/ syntax >> >> So: >> [EMAIL PROTECTED] data]# dd if=00010.coyote._lib.1 bs=32k >> skip=1 | /bin/gzip -dc | /bin/tar -f - >/tmp/test/ >> -bash: /tmp/test/: Is a directory >> >> which is the return from any variation in how the redirect is done. >> >> So what is it that am I doing wrong in the above command line?, so I >> can add it to my helper scripts to be published eventually on >> zmanda.org. > >One of us is confused, and it may very well be me, but... > >the /bin/tar -f - >/tmp/test/ looks to me like it should fail exactly as >bash says it does. the output redirect (>) will only write out to a >file, not a directory. (So, /tmp/file should work, /tmp/file/ won't.) > >Are you trying to redirect where the files get restored? That should be >done with a cd before doing the uncompress. > >Or am I misunderstanding what you're telling me? > >Ray No, apparently its me that's been running with a fubar'd understanding. I was certain that tar (or bash) should have been able to put the recovered files IN the directory /tmp/test but that turns out to need more options after the '-f -' section of that sample line I posted. Thanks. A bunch.. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) Mal: "You are very much lacking in imagination." Zoe: "I imagine that's so, sir." --Episode #8, "Out of Gas" - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] BLK_DEV_IDE_CELLEB dependency fix
It's bool and it depends on BLK_DEV_IDE => should depend on BLK_DEV_IDE=y And move it to "if BLK_DEV_IDEDMA_PCI" block because it depends on BLK_DEV_IDEDMA_PCI. Signed-off-by: Al Viro <[EMAIL PROTECTED]> Signed-off-by: Kou Ishizaki <[EMAIL PROTECTED]> Signed-off-by: Akira Iguchi <[EMAIL PROTECTED]> --- diff -Nrpu -X linux-2.6.21-rc3/Documentation/dontdiff linux-2.6.21-rc3/drivers/ide/Kconfig linux-2.6.21-rc3.mod/drivers/ide/Kconfig --- linux-2.6.21-rc3/drivers/ide/Kconfig2007-03-07 13:41:20.0 +0900 +++ linux-2.6.21-rc3.mod/drivers/ide/Kconfig2007-03-15 23:49:33.0 +0900 @@ -769,6 +769,14 @@ config BLK_DEV_TC86C001 help This driver adds support for Toshiba TC86C001 GOKU-S chip. +config BLK_DEV_IDE_CELLEB + bool "Toshiba's Cell Reference Set IDE support" + depends on PPC_CELLEB && BLK_DEV_IDE=y + help + This driver provides support for the built-in IDE controller on + Toshiba Cell Reference Board. + If unsure, say Y. + endif config BLK_DEV_IDE_PMAC @@ -800,14 +808,6 @@ config BLK_DEV_IDEDMA_PMAC to transfer data to and from memory. Saying Y is safe and improves performance. -config BLK_DEV_IDE_CELLEB - bool "Toshiba's Cell Reference Set IDE support" - depends on PPC_CELLEB - help - This driver provides support for the built-in IDE controller on - Toshiba Cell Reference Board. - If unsure, say Y. - config BLK_DEV_IDE_SWARM tristate "IDE for Sibyte evaluation boards" depends on SIBYTE_SB1xxx_SOC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2/5] fs: introduce new aops and infrastructure
On Thu, Mar 15, 2007 at 05:36:42AM +0100, Nick Piggin wrote: > On Wed, Mar 14, 2007 at 09:13:29PM -0700, Mark Fasheh wrote: > > Are we going to get rid of the file and intr arguments btw? I'm not sure > > intr is useful, and mapping is probably enough to get whatever we inside > > ->write_begin / ->write_end. > > Yeah, I was going to, but I had this version ready to go so decided > to leave them in at the last minute. We can definitely take them out > if people agree. You're really going to need the file argument around. Some folks care about file->private_data, etc. A good example is nfs_updatepage() from nfs_commit_write(). There's a context on the filp. Mapping can get back to the inode via ->host, but not to the struct file. Joel -- Life's Little Instruction Book #157 "Take time to smell the roses." Joel Becker Principal Software Developer Oracle E-mail: [EMAIL PROTECTED] Phone: (650) 506-8127 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2/5] fs: introduce new aops and infrastructure
On Thu, Mar 15, 2007 at 05:36:42AM +0100, Nick Piggin wrote: > > Are we going to get rid of the file and intr arguments btw? I'm not sure > > intr is useful, and mapping is probably enough to get whatever we inside > > ->write_begin / ->write_end. > > Yeah, I was going to, but I had this version ready to go so decided > to leave them in at the last minute. We can definitely take them out > if people agree. > > However a side note about intr -- I wonder if it might be wise to > include a flags argument, in case we might want to add something like > that later? (definitely if we do keep intr, then it should be done as > a flag rather than its own int). I don't see a problem with having a flags argument. It could give us some flexibility in the future which would otherwise require a much bigger update. If we found out that we needed intr, it could just be a flag. > > One interesting side effect is that we no longer pass AOP_TRUNCATE_PAGE up a > > level. This gives callers less to deal with. And it means that ocfs2 doesn't > > have to use the ocfs2_*_lock_with_page() cluster lock variants in > > ocfs2_block_write_begin() because it can order cluster locks outside of the > > page lock there. > > OK that's very cool. I was hoping that would be the case. If GFS2 can > avoid that too, then we might be able to get rid of AOP_TRUNCATE_PAGE > handling from the legacy prepare/commit_write paths, which will make > them simpler. Yeah - so long as we're not taking a page fault between write_begin / write_end, there's no reason for the cluster locks to be taken and dropped within the individual callbacks, which means we can just take them in write_begin (where page lock ordering is possible) and hold them until write_end is called. > OK, well I'll add this to my queue for now, and post the full patchset > after incorporating feedback I've had so far, and doing more testing, > so people can actually apply them and boot kernels. Great, thanks - it just occured to me that I should be holding the clusters locks across the entire copy (as I point out above), so I'll have a slightly updated version of this patch for you soon :) --Mark -- Mark Fasheh Senior Software Developer, Oracle [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 10/13] BLK_DEV_IDE_CELLEB dependency fix
Al wrote: > >Eh... You still need dependency on IDE=y; otherwise you'll get configs >with IDE=m, BLK_DEV_IDE_CELLEB=y and those won't link. BLK_DEV_IDEDMA_PCI >is selectable just fine with IDE=m. > >It's the same problem as with ps3 fb. > I'm sorry I missed this case. Using some configurations, I found BLK_DEV_IDE=y was better. (I failed to link when IDE=y and BLK_DEV_IDE=m.) Best regards, Akira Iguchi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH REPOST] No need to use -traditional for processing asm in arch/i386/
No need to use -traditional for processing asm in arch/i386/ Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> --- arch/i386/boot/Makefile|4 ++-- arch/i386/boot/compressed/Makefile |1 - arch/i386/kernel/Makefile |2 -- arch/i386/kernel/entry.S |2 +- include/asm-i386/percpu.h |4 ++-- 5 files changed, 5 insertions(+), 8 deletions(-) === --- a/arch/i386/boot/Makefile +++ b/arch/i386/boot/Makefile @@ -36,9 +36,9 @@ HOSTCFLAGS_build.o := $(LINUXINCLUDE) # --- $(obj)/zImage: IMAGE_OFFSET := 0x1000 -$(obj)/zImage: EXTRA_AFLAGS := -traditional $(SVGA_MODE) $(RAMDISK) +$(obj)/zImage: EXTRA_AFLAGS := $(SVGA_MODE) $(RAMDISK) $(obj)/bzImage: IMAGE_OFFSET := 0x10 -$(obj)/bzImage: EXTRA_AFLAGS := -traditional $(SVGA_MODE) $(RAMDISK) -D__BIG_KERNEL__ +$(obj)/bzImage: EXTRA_AFLAGS := $(SVGA_MODE) $(RAMDISK) -D__BIG_KERNEL__ $(obj)/bzImage: BUILDFLAGS := -b quiet_cmd_image = BUILD $@ === --- a/arch/i386/boot/compressed/Makefile +++ b/arch/i386/boot/compressed/Makefile @@ -6,7 +6,6 @@ targets:= vmlinux vmlinux.bin vmlinux.bin.gz head.o misc.o piggy.o \ vmlinux.bin.all vmlinux.relocs -EXTRA_AFLAGS := -traditional LDFLAGS_vmlinux := -T CFLAGS_misc.o += -fPIC === --- a/arch/i386/kernel/Makefile +++ b/arch/i386/kernel/Makefile @@ -44,8 +44,6 @@ obj-$(CONFIG_PARAVIRT)+= paravirt.o obj-$(CONFIG_PARAVIRT) += paravirt.o obj-y += pcspeaker.o -EXTRA_AFLAGS := -traditional - obj-$(CONFIG_SCx200) += scx200.o # vsyscall.o contains the vsyscall DSO images as __initdata. === --- a/arch/i386/kernel/entry.S +++ b/arch/i386/kernel/entry.S @@ -635,7 +635,7 @@ ENTRY(name) \ SAVE_ALL; \ TRACE_IRQS_OFF \ movl %esp,%eax; \ - call smp_/**/name; \ + call smp_##name;\ jmp ret_from_intr; \ CFI_ENDPROC;\ ENDPROC(name) === --- a/include/asm-i386/percpu.h +++ b/include/asm-i386/percpu.h @@ -20,10 +20,10 @@ #ifdef CONFIG_SMP #define PER_CPU(var, cpu) \ movl __per_cpu_offset(,cpu,4), cpu; \ - addl $per_cpu__/**/var, cpu; + addl $per_cpu__##var, cpu; #else /* ! SMP */ #define PER_CPU(var, cpu) \ - movl $per_cpu__/**/var, cpu; + movl $per_cpu__##var, cpu; #endif /* SMP */ #endif /* !__ASSEMBLY__ */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc3-mm1
Hello, > > Today after +- 24h of uptime I found some more page allocation > > failures ('eth1: Can't allocate skb for Rx'). You'll find more here: > > > > http://tuxland.pl/misc/2.6.21-rc3-mm1-page-allocation-failure.txt > > > > System wasn't doing anything unusual, as usual ;-) X, some p2p > > software, firefox+flash playing music. > > > > Do other kernels do this, or is 2.6.21-rc3-mm1 worse? I've never seen page allocation failures before 2.6.21-rc3-mm1 (first khubd with the mouse thing now this). > It is of course a non-fatal problem and will inevitably happen sometimes, > but we would like the VM to be able to minimise the occurrence of this > problem. True. System runs as nothing happened. It just pops out from time to time. > I think we were rather hoping that Mel's anti-fragmentation work would > improve things. Thanks, Mariusz Kozlowski - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Clean up ELF note generation
Three cleanups: 1: ELF notes are never mapped, so there's no need to have any access flags in their phdr. 2: When generating them from asm, tell the assembler to use a SHT_NOTE section type. There doesn't seem to be a way to do this from C. 3: Use ANSI rather than traditional cpp behaviour to stringify the macro argument. Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> Cc: Eric W. Biederman <[EMAIL PROTECTED]> --- arch/i386/kernel/vmlinux.lds.S|2 +- include/asm-generic/vmlinux.lds.h |2 +- include/linux/elfnote.h |4 ++-- 3 files changed, 4 insertions(+), 4 deletions(-) === --- a/arch/i386/kernel/vmlinux.lds.S +++ b/arch/i386/kernel/vmlinux.lds.S @@ -34,7 +34,7 @@ PHDRS { PHDRS { text PT_LOAD FLAGS(5); /* R_E */ data PT_LOAD FLAGS(7); /* RWE */ - note PT_NOTE FLAGS(4); /* R__ */ + note PT_NOTE FLAGS(0); /* ___ */ } SECTIONS { === --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -208,7 +208,7 @@ } #define NOTES \ - .notes : { *(.note.*) } :note + .notes : { *(.note.*) } :note #define INITCALLS \ *(.initcall0.init) \ === --- a/include/linux/elfnote.h +++ b/include/linux/elfnote.h @@ -39,12 +39,12 @@ * ELFNOTE(XYZCo, 12, .long, 0xdeadbeef) */ #define ELFNOTE(name, type, desctype, descdata)\ -.pushsection .note.name; \ +.pushsection .note.name, "",@note ; \ .align 4 ; \ .long 2f - 1f/* namesz */; \ .long 4f - 3f/* descsz */; \ .long type ; \ -1:.asciz "name"; \ +1:.asciz #name ; \ 2:.align 4 ; \ 3:desctype descdata; \ 4:.align 4 ; \ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 10/13] BLK_DEV_IDE_CELLEB dependency fix
Al wrote: > >It's bool and it depends on IDE => should depend on IDE=y > >Signed-off-by: Al Viro <[EMAIL PROTECTED]> Move to "if BLK_DEV_IDEDMA_PCI" block because it depends on BLK_DEV_IDEDMA_PCI. Signed-off-by: Kou Ishizaki <[EMAIL PROTECTED]> Signed-off-by: Akira Iguchi <[EMAIL PROTECTED]> --- diff -Nrpu -X linux-2.6.21-rc3/Documentation/dontdiff linux-2.6.21-rc3/drivers/ide/Kconfig linux-2.6.21-rc3.mod/drivers/ide/Kconfig --- linux-2.6.21-rc3/drivers/ide/Kconfig2007-03-07 13:41:20.0 +0900 +++ linux-2.6.21-rc3.mod/drivers/ide/Kconfig2007-03-15 22:47:14.0 +0900 @@ -769,6 +769,14 @@ config BLK_DEV_TC86C001 help This driver adds support for Toshiba TC86C001 GOKU-S chip. +config BLK_DEV_IDE_CELLEB + bool "Toshiba's Cell Reference Set IDE support" + depends on PPC_CELLEB + help + This driver provides support for the built-in IDE controller on + Toshiba Cell Reference Board. + If unsure, say Y. + endif config BLK_DEV_IDE_PMAC @@ -800,14 +808,6 @@ config BLK_DEV_IDEDMA_PMAC to transfer data to and from memory. Saying Y is safe and improves performance. -config BLK_DEV_IDE_CELLEB - bool "Toshiba's Cell Reference Set IDE support" - depends on PPC_CELLEB - help - This driver provides support for the built-in IDE controller on - Toshiba Cell Reference Board. - If unsure, say Y. - config BLK_DEV_IDE_SWARM tristate "IDE for Sibyte evaluation boards" depends on SIBYTE_SB1xxx_SOC - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH/RFC] PCI prepare/activate instead of enable to avoid IRQ storm and rogue DMA access
[cc'ing Andi, Hi!] Hello, Russell King wrote: > On Wed, Mar 14, 2007 at 06:34:11PM -0400, Jeff Garzik wrote: >> Russell King wrote: >>> pci_enable_device() doesn't deal with this; in most PCI setups I've >>> seen, there is no control at PCI level over whether a device generates >>> an interrupt on the bus. Certainly the memory and io command enables >> PCI grew an interrupt enable while you weren't looking: >> PCI_COMMAND_INTX_DISABLE > > That's fine for devices which conform to the later PCI specs, but not > all do. > >> It was added in PCI 2.3 I think. > > Correct. > >> Older PCI devices certainly do not have this standardized bit. > > No PCI device that I have has that bit - including the raid card I > bought last year... Many recent ATA and network controllers do and most new ones will probably do. > In any case, relying on such a new control bit to implement this kind > of functionality would result in a very hit and miss result; Linux > tends to get used on things other than the bleeding edge of hardware > technology. I don't think INTX_DISABLE is on the bleeding edge of hardware technology and many common cases will benefit from using it (just think about the number of newish notebook users). The problem with INTX_DISABLE is that there doesn't seem to be any way to tell whether writing to that bit is safe or not. You are right in that turning off IRQ mechanisms in pci_enable_device() doesn't fix all the problems as PCI-wise it only enables IO and memory address space access, but to some extent it does because in the arch code, it enables the IRQ line and the physical IRQ line might not be shared even if the final IRQ number is shared (Andi, am I correct)? Anyways, I think the proper solution is to make sure all generic IRQ controls including INTX turned off early in the boot during PCI subsystem initialization (ie. do the disable part of pcim_prepare_device() early in the boot before any IRQ line is requested) and let each driver enable after initialization as necessary and do similar things during resume. Note that drivers still need to be modified to signify when the device is initialized enough to enable IRQ, and bus mastering. We can also arch-dep IRQ enabling to the activation time. That will give us more protection even when INTX_DISABLE is not available. Thanks. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/1] Allow i386 crash kernels to handle x86_64 dumps
On Thu, Mar 15, 2007 at 02:07:56PM +0900, Horms wrote: > On Thu, Mar 15, 2007 at 10:25:36AM +0530, Vivek Goyal wrote: > > On Thu, Mar 15, 2007 at 10:46:38AM +0900, Horms wrote: > > > On Wed, Mar 14, 2007 at 05:00:09PM +, Ian Campbell wrote: > > > > The specific case I am encountering is kdump under Xen with a 64 bit > > > > hypervisor and 32 bit kernel/userspace. The dump created is a 64 bit due > > > > to the hypervisor but the dump kernel is 32 bit to match the domain 0 > > > > kernel. > > > > > > > > It's possibly less likely to be useful in a purely native scenario but I > > > > see no reason to disallow it. > > > > > > For native Linux, would this cover the case where the pre-crash kernel > > > is 64bit and the crashdump (post-crash) kernel is 32bit? > > > > > > > I think so. Though I have never tried this. > > > > > > Signed-off-by: Ian Campbell <[EMAIL PROTECTED]> > > > > > > > > --- pristine-linux-2.6.18/include/asm-i386/elf.h2006-09-20 > > > > 04:42:06.0 +0100 > > > > +++ linux-2.6.18-xen/include/asm-i386/elf.h 2007-03-14 > > > > 16:42:30.0 + > > > > @@ -36,7 +36,7 @@ > > > > * This is used to ensure we don't load something for the wrong > > > > architecture. > > > > */ > > > > #define elf_check_arch(x) \ > > > > - (((x)->e_machine == EM_386) || ((x)->e_machine == EM_486)) > > > > + (((x)->e_machine == EM_386) || ((x)->e_machine == EM_486) || > > > > ((x)->e_machine == EM_X86_64)) > > > > But I think changing this macro might run into issues. It is being used at > > few places in kernel, for example while loading module. This will > > essentially > > mean that we allow loading 64bit x86_64 modules on 32bit i386 systems? > > > > Similarly, load_elf_interp() is using it, again will we allow loading a > > interp written for X86_64 on a 32bit i386 machine? > > > > Should we create a separate macro something like elf_check_allowed_arch(), > > to take care of such corner cases? > > That sounds reasonable to me. Though perhaps it could just be > kexec_elf_check_arch() for now, as I don't think there are any > other consumers of it. Kexec will also not allow loading an x86_64 kernel on a 32bit machine. So how about something like vmcore_elf_allowed_cross_arch()? Vmcore code can continue to check elf_check_arch() and if that fails it can invoke vmcore_elf_allowed_cross_arch() to find out what cross arch are allowed for vmcore. Thanks Vivek - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 4/7] RSS accounting hooks over the code
Nick Piggin wrote: Kirill Korotaev wrote: The approaches I have seen that don't have a struct page pointer, do intrusive things like try to put hooks everywhere throughout the kernel where a userspace task can cause an allocation (and of course end up missing many, so they aren't secure anyway)... and basically just nasty stuff that will never get merged. User beancounters patch has got through all these... The approach where each charged object has a pointer to the owner container, who has charged it - is the most easy/clean way to handle all the problems with dynamic context change, races, etc. and 1 pointer in page struct is just 0.1% overehad. The pointer in struct page approach is a decent one, which I have liked since this whole container effort came up. IIRC Linus and Alan also thought that was a reasonable way to go. I haven't reviewed the rest of the beancounters patch since looking at it quite a few months ago... I probably don't have time for a good review at the moment, but I should eventually. This patch is not really beancounters. 1. It uses the containers framework 2. It is similar to my RSS controller (http://lkml.org/lkml/2007/2/26/8) I would say that beancounters are changing and evolving. Struct page overhead really isn't bad. Sure, nobody who doesn't use containers will want to turn it on, but unless you're using a big PAE system you're actually unlikely to notice. big PAE doesn't make any difference IMHO (until struct pages are not created for non-present physical memory areas) The issue is just that struct pages use low memory, which is a really scarce commodity on PAE. One more pointer in the struct page means 64MB less lowmem. But PAE is crap anyway. We've already made enough concessions in the kernel to support it. I agree: struct page overhead is not really significant. The benefits of simplicity seems to outweigh the downside. But again, I'll say the node-container approach of course does avoid this nicely (because we already can get the node from the page). So definitely that approach needs to be discredited before going with this one. But it lacks some other features: 1. page can't be shared easily with another container I think they could be shared. You allocate _new_ pages from your own node, but you can definitely use existing pages allocated to other nodes. 2. shared page can't be accounted honestly to containers as fraction=PAGE_SIZE/containers-using-it Yes there would be some accounting differences. I think it is hard to say exactly what containers are "using" what page anyway, though. What do you say about unmapped pages? Kernel allocations? etc. 3. It doesn't help accounting of kernel memory structures. e.g. in OpenVZ we use exactly the same pointer on the page to track which container owns it, e.g. pages used for page tables are accounted this way. ? page_to_nid(page) ~= container that owns it. 4. I guess container destroy requires destroy of memory zone, which means write out of dirty data. Which doesn't sound good for me as well. I haven't looked at any implementation, but I think it is fine for the zone to stay around. 5. memory reclamation in case of global memory shortage becomes a tricky/unfair task. I don't understand why? You can much more easily target a specific container for reclaim with this approach than with others (because you have an lru per container). Yes, but we break the global LRU. With these RSS patches, reclaim not triggered by containers still uses the global LRU, by using nodes, we would have lost the global LRU. 6. You cannot overcommit. AFAIU, the memory should be granted to node exclusive usage and cannot be used by by another containers, even if it is unused. This is not an option for us. I'm not sure about that. If you have a larger number of nodes, then you could assign more free nodes to a container on demand. But I think there would definitely be less flexibility with nodes... I don't know... and seeing as I don't really know where the google guys are going with it, I won't misrepresent their work any further ;) Everyone seems to have a plan ;) I don't read the containers list... does everyone still have *different* plans, or is any sort of consensus being reached? hope we'll have it soon :) Good luck ;) I think we have made some forward progress on the consensus. -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: MediaGX/GeodeGX1 requires X86_OOSTORE.
From: [EMAIL PROTECTED] (Lennart Sorensen) Subject: Re: MediaGX/GeodeGX1 requires X86_OOSTORE. Date: Tue, 20 Feb 2007 09:48:23 -0500 Hiroshi Miura posted `Geode out-of-order store enables' patch in Jun, 2003. There is http://lkml.org/lkml/2003/6/5/57 . OOSTORE was enabled at this point in time. It seems to have disappeared somewhere. BTW, I use MediaGX with kernel 2.6.20(and 2.6.20.3) and suspend2. When I resume the PC and use the PC Card modem, PC is hungup. However, PC isn't hung up when I apply a WBINVD patch. I can't understand it whether there is problem in resume of suspend2 or MediaGX or both. Many drivers lack support for resume on my PC. > On Tue, Feb 20, 2007 at 08:34:13PM +0900, takada wrote: > > I posted with 2.6.20 + enabled X86_OOSTORE. > > The clflush sze line is in /proc/cpuinfo. but clfush is not in flags line. > > > > BTW, can we use WBINVD instruction? I tested compile only. > > Do you know a method to change dynamically without #ifdef when it works > > with MediaGX/GeodeGX. > > > > diff -Narup a/include/asm-i386/io.h b/include/asm-i386/io.h > > --- a/include/asm-i386/io.h 2007-02-20 16:23:25.0 +0900 > > +++ b/include/asm-i386/io.h 2007-02-20 17:07:14.0 +0900 > > @@ -232,7 +232,19 @@ static inline void memcpy_toio(volatile > > * 2. Accidentally out of order processors (PPro errata #51) > > */ > > > > -#if defined(CONFIG_X86_OOSTORE) || defined(CONFIG_X86_PPRO_FENCE) > > +#ifdef CONFIG_MGEODEGX1 > > + > > +static inline void dma_flush_cache(void) > > +{ > > + __asm__ __volatile__ ("wbinvd": : :"memory"); > > +} > > + > > +#define dma_cache_inv(_start,_size)dma_flush_cache() > > +#define dma_cache_wback(_start,_size) dma_flush_cache() > > +#define dma_cache_wback_inv(_start,_size) dma_flush_cache() > > +#define flush_write_buffers() > > + > > +#elif defined(CONFIG_X86_OOSTORE) || defined(CONFIG_X86_PPRO_FENCE) > > > > static inline void flush_write_buffers(void) > > { > > - > > Well it is starting to look like it isn't a caching issue, but more > likely an issue of which order writes are performed in. I think the MAC > might be seeing the ownership bit change before the rest of the > descriptor, which shouldn't happen. With X86_OOSTORE, wmb() is called > between setting the fields in the descriptor and setting the ownership > bit to the MAC. I still have to investigate a bit more to find out for > sure, but that could certainly explain why X86_OOSTORE makes the problem > become much less frequent. It doesn't completely elliminate it though. > Of course maybe there are two different problems with the same symptoms. > > -- > Len Sorensen > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RSDL v0.30 cpu scheduler for mainline kernels
On Thursday 15 March 2007 13:31, Siddha, Suresh B wrote: > Con, > > On Mon, Mar 12, 2007 at 10:58:11AM +1100, Con Kolivas wrote: > > There are updated patches for 2.6.20, 2.6.20.2, 2.6.21-rc3 and > > 2.6.21-rc3-mm2 to bring RSDL up to version 0.30 for download here: > > I tried this on a Core 2 Quad cpu system(system has 4 cores on a single > package). When I run SPECjbb2000 with number of threads varying from 1-8, > I see ~4.5% perf regression with RSDL (compared to native 2.6.21-rc3) in > the 8 threads case. This I think, is coming from increased number of > context switches, when we have more than one thread(at same user priority) > on the same logical cpu. > > Just to see the % increase in number of context switches, I ran 8 infinite > loops (simple while(1); 's) and with 2.6.21-rc3 I see ~70 context switches > every second, whereas with RSDL I see ~530 context switches. Thanks. If it's just that then scaling rr interval with cpus somewhat would help. If you could, the following patch just to test might confirm that. --- kernel/sched.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux-2.6.21-rc3-mm2/kernel/sched.c === --- linux-2.6.21-rc3-mm2.orig/kernel/sched.c2007-03-15 17:03:17.0 +1100 +++ linux-2.6.21-rc3-mm2/kernel/sched.c 2007-03-15 17:03:30.0 +1100 @@ -104,7 +104,7 @@ unsigned long long __attribute__((weak)) * This is the time all tasks within the same priority round robin. * Set to a minimum of 6ms. */ -#define RR_INTERVAL((6 * HZ / 1001) + 1) +#define RR_INTERVAL((12 * HZ / 1001) + 1) #define DEF_TIMESLICE (RR_INTERVAL * 20) #ifdef CONFIG_SMP -- -ck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 10/13] BLK_DEV_IDE_CELLEB dependency fix
On Thu, Mar 15, 2007 at 02:25:40PM +0900, Akira Iguchi wrote: > Al wrote: > > > >It's bool and it depends on IDE => should depend on IDE=y > > > >Signed-off-by: Al Viro <[EMAIL PROTECTED]> > > Move to "if BLK_DEV_IDEDMA_PCI" block because it depends on > BLK_DEV_IDEDMA_PCI. > +config BLK_DEV_IDE_CELLEB > + bool "Toshiba's Cell Reference Set IDE support" > + depends on PPC_CELLEB > + help > + This driver provides support for the built-in IDE controller on > + Toshiba Cell Reference Board. > + If unsure, say Y. > + Eh... You still need dependency on IDE=y; otherwise you'll get configs with IDE=m, BLK_DEV_IDE_CELLEB=y and those won't link. BLK_DEV_IDEDMA_PCI is selectable just fine with IDE=m. It's the same problem as with ps3 fb. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kref refcounting breakage in mainline
On Sat, Mar 10, 2007 at 04:44:06PM +0100, Mike Galbraith wrote: > On Wed, 2007-03-07 at 06:39 +0100, Mike Galbraith wrote: > > On Tue, 2007-03-06 at 13:04 -0800, Greg KH wrote: > > > On Tue, Mar 06, 2007 at 06:43:22AM +0100, Mike Galbraith wrote: > > > > On Mon, 2007-03-05 at 16:25 -0800, Greg KH wrote: > > > > > > > > > Mike, I've reverted this patch, and I don't see any references > > > > > leaking. > > > > > And, as your patch released the reference on the driver, and the > > > > > module_add_driver() call would not grab a reference to the driver, > > > > > only > > > > > the module kobject, I don't see what you were trying to fix with this > > > > > patch. > > > > > > > > > > Do you have a test case that this fixes? > > > > > > > > What it fixed for me was the hard hang reported below. > > > > > > > > http://lkml.org/lkml/2007/2/16/96 > > > > > > What specific module are you trying to unload that causes the hang? I > > > think it might just be a problem with that module, and not with all > > > others. > > > > It's ipmi_si that's hanging, waits for completion that never comes. > > > > > So, I'm going to revert your patch and work to try to find the real > > > cause of this problem. > > > > Yeah, my stab at it seems busted. I'll take another poke at it to see > > if I can find out why (post 725522b5453dd680412f2b6463a988e4fd148757) > > I'm left with a reference. > > Ok, stab #2. > > My reference count woes stem from module_remove_driver() not removing > the link created in module_add_driver(). With the below, my box boots > fine. Since I obviously know spit about driver layer glue, I'll just > call this one a diagnostic, and head for the hills :) Does ipmi_si not have a "owner"? Ah, that makes sense, not all modules do... > --- linux-2.6.20-rc3/kernel/module.c.org 2007-03-10 15:16:47.0 > +0100 > +++ linux-2.6.20-rc3/kernel/module.c 2007-03-10 15:43:09.0 +0100 > @@ -2411,14 +2411,28 @@ void module_remove_driver(struct device_ > return; > > sysfs_remove_link(>kobj, "module"); > - if (drv->owner && drv->owner->mkobj.drivers_dir) { > - driver_name = make_driver_name(drv); > - if (driver_name) { > - sysfs_remove_link(drv->owner->mkobj.drivers_dir, > + driver_name = make_driver_name(drv); > + if (!driver_name) > + return; > + if (drv->owner && drv->owner->mkobj.drivers_dir) > + sysfs_remove_link(drv->owner->mkobj.drivers_dir, > driver_name); > - kfree(driver_name); > - } > + else if (drv->mod_name) { > + struct module_kobject *mk; > + struct kobject *mkobj; > + > + /* Lookup built-in module entry in /sys/modules */ > + mkobj = kset_find_obj(_subsys.kset, drv->mod_name); > + if (!mkobj) > + goto out_free; > + mk = container_of(mkobj, struct module_kobject, kobj); > + module_create_drivers_dir(mk); > + sysfs_remove_link(mk->drivers_dir, driver_name); > + /* Release reference taken via lookup */ > + kobject_put(mkobj); > } > +out_free: > + kfree(driver_name); > } > EXPORT_SYMBOL(module_remove_driver); > #endif That's pretty good for not knowing much about the subject matter here. But can you try this version instead? It should work a bit better than yours. thanks for your patience, greg k-h Subject: modules: fix reference counting logic for drivers without module pointers. We weren't dropping the sysfs link for the module driver name if we didn't happen to have the "owner" pointer in the driver. Based on a patch from Mike Galbraith <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- kernel/module.c | 24 +--- 1 file changed, 17 insertions(+), 7 deletions(-) --- a/kernel/module.c +++ b/kernel/module.c @@ -2405,20 +2405,30 @@ EXPORT_SYMBOL(module_add_driver); void module_remove_driver(struct device_driver *drv) { + struct module_kobject *mk = NULL; + struct kobject *mkobj = NULL; char *driver_name; if (!drv) return; sysfs_remove_link(>kobj, "module"); - if (drv->owner && drv->owner->mkobj.drivers_dir) { - driver_name = make_driver_name(drv); - if (driver_name) { - sysfs_remove_link(drv->owner->mkobj.drivers_dir, - driver_name); - kfree(driver_name); - } + driver_name = make_driver_name(drv); + if (!driver_name) + return; + + if (drv->owner && drv->owner->mkobj.drivers_dir) + mk = >owner->mkobj; + else { + /* Lookup built-in module entry in /sys/modules */ + mkobj = kset_find_obj(_subsys.kset, drv->mod_name); +
[OT] Re: New thread RDSL, post-2.6.20 kernels and amanda (tar) miss-fires
On Wed, Mar 14, 2007 at 11:12:48PM -0400, Gene Heskett wrote: > On Wednesday 14 March 2007, Ray Lee wrote: > >On 3/13/07, Gene Heskett <[EMAIL PROTECTED]> wrote: > >> On Tuesday 13 March 2007, Gene Heskett wrote: > >> >On Tuesday 13 March 2007, Gene Heskett wrote: > >> >>Greetings; > >> >>Someone suggested a fresh thread for this. > >> >> > >> >>I now have my scripts more or less under control, and I can report > >> >> that kernel-2.6.20.1 with no other patches does not exhibit the > >> >> undesirable behaviour where tar thinks its all new, even when told > >> >> to do a level 2 on a directory tree that hasn't been touched in > >> >> months to update anything. > >> >> > >> >>Next up, 2.6.20.2, plain and with the latest RDSL-0.30 patch. > >> > > >> >And amanda/tar worked normally for 2.6.20.2 plain. > >> > > >> >Next up, 2.6.21-rc1 if it will build here. > >> > >> It built, it booted, and its busted big time. First, with an amdump > >> running in the background, the machine is so close to unusable that I > >> considered rebooting, but I needed the data to show the problem. I am > >> losing the keyboard and mouse for a minute or more at a time but the > >> keystrokes seem to be being registered so it eventually catches up. > >> > >> Disk i/o seems to be the killer according to gkrellm. > >> > >> But to give one an idea of the fits this is giving tar, I'll snip a > >> line or 2 from an amstatus report here: > >> coyote:/GenesAmandaHelper-0.6 1 planner: [dumps way too big, 138200 > >> KB, must skip incremental dumps] > >> > >> Huh? 138.2GB? A 'du -h .' in that dir says 766megs. > >> > >> coyote:/root 1 4426m wait for dumping > >> du -h says 5.0GB so that's ballpark, but its also a level 1, so maybe > >> 20 megs is actually new since 15:57 this afternoon local. kmails > >> final maildir is in that dir. > >> > >> This goes on for much of the amstatus report, very few of the reported > >> sizes are close to sane. > >> > >> Now, can someone suggest a patch I can revert that might fix this? > >> The total number of patches between 2.6.20 and 2.6.21-rc1 will have me > >> building kernels to bisect this till the middle of June at this rate. > > > >In a previous email, you said you were using ext3. If that's the case, > >there doesn't appear to be much going on in terms of patches between > >2.6.20 and 2.6.21-rc1. The only one that even comes close to looking > >like it might have an effect would only come in to play if you have a > >filesystem that has ACL information, but is mounted by a kernel that > >doesn't have ACL support. > > > >I have to echo wli here, I'm afraid, and recommend at least a *few* > >bisections to help narrow down the list of suspect patches. > > > >There are tutorials out there for git users. I use the mercurial > >repository, as I find the mercurial interface and workflow a lot more > >intuitive, but it has the same capability. > > > >Even 2-5 bisections will greatly help others hunt the bug down. > > > >Ray > > Probably. But I've now put a week into this, and from some other clues > I've collected, I'm beginning to think tar has a tummy ache. After all, > and ls -lc reports totally sane mtimes. So why is tar going bonkers > under kernels 2.6.21-rc*, with or without Cons patches? > > I've also spent a day now looking for a valid place to put a bugzilla > entry against tar, but googles search results are sending me to > gcc.gnu.org and this is NOT the correct bugzilla for a tar problem. > > Its no secret that with all the churn in tar over the last 5 years, worse > churn than the kernel IMO in going from 2.0 to 2.6, that I'm not a fan of > yet another _new_ version of tar, when what we just need is _one_ that > works. It is not capable of executing the recovery command listed in the > first block of every amdump file it (amdump) ever built right now, and > I've played the equ of the 10,000 monkeys writing Shakespear for several > hours trying. Damned frustrating is what it is. > > The error it reports seems to indicate that it cannot write through the > pipes involved. But with tar's error reporting, who the hell knows for > sure. > > Here is an example > [EMAIL PROTECTED] data]# dd if=00010.coyote._lib.1 bs=32k count=1 > AMANDA: FILE 20070314104344 coyote /lib lev 1 comp .gz program /bin/tar > To restore, position tape at start of file and run: > dd if= bs=32k skip=1 | /bin/gzip -dc | /bin/tar -f - ... > > And the elipsis is an error if not removed. Then one is supposed to be > able to redirect tars output with the usual >/tmp/test/ syntax > > So: > [EMAIL PROTECTED] data]# dd if=00010.coyote._lib.1 bs=32k > skip=1 | /bin/gzip -dc | /bin/tar -f - >/tmp/test/ > -bash: /tmp/test/: Is a directory > > which is the return from any variation in how the redirect is done. > > So what is it that am I doing wrong in the above command line?, so I can > add it to my helper scripts to be published eventually on zmanda.org. with "/bin/tar -f
[PATCH take3 00/20] Make common x86 arch area for i386 and x86_64 - Take 3
Once again here's an attempt to put the shared files of x86_64 and i386 into a separate directory. This time, I took the pains to make sure that each patch in this series compiles after it is applied. I did this on both x86_64 as well as i386, with the affected files config options turned on. I still stayed away from the pci shared code. This time I moved the speedstep-lib.h into include/asm-x86. Although all references to this files now needs to explicitly state #include But this will also create a doorway for other shared headers to go into. And yes the long term goal is to perhaps make a single arch that can handle both the i386 modern CPUs as well as the x86_64 code. And then phase out the x86_64, keeping the current i386 for legacy hardware. Used git-diff -M for the diffs, so the renames are explicitly stated as such, but no delete/create diff is made (so patch and quilt will not apply theses). Comments and flames welcome. -- Steve - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH take3 12/20] mtrr directory switch
Move the mtrr directory over to the common area. Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]> diff --git a/arch/i386/kernel/cpu/Makefile b/arch/i386/kernel/cpu/Makefile index 010aecf..f8eaef8 100644 --- a/arch/i386/kernel/cpu/Makefile +++ b/arch/i386/kernel/cpu/Makefile @@ -15,5 +15,4 @@ obj-y += umc.o obj-$(CONFIG_X86_MCE) += mcheck/ -obj-$(CONFIG_MTRR) += mtrr/ obj-$(CONFIG_CPU_FREQ) += cpufreq/ diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index 3e15c9e..c1a2b58 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -1,5 +1,7 @@ obj-y += bootflag.o quirks.o i8237.o topology.o alternative.o +obj-y += cpu/ + obj-$(CONFIG_X86_MSR) += msr.o obj-$(CONFIG_X86_CPUID)+= cpuid.o obj-$(CONFIG_MICROCODE)+= microcode.o diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile new file mode 100644 index 000..3e59ae7 --- /dev/null +++ b/arch/x86/kernel/cpu/Makefile @@ -0,0 +1,2 @@ + +obj-$(CONFIG_MTRR) += mtrr/ diff --git a/arch/i386/kernel/cpu/mtrr/Makefile b/arch/x86/kernel/cpu/mtrr/Makefile similarity index 100% rename from arch/i386/kernel/cpu/mtrr/Makefile rename to arch/x86/kernel/cpu/mtrr/Makefile diff --git a/arch/i386/kernel/cpu/mtrr/amd.c b/arch/x86/kernel/cpu/mtrr/amd.c similarity index 100% rename from arch/i386/kernel/cpu/mtrr/amd.c rename to arch/x86/kernel/cpu/mtrr/amd.c diff --git a/arch/i386/kernel/cpu/mtrr/centaur.c b/arch/x86/kernel/cpu/mtrr/centaur.c similarity index 100% rename from arch/i386/kernel/cpu/mtrr/centaur.c rename to arch/x86/kernel/cpu/mtrr/centaur.c diff --git a/arch/i386/kernel/cpu/mtrr/cyrix.c b/arch/x86/kernel/cpu/mtrr/cyrix.c similarity index 100% rename from arch/i386/kernel/cpu/mtrr/cyrix.c rename to arch/x86/kernel/cpu/mtrr/cyrix.c diff --git a/arch/i386/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c similarity index 100% rename from arch/i386/kernel/cpu/mtrr/generic.c rename to arch/x86/kernel/cpu/mtrr/generic.c diff --git a/arch/i386/kernel/cpu/mtrr/if.c b/arch/x86/kernel/cpu/mtrr/if.c similarity index 100% rename from arch/i386/kernel/cpu/mtrr/if.c rename to arch/x86/kernel/cpu/mtrr/if.c diff --git a/arch/i386/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c similarity index 100% rename from arch/i386/kernel/cpu/mtrr/main.c rename to arch/x86/kernel/cpu/mtrr/main.c diff --git a/arch/i386/kernel/cpu/mtrr/mtrr.h b/arch/x86/kernel/cpu/mtrr/mtrr.h similarity index 100% rename from arch/i386/kernel/cpu/mtrr/mtrr.h rename to arch/x86/kernel/cpu/mtrr/mtrr.h diff --git a/arch/i386/kernel/cpu/mtrr/state.c b/arch/x86/kernel/cpu/mtrr/state.c similarity index 100% rename from arch/i386/kernel/cpu/mtrr/state.c rename to arch/x86/kernel/cpu/mtrr/state.c diff --git a/arch/x86_64/kernel/Makefile b/arch/x86_64/kernel/Makefile index 3fae694..60918ad 100644 --- a/arch/x86_64/kernel/Makefile +++ b/arch/x86_64/kernel/Makefile @@ -14,7 +14,6 @@ obj-$(CONFIG_STACKTRACE) += stacktrace.o obj-$(CONFIG_X86_MCE) += mce.o therm_throt.o obj-$(CONFIG_X86_MCE_INTEL)+= mce_intel.o obj-$(CONFIG_X86_MCE_AMD) += mce_amd.o -obj-$(CONFIG_MTRR) += ../../i386/kernel/cpu/mtrr/ obj-$(CONFIG_ACPI) += acpi/ obj-$(CONFIG_SMP) += smp.o smpboot.o trampoline.o obj-y += apic.o nmi.o -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm/filemap.c: unconditionally call mark_page_accessed
On Wed, 14 Mar 2007, Xiaoning Ding wrote: Dave Kleikamp wrote: On Wed, 2007-03-14 at 22:33 +0100, Andreas Mohr wrote: Hi, On Wed, Mar 14, 2007 at 03:55:41PM -0500, Dave Kleikamp wrote: On Wed, 2007-03-14 at 15:58 -0400, Ashif Harji wrote: This patch unconditionally calls mark_page_accessed to prevent pages, especially for small files, from being evicted from the page cache despite frequent access. I guess the downside to this is if a reader is reading a large file, or several files, sequentially with a small read size (smaller than PAGE_SIZE), the pages will be marked active after just one read pass. My gut says the benefits of this patch outweigh the cost. I would expect real-world backup apps, etc. to read at least PAGE_SIZE. I also think that the patch is somewhat problematic, since the original intention seems to have been a reduction of the number of (expensive?) mark_page_accessed() calls, mark_page_accessed() isn't expensive. If called repeatedly, starting with the third call, it will check two page flags and return. The only real expense is that the page appears busier than it may be and will be retained in memory longer than it should. If we allow mark_page_accessed() called multiple times for a single page, a scan of large file with small-size reads would flush the buffer cache. mark_page_accessed() also requests lru_lock when moving page from inactive_list to active_list. It may also increase lock contention. The problem with the existing logic is that it is too coarse. In trying to deal with one usage pattern it is negatively impacting performance for other reasonable access patterns. Further, consider the extreme case of scanning a file 1 byte at a time. In this case, you are going to access a page over 4000 times, but that page is not going to be marked as active and hence that page is likely to be evicted from the cache. Clearly, there are cases when scanning a file that you would like the pages to be kept in the cache. Finally, the existing code is problematic as there is no reasonable way to circumvent the negative impact for small files. Hence, I think a change is necessary. The question is whether the intent of conditionally calling mark_page_accessed() is still reasonable and whether the amount of bookkeeping required to detect that usage pattern but not create a problem for other usage patterns is reasonable. I would tend to agree with David that: "Any application doing many tiny-sized reads isn't exactly asking for great performance." As well, applications concerned with performance and caching problems can read in a file in PAGE_SIZE chunks. I still think the simple fix of removing the condition is the best approach, but I'm certainly open to alternatives. ashif. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/1] Allow i386 crash kernels to handle x86_64 dumps
On Thu, Mar 15, 2007 at 10:25:36AM +0530, Vivek Goyal wrote: > On Thu, Mar 15, 2007 at 10:46:38AM +0900, Horms wrote: > > On Wed, Mar 14, 2007 at 05:00:09PM +, Ian Campbell wrote: > > > The specific case I am encountering is kdump under Xen with a 64 bit > > > hypervisor and 32 bit kernel/userspace. The dump created is a 64 bit due > > > to the hypervisor but the dump kernel is 32 bit to match the domain 0 > > > kernel. > > > > > > It's possibly less likely to be useful in a purely native scenario but I > > > see no reason to disallow it. > > > > For native Linux, would this cover the case where the pre-crash kernel > > is 64bit and the crashdump (post-crash) kernel is 32bit? > > > > I think so. Though I have never tried this. > > > > Signed-off-by: Ian Campbell <[EMAIL PROTECTED]> > > > > > > --- pristine-linux-2.6.18/include/asm-i386/elf.h 2006-09-20 > > > 04:42:06.0 +0100 > > > +++ linux-2.6.18-xen/include/asm-i386/elf.h 2007-03-14 > > > 16:42:30.0 + > > > @@ -36,7 +36,7 @@ > > > * This is used to ensure we don't load something for the wrong > > > architecture. > > > */ > > > #define elf_check_arch(x) \ > > > - (((x)->e_machine == EM_386) || ((x)->e_machine == EM_486)) > > > + (((x)->e_machine == EM_386) || ((x)->e_machine == EM_486) || > > > ((x)->e_machine == EM_X86_64)) > > But I think changing this macro might run into issues. It is being used at > few places in kernel, for example while loading module. This will essentially > mean that we allow loading 64bit x86_64 modules on 32bit i386 systems? > > Similarly, load_elf_interp() is using it, again will we allow loading a > interp written for X86_64 on a 32bit i386 machine? > > Should we create a separate macro something like elf_check_allowed_arch(), > to take care of such corner cases? That sounds reasonable to me. Though perhaps it could just be kexec_elf_check_arch() for now, as I don't think there are any other consumers of it. -- Horms H: http://www.vergenet.net/~horms/ W: http://www.valinux.co.jp/en/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Stolen and degraded time and schedulers
Jeremy Fitzhardinge writes: > Sure. But on a given machine, the CPUs are likely to be closely enough > matched that a cycle on one CPU is more or less equivalent to a cycle on > another CPU. The fact that a cycle represents a different amount of A cycle on one thread of a machine with SMT/hyperthreading when the other thread is idle *isn't* equivalent to a cycle when the other thread is busy. We run into this on POWER5, where we have hardware that counts cycles when each of the two threads in each core gets to dispatch instructions (on each cycle, one thread or the other gets to dispatch). That helps but still doesn't give a totally accurate estimate of how much computation a given process has managed to do. > Not at all. You might have an unimportant but cpu-bound process which > doesn't merit increasing the cpu speed, but should also be scheduled > properly compared to other processes. I often nice my kernel builds > (which cpufreq takes as a hint to not ramp up the cpu speed) on my > laptop so to save power. Just as a side note - that's probably actually a bad strategy; you almost certainly consume less total energy by running the cpu at full speed until the build is done and then going to the deepest sleep mode you can achieve. > That's true. But this is a case of the left brain not talking to the > right brain: cpufreq might decide to slow a cpu down, but the scheduler > doesn't take that into account. Making the timebase of sched_clock > reflect the current cpu speed (or more specifically, the integral of the > cpu speed over a time interval) is a good way of communicating between > the two subsystems. What was the original proposal? I came into this discussion late... Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH take3 02/20] tsc_sync.c switch
Move tsc_sync.c to the common area. Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]> diff --git a/arch/i386/kernel/Makefile b/arch/i386/kernel/Makefile index a57040d..c8fe439 100644 --- a/arch/i386/kernel/Makefile +++ b/arch/i386/kernel/Makefile @@ -18,7 +18,7 @@ obj-$(CONFIG_X86_MSR) += msr.o obj-$(CONFIG_X86_CPUID)+= cpuid.o obj-$(CONFIG_MICROCODE)+= microcode.o obj-$(CONFIG_APM) += apm.o -obj-$(CONFIG_X86_SMP) += smp.o smpboot.o tsc_sync.o +obj-$(CONFIG_X86_SMP) += smp.o smpboot.o obj-$(CONFIG_X86_TRAMPOLINE) += trampoline.o obj-$(CONFIG_X86_MPPARSE) += mpparse.o obj-$(CONFIG_X86_LOCAL_APIC) += apic.o nmi.o diff --git a/arch/i386/kernel/tsc_sync.c b/arch/i386/kernel/tsc_sync.c deleted file mode 100644 index 1242462..000 --- a/arch/i386/kernel/tsc_sync.c +++ /dev/null @@ -1 +0,0 @@ -#include "../../x86_64/kernel/tsc_sync.c" diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index 55f268f..bd548e6 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -1,2 +1,7 @@ obj-$(CONFIG_EARLY_PRINTK) += early_printk.o + +# i386 defines CONFIG_X86_SMP when CONFIG_SMP and !CONFIG_X86_VOYAGER +ifeq ($(CONFIG_X86_VOYAGER), ) +obj-$(CONFIG_SMP) += tsc_sync.o +endif diff --git a/arch/x86_64/kernel/tsc_sync.c b/arch/x86/kernel/tsc_sync.c similarity index 100% rename from arch/x86_64/kernel/tsc_sync.c rename to arch/x86/kernel/tsc_sync.c diff --git a/arch/x86_64/kernel/Makefile b/arch/x86_64/kernel/Makefile index 8b2535c..54fe500 100644 --- a/arch/x86_64/kernel/Makefile +++ b/arch/x86_64/kernel/Makefile @@ -19,7 +19,7 @@ obj-$(CONFIG_ACPI)+= acpi/ obj-$(CONFIG_X86_MSR) += msr.o obj-$(CONFIG_MICROCODE)+= microcode.o obj-$(CONFIG_X86_CPUID)+= cpuid.o -obj-$(CONFIG_SMP) += smp.o smpboot.o trampoline.o tsc_sync.o +obj-$(CONFIG_SMP) += smp.o smpboot.o trampoline.o obj-y += apic.o nmi.o obj-y += io_apic.o mpparse.o \ genapic.o genapic_cluster.o genapic_flat.o -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH take3 15/20] cpufreq files switched
Moved the shared files that were in the arch/i386/kernel/cpu/cpufreq to the common area. Since the speedstep-lib.h file was used by files that were moved as well as files that were not moved, a new directory was created to hold this shared header, called include/asm-x86. Since this directory is not full featured yet (no x86 arch fully defined) all references to this file must be of #include But this allows for a stepping stone approach to a generic x86 arch and a place to put more asm-x86 headers. The Kconfig for cpufreq in the x86_64 arch directory is not moved to simplify this patch. Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]> Cc: Chris Wright <[EMAIL PROTECTED]> diff --git a/arch/i386/kernel/cpu/cpufreq/Makefile b/arch/i386/kernel/cpu/cpufreq/Makefile index 560f776..49c4ca4 100644 --- a/arch/i386/kernel/cpu/cpufreq/Makefile +++ b/arch/i386/kernel/cpu/cpufreq/Makefile @@ -1,6 +1,6 @@ +# See also arch/x86/kernel/cpu/cpufreq/Makefile obj-$(CONFIG_X86_POWERNOW_K6) += powernow-k6.o obj-$(CONFIG_X86_POWERNOW_K7) += powernow-k7.o -obj-$(CONFIG_X86_POWERNOW_K8) += powernow-k8.o obj-$(CONFIG_X86_LONGHAUL) += longhaul.o obj-$(CONFIG_X86_E_POWERSAVER) += e_powersaver.o obj-$(CONFIG_ELAN_CPUFREQ) += elanfreq.o @@ -8,9 +8,5 @@ obj-$(CONFIG_SC520_CPUFREQ) += sc520_freq.o obj-$(CONFIG_X86_LONGRUN) += longrun.o obj-$(CONFIG_X86_GX_SUSPMOD) += gx-suspmod.o obj-$(CONFIG_X86_SPEEDSTEP_ICH)+= speedstep-ich.o -obj-$(CONFIG_X86_SPEEDSTEP_LIB)+= speedstep-lib.o obj-$(CONFIG_X86_SPEEDSTEP_SMI)+= speedstep-smi.o -obj-$(CONFIG_X86_ACPI_CPUFREQ) += acpi-cpufreq.o -obj-$(CONFIG_X86_SPEEDSTEP_CENTRINO) += speedstep-centrino.o -obj-$(CONFIG_X86_P4_CLOCKMOD) += p4-clockmod.o obj-$(CONFIG_X86_CPUFREQ_NFORCE2) += cpufreq-nforce2.o diff --git a/arch/i386/kernel/cpu/cpufreq/speedstep-ich.c b/arch/i386/kernel/cpu/cpufreq/speedstep-ich.c index b425cd3..97c14b3 100644 --- a/arch/i386/kernel/cpu/cpufreq/speedstep-ich.c +++ b/arch/i386/kernel/cpu/cpufreq/speedstep-ich.c @@ -25,7 +25,7 @@ #include #include -#include "speedstep-lib.h" +#include /* speedstep_chipset: diff --git a/arch/i386/kernel/cpu/cpufreq/speedstep-smi.c b/arch/i386/kernel/cpu/cpufreq/speedstep-smi.c index ff0d898..093d7d0 100644 --- a/arch/i386/kernel/cpu/cpufreq/speedstep-smi.c +++ b/arch/i386/kernel/cpu/cpufreq/speedstep-smi.c @@ -22,7 +22,7 @@ #include #include -#include "speedstep-lib.h" +#include /* speedstep system management interface port/command. * diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile index 6557e4a..4728c89 100644 --- a/arch/x86/kernel/cpu/Makefile +++ b/arch/x86/kernel/cpu/Makefile @@ -2,3 +2,4 @@ obj-y += intel_cacheinfo.o obj-$(CONFIG_X86_MCE) += mcheck/ obj-$(CONFIG_MTRR) += mtrr/ +obj-$(CONFIG_CPU_FREQ) += cpufreq/ diff --git a/arch/x86/kernel/cpu/cpufreq/Makefile b/arch/x86/kernel/cpu/cpufreq/Makefile new file mode 100644 index 000..883fae4 --- /dev/null +++ b/arch/x86/kernel/cpu/cpufreq/Makefile @@ -0,0 +1,6 @@ + +obj-$(CONFIG_X86_POWERNOW_K8) += powernow-k8.o +obj-$(CONFIG_X86_SPEEDSTEP_LIB) += speedstep-lib.o +obj-$(CONFIG_X86_ACPI_CPUFREQ) += acpi-cpufreq.o +obj-$(CONFIG_X86_SPEEDSTEP_CENTRINO) += speedstep-centrino.o +obj-$(CONFIG_X86_P4_CLOCKMOD) += p4-clockmod.o diff --git a/arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.c b/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c similarity index 100% rename from arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.c rename to arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c diff --git a/arch/i386/kernel/cpu/cpufreq/p4-clockmod.c b/arch/x86/kernel/cpu/cpufreq/p4-clockmod.c similarity index 100% rename from arch/i386/kernel/cpu/cpufreq/p4-clockmod.c rename to arch/x86/kernel/cpu/cpufreq/p4-clockmod.c index 4786fed..5024ea8 100644 --- a/arch/i386/kernel/cpu/cpufreq/p4-clockmod.c +++ b/arch/x86/kernel/cpu/cpufreq/p4-clockmod.c @@ -33,7 +33,7 @@ #include #include -#include "speedstep-lib.h" +#include #define PFX"p4-clockmod: " #define dprintk(msg...) cpufreq_debug_printk(CPUFREQ_DEBUG_DRIVER, "p4-clockmod", msg) diff --git a/arch/i386/kernel/cpu/cpufreq/powernow-k8.c b/arch/x86/kernel/cpu/cpufreq/powernow-k8.c similarity index 100% rename from arch/i386/kernel/cpu/cpufreq/powernow-k8.c rename to arch/x86/kernel/cpu/cpufreq/powernow-k8.c diff --git a/arch/i386/kernel/cpu/cpufreq/powernow-k8.h b/arch/x86/kernel/cpu/cpufreq/powernow-k8.h similarity index 100% rename from arch/i386/kernel/cpu/cpufreq/powernow-k8.h rename to arch/x86/kernel/cpu/cpufreq/powernow-k8.h diff --git a/arch/i386/kernel/cpu/cpufreq/speedstep-centrino.c b/arch/x86/kernel/cpu/cpufreq/speedstep-centrino.c similarity index 100% rename from arch/i386/kernel/cpu/cpufreq/speedstep-centrino.c rename to arch/x86/kernel/cpu/cpufreq/speedstep-centrino.c diff
Re: kswapd & 2.4.21-47.0.0.1
Well. I expected similar answer :) But unfortunately it's not my decision to use CentOS. Also I couldn't get RH customer support for some reasons. So anyway thank you for answer. Regards, Kostya. Willy Tarreau wrote: Hello, On Wed, Mar 14, 2007 at 04:35:55PM +0300, Konstantin Kalin wrote: Hello, All I have the following configuration: CentOS 3.8, kernel 2.4.21-41.0.01.EL, Dialogic boards. Sometimes a kernel panic happens. I setup netdump and got several crash dumps and logs. Backtrace shows that kswapd called BUG in try_to_unmap function. Unfortunately I couldn't upgrade the kernel because of proprietary Dialogic drivers which are precompiled. Could somebody help me? I tried to find similar issues in maillist and failed with it. There are a few messages but they describe another case. Well, I think you're trying to get both the cake and the money for it. You use a vendor-specific stable kernel in order to get a high reliability and good hardware support, but without paying for the customer support associated with it, and when you have a problem you ask for free help here where people don't know much about it (except for those who worked on it). By trying to get all advantages, you're in the worst situation : you have a bug with a kernel that nobody knows except the vendor, and you can't beat the vendor for this. I don't know if CentOS offers community-based support through mailing lists or such, but maybe you'd loose less time and money by buying the smallest support contract from RH and ask them to help you on this problem. As I understand the rmap.c are under active development and it's strongly been changing per each kernel version. Also if I understand correct rmap.c has appeared in the kernel 2.6.x and my version of the kernel is a backport by RedHat from 2.6 to 2.4. Nope, it was initially written for 2.4 by Rik van Riel, and supported for a long time as a patch for these kernels. Later it got merged in 2.4-ac which became a base for RHEL3. It was also merged in 2.6 but I believe that it got important changes, though I'm not sure. Information about the crash is below. The specific of my system is a lot of java thread (up to 1500). I'm not sure that many people here will be able to provide you with much help, unfortunately. Regards, Willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH take3 20/20] oprofile files switched
Move the oprofile files from arch/i386/oprofile to the common area. Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]> diff --git a/arch/i386/Kconfig b/arch/i386/Kconfig index 53d6237..137c063 100644 --- a/arch/i386/Kconfig +++ b/arch/i386/Kconfig @@ -1226,7 +1226,7 @@ source "fs/Kconfig" menu "Instrumentation Support" depends on EXPERIMENTAL -source "arch/i386/oprofile/Kconfig" +source "arch/x86/oprofile/Kconfig" config KPROBES bool "Kprobes (EXPERIMENTAL)" diff --git a/arch/i386/Makefile b/arch/i386/Makefile index 06dd07e..6e537be 100644 --- a/arch/i386/Makefile +++ b/arch/i386/Makefile @@ -108,7 +108,7 @@ core-y += arch/i386/kernel/ \ drivers-$(CONFIG_MATH_EMULATION) += arch/i386/math-emu/ drivers-$(CONFIG_PCI) += arch/i386/pci/ # must be linked after kernel/ -drivers-$(CONFIG_OPROFILE) += arch/i386/oprofile/ +drivers-$(CONFIG_OPROFILE) += arch/x86/oprofile/ drivers-$(CONFIG_PM) += arch/i386/power/ CFLAGS += $(mflags-y) diff --git a/arch/i386/oprofile/Kconfig b/arch/x86/oprofile/Kconfig similarity index 100% rename from arch/i386/oprofile/Kconfig rename to arch/x86/oprofile/Kconfig diff --git a/arch/i386/oprofile/Makefile b/arch/x86/oprofile/Makefile similarity index 100% rename from arch/i386/oprofile/Makefile rename to arch/x86/oprofile/Makefile diff --git a/arch/i386/oprofile/backtrace.c b/arch/x86/oprofile/backtrace.c similarity index 100% rename from arch/i386/oprofile/backtrace.c rename to arch/x86/oprofile/backtrace.c diff --git a/arch/i386/oprofile/init.c b/arch/x86/oprofile/init.c similarity index 100% rename from arch/i386/oprofile/init.c rename to arch/x86/oprofile/init.c diff --git a/arch/i386/oprofile/nmi_int.c b/arch/x86/oprofile/nmi_int.c similarity index 100% rename from arch/i386/oprofile/nmi_int.c rename to arch/x86/oprofile/nmi_int.c diff --git a/arch/i386/oprofile/nmi_timer_int.c b/arch/x86/oprofile/nmi_timer_int.c similarity index 100% rename from arch/i386/oprofile/nmi_timer_int.c rename to arch/x86/oprofile/nmi_timer_int.c diff --git a/arch/i386/oprofile/op_counter.h b/arch/x86/oprofile/op_counter.h similarity index 100% rename from arch/i386/oprofile/op_counter.h rename to arch/x86/oprofile/op_counter.h diff --git a/arch/i386/oprofile/op_model_athlon.c b/arch/x86/oprofile/op_model_athlon.c similarity index 100% rename from arch/i386/oprofile/op_model_athlon.c rename to arch/x86/oprofile/op_model_athlon.c diff --git a/arch/i386/oprofile/op_model_p4.c b/arch/x86/oprofile/op_model_p4.c similarity index 100% rename from arch/i386/oprofile/op_model_p4.c rename to arch/x86/oprofile/op_model_p4.c diff --git a/arch/i386/oprofile/op_model_ppro.c b/arch/x86/oprofile/op_model_ppro.c similarity index 100% rename from arch/i386/oprofile/op_model_ppro.c rename to arch/x86/oprofile/op_model_ppro.c diff --git a/arch/i386/oprofile/op_x86_model.h b/arch/x86/oprofile/op_x86_model.h similarity index 100% rename from arch/i386/oprofile/op_x86_model.h rename to arch/x86/oprofile/op_x86_model.h diff --git a/arch/x86_64/Kconfig b/arch/x86_64/Kconfig index 56eb14c..12e9fc4 100644 --- a/arch/x86_64/Kconfig +++ b/arch/x86_64/Kconfig @@ -738,7 +738,7 @@ source fs/Kconfig menu "Instrumentation Support" depends on EXPERIMENTAL -source "arch/x86_64/oprofile/Kconfig" +source "arch/x86/oprofile/Kconfig" config KPROBES bool "Kprobes (EXPERIMENTAL)" diff --git a/arch/x86_64/Makefile b/arch/x86_64/Makefile index abf1829..0c7e0fa 100644 --- a/arch/x86_64/Makefile +++ b/arch/x86_64/Makefile @@ -85,7 +85,7 @@ core-y+= arch/x86_64/kernel/ \ arch/x86_64/crypto/ core-$(CONFIG_IA32_EMULATION) += arch/x86_64/ia32/ drivers-$(CONFIG_PCI) += arch/x86_64/pci/ -drivers-$(CONFIG_OPROFILE) += arch/x86_64/oprofile/ +drivers-$(CONFIG_OPROFILE) += arch/x86/oprofile/ boot := arch/x86_64/boot diff --git a/arch/x86_64/oprofile/Kconfig b/arch/x86_64/oprofile/Kconfig deleted file mode 100644 index d8a8408..000 --- a/arch/x86_64/oprofile/Kconfig +++ /dev/null @@ -1,17 +0,0 @@ -config PROFILING - bool "Profiling support (EXPERIMENTAL)" - help - Say Y here to enable the extended profiling support mechanisms used - by profilers such as OProfile. - - -config OPROFILE - tristate "OProfile system profiling (EXPERIMENTAL)" - depends on PROFILING - help - OProfile is a profiling system capable of profiling the - whole system, include the kernel, kernel modules, libraries, - and applications. - - If unsure, say N. - diff --git a/arch/x86_64/oprofile/Makefile b/arch/x86_64/oprofile/Makefile deleted file mode 100644 index 6be3268..000 --- a/arch/x86_64/oprofile/Makefile +++ /dev/null @@ -1,19 +0,0 @@ -# -# oprofile for x86-64. -# Just reuse the one from i386.
[PATCH take3 09/20] cpuid.c switch
Move the cpuid.c to the common area. Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]> diff --git a/arch/i386/kernel/Makefile b/arch/i386/kernel/Makefile index 5276349..4437181 100644 --- a/arch/i386/kernel/Makefile +++ b/arch/i386/kernel/Makefile @@ -14,7 +14,6 @@ obj-y += cpu/ obj-y += acpi/ obj-$(CONFIG_X86_BIOS_REBOOT) += reboot.o obj-$(CONFIG_MCA) += mca.o -obj-$(CONFIG_X86_CPUID)+= cpuid.o obj-$(CONFIG_MICROCODE)+= microcode.o obj-$(CONFIG_APM) += apm.o obj-$(CONFIG_X86_SMP) += smp.o smpboot.o diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index 4e5a88f..912421a 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -1,6 +1,7 @@ obj-y += bootflag.o quirks.o i8237.o topology.o alternative.o obj-$(CONFIG_X86_MSR) += msr.o +obj-$(CONFIG_X86_CPUID)+= cpuid.o obj-$(CONFIG_EARLY_PRINTK) += early_printk.o # i386 defines CONFIG_X86_SMP when CONFIG_SMP and !CONFIG_X86_VOYAGER diff --git a/arch/i386/kernel/cpuid.c b/arch/x86/kernel/cpuid.c similarity index 100% rename from arch/i386/kernel/cpuid.c rename to arch/x86/kernel/cpuid.c diff --git a/arch/x86_64/kernel/Makefile b/arch/x86_64/kernel/Makefile index 248dbe8..f5997f3 100644 --- a/arch/x86_64/kernel/Makefile +++ b/arch/x86_64/kernel/Makefile @@ -17,7 +17,6 @@ obj-$(CONFIG_X86_MCE_AMD) += mce_amd.o obj-$(CONFIG_MTRR) += ../../i386/kernel/cpu/mtrr/ obj-$(CONFIG_ACPI) += acpi/ obj-$(CONFIG_MICROCODE)+= microcode.o -obj-$(CONFIG_X86_CPUID)+= cpuid.o obj-$(CONFIG_SMP) += smp.o smpboot.o trampoline.o obj-y += apic.o nmi.o obj-y += io_apic.o mpparse.o \ @@ -45,7 +44,6 @@ obj-y += pcspeaker.o CFLAGS_vsyscall.o := $(PROFILING) -g0 therm_throt-y += ../../i386/kernel/cpu/mcheck/therm_throt.o -cpuid-$(subst m,y,$(CONFIG_X86_CPUID)) += ../../i386/kernel/cpuid.o microcode-$(subst m,y,$(CONFIG_MICROCODE)) += ../../i386/kernel/microcode.o intel_cacheinfo-y += ../../i386/kernel/cpu/intel_cacheinfo.o pcspeaker-y+= ../../i386/kernel/pcspeaker.o -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH take3 11/20] pcspeaker.c switch
Move the pcspeaker.c to the common area. Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]> diff --git a/arch/i386/kernel/Makefile b/arch/i386/kernel/Makefile index ac925bc..ce1f742 100644 --- a/arch/i386/kernel/Makefile +++ b/arch/i386/kernel/Makefile @@ -37,7 +37,6 @@ obj-$(CONFIG_K8_NB) += k8.o obj-$(CONFIG_VMI) += vmi.o vmitime.o obj-$(CONFIG_PARAVIRT) += paravirt.o -obj-y += pcspeaker.o EXTRA_AFLAGS := -traditional diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index f1c6b2e..3e15c9e 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -5,6 +5,8 @@ obj-$(CONFIG_X86_CPUID) += cpuid.o obj-$(CONFIG_MICROCODE)+= microcode.o obj-$(CONFIG_EARLY_PRINTK) += early_printk.o +obj-y += pcspeaker.o + # i386 defines CONFIG_X86_SMP when CONFIG_SMP and !CONFIG_X86_VOYAGER ifeq ($(CONFIG_X86_VOYAGER), ) obj-$(CONFIG_SMP) += tsc_sync.o diff --git a/arch/i386/kernel/pcspeaker.c b/arch/x86/kernel/pcspeaker.c similarity index 100% rename from arch/i386/kernel/pcspeaker.c rename to arch/x86/kernel/pcspeaker.c diff --git a/arch/x86_64/kernel/Makefile b/arch/x86_64/kernel/Makefile index 08795d8..3fae694 100644 --- a/arch/x86_64/kernel/Makefile +++ b/arch/x86_64/kernel/Makefile @@ -38,10 +38,8 @@ obj-$(CONFIG_MODULES)+= module.o obj-$(CONFIG_PCI) += early-quirks.o obj-y += intel_cacheinfo.o -obj-y += pcspeaker.o CFLAGS_vsyscall.o := $(PROFILING) -g0 therm_throt-y += ../../i386/kernel/cpu/mcheck/therm_throt.o intel_cacheinfo-y += ../../i386/kernel/cpu/intel_cacheinfo.o -pcspeaker-y+= ../../i386/kernel/pcspeaker.o -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH take3 13/20] therm_throt.c switch
Move the therm_throt.c to the common area. Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]> diff --git a/arch/i386/kernel/cpu/mcheck/Makefile b/arch/i386/kernel/cpu/mcheck/Makefile index f1ebe1c..30808f3 100644 --- a/arch/i386/kernel/cpu/mcheck/Makefile +++ b/arch/i386/kernel/cpu/mcheck/Makefile @@ -1,2 +1,2 @@ -obj-y = mce.o k7.o p4.o p5.o p6.o winchip.o therm_throt.o +obj-y = mce.o k7.o p4.o p5.o p6.o winchip.o obj-$(CONFIG_X86_MCE_NONFATAL) += non-fatal.o diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile index 3e59ae7..e439cc1 100644 --- a/arch/x86/kernel/cpu/Makefile +++ b/arch/x86/kernel/cpu/Makefile @@ -1,2 +1,3 @@ +obj-$(CONFIG_X86_MCE) += mcheck/ obj-$(CONFIG_MTRR) += mtrr/ diff --git a/arch/x86/kernel/cpu/mcheck/Makefile b/arch/x86/kernel/cpu/mcheck/Makefile new file mode 100644 index 000..4018cde --- /dev/null +++ b/arch/x86/kernel/cpu/mcheck/Makefile @@ -0,0 +1 @@ +obj-y = therm_throt.o diff --git a/arch/i386/kernel/cpu/mcheck/therm_throt.c b/arch/x86/kernel/cpu/mcheck/therm_throt.c similarity index 100% rename from arch/i386/kernel/cpu/mcheck/therm_throt.c rename to arch/x86/kernel/cpu/mcheck/therm_throt.c diff --git a/arch/x86_64/kernel/Makefile b/arch/x86_64/kernel/Makefile index 60918ad..ef1585d 100644 --- a/arch/x86_64/kernel/Makefile +++ b/arch/x86_64/kernel/Makefile @@ -11,7 +11,7 @@ obj-y := process.o signal.o entry.o traps.o irq.o \ pci-dma.o pci-nommu.o hpet.o tsc.o obj-$(CONFIG_STACKTRACE) += stacktrace.o -obj-$(CONFIG_X86_MCE) += mce.o therm_throt.o +obj-$(CONFIG_X86_MCE) += mce.o obj-$(CONFIG_X86_MCE_INTEL)+= mce_intel.o obj-$(CONFIG_X86_MCE_AMD) += mce_amd.o obj-$(CONFIG_ACPI) += acpi/ @@ -40,5 +40,4 @@ obj-y += intel_cacheinfo.o CFLAGS_vsyscall.o := $(PROFILING) -g0 -therm_throt-y += ../../i386/kernel/cpu/mcheck/therm_throt.o intel_cacheinfo-y += ../../i386/kernel/cpu/intel_cacheinfo.o -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH take3 14/20] intel_cacheinfo.c switch
Move the intel_cacheinfo.c to the common area. Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]> diff --git a/arch/i386/kernel/cpu/Makefile b/arch/i386/kernel/cpu/Makefile index f8eaef8..e484d74 100644 --- a/arch/i386/kernel/cpu/Makefile +++ b/arch/i386/kernel/cpu/Makefile @@ -8,7 +8,7 @@ obj-y += amd.o obj-y += cyrix.o obj-y += centaur.o obj-y += transmeta.o -obj-y += intel.o intel_cacheinfo.o +obj-y += intel.o obj-y += rise.o obj-y += nexgen.o obj-y += umc.o diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile index e439cc1..6557e4a 100644 --- a/arch/x86/kernel/cpu/Makefile +++ b/arch/x86/kernel/cpu/Makefile @@ -1,3 +1,4 @@ +obj-y += intel_cacheinfo.o obj-$(CONFIG_X86_MCE) += mcheck/ obj-$(CONFIG_MTRR) += mtrr/ diff --git a/arch/i386/kernel/cpu/intel_cacheinfo.c b/arch/x86/kernel/cpu/intel_cacheinfo.c similarity index 100% rename from arch/i386/kernel/cpu/intel_cacheinfo.c rename to arch/x86/kernel/cpu/intel_cacheinfo.c diff --git a/arch/x86_64/kernel/Makefile b/arch/x86_64/kernel/Makefile index ef1585d..0a33b03 100644 --- a/arch/x86_64/kernel/Makefile +++ b/arch/x86_64/kernel/Makefile @@ -36,8 +36,5 @@ obj-$(CONFIG_AUDIT) += audit.o obj-$(CONFIG_MODULES) += module.o obj-$(CONFIG_PCI) += early-quirks.o -obj-y += intel_cacheinfo.o - CFLAGS_vsyscall.o := $(PROFILING) -g0 -intel_cacheinfo-y += ../../i386/kernel/cpu/intel_cacheinfo.o -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH take3 19/20] hugetlbpage.c switch
Move the hugetlbpage.c to the common area. Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]> diff --git a/arch/i386/Makefile b/arch/i386/Makefile index d73a830..06dd07e 100644 --- a/arch/i386/Makefile +++ b/arch/i386/Makefile @@ -102,6 +102,7 @@ libs-y += arch/i386/lib/ core-y += arch/i386/kernel/ \ arch/x86/kernel/ \ arch/i386/mm/ \ + arch/x86/mm/ \ arch/i386/$(mcore-y)/ \ arch/i386/crypto/ drivers-$(CONFIG_MATH_EMULATION) += arch/i386/math-emu/ diff --git a/arch/i386/mm/Makefile b/arch/i386/mm/Makefile index 80908b5..0cb01e6 100644 --- a/arch/i386/mm/Makefile +++ b/arch/i386/mm/Makefile @@ -5,6 +5,5 @@ obj-y := init.o pgtable.o fault.o ioremap.o extable.o pageattr.o mmap.o obj-$(CONFIG_NUMA) += discontig.o -obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o obj-$(CONFIG_HIGHMEM) += highmem.o obj-$(CONFIG_BOOT_IOREMAP) += boot_ioremap.o diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile new file mode 100644 index 000..1b6e922 --- /dev/null +++ b/arch/x86/mm/Makefile @@ -0,0 +1 @@ +obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o diff --git a/arch/i386/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c similarity index 100% rename from arch/i386/mm/hugetlbpage.c rename to arch/x86/mm/hugetlbpage.c diff --git a/arch/x86_64/Makefile b/arch/x86_64/Makefile index 3cf9198..abf1829 100644 --- a/arch/x86_64/Makefile +++ b/arch/x86_64/Makefile @@ -81,6 +81,7 @@ libs-y+= arch/x86_64/lib/ core-y += arch/x86_64/kernel/ \ arch/x86/kernel/ \ arch/x86_64/mm/ \ + arch/x86/mm/ \ arch/x86_64/crypto/ core-$(CONFIG_IA32_EMULATION) += arch/x86_64/ia32/ drivers-$(CONFIG_PCI) += arch/x86_64/pci/ diff --git a/arch/x86_64/mm/Makefile b/arch/x86_64/mm/Makefile index d25ac86..b6f1f43 100644 --- a/arch/x86_64/mm/Makefile +++ b/arch/x86_64/mm/Makefile @@ -3,9 +3,7 @@ # obj-y := init.o fault.o ioremap.o extable.o pageattr.o mmap.o -obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o obj-$(CONFIG_NUMA) += numa.o obj-$(CONFIG_K8_NUMA) += k8topology.o obj-$(CONFIG_ACPI_NUMA) += srat.o -hugetlbpage-y = ../../i386/mm/hugetlbpage.o -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH take3 05/20] i8237.c switch
Move the i8237.c to the common area. Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]> diff --git a/arch/i386/kernel/Makefile b/arch/i386/kernel/Makefile index c5c62af..1052659 100644 --- a/arch/i386/kernel/Makefile +++ b/arch/i386/kernel/Makefile @@ -7,7 +7,7 @@ extra-y := head.o init_task.o vmlinux.lds obj-y := process.o signal.o entry.o traps.o irq.o \ ptrace.o time.o ioport.o ldt.o setup.o i8259.o sys_i386.o \ pci-dma.o i386_ksyms.o i387.o e820.o\ - i8237.o topology.o alternative.o i8253.o tsc.o + topology.o alternative.o i8253.o tsc.o obj-$(CONFIG_STACKTRACE) += stacktrace.o obj-y += cpu/ diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index 26feab4..19921b9 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -1,4 +1,4 @@ -obj-y += bootflag.o quirks.o +obj-y += bootflag.o quirks.o i8237.o obj-$(CONFIG_EARLY_PRINTK) += early_printk.o diff --git a/arch/i386/kernel/i8237.c b/arch/x86/kernel/i8237.c similarity index 100% rename from arch/i386/kernel/i8237.c rename to arch/x86/kernel/i8237.c diff --git a/arch/x86_64/kernel/Makefile b/arch/x86_64/kernel/Makefile index 533d4bb..c04f7a6 100644 --- a/arch/x86_64/kernel/Makefile +++ b/arch/x86_64/kernel/Makefile @@ -7,7 +7,7 @@ EXTRA_AFLAGS:= -traditional obj-y := process.o signal.o entry.o traps.o irq.o \ ptrace.o time.o ioport.o ldt.o setup.o i8259.o sys_x86_64.o \ x8664_ksyms.o i387.o syscall.o vsyscall.o \ - setup64.o e820.o reboot.o i8237.o \ + setup64.o e820.o reboot.o \ pci-dma.o pci-nommu.o alternative.o hpet.o tsc.o obj-$(CONFIG_STACKTRACE) += stacktrace.o @@ -51,7 +51,6 @@ cpuid-$(subst m,y,$(CONFIG_X86_CPUID)) += ../../i386/kernel/cpuid.o topology-y += ../../i386/kernel/topology.o microcode-$(subst m,y,$(CONFIG_MICROCODE)) += ../../i386/kernel/microcode.o intel_cacheinfo-y += ../../i386/kernel/cpu/intel_cacheinfo.o -i8237-y+= ../../i386/kernel/i8237.o msr-$(subst m,y,$(CONFIG_X86_MSR)) += ../../i386/kernel/msr.o alternative-y += ../../i386/kernel/alternative.o pcspeaker-y+= ../../i386/kernel/pcspeaker.o -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH take3 06/20] topology.c switch
Move the topology.c to the common area. Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]> diff --git a/arch/i386/kernel/Makefile b/arch/i386/kernel/Makefile index 1052659..556da60 100644 --- a/arch/i386/kernel/Makefile +++ b/arch/i386/kernel/Makefile @@ -7,7 +7,7 @@ extra-y := head.o init_task.o vmlinux.lds obj-y := process.o signal.o entry.o traps.o irq.o \ ptrace.o time.o ioport.o ldt.o setup.o i8259.o sys_i386.o \ pci-dma.o i386_ksyms.o i387.o e820.o\ - topology.o alternative.o i8253.o tsc.o + alternative.o i8253.o tsc.o obj-$(CONFIG_STACKTRACE) += stacktrace.o obj-y += cpu/ diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index 19921b9..d70dbf3 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -1,4 +1,4 @@ -obj-y += bootflag.o quirks.o i8237.o +obj-y += bootflag.o quirks.o i8237.o topology.o obj-$(CONFIG_EARLY_PRINTK) += early_printk.o diff --git a/arch/i386/kernel/topology.c b/arch/x86/kernel/topology.c similarity index 100% rename from arch/i386/kernel/topology.c rename to arch/x86/kernel/topology.c diff --git a/arch/x86_64/kernel/Makefile b/arch/x86_64/kernel/Makefile index c04f7a6..3dc4c18 100644 --- a/arch/x86_64/kernel/Makefile +++ b/arch/x86_64/kernel/Makefile @@ -40,7 +40,6 @@ obj-$(CONFIG_AUDIT) += audit.o obj-$(CONFIG_MODULES) += module.o obj-$(CONFIG_PCI) += early-quirks.o -obj-y += topology.o obj-y += intel_cacheinfo.o obj-y += pcspeaker.o @@ -48,7 +47,6 @@ CFLAGS_vsyscall.o := $(PROFILING) -g0 therm_throt-y += ../../i386/kernel/cpu/mcheck/therm_throt.o cpuid-$(subst m,y,$(CONFIG_X86_CPUID)) += ../../i386/kernel/cpuid.o -topology-y += ../../i386/kernel/topology.o microcode-$(subst m,y,$(CONFIG_MICROCODE)) += ../../i386/kernel/microcode.o intel_cacheinfo-y += ../../i386/kernel/cpu/intel_cacheinfo.o msr-$(subst m,y,$(CONFIG_X86_MSR)) += ../../i386/kernel/msr.o -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH take3 17/20] k8.c switch
Move the k8.c to the common area. Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]> diff --git a/arch/i386/kernel/Makefile b/arch/i386/kernel/Makefile index ce1f742..72e11f7 100644 --- a/arch/i386/kernel/Makefile +++ b/arch/i386/kernel/Makefile @@ -33,7 +33,6 @@ obj-$(CONFIG_EFI) += efi.o efi_stub.o obj-$(CONFIG_DOUBLEFAULT) += doublefault.o obj-$(CONFIG_VM86) += vm86.o obj-$(CONFIG_HPET_TIMER) += hpet.o -obj-$(CONFIG_K8_NB)+= k8.o obj-$(CONFIG_VMI) += vmi.o vmitime.o obj-$(CONFIG_PARAVIRT) += paravirt.o @@ -78,6 +77,5 @@ $(obj)/vsyscall-syms.o: $(src)/vsyscall.lds \ $(obj)/vsyscall-sysenter.o $(obj)/vsyscall-note.o FORCE $(call if_changed,syscall) -k8-y += ../../x86_64/kernel/k8.o stacktrace-y += ../../x86_64/kernel/stacktrace.o diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index 1167962..06c335d 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -7,6 +7,7 @@ obj-$(CONFIG_X86_MSR) += msr.o obj-$(CONFIG_X86_CPUID)+= cpuid.o obj-$(CONFIG_MICROCODE)+= microcode.o obj-$(CONFIG_EARLY_PRINTK) += early_printk.o +obj-$(CONFIG_K8_NB)+= k8.o obj-y += pcspeaker.o diff --git a/arch/x86_64/kernel/k8.c b/arch/x86/kernel/k8.c similarity index 100% rename from arch/x86_64/kernel/k8.c rename to arch/x86/kernel/k8.c diff --git a/arch/x86_64/kernel/Makefile b/arch/x86_64/kernel/Makefile index 3d90462..0510887 100644 --- a/arch/x86_64/kernel/Makefile +++ b/arch/x86_64/kernel/Makefile @@ -29,7 +29,6 @@ obj-$(CONFIG_SWIOTLB) += pci-swiotlb.o obj-$(CONFIG_KPROBES) += kprobes.o obj-$(CONFIG_X86_PM_TIMER) += pmtimer.o obj-$(CONFIG_X86_VSMP) += vsmp.o -obj-$(CONFIG_K8_NB)+= k8.o obj-$(CONFIG_AUDIT)+= audit.o obj-$(CONFIG_MODULES) += module.o -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH take3 01/20] early_printk.c switch
Move the early_printk.c to the common area. Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]> diff --git a/arch/i386/kernel/Makefile b/arch/i386/kernel/Makefile index 4ae3dcf..a57040d 100644 --- a/arch/i386/kernel/Makefile +++ b/arch/i386/kernel/Makefile @@ -35,7 +35,6 @@ obj-$(CONFIG_ACPI_SRAT) += srat.o obj-$(CONFIG_EFI) += efi.o efi_stub.o obj-$(CONFIG_DOUBLEFAULT) += doublefault.o obj-$(CONFIG_VM86) += vm86.o -obj-$(CONFIG_EARLY_PRINTK) += early_printk.o obj-$(CONFIG_HPET_TIMER) += hpet.o obj-$(CONFIG_K8_NB)+= k8.o diff --git a/arch/i386/kernel/early_printk.c b/arch/i386/kernel/early_printk.c deleted file mode 100644 index 92f812b..000 --- a/arch/i386/kernel/early_printk.c +++ /dev/null @@ -1,2 +0,0 @@ - -#include "../../x86_64/kernel/early_printk.c" diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile new file mode 100644 index 000..55f268f --- /dev/null +++ b/arch/x86/kernel/Makefile @@ -0,0 +1,2 @@ + +obj-$(CONFIG_EARLY_PRINTK) += early_printk.o diff --git a/arch/x86_64/kernel/early_printk.c b/arch/x86/kernel/early_printk.c similarity index 100% rename from arch/x86_64/kernel/early_printk.c rename to arch/x86/kernel/early_printk.c diff --git a/arch/x86_64/Makefile b/arch/x86_64/Makefile index 2941a91..3cf9198 100644 --- a/arch/x86_64/Makefile +++ b/arch/x86_64/Makefile @@ -79,6 +79,7 @@ head-y := arch/x86_64/kernel/head.o arch/x86_64/kernel/head64.o arch/x86_64/kern libs-y += arch/x86_64/lib/ core-y += arch/x86_64/kernel/ \ + arch/x86/kernel/ \ arch/x86_64/mm/ \ arch/x86_64/crypto/ core-$(CONFIG_IA32_EMULATION) += arch/x86_64/ia32/ diff --git a/arch/x86_64/kernel/Makefile b/arch/x86_64/kernel/Makefile index bb47e86..8b2535c 100644 --- a/arch/x86_64/kernel/Makefile +++ b/arch/x86_64/kernel/Makefile @@ -28,7 +28,6 @@ obj-$(CONFIG_CRASH_DUMP) += crash_dump.o obj-$(CONFIG_PM) += suspend.o obj-$(CONFIG_SOFTWARE_SUSPEND) += suspend_asm.o obj-$(CONFIG_CPU_FREQ) += cpufreq/ -obj-$(CONFIG_EARLY_PRINTK) += early_printk.o obj-$(CONFIG_IOMMU)+= pci-gart.o aperture.o obj-$(CONFIG_CALGARY_IOMMU)+= pci-calgary.o tce.o obj-$(CONFIG_SWIOTLB) += pci-swiotlb.o diff --git a/arch/i386/Makefile b/arch/i386/Makefile index bd28f9f..d73a830 100644 --- a/arch/i386/Makefile +++ b/arch/i386/Makefile @@ -100,6 +100,7 @@ head-y := arch/i386/kernel/head.o arch/i386/kernel/init_task.o libs-y += arch/i386/lib/ core-y += arch/i386/kernel/ \ + arch/x86/kernel/ \ arch/i386/mm/ \ arch/i386/$(mcore-y)/ \ arch/i386/crypto/ -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH take3 18/20] stacktrace.c switch
Move the stacktrace.c to the common area. Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]> diff --git a/arch/i386/kernel/Makefile b/arch/i386/kernel/Makefile index 72e11f7..a5cf2e7 100644 --- a/arch/i386/kernel/Makefile +++ b/arch/i386/kernel/Makefile @@ -9,7 +9,6 @@ obj-y := process.o signal.o entry.o traps.o irq.o \ pci-dma.o i386_ksyms.o i387.o e820.o\ i8253.o tsc.o -obj-$(CONFIG_STACKTRACE) += stacktrace.o obj-y += cpu/ obj-y += acpi/ obj-$(CONFIG_X86_BIOS_REBOOT) += reboot.o @@ -77,5 +76,3 @@ $(obj)/vsyscall-syms.o: $(src)/vsyscall.lds \ $(obj)/vsyscall-sysenter.o $(obj)/vsyscall-note.o FORCE $(call if_changed,syscall) -stacktrace-y += ../../x86_64/kernel/stacktrace.o - diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index 06c335d..297cde9 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -1,5 +1,7 @@ obj-y += bootflag.o quirks.o i8237.o topology.o alternative.o +obj-$(CONFIG_STACKTRACE) += stacktrace.o + obj-y += cpu/ obj-$(CONFIG_ACPI) += acpi/ diff --git a/arch/x86_64/kernel/stacktrace.c b/arch/x86/kernel/stacktrace.c similarity index 100% rename from arch/x86_64/kernel/stacktrace.c rename to arch/x86/kernel/stacktrace.c diff --git a/arch/x86_64/kernel/Makefile b/arch/x86_64/kernel/Makefile index 0510887..7477cb1 100644 --- a/arch/x86_64/kernel/Makefile +++ b/arch/x86_64/kernel/Makefile @@ -10,7 +10,6 @@ obj-y := process.o signal.o entry.o traps.o irq.o \ setup64.o e820.o reboot.o \ pci-dma.o pci-nommu.o hpet.o tsc.o -obj-$(CONFIG_STACKTRACE) += stacktrace.o obj-$(CONFIG_X86_MCE) += mce.o obj-$(CONFIG_X86_MCE_INTEL)+= mce_intel.o obj-$(CONFIG_X86_MCE_AMD) += mce_amd.o -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH take3 04/20] quirks.c switch
Move the quirks.c to the common area. Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]> diff --git a/arch/i386/kernel/Makefile b/arch/i386/kernel/Makefile index 4622355..c5c62af 100644 --- a/arch/i386/kernel/Makefile +++ b/arch/i386/kernel/Makefile @@ -7,7 +7,7 @@ extra-y := head.o init_task.o vmlinux.lds obj-y := process.o signal.o entry.o traps.o irq.o \ ptrace.o time.o ioport.o ldt.o setup.o i8259.o sys_i386.o \ pci-dma.o i386_ksyms.o i387.o e820.o\ - quirks.o i8237.o topology.o alternative.o i8253.o tsc.o + i8237.o topology.o alternative.o i8253.o tsc.o obj-$(CONFIG_STACKTRACE) += stacktrace.o obj-y += cpu/ diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index fe2e4ea..26feab4 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -1,4 +1,4 @@ -obj-y += bootflag.o +obj-y += bootflag.o quirks.o obj-$(CONFIG_EARLY_PRINTK) += early_printk.o diff --git a/arch/i386/kernel/quirks.c b/arch/x86/kernel/quirks.c similarity index 100% rename from arch/i386/kernel/quirks.c rename to arch/x86/kernel/quirks.c diff --git a/arch/x86_64/kernel/Makefile b/arch/x86_64/kernel/Makefile index 1ffc4ea..533d4bb 100644 --- a/arch/x86_64/kernel/Makefile +++ b/arch/x86_64/kernel/Makefile @@ -7,7 +7,7 @@ EXTRA_AFLAGS:= -traditional obj-y := process.o signal.o entry.o traps.o irq.o \ ptrace.o time.o ioport.o ldt.o setup.o i8259.o sys_x86_64.o \ x8664_ksyms.o i387.o syscall.o vsyscall.o \ - setup64.o e820.o reboot.o quirks.o i8237.o \ + setup64.o e820.o reboot.o i8237.o \ pci-dma.o pci-nommu.o alternative.o hpet.o tsc.o obj-$(CONFIG_STACKTRACE) += stacktrace.o @@ -51,7 +51,6 @@ cpuid-$(subst m,y,$(CONFIG_X86_CPUID)) += ../../i386/kernel/cpuid.o topology-y += ../../i386/kernel/topology.o microcode-$(subst m,y,$(CONFIG_MICROCODE)) += ../../i386/kernel/microcode.o intel_cacheinfo-y += ../../i386/kernel/cpu/intel_cacheinfo.o -quirks-y += ../../i386/kernel/quirks.o i8237-y+= ../../i386/kernel/i8237.o msr-$(subst m,y,$(CONFIG_X86_MSR)) += ../../i386/kernel/msr.o alternative-y += ../../i386/kernel/alternative.o -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH take3 16/20] acpi files switched
Moved the shared files that were in arch/i386/kernel/acpi to the common area. Note, there still exists files in both archs in acpi. Since there's code there that is unique to the arch. Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]> diff --git a/arch/i386/kernel/acpi/Makefile b/arch/i386/kernel/acpi/Makefile index 7f7be01..3de22c2 100644 --- a/arch/i386/kernel/acpi/Makefile +++ b/arch/i386/kernel/acpi/Makefile @@ -1,10 +1,5 @@ -obj-$(CONFIG_ACPI) += boot.o ifneq ($(CONFIG_PCI),) obj-$(CONFIG_X86_IO_APIC) += earlyquirk.o endif obj-$(CONFIG_ACPI_SLEEP) += sleep.o wakeup.o -ifneq ($(CONFIG_ACPI_PROCESSOR),) -obj-y += cstate.o processor.o -endif - diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index c1a2b58..1167962 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -1,6 +1,7 @@ obj-y += bootflag.o quirks.o i8237.o topology.o alternative.o obj-y += cpu/ +obj-$(CONFIG_ACPI) += acpi/ obj-$(CONFIG_X86_MSR) += msr.o obj-$(CONFIG_X86_CPUID)+= cpuid.o diff --git a/arch/x86/kernel/acpi/Makefile b/arch/x86/kernel/acpi/Makefile new file mode 100644 index 000..3aa3d16 --- /dev/null +++ b/arch/x86/kernel/acpi/Makefile @@ -0,0 +1,5 @@ +obj-y += boot.o + +ifneq ($(CONFIG_ACPI_PROCESSOR),) +obj-y += processor.o cstate.o +endif diff --git a/arch/i386/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c similarity index 100% rename from arch/i386/kernel/acpi/boot.c rename to arch/x86/kernel/acpi/boot.c diff --git a/arch/i386/kernel/acpi/cstate.c b/arch/x86/kernel/acpi/cstate.c similarity index 100% rename from arch/i386/kernel/acpi/cstate.c rename to arch/x86/kernel/acpi/cstate.c diff --git a/arch/i386/kernel/acpi/processor.c b/arch/x86/kernel/acpi/processor.c similarity index 100% rename from arch/i386/kernel/acpi/processor.c rename to arch/x86/kernel/acpi/processor.c diff --git a/arch/x86_64/kernel/acpi/Makefile b/arch/x86_64/kernel/acpi/Makefile index 080b996..eb4bc11 100644 --- a/arch/x86_64/kernel/acpi/Makefile +++ b/arch/x86_64/kernel/acpi/Makefile @@ -1,9 +1,2 @@ -obj-y := boot.o -boot-y := ../../../i386/kernel/acpi/boot.o obj-$(CONFIG_ACPI_SLEEP) += sleep.o wakeup.o -ifneq ($(CONFIG_ACPI_PROCESSOR),) -obj-y += processor.o -processor-y:= ../../../i386/kernel/acpi/processor.o ../../../i386/kernel/acpi/cstate.o -endif - -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH take3 08/20] msr.c switch
Move the msr.c to the common area. Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]> diff --git a/arch/i386/kernel/Makefile b/arch/i386/kernel/Makefile index 44c7d89..5276349 100644 --- a/arch/i386/kernel/Makefile +++ b/arch/i386/kernel/Makefile @@ -14,7 +14,6 @@ obj-y += cpu/ obj-y += acpi/ obj-$(CONFIG_X86_BIOS_REBOOT) += reboot.o obj-$(CONFIG_MCA) += mca.o -obj-$(CONFIG_X86_MSR) += msr.o obj-$(CONFIG_X86_CPUID)+= cpuid.o obj-$(CONFIG_MICROCODE)+= microcode.o obj-$(CONFIG_APM) += apm.o diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index b63f832..4e5a88f 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -1,5 +1,6 @@ obj-y += bootflag.o quirks.o i8237.o topology.o alternative.o +obj-$(CONFIG_X86_MSR) += msr.o obj-$(CONFIG_EARLY_PRINTK) += early_printk.o # i386 defines CONFIG_X86_SMP when CONFIG_SMP and !CONFIG_X86_VOYAGER diff --git a/arch/i386/kernel/msr.c b/arch/x86/kernel/msr.c similarity index 100% rename from arch/i386/kernel/msr.c rename to arch/x86/kernel/msr.c diff --git a/arch/x86_64/kernel/Makefile b/arch/x86_64/kernel/Makefile index b12901c..248dbe8 100644 --- a/arch/x86_64/kernel/Makefile +++ b/arch/x86_64/kernel/Makefile @@ -16,7 +16,6 @@ obj-$(CONFIG_X86_MCE_INTEL) += mce_intel.o obj-$(CONFIG_X86_MCE_AMD) += mce_amd.o obj-$(CONFIG_MTRR) += ../../i386/kernel/cpu/mtrr/ obj-$(CONFIG_ACPI) += acpi/ -obj-$(CONFIG_X86_MSR) += msr.o obj-$(CONFIG_MICROCODE)+= microcode.o obj-$(CONFIG_X86_CPUID)+= cpuid.o obj-$(CONFIG_SMP) += smp.o smpboot.o trampoline.o @@ -49,5 +48,4 @@ therm_throt-y += ../../i386/kernel/cpu/mcheck/therm_throt.o cpuid-$(subst m,y,$(CONFIG_X86_CPUID)) += ../../i386/kernel/cpuid.o microcode-$(subst m,y,$(CONFIG_MICROCODE)) += ../../i386/kernel/microcode.o intel_cacheinfo-y += ../../i386/kernel/cpu/intel_cacheinfo.o -msr-$(subst m,y,$(CONFIG_X86_MSR)) += ../../i386/kernel/msr.o pcspeaker-y+= ../../i386/kernel/pcspeaker.o -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH take3 10/20] microcode.c switch
Move the microcode.c to the common area. Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]> diff --git a/arch/i386/kernel/Makefile b/arch/i386/kernel/Makefile index 4437181..ac925bc 100644 --- a/arch/i386/kernel/Makefile +++ b/arch/i386/kernel/Makefile @@ -14,7 +14,6 @@ obj-y += cpu/ obj-y += acpi/ obj-$(CONFIG_X86_BIOS_REBOOT) += reboot.o obj-$(CONFIG_MCA) += mca.o -obj-$(CONFIG_MICROCODE)+= microcode.o obj-$(CONFIG_APM) += apm.o obj-$(CONFIG_X86_SMP) += smp.o smpboot.o obj-$(CONFIG_X86_TRAMPOLINE) += trampoline.o diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index 912421a..f1c6b2e 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -2,6 +2,7 @@ obj-y += bootflag.o quirks.o i8237.o topology.o alternative.o obj-$(CONFIG_X86_MSR) += msr.o obj-$(CONFIG_X86_CPUID)+= cpuid.o +obj-$(CONFIG_MICROCODE)+= microcode.o obj-$(CONFIG_EARLY_PRINTK) += early_printk.o # i386 defines CONFIG_X86_SMP when CONFIG_SMP and !CONFIG_X86_VOYAGER diff --git a/arch/i386/kernel/microcode.c b/arch/x86/kernel/microcode.c similarity index 100% rename from arch/i386/kernel/microcode.c rename to arch/x86/kernel/microcode.c diff --git a/arch/x86_64/kernel/Makefile b/arch/x86_64/kernel/Makefile index f5997f3..08795d8 100644 --- a/arch/x86_64/kernel/Makefile +++ b/arch/x86_64/kernel/Makefile @@ -16,7 +16,6 @@ obj-$(CONFIG_X86_MCE_INTEL) += mce_intel.o obj-$(CONFIG_X86_MCE_AMD) += mce_amd.o obj-$(CONFIG_MTRR) += ../../i386/kernel/cpu/mtrr/ obj-$(CONFIG_ACPI) += acpi/ -obj-$(CONFIG_MICROCODE)+= microcode.o obj-$(CONFIG_SMP) += smp.o smpboot.o trampoline.o obj-y += apic.o nmi.o obj-y += io_apic.o mpparse.o \ @@ -44,6 +43,5 @@ obj-y += pcspeaker.o CFLAGS_vsyscall.o := $(PROFILING) -g0 therm_throt-y += ../../i386/kernel/cpu/mcheck/therm_throt.o -microcode-$(subst m,y,$(CONFIG_MICROCODE)) += ../../i386/kernel/microcode.o intel_cacheinfo-y += ../../i386/kernel/cpu/intel_cacheinfo.o pcspeaker-y+= ../../i386/kernel/pcspeaker.o -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH take3 03/20] bootflag.c switch
Move the bootflag.c to the common area. Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]> diff --git a/arch/i386/kernel/Makefile b/arch/i386/kernel/Makefile index c8fe439..4622355 100644 --- a/arch/i386/kernel/Makefile +++ b/arch/i386/kernel/Makefile @@ -6,7 +6,7 @@ extra-y := head.o init_task.o vmlinux.lds obj-y := process.o signal.o entry.o traps.o irq.o \ ptrace.o time.o ioport.o ldt.o setup.o i8259.o sys_i386.o \ - pci-dma.o i386_ksyms.o i387.o bootflag.o e820.o\ + pci-dma.o i386_ksyms.o i387.o e820.o\ quirks.o i8237.o topology.o alternative.o i8253.o tsc.o obj-$(CONFIG_STACKTRACE) += stacktrace.o diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index bd548e6..fe2e4ea 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -1,3 +1,4 @@ +obj-y += bootflag.o obj-$(CONFIG_EARLY_PRINTK) += early_printk.o diff --git a/arch/i386/kernel/bootflag.c b/arch/x86/kernel/bootflag.c similarity index 100% rename from arch/i386/kernel/bootflag.c rename to arch/x86/kernel/bootflag.c diff --git a/arch/x86_64/kernel/Makefile b/arch/x86_64/kernel/Makefile index 54fe500..1ffc4ea 100644 --- a/arch/x86_64/kernel/Makefile +++ b/arch/x86_64/kernel/Makefile @@ -7,7 +7,7 @@ EXTRA_AFLAGS:= -traditional obj-y := process.o signal.o entry.o traps.o irq.o \ ptrace.o time.o ioport.o ldt.o setup.o i8259.o sys_x86_64.o \ x8664_ksyms.o i387.o syscall.o vsyscall.o \ - setup64.o bootflag.o e820.o reboot.o quirks.o i8237.o \ + setup64.o e820.o reboot.o quirks.o i8237.o \ pci-dma.o pci-nommu.o alternative.o hpet.o tsc.o obj-$(CONFIG_STACKTRACE) += stacktrace.o @@ -47,7 +47,6 @@ obj-y += pcspeaker.o CFLAGS_vsyscall.o := $(PROFILING) -g0 therm_throt-y += ../../i386/kernel/cpu/mcheck/therm_throt.o -bootflag-y += ../../i386/kernel/bootflag.o cpuid-$(subst m,y,$(CONFIG_X86_CPUID)) += ../../i386/kernel/cpuid.o topology-y += ../../i386/kernel/topology.o microcode-$(subst m,y,$(CONFIG_MICROCODE)) += ../../i386/kernel/microcode.o -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH take3 07/20] alternative.c switch
Move the alternative.c to the common area. Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]> diff --git a/arch/i386/kernel/Makefile b/arch/i386/kernel/Makefile index 556da60..44c7d89 100644 --- a/arch/i386/kernel/Makefile +++ b/arch/i386/kernel/Makefile @@ -7,7 +7,7 @@ extra-y := head.o init_task.o vmlinux.lds obj-y := process.o signal.o entry.o traps.o irq.o \ ptrace.o time.o ioport.o ldt.o setup.o i8259.o sys_i386.o \ pci-dma.o i386_ksyms.o i387.o e820.o\ - alternative.o i8253.o tsc.o + i8253.o tsc.o obj-$(CONFIG_STACKTRACE) += stacktrace.o obj-y += cpu/ diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index d70dbf3..b63f832 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -1,4 +1,4 @@ -obj-y += bootflag.o quirks.o i8237.o topology.o +obj-y += bootflag.o quirks.o i8237.o topology.o alternative.o obj-$(CONFIG_EARLY_PRINTK) += early_printk.o diff --git a/arch/i386/kernel/alternative.c b/arch/x86/kernel/alternative.c similarity index 100% rename from arch/i386/kernel/alternative.c rename to arch/x86/kernel/alternative.c diff --git a/arch/x86_64/kernel/Makefile b/arch/x86_64/kernel/Makefile index 3dc4c18..b12901c 100644 --- a/arch/x86_64/kernel/Makefile +++ b/arch/x86_64/kernel/Makefile @@ -8,7 +8,7 @@ obj-y := process.o signal.o entry.o traps.o irq.o \ ptrace.o time.o ioport.o ldt.o setup.o i8259.o sys_x86_64.o \ x8664_ksyms.o i387.o syscall.o vsyscall.o \ setup64.o e820.o reboot.o \ - pci-dma.o pci-nommu.o alternative.o hpet.o tsc.o + pci-dma.o pci-nommu.o hpet.o tsc.o obj-$(CONFIG_STACKTRACE) += stacktrace.o obj-$(CONFIG_X86_MCE) += mce.o therm_throt.o @@ -50,5 +50,4 @@ cpuid-$(subst m,y,$(CONFIG_X86_CPUID)) += ../../i386/kernel/cpuid.o microcode-$(subst m,y,$(CONFIG_MICROCODE)) += ../../i386/kernel/microcode.o intel_cacheinfo-y += ../../i386/kernel/cpu/intel_cacheinfo.o msr-$(subst m,y,$(CONFIG_X86_MSR)) += ../../i386/kernel/msr.o -alternative-y += ../../i386/kernel/alternative.o pcspeaker-y+= ../../i386/kernel/pcspeaker.o -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: New thread RDSL, post-2.6.20 kernels and amanda (tar) miss-fires
Gene Heskett wrote: > Here is an example > [EMAIL PROTECTED] data]# dd if=00010.coyote._lib.1 bs=32k count=1 > AMANDA: FILE 20070314104344 coyote /lib lev 1 comp .gz program /bin/tar > To restore, position tape at start of file and run: > dd if= bs=32k skip=1 | /bin/gzip -dc | /bin/tar -f - ... > > And the elipsis is an error if not removed. Then one is supposed to be > able to redirect tars output with the usual >/tmp/test/ syntax > > So: > [EMAIL PROTECTED] data]# dd if=00010.coyote._lib.1 bs=32k > skip=1 | /bin/gzip -dc | /bin/tar -f - >/tmp/test/ > -bash: /tmp/test/: Is a directory > > which is the return from any variation in how the redirect is done. > > So what is it that am I doing wrong in the above command line?, so I can > add it to my helper scripts to be published eventually on zmanda.org. One of us is confused, and it may very well be me, but... the /bin/tar -f - >/tmp/test/ looks to me like it should fail exactly as bash says it does. the output redirect (>) will only write out to a file, not a directory. (So, /tmp/file should work, /tmp/file/ won't.) Are you trying to redirect where the files get restored? That should be done with a cd before doing the uncompress. Or am I misunderstanding what you're telling me? Ray - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 4/7] RSS accounting hooks over the code
Kirill Korotaev wrote: The approaches I have seen that don't have a struct page pointer, do intrusive things like try to put hooks everywhere throughout the kernel where a userspace task can cause an allocation (and of course end up missing many, so they aren't secure anyway)... and basically just nasty stuff that will never get merged. User beancounters patch has got through all these... The approach where each charged object has a pointer to the owner container, who has charged it - is the most easy/clean way to handle all the problems with dynamic context change, races, etc. and 1 pointer in page struct is just 0.1% overehad. The pointer in struct page approach is a decent one, which I have liked since this whole container effort came up. IIRC Linus and Alan also thought that was a reasonable way to go. I haven't reviewed the rest of the beancounters patch since looking at it quite a few months ago... I probably don't have time for a good review at the moment, but I should eventually. Struct page overhead really isn't bad. Sure, nobody who doesn't use containers will want to turn it on, but unless you're using a big PAE system you're actually unlikely to notice. big PAE doesn't make any difference IMHO (until struct pages are not created for non-present physical memory areas) The issue is just that struct pages use low memory, which is a really scarce commodity on PAE. One more pointer in the struct page means 64MB less lowmem. But PAE is crap anyway. We've already made enough concessions in the kernel to support it. I agree: struct page overhead is not really significant. The benefits of simplicity seems to outweigh the downside. But again, I'll say the node-container approach of course does avoid this nicely (because we already can get the node from the page). So definitely that approach needs to be discredited before going with this one. But it lacks some other features: 1. page can't be shared easily with another container I think they could be shared. You allocate _new_ pages from your own node, but you can definitely use existing pages allocated to other nodes. 2. shared page can't be accounted honestly to containers as fraction=PAGE_SIZE/containers-using-it Yes there would be some accounting differences. I think it is hard to say exactly what containers are "using" what page anyway, though. What do you say about unmapped pages? Kernel allocations? etc. 3. It doesn't help accounting of kernel memory structures. e.g. in OpenVZ we use exactly the same pointer on the page to track which container owns it, e.g. pages used for page tables are accounted this way. ? page_to_nid(page) ~= container that owns it. 4. I guess container destroy requires destroy of memory zone, which means write out of dirty data. Which doesn't sound good for me as well. I haven't looked at any implementation, but I think it is fine for the zone to stay around. 5. memory reclamation in case of global memory shortage becomes a tricky/unfair task. I don't understand why? You can much more easily target a specific container for reclaim with this approach than with others (because you have an lru per container). 6. You cannot overcommit. AFAIU, the memory should be granted to node exclusive usage and cannot be used by by another containers, even if it is unused. This is not an option for us. I'm not sure about that. If you have a larger number of nodes, then you could assign more free nodes to a container on demand. But I think there would definitely be less flexibility with nodes... I don't know... and seeing as I don't really know where the google guys are going with it, I won't misrepresent their work any further ;) Everyone seems to have a plan ;) I don't read the containers list... does everyone still have *different* plans, or is any sort of consensus being reached? hope we'll have it soon :) Good luck ;) -- SUSE Labs, Novell Inc. Send instant messages to your online friends http://au.messenger.yahoo.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/1] Allow i386 crash kernels to handle x86_64 dumps
On Thu, Mar 15, 2007 at 10:46:38AM +0900, Horms wrote: > On Wed, Mar 14, 2007 at 05:00:09PM +, Ian Campbell wrote: > > The specific case I am encountering is kdump under Xen with a 64 bit > > hypervisor and 32 bit kernel/userspace. The dump created is a 64 bit due > > to the hypervisor but the dump kernel is 32 bit to match the domain 0 > > kernel. > > > > It's possibly less likely to be useful in a purely native scenario but I > > see no reason to disallow it. > > For native Linux, would this cover the case where the pre-crash kernel > is 64bit and the crashdump (post-crash) kernel is 32bit? > I think so. Though I have never tried this. > > Signed-off-by: Ian Campbell <[EMAIL PROTECTED]> > > > > --- pristine-linux-2.6.18/include/asm-i386/elf.h2006-09-20 > > 04:42:06.0 +0100 > > +++ linux-2.6.18-xen/include/asm-i386/elf.h 2007-03-14 16:42:30.0 > > + > > @@ -36,7 +36,7 @@ > > * This is used to ensure we don't load something for the wrong > > architecture. > > */ > > #define elf_check_arch(x) \ > > - (((x)->e_machine == EM_386) || ((x)->e_machine == EM_486)) > > + (((x)->e_machine == EM_386) || ((x)->e_machine == EM_486) || > > ((x)->e_machine == EM_X86_64)) But I think changing this macro might run into issues. It is being used at few places in kernel, for example while loading module. This will essentially mean that we allow loading 64bit x86_64 modules on 32bit i386 systems? Similarly, load_elf_interp() is using it, again will we allow loading a interp written for X86_64 on a 32bit i386 machine? Should we create a separate macro something like elf_check_allowed_arch(), to take care of such corner cases? Thanks Vivek - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Kprobes: Make kprobe.symbol_name const
From: Ananth N Mavinakayanahalli <[EMAIL PROTECTED]> Kprobes doesn't scribble the kprobe.symbol_name field. Its only set by the module when registering the probe. Modules that exercise good hygiene using the "const" qualifier will see warnings... warning: assignment discards qualifiers from pointer target type Make struct kprobe.symbol_name const char * Signed-off-by: Ananth N Mavinakayanahalli <[EMAIL PROTECTED]> Signed-off-by: Jim Keniston <[EMAIL PROTECTED]> --- include/linux/kprobes.h |2 +- 1 files changed, 1 insertion(+), 1 deletion(-) Index: linux-2.6.21-rc3/include/linux/kprobes.h === --- linux-2.6.21-rc3.orig/include/linux/kprobes.h +++ linux-2.6.21-rc3/include/linux/kprobes.h @@ -78,7 +78,7 @@ struct kprobe { kprobe_opcode_t *addr; /* Allow user to indicate symbol name of the probe point */ - char *symbol_name; + const char *symbol_name; /* Offset into the symbol */ unsigned int offset; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2/5] fs: introduce new aops and infrastructure
On Wed, Mar 14, 2007 at 09:13:29PM -0700, Mark Fasheh wrote: > Hi Nick, > > On Wed, Mar 14, 2007 at 02:38:22PM +0100, Nick Piggin wrote: > > Introduce write_begin, write_end, and perform_write aops. > > > > These are intended to replace prepare_write and commit_write with more > > flexible alternatives that are also able to avoid the buffered write > > deadlock problems efficiently (which prepare_write is unable to do). > > > Index: linux-2.6/include/linux/fs.h > > === > > --- linux-2.6.orig/include/linux/fs.h > > +++ linux-2.6/include/linux/fs.h > > @@ -449,6 +449,17 @@ struct address_space_operations { > > */ > > int (*prepare_write)(struct file *, struct page *, unsigned, unsigned); > > int (*commit_write)(struct file *, struct page *, unsigned, unsigned); > > + > > + int (*write_begin)(struct file *, struct address_space *mapping, > > + loff_t pos, unsigned len, int intr, > > + struct page **pagep, void **fsdata); > > + int (*write_end)(struct file *, struct address_space *mapping, > > + loff_t pos, unsigned len, unsigned copied, > > + struct page *page, void *fsdata); > > Are we going to get rid of the file and intr arguments btw? I'm not sure > intr is useful, and mapping is probably enough to get whatever we inside > ->write_begin / ->write_end. Yeah, I was going to, but I had this version ready to go so decided to leave them in at the last minute. We can definitely take them out if people agree. However a side note about intr -- I wonder if it might be wise to include a flags argument, in case we might want to add something like that later? (definitely if we do keep intr, then it should be done as a flag rather than its own int). > Also, I noticed that you didn't export block_write_begin(), > simple_write_begin(), block_write_end() and simple_write_end() - I think we > want those for client modules. Yep, simple oversight on my part. > Attached is a quick patch to hook up the existing ocfs2 write code. This has > been compile tested only for now - one of my test machines isn't > cooperating, so a runtime test will have to wait until tommorrow. > > One interesting side effect is that we no longer pass AOP_TRUNCATE_PAGE up a > level. This gives callers less to deal with. And it means that ocfs2 doesn't > have to use the ocfs2_*_lock_with_page() cluster lock variants in > ocfs2_block_write_begin() because it can order cluster locks outside of the > page lock there. OK that's very cool. I was hoping that would be the case. If GFS2 can avoid that too, then we might be able to get rid of AOP_TRUNCATE_PAGE handling from the legacy prepare/commit_write paths, which will make them simpler. > My ocfs2 write rework will be a more serious user of these stuff, including > the fsdata backpointer. That code will also eliminate the entire > ocfs2_*_lock_with_page() cluster locking workarounds for write (they'll have > to remain for ->readpage()). I'm beginning work on cleaning those ocfs2 > patches up and getting them plugged into this stuff. I hope to post them in > the next day or two. OK, well I'll add this to my queue for now, and post the full patchset after incorporating feedback I've had so far, and doing more testing, so people can actually apply them and boot kernels. Thanks, Nick - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 5/5] eventfd+KAIO - KAIO eventfd support (example/maybe-broken) ...
This is another example about how to add eventfd support to the current KAIO code. The KAIO code simply signals the eventfd fd when events are ready, and this triggers a POLLIN in the fd. I made a quick test program to verify the patch, and it runs fine here: http://www.xmailserver.org/eventfd-aio-test.c The test program uses poll(2), but it'd, of course, work with epoll too. This can allow to schedule both block I/O and other poll-able devices requests, and wait for results using select/poll/epoll. Signed-off-by: Davide Libenzi - Davide Index: linux-2.6.20.ep2/fs/aio.c === --- linux-2.6.20.ep2.orig/fs/aio.c 2007-03-14 20:51:32.0 -0700 +++ linux-2.6.20.ep2/fs/aio.c 2007-03-14 20:54:37.0 -0700 @@ -30,6 +30,7 @@ #include #include #include +#include #include #include @@ -422,6 +423,7 @@ req->private = NULL; req->ki_iovec = NULL; INIT_LIST_HEAD(>ki_run_list); + req->ki_eventfd = ERR_PTR(-EINVAL); /* Check if the completion queue has enough free space to * accept an event from this io. @@ -463,6 +465,8 @@ { assert_spin_locked(>ctx_lock); + if (!IS_ERR(req->ki_eventfd)) + fput(req->ki_eventfd); if (req->ki_dtor) req->ki_dtor(req); if (req->ki_iovec != >ki_inline_vec) @@ -947,6 +951,13 @@ return 1; } + /* +* Check if the user asked us to deliver the result through an +* eventfd. +*/ + if (unlikely(!IS_ERR(iocb->ki_eventfd))) + eventfd_signal(iocb->ki_eventfd, 1); + info = >ring_info; /* add a completion event to the ring buffer. @@ -1556,6 +1567,18 @@ fput(file); return -EAGAIN; } + if (iocb->aio_resfd != 0) { + /* +* If the aio_resfd field of the iocb is not zero, get an +* instance of the file* now. This will be the place to deliver +* AIO results to. +*/ + req->ki_eventfd = eventfd_fget((int) iocb->aio_resfd); + if (IS_ERR(req->ki_eventfd)) { + ret = PTR_ERR(req->ki_eventfd); + goto out_put_req; + } + } req->ki_filp = file; ret = put_user(req->ki_key, _iocb->aio_key); Index: linux-2.6.20.ep2/include/linux/aio.h === --- linux-2.6.20.ep2.orig/include/linux/aio.h 2007-03-14 20:51:32.0 -0700 +++ linux-2.6.20.ep2/include/linux/aio.h2007-03-14 20:54:37.0 -0700 @@ -119,6 +119,12 @@ struct list_headki_list;/* the aio core uses this * for cancellation */ + + /* +* If the aio_resfd field of the userspace iocb is not zero, +* this is the underlying file* to deliver event to. +*/ + struct file *ki_eventfd; }; #define is_sync_kiocb(iocb)((iocb)->ki_key == KIOCB_SYNC_KEY) Index: linux-2.6.20.ep2/include/linux/aio_abi.h === --- linux-2.6.20.ep2.orig/include/linux/aio_abi.h 2007-03-14 20:51:32.0 -0700 +++ linux-2.6.20.ep2/include/linux/aio_abi.h2007-03-14 20:56:00.0 -0700 @@ -84,7 +84,11 @@ /* extra parameters */ __u64 aio_reserved2; /* TODO: use this for a (struct sigevent *) */ - __u64 aio_reserved3; + __u32 aio_reserved3; + /* +* If different from 0, this is an eventfd to deliver AIO results to +*/ + __u32 aio_resfd; }; /* 64 bytes */ #undef IFBIG - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 1/5] eventfd+KAIO - anonymous inode source ...
This patch add an anonymous inode source, to be used for files that need and inode only in order to create a file*. We do not care of having an inode for each file, and we do not even care of having different names in the associated dentries (dentry names will be same for classes of file*). This allow code reuse, and will be used by epoll, signalfd and timerfd (and whatever else there'll be). Signed-off-by: Davide Libenzi - Davide Index: linux-2.6.20.ep2/fs/anon_inodes.c === --- /dev/null 1970-01-01 00:00:00.0 + +++ linux-2.6.20.ep2/fs/anon_inodes.c 2007-03-10 15:57:47.0 -0800 @@ -0,0 +1,203 @@ +/* + * fs/anon_inodes.c + * + * Copyright (C) 2007 Davide Libenzi + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + + + +static int ainofs_delete_dentry(struct dentry *dentry); +static struct inode *aino_getinode(void); +static struct inode *aino_mkinode(void); +static int ainofs_get_sb(struct file_system_type *fs_type, int flags, +const char *dev_name, void *data, struct vfsmount *mnt); + + + +static struct vfsmount *aino_mnt __read_mostly; +static struct inode *aino_inode; +static struct file_operations aino_fops = { }; +static struct file_system_type aino_fs_type = { + .name = "ainofs", + .get_sb = ainofs_get_sb, + .kill_sb= kill_anon_super, +}; +static struct dentry_operations ainofs_dentry_operations = { + .d_delete = ainofs_delete_dentry, +}; + + + +int aino_getfd(int *pfd, struct inode **pinode, struct file **pfile, + char const *name, const struct file_operations *fops, void *priv) +{ + struct qstr this; + struct dentry *dentry; + struct inode *inode; + struct file *file; + int error, fd; + + error = -ENFILE; + file = get_empty_filp(); + if (!file) + goto eexit_1; + + inode = aino_getinode(); + if (IS_ERR(inode)) { + error = PTR_ERR(inode); + goto eexit_2; + } + + error = get_unused_fd(); + if (error < 0) + goto eexit_3; + fd = error; + + /* +* Link the inode to a directory entry by creating a unique name +* using the inode sequence number. +*/ + error = -ENOMEM; + this.name = name; + this.len = strlen(name); + this.hash = 0; + dentry = d_alloc(aino_mnt->mnt_sb->s_root, ); + if (!dentry) + goto eexit_4; + dentry->d_op = _dentry_operations; + /* Do not publish this dentry inside the global dentry hash table */ + dentry->d_flags &= ~DCACHE_UNHASHED; + d_instantiate(dentry, inode); + + file->f_path.mnt = mntget(aino_mnt); + file->f_path.dentry = dentry; + file->f_mapping = inode->i_mapping; + + file->f_pos = 0; + file->f_flags = O_RDONLY; + file->f_op = fops; + file->f_mode = FMODE_READ; + file->f_version = 0; + file->private_data = priv; + + fd_install(fd, file); + + *pfd = fd; + *pinode = inode; + *pfile = file; + return 0; + +eexit_4: + put_unused_fd(fd); +eexit_3: + iput(inode); +eexit_2: + put_filp(file); +eexit_1: + return error; +} + + +static int ainofs_delete_dentry(struct dentry *dentry) +{ + /* +* We faked vfs to believe the dentry was hashed when we created it. +* Now we restore the flag so that dput() will work correctly. +*/ + dentry->d_flags |= DCACHE_UNHASHED; + return 1; +} + + +static struct inode *aino_getinode(void) +{ + return igrab(aino_inode); +} + + +/* + * A single inode exist for all aino files. On the contrary of pipes, + * aino inodes has no per-instance data associated, so we can avoid + * the allocation of multiple of them. + */ +static struct inode *aino_mkinode(void) +{ + int error = -ENOMEM; + struct inode *inode = new_inode(aino_mnt->mnt_sb); + + if (!inode) + goto eexit_1; + + inode->i_fop = _fops; + + /* +* Mark the inode dirty from the very beginning, +* that way it will never be moved to the dirty +* list because mark_inode_dirty() will think +* that it already _is_ on the dirty list. +*/ + inode->i_state = I_DIRTY; + inode->i_mode = S_IRUSR | S_IWUSR; + inode->i_uid = current->fsuid; + inode->i_gid = current->fsgid; + inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME; + return inode; + +eexit_1: + return ERR_PTR(error); +} + + +static int ainofs_get_sb(struct file_system_type *fs_type, int flags, +const char *dev_name, void *data, struct vfsmount *mnt) +{ + return get_sb_pseudo(fs_type, "aino:", NULL, AINOFS_MAGIC, mnt); +} + +
[patch 3/5] eventfd+KAIO - eventfd wire up i386 arch ...
This patch wire the eventfd system call to the i386 architecture. Signed-off-by: Davide Libenzi - Davide Index: linux-2.6.20.ep2/arch/i386/kernel/syscall_table.S === --- linux-2.6.20.ep2.orig/arch/i386/kernel/syscall_table.S 2007-03-14 20:51:36.0 -0700 +++ linux-2.6.20.ep2/arch/i386/kernel/syscall_table.S 2007-03-14 20:54:34.0 -0700 @@ -321,3 +321,4 @@ .long sys_epoll_pwait .long sys_signalfd /* 320 */ .long sys_timerfd + .long sys_eventfd Index: linux-2.6.20.ep2/include/asm-i386/unistd.h === --- linux-2.6.20.ep2.orig/include/asm-i386/unistd.h 2007-03-14 20:51:36.0 -0700 +++ linux-2.6.20.ep2/include/asm-i386/unistd.h 2007-03-14 20:54:34.0 -0700 @@ -327,10 +327,11 @@ #define __NR_epoll_pwait 319 #define __NR_signalfd 320 #define __NR_timerfd 321 +#define __NR_eventfd 322 #ifdef __KERNEL__ -#define NR_syscalls 322 +#define NR_syscalls 323 #define __ARCH_WANT_IPC_PARSE_VERSION #define __ARCH_WANT_OLD_READDIR - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 2/5] eventfd+KAIO - eventfd core ...
This is a very simple and light file descriptor, that can be used as event wait/dispatch by userspace (both wait and dispatch) and by the kernel (dispatch only). When used in the kernel, it can offer an fd-bridge to enable functionalities like KAIO or syslets/threadlets to signal to an fd the completion of certain operations. The API is: int eventfd(unsigned int count); The eventfd API accepts an initial "count" parameter, and returns an eventfd fd. It supports poll(2) (POLLIN), read(2) and write(2). The read(2) function reads the __u64 counter value, and reset the internal value to zero. The write(2) call writes an __u64 count value, and adds it to the current counter. The eventfd fd supports O_NONBLOCK also. On the kernel side, we have: struct file *eventfd_fget(int fd); int eventfd_signal(struct file *file, unsigned int n); The eventfd_fget() should be called to get a struct file* from an eventfd fd (this is an fget() + check of f_op being an eventfd fops pointer). The kernel can then call eventfd_signal() every time it wants to post an event to userspace. The eventfd_signal() function can be called from any context. Signed-off-by: Davide Libenzi - Davide Index: linux-2.6.20.ep2/fs/Makefile === --- linux-2.6.20.ep2.orig/fs/Makefile 2007-03-12 11:27:58.0 -0700 +++ linux-2.6.20.ep2/fs/Makefile2007-03-14 17:31:35.0 -0700 @@ -11,7 +11,7 @@ attr.o bad_inode.o file.o filesystems.o namespace.o aio.o \ seq_file.o xattr.o libfs.o fs-writeback.o \ pnode.o drop_caches.o splice.o sync.o utimes.o \ - stack.o anon_inodes.o signalfd.o timerfd.o + stack.o anon_inodes.o signalfd.o timerfd.o eventfd.o ifeq ($(CONFIG_BLOCK),y) obj-y += buffer.o bio.o block_dev.o direct-io.o mpage.o ioprio.o Index: linux-2.6.20.ep2/include/linux/syscalls.h === --- linux-2.6.20.ep2.orig/include/linux/syscalls.h 2007-03-13 16:40:46.0 -0700 +++ linux-2.6.20.ep2/include/linux/syscalls.h 2007-03-14 19:31:56.0 -0700 @@ -605,6 +605,7 @@ asmlinkage long sys_signalfd(int ufd, sigset_t __user *user_mask, size_t sizemask); asmlinkage long sys_timerfd(int ufd, int clockid, int flags, const struct itimerspec __user *utmr); +asmlinkage long sys_eventfd(unsigned int count); int kernel_execve(const char *filename, char *const argv[], char *const envp[]); Index: linux-2.6.20.ep2/fs/eventfd.c === --- /dev/null 1970-01-01 00:00:00.0 + +++ linux-2.6.20.ep2/fs/eventfd.c 2007-03-14 20:42:33.0 -0700 @@ -0,0 +1,259 @@ +/* + * fs/eventfd.c + * + * Copyright (C) 2007 Davide Libenzi + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + + + +struct eventfd_ctx { + spinlock_t lock; + wait_queue_head_t wqh; + __u64 count; +}; + + +static void eventfd_cleanup(struct eventfd_ctx *ctx); +static int eventfd_close(struct inode *inode, struct file *file); +static unsigned int eventfd_poll(struct file *file, poll_table *wait); +static ssize_t eventfd_read(struct file *file, char __user *buf, size_t count, + loff_t *ppos); +static ssize_t eventfd_write(struct file *file, const char __user *buf, size_t count, +loff_t *ppos); + + + +static const struct file_operations eventfd_fops = { + .release= eventfd_close, + .poll = eventfd_poll, + .read = eventfd_read, + .write = eventfd_write, +}; +static struct kmem_cache *eventfd_ctx_cachep; + + + + +struct file *eventfd_fget(int fd) +{ + struct file *file; + + file = fget(fd); + if (!file) + return ERR_PTR(-EBADF); + if (file->f_op != _fops) { + fput(file); + return ERR_PTR(-EINVAL); + } + + return file; +} + + +int eventfd_signal(struct file *file, unsigned int n) +{ + struct eventfd_ctx *ctx = file->private_data; + int res = 0; + unsigned long flags; + + spin_lock_irqsave(>lock, flags); + if (ULLONG_MAX - ctx->count <= n) + res = -EINVAL; + else + ctx->count += n; + if (waitqueue_active(>wqh)) + wake_up_locked(>wqh); + spin_unlock_irqrestore(>lock, flags); + + return res; +} + + +asmlinkage long sys_eventfd(unsigned int count) +{ + int error, fd; + struct eventfd_ctx *ctx; + struct file *file; + struct inode *inode; + + ctx = kmem_cache_alloc(eventfd_ctx_cachep, GFP_KERNEL); + if (!ctx) + return -ENOMEM; + + init_waitqueue_head(>wqh); +
[patch 4/5] eventfd+KAIO - eventfd wire up x86_64 arch ...
This patch wire the eventfd system call to the x86_64 architecture. Signed-off-by: Davide Libenzi - Davide Index: linux-2.6.20.ep2/arch/x86_64/ia32/ia32entry.S === --- linux-2.6.20.ep2.orig/arch/x86_64/ia32/ia32entry.S 2007-03-14 20:51:34.0 -0700 +++ linux-2.6.20.ep2/arch/x86_64/ia32/ia32entry.S 2007-03-14 20:54:36.0 -0700 @@ -721,4 +721,5 @@ .quad sys_epoll_pwait .quad sys_signalfd /* 320 */ .quad sys_timerfd + .quad sys_eventfd ia32_syscall_end: Index: linux-2.6.20.ep2/include/asm-x86_64/unistd.h === --- linux-2.6.20.ep2.orig/include/asm-x86_64/unistd.h 2007-03-14 20:51:34.0 -0700 +++ linux-2.6.20.ep2/include/asm-x86_64/unistd.h2007-03-14 20:54:36.0 -0700 @@ -623,8 +623,10 @@ __SYSCALL(__NR_signalfd, sys_signalfd) #define __NR_timerfd 281 __SYSCALL(__NR_timerfd, sys_timerfd) +#define __NR_eventfd 282 +__SYSCALL(__NR_eventfd, sys_eventfd) -#define __NR_syscall_max __NR_timerfd +#define __NR_syscall_max __NR_eventfd #ifndef __NO_STUBS #define __ARCH_WANT_OLD_READDIR - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2/5] fs: introduce new aops and infrastructure
Hi Nick, On Wed, Mar 14, 2007 at 02:38:22PM +0100, Nick Piggin wrote: > Introduce write_begin, write_end, and perform_write aops. > > These are intended to replace prepare_write and commit_write with more > flexible alternatives that are also able to avoid the buffered write > deadlock problems efficiently (which prepare_write is unable to do). > Index: linux-2.6/include/linux/fs.h > === > --- linux-2.6.orig/include/linux/fs.h > +++ linux-2.6/include/linux/fs.h > @@ -449,6 +449,17 @@ struct address_space_operations { >*/ > int (*prepare_write)(struct file *, struct page *, unsigned, unsigned); > int (*commit_write)(struct file *, struct page *, unsigned, unsigned); > + > + int (*write_begin)(struct file *, struct address_space *mapping, > + loff_t pos, unsigned len, int intr, > + struct page **pagep, void **fsdata); > + int (*write_end)(struct file *, struct address_space *mapping, > + loff_t pos, unsigned len, unsigned copied, > + struct page *page, void *fsdata); Are we going to get rid of the file and intr arguments btw? I'm not sure intr is useful, and mapping is probably enough to get whatever we inside ->write_begin / ->write_end. Also, I noticed that you didn't export block_write_begin(), simple_write_begin(), block_write_end() and simple_write_end() - I think we want those for client modules. Attached is a quick patch to hook up the existing ocfs2 write code. This has been compile tested only for now - one of my test machines isn't cooperating, so a runtime test will have to wait until tommorrow. One interesting side effect is that we no longer pass AOP_TRUNCATE_PAGE up a level. This gives callers less to deal with. And it means that ocfs2 doesn't have to use the ocfs2_*_lock_with_page() cluster lock variants in ocfs2_block_write_begin() because it can order cluster locks outside of the page lock there. My ocfs2 write rework will be a more serious user of these stuff, including the fsdata backpointer. That code will also eliminate the entire ocfs2_*_lock_with_page() cluster locking workarounds for write (they'll have to remain for ->readpage()). I'm beginning work on cleaning those ocfs2 patches up and getting them plugged into this stuff. I hope to post them in the next day or two. --Mark -- Mark Fasheh Senior Software Developer, Oracle [EMAIL PROTECTED] ocfs2: Convert to new aops Turn ocfs2_prepare_write() and ocfs2_commit_write() into ocfs2_write_begin() and ocfs2_write_end(). Signed-off-by: Mark Fasheh <[EMAIL PROTECTED]> diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c index 93628b0..e7bcbbd 100644 --- a/fs/ocfs2/aops.c +++ b/fs/ocfs2/aops.c @@ -293,29 +293,30 @@ int ocfs2_prepare_write_nolock(struct in } /* - * ocfs2_prepare_write() can be an outer-most ocfs2 call when it is called - * from loopback. It must be able to perform its own locking around - * ocfs2_get_block(). + * ocfs2_write_begin() can be an outer-most ocfs2 call when it is + * called from elsewhere in the kernel. It must be able to perform its + * own locking around ocfs2_get_block(). */ -static int ocfs2_prepare_write(struct file *file, struct page *page, - unsigned from, unsigned to) +static int ocfs2_write_begin(struct file *file, struct address_space *mapping, +loff_t pos, unsigned len, int intr, +struct page **pagep, void **fsdata) { - struct inode *inode = page->mapping->host; + struct inode *inode = mapping->host; int ret; - mlog_entry("(0x%p, 0x%p, %u, %u)\n", file, page, from, to); - - ret = ocfs2_meta_lock_with_page(inode, NULL, 0, page); + ret = ocfs2_meta_lock(inode, NULL, 0); if (ret != 0) { mlog_errno(ret); goto out; } - ret = ocfs2_prepare_write_nolock(inode, page, from, to); + down_read(_I(inode)->ip_alloc_sem); + ret = block_write_begin(file, mapping, pos, len, intr, pagep, fsdata, + ocfs2_get_block); + up_read(_I(inode)->ip_alloc_sem); ocfs2_meta_unlock(inode, 0); out: - mlog_exit(ret); return ret; } @@ -388,16 +389,21 @@ out: return handle; } -static int ocfs2_commit_write(struct file *file, struct page *page, - unsigned from, unsigned to) +static int ocfs2_write_end(struct file *file, struct address_space *mapping, + loff_t pos, unsigned len, unsigned copied, + struct page *page, void *fsdata) { int ret; + unsigned from, to; struct buffer_head *di_bh = NULL; struct inode *inode = page->mapping->host; handle_t *handle = NULL; struct ocfs2_dinode *di; - mlog_entry("(0x%p, 0x%p, %u,
Re: [stable] [PATCH] Fix COMPAT_VDSO regression bug
On Thu, Mar 15, 2007 at 12:38:40AM +0100, Leroy van Logchem wrote: > > Revert "[PATCH] Fix CONFIG_COMPAT_VDSO" > This reverts commit a1f3bb9ae4497a2ed3eac773fd7798ac33a0371f. > > Several systems couldnt boot using CONFIG_HIGHMEM64G=y as > reported in bug #8040. Reverting the above patch solved the problem. What stable version did you revert this in that solved your problem? thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2/5] fs: introduce new aops and infrastructure
On Wed, Mar 14, 2007 at 10:46:25PM +0100, Mariusz Kozlowski wrote: > Hello, > > I guess no need to define 'ret' twice here. [...] Hi Mariusz, Thanks, I'll clean that up. Nick - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2/5] fs: introduce new aops and infrastructure
On Thu, Mar 15, 2007 at 12:28:04AM +0300, Dmitriy Monakhov wrote: > Nick Piggin <[EMAIL PROTECTED]> writes: > > > + > > +int pagecache_write_end(struct file *file, struct address_space *mapping, > > + loff_t pos, unsigned len, unsigned copied, > > + struct page *page, void *fsdata) > > +{ > > + const struct address_space_operations *aops = mapping->a_ops; > > + int ret; > > + > > + if (aops->write_begin) > > + ret = aops->write_end(file, mapping, pos, len, copied, page, > > fsdata); > > + else { > > + int ret; > > + unsigned offset = pos & (PAGE_CACHE_SIZE - 1); > > + struct inode *inode = mapping->host; > > + > > + flush_dcache_page(page); > > + ret = aops->commit_write(file, page, offset, offset+len); > > + if (ret < 0) { > > + unlock_page(page); > > + page_cache_release(page); > > + if (pos + len > inode->i_size) > > + vmtruncate(inode, inode->i_size); > > + } else > > + ret = copied; > What about AOP_TRUNCATED_PAGE? Off corse we can't just "goto retry" here :) , > but we may return it to caller and let's caller handle it. Yeah AOP_TRUNCATED_PAGE... I'm _hoping_ that OCFS2 and GFS2 will be able to avoid that using write_begin/write_end, so the caller will not have to know anything about it. I don't know that commit_write can even return AOP_TRUNCATED_PAGE... we should have gathered all our locks in prepare_write. > > + } > > + > > + return copied; > if ->commit_write return non negative value we return with sill locked page > look above at [1] > may be it will be unlocked by caller? I guess no it was just forgoten. Yeah, thanks. I think I converted all my filesystems to use write_begin / write_end, so I probably didn't test this path :P. I do plan to go through and try to individually test error cases and stress test it over the next couple of days. > > +void page_zero_new_buffers(struct page *page, unsigned from, unsigned to) > > +{ > > + unsigned int block_start, block_end; > > + struct buffer_head *head, *bh; > > + > > + BUG_ON(!PageLocked(page)); > > + if (!page_has_buffers(page)) > > + return;__block_prepare_write > > + > > + bh = head = page_buffers(page); > > block_start = 0; > > do { > > - block_end = block_start+blocksize; > > - if (block_end <= from) > > - goto next_bh; > > - if (block_start >= to) > > - break; > > + block_end = block_start + bh->b_size; > > + > > if (buffer_new(bh)) { > > - void *kaddr; > > + if (block_end > from && block_start < to) { > > + if (!PageUptodate(page)) { > > + unsigned start, end; > > + void *kaddr; > > + > > + start = max(from, block_start); > > + end = min(to, block_end); > > + > > + kaddr = kmap_atomic(page, KM_USER0); > > + memset(kaddr+start, 0, block_end-end); > <<< At least this result in information leak in case of (stat == from) > just imagine fs with blocksize == 1k conains file with i_size == 4096 and > fist two blocks not mapped (hole), now invoke write op from 1023 to 2048. > For example we succeed in allocating first block, but faile while allocating > second > , then we call page_zero_new_buffers(...from == 1023, to == 2048) > and then zerro only one last byte for first block, and set is uptodate > After this we just do read( from == 0, to == 1023) and steal old block > content. When we first invoke the write op, it should see were doing a partial write into the first buffer and bring it uptodate first. I don't see the problem, but again I do need to go through and exercise various cases like this. > > @@ -222,67 +221,47 @@ static int do_lo_send_aops(struct loop_d > > len = bvec->bv_len; > > while (len > 0) { > > sector_t IV; > > - unsigned size; > > + unsigned size, copied; > > int transfer_result; > > + struct page *page; > > + void *fsdata; > > > > IV = ((sector_t)index << (PAGE_CACHE_SHIFT - 9))+(offset >> 9); > > size = PAGE_CACHE_SIZE - offset; > > if (size > len) > > size = len; > > - page = grab_cache_page(mapping, index); > > - if (unlikely(!page)) > > + > > + ret = pagecache_write_begin(file, mapping, pos, size, 1, > > + , ); > > + if (ret) > > goto fail; > > - ret = aops->prepare_write(file, page, offset, > > - offset +
Re: New thread RDSL, post-2.6.20 kernels and amanda (tar) miss-fires
On Wednesday 14 March 2007, Ray Lee wrote: >On 3/13/07, Gene Heskett <[EMAIL PROTECTED]> wrote: >> On Tuesday 13 March 2007, Gene Heskett wrote: >> >On Tuesday 13 March 2007, Gene Heskett wrote: >> >>Greetings; >> >>Someone suggested a fresh thread for this. >> >> >> >>I now have my scripts more or less under control, and I can report >> >> that kernel-2.6.20.1 with no other patches does not exhibit the >> >> undesirable behaviour where tar thinks its all new, even when told >> >> to do a level 2 on a directory tree that hasn't been touched in >> >> months to update anything. >> >> >> >>Next up, 2.6.20.2, plain and with the latest RDSL-0.30 patch. >> > >> >And amanda/tar worked normally for 2.6.20.2 plain. >> > >> >Next up, 2.6.21-rc1 if it will build here. >> >> It built, it booted, and its busted big time. First, with an amdump >> running in the background, the machine is so close to unusable that I >> considered rebooting, but I needed the data to show the problem. I am >> losing the keyboard and mouse for a minute or more at a time but the >> keystrokes seem to be being registered so it eventually catches up. >> >> Disk i/o seems to be the killer according to gkrellm. >> >> But to give one an idea of the fits this is giving tar, I'll snip a >> line or 2 from an amstatus report here: >> coyote:/GenesAmandaHelper-0.6 1 planner: [dumps way too big, 138200 >> KB, must skip incremental dumps] >> >> Huh? 138.2GB? A 'du -h .' in that dir says 766megs. >> >> coyote:/root 1 4426m wait for dumping >> du -h says 5.0GB so that's ballpark, but its also a level 1, so maybe >> 20 megs is actually new since 15:57 this afternoon local. kmails >> final maildir is in that dir. >> >> This goes on for much of the amstatus report, very few of the reported >> sizes are close to sane. >> >> Now, can someone suggest a patch I can revert that might fix this? >> The total number of patches between 2.6.20 and 2.6.21-rc1 will have me >> building kernels to bisect this till the middle of June at this rate. > >In a previous email, you said you were using ext3. If that's the case, >there doesn't appear to be much going on in terms of patches between >2.6.20 and 2.6.21-rc1. The only one that even comes close to looking >like it might have an effect would only come in to play if you have a >filesystem that has ACL information, but is mounted by a kernel that >doesn't have ACL support. > >I have to echo wli here, I'm afraid, and recommend at least a *few* >bisections to help narrow down the list of suspect patches. > >There are tutorials out there for git users. I use the mercurial >repository, as I find the mercurial interface and workflow a lot more >intuitive, but it has the same capability. > >Even 2-5 bisections will greatly help others hunt the bug down. > >Ray Probably. But I've now put a week into this, and from some other clues I've collected, I'm beginning to think tar has a tummy ache. After all, and ls -lc reports totally sane mtimes. So why is tar going bonkers under kernels 2.6.21-rc*, with or without Cons patches? I've also spent a day now looking for a valid place to put a bugzilla entry against tar, but googles search results are sending me to gcc.gnu.org and this is NOT the correct bugzilla for a tar problem. Its no secret that with all the churn in tar over the last 5 years, worse churn than the kernel IMO in going from 2.0 to 2.6, that I'm not a fan of yet another _new_ version of tar, when what we just need is _one_ that works. It is not capable of executing the recovery command listed in the first block of every amdump file it (amdump) ever built right now, and I've played the equ of the 10,000 monkeys writing Shakespear for several hours trying. Damned frustrating is what it is. The error it reports seems to indicate that it cannot write through the pipes involved. But with tar's error reporting, who the hell knows for sure. Here is an example [EMAIL PROTECTED] data]# dd if=00010.coyote._lib.1 bs=32k count=1 AMANDA: FILE 20070314104344 coyote /lib lev 1 comp .gz program /bin/tar To restore, position tape at start of file and run: dd if= bs=32k skip=1 | /bin/gzip -dc | /bin/tar -f - ... And the elipsis is an error if not removed. Then one is supposed to be able to redirect tars output with the usual >/tmp/test/ syntax So: [EMAIL PROTECTED] data]# dd if=00010.coyote._lib.1 bs=32k skip=1 | /bin/gzip -dc | /bin/tar -f - >/tmp/test/ -bash: /tmp/test/: Is a directory which is the return from any variation in how the redirect is done. So what is it that am I doing wrong in the above command line?, so I can add it to my helper scripts to be published eventually on zmanda.org. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) Trying to establish voice contact ... please yell into keyboard. - To unsubscribe from this list: send the line
Re: [kvm-devel] [PATCH 01/15] KVM: Use a shared page for kernel/user communication when runing a vcpu
On Wed, 2007-03-14 at 21:38 -0500, Hollis Blanchard wrote: > On Sun, 2007-03-11 at 15:53 +0200, Avi Kivity wrote: > > Instead of passing a 'struct kvm_run' back and forth between the > > kernel and userspace, allocate a page and allow the user to mmap() it. > > This reduces needless copying and makes the interface expandable by > > providing lots of free space. > > Do you provide for another means of accessing guest memory from host > userspace? For example, how do you attach a host debugger to the guest? > > Xen uses an ioctl followed by mmap for this purpose, which is why I > wonder about using mmap(/dev/kvm) for another purpose. Never mind: I see now that you have separate device nodes for the VM vs each vcpu. -Hollis - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH/RFC] PCI prepare/activate instead of enable to avoid IRQ storm and rogue DMA access
Andi Kleen wrote: Tejun Heo <[EMAIL PROTECTED]> writes: Let's assume there's a device which shares its INTX IRQ line with another device and the other one is already initialized. During boot, due to BIOS's fault, bad hardware design or sheer bad luck, the device has got a pending IRQ. This seems to be also common after kexec during kexec crashdumps where the device just continues doing what it did before the crash. This patch expands the pci_set_master() approach. Instead of enabling the device in one go, it's done in two steps - prepare and activate. 'prepare' enables access to PCI configuration, I hope there aren't any new erratas triggered by this. Perhaps it would make sense to add some paranoia sleeps at least before touching other state? Do you mean between disabling IRQ mechanisms and enabling PCI device in pcim_prepare_device()? Thanks. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [kvm-devel] [PATCH 01/15] KVM: Use a shared page for kernel/user communication when runing a vcpu
On Sun, 2007-03-11 at 15:53 +0200, Avi Kivity wrote: > Instead of passing a 'struct kvm_run' back and forth between the > kernel and userspace, allocate a page and allow the user to mmap() it. > This reduces needless copying and makes the interface expandable by > providing lots of free space. Do you provide for another means of accessing guest memory from host userspace? For example, how do you attach a host debugger to the guest? Xen uses an ioctl followed by mmap for this purpose, which is why I wonder about using mmap(/dev/kvm) for another purpose. -Hollis - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH/RFC] PCI prepare/activate instead of enable to avoid IRQ storm and rogue DMA access
Stephen Hemminger wrote: The problem is the BIOS is busted on these machines. How much effort do we want to put into dealing with systems with broken BIOS? I would rather have the root cause fixed than creating a bandaid that has to be maintained for all the other architectures and platforms. For sky2/skge, it might be caused by broken BIOS. For some ATA devices, it's just the hardware which is designed that way. Also, under non-x86 machines and during resume, there's no BIOS to nudge chips into sane state. This is an existing problem which has to be solved. How much effort we are gonna put into it is certainly debatable. Also, the current implementation doesn't have any arch independent part. It's wholly contained in arch independent PCI layer, but it might be beneficial to have arch dependent hooks (IRQ line enable/disable?) in the future. What if the device with the IRQ problem is never loaded? Sometimes devices aren't loaded until after boot. What do you mean by loading a device? Do you mean loading driver for the device? The patch as posted is probably not a complete solution. We probably need to make sure during early boot and resume that all IRQ / bus master are turned off where possible and let low level drivers enable them as needed and after certain amount of initialization is performed. If you use MSI interrupts, they aren't shared so there isn't a problem. Maybe the root cause of this is bad MSI emulation handling in BIOS. Yes, if MSI is used things are better. Any change like this has to be done without changing device drivers. Changing the skge/sky2 drivers as special case is not acceptable. I dunno about that. What I'm proposing is alternative two-step PCI initialization step - the first step enables the device just enough for initialization/reset and the second one enables full access. We're doing part of it already for bus master. I'm proposing to expand that approach and make them handled by generic PCI layer. As you can see, it doesn't add noticeable complexity to drivers. I think it's even clearer than doing pci_set_master() explicitly. If this way of solving the problem is chosen, eventually most drivers should be converted to new initialization steps. And there is no way to do this without modifying low level driver. Only low level driver knows when full blown access can be enabled and such thing must happen before registering the device to upper layer (e.g. ATA/SCSI, netif). sky2/skge aren't exceptions. If this way of solving the problem is chosen, eventually most if not all drivers should be converted to new model. It may take two years, maybe five, but as a start just converting ATA and network drivers shouldn't take too long and that would help a lot of cases. Thanks. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RSDL v0.30 cpu scheduler for mainline kernels
Con, On Mon, Mar 12, 2007 at 10:58:11AM +1100, Con Kolivas wrote: > There are updated patches for 2.6.20, 2.6.20.2, 2.6.21-rc3 and 2.6.21-rc3-mm2 > to bring RSDL up to version 0.30 for download here: I tried this on a Core 2 Quad cpu system(system has 4 cores on a single package). When I run SPECjbb2000 with number of threads varying from 1-8, I see ~4.5% perf regression with RSDL (compared to native 2.6.21-rc3) in the 8 threads case. This I think, is coming from increased number of context switches, when we have more than one thread(at same user priority) on the same logical cpu. Just to see the % increase in number of context switches, I ran 8 infinite loops (simple while(1); 's) and with 2.6.21-rc3 I see ~70 context switches every second, whereas with RSDL I see ~530 context switches. thanks, suresh - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Geode cs5530a magic (Was: Re: [PATCH] clean up mach_reboot_fixups)
Andi Kleen wrote: > On Wednesday 14 March 2007 23:24, Jeremy Fitzhardinge wrote: > >> The reboot_fixups stuff seems to be a bit of a mess, specifically the >> header is in linux/ when its a purely i386-specific piece of code. I'm >> not sure why it has its config option; its only currently needed for >> "geode-gx1/cs5530a", so perhaps whatever config option controls that >> hardware should enable this? >> > > Thanks. Looks good. It looks like a cs5530a is a PATA driver in drivers/ata/pata_cs5530.c. Seems to me the cleanest fix is to register a reboot notifier in the driver and have it do the magic rather than have the special mach_reboot_fixups mechanism at all. Assuming it needs to be done at all... Alan? Jaya? J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/1] Allow i386 crash kernels to handle x86_64 dumps
On Wed, Mar 14, 2007 at 05:00:09PM +, Ian Campbell wrote: > The specific case I am encountering is kdump under Xen with a 64 bit > hypervisor and 32 bit kernel/userspace. The dump created is a 64 bit due > to the hypervisor but the dump kernel is 32 bit to match the domain 0 > kernel. > > It's possibly less likely to be useful in a purely native scenario but I > see no reason to disallow it. For native Linux, would this cover the case where the pre-crash kernel is 64bit and the crashdump (post-crash) kernel is 32bit? > Signed-off-by: Ian Campbell <[EMAIL PROTECTED]> > > --- pristine-linux-2.6.18/include/asm-i386/elf.h 2006-09-20 > 04:42:06.0 +0100 > +++ linux-2.6.18-xen/include/asm-i386/elf.h 2007-03-14 16:42:30.0 > + > @@ -36,7 +36,7 @@ > * This is used to ensure we don't load something for the wrong architecture. > */ > #define elf_check_arch(x) \ > - (((x)->e_machine == EM_386) || ((x)->e_machine == EM_486)) > + (((x)->e_machine == EM_386) || ((x)->e_machine == EM_486) || > ((x)->e_machine == EM_X86_64)) I think it would be a bit nicer if this was < 80col wide, though obviously this doesn't affect the funtionality. diff --git a/include/asm-i386/elf.h b/include/asm-i386/elf.h index 8d33c9b..cd894dd 100644 --- a/include/asm-i386/elf.h +++ b/include/asm-i386/elf.h @@ -36,7 +36,8 @@ typedef struct user_fxsr_struct elf_fpxregset_t; * This is used to ensure we don't load something for the wrong architecture. */ #define elf_check_arch(x) \ - (((x)->e_machine == EM_386) || ((x)->e_machine == EM_486)) + (((x)->e_machine == EM_386) || ((x)->e_machine == EM_486) || \ +((x)->e_machine == EM_X86_64)) /* * These are used to set parameters in the core dumps. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Reiser4: Transparent compression support. Further development and compatibility.
Reiser4 file system: Transparent compression support. Further development and compatibility. A. Reiser4 cryptcompress file plugin(*) and its conversion(**) This is the second file plugin that realizes regular files in reiser4. Unlike previous one (unix-file plugin), cryptcompress plugin manages files with encrypted and(or) compressed bodies packed to metadata pages, so plain text is cached in data pages (pinned to inode's mapping), which don't participate in IO: at background commit their data get compressed with the following update of old compressed bodies. This update is going in so-called "squalloc" phase of the flush algorithm, so eventually everything will be tightly packed. And yes, metadata pages are supposed to be writebacked. Roughly speaking, cryptcompress file occupies more memory and smaller disk space then ordinary file (managed by unix-file plugin). In contrast with unix-file plugin, the smallest addressable unit is page cluster (in memory) and item cluster (on disk). Also cryptcompress plugin implements another, more economic approach in representing holes. However it calls the same low-level (node, etc) plugins, so you can have a "mixed" fileset on your reiser4 partition. See below about backward compatibility. To reduce cpu and memory usage when handling incompressible data one should assign proper compression mode plugin. The default one activates special hook in ->write() method of cryptcompress file plugin (only once per file's life, when starting to write from special offset in some iteration) which tries to estimate whether a file is compressible by testing its first logical cluster (64K by default). If evaluation result is negative, then fragments will be converted to extents, and management will be passed to unix-file plugin. Back conversion does not take place. If evaluation result is positive, then file stays under cryptcompress plugin control, but compression will be dynamically switched by flush manager in accordance with the policy implemented by compression mode plugin. This heuristic looks mostly like improvisation and might be improved via modifying the compression mode plugin (***) (some statistical analysis is needed here to make sure we don't worsen the situation). So let's summarize what we have in the cases of not success in primary evaluation performed by default mode: 1. file is incompressible, but its first logical cluster is compressible. In this case compression will be "turned off" in flush time, so we save only cpu, whereas memory consumption is wasteful, as file stays under cryptcomptress plugin control. Also deleting a huge file built of fragments is not the fastest operation. 2. file is compressible, but its first logical cluster is incompressible. In this case management will be passed to the unix-file plugin forever (not the worse situation). --- (*) "plugins" means "internal reiser4 modules". Perhaps, "plugin" is a bad name, but let us use it in the context of reiser4 (at least for now). Each plugin is labeled by a unique pair (type, id), so plugin's name is composed of id name (first) and type name. For example, "extent item plugin" means plugin of item type that manages extent pointers in reiser4. Plugins of file type are to service VFS entrypoints. (**) plugin conversion means passing management to another plugin of the same plugin type: (type, id1) -> (type, id2) with the following (or preceded) conversion of controlled objects (tail conversion is a classic example of such operation). (***) when modifying an existing plugin we should be careful (see below about backward compatibility). B. Getting started with cryptcompress plugin ** Warning! Warning! Warning! This stuff is experimental. Do not store important data in the files managed by cryptcompress plugin. It can be lost with no chances to recover it back. Also creating at least one such file on your product Reiser4 partition can cause its unrecoverable crash. It is not a joke! ** NOTE: We don't consider using pseudo interface (metas), as it is still deprecated. 1. Build and boot the latest kernel of -mm series. 2. Build and install the latest version of reiser4progs(1.0.6 for now) 3. Have a free partition (not for product using). 4. Format it by mkfs.reiser4. Use the option -o to override "create" and maybe other related plugins that mkfs installs to root directory by default. List of default settings is available via option -p. List of all possible settings is available via option -l For example: "mkfs.reiser4 -o create=ccreg40 /dev/xxx" specifies cryptcompress file plugin with (default) lzo1 compression "mkfs.reiser4 -o create=ccreg40,compress=gzip1 /dev/xxx" specifies cryptcompress file plugin with gzip1 compression. Description of all cryptcompress-related settings can be found
Re: [PATCH 0/8] x86 boot, pda and gdt cleanups
Rusty Russell wrote: > Hmm, this invalidated my assumption that write_gdt_entry is always a > write to this cpu's active gdt. Better fix is not to call it twice > anyway... > No, I don't think that's true. I implemented the write_*_entry functions with the assumption they could be called either on setup or on an in-use entry. I think its good policy to use it all the time anyway, since the pv_ops backend might want to fiddle with the values on the way through. I tried to avoid calling init_gdt twice, but it seemed cleaner to just let it happen. > Getting rid of the call in smp_prepare_boot_cpu currently works, but > it's fragile: __get_cpu_var(x) && per_cpu(x, smp_processor_id()) will > differ, and changes made to __get_cpu_var(x) will vanish... > Yes. I think its definitely a good idea to call init_gdt asap after doing the percpu setup. > Fortunately, UP doesn't have to call init_gdt at all, so I think it's > better to place it in smp_prepare_boot_cpu only and then clean up the UP > code. I'll try now... > It doesn't? The per-cpu gdt is the same as the boot gdt? J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm/filemap.c: unconditionally call mark_page_accessed
Dave Kleikamp wrote: On Wed, 2007-03-14 at 22:33 +0100, Andreas Mohr wrote: Hi, On Wed, Mar 14, 2007 at 03:55:41PM -0500, Dave Kleikamp wrote: On Wed, 2007-03-14 at 15:58 -0400, Ashif Harji wrote: This patch unconditionally calls mark_page_accessed to prevent pages, especially for small files, from being evicted from the page cache despite frequent access. I guess the downside to this is if a reader is reading a large file, or several files, sequentially with a small read size (smaller than PAGE_SIZE), the pages will be marked active after just one read pass. My gut says the benefits of this patch outweigh the cost. I would expect real-world backup apps, etc. to read at least PAGE_SIZE. I also think that the patch is somewhat problematic, since the original intention seems to have been a reduction of the number of (expensive?) mark_page_accessed() calls, mark_page_accessed() isn't expensive. If called repeatedly, starting with the third call, it will check two page flags and return. The only real expense is that the page appears busier than it may be and will be retained in memory longer than it should. If we allow mark_page_accessed() called multiple times for a single page, a scan of large file with small-size reads would flush the buffer cache. mark_page_accessed() also requests lru_lock when moving page from inactive_list to active_list. It may also increase lock contention. but this of course falls flat on its face in case of permanent single-page accesses or accesses with progressing but very small read size (single-byte reads or so), since the cached page content will expire eventually due to lack of mark_page_accessed() updates; thus this patch decided to call mark_page_accessed() unconditionally which may be a large performance penalty for subsequent tiny-sized reads. Any application doing many tiny-sized reads isn't exactly asking for great performance. I've been thinking hard how to avoid the mark_page_accessed() starvation in case of a fixed, (almost) non-changing access state, but this seems hard since it'd seem we need some kind of state management here to figure out good intervals of when to call mark_page_accessed() *again* for this page. E.g. despite non-changing access patterns you could still call mark_page_accessed() every 32 calls or so to avoid expiry, but this would need extra helper variables. A rather ugly way to do it may be to abuse ra.cache_hit or ra.mmap_hit content with a if ((prev_index != index) || (ra.cache_hit % 32 == 0)) mark_page_accessed(page); This assumes that ra.cache_hit gets incremented for every access (haven't checked whether this is the case). That way (combined with an enhanced comment properly explaining the dilemma) you would avoid most mark_page_accessed() invocations of subsequent same-page reads but still do page status updates from time to time to avoid page deprecation. Does anyone think this would be acceptable? Any better idea? I wouldn't go looking for anything more complicated than Ashif's patch, unless testing shows it to be harmful in some realistic workload. Andreas Mohr P.S.: since I'm not too familiar with this area I could be rather wrong after all... I could be missing something as well. :-) Shaggy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: mm: migrate_pages using
On Mon, 12 Mar 2007 19:57:58 +0100 Michal Hocko <[EMAIL PROTECTED]> wrote: > What do you think about that. Is this way correct? > If you are sure that your "original" pages is never freed while you are migrating it.maybe. -Kame - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: 2.6.20-1 not working on ibook g4 (BUG/Oops)
2007/3/13, Benjamin Herrenschmidt <[EMAIL PROTECTED]>: On Tue, 2007-03-13 at 01:49 +, young dave wrote: > Hi, > I have tested on my mac mini g4. > > The 2.6.21-rc2 will cause oops like the above post. > > And for the new 2.6.21-rc3-git7 , the kernel load ok, penguin pixmap > appears, but then it stopped, there's no error messages also. -rc3 should have the bug fixed... it might be something else wrong. Have you use a pmac32_defconfig ? Ben. Hi, I have tested the pmac32_defconfig, make menuconfig , and there's some warnings: .config:380:warning: trying to assign nonexistent symbol IP_NF_TARGET_TCPMSS .config:808:warning: trying to assign nonexistent symbol IEEE1394_OUI_DB .config:811:warning: trying to assign nonexistent symbol IEEE1394_EXPORT_FULL_AP I .config:1308:warning: trying to assign nonexistent symbol BACKLIGHT_DEVICE .config:1310:warning: trying to assign nonexistent symbol LCD_DEVICE .config:1461:warning: trying to assign nonexistent symbol USB_BANDWIDTH .config:1464:warning: trying to assign nonexistent symbol USB_MULTITHREAD_PROBE .config:1476:warning: trying to assign nonexistent symbol USB_OHCI_BIG_ENDIAN .config:1734:warning: trying to assign nonexistent symbol ZISOFS_FS .config:1894:warning: trying to assign nonexistent symbol IOMAP_COPY .config:1920:warning: trying to assign nonexistent symbol DEBUG_RWSEMS Then I modified the config file, But still can't boot , just stopped, and the keyboard is active, it seems the kernel is running, but there's no init messages. I don't know why, the distribution I used is Yellowdog 4.0. the original 2.6.17 is just ok. Could you please help to check the configs? Thanks. config Description: Binary data
do_acct_process bypasses vfs_write?
do_acct_process (in kernel/acct.c) bypasses vfs_write and calls file->f_op->write directly. It therefore bypasses various sanity checks, some of which appear applicable (notably inode->i_flock && MANDATORY_LOCK(inode)) and others of which do not (oversize request, access_ok, etc.). It also neglects to call fsnotify_modify(file->f_path.dentry) after a successful write, which may or may not matter. Perhaps someone more knowledgeable than I could go through vfs_read and vfs_write, distinguishing between those checks which are only applicable to requests initiated from userspace and those which should also be performed for in-kernel uses of f_op->read/write? Cheers, - Michael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 13/13] fix ps3fb glue allowing a modular build
On Wed, 2007-03-14 at 10:50 +0100, Geert Uytterhoeven wrote: > On Wed, 14 Mar 2007, Al Viro wrote: > > Signed-off-by: Al Viro <[EMAIL PROTECTED]> > > And finally, make sure CONFIG_LOGO=n, as there's a bug in the logo code: logos > are __initdata but the logo code still tries to draw them for a modular fbdev. > Originally (eons ago) this case was handled by the flag initmem_freed, which > no > longer exists. > True, I tried to prevent the logo from being drawn if the driver is loaded first prior to fbcon, but the code will still draw the logo if the load order is reversed. Can you try this patch? It will only permit the drawing of the logo if both the driver and fbcon are compiled statically. Tony diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c index bd131d4..12e8a3b 100644 --- a/drivers/video/console/fbcon.c +++ b/drivers/video/console/fbcon.c @@ -107,7 +107,9 @@ static struct display fb_display[MAX_NR_ static signed char con2fb_map[MAX_NR_CONSOLES]; static signed char con2fb_map_boot[MAX_NR_CONSOLES]; +#ifndef MODULE static int logo_height; +#endif static int logo_lines; /* logo_shown is an index to vc_cons when >= 0; otherwise follows FBCON_LOGO enums. */ @@ -576,6 +578,13 @@ static int fbcon_takeover(int show_logo) return err; } +#ifdef MODULE +static void fbcon_prepare_logo(struct vc_data *vc, struct fb_info *info, + int cols, int rows, int new_cols, int new_rows) +{ + logo_shown = FBCON_LOGO_DONTSHOW; +} +#else static void fbcon_prepare_logo(struct vc_data *vc, struct fb_info *info, int cols, int rows, int new_cols, int new_rows) { @@ -584,6 +593,11 @@ static void fbcon_prepare_logo(struct vc int cnt, erase = vc->vc_video_erase_char, step; unsigned short *save = NULL, *r, *q; + if (info->flags & FBINFO_MODULE) { + logo_shown = FBCON_LOGO_DONTSHOW; + goto done; + } + /* * remove underline attribute from erase character * if black and white framebuffer. @@ -654,7 +668,10 @@ static void fbcon_prepare_logo(struct vc logo_shown = FBCON_LOGO_DRAW; vc->vc_top = logo_lines; } + +done: } +#endif /* MODULE */ #ifdef CONFIG_FB_TILEBLITTING static void set_blitting_type(struct vc_data *vc, struct fb_info *info) diff --git a/drivers/video/fbmem.c b/drivers/video/fbmem.c index 45f3839..08c292d 100644 --- a/drivers/video/fbmem.c +++ b/drivers/video/fbmem.c @@ -418,7 +418,8 @@ int fb_prepare_logo(struct fb_info *info memset(_logo, 0, sizeof(struct logo_data)); - if (info->flags & FBINFO_MISC_TILEBLITTING) + if (info->flags & FBINFO_MISC_TILEBLITTING || + info->flags & FBINFO_MODULE) return 0; if (info->fix.visual == FB_VISUAL_DIRECTCOLOR) { @@ -483,7 +484,8 @@ int fb_show_logo(struct fb_info *info, i struct fb_image image; /* Return if the frame buffer is not mapped or suspended */ - if (fb_logo.logo == NULL || info->state != FBINFO_STATE_RUNNING) + if (fb_logo.logo == NULL || info->state != FBINFO_STATE_RUNNING || + info->flags & FBINFO_MODULE) return 0; image.depth = 8;
Re: [PATCH] Fix COMPAT_VDSO regression bug
I built a CONFIG_COMPAT_VDSO=y, CONFIG_HIGHMEM64G=y kernel and it has no problems with FC-6 userland. Everything looks fine with the vDSO. So either some more details of your kernel config are relevant, or something about the userland usage pattern. Thanks, Roland - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc3-mm2 (BUG in pci_restore_state())
Bjorn Helgaas <[EMAIL PROTECTED]> writes: > In 2.6.21-rc3-mm2 (plus some move_freepages() bugfixes), I hit one > of the warnings added by Eric's msi-debug-code.patch. This is on an > ia64 box, an HP rx2600. Let me know if I can collect more information. I think we are good. How pci_save_state and pci_restore_state were implemented and how they were used were out of sync. tg3 was one of the drivers where pci_save_state and pci_restore_state were used as part of the reset routine and were not used in pairs. Which when combined with a pci-x or a pci-express capability resulted in a memory leak, (that I was warning about). This has now been corrected upstream. And the condition I was warning about non paired pci_save_state and pci_restore_state is no longer a problem. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 6/17] sparc: nr_free_pages() is unsigned long
From: William Lee Irwin III <[EMAIL PROTECTED]> Date: Wed, 14 Mar 2007 08:06:12 -0700 > On Wed, Mar 14, 2007 at 09:18:50AM +, Al Viro wrote: > > Signed-off-by: Al Viro <[EMAIL PROTECTED]> > > --- > > arch/sparc/mm/init.c |2 +- > > 1 files changed, 1 insertions(+), 1 deletions(-) > > Dave, I trust you'll pick it up until I get a git tree going. > > Acked-by: William Irwin <[EMAIL PROTECTED]> What usually happens when Al sends a set like this is that Linus picks it up directly, and I've just verified that this is in fact what has happened this time too :-) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: oops in __nodemgr_remove_host_dev (was Re: Ooops with suspend to RAM)
On Thursday 15 March 2007 02:08:43 Stefan Richter wrote: [...] > > Ismail, if you have the opportunity, the next thing you could test would > be to unload eth1394 explicitly before ohci1394 on 2.6.21-rc3. This > would _not_ oops according to my observation. On a clean reboot it works as expected ; southpark cartman # rmmod eth1394 southpark cartman # rmmod ohci1394 southpark cartman # No oops. Thanks. -- Happiness in intelligent people is the rarest thing I know. (Ernest Hemingway) Ismail Donmez ismail (at) pardus.org.tr GPG Fingerprint: 7ACD 5836 7827 5598 D721 DF0D 1A9D 257A 5B88 F54C Pardus Linux / KDE developer - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: oops in __nodemgr_remove_host_dev (was Re: Ooops with suspend to RAM)
On Thursday 15 March 2007 02:08:43 Stefan Richter wrote: [...] > Ismail, if you have the opportunity, the next thing you could test would > be to unload eth1394 explicitly before ohci1394 on 2.6.21-rc3. This > would _not_ oops according to my observation. rmmod eth1394 and modprobe -r eth1394 both hangs here no oops nothing. Regards. -- Happiness in intelligent people is the rarest thing I know. (Ernest Hemingway) Ismail Donmez ismail (at) pardus.org.tr GPG Fingerprint: 7ACD 5836 7827 5598 D721 DF0D 1A9D 257A 5B88 F54C Pardus Linux / KDE developer - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/1] hotplug cpu: migrate a task within its cpuset
On Fri, Mar 09, 2007 at 05:58:59PM -0600, Nathan Lynch wrote: > Hello- > > Cliff Wickman wrote: > > This patch would insert a preference to migrate such a task to a cpu within > > its cpuset (and set its cpus_allowed to its cpuset). > > > > With this patch, migrate the task to: > > 1) to any cpu on the same node as the disabled cpu, which is both online > > and among that task's cpus_allowed > > 2) to any online cpu within the task's cpuset > > 3) to any cpu which is both online and among that task's cpus_allowed > > I think I disagree with this change. > > The kernel shouldn't have to be any smarter than it already is about > moving tasks off an offlined cpu. The only way case 2) can be reached > is if the user has changed a task's cpu affinity. If the user is > sophisticated enough to manipulate tasks' cpu affinity then they can > arrange to migrate tasks as they see fit before offlining a cpu. You are assuming some sort of interlock between the admin and the user. While this may be true on your own personal desktop, I don't think you can expect this to be true on a development machine shared by hundreds of users and admin'd by a group of people. Additionally, ia64 is gaining support for offlining a cpu which is giving cache errors. Thanks, Robin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 13/13] signalfd/timerfd/asyncfd v5 - KAIO asyncfd support (example/maybe-broken) ...
On Wed, 14 Mar 2007, Davide Libenzi wrote: > On Wed, 14 Mar 2007, Benjamin LaHaise wrote: > > > On Wed, Mar 14, 2007 at 04:41:58PM -0700, Davide Libenzi wrote: > > > Yeah, of course. I do not plan revolutions. Just asking if it's a > > > possible > > > thing to do. I can mlock the userspace ring, if imposing that burden over > > > aio_complete() is seen as too heavy. > > > > I'm not sure I follow what you're doing -- why isn't asyncfd merely calling > > io_getevents() instead of reinventing everything the ringbuffer does? The > > aio ringbuffer is already locked in memory. Fwiw, the aio ringbuffer was > > originally wired up to a file descriptor, but that gave way to the actual > > syscall in order to enforce proper typechecking and typical usage scenarios > > with timeouts. > > The purpose of asyncfd is to provide a pollable (by the mean of > f_op->poll) device that can be hosted inside a standard select/poll/epoll > wait subsystem, and that, at the same time, provide a zero-copy way for > kernel code (KAIO and syslets/threadlets were my thought) to deliver > results to userspace. But, yeah. It can end up calling io_getevents() instead of doing it's own thing. That'd make it even slimmer ;) - Davide - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 13/13] signalfd/timerfd/asyncfd v5 - KAIO asyncfd support (example/maybe-broken) ...
On Wed, 14 Mar 2007, Linus Torvalds wrote: > On Wed, 14 Mar 2007, Davide Libenzi wrote: > > > > > > That won't work. aio_complete() is supposed to be irq safe. > > > > Can you point me to a kernel path that ends up calling aio_complete() in a > > do-not-sleep mode? > > All of them. > > It's called from dio_bio_end_aio(), which is the bi_end_io function for an > AIO action. Which in turn is called at IO completion time. > > Which is basically _always_ interrupt context. > > So you cannot sleep. It's not about holding spinlocks (which it might well > do as well). It's about a much more fundamental issue: you can only sleep > in process context, not from interrupts. Ack! Gotcha. Sigh! :) - Davide - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 13/13] signalfd/timerfd/asyncfd v5 - KAIO asyncfd support (example/maybe-broken) ...
On Wed, 14 Mar 2007, Davide Libenzi wrote: > > > > That won't work. aio_complete() is supposed to be irq safe. > > Can you point me to a kernel path that ends up calling aio_complete() in a > do-not-sleep mode? All of them. It's called from dio_bio_end_aio(), which is the bi_end_io function for an AIO action. Which in turn is called at IO completion time. Which is basically _always_ interrupt context. So you cannot sleep. It's not about holding spinlocks (which it might well do as well). It's about a much more fundamental issue: you can only sleep in process context, not from interrupts. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 13/13] signalfd/timerfd/asyncfd v5 - KAIO asyncfd support (example/maybe-broken) ...
On Wed, 14 Mar 2007, Benjamin LaHaise wrote: > On Wed, Mar 14, 2007 at 04:41:58PM -0700, Davide Libenzi wrote: > > Yeah, of course. I do not plan revolutions. Just asking if it's a possible > > thing to do. I can mlock the userspace ring, if imposing that burden over > > aio_complete() is seen as too heavy. > > I'm not sure I follow what you're doing -- why isn't asyncfd merely calling > io_getevents() instead of reinventing everything the ringbuffer does? The > aio ringbuffer is already locked in memory. Fwiw, the aio ringbuffer was > originally wired up to a file descriptor, but that gave way to the actual > syscall in order to enforce proper typechecking and typical usage scenarios > with timeouts. The purpose of asyncfd is to provide a pollable (by the mean of f_op->poll) device that can be hosted inside a standard select/poll/epoll wait subsystem, and that, at the same time, provide a zero-copy way for kernel code (KAIO and syslets/threadlets were my thought) to deliver results to userspace. > Also, there have been patches floating around for aio_poll and a way to get > epoll wakeups into the aio event queue. They deserve serious consideration > if this asyncfd seems necessary. I don't want to talk about the AIO poll code, because last time I saw it, it did not look shiny. But I think we can agree that ppl needs to have a way to wait for both block I/O (covered by either KAIO or syslets/threadlets) and all the other world (covered by epoll). This has been pretty clear for me, looking at the continuous request I got to provide block I/O completions through epoll, and looking at the hackage that ppl has currently to do in userspace to achieve that. Now that I'm seeing I can wait for both block and net I/O, I got excited ;) - Davide - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Bug 8040] Hang before INIT when CONFIG_HIGHMEM4G=y [Fix CONFIG_COMPAT_VDSO] <- Bad
On Thursday 15 March 2007 02:01, Andrew Morton wrote: > > On Wed, 14 Mar 2007 17:52:01 + (UTC) Leroy van Logchem <[EMAIL > > PROTECTED]> wrote: > > Leroy van Logchem wldelft.nl> writes: > > Where does it hang exactly? Do you have a boot log? > > > > > None whatsoever. Three people are reporting this and it's a drop-dead > > > > > showstopper for a 2.6.21 release so we just have to wait until someone > > > > > wakes up and thinks about it. > > > > > > The topic should be "when CONFIG_HIGHMEM64G=y" imo. > > > > > > I'll try to do my first bi-sect today. > > Thanks. Please always do reply-to-all. Cc's restored (and added..) > > > Bisecting went well, after 13 compiles this commit was found: > > > > a1f3bb9ae4497a2ed3eac773fd7798ac33a0371f is first bad commit > > commit a1f3bb9ae4497a2ed3eac773fd7798ac33a0371f > > Author: Roland McGrath <[EMAIL PROTECTED]> > > Date: Fri Jan 26 00:56:46 2007 -0800 Can you please double check this by trying with/without again -- sometimes bisects go bad. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: oops in __nodemgr_remove_host_dev (was Re: Ooops with suspend to RAM)
I wrote: > according to a quick test I made right now it is a regression post 2.6.20. > # modprobe ohci1394 # wait a bit, eth1394 is auto-loaded > # modprobe -r eth1394 > # modprobe -r ohci1394 > works. > # modprobe ohci1394 # wait a bit, eth1394 is auto-loaded > # modprobe -r ohci1394 > oopses with the same trace as Ismael posted. And indeed, looking at his > trace once more I now also spot eth1394 among his linked-in modules. To avoid any misunderstandings: Both the former and the latter sequence work under 2.6.20 and earlier. -- Stefan Richter -=-=-=== --== - http://arcgraph.de/sr/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Fix COMPAT_VDSO regression bug
Revert "[PATCH] Fix CONFIG_COMPAT_VDSO" This reverts commit a1f3bb9ae4497a2ed3eac773fd7798ac33a0371f. Several systems couldnt boot using CONFIG_HIGHMEM64G=y as reported in bug #8040. Reverting the above patch solved the problem. Cc: Randy Dunlap <[EMAIL PROTECTED]> Cc: Ingo Molnar <[EMAIL PROTECTED]> Cc: Roland McGrath <[EMAIL PROTECTED]> Bisected-by: Leroy Raymond van Logchem <[EMAIL PROTECTED]> arch/i386/kernel/entry.S|4 arch/i386/kernel/sysenter.c |2 -- include/asm-i386/elf.h |7 --- include/asm-i386/fixmap.h |2 -- include/asm-i386/page.h |2 -- 5 files changed, 4 insertions(+), 13 deletions(-) diff --git a/arch/i386/kernel/entry.S b/arch/i386/kernel/entry.S index 5e47683..06461b8 100644 --- a/arch/i386/kernel/entry.S +++ b/arch/i386/kernel/entry.S @@ -302,16 +302,12 @@ sysenter_past_esp: pushl $(__USER_CS) CFI_ADJUST_CFA_OFFSET 4 /*CFI_REL_OFFSET cs, 0*/ -#ifndef CONFIG_COMPAT_VDSO /* * Push current_thread_info()->sysenter_return to the stack. * A tiny bit of offset fixup is necessary - 4*4 means the 4 words * pushed above; +8 corresponds to copy_thread's esp0 setting. */ pushl (TI_sysenter_return-THREAD_SIZE+8+4*4)(%esp) -#else - pushl $SYSENTER_RETURN -#endif CFI_ADJUST_CFA_OFFSET 4 CFI_REL_OFFSET eip, 0 diff --git a/arch/i386/kernel/sysenter.c b/arch/i386/kernel/sysenter.c index 666f70d..a1090e1 100644 --- a/arch/i386/kernel/sysenter.c +++ b/arch/i386/kernel/sysenter.c @@ -95,7 +95,6 @@ int __init sysenter_setup(void) return 0; } -#ifndef CONFIG_COMPAT_VDSO static struct page *syscall_nopage(struct vm_area_struct *vma, unsigned long adr, int *type) { @@ -190,4 +189,3 @@ int in_gate_area_no_task(unsigned long addr) { return 0; } -#endif diff --git a/include/asm-i386/elf.h b/include/asm-i386/elf.h index 369035d..157bb7a 100644 --- a/include/asm-i386/elf.h +++ b/include/asm-i386/elf.h @@ -143,9 +143,12 @@ extern int dump_task_extended_fpu (struct task_struct *, struct user_fxsr_struct # define VDSO_PRELINK 0 #endif -#define VDSO_SYM(x) \ +#define VDSO_COMPAT_SYM(x) \ (VDSO_COMPAT_BASE + (unsigned long)(x) - VDSO_PRELINK) +#define VDSO_SYM(x) \ + (VDSO_BASE + (unsigned long)(x) - VDSO_PRELINK) + #define VDSO_HIGH_EHDR ((const struct elfhdr *) VDSO_HIGH_BASE) #define VDSO_EHDR ((const struct elfhdr *) VDSO_COMPAT_BASE) @@ -153,12 +156,10 @@ extern void __kernel_vsyscall; #define VDSO_ENTRY VDSO_SYM(&__kernel_vsyscall) -#ifndef CONFIG_COMPAT_VDSO #define ARCH_HAS_SETUP_ADDITIONAL_PAGES struct linux_binprm; extern int arch_setup_additional_pages(struct linux_binprm *bprm, int executable_stack); -#endif extern unsigned int vdso_enabled; diff --git a/include/asm-i386/fixmap.h b/include/asm-i386/fixmap.h index 3e9f610..02428cb 100644 --- a/include/asm-i386/fixmap.h +++ b/include/asm-i386/fixmap.h @@ -23,8 +23,6 @@ extern unsigned long __FIXADDR_TOP; #else #define __FIXADDR_TOP 0xf000 -#define FIXADDR_USER_START __fix_to_virt(FIX_VDSO) -#define FIXADDR_USER_END __fix_to_virt(FIX_VDSO - 1) #endif #ifndef __ASSEMBLY__ diff --git a/include/asm-i386/page.h b/include/asm-i386/page.h index 7b19f45..fd3f64a 100644 --- a/include/asm-i386/page.h +++ b/include/asm-i386/page.h @@ -143,9 +143,7 @@ extern int page_is_ram(unsigned long pagenr); #include #include -#ifndef CONFIG_COMPAT_VDSO #define __HAVE_ARCH_GATE_AREA 1 -#endif #endif /* __KERNEL__ */ #endif /* _I386_PAGE_H */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] cosmetic adaption of drivers/ide/Kconfig concerning SATA
Hello, since Serial ATA has it's own menu point now, I guess we can change the description of the deprecated SATA driver as well, since the new S-ATA subsystem is not configured through a SCSI low-level driver anymore. The following patch is against 2.6.21-rc3: --- linux-2.6.20.orig/drivers/ide/Kconfig2007-03-12 01:34:38.0 +0100 +++ linux-2.6.20/drivers/ide/Kconfig2007-03-12 01:47:10.0 +0100 @@ -103,7 +103,7 @@ ---help--- There are two drivers for Serial ATA controllers. - The main driver, "libata", exists inside the SCSI subsystem + The main driver, "libata", exists in the "Serial ATA subsystem" and supports most modern SATA controllers. The IDE driver (which you are currently configuring) supports Since I am not subscribed to the list, I'd find it great if I were personally CC'ed. :-) Best regards Patrick - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: oops in __nodemgr_remove_host_dev (was Re: Ooops with suspend to RAM)
Ismail Dönmez wrote: > On Wednesday 14 March 2007 20:25:24 Stefan Richter wrote: >> Ismail Dönmez wrote: >> > Are you able to rmmod it? >> >> Yes, but on 2.6.20 and earlier kernels, most of the time with >> development versions of the 1394 drivers. I still haven't tried >> 2.6.21-rc, will hopefully get to it tonight. > > Ok then that explains a bit, without suspend if I rmmod ohci1394 module I got > the exact oops. Elsewhere, Adrian Bunk wrote: | Is this an old problem, or what was the last kernel that worked | for you? Adrian, according to a quick test I made right now it is a regression post 2.6.20. # modprobe ohci1394 # wait a bit, eth1394 is auto-loaded # modprobe -r eth1394 # modprobe -r ohci1394 works. # modprobe ohci1394 # wait a bit, eth1394 is auto-loaded # modprobe -r ohci1394 oopses with the same trace as Ismael posted. And indeed, looking at his trace once more I now also spot eth1394 among his linked-in modules. Ismail, if you have the opportunity, the next thing you could test would be to unload eth1394 explicitly before ohci1394 on 2.6.21-rc3. This would _not_ oops according to my observation. Thanks to Ismail's link to the similar report on 2.6.19-rc5-mm2 we already have a hot candidate to be the trigger (not necessarily to be the actual bug): http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=43cb76d91ee85f579a69d42bc8efc08bac560278 "Network: convert network devices to use struct device instead of class_device" Alas I didn't remember that older 2.6.19-rc5-mm2 discussion when I saw Greg's pull request with this conversion patch (February 7) and didn't react and test Linus' newest. Advice would be appreciated... -- Stefan Richter -=-=-=== --== - http://arcgraph.de/sr/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc3-mm1
> On Wed, 14 Mar 2007 20:06:02 +0100 Mariusz Kozlowski <[EMAIL PROTECTED]> > wrote: > Hello, > > Today after +- 24h of uptime I found some more page allocation > failures ('eth1: Can't allocate skb for Rx'). You'll find more here: > > http://tuxland.pl/misc/2.6.21-rc3-mm1-page-allocation-failure.txt > > System wasn't doing anything unusual, as usual ;-) X, some p2p > software, firefox+flash playing music. > Do other kernels do this, or is 2.6.21-rc3-mm1 worse? It is of course a non-fatal problem and will inevitably happen sometimes, but we would like the VM to be able to minimise the occurrence of this problem. I think we were rather hoping that Mel's anti-fragmentation work would improve things. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] change futex_wait() to hrtimers
> BTW. my futex man page says timeout's contents "describe the maximum duration > of the wait". Surely that should be *minimum*? Michael cc'ed. Er, the intent of the wording is to say "futex will wait until uaddr no longer contains val, or the timeout expires, whichever happens first". One option for selecting different clock resolutions is to use the clockid_t from the POSIX clock_gettime() family. That is, specify the clock that a wait uses, and then have a separate mechanism for turning a resolution requirement into a clockid_t. (And there can be default clocks for interfaces that don't specify one explicitly.) Although clockid_t is pretty generic, it's biased toward an enumerated list of clocks rather than a continuous resolution. Fortunately, that seems to match the implementation ideas. The question is how much the timeout gets rounded, and the choices are currently jiffies or microseconds. A related option may be whether rounding down is acceptable. For some applications (periodic polling for events), it's fine. For others, it's not. Thus, while it's okay to specify such clocks explicitly, it'd probably be a good idea to forbid selecting them as the default for interfaces that don't specify a clock explicitly. I had some code that suffered 1 ms buzz-loops on Solaris because poll(2) would round the timeout interval down, but the loop calling it would explicitly check whether the timeout had expired using gettimeofday() and would keep re-invoking poll(pollfds, npollfds, 1) until the timeout really did expire. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Bug 8040] Hang before INIT when CONFIG_HIGHMEM4G=y [Fix CONFIG_COMPAT_VDSO] <- Bad
> On Wed, 14 Mar 2007 17:52:01 + (UTC) Leroy van Logchem <[EMAIL > PROTECTED]> wrote: > Leroy van Logchem wldelft.nl> writes: > > > > > > > None whatsoever. Three people are reporting this and it's a drop-dead > > > > showstopper for a 2.6.21 release so we just have to wait until someone > > > > wakes up and thinks about it. > > > > The topic should be "when CONFIG_HIGHMEM64G=y" imo. > > > > I'll try to do my first bi-sect today. Thanks. Please always do reply-to-all. Cc's restored (and added..) > Bisecting went well, after 13 compiles this commit was found: > > a1f3bb9ae4497a2ed3eac773fd7798ac33a0371f is first bad commit > commit a1f3bb9ae4497a2ed3eac773fd7798ac33a0371f > Author: Roland McGrath <[EMAIL PROTECTED]> > Date: Fri Jan 26 00:56:46 2007 -0800 > > [PATCH] Fix CONFIG_COMPAT_VDSO > > I wouldn't mind if CONFIG_COMPAT_VDSO went away entirely. But if it's > there, > it should work properly. Currently it's quite haphazard: both real vma > and > fixmap are mapped, both are put in the two different AT_* slots, sysenter > returns to the vma address rather than the fixmap address, and core dumps > yet > are another story. > > This patch makes CONFIG_COMPAT_VDSO disable the real vma and use the > fixmap > area consistently. This makes it actually compatible with what the old > vdso > implementation did. > > Signed-off-by: Roland McGrath <[EMAIL PROTECTED]> > Cc: Ingo Molnar <[EMAIL PROTECTED]> > Cc: Paul Mackerras <[EMAIL PROTECTED]> > Cc: Benjamin Herrenschmidt <[EMAIL PROTECTED]> > Cc: Andi Kleen <[EMAIL PROTECTED]> > Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> > Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> > > :04 04 802ab3366a651ecba28c8677fa84a9f7c506392b > f44adc4dcdab733e5965b68ccd0d643f0a550a80 M arch > :04 04 be1e217152d8b3fcd05f09aa2b3f4f9dcb8208aa > 46cc86427e861350dd3fef9469474c55119f27ce M include > > I had both CONFIG_COMPAT_VDSO=y and CONFIG_HIGHMEM64G=y configured. > Using a 4GB Supermicro 7044 SMP dual Xeon. Details upon request. > > -- > Leroy > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 13/13] signalfd/timerfd/asyncfd v5 - KAIO asyncfd support (example/maybe-broken) ...
On Wed, Mar 14, 2007 at 04:41:58PM -0700, Davide Libenzi wrote: > Yeah, of course. I do not plan revolutions. Just asking if it's a possible > thing to do. I can mlock the userspace ring, if imposing that burden over > aio_complete() is seen as too heavy. I'm not sure I follow what you're doing -- why isn't asyncfd merely calling io_getevents() instead of reinventing everything the ringbuffer does? The aio ringbuffer is already locked in memory. Fwiw, the aio ringbuffer was originally wired up to a file descriptor, but that gave way to the actual syscall in order to enforce proper typechecking and typical usage scenarios with timeouts. Also, there have been patches floating around for aio_poll and a way to get epoll wakeups into the aio event queue. They deserve serious consideration if this asyncfd seems necessary. -ben -- "Time is of no importance, Mr. President, only life is important." Don't Email: <[EMAIL PROTECTED]>. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/8] x86 boot, pda and gdt cleanups
On Tue, 2007-03-13 at 13:48 -0700, Jeremy Fitzhardinge wrote: > * init_gdt should always use write_gdt_entry when touching the gdt; > if it doesn't and it ends up touching an already-installed gdt > under Xen, it will get a write fault. This happens because > init_gdt ends up getting called twice in SMP (see below). Hmm, this invalidated my assumption that write_gdt_entry is always a write to this cpu's active gdt. Better fix is not to call it twice anyway... > * init_gdt should always be called before bringing up the cpu, > rather than by the cpu itself (and therefore, cpu_init() shouldn't > call it). Obviously the the boot cpu is an exception. Makes sense. > * secondary_cpu_init stops being necessary. Indeed. > * On SMP, init_gdt can get called twice: first time in > smp_prepare_boot_cpu, and a second time in trap_init. On UP, > trap_init is the only caller. Getting rid of the call in smp_prepare_boot_cpu currently works, but it's fragile: __get_cpu_var(x) && per_cpu(x, smp_processor_id()) will differ, and changes made to __get_cpu_var(x) will vanish... Fortunately, UP doesn't have to call init_gdt at all, so I think it's better to place it in smp_prepare_boot_cpu only and then clean up the UP code. I'll try now... Rusty. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 13/13] signalfd/timerfd/asyncfd v5 - KAIO asyncfd support (example/maybe-broken) ...
On Wed, 14 Mar 2007, Davide Libenzi wrote: > On Wed, 14 Mar 2007, Benjamin LaHaise wrote: > > > On Wed, Mar 14, 2007 at 04:24:54PM -0700, Davide Libenzi wrote: > > > Can you point me to a kernel path that ends up calling aio_complete() in > > > a > > > do-not-sleep mode? > > > > If you remove that invariant, then it is very difficult for device drivers > > and other code to make use of aio_complete(). > > > > > The offender I see is drivers/usb/gadget/inode.c that calls it with a > > > spinlock held. > > > > Which was from irq context last time I checked. The drivers/usb/gadget/inode.c case seems to be easily fixeable AFAICS, in the ep_aio_complete() function. I was more under the impression that aio_complete() was more of a tasklet kind of domain. - Davide - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 13/13] signalfd/timerfd/asyncfd v5 - KAIO asyncfd support (example/maybe-broken) ...
On Wed, 14 Mar 2007, Benjamin LaHaise wrote: > On Wed, Mar 14, 2007 at 04:24:54PM -0700, Davide Libenzi wrote: > > Can you point me to a kernel path that ends up calling aio_complete() in a > > do-not-sleep mode? > > If you remove that invariant, then it is very difficult for device drivers > and other code to make use of aio_complete(). > > > The offender I see is drivers/usb/gadget/inode.c that calls it with a > > spinlock held. > > Which was from irq context last time I checked. > > > The aio_run_iocb function seem to release/reacquire the lock before > > calling aio_complete(). > > That implies nothing -- aio_complete() has to acquire ctx_lock and cannot > be called holding the lock. Sure, it could probably be split into > __aio_complete() and have aio_complete() wrap it acquiring the lock. Yeah, of course. I do not plan revolutions. Just asking if it's a possible thing to do. I can mlock the userspace ring, if imposing that burden over aio_complete() is seen as too heavy. - Davide - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 13/13] signalfd/timerfd/asyncfd v5 - KAIO asyncfd support (example/maybe-broken) ...
On Wed, Mar 14, 2007 at 04:24:54PM -0700, Davide Libenzi wrote: > Can you point me to a kernel path that ends up calling aio_complete() in a > do-not-sleep mode? If you remove that invariant, then it is very difficult for device drivers and other code to make use of aio_complete(). > The offender I see is drivers/usb/gadget/inode.c that calls it with a > spinlock held. Which was from irq context last time I checked. > The aio_run_iocb function seem to release/reacquire the lock before > calling aio_complete(). That implies nothing -- aio_complete() has to acquire ctx_lock and cannot be called holding the lock. Sure, it could probably be split into __aio_complete() and have aio_complete() wrap it acquiring the lock. -ben -- "Time is of no importance, Mr. President, only life is important." Don't Email: <[EMAIL PROTECTED]>. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: /sys/devices/system/cpu/cpuX/online are missing
On Tue, 13 Mar 2007 09:56:52 + Russell King <[EMAIL PROTECTED]> wrote: > Right, here's the ARM fix which is now in the ARM tree: > [...] The following patch seems to fix the issue (+ minor style fix). I'm not sure it's ok due to my poor knowledge of this code. Signed-off-by: Giuliano Pochini <[EMAIL PROTECTED]> --- linux-2.6.21rc3/arch/powerpc/kernel/setup_32.c__orig2007-03-15 00:05:02.0 +0100 +++ linux-2.6.21rc3/arch/powerpc/kernel/setup_32.c 2007-03-15 00:07:02.0 +0100 @@ -195,18 +195,22 @@ EXPORT_SYMBOL(nvram_sync); #endif /* CONFIG_NVRAM */ -static struct cpu cpu_devices[NR_CPUS]; +static DEFINE_PER_CPU(struct cpu, cpu_devices); int __init ppc_init(void) { - int i; + int cpu; /* clear the progress line */ - if ( ppc_md.progress ) ppc_md.progress(" ", 0x); + if (ppc_md.progress) + ppc_md.progress(" ", 0x); /* register CPU devices */ - for_each_possible_cpu(i) - register_cpu(_devices[i], i); + for_each_possible_cpu(cpu) { + struct cpu *c = _cpu(cpu_devices, cpu); + c->hotpluggable = 1; + register_cpu(c, cpu); + } /* call platform init */ if (ppc_md.init != NULL) { -- Giuliano. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SMP performance degradation with sysbench
On Tue, Mar 13, 2007 at 05:08:59AM -0700, Nick Piggin wrote: > I would agree that it points to MySQL scalability issues, however the > fact that such large gains come from tcmalloc is still interesting. What glibc version are you, Anton and others are using? Does that version has this fix included? Dynamically size mmap treshold if the program frees mmaped blocks. http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/malloc/malloc.c.diff?r1=1.158=1.159=glibc thanks, suresh - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 13/13] signalfd/timerfd/asyncfd v5 - KAIO asyncfd support (example/maybe-broken) ...
On Wed, 14 Mar 2007, Benjamin LaHaise wrote: > On Wed, Mar 14, 2007 at 03:19:21PM -0700, Davide Libenzi wrote: > > + /* > > +* Check if the user asked us to deliver the result through an > > +* asyncfd. Note that asyncfd_add_results() may sleep. It seems > > +* OK looking at the code, but I'm not sure since inside a USB driver, > > +* aio_complete() is called with a spinlock held. !!CHECK > > +*/ > > That won't work. aio_complete() is supposed to be irq safe. Can you point me to a kernel path that ends up calling aio_complete() in a do-not-sleep mode? The offender I see is drivers/usb/gadget/inode.c that calls it with a spinlock held. The aio_run_iocb function seem to release/reacquire the lock before calling aio_complete(). - Davide - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/