Re: the usage of DEBUG_DRIVER seems ambiguous
On Fri, 9 Mar 2007, Artem Bityutskiy wrote: > Randy Dunlap wrote: > > > it's clearly a configuration variable, but it's also being used by > > > itself in a few drivers/net/ source files. is that deliberate? > > > > The ones in drivers/net/ are just local driver debug controls. > > They happen to have the same name as a (likely newer) kconfig symbol. > > > > Is there a real problem that needs to be fixed? > > Renaming them just for the sake of being less confusing makes sense. that's kind of what i had in mind. i have a script that peruses the source tree, checking for apparent typoes in preprocessor directives when someone forgets the leading "CONFIG_", and as long as that macro name is the way it is, that example is going to be flagged every time for no good reason. if someone wants to make a suggestion, i can submit a simple renaming patch. rday -- Robert P. J. Day Linux Consulting, Training and Annoying Kernel Pedantry Waterloo, Ontario, CANADA http://fsdev.net/wiki/index.php?title=Main_Page - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/3] Input: psmouse - create PS/2 protocol options for Kconfig
On 3/9/07, Andres Salomon <[EMAIL PROTECTED]> wrote: I haven't seen patches in your tree; are you waiting for me to do the cleanups and resend? Still in my private tree; will try to push out over the weekend. -- Dmitry - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Keyboard stops working after *lock [Was: 2.6.21-rc2-mm1]
On Fri, 9 Mar 2007, Dmitry Torokhov wrote: > > > > > (II) evdev brain: Rescanning devices (12). > > > > > (II) evdev brain: Rescanning devices (13). > > > > > (II) evdev brain: Rescanning devices (14). > > > > > in this kernel, but I don't know if this is relevant. > > > > > After booting back to .20-mm2 everything is OK. > > > Thanks. Cc's added. > > Remains unsolved in 2.6.21-rc3-mm2. > Does a PS/2 keyboard behave for you? > Nowadays I forward all USB HID related issues to Jiri Kosina ;) (CCed). Hi, more importantly, does 2.6.21-rc3 work for you? There are not that many USB HID/hidinput specific patches in -mm, so it would show clearly whether it's problem in USB HID/hidinput, or somewhere else. What keyboard is that please? (vedor/product ids) Also, if it turns out to be HID problem - could you please send output of both working and non-working kernels with hid/usbhid debugging enabled? If this is present also in vanilla and not only in -mm, could you please try reverting commits 4237081e573b99a48991aa71364b0682c444651c and d4ae650a904612ffb7edd3f28b69b022988d2466 and let me know if the situation gets any better? Thanks, -- Jiri Kosina - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] RSDL completely fair starvation free interactive cpu scheduler
Linus Torvalds wrote: On Thu, 8 Mar 2007, Bill Davidsen wrote: Please, could you now rethink plugable scheduler as well? Even if one had to be chosen at boot time and couldn't be change thereafter, it would still allow a few new thoughts to be included. No. Really. I absolutely *detest* pluggable schedulers. They have a huge downside: they allow people to think that it's ok to make special-case schedulers. But it IS okay for people to make special-case schedulers. Because it's MY machine, and how it behaves under mixed load is not a technical issue, it's a POLICY issue, and therefore the only way you can allow the admin to implement that policy is to either provide several schedulers or to provide all sorts of tunable knobs. And by having a few schedulers which have been heavily tested and reviewed, you can define the policy the scheduler implements and document it. Instead of people writing their own, or hacking the code, they could have a few well-tested choices, with known policy goals. And I simply very fundamentally disagree. If you want to play with a scheduler of your own, go wild. It's easy (well, you'll find out that getting good results isn't, but that's a different thing). But actual pluggable schedulers just cause people to think that "oh, the scheduler performs badly under circumstance X, so let's tell people to use special scheduler Y for that case". And has that been a problem with io schedulers? I don't see any vast proliferation of them, I don't see contentious exchanges on LKML, or people asking how to get yet another into mainline. In fact, I would say that the io scheduler situation is as right as anything can be, choices for special cases, lack of requests for something else. And CPU scheduling really isn't that complicated. It's *way* simpler than IO scheduling. There simply is *no*excuse* for not trying to do it well enough for all cases, or for having special-case stuff. This supposes that the desired behavior, the policy, is the same on all machines or that there is currently a way to set the target. If I want interactive response with no consideration to batch (and can't trust users to use nice), I want one policy. If I want a compromise, the current scheduler or RSDL are candidates, but they do different things. But even IO scheduling actually ends up being largely the same. Yes, we have pluggable schedulers, and we even allow switching them, but in the end, we don't want people to actually do it. It's much better to have a scheduler that is "good enough" than it is to have five that are "perfect" for five particular cases. We not only have multiple io schedulers, we have many tunable io parameters, all of which allow people to make their system behave the way they think is best. It isn't causing complaint, confusion, or instability. We have many people requesting a different scheduler, so obviously what we have isn't "good enough" and I doubt any one scheduler can be, given that the target behavior is driven by non-technical choices. -- bill davidsen <[EMAIL PROTECTED]> CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Sleeping thread not receive signal until it wakes up
On Thu, 8 Mar 2007 14:52:07 -0800 Luong Ngo wrote: [...] > static irqreturn board_isr(int irq, void *dev_id, struct pt_regs* regs) > { > spin_lock(&dev->lock); >if (dev->irqMask & (1 << irqBit)) { > // Set the interrupt event mask > dev->irqEvent |= (1 << irqBit); > > // Disable this irq, it will be reenabled after processed by board task > disable_irq(irq); I assume that your device does not support shared interrupts? If it does (and a PCI device is required to support them), you cannot use disable_irq() here (and you need to check a register in the device to find out if it really did generate an IRQ)... > // Wake up Board thread that calling IOCTL > wake_up(&(dev->boardIRQWaitQueue)); > } > spin_unlock(&dev->lock); > > return IRQ_HANDLED; ...and return IRQ_NONE here if the IRQ is not from your device. > > } > > static int ats89_ioctl(struct inode *inode, struct file *file, u_int > cmd, u_long arg) > { > > switch(cmd){ >case GET_IRQ_CMD: { > u32 regMask32; > >spin_lock_irq(dev->lock); >while ((dev->irqMask & dev->irqEvent) == 0) { > // Sleep until board interrupt happens > spin_unlock_irq(dev->lock); > interruptible_sleep_on(&(dev->boardIRQWaitQueue)); > if (uncond_wakeup) { > /* don't go back to loop */ > break; > } > spin_lock_irq(dev->lock); > } > > uncond_wakeup = 0; > > // Board interrupt happened > regMask32 = dev->irqMask & dev->irqEvent; > if(copy_to_user(&(((ATS89_IOCTL_S *)arg)->mask32), > ®Mask32, sizeof(u32))) { > spin_unlock_irq(dev->lock); > return -EAGAIN; > } > > // Clear the event mask > dev->irqEvent = 0; > spin_unlock_irq(dev->lock); > } > break; > > >} > } And this code is full of bugs: 1) As you have been told already, interruptible_sleep_on() and sleep_on() functions are broken and should not be used (they are left in the kernel only to support some obsolete code). Either use wait_event_interruptible() or work with wait queues directly (prepare_to_wait(), finish_wait(), ...). 2) The code to handle pending signals is missing - you need to have this after wait_event_interruptible(): if (signal_pending(current)) return -ERESTARTSYS; (but be careful - you might need to clean up something before returning). This is what causes your problem - interruptible_sleep_on() returns if a signal is pending, but your code does not check for signals and therefore invokes interruptible_sleep_on() again; but if a signal is pending, interruptible_sleep_on() returns immediately, causing your driver to eat 100% CPU looping in kernel mode until some device event finally happens. 3) If uncond_wakeup is set, you break out of the loop with dev->lock unlocked; however, if dev->irqEvent gets set, you exit the loop with dev->lock locked. The subsequent code always unlocks dev->lock, so in the uncond_wakeup case you have double unlock. 4) You are doing copy_to_user() while holding a spinlock - this is prohibited (as any other form of sleep inside a spinlock). 5) The return code for the copy_to_user() failure is wrong - it should be -EFAULT (this is not a fatal bug, but an annoyance for users of your driver, who might get such nonstandard error codes while debugging their programs and wonder what is going on). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: sys_write() racy for multi-threaded append?
On Fri, Mar 09, 2007 at 04:19:55AM -0800, Michael K. Edwards wrote: > On 3/8/07, Benjamin LaHaise <[EMAIL PROTECTED]> wrote: > >Any number of things can cause a short write to occur, and rewinding the > >file position after the fact is just as bad. A sane app has to either > >serialise the writes itself or use a thread safe API like pwrite(). > > Not on a pipe/FIFO. Short writes there are flat out verboten by > 1003.1 unless O_NONBLOCK is set. (Not that f_pos is interesting on a > pipe except as a "bytes sent" indicator -- and in the multi-threaded > scenario, if you do the speculative update that I'm suggesting, you > can't 100% trust it unless you ensure that you are not in > mid-read/write in some other thread at the moment you sample f_pos. > But that doesn't make it useless.) Writes to a pipe/FIFO are atomic, so long as they fit within the pipe buffer size, while f_pos on a pipe is undefined -- what exactly is the issue here? The semantics you're assuming are not defined by POSIX. Heck, even looking at a man page for one of the *BSDs states "Some devices are incapable of seeking. The value of the pointer associated with such a device is undefined." What part of undefined is problematic? > As to what a "sane app" has to do: it's just not that unusual to write > application code that treats a short read/write as a catastrophic > error, especially when the fd is of a type that is known never to > produce a short read/write unless something is drastically wrong. For > instance, I bomb on short write in audio applications where the driver > is known to block until enough bytes have been read/written, period. > When switching from reading a stream of audio frames from thread A to > reading them from thread B, I may be willing to omit app > serialization, because I can tolerate an imperfect hand-off in which > thread A steals one last frame after thread B has started reading -- > as long as the fd doesn't get screwed up. There is no reason for the > generic sys_read code to leave a race open in which the same frame is > read by both threads and a hardware buffer overrun results later. I hope I don't have to run any of your software. Short writes can and do happen because of a variety of reasons: signals, memory allocation failures, quota being exceeded These are all error conditions the kernel has to provide well defined semantics for, as well behaved applications will try to handle them gracefully. > In short, I'm not proposing that the kernel perfectly serialize > concurrent reads and writes to arbitrary fd types. I'm proposing that > it not do something blatantly stupid and easily avoided in generic > code that makes it impossible for any fd type to guarantee that, after > 10 successful pipelined 100-byte reads or writes, f_pos will have > advanced by 1000. The semantics you're looking for are defined for regular files with O_APPEND. Anything else is asking for synchronization that other applications do not require and do not desire. -ben -- "Time is of no importance, Mr. President, only life is important." Don't Email: <[EMAIL PROTECTED]>. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [SLUB 0/3] SLUB: The unqueued slab allocator V4
Note that I am amazed that the kernbench even worked. The results without slub_debug were not good except for IA64. x86_64 and ppc64 both blew up for a variety of reasons. The IA64 results were KernBench Comparison 2.6.21-rc2-mm2-clean 2.6.21-rc2-mm2-slub %diff User CPU time1084.64 1032.93 4.77% System CPU time 73.38 63.14 13.95% Total CPU time1158.02 1096.07 5.35% Elapsedtime 307.00285.62 6.96% AIM9 Comparison --- 2.6.21-rc2-mm2-clean2.6.21-rc2-mm2-slub 1 creat-clo425460.75 438809.64 13348.89 3.14% File Creations and Closes/second 2 page_test 2097119.26 3398259.27 1301140.01 62.04% System Allocations & Pages/second 3 brk_test7008395.33 6728755.72 -279639.61 -3.99% System Memory Allocations/second 4 jmp_test 12226295.3112254966.21 28670.90 0.23% Non-local gotos/second 5 signal_test 1271126.28 1235510.96 -35615.32 -2.80% Signal Traps/second 6 exec_test 395.54 381.18 -14.36 -3.63% Program Loads/second 7 fork_test 13218.23 13211.41 -6.82 -0.05% Task Creations/second 8 link_test 64776.047488.13 -57287.91 -88.44% Link/Unlink Pairs/second An example console log from x86_64 is below. It's not particular clear why it went blamo and I haven't had a chance all day to kick it around for a bit due to a variety of other hilarity floating around. Linux version 2.6.21-rc2-mm2-autokern1 ([EMAIL PROTECTED]) (gcc version 4.1.1 20060525 (Red Hat 4.1.1-1)) #1 SMP Thu Mar 8 12:13:27 CST 2007 Command line: ro root=/dev/VolGroup00/LogVol00 rhgb console=tty0 console=ttyS1,19200 selinux=no autobench_args: root=30726124 ABAT:1173378546 loglevel=8 BIOS-provided physical RAM map: BIOS-e820: - 0009d400 (usable) BIOS-e820: 0009d400 - 000a (reserved) BIOS-e820: 000e - 0010 (reserved) BIOS-e820: 0010 - 3ffcddc0 (usable) BIOS-e820: 3ffcddc0 - 3ffd (ACPI data) BIOS-e820: 3ffd - 4000 (reserved) BIOS-e820: fec0 - 0001 (reserved) Entering add_active_range(0, 0, 157) 0 entries of 3200 used Entering add_active_range(0, 256, 262093) 1 entries of 3200 used end_pfn_map = 1048576 DMI 2.3 present. ACPI: RSDP 000FDFC0, 0014 (r0 IBM ) ACPI: RSDT 3FFCFF80, 0034 (r1 IBMSERBLADE 1000 IBM 45444F43) ACPI: FACP 3FFCFEC0, 0084 (r2 IBMSERBLADE 1000 IBM 45444F43) ACPI: DSDT 3FFCDDC0, 1EA6 (r1 IBMSERBLADE 1000 INTL 2002025) ACPI: FACS 3FFCFCC0, 0040 ACPI: APIC 3FFCFE00, 009C (r1 IBMSERBLADE 1000 IBM 45444F43) ACPI: SRAT 3FFCFD40, 0098 (r1 IBMSERBLADE 1000 IBM 45444F43) ACPI: HPET 3FFCFD00, 0038 (r1 IBMSERBLADE 1000 IBM 45444F43) SRAT: PXM 0 -> APIC 0 -> Node 0 SRAT: PXM 0 -> APIC 1 -> Node 0 SRAT: PXM 1 -> APIC 2 -> Node 1 SRAT: PXM 1 -> APIC 3 -> Node 1 SRAT: Node 0 PXM 0 0-4000 Entering add_active_range(0, 0, 157) 0 entries of 3200 used Entering add_active_range(0, 256, 262093) 1 entries of 3200 used NUMA: Using 63 for the hash shift. Bootmem setup node 0 -3ffcd000 Node 0 memmap at 0x81003efcd000 size 16773952 first pfn 0x81003efcd000 sizeof(struct page) = 64 Zone PFN ranges: DMA 0 -> 4096 DMA324096 -> 1048576 Normal1048576 -> 1048576 Movable zone start PFN for each node early_node_map[2] active PFN ranges 0:0 -> 157 0: 256 -> 262093 On node 0 totalpages: 261994 DMA zone: 64 pages used for memmap DMA zone: 2017 pages reserved DMA zone: 1916 pages, LIFO batch:0 DMA32 zone: 4031 pages used for memmap DMA32 zone: 253966 pages, LIFO batch:31 Normal zone: 0 pages used for memmap Movable zone: 0 pages used for memmap ACPI: PM-Timer IO Port: 0x2208 ACPI: Local APIC address 0xfee0 ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) Processor #0 (Bootup-CPU) ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled) Processor #1 ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled) Processor #2 ACPI: LAPIC (acpi_id[0x03] lapic_id[0x03] enabled) Processor #3 ACPI: LAPIC_NMI (acpi_id[0x00] dfl dfl lint[0x1]) ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl lint[0x1]) ACPI: LAPIC_NMI (acpi_id[0x02] dfl dfl lint[0x1]) ACPI: LAPIC_NMI (acpi_id[0x03] dfl dfl lint[0x1]) ACPI: IOAPIC (id[0x0e] address[0xfec0] gsi_base[0]) IOAPIC[0]: apic_id 14, address 0xfec0, GSI 0-23 ACPI: IOAPIC (id[0x0d] address[0xfec1] gsi_base[24]) IOAPIC[1]: apic_id 13, address 0xfe
[PATCH 0/4 TRY#3] improve alternative instruction code and optimize get_cycles_sync
This series of patches extend the alternative instructions framework on i386 and x86_64 architectures to support two alternative instruction replacements. This code is used together with the introduction of the X86_FEATURE_SYNC_RDTSC flag on i386 to simplify and optimize the get_cycles_sync() function. The optimization changes this function to use RDTSCP instead of CPUID;RDTSC if this instruction is available. Don't use CPUID there is really important if the kernel runs as a KVM guest, because this instruction is intercepted and causes an expensive VMEXIT. Changes to the previous submit: * rebased to current linus git tree * replaced RDTSCP usage in get_cycles_sync with the opcode to make it compile with older binutils -- Joerg Roedel Operating System Research Center AMD Saxony LLC & Co. KG - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] revoke: wire up s390 system calls
On Friday 09 March 2007, Pekka J Enberg wrote: > > From: Serge E. Hallyn <[EMAIL PROTECTED]> > > Make revokeat and frevoke system calls available to user-space on s390. > > Signed-off-by: Serge E. Hallyn <[EMAIL PROTECTED]> > Signed-off-by: Pekka Enberg <[EMAIL PROTECTED]> Looks good to me, but you really should through Martin, since he has an overview of what syscall numbers may already be assigned some another patch he has queued up. Arnd <>< - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/4 TRY#3] i386: extend alternative instructions framework
From: Joerg Roedel <[EMAIL PROTECTED]> This patch extends the alternative instructions framework to support 2 alternative instructions. Signed-off-by: Joerg Roedel <[EMAIL PROTECTED]> -- Joerg Roedel Operating System Research Center AMD Saxony LLC & Co. KG diff --git a/arch/i386/kernel/alternative.c b/arch/i386/kernel/alternative.c index 9eca21b..59f1770 100644 --- a/arch/i386/kernel/alternative.c +++ b/arch/i386/kernel/alternative.c @@ -153,14 +153,23 @@ extern u8 __smp_alt_begin[], __smp_alt_end[]; void apply_alternatives(struct alt_instr *start, struct alt_instr *end) { struct alt_instr *a; - u8 *instr; + u8 *instr, *replacement; + u8 replacementlen; int diff; DPRINTK("%s: alt table %p -> %p\n", __FUNCTION__, start, end); for (a = start; a < end; a++) { - BUG_ON(a->replacementlen > a->instrlen); - if (!boot_cpu_has(a->cpuid)) + if (boot_cpu_has(a->cpuid)) { + replacement = a->replacement; + replacementlen = a->replacementlen; + } else if ((a->replacementlen2 > 0) && + (boot_cpu_has(a->cpuid2))) { + replacement = a->replacement2; + replacementlen = a->replacementlen2; + } else continue; + + BUG_ON(replacementlen > a->instrlen); instr = a->instr; #ifdef CONFIG_X86_64 /* vsyscall code is not mapped yet. resolve it manually. */ @@ -170,9 +179,9 @@ void apply_alternatives(struct alt_instr *start, struct alt_instr *end) __FUNCTION__, a->instr, instr); } #endif - memcpy(instr, a->replacement, a->replacementlen); - diff = a->instrlen - a->replacementlen; - nop_out(instr + a->replacementlen, diff); + memcpy(instr, replacement, replacementlen); + diff = a->instrlen - replacementlen; + nop_out(instr + replacementlen, diff); } } diff --git a/include/asm-i386/alternative.h b/include/asm-i386/alternative.h index b8fa955..4a77e93 100644 --- a/include/asm-i386/alternative.h +++ b/include/asm-i386/alternative.h @@ -10,11 +10,14 @@ struct alt_instr { u8 *instr; /* original instruction */ u8 *replacement; + u8 *replacement2; u8 cpuid; /* cpuid bit set for replacement */ + u8 cpuid2; /* cpuid bit set for replacement2 */ u8 instrlen; /* length of original instruction */ u8 replacementlen; /* length of new instruction, <= instrlen */ - u8 pad; -}; + u8 replacementlen2; + u8 pad[3]; +} __attribute__ ((packed)); extern void apply_alternatives(struct alt_instr *start, struct alt_instr *end); @@ -36,6 +39,12 @@ static inline void alternatives_smp_switch(int smp) {} #endif /* + * use this macro(s) if you need more than one output parameter + * in alternative_io_* + */ +#define ASM_OUTPUT2(a, b) a, b + +/* * Alternative instructions for different CPU types or capabilities. * * This allows to use optimized instructions even on generic binary @@ -53,9 +62,12 @@ static inline void alternatives_smp_switch(int smp) {} " .align 4\n"\ " .long 661b\n"/* label */ \ " .long 663f\n"/* new instruction */ \ + " .long 0x00\n" \ " .byte %c0\n" /* feature bit */ \ + " .byte 0x00\n" \ " .byte 662b-661b\n" /* sourcelen */ \ " .byte 664f-663f\n" /* replacementlen */ \ + " .byte 0x00\n" \ ".previous\n" \ ".section .altinstr_replacement,\"ax\"\n" \ "663:\n\t" newinstr "\n664:\n" /* replacement */\ @@ -77,14 +89,38 @@ static inline void alternatives_smp_switch(int smp) {} " .align 4\n"\ " .long 661b\n"/* label */ \ " .long 663f\n"/* new instruction */ \ + " .long 0x00\n" \ " .byte %c0\n" /* feature bit */ \ + " .byte 0x00\n" \ " .byte 662b-661b\n" /* sourcelen */ \ " .byte 664f-663f\n" /* replacementlen */ \ + " .byte 0x00\n" \ ".previ
[PATCH 2/4 TRY#3] x86_64: changes to x86_64 architecture for alternative instruction improvements
From: Joerg Roedel <[EMAIL PROTECTED]> In this patch updates the x86_64 architecture to work with the changes to alternative instructions in i386 Signed-off-by: Joerg Roedel <[EMAIL PROTECTED]> -- Joerg Roedel Operating System Research Center AMD Saxony LLC & Co. KG diff --git a/arch/x86_64/lib/clear_page.S b/arch/x86_64/lib/clear_page.S index 9a10a78..ab525ee 100644 --- a/arch/x86_64/lib/clear_page.S +++ b/arch/x86_64/lib/clear_page.S @@ -53,7 +53,10 @@ ENDPROC(clear_page) .align 8 .quad clear_page .quad 1b + .quad 0 .byte X86_FEATURE_REP_GOOD + .byte 0 .byte .Lclear_page_end - clear_page .byte 2b - 1b + .byte 0 .previous diff --git a/arch/x86_64/lib/copy_page.S b/arch/x86_64/lib/copy_page.S index 727a5d4..b4d0329 100644 --- a/arch/x86_64/lib/copy_page.S +++ b/arch/x86_64/lib/copy_page.S @@ -113,7 +113,10 @@ ENDPROC(copy_page) .align 8 .quad copy_page .quad 1b + .quad 0 .byte X86_FEATURE_REP_GOOD + .byte 0 .byte .Lcopy_page_end - copy_page .byte 2b - 1b + .byte 0 .previous diff --git a/arch/x86_64/lib/copy_user.S b/arch/x86_64/lib/copy_user.S index 70bebd3..d505df3 100644 --- a/arch/x86_64/lib/copy_user.S +++ b/arch/x86_64/lib/copy_user.S @@ -27,9 +27,12 @@ .align 8 .quad 0b .quad 2b + .quad 0 .byte \feature /* when feature is set */ + .byte 0 .byte 5 .byte 5 + .byte 0 .previous .endm diff --git a/arch/x86_64/lib/memcpy.S b/arch/x86_64/lib/memcpy.S index 0ea0ddc..b1e1686 100644 --- a/arch/x86_64/lib/memcpy.S +++ b/arch/x86_64/lib/memcpy.S @@ -123,7 +123,10 @@ ENDPROC(__memcpy) .align 8 .quad memcpy .quad 1b + .quad 0 .byte X86_FEATURE_REP_GOOD + .byte 0 .byte .Lfinal - memcpy .byte 2b - 1b + .byte 0 .previous diff --git a/arch/x86_64/lib/memset.S b/arch/x86_64/lib/memset.S index 2c59481..566e179 100644 --- a/arch/x86_64/lib/memset.S +++ b/arch/x86_64/lib/memset.S @@ -127,7 +127,10 @@ ENDPROC(__memset) .align 8 .quad memset .quad 1b + .quad 0 .byte X86_FEATURE_REP_GOOD + .byte 0 .byte .Lfinal - memset .byte 2b - 1b + .byte 0 .previous diff --git a/include/asm-x86_64/alternative.h b/include/asm-x86_64/alternative.h index a6657b4..63cd8e5 100644 --- a/include/asm-x86_64/alternative.h +++ b/include/asm-x86_64/alternative.h @@ -10,11 +10,14 @@ struct alt_instr { u8 *instr; /* original instruction */ u8 *replacement; + u8 *replacement2; u8 cpuid; /* cpuid bit set for replacement */ + u8 cpuid2; /* cpuid bit set for replacement2 */ u8 instrlen; /* length of original instruction */ u8 replacementlen; /* length of new instruction, <= instrlen */ - u8 pad[5]; -}; + u8 replacementlen2; + u8 pad[3]; +} __attribute__ ((packed)); extern void apply_alternatives(struct alt_instr *start, struct alt_instr *end); @@ -36,6 +39,12 @@ static inline void alternatives_smp_switch(int smp) {} #endif +/* + * use this macro(s) if you need more than one output parameter + * in alternative_io_* + */ +#define ASM_OUTPUT2(a, b) a, b + /* * Alternative instructions for different CPU types or capabilities. * @@ -54,9 +63,12 @@ static inline void alternatives_smp_switch(int smp) {} " .align 8\n" \ " .quad 661b\n"/* label */ \ " .quad 663f\n"/* new instruction */ \ + " .quad 0x00\n" \ " .byte %c0\n" /* feature bit */\ + " .byte 0x00\n" \ " .byte 662b-661b\n" /* sourcelen */ \ " .byte 664f-663f\n" /* replacementlen */ \ + " .byte 0x00\n" \ ".previous\n" \ ".section .altinstr_replacement,\"ax\"\n" \ "663:\n\t" newinstr "\n664:\n" /* replacement */ \ @@ -78,9 +90,12 @@ static inline void alternatives_smp_switch(int smp) {} " .align 8\n"\ " .quad 661b\n"/* label */ \ " .quad 663f\n"/* new instruction */ \ + " .quad 0x00\n" \ " .byte %c0\n" /* feature bit */ \ + " .byte 0x00\n" \ " .byte 662b-661b\n"
[PATCH 3/4 TRY#3] i386: add the X86_FEATURE_SYNC_RDTSC flag
From: Joerg Roedel <[EMAIL PROTECTED]> This patch adds the X86_FEATURE_SYNC_RDTSC to the i386 architecture. This is very helpfull to simplify the get_cycles_sync() function and remove the #ifdefs from it. Signed-off-by: Joerg Roedel <[EMAIL PROTECTED]> -- Joerg Roedel Operating System Research Center AMD Saxony LLC & Co. KG diff --git a/arch/i386/kernel/cpu/amd.c b/arch/i386/kernel/cpu/amd.c index 41cfea5..11f5730 100644 --- a/arch/i386/kernel/cpu/amd.c +++ b/arch/i386/kernel/cpu/amd.c @@ -241,6 +241,8 @@ static void __cpuinit init_amd(struct cpuinfo_x86 *c) if (cpuid_eax(0x8000) >= 0x8006) num_cache_leaves = 3; + + clear_bit(X86_FEATURE_SYNC_RDTSC, c->x86_capability); } static unsigned int __cpuinit amd_size_cache(struct cpuinfo_x86 * c, unsigned int size) diff --git a/arch/i386/kernel/cpu/intel.c b/arch/i386/kernel/cpu/intel.c index 56fe265..403a495 100644 --- a/arch/i386/kernel/cpu/intel.c +++ b/arch/i386/kernel/cpu/intel.c @@ -188,8 +188,11 @@ static void __cpuinit init_intel(struct cpuinfo_x86 *c) } #endif - if (c->x86 == 15) + if (c->x86 == 15) { set_bit(X86_FEATURE_P4, c->x86_capability); + set_bit(X86_FEATURE_SYNC_RDTSC, c->x86_capability); + } else + clear_bit(X86_FEATURE_SYNC_RDTSC, c->x86_capability); if (c->x86 == 6) set_bit(X86_FEATURE_P3, c->x86_capability); if ((c->x86 == 0xf && c->x86_model >= 0x03) || diff --git a/include/asm-i386/cpufeature.h b/include/asm-i386/cpufeature.h index 3f92b94..a9f1f01 100644 --- a/include/asm-i386/cpufeature.h +++ b/include/asm-i386/cpufeature.h @@ -75,6 +76,7 @@ #define X86_FEATURE_ARCH_PERFMON (3*32+11) /* Intel Architectural PerfMon */ #define X86_FEATURE_PEBS (3*32+12) /* Precise-Event Based Sampling */ #define X86_FEATURE_BTS(3*32+13) /* Branch Trace Store */ +#define X86_FEATURE_SYNC_RDTSC (3*32+14) /* RDTSC is serializing */ /* Intel-defined CPU features, CPUID level 0x0001 (ecx), word 4 */ #define X86_FEATURE_XMM3 (4*32+ 0) /* Streaming SIMD Extensions-3 */
[PATCH 4/4 TRY#3] optimize and simplify get_cycles_sync()
From: Joerg Roedel <[EMAIL PROTECTED]> This patch simplifies the get_cycles_sync() function by removing the #ifdefs from it. Further it introduces an optimization for AMD processors. There the RDTSCP instruction is used instead of CPUID;RDTSC which is helpfull if the kernel runs as a KVM guest. Running as a guest makes CPUID very expensive because it causes an intercept of the guest. Signed-off-by: Joerg Roedel <[EMAIL PROTECTED]> -- Joerg Roedel Operating System Research Center AMD Saxony LLC & Co. KG diff --git a/include/asm-i386/cpufeature.h b/include/asm-i386/cpufeature.h index 3f92b94..a9f1f01 100644 --- a/include/asm-i386/cpufeature.h +++ b/include/asm-i386/cpufeature.h @@ -49,6 +49,7 @@ #define X86_FEATURE_MP (1*32+19) /* MP Capable. */ #define X86_FEATURE_NX (1*32+20) /* Execute Disable */ #define X86_FEATURE_MMXEXT (1*32+22) /* AMD MMX extensions */ +#define X86_FEATURE_RDTSCP (1*32+27) /* RDTSCP */ #define X86_FEATURE_LM (1*32+29) /* Long Mode (x86-64) */ #define X86_FEATURE_3DNOWEXT (1*32+30) /* AMD 3DNow! extensions */ #define X86_FEATURE_3DNOW (1*32+31) /* 3DNow! */ diff --git a/include/asm-i386/tsc.h b/include/asm-i386/tsc.h index 84016ff..0b769ad 100644 --- a/include/asm-i386/tsc.h +++ b/include/asm-i386/tsc.h @@ -7,6 +7,7 @@ #define _ASM_i386_TSC_H #include +#include /* * Standard way to access the cycle counter. @@ -34,22 +35,16 @@ static inline cycles_t get_cycles(void) /* Like get_cycles, but make sure the CPU is synchronized. */ static __always_inline cycles_t get_cycles_sync(void) { - unsigned long long ret; -#ifdef X86_FEATURE_SYNC_RDTSC - unsigned eax; + unsigned int a, d; - /* -* Don't do an additional sync on CPUs where we know -* RDTSC is already synchronous: -*/ - alternative_io("cpuid", ASM_NOP2, X86_FEATURE_SYNC_RDTSC, - "=a" (eax), "0" (1) : "ebx","ecx","edx","memory"); -#else - sync_core(); -#endif - rdtscll(ret); +#define RDTSCP ".byte 0x0f, 0x01, 0xf9" + alternative_io_two("cpuid\nrdtsc", + "rdtsc", X86_FEATURE_SYNC_RDTSC, + ".byte 0x0f, 0x01, 0xf9", X86_FEATURE_RDTSCP, + ASM_OUTPUT2("=a" (a), "=d" (d)), + "0" (1) : "ecx", "memory"); - return ret; + return ((unsigned long long)a) | (((unsigned long long)d)<<32); } extern void tsc_init(void);
dev_printk and new-style class devices
Hi Greg, all, As the new-style class devices (as opposed to old-style struct class_device) are becoming more widely used, I noticed that the dev_printk-based functions are not working properly with these. New-style class devices have no driver nor bus, almost by definition, and as a result dev_driver_string(), which is used as the first parameter of dev_printk, resolves to an empty string. This causes entries like the following to show in my logs: i2c-2: adapter [SMBus stub driver] registered Notice the unaesthetical leading whitespace. In order to fix this problem, I suggest that we extend dev_driver_string to deal with new-style class devices: Signed-off-by: Jean Delvare <[EMAIL PROTECTED]> --- drivers/base/core.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- linux-2.6.21-rc3.orig/drivers/base/core.c 2007-02-28 09:48:19.0 +0100 +++ linux-2.6.21-rc3/drivers/base/core.c2007-03-09 16:01:07.0 +0100 @@ -57,7 +57,8 @@ bool is_lanana_major(unsigned int major) const char *dev_driver_string(struct device *dev) { return dev->driver ? dev->driver->name : - (dev->bus ? dev->bus->name : ""); + (dev->bus ? dev->bus->name : + (dev->class ? dev->class->name : "")); } EXPORT_SYMBOL(dev_driver_string); In the case above, the message in the logs now looks like: i2c-adapter i2c-2: adapter [SMBus stub driver] registered Which is much better IMHO. Greg, what do you think? Thanks, -- Jean Delvare - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Software Suspend: Fix suspend when console is in VT_AUTO/KD_GRAPHICS mode
On Fri, Mar 09, 2007 at 10:08:05AM +0100, Pavel Machek wrote: > So... if current console is graphical, we leave X accessing the > console... That's bad, because video state is not going to be > restored...? A graphical console is not necessarily X. Is there any requirement for there to be a single VT that isn't in text mode? The vt switching is a hack, we shouldn't make life difficult for people who have their own userspace code that's entirely capable of restoring video state on its own. -- Matthew Garrett | [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] revoke: wire up s390 system calls
Hi Martin, Martin Schwidefsky wrote: Yes, please put me or Heiko on CC if you add system calls to s390. Ok, sorry about that. I would expect akpm to send it to you guys though whenever revoke graduates from -mm and not merge it to mainline. Pekka - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] revoke: wire up s390 system calls
On Fri, 2007-03-09 at 16:11 +0100, Arnd Bergmann wrote: > > Make revokeat and frevoke system calls available to user-space on s390. > > > > Signed-off-by: Serge E. Hallyn <[EMAIL PROTECTED]> > > Signed-off-by: Pekka Enberg <[EMAIL PROTECTED]> > > Looks good to me, but you really should through Martin, since he > has an overview of what syscall numbers may already be assigned > some another patch he has queued up. Yes, please put me or Heiko on CC if you add system calls to s390. -- blue skies, IBM Deutschland Entwicklung GmbH MartinVorsitzender des Aufsichtsrats: Johann Weihen Geschäftsführung: Herbert Kircher Martin Schwidefsky Sitz der Gesellschaft: Böblingen Linux on zSeries Registergericht: Amtsgericht Stuttgart, Development HRB 243294 "Reality continues to ruin my life." - Calvin. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Use more gcc extensions in the Linux headers
Rusty Russell <[EMAIL PROTECTED]> writes: > __builtin_types_compatible_p() has been around since gcc 2.95, and we > don't use it anywhere. This patch quietly fixes that. Using BUILD_BUG_ON_ZERO() would have been somewhat cleaner. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Keyboard stops working after *lock [Was: 2.6.21-rc2-mm1]
On Fri, 9 Mar 2007, Jiri Kosina wrote: > If this is present also in vanilla and not only in -mm, could you please > try reverting commits 4237081e573b99a48991aa71364b0682c444651c and > d4ae650a904612ffb7edd3f28b69b022988d2466 and let me know if the > situation gets any better? Hi Jiri, or even better, does the patch below (against 2.6.21-rc3) fix the problem with your keyboard? I can see possibilities of report fields unaligned to the byte boundary, which this might be causing problems. (the original patch author added to cc) Thanks. diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c index f4ee1af..f571513 100644 --- a/drivers/hid/hid-core.c +++ b/drivers/hid/hid-core.c @@ -873,10 +873,6 @@ static void hid_output_field(struct hid_field *field, __u8 *data) unsigned size = field->report_size; unsigned n; - /* make sure the unused bits in the last byte are zeros */ - if (count > 0 && size > 0) - data[(offset+count*size-1)/8] = 0; - for (n = 0; n < count; n++) { if (field->logical_minimum < 0) /* signed values */ implement(data, offset + n * size, size, s32ton(field->value[n], size)); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] Software Suspend: Fix suspend when console is in VT_AUTO/KD_GRAPHICS mode
Matthew Garrett wrote: > On Fri, Mar 09, 2007 at 10:08:05AM +0100, Pavel Machek wrote: > > > So... if current console is graphical, we leave X accessing the > > console... That's bad, because video state is not going to be > > restored...? > > A graphical console is not necessarily X. Is there any requirement for > there to be a single VT that isn't in text mode? The vt switching is > a hack, we shouldn't make life difficult for people who have their own > userspace code that's entirely capable of restoring video state on its > own. The problem actually comes about when using Qtopia Phone Edition (QPE) on a PXA270. QPE puts the console into VT_AUTO+KD_GRAPHICS mode and writes directly to the framebuffer from then on. In this mode the kernel correctly disallows a console change, as QPE is not getting notification of a console change and thus does not know when to repaint the screen. AFAIK, X uses VT_PROCESS+KD_GRAPHICS mode, so it gets notification of a change to and from the X console, thus it knows when to repaint the screen. I think you can test this by changing the mode of a text console to KD_GRAPHICS using the KDSETMODE ioctl, then attempting to change to another text console using chvt. -- Andrew - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/7] revoke: add f_light flag for struct file
On Fri, Mar 09, 2007 at 12:13:35PM +0100, Eric Dumazet wrote: > Then just drop the fget_light() 'optimisation' and always take a reference > (atomic on f_count) regardless of single-thread or not. Instead of dirtying > f_light, just do the straightforward thing and be with it. > > (that is : fget_light() = fget() = no more keeping fput_needed everywhere, > and > convoluted things in some dark sides of the kernel. And it makes things rather slower for a lot of single threaded applications on modern systems. Yes, fget_light can be done much more cleanly, but please don't go around ripping out optimizations just because. -ben -- "Time is of no importance, Mr. President, only life is important." Don't Email: <[EMAIL PROTECTED]>. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] revoke: wire up s390 system calls
On Fri, 2007-03-09 at 17:41 +0200, Pekka Enberg wrote: > Martin Schwidefsky wrote: > > Yes, please put me or Heiko on CC if you add system calls to s390. > > Ok, sorry about that. I would expect akpm to send it to you guys though > whenever revoke graduates from -mm and not merge it to mainline. Yes, but nobody is perfect. Even Andrew sometimes forgets to add people to CC who should know about "stuff". It would be nice if the CC-line is added from the start. -- blue skies, IBM Deutschland Entwicklung GmbH MartinVorsitzender des Aufsichtsrats: Johann Weihen Geschäftsführung: Herbert Kircher Martin Schwidefsky Sitz der Gesellschaft: Böblingen Linux on zSeries Registergericht: Amtsgericht Stuttgart, Development HRB 243294 "Reality continues to ruin my life." - Calvin. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible "struct pid" leak from tty_io.c
"Catalin Marinas" <[EMAIL PROTECTED]> writes: > On 08/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote: >> "Catalin Marinas" <[EMAIL PROTECTED]> writes: > > I think it's only the pid_chain and rcu member that could be placed in > a list and kmemleak scans the memory for these two offsets as well. > I'll check those lists anyway but I doubt it's a more fundamental > problem with how kmemleak handles struct pid as I should've probably > got more reports. Right. I was pointing out the possibilities but because we do some tricky things. Mostly I was wondering about the hlist for the list of tasks. Now if a task is on that list we should have a struct pid_link pointing at our struct pid, so it shouldn't fool kmemleak but I'm still a little curious if all of those hlist_heads are NULL pointers. >> In most any other layer we cache pids indefinitely and a situation >> where we have a pointer to a struct pid with a ref count of 1 long >> after the process goes away is expected. > > Yes, indeed, but what kmemleak reports is that the pid structure > wasn't freed yet and there is no way to determine its pointer directly > or via container_of on members (by scanning the memory), hence it is > considered a leak. Yes that sounds like a leak. >> I don't understand your situation enough to guess what is going wrong >> yet. Hopefully I have given you enough information to get started. > > Yes, many thanks. I'll dig further and let you know. Thanks Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
"No handler for vector" patches don't work on some systems
So far I've tried the simple "survive having no handler for a vector" patch and the preliminary 3-patch series that was in -mm for a while, and neither work on the Dell PowerEdge 29xx and 19xx systems. These servers have the Intel 5000X chipset with the 6700PXH PCI Hub with dual independent PCI-X busses, each with its own I/OxAPIC with 24 interrupts. The fixes do work on "simple" systems but not on these high-end ones. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
"No handler for vector" patches don't work on some systems
[sorry for the dup: this time to the right recipient] So far I've tried the simple "survive having no handler for a vector" patch and the preliminary 3-patch series that was in -mm for a while, and neither work on the Dell PowerEdge 29xx and 19xx systems. These servers have the Intel 5000X chipset with the 6700PXH PCI Hub with dual independent PCI-X busses, each with its own I/OxAPIC with 24 interrupts. The fixes do work on "simple" systems but not on these high-end ones. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] chaostables
jimmy píše v Pá 09. 03. 2007 v 13:37 +0530: > Alan Cox wrote: > >> Also note that the word 'chaostables' does not even appear in the patch, > >> though xt_CHAOS does. Since we know that {xt,ipt}_[A-Z]+ are targets, we > >> can safely assume that CHAOS does what it says - make fun of nmap. > > > > "entropy" ? > > "randomness" > > fuzztables? confuztables! Petr signature.asc Description: Toto je digitálně podepsaná část zprávy
Re: the usage of DEBUG_DRIVER seems ambiguous
Robert P. J. Day wrote: > On Fri, 9 Mar 2007, Artem Bityutskiy wrote: >> Randy Dunlap wrote: >> > The ones in drivers/net/ are just local driver debug controls. >> > They happen to have the same name as a (likely newer) kconfig symbol. >> > >> > Is there a real problem that needs to be fixed? >> >> Renaming them just for the sake of being less confusing makes sense. ... > if someone wants to make a suggestion, i can submit a simple renaming > patch. If a driver or subsystem already uses a prefix to have an own namespace for macros, functions, structs and so on, a local DEBUG_DRIVER could become something like LOCALPREFIX_DEBUG. If there is a narrow usage of the macro, e.g. to indicate a debug level, it could become something speaking like LOCALPREFIX_DEBUG_LEVEL. --- However, after looking at the actual occurrences of DEBUG_DRIVER, I see that this recommendation doesn't really apply that well. >From your initial post: | $ $ grep -rw DEBUG_DRIVER * | drivers/net/sunlance.c:#undef DEBUG_DRIVER This is an old forgotten rest of earlier debug code. See here for evidence: http://lxr.linux.no/source/drivers/net/sunlance.c?v=2.2.26#L791 791 #ifdef DEBUG_DRIVER 792 printk (KERN_DEBUG "Lance restart=%d\n", status); 793 #endif This usage of DEBUG_DRIVER isn't there anymore. Therefore simply delete the remaining occurrence: -#undef DEBUG_DRIVER | drivers/net/a2065.c:#ifdef DEBUG_DRIVER | drivers/net/a2065.c:#ifdef DEBUG_DRIVER Rename to A2065_DEBUG or LANCE_DEBUG. Two more alternatives: -#ifdef DEBUG_DRIVER +#if 0 /* debug */ or -#ifdef DEBUG_DRIVER - years_old_debug_cruft_nobody_enables_anymore(); -#endif Needless to say, the maintainer certainly wants to ACK/NAK this. | drivers/net/7990.c:#ifdef DEBUG_DRIVER | drivers/net/7990.c:#ifdef DEBUG_DRIVER Exactly like with a2065.c. | drivers/base/Kconfig:config DEBUG_DRIVER According to where it is defined, CONFIG_DEBUG_DRIVER should only occur in drivers/base/* (and some defconfigs outside of drivers/base/). | ... More hits from LXR: drivers/isdn/hardware/eicon/dbgioctl.h Where is this header file used anyway? Can the entire file be deleted? drivers/isdn/gigaset/gigaset.h (definition as enum item) drivers/isdn/gigaset/common.c (multiple uses) If the overlap with CONFIG_DEBUG_DRIVER bothers you, rename it to DEBUG_DRIVER_STRUCT or whatever. -- Stefan Richter -=-=-=== --== --=== http://arcgraph.de/sr/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] revoke: wire up s390 system calls
Quoting Martin Schwidefsky ([EMAIL PROTECTED]): > On Fri, 2007-03-09 at 17:41 +0200, Pekka Enberg wrote: > > Martin Schwidefsky wrote: > > > Yes, please put me or Heiko on CC if you add system calls to s390. > > > > Ok, sorry about that. I would expect akpm to send it to you guys though > > whenever revoke graduates from -mm and not merge it to mainline. > > Yes, but nobody is perfect. Even Andrew sometimes forgets to add people > to CC who should know about "stuff". It would be nice if the CC-line is > added from the start. Sorry, I should have cc:d you when I sent my testing patch to Pekka. -serge - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: passing function pointers through platform devices?
On Wednesday 07 March 2007 11:55 am, David Brownell wrote: > > I'm developing an SPI- bus >MMC/SD block driver translation layer. > > Another one? There's already been significant work in that area. See for > example > > http://marc.theaimsgroup.com/?l=linux-kernel&m=117000652529003&w=2 Nice, I'll build on that, my previous work ignored the SPI/MMC layers (because they didn't exist at the time) and just build a stacked driver on a character SPI driver. This gives me some direction as to how to proceed. > Which admittedly didn't behave when I just put it onto my test rig, > but seems nonetheless to be a significant step forward. It's not like > everyone has hardware that can use such a driver after all! True, but it's fairly common right now, until every microcontroller gets a hardware SD controller (which seems to be the trend) > That's how it's done in that patch. The model being what the PXA MMC/SD > card driver does, since that's the most generic model I found ... handling > for example systems which need to poll for card detect, as well as ones > that can use real gpio based IRQs. The mmc_spi driver doesn't need to know > which kind of platform it's got. Sounds good, thanks NZG - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: refcounting drivers' data structures used in sysfs buffers
On Fri, 9 Mar 2007, Oliver Neukum wrote: > Am Donnerstag, 8. März 2007 17:02 schrieb Alan Stern: > > On Thu, 8 Mar 2007, Oliver Neukum wrote: > > > > > Hi, > > > > > > after a lightning bolt from high above I've been looking into refcounting > > > the data structures drivers use to provide the data used to refill sysfs > > > buffers. I've come to the following conclusion. > > > > > > 1. struct sysfs_buffer must have a struct kref * and probably a destructor > > > pointer > > > 2. drivers must be able to pass these pointers through an extended > > > device_create_file() > > > 3. Drivers must use refcounting if they want to use attributes > > > 4. read/write/poll must do refcounting > > > > > > I am not sure where to store the pointers. struct sysfs_dirent() looks > > > like the obvious choice. Comments? > > > > Can you explain the reasoning that led to these conclusions? And what > > exactly was your lightning bolt? > > The old race between disconnect and IO to attribute via sysfs again. > If I cannot disassociate the drivers from the buffers in the buffers, drivers > must not deallocate the data necessary to answer sysfs callbacks while > a buffer exists. Why wouldn't you be able to dissociate a driver from a buffer? That was the whole point of adding .orphan to sysfs_buffer and creating sysfs_buffer_collection -- it was supposed to solve exactly this race. Alan Stern - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/7] Resource counters
On Wed, Mar 07, 2007 at 10:19:05AM +0300, Pavel Emelianov wrote: > Balbir Singh wrote: > > Pavel Emelianov wrote: > >> Introduce generic structures and routines for > >> resource accounting. > >> > >> Each resource accounting container is supposed to > >> aggregate it, container_subsystem_state and its > >> resource-specific members within. > >> > >> > >> > >> > >> diff -upr linux-2.6.20.orig/include/linux/res_counter.h > >> linux-2.6.20-0/include/linux/res_counter.h > >> --- linux-2.6.20.orig/include/linux/res_counter.h2007-03-06 > >> 13:39:17.0 +0300 > >> +++ linux-2.6.20-0/include/linux/res_counter.h2007-03-06 > >> 13:33:28.0 +0300 > >> @@ -0,0 +1,83 @@ > >> +#ifndef __RES_COUNTER_H__ > >> +#define __RES_COUNTER_H__ > >> +/* > >> + * resource counters > >> + * > >> + * Copyright 2007 OpenVZ SWsoft Inc > >> + * > >> + * Author: Pavel Emelianov <[EMAIL PROTECTED]> > >> + * > >> + */ > >> + > >> +#include > >> + > >> +struct res_counter { > >> +unsigned long usage; > >> +unsigned long limit; > >> +unsigned long failcnt; > >> +spinlock_t lock; > >> +}; > >> + > >> +enum { > >> +RES_USAGE, > >> +RES_LIMIT, > >> +RES_FAILCNT, > >> +}; > >> + > >> +ssize_t res_counter_read(struct res_counter *cnt, int member, > >> +const char __user *buf, size_t nbytes, loff_t *pos); > >> +ssize_t res_counter_write(struct res_counter *cnt, int member, > >> +const char __user *buf, size_t nbytes, loff_t *pos); > >> + > >> +static inline void res_counter_init(struct res_counter *cnt) > >> +{ > >> +spin_lock_init(&cnt->lock); > >> +cnt->limit = (unsigned long)LONG_MAX; > >> +} > >> + > > > > Is there any way to indicate that there are no limits on this container. > > Yes - LONG_MAX is essentially a "no limit" value as no > container will ever have such many files :) -1 or ~0 is a viable choice for userspace to communicate 'infinite' or 'unlimited' > > LONG_MAX is quite huge, but still when the administrator wants to > > configure a container to *un-limited usage*, it becomes hard for > > the administrator. > > > >> +static inline int res_counter_charge_locked(struct res_counter *cnt, > >> +unsigned long val) > >> +{ > >> +if (cnt->usage <= cnt->limit - val) { > >> +cnt->usage += val; > >> +return 0; > >> +} > >> + > >> +cnt->failcnt++; > >> +return -ENOMEM; > >> +} > >> + > >> +static inline int res_counter_charge(struct res_counter *cnt, > >> +unsigned long val) > >> +{ > >> +int ret; > >> +unsigned long flags; > >> + > >> +spin_lock_irqsave(&cnt->lock, flags); > >> +ret = res_counter_charge_locked(cnt, val); > >> +spin_unlock_irqrestore(&cnt->lock, flags); > >> +return ret; > >> +} > >> + > > > > Will atomic counters help here. > > I'm afraid no. We have to atomically check for limit and alter > one of usage or failcnt depending on the checking result. Making > this with atomic_xxx ops will require at least two ops. Linux-VServer does the accounting with atomic counters, so that works quite fine, just do the checks at the beginning of whatever resource allocation and the accounting once the resource is acquired ... > If we'll remove failcnt this would look like >while (atomic_cmpxchg(...)) > which is also not that good. > > Moreover - in RSS accounting patches I perform page list > manipulations under this lock, so this also saves one atomic op. it still hasn't been shown that this kind of RSS limit doesn't add big time overhead to normal operations (inside and outside of such a resource container) note that the 'usual' memory accounting is much more lightweight and serves similar purposes ... best, Herbert > >> +static inline void res_counter_uncharge_locked(struct res_counter *cnt, > >> +unsigned long val) > >> +{ > >> +if (unlikely(cnt->usage < val)) { > >> +WARN_ON(1); > >> +val = cnt->usage; > >> +} > >> + > >> +cnt->usage -= val; > >> +} > >> + > >> +static inline void res_counter_uncharge(struct res_counter *cnt, > >> +unsigned long val) > >> +{ > >> +unsigned long flags; > >> + > >> +spin_lock_irqsave(&cnt->lock, flags); > >> +res_counter_uncharge_locked(cnt, val); > >> +spin_unlock_irqrestore(&cnt->lock, flags); > >> +} > >> + > >> +#endif > >> diff -upr linux-2.6.20.orig/init/Kconfig linux-2.6.20-0/init/Kconfig > >> --- linux-2.6.20.orig/init/Kconfig2007-03-06 13:33:28.0 +0300 > >> +++ linux-2.6.20-0/init/Kconfig2007-03-06 13:33:28.0 +0300 > >> @@ -265,6 +265,10 @@ config CPUSETS > >> > >>Say N if unsure. > >> > >> +config RESOURCE_COUNTERS > >> +bool > >> +select CONTAINERS > >> + > >> config SYSFS_DEPRECATED > >> bool "Create deprecated sysfs files" > >> default y > >> diff -upr linux-2.6.20.orig/kernel/Makefile > >> linux-2.6.20-0/kernel/Makefile > >> --- linux-2.6.20.or
Re: [SLUB 0/3] SLUB: The unqueued slab allocator V4
On Fri, 9 Mar 2007, Mel Gorman wrote: > I'm not sure what you mean by per-order queues. The buddy allocator already > has per-order lists. Somehow they do not seem to work right. SLAB (and now SLUB too) can avoid (or defer) fragmentation by keeping its own queues. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] rcfs core patch
> nobody actually cares about a precise accounting and > calculating shares or partitions of whatever resource, > all that matters is that you have a way to prevent a > potential hostile environment from sucking up all your > resources (or even a single one) resulting in a DoS This is not true. People care. Reasons: - resource planning - fairness - guarantees What you talk is about security only. Not the above issues. So good precision is required. If there is no precision at all, security sucks as well and can be exploited, e.g. for CPU schedulers doing an accounting based on jiffies accounting in scheduler_tick() it is easy to build an application consuming 90% of CPU, but ~0% from scheduler POV. Kirill - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] rcfs core patch
>>There have been various projects attempting to provide resource >>management support in Linux, including CKRM/Resource Groups and UBC. > > > let me note here, once again, that you forgot Linux-VServer > which does quite non-intrusive resource management ... Herbert, do you care to send patches except for ask others to do something that works for you? Looks like your main argument is non-intrusive... "working", "secure", "flexible" are not required to people any more? :/ >> Each had its own task-grouping mechanism. > > > the basic 'context' (pid space) is the grouping mechanism > we use for resource management too > > >>Paul Menage observed [1] that cpusets in the kernel already has a >>grouping mechanism which was working well for cpusets. He went ahead >>and generalized the grouping code in cpusets so that it could be used >>for overall resource management purpose. > > >>With his patches, it is possible to even create multiple hierarchies >>of groups (see [2] on why multiple hierarchies) as follows: > > > do we need or even want that? IMHO the hierarchical > concept CKRM was designed with, was also the reason > for it being slow, unuseable and complicated 1. cpusets are hierarchical already. So hierarchy is required. 2. As it was discussed on the call controllers which are flat can just prohibit creation of hierarchy on the filesystem. i.e. allow only 1 depth and continue being fast. >>mount -t container -o cpuset none /dev/cpuset <- cpuset hierarchy >>mount -t container -o mem,cpu none /dev/mem <- memory/cpu hierarchy >>mount -t container -o disk none /dev/disk <- disk hierarchy >> >>In each hierarchy, you can create task groups and manipulate the >>resource parameters of each group. You can also move tasks between >>groups at run-time (see [3] on why this is required). > > >>Each hierarchy is also manipulated independent of the other. > > >>Paul's patches also introduced a 'struct container' in the kernel, >>which serves these key purposes: >> >>- Task-grouping >> 'struct container' represents a task-group created in each hierarchy. >> So every directory created under /dev/cpuset or /dev/mem above will >> have a corresponding 'struct container' inside the kernel. All tasks >> pointing to the same 'struct container' are considered to be part of >> a group >> >> The 'struct container' in turn has pointers to resource objects which >> store actual resource parameters for that group. In above example, >> 'struct container' created under /dev/cpuset will have a pointer to >> 'struct cpuset' while 'struct container' created under /dev/disk will >> have pointer to 'struct disk_quota_or_whatever'. >> >>- Maintain hierarchical information >> The 'struct container' also keeps track of hierarchical relationship >> between groups. >> >>The filesystem interface in the patches essentially serves these >>purposes: >> >> - Provide an interface to manipulate task-groups. This includes >>creating/deleting groups, listing tasks present in a group and >>moving tasks across groups >> >> - Provdes an interface to manipulate the resource objects >>(limits etc) pointed to by 'struct container'. >> >>As you know, the introduction of 'struct container' was objected >>to and was felt redundant as a means to group tasks. Thats where I >>took a shot at converting over Paul Menage's patch to avoid 'struct >>container' abstraction and insead work with 'struct nsproxy'. > > > which IMHO isn't a step in the right direction, as > you will need to handle different nsproxies within > the same 'resource container' (see previous email) tend to agree. Looks like Paul's original patch was in the right way. [...] >>A separate filesystem would give us more flexibility like the >>implementing multi-hierarchy support described above. > > > why is the filesystem approach so favored for this > kind of manipulations? > > IMHO it is one of the worst interfaces I can imagine > (to move tasks between spaces and/or assign resources) > but yes, I'm aware that filesystems are 'in' nowadays I also hate filesystems approach being used nowdays everywhere. But, looks like there are reasons still: 1. cpusets already use fs interface. 2. each controller can have a bit of specific information/controls exported easily. Can you suggest any other extensible/flexible interface for these? Thanks, Kirill - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] rcfs core patch
Kirill, responding to Herbert: > > do we need or even want that? IMHO the hierarchical > > concept CKRM was designed with, was also the reason > > for it being slow, unuseable and complicated > 1. cpusets are hierarchical already. So hierarchy is required. I think that CKRM has a harder time doing a hierarchy than cpusets. CKRM is trying to account for and control how much of an amorphous resource is used, whereas cpusets is trying to control whether a specifically identifiable resource is used, or not used, not how much of it is used. A child cpuset gets configured to allow certain CPUs and Nodes, and then does not need to dynamically pass back any information about what is actually used - it's a one-way control with no feedback. That's a relatively easier problem. CKRM (as I recall it, from long ago ...) has to track the amount of usage dynamically, across parent and child groups (whatever they were called.) That's a harder problem. So, yes, as Kirill observes, we need the hierarchy because cpusets has it, cpuset users make good use of the hierarchy, and the hierarchy works fine in that case, even if a hierarchy is more difficult for CKRM. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMAIL PROTECTED]> 1.925.600.0401 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] [Patch 1/1] IBAC Patch
On Thu, 2007-03-08 at 15:08 -0800, Randy Dunlap wrote: > On Thu, 08 Mar 2007 17:58:16 -0500 Mimi Zohar wrote: > > > This is a request for comments for a new Integrity Based Access > > Control(IBAC) LSM module which bases access control decisions > > on the new integrity framework services. > > > > (Hopefully this will help clarify the interaction between an LSM > > module and LIM module.) > > > > Index: linux-2.6.21-rc3-mm2/security/ibac/Kconfig > > === > > --- /dev/null > > +++ linux-2.6.21-rc3-mm2/security/ibac/Kconfig > > @@ -0,0 +1,36 @@ > > +config SECURITY_IBAC > > + boolean "IBAC support" > > + depends on SECURITY && SECURITY_NETWORK && INTEGRITY > > + help > > + Integrity Based Access Control(IBAC) implements integrity > > + based access control. > > Please make the help text do more than repeat the words I B A C... > Put a short explanation or say something like: > See Documentation/security/foobar.txt for more information. > (and add that file) Agreed. Perhaps something like: Integrity Based Access Control(IBAC) uses the Linux Integrity Module(LIM) API calls to verify an executable's metadata and data's integrity. Based on the results, execution permission is permitted/denied. Integrity providers may implement the LIM hooks differently. For more information on integrity verification refer to the specific integrity provider documentation. > > +config SECURITY_IBAC_BOOTPARAM > > + bool "IBAC boot parameter" > > + depends on SECURITY_IBAC > > + default y > > + help > > + This option adds a kernel parameter 'ibac', which allows IBAC > > + to be disabled at boot. If this option is selected, IBAC > > + functionality can be disabled with ibac=0 on the kernel > > + command line. The purpose of this option is to allow a > > + single kernel image to be distributed with IBAC built in, > > + but not necessarily enabled. > > + > > + If you are unsure how to answer this question, answer N. > > What's the downside to having this always builtin instead of > yet another config option? The ability of changing LSM modules at runtime might be perceived as problematic. > > +static struct security_operations ibac_security_ops = { > > + .bprm_check_security = ibac_bprm_check_security > > +}; > > + > > +static int __init init_ibac(void) > > +{ > > + int rc; > > + > > + if (!ibac_enabled) > > + return 0; > > + > > + rc = register_security(&ibac_security_ops); > > + if (rc != 0) > > + panic("IBAC: Unable to register with kernel\n"); > > Normally we would not want to see a panic() from a register_xyz() > failure, but I guess you are arguing that an ibac register_security() > failure needs to halt everything?? Yes, as this implies that another LSM module registered the hooks first, preventing IBAC from registering itself. Thank you for your other comments. They'll be addressed in the next ibac patch release. Mimi Zohar - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: should RTS init in serial core be tied to CRTSCTS
2007/3/8, Russell King <[EMAIL PROTECTED]>: ... which occurs /after/ userspace is up and running, when sysfs is available. So putting it in sysfs is reasonable. Is it right place for serial settings? /sys/class/tty/ttySN/ How far is it reasonable to split termios settings to the attributes? 1) /sys/class/tty/ttyS0/termios 2) /sys/class/tty/ttyS0/c_iflag /sys/class/tty/ttyS0/c_oflag /sys/class/tty/ttyS0/c_cflag /sys/class/tty/ttyS0/c_lflag /sys/class/tty/ttyS0/c_cc 3) /sys/class/tty/ttyS0/speed /sys/class/tty/ttyS0/eof /sys/class/tty/ttyS0/eon /sys/class/tty/ttyS0/erase and so on -Oleksiy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] [Patch 1/1] IBAC Patch
Quoting [EMAIL PROTECTED] ([EMAIL PROTECTED]): > On Thu, 08 Mar 2007 17:58:16 EST, Mimi Zohar said: > > This is a request for comments for a new Integrity Based Access > > Control(IBAC) LSM module which bases access control decisions > > on the new integrity framework services. > > > > (Hopefully this will help clarify the interaction between an LSM > > module and LIM module.) > > OK, between this and the additional LIM hooks I didn't notice in an earlier > patch, we're starting to see the API. The only problem is that although > it may be the right API for *your* code, I suspect it's a non-starter without > a discussion about whether it's the right *generic* API for an LIM (which will > require at least one dramatic bun fight about what "Integrity" means). Casey's earlier message suggested this too. 'Integrity' here in particular does not mean online integrity guarantees through, i.e., information flow control. So perhaps instead of 'integrity' we should make sure to always say 'integrity measurement'. Of course then there is already the 'integrity measurement architecture' which is only one implementation of a LIM module, right? So it would need to be renamed to TIMA (TPM-enabled IMA) or something I guess. -serge - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Complain about missing system calls.
David Woodhouse <[EMAIL PROTECTED]> writes: > Most system calls seem to get added to i386 first. This patch > automatically generates a warning for any new system call which is > implemented on i386 but not the architecture currently being compiled. > On PowerPC at the moment, for example, it results in these warnings: > init/missing_syscalls.h:935:3: warning: #warning syscall sync_file_range not > implemented > init/missing_syscalls.h:947:3: warning: #warning syscall getcpu not > implemented > init/missing_syscalls.h:950:3: warning: #warning syscall epoll_pwait not > implemented I think a better solution would be to finally switch to auto generated system call tables for newer system calls. The original reason why the architectures have different system call numbers -- compatibility with another "native" Unix -- is completely obsolete now. This leaves only minor differences of compat stub vs non compat stub and a few architecture specific calls. Of course the existing syscall numbers can't be changed, but for all new calls one could just add automatically for everybody. A global table with two entries (compat and non compat) and a per arch override table should be sufficient. Comments? -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Complain about missing system calls.
On Fri, 2007-03-09 17:11:10 +0100, Andi Kleen <[EMAIL PROTECTED]> wrote: > David Woodhouse <[EMAIL PROTECTED]> writes: > > Most system calls seem to get added to i386 first. This patch > > automatically generates a warning for any new system call which is > > implemented on i386 but not the architecture currently being compiled. > > On PowerPC at the moment, for example, it results in these warnings: > > init/missing_syscalls.h:935:3: warning: #warning syscall sync_file_range > > not implemented > > init/missing_syscalls.h:947:3: warning: #warning syscall getcpu not > > implemented > > init/missing_syscalls.h:950:3: warning: #warning syscall epoll_pwait not > > implemented > > I think a better solution would be to finally switch to auto generated > system call tables for newer system calls. The original reason why the > architectures have different system call numbers -- compatibility with > another "native" Unix -- is completely obsolete now. This leaves only > minor differences of compat stub vs non compat stub and a few > architecture specific calls. > > Of course the existing syscall numbers can't be changed, but for all new > calls one could just add automatically for everybody. > > A global table with two entries (compat and non compat) and a per arch > override table should be sufficient. Not everybody has a simple indexed list of pointers :) For example, for vax-linux, we use a struct per syscall with the expected number of on-stack longwords for the call. So if something "new" is coming up, please keep in mind that it should be flexible enough to represent that. :) MfG, JBG -- Jan-Benedict Glaw [EMAIL PROTECTED] +49-172-7608481 Signature of: "really soon now": an unspecified period of time, likly to the second : be greater than any reasonable definition of "soon". signature.asc Description: Digital signature
Re: Possible "struct pid" leak from tty_io.c
Eric, For a longer explanation, see the second part of this e-mail. In short, the patch below seems to fix this particular leak. I'm not sure that's the correct/complete fix as I seem to still get a 2nd report. Any info is welcomed. diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c index e453268..4e33dc2 100644 --- a/drivers/char/tty_io.c +++ b/drivers/char/tty_io.c @@ -1375,6 +1375,9 @@ static void do_tty_hangup(struct work_struct *work) } read_unlock(&tasklist_lock); + put_pid(tty->session); + put_pid(tty->pgrp); + tty->flags = 0; tty->session = NULL; tty->pgrp = NULL; On 08/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote: "Catalin Marinas" <[EMAIL PROTECTED]> writes: > The /sbin/init application calls sys_clone() a few times but only one > leak is reported (see below). Looking at the reported pid object (at > 0xc7c14500), count is 2 and nr is 296 but no process with pid 296 > exists any more. [...] > unreferenced object 0xc7c14500 (size 36): > comm "init", pid 245, jiffies 4294939289 > backtrace: >[] kmem_cache_alloc >[] alloc_pid >[] do_fork >[] sys_clone >[] ret_fast_syscall I think this is the path that all pid structures come from so unfortunately that doesn't help tracing this problem down. No, indeed, but that's the only thing kmemleak can report. Anyway, I got some more information now, after adding several printk's: The difference from other pid objects is that this one (with nr 296) is passed as a parameter to proc_set_tty(). The __proc_set_tty() function increments the pid->count twice via get_pid(), and, with two other get_pid calls, the pid->count for this object gets to 5 (1 being the initial value). The prints below are function name, struct pid address (different from the runs yesterday though), pid->nr and pid->count (after get_pid incrementing). It also show the return address and symbol (the calling function): alloc_pid: c7c149d8, 296, 1 get_pid: c7c149d8, 296, 2 return: c0122d64 (proc_set_tty+0x34/0x54) get_pid: c7c149d8, 296, 3 return: c0122d64 (proc_set_tty+0x34/0x54) get_pid: c7c149d8, 296, 4 return: c002b328 (do_exit+0x2e4/0x7f8) - this is actually the get_pid in disassociate_ctty but it is reported like this because of get_pid inlining get_pid: c7c149d8, 296, 5 return: c0124a0c (tty_vhangup+0x14/0x18) On the exit path (see below), however, put_pid is called twice before free_pid and once via release_task -> detach_pid -> free_pid -> ... -> __rcu_process_callbacks -> delayed_put_pid -> put_pid. Note that free_pid is called with pid->nr == 3 and the last put_pid gets called with nr == 3 as well (but it decrements it to 2 and that's what I find at that memory location). In the trace below, the pid->count is printed before put_pid modifies it: put_pid: c7c149d8, 296, 5 return: c0124b5c (disassociate_ctty+0x14c/0x230) put_pid: c7c149d8, 296, 4 return: c0124ba8 (disassociate_ctty+0x198/0x230) detach_pid: c7c149d8, 296, 3 return: c002a230 (release_task+0x1c0/0x358) detach_pid: c7c149d8, 296, 3 return: c002a248 (release_task+0x1d8/0x358) detach_pid: c7c149d8, 296, 3 return: c002a254 (release_task+0x1e4/0x358) free_pid: c7c149d8, 296, 3 return: c003a990 (detach_pid+0xac/0xc8) ... delayed_put_pid: c7c149d8, 296, 3 return: c003af68 (__rcu_process_callbacks+0x19c/0x25c) put_pid: c7c149d8, 296, 3 return: c003a8cc (delayed_put_pid+0x54/0x6c) In the above disassociate_ctty() function the code below (line 1542) doesn't seem to get called: tty = get_current_tty(); if (tty) { put_pid(tty->session); put_pid(tty->pgrp); tty->session = NULL; tty->pgrp = NULL; } else { and I get the following error if TTY_DEBUG_HANGUP is defined - "error attempted to write to tty [0x] = NULL". It looks like the tty_vhangup() call in in disassociate_ctty() sets current->signal->tty to NULL in the do_each_pid_task loop in do_tty_hangup (p->signal->tty = NULL). The second call to get_current_tty() in disassociate_ctty() return NULL and therefore no put_pid on tty->session and tty->pgrp (which are also set to NULL in the previous function). Regards. -- Catalin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: refcounting drivers' data structures used in sysfs buffers
Am Freitag, 9. März 2007 17:32 schrieb Alan Stern: > On Fri, 9 Mar 2007, Oliver Neukum wrote: > > > Am Donnerstag, 8. März 2007 17:02 schrieb Alan Stern: > > > On Thu, 8 Mar 2007, Oliver Neukum wrote: > > > > > > > Hi, > > > > > > > > after a lightning bolt from high above I've been looking into > > > > refcounting > > > > the data structures drivers use to provide the data used to refill sysfs > > > > buffers. I've come to the following conclusion. > > > > > > > > 1. struct sysfs_buffer must have a struct kref * and probably a > > > > destructor > > > > pointer > > > > 2. drivers must be able to pass these pointers through an extended > > > > device_create_file() > > > > 3. Drivers must use refcounting if they want to use attributes > > > > 4. read/write/poll must do refcounting > > > > > > > > I am not sure where to store the pointers. struct sysfs_dirent() looks > > > > like the obvious choice. Comments? > > > > > > Can you explain the reasoning that led to these conclusions? And what > > > exactly was your lightning bolt? > > > > The old race between disconnect and IO to attribute via sysfs again. > > If I cannot disassociate the drivers from the buffers in the buffers, > > drivers > > must not deallocate the data necessary to answer sysfs callbacks while > > a buffer exists. > > Why wouldn't you be able to dissociate a driver from a buffer? That was > the whole point of adding .orphan to sysfs_buffer and creating > sysfs_buffer_collection -- it was supposed to solve exactly this race. It did solve the race but deadlocked when unbinding devices through sysfs. Linux therefore asked for the patch to be reverted and wants the isue solved with refcounting. Regards Oliver - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: "No handler for vector" patches don't work on some systems
Chuck Ebbert <[EMAIL PROTECTED]> writes: > [sorry for the dup: this time to the right recipient] > > So far I've tried the simple "survive having no handler > for a vector" patch and the preliminary 3-patch series > that was in -mm for a while, and neither work on the > Dell PowerEdge 29xx and 19xx systems. These servers > have the Intel 5000X chipset with the 6700PXH PCI Hub > with dual independent PCI-X busses, each with its own > I/OxAPIC with 24 interrupts. The fixes do work on > "simple" systems but not on these high-end ones. Ok thanks for the report. It sounds like there is another cause for the problem in the Dell case. The simple patch drops the interrupt handler but acknowledges the hardware so if the driver can survive missing an interrupt we should be ok. With level triggered interrupts this should pretty much be guaranteed as after the acknowledgement the unhandled interrupt will be refired. One of my internal test systems had a 6700PXH PCI hub (at least I think that was the part) the E7520 chipset. So I don't think it is just a matter of the hardware. Although I do recall Intel having an errata out on that class of hardware for occasionally reordering interrupt messages with the end of interrupt coming before the interrupt message itself. Causing various things to get confused. It would not surprise me if we were tickling some errata like that. I would very much like to know if what I merged linus's tree helps. It is a little more conservative, than my earlier patches. I need a way to reproduce this or to work closely with someone who is, because this sounds like it has a different cause and I need to start with that assumption. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Keyboard stops working after *lock [Was: 2.6.21-rc2-mm1]
On 3/9/07, Jiri Kosina <[EMAIL PROTECTED]> wrote: On Fri, 9 Mar 2007, Jiri Kosina wrote: > If this is present also in vanilla and not only in -mm, could you please > try reverting commits 4237081e573b99a48991aa71364b0682c444651c and > d4ae650a904612ffb7edd3f28b69b022988d2466 and let me know if the > situation gets any better? Hi Jiri, Hi. or even better, does the patch below (against 2.6.21-rc3) fix the problem with your keyboard? I can see possibilities of report fields unaligned to the byte boundary, which this might be causing problems. I'll try it all. I don't know if this is related, but my notebook keyboard doesn't emit numbers with numlock (not even directly Fn+blue number) anymore with -rc3 (note that LED is flashing when numlock is on). I think -rc2 worked fine (I'm going to check this too). It's Asus M6R, similar (except wi-fi) to for example yenya's model here: http://www.fi.muni.cz/~kas/m6r/ thanks, -- http://www.fi.muni.cz/~xslaby/Jiri Slaby faculty of informatics, masaryk university, brno, cz e-mail: jirislaby gmail com, gpg pubkey fingerprint: B674 9967 0407 CE62 ACC8 22A0 32CC 55C3 39D4 7A7E - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 2/7] RSS controller core
On Tue, Mar 06, 2007 at 02:00:36PM -0800, Andrew Morton wrote: > On Tue, 06 Mar 2007 17:55:29 +0300 > Pavel Emelianov <[EMAIL PROTECTED]> wrote: > > > +struct rss_container { > > + struct res_counter res; > > + struct list_head page_list; > > + struct container_subsys_state css; > > +}; > > + > > +struct page_container { > > + struct page *page; > > + struct rss_container *cnt; > > + struct list_head list; > > +}; > > ah. This looks good. I'll find a hunk of time to go through this work > and through Paul's patches. It'd be good to get both patchsets lined > up in -mm within a couple of weeks. But.. doesn't look so good for me, mainly becaus of the additional per page data and per page processing on 4GB memory, with 100 guests, 50% shared for each guest, this basically means ~1mio pages, 500k shared and 1500k x sizeof(page_container) entries, which roughly boils down to ~25MB of wasted memory ... increase the amount of shared pages and it starts getting worse, but maybe I'm missing something here > We need to decide whether we want to do per-container memory > limitation via these data structures, or whether we do it via a > physical scan of some software zone, possibly based on Mel's patches. why not do simple page accounting (as done currently in Linux) and use that for the limits, without keeping the reference from container to page? best, Herbert > ___ > Containers mailing list > [EMAIL PROTECTED] > https://lists.osdl.org/mailman/listinfo/containers - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] z85230: Fix FIFO handling
Alan Cox wrote: We must exit immediately on a FIFO fill not take the end of packet path otherwise each underrun in PIO transmit mode causes a runt packet and the data is lost. Signed-off-by: Alan Cox <[EMAIL PROTECTED]> applied - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible "struct pid" leak from tty_io.c
On 09/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote: "Catalin Marinas" <[EMAIL PROTECTED]> writes: > On 08/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote: >> "Catalin Marinas" <[EMAIL PROTECTED]> writes: > > I think it's only the pid_chain and rcu member that could be placed in > a list and kmemleak scans the memory for these two offsets as well. > I'll check those lists anyway but I doubt it's a more fundamental > problem with how kmemleak handles struct pid as I should've probably > got more reports. Right. I was pointing out the possibilities but because we do some tricky things. Mostly I was wondering about the hlist for the list of tasks. Now if a task is on that list we should have a struct pid_link pointing at our struct pid, so it shouldn't fool kmemleak but I'm still a little curious if all of those hlist_heads are NULL pointers. Yes, all the 3 hlist_head tasks are NULL pointers on the reported object. -- Catalin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i2c-core: i2c bitbang gpio structure
Hi Bryan, On Fri, 09 Mar 2007 18:13:21 +0800, Wu, Bryan wrote: > Hi folks, > > A new structure is added to i2c-core for GPIO-based I2C interface > adapter. My latest GPIO based I2C adapter driver for Blackfin system > will use this stuff. And also IXP4XX GPIO based I2C driver can also be > moved to this. > > Signed-off-by: Bryan Wu <[EMAIL PROTECTED]> > --- > include/linux/i2c.h | 20 > 1 file changed, 20 insertions(+) > > Index: include/linux/i2c.h > === > --- include/linux/i2c.h (revision 2813) > +++ include/linux/i2c.h (working copy) > @@ -201,6 +201,26 @@ struct i2c_algorithm { > }; > > /* > + * Some chips do not have an I2C unit, so GPIO lines are just used to > + * Used as platform_data to provide GPIO pin information to this kind GPIO > + * based I2C driver. > + */ > +struct i2c_bitbang_gpio { > + int sda; > + int scl; > +}; Why would this be included in the generic i2c.h header file? As far as I can see this structure only makes sense for bit-banged I2C busses, so this structure should be declared in i2c-algo-bit.h. Also, this structure alone isn't very useful. I'm waiting to see drivers actually making use of it before I will consider merging this patch at all. > + > +static inline int i2c_bitbang_gpio_sda(struct i2c_bitbang_gpio *gpio) > +{ > + return (gpio->sda); > +} > + > +static inline int i2c_bitbang_gpio_scl(struct i2c_bitbang_gpio *gpio) > +{ > + return (gpio->scl); > +} What's the point of these? -- Jean Delvare - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: refcounting drivers' data structures used in sysfs buffers
On 3/9/07, Oliver Neukum <[EMAIL PROTECTED]> wrote: Am Freitag, 9. März 2007 17:32 schrieb Alan Stern: > On Fri, 9 Mar 2007, Oliver Neukum wrote: > > > Am Donnerstag, 8. März 2007 17:02 schrieb Alan Stern: > > > On Thu, 8 Mar 2007, Oliver Neukum wrote: > > > > > > > Hi, > > > > > > > > after a lightning bolt from high above I've been looking into refcounting > > > > the data structures drivers use to provide the data used to refill sysfs > > > > buffers. I've come to the following conclusion. > > > > > > > > 1. struct sysfs_buffer must have a struct kref * and probably a destructor > > > > pointer > > > > 2. drivers must be able to pass these pointers through an extended > > > > device_create_file() > > > > 3. Drivers must use refcounting if they want to use attributes > > > > 4. read/write/poll must do refcounting > > > > > > > > I am not sure where to store the pointers. struct sysfs_dirent() looks > > > > like the obvious choice. Comments? > > > > > > Can you explain the reasoning that led to these conclusions? And what > > > exactly was your lightning bolt? > > > > The old race between disconnect and IO to attribute via sysfs again. > > If I cannot disassociate the drivers from the buffers in the buffers, drivers > > must not deallocate the data necessary to answer sysfs callbacks while > > a buffer exists. > > Why wouldn't you be able to dissociate a driver from a buffer? That was > the whole point of adding .orphan to sysfs_buffer and creating > sysfs_buffer_collection -- it was supposed to solve exactly this race. It did solve the race but deadlocked when unbinding devices through sysfs. Linux therefore asked for the patch to be reverted and wants the isue solved with refcounting. I think we already have all refcounting that is needed. What is missing is subsystem-provided ->release() hooks for drivers to release driver-specific resources when a device finally goes away. -- Dmitry - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/7] revoke: add f_light flag for struct file
On Fri, Mar 09, 2007 at 12:13:35PM +0100, Eric Dumazet wrote: > Then just drop the fget_light() 'optimisation' and always take a reference > (atomic on f_count) regardless of single-thread or not. Instead of dirtying > f_light, just do the straightforward thing and be with it. > > (that is : fget_light() = fget() = no more keeping fput_needed everywhere, and > convoluted things in some dark sides of the kernel. On 3/9/07, Benjamin LaHaise <[EMAIL PROTECTED]> wrote: And it makes things rather slower for a lot of single threaded applications on modern systems. Yes, fget_light can be done much more cleanly, but please don't go around ripping out optimizations just because. Don't worry, the fget_light() bits are no longer needed: http://lkml.org/lkml/2007/3/9/151 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible "struct pid" leak from tty_io.c
"Catalin Marinas" <[EMAIL PROTECTED]> writes: > Eric, > > For a longer explanation, see the second part of this e-mail. In > short, the patch below seems to fix this particular leak. I'm not sure > that's the correct/complete fix as I seem to still get a 2nd report. > Any info is welcomed. Sure. I was starting to suspect that location myself. > diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c > index e453268..4e33dc2 100644 > --- a/drivers/char/tty_io.c > +++ b/drivers/char/tty_io.c > @@ -1375,6 +1375,9 @@ static void do_tty_hangup(struct work_struct *work) > } > read_unlock(&tasklist_lock); > > + put_pid(tty->session); > + put_pid(tty->pgrp); > + > tty->flags = 0; > tty->session = NULL; > tty->pgrp = NULL; > > On 08/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote: >> "Catalin Marinas" <[EMAIL PROTECTED]> writes: >> > The /sbin/init application calls sys_clone() a few times but only one >> > leak is reported (see below). Looking at the reported pid object (at >> > 0xc7c14500), count is 2 and nr is 296 but no process with pid 296 >> > exists any more. > [...] >> > unreferenced object 0xc7c14500 (size 36): >> > comm "init", pid 245, jiffies 4294939289 >> > backtrace: >> >[] kmem_cache_alloc >> >[] alloc_pid >> >[] do_fork >> >[] sys_clone >> >[] ret_fast_syscall >> >> I think this is the path that all pid structures come from so >> unfortunately that doesn't help tracing this problem down. > > No, indeed, but that's the only thing kmemleak can report. Anyway, I > got some more information now, after adding several printk's: > > The difference from other pid objects is that this one (with nr 296) > is passed as a parameter to proc_set_tty(). The __proc_set_tty() > function increments the pid->count twice via get_pid(), and, with two > other get_pid calls, the pid->count for this object gets to 5 (1 being > the initial value). The prints below are function name, struct pid > address (different from the runs yesterday though), pid->nr and > pid->count (after get_pid incrementing). It also show the return > address and symbol (the calling function): > > alloc_pid: c7c149d8, 296, 1 > get_pid: c7c149d8, 296, 2 >return: c0122d64 (proc_set_tty+0x34/0x54) > get_pid: c7c149d8, 296, 3 >return: c0122d64 (proc_set_tty+0x34/0x54) > get_pid: c7c149d8, 296, 4 >return: c002b328 (do_exit+0x2e4/0x7f8) - this is actually the get_pid > in disassociate_ctty but it is reported like this because of get_pid > inlining > get_pid: c7c149d8, 296, 5 >return: c0124a0c (tty_vhangup+0x14/0x18) > > On the exit path (see below), however, put_pid is called twice before > free_pid and once via release_task -> detach_pid -> free_pid -> ... -> > __rcu_process_callbacks -> delayed_put_pid -> put_pid. Note that > free_pid is called with pid->nr == 3 and the last put_pid gets called > with nr == 3 as well (but it decrements it to 2 and that's what I find > at that memory location). In the trace below, the pid->count is > printed before put_pid modifies it: > > put_pid: c7c149d8, 296, 5 >return: c0124b5c (disassociate_ctty+0x14c/0x230) > put_pid: c7c149d8, 296, 4 >return: c0124ba8 (disassociate_ctty+0x198/0x230) > detach_pid: c7c149d8, 296, 3 >return: c002a230 (release_task+0x1c0/0x358) > detach_pid: c7c149d8, 296, 3 >return: c002a248 (release_task+0x1d8/0x358) > detach_pid: c7c149d8, 296, 3 >return: c002a254 (release_task+0x1e4/0x358) > free_pid: c7c149d8, 296, 3 >return: c003a990 (detach_pid+0xac/0xc8) > ... > delayed_put_pid: c7c149d8, 296, 3 >return: c003af68 (__rcu_process_callbacks+0x19c/0x25c) > put_pid: c7c149d8, 296, 3 >return: c003a8cc (delayed_put_pid+0x54/0x6c) > > In the above disassociate_ctty() function the code below (line 1542) > doesn't seem to get called: > > tty = get_current_tty(); > if (tty) { > put_pid(tty->session); > put_pid(tty->pgrp); > tty->session = NULL; > tty->pgrp = NULL; > } else { > > and I get the following error if TTY_DEBUG_HANGUP is defined - "error > attempted to write to tty [0x] = NULL". > > It looks like the tty_vhangup() call in in disassociate_ctty() sets > current->signal->tty to NULL in the do_each_pid_task loop in > do_tty_hangup (p->signal->tty = NULL). The second call to > get_current_tty() in disassociate_ctty() return NULL and therefore no > put_pid on tty->session and tty->pgrp (which are also set to NULL in > the previous function). Thanks. If I can manage to focus on this, it looks like the information I need to start fixing this. Adding the reference counting when we didn't have any before is always interesting. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Keyboard stops working after *lock [Was: 2.6.21-rc2-mm1]
On 3/9/07, Jiri Slaby <[EMAIL PROTECTED]> wrote: I don't know if this is related, but my notebook keyboard doesn't emit numbers with numlock (not even directly Fn+blue number) anymore with -rc3 (note that LED is flashing when numlock is on). I think -rc2 worked fine (I'm going to check this too). It's Asus M6R, similar (except wi-fi) to for example yenya's model here: http://www.fi.muni.cz/~kas/m6r/ Ignore this, it's deux ex machina, it works now. regards, -- http://www.fi.muni.cz/~xslaby/Jiri Slaby faculty of informatics, masaryk university, brno, cz e-mail: jirislaby gmail com, gpg pubkey fingerprint: B674 9967 0407 CE62 ACC8 22A0 32CC 55C3 39D4 7A7E - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fs/jffs2/scan.c: Fix error-path leak
Please, do not forget look at MAINTAINERS and CC the maintainer. David is CCed. Amit Choudhary wrote: Description: Fix error-path leak in function jffs2_scan_medium(), in file fs/jffs2/scan.c Signed-off-by: Amit Choudhary <[EMAIL PROTECTED]> diff --git a/fs/jffs2/scan.c b/fs/jffs2/scan.c index e241346..cd9ed6e 100644 --- a/fs/jffs2/scan.c +++ b/fs/jffs2/scan.c @@ -130,6 +130,8 @@ #endif if (jffs2_sum_active()) { s = kmalloc(sizeof(struct jffs2_summary), GFP_KERNEL); if (!s) { + free(flashbuf); + flashbuf = NULL; JFFS2_WARNING("Can't allocate memory for summary\n"); return -ENOMEM; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- Best Regards, Artem Bityutskiy (Артём Битюцкий) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: refcounting drivers' data structures used in sysfs buffers
Am Freitag, 9. März 2007 18:02 schrieb Dmitry Torokhov: > I think we already have all refcounting that is needed. What is > missing is subsystem-provided ->release() hooks for drivers to release > driver-specific resources when a device finally goes away. This is an interesting idea. Is it nice to pass through release() but not open() ? Regards Oliver - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [SLUB 0/3] SLUB: The unqueued slab allocator V4
On Fri, 9 Mar 2007, Mel Gorman wrote: > The results without slub_debug were not good except for IA64. x86_64 and ppc64 > both blew up for a variety of reasons. The IA64 results were Yuck that is the dst issue that Adrian is also looking at. Likely an issue with slab merging and RCU frees. > KernBench Comparison > > 2.6.21-rc2-mm2-clean 2.6.21-rc2-mm2-slub > %diff > User CPU time1084.64 1032.93 4.77% > System CPU time 73.38 63.14 > 13.95% > Total CPU time1158.02 1096.07 5.35% > Elapsedtime 307.00285.62 6.96% Wow! The first indication that we are on the right track with this. > AIM9 Comparison > 2 page_test 2097119.26 3398259.27 1301140.01 > 62.04% System Allocations & Pages/second Wow! Must have all stayed within slab boundaries. > 8 link_test 64776.047488.13 -57287.91 > -88.44% Link/Unlink Pairs/second Crap. Maybe we straddled a slab boundary here? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Trouble using some (fast) compact flash as ide device on an embedded system
Hallo! :-) Bartlomiej Zolnierkiewicz ha scritto: > Czesc! > > On Tuesday 06 March 2007, Marco Lazzarotto wrote: > >>Ciao! >> >>Bartlomiej Zolnierkiewicz ha scritto: >> >>>On Friday 02 March 2007, Pavel Machek wrote: >>> >>> Hi! >As I reported in bug 8036 in bugzilla.kernel.org, > >Hardware Environment: > >- Use a compact flash SanDisk SDCFB-128 Firmware revision HDX 2.15 > (we used other compact flashes with the same hw ad sw for years > with no trouble) > >It happens on both etx boards: >- VIA SOM-ETX (4475) >- Gene-4312 >> >>ERRATA CORRIGE: Gene-4312 is not a etx board ;-) but a pc/104 > > > What IDE hardware / host driver is used by this system? NB: I'm usign the VIA SOM-ETX (4475) for debugging Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx VP_IDE: IDE controller at PCI slot :00:07.1 PCI: Calling quirk c01dc1e8 for :00:07.1 VP_IDE: chipset revision 6 VP_IDE: not 100% native mode: will probe irqs later VP_IDE: VIA vt82c686b (rev 40) IDE UDMA100 controller on pci:00:07.1 ide1: BM-DMA at 0xe408-0xe40f, BIOS settings: hdc:DMA, hdd:pio (I disabled DMA in the bios, why is saying it is enabled?) >Doing the command >sfdisk -R /dev/hdc > >gives: > >* * * >ide1: start_request: current=0xc6ebe754 (rq->sect=0,block 0) >hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest } >ide: failed opcode was: unknown >hdc: drive not ready for command >ide1: start_request: current=0xc6ebe754 (rq->sect=0,block 0) >hdc: do_special: 0x02 >hdc: do_special: recalibrate >ide1: start_request: current=0xc6ebe754 (rq->sect=0,block 0) >hdc: reading: block=0 sectors=8, buffer = 0xc6cd4 >ide1: end_request: current=0xc6ebe754 >* * * > >the 'bad bit' in status error is DataRequest > > > Seems like the device wants data from/to host and I have no idea why this > is happening. It might be that this particular CF has problems with one > of the commands that IDE driver issues during device initialization. > > I assume that device is recognized properly by the driver during probe, right? > If so probably adding some debugging printks (i.e. dumping status register) > to ide-disk.c:idedisk_setup() would shed some more light at the problem... The device seems to be recognized properly. Here's (part of) the dmesg output: * * * ide1: BM-DMA at 0xe408-0xe40f, BIOS settings: hdc:DMA, hdd:pio Probing IDE interface ide1... probing for hdc: present=0, media=32, probetype=ATA hdc: SanDisk SDCFB-128, CFA DISK drive () Before ide_disk_init_chs() and ide_disk_init_mult_count() IDE_STATUS_REG=0x50 After ide_disk_init_chs() and ide_disk_init_mult_count() IDE_STATUS_REG=0x50 probing for hdd: present=0, media=32, probetype=ATA probing for hdd: present=0, media=32, probetype=ATAPI ide_init_queue() ide1 at 0x170-0x177,0x376 on irq 15 Probing IDE interface ide0... probing for hda: present=0, media=32, probetype=ATA probing for hda: present=0, media=32, probetype=ATAPI probing for hdb: present=0, media=32, probetype=ATA probing for hdb: present=0, media=32, probetype=ATAPI Probing IDE interface ide2... probing for hde: present=0, media=32, probetype=ATA probing for hde: present=0, media=32, probetype=ATAPI probing for hdf: present=0, media=32, probetype=ATA probing for hdf: present=0, media=32, probetype=ATAPI Probing IDE interface ide3... probing for hdg: present=0, media=32, probetype=ATA probing for hdg: present=0, media=32, probetype=ATAPI probing for hdh: present=0, media=32, probetype=ATA probing for hdh: present=0, media=32, probetype=ATAPI hdc: max request size: 128KiB After init_idedisk_capacity() IDE_STATUS_REG=0x50 After idedisk_capacity() IDE_STATUS_REG=0x50 hdc: 250880 sectors (128 MB) w/1KiB Cache (buf_size=2), CHS=980/8/32 After write_cache(drive,1) IDE_STATUS_REG=0x50 hdc: ide1: start_request: current=0xc1190804 (rq->sect=0,block 0, SECTOR_SIZE=512 hdc: do_special: 0x03 hdc: do_special: set_geometry ide1: start_request: current=0xc1190804 (rq->sect=0,block 0, SECTOR_SIZE=512 hdc: do_special: 0x02 hdc: do_special: recalibrate hdc : recal_intr() IDE_STATUS_REG=50 ide1: start_request: current=0xc1190804 (rq->sect=0,block 0, SECTOR_SIZE=512 hdc: reading: block=0, sectors=8, buffer=0xc6c2d000 ide1: end_request: current=0xc1190804 hdc1 * * * I dump IDE_STATUS_REG with e.g. 'printk("%s : recal_intr() IDE_STATUS_REG=%02x\n",drive->name,stat)' where stat was assigned as 'u8 stat=hwif->INB(IDE_STATUS_REG)' It seems to me that the status is good until it tries to read the partition table... In fact, after I do sfdisk -R /dev/hdc every other reading from compact flash (if ever does not get 'lost interrupt) generates the message hdc: status error: status=0x58 {...} > >doing >sfdisk -l /dev/hdc > >gives: > >* * * >ide1: start_request: curre
Re: "No handler for vector" patches don't work on some systems
Eric W. Biederman wrote: > Chuck Ebbert <[EMAIL PROTECTED]> writes: >> >> So far I've tried the simple "survive having no handler >> for a vector" patch and the preliminary 3-patch series >> that was in -mm for a while, and neither work on the >> Dell PowerEdge 29xx and 19xx systems. These servers >> have the Intel 5000X chipset with the 6700PXH PCI Hub >> with dual independent PCI-X busses, each with its own >> I/OxAPIC with 24 interrupts. The fixes do work on >> "simple" systems but not on these high-end ones. > > > I would very much like to know if what I merged linus's tree helps. > It is a little more conservative, than my earlier patches. I need > a way to reproduce this or to work closely with someone who is, because > this sounds like it has a different cause and I need to start with > that assumption. Was that merged or is it still in -mm? The last thing I see in arch/x86_64/irq.c is: [PATCH] x86-64: survive having no irq mapping for a vector And we tried that one. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] spi subsystem: destroy the spi_bitbang workqueue only after the spi master is unregistered
From: Chris Lesiak <[EMAIL PROTECTED]> This patch fixes a bug in the cleanup of an spi_bitbang bus. The workqueue associated with the bus was destroyed before the call to spi_unregister_master. That meant that spi devices on that bus would be unable to do IO in their remove method. The shutdown flag should have been able to prevent a segfault, but was never getting set. By waiting to destroy the workqueue until after the master is unregistered, devices are able to do IO in their remove methods. An added benefit is that neither the shutdown flag nor a wait for the queue of messages to empty is needed. Signed-off-by: Chris Lesiak <[EMAIL PROTECTED]> --- diff -uprN -X linux-2.6.20-vanilla/Documentation/dontdiff linux-2.6.20-vanilla/drivers/spi/spi_bitbang.c linux-2.6.20/drivers/spi/spi_bitbang.c --- linux-2.6.20-vanilla/drivers/spi/spi_bitbang.c 2007-02-04 12:44:54.0 -0600 +++ linux-2.6.20/drivers/spi/spi_bitbang.c 2007-03-09 11:23:42.0 -0600 @@ -302,10 +302,6 @@ static void bitbang_work(struct work_str setup_transfer = NULL; list_for_each_entry (t, &m->transfers, transfer_list) { - if (bitbang->shutdown) { - status = -ESHUTDOWN; - break; - } /* override or restore speed and wordsize */ if (t->speed_hz || t->bits_per_word) { @@ -410,8 +406,6 @@ int spi_bitbang_transfer(struct spi_devi m->status = -EINPROGRESS; bitbang = spi_master_get_devdata(spi->master); - if (bitbang->shutdown) - return -ESHUTDOWN; spin_lock_irqsave(&bitbang->lock, flags); if (!spi->max_speed_hz) @@ -506,28 +500,12 @@ EXPORT_SYMBOL_GPL(spi_bitbang_start); */ int spi_bitbang_stop(struct spi_bitbang *bitbang) { - unsignedlimit = 500; - - spin_lock_irq(&bitbang->lock); - bitbang->shutdown = 0; - while (!list_empty(&bitbang->queue) && limit--) { - spin_unlock_irq(&bitbang->lock); + spi_unregister_master(bitbang->master); - dev_dbg(bitbang->master->cdev.dev, "wait for queue\n"); - msleep(10); - - spin_lock_irq(&bitbang->lock); - } - spin_unlock_irq(&bitbang->lock); - if (!list_empty(&bitbang->queue)) { - dev_err(bitbang->master->cdev.dev, "queue didn't empty\n"); - return -EBUSY; - } + WARN_ON(!list_empty(&bitbang->queue)); destroy_workqueue(bitbang->workqueue); - spi_unregister_master(bitbang->master); - return 0; } EXPORT_SYMBOL_GPL(spi_bitbang_stop); diff -uprN -X linux-2.6.20-vanilla/Documentation/dontdiff linux-2.6.20-vanilla/include/linux/spi/spi_bitbang.h linux-2.6.20/include/linux/spi/spi_bitbang.h --- linux-2.6.20-vanilla/include/linux/spi/spi_bitbang.h2007-02-04 12:44:54.0 -0600 +++ linux-2.6.20/include/linux/spi/spi_bitbang.h2007-03-09 11:23:42.0 -0600 @@ -25,7 +25,6 @@ struct spi_bitbang { spinlock_t lock; struct list_headqueue; u8 busy; - u8 shutdown; u8 use_dma; struct spi_master *master; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] chaostables
Hello, On Mar 9 2007 11:54, Amin Azez wrote: >> Adding a member to the ip_conntrack/nf_conntrack and sk_buff struct >> would increase the struct sizes, and that would penalize users who do >> not intend to use xt_portscan. > >I understand what you say but it sounds a bit like saying: "but we didn't >make it very good because so few people would use it anyway" which of >course makes it even less attractive. I realise you have your own >interpretation but this is how it reads to me. I just gave the reason why I designed it the way it is now. If you really feel it needs to be changed, well, I don't really object to that. chaostables has only seen like.. 1 1/2 version announcements (urls to tarballs, no patches) to mailing lists, and except for the few users who definitely tried it (based on questions I received), there have not been any suggestions for changes yet, which either tells me that nobody is interested or everything is fine. >> I do not see why the packet/connection marks should not be used to record >> additional information >... >> Almost never I required connection marking myself >I guessed as much. I use it heavily, with my xml rule generators. >> except for this >> portscanning automaton and perhaps a little MARK here and there for >> finely-tuned SNAT. Again, things might look different on your side(s). > >There's too many things fighting over the same few bits of the mark, and >in your case you are using it to track internal state of a connection >that has no relevance to the rest of the iptables/ebtables rules. > >I'm suggesting that some of the people who would want to use the chaos >match, won't because of the mark issue. > >This is not a new problem. > >http://article.gmane.org/gmane.comp.security.firewalls.netfilter.devel/16217 """netfilter marks are the solution of last resort. This is becoming very painful for those of us who produce general Netfilter configuration tools.""" -Toam Eastep I see. Thank you for the link. I think you are on the way to have me convinced. Jan -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: refcounting drivers' data structures used in sysfs buffers
On 3/9/07, Oliver Neukum <[EMAIL PROTECTED]> wrote: Am Freitag, 9. März 2007 18:02 schrieb Dmitry Torokhov: > I think we already have all refcounting that is needed. What is > missing is subsystem-provided ->release() hooks for drivers to release > driver-specific resources when a device finally goes away. This is an interesting idea. Is it nice to pass through release() but not open() ? Not sure if I follow... Generally speaking open is not a mandatory operation; however every object in driver model has a release method. What I am saying is that certain drivers need to have their disconnect method split in 2 parts - one that shuts down the device and second is releases resources that might be accesses through sysfs (and other kernel parts). That second part will have to be called from subsystem's core ->release() method se we need a release() hook. -- Dmitry - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] ibmebus: dynamic addiiton/removal of adapters, uevent, root device based on struct device
John Rose <[EMAIL PROTECTED]> wrote on 06.03.2007 22:51:42: > We are seeing several build errors when attempting to apply this to > 2.6.21-rc2: Hot Damn! I did my test compiles with gcc 3.3, and you obviously compiled with gcc 4.1 - I only got a warning where you got an error, and that warning escaped me. Sorry about that. I fixed the error and all warnings and will post a fresh set of patches right after this reply. If you could give the new patch another go and ack it if it works, I would be delighted! =) Thanks for pointing this out! Joachim --- Joachim Fenkes -- eHCA Linux Driver Developer and Hardware Tamer IBM Deutschland Entwicklung GmbH -- Dept. 3627 (I/O Firmware Dev. 2) Schoenaicher Strasse 220 -- 71032 Boeblingen -- Germany eMail: [EMAIL PROTECTED] -- Phone: +49 7031 16 1239 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.20-rc3: Clocksource tsc unstable
Hi. I got this message after suspend;resume on my notebook Clocksource tsc unstable (delta = -154983451 ns) What other info should I post, who should I Cc? regards, -- http://www.fi.muni.cz/~xslaby/Jiri Slaby faculty of informatics, masaryk university, brno, cz e-mail: jirislaby gmail com, gpg pubkey fingerprint: B674 9967 0407 CE62 ACC8 22A0 32CC 55C3 39D4 7A7E - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i2c-core: i2c bitbang gpio structure
On Friday 09 March 2007 8:55 am, Jean Delvare wrote: > > +struct i2c_bitbang_gpio { > > + int sda; > > + int scl; > > +}; > > ... > > Also, this structure alone isn't very useful. I'm waiting to see > drivers actually making use of it before I will consider merging this > patch at all. The notion would be that we could have one i2c bitbanger using the CONFIG_GENERIC_GPIO interfaces that could work on most platforms, using that struct for platform_data and the usual convention for platform device naming. I'd expect that struct would be merged as part of such a generic GPIO bitbang driver, and would only be used by that one driver. SPI could use such a generic bitbanger too. Until 2.6.21 it's been missing that last step: it's needed platform-specific GPIO calls, so the bitbangers were generic except for those lowest-level hooks. - Dave - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: "No handler for vector" patches don't work on some systems
Chuck Ebbert <[EMAIL PROTECTED]> writes: > Eric W. Biederman wrote: >> Chuck Ebbert <[EMAIL PROTECTED]> writes: >>> >>> So far I've tried the simple "survive having no handler >>> for a vector" patch and the preliminary 3-patch series >>> that was in -mm for a while, and neither work on the >>> Dell PowerEdge 29xx and 19xx systems. These servers >>> have the Intel 5000X chipset with the 6700PXH PCI Hub >>> with dual independent PCI-X busses, each with its own >>> I/OxAPIC with 24 interrupts. The fixes do work on >>> "simple" systems but not on these high-end ones. >> >> >> I would very much like to know if what I merged linus's tree helps. >> It is a little more conservative, than my earlier patches. I need >> a way to reproduce this or to work closely with someone who is, because >> this sounds like it has a different cause and I need to start with >> that assumption. > > Was that merged or is it still in -mm? The last thing I see in > arch/x86_64/irq.c is: > > [PATCH] x86-64: survive having no irq mapping for a vector > > And we tried that one. Look in arch/x86_64/io_apic.c. That is where most of the work happened. If you can extract that patch series for a backport more power to you. Eric commit 610142927b5bc149da92b03c7ab08b8b5f205b74 Author: Eric W. Biederman <[EMAIL PROTECTED]> Date: Fri Feb 23 04:40:58 2007 -0700 [PATCH] x86_64 irq: Safely cleanup an irq after moving it. The problem: After moving an interrupt when is it safe to teardown the data structures for receiving the interrupt at the old location? With a normal pci device it is possible to issue a read to a device to flush all posted writes. This does not work for the oldest ioapics because they are on a 3-wire apic bus which is a completely different data path. For some more modern ioapics when everything is using front side bus delivery you can flush interrupts by simply issuing a read to the ioapic. For other modern ioapics emperical testing has shown that this does not work. So it appears the only reliable way to know the last of the irqs from an ioapic have been received from before the ioapic was reprogrammed is to received the first irq from the ioapic from after it was reprogrammed. Once we know the last irq message has been received from an ioapic into a local apic we then need to know that irq message has been processed through the local apics. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/7] revoke: add f_light flag for struct file
On Friday 09 March 2007 17:11, Benjamin LaHaise wrote: > On Fri, Mar 09, 2007 at 12:13:35PM +0100, Eric Dumazet wrote: > > Then just drop the fget_light() 'optimisation' and always take a > > reference (atomic on f_count) regardless of single-thread or not. Instead > > of dirtying f_light, just do the straightforward thing and be with it. > > > > (that is : fget_light() = fget() = no more keeping fput_needed > > everywhere, and convoluted things in some dark sides of the kernel. > > And it makes things rather slower for a lot of single threaded applications > on modern systems. Yes, fget_light can be done much more cleanly, but > please don't go around ripping out optimizations just because. Sure. But I apparently was the only guy to react to the f_light horror story. And it seems a solution was found, after some mail exchanges. In French we have this expression : "Precher le faux pour savoir le vrai" You could translate to "make false statements in order to discover the truth" or "to tell a lie in order to get at the truth" or maybe "playing the devil's advocate", but really the French one is better :) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Use more gcc extensions in the Linux headers
On Fri, 9 Mar 2007, Christoph Hellwig wrote: > > It was only put in under the premise that they'll fix whatever breaks, > we're not going to put any maintaince border on us to hack around > broken propritary compilers. Well, since Rusty's macro was hoddible *anyway*, I don't think I'd apply it as-is. Breaking icc for something that ugly and not-very-important simply makes no sense. There are better ways to do this. For one, you could (and should!) abstract these kinds of things out, rather than put them in another macro that really does something totally different. Then, the macro could have become #define ARRAY_SIZE (sizeof_expression + 0*error_if_not_array) which would already be a hell of a lot more readable. But more importantly, it's also now suddenly much easiler to abstract out for different compilers. We *already* support different compilers through , and there just isn't any reason for bad code just for bad codes sake! Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/3] ibmebus: whitespace fixes
This fixes a lot of whitespace in ibmebus.[ch] Signed-off-by: Joachim Fenkes <[EMAIL PROTECTED]> --- This patchset applies on top of a vanilla 2.6.20 kernel. No dependencies on other patches except for part 3/3. This is a repost of my earlier patchset and fixes a stupid compile error. arch/powerpc/kernel/ibmebus.c | 126 +- include/asm-powerpc/ibmebus.h | 42 +++--- 2 files changed, 84 insertions(+), 84 deletions(-) diff -Nurp 01.original/arch/powerpc/kernel/ibmebus.c 02.whitespace-fixes/arch/powerpc/kernel/ibmebus.c --- 01.original/arch/powerpc/kernel/ibmebus.c 2007-02-22 05:26:24.0 +0100 +++ 02.whitespace-fixes/arch/powerpc/kernel/ibmebus.c 2007-02-22 06:57:18.0 +0100 @@ -3,35 +3,35 @@ * * Copyright (c) 2005 IBM Corporation * Heiko J Schick <[EMAIL PROTECTED]> - * + * * All rights reserved. * - * This source code is distributed under a dual license of GPL v2.0 and OpenIB - * BSD. + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. * * OpenIB BSD License * - * Redistribution and use in source and binary forms, with or without - * modification, are permitted provided that the following conditions are met: + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: * - * Redistributions of source code must retain the above copyright notice, this - * list of conditions and the following disclaimer. + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. * - * Redistributions in binary form must reproduce the above copyright notice, - * this list of conditions and the following disclaimer in the documentation + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation * and/or other materials - * provided with the distribution. + * provided with the distribution. * - * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" - * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE - * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE - * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE - * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR - * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER - * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) - * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE * POSSIBILITY OF SUCH DAMAGE. */ @@ -55,7 +55,7 @@ static void *ibmebus_alloc_coherent(stru gfp_t flag) { void *mem; - + mem = kmalloc(size, flag); *dma_handle = (dma_addr_t)mem; @@ -63,7 +63,7 @@ static void *ibmebus_alloc_coherent(stru } static void ibmebus_free_coherent(struct device *dev, - size_t size, void *vaddr, + size_t size, void *vaddr, dma_addr_t dma_handle) { kfree(vaddr); @@ -79,7 +79,7 @@ static dma_addr_t ibmebus_map_single(str static void ibmebus_unmap_single(struct device *dev, dma_addr_t dma_addr, -size_t size, +size_t size, enum dma_data_direction direction) { return; @@ -90,13 +90,13 @@ static int ibmebus_map_sg(struct device int nents, enum dma_data_direction direction) { int i; - + for (i = 0; i < nents; i++) { - sg[i].dma_address = (dma_addr_t)page_address(sg[i].page) + sg[i].dma_address = (dma_addr_t)page_address(sg[i].page) + sg[i].offset; sg[i].dma_length = sg[i].length; } - + return nents; } @@ -128,15 +128,15 @@ static int ibmebus_bus_probe(struct devi struct ibmeb
[PATCH 2/3] ibmebus: dynamic addition/removal of adapters, some code cleanup
This adds two sysfs attributes to /sys/bus/ibmebus which can be used to notify the ebus driver of added / removed ebus devices in the OF device tree. Echoing the device's location code (as found in the OFDT "ibm,loc-code" property) into the "probe" attribute will notify ebus of addition of the device and cause the appropriate device driver's probe function to be called on the device. Likewise, echoing the location code into the "remove" attribute will cause the device to be removed from the system. The writes will block until the respective operation has finished and return an error code if the operation failed. In addition, two minor tidbits are fixed: - The fake root device used to provide a common parent for all ebus devices is now based on device instead of of_device - it had no associated devtree node. This saves several checks throughout the ebus driver. - The sysfs attributes are now generated automagically by device_register() instead of by the ibmebus code, which saves a few compiler warnings about unused return codes. Signed-off-by: Joachim Fenkes <[EMAIL PROTECTED]> --- This is a repost of my earlier patch, fixing a stupid compile error and some warnings. arch/powerpc/kernel/ibmebus.c | 167 +- include/asm-powerpc/ibmebus.h |2 2 files changed, 134 insertions(+), 35 deletions(-) diff -Nurp 02.whitespace-fixes/arch/powerpc/kernel/ibmebus.c 03.almost-all/arch/powerpc/kernel/ibmebus.c --- 02.whitespace-fixes/arch/powerpc/kernel/ibmebus.c 2007-02-22 06:57:18.0 +0100 +++ 03.almost-all/arch/powerpc/kernel/ibmebus.c 2007-03-09 17:37:08.309979440 +0100 @@ -2,6 +2,7 @@ * IBM PowerPC IBM eBus Infrastructure Support. * * Copyright (c) 2005 IBM Corporation + * Joachim Fenkes <[EMAIL PROTECTED]> * Heiko J Schick <[EMAIL PROTECTED]> * * All rights reserved. @@ -43,12 +44,14 @@ #include #include -static struct ibmebus_dev ibmebus_bus_device = { /* fake "parent" device */ - .name = ibmebus_bus_device.ofdev.dev.bus_id, - .ofdev.dev.bus_id = "ibmebus", - .ofdev.dev.bus= &ibmebus_bus_type, +#define MAX_LOC_CODE_LENGTH 80 + +static struct device ibmebus_bus_device = { /* fake "parent" device */ + .bus_id = "ibmebus", }; +struct bus_type ibmebus_bus_type; + static void *ibmebus_alloc_coherent(struct device *dev, size_t size, dma_addr_t *dma_handle, @@ -158,21 +161,12 @@ static void __devinit ibmebus_dev_releas kfree(to_ibmebus_dev(dev)); } -static ssize_t ibmebusdev_show_name(struct device *dev, - struct device_attribute *attr, char *buf) -{ - return sprintf(buf, "%s\n", to_ibmebus_dev(dev)->name); -} -static DEVICE_ATTR(name, S_IRUSR | S_IRGRP | S_IROTH, ibmebusdev_show_name, - NULL); - -static struct ibmebus_dev* __devinit ibmebus_register_device_common( +static int __devinit ibmebus_register_device_common( struct ibmebus_dev *dev, const char *name) { int err = 0; - dev->name = name; - dev->ofdev.dev.parent = &ibmebus_bus_device.ofdev.dev; + dev->ofdev.dev.parent = &ibmebus_bus_device; dev->ofdev.dev.bus = &ibmebus_bus_type; dev->ofdev.dev.release = ibmebus_dev_release; @@ -186,12 +180,10 @@ static struct ibmebus_dev* __devinit ibm if ((err = of_device_register(&dev->ofdev)) != 0) { printk(KERN_ERR "%s: failed to register device (%d).\n", __FUNCTION__, err); - return NULL; + return -ENODEV; } - device_create_file(&dev->ofdev.dev, &dev_attr_name); - - return dev; + return 0; } static struct ibmebus_dev* __devinit ibmebus_register_device_node( @@ -205,18 +197,18 @@ static struct ibmebus_dev* __devinit ibm if (!loc_code) { printk(KERN_WARNING "%s: node %s missing 'ibm,loc-code'\n", __FUNCTION__, dn->name ? dn->name : ""); - return NULL; + return ERR_PTR(-EINVAL); } if (strlen(loc_code) == 0) { printk(KERN_WARNING "%s: 'ibm,loc-code' is invalid\n", __FUNCTION__); - return NULL; + return ERR_PTR(-EINVAL); } dev = kzalloc(sizeof(struct ibmebus_dev), GFP_KERNEL); if (!dev) { - return NULL; + return ERR_PTR(-ENOMEM); } dev->ofdev.node = of_node_get(dn); @@ -227,9 +219,9 @@ static struct ibmebus_dev* __devinit ibm min(length, BUS_ID_SIZE - 1)); /* Register with generic device framework. */ - if (ibmebus_register_device_common(dev, dn->name) == NULL) { + if (ibmebus_register_device_common(dev, dn->name) != 0) { kfree(dev); - return NULL; + return ERR_PTR(-ENODEV); }
ABI coupling to hypervisors via CONFIG_PARAVIRT
* Linus Torvalds <[EMAIL PROTECTED]> wrote: > > Sure, that's clean, From that perspective the apic is a bunch of > > registers backed by a state machine or something. > > I think you could do much worse than just decide to pick the > IO-APIC/lapic as your "virtual interrupt controller model". So I do > *not* think that APICRead/APICWrite are in any way horrible interfaces > for a virtual interrupt controller. In many ways, you then have a > tested and known interface to work with. yes - but we already support the raw hardware ABI, in the native kernel. paravirt_ops is not 'just another PC sub-arch'. It is not 'just another hardware driver'. It is not 'just another x86 CPU'. paravirt_ops is much wider than that, it hooks everywhere and has effect on everything! Lets take a look at the raw numbers. Here's a typical distro kernel vmlinux, with and without CONFIG_PARAVIRT [with no paravirt backend enabled]: textdata bss dec hex filename 139863 49010 57672 246545 3c311 x86-kernel-built-in.o.noparavirt 148865 49310 57672 255847 3e767 x86-kernel-built-in.o.paravirt textdata bss dec hex filename 5154975 586932 221184 5963091 5afd53 vmlinux.noparavirt 5189197 587504 221184 5997885 5b853d vmlinux.paravirt why did code size increase by +6.4% in arch/i386/ (+0.7% in the vmlinux)? It is purely because CONFIG_PARAVIRT adds more than _1400_ function call hooks to the x86 arch: c05c8e60 D paravirt_ops c0102602: ff 15 9c 8e 5c c0 call *0xc05c8e9c c0102d37: ff 15 94 8e 5c c0 call *0xc05c8e94 c0102d45: ff 15 94 8e 5c c0 call *0xc05c8e94 c0102d53: ff 15 94 8e 5c c0 call *0xc05c8e94 c0102d61: ff 15 94 8e 5c c0 call *0xc05c8e94 c0102d6f: ff 15 94 8e 5c c0 call *0xc05c8e94 [...] $ objdump -d vmlinux | grep c05c8e | wc -l 1463 _1463_ hooks, spread out all around the x86 arch. Are these only trivial hooks a'ka alternatives.h? Not at all, these are full-blown function hooks freely modifiable by a paravirt_ops implementation, spread throughout the architecture in a finegrained way. (see my arguments and specific demonstration about the bad effects of this, four paragraphs below.) As a comparison: people argued about CONFIG_SECURITY hooks and flamed about them no end. The reality is, there's only _269_ calls to security_ops in this same kernel, and i've got CONFIG_SECURITY + SELINUX enabled. And the only functional modification that security_ops does to native behavior is "deny the syscall". Not 'full control over behavior'... In terms of coupling, CONFIG_SECURITY hooks are a walk in the park, relative to CONFIG_PARAVIRT. we dont even give /real silicon/ that many hooks! If an x86 CPU came along that required the addition of 1400+ function hooks then we'd say: 'you must be joking, that's not an x86 CPU! Make it more compatible!'. please dont get me wrong - 1463 hooks spread out might be fine in the end, but _if and only if_ there are safeguards in place to make sure they are just a trivial variation of the hardware ABI - a'ka asm/alternatives.h. But there is _no_ such safeguard in place today and we are seeing the bad effects of that _already_, with just a _single_ hypervisor and a _single_ abstraction topic (time), so i'm very strongly convinced that it's a serious issue that cannot just be glossed over with "relax, it will work out fine". If there's one thing we learned in the past 15 years is that ABI issues will haunt us forever. Let me demonstrate some of the bad effects, and how far we've _already_ deviated from the 'hardware ABI'. An example: one assumes that paravirt_ops.safe_halt() is a trivial variation of the 'halt instruction', right? But vmi.c and vmitimer.c does much more than that. Take a look at vmi_safe_halt() which calls vmi_stop_hz_timer(): it hacks back a jiffies assumption into its code via paravirt_ops.safe_halt() - purely via changes local to vmitimer.c, by using next_timer_interrupt()! Thus it has created a _dual layer_ of dynticks that we specifically objected against. It does so in spite of our warning about why that is bad, it does so in spite of Xen having implemented a clockevents driver in 2 hours, and it does so under the cover of 'oh, this is only a vmitimer.c local change'. It circumvents the native dynticks framework and in essence brings in the bad NO_IDLE_HZ technique that we worked so hard for 2 years not to ever enable for the i386 arch! so one of my very real problems with paravirt_ops is that due to its sheer hook-based impact it allows the modification of the hardware ABI on a _very_ wide scale: both unintentionally and intentionally. Furthermore, it allows the introduction of hard-to-remove hardwired quirks that bind one particular paravirt_ops method to the hypervisor ABI - quirks that are not present in any real silicon! Quirks _guaranteed by Linux_, by virtu
[PATCH 3/3] ibmebus: uevent support
This adds uevent support to ibmebus using the generic of_device_uevent() function. Signed-off-by: Joachim Fenkes <[EMAIL PROTECTED]> --- I split this change into a separate patch because it depends on another patch against 2.6.20, submitted by Sylvain Munaut: http://patchwork.ozlabs.org/linuxppc/patch?id=9558 ibmebus.c |1 + 1 file changed, 1 insertion(+) diff -Nurp 03.almost-all/arch/powerpc/kernel/ibmebus.c 04.uevent/arch/powerpc/kernel/ibmebus.c --- 03.almost-all/arch/powerpc/kernel/ibmebus.c 2007-03-09 17:37:08.309979440 +0100 +++ 04.uevent/arch/powerpc/kernel/ibmebus.c 2007-03-07 19:07:53.0 +0100 @@ -460,6 +460,7 @@ static struct bus_attribute ibmebus_bus_ struct bus_type ibmebus_bus_type = { .name = "ibmebus", + .uevent= of_device_uevent, .match = ibmebus_bus_match, .dev_attrs = ibmebus_dev_attrs, .bus_attrs = ibmebus_bus_attrs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc3-mm1 RSDL results
Mmm.. when it's good, it's *really* good. My desktop feels snappier and all of that. No noticeable jerkiness of windows/scrolling, which I *do* observe with the stock scheduler. But when it's bad, it stinks. Like when a "make -j2" kernel rebuild is happening in a background window This is on a Pentium-M 760 single-core, w/2GB SDRAM (notebook). JADP (Just Another Data Point). Cheers Mark - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/4 TRY#3] optimize and simplify get_cycles_sync()
Joerg Roedel wrote: From: Joerg Roedel <[EMAIL PROTECTED]> This patch simplifies the get_cycles_sync() function by removing the #ifdefs from it. Further it introduces an optimization for AMD processors. There the RDTSCP instruction is used instead of CPUID;RDTSC which is helpfull if the kernel runs as a KVM guest. Running as a guest makes CPUID very expensive because it causes an intercept of the guest. +#define RDTSCP ".byte 0x0f, 0x01, 0xf9" + alternative_io_two("cpuid\nrdtsc", + "rdtsc", X86_FEATURE_SYNC_RDTSC, + ".byte 0x0f, 0x01, 0xf9", X86_FEATURE_RDTSCP, why not use the RDTSCP macro here? -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] RSDL completely fair starvation free interactive cpu scheduler
On Fri, 9 Mar 2007, Bill Davidsen wrote: > > But it IS okay for people to make special-case schedulers. Because it's MY > machine, Sure. Go wild. It's what open-source is all about. I'm not stopping you. I'm just not merging code that makes the scheduler unreadable, even hard to understand, and slows things down. I'm also not merging code that sets some scheduler policy limits by having specific "pluggable scheduler interfaces". Different schedulers tend to need different data structures in some *very* core data, like the per-cpu run-queues, in "struct task_struct", in "struct thread_struct" etc etc. Those are some of *the* most low-level structures in the kernel. And those are things that get set up to have as little cache footprint a possible etc. IO schedulers have basically none of those issues. Once you need to do IO, you'll happibly use a few indirect pointers, it's not going to show up anywhere. But in the scheduler, 10 cycles here and there will be a big deal. And hey, you can try to prove me wrong. Code talks. So far, nobody has really ever come close. So go and code it up, and show the end result. So far, nobody who actually *does* CPU schedulers have really wanted to do it, because they all want to muck around with their own private versions of the data structures. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH, take2] VFS : Delay the dentry name generation on sockets and pipes.
On Fri, 9 Mar 2007, Eric Dumazet wrote: > > CAUTION : d_path() logic is quite tricky. > The correct way to return for example "Hello" is to put it > at the end of the buffer, and returns a pointer to the first char. Yeah, it's subtle, since it wants to use a single buffer, and not copy things around too much. But can I ask you to do a take3, and simply have a helper function like char *dynamic_dname(struct dentry *dentry, char *buffer, int len, const char *fmt, ...) { va_list args; char temp[64]; int i; va_start(args, fmt); i = vsnprintf(tmp,sizeof(tmp),fmt,args) + 1; va_end(args); if (i > len) return ERR_PTR(-ENAMETOOLONG); buffer += len - i; memcpy(buffer, tmp, i); return buffer; } and just require that everybody use that function. Then the pipe code would just become static char *pipefs_dname(struct dentry *dentry, char *buffer, int buflen) { return dynamic_dname(dentry, buffer, buflen, ""pipe:[%lu]", dentry->d_inode->i_ino); } and you're done, and you have only *one* place in the VFS layer (preferably right next to d_path() itself) that cares about the subtle issues that we have. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc3-mm1 RSDL results
Mark Lord wrote: Mmm.. when it's good, it's *really* good. My desktop feels snappier and all of that. No noticeable jerkiness of windows/scrolling, which I *do* observe with the stock scheduler. But when it's bad, it stinks. Like when a "make -j2" kernel rebuild is happening in a background window Would you please do that same "make -j2" niced. Tell us how that feels. -- Jeffrey Hundstad - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.21-rc3-rt0
Hi, I get a lot of "NOHZ: local_softirq_pending 02" and I have noticed some swsuspend problems. Disabling non-boot CPUs ... CPU1 playing dead [] dump_trace+0x7f/0x229 [] show_trace_log_lvl+0x35/0x54 [] show_trace+0x2c/0x2e [] dump_stack+0x29/0x2b [] cpu_idle+0x91/0x126 [] start_secondary+0x30d/0x315 === --- | preempt count: 0001 ] | 1-level deep critical section nesting: .. [] cpu_idle+0x11b/0x126 .[] .. ( <= start_secondary+0x30d/0x315) l *0xc01021c4 0xc01021c4 is in cpu_idle (arch/i386/kernel/process.c:204). 199 tick_nohz_restart_sched_tick(); 200 local_irq_disable(); 201 __preempt_enable_no_resched(); 202 __schedule(); 203 preempt_disable(); 204 local_irq_enable(); 205 } 206 } 207 208 void cpu_idle_wait(void) l *0xc01156ec 0xc01156ec is in start_secondary (arch/i386/kernel/smpboot.c:432). 427 /* We can take interrupts now: we're officially "up". */ 428 local_irq_enable(); 429 430 wmb(); 431 cpu_idle(); 432 } 433 434 /* 435 * Everything has been set up for the secondary 436 * CPUs - they just need to reload everything CPU 1 is now offline lockdep: not fixing up alternatives. stopped custom tracer. = [ INFO: possible recursive locking detected ] [ 2.6.21-rc3-rt0 #1 - swsusp_shutdown/3406 is trying to acquire lock: ((raw_spinlock_t *)(&lock->wait_lock)){--..}, at: [] migrate_timers+0x8b/0x16e but task is already holding lock: ((raw_spinlock_t *)(&lock->wait_lock)){--..}, at: [] migrate_timers+0x77/0x16e l *0xc01329cd 0xc01329cd is in migrate_timers (kernel/timer.c:1854). 1849 1850local_irq_disable_nort(); 1851double_spin_lock(&new_base->lock, &old_base->lock, 1852 smp_processor_id() < cpu); 1853 1854BUG_ON(old_base->running_timer); 1855 1856for (i = 0; i < TVR_SIZE; i++) 1857migrate_timer_list(new_base, old_base->tv1.vec + i); 1858for (i = 0; i < TVN_SIZE; i++) { l *0xc01329b9 0xc01329b9 is in migrate_timers (include/linux/spinlock.h:711). 706 __acquires(l1) 707 __acquires(l2) 708 { 709 if (l1_first) { 710 spin_lock(l1); 711 spin_lock(l2); 712 } else { 713 spin_lock(l2); 714 spin_lock(l1); 715 } other info that might help us debug this: 5 locks held by swsusp_shutdown/3406: #0: (pm_mutex){--..}, at: [] enter_state+0x40/0xbf #1: (cpu_add_remove_lock){--..}, at: [] disable_nonboot_cpus+0x1a/0x11c #2: (cache_chain_mutex){--..}, at: [] cpuup_callback+0x214/0x3c1 #3: (workqueue_mutex){--..}, at: [] workqueue_cpu_callback+0x11f/0x1a6 #4: ((raw_spinlock_t *)(&lock->wait_lock)){--..}, at: [] migrate_timers+0x77/0x16e l *0xc0155c98 0xc0155c98 is in enter_state (kernel/power/main.c:197). 192 int error; 193 194 if (!valid_state(state)) 195 return -ENODEV; 196 if (!mutex_trylock(&pm_mutex)) 197 return -EBUSY; 198 199 if (state == PM_SUSPEND_DISK) { 200 error = pm_suspend_disk(); 201 goto Unlock; l *0xc014f781 0xc014f781 is in disable_nonboot_cpus (kernel/cpu.c:264). 259 int disable_nonboot_cpus(void) 260 { 261 int cpu, first_cpu, error = 0; 262 263 mutex_lock(&cpu_add_remove_lock); 264 first_cpu = first_cpu(cpu_present_map); 265 if (!cpu_online(first_cpu)) { 266 error = _cpu_up(first_cpu); 267 if (error) { 268 printk(KERN_ERR "Could not bring CPU%d up.\n", l *0xc01872b7 0xc01872b7 is in cpuup_callback (mm/slab.c:1342). 1337start_cpu_timer(cpu); 1338break; 1339#ifdef CONFIG_HOTPLUG_CPU 1340case CPU_DOWN_PREPARE: 1341mutex_lock(&cache_chain_mutex); 1342break; 1343case CPU_DOWN_FAILED: 1344mutex_unlock(&cache_chain_mutex); 1345break; 1346case CPU_DEAD: l *0xc0139fb0 0xc0139fb0 is in workqueue_cpu_callback (kernel/workqueue.c:883). 878 mutex_unlock(&workqueue_mutex); 879 break; 880 881 case CPU_DOWN_PREPARE: 882 mutex_lock(&workqueue_mutex); 883 break; 884 885 case CPU_DOWN_FAILED: 886 mutex_unlock(&workqueue_mutex); 887 break; stack backtrace: [] dump_trace+0x7f/0x229 [] show_trace_log_lvl+0x35/0x5
Re: ABI coupling to hypervisors via CONFIG_PARAVIRT
On Fri, 9 Mar 2007, Ingo Molnar wrote: > > yes - but we already support the raw hardware ABI, in the native kernel. Why do you continue to call paravirt an ABI? We got over that. It's not. It's an API. VMI is an ABI. As long as you try to confuse the two, there's no point to the discussion. Yeah, paravirt is ugly. Yeah, the calls should be moved higher in the stack. But you don't help by confusing the issue by mixing the different parts up and calling something an ABI that simply *isn't*. Paravirt already acts on a higher level than the ioapic. It does do the "irq_disable()" kind of "highlevel" callbacks. Yeah, the "apic_write()" ones should go away, and they're just hacky, but there's nothing there that is an ABI. So just *fix* it or tell others to fix it, instead of just confusing the issue. And trust me, if "apic_write" causes bugs because it interacts with real APIC usage, we don't care ONE WHIT. That paravirt_ops entry goes out the window so fast you can't say "Whaa?!??". Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc3-mm1 RSDL results
On Fri, Mar 09, 2007 at 07:39:05PM +1100, Con Kolivas wrote: > On Friday 09 March 2007 19:20, Matt Mackall wrote: > > And I've just rebooted with NO_HZ and things are greatly improved. At > > idle, Beryl effects are silky smooth (possibly better than stock) and > > shows less load. Under 'make', Beryl is still responsive as is Galeon. > > No sign of lagging mouse or typing. > > > > Under make -j 5, things are intermittent. Galeon scrolling is > > sometimes still responsive, but Beryl, terminals and mouse still drag > > quite a bit. > > I just replied before you sent this one out I think our messages passed each > other across the ocean somewhere. I don't quite get what combination of > factors you're saying here caused great improvement. Was it enabling NO_HZ on > mainline cpu scheduler or disabling NO_HZ or on RSDL? Turning on NO_HZ on RSDL greatly improved it. I have not tried NO_HZ on mainline. The first test was with NO_HZ=n, the second was with NO_HZ=y. My baseline test was with mainline NO_HZ=y. As an aside, we should not name config options NO_* or DISABLE_* because of the potential for double negation. -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ABI coupling to hypervisors via CONFIG_PARAVIRT
On Friday 09 March 2007 19:02, Ingo Molnar wrote: > _1463_ hooks, spread out all around the x86 arch. They are not all different hooks though, just many call site of the same. Also most of them are well defined to just match what the instructions do. paravirt_ops has under hundred entries right now and i intend to not expand it much further after the Xen bits are in. > Let me demonstrate some of the bad effects, and how far we've _already_ > deviated from the 'hardware ABI'. An example: one assumes that > paravirt_ops.safe_halt() The vmi maintainers already agreed to fix that. > i claim that when the 'API cut' is done at the right level Can you make a proposal? Would you be willing to write code for that? > then no more > than say 100 hooks would be needed Well we have less than 100 hooks right now, just with many call sites @) > - with virtually zero kernel size > increase. I'll believe that when I see it. > We've got all the right highlevel abstractions: genirq, gtod, > clockevents. Whatever is missing at the moment from the framework (say > smp_send_reschedule()) we can abstract away. smp_send_reschedule() is just an IPI instance which is already abstracted with genapic. Xen has a genapic_xen. VMI is still where ->apic_read/->apic_write and their relatively harmless timer interrupt change make sense -- if they needed more changes they would just need to bite the bullet and provide a custom genapic vmware apic driver. > Unfortunately, with the current paravirt_ops policy we might end up > seeing none of that unification. I am open to concrete incremental proposals for improvements. > > And that is why the "paravirt_ops is just virtual hardware" argument is > totally wrong. _Nothing_ limits hypervisors from adding arbitrary ABI > bindings to Linux. For example, VMI does this already and none of the > following are hardware ABIs: > > #define VMI_CALL_SetAlarm 68 > #define VMI_CALL_CancelAlarm69 > #define VMI_CALL_GetWallclockTime 70 > #define VMI_CALL_WallclockUpdated 71 That's VMI internal, not exposed above paravirt ops. > Firstly, i think this has been over-rushed. After years of being happy > with forks of the Linux kernel, The code has been posted for a long time, open for review for everybody. > Secondly, i'd like to see a paravirt approach that has /implicit/ > safeguards against the following type of crap: I don't think you can use an API to force the underlying implementation in a practical way. If code wants to do something wrong it no API in the world will stop it. That is why we have code review instead. >it has a hardwired assumption that 'cycles' makes a sense as a way to >communicate time units: > > vmi_timer_ops.set_alarm( > VMI_ALARM_WIRED_LVTT | VMI_ALARM_IS_PERIODIC | > VMI_CYCLES_AVAILABLE, > per_cpu(process_times_cycles_accounted_cpu, cpu) + > cycles_per_alarm, > cycles_per_alarm); That's because VMI is defined this way? If paravirt chosed to not pass cycles anymore they would just add a simple conversion function. SMOP. >it has a hardwired assumption that Linux keeps time in units of >'jiffies': Well a lot of drivers have that, but it can be all fixed. > Granted, some of these are just harmless quirks that are fixable in > Linux only, All of them. > but some of these are stiffling because they bind Linux to > the hypervisor ABI. I haven't seen a concrete example of that yet to be honest and I don't really believe it. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix building kernel under Solaris 11_snv
On Thu, Mar 08, 2007 at 11:01:57PM +0100, Jan Engelhardt wrote: > > On Mar 8 2007 22:25, Sam Ravnborg wrote: > >Subject: Re: [PATCH] Fix building kernel under Solaris > > Since Solaris seems to be on the run, I did myself try compile it. > However, unlike the original poster who said he did so on SunOS 4.8, I > did it on 5.11_snv39, yielding a bigger changeset. I thought I just > share the diff that piled up so far. It needs a lot of hacks on the > Solaris side - prioritizing GNU names, then, second, gnu ld has a > glitch, then, gcc has a missing file... it's fun fun fun! Can I please have a signed-off version of this patch. Thanks, Sam > > Well, I will iterate the key problem with the missing file: > > * include/linux/kernel.h (and many others) include > BUT - since we are using -nostdinc, /usr/include/stdarg.h is not > considered. And gcc's stdarg.h (which lives at > /usr/lib/gcc/i586-suse-linux/4.1.2/include/stdarg.h in Linux land) > is missing in Solaris' GCC (which is version 3.4.3). > > Hack #1: > ln -s \ > > /usr/sfw/lib/gcc/i386-pc-solaris2.11/3.4.3/install-tools/include/stdarg.h \ > /usr/sfw/lib/gcc/i386-pc-solaris2.11/3.4.3/include/stdarg.h > > Hack #2: GNU programs... > > mkdir -p ~/gnulink; > for i in addr2line ar as egrep grep ld make nm objcopy objdump \ > ranlib readelf size string strip tar; do > ln -s "/usr/sfw/bin/$i" "~/gnulink/$i"; > done; > > for i in cat chgrp chmod chown chroot cksum cmp cp cut date \ > dd df diff du echo env expand expr false fgrep find fold \ > getopt groups head hostid install join ln locate ls mkdir \ > mkfifo mknod mv nice nohup od pwd rm rmdir sed seq shred \ > sleep sort split stty tac tail tee touch tr true uname \ > uniq uptime wc who whoami xargs yes gawk; do > ln -s "/opt/csw/bin/$i" "~/gnulink/$i"; > done; > > Hack #3: Diff file... > > Hack #4: GNU ld glitch workaround (GNU ld looks in the current dir...) > > cd linux-2.6.21-rc3 > ln -s /usr/sfw/i386-sun-solaris2.11/lib/ldscripts ldscripts > > Fun #1: > > export PATH="$HOME/gnulink:$PATH"; > make ARCH=i386 > > Oddity #1: > > ARCH=i386 required because the Makefiles seem to use `uname -m` > (which returns "i86pc") rather than `uname -p`. I think we are > at odds here though... > > uname -muname -p > SOL i86pc i386 > LINUX i686athlon > > > Expect compiler failures, especially with assembler code. > > > Jan > > <<< PATCH BELOW <<< > > Index: linux-2.6.21-rc3/include/linux/input.h > === > --- linux-2.6.21-rc3.orig/include/linux/input.h 2007-03-07 > 05:41:20.0 +0100 > +++ linux-2.6.21-rc3/include/linux/input.h2007-03-07 23:40:39.417339000 > +0100 > @@ -16,7 +16,9 @@ > #include > #include > #include > -#include > +#ifndef __sun__ > +#include > +#endif > #endif > > /* > Index: linux-2.6.21-rc3/scripts/genksyms/genksyms.c > === > --- linux-2.6.21-rc3.orig/scripts/genksyms/genksyms.c 2007-03-07 > 05:41:20.0 +0100 > +++ linux-2.6.21-rc3/scripts/genksyms/genksyms.c 2007-03-07 > 23:28:35.659555000 +0100 > @@ -21,6 +21,7 @@ > along with this program; if not, write to the Free Software Foundation, > Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. */ > > +#include > #include > #include > #include > Index: linux-2.6.21-rc3/scripts/kallsyms.c > === > --- linux-2.6.21-rc3.orig/scripts/kallsyms.c 2007-03-07 05:41:20.0 > +0100 > +++ linux-2.6.21-rc3/scripts/kallsyms.c 2007-03-07 23:46:46.249005000 > +0100 > @@ -378,6 +378,40 @@ > table_cnt = pos; > } > > +#ifdef __sun__ > +/* Return the first occurrence of NEEDLE in HAYSTACK. */ > +void * > +memmem (haystack, haystack_len, needle, needle_len) > + const void *haystack; > + size_t haystack_len; > + const void *needle; > + size_t needle_len; > +{ > + const char *begin; > + const char *const last_possible > += (const char *) haystack + haystack_len - needle_len; > + > + if (needle_len == 0) > +/* The first occurrence of the empty string is deemed to occur at > + the beginning of the string. */ > +return (void *) haystack; > + > + /* Sanity check, otherwise the loop might search through the whole > + memory. */ > + if (__builtin_expect (haystack_len < needle_len, 0)) > +return NULL; > + > + for (begin = (const char *) haystack; begin <= last_possible; ++begin) > +if (begin[0] == ((const char *) needle)[0] && > +!memcmp ((const void *) &begin[1], > + (const void *) ((const char *) needle + 1), > + needle_l
Re: ABI coupling to hypervisors via CONFIG_PARAVIRT
* Ingo Molnar ([EMAIL PROTECTED]) wrote: > i claim that when the 'API cut' is done at the right level then no more > than say 100 hooks would be needed - with virtually zero kernel size > increase. We've got all the right highlevel abstractions: genirq, gtod, > clockevents. Whatever is missing at the moment from the framework (say > smp_send_reschedule()) we can abstract away. The bonus? It would be > almost directly applicable to other architectures as well. It would also > work with /any/ hypervisor. Oddly enough, that's really what we are trying to acheive. There is definitely some tension between the VMI model which is modeled very directly on hardware and something like the Xen model which prefers higher level interfaces. I don't really agree with your metrics w.r.t hooks. My point is you take callsites == hooks to arrive at 1463 hook, but then above say 100 hooks is sufficient. But we have on the order of 100 hooks (I believe it's ~75 in Linus' tree). Put it another way. Do you believe that something like irq_{en,dis}able() is appropriate to hook (as that's > 1400 callsites already)? > Firstly, i think this has been over-rushed. After years of being happy > with forks of the Linux kernel, all the hypervisors woke up at once and > want to have their stuff upstream /now/. This rush created a hodgepodge > of APIs/ABIs that we now in the end promise to support /all/. (if we > take CONFIG_VMI i can see little ethical reason to not take Xen's > paravirt_ops, lguest's paravirt_ops, KVM's paravirt_ops and i'm sure > Microsoft/Novell will have something nice and different for us too.) It would be imminently helpful if you helped with some specific ideas on where the paravirt_ops interface needs to be adjusted. > Secondly, i'd like to see a paravirt approach that has /implicit/ > safeguards against the following type of crap: How would you propose doing that? Typically that's done with code review and patches. thanks, -chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Software Suspend: Fix suspend when console is in VT_AUTO/KD_GRAPHICS mode
On Fri, 2007-09-03 at 15:34 +, Matthew Garrett wrote: > On Fri, Mar 09, 2007 at 10:08:05AM +0100, Pavel Machek wrote: > > > So... if current console is graphical, we leave X accessing the > > console... That's bad, because video state is not going to be > > restored...? > > A graphical console is not necessarily X. Is there any requirement for > there to be a single VT that isn't in text mode? The vt switching is > a hack, we shouldn't make life difficult for people who have their own > userspace code that's entirely capable of restoring video state on its > own. I realised that the previous patch would disallow a console switch while running X. Attached is an updated patch with this scenario fixed. Another approach might be to fail in vt_waitactive() if a console switch is not going to occur. -- Andrew Signed-off-by: Andrew Johnson <[EMAIL PROTECTED]> --- diff -rup linux-2.6.20.1/drivers/char/vt.c linux/drivers/char/vt.c --- linux-2.6.20.1/drivers/char/vt.c2007-02-19 22:34:32.0 -0800 +++ linux/drivers/char/vt.c 2007-03-09 10:53:32.0 -0800 @@ -2188,10 +2188,22 @@ static void console_callback(struct work release_console_sem(); } -void set_console(int nr) +extern char vt_dont_switch; + +int set_console(int nr) { + struct vc_data *vc = vc_cons[fg_console].d; + + if(!vc_cons_allocated(nr) || vt_dont_switch || + (vc->vt_mode.mode != VT_PROCESS && vc->vc_mode == KD_GRAPHICS)) { + + return -EINVAL; + } + want_console = nr; schedule_console_callback(); + + return 0; } struct tty_driver *console_driver; diff -rup linux-2.6.20.1/drivers/char/vt_ioctl.c linux/drivers/char/vt_ioctl.c --- linux-2.6.20.1/drivers/char/vt_ioctl.c 2007-02-19 22:34:32.0 -0800 +++ linux/drivers/char/vt_ioctl.c 2007-03-08 14:15:41.0 -0800 @@ -34,7 +34,7 @@ #include #include -static char vt_dont_switch; +char vt_dont_switch; extern struct tty_driver *console_driver; #define VT_IS_IN_USE(i)(console_driver->ttys[i] && console_driver->ttys[i]->count) diff -rup linux-2.6.20.1/include/linux/kbd_kern.h linux/include/linux/kbd_kern.h --- linux-2.6.20.1/include/linux/kbd_kern.h 2007-02-19 22:34:32.0 -0800 +++ linux/include/linux/kbd_kern.h 2007-03-08 14:15:41.0 -0800 @@ -75,7 +75,7 @@ extern int do_poke_blanked_console; extern void (*kbd_ledfunc)(unsigned int led); -extern void set_console(int nr); +extern int set_console(int nr); extern void schedule_console_callback(void); static inline void set_leds(void) diff -rup linux-2.6.20.1/kernel/power/console.c linux/kernel/power/console.c --- linux-2.6.20.1/kernel/power/console.c 2007-02-19 22:34:32.0 -0800 +++ linux/kernel/power/console.c2007-03-08 14:15:41.0 -0800 @@ -27,7 +27,11 @@ int pm_prepare_console(void) return 1; } - set_console(SUSPEND_CONSOLE); + if (set_console(SUSPEND_CONSOLE)) { + /* Unable to change to the new console */ + release_console_sem(); + return 1; + } release_console_sem(); if (vt_waitactive(SUSPEND_CONSOLE)) { - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Linux 2.6.20.2
We (the -stable team) are announcing the release of the 2.6.20.2 kernel. It contains a metric buttload of bugfixes and security updates, so all 2.6.20 users are recommended to upgrade. The diffstat and short summary of the fixes are below. I'll also be replying to this message with a copy of the patch between 2.6.20.1 and 2.6.20.2. The updated 2.6.20.y git tree can be found at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-2.6.20.y.git and can be browsed at the normal kernel.org git web browser: www.kernel.org/git/ thanks, greg k-h Makefile |2 arch/i386/kernel/cpu/mtrr/if.c | 33 +++- arch/i386/kernel/signal.c |6 +- arch/i386/kernel/sysenter.c|2 arch/ia64/Kconfig |1 arch/ia64/kernel/crash.c | 11 ++-- arch/ia64/kernel/machine_kexec.c |2 arch/m32r/kernel/process.c |2 arch/m32r/kernel/signal.c | 26 + arch/powerpc/kernel/head_64.S |2 arch/ppc/kernel/ppc_ksyms.c|2 arch/sparc64/kernel/of_device.c| 40 ++- arch/um/os-Linux/sigio.c | 38 +++--- arch/x86_64/ia32/ia32_signal.c |7 ++ arch/x86_64/ia32/ptrace32.c|1 arch/x86_64/kernel/irq.c | 12 +++- block/ll_rw_blk.c |2 drivers/Makefile |2 drivers/ata/ahci.c | 14 + drivers/ata/ata_generic.c |4 + drivers/ata/ata_piix.c |4 + drivers/ata/pata_ali.c |6 ++ drivers/ata/pata_amd.c | 10 +++ drivers/ata/pata_atiixp.c |4 + drivers/ata/pata_cmd64x.c |6 ++ drivers/ata/pata_cs5520.c |7 ++ drivers/ata/pata_cs5530.c |6 ++ drivers/ata/pata_cs5535.c |4 + drivers/ata/pata_cypress.c |4 + drivers/ata/pata_efar.c|4 + drivers/ata/pata_hpt366.c |7 ++ drivers/ata/pata_hpt3x3.c |6 ++ drivers/ata/pata_it821x.c |6 ++ drivers/ata/pata_jmicron.c |8 +++ drivers/ata/pata_marvell.c |4 + drivers/ata/pata_mpiix.c |4 + drivers/ata/pata_netcell.c |4 + drivers/ata/pata_ns87410.c |4 + drivers/ata/pata_oldpiix.c |4 + drivers/ata/pata_opti.c|4 + drivers/ata/pata_optidma.c |4 + drivers/ata/pata_pdc202xx_old.c|4 + drivers/ata/pata_radisys.c |4 + drivers/ata/pata_rz1000.c |6 ++ drivers/ata/pata_sc1200.c |4 + drivers/ata/pata_serverworks.c |6 ++ drivers/ata/pata_sil680.c |8 +++ drivers/ata/pata_sis.c |4 + drivers/ata/pata_triflex.c |4 + drivers/ata/pata_via.c |6 ++ drivers/ata/sata_sil.c | 10 +++ drivers/ata/sata_sil24.c |2 drivers/block/pktcdvd.c|2 drivers/char/agp/intel-agp.c | 14 +++-- drivers/char/pcmcia/cm4040_cs.c|3 - drivers/char/specialix.c |2 drivers/char/tty_io.c | 14 + drivers/hid/hid-core.c |5 - drivers/ide/ide-iops.c |2 drivers/ieee1394/nodemgr.c | 24 ++--- drivers/ieee1394/video1394.c |8 +++ drivers/input/mouse/psmouse-base.c | 28 ++ drivers/input/mouse/psmouse.h |1 drivers/input/mouse/synaptics.c|1 drivers/kvm/kvm.h |2 drivers/macintosh/Kconfig |2 drivers/md/bitmap.c| 22 +++- drivers/md/raid10.c| 38 +++--- drivers/md/raid5.c | 42 ++- drivers/media/dvb/dvb-core/dvbdev.c| 13 drivers/media/dvb/dvb-usb/cxusb.c |4 - drivers/media/dvb/dvb-usb/digitv.c |2 drivers/media/video/cx25840/cx25840-core.c |4 - drivers/media/video/cx25840/cx25840-firmware.c |2 drivers/media/video/cx88/cx88-blackbird.c | 14 +++-- drivers/media/vid
[PATCH] Bitbanging i2c bus driver using the GPIO API
This is a very simple bitbanging i2c bus driver utilizing the new arch-neutral GPIO API. Useful for chips that don't have a built-in i2c controller, additional i2c busses, or testing purposes. To use, include something similar to the following in the board-specific setup code: #include static struct i2c_gpio_platform_data i2c_gpio_data = { .sda_pin= GPIO_PIN_FOO, .scl_pin= GPIO_PIN_BAR, }; static struct platform_device i2c_gpio_device = { .name = "i2c-gpio", .id = 0, .dev= { .platform_data = &i2c_gpio_data, }, }; Register this platform_device, set up the i2c pins as GPIO if required and you're ready to go. Signed-off-by: Haavard Skinnemoen <[EMAIL PROTECTED]> --- I wrote this driver for testing purposes a couple of weeks ago. Figured I might as well post it since it looks like something like this is needed. This driver hasn't yet been updated for the latest change to the GPIO API. I'll update the patch when the GPIO change makes it into mainline. Haavard drivers/i2c/busses/Kconfig|8 ++ drivers/i2c/busses/Makefile |1 + drivers/i2c/busses/i2c-gpio.c | 164 + include/linux/i2c-gpio.h | 18 + include/linux/i2c-id.h|1 + 5 files changed, 192 insertions(+), 0 deletions(-) diff --git a/drivers/i2c/busses/Kconfig b/drivers/i2c/busses/Kconfig index fb19dbb..52f79d1 100644 --- a/drivers/i2c/busses/Kconfig +++ b/drivers/i2c/busses/Kconfig @@ -102,6 +102,14 @@ config I2C_ELEKTOR This support is also available as a module. If so, the module will be called i2c-elektor. +config I2C_GPIO + tristate "GPIO-based bitbanging i2c driver" + depends on I2C && GENERIC_GPIO + select I2C_ALGOBIT + help + This is a very simple bitbanging i2c driver utilizing the + arch-neutral GPIO API to control the SCL and SDA lines. + config I2C_HYDRA tristate "CHRP Apple Hydra Mac I/O I2C interface" depends on I2C && PCI && PPC_CHRP && EXPERIMENTAL diff --git a/drivers/i2c/busses/Makefile b/drivers/i2c/busses/Makefile index 290b540..68f2b05 100644 --- a/drivers/i2c/busses/Makefile +++ b/drivers/i2c/busses/Makefile @@ -11,6 +11,7 @@ obj-$(CONFIG_I2C_AMD8111) += i2c-amd8111.o obj-$(CONFIG_I2C_AT91) += i2c-at91.o obj-$(CONFIG_I2C_AU1550) += i2c-au1550.o obj-$(CONFIG_I2C_ELEKTOR) += i2c-elektor.o +obj-$(CONFIG_I2C_GPIO) += i2c-gpio.o obj-$(CONFIG_I2C_HYDRA)+= i2c-hydra.o obj-$(CONFIG_I2C_I801) += i2c-i801.o obj-$(CONFIG_I2C_I810) += i2c-i810.o diff --git a/drivers/i2c/busses/i2c-gpio.c b/drivers/i2c/busses/i2c-gpio.c new file mode 100644 index 000..f5ed64e --- /dev/null +++ b/drivers/i2c/busses/i2c-gpio.c @@ -0,0 +1,164 @@ +/* + * Bitbanging i2c bus driver using the GPIO API + * + * Copyright (C) 2006 Atmel Corporation + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ +#include +#include +#include +#include +#include +#include + +#include + +void i2c_gpio_setsda(void *data, int state) +{ + struct i2c_gpio_platform_data *pdata = data; + + if (state) + gpio_direction_input(pdata->sda_pin); + else + gpio_direction_output(pdata->sda_pin); +} + +void i2c_gpio_setscl(void *data, int state) +{ + struct i2c_gpio_platform_data *pdata = data; + + if (state) + gpio_direction_input(pdata->scl_pin); + else + gpio_direction_output(pdata->scl_pin); +} + +int i2c_gpio_getsda(void *data) +{ + struct i2c_gpio_platform_data *pdata = data; + + return gpio_get_value(pdata->sda_pin); +} + +int i2c_gpio_getscl(void *data) +{ + struct i2c_gpio_platform_data *pdata = data; + + return gpio_get_value(pdata->scl_pin); +} + +static int __init i2c_gpio_probe(struct platform_device *pdev) +{ + struct i2c_gpio_platform_data *pdata; + struct i2c_algo_bit_data *bit_data; + struct i2c_adapter *adap; + int ret; + + pdata = pdev->dev.platform_data; + if (!pdata) + return -ENXIO; + + ret = -ENOMEM; + adap = kzalloc(sizeof(struct i2c_adapter), GFP_KERNEL); + if (!adap) + goto err_alloc_adap; + bit_data = kzalloc(sizeof(struct i2c_algo_bit_data), GFP_KERNEL); + if (!bit_data) + goto err_alloc_bit_data; + + ret = gpio_request(pdata->sda_pin, "sda"); + if (ret) + goto err_request_sda; + ret = gpio_request(pdata->scl_pin, "scl"); + if (ret) + goto err_request_scl; + + gpio_direction_input(pdata->sda_pin); + gpio_direction_input(pdata->scl_pin); + gpio_set_value(pdata->sda_pin, 0); +
Re: [4/6] 2.6.21-rc2: known regressions
On Thu, March 8, 2007 11:28 pm, Len Brown wrote: > On Monday 05 March 2007 05:35, Antonino A. Daplas wrote: > > > Looks like I got fooled by the negative logic for the nvidia_bugs(). > Please test this patch -- it should fix it, > as well as simplify the code a bit. > > thanks, -Len > Yep. You can knock this one off the regression list :) Thanks, Andrew - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ABI coupling to hypervisors via CONFIG_PARAVIRT
* Linus Torvalds <[EMAIL PROTECTED]> wrote: > On Fri, 9 Mar 2007, Ingo Molnar wrote: > > > > yes - but we already support the raw hardware ABI, in the native > > kernel. > > Why do you continue to call paravirt an ABI? > > We got over that. It's not. It's an API. > > VMI is an ABI. Unfortunately i still dont see where i'm wrong, and i'm really trying to understand your argument. Is your argument that as long as an ABI (VMI) is never directly used but only used via wrapper functions (paravirt_ops), it has no effects whatsoever on the flexibility of the rest of the software and ceases to have any negative ABI effects? In my opinion that is an absurd (and incorrect) point so i guess you must mean something else, but i really cannot think what that is. I never said paravirt_ops is an ABI. I say that the ABI(s) _behind_ paravirt_ops [in the backend] /does/ limit Linux, even if wrapped, inevitably, and that i'm simply worried about having 4-5 independent ABIs behind each paravirt_ops variant each creating a web of design constraints on the rest of the kernel. To quote a past email of mine: || 'paravirt ops can take care of it' - but that is just blatantly || _FALSE_: the ABI 'behind' the paravirt_ops 'shines through' via || functional coupling it doesnt matter in how big letters the wrapper functions have 'freedom' written on them, the _real_ constraint is the user's expectation to have the hypervisor work with Linux that worked with that particular VMI ABI in v2.6.21. So the user wants to have its hypervisor 1.12 work with Linux v2.6.22 - without having to update the hypervisor. And Linux v2.6.23. Etc. /That/ is the 'ABI effect' i'm worried about. It is a "compatibility web" that gets more and more entangled with every new paravirt_ops implementation added. In practice, when a problem comes up during code rewrite, 90% of the time we can probably find a way around it via paravirt_ops and the backend, but i'm simply worried about the remaining 10%. And that 10% is not hypothetical at all, should i cite specific examples of problems that i think cannot be solved via Linux-only modifications? I'm also worried about the sheer QA inertia of having an additional 4-5 hypervisor-ABI constraints on the correctness of the kernel, in addition to the 2 main CPU variants we have at the moment. If we said "paravirt_ops must behave like real hardware" then we'd probably remove some of that risk (although enforcement is still an issue). But we _specifically_ say that no, it doesnt have to behave like real hardware. We allow shortcuts, we allow modifications of behavior - and that's good in quite many cases. But we allow really weird hacks like the .safe_halt() thing. Our only present requirement it appears is that "it works with today's hypervisor" - and that requirement automatically transforms itself into: "all future kernels will work with all past versions of the hypervisor". Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Bitbanging i2c bus driver using the GPIO API
On Friday 09 March 2007 10:48 am, Haavard Skinnemoen wrote: > This is a very simple bitbanging i2c bus driver utilizing the new > arch-neutral GPIO API. Useful for chips that don't have a built-in > i2c controller, additional i2c busses, or testing purposes. That's the right idea! But remember that not all GPIOs support reading back the actual value on SCL (it's an OUT pin, so lacking multidrive capability the values "should" be what you wrote), so getscl() support should depend on a flag in platform data. In the same vein, if SCL is an output-only pin, you won't be able to change its direction ... but then, I'm not sure why you were changing its direction in setscl() rather than just its value. I2C has another interesting special case. at91_set_multi_drive() would be appropriate (yes?) for ARCH_AT91 to use on SCL, to best support both clock stretching and multi-master configurations. > + gpio_direction_input(pdata->sda_pin); > + gpio_direction_input(pdata->scl_pin); > + gpio_set_value(pdata->sda_pin, 0); > + gpio_set_value(pdata->scl_pin, 0); Surely you mean "output" in both cases. So you can set the value. Setting the value on an input pin is undefined. ;) > + printk(KERN_INFO "i2c-gpio: using pins 0x%x (sda) 0x%x (scl)\n", > +pdata->sda_pin, pdata->scl_pin); Please, no hex there. I think dev_info() would be better; and it might be nice to report whether clock stretching is supported. > --- a/include/linux/i2c-id.h > +++ b/include/linux/i2c-id.h > @@ -194,6 +194,7 @@ > #define I2C_HW_B_EM28XX 0x01001f /* em28xx video capture cards > */ > #define I2C_HW_B_CX2341X 0x010020 /* Conexant CX2341X MPEG encoder cards > */ > #define I2C_HW_B_INTELFB 0x010021 /* intel framebuffer driver */ > +#define I2C_HW_B_GPIO0x010022 /* Generic GPIO-based driver */ It'd be nice to completely abolish those IDs, starting by not adding new ones. Especially, not adding unused ones! - Dave - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: refcounting drivers' data structures used in sysfs buffers
On Fri, 9 Mar 2007, Dmitry Torokhov wrote: > On 3/9/07, Oliver Neukum <[EMAIL PROTECTED]> wrote: > > Am Freitag, 9. März 2007 18:02 schrieb Dmitry Torokhov: > > > > > I think we already have all refcounting that is needed. What is > > > missing is subsystem-provided ->release() hooks for drivers to release > > > driver-specific resources when a device finally goes away. > > > > This is an interesting idea. Is it nice to pass through release() > > but not open() ? > > > > Not sure if I follow... Generally speaking open is not a mandatory > operation; however every object in driver model has a release method. > What I am saying is that certain drivers need to have their disconnect > method split in 2 parts - one that shuts down the device and second is > releases resources that might be accesses through sysfs (and other > kernel parts). That second part will have to be called from > subsystem's core ->release() method se we need a release() hook. Dmitry, you're not viewing this correctly. Adding a new release() callback would solve the problem by creating another. Drivers need to release their data as soon as possible after they unbind from a device, not when the device itself goes away. Think about what would happen if you tried to rmmod a driver. The rmmod process would block until the device was unregistered. Oliver, your idea won't work either. Think about what would happen if someone did rmmod driver_module http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] swsusp: Disable nonboot CPUs before entering platform suspend
On Friday, 9 March 2007 13:29, Heiko Carstens wrote: > On Wed, Mar 07, 2007 at 09:07:17PM +, Pavel Machek wrote: > > Hi! > > > > > Prevent the WARN_ON() in > > > arch/x86_64/kernel/acpi/sleep.c:init_low_mapping() > > > from triggering by disabling nonboot CPUs before we finally enter the > > > platform > > > suspend. > > > > > > Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> > > > --- > > > kernel/power/disk.c |1 + > > > kernel/power/user.c |2 +- > > > 2 files changed, 2 insertions(+), 1 deletion(-) > > > > > > Index: linux-2.6.21-rc2-mm2/kernel/power/disk.c > > > === > > > --- linux-2.6.21-rc2-mm2.orig/kernel/power/disk.c > > > +++ linux-2.6.21-rc2-mm2/kernel/power/disk.c > > > @@ -61,6 +61,7 @@ static void power_down(suspend_disk_meth > > > switch(mode) { > > > case PM_DISK_PLATFORM: > > > if (pm_ops && pm_ops->enter) { > > > + disable_nonboot_cpus(); > > > kernel_shutdown_prepare(SYSTEM_SUSPEND_DISK); > > > pm_ops->enter(PM_SUSPEND_DISK); > > > break; > > > > ...so, if pm_ops is non-null, power_down does nonboot cpu disabling, > > otherwise we proceed with cpus enabled? > > > > That looks ugly. > > > > Is the warning bogus? Or maybe we should *always* disable nonboot cpus > > in powerdown path? > > Is disable_nonboot_cpus() assuming that first_cpu(cpu_present_map) is > the boot cpu? Just wondering why disable_nonboot_cpus() isn't using just > any_online_cpu(cpu_online_map)... Is your question related to the code in kernel/cpu.c? Rafael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/1] hotplug cpu: migrate a task within its cpuset
From: Cliff Wickman <[EMAIL PROTECTED]> (this is a second submission -- the first was from a work area back porting to an older release) When a cpu is disabled, move_task_off_dead_cpu() is called for tasks that have been running on that cpu. Currently, such a task is migrated: 1) to any cpu on the same node as the disabled cpu, which is both online and among that task's cpus_allowed 2) to any cpu which is both online and among that task's cpus_allowed But the task's cpus_allowed may have been a single cpu. This patch would insert a preference to migrate such a task to a cpu within its cpuset (and set its cpus_allowed to its cpuset). With this patch, migrate the task to: 1) to any cpu on the same node as the disabled cpu, which is both online and among that task's cpus_allowed 2) to any online cpu within the task's cpuset 3) to any cpu which is both online and among that task's cpus_allowed Diffed against 2.6.21-rc3 (Andrew's current top of tree) Signed-off-by: Cliff Wickman <[EMAIL PROTECTED]> --- kernel/sched.c |6 ++ 1 file changed, 6 insertions(+) Index: morton.070123/kernel/sched.c === --- morton.070123.orig/kernel/sched.c +++ morton.070123/kernel/sched.c @@ -5170,6 +5170,12 @@ restart: if (dest_cpu == NR_CPUS) dest_cpu = any_online_cpu(p->cpus_allowed); + /* try to stay on the same cpuset */ + if (dest_cpu == NR_CPUS) { + p->cpus_allowed = cpuset_cpus_allowed(p); + dest_cpu = any_online_cpu(p->cpus_allowed); + } + /* No more Mr. Nice Guy. */ if (dest_cpu == NR_CPUS) { rq = task_rq_lock(p, &flags); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Sleeping thread not receive signal until it wakes up
On 3/9/07, Sergey Vlasov <[EMAIL PROTECTED]> wrote: On Thu, 8 Mar 2007 14:52:07 -0800 Luong Ngo wrote: [...] > static irqreturn board_isr(int irq, void *dev_id, struct pt_regs* regs) > { > spin_lock(&dev->lock); >if (dev->irqMask & (1 << irqBit)) { > // Set the interrupt event mask > dev->irqEvent |= (1 << irqBit); > > // Disable this irq, it will be reenabled after processed by board task > disable_irq(irq); I assume that your device does not support shared interrupts? If it does (and a PCI device is required to support them), you cannot use disable_irq() here (and you need to check a register in the device to find out if it really did generate an IRQ)... Yes, the device does not share interrupt. > static int ats89_ioctl(struct inode *inode, struct file *file, u_int > cmd, u_long arg) > { > > switch(cmd){ >case GET_IRQ_CMD: { > u32 regMask32; > >spin_lock_irq(dev->lock); >while ((dev->irqMask & dev->irqEvent) == 0) { > // Sleep until board interrupt happens > spin_unlock_irq(dev->lock); > interruptible_sleep_on(&(dev->boardIRQWaitQueue)); > if (uncond_wakeup) { > /* don't go back to loop */ > break; > } > spin_lock_irq(dev->lock); > } > > uncond_wakeup = 0; > > // Board interrupt happened > regMask32 = dev->irqMask & dev->irqEvent; > if(copy_to_user(&(((ATS89_IOCTL_S *)arg)->mask32), > ®Mask32, sizeof(u32))) { > spin_unlock_irq(dev->lock); > return -EAGAIN; > } > > // Clear the event mask > dev->irqEvent = 0; > spin_unlock_irq(dev->lock); > } > break; > > >} > } And this code is full of bugs: 1) As you have been told already, interruptible_sleep_on() and sleep_on() functions are broken and should not be used (they are left in the kernel only to support some obsolete code). Either use wait_event_interruptible() or work with wait queues directly (prepare_to_wait(), finish_wait(), ...). I agree.but as I said our hardware will repeatedly raising interrupts until it's serviced, the missing wakeup call would be repeated also, so this should still wake up the sleep_on call. But we would change it definitely. 2) The code to handle pending signals is missing - you need to have this after wait_event_interruptible(): if (signal_pending(current)) return -ERESTARTSYS; (but be careful - you might need to clean up something before returning). This is what causes your problem - interruptible_sleep_on() returns if a signal is pending, but your code does not check for signals and therefore invokes interruptible_sleep_on() again; but if a signal is pending, interruptible_sleep_on() returns immediately, causing your driver to eat 100% CPU looping in kernel mode until some device event finally happens. As pointed out by Robert, I added the checking if(signal_pending(current)) return -ERESTARTSYS; right after the line interruptible_sleep_on , but I don't see any difference yet. 3) If uncond_wakeup is set, you break out of the loop with dev->lock unlocked; however, if dev->irqEvent gets set, you exit the loop with dev->lock locked. The subsequent code always unlocks dev->lock, so in the uncond_wakeup case you have double unlock. Thanks for catching it 4) You are doing copy_to_user() while holding a spinlock - this is prohibited (as any other form of sleep inside a spinlock). Thanks again. But may I ask if it is prohibited, how come it has been running without any error? 5) The return code for the copy_to_user() failure is wrong - it should be -EFAULT (this is not a fatal bug, but an annoyance for users of your driver, who might get such nonstandard error codes while debugging their programs and wonder what is going on). changed. Thank you for your input. -LNgo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ABI coupling to hypervisors via CONFIG_PARAVIRT
On Fri, 9 Mar 2007, Ingo Molnar wrote: > > Unfortunately i still dont see where i'm wrong, and i'm really trying to > understand your argument. Is your argument that as long as an ABI (VMI) > is never directly used but only used via wrapper functions > (paravirt_ops) No. My argument is utternly and *purely* that you've been confusing the discussion by using the wrong terms, and as a result, you've been discussing things that aren't *relevant*. You haven't been saying anything constructive. For example, here's a *constructive* thing you could have said, and never actually did: - paravirt_op->write_apic should not exist anything that needs to write to the apic should ether (a) have been caught much earlier in the paravirt stack, ie it's a "disable interrupt" kind of operation, and should never even have gotten to the APIC write in the first place, but been handled by the paravirtualized handler. (b) just be emulated as an APIC write (and if the emulation isn't good enough, screw it) In other words, to be *constructive*, you need to point out particular and practical problem spots, instead of just ranting about any "paravirtualized ABI". Of *course* there is an ABI at some point behind any API. Why do you harp on that? It's irrelevant. Any API will always end up being instantiated into a binary thing at some point, and that binary thing will have to work with some particular version of a hardware/infrastructure combination, but that has *nothing* to do with anything. The x86 instruction set is an ABI. Our API's eventually tend to be compiled to something like that ABI, and yes, some ABI's may not be able to do certain things. For example, on the 32-bit x86 ABI, there are no interfaces for address space identifiers, and the wrappers become a no-op that just don't do anything, and if you want fast context switches between two contexts, you're screwed. Similarly, maybe the VMI ABI doesn't allow for something that the kernel wants to do efficiently. Big deal. What relevance does that have to do with anything, except the fact that if true, the VMWare people are screwed? It's *their* problem. So please - point out things that are badly done. I agree that apic_write() simply shouldn't be an ABI point at all. But do so *directly* without some ranting about other things that aren't relevant. Your "1400 hooks" rant was pointless - there aren't 1400 hooks at all. There are 1400 call-sites, but that's like saying that the "mov" operation is a bad instruction, because there are 5 million mov instructions in the kernel. - Realize that if VMI has problems, it's not *your* problem, or even the kernels problem. It's purely a VMI problem. I don't understand why you care, or why you think we should care. - and I guess we can also stop cc'ing me in the first place. I don't even think virtualization is very interesting. I'd much rather flame people about bad taste in more important areas ;) Thanks, Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.20-1] radeonfb: Add support for Radeon xpress 200m
Benjamin Herrenschmidt wrote: - radeonfb_pm_init(rinfo, rinfo->is_mobility ? 1 : -1, ignore_devlist, force_sleep); + radeonfb_pm_init(rinfo, rinfo->is_mobility && rinfo->family != CHIP_FAMILY_RS480 ? 1 : -1, ignore_devlist, force_sleep); I'd rather you add a check for RS480 inside radeonfb_pm_* Ben. Something like this? --- diff -upr linux-2.6.20.1-vanilla/drivers/video/aty/ati_ids.h linux-2.6.20.1/drivers/video/aty/ati_ids.h --- linux-2.6.20.1-vanilla/drivers/video/aty/ati_ids.h Tue Feb 20 07:34:32 2007 +++ linux-2.6.20.1/drivers/video/aty/ati_ids.h Fri Mar 9 20:30:09 2007 @@ -209,4 +209,4 @@ #define PCI_CHIP_R423_5D57 0x5D57 #define PCI_CHIP_RS350_7834 0x7834 #define PCI_CHIP_RS350_7835 0x7835 - +#define PCI_CHIP_RS480_5955 0x5955 diff -upr linux-2.6.20.1-vanilla/drivers/video/aty/radeon_base.c linux-2.6.20.1/drivers/video/aty/radeon_base.c --- linux-2.6.20.1-vanilla/drivers/video/aty/radeon_base.c Tue Feb 20 07:34:32 2007 +++ linux-2.6.20.1/drivers/video/aty/radeon_base.c Fri Mar 9 20:42:31 2007 @@ -100,6 +100,8 @@ { PCI_VENDOR_ID_ATI, id, PCI_ANY_ID, PCI_ANY_ID, 0, 0, (flags) | (CHIP_FAMILY_##family) } static struct pci_device_id radeonfb_pci_table[] = { +/* Radeon Xpress 200m */ + CHIP_DEF(PCI_CHIP_RS480_5955, RS480, CHIP_HAS_CRTC2 | CHIP_IS_IGP | CHIP_IS_MOBILITY), /* Mobility M6 */ CHIP_DEF(PCI_CHIP_RADEON_LY,RV100, CHIP_HAS_CRTC2 | CHIP_IS_MOBILITY), CHIP_DEF(PCI_CHIP_RADEON_LZ,RV100, CHIP_HAS_CRTC2 | CHIP_IS_MOBILITY), @@ -1990,7 +1992,8 @@ static void radeon_identify_vram(struct /* framebuffer size */ if ((rinfo->family == CHIP_FAMILY_RS100) || (rinfo->family == CHIP_FAMILY_RS200) || -(rinfo->family == CHIP_FAMILY_RS300)) { +(rinfo->family == CHIP_FAMILY_RS300) || + (rinfo->family == CHIP_FAMILY_RS480) ) { u32 tom = INREG(NB_TOM); tmp = tom >> 16) - (tom & 0x) + 1) << 6) * 1024); diff -upr linux-2.6.20.1-vanilla/drivers/video/aty/radeon_pm.c linux-2.6.20.1/drivers/video/aty/radeon_pm.c --- linux-2.6.20.1-vanilla/drivers/video/aty/radeon_pm.cTue Feb 20 07:34:32 2007 +++ linux-2.6.20.1/drivers/video/aty/radeon_pm.cFri Mar 9 20:39:54 2007 @@ -2826,11 +2826,15 @@ void radeonfb_pm_init(struct radeonfb_in rinfo->pm_reg = pci_find_capability(rinfo->pdev, PCI_CAP_ID_PM); /* Enable/Disable dynamic clocks: TODO add sysfs access */ - rinfo->dynclk = dynclk; - if (dynclk == 1) { + if (rinfo->family == CHIP_FAMILY_RS480) + rinfo->dynclk = -1; + else + rinfo->dynclk = dynclk; + + if (rinfo->dynclk == 1) { radeon_pm_enable_dynamic_mode(rinfo); printk("radeonfb: Dynamic Clock Power Management enabled\n"); - } else if (dynclk == 0) { + } else if (rinfo->dynclk == 0) { radeon_pm_disable_dynamic_mode(rinfo); printk("radeonfb: Dynamic Clock Power Management disabled\n"); } diff -upr linux-2.6.20.1-vanilla/drivers/video/aty/radeonfb.h linux-2.6.20.1/drivers/video/aty/radeonfb.h --- linux-2.6.20.1-vanilla/drivers/video/aty/radeonfb.h Tue Feb 20 07:34:32 2007 +++ linux-2.6.20.1/drivers/video/aty/radeonfb.h Fri Mar 9 20:30:09 2007 @@ -48,6 +48,7 @@ enum radeon_family { CHIP_FAMILY_RV350, CHIP_FAMILY_RV380,/* RV370/RV380/M22/M24 */ CHIP_FAMILY_R420, /* R420/R423/M18 */ + CHIP_FAMILY_RS480, CHIP_FAMILY_LAST, }; @@ -64,7 +65,8 @@ enum radeon_family { ((rinfo)->family == CHIP_FAMILY_RV350) || \ ((rinfo)->family == CHIP_FAMILY_R350) || \ ((rinfo)->family == CHIP_FAMILY_RV380) || \ - ((rinfo)->family == CHIP_FAMILY_R420)) + ((rinfo)->family == CHIP_FAMILY_R420) || \ + ((rinfo)->family == CHIP_FAMILY_RS480) ) /* * Chip flags - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] swsusp: Disable nonboot CPUs before entering platform suspend
Hi, On Friday, 9 March 2007 09:54, Pavel Machek wrote: > Hi! > > > > > Index: linux-2.6.21-rc2-mm2/kernel/power/disk.c > > > > === > > > > --- linux-2.6.21-rc2-mm2.orig/kernel/power/disk.c > > > > +++ linux-2.6.21-rc2-mm2/kernel/power/disk.c > > > > @@ -61,6 +61,7 @@ static void power_down(suspend_disk_meth > > > > switch(mode) { > > > > case PM_DISK_PLATFORM: > > > > if (pm_ops && pm_ops->enter) { > > > > + disable_nonboot_cpus(); > > > > kernel_shutdown_prepare(SYSTEM_SUSPEND_DISK); > > > > pm_ops->enter(PM_SUSPEND_DISK); > > > > break; > > > > > > ...so, if pm_ops is non-null, power_down does nonboot cpu disabling, > > > otherwise we proceed with cpus enabled? > > > > > > That looks ugly. > > > > > > Is the warning bogus? > > > > Well, maybe. I'm not sure. > > > > > Or maybe we should *always* disable nonboot cpus in powerdown path? > > > > I think we should do that. > > That would be acceptable. > > > > > Index: linux-2.6.21-rc2-mm2/kernel/power/user.c > > > > === > > > > --- linux-2.6.21-rc2-mm2.orig/kernel/power/user.c > > > > +++ linux-2.6.21-rc2-mm2/kernel/power/user.c > > > > @@ -398,9 +398,9 @@ static int snapshot_ioctl(struct inode * > > > > > > > > case PMOPS_ENTER: > > > > if (data->platform_suspend) { > > > > + disable_nonboot_cpus(); > > > > > > > > kernel_shutdown_prepare(SYSTEM_SUSPEND_DISK); > > > > error = pm_ops->enter(PM_SUSPEND_DISK); > > > > - error = 0; > > > > } > > > > break; > > > > > > Foe an userland application, disabling cpus during pmops_enter is at > > > least surprising... > > > > Yes, but this is not a usual ioctl(). OTOH, we can call > > enable_nonboot_cpus() > > if pm_ops->enter(PM_SUSPEND_DISK) returns an error (otherwise it souldn't > > return at all, no?). > > Ok. Well, does the appended patch look better? Rafael --- kernel/power/disk.c |1 + kernel/power/user.c |3 ++- 2 files changed, 3 insertions(+), 1 deletion(-) Index: linux-2.6.21-rc3/kernel/power/disk.c === --- linux-2.6.21-rc3.orig/kernel/power/disk.c +++ linux-2.6.21-rc3/kernel/power/disk.c @@ -58,6 +58,7 @@ static inline int platform_prepare(void) static void power_down(suspend_disk_method_t mode) { + disable_nonboot_cpus(); switch(mode) { case PM_DISK_PLATFORM: if (pm_ops && pm_ops->enter) { Index: linux-2.6.21-rc3/kernel/power/user.c === --- linux-2.6.21-rc3.orig/kernel/power/user.c +++ linux-2.6.21-rc3/kernel/power/user.c @@ -402,9 +402,10 @@ static int snapshot_ioctl(struct inode * case PMOPS_ENTER: if (data->platform_suspend) { + disable_nonboot_cpus(); kernel_shutdown_prepare(SYSTEM_SUSPEND_DISK); error = pm_ops->enter(PM_SUSPEND_DISK); - error = 0; + enable_nonboot_cpus(); } break; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: refcounting drivers' data structures used in sysfs buffers
Am Freitag, 9. März 2007 20:32 schrieb Alan Stern: > On Fri, 9 Mar 2007, Dmitry Torokhov wrote: > > > On 3/9/07, Oliver Neukum <[EMAIL PROTECTED]> wrote: > > > Am Freitag, 9. März 2007 18:02 schrieb Dmitry Torokhov: > > > > > > > I think we already have all refcounting that is needed. What is > > > > missing is subsystem-provided ->release() hooks for drivers to release > > > > driver-specific resources when a device finally goes away. > > > > > > This is an interesting idea. Is it nice to pass through release() > > > but not open() ? > > > > > > > Not sure if I follow... Generally speaking open is not a mandatory > > operation; however every object in driver model has a release method. > > What I am saying is that certain drivers need to have their disconnect > > method split in 2 parts - one that shuts down the device and second is > > releases resources that might be accesses through sysfs (and other > > kernel parts). That second part will have to be called from > > subsystem's core ->release() method se we need a release() hook. > > Dmitry, you're not viewing this correctly. > > Adding a new release() callback would solve the problem by creating > another. Drivers need to release their data as soon as possible after > they unbind from a device, not when the device itself goes away. Think Wait, the callback from closing the file in sysfs is the earliest we can safely free the data structure. How do you want to free earlier? > about what would happen if you tried to rmmod a driver. The rmmod process > would block until the device was unregistered. > > Oliver, your idea won't work either. Think about what would happen if > someone did > > rmmod driver_module The rmmod process would never actually read the attribute, so until it > exited the private data structure would have a positive refcount. But > rmmod can't exit until the driver has been unloaded from memory, and it > can't be unloaded while its data structure is still allocated. Thus we > would end up with deadlock; rmmod would hang forever. > > It might be better to keep your earlier patch and fix the deadlock you > mentioned earlier, the one that occurs when unbinding a driver through > sysfs. How exactly does that deadlock work? http://lkml.org/lkml/2007/3/6/364 http://lkml.org/lkml/2007/3/6/528 Regards Oliver - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: refcounting drivers' data structures used in sysfs buffers
On Fri, 9 Mar 2007, Alan Stern wrote: > Oliver, your idea won't work either. Think about what would happen if > someone did > > rmmod driver_module > The rmmod process would never actually read the attribute, so until it > exited the private data structure would have a positive refcount. But > rmmod can't exit until the driver has been unloaded from memory, and it > can't be unloaded while its data structure is still allocated. Thus we > would end up with deadlock; rmmod would hang forever. I take this back. Redirecting stdin to the attribute file would increase the module's refcount and cause rmmod to exit immediately with an error. After some more thought, I basically agree with what Oliver wrote originally. sysfs_dirent is indeed the logical place to store the kref pointer. However it needs to be used during open and release, not during read, write, and poll. Another point, which Oliver didn't think of, is that the kref pointer needs to be passed to the driver as an argument in the show() and store() method calls. Implementing this will be difficult. One possibility is to change the definition of sysfs_ops, adding the new struct kref * argument to the prototypes. This will involve changing _lots_ of source files, adding an unused argument to many functions, which isn't attractive. The other possibility is to test at runtime whether the kref pointer is NULL, and if it is, don't pass it. This would work, but it isn't type-safe. Finally, there's added complexity in each driver which wants to use the new facility. The module_exit routine will need to be smart enough to block until all the private data structures have been released. usb-storage does something like that now; it's kind of ugly (although it could be improved if appropriate support were added to the core kernel). Alan Stern - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Bitbanging i2c bus driver using the GPIO API
On Fri, Mar 09, 2007 at 11:30:12AM -0800, David Brownell wrote: > On Friday 09 March 2007 10:48 am, Haavard Skinnemoen wrote: > > This is a very simple bitbanging i2c bus driver utilizing the new > > arch-neutral GPIO API. Useful for chips that don't have a built-in > > i2c controller, additional i2c busses, or testing purposes. > > That's the right idea! But remember that not all GPIOs support > reading back the actual value on SCL (it's an OUT pin, so lacking > multidrive capability the values "should" be what you wrote), so > getscl() support should depend on a flag in platform data. In > the same vein, if SCL is an output-only pin, you won't be able > to change its direction ... but then, I'm not sure why you were > changing its direction in setscl() rather than just its value. That's a more correct I2C implementation. If you read the specs, the SDA and SCL signals are supposed to be driven by open-collector or open-drain drivers, such that devices only pull the bus low. Pull-up resistors pull the signals high when undriven. This avoids the possibility of damage caused when one device drives a signal low and another device tries to drive it high. Therefore, the correct I2C GPIO implementation is one where you drive both SDA and SCL low by using a combination of the data direction register and the output level register, but avoid driving the output high. -- Russell King Linux kernel2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ABI coupling to hypervisors via CONFIG_PARAVIRT
* Linus Torvalds <[EMAIL PROTECTED]> wrote: > Similarly, maybe the VMI ABI doesn't allow for something that the > kernel wants to do efficiently. Big deal. What relevance does that > have to do with anything, except the fact that if true, the VMWare > people are screwed? It's *their* problem. i wont hold you up for long, but i think this is the key difference, and if i understand your point correctly i think you are really wrong here. This is 'enterprise Linux compatibility and ABI 101', really - i dare to bet blindly that you wont see anyone here from distros arguing against this simple point. Once this thing is released upstream, it creates a new compatibility rule: _new kernel must not break on an older hypervisor_ due to a new paravirt_ops design. Ever. It's really that simple. (I think i never said this explicitly because this requirement of backwards compatibility was so obvious to me.) And it doesnt matter whether we think that it was VMWare who messed up. Users/customers _will_ blame us: "v2.6.25 regresses, it wont run under ESX v1.12 anymore". Distro will yield and will undo whatever change breaks backwards compatibility with older hypervisors. (most likely it will be undone upstream already) Backwards compatibility acts as a very heavy barrier against certain types of paravirt_ops design changes. Once v2.6.21 is released, and a bigger distro releases a kernel with CONFIG_PARAVIRT+CONFIG_VMI enabled: backwards compatibility in future versions becomes mainly /that/ distro's problem (and upstream's problem), _NOT_ WMware's problem. That's why i mentioned CONFIG_COMPAT_VDSO as an example. One major distro (SuSE 9.0) came out with that particular glibc version that had a bug that depended on a particular and totally unintentional ABI detail in the vDSO. As a result we had to do several iterations of CONFIG_COMPAT_VDSO to keep backwards compatibility. And glibc is perhaps _the_ most kernel-friendly external software project in existence. Still, the ABI dependency was there, and we cannot break users who run old userspace. The same rule holds here: we cannot break users who run an old hypervisor. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc3-mm1 RSDL results
On Saturday 10 March 2007 05:27, Matt Mackall wrote: > On Fri, Mar 09, 2007 at 07:39:05PM +1100, Con Kolivas wrote: > > On Friday 09 March 2007 19:20, Matt Mackall wrote: > > > And I've just rebooted with NO_HZ and things are greatly improved. At > > > idle, Beryl effects are silky smooth (possibly better than stock) and > > > shows less load. Under 'make', Beryl is still responsive as is Galeon. > > > No sign of lagging mouse or typing. > > > > > > Under make -j 5, things are intermittent. Galeon scrolling is > > > sometimes still responsive, but Beryl, terminals and mouse still drag > > > quite a bit. > > > > I just replied before you sent this one out I think our messages passed > > each other across the ocean somewhere. I don't quite get what combination > > of factors you're saying here caused great improvement. Was it enabling > > NO_HZ on mainline cpu scheduler or disabling NO_HZ or on RSDL? > > Turning on NO_HZ on RSDL greatly improved it. I have not tried NO_HZ > on mainline. The first test was with NO_HZ=n, the second was with > NO_HZ=y. How odd. I would have thought that if an interaction was to occur it would have been without the new feature. Clearly what you describe without NO_HZ is not the expected behaviour with RSDL. I wonder what went wrong. Are you on 100HZ on that laptop? While I expect 100HZ should be ok, it might just not be... My laptop is about the same performance and works fine with 100HZ under load of all sorts BUT I don't have Beryl (which I would have thought swayed things in the opposite direction also). > As an aside, we should not name config options NO_* or DISABLE_* > because of the potential for double negation. Case in point, I couldn't figure out what you were saying :) -- -ck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix building kernel under Solaris 11_snv
On Mar 9 2007 20:00, Sam Ravnborg wrote: >On Thu, Mar 08, 2007 at 11:01:57PM +0100, Jan Engelhardt wrote: >> >> Since Solaris seems to be on the run, I did myself try compile it. >> However, unlike the original poster who said he did so on SunOS 4.8, I >> did it on 5.11_snv39, yielding a bigger changeset. I thought I just >> share the diff that piled up so far. It needs a lot of hacks on the >> Solaris side - prioritizing GNU names, then, second, gnu ld has a >> glitch, then, gcc has a missing file... it's fun fun fun! > >Can I please have a signed-off version of this patch. _Are you sure_ you want all these hacks without further review from other people? Also note the patch is incomplete, for example I could not compile the acpi pieces because acsolaris.h -- which is referenced in the acpi includes -- does not exist. (Yet another piece of software that has crossplatform compatibilty stuff, like XFS.) >> --- linux-2.6.21-rc3.orig/include/linux/input.h 2007-03-07 >> 05:41:20.0 +0100 >> +++ linux-2.6.21-rc3/include/linux/input.h 2007-03-07 23:40:39.417339000 >> +0100 >> @@ -16,7 +16,9 @@ >> #include >> #include >> #include >> -#include >> +#ifndef __sun__ >> +# include >> +#endif >> #endif This is not a proper fix for sure. The problem lies in file2alias.c, see (your own) http://lkml.org/lkml/2007/3/8/339 >> Index: linux-2.6.21-rc3/scripts/genksyms/genksyms.c >> === >> --- linux-2.6.21-rc3.orig/scripts/genksyms/genksyms.c2007-03-07 >> 05:41:20.0 +0100 >> +++ linux-2.6.21-rc3/scripts/genksyms/genksyms.c 2007-03-07 >> 23:28:35.659555000 +0100 >> @@ -21,6 +21,7 @@ >> along with this program; if not, write to the Free Software Foundation, >> Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. */ >> >> +#include >> #include >> #include >> #include This is however, is valid. Can I gave sign-offs for single hunks? >> Index: linux-2.6.21-rc3/scripts/kallsyms.c >> === >> --- linux-2.6.21-rc3.orig/scripts/kallsyms.c 2007-03-07 05:41:20.0 >> +0100 >> +++ linux-2.6.21-rc3/scripts/kallsyms.c 2007-03-07 23:46:46.249005000 >> +0100 >> @@ -378,6 +378,40 @@ >> table_cnt = pos; >> } >> >> +#ifdef __sun__ >> +/* Return the first occurrence of NEEDLE in HAYSTACK. */ >> +void * >> +memmem (haystack, haystack_len, needle, needle_len) >> + const void *haystack; >> + size_t haystack_len; >> + const void *needle; >> + size_t needle_len; >> +{ >> + const char *begin; >> + const char *const last_possible >> += (const char *) haystack + haystack_len - needle_len; >> + >> + if (needle_len == 0) >> +/* The first occurrence of the empty string is deemed to occur at >> + the beginning of the string. */ >> +return (void *) haystack; >> + >> + /* Sanity check, otherwise the loop might search through the whole >> + memory. */ >> + if (__builtin_expect (haystack_len < needle_len, 0)) >> +return NULL; >> + >> + for (begin = (const char *) haystack; begin <= last_possible; ++begin) >> +if (begin[0] == ((const char *) needle)[0] && >> +!memcmp ((const void *) &begin[1], >> + (const void *) ((const char *) needle + 1), >> + needle_len - 1)) >> + return (void *) begin; >> + >> + return NULL; >> +} >> +#endif >> + >> /* replace a given token in all the valid symbols. Use the sampled symbols >> * to update the counts */ >> static void compress_symbols(unsigned char *str, int idx) This one, I am just waiting for someone to object to the extra #if-#endif. >> Index: linux-2.6.21-rc3/scripts/kconfig/Makefile >> === >> --- linux-2.6.21-rc3.orig/scripts/kconfig/Makefile 2007-03-07 >> 05:41:20.0 +0100 >> +++ linux-2.6.21-rc3/scripts/kconfig/Makefile2007-03-07 >> 23:21:19.730679000 +0100 >> @@ -88,7 +88,7 @@ >> HOST_EXTRACFLAGS = $(shell $(CONFIG_SHELL) $(check-lxdialog) -ccflags) >> HOST_LOADLIBES = $(shell $(CONFIG_SHELL) $(check-lxdialog) -ldflags >> $(HOSTCC)) >> >> -HOST_EXTRACFLAGS += -DLOCALE >> +HOST_EXTRACFLAGS += -DLOCALE -std=c99 -D__EXTENSIONS__ >> >> PHONY += $(obj)/dochecklxdialog >> $(obj)/dochecklxdialog: The error message for this one was: only valid in C99 mode. Linux GCC 4.1.2 does not print that, Solaris GCC 3.4.3 does. I do not know offhand who is right. >> Index: linux-2.6.21-rc3/scripts/kconfig/lxdialog/dialog.h >> === >> --- linux-2.6.21-rc3.orig/scripts/kconfig/lxdialog/dialog.h 2007-03-07 >> 05:41:20.0 +0100 >> +++ linux-2.6.21-rc3/scripts/kconfig/lxdialog/dialog.h 2007-03-07 >> 23:14:48.462956000 +0100 >> @@ -222,3 +222,7 @@ >> * -- uppercase chars are used to invoke the button (M_EVENT + 'O') >> */ >> #de