Re: [patch] Real-Time Preemption, -RT-2.6.11-rc3-V0.7.38-01
On Fri, 04 Feb 2005 11:03:47 +0100, Ingo Molnar said: > > i have released the -V0.7.38-01 Real-Time Preemption patch, which can be > downloaded from the usual place: Hey Ingo.. Sorry to keep breaking stuff on you, but.. ;) Summary: Looks like CONFIG_NET_PKTGEN=y gives -V0.7.38-03 indigestion. I retrofitted 0.7.38-03 onto -rc3-mm1, and at boot it wedged up hard scrolling an error message. Looked like a 'scheduling while atomic' error coming from net/pktgen.o. Sorry for the incomplete traceback, but it locked before userspace came up, and I don't have hardware handy for a serial console.. I found a CONFIG_NET_PKTGEN=Y in the config, rebuilt with =n, and the resulting kernel boots fine (am using it as I type). Vanilla -rc3-mm1 also boots fine with the PTKGEN=y setting (as did 2.6.10-mm1-V0.7.34-01, the last -mm I built with a -RT patch). I haven't tried a vanilla -rc3-V0.7.38-03, but I don't see anyplace -mm1 hits pktgen.c If the above isn't enough to track down the issue, feel free to let me know what you'd like me to try next. pgpPCaJdLnngE.pgp Description: PGP signature
Re: [PATCH 2.4.19-bk8] arch/i386/kernel/pci-irq.c: Wrong message output
Mark F. Haigh wrote: Apologies. Patch now -p1-able. [Apologies yet again, now includes description] I'd submitted a patch earlier for this file, fixing a warning. When I looked at it further, I noticed it can output an incorrect warning message under certain circumstances. I've confirmed that this can and does happen in the wild: PCI: Enabling device :00:0a.0 ( -> 0001) PCI: No IRQ known for interrupt pin @ of device :00:0a.0. Probably buggy MP table. It should read "No IRQ known for interrupt pin A", but the 'pin' variable has already been decremented (from 1 to 0), so the line: printk(KERN_WARNING "PCI: No IRQ known for interrupt pin %c of device %s.%s\n", 'A' + pin - 1, dev->slot_name, msg); causes "pin @" to be output, because 'A' + 0 - 1 == '@'. This patch also fixes the original warning: pci-irq.c: In function `pcibios_enable_irq': pci-irq.c:1128: warning: 'msg' might be used uninitialized in this function Thanks, Mark Haigh [EMAIL PROTECTED] Signed-off-by: Mark F. Haigh <[EMAIL PROTECTED]> --- linux-2.4.29-bk8/arch/i386/kernel/pci-irq.c.orig2005-02-07 19:55:23.0 -0800 +++ linux-2.4.29-bk8/arch/i386/kernel/pci-irq.c 2005-02-07 20:13:38.0 -0800 @@ -1127,6 +1127,8 @@ if (pin && !pcibios_lookup_irq(dev, 1) && !dev->irq) { char *msg; + pin--; /* interrupt pins are numbered starting from 1 */ + /* With IDE legacy devices the IRQ lookup failure is not a problem.. */ if (dev->class >> 8 == PCI_CLASS_STORAGE_IDE && !(dev->class & 0x5)) return; @@ -1134,42 +1136,39 @@ if (io_apic_assign_pci_irqs) { int irq; - if (pin) { - pin--; /* interrupt pins are numbered starting from 1 */ - irq = IO_APIC_get_PCI_irq_vector(dev->bus->number, PCI_SLOT(dev->devfn), pin); - /* -* Busses behind bridges are typically not listed in the MP-table. -* In this case we have to look up the IRQ based on the parent bus, -* parent slot, and pin number. The SMP code detects such bridged -* busses itself so we should get into this branch reliably. -*/ - temp_dev = dev; - while (irq < 0 && dev->bus->parent) { /* go back to the bridge */ - struct pci_dev * bridge = dev->bus->self; + irq = IO_APIC_get_PCI_irq_vector(dev->bus->number, PCI_SLOT(dev->devfn), pin); + /* +* Busses behind bridges are typically not listed in the MP-table. +* In this case we have to look up the IRQ based on the parent bus, +* parent slot, and pin number. The SMP code detects such bridged +* busses itself so we should get into this branch reliably. +*/ + temp_dev = dev; + while (irq < 0 && dev->bus->parent) { /* go back to the bridge */ + struct pci_dev * bridge = dev->bus->self; - pin = (pin + PCI_SLOT(dev->devfn)) % 4; - irq = IO_APIC_get_PCI_irq_vector(bridge->bus->number, - PCI_SLOT(bridge->devfn), pin); - if (irq >= 0) - printk(KERN_WARNING "PCI: using PPB(B%d,I%d,P%d) to get irq %d\n", - bridge->bus->number, PCI_SLOT(bridge->devfn), pin, irq); - dev = bridge; - } - dev = temp_dev; - if (irq >= 0) { - printk(KERN_INFO "PCI->APIC IRQ transform: (B%d,I%d,P%d) -> %d\n", - dev->bus->number, PCI_SLOT(dev->devfn), pin, irq); - dev->irq = irq; - return; - } else - msg = " Probably buggy MP table."; + pin = (pin + PCI_SLOT(dev->devfn)) % 4; + irq = IO_APIC_get_PCI_irq_vector(bridge->bus->number, + PCI_SLOT(bridge->devfn), pin); + if (irq >= 0) + printk(KERN_WARNING "PCI: using PPB(B%d,I%d,P%d) to get irq %d\n", +
Marvell Yukon 2 PCI Express 88E8050 is not support in the EXPERIMENTAL skge driver
The description in kernel 2.6.11-rc3-mm1 make for the skge driver states the following: "*New SysKonnect GigaEthernet support (EXPERIMENTAL) (SKGE) This driver support the Marvell Yukon or SysKonnect SK-98xx/SK-95xx and related Gigabit Ethernet adapters. It is a new smaller driver driver with better performance and more complete ethtool support. It does not support the link failover and network management features that "portable" vendor supplied sk98lin driver does.* " What makes my PCI Express mobo with on board 04:00.0 Ethernet controller: Marvell Technology Group Ltd. Gigabit Ethernet Controller (rev 17) *not *supported by skge driver is that it is a NEW generation driver Marvell Yukon 2. I have SysKonnect's sk98lin driver working for me under a custom built 2.6.9 kernel using SysKonnect's driver version 7.09 patched in the kernel. The motherboard I'm using is a new Intel D915GEV. The specs on the lan show as: Gigabit (10/100/1000 Mbits/sec) LAN subsystem using the Marvel* Yukon* 88E8050 PCI Express* Gigabit Ethernet Controller. Don't try this Stephen's skge driver with this, it isn't supported. RaXeT - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
skge driver from Stephen H. doesn't support the following SysKonnect sk98lin supported on board lan
The description in kernel 2.6.11-rc3-mm1 make for the skge driver states the following: "*New SysKonnect GigaEthernet support (EXPERIMENTAL) (SKGE) This driver support the Marvell Yukon or SysKonnect SK-98xx/SK-95xx and related Gigabit Ethernet adapters. It is a new smaller driver driver with better performance and more complete ethtool support. It does not support the link failover and network management features that "portable" vendor supplied sk98lin driver does.* " What makes my PCI Express mobo with on board 04:00.0 Ethernet controller: Marvell Technology Group Ltd. Gigabit Ethernet Controller (rev 17) *not *supported by skge driver is that it is a NEW generation driver Marvell Yukon 2. I have SysKonnect's sk98lin driver working for me under a custom built 2.6.9 kernel using SysKonnect's driver version 7.09 patched in the kernel. The motherboard I'm using is a new Intel D915GEV. The specs on the lan show as: Gigabit (10/100/1000 Mbits/sec) LAN subsystem using the Marvel* Yukon* 88E8050 PCI Express* Gigabit Ethernet Controller. Don't try this Stephen's skge driver with this, it isn't supported. RaXeT - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.4.29-bk8] Resend: sym53c8xx.c: Add ULL suffix to fix warning
Mark F. Haigh wrote: Apologies. Patch now -p1-able. [Apolgies yet again, description included now] Noticed that in drivers/scsi/sym53c8xx.c: sym53c8xx.c:13185: warning: integer constant is too large for "long" type Since we're not dealing with C99 (yet), this 64 bit integer constant needs to be suffixed with ULL. Patch included. Mark F. Haigh [EMAIL PROTECTED] Signed-off-by: Mark F. Haigh <[EMAIL PROTECTED]> --- linux-2.4.29-bk8/drivers/scsi/sym53c8xx.c.orig 2005-02-07 19:53:05.0 -0800 +++ linux-2.4.29-bk8/drivers/scsi/sym53c8xx.c 2005-02-07 19:53:36.0 -0800 @@ -13182,7 +13182,7 @@ ** descriptors. */ if (chip && (chip->features & FE_DAC)) { - if (pci_set_dma_mask(pdev, (u64) 0xff)) + if (pci_set_dma_mask(pdev, (u64) 0xffULL)) chip->features &= ~FE_DAC_IN_USE; else chip->features |= FE_DAC_IN_USE;
Re: [PATCH 2.6.11-rc3-bk4] arch/i386/kernel/pci/irq.c: Wrong message output
Greg KH wrote: Oops, this time you forgot the whole description of the patch :( Third time's the charm... The following has been reported in the wild for kernel 2.6.8-24: PCI: Enabling device :00:05.0 ( -> 0002) PCI: No IRQ known for interrupt pin @ of device :00:05.0. Probably buggy MP table. It should read "No IRQ known for interrupt pin A", but the 'pin' variable has already been decremented (from 1 to 0), so the line: printk(KERN_WARNING "PCI: No IRQ known for interrupt pin %c of device %s.%s\n", 'A' + pin - 1, dev->slot_name, msg); causes "pin @" to be output, because 'A' + 0 - 1 == '@'. The supplied patch should fix it. It also removes a redundant check for a nonzero pin. Mark F. Haigh [EMAIL PROTECTED] Signed-off-by: Mark F. Haigh <[EMAIL PROTECTED]> --- linux-2.6.11-rc3-bk4/arch/i386/pci/irq.c.orig 2005-02-07 20:40:58.0 -0800 +++ linux-2.6.11-rc3-bk4/arch/i386/pci/irq.c2005-02-07 21:39:15.091239272 -0800 @@ -1031,56 +1031,55 @@ pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin); if (pin && !pcibios_lookup_irq(dev, 1) && !dev->irq) { - char *msg; - msg = ""; + char *msg = ""; + + pin--; /* interrupt pins are numbered starting from 1 */ + if (io_apic_assign_pci_irqs) { int irq; - if (pin) { - pin--; /* interrupt pins are numbered starting from 1 */ - irq = IO_APIC_get_PCI_irq_vector(dev->bus->number, PCI_SLOT(dev->devfn), pin); - /* -* Busses behind bridges are typically not listed in the MP-table. -* In this case we have to look up the IRQ based on the parent bus, -* parent slot, and pin number. The SMP code detects such bridged -* busses itself so we should get into this branch reliably. -*/ - temp_dev = dev; - while (irq < 0 && dev->bus->parent) { /* go back to the bridge */ - struct pci_dev * bridge = dev->bus->self; - - pin = (pin + PCI_SLOT(dev->devfn)) % 4; - irq = IO_APIC_get_PCI_irq_vector(bridge->bus->number, - PCI_SLOT(bridge->devfn), pin); - if (irq >= 0) - printk(KERN_WARNING "PCI: using PPB %s[%c] to get irq %d\n", - pci_name(bridge), 'A' + pin, irq); - dev = bridge; - } - dev = temp_dev; - if (irq >= 0) { + irq = IO_APIC_get_PCI_irq_vector(dev->bus->number, PCI_SLOT(dev->devfn), pin); + /* +* Busses behind bridges are typically not listed in the MP-table. +* In this case we have to look up the IRQ based on the parent bus, +* parent slot, and pin number. The SMP code detects such bridged +* busses itself so we should get into this branch reliably. +*/ + temp_dev = dev; + while (irq < 0 && dev->bus->parent) { /* go back to the bridge */ + struct pci_dev * bridge = dev->bus->self; + + pin = (pin + PCI_SLOT(dev->devfn)) % 4; + irq = IO_APIC_get_PCI_irq_vector(bridge->bus->number, + PCI_SLOT(bridge->devfn), pin); + if (irq >= 0) + printk(KERN_WARNING "PCI: using PPB %s[%c] to get irq %d\n", + pci_name(bridge), 'A' + pin, irq); + dev = bridge; + } + dev = temp_dev; + if (irq >= 0) { #ifdef CONFIG_PCI_MSI - if (!platform_legacy_irq(irq)) - irq = IO_APIC_VECTOR(irq); + if (!platform_legacy_irq(irq)) + irq = IO_APIC_VECTOR(irq); #endif - printk(KERN_INFO "PCI->APIC IRQ transform: %s[%c] -> IRQ %d\n", - pci_name(dev), 'A' + pin, irq); - dev->irq = irq; - return 0; -
Re: [PATCH 2.4.19-bk8] arch/i386/kernel/pci-irq.c: Wrong message output
Mark F. Haigh wrote: --- arch/i386/kernel/pci-irq.c.orig 2005-02-07 19:55:23.852531544 -0800 +++ arch/i386/kernel/pci-irq.c 2005-02-07 20:13:38.835068896 -0800 Apologies. Patch now -p1-able. Mark F. Haigh [EMAIL PROTECTED] Signed-off-by: Mark F. Haigh <[EMAIL PROTECTED]> --- linux-2.4.29-bk8/arch/i386/kernel/pci-irq.c.orig2005-02-07 19:55:23.0 -0800 +++ linux-2.4.29-bk8/arch/i386/kernel/pci-irq.c 2005-02-07 20:13:38.0 -0800 @@ -1127,6 +1127,8 @@ if (pin && !pcibios_lookup_irq(dev, 1) && !dev->irq) { char *msg; + pin--; /* interrupt pins are numbered starting from 1 */ + /* With IDE legacy devices the IRQ lookup failure is not a problem.. */ if (dev->class >> 8 == PCI_CLASS_STORAGE_IDE && !(dev->class & 0x5)) return; @@ -1134,42 +1136,39 @@ if (io_apic_assign_pci_irqs) { int irq; - if (pin) { - pin--; /* interrupt pins are numbered starting from 1 */ - irq = IO_APIC_get_PCI_irq_vector(dev->bus->number, PCI_SLOT(dev->devfn), pin); - /* -* Busses behind bridges are typically not listed in the MP-table. -* In this case we have to look up the IRQ based on the parent bus, -* parent slot, and pin number. The SMP code detects such bridged -* busses itself so we should get into this branch reliably. -*/ - temp_dev = dev; - while (irq < 0 && dev->bus->parent) { /* go back to the bridge */ - struct pci_dev * bridge = dev->bus->self; + irq = IO_APIC_get_PCI_irq_vector(dev->bus->number, PCI_SLOT(dev->devfn), pin); + /* +* Busses behind bridges are typically not listed in the MP-table. +* In this case we have to look up the IRQ based on the parent bus, +* parent slot, and pin number. The SMP code detects such bridged +* busses itself so we should get into this branch reliably. +*/ + temp_dev = dev; + while (irq < 0 && dev->bus->parent) { /* go back to the bridge */ + struct pci_dev * bridge = dev->bus->self; - pin = (pin + PCI_SLOT(dev->devfn)) % 4; - irq = IO_APIC_get_PCI_irq_vector(bridge->bus->number, - PCI_SLOT(bridge->devfn), pin); - if (irq >= 0) - printk(KERN_WARNING "PCI: using PPB(B%d,I%d,P%d) to get irq %d\n", - bridge->bus->number, PCI_SLOT(bridge->devfn), pin, irq); - dev = bridge; - } - dev = temp_dev; - if (irq >= 0) { - printk(KERN_INFO "PCI->APIC IRQ transform: (B%d,I%d,P%d) -> %d\n", - dev->bus->number, PCI_SLOT(dev->devfn), pin, irq); - dev->irq = irq; - return; - } else - msg = " Probably buggy MP table."; + pin = (pin + PCI_SLOT(dev->devfn)) % 4; + irq = IO_APIC_get_PCI_irq_vector(bridge->bus->number, + PCI_SLOT(bridge->devfn), pin); + if (irq >= 0) + printk(KERN_WARNING "PCI: using PPB(B%d,I%d,P%d) to get irq %d\n", + bridge->bus->number, PCI_SLOT(bridge->devfn), pin, irq); + dev = bridge; } + dev = temp_dev; + if (irq >= 0) { + printk(KERN_INFO "PCI->APIC IRQ transform: (B%d,I%d,P%d) -> %d\n", + dev->bus->number, PCI_SLOT(dev->devfn), pin, irq); + dev->irq = irq; + return; + } else + msg = " Probably buggy MP table."; } else if (pci_probe & PCI_BIOS_IRQ_SCAN) msg = ""; else msg = " Please try using pci=biosirq.";
Re: [PATCH 2.4.29-bk8] Resend: sym53c8xx.c: Add ULL suffix to fix warning
Mark F. Haigh wrote: --- drivers/scsi/sym53c8xx.c.orig 2005-02-07 19:53:05.741527608 -0800 +++ drivers/scsi/sym53c8xx.c2005-02-07 19:53:36.782808616 -0800 Apologies. Patch now -p1-able. Mark F. Haigh [EMAIL PROTECTED] Signed-off-by: Mark F. Haigh <[EMAIL PROTECTED]> --- linux-2.4.29-bk8/drivers/scsi/sym53c8xx.c.orig 2005-02-07 19:53:05.0 -0800 +++ linux-2.4.29-bk8/drivers/scsi/sym53c8xx.c 2005-02-07 19:53:36.0 -0800 @@ -13182,7 +13182,7 @@ ** descriptors. */ if (chip && (chip->features & FE_DAC)) { - if (pci_set_dma_mask(pdev, (u64) 0xff)) + if (pci_set_dma_mask(pdev, (u64) 0xffULL)) chip->features &= ~FE_DAC_IN_USE; else chip->features |= FE_DAC_IN_USE;
Re: [PATCH 2.6.11-rc3-bk4] arch/i386/kernel/pci/irq.c: Wrong message output
On Mon, Feb 07, 2005 at 09:42:02PM -0800, Mark F. Haigh wrote: > Greg KH wrote: > >On Mon, Feb 07, 2005 at 09:06:18PM -0800, Mark F. Haigh wrote: > > > > --- arch/i386/pci/irq.c.orig 2005-02-07 20:40:58.140856536 -0800 > > > +++ arch/i386/pci/irq.c 2005-02-07 20:46:06.713946296 -0800 > > > >Can you resend this so it can be applied with -p1 to patch, and a > >Signed-off-by: line? > > > > Ack, my fault. > > Mark F. Haigh > [EMAIL PROTECTED] > > > Signed-off-by: Mark F. Haigh <[EMAIL PROTECTED]> Oops, this time you forgot the whole description of the patch :( Third time's the charm... greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.11-rc3-bk4] arch/i386/kernel/pci/irq.c: Wrong message output
Greg KH wrote: On Mon, Feb 07, 2005 at 09:06:18PM -0800, Mark F. Haigh wrote: > --- arch/i386/pci/irq.c.orig 2005-02-07 20:40:58.140856536 -0800 > +++ arch/i386/pci/irq.c 2005-02-07 20:46:06.713946296 -0800 Can you resend this so it can be applied with -p1 to patch, and a Signed-off-by: line? Ack, my fault. Mark F. Haigh [EMAIL PROTECTED] Signed-off-by: Mark F. Haigh <[EMAIL PROTECTED]> --- linux-2.6.11-rc3-bk4/arch/i386/pci/irq.c.orig 2005-02-07 20:40:58.0 -0800 +++ linux-2.6.11-rc3-bk4/arch/i386/pci/irq.c2005-02-07 20:46:06.0 -0800 @@ -1031,56 +1031,55 @@ pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin); if (pin && !pcibios_lookup_irq(dev, 1) && !dev->irq) { - char *msg; - msg = ""; + char *msg = ""; + + pin--; /* interrupt pins are numbered starting from 1 */ + if (io_apic_assign_pci_irqs) { int irq; - if (pin) { - pin--; /* interrupt pins are numbered starting from 1 */ - irq = IO_APIC_get_PCI_irq_vector(dev->bus->number, PCI_SLOT(dev->devfn), pin); - /* -* Busses behind bridges are typically not listed in the MP-table. -* In this case we have to look up the IRQ based on the parent bus, -* parent slot, and pin number. The SMP code detects such bridged -* busses itself so we should get into this branch reliably. -*/ - temp_dev = dev; - while (irq < 0 && dev->bus->parent) { /* go back to the bridge */ - struct pci_dev * bridge = dev->bus->self; - - pin = (pin + PCI_SLOT(dev->devfn)) % 4; - irq = IO_APIC_get_PCI_irq_vector(bridge->bus->number, - PCI_SLOT(bridge->devfn), pin); - if (irq >= 0) - printk(KERN_WARNING "PCI: using PPB %s[%c] to get irq %d\n", - pci_name(bridge), 'A' + pin, irq); - dev = bridge; - } - dev = temp_dev; - if (irq >= 0) { + irq = IO_APIC_get_PCI_irq_vector(dev->bus->number, PCI_SLOT(dev->devfn), pin); + /* +* Busses behind bridges are typically not listed in the MP-table. +* In this case we have to look up the IRQ based on the parent bus, +* parent slot, and pin number. The SMP code detects such bridged +* busses itself so we should get into this branch reliably. +*/ + temp_dev = dev; + while (irq < 0 && dev->bus->parent) { /* go back to the bridge */ + struct pci_dev * bridge = dev->bus->self; + + pin = (pin + PCI_SLOT(dev->devfn)) % 4; + irq = IO_APIC_get_PCI_irq_vector(bridge->bus->number, + PCI_SLOT(bridge->devfn), pin); + if (irq >= 0) + printk(KERN_WARNING "PCI: using PPB %s[%c] to get irq %d\n", + pci_name(bridge), 'A' + pin, irq); + dev = bridge; + } + dev = temp_dev; + if (irq >= 0) { #ifdef CONFIG_PCI_MSI - if (!platform_legacy_irq(irq)) - irq = IO_APIC_VECTOR(irq); + if (!platform_legacy_irq(irq)) + irq = IO_APIC_VECTOR(irq); #endif - printk(KERN_INFO "PCI->APIC IRQ transform: %s[%c] -> IRQ %d\n", - pci_name(dev), 'A' + pin, irq); - dev->irq = irq; - return 0; - } else - msg = " Probably buggy MP table."; - } + printk(KERN_INFO "PCI->APIC IRQ transform: %s[%c] -> IRQ %d\n", + pci_name(dev), 'A' + pin, irq); + dev->irq = irq; + return 0; +
Re: [PATCH 2.6.11-rc3-bk4] arch/i386/kernel/pci/irq.c: Wrong message output
On Mon, Feb 07, 2005 at 09:06:18PM -0800, Mark F. Haigh wrote: > > (Same basic problem I just reported in a seperate thread against 2.4.29-bk8) > > The following has been reported in the wild for kernel 2.6.8-24: > > PCI: Enabling device :00:05.0 ( -> 0002) > PCI: No IRQ known for interrupt pin @ of device :00:05.0. Probably > buggy MP table. > > It should read "No IRQ known for interrupt pin A", but the 'pin' > variable has already been decremented (from 1 to 0), so the line: > > printk(KERN_WARNING "PCI: No IRQ known for interrupt pin %c of device > %s.%s\n", 'A' + pin - 1, dev->slot_name, msg); > > causes "pin @" to be output, because 'A' + 0 - 1 == '@'. > > The supplied patch should fix it. It also removes a redundant check for > a nonzero pin. > > > Mark F. Haigh > [EMAIL PROTECTED] > > --- arch/i386/pci/irq.c.orig 2005-02-07 20:40:58.140856536 -0800 > +++ arch/i386/pci/irq.c 2005-02-07 20:46:06.713946296 -0800 Can you resend this so it can be applied with -p1 to patch, and a Signed-off-by: line? thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2.6.11-rc3-bk4] arch/i386/kernel/pci/irq.c: Wrong message output
(Same basic problem I just reported in a seperate thread against 2.4.29-bk8) The following has been reported in the wild for kernel 2.6.8-24: PCI: Enabling device :00:05.0 ( -> 0002) PCI: No IRQ known for interrupt pin @ of device :00:05.0. Probably buggy MP table. It should read "No IRQ known for interrupt pin A", but the 'pin' variable has already been decremented (from 1 to 0), so the line: printk(KERN_WARNING "PCI: No IRQ known for interrupt pin %c of device %s.%s\n", 'A' + pin - 1, dev->slot_name, msg); causes "pin @" to be output, because 'A' + 0 - 1 == '@'. The supplied patch should fix it. It also removes a redundant check for a nonzero pin. Mark F. Haigh [EMAIL PROTECTED] --- arch/i386/pci/irq.c.orig2005-02-07 20:40:58.140856536 -0800 +++ arch/i386/pci/irq.c 2005-02-07 20:46:06.713946296 -0800 @@ -1031,56 +1031,55 @@ pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin); if (pin && !pcibios_lookup_irq(dev, 1) && !dev->irq) { - char *msg; - msg = ""; + char *msg = ""; + + pin--; /* interrupt pins are numbered starting from 1 */ + if (io_apic_assign_pci_irqs) { int irq; - if (pin) { - pin--; /* interrupt pins are numbered starting from 1 */ - irq = IO_APIC_get_PCI_irq_vector(dev->bus->number, PCI_SLOT(dev->devfn), pin); - /* -* Busses behind bridges are typically not listed in the MP-table. -* In this case we have to look up the IRQ based on the parent bus, -* parent slot, and pin number. The SMP code detects such bridged -* busses itself so we should get into this branch reliably. -*/ - temp_dev = dev; - while (irq < 0 && dev->bus->parent) { /* go back to the bridge */ - struct pci_dev * bridge = dev->bus->self; - - pin = (pin + PCI_SLOT(dev->devfn)) % 4; - irq = IO_APIC_get_PCI_irq_vector(bridge->bus->number, - PCI_SLOT(bridge->devfn), pin); - if (irq >= 0) - printk(KERN_WARNING "PCI: using PPB %s[%c] to get irq %d\n", - pci_name(bridge), 'A' + pin, irq); - dev = bridge; - } - dev = temp_dev; - if (irq >= 0) { + irq = IO_APIC_get_PCI_irq_vector(dev->bus->number, PCI_SLOT(dev->devfn), pin); + /* +* Busses behind bridges are typically not listed in the MP-table. +* In this case we have to look up the IRQ based on the parent bus, +* parent slot, and pin number. The SMP code detects such bridged +* busses itself so we should get into this branch reliably. +*/ + temp_dev = dev; + while (irq < 0 && dev->bus->parent) { /* go back to the bridge */ + struct pci_dev * bridge = dev->bus->self; + + pin = (pin + PCI_SLOT(dev->devfn)) % 4; + irq = IO_APIC_get_PCI_irq_vector(bridge->bus->number, + PCI_SLOT(bridge->devfn), pin); + if (irq >= 0) + printk(KERN_WARNING "PCI: using PPB %s[%c] to get irq %d\n", + pci_name(bridge), 'A' + pin, irq); + dev = bridge; + } + dev = temp_dev; + if (irq >= 0) { #ifdef CONFIG_PCI_MSI - if (!platform_legacy_irq(irq)) - irq = IO_APIC_VECTOR(irq); + if (!platform_legacy_irq(irq)) + irq = IO_APIC_VECTOR(irq); #endif - printk(KERN_INFO "PCI->APIC IRQ transform: %s[%c] -> IRQ %d\n", - pci_name(dev), 'A' + pin, irq); - dev->irq = irq; - return 0; - } else - msg = " Probably buggy MP table."; - } +
[PATCH 2.4.29-bk8] Resend: sym53c8xx.c: Add ULL suffix to fix warning
Same patch, now against 2.4.29-bk8: Noticed that in drivers/scsi/sym53c8xx.c: sym53c8xx.c:13185: warning: integer constant is too large for "long" type Since we're not dealing with C99 (yet), this 64 bit integer constant needs to be suffixed with ULL. Patch included. Mark F. Haigh [EMAIL PROTECTED] --- drivers/scsi/sym53c8xx.c.orig 2005-02-07 19:53:05.741527608 -0800 +++ drivers/scsi/sym53c8xx.c2005-02-07 19:53:36.782808616 -0800 @@ -13182,7 +13182,7 @@ ** descriptors. */ if (chip && (chip->features & FE_DAC)) { - if (pci_set_dma_mask(pdev, (u64) 0xff)) + if (pci_set_dma_mask(pdev, (u64) 0xffULL)) chip->features &= ~FE_DAC_IN_USE; else chip->features |= FE_DAC_IN_USE;
[PATCH 2.4.19-bk8] arch/i386/kernel/pci-irq.c: Wrong message output
I'd submitted a patch earlier for this file, fixing a warning. When I looked at it further, I noticed it can output an incorrect warning message under certain circumstances. I've confirmed that this can and does happen in the wild: PCI: Enabling device :00:0a.0 ( -> 0001) PCI: No IRQ known for interrupt pin @ of device :00:0a.0. Probably buggy MP table. It should read "No IRQ known for interrupt pin A", but the 'pin' variable has already been decremented (from 1 to 0), so the line: printk(KERN_WARNING "PCI: No IRQ known for interrupt pin %c of device %s.%s\n", 'A' + pin - 1, dev->slot_name, msg); causes "pin @" to be output, because 'A' + 0 - 1 == '@'. This patch also fixes the original warning: pci-irq.c: In function `pcibios_enable_irq': pci-irq.c:1128: warning: 'msg' might be used uninitialized in this function Thanks, Mark Haigh [EMAIL PROTECTED] --- arch/i386/kernel/pci-irq.c.orig 2005-02-07 19:55:23.852531544 -0800 +++ arch/i386/kernel/pci-irq.c 2005-02-07 20:13:38.835068896 -0800 @@ -1127,6 +1127,8 @@ if (pin && !pcibios_lookup_irq(dev, 1) && !dev->irq) { char *msg; + pin--; /* interrupt pins are numbered starting from 1 */ + /* With IDE legacy devices the IRQ lookup failure is not a problem.. */ if (dev->class >> 8 == PCI_CLASS_STORAGE_IDE && !(dev->class & 0x5)) return; @@ -1134,42 +1136,39 @@ if (io_apic_assign_pci_irqs) { int irq; - if (pin) { - pin--; /* interrupt pins are numbered starting from 1 */ - irq = IO_APIC_get_PCI_irq_vector(dev->bus->number, PCI_SLOT(dev->devfn), pin); - /* -* Busses behind bridges are typically not listed in the MP-table. -* In this case we have to look up the IRQ based on the parent bus, -* parent slot, and pin number. The SMP code detects such bridged -* busses itself so we should get into this branch reliably. -*/ - temp_dev = dev; - while (irq < 0 && dev->bus->parent) { /* go back to the bridge */ - struct pci_dev * bridge = dev->bus->self; + irq = IO_APIC_get_PCI_irq_vector(dev->bus->number, PCI_SLOT(dev->devfn), pin); + /* +* Busses behind bridges are typically not listed in the MP-table. +* In this case we have to look up the IRQ based on the parent bus, +* parent slot, and pin number. The SMP code detects such bridged +* busses itself so we should get into this branch reliably. +*/ + temp_dev = dev; + while (irq < 0 && dev->bus->parent) { /* go back to the bridge */ + struct pci_dev * bridge = dev->bus->self; - pin = (pin + PCI_SLOT(dev->devfn)) % 4; - irq = IO_APIC_get_PCI_irq_vector(bridge->bus->number, - PCI_SLOT(bridge->devfn), pin); - if (irq >= 0) - printk(KERN_WARNING "PCI: using PPB(B%d,I%d,P%d) to get irq %d\n", - bridge->bus->number, PCI_SLOT(bridge->devfn), pin, irq); - dev = bridge; - } - dev = temp_dev; - if (irq >= 0) { - printk(KERN_INFO "PCI->APIC IRQ transform: (B%d,I%d,P%d) -> %d\n", - dev->bus->number, PCI_SLOT(dev->devfn), pin, irq); - dev->irq = irq; - return; - } else - msg = " Probably buggy MP table."; + pin = (pin + PCI_SLOT(dev->devfn)) % 4; + irq = IO_APIC_get_PCI_irq_vector(bridge->bus->number, + PCI_SLOT(bridge->devfn), pin); + if (irq >= 0) + printk(KERN_WARNING "PCI: using PPB(B%d,I%d,P%d) to get irq %d\n", + bridge->bus->number, PCI_SLOT(bridge->devfn), pin, irq); + dev = bridge; } + dev = temp_dev; +
Kernel panic while executing init. (2.6.11-rc3)
Kernel panic'ed while booting (on HP rx5670 - 2 CPU) the kernel 2.6.11-rc3, configured and compiled with zx1_defconfig target. I want follow the below given steps to understand and debug the problem. Please correct me if they are not the correct way of attacking problems of this kind. 1. Disassemble "create_elf_tables" from vmlinux 2. Locate the code. with the help of IP offset available in the panic dump. 3. Use the register values to see what might have gone wrong. I am not sure how I will be able to do the following. 1. How get the kernel data structure values at the time of panic ? 2. How to know what fault has caused the problem (data page fault, instruction fault etc.) ? -- vishwas ELILO boot: test2611rc3 Uncompressing Linux... done Linux version 2.6.11-rc3 ([EMAIL PROTECTED]) (gcc version 3.2.3 20030502 (Red Hat Linux 3.2.3-42)) #1 SMP Mon Feb 7 12:37:59 IST 2005 EFI v1.10 by HP: SALsystab=0x3ff88000 ACPI 2.0=0x3fdf6000 SMBIOS=0x3ff8a000 HCDP=0x3fdf5000 PCDP: v0 at 0x3fdf5000 Early serial console at MMIO 0x80006000 (options '9600n8') warning: skipping physical page 0 SAL 0.20: INTEL MSL REF SAL version 2.0 SAL: AP wakeup using external interrupt vector 0xff ACPI: Local APIC address c000fee0 GSI 20 (level, low) -> CPU 0 (0x) vector 48 2 CPUs available, 2 CPUs total MCA related initialization done Virtual mem_map starts at 0xa0007fffc720 Built 1 zonelists Kernel command line: BOOT_IMAGE=scsi2:EFI\redhat\vmlinuz-2611rc3 root=/dev/sdb3 ro PID hash table entries: 4096 (order: 12, 131072 bytes) Console: colour dummy device 80x25 Dentry cache hash table entries: 1048576 (order: 9, 8388608 bytes) Inode-cache hash table entries: 524288 (order: 8, 4194304 bytes) Memory: 6236480k/6284976k available (8001k code, 48160k reserved, 3681k data, 272k init) Leaving McKinley Errata 9 workaround enabled Mount-cache hash table entries: 1024 (order: 0, 16384 bytes) Boot processor id 0x0/0x0 task migration cache decay timeout: 10 msecs. CPU 1: synchronized ITC with CPU 0 (last diff 0 cycles, maxerr 435 cycles) Brought up 2 CPUs Total of 2 processors activated (2694.04 BogoMIPS). NET: Registered protocol family 16 ACPI: Subsystem revision 20050125 ACPI: Interpreter enabled ACPI: Using IOSAPIC for interrupt routing ACPI: PCI Root Bridge [PCI0] (00:00) ACPI: PCI Root Bridge [PCI1] (00:20) ACPI: PCI Root Bridge [PCI2] (00:40) ACPI: PCI Root Bridge [PCI3] (00:60) ACPI: PCI Root Bridge [PCI4] (00:80) ACPI: PCI Root Bridge [PCI5] (00:a0) ACPI: PCI Root Bridge [PCI6] (00:c0) ACPI: PCI Root Bridge [PCI7] (00:e0) SCSI subsystem initialized usbcore: registered new driver hub PCI: Using ACPI for IRQ routing ** PCI interrupts are no longer routed automatically. If this ** causes a device to stop working, it is probably because the ** driver failed to call pci_enable_device(). As a temporary ** workaround, the "pci=routeirq" argument restores the old ** behavior. If this argument makes the device work again, ** please email the output of "lspci" to [EMAIL PROTECTED] ** so I can fix the driver. IOC: zx1 2.3 HPA 0xfed01000 IOVA space 1024Mb at 0x4000 perfmon: version 2.0 IRQ 238 perfmon: Itanium 2 PMU detected, 16 PMCs, 18 PMDs, 4 counters (47 bits) PAL Information Facility v0.5 perfmon: added sampling format default_format perfmon_default_smpl: default_format v2.0 registered Total HugeTLB memory allocated, 0 Installing knfsd (copyright (C) 1996 [EMAIL PROTECTED]). Initializing Cryptographic API pci_hotplug: PCI Hot Plug PCI Core version: 0.5 acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.4 ACPI: Power Button (FF) [PWRF] ACPI: Sleep Button (FF) [SLPF] ACPI: Thermal Zone [THM0] (27 C) EFI Time Services Driver v0.4 Linux agpgart interface v0.100 (c) Dave Jones [drm] Initialized drm 1.0.0 20040925 Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing enabled GSI 16 (level, low) -> CPU 1 (0x0100) vector 49 ACPI: PCI interrupt :00:01.0[A] -> GSI 16 (level, low) -> IRQ 49 ttyS0 at MMIO 0x80007000 (irq = 49) is a 16550A ACPI: PCI interrupt :00:01.1[A] -> GSI 16 (level, low) -> IRQ 49 ttyS1 at MMIO 0x80006000 (irq = 49) is a 16550A ttyS2 at MMIO 0x80006010 (irq = 49) is a 16550A ttyS3 at MMIO 0x80006038 (irq = 49) is a 16550A io scheduler noop registered io scheduler anticipatory registered io scheduler deadline registered io scheduler cfq registered RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize loop: loaded (max 8 devices) Intel(R) PRO/1000 Network Driver - version 5.6.10.1-k2 Copyright (c) 1999-2004 Intel Corporation. e100: Intel(R) PRO/100 Network Driver, 3.3.6-k2-NAPI e100: Copyright(c) 1999-2004 Intel Corporation tg3.c:v3.19 (January 26, 2005) GSI 27 (level, low) -> CPU 0 (0x) vector 50 ACPI: PCI interrupt :21:04.0[A] -> GSI 27 (level, low) -> IRQ 50 eth0: Tigon3 [partno(A6794-60001) rev 0105 PHY(5701)] (PCI:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:30:6e:49:1f:a2 eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1]
[OSDL] email gateway for STP available
In a somewhat beta. We're working on our ease-of-use. Release 3.0.19 of STP, available at Sourceforge (http://sourceforge.net/projects/stp ) and via BK ( bk://developer.osdl.org tag: release_3.0.19 ) adds an email gateway, so you can submit test requests without the Web. I am looking for a few beta testers. Grab the kit and email me if you're interested in using. cliffw - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
PCI Error reporting & recovery
Hi Seto ! I was reading the list archives for the discussion back in September about PCI error reporting. Has there been any further progress on this since then ? I'm looking into adapting something for the need of ppc64 as well (which, btw, has 1 slot = 1 bridge on most cases, but not all of them :) which uses quite different low level mecanisms. (Basically, we have to go through the firmware to get to the errors). Also, our bridges are automatically isolating slots that had any error on them (including DMA) and we have the ability to recover, by triggering a reset on a given segment and that sort of thing, for which I would like to provide dirvers with an API to control as well. Finally, I was thinking about some richer semantics for the error themselves. For example, on DMA error, we can sometimes get good details about the faulting address etc... which may be intersting for the driver to log, for diagnostic purpose at least. So I'd like to start from what you did back then and discuss possible APIs for the above ideas / changes. What is the status of that stuff ? did it evolve since then ? Regards, Ben. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.11-rc3: Kylix application no longer works?
Grzegorz Kulewski <[EMAIL PROTECTED]> wrote: > > On Mon, 7 Feb 2005, Andrew Morton wrote: > > > Daniel Drake <[EMAIL PROTECTED]> wrote: > >> > >>> # fs/binfmt_elf.c > >>> # 2005/01/17 13:37:56-08:00 [EMAIL PROTECTED] +43 -19 > >>> # [SPARC64]: Missing user access return value checks in > fs/binfmt_elf.c and fs/compat.c > >>> # > >> > >> I think so. For a short period we applied this patch to the Gentoo 2.6.10 > >> kernel... > >> > >> > http://dev.gentoo.org/~dsd/gentoo-dev-sources/release-10.01/dist/1900_umem_catch.patch > >> > >> ...but removed it once users complained it stopped kylix binaries from > running. > > > > Bah. That's what happens when you fix stuff. > > > > What's kylix? The Borland C++ builder thing? > > Rather Delphi (== Object Pascal) thing. > > > > How should one set about reproducing this problem? > > IIRC, Some minimal "personal" version can be downloaded from borland.com. Well I'd prefer that we not back out the whole patch. Could someone please test with something like the below, let us know exactly where it's falling over? --- 25/fs/binfmt_elf.c~a2005-02-07 20:01:16.0 -0800 +++ 25-akpm/fs/binfmt_elf.c 2005-02-07 20:03:51.0 -0800 @@ -44,6 +44,8 @@ #include +#define D() do { printk("%s:%d\n", __FILE__, __LINE__); dump_stack(); } while (0) + static int load_elf_binary(struct linux_binprm * bprm, struct pt_regs * regs); static int load_elf_library(struct file*); static unsigned long elf_map (struct file *, unsigned long, struct elf_phdr *, int, int); @@ -181,8 +183,10 @@ create_elf_tables(struct linux_binprm *b STACK_ALLOC(p, ((current->pid % 64) << 7)); #endif u_platform = (elf_addr_t __user *)STACK_ALLOC(p, len); - if (__copy_to_user(u_platform, k_platform, len)) + if (__copy_to_user(u_platform, k_platform, len)) { + D(); return -EFAULT; + } } /* Create the ELF interpreter info */ @@ -244,8 +248,10 @@ create_elf_tables(struct linux_binprm *b #endif /* Now, let's put argc (and argv, envp if appropriate) on the stack */ - if (__put_user(argc, sp++)) + if (__put_user(argc, sp++)) { + D(); return -EFAULT; + } if (interp_aout) { argv = sp + 2; envp = argv + argc + 1; @@ -266,8 +272,10 @@ create_elf_tables(struct linux_binprm *b return 0; p += len; } - if (__put_user(0, argv)) + if (__put_user(0, argv)) { + D(); return -EFAULT; + } current->mm->arg_end = current->mm->env_start = p; while (envc-- > 0) { size_t len; @@ -277,14 +285,18 @@ create_elf_tables(struct linux_binprm *b return 0; p += len; } - if (__put_user(0, envp)) + if (__put_user(0, envp)) { + D(); return -EFAULT; + } current->mm->env_end = p; /* Put the elf_info on the stack in the right place. */ sp = (elf_addr_t __user *)envp + 1; - if (copy_to_user(sp, elf_info, ei_index * sizeof(elf_addr_t))) + if (copy_to_user(sp, elf_info, ei_index * sizeof(elf_addr_t))) { + D(); return -EFAULT; + } return 0; } _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH] remove "base" argument from __free_pages_bulk()
Appended is a patch which stops using the zone->zone_mem_map to calculate the buddy and combined page pointers. It uses the fact that the mem_map array is guaranteed to be contigious for the surrounding (1 << MAX_ORDER) pages. The relative positions of the pages in the physical address space to provide the alignement; which conicidentally fixes the issue where zones are not aligned at MAX_ORDER. There is a very comprehensive comment in the new code explaining the mathematical relationship between a page and its buddy so I won't reproduce it here. This kind of approach is required for CONFIG_NONLINEAR systems where the mem_map is not contiguous within a zone, and the zone->zone_mem_map is not used at all. This patch has been boot-tested on a large variety of systems and architectures: my P4 laptop, 16-way NUMAQ, 16-way Summit, 4-way x86 SMP, ppc64 LPAR, x86_64, and several ia64 configurations. It has been performance-tested on a 16-way NUMAQ. SDET shows a very slight (within margin of error) performance gain. Kernbench shows an approximately ~1% decrease in system time with this patch applied. So, it has a likely positive performance impact. However, the patch has the potential to have a negative performance impact on systems with an expensive page_to_pfn() implementation. But, I think the NUMAQ has one of the more expensive ones around, and it doesn't seem mind too much. Signed-off-by: Andy Whitcroft <[EMAIL PROTECTED]> Signed-off-by: Dave Hansen <[EMAIL PROTECTED]> --- memhotplug-dave/mm/page_alloc.c | 57 +--- 1 files changed, 42 insertions(+), 15 deletions(-) diff -puN mm/page_alloc.c~B-sparse-120-free-pages-no-base mm/page_alloc.c --- memhotplug/mm/page_alloc.c~B-sparse-120-free-pages-no-base 2005-02-04 15:21:59.0 -0800 +++ memhotplug-dave/mm/page_alloc.c 2005-02-07 20:02:25.0 -0800 @@ -192,6 +192,35 @@ static inline void rmv_page_order(struct } /* + * Locate the struct page for both the matching buddy in our + * pair (buddy1) and the combined O(n+1) page they form (page). + * + * 1) Any buddy B1 will have an order O twin B2 which satisfies + * the following equasion: + * B2 = B1 ^ (1 << O) + * For example, if the starting buddy (buddy2) is #8 its order + * 1 buddy is #10: + * B2 = 8 ^ (1 << 1) = 8 ^ 2 = 10 + * + * 2) Any buddy B will have an order O+1 parent P which + * satisfies the following equasion: + * P = B & ~(1 << O) + * + * Assumption: *_mem_map is contigious at least up to MAX_ORDER + */ +static inline struct page *__page_find_buddy(struct page *page, unsigned long page_idx, unsigned int order) +{ + unsigned long buddy_idx = page_idx ^ (1 << order); + + return page + (buddy_idx - page_idx);; +} + +static inline unsigned long __find_combined_index(unsigned long page_idx, unsigned int order) +{ + return (page_idx & ~(1 << order)); +} + +/* * This function checks whether a page is free && is the buddy * we can do coalesce a page and its buddy if * (a) the buddy is free && @@ -234,44 +263,43 @@ static inline int page_is_buddy(struct p * -- wli */ -static inline void __free_pages_bulk (struct page *page, struct page *base, +static inline void __free_pages_bulk (struct page *page, struct zone *zone, unsigned int order) { unsigned long page_idx; - struct page *coalesced; int order_size = 1 << order; if (unlikely(order)) destroy_compound_page(page, order); - page_idx = page - base; + page_idx = page_to_pfn(page) & ((1 << MAX_ORDER) - 1); BUG_ON(page_idx & (order_size - 1)); BUG_ON(bad_range(zone, page)); zone->free_pages += order_size; while (order < MAX_ORDER-1) { + unsigned long combined_idx; struct free_area *area; struct page *buddy; - int buddy_idx; - buddy_idx = (page_idx ^ (1 << order)); - buddy = base + buddy_idx; + combined_idx = __find_combined_index(page_idx, order); + buddy = __page_find_buddy(page, page_idx, order); + if (bad_range(zone, buddy)) break; if (!page_is_buddy(buddy, order)) - break; - /* Move the buddy up one level. */ + break; /* Move the buddy up one level. */ list_del(&buddy->lru); area = zone->free_area + order; area->nr_free--; rmv_page_order(buddy); - page_idx &= buddy_idx; + page = page + (combined_idx - page_idx); + page_idx = combined_idx; order++; } - coalesced = base + page_idx; - set_page_order(coalesced, order); - list_add(&coalesced->lru, &zone->free_area[order].free_list); + set_page_order(page, order); + list_add(&page->lru, &zone->
Re: linux-2.6.11-rc3: XFS internal error xfs_da_do_buf(1) at line 2176 of file fs/xfs/xfs_da_btree.c.
There are some corrections for my message... Sorry. At Tue, 08 Feb 2005 12:53:29 +0900, SATOH Fumiyasu wrote: > Host3: > -- > OS: Debian GNU/Linux testing version (sarge) > Kernel: kernel-image-2.6.10-1-686-smp > (compiled by gcc version 3.3.5 (Debian 1:3.3.5-6)) > Filesystem: / (/dev/md0 (RAID1, /dev/hda1, /dev/hdd1)) > CPU: Intel(R) Xeon(TM) CPU 2.40GHz x 2 (SMP) Filesystem: / (/dev/md0 (RAID1, /dev/sda1, /dev/sdb1)) CPU: Intel(R) Pentium(R) III CPU family 1133MHz x 2 (SMP) SCSI-HBA: Adaptec AIC-7899P U160/m (rev 01) -- -- Name: SATOH Fumiyasu -- Home: http://www.sfo.jp (in Japanese only) -- Mail: fumiya at net-thrust.com, samba.gr.jp, namazu.org or ... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: linux-2.6.11-rc3: XFS internal error xfs_da_do_buf(1) at line 2176 of file fs/xfs/xfs_da_btree.c.
At Mon, 07 Feb 2005 09:38:28 -0600, Jeffrey E. Hundstad wrote: > I'm sorry for this truncated report... but it's all I've got. If you > need .config or system configuration, etc. let me know and I'll send'em > ASAP. I don't believe this is hardware related; ide-smart shows all fine. > > From dmesg: > > xfs_da_do_buf: bno 8388608 > dir: inode 117526252 > Filesystem "hda4": XFS internal error xfs_da_do_buf(1) at line 2176 of > file fs/xfs/xfs_da_btree.c. Caller 0xc01bda27 I've seen similar problems on Debian GNU/Linux testing ver. (sarge) and kernel-image-2.6.8-1-686-smp and kernel-image-2.6.10-1-686-smp (kernel-image-* are Debian-oriented Linux kernel binary packages). I think this is NOT hardware related. These problems are occured on three different hardwares. Host1 - OS: Debian GNU/Linux testing (sarge) Kernel: kernel-image-2.6.8-1-686-smp (compiled by gcc version 3.3.5 (Debian 1:3.3.5-2) Filesystem: / (/dev/md0 (RAID1, /dev/hda1, /dev/hdd1)) CPU: Intel(R) Xeon(TM) CPU 2.40GHz x 2 (SMP) # xfs_info / meta-data=/ isize=256agcount=8, agsize=244232 blks = sectsz=512 data = bsize=4096 blocks=1953856, imaxpct=25 = sunit=8 swidth=16 blks, unwritten=1 naming =version 2 bsize=4096 log =internal bsize=4096 blocks=2560, version=1 = sectsz=512 sunit=0 blks realtime =none extsz=65536 blocks=0, rtextents=0 Log: Not found in /var/log/* Host2 - OS: Debian GNU/Linux testing (sarge) Kernel: kernel-image-2.6.10-1-686-smp (compiled by gcc version 3.3.5 (Debian 1:3.3.5-6)) Filesystem: / (/dev/md0 (RAID1, /dev/hda1, /dev/hdd1)) CPU: Intel(R) Xeon(TM) CPU 2.40GHz x 2 (SMP) # xfs_info / meta-data=/ isize=256agcount=8, agsize=244232 blks = sectsz=512 data = bsize=4096 blocks=1953856, imaxpct=25 = sunit=8 swidth=16 blks, unwritten=1 naming =version 2 bsize=4096 log =internal bsize=4096 blocks=2560, version=1 = sectsz=512 sunit=0 blks realtime =none extsz=65536 blocks=0, rtextents=0 Log1 from /var/log/kern.log*: Jan 28 21:11:50 host2 kernel: XFS internal error XFS_WANT_CORRUPTED_GOTO at line 1583 of file fs/xfs/xfs_alloc.c. Caller 0xf89d02a5 Jan 28 21:11:50 host2 kernel: [__crc_pm_idle+782770/5290900] xfs_free_ag_extent+0x471/0x7a0 [xfs] Jan 28 21:11:50 host2 kernel: [__crc_pm_idle+787494/5290900] xfs_free_extent+0xe5/0x110 [xfs] Jan 28 21:11:50 host2 kernel: [__crc_pm_idle+787494/5290900] xfs_free_extent+0xe5/0x110 [xfs] Jan 28 21:11:50 host2 kernel: [__crc_pm_idle+1189885/5290900] kmem_zone_alloc+0x4c/0xa0 [xfs] Jan 28 21:11:50 host2 kernel: [__crc_pm_idle+997143/5290900] xfs_efd_init+0x86/0x90 [xfs] Jan 28 21:11:50 host2 kernel: [__crc_pm_idle+1142537/5290900] xfs_trans_get_efd+0x38/0x50 [xfs] Jan 28 21:11:50 host2 kernel: [__crc_pm_idle+867408/5290900] xfs_bmap_finish+0x13f/0x1e0 [xfs] Jan 28 21:11:50 host2 kernel: [__crc_pm_idle+1039636/5290900] xfs_itruncate_finish+0x233/0x460 [xfs] Jan 28 21:11:50 host2 kernel: [__crc_pm_idle+1167530/5290900] xfs_inactive+0x509/0x570 [xfs] Jan 28 21:11:50 host2 kernel: [__crc_pm_idle+1237072/5290900] vn_rele+0xff/0x120 [xfs] Jan 28 21:11:50 host2 kernel: [__crc_pm_idle+1230441/5290900] linvfs_clear_inode+0x18/0x30 [xfs] Jan 28 21:11:50 host2 kernel: [clear_inode+230/288] clear_inode+0xe6/0x120 Jan 28 21:11:50 host2 kernel: [generic_delete_inode+362/416] generic_delete_inode+0x16a/0x1a0 Jan 28 21:11:50 host2 kernel: [iput+99/144] iput+0x63/0x90 Jan 28 21:11:50 host2 kernel: [sys_unlink+275/320] sys_unlink+0x113/0x140 Jan 28 21:11:50 host2 kernel: [sysenter_past_esp+82/113] sysenter_past_esp+0x52/0x71 Jan 28 21:11:50 host2 kernel: xfs_force_shutdown(md0,0x8) called from line 4049 of file fs/xfs/xfs_bmap.c. Return address = 0xf8a3d2db Jan 28 21:11:50 host2 kernel: Filesystem "md0": Corruption of in-memory data detected. Shutting down filesystem: md0 Jan 28 21:11:50 host2 kernel: Please umount the filesystem, and rectify the problem(s) Log2 from /var/log/kern.log*: Feb 3 14:35:18 host2 kernel: xfs_force_shutdown(md0,0x8) called from line 1091 of file fs/xfs/xfs_trans.c. Return address = 0xf8a8b23b Feb 3 14:35:18 host2 kernel: Filesystem "md0": Corruption of in-memory data detected. Shutting down filesystem: md0 Feb 3 14:35:18 host2 kernel: Please umount the filesystem, and rectify the problem(s) Host3: -- OS: Debian GNU/Linux testing version (sarge) Kernel: kernel-image-2.6.10-1-686-smp (compiled by gcc version 3.3.5 (Debian 1:3.3.5-6)) Filesystem: / (/dev/md0 (RAID1, /dev/hda1, /dev/hdd1)) CPU: Intel(R) Xeon(TM) CPU 2.40GHz x 2 (SMP) # xfs_info / meta-data=/ isize=256
Question about sendfile
Hi, I am trying to beat the I/O bottleneck so as to speed up bulk data transfers in high speed network. It seems that the system call sendfile() can help to reduce CPU utilization and speedup data transfers. But I have one question about the system call, First, Linux sendfile requires that the input file descriptor cannot be a network socket. What are the reasons for such a restriction? Sending a socket to a file via zero copy is definitely useful. Actually this is one approach I am trying to do to improve performance. Some discussions on Linux zero copy said this is because it is harder. Sending a socket to a file via zero copy needs the support of NICs. I cannot understand this explanation. It seems that FreeBSD has implemented bidirectional zero copy(http://people.freebsd.org/~ken/zero_copy/#Download). So why Linux does not support it? What shall I do to release the restriction that Linux enforces on sendfile? Any hints will be highly appreciated. Thanks. Xiuduan Fang BEGIN:VCARD VERSION:2.1 N:Fang;Xiuduan FN:Xiuduan Fang ORG:University of Virginia;Computer Science Dept TITLE:2nd Year Graduate TEL;WORK;VOICE:1-434-982-2296 ADR;WORK:;;151 Engineer's Way, P.O. Box 400740;Charlottesville;VA;22904-4743;USA LABEL;WORK;ENCODING=QUOTED-PRINTABLE:151 Engineer's Way, P.O. Box 400740=0D=0ACharlottesville, VA 22904-4743=0D= =0AUSA KEY;X509;ENCODING=BASE64: MIIEYzCCA8ygAwIBAgIQJav9Aj366wHb4hpgZ1JRKDANBgkqhkiG9w0BAQQFADCBzDEXMBUG A1UEChMOVmVyaVNpZ24sIEluYy4xHzAdBgNVBAsTFlZlcmlTaWduIFRydXN0IE5ldHdvcmsx RjBEBgNVBAsTPXd3dy52ZXJpc2lnbi5jb20vcmVwb3NpdG9yeS9SUEEgSW5jb3JwLiBCeSBS ZWYuLExJQUIuTFREKGMpOTgxSDBGBgNVBAMTP1ZlcmlTaWduIENsYXNzIDEgQ0EgSW5kaXZp ZHVhbCBTdWJzY3JpYmVyLVBlcnNvbmEgTm90IFZhbGlkYXRlZDAeFw0wNDEwMDQwMDAwMDBa Fw0wNDEyMDMyMzU5NTlaMIIBBzEXMBUGA1UEChMOVmVyaVNpZ24sIEluYy4xHzAdBgNVBAsT FlZlcmlTaWduIFRydXN0IE5ldHdvcmsxRjBEBgNVBAsTPXd3dy52ZXJpc2lnbi5jb20vcmVw b3NpdG9yeS9SUEEgSW5jb3JwLiBieSBSZWYuLExJQUIuTFREKGMpOTgxHjAcBgNVBAsTFVBl cnNvbmEgTm90IFZhbGlkYXRlZDEnMCUGA1UECxMeRGlnaXRhbCBJRCBDbGFzcyAxIC0gTWlj cm9zb2Z0MRUwEwYDVQQDFAxYaXVkdWFuIEZhbmcxIzAhBgkqhkiG9w0BCQEWFHhmNGNAY3Mu dmlyZ2luaWEuZWR1MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQDRn6bRIKJguTHWwMQB aKdf9VOH3758Ba6owaoGy5ME/fds2ZPTWvuW+IyFskupZ0stK7f9OtzKAi+EFkFlD1umHItr XM74PapnYI/8TR/svKbZJLodGNAto9sJjvLQkNK6hwvTp5eBwQ1YgC7GmZHmtshPH8N+8Ast xOxoflE6dwIDAQABo4IBBjCCAQIwCQYDVR0TBAIwADCBrAYDVR0gBIGkMIGhMIGeBgtghkgB hvhFAQcBATCBjjAoBggrBgEFBQcCARYcaHR0cHM6Ly93d3cudmVyaXNpZ24uY29tL0NQUzBi BggrBgEFBQcCAjBWMBUWDlZlcmlTaWduLCBJbmMuMAMCAQEaPVZlcmlTaWduJ3MgQ1BTIGlu Y29ycC4gYnkgcmVmZXJlbmNlIGxpYWIuIGx0ZC4gKGMpOTcgVmVyaVNpZ24wEQYJYIZIAYb4 QgEBBAQDAgeAMDMGA1UdHwQsMCowKKAmoCSGImh0dHA6Ly9jcmwudmVyaXNpZ24uY29tL2Ns YXNzMS5jcmwwDQYJKoZIhvcNAQEEBQADgYEASTrowJeKxyNUZbF+AwGXfqXBrOyN3b+3aRDN CgSQVp0zaLHwLReTa+3mEnwtrMN6QSM02gPbiuzVkdmGyxmlHAmrHQ2l61fyotoMH47RJbe+ qzClrcMr2Y9AAyTNeVrvfSZRdKMZ9HFduUu1tn5/FTZFCK8Xoaq3BIo81b8nHGs= EMAIL;PREF;INTERNET:[EMAIL PROTECTED] REV:20050208T032639Z END:VCARD
Re: [PATCH] BSD Secure Levels: claim block dev in file struct rather than inode struct, 2.6.11-rc2-mm1 (3/8)
On Mon, 07 Feb 2005 18:20:36 PST, Chris Wright said: > * [EMAIL PROTECTED] ([EMAIL PROTECTED]) wrote: > > open("/tmp/sh-thd-1107848098", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL|O_LARGEFILE, 0600) = 3 > > O_EXCL > > > Wow - if my /tmp was on the same partition, and I'd hard-linked that > > file to /etc/passwd, it would be toast now if root had run it. > > So, in fact, it wouldn't ;-) Well.. Yeah. bash gets it right, a lot of programs botch it. ;) pgpW8EazQy2Vi.pgp Description: PGP signature
[PATCH] Makefiles are not built using a Fortran compiler
David Holland pointed out that Make has a lot of implicit suffix rules built in and you can disable them by setting ".SUFFIXES:". As an example, checking the debugging information shows we no longer try to compile anything from a '.f' suffix. This turns out to be good for a 15% speedup on a build with nothing to do; down from 29.1 seconds to 24.7 seconds on my K6. Signed-off-by: Matthew Wilcox <[EMAIL PROTECTED]> Index: Makefile === RCS file: /var/cvs/linux-2.6/Makefile,v retrieving revision 1.338 diff -u -p -r1.338 Makefile --- Makefile6 Feb 2005 06:43:49 - 1.338 +++ Makefile8 Feb 2005 02:39:28 - @@ -4,6 +4,8 @@ SUBLEVEL = 11 EXTRAVERSION =-rc3-pa3 NAME=Woozy Numbat +.SUFFIXES: + # *DOCUMENTATION* # To see a list of typical targets execute "make help" # More info can be located in ./README Index: scripts/Makefile.build === RCS file: /var/cvs/linux-2.6/scripts/Makefile.build,v retrieving revision 1.9 diff -u -p -r1.9 Makefile.build --- scripts/Makefile.build 12 Jan 2005 20:18:19 - 1.9 +++ scripts/Makefile.build 8 Feb 2005 02:39:28 - @@ -4,6 +4,8 @@ src := $(obj) +.SUFFIXES: + .PHONY: __build __build: -- "Next the statesmen will invent cheap lies, putting the blame upon the nation that is attacked, and every man will be glad of those conscience-soothing falsities, and will diligently study them, and refuse to examine any refutations of them; and thus he will by and by convince himself that the war is just, and will thank God for the better sleep he enjoys after this process of grotesque self-deception." -- Mark Twain - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Memory leak in 2.6.11-rc1? (also here)
On Mon, Feb 07, 2005 at 07:38:12AM -0800, Linus Torvalds wrote: > > Whee. You've got 5 _million_ bio's "active". Which account for about 750MB > of your 860MB of slab usage. Same situation here, at different rates on two different platforms, both running same kernel build. Both show steadily increasing biovec-1. uglybox was previously running Ingo's 2.6.11-rc2-RT-V0.7.36-03, and was well over 3,000,000 bios after about a week of uptime. With only 512M of memory, it was pretty sluggish. Interesting that the 4-disk RAID5 seems to be growing about 4 times as fast as the RAID1. If there's anything else that could help, or patches you want me to try, just ask. Details: = #1: Soyo KT600 Platinum, Athlon 2500+, 512MB 2 SATA, 2 PATA (all on 8237) RAID1 and RAID5 on-board tg3 >uname -a Linux uglybox 2.6.11-rc3 #2 Thu Feb 3 16:19:44 EST 2005 i686 GNU/Linux >uptime 21:27:47 up 7:04, 4 users, load average: 1.06, 1.03, 1.02 >grep '^bio' /proc/slabinfo biovec-(256) 256256 307222 : tunables 24 120 : slabdata128128 0 biovec-128 256260 153652 : tunables 24 120 : slabdata 52 52 0 biovec-6425626076851 : tunables 54 270 : slabdata 52 52 0 biovec-16256260192 201 : tunables 120 600 : slabdata 13 13 0 biovec-4 256305 64 611 : tunables 120 600 : slabdata 5 5 0 biovec-1 64547 64636 16 2261 : tunables 120 600 : slabdata286286 0 bio64551 64599 64 611 : tunables 120 600 : slabdata 1059 1059 0 >lsmod Module Size Used by ppp_deflate 4928 2 zlib_deflate 21144 1 ppp_deflate bsd_comp5376 0 ppp_async 9280 1 crc_ccitt 1728 1 ppp_async ppp_generic21396 7 ppp_deflate,bsd_comp,ppp_async slhc6720 1 ppp_generic radeon 76224 1 ipv6 235456 27 pcspkr 3300 0 tg384932 0 ohci1394 31748 0 ieee1394 94196 1 ohci1394 snd_cmipci 30112 1 snd_pcm_oss48480 0 snd_mixer_oss 17728 1 snd_pcm_oss usbhid 31168 0 snd_pcm83528 2 snd_cmipci,snd_pcm_oss snd_page_alloc 7620 1 snd_pcm snd_opl3_lib9472 1 snd_cmipci snd_timer 21828 2 snd_pcm,snd_opl3_lib snd_hwdep 7456 1 snd_opl3_lib snd_mpu401_uart 6528 1 snd_cmipci snd_rawmidi20704 1 snd_mpu401_uart snd_seq_device 7116 2 snd_opl3_lib,snd_rawmidi snd48996 12 snd_cmipci,snd_pcm_oss,snd_mixer_oss,snd_pcm,snd_opl3_lib,snd_timer,snd_hwdep,snd_mpu401_uart,snd_rawmidi,snd_seq_device soundcore 7648 1 snd uhci_hcd 29968 0 ehci_hcd 29000 0 usbcore 106744 4 usbhid,uhci_hcd,ehci_hcd dm_mod 52796 0 it87 23900 0 eeprom 5776 0 lm90 11044 0 i2c_sensor 2944 3 it87,eeprom,lm90 i2c_isa 1728 0 i2c_viapro 6412 0 i2c_core 18512 6 it87,eeprom,lm90,i2c_sensor,i2c_isa,i2c_viapro >lspci :00:00.0 Host bridge: VIA Technologies, Inc. VT8377 [KT400/KT600 AGP] Host Bridge (rev 80) :00:01.0 PCI bridge: VIA Technologies, Inc. VT8237 PCI Bridge :00:07.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5705 Gigabit Ethernet (rev 03) :00:0d.0 FireWire (IEEE 1394): VIA Technologies, Inc. IEEE 1394 Host Controller (rev 46) :00:0e.0 Multimedia audio controller: C-Media Electronics Inc CM8738 (rev 10) :00:0f.0 RAID bus controller: VIA Technologies, Inc. VIA VT6420 SATA RAID Controller (rev 80) :00:0f.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06) :00:10.0 USB Controller: VIA Technologies, Inc. VT82x UHCI USB 1.1 Controller (rev 81) :00:10.1 USB Controller: VIA Technologies, Inc. VT82x UHCI USB 1.1 Controller (rev 81) :00:10.2 USB Controller: VIA Technologies, Inc. VT82x UHCI USB 1.1 Controller (rev 81) :00:10.3 USB Controller: VIA Technologies, Inc. VT82x UHCI USB 1.1 Controller (rev 81) :00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 86) :00:11.0 ISA bridge: VIA Technologies, Inc. VT8237 ISA bridge [K8T800 South] :00:13.0 RAID bus controller: Silicon Image, Inc. (formerly CMD Technology Inc) SiI 3112 [SATALink/SATARaid] Serial ATA Controller (rev 02) :01:00.0 VGA compatible controller: ATI Technologies Inc Radeon RV200 QW [Radeon 7500] >cat /proc/mdstat Personalities : [raid0] [ra
Re: [PATCH] Re: msdos/vfat defaults are annoying
-BEGIN PGP SIGNED MESSAGE- (BHash: SHA1 (B (BOn 02/08/2005 09:23 AM, Horst von Brand wrote: (B> Clemens Schwaighofer <[EMAIL PROTECTED]> said: (B> (B> [...] (B>>but to be honest, most times I need vfat, and I actually haven't (B>>encountered a time when I need msdos. (B> (B> But writing MSDOS on a VFAT filesystem is a sure way to screw it up, and (B> AFAIU vice-versa. (B (Bwell it doesn't screw it up if you write MS DOS on a VFAT, you just (Bloose a lot of data. (B (BI was kinda surprised when I came home and plugged in my USB stick to (Bsee just A3.CB instead of a nice long filename :) (B (B- -- (B[ Clemens Schwaighofer -=:~ ] (B[ TBWA\ && TEQUILA\ Japan IT Group ] (B[6-17-2 Ginza Chuo-ku, Tokyo 104-0061, JAPAN ] (B[ Tel: +81-(0)3-3545-7703Fax: +81-(0)3-3545-7343 ] (B[ http://www.tequila.co.jphttp://www.tbwajapan.co.jp ] (B-BEGIN PGP SIGNATURE- (BVersion: GnuPG v1.2.6 (GNU/Linux) (BComment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org (B (BiD8DBQFCCCLljBz/yQjBxz8RAvgyAJ4zRyjszLLuBeZz5lBAyegCTbm1ygCfYf2E (BUJKEEU0HJuLRTAjec3aEQ3s= (B=g+L4 (B-END PGP SIGNATURE- (B- (BTo unsubscribe from this list: send the line "unsubscribe linux-kernel" in (Bthe body of a message to [EMAIL PROTECTED] (BMore majordomo info at http://vger.kernel.org/majordomo-info.html (BPlease read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Re: msdos/vfat defaults are annoying
Clemens Schwaighofer <[EMAIL PROTECTED]> said: [...] > but to be honest, most times I need vfat, and I actually haven't > encountered a time when I need msdos. But writing MSDOS on a VFAT filesystem is a sure way to screw it up, and AFAIU vice-versa. -- Dr. Horst H. von Brand User #22616 counter.li.org Departamento de Informatica Fono: +56 32 654431 Universidad Tecnica Federico Santa Maria +56 32 654239 Casilla 110-V, Valparaiso, ChileFax: +56 32 797513 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Please open sysfs symbols to proprietary modules
"Randy.Dunlap" <[EMAIL PROTECTED]> said: > Chris Friesen wrote: [...] > > If you look at the big chip manufacturers (TI, Maxim, Analog Devices, > > etc.) they publish specs on everything. It would be nice if others did > > the same. > One of the arguments that I have heard is fairly old and debatable as > well. This was the subject of a panel discussion at LWE in 2000 or > 2001, chaired by journalist Nicholas Petreley. The panel was composed > of vendors from (mostly) audio devices IIRC, but I'm not sure. A friend of mine got to sign an NDA for access to the official specs to a device. Turned out to be some handwritten sheets, scribbled over... Shame might have something to do too ;-) -- Dr. Horst H. von Brand User #22616 counter.li.org Departamento de Informatica Fono: +56 32 654431 Universidad Tecnica Federico Santa Maria +56 32 654239 Casilla 110-V, Valparaiso, ChileFax: +56 32 797513 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] BSD Secure Levels: claim block dev in file struct rather than inode struct, 2.6.11-rc2-mm1 (3/8)
* [EMAIL PROTECTED] ([EMAIL PROTECTED]) wrote: > open("/tmp/sh-thd-1107848098", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL|O_LARGEFILE, > 0600) = 3 O_EXCL > Wow - if my /tmp was on the same partition, and I'd hard-linked that > file to /etc/passwd, it would be toast now if root had run it. So, in fact, it wouldn't ;-) thanks, -chris -- Linux Security Modules http://lsm.immunix.org http://lsm.bkbits.net - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] BSD Secure Levels: claim block dev in file struct rather than inode struct, 2.6.11-rc2-mm1 (3/8)
On Tue, 08 Feb 2005 01:48:40 GMT, David Wagner said: > How would /etc/passwd get clobbered? Are you thinking that a tmp > cleaner run by cron might delete /tmp/whatever (i.e., delete the hardlink > you created above)? But deleting /tmp/whatever is safe; it doesn't affect > /etc/passwd. I'm guessing I'm probably missing something. The attack is to hardlink some tempfile name to some file you want over-written. This usually involves just a little bit of work, such as recognizing that a given root cronjob uses an unsafe predictable filename in /tmp (look at the Bugtraq or Full-Disclosure archives, there's plenty). Then you set a little program that sleep()s till a few seconds before the cronjob runs, does a getpid(), and then sprays hardlinks into the next 15 or 20 things that mktemp() will generate... Consider how bash implements "here" scripts: #!/bin/bash echo << EOF some trash EOF Now let's look at the strace (snipped for brevity..) statfs("/tmp", {f_type="EXT2_SUPER_MAGIC", f_bsize=1024, f_blocks=253871, f_bfree=213773, f_bavail=200666, f_files=65536, f_ffree=65445, f_fsid={0, 0}, f_namelen=255, f_frsize=1024}) = 0 time(NULL) = 1107828098 open("/tmp/sh-thd-1107848098", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL|O_LARGEFILE, 0600) = 3 dup(3) = 4 fcntl64(4, F_GETFL) = 0x8001 (flags O_WRONLY|O_LARGEFILE) fstat64(4, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7d71000 _llseek(4, 0, [0], SEEK_CUR)= 0 write(4, "some trash\n", 11)= 11 close(4)= 0 munmap(0xb7d71000, 4096)= 0 open("/tmp/sh-thd-1107848098", O_RDONLY|O_LARGEFILE) = 4 close(3)= 0 unlink("/tmp/sh-thd-1107848098")= 0 fcntl64(0, F_GETFD) = 0 fcntl64(0, F_DUPFD, 10) = 10 fcntl64(0, F_GETFD) = 0 fcntl64(10, F_SETFD, FD_CLOEXEC)= 0 dup2(4, 0) = 0 close(4)= 0 Wow - if my /tmp was on the same partition, and I'd hard-linked that file to /etc/passwd, it would be toast now if root had run it. You usually can't control what gets written - but often it's sufficient for the attacker to simply get a file clobbered pgp1unSohNbRA.pgp Description: PGP signature
Re: [PATCH] Filesystem linking protections
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Chris Wright wrote: > * John Richard Moser ([EMAIL PROTECTED]) wrote: > >>Yes, mkdtemp() and mkstemp(). >> >>Of course we can't always rely on programmers to get it right, so the >>idea here is to make sure we ask broken code to behave nicely, and stab >>it in the face if it doesn't. Please try to examine this in that scope. > > > It's fine for hardened distro. But still inappropriate for mainline. > Perhaps in mainline as an option? The [*] notations next to things are really nice, they let you turn kernel stuff on and off :) It's appropriate for mainline to support added security isn't it? I think following the path of supporting-but-not-forcing is the best route, because it encourages people to account for systems which may take advantage of such options, and thus leads to a software base in which it's quite sane to actually enable those options globally. That's just how I think though. > thanks, > -chris - -- All content of all messages exchanged herein are left in the Public Domain, unless otherwise explicitly stated. -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFCCB+GhDd4aOud5P8RAlD9AJ45JTY20WY6qHe0h0ZIcFasgxJDtACbB1aB i4hytMAy6Cs1AUNXC296JOk= =oLVs -END PGP SIGNATURE- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel 2.6.9 failure
Hi, On Mon, 2005-02-07 at 16:51 -0800, [EMAIL PROTECTED] wrote: [...] > On a K6-2 box the 2.6.9 kernel starts to load : "Loading" then the PC > resets. > The kernel compiled and everything installed OK. Lilo is OK. I've tried four > times different configs with the same result. Box resets. My 2.4.28 kernel > works OK. > I've tried rm'ing and re-unpacking the 2.6.9 source and starting afresh. Box [...]^ Is there any special reason why you don't use 2.6.10. I think it would be a good idea to give it a try! Greets, Lars Strojny -- name: Lars Strojny web: http://strojny.net street: Yorckstrasse 22blog: http://usrportage.de city: D-71636 Ludwigsburg mail/jabber: [EMAIL PROTECTED] f-print: 6663 1055 543E 3106 3FD3 4F40 AC74 CD1F C327 14BD signature.asc Description: This is a digitally signed message part
Re: [PATCH] BSD Secure Levels: claim block dev in file struct rather than inode struct, 2.6.11-rc2-mm1 (3/8)
>For those systems that have everything on one big partition, you can often >do stuff like: > >ln /etc/passwd /tmp/ > >and wait for /etc/passwd to get clobbered by a cron job run by root... How would /etc/passwd get clobbered? Are you thinking that a tmp cleaner run by cron might delete /tmp/whatever (i.e., delete the hardlink you created above)? But deleting /tmp/whatever is safe; it doesn't affect /etc/passwd. I'm guessing I'm probably missing something. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: prezeroing V6 [2/3]: ScrubD
Christoph Lameter <[EMAIL PROTECTED]> wrote: > > On Mon, 7 Feb 2005, Andrew Morton wrote: > > > > Look at the early posts. I plan to put that up on the web. I have some > > > stats attached to the end of this message from an earlier post. > > > > But that's a patch-specific microbenchmark, isn't it? Has this work been > > benchmarked against real-world stuff? > > No its a page fault benchmark. Dave Miller has done some kernel compiles > and I have some benchmarks here that I never posted because they do not > show any material change as far as I can see. I will be posting that soon > when this is complete (also need to do the same for the atomic page fault > ops and the prefaulting patch). OK, thanks. That's important work. After all, this patch is a performance optimisation. > > > > Should we be managing the kernel threads with the kthread() API? > > > > > > What would you like to manage? > > > > Startup, perhaps binding the threads to their cpus too. > > That is all already controllable in the same way as the swapper. kswapd uses an old API. > Each > memory node is bound to a set of cpus. This may be controlled by the > NUMA node configuration. F.e. for nodes without cpus. kthread_bind() should be able to do this. From a quick read it appears to have shortcomings in this department (it expects to be bound to a single CPU). We should fix kthread_bind() so that it can accomodate the kscrub/kswapd requirement. That's one of the _reasons_ for using the provided infrastructure rather than open-coding around it. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [IPSEC] Move dst->child loop from dst_ifdown to xfrm_dst_ifdown
On Tue, Feb 08, 2005 at 12:29:29PM +1100, herbert wrote: > > This one moves the dst->child processing from dst_ifdown into > xfrm_dst_ifdown. This patch adds a net_device argument to ifdown. After all, it's a bit silly to notify someone of an ifdown event without telling them what which device it was for :) Signed-off-by: Herbert Xu <[EMAIL PROTECTED]> Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt = include/net/dst.h 1.25 vs edited = --- 1.25/include/net/dst.h 2005-02-06 14:23:59 +11:00 +++ edited/include/net/dst.h2005-02-08 12:14:10 +11:00 @@ -89,7 +89,8 @@ int (*gc)(void); struct dst_entry * (*check)(struct dst_entry *, __u32 cookie); void(*destroy)(struct dst_entry *); - void(*ifdown)(struct dst_entry *, int how); + void(*ifdown)(struct dst_entry *, + struct net_device *dev, int how); struct dst_entry * (*negative_advice)(struct dst_entry *); void(*link_failure)(struct sk_buff *); void(*update_pmtu)(struct dst_entry *dst, u32 mtu); = net/core/dst.c 1.27 vs edited = --- 1.27/net/core/dst.c 2005-02-08 12:12:21 +11:00 +++ edited/net/core/dst.c 2005-02-08 12:15:03 +11:00 @@ -220,12 +220,14 @@ * * Commented and originally written by Alexey. */ -static inline void dst_ifdown(struct dst_entry *dst, int unregister) +static inline void dst_ifdown(struct dst_entry *dst, struct net_device *dev, + int unregister) { - struct net_device *dev = dst->dev; - if (dst->ops->ifdown) - dst->ops->ifdown(dst, unregister); + dst->ops->ifdown(dst, dev, unregister); + + if (dev != dst->dev) + return; if (!unregister) { dst->input = dst_discard_in; @@ -252,8 +254,7 @@ case NETDEV_DOWN: spin_lock_bh(&dst_lock); for (dst = dst_garbage_list; dst; dst = dst->next) { - if (dst->dev == dev) - dst_ifdown(dst, event != NETDEV_DOWN); + dst_ifdown(dst, dev, event != NETDEV_DOWN); } spin_unlock_bh(&dst_lock); break; = net/ipv4/route.c 1.101 vs edited = --- 1.101/net/ipv4/route.c 2005-02-03 07:43:48 +11:00 +++ edited/net/ipv4/route.c 2005-02-08 12:14:11 +11:00 @@ -138,7 +138,8 @@ static struct dst_entry *ipv4_dst_check(struct dst_entry *dst, u32 cookie); static void ipv4_dst_destroy(struct dst_entry *dst); -static void ipv4_dst_ifdown(struct dst_entry *dst, int how); +static void ipv4_dst_ifdown(struct dst_entry *dst, +struct net_device *dev, int how); static struct dst_entry *ipv4_negative_advice(struct dst_entry *dst); static void ipv4_link_failure(struct sk_buff *skb); static void ip_rt_update_pmtu(struct dst_entry *dst, u32 mtu); @@ -1342,11 +1343,12 @@ } } -static void ipv4_dst_ifdown(struct dst_entry *dst, int how) +static void ipv4_dst_ifdown(struct dst_entry *dst, struct net_device *dev, + int how) { struct rtable *rt = (struct rtable *) dst; struct in_device *idev = rt->idev; - if (idev && idev->dev != &loopback_dev) { + if (dev != &loopback_dev && idev && idev->dev == dev) { struct in_device *loopback_idev = in_dev_get(&loopback_dev); if (loopback_idev) { rt->idev = loopback_idev; = net/ipv6/route.c 1.105 vs edited = --- 1.105/net/ipv6/route.c 2005-01-15 19:44:48 +11:00 +++ edited/net/ipv6/route.c 2005-02-08 12:14:11 +11:00 @@ -84,7 +84,8 @@ static struct dst_entry*ip6_dst_check(struct dst_entry *dst, u32 cookie); static struct dst_entry *ip6_negative_advice(struct dst_entry *); static voidip6_dst_destroy(struct dst_entry *); -static voidip6_dst_ifdown(struct dst_entry *, int how); +static voidip6_dst_ifdown(struct dst_entry *, + struct net_device *dev, int how); static int ip6_dst_gc(void); static int ip6_pkt_discard(struct sk_buff *skb); @@ -153,12 +154,13 @@ } } -static void ip6_dst_ifdown(struct dst_entry *dst, int how) +static void ip6_dst_ifdown(struct dst_entry *dst, struct net_device *dev, + int how) { struct rt6_info *rt = (struct rt6_info *)dst; struct inet6_dev *idev = rt->rt6i_idev; - if (idev != NULL && idev->dev != &loopback_dev) { + if (dev != &loopback_dev && idev != NULL &&
[PATCH] resend: compat ioctl for submiting URB
Here is the resend of the patch to support compatible URB ioctl on 64 bit systems. This version already incorporate some feed back I get from the list and I have not get any new input yet. Change Log: - Let usbdevfs directly handle 32 bit URB ioctl. More specifically: USBDEVFS_SUBMITURB32, USBDEVFS_REAPURB32 and USBDEVFS_REAPURBNDELAY32. Those asynchronous ioctls are too complicate to handle by the compatible layer. Thanks Chris Index: linux-2.5/include/linux/compat_ioctl.h === --- linux-2.5.orig/include/linux/compat_ioctl.h 2005-01-26 17:23:57.0 -0800 +++ linux-2.5/include/linux/compat_ioctl.h 2005-02-07 15:10:54.0 -0800 @@ -692,6 +692,9 @@ COMPATIBLE_IOCTL(USBDEVFS_CONNECTINFO) COMPATIBLE_IOCTL(USBDEVFS_HUB_PORTINFO) COMPATIBLE_IOCTL(USBDEVFS_RESET) +COMPATIBLE_IOCTL(USBDEVFS_SUBMITURB32) +COMPATIBLE_IOCTL(USBDEVFS_REAPURB32) +COMPATIBLE_IOCTL(USBDEVFS_REAPURBNDELAY32) COMPATIBLE_IOCTL(USBDEVFS_CLEAR_HALT) /* MTD */ COMPATIBLE_IOCTL(MEMGETINFO) Index: linux-2.5/include/linux/usbdevice_fs.h === --- linux-2.5.orig/include/linux/usbdevice_fs.h 2005-01-25 12:08:02.0 -0800 +++ linux-2.5/include/linux/usbdevice_fs.h 2005-02-07 15:10:54.0 -0800 @@ -32,6 +32,7 @@ #define _LINUX_USBDEVICE_FS_H #include +#include /* - */ @@ -123,6 +124,22 @@ char port [127];/* e.g. port 3 connects to device 27 */ }; +struct usbdevfs_urb32 { + unsigned char type; + unsigned char endpoint; + compat_int_t status; + compat_uint_t flags; + compat_caddr_t buffer; + compat_int_t buffer_length; + compat_int_t actual_length; + compat_int_t start_frame; + compat_int_t number_of_packets; + compat_int_t error_count; + compat_uint_t signr; + compat_caddr_t usercontext; /* unused */ + struct usbdevfs_iso_packet_desc iso_frame_desc[0]; +}; + #define USBDEVFS_CONTROL _IOWR('U', 0, struct usbdevfs_ctrltransfer) #define USBDEVFS_BULK _IOWR('U', 2, struct usbdevfs_bulktransfer) #define USBDEVFS_RESETEP _IOR('U', 3, unsigned int) @@ -130,9 +147,12 @@ #define USBDEVFS_SETCONFIGURATION _IOR('U', 5, unsigned int) #define USBDEVFS_GETDRIVER _IOW('U', 8, struct usbdevfs_getdriver) #define USBDEVFS_SUBMITURB _IOR('U', 10, struct usbdevfs_urb) +#define USBDEVFS_SUBMITURB32 _IOR('U', 10, struct usbdevfs_urb32) #define USBDEVFS_DISCARDURB_IO('U', 11) #define USBDEVFS_REAPURB _IOW('U', 12, void *) +#define USBDEVFS_REAPURB32 _IOW('U', 12, u32) #define USBDEVFS_REAPURBNDELAY _IOW('U', 13, void *) +#define USBDEVFS_REAPURBNDELAY32 _IOW('U', 13, u32) #define USBDEVFS_DISCSIGNAL_IOR('U', 14, struct usbdevfs_disconnectsignal) #define USBDEVFS_CLAIMINTERFACE_IOR('U', 15, unsigned int) #define USBDEVFS_RELEASEINTERFACE _IOR('U', 16, unsigned int) @@ -143,5 +163,4 @@ #define USBDEVFS_CLEAR_HALT_IOR('U', 21, unsigned int) #define USBDEVFS_DISCONNECT_IO('U', 22) #define USBDEVFS_CONNECT _IO('U', 23) - #endif /* _LINUX_USBDEVICE_FS_H */ Index: linux-2.5/fs/compat_ioctl.c === --- linux-2.5.orig/fs/compat_ioctl.c2005-01-25 12:08:12.0 -0800 +++ linux-2.5/fs/compat_ioctl.c 2005-02-07 15:18:38.0 -0800 @@ -2570,229 +2570,11 @@ return sys_ioctl(fd, USBDEVFS_BULK, (unsigned long)p); } -/* This needs more work before we can enable it. Unfortunately - * because of the fancy asynchronous way URB status/error is written - * back to userspace, we'll need to fiddle with USB devio internals - * and/or reimplement entirely the frontend of it ourselves. -DaveM - * - * The issue is: - * - * When an URB is submitted via usbdevicefs it is put onto an - * asynchronous queue. When the URB completes, it may be reaped - * via another ioctl. During this reaping the status is written - * back to userspace along with the length of the transfer. - * - * We must translate into 64-bit kernel types so we pass in a kernel - * space copy of the usbdevfs_urb structure. This would mean that we - * must do something to deal with the async entry reaping. First we - * have to deal somehow with this transitory memory we've allocated. - * This is problematic since there are many call sites from which the - * async entries can be destroyed (and thus when we'd need to free up - * this kernel memory). One of which is the close() op of usbdevicefs. - * To handle that we'd need to make our own file_operations struct which - * overrides usbdevicefs's release op with our own which runs usbdevicefs's - * real release op then frees up the kernel memory. - * - *
[IPSEC] Move dst->child loop from dst_ifdown to xfrm_dst_ifdown
On Sun, Feb 06, 2005 at 05:51:17PM +1100, herbert wrote: > > The idea is to move the check into dst->ops->ifdown. By definition > ipv6_dst_ifdown will only see rt6_info entries. So dst_dev_event > will become Here are the patches to do this. Do they look sane? This one moves the dst->child processing from dst_ifdown into xfrm_dst_ifdown. Signed-off-by: Herbert Xu <[EMAIL PROTECTED]> Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt = net/core/dst.c 1.26 vs edited = --- 1.26/net/core/dst.c 2005-02-06 14:23:59 +11:00 +++ edited/net/core/dst.c 2005-02-08 12:11:39 +11:00 @@ -220,31 +220,26 @@ * * Commented and originally written by Alexey. */ -static void dst_ifdown(struct dst_entry *dst, int unregister) +static inline void dst_ifdown(struct dst_entry *dst, int unregister) { struct net_device *dev = dst->dev; + if (dst->ops->ifdown) + dst->ops->ifdown(dst, unregister); + if (!unregister) { dst->input = dst_discard_in; dst->output = dst_discard_out; - } - - do { - if (unregister) { - dst->dev = &loopback_dev; - dev_hold(&loopback_dev); + } else { + dst->dev = &loopback_dev; + dev_hold(&loopback_dev); + dev_put(dev); + if (dst->neighbour && dst->neighbour->dev == dev) { + dst->neighbour->dev = &loopback_dev; dev_put(dev); - if (dst->neighbour && dst->neighbour->dev == dev) { - dst->neighbour->dev = &loopback_dev; - dev_put(dev); - dev_hold(&loopback_dev); - } + dev_hold(&loopback_dev); } - - if (dst->ops->ifdown) - dst->ops->ifdown(dst, unregister); - } while ((dst = dst->child) && dst->flags & DST_NOHASH && -dst->dev == dev); + } } static int dst_dev_event(struct notifier_block *this, unsigned long event, void *ptr) = net/xfrm/xfrm_policy.c 1.63 vs edited = --- 1.63/net/xfrm/xfrm_policy.c 2005-01-19 07:08:19 +11:00 +++ edited/net/xfrm/xfrm_policy.c 2005-02-08 12:10:47 +11:00 @@ -1027,6 +1027,20 @@ dst->xfrm = NULL; } +static void xfrm_dst_ifdown(struct dst_entry *dst, int unregister) +{ + struct net_device *dev = dst->dev; + + if (!unregister) + return; + + while ((dst = dst->child) && dst->xfrm && dst->dev == dev) { + dst->dev = &loopback_dev; + dev_hold(&loopback_dev); + dev_put(dev); + } +} + static void xfrm_link_failure(struct sk_buff *skb) { /* Impossible. Such dst must be popped before reaches point of failure. */ @@ -1150,6 +1164,8 @@ dst_ops->check = xfrm_dst_check; if (likely(dst_ops->destroy == NULL)) dst_ops->destroy = xfrm_dst_destroy; + if (likely(dst_ops->ifdown == NULL)) + dst_ops->ifdown = xfrm_dst_ifdown; if (likely(dst_ops->negative_advice == NULL)) dst_ops->negative_advice = xfrm_negative_advice; if (likely(dst_ops->link_failure == NULL)) @@ -1181,6 +1197,7 @@ dst_ops->kmem_cachep = NULL; dst_ops->check = NULL; dst_ops->destroy = NULL; + dst_ops->ifdown = NULL; dst_ops->negative_advice = NULL; dst_ops->link_failure = NULL; dst_ops->get_mss = NULL;
Re: prezeroing V6 [2/3]: ScrubD
On Mon, 7 Feb 2005, Andrew Morton wrote: > > Look at the early posts. I plan to put that up on the web. I have some > > stats attached to the end of this message from an earlier post. > > But that's a patch-specific microbenchmark, isn't it? Has this work been > benchmarked against real-world stuff? No its a page fault benchmark. Dave Miller has done some kernel compiles and I have some benchmarks here that I never posted because they do not show any material change as far as I can see. I will be posting that soon when this is complete (also need to do the same for the atomic page fault ops and the prefaulting patch). > > > Should we be managing the kernel threads with the kthread() API? > > > > What would you like to manage? > > Startup, perhaps binding the threads to their cpus too. That is all already controllable in the same way as the swapper. Each memory node is bound to a set of cpus. This may be controlled by the NUMA node configuration. F.e. for nodes without cpus. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: out-of-line x86 "put_user()" implementation
On Mon, 7 Feb 2005, Ingo Molnar wrote: > > boots fine and shrinks the image size quite noticeably: > > [Nr] Name TypeAddr OffSize > [ 1] .textPROGBITSc010 001000 2771a9 [vmlinux-orig] > [ 1] .textPROGBITSc010 001000 2742dd [vmlinux-patched] > > that's 11980 bytes off a 2585001 bytes .text, a 0.5% size reduction. > This patch we want ... Goodie. Here's a slightly more recent version, where I cleaned up the assembly code (no need to save %ecx if we just update %ebx instead, which makes the code a bit more readable too - and doing it this way should hopefully make it easier for an out-of-order CPU to start the memops earlier too. Who knows..) I'm not going to put this into 2.6.11, since I worry about compiler interactions, but the more people who test it anyway, the better. Linus - # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2005/02/07 08:14:28-08:00 [EMAIL PROTECTED] # x86: make "put_user()" be out-of-line # # It's really too big to be inlined. # # Ingo tests and reports: this shrinks his kernel text size by # about 12kB (roughly 0.5%) # # arch/i386/lib/putuser.S # 2005/02/07 08:14:17-08:00 [EMAIL PROTECTED] +86 -0 # # include/asm-i386/uaccess.h # 2005/02/07 08:14:17-08:00 [EMAIL PROTECTED] +27 -3 # x86: make "put_user()" be out-of-line # # arch/i386/lib/putuser.S # 2005/02/07 08:14:17-08:00 [EMAIL PROTECTED] +0 -0 # BitKeeper file /home/torvalds/v2.6/linux/arch/i386/lib/putuser.S # # arch/i386/lib/Makefile # 2005/02/07 08:14:17-08:00 [EMAIL PROTECTED] +1 -1 # x86: make "put_user()" be out-of-line # # arch/i386/kernel/i386_ksyms.c # 2005/02/07 08:14:17-08:00 [EMAIL PROTECTED] +5 -0 # x86: make "put_user()" be out-of-line # diff -Nru a/arch/i386/kernel/i386_ksyms.c b/arch/i386/kernel/i386_ksyms.c --- a/arch/i386/kernel/i386_ksyms.c 2005-02-07 17:16:32 -08:00 +++ b/arch/i386/kernel/i386_ksyms.c 2005-02-07 17:16:32 -08:00 @@ -97,6 +97,11 @@ EXPORT_SYMBOL(__get_user_2); EXPORT_SYMBOL(__get_user_4); +EXPORT_SYMBOL(__put_user_1); +EXPORT_SYMBOL(__put_user_2); +EXPORT_SYMBOL(__put_user_4); +EXPORT_SYMBOL(__put_user_8); + EXPORT_SYMBOL(strpbrk); EXPORT_SYMBOL(strstr); diff -Nru a/arch/i386/lib/Makefile b/arch/i386/lib/Makefile --- a/arch/i386/lib/Makefile2005-02-07 17:16:32 -08:00 +++ b/arch/i386/lib/Makefile2005-02-07 17:16:32 -08:00 @@ -3,7 +3,7 @@ # -lib-y = checksum.o delay.o usercopy.o getuser.o memcpy.o strstr.o \ +lib-y = checksum.o delay.o usercopy.o getuser.o putuser.o memcpy.o strstr.o \ bitops.o lib-$(CONFIG_X86_USE_3DNOW) += mmx.o diff -Nru a/arch/i386/lib/putuser.S b/arch/i386/lib/putuser.S --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/arch/i386/lib/putuser.S 2005-02-07 17:16:32 -08:00 @@ -0,0 +1,86 @@ +/* + * __put_user functions. + * + * (C) Copyright 2005 Linus Torvalds + * + * These functions have a non-standard call interface + * to make them more efficient, especially as they + * return an error value in addition to the "real" + * return value. + */ +#include + + +/* + * __put_user_X + * + * Inputs: %eax[:%edx] contains the data + * %ecx contains the address + * + * Outputs:%eax is error code (0 or -EFAULT) + * + * These functions should not modify any other registers, + * as they get called from within inline assembly. + */ + +#define ENTER pushl %ebx ; GET_THREAD_INFO(%ebx) +#define EXIT popl %ebx ; ret + +.text +.align 4 +.globl __put_user_1 +__put_user_1: + ENTER + cmpl TI_addr_limit(%ebx),%ecx + jae bad_put_user +1: movb %al,(%ecx) + xorl %eax,%eax + EXIT + +.align 4 +.globl __put_user_2 +__put_user_2: + ENTER + movl TI_addr_limit(%ebx),%ebx + subl $1,%ebx + cmpl %ebx,%ecx + jae bad_put_user +2: movw %ax,(%ecx) + xorl %eax,%eax + EXIT + +.align 4 +.globl __put_user_4 +__put_user_4: + ENTER + movl TI_addr_limit(%ebx),%ebx + subl $3,%ebx + cmpl %ebx,%ecx + jae bad_put_user +3: movl %eax,(%ecx) + xorl %eax,%eax + EXIT + +.align 4 +.globl __put_user_8 +__put_user_8: + ENTER + movl TI_addr_limit(%ebx),%ebx + subl $7,%ebx + cmpl %ebx,%ecx + jae bad_put_user +3: movl %eax,(%ecx) +4: movl %edx,4(%ecx) + xorl %eax,%eax + EXIT + +bad_put_user: + movl $-14,%eax + EXIT + +.section __ex_table,"a" + .long 1b,bad_put_user + .long 2b,bad_put_user + .long 3b,bad_put_user + .long 4b,bad_put_user +.previous diff -Nru a/include/asm-i386/uaccess.h b/include/asm-i386/uaccess.h --- a/include/asm-i386/uaccess.h2005-02-07 17:16:32 -08:00 +++ b/include/asm-i386/uaccess.h2005-02-07 17:16:32 -08:00 @@ -185,6 +185,21 @@ extern void __put_user_bad(void); +/* + * Strange magic calling convention: pointer in %ecx, + * value in %eax(:%edx), return value in %eax, no clobbers. + */
Re: [PATCH] hot-swapping support for PSX controllers
Eric Piel wrote: Note that this is a re-send of a previous patch now that the patch of Peter (which had to be applied before this one) has been intregrated in the vanilla kernel. It's Peter's version modified to apply cleanly against 2.6.11-rc3 plus a fix in the comment. I was actually just about to re-post this patch. I've tested it and it works for me, plus it saves a few bytes of kernel memory fixing the array sizes. -Peter -- Fixes hotplug support for PSX controllers and some mis-sized arrays. Signed-off-by: Eric Piel <[EMAIL PROTECTED]> Signed-off-by: Peter Nelson <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: prezeroing V6 [2/3]: ScrubD
Christoph Lameter <[EMAIL PROTECTED]> wrote: > > > What were the benchmarking results for this work? I think you had some, > > but this is pretty vital info, so it should be retained in the changelogs. > > Look at the early posts. I plan to put that up on the web. I have some > stats attached to the end of this message from an earlier post. But that's a patch-specific microbenchmark, isn't it? Has this work been benchmarked against real-world stuff? > > Should we be managing the kernel threads with the kthread() API? > > What would you like to manage? Startup, perhaps binding the threads to their cpus too. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.11-rc3: Kylix application no longer works?
On Mon, 7 Feb 2005, Andrew Morton wrote: Daniel Drake <[EMAIL PROTECTED]> wrote: # fs/binfmt_elf.c # 2005/01/17 13:37:56-08:00 [EMAIL PROTECTED] +43 -19 # [SPARC64]: Missing user access return value checks in fs/binfmt_elf.c and fs/compat.c # I think so. For a short period we applied this patch to the Gentoo 2.6.10 kernel... http://dev.gentoo.org/~dsd/gentoo-dev-sources/release-10.01/dist/1900_umem_catch.patch ...but removed it once users complained it stopped kylix binaries from running. Bah. That's what happens when you fix stuff. What's kylix? The Borland C++ builder thing? Rather Delphi (== Object Pascal) thing. How should one set about reproducing this problem? IIRC, Some minimal "personal" version can be downloaded from borland.com. Grzegorz Kulewski - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
kernel 2.6.9 failure
Hi all, On a K6-2 box the 2.6.9 kernel starts to load : "Loading" then the PC resets. The kernel compiled and everything installed OK. Lilo is OK. I've tried four times different configs with the same result. Box resets. My 2.4.28 kernel works OK. I've tried rm'ing and re-unpacking the 2.6.9 source and starting afresh. Box resets. The only clue, if that's what it is, is when I tried to upgrade module-init- tools and quota-tools I got an error, can't find ../asm-generic/errno.h. True enough, there's no ../asm-generic dir in the includes. The closest is ../mach- generic. And there *is* a errno.h in the include files. So I just made an ../asm- generic dir and put a copy of errno.h in it. No luck. -Gil - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: prezeroing V6 [2/3]: ScrubD
On Mon, 7 Feb 2005, Andrew Morton wrote: > Christoph Lameter <[EMAIL PROTECTED]> wrote: > > > > Adds management of ZEROED and NOT_ZEROED pages and a background daemon > > called scrubd. > > What were the benchmarking results for this work? I think you had some, > but this is pretty vital info, so it should be retained in the changelogs. Look at the early posts. I plan to put that up on the web. I have some stats attached to the end of this message from an earlier post. > Having one kscrubd per node seems like the right thing to do. Yes that is what is happening. Otherwise our NUMA stuff would not work right ;-) > Should we be managing the kernel threads with the kthread() API? What would you like to manage? -- Earlier post The scrub daemon is invoked when a unzeroed page of a certain order has been generated so that its worth running it. If no higher order pages are present then the logic will favor hot zeroing rather than simply shifting processing around. kscrubd typically runs only for a fraction of a second and sleeps for long periods of time even under memory benchmarking. kscrubd performs short bursts of zeroing when needed and tries to stay out off the processor as much as possible. The result is a significant increase of the page fault performance even for single threaded applications (i386 2x PIII-450 384M RAM allocating 256M in each run): w/o patch: Gb Rep Threads User System Wall flt/cpu/s fault/wsec 0 110.006s 0.389s 0.039s157455.320 157070.694 0 120.007s 0.607s 0.032s101476.689 190350.885 w/patch Gb Rep Threads User System Wall flt/cpu/s fault/wsec 0 110.008s 0.083s 0.009s672151.422 664045.899 0 120.005s 0.129s 0.008s459629.796 741857.373 The performance can only be upheld if enough zeroed pages are available. In a heavy memory intensive benchmark the system may run out of these very fast but the efficient algorithm for page zeroing still makes this a winner (2 way system with 384MB RAM, no hardware zeroing support). In the following measurement the test is repeated 10 times allocating 256M each in rapid succession which would deplete the pool of zeroed pages quickly): w/o patch: Gb Rep Threads User System Wall flt/cpu/s fault/wsec 0 1010.058s 3.913s 3.097s157335.774 157076.932 0 1020.063s 6.139s 3.027s100756.788 190572.486 w/patch Gb Rep Threads User System Wall flt/cpu/s fault/wsec 0 1010.059s 1.828s 1.089s330913.517 330225.515 0 1020.082s 1.951s 1.094s307172.100 320680.232 Note that zeroing of pages makes no sense if the application touches all cache lines of a page allocated (there is no influence of prezeroing on benchmarks like lmbench for that reason) since the extensive caching of modern cpus means that the zeroes written to a hot zeroed page will then be overwritten by the application in the cpu cache and thus the zeros will never make it to memory! The test program used above only touches one 128 byte cache line of a 16k page (ia64). Sparsely populated and accessed areas are typical for lots of applications. Here is another test in order to gauge the influence of the number of cache lines touched on the performance of the prezero enhancements: Gb Rep Thr CLine User System Wall flt/cpu/s fault/wsec 1 11 10.01s 0.12s 0.01s500813.853 497925.891 1 11 20.01s 0.11s 0.01s493453.103 472877.725 1 11 40.02s 0.10s 0.01s479351.658 471507.415 1 11 80.01s 0.13s 0.01s424742.054 416725.013 1 11 160.05s 0.12s 0.01s347715.359 336983.834 1 11 320.12s 0.13s 0.02s258112.286 256246.731 1 11 640.24s 0.14s 0.03s169896.381 168189.283 1 11 1280.49s 0.14s 0.06s102300.257 101674.435 The benefits of prezeroing are reduced to minimal quantities if all cachelines of a page are touched. Prezeroing can only be effective if the whole page is not immediately used after the page fault. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] pci_raw_ops should use unsigned args
On Thu, Feb 03, 2005 at 12:08:05PM -0700, Bjorn Helgaas wrote: > Convert pci_raw_ops to use unsigned segment (aka domain), > bus, and devfn. With the previous code, various ia64 config > accesses fail due to segment sign-extension problems. > > ia64: > - With a signed seg >= 0x8, unwanted sign-extension occurs when > "seg << 28" is cast to u64 in PCI_SAL_EXT_ADDRESS() > - PCI_SAL_EXT_ADDRESS(): cast to u64 *before* shifting; otherwise > "seg << 28" is evaluated as unsigned int (32 bits) and gets > truncated when seg > 0xf > - pci_sal_read(): validate "value" ptr as other arches do > - pci_sal_{read,write}(): return -EINVAL rather than SAL error status > > arch/i386/pci/direct.c | 12 ++ > arch/i386/pci/mmconfig.c |6 +++-- > arch/i386/pci/numa.c |6 +++-- > arch/i386/pci/pcbios.c |6 +++-- > arch/ia64/pci/pci.c| 53 > ++--- > arch/x86_64/pci/mmconfig.c |8 -- > include/linux/pci.h|6 +++-- > 7 files changed, 51 insertions(+), 46 deletions(-) > > Signed-off-by: Bjorn Helgaas <[EMAIL PROTECTED]> Applied, thanks. greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PCI: fix pci_remove_legacy_files() crash
On Fri, Feb 04, 2005 at 12:28:36PM +0900, MUNEDA Takahiro wrote: > Hi, > > The legacy_io which is the member of pci_bus struct might be > NULL. It should be checked. > > This patch checks 'b->legacy_io', NULL or not. > > Signed-off-by: MUNEDA Takahiro <[EMAIL PROTECTED]> Applied, thanks. greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Lse-tech] [PATCH] cpusets - big numa cpu and memory placement
Andrew wrote: > OK, I'll add cpusets to the 2.6.12 queue. I'd like that ;). Thank-you, Matthew, for the work you put into making sense of this. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMAIL PROTECTED]> 1.650.933.1373, 1.925.600.0401 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] convert /proc/driver/rtc to seq_file.
The /proc/driver/rtc interface didn't have any module owner hook. The simplest fix is to just convert this to the single version of seq_file. Also, fix initialization of rtc_dev to use C99 form. Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]> diff -Nru a/drivers/char/rtc.c b/drivers/char/rtc.c --- a/drivers/char/rtc.c2005-02-07 16:08:10 -08:00 +++ b/drivers/char/rtc.c2005-02-07 16:08:10 -08:00 @@ -73,6 +73,7 @@ #include #include #include +#include #include #include #include @@ -151,8 +152,7 @@ static void mask_rtc_irq_bit(unsigned char bit); #endif -static int rtc_read_proc(char *page, char **start, off_t off, - int count, int *eof, void *data); +static int rtc_proc_open(struct inode *inode, struct file *file); /* * Bits in rtc_status. (6 bits of room for future expansion) @@ -871,11 +871,18 @@ .fasync = rtc_fasync, }; -static struct miscdevice rtc_dev= -{ - RTC_MINOR, - "rtc", - &rtc_fops +static struct miscdevice rtc_dev = { + .minor = RTC_MINOR, + .name = "rtc", + .fops = &rtc_fops, +}; + +static struct file_operations rtc_proc_fops = { + .owner = THIS_MODULE, + .open = rtc_proc_open, + .read = seq_read, + .llseek = seq_lseek, + .release = single_release, }; #if defined(RTC_IRQ) && !defined(__sparc__) @@ -884,6 +891,7 @@ static int __init rtc_init(void) { + struct proc_dir_entry *ent; #if defined(__alpha__) || defined(__mips__) unsigned int year, ctrl; unsigned long uip_watchdog; @@ -974,7 +982,9 @@ release_region(RTC_PORT(0), RTC_IO_EXTENT); return -ENODEV; } - if (!create_proc_read_entry ("driver/rtc", 0, NULL, rtc_read_proc, NULL)) { + + ent = create_proc_entry("driver/rtc", 0, NULL); + if (!ent) { #ifdef RTC_IRQ free_irq(RTC_IRQ, NULL); #endif @@ -982,6 +992,7 @@ misc_deregister(&rtc_dev); return -ENOMEM; } + ent->proc_fops = &rtc_proc_fops; #if defined(__alpha__) || defined(__mips__) rtc_freq = HZ; @@ -1119,11 +1130,10 @@ * Info exported via "/proc/driver/rtc". */ -static int rtc_proc_output (char *buf) +static int rtc_proc_show(struct seq_file *seq, void *v) { #define YN(bit) ((ctrl & bit) ? "yes" : "no") #define NY(bit) ((ctrl & bit) ? "no" : "yes") - char *p; struct rtc_time tm; unsigned char batt, ctrl; unsigned long freq; @@ -1134,7 +1144,6 @@ freq = rtc_freq; spin_unlock_irq(&rtc_lock); - p = buf; rtc_get_rtc_time(&tm); @@ -1142,12 +1151,12 @@ * There is no way to tell if the luser has the RTC set for local * time or for Universal Standard Time (GMT). Probably local though. */ - p += sprintf(p, -"rtc_time\t: %02d:%02d:%02d\n" -"rtc_date\t: %04d-%02d-%02d\n" -"rtc_epoch\t: %04lu\n", -tm.tm_hour, tm.tm_min, tm.tm_sec, -tm.tm_year + 1900, tm.tm_mon + 1, tm.tm_mday, epoch); + seq_printf(seq, + "rtc_time\t: %02d:%02d:%02d\n" + "rtc_date\t: %04d-%02d-%02d\n" + "rtc_epoch\t: %04lu\n", + tm.tm_hour, tm.tm_min, tm.tm_sec, + tm.tm_year + 1900, tm.tm_mon + 1, tm.tm_mday, epoch); get_rtc_alm_time(&tm); @@ -1156,57 +1165,50 @@ * match any value for that particular field. Values that are * greater than a valid time, but less than 0xc0 shouldn't appear. */ - p += sprintf(p, "alarm\t\t: "); + seq_puts(seq, "alarm\t\t: "); if (tm.tm_hour <= 24) - p += sprintf(p, "%02d:", tm.tm_hour); + seq_printf(seq, "%02d:", tm.tm_hour); else - p += sprintf(p, "**:"); + seq_puts(seq, "**:"); if (tm.tm_min <= 59) - p += sprintf(p, "%02d:", tm.tm_min); + seq_printf(seq, "%02d:", tm.tm_min); else - p += sprintf(p, "**:"); + seq_puts(seq, "**:"); if (tm.tm_sec <= 59) - p += sprintf(p, "%02d\n", tm.tm_sec); + seq_printf(seq, "%02d\n", tm.tm_sec); else - p += sprintf(p, "**\n"); + seq_puts(seq, "**\n"); - p += sprintf(p, -"DST_enable\t: %s\n" -"BCD\t\t: %s\n" -"24hr\t\t: %s\n" -"square_wave\t: %s\n" -"alarm_IRQ\t: %s\n" -"update_IRQ\t: %s\n" -"periodic_IRQ\t: %s\n" -"periodic_freq\t: %ld\n" -"batt_status\t: %s\n", -YN(RTC_DST_EN), -NY(RTC_DM_BINARY), -YN(RTC_24H), -
RE: BIOS Bug
>-Original Message- >From: [EMAIL PROTECTED] >[mailto:[EMAIL PROTECTED] On Behalf Of Enrico Bartky >Sent: Monday, February 07, 2005 7:12 AM >To: linux-kernel@vger.kernel.org >Subject: BIOS Bug > >Hello, > >on my notebook, when I plugged in my USB keyboard the kernel >doesnt boot correctly, ... > >... >BIOS hangoff failed ( 112, 1010001 ) >continuing after BIOS bug >irq 192, pci mem 0xfebff000 >new usb device registered, assigned bus number 1 >... > >then the notebook hangs. If I boot without the plugged >keyboard and plug in when the kernel is ready, there are no >problems. I have a SiS USB chipset. > >Can you help me? What kernel version are you using ? Try 2.6.10 with the following command line parameter: usb-handoff Aleks. > >Thanx, EnricoB >__ >Verschicken Sie romantische, coole und witzige Bilder per SMS! >Jetzt bei WEB.DE FreeMail: http://f.web.de/?mc=021193 > >- >To unsubscribe from this list: send the line "unsubscribe >linux-kernel" in >the body of a message to [EMAIL PROTECTED] >More majordomo info at http://vger.kernel.org/majordomo-info.html >Please read the FAQ at http://www.tux.org/lkml/ > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: prezeroing V6 [2/3]: ScrubD
Christoph Lameter <[EMAIL PROTECTED]> wrote: > > Adds management of ZEROED and NOT_ZEROED pages and a background daemon > called scrubd. What were the benchmarking results for this work? I think you had some, but this is pretty vital info, so it should be retained in the changelogs. Having one kscrubd per node seems like the right thing to do. Should we be managing the kernel threads with the kthread() API? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PCI Hotplug: remove incorrect rpaphp firmware dependency
> Er, use the result of the get_children_props() call only if it _failed_? > I suspect that wasn't your intention. This makes my G5 boot again: Here's an alternate fix for the ppc64 crash during boot. This corrects the offending function to use more conventional error codes. I'll follow up with return code cleanups for the entire module, and for RTAS code, since these are probably too big for 2.6.11. Please apply, if appropriate. Thanks- John Signed-off-by: John Rose <[EMAIL PROTECTED]> diff -puN drivers/pci/hotplug/rpaphp_core.c~01_rpaphp_is_php_fix drivers/pci/hotplug/rpaphp_core.c --- 2_6_linus/drivers/pci/hotplug/rpaphp_core.c~01_rpaphp_is_php_fix 2005-02-07 18:06:29.0 -0600 +++ 2_6_linus-johnrose/drivers/pci/hotplug/rpaphp_core.c2005-02-07 18:10:15.0 -0600 @@ -224,7 +224,7 @@ static int get_children_props(struct dev if (!indexes || !names || !types || !domains) { /* Slot does not have dynamically-removable children */ - return 1; + return -EINVAL; } if (drc_indexes) *drc_indexes = indexes; @@ -260,7 +260,7 @@ int rpaphp_get_drc_props(struct device_n } rc = get_children_props(dn->parent, &indexes, &names, &types, &domains); - if (rc) { + if (rc < 0) { return 1; } @@ -307,7 +307,7 @@ static int is_php_dn(struct device_node int rc; rc = get_children_props(dn, indexes, names, &drc_types, power_domains); - if (rc) { + if (rc >= 0) { if (is_php_type((char *) &drc_types[1])) { *types = drc_types; return 1; @@ -331,7 +331,7 @@ static int is_dr_dn(struct device_node * rc = get_children_props(dn->parent, indexes, names, types, power_domains); - return (rc == 0); + return (rc >= 0); } static inline int is_vdevice_root(struct device_node *dn) _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Merging the Suspend2 freezer implementation.
Hi Pavel. I'm keen to see if we can merge Suspend2's freezer implementation after 2.6.11. Does that conflict with any of your intended changes? If it doesn't, I'll submit a patch for review/merge as quickly as I can. The main change involves the introduction of a new SYNCTHREAD flag. We use this to avoid deadlocking over processes that are running sys_sync and siblings. Processes that enter those routines get the flag added, and it's removed when they exit the sync routine. We then freeze in four stages: 1) Freeze user space threads without SYNCTHREAD set; 2) Freeze user space threads with SYNCTHREAD set; 3) Run our own sys_sync in case no one else was syncing 4) Freeze kernel space threads without NOFREEZE set. I'd also like to look at your SMP support and see if we can improve compatibility there at the same time. Finally I'd like to merge the support for freezer flags on workqueues. Regards, Nigel -- Nigel Cunningham Software Engineer, Canberra, Australia http://www.cyclades.com Ph: +61 (2) 6292 8028 Mob: +61 (417) 100 574 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Lse-tech] [PATCH] cpusets - big numa cpu and memory placement
Matthew Dobson <[EMAIL PROTECTED]> wrote: > > Sorry to reply a long quiet thread, Is appreciated, thanks. > but I've been trading emails with Paul > Jackson on this subject recently, and I've been unable to convince either him > or myself that merging CPUSETs and CKRM is as easy as I once believed. I'm > still convinced the CPU side is doable, but I haven't managed as much success > with the memory binding side of CPUSETs. In light of this, I'd like to > remove > my previous objections to CPUSETs moving forward. If others still have > things > they want discussed before CPUSETs moves into mainline, that's fine, but it > seems to me that CPUSETs offer legitimate functionality and that the code has > certainly "done its time" in -mm to convince me it's stable and usable. OK, I'll add cpusets to the 2.6.12 queue. going once, going twice... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.11-rc3: Kylix application no longer works?
Daniel Drake <[EMAIL PROTECTED]> wrote: > > > # fs/binfmt_elf.c > > # 2005/01/17 13:37:56-08:00 [EMAIL PROTECTED] +43 -19 > > # [SPARC64]: Missing user access return value checks in fs/binfmt_elf.c > > and fs/compat.c > > # > > I think so. For a short period we applied this patch to the Gentoo 2.6.10 > kernel... > > http://dev.gentoo.org/~dsd/gentoo-dev-sources/release-10.01/dist/1900_umem_catch.patch > > ...but removed it once users complained it stopped kylix binaries from > running. Bah. That's what happens when you fix stuff. What's kylix? The Borland C++ builder thing? How should one set about reproducing this problem? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Lse-tech] [PATCH] cpusets - big numa cpu and memory placement
Matthew Dobson wrote: On Sun, 2004-10-03 at 16:53, Martin J. Bligh wrote: Martin wrote: Matt had proposed having a separate sched_domain tree for each cpuset, which made a lot of sense, but seemed harder to do in practice because "exclusive" in cpusets doesn't really mean exclusive at all. See my comments on this from yesterday on this thread. I suspect we don't want a distinct sched_domain for each cpuset, but rather a sched_domain for each of several entire subtrees of the cpuset hierarchy, such that every CPU is in exactly one such sched domain, even though it be in several cpusets in that sched_domain. Mmmm. The fundamental problem I think we ran across (just whilst pondering, not in code) was that some things (eg ... init) are bound to ALL cpus (or no cpus, depending how you word it); i.e. they're created before the cpusets are, and are a member of the grand-top-level-uber-master-thingummy. How do you service such processes? That's what I meant by the exclusive domains aren't really exclusive. Perhaps Matt can recall the problems better. I really liked his idea, aside from the small problem that it didn't seem to work ;-) Well that doesn't seem like a fair statement. It's potentially true, but it's really hard to say without an implementation! ;) I think that the idea behind cpusets is really good, essentially creating isolated areas of CPUs and memory for tasks to run undisturbed. I feel that the actual implementation, however, is taking a wrong approach, because it attempts to use the cpus_allowed mask to override the scheduler in the general case. cpus_allowed, in my estimation, is meant to be used as the exception, not the rule. If we wish to change that, we need to make the scheduler more aware of it, so it can do the right thing(tm) in the presence of numerous tasks with varying cpus_allowed masks. The other option is to implement cpusets in a way that doesn't use cpus_allowed. That is the option that I am pursuing. My idea is to make sched_domains much more flexible and dynamic. By adding locking and reference counting, and simplifying the way in which sched_domains are created, linked, unlinked and eventually destroyed we can use sched_domains as the implementation of cpusets. IA64 already allows multiple sched_domains trees without a shared top-level domain. My proposal is to make this functionality more generally available. Extending the "isolated domains" concept a little further will buy us most (all?) the functionality of "exclusive" cpusets without the need to use cpus_allowed at all. I've got some code. I'm in the midst of pushing it forward to rc3-mm2. I'll post an RFC later today or tomorrow when it's cleaned up. -Matt Sorry to reply a long quiet thread, but I've been trading emails with Paul Jackson on this subject recently, and I've been unable to convince either him or myself that merging CPUSETs and CKRM is as easy as I once believed. I'm still convinced the CPU side is doable, but I haven't managed as much success with the memory binding side of CPUSETs. In light of this, I'd like to remove my previous objections to CPUSETs moving forward. If others still have things they want discussed before CPUSETs moves into mainline, that's fine, but it seems to me that CPUSETs offer legitimate functionality and that the code has certainly "done its time" in -mm to convince me it's stable and usable. -Matt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.11-rc3: Kylix application no longer works?
Andrew Morton wrote: I wonder if reverting the patch will restore the old behaviour? # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2005/01/21 13:42:18-08:00 [EMAIL PROTECTED] # Merge nuts.davemloft.net:/disk1/BK/sparcwork-2.6 # into nuts.davemloft.net:/disk1/BK/sparc-2.6 # # fs/binfmt_elf.c # 2005/01/21 13:42:06-08:00 [EMAIL PROTECTED] +0 -0 # Auto merged # # ChangeSet # 2005/01/17 13:38:38-08:00 [EMAIL PROTECTED] # [SPARC64]: Missing user access return value checks in fs/binfmt_elf.c and fs/compat.c # # Signed-off-by: David S. Miller <[EMAIL PROTECTED]> # # fs/compat_ioctl.c # 2005/01/17 13:37:56-08:00 [EMAIL PROTECTED] +12 -5 # [SPARC64]: Missing user access return value checks in fs/binfmt_elf.c and fs/compat.c # # fs/binfmt_elf.c # 2005/01/17 13:37:56-08:00 [EMAIL PROTECTED] +43 -19 # [SPARC64]: Missing user access return value checks in fs/binfmt_elf.c and fs/compat.c # I think so. For a short period we applied this patch to the Gentoo 2.6.10 kernel... http://dev.gentoo.org/~dsd/gentoo-dev-sources/release-10.01/dist/1900_umem_catch.patch ...but removed it once users complained it stopped kylix binaries from running. Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: question on symbol exports
On Feb 7, 2005, at 4:35 PM, Benjamin Herrenschmidt wrote: Interesting... more than no swap, you must also make sure you have no r/w mmap'ed file (which are technically equivalent to swap). Yeah, I kinda had a similar thought. Just because you aren't swapping doesn't mean the VM subsystem isn't looking at dirty bits, too. It could potentially steal a page that it thinks can be replaced from either a zero-fill or reading again from persistent storage. -- Dan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sys_chroot() hook for additional chroot() jails enforcing
El lun, 07-02-2005 a las 16:50 -0600, Serge E. Hallyn escribió: > Hi, > > If I understood you correct earlier, the only policy you needed to > enforce was to prevent double-chrooting. If that is the case, why is it > not sufficient to keep a "process-has-used-chroot" flag in > current->security which is set on the first call to > capable(CAP_SYS_CHROOT) and inherited by forked children, after which > calls to capable(CAP_SYS_CHROOT) are refused? > > Of course if you need to do more, then a hook might be necessary. Yeah, checking that process is chrooted using the current macro and denying if capable() gets it trying to access CAP_SYS_CHROOT it's the way that vSecurity currently does it. But the hook will have to handle some chdir enforcing that can't be done with current hooks, I will explain it further tomorrow. It's too late here ;) Cheers, -- Lorenzo Hernández García-Hierro <[EMAIL PROTECTED]> [1024D/6F2B2DEC] & [2048g/9AE91A22][http://tuxedo-es.org] signature.asc Description: Esta parte del mensaje =?ISO-8859-1?Q?est=E1?= firmada digitalmente
Re: [RFC][PATCH 2.6.11-rc3-mm1] Relay Fork Module
Guillaume Thouvenin <[EMAIL PROTECTED]> wrote: > > Hello, > >This module sends a signal to one or several processes (in user > space) when a fork occurs in the kernel. It relays information about > forks (parent and child pid) to a user space application. > > ... >This patch is used by the Enhanced Linux System Accounting tool that > can be downloaded from http://elsa.sf.net So this permits ELSA to maintain a complete picture of the process/thread hierarchy? I guess that fits into the "do it in userspace" mantra - certainly hooking into fork() is a minimal way of doing this, although I wonder what the limitations are. Implementation-wise: there's a lot of code there and the interface is a bit awkward. Why not just feed that kobject you have there into kobject_uevent()? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Real-Time Preemption and UML?
Well, I keep trying a little bit more. In the mean while you can get some of the stuff I needed to change to at least get it to compile: One of the problems was use of direct architecture specific semaphores (which doesn't work under PREEMPT_REALTIME) and in places where a quick (maybe too quick) look at the code told me that completions ought to be used. Therefore I changed two semaphores to completions which compiled fine. I have tried the change on 2.6.11-rc2, and it seemed to work, but I have not tested it heavily. The patch is in an attachment - I hope the mail-list will alow that. It is simply too trouplesome otherwise when I am using Pine as mail client. Esben On Mon, 7 Feb 2005, Jeff Dike wrote: > [EMAIL PROTECTED] said: > > Hi, I am trying to compile and run UM-Linux with PREEMPT_REALTIME. I > > managed to get it to compile but it wont start - it simply stops > > somewhere in start_kernel() :-( > > I've never played with preemption on UML. No doubt it needs some work... > > Jeff > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > --- linux-2.6.11-rc2-um/arch/um/drivers/port_kern.c.orig2005-01-23 15:53:29.0 +0100 +++ linux-2.6.11-rc2-um/arch/um/drivers/port_kern.c 2005-02-06 19:54:52.0 +0100 @@ -23,7 +23,7 @@ struct port_list { struct list_head list; int has_connection; - struct semaphore sem; + struct completion done; int port; int fd; spinlock_t lock; @@ -66,7 +66,7 @@ conn->fd = fd; list_add(&conn->list, &conn->port->connections); - up(&conn->port->sem); + complete(&conn->port->done); return(IRQ_HANDLED); } @@ -183,13 +183,14 @@ *port = ((struct port_list) { .list = LIST_HEAD_INIT(port->list), .has_connection = 0, - .sem = __SEMAPHORE_INITIALIZER(port->sem, - 0), .lock = SPIN_LOCK_UNLOCKED, .port = port_num, .fd = fd, .pending = LIST_HEAD_INIT(port->pending), .connections = LIST_HEAD_INIT(port->connections) }); + + init_completion(&port->done), + list_add(&port->list, &ports); found: @@ -221,7 +222,7 @@ int fd; while(1){ - if(down_interruptible(&port->sem)) + if(wait_for_completion_interruptible(&port->done)) return(-ERESTARTSYS); spin_lock(&port->lock); --- linux-2.6.11-rc2-um/arch/um/drivers/xterm_kern.c.orig 2005-01-23 15:53:29.0 +0100 +++ linux-2.6.11-rc2-um/arch/um/drivers/xterm_kern.c2005-02-06 19:54:58.0 +0100 @@ -16,7 +16,7 @@ #include "xterm.h" struct xterm_wait { - struct semaphore sem; + struct completion ready; int fd; int pid; int new_fd; @@ -32,7 +32,7 @@ return(IRQ_NONE); xterm->new_fd = fd; - up(&xterm->sem); + complete(&xterm->ready); return(IRQ_HANDLED); } @@ -49,10 +49,10 @@ /* This is a locked semaphore... */ *data = ((struct xterm_wait) - { .sem = __SEMAPHORE_INITIALIZER(data->sem, 0), - .fd = socket, + { .fd = socket, .pid = -1, .new_fd = -1 }); + init_completion(&data->ready); err = um_request_irq(XTERM_IRQ, socket, IRQ_READ, xterm_interrupt, SA_INTERRUPT | SA_SHIRQ | SA_SAMPLE_RANDOM, @@ -68,7 +68,7 @@ * * XXX Note, if the xterm doesn't work for some reason (eg. DISPLAY * isn't set) this will hang... */ - down(&data->sem); + wait_for_completion(&data->ready); free_irq_by_irq_and_dev(XTERM_IRQ, data); free_irq(XTERM_IRQ, data);
Re: question on symbol exports
Benjamin Herrenschmidt wrote: Interesting... more than no swap, you must also make sure you have no r/w mmap'ed file (which are technically equivalent to swap). Ah...thanks for the warning. We want to eventually make it work with swap as well, but that's substantially more complicated. I'm not too fan about exporting those symbols, but I'll talk to paulus, it should be possible at least to EXPORT_SYMBOL_GPL them... I understand the reluctance. I'm perfectly willing to export it GPL in my private branch as long as you guys don't consider it evil--the module is going to be GPL anyways. The alternative would be for me to build my code directly in to the kernel...just makes it harder for me to debug. Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/1] PCI: Dynids - passing driver data
Martin Mares wrote: Hello! Which is a good thing, right? "driver_data" is usually a pointer to somewhere. Having userspace specify it would not be a good thing. That depends on the driver usage, and the patch allows it to be configurable and defaults to not being used. Maybe we could just define the operation as cloning of an entry for another device ID, including its driver_data. Possibly. That would potentially require a lot of parameters to userspace. We would really need to duplicate all the currently existing sysfs parms to accomplish this. -- Brian King eServer Storage I/O IBM Linux Technology Center - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Suggestion for CD filesystem for Backups
On Fri, Sep 24, 2004 at 01:18:19AM -0400, Ali Bayazit wrote: > > On Thu, 2004-09-23 at 17:16 +0100, Alan Cox wrote: > > On Iau, 2004-09-23 at 00:04, Judith und Mirko Kloppstech wrote: > > > Why not write a file system on top of ISO9660 which uses the rest of the > > > CD to write error correction. If a sector becomes unreadable, the error > > > correction saves the data. Besides, a tool for testing the error rate > > > and the safety of the data can be easily written for a normal CD-ROM > > > drive. > > > > > > The data for error correction might be written into a file so that the > > > CD can be read using any System, but Linux provides error correction. > > > > Send patches, or possibly if you are dumping tars and the like just > > write yourself an app to generate a second file of ECC data. > > Wouldn't it be safer to do ECC on meta-data also? > That probably means replacing ISO9660 though. There seems to be a good user space alternative for this purpose: http://dvdisaster.berlios.de Regards, Toon. -- "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian W. Kernighan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: move-accounting-function-calls-out-of-critical-vm-code-paths.patch
Jay Lan <[EMAIL PROTECTED]> wrote: > > I have tested Christoph's patch before the leave. It did work for CSA > and showed performance improvement on certain configuration. OK, thanks. > Should i propose to include the CSA module in > the kernel then, Andrew? :) Sure, if such an action is suitable for all the other parties who are interested in enhanced system accounting. What this ballgame needs is for someone to grab the bull by the horns and run with it. This thing obviously requires a lot more cliches! - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sys_chroot() hook for additional chroot() jails enforcing
Hi, If I understood you correct earlier, the only policy you needed to enforce was to prevent double-chrooting. If that is the case, why is it not sufficient to keep a "process-has-used-chroot" flag in current->security which is set on the first call to capable(CAP_SYS_CHROOT) and inherited by forked children, after which calls to capable(CAP_SYS_CHROOT) are refused? Of course if you need to do more, then a hook might be necessary. -serge - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Irix NFS server usual problem
må den 07.02.2005 Klokka 23:16 (+0100) skreiv Olivier Galibert: > I'm starting to install some fedora core 3 systems in an environment > where 64bits SGIs are still serving the home directories. They have > the bug/feature that required the 2.4 patch to hack the 64bits > cookies[1]. The 2.6 kernel I just found still can't compensate by > itself for the issue. > > Is there an easy way to fix that? Have you applied SGI's IRIX patches to your server (the one that makes the cookies take 32-bit values)? Alternatively, you can forward-port the old 2.4.x cookie hack to 2.6.x (that should be fairly trivial to do). You can find the patch on http://client.linux-nfs.org/Linux-2.4.x/2.4.26/linux-2.4.26-02-seekdir.dif Cheers, Trond -- Trond Myklebust <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Filesystem linking protections
* John Richard Moser ([EMAIL PROTECTED]) wrote: > Yes, mkdtemp() and mkstemp(). > > Of course we can't always rely on programmers to get it right, so the > idea here is to make sure we ask broken code to behave nicely, and stab > it in the face if it doesn't. Please try to examine this in that scope. It's fine for hardened distro. But still inappropriate for mainline. thanks, -chris -- Linux Security Modules http://lsm.immunix.org http://lsm.bkbits.net - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.11-rc3: Kylix application no longer works?
Pavel Machek <[EMAIL PROTECTED]> wrote: > > I have some obscure Kylix application here... It started gets > misteriously killed in 2.6.11-rc3 and -rc3-mm1... > > [EMAIL PROTECTED]:~/slovnik/bin$ strace ./Slovnik > execve("./Slovnik", ["./Slovnik"], [/* 32 vars */]) = 0 > +++ killed by SIGKILL +++ > [EMAIL PROTECTED]:~/slovnik/bin$ ldd ./Slovnik > /usr/bin/ldd: line 1: 8759 Killed > LD_TRACE_LOADED_OBJECTS=1 LD_WARN= LD_BIND_NOW= > LD_LIBRARY_VERSION=$verify_out LD_VERBOSE= "$file" > [EMAIL PROTECTED]:~/slovnik/bin$ > > I get this in 2.6.10-rc3: > > [EMAIL PROTECTED]:~/slovnik/bin$ ./Slovnik > ./Slovnik: relocation error: ./Slovnik: undefined symbol: > initPAnsiStrings > [EMAIL PROTECTED]:~/slovnik/bin$ ldd ./Slovnik > libz.so.1 => /usr/lib/libz.so.1 (0xb7fc2000) > libX11.so.6 => /usr/X11/lib/libX11.so.6 (0xb7efa000) > libpthread.so.0 => /lib/libpthread.so.0 (0xb7ea9000) > libdl.so.2 => /lib/libdl.so.2 (0xb7ea6000) > libc.so.6 => /lib/libc.so.6 (0xb7d73000) > /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0xb7fea000) > [EMAIL PROTECTED]:~/slovnik/bin$ Does it work correctly under earlier kernels? If so, when did it break? > When I set LD_LIBRARY_PATH right, it will actually work. Any ideas? Presumably you're picking up a different library without LD_LIBRARY_PATH. Perhaps that library is mucked up and the new uaccess checking code in binfmt_elf.c is now doing the right thing, and we were previously forgetting to report some error. I wonder if reverting the patch will restore the old behaviour? # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2005/01/21 13:42:18-08:00 [EMAIL PROTECTED] # Merge nuts.davemloft.net:/disk1/BK/sparcwork-2.6 # into nuts.davemloft.net:/disk1/BK/sparc-2.6 # # fs/binfmt_elf.c # 2005/01/21 13:42:06-08:00 [EMAIL PROTECTED] +0 -0 # Auto merged # # ChangeSet # 2005/01/17 13:38:38-08:00 [EMAIL PROTECTED] # [SPARC64]: Missing user access return value checks in fs/binfmt_elf.c and fs/compat.c # # Signed-off-by: David S. Miller <[EMAIL PROTECTED]> # # fs/compat_ioctl.c # 2005/01/17 13:37:56-08:00 [EMAIL PROTECTED] +12 -5 # [SPARC64]: Missing user access return value checks in fs/binfmt_elf.c and fs/compat.c # # fs/binfmt_elf.c # 2005/01/17 13:37:56-08:00 [EMAIL PROTECTED] +43 -19 # [SPARC64]: Missing user access return value checks in fs/binfmt_elf.c and fs/compat.c # diff -Nru a/fs/binfmt_elf.c b/fs/binfmt_elf.c --- a/fs/binfmt_elf.c 2005-02-07 14:50:07 -08:00 +++ b/fs/binfmt_elf.c 2005-02-07 14:50:07 -08:00 @@ -110,15 +110,17 @@ be in memory */ -static void padzero(unsigned long elf_bss) +static int padzero(unsigned long elf_bss) { unsigned long nbyte; nbyte = ELF_PAGEOFFSET(elf_bss); if (nbyte) { nbyte = ELF_MIN_ALIGN - nbyte; - clear_user((void __user *) elf_bss, nbyte); + if (clear_user((void __user *) elf_bss, nbyte)) + return -EFAULT; } + return 0; } /* Let's use some macros to make this stack manipulation a litle clearer */ @@ -134,7 +136,7 @@ #define STACK_ALLOC(sp, len) ({ sp -= len ; sp; }) #endif -static void +static int create_elf_tables(struct linux_binprm *bprm, struct elfhdr * exec, int interp_aout, unsigned long load_addr, unsigned long interp_load_addr) @@ -179,7 +181,8 @@ STACK_ALLOC(p, ((current->pid % 64) << 7)); #endif u_platform = (elf_addr_t __user *)STACK_ALLOC(p, len); - __copy_to_user(u_platform, k_platform, len); + if (__copy_to_user(u_platform, k_platform, len)) + return -EFAULT; } /* Create the ELF interpreter info */ @@ -241,7 +244,8 @@ #endif /* Now, let's put argc (and argv, envp if appropriate) on the stack */ - __put_user(argc, sp++); + if (__put_user(argc, sp++)) + return -EFAULT; if (interp_aout) { argv = sp + 2; envp = argv + argc + 1; @@ -259,25 +263,29 @@ __put_user((elf_addr_t)p, argv++); len = strnlen_user((void __user *)p, PAGE_SIZE*MAX_ARG_PAGES); if (!len || len > PAGE_SIZE*MAX_ARG_PAGES) - return; + return 0; p += len; } - __put_user(0, argv); + if (__put_user(0, argv)) + return -EFAULT; current->mm->arg_end = current->mm->env_start = p; while (envc-- > 0) { size_t len; __put_user((elf_addr_t)p, envp++); len = strnlen_user((void __user *)p, PAGE_SIZE*MAX_ARG_PAGES); if (!len || len > PAGE_SIZE*MAX_ARG_PAGES) - return; + return 0; p += len; } - __put_user(0, envp); + if (__put_user(0, envp)) +
Re: [PATCH] BSD Secure Levels: claim block dev in file struct rather than inode struct, 2.6.11-rc2-mm1 (3/8)
On Mon, 07 Feb 2005 14:26:03 PST, Chris Wright said: > * Michael Halcrow ([EMAIL PROTECTED]) wrote: > > This is the third in a series of eight patches to the BSD Secure > > Levels LSM. It moves the claim on the block device from the inode > > struct to the file struct in order to address a potential > > circumvention of the control via hard links to block devices. Thanks > > to Serge Hallyn for pointing this out. > > Hard links still point to same inode, what's the issue that this > addresses? Ignore that last - I thought it was the "filesystem linking permissions" thread rather than the BSD Secure linking permissions thread. ;) pgpHd0UzzrMjl.pgp Description: PGP signature
Re: [linux-usb-devel] 2.6: USB disk unusable level of data corruption
David Brownell wrote: > On Sunday 06 February 2005 7:59 am, Giuseppe Bilotta wrote: > > > > I have a MAGNEX/ViPower USB/FirWire external HD enclosure. I > > found that it works pretty fine (albeit slowly) when connected > > to the USB 1.1 ports built in my Dell Inspiron 8200, but trying > > to connect it via the Hamlet PCMCIA USB2 Card Adapter doesn't > > work (it seems it gets assigned minors 1,2,3,4,5,6,... and so > > on forever until I unplug it). > > What do you mean "minors"? Addresses or actual /dev/sdN numbers? > > If it's addresses, that would be an an enumeration problem. Some > recent changes have caused prolems there, 2.6.11-rc3-mm2 ought to > have a patch making it better. (Well, working around one of the > two problems that'd suggest.) Sorry, it's addresses. usb 5-1: new high speed USB device using ehci_hcd and address 4 usb 5-1: new high speed USB device using ehci_hcd and address 5 usb 5-1: new high speed USB device using ehci_hcd and address 6 blah blah blah, neverending. So yes, it's probably the enumeration problem. Also, when I plug in the PCMCIA card I get (sorry for the wrapping, Gravity sucks) PCI: Enabling device :07:00.0 ( -> 0002) ACPI: PCI interrupt :07:00.0[A] -> GSI 11 (level, low) -> IRQ 11 ohci_hcd :07:00.0: NEC Corporation USB PCI: Setting latency timer of device :07:00.0 to 64 ohci_hcd :07:00.0: irq 11, pci mem 0x2900 ohci_hcd :07:00.0: new USB bus registered, assigned bus number 3 hub 3-0:1.0: USB hub found hub 3-0:1.0: 3 ports detected PCI: Enabling device :07:00.1 ( -> 0002) ACPI: PCI interrupt :07:00.1[B] -> GSI 11 (level, low) -> IRQ 11 ohci_hcd :07:00.1: NEC Corporation USB (#2) PCI: Setting latency timer of device :07:00.1 to 64 ohci_hcd :07:00.1: irq 11, pci mem 0x29001000 ohci_hcd :07:00.1: new USB bus registered, assigned bus number 4 hub 4-0:1.0: USB hub found hub 4-0:1.0: 2 ports detected PCI: Enabling device :07:00.2 ( -> 0002) ACPI: PCI interrupt :07:00.2[C] -> GSI 11 (level, low) -> IRQ 11 ehci_hcd :07:00.2: NEC Corporation USB 2.0 ehci_hcd :07:00.2: irq 11, pci mem 0x29002000 ehci_hcd :07:00.2: new USB bus registered, assigned bus number 5 ehci_hcd :07:00.2: USB 2.0 initialized, EHCI 0.95, driver 26 Oct 2004 hub 5-0:1.0: USB hub found hub 5-0:1.0: 5 ports detected The card only has 2 USB ports .. why 5 ports here? Is this the same bug? Another interesting tidbit is that I get: USB Universal Host Controller Interface driver v2.2 ACPI: PCI interrupt :00:1d.0[A] -> GSI 11 (level, low) -> IRQ 11 uhci_hcd :00:1d.0: Intel Corp. 82801CA/CAM USB (Hub #1) PCI: Setting latency timer of device :00:1d.0 to 64 uhci_hcd :00:1d.0: irq 11, io base 0xbf80 uhci_hcd :00:1d.0: new USB bus registered, assigned bus number 1 hub 1-0:1.0: USB hub found hub 1-0:1.0: 2 ports detected ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 11 ACPI: PCI interrupt :00:1d.2[C] -> GSI 11 (level, low) -> IRQ 11 uhci_hcd :00:1d.2: Intel Corp. 82801CA/CAM USB (Hub #3) PCI: Setting latency timer of device :00:1d.2 to 64 uhci_hcd :00:1d.2: irq 11, io base 0xbf20 uhci_hcd :00:1d.2: new USB bus registered, assigned bus number 2 hub 2-0:1.0: USB hub found hub 2-0:1.0: 2 ports detected for the built-in ports ... I only have two USB ports on this machine though, why does it see 4 of them? (Do you also need the lspci and/or lsusb and/or dmesg of the error that happens when I disable the EHCI driver and only let the OHCI manage the PCMCIA card?) -- Giuseppe "Oblomov" Bilotta Can't you see It all makes perfect sense Expressed in dollar and cents Pounds shillings and pence (Roger Waters) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] BSD Secure Levels: claim block dev in file struct rather than inode struct, 2.6.11-rc2-mm1 (3/8)
On Mon, 07 Feb 2005 14:26:03 PST, Chris Wright said: > Hard links still point to same inode, what's the issue that this > addresses? For those systems that have everything on one big partition, you can often do stuff like: ln /etc/passwd /tmp/ and wait for /etc/passwd to get clobbered by a cron job run by root... pgpv1juO6RgIl.pgp Description: PGP signature
Re: [PATCH 1/1] PCI: Dynids - passing driver data
Hello! > >Which is a good thing, right? "driver_data" is usually a pointer to > >somewhere. Having userspace specify it would not be a good thing. > > That depends on the driver usage, and the patch allows it to be > configurable and defaults to not being used. Maybe we could just define the operation as cloning of an entry for another device ID, including its driver_data. Have a nice fortnight -- Martin `MJ' Mares <[EMAIL PROTECTED]> http://atrey.karlin.mff.cuni.cz/~mj/ Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth Only dead fish swim with the stream. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Sabotaged PaXtest (was: Re: Patch 4/6 randomize the stack pointer)
btw., do you consider PaX as a 100% sure solution against 'code injection' attacks (meaning that the attacker wants to execute an arbitrary piece of code, and assuming the attacked application has a stack overflow)? I.e. does PaX avoid all such attacks in a guaranteed way? Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sys_chroot() hook for additional chroot() jails enforcing
* Lorenzo Hernández García-Hierro ([EMAIL PROTECTED]) wrote: > Attached you can find a patch which adds a new hook for the sys_chroot() > syscall, and makes us able to add additional enforcing and security > checks by using the Linux Security Modules framework (ie. chdir > enforcing, etc). If you want to make a change like this, collapse the capable(CAP_SYS_CHROOT) check behind this hook, no point having two outcalls from same call site. What logic do you expect to put behind the chroot() hook? thanks, -chris -- Linux Security Modules http://lsm.immunix.org http://lsm.bkbits.net - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/1] PCI: Dynids - passing driver data
Greg KH wrote: On Mon, Feb 07, 2005 at 04:00:27PM -0600, [EMAIL PROTECTED] wrote: Currently, code exists in the pci layer to allow userspace to specify driver data when adding a pci dynamic id from sysfs. However, this data is never used and there exists no way in the existing code to use it. Which is a good thing, right? "driver_data" is usually a pointer to somewhere. Having userspace specify it would not be a good thing. That depends on the driver usage, and the patch allows it to be configurable and defaults to not being used. This patch allows device drivers to indicate that they want driver data passed to them on dynamic id adds by initializing use_driver_data in their pci_driver->pci_dynids struct. The documentation has also been updated to reflect this. What driver wants to use this? I am in the process of adding dynids support into the ipr scsi driver. I originally was using driver_data as a pointer, but am changing it to be an index instead, so that it can be specified by the user. There are essentially 2 different types of chipsets that ipr controls, the primary difference being the register offsets. I am using driver_data to figure that out today. My other option is to somehow change the driver to cope with having no driver data, but that will result in more driver code and will ultimately be less flexible in the new chipsets that can be added using dynids. -Brian -- Brian King eServer Storage I/O IBM Linux Technology Center - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] out-of-tree builds: preserve ARCH and CROSS_COMPILE settings
[I am not subscribed, please CC: any replies] When you build the 2.6 kernel outside of its source directory, using the O= option like so: make -C linux-2.6.10 O=../builddir this conveniently produces a top-level Makefile in "builddir" which can be used to update/clean/rebuild the tree with a simple "make". It also uses the ".config" file from "builddir", which makes it very convenient for managing multiple builds for different target systems. However if you are cross-compiling, you must also set ARCH and CROSS_COMPILE variables as appropriate. Unfortunately these settings are not recorded in the generated Makefile in "builddir", so one cannot simply do "make" anymore. The attached patch fixes the script that generates the Makefile, so as to pass ARCH and CROSS_COMPILE settings, only when they are defined. Otherwise behaviour is exactly as it was before. Since the contents of "builddir" are specific to ARCH and CROSS_COMPILER I see no reason why the values should not become fixed in "builddir". Signed-off-by: Ralph Siemsen <[EMAIL PROTECTED]> diff -u mkmakefile --- linux-2.6.10.orig/scripts/mkmakefile 27 Jan 2005 15:53:54 - +++ linux-2.6.10/scripts/mkmakefile 7 Feb 2005 21:20:19 - @@ -9,6 +9,8 @@ # $3 - version # $4 - patchlevel +test "$ARCH" != "" && ARCH="ARCH=$ARCH" +test "$CROSS_COMPILE" != "" && CROSS="CROSS_COMPILE=$CROSS_COMPILE" cat << EOF # Automatically generated by $0: don't edit @@ -22,10 +24,10 @@ MAKEFLAGS += --no-print-directory all: - \$(MAKE) -C \$(KERNELSRC) O=\$(KERNELOUTPUT) + \$(MAKE) $ARCH $CROSS -C \$(KERNELSRC) O=\$(KERNELOUTPUT) %:: - \$(MAKE) -C \$(KERNELSRC) O=\$(KERNELOUTPUT) \$@ + \$(MAKE) $ARCH $CROSS -C \$(KERNELSRC) O=\$(KERNELOUTPUT) \$@ EOF
2.6.11-rc3: Kylix application no longer works?
Hi! I have some obscure Kylix application here... It started gets misteriously killed in 2.6.11-rc3 and -rc3-mm1... [EMAIL PROTECTED]:~/slovnik/bin$ strace ./Slovnik execve("./Slovnik", ["./Slovnik"], [/* 32 vars */]) = 0 +++ killed by SIGKILL +++ [EMAIL PROTECTED]:~/slovnik/bin$ ldd ./Slovnik /usr/bin/ldd: line 1: 8759 Killed LD_TRACE_LOADED_OBJECTS=1 LD_WARN= LD_BIND_NOW= LD_LIBRARY_VERSION=$verify_out LD_VERBOSE= "$file" [EMAIL PROTECTED]:~/slovnik/bin$ I get this in 2.6.10-rc3: [EMAIL PROTECTED]:~/slovnik/bin$ ./Slovnik ./Slovnik: relocation error: ./Slovnik: undefined symbol: initPAnsiStrings [EMAIL PROTECTED]:~/slovnik/bin$ ldd ./Slovnik libz.so.1 => /usr/lib/libz.so.1 (0xb7fc2000) libX11.so.6 => /usr/X11/lib/libX11.so.6 (0xb7efa000) libpthread.so.0 => /lib/libpthread.so.0 (0xb7ea9000) libdl.so.2 => /lib/libdl.so.2 (0xb7ea6000) libc.so.6 => /lib/libc.so.6 (0xb7d73000) /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0xb7fea000) [EMAIL PROTECTED]:~/slovnik/bin$ When I set LD_LIBRARY_PATH right, it will actually work. Any ideas? Pavel -- People were complaining that M$ turns users into beta-testers... ...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl! - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Filesystem linking protections
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Chris Wright wrote: > * John Richard Moser ([EMAIL PROTECTED]) wrote: > >>I've yet to see this break anything on Ubuntu or Gentoo; Brad Spengler >>claims this breaks nothing on Debian. On the other hand, this could >>potentially squash the second most prevalent security bug. > > > Yes I know, I've worked on distro with it as well in the past. And it > has broken atd and courier in the past. This is something that also > can be done in userspace using sane subdirs in +t world writable dirs, > or O_EXCL so there's work to be done in userspace. > Yes, mkdtemp() and mkstemp(). Of course we can't always rely on programmers to get it right, so the idea here is to make sure we ask broken code to behave nicely, and stab it in the face if it doesn't. Please try to examine this in that scope. > thanks, > -chris - -- All content of all messages exchanged herein are left in the Public Domain, unless otherwise explicitly stated. -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFCB+vThDd4aOud5P8RAssCAJ9L7Cf5pnvI8GdKs1P4cpM2lJvtYACZAXee a5kkPkxXm9YK0DFSfvDd6fQ= =00DK -END PGP SIGNATURE- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] BSD Secure Levels: claim block dev in file struct rather than inode struct, 2.6.11-rc2-mm1 (3/8)
* Michael Halcrow ([EMAIL PROTECTED]) wrote: > This is the third in a series of eight patches to the BSD Secure > Levels LSM. It moves the claim on the block device from the inode > struct to the file struct in order to address a potential > circumvention of the control via hard links to block devices. Thanks > to Serge Hallyn for pointing this out. Hard links still point to same inode, what's the issue that this addresses? thanks, -chris -- Linux Security Modules http://lsm.immunix.org http://lsm.bkbits.net - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[ANNOUNCE] February release of LTP
The February release of LTP is now available. LTP-20050207 - runltp now exports $TMPDIR as a copy of $TMP, certain exceptions caused these to be different. - extra functions for LTP libs are to make these tests fail with a more informative message when attempts to create swap on tmpfs are made. - IPV6 testcase updates from David Stevens - Applied patch from Jacky Malcles that fixes an inconsistency regarding synchronization. - Make proc01 skip kcore - Fix gives an hint to the probable solution if capset01 test fails - Fix for race conditions in synchronization between children and parent on fcntl15. - Applied patch from Jacky Malcles to allow test to run on ia64. - The test llseek sets RLIMIT_FSIZE to a small number, this fix to restore it to its original value. - Fix IPV6 Makefile install path problem Linux Test Project Linux Technology Center IBM Corporation Internet E-Mail : [EMAIL PROTECTED] IBM, 11501 Burnet Rd, Austin, TX 78758 Phone (512) 838-1356 - T/L 678-1356 - Bldg. 908/1C005 Austin, TX. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/1] PCI: Dynids - passing driver data
On Mon, Feb 07, 2005 at 04:00:27PM -0600, [EMAIL PROTECTED] wrote: > > Currently, code exists in the pci layer to allow userspace to specify > driver data when adding a pci dynamic id from sysfs. However, this data > is never used and there exists no way in the existing code to use it. Which is a good thing, right? "driver_data" is usually a pointer to somewhere. Having userspace specify it would not be a good thing. > This patch allows device drivers to indicate that they want driver data > passed to them on dynamic id adds by initializing use_driver_data in their > pci_driver->pci_dynids struct. The documentation has also been updated > to reflect this. What driver wants to use this? thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] sys_chroot() hook for additional chroot() jails enforcing
Hi, Attached you can find a patch which adds a new hook for the sys_chroot() syscall, and makes us able to add additional enforcing and security checks by using the Linux Security Modules framework (ie. chdir enforcing, etc). Current user of the hook is the forthcoming 0.2 revision of vSecurity. With it, and used within an LSM module, we can achieve the goal of enforcing and apply some hardening to the sys_chroot() syscall. Even if chroot jails are broken by design, in terms of security, with a few changes to their base and some syscalls that it relies with, we can achieve the goal of preventing some of the already known attacks against them. I will make available some patches for other syscalls as well (sys_fchmod(), sys_chmod(), ...), that will add a few more hooks to the LSM framework, in the hope that they will be useful. The patch can be retrieved too from: http://pearls.tuxedo-es.org/patches/sys_chroot_lsm-hook-2.6.11-rc3.patch Thanks in advance, and, again, I will appreciate any suggestions on which hooks are good candidates to be added. Feel free to edit tuxedo-es.org wiki at http://wiki.tuxedo-es.org/LSM and put suggestions & comments there. Cheers, -- Lorenzo Hernández García-Hierro <[EMAIL PROTECTED]> [1024D/6F2B2DEC] & [2048g/9AE91A22][http://tuxedo-es.org] diff -Nur linux-2.6.11-rc3/fs/open.c linux-2.6.11-rc3.chroot-lsm/fs/open.c --- linux-2.6.11-rc3/fs/open.c 2005-02-06 21:40:40.0 +0100 +++ linux-2.6.11-rc3.chroot-lsm/fs/open.c 2005-02-07 21:42:45.0 +0100 @@ -582,6 +582,10 @@ error = -EPERM; if (!capable(CAP_SYS_CHROOT)) goto dput_and_out; + + error = security_chroot(&nd); + if (error) + goto dput_and_out; set_fs_root(current->fs, nd.mnt, nd.dentry); set_fs_altroot(); diff -Nur linux-2.6.11-rc3/include/linux/security.h linux-2.6.11-rc3.chroot-lsm/include/linux/security.h --- linux-2.6.11-rc3/include/linux/security.h 2005-02-06 21:40:27.0 +0100 +++ linux-2.6.11-rc3.chroot-lsm/include/linux/security.h 2005-02-07 21:10:05.0 +0100 @@ -1008,6 +1008,10 @@ * @ts contains new time * @tz contains new timezone * Return 0 if permission is granted. + * @chroot: + * Check permission to change the current root by sys_chroot() syscall. + * @nd contains the nameidata struct passed by sys_chroot() + * Return 0 if permission is granted. * @vm_enough_memory: * Check permissions for allocating a new virtual mapping. * @pages contains the number of pages. @@ -1040,6 +1044,7 @@ int (*acct) (struct file * file); int (*sysctl) (struct ctl_table * table, int op); int (*capable) (struct task_struct * tsk, int cap); + int (*chroot) (struct nameidata * nd); int (*quotactl) (int cmds, int type, int id, struct super_block * sb); int (*quota_on) (struct dentry * dentry); int (*syslog) (int type); @@ -1304,6 +1309,10 @@ return security_ops->settime(ts, tz); } +static inline int security_chroot(struct nameidata *nd) +{ + return security_ops->chroot(nd); +} static inline int security_vm_enough_memory(long pages) { @@ -1986,6 +1995,11 @@ return cap_settime(ts, tz); } +static inline int security_chroot(struct nameidata *nd) +{ + return 0; +} + static inline int security_vm_enough_memory(long pages) { return cap_vm_enough_memory(pages); diff -Nur linux-2.6.11-rc3/security/dummy.c linux-2.6.11-rc3.chroot-lsm/security/dummy.c --- linux-2.6.11-rc3/security/dummy.c 2005-02-06 21:40:57.0 +0100 +++ linux-2.6.11-rc3.chroot-lsm/security/dummy.c 2005-02-07 21:12:01.0 +0100 @@ -101,6 +101,11 @@ return 0; } +static int dummy_chroot(struct nameidata *nd) +{ + return 0; +} + static int dummy_settime(struct timespec *ts, struct timezone *tz) { if (!capable(CAP_SYS_TIME)) @@ -858,6 +863,7 @@ set_to_dummy_if_null(ops, sysctl); set_to_dummy_if_null(ops, syslog); set_to_dummy_if_null(ops, settime); + set_to_dummy_if_null(ops, chroot); set_to_dummy_if_null(ops, vm_enough_memory); set_to_dummy_if_null(ops, bprm_alloc_security); set_to_dummy_if_null(ops, bprm_free_security); signature.asc Description: Esta parte del mensaje =?ISO-8859-1?Q?est=E1?= firmada digitalmente
Irix NFS server usual problem
I'm starting to install some fedora core 3 systems in an environment where 64bits SGIs are still serving the home directories. They have the bug/feature that required the 2.4 patch to hack the 64bits cookies[1]. The 2.6 kernel I just found still can't compensate by itself for the issue. Is there an easy way to fix that? OG. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Filesystem linking protections
On Mon, 07 Feb 2005 23:00:33 +0100, Lorenzo =?ISO-8859-1?Q?Hern=E1ndez_?= =?ISO-8859-1?Q?Garc=EDa-Hierro?= said: > A sysctl can be a good option, creating a CTL_SECURITY and then > registering stuff under it, but this requires to have the kernel hackers > agree with implementing a new security suite and such. > In short, re-inventing the wheel. No, you can do this from within an LSM and the kernel hackers don't have to deal with it (tech note - don't call register_sysctl_table() from within a security_initcall(). Use a separate __initcall() that gets called later - security_initcall() happens before the kernel has the sysctl infrastructure in place. Guess how I know that? ;) pgpOpjamuhL1A.pgp Description: PGP signature
Re: [PATCH] Dynamic tick, version 050127-1
Pavel Machek wrote: Hi! I do have CONFIG_X86_PM_TIMER enabled, but it seems by board does not have such piece of hardware: [EMAIL PROTECTED]:/usr/src/linux-mm$ dmesg | grep -i "time\|tick\|apic" PCI: Setting latency timer of device :00:11.5 to 64 [EMAIL PROTECTED]:/usr/src/linux-mm$ If you are sure that machine supports ACPI, maybe this is your problem (from the POSIX high res timer patch): If you enable the ACPI pm timer and it cannot be found, it is possible that your BIOS is not producing the ACPI table or that your machine does not support ACPI. In the former case, see "Default ACPI pm timer address". If the timer is not found the boot will fail when trying to calibrate the 'delay' loop. Well, but how do I get the address? I'll try looking at BIOS options... Pavel In my machine, if I turned off the PM code (in the BIOS) (or possibly turning on the ACPI, again in the BIOS) it did produce the address. Booting then would put that address in the dmesg file. You can then change the BIOS back to what it was and use the address found in the dmesg file. -- George Anzinger george@mvista.com High-res-timers: http://sourceforge.net/projects/high-res-timers/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Filesystem linking protections
El lun, 07-02-2005 a las 16:45 -0500, [EMAIL PROTECTED] escribió: > On Mon, 07 Feb 2005 20:34:33 +0100, Lorenzo =?ISO-8859-1?Q?Hern=E1ndez_?= > =?ISO-8859-1?Q?Garc=EDa-Hierro?= said: > > > But It's better to give users a "secure-by-default" status, at least on > > those parts that don't affect negatively the stability or the > > performance itself. > > It's still policy, and should be put someplace where users can manage it. > You're changing the behavior from what POSIX specifies, and that's in general > a no-no for mainline kernel code. A sysctl can be a good option, creating a CTL_SECURITY and then registering stuff under it, but this requires to have the kernel hackers agree with implementing a new security suite and such. In short, re-inventing the wheel. > Like an LSM, which happens to be there so users can impose policy without > making any code changes to the kernel. Implementing a policy that results in > non-POSIXy behavior in an LSM is perfectly OK.. ;) It's currently made in vSecurity :) > > The LSM hook call is before the check, so, LSM framework still has the > > control over it, until it releases the operation giving control back to > > the standard function. > > Right.. Which means LSM can stop that particular attack even faster than > your patch.. ;) At least I don't interfere with LSM, so, if no LSM hook adds it's own security checks, then it gets used. > > If users must rely on LSM or other external solutions for applying basic > > security checks (as the framework itself only provides the way to apply > > them, the checks need to be implemented in a module), then we are making > > them unable to be protected using the "default" configuration. > > You're making the very rash assumption that a hard-coded one-size-fits all > "default" that behaves differently than POSIX is suitable for all sites, > including sites that run software that gets broken by this change, and > things like embedded systems where it's not a concern at all, and sites that > already implement some *other* system to ensure that it's not an issue (for > instance, by using an SELinux policy...) Good point, then the solution is to make it config-dependent, and that's a thing that kernel hackers seem to dislike. Lemme know what's the final thought on this, so, I could work out it and give what you want, without time loss and we all can feel happy with it :) Cheers and thanks for the comments, -- Lorenzo Hernández García-Hierro <[EMAIL PROTECTED]> [1024D/6F2B2DEC] & [2048g/9AE91A22][http://tuxedo-es.org] signature.asc Description: Esta parte del mensaje =?ISO-8859-1?Q?est=E1?= firmada digitalmente
[PATCH 1/1] PCI: Dynids - passing driver data
Currently, code exists in the pci layer to allow userspace to specify driver data when adding a pci dynamic id from sysfs. However, this data is never used and there exists no way in the existing code to use it. This patch allows device drivers to indicate that they want driver data passed to them on dynamic id adds by initializing use_driver_data in their pci_driver->pci_dynids struct. The documentation has also been updated to reflect this. Signed-off-by: Brian King <[EMAIL PROTECTED]> --- linux-2.6.11-rc3-bk4-bjking1/Documentation/pci.txt|8 linux-2.6.11-rc3-bk4-bjking1/drivers/pci/pci-driver.c |1 - 2 files changed, 4 insertions(+), 5 deletions(-) diff -puN drivers/pci/pci-driver.c~pci_dynids_driver_data drivers/pci/pci-driver.c --- linux-2.6.11-rc3-bk4/drivers/pci/pci-driver.c~pci_dynids_driver_data 2005-02-07 15:58:21.0 -0600 +++ linux-2.6.11-rc3-bk4-bjking1/drivers/pci/pci-driver.c 2005-02-07 15:58:21.0 -0600 @@ -115,7 +115,6 @@ static DRIVER_ATTR(new_id, S_IWUSR, NULL static inline void pci_init_dynids(struct pci_dynids *dynids) { - memset(dynids, 0, sizeof(*dynids)); spin_lock_init(&dynids->lock); INIT_LIST_HEAD(&dynids->list); } diff -puN Documentation/pci.txt~pci_dynids_driver_data Documentation/pci.txt --- linux-2.6.11-rc3-bk4/Documentation/pci.txt~pci_dynids_driver_data 2005-02-07 15:58:21.0 -0600 +++ linux-2.6.11-rc3-bk4-bjking1/Documentation/pci.txt 2005-02-07 15:58:21.0 -0600 @@ -99,10 +99,10 @@ where all fields are passed in as hexade Users need pass only as many fields as necessary; vendor, device, subvendor, and subdevice fields default to PCI_ANY_ID (), class and classmask fields default to 0, and driver_data defaults to -0UL. Device drivers must call - pci_dynids_set_use_driver_data(pci_driver *, 1) -in order for the driver_data field to get passed to the driver. -Otherwise, only a 0 is passed in that field. +0UL. Device drivers must initialize use_driver_data in the dynids struct +in their pci_driver struct prior to calling pci_register_driver in order +for the driver_data field to get passed to the driver. Otherwise, only a +0 is passed in that field. When the driver exits, it just calls pci_unregister_driver() and the PCI layer automatically calls the remove hook for all devices handled by the driver. _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Re: msdos/vfat defaults are annoying
Michelle Konzack schrieb: > Am 2005-02-07 09:47:09, schrieb Pozsár Balázs: > > See? I _have_ that patch applied, that's why it tried vfat and not msdos > > first. > > With this, you will nerver mount a Filesystem "msdos". > > Because "vfat" IS "msdos" + "lfn". > > You can attach to ALL "msdos" media "lfn" and you will have "vfat". So msdos is vfat WITHOUT lfn, which is a a restriction like noatime or mounting ext3 as ext2. That's why the default should be vfat indeed and the restriction should be "nolfn", which will not allow lfns to be created and is what you actually intend, right? But this will break API today, so it should be added to list of features that will change. Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Filesystem linking protections
On Mon, 07 Feb 2005 20:34:33 +0100, Lorenzo =?ISO-8859-1?Q?Hern=E1ndez_?= =?ISO-8859-1?Q?Garc=EDa-Hierro?= said: > But It's better to give users a "secure-by-default" status, at least on > those parts that don't affect negatively the stability or the > performance itself. It's still policy, and should be put someplace where users can manage it. You're changing the behavior from what POSIX specifies, and that's in general a no-no for mainline kernel code. Like an LSM, which happens to be there so users can impose policy without making any code changes to the kernel. Implementing a policy that results in non-POSIXy behavior in an LSM is perfectly OK.. ;) > The LSM hook call is before the check, so, LSM framework still has the > control over it, until it releases the operation giving control back to > the standard function. Right.. Which means LSM can stop that particular attack even faster than your patch.. ;) > If users must rely on LSM or other external solutions for applying basic > security checks (as the framework itself only provides the way to apply > them, the checks need to be implemented in a module), then we are making > them unable to be protected using the "default" configuration. You're making the very rash assumption that a hard-coded one-size-fits all "default" that behaves differently than POSIX is suitable for all sites, including sites that run software that gets broken by this change, and things like embedded systems where it's not a concern at all, and sites that already implement some *other* system to ensure that it's not an issue (for instance, by using an SELinux policy...) pgpan5ep3gfVq.pgp Description: PGP signature
Re: ioremap() and port of linux to MPC7400 based SBC (VME board)
him wrote: > I have run into a problem I am having a hard time figuring out. > > I have an MPC7400 SBC (PCI bus based) that has a device X residing > at the following locations in memory: > > 0x1860 - 0x186f device control register space > 0xb000 - 0xbfff device memory space > > Now assume for a moment that NOTHING special needs to be done to > access either space once the system has booted and bus enumerator > have set things up. > > ioremap() of the first physical address returns a VALID virtual > address ... that I can read and write to. It works as expected > because there are signature values at various offsets in the control > register space. > The virtual address returned is EQUAL to the physical address > > ioremap() of the second physical address also returns what appears to > be a VALID virtual address although WRITES go nowhere and READS return > all ff's. > The virtual address returned is 0xc100 > > > > Now my question ... I have the source for the port. Where should I focus > my efforts in trying to figure this out? > > I have read the device drivers book and certain that I am following > the rules. > > I should also mention that there is an IO controller seperate from the > MPC7400 that I use to verify that the device X control and memory exist > in THAT physical range. > > If Only I can access them through ioremap() > > Thanks No idea up to now, but what kernel, what linux? is it VM linux or uClinux? Heinz -- with best regards / mit freundlichen Grüßen Heinz-Jürgen Oertel +=== | Heinz-Jürgen Oertel port GmbH http://www.port.de | mailto:[EMAIL PROTECTED] | phone +49 345 77755-0 fax +49 345 77755-20 | Regensburger Str. 7b, D-06132 Halle/Saale, Germany | CAN Wikihttp://www.CAN-Wiki.info | Newsletter: http://www.port.de/engl/company/content/abo_form.html +=== - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: question on symbol exports
On Mon, 2005-02-07 at 08:44 -0600, Chris Friesen wrote: > Benjamin Herrenschmidt wrote: > >>It turns out that to call ptep_clear_flush_dirty() on ppc64 from a > >>module I needed to export the following symbols: > >> > >>__flush_tlb_pending > >>ppc64_tlb_batch > >>hpte_update > > > > > > Any reason why you need to call that from a module ? Is the module > > GPL'd ? > > I explained this at the beginning of the thread, but I'll do so again. > The module will be released under the GPL. > > The basic idea is that we want to be able to track pages dirtied by a > userspace process. The system has no swap, so we use the dirty bit for > this. On demand we look up the page tables for an address range > specified by the caller, store the addresses of any dirty pages, then > mark them clean so that the next write causes them to get marked dirty > again. It is this act of marking them clean that requires the > additional exports. > > I've included the current code below. If there is any way to accomplish > this without the additional exports, I'd love to hear about it. Interesting... more than no swap, you must also make sure you have no r/w mmap'ed file (which are technically equivalent to swap). I'm not too fan about exporting those symbols, but I'll talk to paulus, it should be possible at least to EXPORT_SYMBOL_GPL them... Ben. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] Changing COW detection to be memory hotplug friendly
On Thu, 3 Feb 2005, IWAMOTO Toshihiro wrote: > The current implementation of memory hotremoval relies on that pages > can be unmapped from process spaces. After successful unmapping, > subsequent accesses to the pages are blocked and don't interfere > the hotremoval operation. > > However, this code > > if (PageSwapCache(page) && > page_count(page) != page_mapcount(page) + 2) { > ret = SWAP_FAIL; > goto out_unmap; > } Yes, that is odd code. It would be nice to have a solution without it. > in try_to_unmap_one() prevents unmapping pages that are referenced via > get_user_pages(), and such references can be held for a long time if > they are due to such as direct IO. > I've made a test program that issues multiple direct IO read requests > against a single read buffer, and pages that belong to the buffer > cannot be hotremoved because they aren't unmapped. I haven't looked at the rest of your hotremoval, so it's not obvious to me how a change here would help you - obviously you wouldn't want to be migrating pages while direct IO to them was in progress. I presume your patch works for you by letting the page count fall to a point where migration moves it automatically as soon as the got_user_pages are put, where without your patch the count is held too high, and you keep doing scans which tend to miss the window in which those pages are put? > The following patch, which is against linux-2.6.11-rc1-mm1 and also > tested witch linux-2.6.11-rc2-mm2, fixes this issue. The purpose of > this patch is to be able to unmap pages that have incremented > page_count. To do that consistently, the COW detection logic needs to > be modified to not to rely on page_count. I'm aware that such > extensive use of page_mapcount is discouraged and there is a plan to > kill page_mapcount (*), but I cannot think of a better alternative > solution. > > (*) c.f. http://www.ussg.iu.edu/hypermail/linux/kernel/0406.0/0483.html I apologize for scaring you off page mapcount. I have no current plans to scrap it, and feel a lot more satisfied with it than at the time of that comment. Partly because it's now manipulated atomically rather than under bitspin lock. Partly because I realize that although 64-bit systems are overdue for an atomic64 page count and page mapcount, we can actually just use one atomic64 for them both, keeping, say, lower 24 bits for count and upper 40 for mapcount (and not repeating mapcount in count on these arches), so mapcount won't increase struct page size. Go right ahead and use page mapcount if it's appropriate. > Some notes about my code: > > - I think it's safe to rely on page_mapcount in do_swap_page(), > because its use is protected by lock_page(). I think so too. > - The can_share_swap_page() call in do_swap_page() always returns > false. It is inefficient but should be harmless. Incrementing > page_mapcount before calling that function should fix the problem, > but it may cause bad side effects. Odd that your patch moves it if it now doesn't even work! But I think some more movement should be able to solve that. > - Another obvious solution to this issue is to find the "offending" > process from a un-unmappable page and suspend it until the page is > unmapped. I'm afraid the implementation would be much more complicated. Agreed, let's not get into that. > - I could not test the following situation. It should be possible > to write some kernel code to do that, but please let me know if > you know any such test cases. > - A page_count is incremented by get_user_pages(). > - The page gets unmapped. > - The process causes a write fault for the page, before the > incremented page_count is dropped. I confess I don't have such a test case ready myself. > Also, while I've tried carefully not to make mistakes and done some > testing, I'm not very sure this is bug free. Please comment. > > --- mm/memory.c.orig 2005-01-17 14:47:11.0 +0900 > +++ mm/memory.c 2005-01-17 14:55:51.0 +0900 > @@ -1786,10 +1786,6 @@ static int do_swap_page(struct mm_struct > } > > /* The page isn't present yet, go ahead with the fault. */ > - > - swap_free(entry); > - if (vm_swap_full()) > - remove_exclusive_swap_page(page); > > mm->rss++; > acct_update_integrals(); > @@ -1800,6 +1796,10 @@ static int do_swap_page(struct mm_struct > pte = maybe_mkwrite(pte_mkdirty(pte), vma); > write_access = 0; > } > + > + swap_free(entry); > + if (vm_swap_full()) > + remove_exclusive_swap_page(page); > unlock_page(page); > > flush_icache_page(vma, page); > --- mm/rmap.c.orig2005-01-17 14:40:08.0 +0900 > +++ mm/rmap.c 2005-01-21 12:34:06.0 +0900 > @@ -569,8 +569,11 @@ static int try_to_unmap_one(struct page >*/ >
[Patch] only unmap what intersects a direct_IO op
Now that we're only invalidating the pages that intersected a direct IO write we might as well only unmap the intersecting bytes as well. This passed a light fsx load with page cache, direct, and mmap IO. Signed-off-by: Zach Brown <[EMAIL PROTECTED]> --- filemap.c | 12 1 files changed, 8 insertions(+), 4 deletions(-) Index: 2.6-bk-odirinv/mm/filemap.c === --- 2.6-bk-odirinv.orig/mm/filemap.c2005-02-07 12:42:50.0 -0800 +++ 2.6-bk-odirinv/mm/filemap.c 2005-02-07 12:43:16.244253441 -0800 @@ -2285,22 +2285,26 @@ struct file *file = iocb->ki_filp; struct address_space *mapping = file->f_mapping; ssize_t retval; + size_t write_len = 0; /* * If it's a write, unmap all mmappings of the file up-front. This * will cause any pte dirty bits to be propagated into the pageframes * for the subsequent filemap_write_and_wait(). */ - if (rw == WRITE && mapping_mapped(mapping)) - unmap_mapping_range(mapping, 0, -1, 0); + if (rw == WRITE) { + write_len = iov_length(iov, nr_segs); + if (mapping_mapped(mapping)) + unmap_mapping_range(mapping, offset, write_len, 0); + } retval = filemap_write_and_wait(mapping); if (retval == 0) { retval = mapping->a_ops->direct_IO(rw, iocb, iov, offset, nr_segs); if (rw == WRITE && mapping->nrpages) { - pgoff_t end = (offset + iov_length(iov, nr_segs) - 1) - >> PAGE_CACHE_SHIFT; + pgoff_t end = (offset + write_len - 1) + >> PAGE_CACHE_SHIFT; int err = invalidate_inode_pages2_range(mapping, offset >> PAGE_CACHE_SHIFT, end); if (err) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: M7101
On Sun, Feb 06, 2005 at 03:26:15PM +0100, Jean Delvare wrote: > Hi Enrico, > > Sorry for the delay. > > > I have a board with the ALI M7101 chip, but I can't activate it in > > BIOS. I tried to compile the prog/hotplug/m7101.c but I seen that > > this is only for 2.4 Kernels. Is there a module for 2.6? > > The prog/hotplug/m7101.c (from the lm_sensors project) was a quick hack > and only works with 2.4 kernels, as you noticed. For 2.6 kernels, the > prefered solution is known as PCI quirks (drivers/pci/quirks.c). I can > see that you already found that and proposed a patch for the 2.6 kernel > here: > http://marc.theaimsgroup.com/?l=linux-kernel&m=110606482902883 > > Maarten Deprez then converted it to the proper kernel coding-style: > http://marc.theaimsgroup.com/?l=linux-kernel&m=110726276414532 > > I invite you to test the new patch and confirm that it works for you. > > Any chance we could get the PCI folks to review the code and push it > upwards if it is OK? I need it resent with the fixes, and a "Signed-off-by:" line to do that :) Also, a pci_get_* is called without a matching put. thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[Patch] write and wait on range before direct io read
This adds filemap_write_and_wait_range(mapping, lstart, lend) which starts writeback and waits on a range of pages. We call this from __blkdev_direct_IO with just the range that is going to be read by the direct_IO read. It was lightly tested with fsx and ext3 and passed. Signed-off-by: Zach Brown <[EMAIL PROTECTED]> --- fs/direct-io.c |7 +-- include/linux/fs.h |2 ++ mm/filemap.c | 16 3 files changed, 23 insertions(+), 2 deletions(-) Index: 2.6-bk-odirinv/fs/direct-io.c === --- 2.6-bk-odirinv.orig/fs/direct-io.c 2005-02-07 11:19:40.0 -0800 +++ 2.6-bk-odirinv/fs/direct-io.c 2005-02-07 12:43:09.572259133 -0800 @@ -1206,7 +1206,8 @@ */ dio->lock_type = dio_lock_type; if (dio_lock_type != DIO_NO_LOCKING) { - if (rw == READ) { + /* watch out for a 0 len io from a tricksy fs */ + if (rw == READ && end > offset) { struct address_space *mapping; mapping = iocb->ki_filp->f_mapping; @@ -1214,7 +1215,9 @@ down(&inode->i_sem); reader_with_isem = 1; } - retval = filemap_write_and_wait(mapping); + + retval = filemap_write_and_wait_range(mapping, offset, + end - 1); if (retval) { kfree(dio); goto out; Index: 2.6-bk-odirinv/include/linux/fs.h === --- 2.6-bk-odirinv.orig/include/linux/fs.h 2005-02-07 11:26:23.0 -0800 +++ 2.6-bk-odirinv/include/linux/fs.h 2005-02-07 12:26:43.030749241 -0800 @@ -1359,6 +1359,8 @@ extern int filemap_flush(struct address_space *); extern int filemap_fdatawait(struct address_space *); extern int filemap_write_and_wait(struct address_space *mapping); +extern int filemap_write_and_wait_range(struct address_space *mapping, + loff_t lstart, loff_t lend); extern void sync_supers(void); extern void sync_filesystems(int wait); extern void emergency_sync(void); Index: 2.6-bk-odirinv/mm/filemap.c === --- 2.6-bk-odirinv.orig/mm/filemap.c2005-02-07 11:26:23.0 -0800 +++ 2.6-bk-odirinv/mm/filemap.c 2005-02-07 13:00:29.723440763 -0800 @@ -336,6 +336,22 @@ return retval; } +int filemap_write_and_wait_range(struct address_space *mapping, +loff_t lstart, loff_t lend) +{ + int retval = 0; + + if (mapping->nrpages) { + retval = __filemap_fdatawrite_range(mapping, lstart, lend, + WB_SYNC_ALL); + if (retval == 0) + retval = wait_on_page_writeback_range(mapping, + lstart >> PAGE_CACHE_SHIFT, + lend >> PAGE_CACHE_SHIFT); + } + return retval; +} + /* * This function is used to add newly allocated pagecache pages: * the page is new, so we can just run SetPageLocked() against it. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Patch] invalidate range of pages after direct IO write
>> But this won't happen if next >>started as 0 and we didn't update it. I don't know if retrying is the >>intended behaviour or if we care that the start == 0 case doesn't do it. > > > Good point. Let's make it explicit? Looks great. I briefly had visions of some bitfield to pack the three boolean ints we have and then quickly came to my senses. :) I threw together those other two patches that work with ranges around direct IO. (unmaping before r/w and writing and waiting before reads). rc3-mm1 is angry with my test machine so they're actually against current -bk with this first invalidation patch applied. I hope that doesn't make life harder than it needs to be. I'll send them under seperate cover. - z - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Memory leak in 2.6.11-rc1?
Jan Kasprzak wrote: : I think I have been running 2.6.10-rc3 before. I've copied : the fs/bio.c from 2.6.10-rc3 to my 2.6.11-rc2 sources and booted the : resulting kernel. I hope it will not eat my filesystems :-) I will send : my /proc/slabinfo in a few days. Hmm, after 3h35min of uptime I have biovec-1 92157 92250 16 2251 : tunables 120 608 : slabdata410410 60 bio92163 92163128 311 : tunables 120 608 : slabdata 2973 2973 60 so it is probably still leaking - about half an hour ago it was biovec-1 77685 77850 16 2251 : tunables 120 608 : slabdata346346 0 bio77841 77841128 311 : tunables 120 608 : slabdata 2511 2511180 -Yenya -- | Jan "Yenya" Kasprzak | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ | > Whatever the Java applications and desktop dances may lead to, Unix will < > still be pushing the packets around for a quite a while. --Rob Pike < - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Sabotaged PaXtest (was: Re: Patch 4/6 randomize the stack pointer)
* [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > > still wrong. What you get this way is a nice, complicated NOP. > > not only a nop but also a likely crash given that i didn't adjust > the declaration of some_function appropriately ;-). let's cater > for less complexity too with the following payload (of the 'many > other ways' kind): > > [field1 and other locals replaced with shellcode] > [space to cover the locals of __libc_dlopen_mode()] yes, i agree with you, __libc_dlopen_mode() is an easier target (but not _that_ easy of a target, see further down), and your code looks right - but what this discussion was about was the _dl_make_stack_executable() function. Similar 'protection' techniques can be used for __libc_dlopen_mode() too, and it's being fixed. (you'd be correct to point out that what cannot be 'fixed' even this way are libdl.so using applications and the dlopen() symbol - for them, if randomization is not enough, PaX or SELinux is the fix.) > one disadvantage of this approach is that now not only the randomness > in libc.so has to be found but also that of the stack (repeating parts > of the payload would help reduce it though), and if user_input itself > is on the heap (and there're no copies on the stack), we'll need that > randomness too. such an attack needs to get 2 or 3 random values right - which, considering 13-bits randomization per value is still 26-39 bits (minus the constant number of bits you can get away via replication). If the stack wasnt nonexec then the attack would need to get only 1 random value right. In that sense it still makes quite a difference in increasing the complexity of the attack, do you agree? Yes, the drastic method is to disable the adding of code to a process image altogether (PaX did this first, and does a nice job in that, and SELinux is catching up as well), but that clearly was not a product option when PT_GNU_STACK was written. As you can see on lkml, people are resisting changes hard that affect 2-3 apps. What chances do changes have that break dozens of common applications? PT_GNU_STACK is not perfect, but it was the maximum we could get away on the non-selinux side of the distribution, mapping many of the dependencies and assumptions of apps. So PT_GNU_STACK is certainly a beginning, and as the end result (hopefully soon) we can do away with libraries having any RWE PT_GNU_STACK markings (so that only binaries can carry RWE) and can move make_stacks_executable() from libc.so. You seem to consider these steps of how Fedora 'morphs' into a productized version of SELinux as 'fully vulnerable' (and despise it), there's no way around walking that walk and persuading users to actually follow - which is the hardest part. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.11-rc2-mm1
On Mon, 2005-02-07 at 12:30 -0500, Robert Love wrote: > Well, I don't share the hatred for ioctl, at least compared to another > type unsafe interface like write(). > > But John and I are open to doing whatever is the consensus. If there is > an agreed alternative, and that is the requirement for merging, I'll do > it. Yes, if ioctl is unacceptable, then providing a write() interface is what we will do. > > I'd like to keep the user-space interface and simple, and absolutely > want to keep the single file descriptor approach. How the fd is > obtained is up for discussion. I would still like to keep the character device as the interface for getting the fd. I don't see what benefit could be gained by converting to a syscall based interface for getting the fd. -- John McCutchan <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] Reliable video POSTing on resume (was: Re: [ACPI] Samsung P35, S3, black screen (radeon))
Hi! > > > > We already try to do that, but it hangs on 70% of machines. See > > > > Documentation/power/video.txt. > > > > > > We know that all of these ROMs are run at power on so they have to > > > work. This implies that there must be something wrong with the > > > environment the ROM are being run in. Video ROMs make calls into the > > > INT vectors of the system BIOS. If these haven't been set up yet > > > running the VBIOS is sure to hang. Has someone with ROM source and > > > the appropriate debugging tools tried to debug one of these hangs? > > > Alternatively code could be added to wakeup.S to try and set these up > > > or dump the ones that are there and see if they are sane. > > > > Rumors say that notebooks no longer have video bios at C000h:0; rumors > > say that video BIOS on notebooks is simply integrated into main system > > BIOS. I personaly do not know if rumors are true, but PCs are ugly > > machines > > The state of current hardware has already been mentioned but let > me clarify. This is not a laptop problem anytime you have onboard > video you are unlikely to have a separate video ROM. This includes > many recent server boards as well as laptops. When the board boots > up there will be a video option ROM shadowed into the usually location > at C000h:0 but what becomes of it afterwards is a good question. > > For server boards most commonly this seems to be a flavor of the ATI > Rage XL chip. It is a low end part that I doubt getting documentation > for will be very hard. And according to > Documentation/power/video.txt this is one of the cases that actually > works. I do not see Rage XL mentioned in video.txt; can you give me details and/or suggest a patch? > What is happening in those POST routines of a video card is typically > the code to initialize the memory controller on the video card. Plus > a little bit of code to set the video mode. If I read the > documentation correctly in a S3 power state only the RAM is preserved. > So it does look like the video post is needed. On some machines, video state is preserved over S3... Some BIOSes are good enough to POST video for you... Pavel -- People were complaining that M$ turns users into beta-testers... ...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl! - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/