date:20050207

On Fri, 04 Feb 2005 11:03:47 +0100, Ingo Molnar said:
> 
> i have released the -V0.7.38-01 Real-Time Preemption patch, which can be
> downloaded from the usual place:

Hey Ingo.. Sorry to keep breaking stuff on you, but.. ;)

Summary: Looks like CONFIG_NET_PKTGEN=y gives -V0.7.38-03 indigestion.

I retrofitted 0.7.38-03 onto -rc3-mm1, and at boot it wedged up hard scrolling
an error message.  Looked like a 'scheduling while atomic' error coming from
net/pktgen.o.   Sorry for the incomplete traceback, but it locked before
userspace came up, and I don't have hardware handy for a serial console..

I found a CONFIG_NET_PKTGEN=Y in the config, rebuilt with =n, and the resulting
kernel boots fine (am using it as I type). Vanilla -rc3-mm1 also boots fine
with the PTKGEN=y setting (as did 2.6.10-mm1-V0.7.34-01, the last -mm I built
with a -RT patch).  I haven't tried a vanilla -rc3-V0.7.38-03, but I don't see
anyplace -mm1 hits pktgen.c

If the above isn't enough to track down the issue, feel free to let me know
what you'd like me to try next.


pgpPCaJdLnngE.pgp
Description: PGP signature

Re: [PATCH 2.4.19-bk8] arch/i386/kernel/pci-irq.c: Wrong message output

Mark F. Haigh wrote:
Apologies.  Patch now -p1-able.
[Apologies yet again, now includes description]
I'd submitted a patch earlier for this file, fixing a warning.  When I 
looked at it further, I noticed it can output an incorrect warning 
message under certain circumstances.  I've confirmed that this can and 
does happen in the wild:

PCI: Enabling device :00:0a.0 ( -> 0001)
PCI: No IRQ known for interrupt pin @ of device :00:0a.0. Probably 
buggy MP table.

It should read "No IRQ known for interrupt pin A", but the 'pin' 
variable has already been decremented (from 1 to 0), so the line:

printk(KERN_WARNING "PCI: No IRQ known for interrupt pin %c of device 
%s.%s\n", 'A' + pin - 1, dev->slot_name, msg);

causes "pin @" to be output, because 'A' + 0 - 1 == '@'.
This patch also fixes the original warning:
pci-irq.c: In function `pcibios_enable_irq':
pci-irq.c:1128: warning: 'msg' might be used uninitialized in this function
Thanks,
Mark Haigh
[EMAIL PROTECTED]
Signed-off-by:  Mark F. Haigh  <[EMAIL PROTECTED]>
--- linux-2.4.29-bk8/arch/i386/kernel/pci-irq.c.orig2005-02-07 
19:55:23.0 -0800
+++ linux-2.4.29-bk8/arch/i386/kernel/pci-irq.c 2005-02-07 20:13:38.0 
-0800
@@ -1127,6 +1127,8 @@
if (pin && !pcibios_lookup_irq(dev, 1) && !dev->irq) {
char *msg;
 
+   pin--;  /* interrupt pins are numbered starting from 1 
*/
+
/* With IDE legacy devices the IRQ lookup failure is not a 
problem.. */
if (dev->class >> 8 == PCI_CLASS_STORAGE_IDE && !(dev->class & 
0x5))
return;
@@ -1134,42 +1136,39 @@
if (io_apic_assign_pci_irqs) {
int irq;
 
-   if (pin) {
-   pin--;  /* interrupt pins are numbered 
starting from 1 */
-   irq = 
IO_APIC_get_PCI_irq_vector(dev->bus->number, PCI_SLOT(dev->devfn), pin);
-   /*
-* Busses behind bridges are typically not 
listed in the MP-table.
-* In this case we have to look up the IRQ 
based on the parent bus,
-* parent slot, and pin number. The SMP code 
detects such bridged
-* busses itself so we should get into this 
branch reliably.
-*/
-   temp_dev = dev;
-   while (irq < 0 && dev->bus->parent) { /* go 
back to the bridge */
-   struct pci_dev * bridge = 
dev->bus->self;
+   irq = IO_APIC_get_PCI_irq_vector(dev->bus->number, 
PCI_SLOT(dev->devfn), pin);
+   /*
+* Busses behind bridges are typically not listed in 
the MP-table.
+* In this case we have to look up the IRQ based on the 
parent bus,
+* parent slot, and pin number. The SMP code detects 
such bridged
+* busses itself so we should get into this branch 
reliably.
+*/
+   temp_dev = dev;
+   while (irq < 0 && dev->bus->parent) { /* go back to the 
bridge */
+   struct pci_dev * bridge = dev->bus->self;
 
-   pin = (pin + PCI_SLOT(dev->devfn)) % 4;
-   irq = 
IO_APIC_get_PCI_irq_vector(bridge->bus->number, 
-   
PCI_SLOT(bridge->devfn), pin);
-   if (irq >= 0)
-   printk(KERN_WARNING "PCI: using 
PPB(B%d,I%d,P%d) to get irq %d\n", 
-   bridge->bus->number, 
PCI_SLOT(bridge->devfn), pin, irq);
-   dev = bridge;
-   }
-   dev = temp_dev;
-   if (irq >= 0) {
-   printk(KERN_INFO "PCI->APIC IRQ 
transform: (B%d,I%d,P%d) -> %d\n",
-   dev->bus->number, 
PCI_SLOT(dev->devfn), pin, irq);
-   dev->irq = irq;
-   return;
-   } else
-   msg = " Probably buggy MP table.";
+   pin = (pin + PCI_SLOT(dev->devfn)) % 4;
+   irq = 
IO_APIC_get_PCI_irq_vector(bridge->bus->number, 
+   PCI_SLOT(bridge->devfn), pin);
+   if (irq >= 0)
+   printk(KERN_WARNING "PCI: using 
PPB(B%d,I%d,P%d) to get irq %d\n", 
+

Marvell Yukon 2 PCI Express 88E8050 is not support in the EXPERIMENTAL skge driver

2005-02-07 Thread maxer1

The description in kernel 2.6.11-rc3-mm1 make for the skge driver states 
the following:

"*New SysKonnect GigaEthernet support (EXPERIMENTAL) (SKGE)
This driver support the Marvell Yukon or SysKonnect SK-98xx/SK-95xx
and related Gigabit Ethernet adapters. It is a new smaller driver
driver with better performance and more complete ethtool support.
It does not support the link failover and network management
features that "portable" vendor supplied sk98lin driver does.* "
What makes my PCI Express mobo with on board 04:00.0 Ethernet 
controller: Marvell Technology Group Ltd. Gigabit Ethernet Controller 
(rev 17)
*not *supported by skge driver is that it is a  NEW generation  driver  
Marvell Yukon  2.

I have SysKonnect's sk98lin driver working for me under a custom built 
2.6.9 kernel using SysKonnect's driver version 7.09 patched in the kernel.

The motherboard I'm using is a new Intel D915GEV. The specs on the lan 
show as:

Gigabit (10/100/1000 Mbits/sec) LAN subsystem using the Marvel* Yukon* 
88E8050 PCI Express* Gigabit Ethernet Controller.

Don't try this Stephen's skge driver with this, it isn't supported.
RaXeT
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

skge driver from Stephen H. doesn't support the following SysKonnect sk98lin supported on board lan

2005-02-07 Thread maxer

The description in kernel 2.6.11-rc3-mm1 make for the skge driver states 
the following:

"*New SysKonnect GigaEthernet support (EXPERIMENTAL) (SKGE)
This driver support the Marvell Yukon or SysKonnect SK-98xx/SK-95xx
and related Gigabit Ethernet adapters. It is a new smaller driver
driver with better performance and more complete ethtool support.
It does not support the link failover and network management
features that "portable" vendor supplied sk98lin driver does.* "
What makes my PCI Express mobo with on board 04:00.0 Ethernet 
controller: Marvell Technology Group Ltd. Gigabit Ethernet Controller 
(rev 17)
*not *supported by skge driver is that it is a  NEW generation  driver  
Marvell Yukon  2.

I have SysKonnect's sk98lin driver working for me under a custom built 
2.6.9 kernel using SysKonnect's driver version 7.09 patched in the kernel.

The motherboard I'm using is a new Intel D915GEV. The specs on the lan 
show as:

Gigabit (10/100/1000 Mbits/sec) LAN subsystem using the Marvel* Yukon* 
88E8050 PCI Express* Gigabit Ethernet Controller.

Don't try this Stephen's skge driver with this, it isn't supported.
RaXeT
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2.4.29-bk8] Resend: sym53c8xx.c: Add ULL suffix to fix warning

Mark F. Haigh wrote:
Apologies.  Patch now -p1-able.
[Apolgies yet again, description included now]
Noticed that in drivers/scsi/sym53c8xx.c:
sym53c8xx.c:13185: warning: integer constant is too large for "long" type
Since we're not dealing with C99 (yet), this 64 bit integer constant
needs to be suffixed with ULL.  Patch included.
Mark F. Haigh
[EMAIL PROTECTED]
Signed-off-by: Mark F. Haigh  <[EMAIL PROTECTED]>
--- linux-2.4.29-bk8/drivers/scsi/sym53c8xx.c.orig  2005-02-07 
19:53:05.0 -0800
+++ linux-2.4.29-bk8/drivers/scsi/sym53c8xx.c   2005-02-07 19:53:36.0 
-0800
@@ -13182,7 +13182,7 @@
** descriptors.
*/
if (chip && (chip->features & FE_DAC)) {
-   if (pci_set_dma_mask(pdev, (u64) 0xff))
+   if (pci_set_dma_mask(pdev, (u64) 0xffULL))
chip->features &= ~FE_DAC_IN_USE;
else
chip->features |= FE_DAC_IN_USE;

Re: [PATCH 2.6.11-rc3-bk4] arch/i386/kernel/pci/irq.c: Wrong message output

Greg KH wrote:

Oops, this time you forgot the whole description of the patch :(
Third time's the charm...
The following has been reported in the wild for kernel 2.6.8-24:
PCI: Enabling device :00:05.0 ( -> 0002)
PCI: No IRQ known for interrupt pin @ of device :00:05.0. Probably 
buggy MP table.

It should read "No IRQ known for interrupt pin A", but the 'pin' 
variable has already been decremented (from 1 to 0), so the line:

printk(KERN_WARNING "PCI: No IRQ known for interrupt pin %c of device 
%s.%s\n", 'A' + pin - 1, dev->slot_name, msg);

causes "pin @" to be output, because 'A' + 0 - 1 == '@'.
The supplied patch should fix it.  It also removes a redundant check for 
a nonzero pin.

Mark F. Haigh
[EMAIL PROTECTED]
Signed-off-by: Mark F. Haigh  <[EMAIL PROTECTED]>
--- linux-2.6.11-rc3-bk4/arch/i386/pci/irq.c.orig   2005-02-07 
20:40:58.0 -0800
+++ linux-2.6.11-rc3-bk4/arch/i386/pci/irq.c2005-02-07 21:39:15.091239272 
-0800
@@ -1031,56 +1031,55 @@
 
pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin);
if (pin && !pcibios_lookup_irq(dev, 1) && !dev->irq) {
-   char *msg;
-   msg = "";
+   char *msg = "";
+
+   pin--;  /* interrupt pins are numbered starting from 1 
*/
+
if (io_apic_assign_pci_irqs) {
int irq;
 
-   if (pin) {
-   pin--;  /* interrupt pins are numbered 
starting from 1 */
-   irq = 
IO_APIC_get_PCI_irq_vector(dev->bus->number, PCI_SLOT(dev->devfn), pin);
-   /*
-* Busses behind bridges are typically not 
listed in the MP-table.
-* In this case we have to look up the IRQ 
based on the parent bus,
-* parent slot, and pin number. The SMP code 
detects such bridged
-* busses itself so we should get into this 
branch reliably.
-*/
-   temp_dev = dev;
-   while (irq < 0 && dev->bus->parent) { /* go 
back to the bridge */
-   struct pci_dev * bridge = 
dev->bus->self;
-
-   pin = (pin + PCI_SLOT(dev->devfn)) % 4;
-   irq = 
IO_APIC_get_PCI_irq_vector(bridge->bus->number, 
-   
PCI_SLOT(bridge->devfn), pin);
-   if (irq >= 0)
-   printk(KERN_WARNING "PCI: using 
PPB %s[%c] to get irq %d\n",
-   pci_name(bridge), 'A' + 
pin, irq);
-   dev = bridge;
-   }
-   dev = temp_dev;
-   if (irq >= 0) {
+   irq = IO_APIC_get_PCI_irq_vector(dev->bus->number, 
PCI_SLOT(dev->devfn), pin);
+   /*
+* Busses behind bridges are typically not listed in 
the MP-table.
+* In this case we have to look up the IRQ based on the 
parent bus,
+* parent slot, and pin number. The SMP code detects 
such bridged
+* busses itself so we should get into this branch 
reliably.
+*/
+   temp_dev = dev;
+   while (irq < 0 && dev->bus->parent) { /* go back to the 
bridge */
+   struct pci_dev * bridge = dev->bus->self;
+
+   pin = (pin + PCI_SLOT(dev->devfn)) % 4;
+   irq = 
IO_APIC_get_PCI_irq_vector(bridge->bus->number, 
+   PCI_SLOT(bridge->devfn), pin);
+   if (irq >= 0)
+   printk(KERN_WARNING "PCI: using PPB 
%s[%c] to get irq %d\n",
+   pci_name(bridge), 'A' + pin, 
irq);
+   dev = bridge;
+   }
+   dev = temp_dev;
+   if (irq >= 0) {
 #ifdef CONFIG_PCI_MSI
-   if (!platform_legacy_irq(irq))
-   irq = IO_APIC_VECTOR(irq);
+   if (!platform_legacy_irq(irq))
+   irq = IO_APIC_VECTOR(irq);
 #endif
-   printk(KERN_INFO "PCI->APIC IRQ 
transform: %s[%c] -> IRQ %d\n",
-   pci_name(dev), 'A' + pin, irq);
-   dev->irq = irq;
-   return 0;
-

Re: [PATCH 2.4.19-bk8] arch/i386/kernel/pci-irq.c: Wrong message output

Mark F. Haigh wrote:
--- arch/i386/kernel/pci-irq.c.orig 2005-02-07 19:55:23.852531544 -0800
+++ arch/i386/kernel/pci-irq.c  2005-02-07 20:13:38.835068896 -0800
Apologies.  Patch now -p1-able.
Mark F. Haigh
[EMAIL PROTECTED]
Signed-off-by: Mark F. Haigh  <[EMAIL PROTECTED]>
--- linux-2.4.29-bk8/arch/i386/kernel/pci-irq.c.orig2005-02-07 
19:55:23.0 -0800
+++ linux-2.4.29-bk8/arch/i386/kernel/pci-irq.c 2005-02-07 20:13:38.0 
-0800
@@ -1127,6 +1127,8 @@
if (pin && !pcibios_lookup_irq(dev, 1) && !dev->irq) {
char *msg;
 
+   pin--;  /* interrupt pins are numbered starting from 1 
*/
+
/* With IDE legacy devices the IRQ lookup failure is not a 
problem.. */
if (dev->class >> 8 == PCI_CLASS_STORAGE_IDE && !(dev->class & 
0x5))
return;
@@ -1134,42 +1136,39 @@
if (io_apic_assign_pci_irqs) {
int irq;
 
-   if (pin) {
-   pin--;  /* interrupt pins are numbered 
starting from 1 */
-   irq = 
IO_APIC_get_PCI_irq_vector(dev->bus->number, PCI_SLOT(dev->devfn), pin);
-   /*
-* Busses behind bridges are typically not 
listed in the MP-table.
-* In this case we have to look up the IRQ 
based on the parent bus,
-* parent slot, and pin number. The SMP code 
detects such bridged
-* busses itself so we should get into this 
branch reliably.
-*/
-   temp_dev = dev;
-   while (irq < 0 && dev->bus->parent) { /* go 
back to the bridge */
-   struct pci_dev * bridge = 
dev->bus->self;
+   irq = IO_APIC_get_PCI_irq_vector(dev->bus->number, 
PCI_SLOT(dev->devfn), pin);
+   /*
+* Busses behind bridges are typically not listed in 
the MP-table.
+* In this case we have to look up the IRQ based on the 
parent bus,
+* parent slot, and pin number. The SMP code detects 
such bridged
+* busses itself so we should get into this branch 
reliably.
+*/
+   temp_dev = dev;
+   while (irq < 0 && dev->bus->parent) { /* go back to the 
bridge */
+   struct pci_dev * bridge = dev->bus->self;
 
-   pin = (pin + PCI_SLOT(dev->devfn)) % 4;
-   irq = 
IO_APIC_get_PCI_irq_vector(bridge->bus->number, 
-   
PCI_SLOT(bridge->devfn), pin);
-   if (irq >= 0)
-   printk(KERN_WARNING "PCI: using 
PPB(B%d,I%d,P%d) to get irq %d\n", 
-   bridge->bus->number, 
PCI_SLOT(bridge->devfn), pin, irq);
-   dev = bridge;
-   }
-   dev = temp_dev;
-   if (irq >= 0) {
-   printk(KERN_INFO "PCI->APIC IRQ 
transform: (B%d,I%d,P%d) -> %d\n",
-   dev->bus->number, 
PCI_SLOT(dev->devfn), pin, irq);
-   dev->irq = irq;
-   return;
-   } else
-   msg = " Probably buggy MP table.";
+   pin = (pin + PCI_SLOT(dev->devfn)) % 4;
+   irq = 
IO_APIC_get_PCI_irq_vector(bridge->bus->number, 
+   PCI_SLOT(bridge->devfn), pin);
+   if (irq >= 0)
+   printk(KERN_WARNING "PCI: using 
PPB(B%d,I%d,P%d) to get irq %d\n", 
+   bridge->bus->number, 
PCI_SLOT(bridge->devfn), pin, irq);
+   dev = bridge;
}
+   dev = temp_dev;
+   if (irq >= 0) {
+   printk(KERN_INFO "PCI->APIC IRQ transform: 
(B%d,I%d,P%d) -> %d\n",
+   dev->bus->number, PCI_SLOT(dev->devfn), 
pin, irq);
+   dev->irq = irq;
+   return;
+   } else
+   msg = " Probably buggy MP table.";
} else if (pci_probe & PCI_BIOS_IRQ_SCAN)
msg = "";
else
msg = " Please try using pci=biosirq.";

Re: [PATCH 2.4.29-bk8] Resend: sym53c8xx.c: Add ULL suffix to fix warning

Mark F. Haigh wrote:
--- drivers/scsi/sym53c8xx.c.orig   2005-02-07 19:53:05.741527608 -0800
+++ drivers/scsi/sym53c8xx.c2005-02-07 19:53:36.782808616 -0800
Apologies.  Patch now -p1-able.
Mark F. Haigh
[EMAIL PROTECTED]
Signed-off-by: Mark F. Haigh  <[EMAIL PROTECTED]>
--- linux-2.4.29-bk8/drivers/scsi/sym53c8xx.c.orig  2005-02-07 
19:53:05.0 -0800
+++ linux-2.4.29-bk8/drivers/scsi/sym53c8xx.c   2005-02-07 19:53:36.0 
-0800
@@ -13182,7 +13182,7 @@
** descriptors.
*/
if (chip && (chip->features & FE_DAC)) {
-   if (pci_set_dma_mask(pdev, (u64) 0xff))
+   if (pci_set_dma_mask(pdev, (u64) 0xffULL))
chip->features &= ~FE_DAC_IN_USE;
else
chip->features |= FE_DAC_IN_USE;

Re: [PATCH 2.6.11-rc3-bk4] arch/i386/kernel/pci/irq.c: Wrong message output

On Mon, Feb 07, 2005 at 09:42:02PM -0800, Mark F. Haigh wrote:
> Greg KH wrote:
> >On Mon, Feb 07, 2005 at 09:06:18PM -0800, Mark F. Haigh wrote:
> 
> > > --- arch/i386/pci/irq.c.orig  2005-02-07 20:40:58.140856536 -0800
> > > +++ arch/i386/pci/irq.c   2005-02-07 20:46:06.713946296 -0800
> >
> >Can you resend this so it can be applied with -p1 to patch, and a
> >Signed-off-by: line?
> >
> 
> Ack, my fault.
> 
> Mark F. Haigh
> [EMAIL PROTECTED]
> 
> 
> Signed-off-by: Mark F. Haigh  <[EMAIL PROTECTED]>

Oops, this time you forgot the whole description of the patch :(

Third time's the charm...

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2.6.11-rc3-bk4] arch/i386/kernel/pci/irq.c: Wrong message output

Greg KH wrote:
On Mon, Feb 07, 2005 at 09:06:18PM -0800, Mark F. Haigh wrote:

 > --- arch/i386/pci/irq.c.orig  2005-02-07 20:40:58.140856536 -0800
 > +++ arch/i386/pci/irq.c   2005-02-07 20:46:06.713946296 -0800
Can you resend this so it can be applied with -p1 to patch, and a
Signed-off-by: line?
Ack, my fault.
Mark F. Haigh
[EMAIL PROTECTED]
Signed-off-by: Mark F. Haigh  <[EMAIL PROTECTED]>
--- linux-2.6.11-rc3-bk4/arch/i386/pci/irq.c.orig   2005-02-07 
20:40:58.0 -0800
+++ linux-2.6.11-rc3-bk4/arch/i386/pci/irq.c2005-02-07 20:46:06.0 
-0800
@@ -1031,56 +1031,55 @@
 
pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin);
if (pin && !pcibios_lookup_irq(dev, 1) && !dev->irq) {
-   char *msg;
-   msg = "";
+   char *msg = "";
+
+   pin--;  /* interrupt pins are numbered starting from 1 
*/
+
if (io_apic_assign_pci_irqs) {
int irq;
 
-   if (pin) {
-   pin--;  /* interrupt pins are numbered 
starting from 1 */
-   irq = 
IO_APIC_get_PCI_irq_vector(dev->bus->number, PCI_SLOT(dev->devfn), pin);
-   /*
-* Busses behind bridges are typically not 
listed in the MP-table.
-* In this case we have to look up the IRQ 
based on the parent bus,
-* parent slot, and pin number. The SMP code 
detects such bridged
-* busses itself so we should get into this 
branch reliably.
-*/
-   temp_dev = dev;
-   while (irq < 0 && dev->bus->parent) { /* go 
back to the bridge */
-   struct pci_dev * bridge = 
dev->bus->self;
-
-   pin = (pin + PCI_SLOT(dev->devfn)) % 4;
-   irq = 
IO_APIC_get_PCI_irq_vector(bridge->bus->number, 
-   
PCI_SLOT(bridge->devfn), pin);
-   if (irq >= 0)
-   printk(KERN_WARNING "PCI: using 
PPB %s[%c] to get irq %d\n",
-   pci_name(bridge), 'A' + 
pin, irq);
-   dev = bridge;
-   }
-   dev = temp_dev;
-   if (irq >= 0) {
+   irq = IO_APIC_get_PCI_irq_vector(dev->bus->number, 
PCI_SLOT(dev->devfn), pin);
+   /*
+* Busses behind bridges are typically not listed in 
the MP-table.
+* In this case we have to look up the IRQ based on the 
parent bus,
+* parent slot, and pin number. The SMP code detects 
such bridged
+* busses itself so we should get into this branch 
reliably.
+*/
+   temp_dev = dev;
+   while (irq < 0 && dev->bus->parent) { /* go back to the 
bridge */
+   struct pci_dev * bridge = dev->bus->self;
+
+   pin = (pin + PCI_SLOT(dev->devfn)) % 4;
+   irq = 
IO_APIC_get_PCI_irq_vector(bridge->bus->number, 
+   PCI_SLOT(bridge->devfn), pin);
+   if (irq >= 0)
+   printk(KERN_WARNING "PCI: using PPB 
%s[%c] to get irq %d\n",
+   pci_name(bridge), 'A' + pin, 
irq);
+   dev = bridge;
+   }
+   dev = temp_dev;
+   if (irq >= 0) {
 #ifdef CONFIG_PCI_MSI
-   if (!platform_legacy_irq(irq))
-   irq = IO_APIC_VECTOR(irq);
+   if (!platform_legacy_irq(irq))
+   irq = IO_APIC_VECTOR(irq);
 #endif
-   printk(KERN_INFO "PCI->APIC IRQ 
transform: %s[%c] -> IRQ %d\n",
-   pci_name(dev), 'A' + pin, irq);
-   dev->irq = irq;
-   return 0;
-   } else
-   msg = " Probably buggy MP table.";
-   }
+   printk(KERN_INFO "PCI->APIC IRQ transform: 
%s[%c] -> IRQ %d\n",
+   pci_name(dev), 'A' + pin, irq);
+   dev->irq = irq;
+   return 0;
+

Re: [PATCH 2.6.11-rc3-bk4] arch/i386/kernel/pci/irq.c: Wrong message output

On Mon, Feb 07, 2005 at 09:06:18PM -0800, Mark F. Haigh wrote:
> 
> (Same basic problem I just reported in a seperate thread against 2.4.29-bk8)
> 
> The following has been reported in the wild for kernel 2.6.8-24:
> 
> PCI: Enabling device :00:05.0 ( -> 0002)
> PCI: No IRQ known for interrupt pin @ of device :00:05.0. Probably 
> buggy MP table.
> 
> It should read "No IRQ known for interrupt pin A", but the 'pin' 
> variable has already been decremented (from 1 to 0), so the line:
> 
> printk(KERN_WARNING "PCI: No IRQ known for interrupt pin %c of device 
> %s.%s\n", 'A' + pin - 1, dev->slot_name, msg);
> 
> causes "pin @" to be output, because 'A' + 0 - 1 == '@'.
> 
> The supplied patch should fix it.  It also removes a redundant check for 
> a nonzero pin.
> 
> 
> Mark F. Haigh
> [EMAIL PROTECTED]
> 

> --- arch/i386/pci/irq.c.orig  2005-02-07 20:40:58.140856536 -0800
> +++ arch/i386/pci/irq.c   2005-02-07 20:46:06.713946296 -0800

Can you resend this so it can be applied with -p1 to patch, and a
Signed-off-by: line?

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2.6.11-rc3-bk4] arch/i386/kernel/pci/irq.c: Wrong message output

(Same basic problem I just reported in a seperate thread against 2.4.29-bk8)
The following has been reported in the wild for kernel 2.6.8-24:
PCI: Enabling device :00:05.0 ( -> 0002)
PCI: No IRQ known for interrupt pin @ of device :00:05.0. Probably 
buggy MP table.

It should read "No IRQ known for interrupt pin A", but the 'pin' 
variable has already been decremented (from 1 to 0), so the line:

printk(KERN_WARNING "PCI: No IRQ known for interrupt pin %c of device 
%s.%s\n", 'A' + pin - 1, dev->slot_name, msg);

causes "pin @" to be output, because 'A' + 0 - 1 == '@'.
The supplied patch should fix it.  It also removes a redundant check for 
a nonzero pin.

Mark F. Haigh
[EMAIL PROTECTED]
--- arch/i386/pci/irq.c.orig2005-02-07 20:40:58.140856536 -0800
+++ arch/i386/pci/irq.c 2005-02-07 20:46:06.713946296 -0800
@@ -1031,56 +1031,55 @@
 
pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin);
if (pin && !pcibios_lookup_irq(dev, 1) && !dev->irq) {
-   char *msg;
-   msg = "";
+   char *msg = "";
+
+   pin--;  /* interrupt pins are numbered starting from 1 
*/
+
if (io_apic_assign_pci_irqs) {
int irq;
 
-   if (pin) {
-   pin--;  /* interrupt pins are numbered 
starting from 1 */
-   irq = 
IO_APIC_get_PCI_irq_vector(dev->bus->number, PCI_SLOT(dev->devfn), pin);
-   /*
-* Busses behind bridges are typically not 
listed in the MP-table.
-* In this case we have to look up the IRQ 
based on the parent bus,
-* parent slot, and pin number. The SMP code 
detects such bridged
-* busses itself so we should get into this 
branch reliably.
-*/
-   temp_dev = dev;
-   while (irq < 0 && dev->bus->parent) { /* go 
back to the bridge */
-   struct pci_dev * bridge = 
dev->bus->self;
-
-   pin = (pin + PCI_SLOT(dev->devfn)) % 4;
-   irq = 
IO_APIC_get_PCI_irq_vector(bridge->bus->number, 
-   
PCI_SLOT(bridge->devfn), pin);
-   if (irq >= 0)
-   printk(KERN_WARNING "PCI: using 
PPB %s[%c] to get irq %d\n",
-   pci_name(bridge), 'A' + 
pin, irq);
-   dev = bridge;
-   }
-   dev = temp_dev;
-   if (irq >= 0) {
+   irq = IO_APIC_get_PCI_irq_vector(dev->bus->number, 
PCI_SLOT(dev->devfn), pin);
+   /*
+* Busses behind bridges are typically not listed in 
the MP-table.
+* In this case we have to look up the IRQ based on the 
parent bus,
+* parent slot, and pin number. The SMP code detects 
such bridged
+* busses itself so we should get into this branch 
reliably.
+*/
+   temp_dev = dev;
+   while (irq < 0 && dev->bus->parent) { /* go back to the 
bridge */
+   struct pci_dev * bridge = dev->bus->self;
+
+   pin = (pin + PCI_SLOT(dev->devfn)) % 4;
+   irq = 
IO_APIC_get_PCI_irq_vector(bridge->bus->number, 
+   PCI_SLOT(bridge->devfn), pin);
+   if (irq >= 0)
+   printk(KERN_WARNING "PCI: using PPB 
%s[%c] to get irq %d\n",
+   pci_name(bridge), 'A' + pin, 
irq);
+   dev = bridge;
+   }
+   dev = temp_dev;
+   if (irq >= 0) {
 #ifdef CONFIG_PCI_MSI
-   if (!platform_legacy_irq(irq))
-   irq = IO_APIC_VECTOR(irq);
+   if (!platform_legacy_irq(irq))
+   irq = IO_APIC_VECTOR(irq);
 #endif
-   printk(KERN_INFO "PCI->APIC IRQ 
transform: %s[%c] -> IRQ %d\n",
-   pci_name(dev), 'A' + pin, irq);
-   dev->irq = irq;
-   return 0;
-   } else
-   msg = " Probably buggy MP table.";
-   }
+

[PATCH 2.4.29-bk8] Resend: sym53c8xx.c: Add ULL suffix to fix warning

Same patch, now against 2.4.29-bk8:
Noticed that in drivers/scsi/sym53c8xx.c:
sym53c8xx.c:13185: warning: integer constant is too large for "long" type
Since we're not dealing with C99 (yet), this 64 bit integer constant
needs to be suffixed with ULL.  Patch included.
Mark F. Haigh
[EMAIL PROTECTED]

--- drivers/scsi/sym53c8xx.c.orig   2005-02-07 19:53:05.741527608 -0800
+++ drivers/scsi/sym53c8xx.c2005-02-07 19:53:36.782808616 -0800
@@ -13182,7 +13182,7 @@
** descriptors.
*/
if (chip && (chip->features & FE_DAC)) {
-   if (pci_set_dma_mask(pdev, (u64) 0xff))
+   if (pci_set_dma_mask(pdev, (u64) 0xffULL))
chip->features &= ~FE_DAC_IN_USE;
else
chip->features |= FE_DAC_IN_USE;

[PATCH 2.4.19-bk8] arch/i386/kernel/pci-irq.c: Wrong message output

I'd submitted a patch earlier for this file, fixing a warning.  When I 
looked at it further, I noticed it can output an incorrect warning 
message under certain circumstances.  I've confirmed that this can and 
does happen in the wild:

PCI: Enabling device :00:0a.0 ( -> 0001)
PCI: No IRQ known for interrupt pin @ of device :00:0a.0. Probably 
buggy MP table.

It should read "No IRQ known for interrupt pin A", but the 'pin' 
variable has already been decremented (from 1 to 0), so the line:

printk(KERN_WARNING "PCI: No IRQ known for interrupt pin %c of device 
%s.%s\n", 'A' + pin - 1, dev->slot_name, msg);

causes "pin @" to be output, because 'A' + 0 - 1 == '@'.
This patch also fixes the original warning:
pci-irq.c: In function `pcibios_enable_irq':
pci-irq.c:1128: warning: 'msg' might be used uninitialized in this function
Thanks,
Mark Haigh
[EMAIL PROTECTED]
--- arch/i386/kernel/pci-irq.c.orig 2005-02-07 19:55:23.852531544 -0800
+++ arch/i386/kernel/pci-irq.c  2005-02-07 20:13:38.835068896 -0800
@@ -1127,6 +1127,8 @@
if (pin && !pcibios_lookup_irq(dev, 1) && !dev->irq) {
char *msg;
 
+   pin--;  /* interrupt pins are numbered starting from 1 
*/
+
/* With IDE legacy devices the IRQ lookup failure is not a 
problem.. */
if (dev->class >> 8 == PCI_CLASS_STORAGE_IDE && !(dev->class & 
0x5))
return;
@@ -1134,42 +1136,39 @@
if (io_apic_assign_pci_irqs) {
int irq;
 
-   if (pin) {
-   pin--;  /* interrupt pins are numbered 
starting from 1 */
-   irq = 
IO_APIC_get_PCI_irq_vector(dev->bus->number, PCI_SLOT(dev->devfn), pin);
-   /*
-* Busses behind bridges are typically not 
listed in the MP-table.
-* In this case we have to look up the IRQ 
based on the parent bus,
-* parent slot, and pin number. The SMP code 
detects such bridged
-* busses itself so we should get into this 
branch reliably.
-*/
-   temp_dev = dev;
-   while (irq < 0 && dev->bus->parent) { /* go 
back to the bridge */
-   struct pci_dev * bridge = 
dev->bus->self;
+   irq = IO_APIC_get_PCI_irq_vector(dev->bus->number, 
PCI_SLOT(dev->devfn), pin);
+   /*
+* Busses behind bridges are typically not listed in 
the MP-table.
+* In this case we have to look up the IRQ based on the 
parent bus,
+* parent slot, and pin number. The SMP code detects 
such bridged
+* busses itself so we should get into this branch 
reliably.
+*/
+   temp_dev = dev;
+   while (irq < 0 && dev->bus->parent) { /* go back to the 
bridge */
+   struct pci_dev * bridge = dev->bus->self;
 
-   pin = (pin + PCI_SLOT(dev->devfn)) % 4;
-   irq = 
IO_APIC_get_PCI_irq_vector(bridge->bus->number, 
-   
PCI_SLOT(bridge->devfn), pin);
-   if (irq >= 0)
-   printk(KERN_WARNING "PCI: using 
PPB(B%d,I%d,P%d) to get irq %d\n", 
-   bridge->bus->number, 
PCI_SLOT(bridge->devfn), pin, irq);
-   dev = bridge;
-   }
-   dev = temp_dev;
-   if (irq >= 0) {
-   printk(KERN_INFO "PCI->APIC IRQ 
transform: (B%d,I%d,P%d) -> %d\n",
-   dev->bus->number, 
PCI_SLOT(dev->devfn), pin, irq);
-   dev->irq = irq;
-   return;
-   } else
-   msg = " Probably buggy MP table.";
+   pin = (pin + PCI_SLOT(dev->devfn)) % 4;
+   irq = 
IO_APIC_get_PCI_irq_vector(bridge->bus->number, 
+   PCI_SLOT(bridge->devfn), pin);
+   if (irq >= 0)
+   printk(KERN_WARNING "PCI: using 
PPB(B%d,I%d,P%d) to get irq %d\n", 
+   bridge->bus->number, 
PCI_SLOT(bridge->devfn), pin, irq);
+   dev = bridge;
}
+   dev = temp_dev;
+

Kernel panic while executing init. (2.6.11-rc3)

2005-02-07 Thread Vishwas Pai

Kernel panic'ed while booting (on HP rx5670 - 2 CPU) the kernel
2.6.11-rc3, configured
and compiled with zx1_defconfig target.

I want follow the below given steps to understand and debug the problem. Please
correct me if they are not the correct way of attacking problems of this kind.

1. Disassemble "create_elf_tables" from vmlinux
2. Locate the code. with the help of IP offset available in the panic dump.
3. Use the register values to see what might have gone wrong.

I am not sure how I will be able to do the following.

1. How get the kernel data structure values at the time of panic ?
2. How to know what fault has caused the problem (data page fault,
instruction fault etc.) ?

-- vishwas

ELILO boot: test2611rc3
Uncompressing Linux... done
Linux version 2.6.11-rc3 ([EMAIL PROTECTED]) (gcc version 3.2.3 20030502
(Red Hat Linux 3.2.3-42)) #1 SMP Mon Feb 7 12:37:59 IST 2005
EFI v1.10 by HP: SALsystab=0x3ff88000 ACPI 2.0=0x3fdf6000
SMBIOS=0x3ff8a000 HCDP=0x3fdf5000
PCDP: v0 at 0x3fdf5000
Early serial console at MMIO 0x80006000 (options '9600n8')
warning: skipping physical page 0
SAL 0.20: INTEL   MSL REF SAL  version 2.0
SAL: AP wakeup using external interrupt vector 0xff
ACPI: Local APIC address c000fee0
GSI 20 (level, low) -> CPU 0 (0x) vector 48
2 CPUs available, 2 CPUs total
MCA related initialization done
Virtual mem_map starts at 0xa0007fffc720
Built 1 zonelists
Kernel command line: BOOT_IMAGE=scsi2:EFI\redhat\vmlinuz-2611rc3 
root=/dev/sdb3 ro
PID hash table entries: 4096 (order: 12, 131072 bytes)
Console: colour dummy device 80x25
Dentry cache hash table entries: 1048576 (order: 9, 8388608 bytes)
Inode-cache hash table entries: 524288 (order: 8, 4194304 bytes)
Memory: 6236480k/6284976k available (8001k code, 48160k reserved,
3681k data, 272k init)
Leaving McKinley Errata 9 workaround enabled
Mount-cache hash table entries: 1024 (order: 0, 16384 bytes)
Boot processor id 0x0/0x0
task migration cache decay timeout: 10 msecs.
CPU 1: synchronized ITC with CPU 0 (last diff 0 cycles, maxerr 435 cycles)
Brought up 2 CPUs
Total of 2 processors activated (2694.04 BogoMIPS).
NET: Registered protocol family 16
ACPI: Subsystem revision 20050125
ACPI: Interpreter enabled
ACPI: Using IOSAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (00:00)
ACPI: PCI Root Bridge [PCI1] (00:20)
ACPI: PCI Root Bridge [PCI2] (00:40)
ACPI: PCI Root Bridge [PCI3] (00:60)
ACPI: PCI Root Bridge [PCI4] (00:80)
ACPI: PCI Root Bridge [PCI5] (00:a0)
ACPI: PCI Root Bridge [PCI6] (00:c0)
ACPI: PCI Root Bridge [PCI7] (00:e0)
SCSI subsystem initialized
usbcore: registered new driver hub
PCI: Using ACPI for IRQ routing
** PCI interrupts are no longer routed automatically.  If this
** causes a device to stop working, it is probably because the
** driver failed to call pci_enable_device().  As a temporary
** workaround, the "pci=routeirq" argument restores the old
** behavior.  If this argument makes the device work again,
** please email the output of "lspci" to [EMAIL PROTECTED]
** so I can fix the driver.
IOC: zx1 2.3 HPA 0xfed01000 IOVA space 1024Mb at 0x4000
perfmon: version 2.0 IRQ 238
perfmon: Itanium 2 PMU detected, 16 PMCs, 18 PMDs, 4 counters (47 bits)
PAL Information Facility v0.5
perfmon: added sampling format default_format
perfmon_default_smpl: default_format v2.0 registered
Total HugeTLB memory allocated, 0
Installing knfsd (copyright (C) 1996 [EMAIL PROTECTED]).
Initializing Cryptographic API
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.4
ACPI: Power Button (FF) [PWRF]
ACPI: Sleep Button (FF) [SLPF]
ACPI: Thermal Zone [THM0] (27 C)
EFI Time Services Driver v0.4
Linux agpgart interface v0.100 (c) Dave Jones
[drm] Initialized drm 1.0.0 20040925
Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing enabled
GSI 16 (level, low) -> CPU 1 (0x0100) vector 49
ACPI: PCI interrupt :00:01.0[A] -> GSI 16 (level, low) -> IRQ 49
ttyS0 at MMIO 0x80007000 (irq = 49) is a 16550A
ACPI: PCI interrupt :00:01.1[A] -> GSI 16 (level, low) -> IRQ 49
ttyS1 at MMIO 0x80006000 (irq = 49) is a 16550A
ttyS2 at MMIO 0x80006010 (irq = 49) is a 16550A
ttyS3 at MMIO 0x80006038 (irq = 49) is a 16550A
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered
RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize
loop: loaded (max 8 devices)
Intel(R) PRO/1000 Network Driver - version 5.6.10.1-k2
Copyright (c) 1999-2004 Intel Corporation.
e100: Intel(R) PRO/100 Network Driver, 3.3.6-k2-NAPI
e100: Copyright(c) 1999-2004 Intel Corporation
tg3.c:v3.19 (January 26, 2005)
GSI 27 (level, low) -> CPU 0 (0x) vector 50
ACPI: PCI interrupt :21:04.0[A] -> GSI 27 (level, low) -> IRQ 50
eth0: Tigon3 [partno(A6794-60001) rev 0105 PHY(5701)]
(PCI:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:30:6e:49:1f:a2
eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1]

[OSDL] email gateway for STP available

2005-02-07 Thread Cliff White


In a somewhat beta. 
We're working on our ease-of-use. 
Release 3.0.19 of STP, available at Sourceforge
(http://sourceforge.net/projects/stp )
and via BK
( bk://developer.osdl.org tag: release_3.0.19 )

adds an email gateway, so you can submit test requests without
the Web. 
I am looking for a few beta testers. 
Grab the kit and email me if you're interested in using.
cliffw
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

PCI Error reporting & recovery

2005-02-07 Thread Benjamin Herrenschmidt

Hi Seto !

I was reading the list archives for the discussion back in September
about PCI error reporting. Has there been any further progress on this
since then ?

I'm looking into adapting something for the need of ppc64 as well
(which, btw, has 1 slot = 1 bridge on most cases, but not all of them :)
which uses quite different low level mecanisms. (Basically, we have to
go through the firmware to get to the errors).

Also, our bridges are automatically isolating slots that had any error
on them (including DMA) and we have the ability to recover, by
triggering a reset on a given segment and that sort of thing, for which
I would like to provide dirvers with an API to control as well.

Finally, I was thinking about some richer semantics for the error
themselves. For example, on DMA error, we can sometimes get good details
about the faulting address etc... which may be intersting for the driver
to log, for diagnostic purpose at least.

So I'd like to start from what you did back then and discuss possible
APIs for the above ideas / changes. What is the status of that stuff ?
did it evolve since then ?

Regards,
Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.11-rc3: Kylix application no longer works?

Grzegorz Kulewski <[EMAIL PROTECTED]> wrote:
>
> On Mon, 7 Feb 2005, Andrew Morton wrote:
> 
>  > Daniel Drake <[EMAIL PROTECTED]> wrote:
>  >>
>  >>> # fs/binfmt_elf.c
>  >>> #   2005/01/17 13:37:56-08:00 [EMAIL PROTECTED] +43 -19
>  >>> #   [SPARC64]: Missing user access return value checks in 
> fs/binfmt_elf.c and fs/compat.c
>  >>> #
>  >>
>  >> I think so. For a short period we applied this patch to the Gentoo 2.6.10
>  >> kernel...
>  >>
>  >> 
> http://dev.gentoo.org/~dsd/gentoo-dev-sources/release-10.01/dist/1900_umem_catch.patch
>  >>
>  >> ...but removed it once users complained it stopped kylix binaries from 
> running.
>  >
>  > Bah.  That's what happens when you fix stuff.
>  >
>  > What's kylix?  The Borland C++ builder thing?
> 
>  Rather Delphi (== Object Pascal) thing.
> 
> 
>  > How should one set about reproducing this problem?
> 
>  IIRC, Some minimal "personal" version can be downloaded from borland.com.

Well I'd prefer that we not back out the whole patch.  Could someone please
test with something like the below, let us know exactly where it's falling
over?


--- 25/fs/binfmt_elf.c~a2005-02-07 20:01:16.0 -0800
+++ 25-akpm/fs/binfmt_elf.c 2005-02-07 20:03:51.0 -0800
@@ -44,6 +44,8 @@
 
 #include 
 
+#define D() do { printk("%s:%d\n", __FILE__, __LINE__); dump_stack(); } while 
(0)
+
 static int load_elf_binary(struct linux_binprm * bprm, struct pt_regs * regs);
 static int load_elf_library(struct file*);
 static unsigned long elf_map (struct file *, unsigned long, struct elf_phdr *, 
int, int);
@@ -181,8 +183,10 @@ create_elf_tables(struct linux_binprm *b
STACK_ALLOC(p, ((current->pid % 64) << 7));
 #endif
u_platform = (elf_addr_t __user *)STACK_ALLOC(p, len);
-   if (__copy_to_user(u_platform, k_platform, len))
+   if (__copy_to_user(u_platform, k_platform, len)) {
+   D();
return -EFAULT;
+   }
}
 
/* Create the ELF interpreter info */
@@ -244,8 +248,10 @@ create_elf_tables(struct linux_binprm *b
 #endif
 
/* Now, let's put argc (and argv, envp if appropriate) on the stack */
-   if (__put_user(argc, sp++))
+   if (__put_user(argc, sp++)) {
+   D();
return -EFAULT;
+   }
if (interp_aout) {
argv = sp + 2;
envp = argv + argc + 1;
@@ -266,8 +272,10 @@ create_elf_tables(struct linux_binprm *b
return 0;
p += len;
}
-   if (__put_user(0, argv))
+   if (__put_user(0, argv)) {
+   D();
return -EFAULT;
+   }
current->mm->arg_end = current->mm->env_start = p;
while (envc-- > 0) {
size_t len;
@@ -277,14 +285,18 @@ create_elf_tables(struct linux_binprm *b
return 0;
p += len;
}
-   if (__put_user(0, envp))
+   if (__put_user(0, envp)) {
+   D();
return -EFAULT;
+   }
current->mm->env_end = p;
 
/* Put the elf_info on the stack in the right place.  */
sp = (elf_addr_t __user *)envp + 1;
-   if (copy_to_user(sp, elf_info, ei_index * sizeof(elf_addr_t)))
+   if (copy_to_user(sp, elf_info, ei_index * sizeof(elf_addr_t))) {
+   D();
return -EFAULT;
+   }
return 0;
 }
 
_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC][PATCH] remove "base" argument from __free_pages_bulk()

2005-02-07 Thread Dave Hansen


Appended is a patch which stops using the zone->zone_mem_map to
calculate the buddy and combined page pointers.  It uses the fact
that the mem_map array is guaranteed to be contigious for the
surrounding (1 << MAX_ORDER) pages.  The relative positions of
the pages in the physical address space to provide the
alignement; which conicidentally fixes the issue where zones are
not aligned at MAX_ORDER.  There is a very comprehensive comment
in the new code explaining the mathematical relationship between
a page and its buddy so I won't reproduce it here.

This kind of approach is required for CONFIG_NONLINEAR systems
where the mem_map is not contiguous within a zone, and the 
zone->zone_mem_map is not used at all.

This patch has been boot-tested on a large variety of systems and
architectures: my P4 laptop, 16-way NUMAQ, 16-way Summit, 4-way
x86 SMP, ppc64 LPAR, x86_64, and several ia64 configurations.

It has been performance-tested on a 16-way NUMAQ. SDET shows a
very slight (within margin of error) performance gain.  Kernbench
shows an approximately ~1% decrease in system time with this
patch applied.  So, it has a likely positive performance impact.

However, the patch has the potential to have a negative performance
impact on systems with an expensive page_to_pfn() implementation.
But, I think the NUMAQ has one of the more expensive ones around,
and it doesn't seem mind too much.

Signed-off-by: Andy Whitcroft <[EMAIL PROTECTED]>
Signed-off-by: Dave Hansen <[EMAIL PROTECTED]>
---

 memhotplug-dave/mm/page_alloc.c |   57 +---
 1 files changed, 42 insertions(+), 15 deletions(-)

diff -puN mm/page_alloc.c~B-sparse-120-free-pages-no-base mm/page_alloc.c
--- memhotplug/mm/page_alloc.c~B-sparse-120-free-pages-no-base  2005-02-04 
15:21:59.0 -0800
+++ memhotplug-dave/mm/page_alloc.c 2005-02-07 20:02:25.0 -0800
@@ -192,6 +192,35 @@ static inline void rmv_page_order(struct
 }
 
 /*
+ * Locate the struct page for both the matching buddy in our
+ * pair (buddy1) and the combined O(n+1) page they form (page).
+ *
+ * 1) Any buddy B1 will have an order O twin B2 which satisfies
+ * the following equasion:
+ * B2 = B1 ^ (1 << O)
+ * For example, if the starting buddy (buddy2) is #8 its order
+ * 1 buddy is #10:
+ * B2 = 8 ^ (1 << 1) = 8 ^ 2 = 10
+ *
+ * 2) Any buddy B will have an order O+1 parent P which
+ * satisfies the following equasion:
+ * P = B & ~(1 << O)
+ *
+ * Assumption: *_mem_map is contigious at least up to MAX_ORDER
+ */
+static inline struct page *__page_find_buddy(struct page *page, unsigned long 
page_idx, unsigned int order)
+{
+   unsigned long buddy_idx = page_idx ^ (1 << order);
+
+   return page + (buddy_idx - page_idx);;
+}
+
+static inline unsigned long __find_combined_index(unsigned long page_idx, 
unsigned int order)
+{
+   return (page_idx & ~(1 << order));
+}
+
+/*
  * This function checks whether a page is free && is the buddy
  * we can do coalesce a page and its buddy if
  * (a) the buddy is free &&
@@ -234,44 +263,43 @@ static inline int page_is_buddy(struct p
  * -- wli
  */
 
-static inline void __free_pages_bulk (struct page *page, struct page *base,
+static inline void __free_pages_bulk (struct page *page,
struct zone *zone, unsigned int order)
 {
unsigned long page_idx;
-   struct page *coalesced;
int order_size = 1 << order;
 
if (unlikely(order))
destroy_compound_page(page, order);
 
-   page_idx = page - base;
+   page_idx = page_to_pfn(page) & ((1 << MAX_ORDER) - 1);
 
BUG_ON(page_idx & (order_size - 1));
BUG_ON(bad_range(zone, page));
 
zone->free_pages += order_size;
while (order < MAX_ORDER-1) {
+   unsigned long combined_idx;
struct free_area *area;
struct page *buddy;
-   int buddy_idx;
 
-   buddy_idx = (page_idx ^ (1 << order));
-   buddy = base + buddy_idx;
+   combined_idx = __find_combined_index(page_idx, order);
+   buddy = __page_find_buddy(page, page_idx, order);
+
if (bad_range(zone, buddy))
break;
if (!page_is_buddy(buddy, order))
-   break;
-   /* Move the buddy up one level. */
+   break;  /* Move the buddy up one level. */
list_del(&buddy->lru);
area = zone->free_area + order;
area->nr_free--;
rmv_page_order(buddy);
-   page_idx &= buddy_idx;
+   page = page + (combined_idx - page_idx);
+   page_idx = combined_idx;
order++;
}
-   coalesced = base + page_idx;
-   set_page_order(coalesced, order);
-   list_add(&coalesced->lru, &zone->free_area[order].free_list);
+   set_page_order(page, order);
+   list_add(&page->lru, &zone->

Re: linux-2.6.11-rc3: XFS internal error xfs_da_do_buf(1) at line 2176 of file fs/xfs/xfs_da_btree.c.

2005-02-07 Thread SATOH Fumiyasu

There are some corrections for my message... Sorry.

At Tue, 08 Feb 2005 12:53:29 +0900,
SATOH Fumiyasu wrote:
> Host3:
> --
> OS: Debian GNU/Linux testing version (sarge)
> Kernel: kernel-image-2.6.10-1-686-smp
> (compiled by gcc version 3.3.5 (Debian 1:3.3.5-6))
> Filesystem: / (/dev/md0 (RAID1, /dev/hda1, /dev/hdd1))
> CPU: Intel(R) Xeon(TM) CPU 2.40GHz x 2 (SMP)

Filesystem: / (/dev/md0 (RAID1, /dev/sda1, /dev/sdb1))
CPU: Intel(R) Pentium(R) III CPU family 1133MHz x 2 (SMP)
SCSI-HBA: Adaptec AIC-7899P U160/m (rev 01)

-- 
-- Name: SATOH Fumiyasu  -- Home: http://www.sfo.jp (in Japanese only)
-- Mail: fumiya at net-thrust.com, samba.gr.jp, namazu.org or ...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: linux-2.6.11-rc3: XFS internal error xfs_da_do_buf(1) at line 2176 of file fs/xfs/xfs_da_btree.c.

2005-02-07 Thread SATOH Fumiyasu

At Mon, 07 Feb 2005 09:38:28 -0600,
Jeffrey E. Hundstad wrote:
> I'm sorry for this truncated report... but it's all I've got.  If you 
> need .config or system configuration, etc. let me know and I'll send'em 
> ASAP.  I don't believe this is hardware related; ide-smart shows all fine.
> 
>  From dmesg:
> 
> xfs_da_do_buf: bno 8388608
> dir: inode 117526252
> Filesystem "hda4": XFS internal error xfs_da_do_buf(1) at line 2176 of 
> file fs/xfs/xfs_da_btree.c.  Caller 0xc01bda27

I've seen similar problems on Debian GNU/Linux testing ver. (sarge)
and kernel-image-2.6.8-1-686-smp and kernel-image-2.6.10-1-686-smp
(kernel-image-* are Debian-oriented Linux kernel binary packages).

I think this is NOT hardware related. These problems are occured
on three different hardwares.

Host1
-
OS: Debian GNU/Linux testing (sarge)
Kernel: kernel-image-2.6.8-1-686-smp
   (compiled by gcc version 3.3.5 (Debian 1:3.3.5-2)
Filesystem: / (/dev/md0 (RAID1, /dev/hda1, /dev/hdd1))
CPU: Intel(R) Xeon(TM) CPU 2.40GHz x 2 (SMP)

# xfs_info /
meta-data=/  isize=256agcount=8, agsize=244232 blks
 =   sectsz=512
data =   bsize=4096   blocks=1953856, imaxpct=25
 =   sunit=8  swidth=16 blks, unwritten=1
naming   =version 2  bsize=4096
log  =internal   bsize=4096   blocks=2560, version=1
 =   sectsz=512   sunit=0 blks
realtime =none   extsz=65536  blocks=0, rtextents=0

Log: Not found in /var/log/*

Host2
-
OS: Debian GNU/Linux testing (sarge)
Kernel: kernel-image-2.6.10-1-686-smp
(compiled by gcc version 3.3.5 (Debian 1:3.3.5-6))
Filesystem: / (/dev/md0 (RAID1, /dev/hda1, /dev/hdd1))
CPU: Intel(R) Xeon(TM) CPU 2.40GHz x 2 (SMP)

# xfs_info /
meta-data=/  isize=256agcount=8, agsize=244232 blks
 =   sectsz=512
data =   bsize=4096   blocks=1953856, imaxpct=25
 =   sunit=8  swidth=16 blks, unwritten=1
naming   =version 2  bsize=4096
log  =internal   bsize=4096   blocks=2560, version=1
 =   sectsz=512   sunit=0 blks
realtime =none   extsz=65536  blocks=0, rtextents=0

Log1 from /var/log/kern.log*:
Jan 28 21:11:50 host2 kernel: XFS internal error XFS_WANT_CORRUPTED_GOTO at 
line 1583 of file fs/xfs/xfs_alloc.c.  Caller 0xf89d02a5
Jan 28 21:11:50 host2 kernel:  [__crc_pm_idle+782770/5290900] 
xfs_free_ag_extent+0x471/0x7a0 [xfs]
Jan 28 21:11:50 host2 kernel:  [__crc_pm_idle+787494/5290900] 
xfs_free_extent+0xe5/0x110 [xfs]
Jan 28 21:11:50 host2 kernel:  [__crc_pm_idle+787494/5290900] 
xfs_free_extent+0xe5/0x110 [xfs]
Jan 28 21:11:50 host2 kernel:  [__crc_pm_idle+1189885/5290900] 
kmem_zone_alloc+0x4c/0xa0 [xfs]
Jan 28 21:11:50 host2 kernel:  [__crc_pm_idle+997143/5290900] 
xfs_efd_init+0x86/0x90 [xfs]
Jan 28 21:11:50 host2 kernel:  [__crc_pm_idle+1142537/5290900] 
xfs_trans_get_efd+0x38/0x50 [xfs]
Jan 28 21:11:50 host2 kernel:  [__crc_pm_idle+867408/5290900] 
xfs_bmap_finish+0x13f/0x1e0 [xfs]
Jan 28 21:11:50 host2 kernel:  [__crc_pm_idle+1039636/5290900] 
xfs_itruncate_finish+0x233/0x460 [xfs]
Jan 28 21:11:50 host2 kernel:  [__crc_pm_idle+1167530/5290900] 
xfs_inactive+0x509/0x570 [xfs]
Jan 28 21:11:50 host2 kernel:  [__crc_pm_idle+1237072/5290900] 
vn_rele+0xff/0x120 [xfs]
Jan 28 21:11:50 host2 kernel:  [__crc_pm_idle+1230441/5290900] 
linvfs_clear_inode+0x18/0x30 [xfs]
Jan 28 21:11:50 host2 kernel:  [clear_inode+230/288] clear_inode+0xe6/0x120
Jan 28 21:11:50 host2 kernel:  [generic_delete_inode+362/416] 
generic_delete_inode+0x16a/0x1a0
Jan 28 21:11:50 host2 kernel:  [iput+99/144] iput+0x63/0x90
Jan 28 21:11:50 host2 kernel:  [sys_unlink+275/320] sys_unlink+0x113/0x140
Jan 28 21:11:50 host2 kernel:  [sysenter_past_esp+82/113] 
sysenter_past_esp+0x52/0x71
Jan 28 21:11:50 host2 kernel: xfs_force_shutdown(md0,0x8) called from line 4049 
of file fs/xfs/xfs_bmap.c.  Return address = 0xf8a3d2db
Jan 28 21:11:50 host2 kernel: Filesystem "md0": Corruption of in-memory data 
detected.  Shutting down filesystem: md0
Jan 28 21:11:50 host2 kernel: Please umount the filesystem, and rectify the 
problem(s)

Log2 from /var/log/kern.log*:
Feb  3 14:35:18 host2 kernel: xfs_force_shutdown(md0,0x8) called from line 1091 
of file fs/xfs/xfs_trans.c.  Return address = 0xf8a8b23b
Feb  3 14:35:18 host2 kernel: Filesystem "md0": Corruption of in-memory data 
detected.  Shutting down filesystem: md0
Feb  3 14:35:18 host2 kernel: Please umount the filesystem, and rectify the 
problem(s)

Host3:
--
OS: Debian GNU/Linux testing version (sarge)
Kernel: kernel-image-2.6.10-1-686-smp
(compiled by gcc version 3.3.5 (Debian 1:3.3.5-6))
Filesystem: / (/dev/md0 (RAID1, /dev/hda1, /dev/hdd1))
CPU: Intel(R) Xeon(TM) CPU 2.40GHz x 2 (SMP)

# xfs_info /
meta-data=/  isize=256

Question about sendfile

2005-02-07 Thread Xiuduan Fang

Hi,
I am trying to beat the I/O bottleneck so as to speed up bulk data transfers
in high speed network. It seems that the system call sendfile() can help to
reduce CPU utilization and speedup data transfers. But I have one question
about the system call,

First, Linux sendfile requires that the input file descriptor cannot be a
network socket. What are the reasons for such a restriction? Sending a
socket to a file via zero copy is definitely useful. Actually this is one
approach I am trying to do to improve performance. Some discussions on
Linux zero copy said this is because it is harder. Sending a socket to a
file via zero copy needs the support of NICs. I cannot understand this
explanation. It seems that FreeBSD has implemented bidirectional zero
copy(http://people.freebsd.org/~ken/zero_copy/#Download). So why Linux does
not support it? What shall I do to release the restriction that Linux
enforces on sendfile?

Any hints will be highly appreciated. Thanks.
Xiuduan Fang
BEGIN:VCARD
VERSION:2.1
N:Fang;Xiuduan
FN:Xiuduan Fang
ORG:University of Virginia;Computer Science Dept
TITLE:2nd Year Graduate
TEL;WORK;VOICE:1-434-982-2296
ADR;WORK:;;151 Engineer's Way, P.O. Box 400740;Charlottesville;VA;22904-4743;USA
LABEL;WORK;ENCODING=QUOTED-PRINTABLE:151 Engineer's Way, P.O. Box 400740=0D=0ACharlottesville, VA 22904-4743=0D=
=0AUSA
KEY;X509;ENCODING=BASE64:
MIIEYzCCA8ygAwIBAgIQJav9Aj366wHb4hpgZ1JRKDANBgkqhkiG9w0BAQQFADCBzDEXMBUG
A1UEChMOVmVyaVNpZ24sIEluYy4xHzAdBgNVBAsTFlZlcmlTaWduIFRydXN0IE5ldHdvcmsx
RjBEBgNVBAsTPXd3dy52ZXJpc2lnbi5jb20vcmVwb3NpdG9yeS9SUEEgSW5jb3JwLiBCeSBS
ZWYuLExJQUIuTFREKGMpOTgxSDBGBgNVBAMTP1ZlcmlTaWduIENsYXNzIDEgQ0EgSW5kaXZp
ZHVhbCBTdWJzY3JpYmVyLVBlcnNvbmEgTm90IFZhbGlkYXRlZDAeFw0wNDEwMDQwMDAwMDBa
Fw0wNDEyMDMyMzU5NTlaMIIBBzEXMBUGA1UEChMOVmVyaVNpZ24sIEluYy4xHzAdBgNVBAsT
FlZlcmlTaWduIFRydXN0IE5ldHdvcmsxRjBEBgNVBAsTPXd3dy52ZXJpc2lnbi5jb20vcmVw
b3NpdG9yeS9SUEEgSW5jb3JwLiBieSBSZWYuLExJQUIuTFREKGMpOTgxHjAcBgNVBAsTFVBl
cnNvbmEgTm90IFZhbGlkYXRlZDEnMCUGA1UECxMeRGlnaXRhbCBJRCBDbGFzcyAxIC0gTWlj
cm9zb2Z0MRUwEwYDVQQDFAxYaXVkdWFuIEZhbmcxIzAhBgkqhkiG9w0BCQEWFHhmNGNAY3Mu
dmlyZ2luaWEuZWR1MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQDRn6bRIKJguTHWwMQB
aKdf9VOH3758Ba6owaoGy5ME/fds2ZPTWvuW+IyFskupZ0stK7f9OtzKAi+EFkFlD1umHItr
XM74PapnYI/8TR/svKbZJLodGNAto9sJjvLQkNK6hwvTp5eBwQ1YgC7GmZHmtshPH8N+8Ast
xOxoflE6dwIDAQABo4IBBjCCAQIwCQYDVR0TBAIwADCBrAYDVR0gBIGkMIGhMIGeBgtghkgB
hvhFAQcBATCBjjAoBggrBgEFBQcCARYcaHR0cHM6Ly93d3cudmVyaXNpZ24uY29tL0NQUzBi
BggrBgEFBQcCAjBWMBUWDlZlcmlTaWduLCBJbmMuMAMCAQEaPVZlcmlTaWduJ3MgQ1BTIGlu
Y29ycC4gYnkgcmVmZXJlbmNlIGxpYWIuIGx0ZC4gKGMpOTcgVmVyaVNpZ24wEQYJYIZIAYb4
QgEBBAQDAgeAMDMGA1UdHwQsMCowKKAmoCSGImh0dHA6Ly9jcmwudmVyaXNpZ24uY29tL2Ns
YXNzMS5jcmwwDQYJKoZIhvcNAQEEBQADgYEASTrowJeKxyNUZbF+AwGXfqXBrOyN3b+3aRDN
CgSQVp0zaLHwLReTa+3mEnwtrMN6QSM02gPbiuzVkdmGyxmlHAmrHQ2l61fyotoMH47RJbe+
qzClrcMr2Y9AAyTNeVrvfSZRdKMZ9HFduUu1tn5/FTZFCK8Xoaq3BIo81b8nHGs=

EMAIL;PREF;INTERNET:[EMAIL PROTECTED]
REV:20050208T032639Z
END:VCARD

Re: [PATCH] BSD Secure Levels: claim block dev in file struct rather than inode struct, 2.6.11-rc2-mm1 (3/8)

On Mon, 07 Feb 2005 18:20:36 PST, Chris Wright said:
> * [EMAIL PROTECTED] ([EMAIL PROTECTED]) wrote:
> > open("/tmp/sh-thd-1107848098", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL|O_LARGEFILE,
 0600) = 3
> 
> O_EXCL
> 
> > Wow - if my /tmp was on the same partition, and I'd hard-linked that
> > file to /etc/passwd, it would be toast now if root had run it.
> 
> So, in fact, it wouldn't ;-)

Well.. Yeah.  bash gets it right, a lot of programs botch it. ;)


pgpW8EazQy2Vi.pgp
Description: PGP signature

[PATCH] Makefiles are not built using a Fortran compiler

2005-02-07 Thread Matthew Wilcox


David Holland pointed out that Make has a lot of implicit suffix rules
built in and you can disable them by setting ".SUFFIXES:".  As an
example, checking the debugging information shows we no longer try to
compile anything from a '.f' suffix.  This turns out to be good for a 15%
speedup on a build with nothing to do; down from 29.1 seconds to 24.7
seconds on my K6.

Signed-off-by: Matthew Wilcox <[EMAIL PROTECTED]>

Index: Makefile
===
RCS file: /var/cvs/linux-2.6/Makefile,v
retrieving revision 1.338
diff -u -p -r1.338 Makefile
--- Makefile6 Feb 2005 06:43:49 -   1.338
+++ Makefile8 Feb 2005 02:39:28 -
@@ -4,6 +4,8 @@ SUBLEVEL = 11
 EXTRAVERSION =-rc3-pa3
 NAME=Woozy Numbat
 
+.SUFFIXES:
+
 # *DOCUMENTATION*
 # To see a list of typical targets execute "make help"
 # More info can be located in ./README
Index: scripts/Makefile.build
===
RCS file: /var/cvs/linux-2.6/scripts/Makefile.build,v
retrieving revision 1.9
diff -u -p -r1.9 Makefile.build
--- scripts/Makefile.build  12 Jan 2005 20:18:19 -  1.9
+++ scripts/Makefile.build  8 Feb 2005 02:39:28 -
@@ -4,6 +4,8 @@
 
 src := $(obj)
 
+.SUFFIXES:
+
 .PHONY: __build
 __build:
 

-- 
"Next the statesmen will invent cheap lies, putting the blame upon 
the nation that is attacked, and every man will be glad of those
conscience-soothing falsities, and will diligently study them, and refuse
to examine any refutations of them; and thus he will by and by convince 
himself that the war is just, and will thank God for the better sleep 
he enjoys after this process of grotesque self-deception." -- Mark Twain
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Memory leak in 2.6.11-rc1? (also here)

2005-02-07 Thread Noel Maddy

On Mon, Feb 07, 2005 at 07:38:12AM -0800, Linus Torvalds wrote:
> 
> Whee. You've got 5 _million_ bio's "active". Which account for about 750MB
> of your 860MB of slab usage.

Same situation here, at different rates on two different platforms,
both running same kernel build. Both show steadily increasing biovec-1.

uglybox was previously running Ingo's 2.6.11-rc2-RT-V0.7.36-03, and was
well over 3,000,000 bios after about a week of uptime. With only 512M of
memory, it was pretty sluggish.

Interesting that the 4-disk RAID5 seems to be growing about 4 times as
fast as the RAID1.

If there's anything else that could help, or patches you want me to try,
just ask.

Details:

=
#1: Soyo KT600 Platinum, Athlon 2500+, 512MB
2 SATA, 2 PATA (all on 8237)
RAID1 and RAID5
on-board tg3


>uname -a
Linux uglybox 2.6.11-rc3 #2 Thu Feb 3 16:19:44 EST 2005 i686 GNU/Linux
>uptime
 21:27:47 up  7:04,  4 users,  load average: 1.06, 1.03, 1.02
>grep '^bio' /proc/slabinfo
biovec-(256) 256256   307222 : tunables   24   120 : 
slabdata128128  0
biovec-128   256260   153652 : tunables   24   120 : 
slabdata 52 52  0
biovec-6425626076851 : tunables   54   270 : 
slabdata 52 52  0
biovec-16256260192   201 : tunables  120   600 : 
slabdata 13 13  0
biovec-4 256305 64   611 : tunables  120   600 : 
slabdata  5  5  0
biovec-1   64547  64636 16  2261 : tunables  120   600 : 
slabdata286286  0
bio64551  64599 64   611 : tunables  120   600 : 
slabdata   1059   1059  0
>lsmod
Module  Size  Used by
ppp_deflate 4928  2 
zlib_deflate   21144  1 ppp_deflate
bsd_comp5376  0 
ppp_async   9280  1 
crc_ccitt   1728  1 ppp_async
ppp_generic21396  7 ppp_deflate,bsd_comp,ppp_async
slhc6720  1 ppp_generic
radeon 76224  1 
ipv6  235456  27 
pcspkr  3300  0 
tg384932  0 
ohci1394   31748  0 
ieee1394   94196  1 ohci1394
snd_cmipci 30112  1 
snd_pcm_oss48480  0 
snd_mixer_oss  17728  1 snd_pcm_oss
usbhid 31168  0 
snd_pcm83528  2 snd_cmipci,snd_pcm_oss
snd_page_alloc  7620  1 snd_pcm
snd_opl3_lib9472  1 snd_cmipci
snd_timer  21828  2 snd_pcm,snd_opl3_lib
snd_hwdep   7456  1 snd_opl3_lib
snd_mpu401_uart 6528  1 snd_cmipci
snd_rawmidi20704  1 snd_mpu401_uart
snd_seq_device  7116  2 snd_opl3_lib,snd_rawmidi
snd48996  12 
snd_cmipci,snd_pcm_oss,snd_mixer_oss,snd_pcm,snd_opl3_lib,snd_timer,snd_hwdep,snd_mpu401_uart,snd_rawmidi,snd_seq_device
soundcore   7648  1 snd
uhci_hcd   29968  0 
ehci_hcd   29000  0 
usbcore   106744  4 usbhid,uhci_hcd,ehci_hcd
dm_mod 52796  0 
it87   23900  0 
eeprom  5776  0 
lm90   11044  0 
i2c_sensor  2944  3 it87,eeprom,lm90
i2c_isa 1728  0 
i2c_viapro  6412  0 
i2c_core   18512  6 it87,eeprom,lm90,i2c_sensor,i2c_isa,i2c_viapro
>lspci
:00:00.0 Host bridge: VIA Technologies, Inc. VT8377 [KT400/KT600 AGP] Host 
Bridge (rev 80)
:00:01.0 PCI bridge: VIA Technologies, Inc. VT8237 PCI Bridge
:00:07.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5705 
Gigabit Ethernet (rev 03)
:00:0d.0 FireWire (IEEE 1394): VIA Technologies, Inc. IEEE 1394 Host 
Controller (rev 46)
:00:0e.0 Multimedia audio controller: C-Media Electronics Inc CM8738 (rev 
10)
:00:0f.0 RAID bus controller: VIA Technologies, Inc. VIA VT6420 SATA RAID 
Controller (rev 80)
:00:0f.1 IDE interface: VIA Technologies, Inc. 
VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
:00:10.0 USB Controller: VIA Technologies, Inc. VT82x UHCI USB 1.1 
Controller (rev 81)
:00:10.1 USB Controller: VIA Technologies, Inc. VT82x UHCI USB 1.1 
Controller (rev 81)
:00:10.2 USB Controller: VIA Technologies, Inc. VT82x UHCI USB 1.1 
Controller (rev 81)
:00:10.3 USB Controller: VIA Technologies, Inc. VT82x UHCI USB 1.1 
Controller (rev 81)
:00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 86)
:00:11.0 ISA bridge: VIA Technologies, Inc. VT8237 ISA bridge [K8T800 South]
:00:13.0 RAID bus controller: Silicon Image, Inc. (formerly CMD Technology 
Inc) SiI 3112 [SATALink/SATARaid] Serial ATA Controller (rev 02)
:01:00.0 VGA compatible controller: ATI Technologies Inc Radeon RV200 QW 
[Radeon 7500]
>cat /proc/mdstat
Personalities : [raid0] [ra

Re: [PATCH] Re: msdos/vfat defaults are annoying

2005-02-07 Thread Clemens Schwaighofer

-BEGIN PGP SIGNED MESSAGE-
(BHash: SHA1
(B
(BOn 02/08/2005 09:23 AM, Horst von Brand wrote:
(B> Clemens Schwaighofer <[EMAIL PROTECTED]> said:
(B> 
(B> [...]
(B>>but to be honest, most times I need vfat, and I actually haven't
(B>>encountered a time when I need msdos.
(B> 
(B> But writing MSDOS on a VFAT filesystem is a sure way to screw it up, and
(B> AFAIU vice-versa.
(B
(Bwell it doesn't screw it up if you write MS DOS on a VFAT, you just
(Bloose a lot of data.
(B
(BI was kinda surprised when I came home and plugged in my USB stick to
(Bsee just A3.CB instead of a nice long filename :)
(B
(B- --
(B[ Clemens Schwaighofer  -=:~ ]
(B[ TBWA\ && TEQUILA\ Japan IT Group   ]
(B[6-17-2 Ginza Chuo-ku, Tokyo 104-0061, JAPAN ]
(B[ Tel: +81-(0)3-3545-7703Fax: +81-(0)3-3545-7343 ]
(B[ http://www.tequila.co.jphttp://www.tbwajapan.co.jp ]
(B-BEGIN PGP SIGNATURE-
(BVersion: GnuPG v1.2.6 (GNU/Linux)
(BComment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
(B
(BiD8DBQFCCCLljBz/yQjBxz8RAvgyAJ4zRyjszLLuBeZz5lBAyegCTbm1ygCfYf2E
(BUJKEEU0HJuLRTAjec3aEQ3s=
(B=g+L4
(B-END PGP SIGNATURE-
(B-
(BTo unsubscribe from this list: send the line "unsubscribe linux-kernel" in
(Bthe body of a message to [EMAIL PROTECTED]
(BMore majordomo info at  http://vger.kernel.org/majordomo-info.html
(BPlease read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Re: msdos/vfat defaults are annoying

2005-02-07 Thread Horst von Brand

Clemens Schwaighofer <[EMAIL PROTECTED]> said:

[...]

> but to be honest, most times I need vfat, and I actually haven't
> encountered a time when I need msdos.

But writing MSDOS on a VFAT filesystem is a sure way to screw it up, and
AFAIU vice-versa.
-- 
Dr. Horst H. von Brand   User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria  +56 32 654239
Casilla 110-V, Valparaiso, ChileFax:  +56 32 797513
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Please open sysfs symbols to proprietary modules

2005-02-07 Thread Horst von Brand

"Randy.Dunlap" <[EMAIL PROTECTED]> said:
> Chris Friesen wrote:

[...]

> > If you look at the big chip manufacturers (TI, Maxim, Analog Devices, 
> > etc.) they publish specs on everything.  It would be nice if others did 
> > the same.

> One of the arguments that I have heard is fairly old and debatable as
> well.  This was the subject of a panel discussion at LWE in 2000 or
> 2001, chaired by journalist Nicholas Petreley.  The panel was composed
> of vendors from (mostly) audio devices IIRC, but I'm not sure.

A friend of mine got to sign an NDA for access to the official specs to a
device. Turned out to be some handwritten sheets, scribbled over...

Shame might have something to do too ;-)
-- 
Dr. Horst H. von Brand   User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria  +56 32 654239
Casilla 110-V, Valparaiso, ChileFax:  +56 32 797513
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] BSD Secure Levels: claim block dev in file struct rather than inode struct, 2.6.11-rc2-mm1 (3/8)

* [EMAIL PROTECTED] ([EMAIL PROTECTED]) wrote:
> open("/tmp/sh-thd-1107848098", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL|O_LARGEFILE, 
> 0600) = 3

O_EXCL

> Wow - if my /tmp was on the same partition, and I'd hard-linked that
> file to /etc/passwd, it would be toast now if root had run it.

So, in fact, it wouldn't ;-)

thanks,
-chris
-- 
Linux Security Modules http://lsm.immunix.org http://lsm.bkbits.net
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] BSD Secure Levels: claim block dev in file struct rather than inode struct, 2.6.11-rc2-mm1 (3/8)

On Tue, 08 Feb 2005 01:48:40 GMT, David Wagner said:

> How would /etc/passwd get clobbered?  Are you thinking that a tmp
> cleaner run by cron might delete /tmp/whatever (i.e., delete the hardlink
> you created above)?  But deleting /tmp/whatever is safe; it doesn't affect
> /etc/passwd.  I'm guessing I'm probably missing something.

The attack is to hardlink some tempfile name to some file you want over-written.
This usually involves just a little bit of work, such as recognizing that a 
given
root cronjob uses an unsafe predictable filename in /tmp (look at the Bugtraq or
Full-Disclosure archives, there's plenty).  Then you set a little program that
sleep()s till a few seconds before the cronjob runs, does a getpid(), and then
sprays hardlinks into the next 15 or 20 things that mktemp() will generate...

Consider how bash implements "here" scripts:

#!/bin/bash
echo << EOF
some trash
EOF

Now let's look at the strace (snipped for brevity..)

statfs("/tmp", {f_type="EXT2_SUPER_MAGIC", f_bsize=1024, f_blocks=253871, 
f_bfree=213773, f_bavail=200666, f_files=65536, f_ffree=65445, f_fsid={0, 0}, 
f_namelen=255, f_frsize=1024}) = 0
time(NULL)  = 1107828098
open("/tmp/sh-thd-1107848098", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL|O_LARGEFILE, 
0600) = 3
dup(3)  = 4
fcntl64(4, F_GETFL) = 0x8001 (flags O_WRONLY|O_LARGEFILE)
fstat64(4, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0xb7d71000
_llseek(4, 0, [0], SEEK_CUR)= 0
write(4, "some trash\n", 11)= 11
close(4)= 0
munmap(0xb7d71000, 4096)= 0
open("/tmp/sh-thd-1107848098", O_RDONLY|O_LARGEFILE) = 4
close(3)= 0
unlink("/tmp/sh-thd-1107848098")= 0
fcntl64(0, F_GETFD) = 0
fcntl64(0, F_DUPFD, 10) = 10
fcntl64(0, F_GETFD) = 0
fcntl64(10, F_SETFD, FD_CLOEXEC)= 0
dup2(4, 0)  = 0
close(4)= 0

Wow - if my /tmp was on the same partition, and I'd hard-linked that
file to /etc/passwd, it would be toast now if root had run it.

You usually can't control what gets written - but often it's sufficient for the
attacker to simply get a file clobbered


pgp1unSohNbRA.pgp
Description: PGP signature

Re: [PATCH] Filesystem linking protections

2005-02-07 Thread John Richard Moser

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Chris Wright wrote:
> * John Richard Moser ([EMAIL PROTECTED]) wrote:
> 
>>Yes, mkdtemp() and mkstemp().
>>
>>Of course we can't always rely on programmers to get it right, so the
>>idea here is to make sure we ask broken code to behave nicely, and stab
>>it in the face if it doesn't.  Please try to examine this in that scope.
> 
> 
> It's fine for hardened distro.  But still inappropriate for mainline.
> 

Perhaps in mainline as an option?  The [*] notations next to things are
really nice, they let you turn kernel stuff on and off :)  It's
appropriate for mainline to support added security isn't it?  I think
following the path of supporting-but-not-forcing is the best route,
because it encourages people to account for systems which may take
advantage of such options, and thus leads to a software base in which
it's quite sane to actually enable those options globally.

That's just how I think though.

> thanks,
> -chris

- --
All content of all messages exchanged herein are left in the
Public Domain, unless otherwise explicitly stated.

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFCCB+GhDd4aOud5P8RAlD9AJ45JTY20WY6qHe0h0ZIcFasgxJDtACbB1aB
i4hytMAy6Cs1AUNXC296JOk=
=oLVs
-END PGP SIGNATURE-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel 2.6.9 failure

2005-02-07 Thread Lars Strojny

Hi,

On Mon, 2005-02-07 at 16:51 -0800, [EMAIL PROTECTED] wrote:
[...]
> On a K6-2 box the 2.6.9 kernel starts to load : "Loading" then the PC 
> resets. 
> The kernel compiled and everything installed OK. Lilo is OK.  I've tried four 
> times different configs with the same result. Box resets. My 2.4.28 kernel 
> works OK. 
> I've tried rm'ing and re-unpacking the 2.6.9 source and starting afresh.  Box 
[...]^

Is there any special reason why you don't use 2.6.10. I think it would
be a good idea to give it a try!

Greets, Lars Strojny
-- 
name: Lars Strojny web: http://strojny.net 
street: Yorckstrasse 22blog: http://usrportage.de
city: D-71636 Ludwigsburg  mail/jabber: [EMAIL PROTECTED]
f-print: 6663 1055 543E 3106 3FD3  4F40 AC74 CD1F C327 14BD


signature.asc
Description: This is a digitally signed message part

Re: [PATCH] BSD Secure Levels: claim block dev in file struct rather than inode struct, 2.6.11-rc2-mm1 (3/8)

2005-02-07 Thread David Wagner

>For those systems that have everything on one big partition, you can often
>do stuff like:
>
>ln /etc/passwd /tmp/
>
>and wait for /etc/passwd to get clobbered by a cron job run by root...

How would /etc/passwd get clobbered?  Are you thinking that a tmp
cleaner run by cron might delete /tmp/whatever (i.e., delete the hardlink
you created above)?  But deleting /tmp/whatever is safe; it doesn't affect
/etc/passwd.  I'm guessing I'm probably missing something.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: prezeroing V6 [2/3]: ScrubD

Christoph Lameter <[EMAIL PROTECTED]> wrote:
>
> On Mon, 7 Feb 2005, Andrew Morton wrote:
> 
> > > Look at the early posts. I plan to put that up on the web. I have some
> > > stats attached to the end of this message from an earlier post.
> >
> > But that's a patch-specific microbenchmark, isn't it?  Has this work been
> > benchmarked against real-world stuff?
> 
> No its a page fault benchmark. Dave Miller has done some kernel compiles
> and I have some benchmarks here that I never posted because they do not
> show any material change as far as I can see. I will be posting that soon
> when this is complete (also need to do the same for the atomic page fault
> ops and the prefaulting patch).

OK, thanks.  That's important work.  After all, this patch is a performance
optimisation.

> > > > Should we be managing the kernel threads with the kthread() API?
> > >
> > > What would you like to manage?
> >
> > Startup, perhaps binding the threads to their cpus too.
> 
> That is all already controllable in the same way as the swapper.

kswapd uses an old API.

> Each
> memory node is bound to a set of cpus. This may be controlled by the
> NUMA node configuration. F.e. for nodes without cpus.

kthread_bind() should be able to do this.  From a quick read it appears to
have shortcomings in this department (it expects to be bound to a single
CPU).

We should fix kthread_bind() so that it can accomodate the kscrub/kswapd
requirement.  That's one of the _reasons_ for using the provided
infrastructure rather than open-coding around it.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [IPSEC] Move dst->child loop from dst_ifdown to xfrm_dst_ifdown

2005-02-07 Thread Herbert Xu

On Tue, Feb 08, 2005 at 12:29:29PM +1100, herbert wrote:
> 
> This one moves the dst->child processing from dst_ifdown into
> xfrm_dst_ifdown.

This patch adds a net_device argument to ifdown.  After all,
it's a bit silly to notify someone of an ifdown event without
telling them what which device it was for :)

Signed-off-by: Herbert Xu <[EMAIL PROTECTED]>

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
= include/net/dst.h 1.25 vs edited =
--- 1.25/include/net/dst.h  2005-02-06 14:23:59 +11:00
+++ edited/include/net/dst.h2005-02-08 12:14:10 +11:00
@@ -89,7 +89,8 @@
int (*gc)(void);
struct dst_entry *  (*check)(struct dst_entry *, __u32 cookie);
void(*destroy)(struct dst_entry *);
-   void(*ifdown)(struct dst_entry *, int how);
+   void(*ifdown)(struct dst_entry *,
+ struct net_device *dev, int how);
struct dst_entry *  (*negative_advice)(struct dst_entry *);
void(*link_failure)(struct sk_buff *);
void(*update_pmtu)(struct dst_entry *dst, u32 mtu);
= net/core/dst.c 1.27 vs edited =
--- 1.27/net/core/dst.c 2005-02-08 12:12:21 +11:00
+++ edited/net/core/dst.c   2005-02-08 12:15:03 +11:00
@@ -220,12 +220,14 @@
  *
  * Commented and originally written by Alexey.
  */
-static inline void dst_ifdown(struct dst_entry *dst, int unregister)
+static inline void dst_ifdown(struct dst_entry *dst, struct net_device *dev,
+ int unregister)
 {
-   struct net_device *dev = dst->dev;
-
if (dst->ops->ifdown)
-   dst->ops->ifdown(dst, unregister);
+   dst->ops->ifdown(dst, dev, unregister);
+
+   if (dev != dst->dev)
+   return;
 
if (!unregister) {
dst->input = dst_discard_in;
@@ -252,8 +254,7 @@
case NETDEV_DOWN:
spin_lock_bh(&dst_lock);
for (dst = dst_garbage_list; dst; dst = dst->next) {
-   if (dst->dev == dev)
-   dst_ifdown(dst, event != NETDEV_DOWN);
+   dst_ifdown(dst, dev, event != NETDEV_DOWN);
}
spin_unlock_bh(&dst_lock);
break;
= net/ipv4/route.c 1.101 vs edited =
--- 1.101/net/ipv4/route.c  2005-02-03 07:43:48 +11:00
+++ edited/net/ipv4/route.c 2005-02-08 12:14:11 +11:00
@@ -138,7 +138,8 @@
 
 static struct dst_entry *ipv4_dst_check(struct dst_entry *dst, u32 cookie);
 static void ipv4_dst_destroy(struct dst_entry *dst);
-static void ipv4_dst_ifdown(struct dst_entry *dst, int how);
+static void ipv4_dst_ifdown(struct dst_entry *dst,
+struct net_device *dev, int how);
 static struct dst_entry *ipv4_negative_advice(struct dst_entry *dst);
 static void ipv4_link_failure(struct sk_buff *skb);
 static void ip_rt_update_pmtu(struct dst_entry *dst, u32 mtu);
@@ -1342,11 +1343,12 @@
}
 }
 
-static void ipv4_dst_ifdown(struct dst_entry *dst, int how)
+static void ipv4_dst_ifdown(struct dst_entry *dst, struct net_device *dev,
+   int how)
 {
struct rtable *rt = (struct rtable *) dst;
struct in_device *idev = rt->idev;
-   if (idev && idev->dev != &loopback_dev) {
+   if (dev != &loopback_dev && idev && idev->dev == dev) {
struct in_device *loopback_idev = in_dev_get(&loopback_dev);
if (loopback_idev) {
rt->idev = loopback_idev;
= net/ipv6/route.c 1.105 vs edited =
--- 1.105/net/ipv6/route.c  2005-01-15 19:44:48 +11:00
+++ edited/net/ipv6/route.c 2005-02-08 12:14:11 +11:00
@@ -84,7 +84,8 @@
 static struct dst_entry*ip6_dst_check(struct dst_entry *dst, u32 
cookie);
 static struct dst_entry *ip6_negative_advice(struct dst_entry *);
 static voidip6_dst_destroy(struct dst_entry *);
-static voidip6_dst_ifdown(struct dst_entry *, int how);
+static voidip6_dst_ifdown(struct dst_entry *,
+  struct net_device *dev, int how);
 static int  ip6_dst_gc(void);
 
 static int ip6_pkt_discard(struct sk_buff *skb);
@@ -153,12 +154,13 @@
}   
 }
 
-static void ip6_dst_ifdown(struct dst_entry *dst, int how)
+static void ip6_dst_ifdown(struct dst_entry *dst, struct net_device *dev,
+  int how)
 {
struct rt6_info *rt = (struct rt6_info *)dst;
struct inet6_dev *idev = rt->rt6i_idev;
 
-   if (idev != NULL && idev->dev != &loopback_dev) {
+   if (dev != &loopback_dev && idev != NULL &&

[PATCH] resend: compat ioctl for submiting URB

2005-02-07 Thread Christopher Li

Here is the resend of the patch to support compatible URB ioctl
on 64 bit systems. This version already incorporate some feed back
I get from the list and I have not get any new input yet.

Change Log:
- Let usbdevfs directly handle 32 bit URB ioctl. More specifically:
  USBDEVFS_SUBMITURB32, USBDEVFS_REAPURB32 and USBDEVFS_REAPURBNDELAY32.
  Those asynchronous ioctls are too complicate to handle by the
  compatible layer.

Thanks

Chris

Index: linux-2.5/include/linux/compat_ioctl.h
===
--- linux-2.5.orig/include/linux/compat_ioctl.h 2005-01-26 17:23:57.0 
-0800
+++ linux-2.5/include/linux/compat_ioctl.h  2005-02-07 15:10:54.0 
-0800
@@ -692,6 +692,9 @@
 COMPATIBLE_IOCTL(USBDEVFS_CONNECTINFO)
 COMPATIBLE_IOCTL(USBDEVFS_HUB_PORTINFO)
 COMPATIBLE_IOCTL(USBDEVFS_RESET)
+COMPATIBLE_IOCTL(USBDEVFS_SUBMITURB32)
+COMPATIBLE_IOCTL(USBDEVFS_REAPURB32)
+COMPATIBLE_IOCTL(USBDEVFS_REAPURBNDELAY32)
 COMPATIBLE_IOCTL(USBDEVFS_CLEAR_HALT)
 /* MTD */
 COMPATIBLE_IOCTL(MEMGETINFO)
Index: linux-2.5/include/linux/usbdevice_fs.h
===
--- linux-2.5.orig/include/linux/usbdevice_fs.h 2005-01-25 12:08:02.0 
-0800
+++ linux-2.5/include/linux/usbdevice_fs.h  2005-02-07 15:10:54.0 
-0800
@@ -32,6 +32,7 @@
 #define _LINUX_USBDEVICE_FS_H
 
 #include 
+#include 
 
 /* - */
 
@@ -123,6 +124,22 @@
char port [127];/* e.g. port 3 connects to device 27 */
 };
 
+struct usbdevfs_urb32 {
+   unsigned char type;
+   unsigned char endpoint;
+   compat_int_t status;
+   compat_uint_t flags;
+   compat_caddr_t buffer;
+   compat_int_t buffer_length;
+   compat_int_t actual_length;
+   compat_int_t start_frame;
+   compat_int_t number_of_packets;
+   compat_int_t error_count;
+   compat_uint_t signr;
+   compat_caddr_t usercontext; /* unused */
+   struct usbdevfs_iso_packet_desc iso_frame_desc[0];
+};
+
 #define USBDEVFS_CONTROL   _IOWR('U', 0, struct usbdevfs_ctrltransfer)
 #define USBDEVFS_BULK  _IOWR('U', 2, struct usbdevfs_bulktransfer)
 #define USBDEVFS_RESETEP   _IOR('U', 3, unsigned int)
@@ -130,9 +147,12 @@
 #define USBDEVFS_SETCONFIGURATION  _IOR('U', 5, unsigned int)
 #define USBDEVFS_GETDRIVER _IOW('U', 8, struct usbdevfs_getdriver)
 #define USBDEVFS_SUBMITURB _IOR('U', 10, struct usbdevfs_urb)
+#define USBDEVFS_SUBMITURB32   _IOR('U', 10, struct usbdevfs_urb32)
 #define USBDEVFS_DISCARDURB_IO('U', 11)
 #define USBDEVFS_REAPURB   _IOW('U', 12, void *)
+#define USBDEVFS_REAPURB32 _IOW('U', 12, u32)
 #define USBDEVFS_REAPURBNDELAY _IOW('U', 13, void *)
+#define USBDEVFS_REAPURBNDELAY32   _IOW('U', 13, u32)
 #define USBDEVFS_DISCSIGNAL_IOR('U', 14, struct 
usbdevfs_disconnectsignal)
 #define USBDEVFS_CLAIMINTERFACE_IOR('U', 15, unsigned int)
 #define USBDEVFS_RELEASEINTERFACE  _IOR('U', 16, unsigned int)
@@ -143,5 +163,4 @@
 #define USBDEVFS_CLEAR_HALT_IOR('U', 21, unsigned int)
 #define USBDEVFS_DISCONNECT_IO('U', 22)
 #define USBDEVFS_CONNECT   _IO('U', 23)
-
 #endif /* _LINUX_USBDEVICE_FS_H */
Index: linux-2.5/fs/compat_ioctl.c
===
--- linux-2.5.orig/fs/compat_ioctl.c2005-01-25 12:08:12.0 -0800
+++ linux-2.5/fs/compat_ioctl.c 2005-02-07 15:18:38.0 -0800
@@ -2570,229 +2570,11 @@
 return sys_ioctl(fd, USBDEVFS_BULK, (unsigned long)p);
 }
 
-/* This needs more work before we can enable it.  Unfortunately
- * because of the fancy asynchronous way URB status/error is written
- * back to userspace, we'll need to fiddle with USB devio internals
- * and/or reimplement entirely the frontend of it ourselves. -DaveM
- *
- * The issue is:
- *
- * When an URB is submitted via usbdevicefs it is put onto an
- * asynchronous queue.  When the URB completes, it may be reaped
- * via another ioctl.  During this reaping the status is written
- * back to userspace along with the length of the transfer.
- *
- * We must translate into 64-bit kernel types so we pass in a kernel
- * space copy of the usbdevfs_urb structure.  This would mean that we
- * must do something to deal with the async entry reaping.  First we
- * have to deal somehow with this transitory memory we've allocated.
- * This is problematic since there are many call sites from which the
- * async entries can be destroyed (and thus when we'd need to free up
- * this kernel memory).  One of which is the close() op of usbdevicefs.
- * To handle that we'd need to make our own file_operations struct which
- * overrides usbdevicefs's release op with our own which runs usbdevicefs's
- * real release op then frees up the kernel memory.
- *
- *

[IPSEC] Move dst->child loop from dst_ifdown to xfrm_dst_ifdown

2005-02-07 Thread Herbert Xu

On Sun, Feb 06, 2005 at 05:51:17PM +1100, herbert wrote:
> 
> The idea is to move the check into dst->ops->ifdown.  By definition
> ipv6_dst_ifdown will only see rt6_info entries.  So dst_dev_event
> will become

Here are the patches to do this.  Do they look sane?

This one moves the dst->child processing from dst_ifdown into
xfrm_dst_ifdown.

Signed-off-by: Herbert Xu <[EMAIL PROTECTED]>

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
= net/core/dst.c 1.26 vs edited =
--- 1.26/net/core/dst.c 2005-02-06 14:23:59 +11:00
+++ edited/net/core/dst.c   2005-02-08 12:11:39 +11:00
@@ -220,31 +220,26 @@
  *
  * Commented and originally written by Alexey.
  */
-static void dst_ifdown(struct dst_entry *dst, int unregister)
+static inline void dst_ifdown(struct dst_entry *dst, int unregister)
 {
struct net_device *dev = dst->dev;
 
+   if (dst->ops->ifdown)
+   dst->ops->ifdown(dst, unregister);
+
if (!unregister) {
dst->input = dst_discard_in;
dst->output = dst_discard_out;
-   }
-
-   do {
-   if (unregister) {
-   dst->dev = &loopback_dev;
-   dev_hold(&loopback_dev);
+   } else {
+   dst->dev = &loopback_dev;
+   dev_hold(&loopback_dev);
+   dev_put(dev);
+   if (dst->neighbour && dst->neighbour->dev == dev) {
+   dst->neighbour->dev = &loopback_dev;
dev_put(dev);
-   if (dst->neighbour && dst->neighbour->dev == dev) {
-   dst->neighbour->dev = &loopback_dev;
-   dev_put(dev);
-   dev_hold(&loopback_dev);
-   }
+   dev_hold(&loopback_dev);
}
-
-   if (dst->ops->ifdown)
-   dst->ops->ifdown(dst, unregister);
-   } while ((dst = dst->child) && dst->flags & DST_NOHASH &&
-dst->dev == dev);
+   }
 }
 
 static int dst_dev_event(struct notifier_block *this, unsigned long event, 
void *ptr)
= net/xfrm/xfrm_policy.c 1.63 vs edited =
--- 1.63/net/xfrm/xfrm_policy.c 2005-01-19 07:08:19 +11:00
+++ edited/net/xfrm/xfrm_policy.c   2005-02-08 12:10:47 +11:00
@@ -1027,6 +1027,20 @@
dst->xfrm = NULL;
 }
 
+static void xfrm_dst_ifdown(struct dst_entry *dst, int unregister)
+{
+   struct net_device *dev = dst->dev;
+
+   if (!unregister)
+   return;
+
+   while ((dst = dst->child) && dst->xfrm && dst->dev == dev) {
+   dst->dev = &loopback_dev;
+   dev_hold(&loopback_dev);
+   dev_put(dev);
+   }
+}
+
 static void xfrm_link_failure(struct sk_buff *skb)
 {
/* Impossible. Such dst must be popped before reaches point of failure. 
*/
@@ -1150,6 +1164,8 @@
dst_ops->check = xfrm_dst_check;
if (likely(dst_ops->destroy == NULL))
dst_ops->destroy = xfrm_dst_destroy;
+   if (likely(dst_ops->ifdown == NULL))
+   dst_ops->ifdown = xfrm_dst_ifdown;
if (likely(dst_ops->negative_advice == NULL))
dst_ops->negative_advice = xfrm_negative_advice;
if (likely(dst_ops->link_failure == NULL))
@@ -1181,6 +1197,7 @@
dst_ops->kmem_cachep = NULL;
dst_ops->check = NULL;
dst_ops->destroy = NULL;
+   dst_ops->ifdown = NULL;
dst_ops->negative_advice = NULL;
dst_ops->link_failure = NULL;
dst_ops->get_mss = NULL;

Re: prezeroing V6 [2/3]: ScrubD

2005-02-07 Thread Christoph Lameter

On Mon, 7 Feb 2005, Andrew Morton wrote:

> > Look at the early posts. I plan to put that up on the web. I have some
> > stats attached to the end of this message from an earlier post.
>
> But that's a patch-specific microbenchmark, isn't it?  Has this work been
> benchmarked against real-world stuff?

No its a page fault benchmark. Dave Miller has done some kernel compiles
and I have some benchmarks here that I never posted because they do not
show any material change as far as I can see. I will be posting that soon
when this is complete (also need to do the same for the atomic page fault
ops and the prefaulting patch).

> > > Should we be managing the kernel threads with the kthread() API?
> >
> > What would you like to manage?
>
> Startup, perhaps binding the threads to their cpus too.

That is all already controllable in the same way as the swapper. Each
memory node is bound to a set of cpus. This may be controlled by the
NUMA node configuration. F.e. for nodes without cpus.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: out-of-line x86 "put_user()" implementation

2005-02-07 Thread Linus Torvalds



On Mon, 7 Feb 2005, Ingo Molnar wrote:
>
> boots fine and shrinks the image size quite noticeably:
> 
>   [Nr] Name TypeAddr OffSize
>   [ 1] .textPROGBITSc010 001000 2771a9   [vmlinux-orig]
>   [ 1] .textPROGBITSc010 001000 2742dd   [vmlinux-patched]
> 
> that's 11980 bytes off a 2585001 bytes .text, a 0.5% size reduction.
> This patch we want ...

Goodie. Here's a slightly more recent version, where I cleaned up the
assembly code (no need to save %ecx if we just update %ebx instead, which
makes the code a bit more readable too - and doing it this way should
hopefully make it easier for an out-of-order CPU to start the memops
earlier too. Who knows..)

I'm not going to put this into 2.6.11, since I worry about compiler 
interactions, but the more people who test it anyway, the better.

Linus

-
# This is a BitKeeper generated diff -Nru style patch.
#
# ChangeSet
#   2005/02/07 08:14:28-08:00 [EMAIL PROTECTED] 
#   x86: make "put_user()" be out-of-line
#   
#   It's really too big to be inlined.
#   
#   Ingo tests and reports: this shrinks his kernel text size by
#   about 12kB (roughly 0.5%)
# 
# arch/i386/lib/putuser.S
#   2005/02/07 08:14:17-08:00 [EMAIL PROTECTED] +86 -0
# 
# include/asm-i386/uaccess.h
#   2005/02/07 08:14:17-08:00 [EMAIL PROTECTED] +27 -3
#   x86: make "put_user()" be out-of-line
# 
# arch/i386/lib/putuser.S
#   2005/02/07 08:14:17-08:00 [EMAIL PROTECTED] +0 -0
#   BitKeeper file /home/torvalds/v2.6/linux/arch/i386/lib/putuser.S
# 
# arch/i386/lib/Makefile
#   2005/02/07 08:14:17-08:00 [EMAIL PROTECTED] +1 -1
#   x86: make "put_user()" be out-of-line
# 
# arch/i386/kernel/i386_ksyms.c
#   2005/02/07 08:14:17-08:00 [EMAIL PROTECTED] +5 -0
#   x86: make "put_user()" be out-of-line
# 
diff -Nru a/arch/i386/kernel/i386_ksyms.c b/arch/i386/kernel/i386_ksyms.c
--- a/arch/i386/kernel/i386_ksyms.c 2005-02-07 17:16:32 -08:00
+++ b/arch/i386/kernel/i386_ksyms.c 2005-02-07 17:16:32 -08:00
@@ -97,6 +97,11 @@
 EXPORT_SYMBOL(__get_user_2);
 EXPORT_SYMBOL(__get_user_4);
 
+EXPORT_SYMBOL(__put_user_1);
+EXPORT_SYMBOL(__put_user_2);
+EXPORT_SYMBOL(__put_user_4);
+EXPORT_SYMBOL(__put_user_8);
+
 EXPORT_SYMBOL(strpbrk);
 EXPORT_SYMBOL(strstr);
 
diff -Nru a/arch/i386/lib/Makefile b/arch/i386/lib/Makefile
--- a/arch/i386/lib/Makefile2005-02-07 17:16:32 -08:00
+++ b/arch/i386/lib/Makefile2005-02-07 17:16:32 -08:00
@@ -3,7 +3,7 @@
 #
 
 
-lib-y = checksum.o delay.o usercopy.o getuser.o memcpy.o strstr.o \
+lib-y = checksum.o delay.o usercopy.o getuser.o putuser.o memcpy.o strstr.o \
bitops.o
 
 lib-$(CONFIG_X86_USE_3DNOW) += mmx.o
diff -Nru a/arch/i386/lib/putuser.S b/arch/i386/lib/putuser.S
--- /dev/null   Wed Dec 31 16:00:00 196900
+++ b/arch/i386/lib/putuser.S   2005-02-07 17:16:32 -08:00
@@ -0,0 +1,86 @@
+/*
+ * __put_user functions.
+ *
+ * (C) Copyright 2005 Linus Torvalds
+ *
+ * These functions have a non-standard call interface
+ * to make them more efficient, especially as they
+ * return an error value in addition to the "real"
+ * return value.
+ */
+#include 
+
+
+/*
+ * __put_user_X
+ *
+ * Inputs: %eax[:%edx] contains the data
+ * %ecx contains the address
+ *
+ * Outputs:%eax is error code (0 or -EFAULT)
+ *
+ * These functions should not modify any other registers,
+ * as they get called from within inline assembly.
+ */
+
+#define ENTER  pushl %ebx ; GET_THREAD_INFO(%ebx)
+#define EXIT   popl %ebx ; ret
+
+.text
+.align 4
+.globl __put_user_1
+__put_user_1:
+   ENTER
+   cmpl TI_addr_limit(%ebx),%ecx
+   jae bad_put_user
+1: movb %al,(%ecx)
+   xorl %eax,%eax
+   EXIT
+
+.align 4
+.globl __put_user_2
+__put_user_2:
+   ENTER
+   movl TI_addr_limit(%ebx),%ebx
+   subl $1,%ebx
+   cmpl %ebx,%ecx
+   jae bad_put_user
+2: movw %ax,(%ecx)
+   xorl %eax,%eax
+   EXIT
+
+.align 4
+.globl __put_user_4
+__put_user_4:
+   ENTER
+   movl TI_addr_limit(%ebx),%ebx
+   subl $3,%ebx
+   cmpl %ebx,%ecx
+   jae bad_put_user
+3: movl %eax,(%ecx)
+   xorl %eax,%eax
+   EXIT
+
+.align 4
+.globl __put_user_8
+__put_user_8:
+   ENTER
+   movl TI_addr_limit(%ebx),%ebx
+   subl $7,%ebx
+   cmpl %ebx,%ecx
+   jae bad_put_user
+3: movl %eax,(%ecx)
+4: movl %edx,4(%ecx)
+   xorl %eax,%eax
+   EXIT
+
+bad_put_user:
+   movl $-14,%eax
+   EXIT
+
+.section __ex_table,"a"
+   .long 1b,bad_put_user
+   .long 2b,bad_put_user
+   .long 3b,bad_put_user
+   .long 4b,bad_put_user
+.previous
diff -Nru a/include/asm-i386/uaccess.h b/include/asm-i386/uaccess.h
--- a/include/asm-i386/uaccess.h2005-02-07 17:16:32 -08:00
+++ b/include/asm-i386/uaccess.h2005-02-07 17:16:32 -08:00
@@ -185,6 +185,21 @@
 
 extern void __put_user_bad(void);
 
+/*
+ * Strange magic calling convention: pointer in %ecx,
+ * value in %eax(:%edx), return value in %eax, no clobbers.
+ */

Re: [PATCH] hot-swapping support for PSX controllers

2005-02-07 Thread Peter Nelson

Eric Piel wrote:
Note that this is a re-send of a previous patch now that the patch of 
Peter (which had to be applied before this one) has been intregrated 
in the vanilla kernel. It's Peter's version modified to apply cleanly 
against 2.6.11-rc3 plus a fix in the comment.
I was actually just about to re-post this patch.  I've tested it and it 
works for me, plus it saves a few bytes of kernel memory fixing the 
array sizes.

-Peter
--
Fixes hotplug support for PSX controllers and some mis-sized arrays.
Signed-off-by: Eric Piel <[EMAIL PROTECTED]>
Signed-off-by: Peter Nelson <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: prezeroing V6 [2/3]: ScrubD

Christoph Lameter <[EMAIL PROTECTED]> wrote:
>
> > What were the benchmarking results for this work?  I think you had some,
> > but this is pretty vital info, so it should be retained in the changelogs.
> 
> Look at the early posts. I plan to put that up on the web. I have some
> stats attached to the end of this message from an earlier post.

But that's a patch-specific microbenchmark, isn't it?  Has this work been
benchmarked against real-world stuff?

> > Should we be managing the kernel threads with the kthread() API?
> 
> What would you like to manage?

Startup, perhaps binding the threads to their cpus too.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.11-rc3: Kylix application no longer works?

2005-02-07 Thread Grzegorz Kulewski

On Mon, 7 Feb 2005, Andrew Morton wrote:
Daniel Drake <[EMAIL PROTECTED]> wrote:

# fs/binfmt_elf.c
#   2005/01/17 13:37:56-08:00 [EMAIL PROTECTED] +43 -19
#   [SPARC64]: Missing user access return value checks in fs/binfmt_elf.c and 
fs/compat.c
#
I think so. For a short period we applied this patch to the Gentoo 2.6.10
kernel...
http://dev.gentoo.org/~dsd/gentoo-dev-sources/release-10.01/dist/1900_umem_catch.patch
...but removed it once users complained it stopped kylix binaries from running.
Bah.  That's what happens when you fix stuff.
What's kylix?  The Borland C++ builder thing?
Rather Delphi (== Object Pascal) thing.

How should one set about reproducing this problem?
IIRC, Some minimal "personal" version can be downloaded from borland.com.

Grzegorz Kulewski
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

kernel 2.6.9 failure

2005-02-07 Thread gl34


Hi all,
 
On a K6-2 box the 2.6.9 kernel starts to load : "Loading" then the PC 
resets. 
The kernel compiled and everything installed OK. Lilo is OK.  I've tried four 
times different configs with the same result. Box resets. My 2.4.28 kernel 
works OK. 
I've tried rm'ing and re-unpacking the 2.6.9 source and starting afresh.  Box 
resets.
 
The only clue, if that's what it is, is when I tried to upgrade module-init-
tools 
and quota-tools I got an error, can't find ../asm-generic/errno.h. True 
enough, 
there's no ../asm-generic dir in the includes. The closest is  ../mach-
generic. 
And there *is* a errno.h in the include files. So I just made an ../asm-
generic dir 
and put a copy of errno.h in it. No luck.
 
-Gil





-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: prezeroing V6 [2/3]: ScrubD

2005-02-07 Thread Christoph Lameter

On Mon, 7 Feb 2005, Andrew Morton wrote:

> Christoph Lameter <[EMAIL PROTECTED]> wrote:
> >
> > Adds management of ZEROED and NOT_ZEROED pages and a background daemon
> > called scrubd.
>
> What were the benchmarking results for this work?  I think you had some,
> but this is pretty vital info, so it should be retained in the changelogs.

Look at the early posts. I plan to put that up on the web. I have some
stats attached to the end of this message from an earlier post.

> Having one kscrubd per node seems like the right thing to do.

Yes that is what is happening. Otherwise our NUMA stuff would not work
right ;-)

> Should we be managing the kernel threads with the kthread() API?

What would you like to manage?

-- Earlier post
The scrub daemon is invoked when a unzeroed page of a certain order has
been generated so that its worth running it. If no higher order pages are
present then the logic will favor hot zeroing rather than simply shifting
processing around. kscrubd typically runs only for a fraction of a second
and sleeps for long periods of time even under memory benchmarking.
kscrubd
performs short bursts of zeroing when needed and tries to stay out off the
processor as much as possible.

The result is a significant increase of the page fault performance even
for
single threaded applications (i386 2x PIII-450 384M RAM allocating 256M in
each run):

w/o patch:
 Gb Rep Threads   User  System Wall flt/cpu/s fault/wsec
  0   110.006s  0.389s   0.039s157455.320 157070.694
  0   120.007s  0.607s   0.032s101476.689 190350.885

w/patch
 Gb Rep Threads   User  System Wall flt/cpu/s fault/wsec
  0   110.008s  0.083s   0.009s672151.422 664045.899
  0   120.005s  0.129s   0.008s459629.796 741857.373

The performance can only be upheld if enough zeroed pages are available.
In a heavy memory intensive benchmark the system may run out of these very
fast but the efficient algorithm for page zeroing still makes this a
winner
(2 way system with 384MB RAM, no hardware zeroing support). In the
following
measurement the test is repeated 10 times allocating 256M each in rapid
succession which would deplete the pool of zeroed pages quickly):

w/o patch:
 Gb Rep Threads   User  System Wall flt/cpu/s fault/wsec
  0  1010.058s  3.913s   3.097s157335.774 157076.932
  0  1020.063s  6.139s   3.027s100756.788 190572.486

w/patch
 Gb Rep Threads   User  System Wall flt/cpu/s fault/wsec
  0  1010.059s  1.828s   1.089s330913.517 330225.515
  0  1020.082s  1.951s   1.094s307172.100 320680.232

Note that zeroing of pages makes no sense if the application
touches all cache lines of a page allocated (there is no influence of
prezeroing on benchmarks like lmbench for that reason) since the extensive
caching of modern cpus means that the zeroes written to a hot zeroed page
will then be overwritten by the application in the cpu cache and thus
the zeros will never make it to memory! The test program used above only
touches one 128 byte cache line of a 16k page (ia64). Sparsely
populated and accessed areas are typical for lots of applications.

Here is another test in order to gauge the influence of the number of
cache
lines touched on the performance of the prezero enhancements:

 Gb Rep Thr CLine  User  System   Wall  flt/cpu/s fault/wsec
  1  11   10.01s  0.12s   0.01s500813.853 497925.891
  1  11   20.01s  0.11s   0.01s493453.103 472877.725
  1  11   40.02s  0.10s   0.01s479351.658 471507.415
  1  11   80.01s  0.13s   0.01s424742.054 416725.013
  1  11  160.05s  0.12s   0.01s347715.359 336983.834
  1  11  320.12s  0.13s   0.02s258112.286 256246.731
  1  11  640.24s  0.14s   0.03s169896.381 168189.283
  1  11 1280.49s  0.14s   0.06s102300.257 101674.435

The benefits of prezeroing are reduced to minimal quantities if all
cachelines of a page are touched. Prezeroing can only be effective
if the whole page is not immediately used after the page fault.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] pci_raw_ops should use unsigned args

On Thu, Feb 03, 2005 at 12:08:05PM -0700, Bjorn Helgaas wrote:
> Convert pci_raw_ops to use unsigned segment (aka domain),
> bus, and devfn.  With the previous code, various ia64 config
> accesses fail due to segment sign-extension problems.
> 
> ia64:
> - With a signed seg >= 0x8, unwanted sign-extension occurs when
>   "seg << 28" is cast to u64 in PCI_SAL_EXT_ADDRESS()
> - PCI_SAL_EXT_ADDRESS(): cast to u64 *before* shifting; otherwise
>   "seg << 28" is evaluated as unsigned int (32 bits) and gets
>   truncated when seg > 0xf
> - pci_sal_read(): validate "value" ptr as other arches do
> - pci_sal_{read,write}(): return -EINVAL rather than SAL error status
> 
>  arch/i386/pci/direct.c |   12 ++
>  arch/i386/pci/mmconfig.c   |6 +++--
>  arch/i386/pci/numa.c   |6 +++--
>  arch/i386/pci/pcbios.c |6 +++--
>  arch/ia64/pci/pci.c|   53 
> ++---
>  arch/x86_64/pci/mmconfig.c |8 --
>  include/linux/pci.h|6 +++--
>  7 files changed, 51 insertions(+), 46 deletions(-)
> 
> Signed-off-by: Bjorn Helgaas <[EMAIL PROTECTED]>

Applied, thanks.

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PCI: fix pci_remove_legacy_files() crash

On Fri, Feb 04, 2005 at 12:28:36PM +0900, MUNEDA Takahiro wrote:
> Hi,
> 
> The legacy_io which is the member of pci_bus struct might be
> NULL. It should be checked.
> 
> This patch checks 'b->legacy_io', NULL or not.
> 
> Signed-off-by: MUNEDA Takahiro <[EMAIL PROTECTED]>

Applied, thanks.

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Lse-tech] [PATCH] cpusets - big numa cpu and memory placement

2005-02-07 Thread Paul Jackson

Andrew wrote:
> OK, I'll add cpusets to the 2.6.12 queue.

I'd like that ;).

Thank-you, Matthew, for the work you put into making sense of this.

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson <[EMAIL PROTECTED]> 1.650.933.1373, 
1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] convert /proc/driver/rtc to seq_file.

2005-02-07 Thread Stephen Hemminger

The /proc/driver/rtc interface didn't have any module owner hook.
The simplest fix is to just convert this to the single version of seq_file.
Also, fix initialization of rtc_dev to use C99 form.

Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]>


diff -Nru a/drivers/char/rtc.c b/drivers/char/rtc.c
--- a/drivers/char/rtc.c2005-02-07 16:08:10 -08:00
+++ b/drivers/char/rtc.c2005-02-07 16:08:10 -08:00
@@ -73,6 +73,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -151,8 +152,7 @@
 static void mask_rtc_irq_bit(unsigned char bit);
 #endif
 
-static int rtc_read_proc(char *page, char **start, off_t off,
- int count, int *eof, void *data);
+static int rtc_proc_open(struct inode *inode, struct file *file);
 
 /*
  * Bits in rtc_status. (6 bits of room for future expansion)
@@ -871,11 +871,18 @@
.fasync = rtc_fasync,
 };
 
-static struct miscdevice rtc_dev=
-{
-   RTC_MINOR,
-   "rtc",
-   &rtc_fops
+static struct miscdevice rtc_dev = {
+   .minor  = RTC_MINOR,
+   .name   = "rtc",
+   .fops   = &rtc_fops,
+};
+
+static struct file_operations rtc_proc_fops = {
+   .owner = THIS_MODULE,
+   .open = rtc_proc_open,
+   .read  = seq_read,
+   .llseek = seq_lseek,
+   .release = single_release,
 };
 
 #if defined(RTC_IRQ) && !defined(__sparc__)
@@ -884,6 +891,7 @@
 
 static int __init rtc_init(void)
 {
+   struct proc_dir_entry *ent;
 #if defined(__alpha__) || defined(__mips__)
unsigned int year, ctrl;
unsigned long uip_watchdog;
@@ -974,7 +982,9 @@
release_region(RTC_PORT(0), RTC_IO_EXTENT);
return -ENODEV;
}
-   if (!create_proc_read_entry ("driver/rtc", 0, NULL, rtc_read_proc, 
NULL)) {
+
+   ent = create_proc_entry("driver/rtc", 0, NULL);
+   if (!ent) {
 #ifdef RTC_IRQ
free_irq(RTC_IRQ, NULL);
 #endif
@@ -982,6 +992,7 @@
misc_deregister(&rtc_dev);
return -ENOMEM;
}
+   ent->proc_fops = &rtc_proc_fops;
 
 #if defined(__alpha__) || defined(__mips__)
rtc_freq = HZ;
@@ -1119,11 +1130,10 @@
  * Info exported via "/proc/driver/rtc".
  */
 
-static int rtc_proc_output (char *buf)
+static int rtc_proc_show(struct seq_file *seq, void *v)
 {
 #define YN(bit) ((ctrl & bit) ? "yes" : "no")
 #define NY(bit) ((ctrl & bit) ? "no" : "yes")
-   char *p;
struct rtc_time tm;
unsigned char batt, ctrl;
unsigned long freq;
@@ -1134,7 +1144,6 @@
freq = rtc_freq;
spin_unlock_irq(&rtc_lock);
 
-   p = buf;
 
rtc_get_rtc_time(&tm);
 
@@ -1142,12 +1151,12 @@
 * There is no way to tell if the luser has the RTC set for local
 * time or for Universal Standard Time (GMT). Probably local though.
 */
-   p += sprintf(p,
-"rtc_time\t: %02d:%02d:%02d\n"
-"rtc_date\t: %04d-%02d-%02d\n"
-"rtc_epoch\t: %04lu\n",
-tm.tm_hour, tm.tm_min, tm.tm_sec,
-tm.tm_year + 1900, tm.tm_mon + 1, tm.tm_mday, epoch);
+   seq_printf(seq, 
+  "rtc_time\t: %02d:%02d:%02d\n"
+  "rtc_date\t: %04d-%02d-%02d\n"
+  "rtc_epoch\t: %04lu\n",
+  tm.tm_hour, tm.tm_min, tm.tm_sec,
+  tm.tm_year + 1900, tm.tm_mon + 1, tm.tm_mday, epoch);
 
get_rtc_alm_time(&tm);
 
@@ -1156,57 +1165,50 @@
 * match any value for that particular field. Values that are
 * greater than a valid time, but less than 0xc0 shouldn't appear.
 */
-   p += sprintf(p, "alarm\t\t: ");
+   seq_puts(seq, "alarm\t\t: ");
if (tm.tm_hour <= 24)
-   p += sprintf(p, "%02d:", tm.tm_hour);
+   seq_printf(seq, "%02d:", tm.tm_hour);
else
-   p += sprintf(p, "**:");
+   seq_puts(seq, "**:");
 
if (tm.tm_min <= 59)
-   p += sprintf(p, "%02d:", tm.tm_min);
+   seq_printf(seq, "%02d:", tm.tm_min);
else
-   p += sprintf(p, "**:");
+   seq_puts(seq, "**:");
 
if (tm.tm_sec <= 59)
-   p += sprintf(p, "%02d\n", tm.tm_sec);
+   seq_printf(seq, "%02d\n", tm.tm_sec);
else
-   p += sprintf(p, "**\n");
+   seq_puts(seq, "**\n");
 
-   p += sprintf(p,
-"DST_enable\t: %s\n"
-"BCD\t\t: %s\n"
-"24hr\t\t: %s\n"
-"square_wave\t: %s\n"
-"alarm_IRQ\t: %s\n"
-"update_IRQ\t: %s\n"
-"periodic_IRQ\t: %s\n"
-"periodic_freq\t: %ld\n"
-"batt_status\t: %s\n",
-YN(RTC_DST_EN),
-NY(RTC_DM_BINARY),
-YN(RTC_24H),
-

RE: BIOS Bug

2005-02-07 Thread Aleksey Gorelov

>-Original Message-
>From: [EMAIL PROTECTED] 
>[mailto:[EMAIL PROTECTED] On Behalf Of Enrico Bartky
>Sent: Monday, February 07, 2005 7:12 AM
>To: linux-kernel@vger.kernel.org
>Subject: BIOS Bug
>
>Hello,
>
>on my notebook, when I plugged in my USB keyboard the kernel 
>doesnt boot correctly, ...
>
>... 
>BIOS hangoff failed ( 112, 1010001 )
>continuing after BIOS bug
>irq 192, pci mem 0xfebff000
>new usb device registered, assigned bus number 1
>...
>
>then the notebook hangs. If I boot without the plugged 
>keyboard and plug in when the kernel is ready, there are no 
>problems. I have a SiS USB chipset.
>
>Can you help me?

What kernel version are you using ?
Try 2.6.10 with the following command line parameter:
usb-handoff

Aleks.

>
>Thanx, EnricoB
>__
>Verschicken Sie romantische, coole und witzige Bilder per SMS!
>Jetzt bei WEB.DE FreeMail: http://f.web.de/?mc=021193
>
>-
>To unsubscribe from this list: send the line "unsubscribe 
>linux-kernel" in
>the body of a message to [EMAIL PROTECTED]
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: prezeroing V6 [2/3]: ScrubD

Christoph Lameter <[EMAIL PROTECTED]> wrote:
>
> Adds management of ZEROED and NOT_ZEROED pages and a background daemon
> called scrubd.

What were the benchmarking results for this work?  I think you had some,
but this is pretty vital info, so it should be retained in the changelogs.

Having one kscrubd per node seems like the right thing to do.

Should we be managing the kernel threads with the kthread() API?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PCI Hotplug: remove incorrect rpaphp firmware dependency

2005-02-07 Thread John Rose

> Er, use the result of the get_children_props() call only if it _failed_?
> I suspect that wasn't your intention. This makes my G5 boot again:

Here's an alternate fix for the ppc64 crash during boot.  This corrects
the offending function to use more conventional error codes.  I'll
follow up with return code cleanups for the entire module, and for RTAS
code, since these are probably too big for 2.6.11.

Please apply, if appropriate.

Thanks-
John

Signed-off-by: John Rose <[EMAIL PROTECTED]>

diff -puN drivers/pci/hotplug/rpaphp_core.c~01_rpaphp_is_php_fix 
drivers/pci/hotplug/rpaphp_core.c
--- 2_6_linus/drivers/pci/hotplug/rpaphp_core.c~01_rpaphp_is_php_fix
2005-02-07 18:06:29.0 -0600
+++ 2_6_linus-johnrose/drivers/pci/hotplug/rpaphp_core.c2005-02-07 
18:10:15.0 -0600
@@ -224,7 +224,7 @@ static int get_children_props(struct dev
 
if (!indexes || !names || !types || !domains) {
/* Slot does not have dynamically-removable children */
-   return 1;
+   return -EINVAL;
}
if (drc_indexes)
*drc_indexes = indexes;
@@ -260,7 +260,7 @@ int rpaphp_get_drc_props(struct device_n
}
 
rc = get_children_props(dn->parent, &indexes, &names, &types, &domains);
-   if (rc) {
+   if (rc < 0) {
return 1;
}
 
@@ -307,7 +307,7 @@ static int is_php_dn(struct device_node 
int rc;
 
rc = get_children_props(dn, indexes, names, &drc_types, power_domains);
-   if (rc) {
+   if (rc >= 0) {
if (is_php_type((char *) &drc_types[1])) {
*types = drc_types;
return 1;
@@ -331,7 +331,7 @@ static int is_dr_dn(struct device_node *
 
rc = get_children_props(dn->parent, indexes, names, types,
power_domains);
-   return (rc == 0);
+   return (rc >= 0);
 }
 
 static inline int is_vdevice_root(struct device_node *dn)

_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Merging the Suspend2 freezer implementation.

2005-02-07 Thread Nigel Cunningham

Hi Pavel.

I'm keen to see if we can merge Suspend2's freezer implementation after
2.6.11. Does that conflict with any of your intended changes? If it
doesn't, I'll submit a patch for review/merge as quickly as I can.

The main change involves the introduction of a new SYNCTHREAD flag. We
use this to avoid deadlocking over processes that are running sys_sync
and siblings. Processes that enter those routines get the flag added,
and it's removed when they exit the sync routine. We then freeze in four
stages: 

1) Freeze user space threads without SYNCTHREAD set;
2) Freeze user space threads with SYNCTHREAD set;
3) Run our own sys_sync in case no one else was syncing
4) Freeze kernel space threads without NOFREEZE set.

I'd also like to look at your SMP support and see if we can improve
compatibility there at the same time.

Finally I'd like to merge the support for freezer flags on workqueues.

Regards,

Nigel
-- 
Nigel Cunningham
Software Engineer, Canberra, Australia
http://www.cyclades.com

Ph: +61 (2) 6292 8028  Mob: +61 (417) 100 574

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Lse-tech] [PATCH] cpusets - big numa cpu and memory placement

Matthew Dobson <[EMAIL PROTECTED]> wrote:
>
> Sorry to reply a long quiet thread,

Is appreciated, thanks.

> but I've been trading emails with Paul 
> Jackson on this subject recently, and I've been unable to convince either him 
> or myself that merging CPUSETs and CKRM is as easy as I once believed.  I'm 
> still convinced the CPU side is doable, but I haven't managed as much success 
> with the memory binding side of CPUSETs.  In light of this, I'd like to 
> remove 
> my previous objections to CPUSETs moving forward.  If others still have 
> things 
> they want discussed before CPUSETs moves into mainline, that's fine, but it 
> seems to me that CPUSETs offer legitimate functionality and that the code has 
> certainly "done its time" in -mm to convince me it's stable and usable.

OK, I'll add cpusets to the 2.6.12 queue.

going once, going twice...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.11-rc3: Kylix application no longer works?

Daniel Drake <[EMAIL PROTECTED]> wrote:
>
> > # fs/binfmt_elf.c
> > #   2005/01/17 13:37:56-08:00 [EMAIL PROTECTED] +43 -19
> > #   [SPARC64]: Missing user access return value checks in fs/binfmt_elf.c 
> > and fs/compat.c
> > # 
> 
> I think so. For a short period we applied this patch to the Gentoo 2.6.10 
> kernel...
> 
> http://dev.gentoo.org/~dsd/gentoo-dev-sources/release-10.01/dist/1900_umem_catch.patch
> 
> ...but removed it once users complained it stopped kylix binaries from 
> running.

Bah.  That's what happens when you fix stuff.

What's kylix?  The Borland C++ builder thing?

How should one set about reproducing this problem?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Lse-tech] [PATCH] cpusets - big numa cpu and memory placement

2005-02-07 Thread Matthew Dobson

Matthew Dobson wrote:
On Sun, 2004-10-03 at 16:53, Martin J. Bligh wrote:
Martin wrote:
Matt had proposed having a separate sched_domain tree for each cpuset, which
made a lot of sense, but seemed harder to do in practice because "exclusive"
in cpusets doesn't really mean exclusive at all.
See my comments on this from yesterday on this thread.
I suspect we don't want a distinct sched_domain for each cpuset, but
rather a sched_domain for each of several entire subtrees of the cpuset
hierarchy, such that every CPU is in exactly one such sched domain, even
though it be in several cpusets in that sched_domain.
Mmmm. The fundamental problem I think we ran across (just whilst pondering,
not in code) was that some things (eg ... init) are bound to ALL cpus (or
no cpus, depending how you word it); i.e. they're created before the cpusets
are, and are a member of the grand-top-level-uber-master-thingummy.
How do you service such processes? That's what I meant by the exclusive
domains aren't really exclusive. 

Perhaps Matt can recall the problems better. I really liked his idea, aside
from the small problem that it didn't seem to work ;-)

Well that doesn't seem like a fair statement.  It's potentially true,
but it's really hard to say without an implementation! ;)
I think that the idea behind cpusets is really good, essentially
creating isolated areas of CPUs and memory for tasks to run
undisturbed.  I feel that the actual implementation, however, is taking
a wrong approach, because it attempts to use the cpus_allowed mask to
override the scheduler in the general case.  cpus_allowed, in my
estimation, is meant to be used as the exception, not the rule.  If we
wish to change that, we need to make the scheduler more aware of it, so
it can do the right thing(tm) in the presence of numerous tasks with
varying cpus_allowed masks.  The other option is to implement cpusets in
a way that doesn't use cpus_allowed.  That is the option that I am
pursuing.  

My idea is to make sched_domains much more flexible and dynamic.  By
adding locking and reference counting, and simplifying the way in which
sched_domains are created, linked, unlinked and eventually destroyed we
can use sched_domains as the implementation of cpusets.  IA64 already
allows multiple sched_domains trees without a shared top-level domain. 
My proposal is to make this functionality more generally available. 
Extending the "isolated domains" concept a little further will buy us
most (all?) the functionality of "exclusive" cpusets without the need to
use cpus_allowed at all.

I've got some code.  I'm in the midst of pushing it forward to rc3-mm2. 
I'll post an RFC later today or tomorrow when it's cleaned up.

-Matt
Sorry to reply a long quiet thread, but I've been trading emails with Paul 
Jackson on this subject recently, and I've been unable to convince either him 
or myself that merging CPUSETs and CKRM is as easy as I once believed.  I'm 
still convinced the CPU side is doable, but I haven't managed as much success 
with the memory binding side of CPUSETs.  In light of this, I'd like to remove 
my previous objections to CPUSETs moving forward.  If others still have things 
they want discussed before CPUSETs moves into mainline, that's fine, but it 
seems to me that CPUSETs offer legitimate functionality and that the code has 
certainly "done its time" in -mm to convince me it's stable and usable.

-Matt
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.11-rc3: Kylix application no longer works?

2005-02-07 Thread Daniel Drake

Andrew Morton wrote:
I wonder if reverting the patch will restore the old behaviour?
# This is a BitKeeper generated diff -Nru style patch.
#
# ChangeSet
#   2005/01/21 13:42:18-08:00 [EMAIL PROTECTED] 
#   Merge nuts.davemloft.net:/disk1/BK/sparcwork-2.6
#   into nuts.davemloft.net:/disk1/BK/sparc-2.6
# 
# fs/binfmt_elf.c
#   2005/01/21 13:42:06-08:00 [EMAIL PROTECTED] +0 -0
#   Auto merged
# 
# ChangeSet
#   2005/01/17 13:38:38-08:00 [EMAIL PROTECTED] 
#   [SPARC64]: Missing user access return value checks in fs/binfmt_elf.c and fs/compat.c
#   
#   Signed-off-by: David S. Miller <[EMAIL PROTECTED]>
# 
# fs/compat_ioctl.c
#   2005/01/17 13:37:56-08:00 [EMAIL PROTECTED] +12 -5
#   [SPARC64]: Missing user access return value checks in fs/binfmt_elf.c and fs/compat.c
# 
# fs/binfmt_elf.c
#   2005/01/17 13:37:56-08:00 [EMAIL PROTECTED] +43 -19
#   [SPARC64]: Missing user access return value checks in fs/binfmt_elf.c and fs/compat.c
# 
I think so. For a short period we applied this patch to the Gentoo 2.6.10 
kernel...

http://dev.gentoo.org/~dsd/gentoo-dev-sources/release-10.01/dist/1900_umem_catch.patch
...but removed it once users complained it stopped kylix binaries from running.
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: question on symbol exports

2005-02-07 Thread Dan Malek

On Feb 7, 2005, at 4:35 PM, Benjamin Herrenschmidt wrote:
Interesting... more than no swap, you must also make sure you have no
r/w mmap'ed file (which are technically equivalent to swap).
Yeah, I kinda had a similar thought.  Just because you aren't
swapping doesn't mean the VM subsystem isn't looking at dirty bits,
too.  It could potentially steal a page that it thinks can be replaced
from either a zero-fill or reading again from persistent storage.
-- Dan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] sys_chroot() hook for additional chroot() jails enforcing

2005-02-07 Thread Lorenzo Hernández García-Hierro

El lun, 07-02-2005 a las 16:50 -0600, Serge E. Hallyn escribió:
> Hi,
> 
> If I understood you correct earlier, the only policy you needed to
> enforce was to prevent double-chrooting.  If that is the case, why is it
> not sufficient to keep a "process-has-used-chroot" flag in
> current->security which is set on the first call to
> capable(CAP_SYS_CHROOT) and inherited by forked children, after which
> calls to capable(CAP_SYS_CHROOT) are refused?
> 
> Of course if you need to do more, then a hook might be necessary.

Yeah, checking that process is chrooted using the current macro and
denying if capable() gets it trying to access CAP_SYS_CHROOT it's the
way that vSecurity currently does it.

But the hook will have to handle some chdir enforcing that can't be done
with current hooks, I will explain it further tomorrow.

It's too late here ;)

Cheers,
-- 
Lorenzo Hernández García-Hierro <[EMAIL PROTECTED]> 
[1024D/6F2B2DEC] & [2048g/9AE91A22][http://tuxedo-es.org]


signature.asc
Description: Esta parte del mensaje =?ISO-8859-1?Q?est=E1?= firmada	digitalmente

Re: [RFC][PATCH 2.6.11-rc3-mm1] Relay Fork Module

Guillaume Thouvenin <[EMAIL PROTECTED]> wrote:
>
> Hello,
>   
>This module sends a signal to one or several processes (in user
> space) when a fork occurs in the kernel. It relays information about
> forks (parent and child pid) to a user space application.
>
> ...
>This patch is used by the Enhanced Linux System Accounting tool that
> can be downloaded from http://elsa.sf.net

So this permits ELSA to maintain a complete picture of the process/thread
hierarchy?  I guess that fits into the "do it in userspace" mantra -
certainly hooking into fork() is a minimal way of doing this, although I
wonder what the limitations are.

Implementation-wise: there's a lot of code there and the interface is a bit
awkward.  Why not just feed that kobject you have there into
kobject_uevent()?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Real-Time Preemption and UML?

2005-02-07 Thread Esben Nielsen

Well, I keep trying a little bit more. In the mean while you can get some
of the stuff I needed to change to at least get it to compile:

One of the problems was use of direct architecture specific semaphores
(which doesn't work under PREEMPT_REALTIME) and in places where a quick
(maybe too quick) look at the code told me that completions ought to be
used. Therefore I changed two semaphores to completions which compiled
fine. I have tried the change on 2.6.11-rc2, and it seemed to work, but I
have not tested it heavily.

The patch is in an attachment - I hope the mail-list will alow that. It is
simply too trouplesome otherwise when I am using Pine as mail client.

Esben


On Mon, 7 Feb 2005, Jeff Dike wrote:

> [EMAIL PROTECTED] said:
> > Hi, I am trying to compile and run UM-Linux with PREEMPT_REALTIME. I
> > managed to get it to compile but it wont start - it simply stops
> > somewhere in start_kernel() :-( 
> 
> I've never played with preemption on UML.  No doubt it needs some work...
> 
>   Jeff
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
--- linux-2.6.11-rc2-um/arch/um/drivers/port_kern.c.orig2005-01-23 
15:53:29.0 +0100
+++ linux-2.6.11-rc2-um/arch/um/drivers/port_kern.c 2005-02-06 
19:54:52.0 +0100
@@ -23,7 +23,7 @@
 struct port_list {
struct list_head list;
int has_connection;
-   struct semaphore sem;
+   struct completion done;
int port;
int fd;
spinlock_t lock;
@@ -66,7 +66,7 @@
conn->fd = fd;
list_add(&conn->list, &conn->port->connections);
 
-   up(&conn->port->sem);
+   complete(&conn->port->done);
return(IRQ_HANDLED);
 }
 
@@ -183,13 +183,14 @@
*port = ((struct port_list) 
{ .list = LIST_HEAD_INIT(port->list),
  .has_connection   = 0,
- .sem  = __SEMAPHORE_INITIALIZER(port->sem, 
- 0),
  .lock = SPIN_LOCK_UNLOCKED,
  .port = port_num,
  .fd   = fd,
  .pending  = LIST_HEAD_INIT(port->pending),
  .connections  = LIST_HEAD_INIT(port->connections) });
+
+   init_completion(&port->done), 
+
list_add(&port->list, &ports);
 
  found:
@@ -221,7 +222,7 @@
int fd;
 
while(1){
-   if(down_interruptible(&port->sem))
+   if(wait_for_completion_interruptible(&port->done))
return(-ERESTARTSYS);
 
spin_lock(&port->lock);
--- linux-2.6.11-rc2-um/arch/um/drivers/xterm_kern.c.orig   2005-01-23 
15:53:29.0 +0100
+++ linux-2.6.11-rc2-um/arch/um/drivers/xterm_kern.c2005-02-06 
19:54:58.0 +0100
@@ -16,7 +16,7 @@
 #include "xterm.h"
 
 struct xterm_wait {
-   struct semaphore sem;
+   struct completion ready;
int fd;
int pid;
int new_fd;
@@ -32,7 +32,7 @@
return(IRQ_NONE);
 
xterm->new_fd = fd;
-   up(&xterm->sem);
+   complete(&xterm->ready);
return(IRQ_HANDLED);
 }
 
@@ -49,10 +49,10 @@
 
/* This is a locked semaphore... */
*data = ((struct xterm_wait) 
-   { .sem  = __SEMAPHORE_INITIALIZER(data->sem, 0),
- .fd   = socket,
+   { .fd   = socket,
  .pid  = -1,
  .new_fd   = -1 });
+   init_completion(&data->ready);
 
err = um_request_irq(XTERM_IRQ, socket, IRQ_READ, xterm_interrupt, 
 SA_INTERRUPT | SA_SHIRQ | SA_SAMPLE_RANDOM, 
@@ -68,7 +68,7 @@
 *
 * XXX Note, if the xterm doesn't work for some reason (eg. DISPLAY
 * isn't set) this will hang... */
-   down(&data->sem);
+   wait_for_completion(&data->ready);
 
free_irq_by_irq_and_dev(XTERM_IRQ, data);
free_irq(XTERM_IRQ, data);

Re: question on symbol exports

2005-02-07 Thread Chris Friesen

Benjamin Herrenschmidt wrote:
Interesting... more than no swap, you must also make sure you have no
r/w mmap'ed file (which are technically equivalent to swap).
Ah...thanks for the warning.
We want to eventually make it work with swap as well, but that's 
substantially more complicated.

I'm not too fan about exporting those symbols, but I'll talk to paulus,
it should be possible at least to EXPORT_SYMBOL_GPL them...
I understand the reluctance.  I'm perfectly willing to export it GPL in 
my private branch as long as you guys don't consider it evil--the module 
is going to be GPL anyways.

The alternative would be for me to build my code directly in to the 
kernel...just makes it harder for me to debug.

Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/1] PCI: Dynids - passing driver data

2005-02-07 Thread Brian King

Martin Mares wrote:
Hello!

Which is a good thing, right?  "driver_data" is usually a pointer to
somewhere.  Having userspace specify it would not be a good thing.
That depends on the driver usage, and the patch allows it to be 
configurable and defaults to not being used.

Maybe we could just define the operation as cloning of an entry
for another device ID, including its driver_data.
Possibly. That would potentially require a lot of parameters to 
userspace. We would really need to duplicate all the currently existing 
sysfs parms to accomplish this.

--
Brian King
eServer Storage I/O
IBM Linux Technology Center
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Suggestion for CD filesystem for Backups

2005-02-07 Thread Toon van der Pas

On Fri, Sep 24, 2004 at 01:18:19AM -0400, Ali Bayazit wrote:
> 
> On Thu, 2004-09-23 at 17:16 +0100, Alan Cox wrote:
> > On Iau, 2004-09-23 at 00:04, Judith und Mirko Kloppstech wrote:
> > > Why not write a file system on top of ISO9660 which uses the rest of the 
> > > CD to write error correction. If a sector becomes unreadable, the error 
> > > correction saves the data. Besides, a tool for testing the error rate 
> > > and the safety of the data can be easily written for a normal CD-ROM 
> > > drive.
> > > 
> > > The data for error correction might be written into a file so that the 
> > > CD can be read using any System, but Linux provides error correction.
> > 
> > Send patches, or possibly if you are dumping tars and the like just
> > write yourself an app to generate a second file of ECC data.
> 
> Wouldn't it be safer to do ECC on meta-data also?
> That probably means replacing ISO9660 though.

There seems to be a good user space alternative for this purpose:

http://dvdisaster.berlios.de

Regards,
Toon.
-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: move-accounting-function-calls-out-of-critical-vm-code-paths.patch

Jay Lan <[EMAIL PROTECTED]> wrote:
>
> I have tested Christoph's patch before the leave. It did work for CSA
> and showed performance improvement on certain configuration.

OK, thanks.

> Should i propose to include the CSA module in
> the kernel then, Andrew? :)

Sure, if such an action is suitable for all the other parties who are
interested in enhanced system accounting.

What this ballgame needs is for someone to grab the bull by the horns and
run with it.  This thing obviously requires a lot more cliches!
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] sys_chroot() hook for additional chroot() jails enforcing

2005-02-07 Thread Serge E. Hallyn

Hi,

If I understood you correct earlier, the only policy you needed to
enforce was to prevent double-chrooting.  If that is the case, why is it
not sufficient to keep a "process-has-used-chroot" flag in
current->security which is set on the first call to
capable(CAP_SYS_CHROOT) and inherited by forked children, after which
calls to capable(CAP_SYS_CHROOT) are refused?

Of course if you need to do more, then a hook might be necessary.

-serge

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Irix NFS server usual problem

2005-02-07 Thread Trond Myklebust

må den 07.02.2005 Klokka 23:16 (+0100) skreiv Olivier Galibert:
> I'm starting to install some fedora core 3 systems in an environment
> where 64bits SGIs are still serving the home directories.  They have
> the bug/feature that required the 2.4 patch to hack the 64bits
> cookies[1].  The 2.6 kernel I just found still can't compensate by
> itself for the issue.
> 
> Is there an easy way to fix that?

Have you applied SGI's IRIX patches to your server (the one that makes
the cookies take 32-bit values)?

Alternatively, you can forward-port the old 2.4.x cookie hack to 2.6.x
(that should be fairly trivial to do). You can find the patch on

http://client.linux-nfs.org/Linux-2.4.x/2.4.26/linux-2.4.26-02-seekdir.dif

Cheers,
  Trond

-- 
Trond Myklebust <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Filesystem linking protections

* John Richard Moser ([EMAIL PROTECTED]) wrote:
> Yes, mkdtemp() and mkstemp().
> 
> Of course we can't always rely on programmers to get it right, so the
> idea here is to make sure we ask broken code to behave nicely, and stab
> it in the face if it doesn't.  Please try to examine this in that scope.

It's fine for hardened distro.  But still inappropriate for mainline.

thanks,
-chris
-- 
Linux Security Modules http://lsm.immunix.org http://lsm.bkbits.net
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.11-rc3: Kylix application no longer works?

Pavel Machek <[EMAIL PROTECTED]> wrote:
>
> I have some obscure Kylix application here... It started gets
> misteriously killed in 2.6.11-rc3 and -rc3-mm1...
> 
> [EMAIL PROTECTED]:~/slovnik/bin$ strace ./Slovnik
> execve("./Slovnik", ["./Slovnik"], [/* 32 vars */]) = 0
> +++ killed by SIGKILL +++
> [EMAIL PROTECTED]:~/slovnik/bin$ ldd ./Slovnik
> /usr/bin/ldd: line 1:  8759 Killed
> LD_TRACE_LOADED_OBJECTS=1 LD_WARN= LD_BIND_NOW=
> LD_LIBRARY_VERSION=$verify_out LD_VERBOSE= "$file"
> [EMAIL PROTECTED]:~/slovnik/bin$
> 
> I get this in 2.6.10-rc3:
> 
> [EMAIL PROTECTED]:~/slovnik/bin$ ./Slovnik
> ./Slovnik: relocation error: ./Slovnik: undefined symbol:
> initPAnsiStrings
> [EMAIL PROTECTED]:~/slovnik/bin$ ldd ./Slovnik
> libz.so.1 => /usr/lib/libz.so.1 (0xb7fc2000)
> libX11.so.6 => /usr/X11/lib/libX11.so.6 (0xb7efa000)
> libpthread.so.0 => /lib/libpthread.so.0 (0xb7ea9000)
> libdl.so.2 => /lib/libdl.so.2 (0xb7ea6000)
> libc.so.6 => /lib/libc.so.6 (0xb7d73000)
> /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0xb7fea000)
> [EMAIL PROTECTED]:~/slovnik/bin$

Does it work correctly under earlier kernels?  If so, when did it break?

> When I set LD_LIBRARY_PATH right, it will actually work. Any ideas?

Presumably you're picking up a different library without LD_LIBRARY_PATH. 
Perhaps that library is mucked up and the new uaccess checking code in
binfmt_elf.c is now doing the right thing, and we were previously
forgetting to report some error.

I wonder if reverting the patch will restore the old behaviour?

# This is a BitKeeper generated diff -Nru style patch.
#
# ChangeSet
#   2005/01/21 13:42:18-08:00 [EMAIL PROTECTED] 
#   Merge nuts.davemloft.net:/disk1/BK/sparcwork-2.6
#   into nuts.davemloft.net:/disk1/BK/sparc-2.6
# 
# fs/binfmt_elf.c
#   2005/01/21 13:42:06-08:00 [EMAIL PROTECTED] +0 -0
#   Auto merged
# 
# ChangeSet
#   2005/01/17 13:38:38-08:00 [EMAIL PROTECTED] 
#   [SPARC64]: Missing user access return value checks in fs/binfmt_elf.c and 
fs/compat.c
#   
#   Signed-off-by: David S. Miller <[EMAIL PROTECTED]>
# 
# fs/compat_ioctl.c
#   2005/01/17 13:37:56-08:00 [EMAIL PROTECTED] +12 -5
#   [SPARC64]: Missing user access return value checks in fs/binfmt_elf.c and 
fs/compat.c
# 
# fs/binfmt_elf.c
#   2005/01/17 13:37:56-08:00 [EMAIL PROTECTED] +43 -19
#   [SPARC64]: Missing user access return value checks in fs/binfmt_elf.c and 
fs/compat.c
# 
diff -Nru a/fs/binfmt_elf.c b/fs/binfmt_elf.c
--- a/fs/binfmt_elf.c   2005-02-07 14:50:07 -08:00
+++ b/fs/binfmt_elf.c   2005-02-07 14:50:07 -08:00
@@ -110,15 +110,17 @@
be in memory */
 
 
-static void padzero(unsigned long elf_bss)
+static int padzero(unsigned long elf_bss)
 {
unsigned long nbyte;
 
nbyte = ELF_PAGEOFFSET(elf_bss);
if (nbyte) {
nbyte = ELF_MIN_ALIGN - nbyte;
-   clear_user((void __user *) elf_bss, nbyte);
+   if (clear_user((void __user *) elf_bss, nbyte))
+   return -EFAULT;
}
+   return 0;
 }
 
 /* Let's use some macros to make this stack manipulation a litle clearer */
@@ -134,7 +136,7 @@
 #define STACK_ALLOC(sp, len) ({ sp -= len ; sp; })
 #endif
 
-static void
+static int
 create_elf_tables(struct linux_binprm *bprm, struct elfhdr * exec,
int interp_aout, unsigned long load_addr,
unsigned long interp_load_addr)
@@ -179,7 +181,8 @@
STACK_ALLOC(p, ((current->pid % 64) << 7));
 #endif
u_platform = (elf_addr_t __user *)STACK_ALLOC(p, len);
-   __copy_to_user(u_platform, k_platform, len);
+   if (__copy_to_user(u_platform, k_platform, len))
+   return -EFAULT;
}
 
/* Create the ELF interpreter info */
@@ -241,7 +244,8 @@
 #endif
 
/* Now, let's put argc (and argv, envp if appropriate) on the stack */
-   __put_user(argc, sp++);
+   if (__put_user(argc, sp++))
+   return -EFAULT;
if (interp_aout) {
argv = sp + 2;
envp = argv + argc + 1;
@@ -259,25 +263,29 @@
__put_user((elf_addr_t)p, argv++);
len = strnlen_user((void __user *)p, PAGE_SIZE*MAX_ARG_PAGES);
if (!len || len > PAGE_SIZE*MAX_ARG_PAGES)
-   return;
+   return 0;
p += len;
}
-   __put_user(0, argv);
+   if (__put_user(0, argv))
+   return -EFAULT;
current->mm->arg_end = current->mm->env_start = p;
while (envc-- > 0) {
size_t len;
__put_user((elf_addr_t)p, envp++);
len = strnlen_user((void __user *)p, PAGE_SIZE*MAX_ARG_PAGES);
if (!len || len > PAGE_SIZE*MAX_ARG_PAGES)
-   return;
+   return 0;
p += len;
}
-   __put_user(0, envp);
+   if (__put_user(0, envp))
+

Re: [PATCH] BSD Secure Levels: claim block dev in file struct rather than inode struct, 2.6.11-rc2-mm1 (3/8)

On Mon, 07 Feb 2005 14:26:03 PST, Chris Wright said:
> * Michael Halcrow ([EMAIL PROTECTED]) wrote:
> > This is the third in a series of eight patches to the BSD Secure
> > Levels LSM.  It moves the claim on the block device from the inode
> > struct to the file struct in order to address a potential
> > circumvention of the control via hard links to block devices.  Thanks
> > to Serge Hallyn for pointing this out.
> 
> Hard links still point to same inode, what's the issue that this
> addresses?

Ignore that last - I thought it was the "filesystem linking permissions"
thread rather than the BSD Secure linking permissions thread. ;)



pgpHd0UzzrMjl.pgp
Description: PGP signature

Re: [linux-usb-devel] 2.6: USB disk unusable level of data corruption

2005-02-07 Thread Giuseppe Bilotta

David Brownell wrote:
> On Sunday 06 February 2005 7:59 am, Giuseppe Bilotta wrote:
> > 
> > I have a MAGNEX/ViPower USB/FirWire external HD enclosure. I 
> > found that it works pretty fine (albeit slowly) when connected 
> > to the USB 1.1 ports built in my Dell Inspiron 8200, but trying 
> > to connect it via the Hamlet PCMCIA USB2 Card Adapter doesn't 
> > work (it seems it gets assigned minors 1,2,3,4,5,6,... and so 
> > on forever until I unplug it).
> 
> What do you mean "minors"?  Addresses or actual /dev/sdN numbers?
> 
> If it's addresses, that would be an an enumeration problem.  Some
> recent changes have caused prolems there, 2.6.11-rc3-mm2 ought to
> have a patch making it better.  (Well, working around one of the
> two problems that'd suggest.)

Sorry, it's addresses.

usb 5-1: new high speed USB device using ehci_hcd and address 4
usb 5-1: new high speed USB device using ehci_hcd and address 5
usb 5-1: new high speed USB device using ehci_hcd and address 6

blah blah blah, neverending. So yes, it's probably the 
enumeration problem.

Also, when I plug in the PCMCIA card I get (sorry for the 
wrapping, Gravity sucks)

PCI: Enabling device :07:00.0 ( -> 0002)
ACPI: PCI interrupt :07:00.0[A] -> GSI 11 (level, low) -> 
IRQ 11
ohci_hcd :07:00.0: NEC Corporation USB
PCI: Setting latency timer of device :07:00.0 to 64
ohci_hcd :07:00.0: irq 11, pci mem 0x2900
ohci_hcd :07:00.0: new USB bus registered, assigned bus 
number 3
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 3 ports detected
PCI: Enabling device :07:00.1 ( -> 0002)
ACPI: PCI interrupt :07:00.1[B] -> GSI 11 (level, low) -> 
IRQ 11
ohci_hcd :07:00.1: NEC Corporation USB (#2)
PCI: Setting latency timer of device :07:00.1 to 64
ohci_hcd :07:00.1: irq 11, pci mem 0x29001000
ohci_hcd :07:00.1: new USB bus registered, assigned bus 
number 4
hub 4-0:1.0: USB hub found
hub 4-0:1.0: 2 ports detected
PCI: Enabling device :07:00.2 ( -> 0002)
ACPI: PCI interrupt :07:00.2[C] -> GSI 11 (level, low) -> 
IRQ 11
ehci_hcd :07:00.2: NEC Corporation USB 2.0
ehci_hcd :07:00.2: irq 11, pci mem 0x29002000
ehci_hcd :07:00.2: new USB bus registered, assigned bus 
number 5
ehci_hcd :07:00.2: USB 2.0 initialized, EHCI 0.95, driver 
26 Oct 2004
hub 5-0:1.0: USB hub found
hub 5-0:1.0: 5 ports detected 

The card only has 2 USB ports .. why 5 ports here? Is this the 
same bug?

Another interesting tidbit is that I get:

USB Universal Host Controller Interface driver v2.2
ACPI: PCI interrupt :00:1d.0[A] -> GSI 11 (level, low) -> 
IRQ 11
uhci_hcd :00:1d.0: Intel Corp. 82801CA/CAM USB (Hub #1)
PCI: Setting latency timer of device :00:1d.0 to 64
uhci_hcd :00:1d.0: irq 11, io base 0xbf80
uhci_hcd :00:1d.0: new USB bus registered, assigned bus 
number 1
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 2 ports detected
ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 11
ACPI: PCI interrupt :00:1d.2[C] -> GSI 11 (level, low) -> 
IRQ 11
uhci_hcd :00:1d.2: Intel Corp. 82801CA/CAM USB (Hub #3)
PCI: Setting latency timer of device :00:1d.2 to 64
uhci_hcd :00:1d.2: irq 11, io base 0xbf20
uhci_hcd :00:1d.2: new USB bus registered, assigned bus 
number 2
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected 

for the built-in ports ... I only have two USB ports on this 
machine though, why does it see 4 of them?

(Do you also need the lspci and/or lsusb and/or dmesg of the 
error that happens when I disable the EHCI driver and only let 
the OHCI manage the PCMCIA card?)

-- 
Giuseppe "Oblomov" Bilotta

Can't you see
It all makes perfect sense
Expressed in dollar and cents
Pounds shillings and pence
  (Roger Waters)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] BSD Secure Levels: claim block dev in file struct rather than inode struct, 2.6.11-rc2-mm1 (3/8)

On Mon, 07 Feb 2005 14:26:03 PST, Chris Wright said:

> Hard links still point to same inode, what's the issue that this
> addresses?

For those systems that have everything on one big partition, you can often
do stuff like:

ln /etc/passwd /tmp/

and wait for /etc/passwd to get clobbered by a cron job run by root...


pgpv1juO6RgIl.pgp
Description: PGP signature

Re: [PATCH 1/1] PCI: Dynids - passing driver data

2005-02-07 Thread Martin Mares

Hello!

> >Which is a good thing, right?  "driver_data" is usually a pointer to
> >somewhere.  Having userspace specify it would not be a good thing.
> 
> That depends on the driver usage, and the patch allows it to be 
> configurable and defaults to not being used.

Maybe we could just define the operation as cloning of an entry
for another device ID, including its driver_data.

Have a nice fortnight
-- 
Martin `MJ' Mares   <[EMAIL PROTECTED]>   http://atrey.karlin.mff.cuni.cz/~mj/
Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth
Only dead fish swim with the stream.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Sabotaged PaXtest (was: Re: Patch 4/6 randomize the stack pointer)

2005-02-07 Thread Ingo Molnar


btw., do you consider PaX as a 100% sure solution against 'code
injection' attacks (meaning that the attacker wants to execute an
arbitrary piece of code, and assuming the attacked application has a
stack overflow)? I.e. does PaX avoid all such attacks in a guaranteed
way?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] sys_chroot() hook for additional chroot() jails enforcing

* Lorenzo Hernández García-Hierro ([EMAIL PROTECTED]) wrote:
> Attached you can find a patch which adds a new hook for the sys_chroot()
> syscall, and makes us able to add additional enforcing and security
> checks by using the Linux Security Modules framework (ie. chdir
> enforcing, etc).

If you want to make a change like this, collapse the
capable(CAP_SYS_CHROOT) check behind this hook, no point having two
outcalls from same call site.  What logic do you expect to put behind
the chroot() hook?

thanks,
-chris
-- 
Linux Security Modules http://lsm.immunix.org http://lsm.bkbits.net
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/1] PCI: Dynids - passing driver data

2005-02-07 Thread Brian King

Greg KH wrote:
On Mon, Feb 07, 2005 at 04:00:27PM -0600, [EMAIL PROTECTED] wrote:
Currently, code exists in the pci layer to allow userspace to specify
driver data when adding a pci dynamic id from sysfs. However, this data
is never used and there exists no way in the existing code to use it.

Which is a good thing, right?  "driver_data" is usually a pointer to
somewhere.  Having userspace specify it would not be a good thing.
That depends on the driver usage, and the patch allows it to be 
configurable and defaults to not being used.

This patch allows device drivers to indicate that they want driver data
passed to them on dynamic id adds by initializing use_driver_data in their
pci_driver->pci_dynids struct. The documentation has also been updated
to reflect this.

What driver wants to use this?
I am in the process of adding dynids support into the ipr scsi driver. I 
originally was using driver_data as a pointer, but am changing it to be 
an index instead, so that it can be specified by the user.

There are essentially 2 different types of chipsets that ipr controls, 
the primary difference being the register offsets. I am using 
driver_data to figure that out today.

My other option is to somehow change the driver to cope with having no 
driver data, but that will result in more driver code and will 
ultimately be less flexible in the new chipsets that can be added using 
dynids.

-Brian
--
Brian King
eServer Storage I/O
IBM Linux Technology Center
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] out-of-tree builds: preserve ARCH and CROSS_COMPILE settings

2005-02-07 Thread Ralph Siemsen

[I am not subscribed, please CC: any replies]
When you build the 2.6 kernel outside of its source directory, using the 
O= option like so:

make -C linux-2.6.10 O=../builddir
this conveniently produces a top-level Makefile in "builddir" which can 
be used to update/clean/rebuild the tree with a simple "make".  It also 
uses the ".config" file from "builddir", which makes it very convenient 
for managing multiple builds for different target systems.

However if you are cross-compiling, you must also set ARCH and 
CROSS_COMPILE variables as appropriate.  Unfortunately these settings 
are not recorded in the generated Makefile in "builddir", so one cannot 
simply do "make" anymore.

The attached patch fixes the script that generates the Makefile, so as 
to pass ARCH and CROSS_COMPILE settings, only when they are defined. 
Otherwise behaviour is exactly as it was before.

Since the contents of "builddir" are specific to ARCH and CROSS_COMPILER 
I see no reason why the values should not become fixed in "builddir".

Signed-off-by: Ralph Siemsen <[EMAIL PROTECTED]>
diff -u mkmakefile
--- linux-2.6.10.orig/scripts/mkmakefile	27 Jan 2005 15:53:54 -
+++ linux-2.6.10/scripts/mkmakefile	7 Feb 2005 21:20:19 -
@@ -9,6 +9,8 @@
 # $3 - version
 # $4 - patchlevel
 
+test "$ARCH" != "" && ARCH="ARCH=$ARCH"
+test "$CROSS_COMPILE" != "" && CROSS="CROSS_COMPILE=$CROSS_COMPILE"
 
 cat << EOF
 # Automatically generated by $0: don't edit
@@ -22,10 +24,10 @@
 MAKEFLAGS += --no-print-directory
 
 all:
-	\$(MAKE) -C \$(KERNELSRC) O=\$(KERNELOUTPUT)
+	\$(MAKE) $ARCH $CROSS -C \$(KERNELSRC) O=\$(KERNELOUTPUT)
 
 %::
-	\$(MAKE) -C \$(KERNELSRC) O=\$(KERNELOUTPUT) \$@
+	\$(MAKE) $ARCH $CROSS -C \$(KERNELSRC) O=\$(KERNELOUTPUT) \$@
 
 EOF

2.6.11-rc3: Kylix application no longer works?

2005-02-07 Thread Pavel Machek

Hi!

I have some obscure Kylix application here... It started gets
misteriously killed in 2.6.11-rc3 and -rc3-mm1...

[EMAIL PROTECTED]:~/slovnik/bin$ strace ./Slovnik
execve("./Slovnik", ["./Slovnik"], [/* 32 vars */]) = 0
+++ killed by SIGKILL +++
[EMAIL PROTECTED]:~/slovnik/bin$ ldd ./Slovnik
/usr/bin/ldd: line 1:  8759 Killed
LD_TRACE_LOADED_OBJECTS=1 LD_WARN= LD_BIND_NOW=
LD_LIBRARY_VERSION=$verify_out LD_VERBOSE= "$file"
[EMAIL PROTECTED]:~/slovnik/bin$

I get this in 2.6.10-rc3:

[EMAIL PROTECTED]:~/slovnik/bin$ ./Slovnik
./Slovnik: relocation error: ./Slovnik: undefined symbol:
initPAnsiStrings
[EMAIL PROTECTED]:~/slovnik/bin$ ldd ./Slovnik
libz.so.1 => /usr/lib/libz.so.1 (0xb7fc2000)
libX11.so.6 => /usr/X11/lib/libX11.so.6 (0xb7efa000)
libpthread.so.0 => /lib/libpthread.so.0 (0xb7ea9000)
libdl.so.2 => /lib/libdl.so.2 (0xb7ea6000)
libc.so.6 => /lib/libc.so.6 (0xb7d73000)
/lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0xb7fea000)
[EMAIL PROTECTED]:~/slovnik/bin$

When I set LD_LIBRARY_PATH right, it will actually work. Any ideas?

Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Filesystem linking protections

2005-02-07 Thread John Richard Moser

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



Chris Wright wrote:
> * John Richard Moser ([EMAIL PROTECTED]) wrote:
> 
>>I've yet to see this break anything on Ubuntu or Gentoo; Brad Spengler
>>claims this breaks nothing on Debian.  On the other hand, this could
>>potentially squash the second most prevalent security bug.
> 
> 
> Yes I know, I've worked on distro with it as well in the past.  And it
> has broken atd and courier in the past.  This is something that also
> can be done in userspace using sane subdirs in +t world writable dirs,
> or O_EXCL so there's work to be done in userspace.
> 

Yes, mkdtemp() and mkstemp().

Of course we can't always rely on programmers to get it right, so the
idea here is to make sure we ask broken code to behave nicely, and stab
it in the face if it doesn't.  Please try to examine this in that scope.

> thanks,
> -chris

- --
All content of all messages exchanged herein are left in the
Public Domain, unless otherwise explicitly stated.

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFCB+vThDd4aOud5P8RAssCAJ9L7Cf5pnvI8GdKs1P4cpM2lJvtYACZAXee
a5kkPkxXm9YK0DFSfvDd6fQ=
=00DK
-END PGP SIGNATURE-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] BSD Secure Levels: claim block dev in file struct rather than inode struct, 2.6.11-rc2-mm1 (3/8)

* Michael Halcrow ([EMAIL PROTECTED]) wrote:
> This is the third in a series of eight patches to the BSD Secure
> Levels LSM.  It moves the claim on the block device from the inode
> struct to the file struct in order to address a potential
> circumvention of the control via hard links to block devices.  Thanks
> to Serge Hallyn for pointing this out.

Hard links still point to same inode, what's the issue that this
addresses?

thanks,
-chris
-- 
Linux Security Modules http://lsm.immunix.org http://lsm.bkbits.net
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[ANNOUNCE] February release of LTP

2005-02-07 Thread Marty Ridgeway





The February release of LTP is now available.

LTP-20050207
- runltp now exports $TMPDIR as a copy of $TMP, certain exceptions caused
these to be different.
- extra functions for LTP libs are to make these tests fail with a more
  informative message when attempts to create swap on tmpfs are made.
- IPV6 testcase updates from David Stevens
- Applied patch from Jacky Malcles that fixes an inconsistency regarding
synchronization.
- Make proc01 skip kcore
- Fix gives an hint to the probable solution if capset01 test fails
- Fix for race conditions in synchronization between children and parent on
fcntl15.
- Applied patch from Jacky Malcles to allow test to run on ia64.
- The test llseek sets RLIMIT_FSIZE to a small number, this fix to
  restore it to its original value.
- Fix IPV6 Makefile install path problem


Linux Test Project
Linux Technology Center
IBM Corporation


Internet E-Mail : [EMAIL PROTECTED]
IBM, 11501 Burnet Rd, Austin, TX  78758
Phone (512) 838-1356 - T/L 678-1356 - Bldg. 908/1C005
Austin, TX.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/1] PCI: Dynids - passing driver data

On Mon, Feb 07, 2005 at 04:00:27PM -0600, [EMAIL PROTECTED] wrote:
> 
> Currently, code exists in the pci layer to allow userspace to specify
> driver data when adding a pci dynamic id from sysfs. However, this data
> is never used and there exists no way in the existing code to use it.

Which is a good thing, right?  "driver_data" is usually a pointer to
somewhere.  Having userspace specify it would not be a good thing.

> This patch allows device drivers to indicate that they want driver data
> passed to them on dynamic id adds by initializing use_driver_data in their
> pci_driver->pci_dynids struct. The documentation has also been updated
> to reflect this.

What driver wants to use this?

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] sys_chroot() hook for additional chroot() jails enforcing

2005-02-07 Thread Lorenzo Hernández García-Hierro

Hi,

Attached you can find a patch which adds a new hook for the sys_chroot()
syscall, and makes us able to add additional enforcing and security
checks by using the Linux Security Modules framework (ie. chdir
enforcing, etc).

Current user of the hook is the forthcoming 0.2 revision of vSecurity.

With it, and used within an LSM module, we can achieve the goal of
enforcing and apply some hardening to the sys_chroot() syscall.
Even if chroot jails are broken by design, in terms of security, with a
few changes to their base and some syscalls that it relies with, we can
achieve the goal of preventing some of the already known attacks against
them.

I will make available some patches for other syscalls as well
(sys_fchmod(), sys_chmod(), ...), that will add a few more hooks to the
LSM framework, in the hope that they will be useful.

The patch can be retrieved too from:
http://pearls.tuxedo-es.org/patches/sys_chroot_lsm-hook-2.6.11-rc3.patch

Thanks in advance, and, again, I will appreciate any suggestions on
which hooks are good candidates to be added.
Feel free to edit tuxedo-es.org wiki at http://wiki.tuxedo-es.org/LSM
and put suggestions & comments there.

Cheers,
-- 
Lorenzo Hernández García-Hierro <[EMAIL PROTECTED]> 
[1024D/6F2B2DEC] & [2048g/9AE91A22][http://tuxedo-es.org]
diff -Nur linux-2.6.11-rc3/fs/open.c linux-2.6.11-rc3.chroot-lsm/fs/open.c
--- linux-2.6.11-rc3/fs/open.c	2005-02-06 21:40:40.0 +0100
+++ linux-2.6.11-rc3.chroot-lsm/fs/open.c	2005-02-07 21:42:45.0 +0100
@@ -582,6 +582,10 @@
 	error = -EPERM;
 	if (!capable(CAP_SYS_CHROOT))
 		goto dput_and_out;
+		
+	error = security_chroot(&nd);
+	if (error)
+		goto dput_and_out;
 
 	set_fs_root(current->fs, nd.mnt, nd.dentry);
 	set_fs_altroot();
diff -Nur linux-2.6.11-rc3/include/linux/security.h linux-2.6.11-rc3.chroot-lsm/include/linux/security.h
--- linux-2.6.11-rc3/include/linux/security.h	2005-02-06 21:40:27.0 +0100
+++ linux-2.6.11-rc3.chroot-lsm/include/linux/security.h	2005-02-07 21:10:05.0 +0100
@@ -1008,6 +1008,10 @@
  *	@ts contains new time
  *	@tz contains new timezone
  *	Return 0 if permission is granted.
+ * @chroot:
+ *	Check permission to change the current root by sys_chroot() syscall.
+ *	@nd contains the nameidata struct passed by sys_chroot()
+ *	Return 0 if permission is granted.
  * @vm_enough_memory:
  *	Check permissions for allocating a new virtual mapping.
  *  @pages contains the number of pages.
@@ -1040,6 +1044,7 @@
 	int (*acct) (struct file * file);
 	int (*sysctl) (struct ctl_table * table, int op);
 	int (*capable) (struct task_struct * tsk, int cap);
+	int (*chroot) (struct nameidata * nd);
 	int (*quotactl) (int cmds, int type, int id, struct super_block * sb);
 	int (*quota_on) (struct dentry * dentry);
 	int (*syslog) (int type);
@@ -1304,6 +1309,10 @@
 	return security_ops->settime(ts, tz);
 }
 
+static inline int security_chroot(struct nameidata *nd)
+{
+	return security_ops->chroot(nd);
+}
 
 static inline int security_vm_enough_memory(long pages)
 {
@@ -1986,6 +1995,11 @@
 	return cap_settime(ts, tz);
 }
 
+static inline int security_chroot(struct nameidata *nd)
+{
+	return 0;
+}
+
 static inline int security_vm_enough_memory(long pages)
 {
 	return cap_vm_enough_memory(pages);
diff -Nur linux-2.6.11-rc3/security/dummy.c linux-2.6.11-rc3.chroot-lsm/security/dummy.c
--- linux-2.6.11-rc3/security/dummy.c	2005-02-06 21:40:57.0 +0100
+++ linux-2.6.11-rc3.chroot-lsm/security/dummy.c	2005-02-07 21:12:01.0 +0100
@@ -101,6 +101,11 @@
 	return 0;
 }
 
+static int dummy_chroot(struct nameidata *nd)
+{
+	return 0;
+}
+
 static int dummy_settime(struct timespec *ts, struct timezone *tz)
 {
 	if (!capable(CAP_SYS_TIME))
@@ -858,6 +863,7 @@
 	set_to_dummy_if_null(ops, sysctl);
 	set_to_dummy_if_null(ops, syslog);
 	set_to_dummy_if_null(ops, settime);
+	set_to_dummy_if_null(ops, chroot);
 	set_to_dummy_if_null(ops, vm_enough_memory);
 	set_to_dummy_if_null(ops, bprm_alloc_security);
 	set_to_dummy_if_null(ops, bprm_free_security);


signature.asc
Description: Esta parte del mensaje =?ISO-8859-1?Q?est=E1?= firmada	digitalmente

Irix NFS server usual problem

2005-02-07 Thread Olivier Galibert

I'm starting to install some fedora core 3 systems in an environment
where 64bits SGIs are still serving the home directories.  They have
the bug/feature that required the 2.4 patch to hack the 64bits
cookies[1].  The 2.6 kernel I just found still can't compensate by
itself for the issue.

Is there an easy way to fix that?

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Filesystem linking protections

On Mon, 07 Feb 2005 23:00:33 +0100, Lorenzo =?ISO-8859-1?Q?Hern=E1ndez_?= 
=?ISO-8859-1?Q?Garc=EDa-Hierro?= said:

> A sysctl can be a good option, creating a CTL_SECURITY and then
> registering stuff under it, but this requires to have the kernel hackers
> agree with implementing a new security suite and such.
> In short, re-inventing the wheel.

No, you can do this from within an LSM and the kernel hackers don't have to deal
with it

(tech note - don't call register_sysctl_table() from within a 
security_initcall().
Use a separate __initcall() that gets called later - security_initcall() happens
before the kernel has the sysctl infrastructure in place.  Guess how I know 
that? ;)


pgpOpjamuhL1A.pgp
Description: PGP signature

Re: [PATCH] Dynamic tick, version 050127-1

2005-02-07 Thread George Anzinger

Pavel Machek wrote:
Hi!

I do have CONFIG_X86_PM_TIMER enabled, but it seems by board does not
have such piece of hardware:
[EMAIL PROTECTED]:/usr/src/linux-mm$ dmesg | grep -i "time\|tick\|apic"
PCI: Setting latency timer of device :00:11.5 to 64
[EMAIL PROTECTED]:/usr/src/linux-mm$ 
If you are sure that machine supports ACPI, maybe this is your problem
(from the POSIX high res timer patch):
 If you enable the ACPI pm timer and it cannot be found, it is
 possible that your BIOS is not producing the ACPI table or
 that your machine does not support ACPI.  In the former case,
 see "Default ACPI pm timer address".  If the timer is not
 found the boot will fail when trying to calibrate the 'delay'
 loop.

Well, but how do I get the address? I'll try looking at BIOS
options...
Pavel
In my machine, if I turned off the PM code (in the BIOS) (or possibly turning on 
the ACPI, again in the BIOS) it did produce the address.  Booting then would put 
that address in the dmesg file.  You can then change the BIOS back to what it 
was and use the address found in the dmesg file.
--
George Anzinger   george@mvista.com
High-res-timers:  http://sourceforge.net/projects/high-res-timers/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Filesystem linking protections

2005-02-07 Thread Lorenzo Hernández García-Hierro

El lun, 07-02-2005 a las 16:45 -0500, [EMAIL PROTECTED] escribió:
> On Mon, 07 Feb 2005 20:34:33 +0100, Lorenzo =?ISO-8859-1?Q?Hern=E1ndez_?= 
> =?ISO-8859-1?Q?Garc=EDa-Hierro?= said:
> 
> > But It's better to give users a "secure-by-default" status, at least on
> > those parts that don't affect negatively the stability or the
> > performance itself.
> 
> It's still policy, and should be put someplace where users can manage it.
> You're changing the behavior from what POSIX specifies, and that's in general
> a no-no for mainline kernel code.

A sysctl can be a good option, creating a CTL_SECURITY and then
registering stuff under it, but this requires to have the kernel hackers
agree with implementing a new security suite and such.
In short, re-inventing the wheel.

> Like an LSM, which happens to be there so users can impose policy without
> making any code changes to the kernel.  Implementing a policy that results in
> non-POSIXy behavior in an LSM is perfectly OK.. ;)

It's currently made in vSecurity :)

> > The LSM hook call is before the check, so, LSM framework still has the
> > control over it, until it releases the operation giving control back to
> > the standard function.
> 
> Right.. Which means LSM can stop that particular attack even faster than
> your patch.. ;)

At least I don't interfere with LSM, so, if no LSM hook adds it's own
security checks, then it gets used.

> > If users must rely on LSM or other external solutions for applying basic
> > security checks (as the framework itself only provides the way to apply
> > them, the checks need to be implemented in a module), then we are making
> > them unable to be protected using the "default" configuration.
> 
> You're making the very rash assumption that a hard-coded one-size-fits all
> "default" that behaves differently than POSIX is suitable for all sites,
> including sites that run software that gets broken by this change, and
> things like embedded systems where it's not a concern at all, and sites that
> already implement some *other* system to ensure that it's not an issue (for
> instance, by using an SELinux policy...)

Good point, then the solution is to make it config-dependent, and that's
a thing that kernel hackers seem to dislike.

Lemme know what's the final thought on this, so, I could work out it and
give what you want, without time loss and we all can feel happy with
it :)

Cheers and thanks for the comments,
-- 
Lorenzo Hernández García-Hierro <[EMAIL PROTECTED]> 
[1024D/6F2B2DEC] & [2048g/9AE91A22][http://tuxedo-es.org]


signature.asc
Description: Esta parte del mensaje =?ISO-8859-1?Q?est=E1?= firmada	digitalmente

[PATCH 1/1] PCI: Dynids - passing driver data

2005-02-07 Thread brking


Currently, code exists in the pci layer to allow userspace to specify
driver data when adding a pci dynamic id from sysfs. However, this data
is never used and there exists no way in the existing code to use it.
This patch allows device drivers to indicate that they want driver data
passed to them on dynamic id adds by initializing use_driver_data in their
pci_driver->pci_dynids struct. The documentation has also been updated
to reflect this.

Signed-off-by: Brian King <[EMAIL PROTECTED]>
---

 linux-2.6.11-rc3-bk4-bjking1/Documentation/pci.txt|8 
 linux-2.6.11-rc3-bk4-bjking1/drivers/pci/pci-driver.c |1 -
 2 files changed, 4 insertions(+), 5 deletions(-)

diff -puN drivers/pci/pci-driver.c~pci_dynids_driver_data 
drivers/pci/pci-driver.c
--- linux-2.6.11-rc3-bk4/drivers/pci/pci-driver.c~pci_dynids_driver_data
2005-02-07 15:58:21.0 -0600
+++ linux-2.6.11-rc3-bk4-bjking1/drivers/pci/pci-driver.c   2005-02-07 
15:58:21.0 -0600
@@ -115,7 +115,6 @@ static DRIVER_ATTR(new_id, S_IWUSR, NULL
 static inline void
 pci_init_dynids(struct pci_dynids *dynids)
 {
-   memset(dynids, 0, sizeof(*dynids));
spin_lock_init(&dynids->lock);
INIT_LIST_HEAD(&dynids->list);
 }
diff -puN Documentation/pci.txt~pci_dynids_driver_data Documentation/pci.txt
--- linux-2.6.11-rc3-bk4/Documentation/pci.txt~pci_dynids_driver_data   
2005-02-07 15:58:21.0 -0600
+++ linux-2.6.11-rc3-bk4-bjking1/Documentation/pci.txt  2005-02-07 
15:58:21.0 -0600
@@ -99,10 +99,10 @@ where all fields are passed in as hexade
 Users need pass only as many fields as necessary; vendor, device,
 subvendor, and subdevice fields default to PCI_ANY_ID (),
 class and classmask fields default to 0, and driver_data defaults to
-0UL.  Device drivers must call
-   pci_dynids_set_use_driver_data(pci_driver *, 1)
-in order for the driver_data field to get passed to the driver.
-Otherwise, only a 0 is passed in that field.
+0UL.  Device drivers must initialize use_driver_data in the dynids struct
+in their pci_driver struct prior to calling pci_register_driver in order
+for the driver_data field to get passed to the driver. Otherwise, only a
+0 is passed in that field.
 
 When the driver exits, it just calls pci_unregister_driver() and the PCI layer
 automatically calls the remove hook for all devices handled by the driver.
_
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Re: msdos/vfat defaults are annoying

2005-02-07 Thread Ingo Oeser

Michelle Konzack schrieb:
> Am 2005-02-07 09:47:09, schrieb Pozsár Balázs:
> > See? I _have_ that patch applied, that's why it tried vfat and not msdos
> > first.
>
> With this, you will nerver mount a Filesystem "msdos".
>
> Because "vfat" IS "msdos" + "lfn".
>
> You can attach to ALL "msdos" media "lfn" and you will have "vfat".

So msdos is vfat WITHOUT lfn, which is a a restriction like noatime
or mounting ext3 as ext2.

That's why the default should be vfat indeed and the restriction should be
"nolfn", which will not allow lfns to be created and is what you actually 
intend, right?

But this will break API today, so it should be added to list of
features that will change.

Regards

Ingo Oeser

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Filesystem linking protections

On Mon, 07 Feb 2005 20:34:33 +0100, Lorenzo =?ISO-8859-1?Q?Hern=E1ndez_?= 
=?ISO-8859-1?Q?Garc=EDa-Hierro?= said:

> But It's better to give users a "secure-by-default" status, at least on
> those parts that don't affect negatively the stability or the
> performance itself.

It's still policy, and should be put someplace where users can manage it.
You're changing the behavior from what POSIX specifies, and that's in general
a no-no for mainline kernel code.

Like an LSM, which happens to be there so users can impose policy without
making any code changes to the kernel.  Implementing a policy that results in
non-POSIXy behavior in an LSM is perfectly OK.. ;)

> The LSM hook call is before the check, so, LSM framework still has the
> control over it, until it releases the operation giving control back to
> the standard function.

Right.. Which means LSM can stop that particular attack even faster than
your patch.. ;)

> If users must rely on LSM or other external solutions for applying basic
> security checks (as the framework itself only provides the way to apply
> them, the checks need to be implemented in a module), then we are making
> them unable to be protected using the "default" configuration.

You're making the very rash assumption that a hard-coded one-size-fits all
"default" that behaves differently than POSIX is suitable for all sites,
including sites that run software that gets broken by this change, and
things like embedded systems where it's not a concern at all, and sites that
already implement some *other* system to ensure that it's not an issue (for
instance, by using an SELinux policy...)


pgpan5ep3gfVq.pgp
Description: PGP signature

Re: ioremap() and port of linux to MPC7400 based SBC (VME board)

2005-02-07 Thread Heinz-Jürgen Oertel

him wrote:

> I have run into a problem I am having a hard time figuring out.
> 
> I have an MPC7400 SBC (PCI bus based) that has a device X residing
> at the following locations in memory:
> 
> 0x1860  - 0x186f  device control register space
> 0xb000  - 0xbfff  device memory space
> 
> Now assume for a moment that NOTHING special needs to be done to
> access either space once the system has booted and bus enumerator
> have set things up.
> 
> ioremap() of the first physical address returns a VALID virtual
> address  ... that I can read and write to. It works as expected
> because there are signature values at various offsets in the control
> register space.
> The virtual address returned is EQUAL to the physical address
> 
> ioremap() of the second physical address also returns what appears to
> be a VALID virtual address although WRITES go nowhere and READS return
> all ff's.
> The virtual address returned is 0xc100 
> 
> 
> 
> Now my question ... I have the source for the port. Where should I focus
> my efforts in trying to figure this out?
> 
> I have read the device drivers book and certain that I am following
> the rules.
> 
> I should also mention that there is an IO controller seperate from the
> MPC7400 that I use to verify that the device X control and memory exist
> in THAT physical range.
> 
> If Only I can access them through ioremap()
> 
> Thanks

No idea up to now, but what kernel, what linux? is it VM linux or uClinux?

Heinz
-- 

with best regards / mit freundlichen Grüßen

   Heinz-Jürgen Oertel
+===
| Heinz-Jürgen Oertel  port GmbH  http://www.port.de
| mailto:[EMAIL PROTECTED]
| phone +49 345 77755-0 fax   +49 345 77755-20
| Regensburger Str. 7b, D-06132 Halle/Saale,  Germany 
| CAN Wikihttp://www.CAN-Wiki.info
| Newsletter: http://www.port.de/engl/company/content/abo_form.html
+===
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: question on symbol exports

2005-02-07 Thread Benjamin Herrenschmidt

On Mon, 2005-02-07 at 08:44 -0600, Chris Friesen wrote:
> Benjamin Herrenschmidt wrote:
> >>It turns out that to call ptep_clear_flush_dirty() on ppc64 from a 
> >>module I needed to export the following symbols:
> >>
> >>__flush_tlb_pending
> >>ppc64_tlb_batch
> >>hpte_update
> > 
> > 
> > Any reason why you need to call that from a module ? Is the module
> > GPL'd ?
> 
> I explained this at the beginning of the thread, but I'll do so again. 
> The module will be released under the GPL.
> 
> The basic idea is that we want to be able to track pages dirtied by a 
> userspace process.  The system has no swap, so we use the dirty bit for 
> this.  On demand we look up the page tables for an address range 
> specified by the caller, store the addresses of any dirty pages, then 
> mark them clean so that the next write causes them to get marked dirty 
> again.  It is this act of marking them clean that requires the 
> additional exports.
> 
> I've included the current code below.  If there is any way to accomplish 
> this without the additional exports, I'd love to hear about it.

Interesting... more than no swap, you must also make sure you have no
r/w mmap'ed file (which are technically equivalent to swap).

I'm not too fan about exporting those symbols, but I'll talk to paulus,
it should be possible at least to EXPORT_SYMBOL_GPL them...

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] Changing COW detection to be memory hotplug friendly

2005-02-07 Thread Hugh Dickins

On Thu, 3 Feb 2005, IWAMOTO Toshihiro wrote:
> The current implementation of memory hotremoval relies on that pages
> can be unmapped from process spaces.  After successful unmapping,
> subsequent accesses to the pages are blocked and don't interfere
> the hotremoval operation.
> 
> However, this code
> 
> if (PageSwapCache(page) &&
> page_count(page) != page_mapcount(page) + 2) {
> ret = SWAP_FAIL;
> goto out_unmap;
> }

Yes, that is odd code.  It would be nice to have a solution without it.

> in try_to_unmap_one() prevents unmapping pages that are referenced via
> get_user_pages(), and such references can be held for a long time if
> they are due to such as direct IO.
> I've made a test program that issues multiple direct IO read requests
> against a single read buffer, and pages that belong to the buffer
> cannot be hotremoved because they aren't unmapped.

I haven't looked at the rest of your hotremoval, so it's not obvious
to me how a change here would help you - obviously you wouldn't want
to be migrating pages while direct IO to them was in progress.

I presume your patch works for you by letting the page count fall
to a point where migration moves it automatically as soon as the
got_user_pages are put, where without your patch the count is held
too high, and you keep doing scans which tend to miss the window
in which those pages are put?

> The following patch, which is against linux-2.6.11-rc1-mm1 and also
> tested witch linux-2.6.11-rc2-mm2, fixes this issue.  The purpose of
> this patch is to be able to unmap pages that have incremented
> page_count.  To do that consistently, the COW detection logic needs to
> be modified to not to rely on page_count.  I'm aware that such
> extensive use of page_mapcount is discouraged and there is a plan to
> kill page_mapcount (*), but I cannot think of a better alternative
> solution.
> 
> (*) c.f. http://www.ussg.iu.edu/hypermail/linux/kernel/0406.0/0483.html

I apologize for scaring you off page mapcount.  I have no current
plans to scrap it, and feel a lot more satisfied with it than at the
time of that comment.  Partly because it's now manipulated atomically
rather than under bitspin lock.  Partly because I realize that although
64-bit systems are overdue for an atomic64 page count and page mapcount,
we can actually just use one atomic64 for them both, keeping, say, lower
24 bits for count and upper 40 for mapcount (and not repeating mapcount
in count on these arches), so mapcount won't increase struct page size.

Go right ahead and use page mapcount if it's appropriate.

> Some notes about my code:
> 
>   - I think it's safe to rely on page_mapcount in do_swap_page(),
> because its use is protected by lock_page().

I think so too.

>   - The can_share_swap_page() call in do_swap_page() always returns
> false.  It is inefficient but should be harmless.  Incrementing
> page_mapcount before calling that function should fix the problem,
> but it may cause bad side effects.

Odd that your patch moves it if it now doesn't even work!
But I think some more movement should be able to solve that.

>   - Another obvious solution to this issue is to find the "offending"
> process from a un-unmappable page and suspend it until the page is
> unmapped.  I'm afraid the implementation would be much more complicated.

Agreed, let's not get into that.

>   - I could not test the following situation.  It should be possible
> to write some kernel code to do that, but please let me know if
> you know any such test cases.
> - A page_count is incremented by get_user_pages().
> - The page gets unmapped.
> - The process causes a write fault for the page, before the
>   incremented page_count is dropped.

I confess I don't have such a test case ready myself.

> Also, while I've tried carefully not to make mistakes and done some
> testing, I'm not very sure this is bug free.  Please comment.
> 
> --- mm/memory.c.orig  2005-01-17 14:47:11.0 +0900
> +++ mm/memory.c   2005-01-17 14:55:51.0 +0900
> @@ -1786,10 +1786,6 @@ static int do_swap_page(struct mm_struct
>   }
>  
>   /* The page isn't present yet, go ahead with the fault. */
> - 
> - swap_free(entry);
> - if (vm_swap_full())
> - remove_exclusive_swap_page(page);
>  
>   mm->rss++;
>   acct_update_integrals();
> @@ -1800,6 +1796,10 @@ static int do_swap_page(struct mm_struct
>   pte = maybe_mkwrite(pte_mkdirty(pte), vma);
>   write_access = 0;
>   }
> + 
> + swap_free(entry);
> + if (vm_swap_full())
> + remove_exclusive_swap_page(page);
>   unlock_page(page);
>  
>   flush_icache_page(vma, page);
> --- mm/rmap.c.orig2005-01-17 14:40:08.0 +0900
> +++ mm/rmap.c 2005-01-21 12:34:06.0 +0900
> @@ -569,8 +569,11 @@ static int try_to_unmap_one(struct page 
>*/
>

[Patch] only unmap what intersects a direct_IO op

2005-02-07 Thread Zach Brown


Now that we're only invalidating the pages that intersected a direct IO write
we might as well only unmap the intersecting bytes as well.  This passed a
light fsx load with page cache, direct, and mmap IO.

Signed-off-by: Zach Brown <[EMAIL PROTECTED]>

---

 filemap.c |   12 
 1 files changed, 8 insertions(+), 4 deletions(-)

Index: 2.6-bk-odirinv/mm/filemap.c
===
--- 2.6-bk-odirinv.orig/mm/filemap.c2005-02-07 12:42:50.0 -0800
+++ 2.6-bk-odirinv/mm/filemap.c 2005-02-07 12:43:16.244253441 -0800
@@ -2285,22 +2285,26 @@
struct file *file = iocb->ki_filp;
struct address_space *mapping = file->f_mapping;
ssize_t retval;
+   size_t write_len = 0;
 
/*
 * If it's a write, unmap all mmappings of the file up-front.  This
 * will cause any pte dirty bits to be propagated into the pageframes
 * for the subsequent filemap_write_and_wait().
 */
-   if (rw == WRITE && mapping_mapped(mapping))
-   unmap_mapping_range(mapping, 0, -1, 0);
+   if (rw == WRITE) {
+   write_len = iov_length(iov, nr_segs);
+   if (mapping_mapped(mapping))
+   unmap_mapping_range(mapping, offset, write_len, 0);
+   }
 
retval = filemap_write_and_wait(mapping);
if (retval == 0) {
retval = mapping->a_ops->direct_IO(rw, iocb, iov,
offset, nr_segs);
if (rw == WRITE && mapping->nrpages) {
-   pgoff_t end = (offset + iov_length(iov, nr_segs) - 1)
- >> PAGE_CACHE_SHIFT;
+   pgoff_t end = (offset + write_len - 1)
+   >> PAGE_CACHE_SHIFT;
int err = invalidate_inode_pages2_range(mapping,
offset >> PAGE_CACHE_SHIFT, end);
if (err)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: M7101