Re: [patch 20/20] XEN-paravirt: Add Xen virtual block device driver.

2007-01-13 Thread Greg KH
On Sat, Jan 13, 2007 at 05:07:28PM -0800, Arjan van de Ven wrote:
> > +
> > +#define DPRINTK(_f, _a...) pr_debug(_f, ## _a)
> 
> why this silly abstraction? Just use pr_debug in the code directly

Actually, for drivers, like this one, you should use the dev_printk()
and friends (dev_dbg, dev_err, etc.) instead so that userspace knows
exactly which device and driver the message comes from.

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: No more "device" symlinks for classes

2007-01-13 Thread Greg KH
On Sun, Jan 14, 2007 at 09:10:59AM +0300, Andrey Borzenkov wrote:
> Pierre Ossman wrote:
> 
> > Hi guys,
> > 
> > I just wanted to know the rationale behind
> > 99ef3ef8d5f2f5b5312627127ad63df27c0d0d05 (no more "device" symlink in
> > class devices). I thought that was a rather convenient way of finding
> > which physical device the class device was coupled to.
> > 
> 
> Actually I wonder why those links still present even when I told system not
> to create them?
> 
> {pts/1}% grep DEPRE /boot/config
> # CONFIG_SYSFS_DEPRECATED is not set
> # CONFIG_PM_SYSFS_DEPRECATED is not set
> {pts/1}% find /sys/class -name device
> /sys/class/pcmcia_socket/pcmcia_socket2/device
> /sys/class/pcmcia_socket/pcmcia_socket1/device
> /sys/class/pcmcia_socket/pcmcia_socket0/device
> /sys/class/usb_device/usbdev1.1/device
> /sys/class/usb_host/usb_host1/device
> /sys/class/scsi_disk/0:0:0:0/device
> /sys/class/scsi_device/1:0:0:0/device
> /sys/class/scsi_device/0:0:0:0/device
> /sys/class/scsi_host/host1/device
> /sys/class/scsi_host/host0/device
> /sys/class/net/eth0/device
> /sys/class/net/eth1/device
> /sys/class/input/input1/ts0/device
> /sys/class/input/input1/mouse0/device
> /sys/class/input/input1/event1/device
> /sys/class/input/input1/device
> /sys/class/input/input0/event0/device
> /sys/class/input/input0/device
> {pts/1}% uname -a
> Linux cooker 2.6.20-rc5-1avb #10 Sat Jan 13 14:05:34 MSK 2007 i686 Pentium
> III (Coppermine) GNU/Linux

Because I haven't finished converting all of the different usages of
struct class_device to struct device just yet.  When that happens, those
links go away, as the /sys/class/foo_class/foo is a symlink itself into
the /sys/devices/ tree.

If you look in the -mm tree there is a patch for the network devices,
and I have patches in my tree (but not -mm) for pcmcia, usb_host,
usb_device, and input.  These patches still need a bit of work before
sending them on to their relative maintainers for acceptance.

Hope this helps explain things,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 18/20] XEN-paravirt: Add Xen driver utility functions.

2007-01-13 Thread Greg KH
On Fri, Jan 12, 2007 at 05:45:57PM -0800, Jeremy Fitzhardinge wrote:
> Allocate/destroy a 'vmalloc' VM area: alloc_vm_area and free_vm_area
> The alloc function ensures that page tables are constructed for the
> region of kernel virtual address space and mapped into init_mm.

Shouldn't these functions go into the core mm/ kernel code if they are
needed, instead of living in the xen directories?

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux v2.6.20-rc5

2007-01-13 Thread Adrian Bunk
On Sun, Jan 14, 2007 at 03:38:24PM +0800, Jeff Chua wrote:
> On 1/13/07, Jan Engelhardt <[EMAIL PROTECTED]> wrote:
> >On Jan 13 2007 06:01, Adrian Bunk wrote:
> >>On Fri, Jan 12, 2007 at 02:26:45PM -0800, Andrew Morton wrote:
> 
> >*cough*vmware*cough*
> 
> setting CONFIG_PARAVIRT=y will return in ...
> 
>   vmmon.ko module unknown symbol paravirt_ops
> 
> Without it, vmware runs run. Any fix?

Please send the 2.6.20-rc5 .config you saw this with.

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux v2.6.20-rc5

2007-01-13 Thread Jeff Chua

On 1/13/07, Jan Engelhardt <[EMAIL PROTECTED]> wrote:

On Jan 13 2007 06:01, Adrian Bunk wrote:
>On Fri, Jan 12, 2007 at 02:26:45PM -0800, Andrew Morton wrote:



*cough*vmware*cough*


setting CONFIG_PARAVIRT=y will return in ...

  vmmon.ko module unknown symbol paravirt_ops

Without it, vmware runs run. Any fix?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch-mm 3/3] Scheduled removal of SA_xxx interrupt flags fixups 2 (mm)

2007-01-13 Thread Thomas Gleixner
The obsolete SA_xxx interrupt flags have been used despite the scheduled
removal. Fixup the remaining users in -mm.

Signed-off-by: Thomas Gleixner <[EMAIL PROTECTED]>

Index: linux-2.6.20-rc4-mm1/arch/i386/kernel/vmitime.c
===
--- linux-2.6.20-rc4-mm1.orig/arch/i386/kernel/vmitime.c
+++ linux-2.6.20-rc4-mm1/arch/i386/kernel/vmitime.c
@@ -121,12 +121,10 @@ static struct clocksource clocksource_vm
 static irqreturn_t vmi_timer_interrupt(int irq, void *dev_id);
 
 struct irqaction vmi_timer_irq  = {
-   vmi_timer_interrupt,
-   SA_INTERRUPT,
-   CPU_MASK_NONE,
-   "VMI-alarm",
-   NULL,
-   NULL
+   .handler = vmi_timer_interrupt,
+   .flags = IRQF_DISABLED,
+   .mask = CPU_MASK_NONE,
+   .name = "VMI-alarm",
 };
 
 /* Alarm rate */
Index: linux-2.6.20-rc4-mm1/drivers/char/nozomi.c
===
--- linux-2.6.20-rc4-mm1.orig/drivers/char/nozomi.c
+++ linux-2.6.20-rc4-mm1/drivers/char/nozomi.c
@@ -1378,7 +1378,7 @@ static int nozomi_setup_interrupt(struct
 {
int rval;
 
-   rval = request_irq(dc->pdev->irq, _handler, SA_SHIRQ,
+   rval = request_irq(dc->pdev->irq, _handler, IRQF_SHARED,
   NOZOMI_NAME, dc);
if (rval)
dev_err(>pdev->dev, "Cannot open because IRQ %d "
Index: linux-2.6.20-rc4-mm1/drivers/firewire/fw-ohci.c
===
--- linux-2.6.20-rc4-mm1.orig/drivers/firewire/fw-ohci.c
+++ linux-2.6.20-rc4-mm1/drivers/firewire/fw-ohci.c
@@ -714,7 +714,7 @@ static int ohci_enable(struct fw_card *c
reg_write(ohci, OHCI1394_AsReqFilterHiSet, 0x8000);
 
if (request_irq(dev->irq, irq_handler,
-   SA_SHIRQ, ohci_driver_name, ohci)) {
+   IRQF_SHARED, ohci_driver_name, ohci)) {
fw_error("Failed to allocate shared interrupt %d.\n",
 dev->irq);
dma_free_coherent(ohci->card.device, CONFIG_ROM_SIZE,
Index: linux-2.6.20-rc4-mm1/drivers/input/keyboard/gpio_keys.c
===
--- linux-2.6.20-rc4-mm1.orig/drivers/input/keyboard/gpio_keys.c
+++ linux-2.6.20-rc4-mm1/drivers/input/keyboard/gpio_keys.c
@@ -78,7 +78,7 @@ static int __devinit gpio_keys_probe(str
int irq = IRQ_GPIO(pdata->buttons[i].gpio);
 
set_irq_type(irq, IRQ_TYPE_EDGE_BOTH);
-   error = request_irq(irq, gpio_keys_isr, SA_SAMPLE_RANDOM,
+   error = request_irq(irq, gpio_keys_isr, IRQF_SAMPLE_RANDOM,
 pdata->buttons[i].desc ? 
pdata->buttons[i].desc : "gpio_keys",
 pdev);
if (error) {
Index: linux-2.6.20-rc4-mm1/drivers/mtd/nand/cafe.c
===
--- linux-2.6.20-rc4-mm1.orig/drivers/mtd/nand/cafe.c
+++ linux-2.6.20-rc4-mm1/drivers/mtd/nand/cafe.c
@@ -596,7 +596,8 @@ static int __devinit cafe_nand_probe(str
cafe_writel(cafe, 0x, NAND_TIMING3);
}
cafe_writel(cafe, 0x, NAND_IRQ_MASK);
-   err = request_irq(pdev->irq, _nand_interrupt, SA_SHIRQ, "CAFE 
NAND", mtd);
+   err = request_irq(pdev->irq, _nand_interrupt, IRQF_SHARED,
+ "CAFE NAND", mtd);
if (err) {
dev_warn(>dev, "Could not register IRQ %d\n", pdev->irq);
 
Index: linux-2.6.20-rc4-mm1/drivers/net/cxgb3/cxgb3_main.c
===
--- linux-2.6.20-rc4-mm1.orig/drivers/net/cxgb3/cxgb3_main.c
+++ linux-2.6.20-rc4-mm1/drivers/net/cxgb3/cxgb3_main.c
@@ -709,7 +709,8 @@ static int cxgb_up(struct adapter *adap)
  t3_intr_handler(adap,
  adap->sge.qs[0].rspq.
  polling),
- (adap->flags & USING_MSI) ? 0 : SA_SHIRQ,
+ (adap->flags & USING_MSI) ?
+  0 : IRQF_SHARED,
  adap->name, adap)))
goto irq_err;
 
Index: linux-2.6.20-rc4-mm1/drivers/net/sc92031.c
===
--- linux-2.6.20-rc4-mm1.orig/drivers/net/sc92031.c
+++ linux-2.6.20-rc4-mm1/drivers/net/sc92031.c
@@ -1035,7 +1035,7 @@ static int sc92031_open(struct net_devic
priv->tx_head = priv->tx_tail = 0;
 
err = request_irq(pdev->irq, sc92031_interrupt,
-   SA_SHIRQ, dev->name, dev);
+   IRQF_SHARED, dev->name, dev);
if (unlikely(err < 0))
goto out_request_irq;
 
Index: linux-2.6.20-rc4-mm1/kernel/irq/manage.c

[patch 2/3] Scheduled removal of SA_xxx interrupt flags fixups

2007-01-13 Thread Thomas Gleixner
The obsolete SA_xxx interrupt flags have been used despite the scheduled
removal. Fixup the remaining users.

Signed-off-by: Thomas Gleixner <[EMAIL PROTECTED]>

Index: linux-2.6.20-rc5/kernel/irq/manage.c
===
--- linux-2.6.20-rc5.orig/kernel/irq/manage.c
+++ linux-2.6.20-rc5/kernel/irq/manage.c
@@ -442,7 +442,7 @@ int request_irq(unsigned int irq, irq_ha
/*
 * Lockdep wants atomic interrupt handlers:
 */
-   irqflags |= SA_INTERRUPT;
+   irqflags |= IRQF_DISABLED;
 #endif
/*
 * Sanity-check: shared interrupts must pass in a real dev-ID,
Index: linux-2.6.20-rc5/drivers/usb/host/ohci-ep93xx.c
===
--- linux-2.6.20-rc5.orig/drivers/usb/host/ohci-ep93xx.c
+++ linux-2.6.20-rc5/drivers/usb/host/ohci-ep93xx.c
@@ -78,7 +78,7 @@ static int usb_hcd_ep93xx_probe(const st
 
ohci_hcd_init(hcd_to_ohci(hcd));
 
-   retval = usb_add_hcd(hcd, pdev->resource[1].start, SA_INTERRUPT);
+   retval = usb_add_hcd(hcd, pdev->resource[1].start, IRQF_DISABLED);
if (retval == 0)
return retval;
 
Index: linux-2.6.20-rc5/drivers/usb/host/ohci-pnx4008.c
===
--- linux-2.6.20-rc5.orig/drivers/usb/host/ohci-pnx4008.c
+++ linux-2.6.20-rc5/drivers/usb/host/ohci-pnx4008.c
@@ -421,7 +421,7 @@ static int __devinit usb_hcd_pnx4008_pro
ohci_hcd_init(ohci);
 
dev_info(>dev, "at 0x%p, irq %d\n", hcd->regs, hcd->irq);
-   ret = usb_add_hcd(hcd, irq, SA_INTERRUPT);
+   ret = usb_add_hcd(hcd, irq, IRQF_DISABLED);
if (ret == 0)
return ret;
 
Index: linux-2.6.20-rc5/drivers/usb/host/ohci-pnx8550.c
===
--- linux-2.6.20-rc5.orig/drivers/usb/host/ohci-pnx8550.c
+++ linux-2.6.20-rc5/drivers/usb/host/ohci-pnx8550.c
@@ -107,7 +107,7 @@ int usb_hcd_pnx8550_probe (const struct 
 
ohci_hcd_init(hcd_to_ohci(hcd));
 
-   retval = usb_add_hcd(hcd, dev->resource[1].start, SA_INTERRUPT);
+   retval = usb_add_hcd(hcd, dev->resource[1].start, IRQF_DISABLED);
if (retval == 0)
return retval;
 
Index: linux-2.6.20-rc5/drivers/usb/gadget/pxa2xx_udc.c
===
--- linux-2.6.20-rc5.orig/drivers/usb/gadget/pxa2xx_udc.c
+++ linux-2.6.20-rc5/drivers/usb/gadget/pxa2xx_udc.c
@@ -2614,7 +2614,7 @@ lubbock_fail0:
 #endif
if (vbus_irq) {
retval = request_irq(vbus_irq, udc_vbus_irq,
-   SA_INTERRUPT | SA_SAMPLE_RANDOM,
+   IRQF_DISABLED | IRQF_SAMPLE_RANDOM,
driver_name, dev);
if (retval != 0) {
printk(KERN_ERR "%s: can't get irq %i, err %d\n",
Index: linux-2.6.20-rc5/drivers/net/qla3xxx.c
===
--- linux-2.6.20-rc5.orig/drivers/net/qla3xxx.c
+++ linux-2.6.20-rc5/drivers/net/qla3xxx.c
@@ -2999,7 +2999,7 @@ static int ql_adapter_up(struct ql3_adap
 {
struct net_device *ndev = qdev->ndev;
int err;
-   unsigned long irq_flags = SA_SAMPLE_RANDOM | SA_SHIRQ;
+   unsigned long irq_flags = IRQF_SAMPLE_RANDOM | IRQF_SHARED;
unsigned long hw_flags;
 
if (ql_alloc_mem_resources(qdev)) {
@@ -3018,7 +3018,7 @@ static int ql_adapter_up(struct ql3_adap
} else {
printk(KERN_INFO PFX "%s: MSI Enabled...\n", 
qdev->ndev->name);
set_bit(QL_MSI_ENABLED,>flags);
-   irq_flags &= ~SA_SHIRQ;
+   irq_flags &= ~IRQF_SHARED;
}
}
 
Index: linux-2.6.20-rc5/drivers/scsi/aic94xx/aic94xx_init.c
===
--- linux-2.6.20-rc5.orig/drivers/scsi/aic94xx/aic94xx_init.c
+++ linux-2.6.20-rc5/drivers/scsi/aic94xx/aic94xx_init.c
@@ -646,7 +646,7 @@ static int __devinit asd_pci_probe(struc
if (use_msi)
pci_enable_msi(asd_ha->pcidev);
 
-   err = request_irq(asd_ha->pcidev->irq, asd_hw_isr, SA_SHIRQ,
+   err = request_irq(asd_ha->pcidev->irq, asd_hw_isr, IRQF_SHARED,
  ASD_DRIVER_NAME, asd_ha);
if (err) {
asd_printk("couldn't get irq %d for %s\n",
Index: linux-2.6.20-rc5/drivers/net/netxen/netxen_nic_main.c
===
--- linux-2.6.20-rc5.orig/drivers/net/netxen/netxen_nic_main.c
+++ linux-2.6.20-rc5/drivers/net/netxen/netxen_nic_main.c
@@ -619,8 +619,8 @@ static int netxen_nic_open(struct net_de
}
adapter->irq = adapter->ahw.pdev->irq;
err = request_irq(adapter->ahw.pdev->irq, _intr,
- SA_SHIRQ 

[patch 1/3] Scheduled removal of SA_xxx interrupt flags

2007-01-13 Thread Thomas Gleixner
The name space cleanup of the interrupt request flags (SA_xxx -> IRQF_xxx)
left a 6 month grace period for the old deprecated flags. Remove them.

Signed-off-by: Thomas Gleixner <[EMAIL PROTECTED]>

Index: linux-2.6.20-rc3-mm1/Documentation/feature-removal-schedule.txt
===
--- linux-2.6.20-rc3-mm1.orig/Documentation/feature-removal-schedule.txt
+++ linux-2.6.20-rc3-mm1/Documentation/feature-removal-schedule.txt
@@ -182,15 +182,6 @@ Who:   Nick Piggin <[EMAIL PROTECTED]>
 
 ---
 
-What:  Interrupt only SA_* flags
-When:  Januar 2007
-Why:   The interrupt related SA_* flags are replaced by IRQF_* to move them
-   out of the signal namespace.
-
-Who:   Thomas Gleixner <[EMAIL PROTECTED]>
-

-
 What:  PHYSDEVPATH, PHYSDEVBUS, PHYSDEVDRIVER in the uevent environment
 When:  October 2008
 Why:   The stacking of class devices makes these values misleading and
Index: linux-2.6.20-rc3-mm1/include/linux/interrupt.h
===
--- linux-2.6.20-rc3-mm1.orig/include/linux/interrupt.h
+++ linux-2.6.20-rc3-mm1/include/linux/interrupt.h
@@ -49,22 +49,6 @@
 #define IRQF_TIMER 0x0200
 #define IRQF_PERCPU0x0400
 
-/*
- * Migration helpers. Scheduled for removal in 1/2007
- * Do not use for new code !
- */
-#define SA_INTERRUPT   IRQF_DISABLED
-#define SA_SAMPLE_RANDOM   IRQF_SAMPLE_RANDOM
-#define SA_SHIRQ   IRQF_SHARED
-#define SA_PROBEIRQIRQF_PROBE_SHARED
-#define SA_PERCPU  IRQF_PERCPU
-
-#define SA_TRIGGER_LOW IRQF_TRIGGER_LOW
-#define SA_TRIGGER_HIGHIRQF_TRIGGER_HIGH
-#define SA_TRIGGER_FALLING IRQF_TRIGGER_FALLING
-#define SA_TRIGGER_RISING  IRQF_TRIGGER_RISING
-#define SA_TRIGGER_MASKIRQF_TRIGGER_MASK
-
 typedef irqreturn_t (*irq_handler_t)(int, void *);
 
 struct irqaction {

-- 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 0/3] Scheduled removal of SA_xxx interrupt flags

2007-01-13 Thread Thomas Gleixner
Andrew,

the following series removes the deprecated SA_xx interrupt flags as scheduled.
There are some new users of those flags since the initial cleanup patch. The
fixup of those users is split into two parts:
 1) mainline fixups
 2) -mm fixups

tglx

-- 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch] faster vgetcpu using sidt (take 2)

2007-01-13 Thread dean gaudet
ok here is the latest rev of this patch (against 2.6.20-rc4).

timings in cycles:

baseline   patchedbaseline   patched
no cache   no cachecache  cache
k8 pre-revF2116  1417
k8 revF3117  1417
core2  3816  1214
p4 4941  2424

the degredation in cached timings appears to be due to the 16 byte stack
frame set up for the sidt instruction.  apparently due to -mno-red-zone...
would you accept a patch which re-enables the red-zone for vsyscalls?

here is the slightly updated description:

below is a patch which improves vgetcpu latency on all x86_64 
implementations i've tested.

Nathan Laredo pointed out the sgdt/sidt/sldt instructions are 
userland-accessible and we could use their limit fields to tuck away a few 
bits of per-cpu information.

vgetcpu generally uses lsl at present, but all of sgdt/sidt/sldt are
faster than lsl on all x86_64 processors i've tested.  lsl requires
microcoded permission testing whereas s*dt are free of any such hassle.

sldt is the least expensive of the three instructions however it's a 
hassle to use because processes may want to adjust their ldt.  sidt/sgdt 
have essentially the same performance across all the major architectures 
-- however sidt has the advantage that its limit field is 16-bits, yet any 
value >= 0xfff is essentially "infinite" because there are only 256 (16 
byte) descriptors.  so sidt is probably the best choice of the three.

in benchmarking i've discovered the rdtscp implementation of vgetcpu is 
slower than even the lsl-based implementation on opteron revF.  so i've 
dropped the rdtscp implementation in this patch.  however i've left the 
rdtscp_aux register initialized because i'm sure it's the right choice for 
various proposed vgettimeofday / per-cpu tsc state improvements which need 
the atomic nature of the rdtscp instruction and i hope it'll be used in 
those situations.

at compile time this patch detects if 0x1000 + 
(CONFIG_NR_CPUS

-dean

Signed-off-by: dean gaudet <[EMAIL PROTECTED]>

Index: linux/arch/x86_64/kernel/time.c
===
--- linux.orig/arch/x86_64/kernel/time.c2007-01-13 22:20:46.0 
-0800
+++ linux/arch/x86_64/kernel/time.c 2007-01-13 22:21:01.0 -0800
@@ -957,11 +957,6 @@
if (unsynchronized_tsc())
notsc = 1;
 
-   if (cpu_has(_cpu_data, X86_FEATURE_RDTSCP))
-   vgetcpu_mode = VGETCPU_RDTSCP;
-   else
-   vgetcpu_mode = VGETCPU_LSL;
-
if (vxtime.hpet_address && notsc) {
timetype = hpet_use_timer ? "HPET" : "PIT/HPET";
if (hpet_use_timer)
Index: linux/arch/x86_64/kernel/vsyscall.c
===
--- linux.orig/arch/x86_64/kernel/vsyscall.c2007-01-13 22:20:46.0 
-0800
+++ linux/arch/x86_64/kernel/vsyscall.c 2007-01-13 22:21:01.0 -0800
@@ -46,7 +46,11 @@
 
 int __sysctl_vsyscall __section_sysctl_vsyscall = 1;
 seqlock_t __xtime_lock __section_xtime_lock = SEQLOCK_UNLOCKED;
-int __vgetcpu_mode __section_vgetcpu_mode;
+
+/* is this necessary? */
+#ifndef CONFIG_NODES_SHIFT
+#define CONFIG_NODES_SHIFT 0
+#endif
 
 #include 
 
@@ -147,11 +151,11 @@
 long __vsyscall(2)
 vgetcpu(unsigned *cpu, unsigned *node, struct getcpu_cache *tcache)
 {
-   unsigned int dummy, p;
+   unsigned int p;
unsigned long j = 0;
 
/* Fast cache - only recompute value once per jiffies and avoid
-  relatively costly rdtscp/cpuid otherwise.
+  relatively costly lsl/sidt otherwise.
   This works because the scheduler usually keeps the process
   on the same CPU and this syscall doesn't guarantee its
   results anyways.
@@ -160,21 +164,30 @@
   If you don't like it pass NULL. */
if (tcache && tcache->blob[0] == (j = __jiffies)) {
p = tcache->blob[1];
-   } else if (__vgetcpu_mode == VGETCPU_RDTSCP) {
-   /* Load per CPU data from RDTSCP */
-   rdtscp(dummy, dummy, p);
-   } else {
+   }
+   else {
+#ifdef VGETCPU_USE_SIDT
+struct {
+char pad[6];   /* avoid unaligned stores */
+u16 size;
+u64 address;
+} idt;
+
+asm("sidt %0" : "=m" (idt.size));
+p = idt.size - 0x1000;
+#else
/* Load per CPU data from GDT */
asm("lsl %1,%0" : "=r" (p) : "r" (__PER_CPU_SEG));
-   }
-   if (tcache) {
-   tcache->blob[0] = j;
-   tcache->blob[1] = p;
+#endif
+   if (tcache) {
+   tcache->blob[0] = j;
+   tcache->blob[1] = p;
+   

Re: No more "device" symlinks for classes

2007-01-13 Thread Andrey Borzenkov
Pierre Ossman wrote:

> Hi guys,
> 
> I just wanted to know the rationale behind
> 99ef3ef8d5f2f5b5312627127ad63df27c0d0d05 (no more "device" symlink in
> class devices). I thought that was a rather convenient way of finding
> which physical device the class device was coupled to.
> 

Actually I wonder why those links still present even when I told system not
to create them?

{pts/1}% grep DEPRE /boot/config
# CONFIG_SYSFS_DEPRECATED is not set
# CONFIG_PM_SYSFS_DEPRECATED is not set
{pts/1}% find /sys/class -name device
/sys/class/pcmcia_socket/pcmcia_socket2/device
/sys/class/pcmcia_socket/pcmcia_socket1/device
/sys/class/pcmcia_socket/pcmcia_socket0/device
/sys/class/usb_device/usbdev1.1/device
/sys/class/usb_host/usb_host1/device
/sys/class/scsi_disk/0:0:0:0/device
/sys/class/scsi_device/1:0:0:0/device
/sys/class/scsi_device/0:0:0:0/device
/sys/class/scsi_host/host1/device
/sys/class/scsi_host/host0/device
/sys/class/net/eth0/device
/sys/class/net/eth1/device
/sys/class/input/input1/ts0/device
/sys/class/input/input1/mouse0/device
/sys/class/input/input1/event1/device
/sys/class/input/input1/device
/sys/class/input/input0/event0/device
/sys/class/input/input0/device
{pts/1}% uname -a
Linux cooker 2.6.20-rc5-1avb #10 Sat Jan 13 14:05:34 MSK 2007 i686 Pentium
III (Coppermine) GNU/Linux

-andrey

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 10/11] x86_64 ia32 vDSO: use install_special_mapping

2007-01-13 Thread Roland McGrath

This patch uses install_special_mapping for the ia32 vDSO setup,
consolidating duplicated code.

Signed-off-by: Roland McGrath <[EMAIL PROTECTED]>
---
 arch/x86_64/ia32/syscall32.c |   75 --
 include/asm-x86_64/proto.h   |1 -
 2 files changed, 21 insertions(+), 55 deletions(-)

diff --git a/arch/x86_64/ia32/syscall32.c b/arch/x86_64/ia32/syscall32.c
index 59f1fa1..3939f10 100644  
--- a/arch/x86_64/ia32/syscall32.c
+++ b/arch/x86_64/ia32/syscall32.c
@@ -18,68 +18,34 @@ extern unsigned char syscall32_syscall[]
 extern unsigned char syscall32_sysenter[], syscall32_sysenter_end[];
 extern int sysctl_vsyscall32;
 
-char *syscall32_page; 
+static struct page *syscall32_pages[1];
 static int use_sysenter = -1;
 
-static struct page *
-syscall32_nopage(struct vm_area_struct *vma, unsigned long adr, int *type)
-{
-   struct page *p = virt_to_page(adr - vma->vm_start + syscall32_page);
-   get_page(p);
-   return p;
-}
-
-/* Prevent VMA merging */
-static void syscall32_vma_close(struct vm_area_struct *vma)
-{
-}
-
-static struct vm_operations_struct syscall32_vm_ops = {
-   .close = syscall32_vma_close,
-   .nopage = syscall32_nopage,
-};
-
 struct linux_binprm;
 
 /* Setup a VMA at program startup for the vsyscall page */
 int syscall32_setup_pages(struct linux_binprm *bprm, int exstack)
 {
-   int npages = (VSYSCALL32_END - VSYSCALL32_BASE) >> PAGE_SHIFT;
-   struct vm_area_struct *vma;
struct mm_struct *mm = current->mm;
int ret;
 
-   vma = kmem_cache_alloc(vm_area_cachep, GFP_KERNEL);
-   if (!vma)
-   return -ENOMEM;
-
-   memset(vma, 0, sizeof(struct vm_area_struct));
-   /* Could randomize here */
-   vma->vm_start = VSYSCALL32_BASE;
-   vma->vm_end = VSYSCALL32_END;
-   /* MAYWRITE to allow gdb to COW and set breakpoints */
-   vma->vm_flags = VM_READ|VM_EXEC|VM_MAYREAD|VM_MAYEXEC|VM_MAYWRITE;
+   down_write(>mmap_sem);
/*
+* MAYWRITE to allow gdb to COW and set breakpoints
+*
 * Make sure the vDSO gets into every core dump.
 * Dumping its contents makes post-mortem fully interpretable later
 * without matching up the same kernel and hardware config to see
 * what PC values meant.
 */
-   vma->vm_flags |= VM_ALWAYSDUMP;
-   vma->vm_flags |= mm->def_flags;
-   vma->vm_page_prot = protection_map[vma->vm_flags & 7];
-   vma->vm_ops = _vm_ops;
-   vma->vm_mm = mm;
-
-   down_write(>mmap_sem);
-   if ((ret = insert_vm_struct(mm, vma))) {
-   up_write(>mmap_sem);
-   kmem_cache_free(vm_area_cachep, vma);
-   return ret;
-   }
-   mm->total_vm += npages;
+   /* Could randomize here */
+   ret = install_special_mapping(mm, VSYSCALL32_BASE, PAGE_SIZE,
+ VM_READ|VM_EXEC|
+ VM_MAYREAD|VM_MAYWRITE|VM_MAYEXEC|
+ VM_ALWAYSDUMP,
+ syscall32_pages);
up_write(>mmap_sem);
-   return 0;
+   return ret;
 }
 
 const char *arch_vma_name(struct vm_area_struct *vma)
@@ -92,9 +58,10 @@ const char *arch_vma_name(struct vm_area
 
 static int __init init_syscall32(void)
 { 
-   syscall32_page = (void *)get_zeroed_page(GFP_KERNEL); 
+   char *syscall32_page = (void *)get_zeroed_page(GFP_KERNEL);
if (!syscall32_page) 
panic("Cannot allocate syscall32 page"); 
+   syscall32_pages[0] = virt_to_page(syscall32_page);
if (use_sysenter > 0) {
memcpy(syscall32_page, syscall32_sysenter,
   syscall32_sysenter_end - syscall32_sysenter);
diff --git a/include/asm-x86_64/proto.h b/include/asm-x86_64/proto.h
index 6d324b8..a6d2ff5 100644  
--- a/include/asm-x86_64/proto.h
+++ b/include/asm-x86_64/proto.h
@@ -81,7 +81,6 @@ extern void swap_low_mappings(void);
 extern void __show_regs(struct pt_regs * regs);
 extern void show_regs(struct pt_regs * regs);
 
-extern char *syscall32_page;
 extern void syscall32_cpu_init(void);
 
 extern void setup_node_bootmem(int nodeid, unsigned long start, unsigned long 
end);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 11/11] powerpc vDSO: use install_special_mapping

2007-01-13 Thread Roland McGrath

This patch uses install_special_mapping for the powerpc vDSO setup,
consolidating duplicated code.

Signed-off-by: Roland McGrath <[EMAIL PROTECTED]>
---
 arch/powerpc/kernel/vdso.c |  104 +++
 1 files changed, 27 insertions(+), 77 deletions(-)

diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c
index ae0ede1..50149ec 100644  
--- a/arch/powerpc/kernel/vdso.c
+++ b/arch/powerpc/kernel/vdso.c
@@ -49,9 +49,13 @@
 /* Max supported size for symbol names */
 #define MAX_SYMNAME64
 
+#define VDSO32_MAXPAGES(((0x3000 + PAGE_MASK) >> PAGE_SHIFT) + 2)
+#define VDSO64_MAXPAGES(((0x3000 + PAGE_MASK) >> PAGE_SHIFT) + 2)
+
 extern char vdso32_start, vdso32_end;
 static void *vdso32_kbase = _start;
 unsigned int vdso32_pages;
+static struct page *vdso32_pagelist[VDSO32_MAXPAGES];
 unsigned long vdso32_sigtramp;
 unsigned long vdso32_rt_sigtramp;
 
@@ -59,6 +63,7 @@ unsigned long vdso32_rt_sigtramp;
 extern char vdso64_start, vdso64_end;
 static void *vdso64_kbase = _start;
 unsigned int vdso64_pages;
+static struct page *vdso64_pagelist[VDSO64_MAXPAGES];
 unsigned long vdso64_rt_sigtramp;
 #endif /* CONFIG_PPC64 */
 
@@ -165,55 +170,6 @@ static void dump_vdso_pages(struct vm_ar
 #endif /* DEBUG */
 
 /*
- * Keep a dummy vma_close for now, it will prevent VMA merging.
- */
-static void vdso_vma_close(struct vm_area_struct * vma)
-{
-}
-
-/*
- * Our nopage() function, maps in the actual vDSO kernel pages, they will
- * be mapped read-only by do_no_page(), and eventually COW'ed, either
- * right away for an initial write access, or by do_wp_page().
- */
-static struct page * vdso_vma_nopage(struct vm_area_struct * vma,
-unsigned long address, int *type)
-{
-   unsigned long offset = address - vma->vm_start;
-   struct page *pg;
-#ifdef CONFIG_PPC64
-   void *vbase = (vma->vm_mm->task_size > TASK_SIZE_USER32) ?
-   vdso64_kbase : vdso32_kbase;
-#else
-   void *vbase = vdso32_kbase;
-#endif
-
-   DBG("vdso_vma_nopage(current: %s, address: %016lx, off: %lx)\n",
-   current->comm, address, offset);
-
-   if (address < vma->vm_start || address > vma->vm_end)
-   return NOPAGE_SIGBUS;
-
-   /*
-* Last page is systemcfg.
-*/
-   if ((vma->vm_end - address) <= PAGE_SIZE)
-   pg = virt_to_page(vdso_data);
-   else
-   pg = virt_to_page(vbase + offset);
-
-   get_page(pg);
-   DBG(" ->page count: %d\n", page_count(pg));
-
-   return pg;
-}
-
-static struct vm_operations_struct vdso_vmops = {
-   .close  = vdso_vma_close,
-   .nopage = vdso_vma_nopage,
-};
-
-/*
  * This is called from binfmt_elf, we create the special vma for the
  * vDSO and insert it into the mm struct tree
  */
@@ -221,20 +177,23 @@ int arch_setup_additional_pages(struct l
int executable_stack)
 {
struct mm_struct *mm = current->mm;
-   struct vm_area_struct *vma;
+   struct page **vdso_pagelist;
unsigned long vdso_pages;
unsigned long vdso_base;
int rc;
 
 #ifdef CONFIG_PPC64
if (test_thread_flag(TIF_32BIT)) {
+   vdso_pagelist = vdso32_pagelist;
vdso_pages = vdso32_pages;
vdso_base = VDSO32_MBASE;
} else {
+   vdso_pagelist = vdso64_pagelist;
vdso_pages = vdso64_pages;
vdso_base = VDSO64_MBASE;
}
 #else
+   vdso_pagelist = vdso32_pagelist;
vdso_pages = vdso32_pages;
vdso_base = VDSO32_MBASE;
 #endif
@@ -262,17 +221,6 @@ int arch_setup_additional_pages(struct l
goto fail_mmapsem;
}
 
-
-   /* Allocate a VMA structure and fill it up */
-   vma = kmem_cache_zalloc(vm_area_cachep, GFP_KERNEL);
-   if (vma == NULL) {
-   rc = -ENOMEM;
-   goto fail_mmapsem;
-   }
-   vma->vm_mm = mm;
-   vma->vm_start = vdso_base;
-   vma->vm_end = vma->vm_start + (vdso_pages << PAGE_SHIFT);
-
/*
 * our vma flags don't have VM_WRITE so by default, the process isn't
 * allowed to write those pages.
@@ -282,32 +230,26 @@ int arch_setup_additional_pages(struct l
 * and your nice userland gettimeofday will be totally dead.
 * It's fine to use that for setting breakpoints in the vDSO code
 * pages though
-*/
-   vma->vm_flags = VM_READ|VM_EXEC|VM_MAYREAD|VM_MAYWRITE|VM_MAYEXEC;
-   /*
+*
 * Make sure the vDSO gets into every core dump.
 * Dumping its contents makes post-mortem fully interpretable later
 * without matching up the same kernel and hardware config to see
 * what PC values meant.
 */
-   vma->vm_flags |= VM_ALWAYSDUMP;
-   vma->vm_flags |= mm->def_flags;
-   vma->vm_page_prot = protection_map[vma->vm_flags & 0x7];
-   vma->vm_ops = 

[PATCH 9/11] i386 vDSO: use install_special_mapping

2007-01-13 Thread Roland McGrath

This patch uses install_special_mapping for the i386 vDSO setup,
consolidating duplicated code.

Signed-off-by: Roland McGrath <[EMAIL PROTECTED]>
---
 arch/i386/kernel/sysenter.c |   53 +--
 1 files changed, 11 insertions(+), 42 deletions(-)

diff --git a/arch/i386/kernel/sysenter.c b/arch/i386/kernel/sysenter.c
index 5da7442..bc882a2 100644  
--- a/arch/i386/kernel/sysenter.c
+++ b/arch/i386/kernel/sysenter.c
@@ -70,11 +70,12 @@ void enable_sep_cpu(void)
  */
 extern const char vsyscall_int80_start, vsyscall_int80_end;
 extern const char vsyscall_sysenter_start, vsyscall_sysenter_end;
-static void *syscall_page;
+static struct page *syscall_pages[1];
 
 int __init sysenter_setup(void)
 {
-   syscall_page = (void *)get_zeroed_page(GFP_ATOMIC);
+   void *syscall_page = (void *)get_zeroed_page(GFP_ATOMIC);
+   syscall_pages[0] = virt_to_page(syscall_page);
 
 #ifdef CONFIG_COMPAT_VDSO
__set_fixmap(FIX_VDSO, __pa(syscall_page), PAGE_READONLY);
@@ -96,31 +97,12 @@ int __init sysenter_setup(void)
 }
 
 #ifndef CONFIG_COMPAT_VDSO
-static struct page *syscall_nopage(struct vm_area_struct *vma,
-   unsigned long adr, int *type)
-{
-   struct page *p = virt_to_page(adr - vma->vm_start + syscall_page);
-   get_page(p);
-   return p;
-}
-
-/* Prevent VMA merging */
-static void syscall_vma_close(struct vm_area_struct *vma)
-{
-}
-
-static struct vm_operations_struct syscall_vm_ops = {
-   .close = syscall_vma_close,
-   .nopage = syscall_nopage,
-};
-
 /* Defined in vsyscall-sysenter.S */
 extern void SYSENTER_RETURN;
 
 /* Setup a VMA at program startup for the vsyscall page */
 int arch_setup_additional_pages(struct linux_binprm *bprm, int exstack)
 {
-   struct vm_area_struct *vma;
struct mm_struct *mm = current->mm;
unsigned long addr;
int ret;
@@ -132,38 +114,25 @@ int arch_setup_additional_pages(struct l
goto up_fail;
}
 
-   vma = kmem_cache_zalloc(vm_area_cachep, GFP_KERNEL);
-   if (!vma) {
-   ret = -ENOMEM;
-   goto up_fail;
-   }
-
-   vma->vm_start = addr;
-   vma->vm_end = addr + PAGE_SIZE;
-   /* MAYWRITE to allow gdb to COW and set breakpoints */
-   vma->vm_flags = VM_READ|VM_EXEC|VM_MAYREAD|VM_MAYEXEC|VM_MAYWRITE;
/*
+* MAYWRITE to allow gdb to COW and set breakpoints
+*
 * Make sure the vDSO gets into every core dump.
 * Dumping its contents makes post-mortem fully interpretable later
 * without matching up the same kernel and hardware config to see
 * what PC values meant.
 */
-   vma->vm_flags |= VM_ALWAYSDUMP;
-   vma->vm_flags |= mm->def_flags;
-   vma->vm_page_prot = protection_map[vma->vm_flags & 7];
-   vma->vm_ops = _vm_ops;
-   vma->vm_mm = mm;
-
-   ret = insert_vm_struct(mm, vma);
-   if (unlikely(ret)) {
-   kmem_cache_free(vm_area_cachep, vma);
+   ret = install_special_mapping(mm, addr, PAGE_SIZE,
+ VM_READ|VM_EXEC|
+ VM_MAYREAD|VM_MAYWRITE|VM_MAYEXEC|
+ VM_ALWAYSDUMP,
+ syscall_pages);
+   if (ret)
goto up_fail;
-   }
 
current->mm->context.vdso = (void *)addr;
current_thread_info()->sysenter_return =
(void *)VDSO_SYM(_RETURN);
-   mm->total_vm++;
 up_fail:
up_write(>mmap_sem);
return ret;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 8/11] Add install_special_mapping

2007-01-13 Thread Roland McGrath

This patchs adds a utility function install_special_mapping, for creating a
special vma using a fixed set of preallocated pages as backing, such as for
a vDSO.  This consolidates some nearly identical code used for vDSO mapping
reimplemented for different architectures.

Signed-off-by: Roland McGrath <[EMAIL PROTECTED]>
---
 include/linux/mm.h |3 ++
 mm/mmap.c  |   72 
 2 files changed, 75 insertions(+), 0 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 2d2c08d..bb793a4 100644  
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1030,6 +1030,9 @@ extern struct vm_area_struct *copy_vma(s
unsigned long addr, unsigned long len, pgoff_t pgoff);
 extern void exit_mmap(struct mm_struct *);
 extern int may_expand_vm(struct mm_struct *mm, unsigned long npages);
+extern int install_special_mapping(struct mm_struct *mm,
+  unsigned long addr, unsigned long len,
+  unsigned long flags, struct page **pages);
 
 extern unsigned long get_unmapped_area(struct file *, unsigned long, unsigned 
long, unsigned long, unsigned long);
 
diff --git a/mm/mmap.c b/mm/mmap.c
index 9717337..b540fb2 100644  
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2094,3 +2094,75 @@ int may_expand_vm(struct mm_struct *mm, 
return 0;
return 1;
 }
+
+
+static struct page *special_mapping_nopage(struct vm_area_struct *vma,
+  unsigned long address, int *type)
+{
+   struct page **pages;
+
+   BUG_ON(address < vma->vm_start || address >= vma->vm_end);
+
+   address -= vma->vm_start;
+   for (pages = vma->vm_private_data; address > 0 && *pages; ++pages)
+   address -= PAGE_SIZE;
+
+   if (*pages) {
+   struct page *page = *pages;
+   get_page(page);
+   return page;
+   }
+
+   return NOPAGE_SIGBUS;
+}
+
+/*
+ * Having a close hook prevents vma merging regardless of flags.
+ */
+static void special_mapping_close(struct vm_area_struct *vma)
+{
+}
+
+static struct vm_operations_struct special_mapping_vmops = {
+   .close = special_mapping_close,
+   .nopage = special_mapping_nopage,
+};
+
+/*
+ * Called with mm->mmap_sem held for writing.
+ * Insert a new vma covering the given region, with the given flags.
+ * Its pages are supplied by the given array of struct page *.
+ * The array can be shorter than len >> PAGE_SHIFT if it's null-terminated.
+ * The region past the last page supplied will always produce SIGBUS.
+ * The array pointer and the pages it points to are assumed to stay alive
+ * for as long as this mapping might exist.
+ */
+int install_special_mapping(struct mm_struct *mm,
+   unsigned long addr, unsigned long len,
+   unsigned long vm_flags, struct page **pages)
+{
+   struct vm_area_struct *vma;
+
+   vma = kmem_cache_zalloc(vm_area_cachep, GFP_KERNEL);
+   if (unlikely(vma == NULL))
+   return -ENOMEM;
+
+   vma->vm_mm = mm;
+   vma->vm_start = addr;
+   vma->vm_end = addr + len;
+
+   vma->vm_flags = vm_flags | mm->def_flags;
+   vma->vm_page_prot = protection_map[vma->vm_flags & 7];
+
+   vma->vm_ops = _mapping_vmops;
+   vma->vm_private_data = pages;
+
+   if (unlikely(insert_vm_struct(mm, vma))) {
+   kmem_cache_free(vm_area_cachep, vma);
+   return -ENOMEM;
+   }
+
+   mm->total_vm += len >> PAGE_SHIFT;
+
+   return 0;
+}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 6/11] powerpc vDSO: use VM_ALWAYSDUMP

2007-01-13 Thread Roland McGrath

This patch fixes core dumps to include the vDSO vma, which is left out now.

Signed-off-by: Roland McGrath <[EMAIL PROTECTED]>
---
 arch/powerpc/kernel/vdso.c |7 +++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c
index a4b28c7..ae0ede1 100644  
--- a/arch/powerpc/kernel/vdso.c
+++ b/arch/powerpc/kernel/vdso.c
@@ -284,6 +284,13 @@ int arch_setup_additional_pages(struct l
 * pages though
 */
vma->vm_flags = VM_READ|VM_EXEC|VM_MAYREAD|VM_MAYWRITE|VM_MAYEXEC;
+   /*
+* Make sure the vDSO gets into every core dump.
+* Dumping its contents makes post-mortem fully interpretable later
+* without matching up the same kernel and hardware config to see
+* what PC values meant.
+*/
+   vma->vm_flags |= VM_ALWAYSDUMP;
vma->vm_flags |= mm->def_flags;
vma->vm_page_prot = protection_map[vma->vm_flags & 0x7];
vma->vm_ops = _vmops;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 7/11] x86_64 ia32 vDSO: define arch_vma_name

2007-01-13 Thread Roland McGrath

This patch makes x86_64 define arch_vma_name for CONFIG_IA32_EMULATION.
This makes the ia32 vDSO mapping appear in /proc/PID/maps with "[vdso]"
for ia32 processes, as it does on native i386.

Signed-off-by: Roland McGrath <[EMAIL PROTECTED]>
---
 arch/x86_64/ia32/syscall32.c |8 
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/arch/x86_64/ia32/syscall32.c b/arch/x86_64/ia32/syscall32.c
index 3ac9355..59f1fa1 100644  
--- a/arch/x86_64/ia32/syscall32.c
+++ b/arch/x86_64/ia32/syscall32.c
@@ -82,6 +82,14 @@ int syscall32_setup_pages(struct linux_b
return 0;
 }
 
+const char *arch_vma_name(struct vm_area_struct *vma)
+{
+   if (vma->vm_start == VSYSCALL32_BASE &&
+   vma->vm_mm && vma->vm_mm->task_size == IA32_PAGE_OFFSET)
+   return "[vdso]";
+   return NULL;
+}
+
 static int __init init_syscall32(void)
 { 
syscall32_page = (void *)get_zeroed_page(GFP_KERNEL); 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 5/11] x86_64 ia32 vDSO: use VM_ALWAYSDUMP

2007-01-13 Thread Roland McGrath

This patch fixes ia32 core dumps on x86_64 to include just one phdr for the
vDSO vma.  Currently it writes a confused format with two phdrs for the
address, one without contents and one with.  This patch removes the
special-case core writing macros for the ia32 vDSO.  Instead, it uses
VM_ALWAYSDUMP in the vma.  This changes core dumps so they no longer
include the non-PT_LOAD phdrs from the vDSO, consistent with fixed native
i386 core dumps.

Signed-off-by: Roland McGrath <[EMAIL PROTECTED]>
---
 arch/x86_64/ia32/ia32_binfmt.c |   49 
 arch/x86_64/ia32/syscall32.c   |7 +
 2 files changed, 7 insertions(+), 49 deletions(-)

diff --git a/arch/x86_64/ia32/ia32_binfmt.c b/arch/x86_64/ia32/ia32_binfmt.c
index 543ef4f..5ce0bd4 100644  
--- a/arch/x86_64/ia32/ia32_binfmt.c
+++ b/arch/x86_64/ia32/ia32_binfmt.c
@@ -64,55 +64,6 @@ typedef unsigned int elf_greg_t;
 #define ELF_NGREG (sizeof (struct user_regs_struct32) / sizeof(elf_greg_t))
 typedef elf_greg_t elf_gregset_t[ELF_NGREG];
 
-/*
- * These macros parameterize elf_core_dump in fs/binfmt_elf.c to write out
- * extra segments containing the vsyscall DSO contents.  Dumping its
- * contents makes post-mortem fully interpretable later without matching up
- * the same kernel and hardware config to see what PC values meant.
- * Dumping its extra ELF program headers includes all the other information
- * a debugger needs to easily find how the vsyscall DSO was being used.
- */
-#define ELF_CORE_EXTRA_PHDRS   (find_vma(current->mm, VSYSCALL32_BASE) ? \
-(VSYSCALL32_EHDR->e_phnum) : 0)
-#define ELF_CORE_WRITE_EXTRA_PHDRS   \
-do { \
-   if (find_vma(current->mm, VSYSCALL32_BASE)) { \
-   const struct elf32_phdr *const vsyscall_phdrs =   \
-   (const struct elf32_phdr *) (VSYSCALL32_BASE  \
-  + VSYSCALL32_EHDR->e_phoff);\
-   int i;\
-   Elf32_Off ofs = 0;\
-   for (i = 0; i < VSYSCALL32_EHDR->e_phnum; ++i) {  \
-   struct elf32_phdr phdr = vsyscall_phdrs[i];   \
-   if (phdr.p_type == PT_LOAD) { \
-   BUG_ON(ofs != 0); \
-   ofs = phdr.p_offset = offset; \
-   phdr.p_memsz = PAGE_ALIGN(phdr.p_memsz);  \
-   phdr.p_filesz = phdr.p_memsz; \
-   offset += phdr.p_filesz;  \
-   } \
-   else  \
-   phdr.p_offset += ofs; \
-   phdr.p_paddr = 0; /* match other core phdrs */\
-   DUMP_WRITE(, sizeof(phdr));  \
-   } \
-   } \
-} while (0)
-#define ELF_CORE_WRITE_EXTRA_DATA\
-do { \
-   if (find_vma(current->mm, VSYSCALL32_BASE)) { \
-   const struct elf32_phdr *const vsyscall_phdrs =   \
-   (const struct elf32_phdr *) (VSYSCALL32_BASE  \
-  + VSYSCALL32_EHDR->e_phoff); 
 \
-   int i;\
-   for (i = 0; i < VSYSCALL32_EHDR->e_phnum; ++i) {  \
-   if (vsyscall_phdrs[i].p_type == PT_LOAD)  \
-   DUMP_WRITE((void *) (u64) 
vsyscall_phdrs[i].p_vaddr,\
-   PAGE_ALIGN(vsyscall_phdrs[i].p_memsz));   \
-   } \
-   } \
-} while (0)
-
 struct elf_siginfo
 {
int si_signo;   /* signal number */
diff --git a/arch/x86_64/ia32/syscall32.c b/arch/x86_64/ia32/syscall32.c
index 3e5ed20..3ac9355 100644  
--- a/arch/x86_64/ia32/syscall32.c
+++ b/arch/x86_64/ia32/syscall32.c
@@ -59,6 +59,13 @@ int syscall32_setup_pages(struct linux_b
vma->vm_end = VSYSCALL32_END;
/* MAYWRITE to allow gdb to COW and set breakpoints */
vma->vm_flags = 

[PATCH 4/11] i386 vDSO: use VM_ALWAYSDUMP

2007-01-13 Thread Roland McGrath

This patch fixes core dumps to include the vDSO vma, which is left out now.
It removes the special-case core writing macros, which were not doing the
right thing for the vDSO vma anyway.  Instead, it uses VM_ALWAYSDUMP in the
vma; there is no need for the fixmap page to be installed.  It handles the
CONFIG_COMPAT_VDSO case by making elf_core_dump use the fake vma from
get_gate_vma after real vmas in the same way the /proc/PID/maps code does.

This changes core dumps so they no longer include the non-PT_LOAD phdrs
from the vDSO.  I made the change to add them in the first place, but in
turned out that nothing ever wanted them there since the advent of NT_AUXV.
It's cleaner to leave them out, and just let the phdrs inside the vDSO
image speak for themselves.

Signed-off-by: Roland McGrath <[EMAIL PROTECTED]>
---
 arch/i386/kernel/sysenter.c |   12 ++
 fs/binfmt_elf.c |   12 --
 include/asm-i386/elf.h  |   44 ---
 mm/memory.c |7 ++
 4 files changed, 23 insertions(+), 52 deletions(-)

diff --git a/arch/i386/kernel/sysenter.c b/arch/i386/kernel/sysenter.c
index 454d12d..5da7442 100644  
--- a/arch/i386/kernel/sysenter.c
+++ b/arch/i386/kernel/sysenter.c
@@ -79,11 +79,6 @@ int __init sysenter_setup(void)
 #ifdef CONFIG_COMPAT_VDSO
__set_fixmap(FIX_VDSO, __pa(syscall_page), PAGE_READONLY);
printk("Compat vDSO mapped to %08lx.\n", __fix_to_virt(FIX_VDSO));
-#else
-   /*
-* In the non-compat case the ELF coredumping code needs the fixmap:
-*/
-   __set_fixmap(FIX_VDSO, __pa(syscall_page), PAGE_KERNEL_RO);
 #endif
 
if (!boot_cpu_has(X86_FEATURE_SEP)) {
@@ -147,6 +142,13 @@ int arch_setup_additional_pages(struct l
vma->vm_end = addr + PAGE_SIZE;
/* MAYWRITE to allow gdb to COW and set breakpoints */
vma->vm_flags = VM_READ|VM_EXEC|VM_MAYREAD|VM_MAYEXEC|VM_MAYWRITE;
+   /*
+* Make sure the vDSO gets into every core dump.
+* Dumping its contents makes post-mortem fully interpretable later
+* without matching up the same kernel and hardware config to see
+* what PC values meant.
+*/
+   vma->vm_flags |= VM_ALWAYSDUMP;
vma->vm_flags |= mm->def_flags;
vma->vm_page_prot = protection_map[vma->vm_flags & 7];
vma->vm_ops = _vm_ops;
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 6fec8bf..4ee7cf5 100644  
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -1443,7 +1443,7 @@ static int elf_core_dump(long signr, str
int segs;
size_t size = 0;
int i;
-   struct vm_area_struct *vma;
+   struct vm_area_struct *vma, *gate_vma;
struct elfhdr *elf = NULL;
loff_t offset = 0, dataoff, foffset;
unsigned long limit = current->signal->rlim[RLIMIT_CORE].rlim_cur;
@@ -1529,6 +1529,10 @@ static int elf_core_dump(long signr, str
segs += ELF_CORE_EXTRA_PHDRS;
 #endif
 
+   gate_vma = get_gate_vma(current);
+   if (gate_vma != NULL)
+   segs++;
+
/* Set up header */
fill_elf_header(elf, segs + 1); /* including notes section */
 
@@ -1596,7 +1600,8 @@ static int elf_core_dump(long signr, str
dataoff = offset = roundup(offset, ELF_EXEC_PAGESIZE);
 
/* Write program headers for segments dump */
-   for (vma = current->mm->mmap; vma != NULL; vma = vma->vm_next) {
+   for (vma = current->mm->mmap; vma != NULL;
+vma = vma->vm_next ?: vma == gate_vma ? NULL : gate_vma) {
struct elf_phdr phdr;
size_t sz;
 
@@ -1645,7 +1650,8 @@ static int elf_core_dump(long signr, str
/* Align to page */
DUMP_SEEK(dataoff - foffset);
 
-   for (vma = current->mm->mmap; vma != NULL; vma = vma->vm_next) {
+   for (vma = current->mm->mmap; vma != NULL;
+vma = vma->vm_next ?: vma == gate_vma ? NULL : gate_vma) {
unsigned long addr;
 
if (!maydump(vma))
diff --git a/include/asm-i386/elf.h b/include/asm-i386/elf.h
index 0515d61..369035d 100644  
--- a/include/asm-i386/elf.h
+++ b/include/asm-i386/elf.h
@@ -168,50 +168,6 @@ do if (vdso_enabled) { 
\
NEW_AUX_ENT(AT_SYSINFO_EHDR, VDSO_COMPAT_BASE); \
 } while (0)
 
-/*
- * These macros parameterize elf_core_dump in fs/binfmt_elf.c to write out
- * extra segments containing the vsyscall DSO contents.  Dumping its
- * contents makes post-mortem fully interpretable later without matching up
- * the same kernel and hardware config to see what PC values meant.
- * Dumping its extra ELF program headers includes all the other information
- * a debugger needs to easily find how the vsyscall DSO was being used.
- */
-#define ELF_CORE_EXTRA_PHDRS   (VDSO_HIGH_EHDR->e_phnum)
-#define ELF_CORE_WRITE_EXTRA_PHDRS   \
-do { 

[PATCH 2/11] Fix gate_vma.vm_flags

2007-01-13 Thread Roland McGrath

This patch fixes the initialization of gate_vma.vm_flags and
gate_vma.vm_page_prot to reflect reality.  This makes the "[vdso]" line in
/proc/PID/maps correctly show r-xp instead of ---p, when gate_vma is used
(CONFIG_COMPAT_VDSO on i386).

Signed-off-by: Roland McGrath <[EMAIL PROTECTED]>
---
 mm/memory.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index af227d2..5beb4b8 100644  
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2606,8 +2606,8 @@ static int __init gate_vma_init(void)
gate_vma.vm_mm = NULL;
gate_vma.vm_start = FIXADDR_USER_START;
gate_vma.vm_end = FIXADDR_USER_END;
-   gate_vma.vm_page_prot = PAGE_READONLY;
-   gate_vma.vm_flags = 0;
+   gate_vma.vm_flags = VM_READ | VM_MAYREAD | VM_EXEC | VM_MAYEXEC;
+   gate_vma.vm_page_prot = __P101;
return 0;
 }
 __initcall(gate_vma_init);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/11] Add VM_ALWAYSDUMP

2007-01-13 Thread Roland McGrath

This patch adds the VM_ALWAYSDUMP flag for vm_flags in vm_area_struct.
This provides a clean explicit way to have a vma always included in core
dumps, as is needed for vDSO's.

Signed-off-by: Roland McGrath <[EMAIL PROTECTED]>
---
 fs/binfmt_elf.c|4 
 include/linux/mm.h |1 +
 2 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 7cb2872..6fec8bf 100644  
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -1178,6 +1178,10 @@ static int dump_seek(struct file *file, 
  */
 static int maydump(struct vm_area_struct *vma)
 {
+   /* The vma can be set up to tell us the answer directly.  */
+   if (vma->vm_flags & VM_ALWAYSDUMP)
+   return 1;
+
/* Do not dump I/O mapped devices or special mappings */
if (vma->vm_flags & (VM_IO | VM_RESERVED))
return 0;
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 7691223..2d2c08d 100644  
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -168,6 +168,7 @@ extern unsigned int kobjsize(const void 
 #define VM_NONLINEAR   0x0080  /* Is non-linear (remap_file_pages) */
 #define VM_MAPPED_COPY 0x0100  /* T if mapped copy of data (nommu 
mmap) */
 #define VM_INSERTPAGE  0x0200  /* The vma has had "vm_insert_page()" 
done on it */
+#define VM_ALWAYSDUMP  0x0400  /* Always include in core dumps */
 
 #ifndef VM_STACK_DEFAULT_FLAGS /* arch can override this */
 #define VM_STACK_DEFAULT_FLAGS VM_DATA_DEFAULT_FLAGS
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/11] Fix CONFIG_COMPAT_VDSO

2007-01-13 Thread Roland McGrath

I wouldn't mind if CONFIG_COMPAT_VDSO went away entirely.
But if it's there, it should work properly.  Currently
it's quite haphazard: both real vma and fixmap are
mapped, both are put in the two different AT_* slots,
sysenter returns to the vma address rather than the
fixmap address, and core dumps yet are another story.

This patch makes CONFIG_COMPAT_VDSO disable the real vma
and use the fixmap area consistently.  This makes it
actually compatible with what the old vdso implementation did.

Signed-off-by: Roland McGrath <[EMAIL PROTECTED]>
---
 arch/i386/kernel/entry.S|4 
 arch/i386/kernel/sysenter.c |2 ++
 include/asm-i386/elf.h  |7 +++
 include/asm-i386/fixmap.h   |2 ++
 include/asm-i386/page.h |2 ++
 5 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/arch/i386/kernel/entry.S b/arch/i386/kernel/entry.S
index 06461b8..5e47683 100644  
--- a/arch/i386/kernel/entry.S
+++ b/arch/i386/kernel/entry.S
@@ -302,12 +302,16 @@ sysenter_past_esp:
pushl $(__USER_CS)
CFI_ADJUST_CFA_OFFSET 4
/*CFI_REL_OFFSET cs, 0*/
+#ifndef CONFIG_COMPAT_VDSO
/*
 * Push current_thread_info()->sysenter_return to the stack.
 * A tiny bit of offset fixup is necessary - 4*4 means the 4 words
 * pushed above; +8 corresponds to copy_thread's esp0 setting.
 */
pushl (TI_sysenter_return-THREAD_SIZE+8+4*4)(%esp)
+#else
+   pushl $SYSENTER_RETURN
+#endif
CFI_ADJUST_CFA_OFFSET 4
CFI_REL_OFFSET eip, 0
 
diff --git a/arch/i386/kernel/sysenter.c b/arch/i386/kernel/sysenter.c
index 7de9117..454d12d 100644  
--- a/arch/i386/kernel/sysenter.c
+++ b/arch/i386/kernel/sysenter.c
@@ -100,6 +100,7 @@ int __init sysenter_setup(void)
return 0;
 }
 
+#ifndef CONFIG_COMPAT_VDSO
 static struct page *syscall_nopage(struct vm_area_struct *vma,
unsigned long adr, int *type)
 {
@@ -187,3 +188,4 @@ int in_gate_area_no_task(unsigned long a
 {
return 0;
 }
+#endif
diff --git a/include/asm-i386/elf.h b/include/asm-i386/elf.h
index 45d21a0..0515d61 100644  
--- a/include/asm-i386/elf.h
+++ b/include/asm-i386/elf.h
@@ -143,11 +143,8 @@ extern int dump_task_extended_fpu (struc
 # define VDSO_PRELINK  0
 #endif
 
-#define VDSO_COMPAT_SYM(x) \
-   (VDSO_COMPAT_BASE + (unsigned long)(x) - VDSO_PRELINK)
-
 #define VDSO_SYM(x) \
-   (VDSO_BASE + (unsigned long)(x) - VDSO_PRELINK)
+   (VDSO_COMPAT_BASE + (unsigned long)(x) - VDSO_PRELINK)
 
 #define VDSO_HIGH_EHDR ((const struct elfhdr *) VDSO_HIGH_BASE)
 #define VDSO_EHDR  ((const struct elfhdr *) VDSO_COMPAT_BASE)
@@ -156,10 +153,12 @@ extern void __kernel_vsyscall;
 
 #define VDSO_ENTRY VDSO_SYM(&__kernel_vsyscall)
 
+#ifndef CONFIG_COMPAT_VDSO
 #define ARCH_HAS_SETUP_ADDITIONAL_PAGES
 struct linux_binprm;
 extern int arch_setup_additional_pages(struct linux_binprm *bprm,
int executable_stack);
+#endif
 
 extern unsigned int vdso_enabled;
 
diff --git a/include/asm-i386/fixmap.h b/include/asm-i386/fixmap.h
index 02428cb..3e9f610 100644  
--- a/include/asm-i386/fixmap.h
+++ b/include/asm-i386/fixmap.h
@@ -23,6 +23,8 @@
 extern unsigned long __FIXADDR_TOP;
 #else
 #define __FIXADDR_TOP  0xf000
+#define FIXADDR_USER_START __fix_to_virt(FIX_VDSO)
+#define FIXADDR_USER_END   __fix_to_virt(FIX_VDSO - 1)
 #endif
 
 #ifndef __ASSEMBLY__
diff --git a/include/asm-i386/page.h b/include/asm-i386/page.h
index fd3f64a..7b19f45 100644  
--- a/include/asm-i386/page.h
+++ b/include/asm-i386/page.h
@@ -143,7 +143,9 @@ extern int page_is_ram(unsigned long pag
 #include 
 #include 
 
+#ifndef CONFIG_COMPAT_VDSO
 #define __HAVE_ARCH_GATE_AREA 1
+#endif
 #endif /* __KERNEL__ */
 
 #endif /* _I386_PAGE_H */
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] MMC: Major restructuring and cleanup

2007-01-13 Thread Philip Langdale
Pierre Ossman wrote:
> Hi everyone,
> 
> As I've mentioned to some of you, I've been working on restructuring the MMC
> layer in order to make it more easily maintained and to allow extensions
> like SDIO support. A first draft of this is now ready for public review.
> I've cc:d those who have been waiting for this patch set (and Russell since
> he always gives blunt, but valuable feedback ;)).

So, I think I'm a bit too much of a kernel newbie to be able to provide a
definitive review, but I've looked over the changes and they look good to me.

I fully agree with the rearchitecturing - it makes it a lot easier to see
what's going on and it'll scale for SDIO (as you mention) and CE-ATA as well,
if we ever get a hold of any of those :-)

One concrete observation I'd make is that we should probably try and detect
MMC first instead of SD. Up until today, I'd have said it didn't really
matter, but I've been doing some reading and discovered that Protec make
some very strange cards they call "SuperSD" which can talk mmc4 and sd 1.1.
These will happily go along with either initialisation sequence - and as mmc4
is either the same or better than sd 1.1 from a performance point of view,
we should prefer it. This is independent of your restructuring, but as you're
fiddling with this code... :-)

http://www.hjreggel.net/cardspeed/special-sd.html#supersd

http://www.jactron.co.uk/pretec/ssd/consumer/super-sd.htm

--phil
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


How to flush the disk write cache from userspace

2007-01-13 Thread Ricardo Correia
Hi, (please CC: to my email address, I'm not subscribed)

Quick question: how can I flush the disk write cache from userspace?

Long question:

I'm porting the Solaris ZFS filesystem to the FUSE/Linux filesystem framework.
This is a copy-on-write, transactional filesystem and so it needs to ensure 
correct ordering of writes when transactions are written to disk.

At the moment, when transactions end, I'm using a fsync() on the block device 
followed by a ioctl(BLKFLSBUF).

This is because, according to the fsync manpage, even after fsync() returns, 
data might still be in the disk write cache, so fsync by itself doesn't 
guarantee data safety on power failure.

I was looking for something like the Solaris ioctl(DKIOCFLUSHWRITECACHE), 
which does exactly what I need.

The most similar thing I could find was ioctl(BLKFLSBUF), however a search for 
BLKFLSBUF on the Linux 2.6.15 source doesn't seem to return anything related 
to IDE or SCSI disks.

Can I trust ioctl(BLKFLSBUF) to flush disks' write caches (for disks that 
follow the specs)?

What about block devices of disk partitions, LVM logical volumes and the EMVS 
volumes, do they propagate flush commands to the respective disks?

What about loop devices?

Thanks in advance.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 10/10] mm: fix pagecache write deadlocks

2007-01-13 Thread Nick Piggin

Nick Piggin wrote:


@@ -1878,31 +1889,88 @@ generic_file_buffered_write(struct kiocb
break;
}
 
+		/*

+* non-uptodate pages cannot cope with short copies, and we
+* cannot take a pagefault with the destination page locked.
+* So pin the source page to copy it.
+*/
+   if (!PageUptodate(page)) {
+   unlock_page(page);
+
+   bytes = min(bytes, PAGE_CACHE_SIZE -
+((unsigned long)buf & ~PAGE_CACHE_MASK));
+
+   /*
+* Cannot get_user_pages with a page locked for the
+* same reason as we can't take a page fault with a
+* page locked (as explained below).
+*/
+   status = get_user_pages(current, current->mm,
+   (unsigned long)buf & PAGE_CACHE_MASK, 1,
+   0, 0, _page, NULL);


Thinko... get_user_pages needs to be called with mmap_sem held, obviously.

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ahci_softreset prevents acpi_power_off

2007-01-13 Thread Jeff Garzik

Robert Hancock wrote:

Faik Uygur wrote:
What happens when you try to shutdown?  


Does not shutdown and freezes.

Hand copied last messages seen on console:

Synchronizing SCSI cache for disk sda:
ACPI: PCI Interrupt for device :06:08.0 disabled
Power down.
acpi_power_off called
  hwsleep-0285 [01] enter_sleep_state: Entering sleep state [S5]


Since you're getting to this point I think this has to be some kind of 
BIOS interaction causing this. The only thing that happens after the 
"Entering sleep state" is that the kernel writes to some ACPI registers 
to tell the hardware to power down. I think some laptop BIOSes do things 
on ACPI power down like try to park the drive heads, etc. and maybe this 
change that you found from git bisecting is somehow interfering with it 
doing this?


Might want to check for a BIOS update first of all..


It would be interesting to try -mm, which includes ACPI support for ATA...

Jeff



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Choosing a HyperThreading/SMP/MultiCore kernel ?

2007-01-13 Thread Valdis . Kletnieks
On Sat, 13 Jan 2007 12:54:43 EST, Lennart Sorensen said:
> On Fri, Jan 12, 2007 at 10:38:43PM -0500, [EMAIL PROTECTED] wrote:
> > CONFIG_MCORE2=y
> 
> Oh good.  Makes life much simpler for users.

After writing that, I actually went back and *checked* the fine print.
It turns out that unless you have installed a not-yet-escaped release of
gcc, -mtune=core2 doesn't work, so it punts to -mtune=generic.  Wandering
over to http://gcc.gnu.org and searching the mailing lists, it seems that
on most of the benchmarks, -mtune=core2 was only a 0.5% or so win on most
stuff in its current form.


pgpImbaqJuk8t.pgp
Description: PGP signature


Re: Choosing a HyperThreading/SMP/MultiCore kernel ?

2007-01-13 Thread Valdis . Kletnieks
On Sat, 13 Jan 2007 15:18:31 EST, Bill Davidsen said:
> [EMAIL PROTECTED] wrote:
> > On Fri, 12 Jan 2007 10:03:49 EST, Lennart Sorensen said:
> >> I would expect any distribution should work on these (as long as the
> >> kernel they use isn't too old.).  Of course if it is a Mac, you need a
> >> distribution that supports their firmware (which is of course not a PC
> >> bios).  As long as you can boot it, any i386 or amd64 kernel with smp
> >> enabled should use all the processors present (well amd64 on the
> >> core2duo and on the p4 if it is em64t enabled).
> > 
> > amd64 will only work on a core2duo if it's a T7200 or higher - the
> > lower numbers are 32-bit-only chipsets.  I admit not knowing what
> > exact variant the Mac has.
> 
> I don't believe that's correct, the Intel features page indicates all 
> core2 have both 64bit and virtualization. Perhaps some of the core (no 
> 2) models didn't? Even the old 930 had those features by my notes.

My screwup - the chart I looked at managed to get the Core and Core2 series
mixed up. Here's a hopefully more canonical one:

http://www.intel.com/products/processor_number/proc_info_table.pdf

Does however list some Core2 that don't do virtualization (page 3, the
T5600 and T5500), which is what I think confused the author of the table
that I misread. ;)


pgpei2sRr1Ydh.pgp
Description: PGP signature


Re: [PATCH] libata: PIIX3 support

2007-01-13 Thread Robert Hancock

Mikael Pettersson wrote:

On Wed, 10 Jan 2007 17:13:38 +, Alan <[EMAIL PROTECTED]> wrote:

This I believe completes the PIIX range of support for libata

This adds the table entries needed for the PIIX3, both a new PCI
identifier and a new mode list. It also fixes an erroneous access to PCI
configuration 0x48 on non UDMA capable chips.


Works fine here on a 430HX box (ASUS T2P4).
I'm appending kernel messages for boots with the IDE driver and
with the updated libata driver, in case you want to compare them.

I did notice that ata_piix identified the disk as
"QUANTUM FIREBALL A5U." when IDE correctly identified it as
"QUANTUM FIREBALL CR8.4A".


I believe libata truncates the ATA device ID string to fit the max 
allowable for SCSI. The A5U. part is presumably the drive's firmware 
revision.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ahci_softreset prevents acpi_power_off

2007-01-13 Thread Robert Hancock

Faik Uygur wrote:
What happens when you try to shutdown?  


Does not shutdown and freezes.

Hand copied last messages seen on console:

Synchronizing SCSI cache for disk sda:
ACPI: PCI Interrupt for device :06:08.0 disabled
Power down.
acpi_power_off called
  hwsleep-0285 [01] enter_sleep_state: Entering sleep state [S5]


Since you're getting to this point I think this has to be some kind of 
BIOS interaction causing this. The only thing that happens after the 
"Entering sleep state" is that the kernel writes to some ACPI registers 
to tell the hardware to power down. I think some laptop BIOSes do things 
on ACPI power down like try to park the drive heads, etc. and maybe this 
change that you found from git bisecting is somehow interfering with it 
doing this?


Might want to check for a BIOS update first of all..

--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 20/20] XEN-paravirt: Add Xen virtual block device driver.

2007-01-13 Thread Arjan van de Ven
> +#include "../../../arch/i386/paravirt-xen/events.h"
> +#include "../../../arch/i386/paravirt-xen/xen-page.h"

this shows the headers are clearly in the wrong place...
> +
> + err = xenbus_printf(xbt, dev->nodename,
> + "ring-ref","%u", info->ring_ref);

why do you need your own printf?

> +static inline int GET_ID_FROM_FREELIST(

does this really need screaming?


> +
> +int blkif_ioctl(struct inode *inode, struct file *filep,
> + unsigned command, unsigned long argument)
> +{
> + int i;
> +
> + DPRINTK_IOCTL("command: 0x%x, argument: 0x%lx, dev: 0x%04x\n",
> +   command, (long)argument, inode->i_rdev);
> +
> + switch (command) {
> + case CDROMMULTISESSION:
> + DPRINTK("FIXME: support multisession CDs later\n");
> + for (i = 0; i < sizeof(struct cdrom_multisession); i++)
> + if (put_user(0, (char __user *)(argument + i)))
> + return -EFAULT;
> + return 0;
> +
> + default:
> + /*printk(KERN_ALERT "ioctl %08x not supported by Xen blkdev\n",
> +   command);*/
> + return -EINVAL; /* same return as native Linux */
> + }

eh so you implement no ioctls.. why then implement the ioctl method at
all?


> +static struct xenbus_driver blkfront = {
> + .name = "vbd",
> + .owner = THIS_MODULE,
> + .ids = blkfront_ids,
> + .probe = blkfront_probe,
> + .remove = blkfront_remove,
> + .resume = blkfront_resume,
> + .otherend_changed = backend_changed,
> +};

this can be const

> +
> +#define DPRINTK(_f, _a...) pr_debug(_f, ## _a)

why this silly abstraction? Just use pr_debug in the code directly



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 12/12] mark struct inode_operations const 3

2007-01-13 Thread Arjan van de Ven
From: Arjan van de Ven <[EMAIL PROTECTED]>
Subject: [patch 12/12] mark struct inode_operations const

Many struct inode_operations in the kernel can be "const". Marking them const
moves these to the .rodata section, which avoids false sharing with
potential dirty data. In addition it'll catch accidental writes at compile
time to these shared resources.

Signed-off-by: Arjan van de Ven <[EMAIL PROTECTED]>

Index: linux-2.6.20-rc4/fs/proc/base.c
===
--- linux-2.6.20-rc4.orig/fs/proc/base.c
+++ linux-2.6.20-rc4/fs/proc/base.c
@@ -93,7 +93,7 @@ struct pid_entry {
int len;
char *name;
mode_t mode;
-   struct inode_operations *iop;
+   const struct inode_operations *iop;
const struct file_operations *fop;
union proc_op op;
 };
@@ -352,7 +352,7 @@ static int proc_setattr(struct dentry *d
return error;
 }
 
-static struct inode_operations proc_def_inode_operations = {
+static const struct inode_operations proc_def_inode_operations = {
.setattr= proc_setattr,
 };
 
@@ -978,7 +978,7 @@ out:
return error;
 }
 
-static struct inode_operations proc_pid_link_inode_operations = {
+static const struct inode_operations proc_pid_link_inode_operations = {
.readlink   = proc_pid_readlink,
.follow_link= proc_pid_follow_link,
.setattr= proc_setattr,
@@ -1414,7 +1414,7 @@ static const struct file_operations proc
 /*
  * proc directories can do almost nothing..
  */
-static struct inode_operations proc_fd_inode_operations = {
+static const struct inode_operations proc_fd_inode_operations = {
.lookup = proc_lookupfd,
.setattr= proc_setattr,
 };
@@ -1654,7 +1654,7 @@ static struct dentry *proc_attr_dir_look
  attr_dir_stuff, ARRAY_SIZE(attr_dir_stuff));
 }
 
-static struct inode_operations proc_attr_dir_inode_operations = {
+static const struct inode_operations proc_attr_dir_inode_operations = {
.lookup = proc_attr_dir_lookup,
.getattr= pid_getattr,
.setattr= proc_setattr,
@@ -1680,7 +1680,7 @@ static void *proc_self_follow_link(struc
return ERR_PTR(vfs_follow_link(nd,tmp));
 }
 
-static struct inode_operations proc_self_inode_operations = {
+static const struct inode_operations proc_self_inode_operations = {
.readlink   = proc_self_readlink,
.follow_link= proc_self_follow_link,
 };
@@ -1829,7 +1829,7 @@ static int proc_pid_io_accounting(struct
  * Thread groups
  */
 static const struct file_operations proc_task_operations;
-static struct inode_operations proc_task_inode_operations;
+static const struct inode_operations proc_task_inode_operations;
 
 static struct pid_entry tgid_base_stuff[] = {
DIR("task",   S_IRUGO|S_IXUGO, task),
@@ -1898,7 +1898,7 @@ static struct dentry *proc_tgid_base_loo
  tgid_base_stuff, ARRAY_SIZE(tgid_base_stuff));
 }
 
-static struct inode_operations proc_tgid_base_inode_operations = {
+static const struct inode_operations proc_tgid_base_inode_operations = {
.lookup = proc_tgid_base_lookup,
.getattr= pid_getattr,
.setattr= proc_setattr,
@@ -2176,7 +2176,7 @@ static const struct file_operations proc
.readdir= proc_tid_base_readdir,
 };
 
-static struct inode_operations proc_tid_base_inode_operations = {
+static const struct inode_operations proc_tid_base_inode_operations = {
.lookup = proc_tid_base_lookup,
.getattr= pid_getattr,
.setattr= proc_setattr,
@@ -2392,7 +2392,7 @@ static int proc_task_getattr(struct vfsm
return 0;
 }
 
-static struct inode_operations proc_task_inode_operations = {
+static const struct inode_operations proc_task_inode_operations = {
.lookup = proc_task_lookup,
.getattr= proc_task_getattr,
.setattr= proc_setattr,
Index: linux-2.6.20-rc4/fs/proc/generic.c
===
--- linux-2.6.20-rc4.orig/fs/proc/generic.c
+++ linux-2.6.20-rc4/fs/proc/generic.c
@@ -265,7 +265,7 @@ static int proc_getattr(struct vfsmount 
return 0;
 }
 
-static struct inode_operations proc_file_inode_operations = {
+static const struct inode_operations proc_file_inode_operations = {
.setattr= proc_notify_change,
 };
 
@@ -357,7 +357,7 @@ static void *proc_follow_link(struct den
return NULL;
 }
 
-static struct inode_operations proc_link_inode_operations = {
+static const struct inode_operations proc_link_inode_operations = {
.readlink   = generic_readlink,
.follow_link= proc_follow_link,
 };
@@ -505,7 +505,7 @@ static const struct file_operations proc
 /*
  * proc directories can do almost nothing..
  */
-static struct inode_operations proc_dir_inode_operations = {

[patch 11/12] mark struct inode_operations const 2

2007-01-13 Thread Arjan van de Ven
From: Arjan van de Ven <[EMAIL PROTECTED]>
Subject: [patch 11/12] mark struct inode_operations const

Many struct inode_operations in the kernel can be "const". Marking them const
moves these to the .rodata section, which avoids false sharing with
potential dirty data. In addition it'll catch accidental writes at compile
time to these shared resources.

Signed-off-by: Arjan van de Ven <[EMAIL PROTECTED]>

Index: linux-2.6.20-rc4/fs/gfs2/ops_inode.c
===
--- linux-2.6.20-rc4.orig/fs/gfs2/ops_inode.c
+++ linux-2.6.20-rc4/fs/gfs2/ops_inode.c
@@ -1084,7 +1084,7 @@ static int gfs2_removexattr(struct dentr
return gfs2_ea_remove(GFS2_I(dentry->d_inode), );
 }
 
-struct inode_operations gfs2_file_iops = {
+const struct inode_operations gfs2_file_iops = {
.permission = gfs2_permission,
.setattr = gfs2_setattr,
.getattr = gfs2_getattr,
@@ -1094,7 +1094,7 @@ struct inode_operations gfs2_file_iops =
.removexattr = gfs2_removexattr,
 };
 
-struct inode_operations gfs2_dev_iops = {
+const struct inode_operations gfs2_dev_iops = {
.permission = gfs2_permission,
.setattr = gfs2_setattr,
.getattr = gfs2_getattr,
@@ -1104,7 +1104,7 @@ struct inode_operations gfs2_dev_iops = 
.removexattr = gfs2_removexattr,
 };
 
-struct inode_operations gfs2_dir_iops = {
+const struct inode_operations gfs2_dir_iops = {
.create = gfs2_create,
.lookup = gfs2_lookup,
.link = gfs2_link,
@@ -1123,7 +1123,7 @@ struct inode_operations gfs2_dir_iops = 
.removexattr = gfs2_removexattr,
 };
 
-struct inode_operations gfs2_symlink_iops = {
+const struct inode_operations gfs2_symlink_iops = {
.readlink = gfs2_readlink,
.follow_link = gfs2_follow_link,
.permission = gfs2_permission,
Index: linux-2.6.20-rc4/fs/gfs2/ops_inode.h
===
--- linux-2.6.20-rc4.orig/fs/gfs2/ops_inode.h
+++ linux-2.6.20-rc4/fs/gfs2/ops_inode.h
@@ -12,9 +12,9 @@
 
 #include 
 
-extern struct inode_operations gfs2_file_iops;
-extern struct inode_operations gfs2_dir_iops;
-extern struct inode_operations gfs2_symlink_iops;
-extern struct inode_operations gfs2_dev_iops;
+extern const struct inode_operations gfs2_file_iops;
+extern const struct inode_operations gfs2_dir_iops;
+extern const struct inode_operations gfs2_symlink_iops;
+extern const struct inode_operations gfs2_dev_iops;
 
 #endif /* __OPS_INODE_DOT_H__ */
Index: linux-2.6.20-rc4/fs/hfs/dir.c
===
--- linux-2.6.20-rc4.orig/fs/hfs/dir.c
+++ linux-2.6.20-rc4/fs/hfs/dir.c
@@ -320,7 +320,7 @@ const struct file_operations hfs_dir_ope
.release= hfs_dir_release,
 };
 
-struct inode_operations hfs_dir_inode_operations = {
+const struct inode_operations hfs_dir_inode_operations = {
.create = hfs_create,
.lookup = hfs_lookup,
.unlink = hfs_unlink,
Index: linux-2.6.20-rc4/fs/hfs/hfs_fs.h
===
--- linux-2.6.20-rc4.orig/fs/hfs/hfs_fs.h
+++ linux-2.6.20-rc4/fs/hfs/hfs_fs.h
@@ -170,7 +170,7 @@ extern void hfs_cat_build_key(struct sup
 
 /* dir.c */
 extern const struct file_operations hfs_dir_operations;
-extern struct inode_operations hfs_dir_inode_operations;
+extern const struct inode_operations hfs_dir_inode_operations;
 
 /* extent.c */
 extern int hfs_ext_keycmp(const btree_key *, const btree_key *);
Index: linux-2.6.20-rc4/fs/hfs/inode.c
===
--- linux-2.6.20-rc4.orig/fs/hfs/inode.c
+++ linux-2.6.20-rc4/fs/hfs/inode.c
@@ -18,7 +18,7 @@
 #include "btree.h"
 
 static const struct file_operations hfs_file_operations;
-static struct inode_operations hfs_file_inode_operations;
+static const struct inode_operations hfs_file_inode_operations;
 
 /* Variable-like macros */
 
@@ -612,7 +612,7 @@ static const struct file_operations hfs_
.release= hfs_file_release,
 };
 
-static struct inode_operations hfs_file_inode_operations = {
+static const struct inode_operations hfs_file_inode_operations = {
.lookup = hfs_file_lookup,
.truncate   = hfs_file_truncate,
.setattr= hfs_inode_setattr,
Index: linux-2.6.20-rc4/fs/hfsplus/dir.c
===
--- linux-2.6.20-rc4.orig/fs/hfsplus/dir.c
+++ linux-2.6.20-rc4/fs/hfsplus/dir.c
@@ -471,7 +471,7 @@ static int hfsplus_rename(struct inode *
return res;
 }
 
-struct inode_operations hfsplus_dir_inode_operations = {
+const struct inode_operations hfsplus_dir_inode_operations = {
.lookup = hfsplus_lookup,
.create = hfsplus_create,
.link   = hfsplus_link,
Index: linux-2.6.20-rc4/fs/hfsplus/inode.c

[patch 09/12] mark struct file_operations const 9

2007-01-13 Thread Arjan van de Ven
From: Arjan van de Ven <[EMAIL PROTECTED]>
Subject: [patch 09/12] mark struct file_operations const

Many struct file_operations in the kernel can be "const". Marking them const
moves these to the .rodata section, which avoids false sharing with
potential dirty data. In addition it'll catch accidental writes at compile
time to these shared resources.

Signed-off-by: Arjan van de Ven <[EMAIL PROTECTED]>

Index: linux-2.6/security/inode.c
===
--- linux-2.6.orig/security/inode.c
+++ linux-2.6/security/inode.c
@@ -50,7 +50,7 @@ static int default_open(struct inode *in
return 0;
 }
 
-static struct file_operations default_file_ops = {
+static const struct file_operations default_file_ops = {
.read = default_read_file,
.write =default_write_file,
.open = default_open,
@@ -215,7 +215,7 @@ static int create_by_name(const char *na
  */
 struct dentry *securityfs_create_file(const char *name, mode_t mode,
   struct dentry *parent, void *data,
-  struct file_operations *fops)
+  const struct file_operations *fops)
 {
struct dentry *dentry = NULL;
int error;
Index: linux-2.6/security/keys/proc.c
===
--- linux-2.6.orig/security/keys/proc.c
+++ linux-2.6/security/keys/proc.c
@@ -33,7 +33,7 @@ static struct seq_operations proc_keys_o
.show   = proc_keys_show,
 };
 
-static struct file_operations proc_keys_fops = {
+static const struct file_operations proc_keys_fops = {
.open   = proc_keys_open,
.read   = seq_read,
.llseek = seq_lseek,
@@ -54,7 +54,7 @@ static struct seq_operations proc_key_us
.show   = proc_key_users_show,
 };
 
-static struct file_operations proc_key_users_fops = {
+static const struct file_operations proc_key_users_fops = {
.open   = proc_key_users_open,
.read   = seq_read,
.llseek = seq_lseek,
Index: linux-2.6/security/selinux/selinuxfs.c
===
--- linux-2.6.orig/security/selinux/selinuxfs.c
+++ linux-2.6/security/selinux/selinuxfs.c
@@ -161,7 +161,7 @@ out:
 #define sel_write_enforce NULL
 #endif
 
-static struct file_operations sel_enforce_ops = {
+static const struct file_operations sel_enforce_ops = {
.read   = sel_read_enforce,
.write  = sel_write_enforce,
 };
@@ -211,7 +211,7 @@ out:
 #define sel_write_disable NULL
 #endif
 
-static struct file_operations sel_disable_ops = {
+static const struct file_operations sel_disable_ops = {
.write  = sel_write_disable,
 };
 
@@ -225,7 +225,7 @@ static ssize_t sel_read_policyvers(struc
return simple_read_from_buffer(buf, count, ppos, tmpbuf, length);
 }
 
-static struct file_operations sel_policyvers_ops = {
+static const struct file_operations sel_policyvers_ops = {
.read   = sel_read_policyvers,
 };
 
@@ -242,7 +242,7 @@ static ssize_t sel_read_mls(struct file 
return simple_read_from_buffer(buf, count, ppos, tmpbuf, length);
 }
 
-static struct file_operations sel_mls_ops = {
+static const struct file_operations sel_mls_ops = {
.read   = sel_read_mls,
 };
 
@@ -294,7 +294,7 @@ out:
return length;
 }
 
-static struct file_operations sel_load_ops = {
+static const struct file_operations sel_load_ops = {
.write  = sel_write_load,
 };
 
@@ -374,7 +374,7 @@ out:
free_page((unsigned long) page);
return length;
 }
-static struct file_operations sel_checkreqprot_ops = {
+static const struct file_operations sel_checkreqprot_ops = {
.read   = sel_read_checkreqprot,
.write  = sel_write_checkreqprot,
 };
@@ -423,7 +423,7 @@ out:
free_page((unsigned long) page);
return length;
 }
-static struct file_operations sel_compat_net_ops = {
+static const struct file_operations sel_compat_net_ops = {
.read   = sel_read_compat_net,
.write  = sel_write_compat_net,
 };
@@ -467,7 +467,7 @@ static ssize_t selinux_transaction_write
return rv;
 }
 
-static struct file_operations transaction_ops = {
+static const struct file_operations transaction_ops = {
.write  = selinux_transaction_write,
.read   = simple_transaction_read,
.release= simple_transaction_release,
@@ -875,7 +875,7 @@ out:
return length;
 }
 
-static struct file_operations sel_bool_ops = {
+static const struct file_operations sel_bool_ops = {
.read   = sel_read_bool,
.write  = sel_write_bool,
 };
@@ -932,7 +932,7 @@ out:
return length;
 }
 
-static struct file_operations sel_commit_bools_ops = {
+static const struct file_operations 

[patch 10/12] mark struct inode_operations const 1

2007-01-13 Thread Arjan van de Ven
From: Arjan van de Ven <[EMAIL PROTECTED]>
Subject: [patch 10/12] mark struct inode_operations const

Many struct inode_operations in the kernel can be "const". Marking them const
moves these to the .rodata section, which avoids false sharing with
potential dirty data. In addition it'll catch accidental writes at compile
time to these shared resources.

Signed-off-by: Arjan van de Ven <[EMAIL PROTECTED]>

Index: linux-2.6.20-rc4/arch/powerpc/platforms/cell/spufs/inode.c
===
--- linux-2.6.20-rc4.orig/arch/powerpc/platforms/cell/spufs/inode.c
+++ linux-2.6.20-rc4/arch/powerpc/platforms/cell/spufs/inode.c
@@ -220,7 +220,7 @@ static int spufs_dir_close(struct inode 
return dcache_dir_close(inode, file);
 }
 
-struct inode_operations spufs_dir_inode_operations = {
+const struct inode_operations spufs_dir_inode_operations = {
.lookup = simple_lookup,
 };
 
Index: linux-2.6.20-rc4/fs/9p/vfs_inode.c
===
--- linux-2.6.20-rc4.orig/fs/9p/vfs_inode.c
+++ linux-2.6.20-rc4/fs/9p/vfs_inode.c
@@ -41,10 +41,10 @@
 #include "v9fs_vfs.h"
 #include "fid.h"
 
-static struct inode_operations v9fs_dir_inode_operations;
-static struct inode_operations v9fs_dir_inode_operations_ext;
-static struct inode_operations v9fs_file_inode_operations;
-static struct inode_operations v9fs_symlink_inode_operations;
+static const struct inode_operations v9fs_dir_inode_operations;
+static const struct inode_operations v9fs_dir_inode_operations_ext;
+static const struct inode_operations v9fs_file_inode_operations;
+static const struct inode_operations v9fs_symlink_inode_operations;
 
 /**
  * unixmode2p9mode - convert unix mode bits to plan 9
@@ -1274,7 +1274,7 @@ v9fs_vfs_mknod(struct inode *dir, struct
return retval;
 }
 
-static struct inode_operations v9fs_dir_inode_operations_ext = {
+static const struct inode_operations v9fs_dir_inode_operations_ext = {
.create = v9fs_vfs_create,
.lookup = v9fs_vfs_lookup,
.symlink = v9fs_vfs_symlink,
@@ -1289,7 +1289,7 @@ static struct inode_operations v9fs_dir_
.setattr = v9fs_vfs_setattr,
 };
 
-static struct inode_operations v9fs_dir_inode_operations = {
+static const struct inode_operations v9fs_dir_inode_operations = {
.create = v9fs_vfs_create,
.lookup = v9fs_vfs_lookup,
.unlink = v9fs_vfs_unlink,
@@ -1301,12 +1301,12 @@ static struct inode_operations v9fs_dir_
.setattr = v9fs_vfs_setattr,
 };
 
-static struct inode_operations v9fs_file_inode_operations = {
+static const struct inode_operations v9fs_file_inode_operations = {
.getattr = v9fs_vfs_getattr,
.setattr = v9fs_vfs_setattr,
 };
 
-static struct inode_operations v9fs_symlink_inode_operations = {
+static const struct inode_operations v9fs_symlink_inode_operations = {
.readlink = v9fs_vfs_readlink,
.follow_link = v9fs_vfs_follow_link,
.put_link = v9fs_vfs_put_link,
Index: linux-2.6.20-rc4/fs/adfs/adfs.h
===
--- linux-2.6.20-rc4.orig/fs/adfs/adfs.h
+++ linux-2.6.20-rc4/fs/adfs/adfs.h
@@ -84,7 +84,7 @@ void __adfs_error(struct super_block *sb
  */
 
 /* dir_*.c */
-extern struct inode_operations adfs_dir_inode_operations;
+extern const struct inode_operations adfs_dir_inode_operations;
 extern const struct file_operations adfs_dir_operations;
 extern struct dentry_operations adfs_dentry_operations;
 extern struct adfs_dir_ops adfs_f_dir_ops;
@@ -93,7 +93,7 @@ extern struct adfs_dir_ops adfs_fplus_di
 extern int adfs_dir_update(struct super_block *sb, struct object_info *obj);
 
 /* file.c */
-extern struct inode_operations adfs_file_inode_operations;
+extern const struct inode_operations adfs_file_inode_operations;
 extern const struct file_operations adfs_file_operations;
 
 static inline __u32 signed_asl(__u32 val, signed int shift)
Index: linux-2.6.20-rc4/fs/adfs/dir.c
===
--- linux-2.6.20-rc4.orig/fs/adfs/dir.c
+++ linux-2.6.20-rc4/fs/adfs/dir.c
@@ -295,7 +295,7 @@ adfs_lookup(struct inode *dir, struct de
 /*
  * directories can handle most operations...
  */
-struct inode_operations adfs_dir_inode_operations = {
+const struct inode_operations adfs_dir_inode_operations = {
.lookup = adfs_lookup,
.setattr= adfs_notify_change,
 };
Index: linux-2.6.20-rc4/fs/adfs/file.c
===
--- linux-2.6.20-rc4.orig/fs/adfs/file.c
+++ linux-2.6.20-rc4/fs/adfs/file.c
@@ -36,6 +36,6 @@ const struct file_operations adfs_file_o
.sendfile   = generic_file_sendfile,
 };
 
-struct inode_operations adfs_file_inode_operations = {
+const struct inode_operations adfs_file_inode_operations = {
.setattr= adfs_notify_change,
 };
Index: linux-2.6.20-rc4/fs/affs/affs.h

Re: [RFC] How to (automatically) find the correct maintainer(s)

2007-01-13 Thread Stefan Richter
I wrote:
> gcc -o test3.o test.c test.c
   ^^ typo
gcc -o test3.o test.c test2.c
-- 
Stefan Richter
-=-=-=== ---= -===-
http://arcgraph.de/sr/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 07/12] mark struct file_operations const 7

2007-01-13 Thread Arjan van de Ven
From: Arjan van de Ven <[EMAIL PROTECTED]>
Subject: [patch 07/12] mark struct file_operations const

Many struct file_operations in the kernel can be "const". Marking them const
moves these to the .rodata section, which avoids false sharing with
potential dirty data. In addition it'll catch accidental writes at compile
time to these shared resources.

Signed-off-by: Arjan van de Ven <[EMAIL PROTECTED]>

Index: linux-2.6/ipc/mqueue.c
===
--- linux-2.6.orig/ipc/mqueue.c
+++ linux-2.6/ipc/mqueue.c
@@ -85,7 +85,7 @@ struct mqueue_inode_info {
 };
 
 static struct inode_operations mqueue_dir_inode_operations;
-static struct file_operations mqueue_file_operations;
+static const struct file_operations mqueue_file_operations;
 static struct super_operations mqueue_super_ops;
 static void remove_notification(struct mqueue_inode_info *info);
 
@@ -1166,7 +1166,7 @@ static struct inode_operations mqueue_di
.unlink = mqueue_unlink,
 };
 
-static struct file_operations mqueue_file_operations = {
+static const struct file_operations mqueue_file_operations = {
.flush = mqueue_flush_file,
.poll = mqueue_poll_file,
.read = mqueue_read_file,
Index: linux-2.6/ipc/shm.c
===
--- linux-2.6.orig/ipc/shm.c
+++ linux-2.6/ipc/shm.c
@@ -42,7 +42,7 @@
 
 #include "util.h"
 
-static struct file_operations shm_file_operations;
+static const struct file_operations shm_file_operations;
 static struct vm_operations_struct shm_vm_ops;
 
 static struct ipc_ids init_shm_ids;
@@ -249,7 +249,7 @@ static int shm_release(struct inode *ino
return 0;
 }
 
-static struct file_operations shm_file_operations = {
+static const struct file_operations shm_file_operations = {
.mmap   = shm_mmap,
.release= shm_release,
 #ifndef CONFIG_MMU
Index: linux-2.6/ipc/util.c
===
--- linux-2.6.orig/ipc/util.c
+++ linux-2.6/ipc/util.c
@@ -205,7 +205,7 @@ void __ipc_init ipc_init_ids(struct ipc_
 }
 
 #ifdef CONFIG_PROC_FS
-static struct file_operations sysvipc_proc_fops;
+static const struct file_operations sysvipc_proc_fops;
 /**
  * ipc_init_proc_interface -  Create a proc interface for sysipc types
  *using a seq_file interface.
@@ -848,7 +848,7 @@ static int sysvipc_proc_open(struct inod
return ret;
 }
 
-static struct file_operations sysvipc_proc_fops = {
+static const struct file_operations sysvipc_proc_fops = {
.open= sysvipc_proc_open,
.read= seq_read,
.llseek  = seq_lseek,
Index: linux-2.6/kernel/cpuset.c
===
--- linux-2.6.orig/kernel/cpuset.c
+++ linux-2.6/kernel/cpuset.c
@@ -2656,7 +2656,7 @@ static int cpuset_open(struct inode *ino
return single_open(file, proc_cpuset_show, pid);
 }
 
-struct file_operations proc_cpuset_operations = {
+const struct file_operations proc_cpuset_operations = {
.open   = cpuset_open,
.read   = seq_read,
.llseek = seq_lseek,
Index: linux-2.6/net/802/tr.c
===
--- linux-2.6.orig/net/802/tr.c
+++ linux-2.6/net/802/tr.c
@@ -576,7 +576,7 @@ static int rif_seq_open(struct inode *in
return seq_open(file, _seq_ops);
 }
 
-static struct file_operations rif_seq_fops = {
+static const struct file_operations rif_seq_fops = {
.owner   = THIS_MODULE,
.open= rif_seq_open,
.read= seq_read,
Index: linux-2.6/net/8021q/vlanproc.c
===
--- linux-2.6.orig/net/8021q/vlanproc.c
+++ linux-2.6/net/8021q/vlanproc.c
@@ -81,7 +81,7 @@ static int vlan_seq_open(struct inode *i
return seq_open(file, _seq_ops);
 }
 
-static struct file_operations vlan_fops = {
+static const struct file_operations vlan_fops = {
.owner   = THIS_MODULE,
.open= vlan_seq_open,
.read= seq_read,
@@ -98,7 +98,7 @@ static int vlandev_seq_open(struct inode
return single_open(file, vlandev_seq_show, PDE(inode)->data);
 }
 
-static struct file_operations vlandev_fops = {
+static const struct file_operations vlandev_fops = {
.owner = THIS_MODULE,
.open= vlandev_seq_open,
.read= seq_read,
Index: linux-2.6/net/appletalk/aarp.c
===
--- linux-2.6.orig/net/appletalk/aarp.c
+++ linux-2.6/net/appletalk/aarp.c
@@ -1048,7 +1048,7 @@ out_kfree:
goto out;
 }
 
-struct file_operations atalk_seq_arp_fops = {
+const struct file_operations atalk_seq_arp_fops = {
.owner  = THIS_MODULE,
.open   = aarp_seq_open,
.read   = seq_read,
Index: linux-2.6/net/appletalk/atalk_proc.c

[patch 08/12] mark struct file_operations const 8

2007-01-13 Thread Arjan van de Ven
From: Arjan van de Ven <[EMAIL PROTECTED]>
Subject: [patch 08/12] mark struct file_operations const

Many struct file_operations in the kernel can be "const". Marking them const
moves these to the .rodata section, which avoids false sharing with
potential dirty data. In addition it'll catch accidental writes at compile
time to these shared resources.

Signed-off-by: Arjan van de Ven <[EMAIL PROTECTED]>

Index: linux-2.6/net/irda/ircomm/ircomm_core.c
===
--- linux-2.6.orig/net/irda/ircomm/ircomm_core.c
+++ linux-2.6/net/irda/ircomm/ircomm_core.c
@@ -56,7 +56,7 @@ static void ircomm_control_indication(st
 extern struct proc_dir_entry *proc_irda;
 static int ircomm_seq_open(struct inode *, struct file *);
 
-static struct file_operations ircomm_proc_fops = {
+static const struct file_operations ircomm_proc_fops = {
.owner  = THIS_MODULE,
.open   = ircomm_seq_open,
.read   = seq_read,
Index: linux-2.6/net/irda/iriap.c
===
--- linux-2.6.orig/net/irda/iriap.c
+++ linux-2.6/net/irda/iriap.c
@@ -1080,7 +1080,7 @@ static int irias_seq_open(struct inode *
return seq_open(file, _seq_ops);
 }
 
-struct file_operations irias_seq_fops = {
+const struct file_operations irias_seq_fops = {
.owner  = THIS_MODULE,
.open   = irias_seq_open,
.read   = seq_read,
Index: linux-2.6/net/irda/irlan/irlan_common.c
===
--- linux-2.6.orig/net/irda/irlan/irlan_common.c
+++ linux-2.6/net/irda/irlan/irlan_common.c
@@ -93,7 +93,7 @@ extern struct proc_dir_entry *proc_irda;
 
 static int irlan_seq_open(struct inode *inode, struct file *file);
 
-static struct file_operations irlan_fops = {
+static const struct file_operations irlan_fops = {
.owner   = THIS_MODULE,
.open= irlan_seq_open,
.read= seq_read,
Index: linux-2.6/net/irda/irlap.c
===
--- linux-2.6.orig/net/irda/irlap.c
+++ linux-2.6/net/irda/irlap.c
@@ -1244,7 +1244,7 @@ out_kfree:
goto out;
 }
 
-struct file_operations irlap_seq_fops = {
+const struct file_operations irlap_seq_fops = {
.owner  = THIS_MODULE,
.open   = irlap_seq_open,
.read   = seq_read,
Index: linux-2.6/net/irda/irlmp.c
===
--- linux-2.6.orig/net/irda/irlmp.c
+++ linux-2.6/net/irda/irlmp.c
@@ -2026,7 +2026,7 @@ out_kfree:
goto out;
 }
 
-struct file_operations irlmp_seq_fops = {
+const struct file_operations irlmp_seq_fops = {
.owner  = THIS_MODULE,
.open   = irlmp_seq_open,
.read   = seq_read,
Index: linux-2.6/net/irda/irttp.c
===
--- linux-2.6.orig/net/irda/irttp.c
+++ linux-2.6/net/irda/irttp.c
@@ -1895,7 +1895,7 @@ out_kfree:
goto out;
 }
 
-struct file_operations irttp_seq_fops = {
+const struct file_operations irttp_seq_fops = {
.owner  = THIS_MODULE,
.open   = irttp_seq_open,
.read   = seq_read,
Index: linux-2.6/net/llc/llc_proc.c
===
--- linux-2.6.orig/net/llc/llc_proc.c
+++ linux-2.6/net/llc/llc_proc.c
@@ -208,7 +208,7 @@ static int llc_seq_core_open(struct inod
return seq_open(file, _seq_core_ops);
 }
 
-static struct file_operations llc_seq_socket_fops = {
+static const struct file_operations llc_seq_socket_fops = {
.owner  = THIS_MODULE,
.open   = llc_seq_socket_open,
.read   = seq_read,
@@ -216,7 +216,7 @@ static struct file_operations llc_seq_so
.release= seq_release,
 };
 
-static struct file_operations llc_seq_core_fops = {
+static const struct file_operations llc_seq_core_fops = {
.owner  = THIS_MODULE,
.open   = llc_seq_core_open,
.read   = seq_read,
Index: linux-2.6/net/netfilter/nf_conntrack_expect.c
===
--- linux-2.6.orig/net/netfilter/nf_conntrack_expect.c
+++ linux-2.6/net/netfilter/nf_conntrack_expect.c
@@ -435,7 +435,7 @@ static int exp_open(struct inode *inode,
return seq_open(file, _seq_ops);
 }
 
-struct file_operations exp_file_ops = {
+const struct file_operations exp_file_ops = {
.owner   = THIS_MODULE,
.open= exp_open,
.read= seq_read,
Index: linux-2.6/net/netfilter/nf_conntrack_standalone.c
===
--- linux-2.6.orig/net/netfilter/nf_conntrack_standalone.c
+++ linux-2.6/net/netfilter/nf_conntrack_standalone.c
@@ -229,7 +229,7 @@ out_free:
return ret;
 }
 

[patch 05/12] mark struct file_operations const 5

2007-01-13 Thread Arjan van de Ven
From: Arjan van de Ven <[EMAIL PROTECTED]>
Subject: [patch 05/12] mark struct file_operations const

Many struct file_operations in the kernel can be "const". Marking them const
moves these to the .rodata section, which avoids false sharing with
potential dirty data. In addition it'll catch accidental writes at compile
time to these shared resources.

Signed-off-by: Arjan van de Ven <[EMAIL PROTECTED]>

Index: linux-2.6/drivers/message/i2o/i2o_config.c
===
--- linux-2.6.orig/drivers/message/i2o/i2o_config.c
+++ linux-2.6/drivers/message/i2o/i2o_config.c
@@ -,7 +,7 @@ static int cfg_release(struct inode *ino
return 0;
 }
 
-static struct file_operations config_fops = {
+static const struct file_operations config_fops = {
.owner = THIS_MODULE,
.llseek = no_llseek,
.ioctl = i2o_cfg_ioctl,
Index: linux-2.6/drivers/message/i2o/i2o_proc.c
===
--- linux-2.6.orig/drivers/message/i2o/i2o_proc.c
+++ linux-2.6/drivers/message/i2o/i2o_proc.c
@@ -1703,133 +1703,133 @@ static int i2o_seq_open_dev_name(struct 
return single_open(file, i2o_seq_show_dev_name, PDE(inode)->data);
 };
 
-static struct file_operations i2o_seq_fops_lct = {
+static const struct file_operations i2o_seq_fops_lct = {
.open = i2o_seq_open_lct,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
 };
 
-static struct file_operations i2o_seq_fops_hrt = {
+static const struct file_operations i2o_seq_fops_hrt = {
.open = i2o_seq_open_hrt,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
 };
 
-static struct file_operations i2o_seq_fops_status = {
+static const struct file_operations i2o_seq_fops_status = {
.open = i2o_seq_open_status,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
 };
 
-static struct file_operations i2o_seq_fops_hw = {
+static const struct file_operations i2o_seq_fops_hw = {
.open = i2o_seq_open_hw,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
 };
 
-static struct file_operations i2o_seq_fops_ddm_table = {
+static const struct file_operations i2o_seq_fops_ddm_table = {
.open = i2o_seq_open_ddm_table,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
 };
 
-static struct file_operations i2o_seq_fops_driver_store = {
+static const struct file_operations i2o_seq_fops_driver_store = {
.open = i2o_seq_open_driver_store,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
 };
 
-static struct file_operations i2o_seq_fops_drivers_stored = {
+static const struct file_operations i2o_seq_fops_drivers_stored = {
.open = i2o_seq_open_drivers_stored,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
 };
 
-static struct file_operations i2o_seq_fops_groups = {
+static const struct file_operations i2o_seq_fops_groups = {
.open = i2o_seq_open_groups,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
 };
 
-static struct file_operations i2o_seq_fops_phys_device = {
+static const struct file_operations i2o_seq_fops_phys_device = {
.open = i2o_seq_open_phys_device,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
 };
 
-static struct file_operations i2o_seq_fops_claimed = {
+static const struct file_operations i2o_seq_fops_claimed = {
.open = i2o_seq_open_claimed,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
 };
 
-static struct file_operations i2o_seq_fops_users = {
+static const struct file_operations i2o_seq_fops_users = {
.open = i2o_seq_open_users,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
 };
 
-static struct file_operations i2o_seq_fops_priv_msgs = {
+static const struct file_operations i2o_seq_fops_priv_msgs = {
.open = i2o_seq_open_priv_msgs,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
 };
 
-static struct file_operations i2o_seq_fops_authorized_users = {
+static const struct file_operations i2o_seq_fops_authorized_users = {
.open = i2o_seq_open_authorized_users,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
 };
 
-static struct file_operations i2o_seq_fops_dev_name = {
+static const struct file_operations i2o_seq_fops_dev_name = {
.open = i2o_seq_open_dev_name,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
 };
 
-static struct file_operations i2o_seq_fops_dev_identity = {
+static const struct file_operations i2o_seq_fops_dev_identity = {
.open = i2o_seq_open_dev_identity,
 

[patch 06/12] mark struct file_operations const 6

2007-01-13 Thread Arjan van de Ven
From: Arjan van de Ven <[EMAIL PROTECTED]>
Subject: [patch 06/12] mark struct file_operations const

Many struct file_operations in the kernel can be "const". Marking them const
moves these to the .rodata section, which avoids false sharing with
potential dirty data. In addition it'll catch accidental writes at compile
time to these shared resources.

Signed-off-by: Arjan van de Ven <[EMAIL PROTECTED]>

Index: linux-2.6/drivers/sbus/char/bpp.c
===
--- linux-2.6.orig/drivers/sbus/char/bpp.c
+++ linux-2.6/drivers/sbus/char/bpp.c
@@ -846,7 +846,7 @@ static int bpp_ioctl(struct inode *inode
   return errno;
 }
 
-static struct file_operations bpp_fops = {
+static const struct file_operations bpp_fops = {
.owner =THIS_MODULE,
.read = bpp_read,
.write =bpp_write,
Index: linux-2.6/drivers/sbus/char/cpwatchdog.c
===
--- linux-2.6.orig/drivers/sbus/char/cpwatchdog.c
+++ linux-2.6/drivers/sbus/char/cpwatchdog.c
@@ -459,7 +459,7 @@ static irqreturn_t wd_interrupt(int irq,
return IRQ_HANDLED;
 }
 
-static struct file_operations wd_fops = {
+static const struct file_operations wd_fops = {
.owner =THIS_MODULE,
.ioctl =wd_ioctl,
.compat_ioctl = wd_compat_ioctl,
Index: linux-2.6/drivers/sbus/char/display7seg.c
===
--- linux-2.6.orig/drivers/sbus/char/display7seg.c
+++ linux-2.6/drivers/sbus/char/display7seg.c
@@ -166,7 +166,7 @@ static long d7s_ioctl(struct file *file,
return error;
 }
 
-static struct file_operations d7s_fops = {
+static const struct file_operations d7s_fops = {
.owner =THIS_MODULE,
.unlocked_ioctl =   d7s_ioctl,
.compat_ioctl = d7s_ioctl,
Index: linux-2.6/drivers/sbus/char/envctrl.c
===
--- linux-2.6.orig/drivers/sbus/char/envctrl.c
+++ linux-2.6/drivers/sbus/char/envctrl.c
@@ -705,7 +705,7 @@ envctrl_release(struct inode *inode, str
return 0;
 }
 
-static struct file_operations envctrl_fops = {
+static const struct file_operations envctrl_fops = {
.owner =THIS_MODULE,
.read = envctrl_read,
.unlocked_ioctl =   envctrl_ioctl,
Index: linux-2.6/drivers/sbus/char/flash.c
===
--- linux-2.6.orig/drivers/sbus/char/flash.c
+++ linux-2.6/drivers/sbus/char/flash.c
@@ -142,7 +142,7 @@ flash_release(struct inode *inode, struc
return 0;
 }
 
-static struct file_operations flash_fops = {
+static const struct file_operations flash_fops = {
/* no write to the Flash, use mmap
 * and play flash dependent tricks.
 */
Index: linux-2.6/drivers/sbus/char/jsflash.c
===
--- linux-2.6.orig/drivers/sbus/char/jsflash.c
+++ linux-2.6/drivers/sbus/char/jsflash.c
@@ -431,7 +431,7 @@ static int jsf_release(struct inode *ino
return 0;
 }
 
-static struct file_operations jsf_fops = {
+static const struct file_operations jsf_fops = {
.owner =THIS_MODULE,
.llseek =   jsf_lseek,
.read = jsf_read,
Index: linux-2.6/drivers/sbus/char/openprom.c
===
--- linux-2.6.orig/drivers/sbus/char/openprom.c
+++ linux-2.6/drivers/sbus/char/openprom.c
@@ -704,7 +704,7 @@ static int openprom_release(struct inode
return 0;
 }
 
-static struct file_operations openprom_fops = {
+static const struct file_operations openprom_fops = {
.owner =THIS_MODULE,
.llseek =   no_llseek,
.ioctl =openprom_ioctl,
Index: linux-2.6/drivers/sbus/char/riowatchdog.c
===
--- linux-2.6.orig/drivers/sbus/char/riowatchdog.c
+++ linux-2.6/drivers/sbus/char/riowatchdog.c
@@ -193,7 +193,7 @@ static ssize_t riowd_write(struct file *
return 0;
 }
 
-static struct file_operations riowd_fops = {
+static const struct file_operations riowd_fops = {
.owner =THIS_MODULE,
.ioctl =riowd_ioctl,
.open = riowd_open,
Index: linux-2.6/drivers/sbus/char/rtc.c
===
--- linux-2.6.orig/drivers/sbus/char/rtc.c
+++ linux-2.6/drivers/sbus/char/rtc.c
@@ -233,7 +233,7 @@ static int rtc_release(struct inode *ino
return 0;
 }
 
-static struct file_operations rtc_fops = {
+static const struct file_operations rtc_fops = {
.owner =THIS_MODULE,
.llseek =   no_llseek,
.ioctl =rtc_ioctl,
Index: linux-2.6/drivers/sbus/char/uctrl.c

Re: [RFC] How to (automatically) find the correct maintainer(s)

2007-01-13 Thread Stefan Richter
On 14 Jan, Richard Knutsson wrote:
> Stefan Richter wrote:
[getting a wrong contact from looking at the MAINTAINERS file]
> Hopefully, but I think it is asking much of the maintainer and then 
> there will certanly be confused/frustrated submitter who don't know why 
> they don't get any answer nor patched included. We have already seen a 
> few asking about what happened with their patches.

Sure. But such glitches occur due to lack of research by the submitter
or due to missing information about maintainers. Neither one would be
made worse nor cured by adding script-readable references to sources or
config options to the MAINTAINERS file.

>>> Can you make a object-file out of 2 c-files? Using Makefile?
>>
>> Yes, you can, although I don't know if it is directly done in the
>> kernel build system.
[...]
> How?:
> gcc -c test.c test2.c -o test3.o
> gcc: cannot specify -o with -c or -S with multiple files
> (with only -c i got test.o and test2.o)

gcc -o test3.o test.c test.c

> In the kernel building system, an object-file is made from a c- or 
> s-file with the same name. Then, of course, they can be put together to 
> a larger object-file.
[...]
[multiple references in one maintainer record]
> What about possibility to replace it with:
> 
> C:IEEE1394*
> 
> and use the same system as with the path-approach, "longest wins". (I 
> don't think just IEEE1394 is appropriate, since then there is 
> possibility with false-positives again)

I doubt that wildcards (or maybe regular expressions) are really needed.
But this can only be found out by going through some non-trivial cases.

>> On the other hand, we could write
>>
>> IEEE 1394 SUBSYSTEM
>> F:   drivers/ieee1394
>> L:   [EMAIL PROTECTED]
>> P:   Ben and me
>> [...]
>> IEEE 1394 IPV4 DRIVER (eth1394)
>> F:   drivers/ieee1394/eth1394
>> [...]
>>
>> If it was done the latter way, i.e. using F: not C:, it could be
>> made a rule that the more specific entries come after more generic
>> entries. Thus the last match of multiple matches is the proper one.
>> In any case, the longest match is the proper one.
>>   
> As I wrote in the initial mail, my first idea was like that. But how to 
> solve when different drivers (with of course different maintainers) lies 
> in the same directory?

To continue my above example:

IEEE 1394 PCILYNX DRIVER
F:  drivers/ieee1394/pcilynx

Should work. Note, the substrings "eth1394" and "pcilynx" do not denote
subdirectories. They are substrings of the paths to these drivers'
sources nonetheless.

> I thought something like include/linux/config.h,autoconf.h could be used 
> when referring to a few specific files in a directory. But there is also 
> the problem that all mails were the maintainer has no F: will fall in 
> the lap of the "good" maintainer with the shorter pathway, and I'm 
> afraid this might make people hesitant to add the F:.

1. The same can be said about the C: method, or about the status quo.
2. The patches will typically be Cced to the respective mailinglist
   where the driver maintainer can harvest the patch or can send an ACK
   or NAK as a signal to the subsystem maintainer whether to pick it up.
3. When people notice that patches are misdirected too often, they will
   update MAINTAINERS.

[...]
> It is just the problems with false-positives and picking out specific
> files that made me reconsider.

May I remind that whoever uses scripts to figure out contacts should
better double-check what the script found out for him. (Regardless
whether the script grepped for config options or for path components.)
There are carbon-based lifeforms on the receiving end.

BTW, it seems to me like the F: approach is easier than the C: one when
it comes to patches which touch only .h files. It is already somewhat
costly to backtrack .c files from .o files from config options, but
considerably more so with .h files.
-- 
Stefan Richter
-=-=-=== ---= -===-
http://arcgraph.de/sr/



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 04/12] mark struct file_operations const 4

2007-01-13 Thread Arjan van de Ven
From: Arjan van de Ven <[EMAIL PROTECTED]>
Subject: [patch 04/12] mark struct file_operations const

Many struct file_operations in the kernel can be "const". Marking them const
moves these to the .rodata section, which avoids false sharing with
potential dirty data. In addition it'll catch accidental writes at compile
time to these shared resources.

Signed-off-by: Arjan van de Ven <[EMAIL PROTECTED]>

Index: linux-2.6/drivers/macintosh/adb.c
===
--- linux-2.6.orig/drivers/macintosh/adb.c
+++ linux-2.6/drivers/macintosh/adb.c
@@ -885,7 +885,7 @@ out:
return ret;
 }
 
-static struct file_operations adb_fops = {
+static const struct file_operations adb_fops = {
.owner  = THIS_MODULE,
.llseek = no_llseek,
.read   = adb_read,
Index: linux-2.6/drivers/macintosh/ans-lcd.c
===
--- linux-2.6.orig/drivers/macintosh/ans-lcd.c
+++ linux-2.6/drivers/macintosh/ans-lcd.c
@@ -121,7 +121,7 @@ anslcd_open( struct inode * inode, struc
return 0;
 }
 
-struct file_operations anslcd_fops = {
+const struct file_operations anslcd_fops = {
.write  = anslcd_write,
.ioctl  = anslcd_ioctl,
.open   = anslcd_open,
Index: linux-2.6/drivers/macintosh/apm_emu.c
===
--- linux-2.6.orig/drivers/macintosh/apm_emu.c
+++ linux-2.6/drivers/macintosh/apm_emu.c
@@ -501,7 +501,7 @@ static int apm_emu_get_info(char *buf, c
return p - buf;
 }
 
-static struct file_operations apm_bios_fops = {
+static const struct file_operations apm_bios_fops = {
.owner  = THIS_MODULE,
.read   = do_read,
.poll   = do_poll,
Index: linux-2.6/drivers/macintosh/nvram.c
===
--- linux-2.6.orig/drivers/macintosh/nvram.c
+++ linux-2.6/drivers/macintosh/nvram.c
@@ -100,7 +100,7 @@ static int nvram_ioctl(struct inode *ino
return 0;
 }
 
-struct file_operations nvram_fops = {
+const struct file_operations nvram_fops = {
.owner  = THIS_MODULE,
.llseek = nvram_llseek,
.read   = read_nvram,
Index: linux-2.6/drivers/macintosh/smu.c
===
--- linux-2.6.orig/drivers/macintosh/smu.c
+++ linux-2.6/drivers/macintosh/smu.c
@@ -1277,7 +1277,7 @@ static int smu_release(struct inode *ino
 }
 
 
-static struct file_operations smu_device_fops = {
+static const struct file_operations smu_device_fops = {
.llseek = no_llseek,
.read   = smu_read,
.write  = smu_write,
Index: linux-2.6/drivers/macintosh/via-pmu68k.c
===
--- linux-2.6.orig/drivers/macintosh/via-pmu68k.c
+++ linux-2.6/drivers/macintosh/via-pmu68k.c
@@ -1040,7 +1040,7 @@ static int pmu_ioctl(struct inode * inod
return -EINVAL;
 }
 
-static struct file_operations pmu_device_fops = {
+static const struct file_operations pmu_device_fops = {
.read   = pmu_read,
.write  = pmu_write,
.ioctl  = pmu_ioctl,
Index: linux-2.6/drivers/macintosh/via-pmu.c
===
--- linux-2.6.orig/drivers/macintosh/via-pmu.c
+++ linux-2.6/drivers/macintosh/via-pmu.c
@@ -2673,7 +2673,7 @@ pmu_ioctl(struct inode * inode, struct f
return error;
 }
 
-static struct file_operations pmu_device_fops = {
+static const struct file_operations pmu_device_fops = {
.read   = pmu_read,
.write  = pmu_write,
.poll   = pmu_fpoll,
Index: linux-2.6/drivers/md/dm-ioctl.c
===
--- linux-2.6.orig/drivers/md/dm-ioctl.c
+++ linux-2.6/drivers/md/dm-ioctl.c
@@ -1473,7 +1473,7 @@ static int ctl_ioctl(struct inode *inode
return r;
 }
 
-static struct file_operations _ctl_fops = {
+static const struct file_operations _ctl_fops = {
.ioctl   = ctl_ioctl,
.owner   = THIS_MODULE,
 };
Index: linux-2.6/drivers/md/md.c
===
--- linux-2.6.orig/drivers/md/md.c
+++ linux-2.6/drivers/md/md.c
@@ -4917,7 +4917,7 @@ static unsigned int mdstat_poll(struct f
return mask;
 }
 
-static struct file_operations md_seq_fops = {
+static const struct file_operations md_seq_fops = {
.owner  = THIS_MODULE,
.open   = md_seq_open,
.read   = seq_read,
Index: linux-2.6/drivers/media/common/saa7146_fops.c
===
--- linux-2.6.orig/drivers/media/common/saa7146_fops.c
+++ linux-2.6/drivers/media/common/saa7146_fops.c
@@ -416,7 +416,7 @@ static ssize_t fops_write(struct 

[patch 03/12] mark struct file_operations const 3

2007-01-13 Thread Arjan van de Ven
From: Arjan van de Ven <[EMAIL PROTECTED]>
Subject: [patch 03/12] mark struct file_operations const

Many struct file_operations in the kernel can be "const". Marking them const
moves these to the .rodata section, which avoids false sharing with
potential dirty data. In addition it'll catch accidental writes at compile
time to these shared resources.

Signed-off-by: Arjan van de Ven <[EMAIL PROTECTED]>

Index: linux-2.6/block/blktrace.c
===
--- linux-2.6.orig/block/blktrace.c
+++ linux-2.6/block/blktrace.c
@@ -264,7 +264,7 @@ static ssize_t blk_dropped_read(struct f
return simple_read_from_buffer(buffer, count, ppos, buf, strlen(buf));
 }
 
-static struct file_operations blk_dropped_fops = {
+static const struct file_operations blk_dropped_fops = {
.owner =THIS_MODULE,
.open = blk_dropped_open,
.read = blk_dropped_read,
Index: linux-2.6/crypto/proc.c
===
--- linux-2.6.orig/crypto/proc.c
+++ linux-2.6/crypto/proc.c
@@ -101,7 +101,7 @@ static int crypto_info_open(struct inode
return seq_open(file, _seq_ops);
 }
 
-static struct file_operations proc_crypto_ops = {
+static const struct file_operations proc_crypto_ops = {
.open   = crypto_info_open,
.read   = seq_read,
.llseek = seq_lseek,
Index: linux-2.6/drivers/acorn/char/i2c.c
===
--- linux-2.6.orig/drivers/acorn/char/i2c.c
+++ linux-2.6/drivers/acorn/char/i2c.c
@@ -238,7 +238,7 @@ static int rtc_ioctl(struct inode *inode
return -EINVAL;
 }
 
-static struct file_operations rtc_fops = {
+static const struct file_operations rtc_fops = {
.ioctl  = rtc_ioctl,
 };
 
Index: linux-2.6/drivers/block/acsi_slm.c
===
--- linux-2.6.orig/drivers/block/acsi_slm.c
+++ linux-2.6/drivers/block/acsi_slm.c
@@ -269,7 +269,7 @@ static int slm_get_pagesize( int device,
 
 static DEFINE_TIMER(slm_timer, slm_test_ready, 0, 0);
 
-static struct file_operations slm_fops = {
+static const struct file_operations slm_fops = {
.owner =THIS_MODULE,
.read = slm_read,
.write =slm_write,
Index: linux-2.6/drivers/block/aoe/aoechr.c
===
--- linux-2.6.orig/drivers/block/aoe/aoechr.c
+++ linux-2.6/drivers/block/aoe/aoechr.c
@@ -233,7 +233,7 @@ loop:
}
 }
 
-static struct file_operations aoe_fops = {
+static const struct file_operations aoe_fops = {
.write = aoechr_write,
.read = aoechr_read,
.open = aoechr_open,
Index: linux-2.6/drivers/block/DAC960.c
===
--- linux-2.6.orig/drivers/block/DAC960.c
+++ linux-2.6/drivers/block/DAC960.c
@@ -7025,7 +7025,7 @@ static int DAC960_gam_ioctl(struct inode
   return -EINVAL;
 }
 
-static struct file_operations DAC960_gam_fops = {
+static const struct file_operations DAC960_gam_fops = {
.owner  = THIS_MODULE,
.ioctl  = DAC960_gam_ioctl
 };
Index: linux-2.6/drivers/block/paride/pg.c
===
--- linux-2.6.orig/drivers/block/paride/pg.c
+++ linux-2.6/drivers/block/paride/pg.c
@@ -227,7 +227,7 @@ static struct class *pg_class;
 
 /* kernel glue structures */
 
-static struct file_operations pg_fops = {
+static const struct file_operations pg_fops = {
.owner = THIS_MODULE,
.read = pg_read,
.write = pg_write,
Index: linux-2.6/drivers/block/paride/pt.c
===
--- linux-2.6.orig/drivers/block/paride/pt.c
+++ linux-2.6/drivers/block/paride/pt.c
@@ -232,7 +232,7 @@ static char pt_scratch[512];/* scratch 
 
 /* kernel glue structures */
 
-static struct file_operations pt_fops = {
+static const struct file_operations pt_fops = {
.owner = THIS_MODULE,
.read = pt_read,
.write = pt_write,
Index: linux-2.6/drivers/block/pktcdvd.c
===
--- linux-2.6.orig/drivers/block/pktcdvd.c
+++ linux-2.6/drivers/block/pktcdvd.c
@@ -447,7 +447,7 @@ static int pkt_debugfs_fops_open(struct 
return single_open(file, pkt_debugfs_seq_show, inode->i_private);
 }
 
-static struct file_operations debug_fops = {
+static const struct file_operations debug_fops = {
.open   = pkt_debugfs_fops_open,
.read   = seq_read,
.llseek = seq_lseek,
@@ -2737,7 +2737,7 @@ static int pkt_seq_open(struct inode *in
return single_open(file, pkt_seq_show, PDE(inode)->data);
 }
 
-static struct file_operations pkt_proc_fops = {
+static const struct file_operations pkt_proc_fops = {
.open   

[patch 02/12] mark struct file_operations const 2

2007-01-13 Thread Arjan van de Ven
From: Arjan van de Ven <[EMAIL PROTECTED]>
Subject: [patch 02/12] mark struct file_operations const

Many struct file_operations in the kernel can be "const". Marking them const
moves these to the .rodata section, which avoids false sharing with
potential dirty data. In addition it'll catch accidental writes at compile
time to these shared resources.

Signed-off-by: Arjan van de Ven <[EMAIL PROTECTED]>

Index: linux-2.6/arch/arm/common/rtctime.c
===
--- linux-2.6.orig/arch/arm/common/rtctime.c
+++ linux-2.6/arch/arm/common/rtctime.c
@@ -329,7 +329,7 @@ static int rtc_fasync(int fd, struct fil
return fasync_helper(fd, file, on, _async_queue);
 }
 
-static struct file_operations rtc_fops = {
+static const struct file_operations rtc_fops = {
.owner  = THIS_MODULE,
.llseek = no_llseek,
.read   = rtc_read,
Index: linux-2.6/arch/arm/kernel/apm.c
===
--- linux-2.6.orig/arch/arm/kernel/apm.c
+++ linux-2.6/arch/arm/kernel/apm.c
@@ -446,7 +446,7 @@ static int apm_open(struct inode * inode
return as ? 0 : -ENOMEM;
 }
 
-static struct file_operations apm_bios_fops = {
+static const struct file_operations apm_bios_fops = {
.owner  = THIS_MODULE,
.read   = apm_read,
.poll   = apm_poll,
Index: linux-2.6/arch/arm/mach-at91rm9200/clock.c
===
--- linux-2.6.orig/arch/arm/mach-at91rm9200/clock.c
+++ linux-2.6/arch/arm/mach-at91rm9200/clock.c
@@ -407,7 +407,7 @@ static int at91_clk_open(struct inode *i
return single_open(file, at91_clk_show, NULL);
 }
 
-static struct file_operations at91_clk_operations = {
+static const struct file_operations at91_clk_operations = {
.open   = at91_clk_open,
.read   = seq_read,
.llseek = seq_lseek,
Index: linux-2.6/arch/avr32/mm/tlb.c
===
--- linux-2.6.orig/arch/avr32/mm/tlb.c
+++ linux-2.6/arch/avr32/mm/tlb.c
@@ -360,7 +360,7 @@ static int tlb_open(struct inode *inode,
return seq_open(file, _ops);
 }
 
-static struct file_operations proc_tlb_operations = {
+static const struct file_operations proc_tlb_operations = {
.open   = tlb_open,
.read   = seq_read,
.llseek = seq_lseek,
Index: linux-2.6/arch/cris/arch-v10/drivers/ds1302.c
===
--- linux-2.6.orig/arch/cris/arch-v10/drivers/ds1302.c
+++ linux-2.6/arch/cris/arch-v10/drivers/ds1302.c
@@ -499,7 +499,7 @@ print_rtc_status(void)
 
 /* The various file operations we support. */
 
-static struct file_operations rtc_fops = {
+static const struct file_operations rtc_fops = {
.owner =THIS_MODULE,
.ioctl =rtc_ioctl,
 }; 
Index: linux-2.6/arch/cris/arch-v10/drivers/eeprom.c
===
--- linux-2.6.orig/arch/cris/arch-v10/drivers/eeprom.c
+++ linux-2.6/arch/cris/arch-v10/drivers/eeprom.c
@@ -172,7 +172,7 @@ static const char eeprom_name[] = "eepro
 static struct eeprom_type eeprom;
 
 /* This is the exported file-operations structure for this device. */
-struct file_operations eeprom_fops =
+const struct file_operations eeprom_fops =
 {
   .llseek  = eeprom_lseek,
   .read= eeprom_read,
Index: linux-2.6/arch/cris/arch-v10/drivers/gpio.c
===
--- linux-2.6.orig/arch/cris/arch-v10/drivers/gpio.c
+++ linux-2.6/arch/cris/arch-v10/drivers/gpio.c
@@ -838,7 +838,7 @@ gpio_leds_ioctl(unsigned int cmd, unsign
return 0;
 }
 
-struct file_operations gpio_fops = {
+const struct file_operations gpio_fops = {
.owner   = THIS_MODULE,
.poll= gpio_poll,
.ioctl   = gpio_ioctl,
Index: linux-2.6/arch/cris/arch-v10/drivers/i2c.c
===
--- linux-2.6.orig/arch/cris/arch-v10/drivers/i2c.c
+++ linux-2.6/arch/cris/arch-v10/drivers/i2c.c
@@ -692,7 +692,7 @@ i2c_ioctl(struct inode *inode, struct fi
return 0;
 }
 
-static struct file_operations i2c_fops = {
+static const struct file_operations i2c_fops = {
.owner= THIS_MODULE,
.ioctl= i2c_ioctl,
.open = i2c_open,
Index: linux-2.6/arch/cris/arch-v10/drivers/pcf8563.c
===
--- linux-2.6.orig/arch/cris/arch-v10/drivers/pcf8563.c
+++ linux-2.6/arch/cris/arch-v10/drivers/pcf8563.c
@@ -56,7 +56,7 @@ static const unsigned char days_in_month
 
 int pcf8563_ioctl(struct inode *, struct file *, unsigned int, unsigned long);
 
-static struct file_operations pcf8563_fops = {
+static const struct file_operations pcf8563_fops = {

[patch 00/12] Fix ppc64's writing to struct file_operations

2007-01-13 Thread Arjan van de Ven
From: Arjan van de Ven <[EMAIL PROTECTED]>
Subject: [patch 00/12] Fix ppc64's writing to struct file_operations

the ppc64 code needlessly wrote to a struct file_operations variable;
this patch turns this into a compile time initialization instead.


Signed-off-by: Arjan van de Ven <[EMAIL PROTECTED]>

Index: linux-2.6/arch/powerpc/kernel/lparcfg.c
===
--- linux-2.6.orig/arch/powerpc/kernel/lparcfg.c
+++ linux-2.6/arch/powerpc/kernel/lparcfg.c
@@ -570,6 +570,7 @@ static int lparcfg_open(struct inode *in
 struct file_operations lparcfg_fops = {
.owner  = THIS_MODULE,
.read   = seq_read,
+   .write  = lparcfg_write,
.open   = lparcfg_open,
.release= single_release,
 };
@@ -581,10 +582,8 @@ int __init lparcfg_init(void)
 
/* Allow writing if we have FW_FEATURE_SPLPAR */
if (firmware_has_feature(FW_FEATURE_SPLPAR) &&
-   !firmware_has_feature(FW_FEATURE_ISERIES)) {
-   lparcfg_fops.write = lparcfg_write;
+   !firmware_has_feature(FW_FEATURE_ISERIES))
mode |= S_IWUSR;
-   }
 
ent = create_proc_entry("ppc64/lparcfg", mode, NULL);
if (ent) {


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 01/12] mark struct file_operations const 1

2007-01-13 Thread Arjan van de Ven
From: Arjan van de Ven <[EMAIL PROTECTED]>
Subject: [patch 01/12] mark struct file_operations const

Many struct file_operations in the kernel can be "const". Marking them const
moves these to the .rodata section, which avoids false sharing with
potential dirty data. In addition it'll catch accidental writes at compile
time to these shared resources.

Signed-off-by: Arjan van de Ven <[EMAIL PROTECTED]>
Index: linux-2.6/include/linux/atalk.h
===
--- linux-2.6.orig/include/linux/atalk.h
+++ linux-2.6/include/linux/atalk.h
@@ -182,7 +182,7 @@ extern rwlock_t atalk_interfaces_lock;
 
 extern struct atalk_route atrtr_default;
 
-extern struct file_operations atalk_seq_arp_fops;
+extern const struct file_operations atalk_seq_arp_fops;
 
 extern int sysctl_aarp_expiry_time;
 extern int sysctl_aarp_tick_time;
Index: linux-2.6/include/linux/cpuset.h
===
--- linux-2.6.orig/include/linux/cpuset.h
+++ linux-2.6/include/linux/cpuset.h
@@ -55,7 +55,7 @@ extern int cpuset_excl_nodes_overlap(con
 extern int cpuset_memory_pressure_enabled;
 extern void __cpuset_memory_pressure_bump(void);
 
-extern struct file_operations proc_cpuset_operations;
+extern const struct file_operations proc_cpuset_operations;
 extern char *cpuset_task_status_allowed(struct task_struct *task, char 
*buffer);
 
 extern void cpuset_lock(void);
Index: linux-2.6/include/linux/random.h
===
--- linux-2.6.orig/include/linux/random.h
+++ linux-2.6/include/linux/random.h
@@ -63,7 +63,7 @@ extern u64 secure_dccp_sequence_number(_
   __be16 sport, __be16 dport);
 
 #ifndef MODULE
-extern struct file_operations random_fops, urandom_fops;
+extern const struct file_operations random_fops, urandom_fops;
 #endif
 
 unsigned int get_random_int(void);
Index: linux-2.6/include/linux/security.h
===
--- linux-2.6.orig/include/linux/security.h
+++ linux-2.6/include/linux/security.h
@@ -2130,7 +2130,7 @@ extern int mod_reg_security   (const char 
 extern int mod_unreg_security  (const char *name, struct security_operations 
*ops);
 extern struct dentry *securityfs_create_file(const char *name, mode_t mode,
 struct dentry *parent, void *data,
-struct file_operations *fops);
+const struct file_operations 
*fops);
 extern struct dentry *securityfs_create_dir(const char *name, struct dentry 
*parent);
 extern void securityfs_remove(struct dentry *dentry);
 
Index: linux-2.6/include/net/ax25.h
===
--- linux-2.6.orig/include/net/ax25.h
+++ linux-2.6/include/net/ax25.h
@@ -377,7 +377,7 @@ extern int  ax25_check_iframes_acked(ax2
 /* ax25_route.c */
 extern void ax25_rt_device_down(struct net_device *);
 extern int  ax25_rt_ioctl(unsigned int, void __user *);
-extern struct file_operations ax25_route_fops;
+extern const struct file_operations ax25_route_fops;
 extern ax25_route *ax25_get_route(ax25_address *addr, struct net_device *dev);
 extern int  ax25_rt_autobind(ax25_cb *, ax25_address *);
 extern struct sk_buff *ax25_rt_build_path(struct sk_buff *, ax25_address *, 
ax25_address *, ax25_digi *);
@@ -430,7 +430,7 @@ extern unsigned long ax25_display_timer(
 extern int  ax25_uid_policy;
 extern ax25_uid_assoc *ax25_findbyuid(uid_t);
 extern int __must_check ax25_uid_ioctl(int, struct sockaddr_ax25 *);
-extern struct file_operations ax25_uid_fops;
+extern const struct file_operations ax25_uid_fops;
 extern void ax25_uid_free(void);
 
 /* sysctl_net_ax25.c */
Index: linux-2.6/include/net/netfilter/nf_conntrack_expect.h
===
--- linux-2.6.orig/include/net/netfilter/nf_conntrack_expect.h
+++ linux-2.6/include/net/netfilter/nf_conntrack_expect.h
@@ -8,7 +8,7 @@
 
 extern struct list_head nf_conntrack_expect_list;
 extern struct kmem_cache *nf_conntrack_expect_cachep;
-extern struct file_operations exp_file_ops;
+extern const struct file_operations exp_file_ops;
 
 struct nf_conntrack_expect
 {
Index: linux-2.6/include/net/netrom.h
===
--- linux-2.6.orig/include/net/netrom.h
+++ linux-2.6/include/net/netrom.h
@@ -215,8 +215,8 @@ extern struct net_device *nr_dev_get(ax2
 extern int  nr_rt_ioctl(unsigned int, void __user *);
 extern void nr_link_failed(ax25_cb *, int);
 extern int  nr_route_frame(struct sk_buff *, ax25_cb *);
-extern struct file_operations nr_nodes_fops;
-extern struct file_operations nr_neigh_fops;
+extern const struct file_operations nr_nodes_fops;
+extern const struct file_operations nr_neigh_fops;
 extern void nr_rt_free(void);
 
 /* nr_subr.c */

Patch series to mark struct file_operations and struct inode_operations const

2007-01-13 Thread Arjan van de Ven
Hi,

today a sizable portion of the "struct file_operations" variables in the
kernel are const, but by far not all. Nor are any of the struct
inode_operations const. Marking these read-only datastructures const has
the advantage of reducing false sharing of these, often hot,
datastructures. In addition there have been cases where drivers or
filesystems accidentally and incorrectly wrote to such a struct
forgetting that it's a shared datastructure. By marking these const, the
compiler will warn/error on such instances.

The series is split up for size, there isn't really any logical order
for such a simple search-and-replace operation.

Greetings,
   Arjan van de Ven

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 09/20] XEN-paravirt: dont export paravirt_ops structure, do individual functions

2007-01-13 Thread Rusty Russell
On Fri, 2007-01-12 at 17:45 -0800, Jeremy Fitzhardinge wrote:
> Wrap the paravirt_ops members we want to export in wrapper functions.

Andrew, the removal of paravirt_ops export here will break kvm.  Feel
free to re-add "EXPORT_SYMBOL_GPL(paravirt_ops)" at the bottom of
arch/i386/kernel/paravirt.c; I'm working on a cleaner way for modules
like kvm / lguest (which want to use the native versions directly
anyway).

Thanks,
Rusty.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: No more "device" symlinks for classes

2007-01-13 Thread Kay Sievers
On Sun, 2007-01-14 at 01:29 +0100, Pierre Ossman wrote:
> Kay Sievers wrote:
> >
> > The plan is to have a single unified tree at /sys/devices, where all
> > device-directories live below their parents, and /sys/class contains
> > only symlinks pointing into this single tree, just like /sys/bus.
> >
> > People want to stack class-devices, but this leads to a /sys/devices
> > tree and several small trees spread around in /sys/class. These trees
> > need to be connected by "device"-links and the "class:"-links, which
> > just doesn't make much sense if you can have one single tree with the
> > same information.
> >
> > In the unified tree, the "device"-link will always just point to the
> > parent device, that's why there is a config option to disable these
> > links and test current software not to depend on it.
> >
> I'm not sure I completely follow. Should an application look at the
> symlink (e.g. /sys/class/fooclass/foodev -> /sys/devices/...) and follow
> that one level up? If so, then this sounds a bit complicated. Especially
> from shell scripts.

We would have one single tree at /sys/devices, and always flat
classification without hierarchy at /sys/class and /sys/bus. If you
enter the device-tree by starting at /sys/class, you get the full path
to the device by reading the link, and get all the device's
dependencies(parents) in the devpath of the device,

I can't see any problem stripping the last element of a path with a
shell script. It's all implemented in current udev and HAL for quite
some time and it's pretty easy.

Kay

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 16/20] XEN-paravirt: Add the Xen virtual console driver.

2007-01-13 Thread Jeremy Fitzhardinge
Alan wrote:
> Andrew: No objection to this tty stuff being merged provided the bugs
> noted above (not worried about the sign stuff) are fixed before it goes
> on to Linus.
>   

Thanks for the comments.  I'll see if I can put together a fixup patch
before LCA, but possibly not.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


How to mmap a shadow framebuffer in virtual memory

2007-01-13 Thread Bernardo Innocenti
Hello,

This is driving me crazy.  I wrote this custom fb driver for an
organic LED display for an embedded ARM system.

The display is connected trough an I2C bus, therefore the display
buffer is not memory mapped.

Anyway, I want to support mmap() and my driver allocates shadow
buffer with __get_free_pages() which gets periodically copied
to the display by a thread. This is unlike most fb drivers which
just point smem_start to the phisical address of their framebuffer.

>From user space, opening /dev/fb0 and writing to it works just
fine.  mmap()'ing the file and writing to it does not have any
effect.

Writing the phisical address in smem_start and letting the
fbgen code do the rest didn't seem to work, so I reimplemented
the fb_mmap hook.

I don't feel confident with the Linux VM, so I tried several
strategies to allocate the shadow buffer, including vmalloc()
and kmalloc().

The virtual framebuffer (vfb) also uses vmalloc() but crashes
calling processes because it confuses physical and virtual
addresses, or so it seems.

Maybe it's just my kernel or my platform... does anybody use
a similar technique?  Can anybody point me to known-good code
that approximates my needs?

If you want to review the code below, look for the allocation in
oledfb_init() and usage in oledfb_mmap().  This code runs on
2.4.19-rmk7 because I can't upgrade to a newer kernel on this
target.



/*
 * linux/drivers/video/oledfb.c -- STV8102 OLED frame buffer device
 *
 * Copyright 2006 Develer S.r.l. (http://www.develer.com/)
 * Author: Bernardo Innocenti <[EMAIL PROTECTED]>
 * Author: Stefano Fedrigo <[EMAIL PROTECTED]>
 *
 * This file is subject to the terms and conditions of the GNU General Public
 * License.  See the file COPYING in the main directory of this archive
 * for more details.
 */

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

#include 

/* Define to either 0 or 1 */
#define OLED_DEBUG  1

/* Driver name used in several places */
#define OLED_NAME "oledfb"

/* Fully qualified driver name for users */
#define OLED_FRIENDLY_NAME "STV8102 OLED"

/* Dimensions in pixels */
#define OLED_WIDTH  128
#define OLED_HEIGHT 33

/* Dimensions in millimeters */
#define OLED_WIDTH_MM  55
#define OLED_HEIGHT_MM 14

/* Size of an horizontal line in bytes */
#define OLED_WIDTH_BYTES  ((OLED_WIDTH + 7) / 8)

/* Framebuffer size in bytes */
#define OLED_MEMSIZE (OLED_WIDTH_BYTES * OLED_HEIGHT)

#define OLED_MEMORDER (get_order(PAGE_ALIGN(OLED_MEMSIZE)))

/* OLED refresh delay in milliseconds */
#define OLED_REFRESH_DELAY 300
#define OLED_REFRESH_JIFFIES ((OLED_REFRESH_DELAY * HZ)/1000)

/* Set to 1 to enable an X11-like backfilling pattern */
#define OLED_BACKFILL_PATTERN 0

/* I2C address of the OLED command/data registers */
#define I2C_DRIVERID_STV8102_CMD  0x3C
#define I2C_DRIVERID_STV8102_DATA 0x3D

/* Use a kernel thread to refresh the OLED periodically */
#define CONFIG_OLED_REFRESH_THREAD 1

/* BROKEN: i2c code sleeps in timer context */
#define CONFIG_OLED_REFRESH_TIMER 0


#define OLED_CMD_XSTART   0x00 /* address in lower 4 bits */
#define OLED_CMD_YSTART   0x40 /* address in lower 4 bits */
#define OLED_CMD_DISPON   0xAF
#define OLED_CMD_DISPOFF  0xAE
#define OLED_CMD_MOVE 0x80 /* effects in lower 4 bits */
#define OLED_CMD_HSPEED   0x90 /* speed in lower 3 bits */
#define OLED_CMD_VSPEED   0x98 /* speed in lower 3 bits */
#define OLED_CMD_HMIN 0xC0
#define OLED_CMD_HMAX 0xC2
#define OLED_CMD_VMIN 0xC6
#define OLED_CMD_VMAX 0xC8

/* Missing utility macros */
#define countof(x) (sizeof(x) / sizeof(x[0]))
#ifndef bool
#   define bool  int
#   define false 0
#   define true  1
#endif

#if OLED_DEBUG == 1
#define OLED_TRACE printk(KERN_DEBUG "%s:%s()\n", OLED_NAME, 
__FUNCTION__)
#define OLED_TRACEMSG(msg,...) printk(KERN_DEBUG "%s:%s(): " msg "\n", \
OLED_NAME, __FUNCTION__, ## __VA_ARGS__)
#elif OLED_DEBUG == 0
#define OLED_TRACE do {} while (0)
#define OLED_TRACEMSG(msg,...)
#else
#error Define OLED_DEBUG to either 0 or 1
#endif


struct oledfb_info {
struct fb_info_gen gen;

/* Shadow buffer for the memory mapped framebuffer */
uint8_t *shadow;

/* Second copy of shadow buffer for optimized refesh */
uint8_t *shadow2;

/* Physical address of shadow buffer as required by fbmem */
unsigned long shadow_phys;

/* I2C client we talk to for OLED command register read/write */
struct i2c_client i2c_cmd;

/* I2C client we talk to for OLED data register write */
struct i2c_client i2c_data;

bool screensaver_running;

atomic_t open_cnt;

#if CONFIG_OLED_REFRESH_THREAD
int thread_pid;

/* Used to tell our refresh thread to quit asap */
/*bool*/ int quitting;


Re: No more "device" symlinks for classes

2007-01-13 Thread Pierre Ossman
Kay Sievers wrote:
>
> The plan is to have a single unified tree at /sys/devices, where all
> device-directories live below their parents, and /sys/class contains
> only symlinks pointing into this single tree, just like /sys/bus.
>
> People want to stack class-devices, but this leads to a /sys/devices
> tree and several small trees spread around in /sys/class. These trees
> need to be connected by "device"-links and the "class:"-links, which
> just doesn't make much sense if you can have one single tree with the
> same information.
>
> In the unified tree, the "device"-link will always just point to the
> parent device, that's why there is a config option to disable these
> links and test current software not to depend on it.
>
>   

I'm not sure I completely follow. Should an application look at the
symlink (e.g. /sys/class/fooclass/foodev -> /sys/devices/...) and follow
that one level up? If so, then this sounds a bit complicated. Especially
from shell scripts.

Rgds

-- 
 -- Pierre Ossman

  Linux kernel, MMC maintainerhttp://www.kernel.org
  PulseAudio, core developer  http://pulseaudio.org
  rdesktop, core developer  http://www.rdesktop.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 16/20] XEN-paravirt: Add the Xen virtual console driver.

2007-01-13 Thread Alan
> +#endif
> + tty_insert_flip_char(xencons_tty, buf[i], 0);

Please use the defines like TTY_NORMAL not just 0. 

> + if ((xencons_tty->flags & (1 << TTY_DO_WRITE_WAKEUP)) &&
> + (xencons_tty->ldisc.write_wakeup != NULL))
> + (xencons_tty->ldisc.write_wakeup)(xencons_tty);
> + }

You are't allowed to derefence and call ldisc methods without
holding the lock. You can replace that chunk with a call to the helper
function tty_wakeup(tty). Small but real race condition otherwise as you
xencons_tty->ldisc may be changing as you call it.

> +static inline int __xencons_put_char(int ch)
> +{
> + char _ch = (char)ch;
> + if ((wp - wc) == wbuf_size)
> + return 0;
> + wbuf[WBUF_MASK(wp++)] = _ch;
> + return 1;
> +}

A lot of very confusing sign stuff here - you turn an int into a char and
put it into a uchar array

> + for (i = 0; i < count; i++)
> + if (!__xencons_put_char(buf[i]))
> + break;

The int coming from a uchar array

Don't think its wrong - just acutely weird and perhaps could be tidier

> +static void xencons_close(struct tty_struct *tty, struct file *filp)
> +{
> + unsigned long flags;
> +
> + mutex_lock(_mutex);

It would be good in future if you could avoid using tty_mutex and use a
private lock. At the moment vt "borrows" it and there are a couple of
incestuous spots but the plan is to eventually fix them and make it
private to tty_io.


Andrew: No objection to this tty stuff being merged provided the bugs
noted above (not worried about the sign stuff) are fixed before it goes
on to Linus.

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [patch 02/20] XEN-paravirt: Add a flag to allow the VGA console to be disabled

2007-01-13 Thread Alan
On Fri, 12 Jan 2007 17:45:41 -0800
Jeremy Fitzhardinge <[EMAIL PROTECTED]> wrote:

> Add a flag to allow the VGA console to be disabled.  The VGA code will
> spin forever if there isn't any real VGA hardware, which will happen
> under Xen.

If it is doing this then the real bug should be fixed so that it doesn't
hang in the same way on a physical system which has no VGA.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] How to (automatically) find the correct maintainer(s)

2007-01-13 Thread Matthias Schniedermeyer
Richard Knutsson wrote:
> Matthias Schniedermeyer wrote:
> 
>> Richard Knutsson wrote:
>>
>>  
>>
>>> Any thoughts on this is very much appreciated (is there any flaws with
>>> this?).
>>> 
>>
>>
>> The thought that crossed my mind was:
>>
>> Why not do the same thing that was done to the "Help"-file. (Before it
>> was superseded by Kconfig).
>>
>> Originaly there was a central Help-file, with all the texts. Then it was
>> split and placed in each sub-dir. And later it was superseded by Kconfig.
>>
>> On the other hand you could skip the intermediate step and just fold the
>> Maintainer-data directly into Kconfig, that way everything is "in one
>> place" and you could place a "Maintainers"-Button next to the
>> "Help"-Button in *config, or just display it alongside the help.
>>
>> And MAYBE that would also lessen the "update-to-date"-problem, as you
>> can just write the MAINTAINERs-data when you create/update the
>> Kconfig-file. Which is a thing that creates much bigger pain when you
>> forget it accidently. ;-)
>>
>> Oh, and it neadly solves the mapping-problem, for at least all
>> kernel-parts that have a Kconfig-option/Sub-Tree.
>>   
> 
> I'm all for splitting up the MAINTAINERS! :)
> 
> Just, do you have any ideas how to solve the possible multiple of the
> same entries, when handling multiple sub-directories and when many
> different drivers with different maintainers are in the same directory
> and a maintainer have more then one driver?

Handles.
If a Maintainer maintains several subsystems/drivers a "handle" could be
used to references to a handle-list (hello MAINTAINERS) or to the place
where the full-maintainers-entry is placed.





Bis denn

-- 
Real Programmers consider "what you see is what you get" to be just as
bad a concept in Text Editors as it is in women. No, the Real Programmer
wants a "you asked for it, you got it" text editor -- complicated,
cryptic, powerful, unforgiving, dangerous.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: No more "device" symlinks for classes

2007-01-13 Thread Kay Sievers
On Sun, 2007-01-14 at 00:51 +0100, Pierre Ossman wrote:
> I just wanted to know the rationale behind
> 99ef3ef8d5f2f5b5312627127ad63df27c0d0d05 (no more "device" symlink in
> class devices). I thought that was a rather convenient way of finding
> which physical device the class device was coupled to.

The plan is to have a single unified tree at /sys/devices, where all
device-directories live below their parents, and /sys/class contains
only symlinks pointing into this single tree, just like /sys/bus.

People want to stack class-devices, but this leads to a /sys/devices
tree and several small trees spread around in /sys/class. These trees
need to be connected by "device"-links and the "class:"-links, which
just doesn't make much sense if you can have one single tree with the
same information.

In the unified tree, the "device"-link will always just point to the
parent device, that's why there is a config option to disable these
links and test current software not to depend on it.

There was a long discussion on lkml about all that, maybe a year ago,
while converting "input".

Thanks,
Kay

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: tuning/tweaking VM settings for low memory (preventing OOM)

2007-01-13 Thread Toon van der Pas
On Sat, Jan 13, 2007 at 03:30:27PM +0100, Willy Tarreau wrote:
> On Sat, Jan 13, 2007 at 02:16:01PM +0100, Toon van der Pas wrote:
> > On Sat, Jan 13, 2007 at 08:22:18AM +0100, Willy Tarreau wrote:
> > > > 
> > > > Which makes me think that we aren't writing back fast enough.  If I  
> > > > mount the drive "sync" the issue clearly goes away.
> > > > 
> > > > It appears from an strace we are doing ftruncate64(5, 178257920) when  
> > > > we OOM.
> > > > 
> > > > Any ideas on VM parameters to tweak so we throttle this from occurring?
> > > 
> > > Take a look at /proc/sys/vm/bdflush. There are several useful parameters
> > > there (doc is in linux-xxx/Documentation). For instance, the first column
> > > is the percentage of memory used by writes before starting to write on
> > > disk. When using tcpdump intensively, I lower this one to about 1%.
> > 
> > Hi Willy,
> > 
> > I know you're doing a great job on keeping the 2.4 kernel in shape,
> > but do you also have a good advice for people with more recent
> > kernels?  (hint: the file /proc/sys/vm/bdflush is missing)
> 
> OK OK OK... Next time I will have coffee *before* replying :-)
> 
> Check /proc/sys/vm/dirty_ratio and dirty_background_ratio. Both are
> percentage of total memory. The first one is for "foreground" writes
> (ie the writing process may block) while the second one is for
> "background" writes :
> 
> $ uname -a
> Linux hp 2.6.16-rc2-pa1 #1 Fri Feb 3 23:34:56 MST 2006 parisc unknown
> $ cat /proc/sys/vm/dirty_ratio 
> 40
> $ cat /proc/sys/vm/dirty_background_ratio 
> 10
> 
> Again, lowering those values should help writing data to disk
> sooner. Also you should take a look at min_free_kbytes (although
> I've not played with it yet) :

Ahh, okay, I didn't really understand these parameters before.
Now I think I understand what they are supposed to do.
I'll do some experiments with them.

Thanks for your help.
Toon.

BTW: That's pretty exotic hardware you have there (parisc).  ;-)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux v2.6.20-rc5

2007-01-13 Thread Segher Boessenkool

 CC [M]  drivers/kvm/vmx.o
{standard input}: Assembler messages:
{standard input}:3257: Error: bad register name `%sil'
make[2]: *** [drivers/kvm/vmx.o] Error 1
make[1]: *** [drivers/kvm] Error 2
make: *** [drivers] Error 2

Am I missing something or this is a real problem?


What's on (and sround) that line #3257?


Segher

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] How to (automatically) find the correct maintainer(s)

2007-01-13 Thread Richard Knutsson

Matthias Schniedermeyer wrote:

Richard Knutsson wrote:

  

Any thoughts on this is very much appreciated (is there any flaws with
this?).



The thought that crossed my mind was:

Why not do the same thing that was done to the "Help"-file. (Before it
was superseded by Kconfig).

Originaly there was a central Help-file, with all the texts. Then it was
split and placed in each sub-dir. And later it was superseded by Kconfig.

On the other hand you could skip the intermediate step and just fold the
Maintainer-data directly into Kconfig, that way everything is "in one
place" and you could place a "Maintainers"-Button next to the
"Help"-Button in *config, or just display it alongside the help.

And MAYBE that would also lessen the "update-to-date"-problem, as you
can just write the MAINTAINERs-data when you create/update the
Kconfig-file. Which is a thing that creates much bigger pain when you
forget it accidently. ;-)

Oh, and it neadly solves the mapping-problem, for at least all
kernel-parts that have a Kconfig-option/Sub-Tree.
  

I'm all for splitting up the MAINTAINERS! :)

Just, do you have any ideas how to solve the possible multiple of the 
same entries, when handling multiple sub-directories and when many 
different drivers with different maintainers are in the same directory 
and a maintainer have more then one driver?


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] How to (automatically) find the correct maintainer(s)

2007-01-13 Thread Richard Knutsson

Stefan Richter wrote:

On 13 Jan, Richard Knutsson wrote:
  

Stefan Richter wrote:


On 13 Jan, Richard Knutsson wrote:
[...]
  
  

SUPERCOOL ALPHA CARD

P:  Clark Kent
M:  [EMAIL PROTECTED]
L:  [EMAIL PROTECTED]
C:  SUPER_A
S:  Maintained
(C: for CONFIG. Any better idea?)

then if someone changes a file who are built with CONFIG_SUPER_A, can 
easily backtrack it to the correct maintainer(s).



[...]
  
  
My first idea was to use the pathway and define that directories above 
the specified (if not specified by another) would fall to the current 
maintainer. It would work, but requires that all pathways be specified 
at once, or a few maintainers with "short" pathways would get much of 
the patches (and it is not as correct/easy to maintain as looking for 
the CONFIG_flag).



Any thoughts on this is very much appreciated (is there any flaws with 
this?).



 - What about drivers which have no MAINTAINER entry but reside in a
   subsystem with MAINTAINER entry?
  
  

Hmm, how are those drivers built? Can you please point me to one?



I believe you read too quickly what I wrote, didn't you? :-)
The MAINTAINER file doesn't influence how drivers are built.
  
What the... now I have no idea why I deleted the previous text... oh 
well, I tried 'grep -Er "^M\:" */*' but did not find any such entries. 
Or did you mean files just stating "I maintaining this file"?
At least I know so much about the building-process that I don't think 
MAINTAINER is involved :). It was meant as: how is a driver build 
without some CONFIG_-flag set, but not sure now what I wanted with that 
(blaming low blood-suger, got a pizza since then).
  

 - What if these drivers depend on two subsystems?
  
  
Not sure if I understand the problem. I don't see the maintainers for 
the subsystems too interested in a driver, and it is the drivers 
maintainer we want.



I am specifically thinking of drivers which are maintained by the
subsystem maintainers. (Well, see below...)
  
More then one subsystem maintainers that is maintainers to a driver? I 
would think one off those would quite naturally become the maintainer of 
the driver and then accepting patches from the rest.

Besides, the subsystem maintainer could point the submitter to a
more appropriate channel or ignore the submitter. (A submitter who
feels ignored is hopefully doing some more research then.) Also,
a driver maintainer certainly reads the mailinglist to which the
submitter posted.
  
Hopefully, but I think it is asking much of the maintainer and then 
there will certanly be confused/frustrated submitter who don't know why 
they don't get any answer nor patched included. We have already seen a 
few asking about what happened with their patches.

 - Config options map to object files but do not map directly to source
   files. Diffstats show source files.
  
  

Can you make a object-file out of 2 c-files? Using Makefile?



Yes, you can, although I don't know if it is directly done in the
kernel build system. Of course what is often done is to make n object
files out of n c files, then link them to make 1 object file.
  

How?:
gcc -c test.c test2.c -o test3.o
gcc: cannot specify -o with -c or -S with multiple files
(with only -c i got test.o and test2.o)

In the kernel building system, an object-file is made from a c- or 
s-file with the same name. Then, of course, they can be put together to 
a larger object-file.

Example: The sbp2 driver is an IEEE 1394 driver and a SCSI driver.
sbp2.o is enabled by CONFIG_IEEE1394_SBP2 which depends on
CONFIG_IEEE1394 and CONFIG_SCSI. sbp2.c resides in drivers/ieee1394/.
What is the algorithm to look up sbp2's maintainers?
  
  

The one listed for CONFIG_IEEE1394_SBP2 :)



...OK, we /could/ write

IEEE 1394 SUBSYSTEM
C:  IEEE1394
C:  IEEE1394_OHCI1394
C:  IEEE1394_SBP2
C:  IEEE1394_DV1394  /* would better be put into a new own entry due to 
different status of maintenance level */
C:  IEEE1394_VIDEO1394  /* that one perhaps too */
L:  [EMAIL PROTECTED]
P:  Ben and me
[...]
IEEE 1394 IPV4 DRIVER (eth1394)
C:  IEEE1394_ETH1394
[...]
  

What about possibility to replace it with:

C:  IEEE1394*

and use the same system as with the path-approach, "longest wins". (I 
don't think just IEEE1394 is appropriate, since then there is 
possibility with false-positives again)

On the other hand, we could write

IEEE 1394 SUBSYSTEM
F:  drivers/ieee1394
L:  [EMAIL PROTECTED]
P:  Ben and me
[...]
IEEE 1394 IPV4 DRIVER (eth1394)
F:  drivers/ieee1394/eth1394
[...]

If it was done the latter way, i.e. using F: not C:, it could be
made a rule that the more specific entries come after more generic
entries. Thus the last match of multiple matches is the proper one.
In any case, the longest match is the proper one.
  
As I wrote in the initial mail, my first idea was like that. But how to 
solve when different 

[patch 03/20] XEN-paravirt: paravirt: page-table accessors

2007-01-13 Thread Jeremy Fitzhardinge
Add a set of accessors to pack, unpack and modify page table entries
(at all levels).  This allows a paravirt implementation to control the
contents of pgd/pmd/pte entries.  For example, Xen uses this to
convert the (pseudo-)physical address into a machine address when
populating a pagetable entry, and converting back to pphys address
when an entry is read.

Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]>
Cc: Chris Wright <[EMAIL PROTECTED]>
Cc: Zachary Amsden <[EMAIL PROTECTED]>
Cc: Andi Kleen <[EMAIL PROTECTED]>
Cc: Andrew Morton <[EMAIL PROTECTED]>
Cc: Rusty Russell <[EMAIL PROTECTED]>

--
 arch/i386/kernel/paravirt.c   |  113 +
 arch/i386/kernel/vmlinux.lds.S|3 
 include/asm-i386/page.h   |   18 -
 include/asm-i386/paravirt.h   |   68 +-
 include/asm-i386/pgtable-2level.h |5 -
 include/asm-i386/pgtable-3level.h |   27 
 6 files changed, 199 insertions(+), 35 deletions(-)

===
--- a/arch/i386/kernel/paravirt.c
+++ b/arch/i386/kernel/paravirt.c
@@ -34,7 +34,7 @@
 #include 
 
 /* nop stub */
-static void native_nop(void)
+void native_nop(void)
 {
 }
 
@@ -400,38 +400,74 @@ static fastcall void native_flush_tlb_si
 }
 
 #ifndef CONFIG_X86_PAE
-static fastcall void native_set_pte(pte_t *ptep, pte_t pteval)
+fastcall void native_set_pte(pte_t *ptep, pte_t pteval)
 {
*ptep = pteval;
 }
 
-static fastcall void native_set_pte_at(struct mm_struct *mm, u32 addr, pte_t 
*ptep, pte_t pteval)
+fastcall void native_set_pte_at(struct mm_struct *mm, u32 addr, pte_t *ptep, 
pte_t pteval)
 {
*ptep = pteval;
 }
 
-static fastcall void native_set_pmd(pmd_t *pmdp, pmd_t pmdval)
+fastcall void native_set_pmd(pmd_t *pmdp, pmd_t pmdval)
 {
*pmdp = pmdval;
 }
 
+fastcall unsigned long native_pte_val(pte_t pte)
+{
+   return pte.pte_low;
+}
+
+fastcall unsigned long native_pmd_val(pmd_t pmd)
+{
+   BUG();
+   return 0;
+}
+
+fastcall unsigned long native_pgd_val(pgd_t pgd)
+{
+   return pgd.pgd;
+}
+
+fastcall pte_t native_make_pte(unsigned long pte)
+{
+   return (pte_t){ pte };
+}
+
+fastcall pmd_t native_make_pmd(unsigned long pmd)
+{
+   BUG();
+}
+
+fastcall pgd_t native_make_pgd(unsigned long pgd)
+{
+   return (pgd_t){ pgd };
+}
+
+fastcall pte_t native_ptep_get_and_clear(pte_t *ptep)
+{
+   return __pte(xchg(&(ptep)->pte_low, 0));
+}
+
 #else /* CONFIG_X86_PAE */
 
-static fastcall void native_set_pte(pte_t *ptep, pte_t pte)
+fastcall void native_set_pte(pte_t *ptep, pte_t pte)
 {
ptep->pte_high = pte.pte_high;
smp_wmb();
ptep->pte_low = pte.pte_low;
 }
 
-static fastcall void native_set_pte_at(struct mm_struct *mm, u32 addr, pte_t 
*ptep, pte_t pte)
+fastcall void native_set_pte_at(struct mm_struct *mm, u32 addr, pte_t *ptep, 
pte_t pte)
 {
ptep->pte_high = pte.pte_high;
smp_wmb();
ptep->pte_low = pte.pte_low;
 }
 
-static fastcall void native_set_pte_present(struct mm_struct *mm, unsigned 
long addr, pte_t *ptep, pte_t pte)
+fastcall void native_set_pte_present(struct mm_struct *mm, u32 addr, pte_t 
*ptep, pte_t pte)
 {
ptep->pte_low = 0;
smp_wmb();
@@ -440,34 +476,76 @@ static fastcall void native_set_pte_pres
ptep->pte_low = pte.pte_low;
 }
 
-static fastcall void native_set_pte_atomic(pte_t *ptep, pte_t pteval)
+fastcall void native_set_pte_atomic(pte_t *ptep, pte_t pteval)
 {
set_64bit((unsigned long long *)ptep,pte_val(pteval));
 }
 
-static fastcall void native_set_pmd(pmd_t *pmdp, pmd_t pmdval)
+fastcall void native_set_pmd(pmd_t *pmdp, pmd_t pmdval)
 {
set_64bit((unsigned long long *)pmdp,pmd_val(pmdval));
 }
 
-static fastcall void native_set_pud(pud_t *pudp, pud_t pudval)
+fastcall void native_set_pud(pud_t *pudp, pud_t pudval)
 {
*pudp = pudval;
 }
 
-static fastcall void native_pte_clear(struct mm_struct *mm, unsigned long 
addr, pte_t *ptep)
+fastcall void native_pte_clear(struct mm_struct *mm, u32 addr, pte_t *ptep)
 {
ptep->pte_low = 0;
smp_wmb();
ptep->pte_high = 0;
 }
 
-static fastcall void native_pmd_clear(pmd_t *pmd)
+fastcall void native_pmd_clear(pmd_t *pmd)
 {
u32 *tmp = (u32 *)pmd;
*tmp = 0;
smp_wmb();
*(tmp + 1) = 0;
+}
+
+fastcall unsigned long long native_pte_val(pte_t pte)
+{
+   return pte.pte_low | ((unsigned long long)pte.pte_high << 32);
+}
+
+fastcall unsigned long long native_pmd_val(pmd_t pmd)
+{
+   return pmd.pmd;
+}
+
+fastcall unsigned long long native_pgd_val(pgd_t pgd)
+{
+   return pgd.pgd;
+}
+
+fastcall pte_t native_make_pte(unsigned long long pte)
+{
+   return (pte_t){ pte };
+}
+
+fastcall pmd_t native_make_pmd(unsigned long long pmd)
+{
+   return (pmd_t){ pmd };
+}
+
+fastcall pgd_t native_make_pgd(unsigned long long pgd)
+{
+   return (pgd_t){ pgd };
+}
+
+fastcall pte_t 

[patch 18/20] XEN-paravirt: Add Xen driver utility functions.

2007-01-13 Thread Jeremy Fitzhardinge
Allocate/destroy a 'vmalloc' VM area: alloc_vm_area and free_vm_area
The alloc function ensures that page tables are constructed for the
region of kernel virtual address space and mapped into init_mm.

Lock an area so that PTEs are accessible in the current address space:
lock_vm_area and unlock_vm_area
The lock function prevents context switches to a lazy mm that doesn't
have the area mapped into its page tables.  It also ensures that the
page tables are mapped into the current mm by causing the page fault
handler to copy the page directory pointers from init_mm into the
current mm.

Signed-off-by: Ian Pratt <[EMAIL PROTECTED]>
Signed-off-by: Christian Limpach <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
Cc: "Jan Beulich" <[EMAIL PROTECTED]>
---

 drivers/xen/Makefile  |2 +
 drivers/xen/util.c|   70 ++
 include/xen/driver_util.h |   16 ++
 3 files changed, 88 insertions(+)

===
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -1,2 +1,4 @@ obj-y   += core/
+obj-y  += util.o
+
 obj-y  += core/
 obj-y  += console/
===
--- /dev/null
+++ b/drivers/xen/util.c
@@ -0,0 +1,69 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static int f(pte_t *pte, struct page *pmd_page, unsigned long addr, void *data)
+{
+   /* apply_to_page_range() does all the hard work. */
+   return 0;
+}
+
+struct vm_struct *alloc_vm_area(unsigned long size)
+{
+   struct vm_struct *area;
+
+   area = get_vm_area(size, VM_IOREMAP);
+   if (area == NULL)
+   return NULL;
+
+   /*
+* This ensures that page tables are constructed for this region
+* of kernel virtual address space and mapped into init_mm.
+*/
+   if (apply_to_page_range(_mm, (unsigned long)area->addr,
+   area->size, f, NULL)) {
+   free_vm_area(area);
+   return NULL;
+   }
+
+   return area;
+}
+EXPORT_SYMBOL_GPL(alloc_vm_area);
+
+void free_vm_area(struct vm_struct *area)
+{
+   struct vm_struct *ret;
+   ret = remove_vm_area(area->addr);
+   BUG_ON(ret != area);
+   kfree(area);
+}
+EXPORT_SYMBOL_GPL(free_vm_area);
+
+void lock_vm_area(struct vm_struct *area)
+{
+   unsigned long i;
+   char c;
+
+   /*
+* Prevent context switch to a lazy mm that doesn't have this area
+* mapped into its page tables.
+*/
+   preempt_disable();
+
+   /*
+* Ensure that the page tables are mapped into the current mm. The
+* page-fault path will copy the page directory pointers from init_mm.
+*/
+   for (i = 0; i < area->size; i += PAGE_SIZE)
+   (void)__get_user(c, (char __user *)area->addr + i);
+}
+EXPORT_SYMBOL_GPL(lock_vm_area);
+
+void unlock_vm_area(struct vm_struct *area)
+{
+   preempt_enable();
+}
+EXPORT_SYMBOL_GPL(unlock_vm_area);
===
--- /dev/null
+++ b/include/xen/driver_util.h
@@ -0,0 +1,15 @@
+
+#ifndef __ASM_XEN_DRIVER_UTIL_H__
+#define __ASM_XEN_DRIVER_UTIL_H__
+
+#include 
+
+/* Allocate/destroy a 'vmalloc' VM area. */
+extern struct vm_struct *alloc_vm_area(unsigned long size);
+extern void free_vm_area(struct vm_struct *area);
+
+/* Lock an area so that PTEs are accessible in the current address space. */
+extern void lock_vm_area(struct vm_struct *area);
+extern void unlock_vm_area(struct vm_struct *area);
+
+#endif /* __ASM_XEN_DRIVER_UTIL_H__ */

-- 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 05/20] XEN-paravirt: paravirt: reserve fixmap slot

2007-01-13 Thread Jeremy Fitzhardinge
Reserve a new fixmap slot for paravirt backends.  Xen uses this for
mapping the hypervisor shared-info page, which doesn't really exist in
the guest address space.

Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]>
Cc: Chris Wright <[EMAIL PROTECTED]>
Cc: Zachary Amsden <[EMAIL PROTECTED]>
Cc: Andi Kleen <[EMAIL PROTECTED]>
Cc: Andrew Morton <[EMAIL PROTECTED]>
Cc: Rusty Russell <[EMAIL PROTECTED]>

===
--- a/include/asm-i386/fixmap.h
+++ b/include/asm-i386/fixmap.h
@@ -86,6 +86,9 @@ enum fixed_addresses {
 #ifdef CONFIG_PCI_MMCONFIG
FIX_PCIE_MCFG,
 #endif
+#ifdef CONFIG_PARAVIRT
+   FIX_PARAVIRT,
+#endif
__end_of_permanent_fixed_addresses,
/* temporary boot-time mappings, used before ioremap() is functional */
 #define NR_FIX_BTMAPS  16

-- 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 20/20] XEN-paravirt: Add Xen virtual block device driver.

2007-01-13 Thread Jeremy Fitzhardinge
The block device frontend driver allows the kernel to access block
devices exported exported by a virtual machine containing a physical
block device driver.

Signed-off-by: Ian Pratt <[EMAIL PROTECTED]>
Signed-off-by: Christian Limpach <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 drivers/block/Kconfig|1
 drivers/block/Makefile   |1
 drivers/block/xen/Kconfig|   14
 drivers/block/xen/Makefile   |5
 drivers/block/xen/blkfront.c |  812 
+++ drivers/block/xen/block.h|  155 

 drivers/block/xen/vbd.c  |  214 +++
 7 files changed, 1202 insertions(+)

===
--- a/drivers/block/Kconfig
+++ b/drivers/block/Kconfig
@@ -461,6 +461,7 @@ config CDROM_PKTCDVD_WCACHE
  don't do deferred write error handling yet.
 
 source "drivers/s390/block/Kconfig"
+source "drivers/block/xen/Kconfig"
 
 config ATA_OVER_ETH
tristate "ATA over Ethernet support"
===
--- a/drivers/block/Makefile
+++ b/drivers/block/Makefile
@@ -30,3 +30,4 @@ obj-$(CONFIG_BLK_DEV_SX8) += sx8.o
 obj-$(CONFIG_BLK_DEV_SX8)  += sx8.o
 obj-$(CONFIG_BLK_DEV_UB)   += ub.o
 
+obj-$(CONFIG_XEN)  += xen/
===
--- a/include/linux/major.h
+++ b/include/linux/major.h
@@ -156,6 +156,8 @@
 #define VXSPEC_MAJOR   200 /* VERITAS volume config driver */
 #define VXDMP_MAJOR201 /* VERITAS volume multipath driver */
 
+#define XENVBD_MAJOR   202 /* Xen virtual block device */
+
 #define MSR_MAJOR  202
 #define CPUID_MAJOR203
 
===
--- /dev/null
+++ b/drivers/block/xen/Kconfig
@@ -0,0 +1,14 @@
+menu "Xen block device drivers"
+depends on XEN
+
+config XEN_BLKDEV_FRONTEND
+   tristate "Block device frontend driver"
+   depends on XEN
+   default y
+   help
+ The block device frontend driver allows the kernel to access block
+ devices exported from a device driver virtual machine. Unless you
+ are building a dedicated device driver virtual machine, then you
+ almost certainly want to say Y here.
+
+endmenu
===
--- /dev/null
+++ b/drivers/block/xen/Makefile
@@ -0,0 +1,5 @@
+
+obj-$(CONFIG_XEN_BLKDEV_FRONTEND)  := xenblk.o
+
+xenblk-objs := blkfront.o vbd.o
+
===
--- /dev/null
+++ b/drivers/block/xen/blkfront.c
@@ -0,0 +1,870 @@
+/**
+ * blkfront.c
+ * 
+ * XenLinux virtual block device driver.
+ * 
+ * Copyright (c) 2003-2004, Keir Fraser & Steve Hand
+ * Modifications by Mark A. Williamson are (c) Intel Research Cambridge
+ * Copyright (c) 2004, Christian Limpach
+ * Copyright (c) 2004, Andrew Warfield
+ * Copyright (c) 2005, Christopher Clark
+ * Copyright (c) 2005, XenSource Ltd
+ * 
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License version 2
+ * as published by the Free Software Foundation; or, when distributed
+ * separately from the Linux kernel or incorporated into other
+ * software packages, subject to the following license:
+ * 
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this source file (the "Software"), to deal in the Software without
+ * restriction, including without limitation the rights to use, copy, modify,
+ * merge, publish, distribute, sublicense, and/or sell copies of the Software,
+ * and to permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ * 
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ * 
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include 
+#include "block.h"
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "../../../arch/i386/paravirt-xen/events.h"
+#include "../../../arch/i386/paravirt-xen/xen-page.h"
+#include 
+
+#define BLKIF_STATE_DISCONNECTED 0
+#define BLKIF_STATE_CONNECTED1
+#define BLKIF_STATE_SUSPENDED2
+
+#define MAXIMUM_OUTSTANDING_BLOCK_REQS \
+

[patch 13/20] XEN-paravirt: Xen: Add config options and disable unsupported config options.

2007-01-13 Thread Jeremy Fitzhardinge
The XEN config option enables the Xen paravirt_ops interface, which is
installed when the kernel finds itself running under Xen. (By some
as-yet fully defined mechanism, implemented in a future patch.)

Xen is no longer a sub-architecture, so the X86_XEN subarch config
option has gone.

The disabled config options are:
- PREEMPT: Xen doesn't support it
- HZ: set to 100Hz for now, to cut down on VCPU context switch rate.
  This will be adapted to use tickless later.
- kexec: not yet supported

Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]>
Signed-off-by: Ian Pratt <[EMAIL PROTECTED]>
Signed-off-by: Christian Limpach <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>

---
 arch/i386/Kconfig  |8 +---
 arch/i386/Kconfig.debug|1 +
 arch/i386/paravirt-xen/Kconfig |   17 +
 kernel/Kconfig.hz  |4 ++--
 kernel/Kconfig.preempt |1 +
 5 files changed, 26 insertions(+), 5 deletions(-)

===
--- a/arch/i386/Kconfig
+++ b/arch/i386/Kconfig
@@ -192,6 +192,8 @@ config PARAVIRT
  under a hypervisor, improving performance significantly.
  However, when run without a hypervisor the kernel is
  theoretically slower.  If in doubt, say N.
+
+source "arch/i386/paravirt-xen/Kconfig"
 
 config ACPI_SRAT
bool
@@ -298,12 +300,12 @@ config X86_UP_IOAPIC
 
 config X86_LOCAL_APIC
bool
-   depends on X86_UP_APIC || ((X86_VISWS || SMP) && !X86_VOYAGER) || 
X86_GENERICARCH
+   depends on X86_UP_APIC || (((X86_VISWS || SMP) && !X86_VOYAGER) || 
X86_GENERICARCH)
default y
 
 config X86_IO_APIC
bool
-   depends on X86_UP_IOAPIC || (SMP && !(X86_VISWS || X86_VOYAGER)) || 
X86_GENERICARCH
+   depends on X86_UP_IOAPIC || ((SMP && !(X86_VISWS || X86_VOYAGER)) || 
X86_GENERICARCH)
default y
 
 config X86_VISWS_APIC
@@ -743,6 +745,7 @@ source kernel/Kconfig.hz
 
 config KEXEC
bool "kexec system call"
+   depends on !XEN
help
  kexec is a system call that implements the ability to shutdown your
  current kernel, and to start another kernel.  It is like a reboot
===
--- a/arch/i386/Kconfig.debug
+++ b/arch/i386/Kconfig.debug
@@ -79,6 +79,7 @@ config DOUBLEFAULT
 config DOUBLEFAULT
default y
bool "Enable doublefault exception handler" if EMBEDDED
+   depends on !XEN
help
   This option allows trapping of rare doublefault exceptions that
   would otherwise cause a system to silently reboot. Disabling this
===
--- /dev/null
+++ b/arch/i386/paravirt-xen/Kconfig
@@ -0,0 +1,10 @@
+#
+# This Kconfig describes xen options
+#
+
+config XEN
+   bool "Enable support for Xen hypervisor"
+   depends PARAVIRT
+   default y
+   help
+ This is the Linux Xen port.
===
--- a/kernel/Kconfig.hz
+++ b/kernel/Kconfig.hz
@@ -3,7 +3,7 @@
 #
 
 choice
-   prompt "Timer frequency"
+   prompt "Timer frequency" if !XEN
default HZ_250
help
 Allows the configuration of the timer frequency. It is customary
@@ -49,7 +49,7 @@ endchoice
 
 config HZ
int
-   default 100 if HZ_100
+   default 100 if HZ_100 || XEN
default 250 if HZ_250
default 300 if HZ_300
default 1000 if HZ_1000
===
--- a/kernel/Kconfig.preempt
+++ b/kernel/Kconfig.preempt
@@ -35,6 +35,7 @@ config PREEMPT_VOLUNTARY
 
 config PREEMPT
bool "Preemptible Kernel (Low-Latency Desktop)"
+   depends on !XEN
help
  This option reduces the latency of the kernel by making
  all kernel code (that is not executing in a critical section)

-- 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 12/20] XEN-paravirt: Xen: Add nosegneg capability to the vsyscall page notes

2007-01-13 Thread Jeremy Fitzhardinge
Add the "nosegneg" fake capabilty to the vsyscall page notes. This is
used by the runtime linker to select a glibc version which then
disables negative-offset accesses to the thread-local segment via
%gs. These accesses require emulation in Xen (because segments are
truncated to protect the hypervisor address space) and avoiding them
provides a measurable performance boost.

Signed-off-by: Ian Pratt <[EMAIL PROTECTED]>
Signed-off-by: Christian Limpach <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 arch/i386/kernel/vsyscall-note.S |   29 +
 1 files changed, 29 insertions(+)

===
--- a/arch/i386/kernel/vsyscall-note.S
+++ b/arch/i386/kernel/vsyscall-note.S
@@ -23,3 +24,31 @@ 3:   .balign 4;  /* pad out section */   
 
ASM_ELF_NOTE_BEGIN(".note.kernel-version", "a", UTS_SYSNAME, 0)
.long LINUX_VERSION_CODE
ASM_ELF_NOTE_END
+
+#ifdef CONFIG_XEN
+/*
+ * Add a special note telling glibc's dynamic linker a fake hardware
+ * flavor that it will use to choose the search path for libraries in the
+ * same way it uses real hardware capabilities like "mmx".
+ * We supply "nosegneg" as the fake capability, to indicate that we
+ * do not like negative offsets in instructions using segment overrides,
+ * since we implement those inefficiently.  This makes it possible to
+ * install libraries optimized to avoid those access patterns in someplace
+ * like /lib/i686/tls/nosegneg.  Note that an /etc/ld.so.conf.d/file
+ * corresponding to the bits here is needed to make ldconfig work right.
+ * It should contain:
+ * hwcap 0 nosegneg
+ * to match the mapping of bit to name that we give here.
+ */
+#define NOTE_KERNELCAP_BEGIN(ncaps, mask) \
+   ASM_ELF_NOTE_BEGIN(".note.kernelcap", "a", "GNU", 2) \
+   .long ncaps, mask
+#define NOTE_KERNELCAP(bit, name) \
+   .byte bit; .asciz name
+#define NOTE_KERNELCAP_END ASM_ELF_NOTE_END
+
+NOTE_KERNELCAP_BEGIN(1, 1)
+NOTE_KERNELCAP(1, "nosegneg")
+NOTE_KERNELCAP_END
+#endif
+

-- 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 10/20] XEN-paravirt: mm lifetime hooks

2007-01-13 Thread Jeremy Fitzhardinge
Add hooks to allow a paravirt implementation to track the lifetime of
an mm.  Unfortunately dup_mmap and exit_mmap are in generic code, so
we need to #ifdef CONFIG_PARAVIRT.

Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]>
Cc: Chris Wright <[EMAIL PROTECTED]>
Cc: Zachary Amsden <[EMAIL PROTECTED]>
Cc: Andi Kleen <[EMAIL PROTECTED]>
Cc: Andrew Morton <[EMAIL PROTECTED]>
Cc: Rusty Russell <[EMAIL PROTECTED]>

--
 arch/i386/kernel/paravirt.c|4 
 include/asm-i386/mmu_context.h |8 ++--
 include/asm-i386/paravirt.h|   38 ++
 kernel/fork.c  |3 +++
 mm/mmap.c  |4 
 5 files changed, 55 insertions(+), 2 deletions(-)

===
--- a/arch/i386/kernel/paravirt.c
+++ b/arch/i386/kernel/paravirt.c
@@ -706,6 +706,10 @@ struct paravirt_ops paravirt_ops = {
.irq_enable_sysexit = native_irq_enable_sysexit,
.iret = native_iret,
 
+   .dup_mmap = (void *)native_nop,
+   .exit_mmap = (void *)native_nop,
+   .activate_mm = (void *)native_nop,
+
.startup_ipi_hook = (void *)native_nop,
 };
 
===
--- a/include/asm-i386/mmu_context.h
+++ b/include/asm-i386/mmu_context.h
@@ -5,6 +5,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * Used for LDT copy/destruction.
@@ -65,7 +66,10 @@ static inline void switch_mm(struct mm_s
 #define deactivate_mm(tsk, mm) \
asm("movl %0,%%gs": :"r" (0));
 
-#define activate_mm(prev, next) \
-   switch_mm((prev),(next),NULL)
+#define activate_mm(prev, next)\
+   do {\
+   paravirt_activate_mm(prev, next);   \
+   switch_mm((prev),(next),NULL);  \
+   } while(0);
 
 #endif
===
--- a/include/asm-i386/paravirt.h
+++ b/include/asm-i386/paravirt.h
@@ -126,6 +126,12 @@ struct paravirt_ops
void (fastcall *io_delay)(void);
void (*const_udelay)(unsigned long loops);
 
+   void (fastcall *activate_mm)(struct mm_struct *prev,
+struct mm_struct *next);
+   void (fastcall *dup_mmap)(struct mm_struct *oldmm, 
+ struct mm_struct *mm);
+   void (fastcall *exit_mmap)(struct mm_struct *mm);
+
 #ifdef CONFIG_X86_LOCAL_APIC
void (fastcall *apic_write)(unsigned long reg, unsigned long v);
void (fastcall *apic_write_atomic)(unsigned long reg, unsigned long v);
@@ -429,6 +435,23 @@ static inline void startup_ipi_hook(int 
 }
 #endif
 
+static inline void paravirt_activate_mm(struct mm_struct *prev,
+   struct mm_struct *next)
+{
+   paravirt_ops.activate_mm(prev, next);
+}
+
+static inline void paravirt_dup_mmap(struct mm_struct *oldmm,
+struct mm_struct *mm)
+{
+   paravirt_ops.dup_mmap(oldmm, mm);
+}
+
+static inline void paravirt_exit_mmap(struct mm_struct *mm)
+{
+   paravirt_ops.exit_mmap(mm);
+}
+
 #define __flush_tlb() paravirt_ops.flush_tlb_user()
 #define __flush_tlb_global() paravirt_ops.flush_tlb_kernel()
 #define __flush_tlb_single(addr) paravirt_ops.flush_tlb_single(addr)
@@ -673,5 +696,20 @@ static inline void paravirt_pagetable_se
set_pgd([0], base[USER_PTRS_PER_PGD]);
 #endif
 }
+
+static inline void paravirt_activate_mm(struct mm_struct *prev,
+   struct mm_struct *next)
+{
+}
+
+static inline void paravirt_dup_mmap(struct mm_struct *oldmm,
+struct mm_struct *mm)
+{
+}
+
+static inline void paravirt_exit_mmap(struct mm_struct *mm)
+{
+}
+
 #endif /* CONFIG_PARAVIRT */
 #endif /* __ASM_PARAVIRT_H */
===
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -287,6 +287,9 @@ static inline int dup_mmap(struct mm_str
if (retval)
goto out;
}
+#ifdef CONFIG_PARAVIRT
+   paravirt_dup_mmap(oldmm, mm);
+#endif
retval = 0;
 out:
up_write(>mmap_sem);
===
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1969,6 +1969,10 @@ void exit_mmap(struct mm_struct *mm)
struct vm_area_struct *vma = mm->mmap;
unsigned long nr_accounted = 0;
unsigned long end;
+
+#ifdef CONFIG_PARAVIRT
+   paravirt_exit_mmap(mm);
+#endif
 
lru_add_drain();
flush_cache_mm(mm);

-- 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 08/20] XEN-paravirt: paravirt pgd allocation alignment

2007-01-13 Thread Jeremy Fitzhardinge
Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]>
Cc: Chris Wright <[EMAIL PROTECTED]>
Cc: Zachary Amsden <[EMAIL PROTECTED]>
Cc: Andi Kleen <[EMAIL PROTECTED]>
Cc: Andrew Morton <[EMAIL PROTECTED]>
Cc: Rusty Russell <[EMAIL PROTECTED]>

===
--- a/arch/i386/kernel/paravirt.c
+++ b/arch/i386/kernel/paravirt.c
@@ -573,6 +573,7 @@ struct paravirt_ops paravirt_ops = {
.paravirt_enabled = 0,
.kernel_rpl = 0,
.shared_kernel_pmd = 1, /* Only used when CONFIG_X86_PAE is set */
+   .pgd_alignment = sizeof(pgd_t) * PTRS_PER_PGD,
 
.patch = native_patch,
.banner = default_banner,
===
--- a/arch/i386/mm/init.c
+++ b/arch/i386/mm/init.c
@@ -745,7 +745,7 @@ void __init pgtable_cache_init(void)
}
pgd_cache = kmem_cache_create("pgd",
  PTRS_PER_PGD*sizeof(pgd_t),
- PTRS_PER_PGD*sizeof(pgd_t),
+ PGD_ALIGNMENT,
  0, NULL, NULL);
if (!pgd_cache)
panic("pgtable_cache_init(): Cannot create pgd cache");
===
--- a/include/asm-i386/paravirt.h
+++ b/include/asm-i386/paravirt.h
@@ -33,9 +33,12 @@ struct mm_struct;
 struct mm_struct;
 struct paravirt_ops
 {
+   int paravirt_enabled;
unsigned int kernel_rpl;
+
int shared_kernel_pmd;
-   int paravirt_enabled;
+   int pgd_alignment;
+
const char *name;
 
/*
===
--- a/include/asm-i386/pgtable.h
+++ b/include/asm-i386/pgtable.h
@@ -270,6 +270,12 @@ static inline void vmalloc_sync_all(void
 #define pte_update_defer(mm, addr, ptep)   do { } while (0)
 #endif
 
+#ifdef CONFIG_PARAVIRT
+#define PGD_ALIGNMENT  (paravirt_ops.pgd_alignment)
+#else
+#define PGD_ALIGNMENT  (sizeof(pgd_t) * PTRS_PER_PGD)
+#endif
+
 /*
  * We only update the dirty/accessed state if we set
  * the dirty bit by hand in the kernel, since the hardware

-- 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 07/20] XEN-paravirt: paravirt shared kernel pmd flag

2007-01-13 Thread Jeremy Fitzhardinge
Xen does not allow guests to have the kernel pmd shared between page
tables, so parameterize pgtable.c to allow both modes of operation.

Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]>
Cc: Chris Wright <[EMAIL PROTECTED]>
Cc: Zachary Amsden <[EMAIL PROTECTED]>
Cc: Andi Kleen <[EMAIL PROTECTED]>
Cc: Andrew Morton <[EMAIL PROTECTED]>
Cc: Rusty Russell <[EMAIL PROTECTED]>

--
 arch/i386/kernel/paravirt.c|1 
 arch/i386/mm/fault.c   |6 +--
 arch/i386/mm/pageattr.c|2 -
 arch/i386/mm/pgtable.c |   61 
 include/asm-i386/page.h|7 ++-
 include/asm-i386/paravirt.h|1 
 include/asm-i386/pgtable-2level-defs.h |2 +
 include/asm-i386/pgtable-2level.h  |2 -
 include/asm-i386/pgtable-3level-defs.h |6 +++
 include/asm-i386/pgtable-3level.h  |   16 ++--
 include/asm-i386/pgtable.h |7 +++
 11 files changed, 68 insertions(+), 43 deletions(-)

===
--- a/arch/i386/kernel/paravirt.c
+++ b/arch/i386/kernel/paravirt.c
@@ -572,6 +572,7 @@ struct paravirt_ops paravirt_ops = {
.name = "bare hardware",
.paravirt_enabled = 0,
.kernel_rpl = 0,
+   .shared_kernel_pmd = 1, /* Only used when CONFIG_X86_PAE is set */
 
.patch = native_patch,
.banner = default_banner,
===
--- a/arch/i386/mm/fault.c
+++ b/arch/i386/mm/fault.c
@@ -616,8 +616,7 @@ do_sigbus:
force_sig_info_fault(SIGBUS, BUS_ADRERR, address, tsk);
 }
 
-#ifndef CONFIG_X86_PAE
-void vmalloc_sync_all(void)
+void _vmalloc_sync_all(void)
 {
/*
 * Note that races in the updates of insync and start aren't
@@ -628,6 +627,8 @@ void vmalloc_sync_all(void)
static DECLARE_BITMAP(insync, PTRS_PER_PGD);
static unsigned long start = TASK_SIZE;
unsigned long address;
+
+   BUG_ON(SHARED_KERNEL_PMD);
 
BUILD_BUG_ON(TASK_SIZE & ~PGDIR_MASK);
for (address = start; address >= TASK_SIZE; address += PGDIR_SIZE) {
@@ -651,4 +652,3 @@ void vmalloc_sync_all(void)
start = address + PGDIR_SIZE;
}
 }
-#endif
===
--- a/arch/i386/mm/pageattr.c
+++ b/arch/i386/mm/pageattr.c
@@ -91,7 +91,7 @@ static void set_pmd_pte(pte_t *kpte, uns
unsigned long flags;
 
set_pte_atomic(kpte, pte);  /* change init_mm */
-   if (PTRS_PER_PMD > 1)
+   if (SHARED_KERNEL_PMD)
return;
 
spin_lock_irqsave(_lock, flags);
===
--- a/arch/i386/mm/pgtable.c
+++ b/arch/i386/mm/pgtable.c
@@ -241,31 +241,42 @@ static void pgd_ctor(pgd_t *pgd)
unsigned long flags;
 
if (PTRS_PER_PMD == 1) {
+   /* !PAE, no pagetable sharing */
memset(pgd, 0, USER_PTRS_PER_PGD*sizeof(pgd_t));
+
+   clone_pgd_range(pgd + USER_PTRS_PER_PGD,
+   swapper_pg_dir + USER_PTRS_PER_PGD,
+   KERNEL_PGD_PTRS);
+
spin_lock_irqsave(_lock, flags);
-   }
-
-   clone_pgd_range(pgd + USER_PTRS_PER_PGD,
-   swapper_pg_dir + USER_PTRS_PER_PGD,
-   KERNEL_PGD_PTRS);
-
-   if (PTRS_PER_PMD > 1)
-   return;
-
-   /* must happen under lock */
-   paravirt_alloc_pd_clone(__pa(pgd) >> PAGE_SHIFT,
-   __pa(swapper_pg_dir) >> PAGE_SHIFT,
-   USER_PTRS_PER_PGD, PTRS_PER_PGD - USER_PTRS_PER_PGD);
-
-   pgd_list_add(pgd);
-   spin_unlock_irqrestore(_lock, flags);
+
+   /* must happen under lock */
+   paravirt_alloc_pd_clone(__pa(pgd) >> PAGE_SHIFT,
+   __pa(swapper_pg_dir) >> PAGE_SHIFT,
+   USER_PTRS_PER_PGD,
+   PTRS_PER_PGD - USER_PTRS_PER_PGD);
+
+   pgd_list_add(pgd);
+   spin_unlock_irqrestore(_lock, flags);
+   } else {
+   /* PAE, PMD may be shared */
+   if (SHARED_KERNEL_PMD) {
+   clone_pgd_range((pgd_t *)pgd + USER_PTRS_PER_PGD,
+   swapper_pg_dir + USER_PTRS_PER_PGD,
+   KERNEL_PGD_PTRS);
+   } else {
+   spin_lock_irqsave(_lock, flags);
+   pgd_list_add(pgd);
+   spin_unlock_irqrestore(_lock, flags);
+   }
+   }
 }
 
 static void pgd_dtor(pgd_t *pgd)
 {
unsigned long flags; /* can be called from interrupt context */
 
-   if (PTRS_PER_PMD == 1)
+   if (SHARED_KERNEL_PMD)
return;
 

[patch 11/20] XEN-paravirt: Add apply_to_page_range() which applies a function to a pte range.

2007-01-13 Thread Jeremy Fitzhardinge
Add a new mm function apply_to_page_range() which applies a given
function to every pte in a given virtual address range in a given mm
structure. This is a generic alternative to cut-and-pasting the Linux
idiomatic pagetable walking code in every place that a sequence of
PTEs must be accessed.

Although this interface is intended to be useful in a wide range of
situations, it is currently used specifically by several Xen
subsystems, for example: to ensure that pagetables have been allocated
for a virtual address range, and to construct batched special
pagetable update requests to map I/O memory (in ioremap()).

Signed-off-by: Ian Pratt <[EMAIL PROTECTED]>
Signed-off-by: Christian Limpach <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]>
Cc: Christoph Lameter <[EMAIL PROTECTED]>

---
 include/linux/mm.h |5 ++
 mm/memory.c|   94 
 2 files changed, 99 insertions(+)

===
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1130,6 +1130,11 @@ struct page *follow_page(struct vm_area_
 
 unsigned long __follow_page(void *vaddr);
 
+typedef int (*pte_fn_t)(pte_t *pte, struct page *pmd_page, unsigned long addr,
+   void *data);
+extern int apply_to_page_range(struct mm_struct *mm, unsigned long address,
+  unsigned long size, pte_fn_t fn, void *data);
+
 #ifdef CONFIG_PROC_FS
 void vm_stat_account(struct mm_struct *, unsigned long, struct file *, long);
 #else
===
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1414,6 +1414,100 @@ int remap_pfn_range(struct vm_area_struc
 }
 EXPORT_SYMBOL(remap_pfn_range);
 
+static int apply_to_pte_range(struct mm_struct *mm, pmd_t *pmd,
+unsigned long addr, unsigned long end,
+pte_fn_t fn, void *data)
+{
+   pte_t *pte;
+   int err;
+   struct page *pmd_page;
+   spinlock_t *ptl;
+
+   pte = (mm == _mm) ?
+   pte_alloc_kernel(pmd, addr) :
+   pte_alloc_map_lock(mm, pmd, addr, );
+   if (!pte)
+   return -ENOMEM;
+
+   BUG_ON(pmd_huge(*pmd));
+
+   pmd_page = pmd_page(*pmd);
+
+   do {
+   err = fn(pte, pmd_page, addr, data);
+   if (err)
+   break;
+   } while (pte++, addr += PAGE_SIZE, addr != end);
+
+   if (mm != _mm)
+   pte_unmap_unlock(pte-1, ptl);
+   return err;
+}
+
+static int apply_to_pmd_range(struct mm_struct *mm, pud_t *pud,
+unsigned long addr, unsigned long end,
+pte_fn_t fn, void *data)
+{
+   pmd_t *pmd;
+   unsigned long next;
+   int err;
+
+   pmd = pmd_alloc(mm, pud, addr);
+   if (!pmd)
+   return -ENOMEM;
+   do {
+   next = pmd_addr_end(addr, end);
+   err = apply_to_pte_range(mm, pmd, addr, next, fn, data);
+   if (err)
+   break;
+   } while (pmd++, addr = next, addr != end);
+   return err;
+}
+
+static int apply_to_pud_range(struct mm_struct *mm, pgd_t *pgd,
+unsigned long addr, unsigned long end,
+pte_fn_t fn, void *data)
+{
+   pud_t *pud;
+   unsigned long next;
+   int err;
+
+   pud = pud_alloc(mm, pgd, addr);
+   if (!pud)
+   return -ENOMEM;
+   do {
+   next = pud_addr_end(addr, end);
+   err = apply_to_pmd_range(mm, pud, addr, next, fn, data);
+   if (err)
+   break;
+   } while (pud++, addr = next, addr != end);
+   return err;
+}
+
+/*
+ * Scan a region of virtual memory, filling in page tables as necessary
+ * and calling a provided function on each leaf page table.
+ */
+int apply_to_page_range(struct mm_struct *mm, unsigned long addr,
+   unsigned long size, pte_fn_t fn, void *data)
+{
+   pgd_t *pgd;
+   unsigned long next;
+   unsigned long end = addr + size;
+   int err;
+
+   BUG_ON(addr >= end);
+   pgd = pgd_offset(mm, addr);
+   do {
+   next = pgd_addr_end(addr, end);
+   err = apply_to_pud_range(mm, pgd, addr, next, fn, data);
+   if (err)
+   break;
+   } while (pgd++, addr = next, addr != end);
+   return err;
+}
+EXPORT_SYMBOL_GPL(apply_to_page_range);
+
 /*
  * handle_pte_fault chooses page fault handler according to an entry
  * which was read non-atomically.  Before making any commitment, on

-- 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  

[patch 06/20] XEN-paravirt: remove pgd ctor

2007-01-13 Thread Jeremy Fitzhardinge
Remove the ctor for the pgd cache.  There's no point in having the
cache machinery do this via an indirect call when all pgd are freed in
the one place anyway.

Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]>
Cc: Chris Wright <[EMAIL PROTECTED]>
Cc: Zachary Amsden <[EMAIL PROTECTED]>
Cc: Andi Kleen <[EMAIL PROTECTED]>
Cc: Andrew Morton <[EMAIL PROTECTED]>
Cc: Rusty Russell <[EMAIL PROTECTED]>

--
 arch/i386/mm/init.c|8 +++-
 arch/i386/mm/pgtable.c |   15 +++
 include/asm-i386/pgtable.h |2 --
 3 files changed, 14 insertions(+), 11 deletions(-)

===
--- a/arch/i386/mm/init.c
+++ b/arch/i386/mm/init.c
@@ -739,11 +739,9 @@ void __init pgtable_cache_init(void)
panic("pgtable_cache_init(): cannot create pmd cache");
}
pgd_cache = kmem_cache_create("pgd",
-   PTRS_PER_PGD*sizeof(pgd_t),
-   PTRS_PER_PGD*sizeof(pgd_t),
-   0,
-   pgd_ctor,
-   PTRS_PER_PMD == 1 ? pgd_dtor : NULL);
+ PTRS_PER_PGD*sizeof(pgd_t),
+ PTRS_PER_PGD*sizeof(pgd_t),
+ 0, NULL, NULL);
if (!pgd_cache)
panic("pgtable_cache_init(): Cannot create pgd cache");
 }
===
--- a/arch/i386/mm/pgtable.c
+++ b/arch/i386/mm/pgtable.c
@@ -236,7 +236,7 @@ static inline void pgd_list_del(pgd_t *p
set_page_private(next, (unsigned long)pprev);
 }
 
-void pgd_ctor(void *pgd, struct kmem_cache *cache, unsigned long unused)
+static void pgd_ctor(pgd_t *pgd)
 {
unsigned long flags;
 
@@ -245,7 +245,7 @@ void pgd_ctor(void *pgd, struct kmem_cac
spin_lock_irqsave(_lock, flags);
}
 
-   clone_pgd_range((pgd_t *)pgd + USER_PTRS_PER_PGD,
+   clone_pgd_range(pgd + USER_PTRS_PER_PGD,
swapper_pg_dir + USER_PTRS_PER_PGD,
KERNEL_PGD_PTRS);
 
@@ -261,10 +261,12 @@ void pgd_ctor(void *pgd, struct kmem_cac
spin_unlock_irqrestore(_lock, flags);
 }
 
-/* never called when PTRS_PER_PMD > 1 */
-void pgd_dtor(void *pgd, struct kmem_cache *cache, unsigned long unused)
+static void pgd_dtor(pgd_t *pgd)
 {
unsigned long flags; /* can be called from interrupt context */
+
+   if (PTRS_PER_PMD == 1)
+   return;
 
paravirt_release_pd(__pa(pgd) >> PAGE_SHIFT);
spin_lock_irqsave(_lock, flags);
@@ -276,6 +278,9 @@ pgd_t *pgd_alloc(struct mm_struct *mm)
 {
int i;
pgd_t *pgd = kmem_cache_alloc(pgd_cache, GFP_KERNEL);
+
+   if (pgd)
+   pgd_ctor(pgd);
 
if (PTRS_PER_PMD == 1 || !pgd)
return pgd;
@@ -296,6 +301,7 @@ out_oom:
paravirt_release_pd(__pa(pmd) >> PAGE_SHIFT);
kmem_cache_free(pmd_cache, pmd);
}
+   pgd_dtor(pgd);
kmem_cache_free(pgd_cache, pgd);
return NULL;
 }
@@ -313,5 +319,6 @@ void pgd_free(pgd_t *pgd)
kmem_cache_free(pmd_cache, pmd);
}
/* in the non-PAE case, free_pgtables() clears user pgd entries */
+   pgd_dtor(pgd);
kmem_cache_free(pgd_cache, pgd);
 }
===
--- a/include/asm-i386/pgtable.h
+++ b/include/asm-i386/pgtable.h
@@ -41,8 +41,6 @@ extern struct page *pgd_list;
 extern struct page *pgd_list;
 
 void pmd_ctor(void *, struct kmem_cache *, unsigned long);
-void pgd_ctor(void *, struct kmem_cache *, unsigned long);
-void pgd_dtor(void *, struct kmem_cache *, unsigned long);
 void pgtable_cache_init(void);
 void paging_init(void);
 

-- 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 17/20] XEN-paravirt: Add Xen grant table support

2007-01-13 Thread Jeremy Fitzhardinge
Add Xen 'grant table' driver which allows granting of access to
selected local memory pages by other virtual machines and,
symmetrically, the mapping of remote memory pages which other virtual
machines have granted access to.

This driver is a prerequisite for many of the Xen virtual device
drivers, which grant the 'device driver domain' restricted and
temporary access to only those memory pages that are currently
involved in I/O operations.

Signed-off-by: Ian Pratt <[EMAIL PROTECTED]>
Signed-off-by: Christian Limpach <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 drivers/xen/core/Makefile |2
 drivers/xen/core/gnttab.c |  422 ++
 include/xen/gnttab.h  |  105 +++
 3 files changed, 528 insertions(+), 1 deletion(-)

===
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -1,1 +1,2 @@ obj-y   += console/
+obj-y  += core/
 obj-y  += console/
===
--- /dev/null
+++ b/drivers/xen/core/Makefile
@@ -0,0 +1,1 @@
+obj-y  += grant_table.o
===
--- /dev/null
+++ b/drivers/xen/core/grant_table.c
@@ -0,0 +1,445 @@
+/**
+ * grant_table.c
+ *
+ * Granting foreign access to our memory reservation.
+ *
+ * Copyright (c) 2005, Christopher Clark
+ * Copyright (c) 2004-2005, K A Fraser
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License version 2
+ * as published by the Free Software Foundation; or, when distributed
+ * separately from the Linux kernel or incorporated into other
+ * software packages, subject to the following license:
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this source file (the "Software"), to deal in the Software without
+ * restriction, including without limitation the rights to use, copy, modify,
+ * merge, publish, distribute, sublicense, and/or sell copies of the Software,
+ * and to permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "../../../arch/i386/paravirt-xen/xen-page.h"
+
+/* External tools reserve first few grant table entries. */
+#define NR_RESERVED_ENTRIES 8
+
+#define NR_GRANT_ENTRIES \
+   (NR_GRANT_FRAMES * PAGE_SIZE / sizeof(struct grant_entry))
+#define GNTTAB_LIST_END (NR_GRANT_ENTRIES + 1)
+
+static grant_ref_t gnttab_list[NR_GRANT_ENTRIES];
+static int gnttab_free_count;
+static grant_ref_t gnttab_free_head;
+static DEFINE_SPINLOCK(gnttab_list_lock);
+
+static struct grant_entry *shared;
+
+static struct gnttab_free_callback *gnttab_free_callback_list;
+
+static int get_free_entries(int count)
+{
+   unsigned long flags;
+   int ref;
+   grant_ref_t head;
+   spin_lock_irqsave(_list_lock, flags);
+   if (gnttab_free_count < count) {
+   spin_unlock_irqrestore(_list_lock, flags);
+   return -1;
+   }
+   ref = head = gnttab_free_head;
+   gnttab_free_count -= count;
+   while (count-- > 1)
+   head = gnttab_list[head];
+   gnttab_free_head = gnttab_list[head];
+   gnttab_list[head] = GNTTAB_LIST_END;
+   spin_unlock_irqrestore(_list_lock, flags);
+   return ref;
+}
+
+#define get_free_entry() get_free_entries(1)
+
+static void do_free_callbacks(void)
+{
+   struct gnttab_free_callback *callback, *next;
+
+   callback = gnttab_free_callback_list;
+   gnttab_free_callback_list = NULL;
+
+   while (callback != NULL) {
+   next = callback->next;
+   if (gnttab_free_count >= callback->count) {
+   callback->next = NULL;
+   callback->fn(callback->arg);
+   } else {
+   callback->next = gnttab_free_callback_list;
+   gnttab_free_callback_list = callback;
+   }
+   callback = next;
+   }
+}
+
+static inline void check_free_callbacks(void)
+{
+   if 

[patch 09/20] XEN-paravirt: dont export paravirt_ops structure, do individual functions

2007-01-13 Thread Jeremy Fitzhardinge
Wrap the paravirt_ops members we want to export in wrapper functions.
Since we binary-patch the critical ones, this doesn't make a speed
impact.

I moved drm_follow_page into the core, to avoid having to wrap the
various pte ops.  Unlining kernel_fpu_end and using that in the RAID6
code would remove the need to export clts/read_cr0/write_cr0 too.

Signed-off-by: Rusty Russell <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>

===
--- a/arch/i386/kernel/paravirt.c
+++ b/arch/i386/kernel/paravirt.c
@@ -596,6 +596,123 @@ static int __init print_banner(void)
return 0;
 }
 core_initcall(print_banner);
+
+unsigned long paravirt_save_flags(void)
+{
+   return paravirt_ops.save_fl();
+}
+EXPORT_SYMBOL(paravirt_save_flags);
+
+void paravirt_restore_flags(unsigned long flags)
+{
+   paravirt_ops.restore_fl(flags);
+}
+EXPORT_SYMBOL(paravirt_restore_flags);
+
+void paravirt_irq_disable(void)
+{
+   paravirt_ops.irq_disable();
+}
+EXPORT_SYMBOL(paravirt_irq_disable);
+
+void paravirt_irq_enable(void)
+{
+   paravirt_ops.irq_enable();
+}
+EXPORT_SYMBOL(paravirt_irq_enable);
+
+void paravirt_io_delay(void)
+{
+   paravirt_ops.io_delay();
+}
+EXPORT_SYMBOL(paravirt_io_delay);
+
+void paravirt_const_udelay(unsigned long loops)
+{
+   paravirt_ops.const_udelay(loops);
+}
+EXPORT_SYMBOL(paravirt_const_udelay);
+
+u64 paravirt_read_msr(unsigned int msr, int *err)
+{
+   return paravirt_ops.read_msr(msr, err);
+}
+EXPORT_SYMBOL(paravirt_read_msr);
+
+int paravirt_write_msr(unsigned int msr, u64 val)
+{
+   return paravirt_ops.write_msr(msr, val);
+}
+EXPORT_SYMBOL(paravirt_write_msr);
+
+u64 paravirt_read_tsc(void)
+{
+   return paravirt_ops.read_tsc();
+}
+EXPORT_SYMBOL(paravirt_read_tsc);
+
+int paravirt_enabled(void)
+{
+   return paravirt_ops.paravirt_enabled;
+}
+EXPORT_SYMBOL(paravirt_enabled);
+
+void clts(void)
+{
+   paravirt_ops.clts();
+}
+EXPORT_SYMBOL(clts);
+
+unsigned long read_cr0(void)
+{
+   return paravirt_ops.read_cr0();
+}
+EXPORT_SYMBOL_GPL(read_cr0);
+
+void write_cr0(unsigned long cr0)
+{
+   paravirt_ops.write_cr0(cr0);
+}
+EXPORT_SYMBOL_GPL(write_cr0);
+
+void wbinvd(void)
+{
+   paravirt_ops.wbinvd();
+}
+EXPORT_SYMBOL(wbinvd);
+
+void raw_safe_halt(void)
+{
+   paravirt_ops.safe_halt();
+}
+EXPORT_SYMBOL_GPL(raw_safe_halt);
+
+void halt(void)
+{
+   paravirt_ops.safe_halt();
+}
+EXPORT_SYMBOL_GPL(halt);
+
+#ifdef CONFIG_X86_LOCAL_APIC
+void apic_write(unsigned long reg, unsigned long v)
+{
+   paravirt_ops.apic_write(reg,v);
+}
+EXPORT_SYMBOL_GPL(apic_write);
+
+unsigned long apic_read(unsigned long reg)
+{
+   return paravirt_ops.apic_read(reg);
+}
+EXPORT_SYMBOL_GPL(apic_read);
+#endif
+
+void __cpuid(unsigned int *eax, unsigned int *ebx,
+unsigned int *ecx, unsigned int *edx)
+{
+   paravirt_ops.cpuid(eax, ebx, ecx, edx);
+}
+EXPORT_SYMBOL(__cpuid);
 
 /* We simply declare start_kernel to be the paravirt probe of last resort. */
 paravirt_probe(start_kernel);
@@ -712,11 +829,3 @@ struct paravirt_ops paravirt_ops = {
 
.startup_ipi_hook = (void *)native_nop,
 };
-
-/*
- * NOTE: CONFIG_PARAVIRT is experimental and the paravirt_ops
- * semantics are subject to change. Hence we only do this
- * internal-only export of this, until it gets sorted out and
- * all lowlevel CPU ops used by modules are separately exported.
- */
-EXPORT_SYMBOL_GPL(paravirt_ops);
===
--- a/include/asm-i386/delay.h
+++ b/include/asm-i386/delay.h
@@ -17,9 +17,9 @@ extern void __delay(unsigned long loops)
 extern void __delay(unsigned long loops);
 
 #if defined(CONFIG_PARAVIRT) && !defined(USE_REAL_TIME_DELAY)
-#define udelay(n) paravirt_ops.const_udelay((n) * 0x10c7ul)
+#define udelay(n) paravirt_const_udelay((n) * 0x10c7ul)
 
-#define ndelay(n) paravirt_ops.const_udelay((n) * 5ul)
+#define ndelay(n) paravirt_const_udelay((n) * 5ul)
 
 #else /* !PARAVIRT || USE_REAL_TIME_DELAY */
 
===
--- a/include/asm-i386/paravirt.h
+++ b/include/asm-i386/paravirt.h
@@ -218,8 +218,6 @@ fastcall pgd_t native_make_pgd(unsigned 
 fastcall pgd_t native_make_pgd(unsigned long pgd);
 #endif
 
-#define paravirt_enabled() (paravirt_ops.paravirt_enabled)
-
 static inline void load_esp0(struct tss_struct *tss,
 struct thread_struct *thread)
 {
@@ -243,11 +241,8 @@ static inline void do_time_init(void)
 }
 
 /* The paravirtualized CPUID instruction. */
-static inline void __cpuid(unsigned int *eax, unsigned int *ebx,
-  unsigned int *ecx, unsigned int *edx)
-{
-   paravirt_ops.cpuid(eax, ebx, ecx, edx);
-}
+void __cpuid(unsigned int *eax, unsigned int *ebx,
+unsigned int *ecx, unsigned int *edx);
 
 /*
  * These special macros can be used to get or set a debugging register

[patch 16/20] XEN-paravirt: Add the Xen virtual console driver.

2007-01-13 Thread Jeremy Fitzhardinge
This provides a bootstrap and ongoing emergency console which is
intended to be available from very early during boot and at all times
thereafter, in contrast with alternatives such as UDP-based syslogd,
or logging in via ssh. The protocol is based on a simple shared-memory
ring buffer.

Signed-off-by: Ian Pratt <[EMAIL PROTECTED]>
Signed-off-by: Christian Limpach <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>

---
 arch/i386/kernel/early_printk.c|2 
 arch/i386/paravirt-xen/enlighten.c |3 
 drivers/Makefile   |3 
 drivers/xen/Makefile   |1 
 drivers/xen/console/Makefile   |2 
 drivers/xen/console/console.c  |  588 
 drivers/xen/console/xencons_ring.c |  144 
 include/xen/xencons.h  |   14 
 init/main.c|2 
 9 files changed, 759 insertions(+)

===
--- a/arch/i386/kernel/early_printk.c
+++ b/arch/i386/kernel/early_printk.c
@@ -1,2 +1,4 @@
 
+#ifndef CONFIG_XEN
 #include "../../x86_64/kernel/early_printk.c"
+#endif
===
--- a/arch/i386/paravirt-xen/enlighten.c
+++ b/arch/i386/paravirt-xen/enlighten.c
@@ -798,6 +798,9 @@ static asmlinkage void __init xen_start_
INITRD_START = xen_start_info->mod_start ? 
__pa(xen_start_info->mod_start) : 0;
INITRD_SIZE = xen_start_info->mod_len;
 
+   /* use Xen console */
+   vgacon_enabled = 0;
+
/* Start the world */
start_kernel();
 }
===
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -14,6 +14,9 @@ obj-$(CONFIG_ACPI)+= acpi/
 # was used and do nothing if so
 obj-$(CONFIG_PNP)  += pnp/
 obj-$(CONFIG_ARM_AMBA) += amba/
+
+# Xen is the default console when running as a guest
+obj-$(CONFIG_XEN)  += xen/
 
 # char/ comes before serial/ etc so that the VT console is the boot-time
 # default.
===
--- /dev/null
+++ b/drivers/xen/Makefile
@@ -0,0 +1,1 @@
+obj-y  += console/
===
--- /dev/null
+++ b/drivers/xen/console/Makefile
@@ -0,0 +1,2 @@
+
+obj-y  := console.o xencons_ring.o
===
--- /dev/null
+++ b/drivers/xen/console/console.c
@@ -0,0 +1,588 @@
+/**
+ * console.c
+ * 
+ * Virtual console driver.
+ * 
+ * Copyright (c) 2002-2004, K A Fraser.
+ * 
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License version 2
+ * as published by the Free Software Foundation; or, when distributed
+ * separately from the Linux kernel or incorporated into other
+ * software packages, subject to the following license:
+ * 
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this source file (the "Software"), to deal in the Software without
+ * restriction, including without limitation the rights to use, copy, modify,
+ * merge, publish, distribute, sublicense, and/or sell copies of the Software,
+ * and to permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ * 
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ * 
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "../../../arch/i386/paravirt-xen/events.h"
+#include 
+
+MODULE_LICENSE("Dual BSD/GPL");
+
+static int xc_disabled = 0;
+static int xc_num = -1;
+
+/* /dev/xvc0 device number allocated by lanana.org. */
+#define XEN_XVC_MAJOR 204
+#define XEN_XVC_MINOR 191
+
+#ifdef CONFIG_MAGIC_SYSRQ
+static unsigned long sysrq_requested;
+#endif
+
+static int __init xencons_setup(char *str)
+{
+   if (!strcmp(str, "off"))
+   xc_disabled = 1;
+   return 1;
+}
+__setup("xencons=", xencons_setup);
+
+/* The kernel and user-land drivers share a common transmit buffer. */
+static 

[patch 04/20] XEN-paravirt: paravirt pagetable init

2007-01-13 Thread Jeremy Fitzhardinge
Add paravirt hooks into the initial pagetable setup.  In the native
case, the kernel builds itself a new initial pagetable from scratch.
In the Xen case, the kernel starts with a pagetable provided by the
hypervisor, which is used as the prototype for the kernel-generated
pagetable.  The hooks added in this patch allow either mode of
operation without having special cases (the main change to the
pagetable construction logic is a testing to make sure a pagetable
slot is actually empty before populating it).

Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]>
Cc: Chris Wright <[EMAIL PROTECTED]>
Cc: Zachary Amsden <[EMAIL PROTECTED]>
Cc: Andi Kleen <[EMAIL PROTECTED]>
Cc: Andrew Morton <[EMAIL PROTECTED]>
Cc: Rusty Russell <[EMAIL PROTECTED]>

===
--- a/arch/i386/kernel/paravirt.c +++ b/arch/i386/kernel/paravirt.c @@
-379,6 +379,43 @@ static fastcall void native_io_delay(voi
 {
asm volatile("outb %al,$0x80");
 }
+
+void native_pagetable_setup_start(pgd_t *base)
+{
+#ifdef CONFIG_X86_PAE
+   int i;
+
+   /*
+* Init entries of the first-level page table to the
+* zero page, if they haven't already been set up.
+*
+* In a normal native boot, we'll be running on a
+* pagetable rooted in swapper_pg_dir, but not in PAE
+* mode, so this will end up clobbering the mappings
+* for the lower 24Mbytes of the address space,
+* without affecting the kernel address space.
+*/
+   for (i = 0; i < USER_PTRS_PER_PGD; i++)
+   set_pgd([i],
+   __pgd(__pa(empty_zero_page) | _PAGE_PRESENT));
+   memset([USER_PTRS_PER_PGD], 0, sizeof(pgd_t));
+#endif
+}
+
+void native_pagetable_setup_done(pgd_t *base)
+{
+#ifdef CONFIG_X86_PAE
+   /*
+* Add low memory identity-mappings - SMP needs it when
+* starting up on an AP from real-mode. In the non-PAE
+* case we already have these mappings through head.S.
+* All user-space mappings are explicitly cleared after
+* SMP startup.
+*/
+   set_pgd([0], base[USER_PTRS_PER_PGD]);
+#endif
+}
+
 
 static fastcall void native_flush_tlb(void)
 {
@@ -627,6 +664,9 @@ struct paravirt_ops paravirt_ops = {
 #endif
.set_lazy_mode = (void *)native_nop,
 
+   .pagetable_setup_start = native_pagetable_setup_start,
+   .pagetable_setup_done = native_pagetable_setup_done,
+
.flush_tlb_user = native_flush_tlb,
.flush_tlb_kernel = native_flush_tlb_global,
.flush_tlb_single = native_flush_tlb_single,
===
--- a/arch/i386/mm/init.c
+++ b/arch/i386/mm/init.c
@@ -42,6 +42,7 @@
 #include 
 #include 
 #include 
+#include 
 
 unsigned int __VMALLOC_RESERVE = 128 << 20;
 
@@ -62,6 +63,8 @@ static pmd_t * __init one_md_table_init(

 #ifdef CONFIG_X86_PAE
pmd_table = (pmd_t *) alloc_bootmem_low_pages(PAGE_SIZE);
+   memset(pmd_table, 0, PAGE_SIZE);
+
paravirt_alloc_pd(__pa(pmd_table) >> PAGE_SHIFT);
set_pgd(pgd, __pgd(__pa(pmd_table) | _PAGE_PRESENT));
pud = pud_offset(pgd, 0);
@@ -83,12 +86,11 @@ static pte_t * __init one_page_table_ini
 {
if (pmd_none(*pmd)) {
pte_t *page_table = (pte_t *) 
alloc_bootmem_low_pages(PAGE_SIZE);
+   memset(page_table, 0, PAGE_SIZE);
+
paravirt_alloc_pt(__pa(page_table) >> PAGE_SHIFT);
set_pmd(pmd, __pmd(__pa(page_table) | _PAGE_TABLE));
-   if (page_table != pte_offset_kernel(pmd, 0))
-   BUG();  
-
-   return page_table;
+   BUG_ON(page_table != pte_offset_kernel(pmd, 0));
}

return pte_offset_kernel(pmd, 0);
@@ -119,7 +121,7 @@ static void __init page_table_range_init
pgd = pgd_base + pgd_idx;
 
for ( ; (pgd_idx < PTRS_PER_PGD) && (vaddr != end); pgd++, pgd_idx++) {
-   if (pgd_none(*pgd)) 
+   if (!(pgd_val(*pgd) & _PAGE_PRESENT)) 
one_md_table_init(pgd);
pud = pud_offset(pgd, vaddr);
pmd = pmd_offset(pud, vaddr);
@@ -158,7 +160,11 @@ static void __init kernel_physical_mappi
pfn = 0;
 
for (; pgd_idx < PTRS_PER_PGD; pgd++, pgd_idx++) {
-   pmd = one_md_table_init(pgd);
+   if (!(pgd_val(*pgd) & _PAGE_PRESENT))
+   pmd = one_md_table_init(pgd);
+   else
+   pmd = pmd_offset(pud_offset(pgd, PAGE_OFFSET), 
PAGE_OFFSET);
+
if (pfn >= max_low_pfn)
continue;
for (pmd_idx = 0; pmd_idx < PTRS_PER_PMD && pfn < max_low_pfn; 
pmd++, pmd_idx++) {
@@ -167,20 +173,26 @@ static void __init kernel_physical_mappi
/* Map with big pages if possible, otherwise create 
normal page tables. */

[patch 02/20] XEN-paravirt: Add a flag to allow the VGA console to be disabled

2007-01-13 Thread Jeremy Fitzhardinge
Add a flag to allow the VGA console to be disabled.  The VGA code will
spin forever if there isn't any real VGA hardware, which will happen
under Xen.

Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]>
Signed-off-by: Gerd Hoffmann <[EMAIL PROTECTED]>
Cc: Andrew Morton <[EMAIL PROTECTED]>

===
--- a/arch/i386/kernel/setup.c
+++ b/arch/i386/kernel/setup.c
@@ -71,6 +71,9 @@ unsigned long init_pg_tables_end __initd
 
 int disable_pse __devinitdata = 0;
 
+/* use to runtime disable vga_con */
+int vgacon_enabled = 1;
+ 
 /*
  * Machine setup..
  */
@@ -652,7 +655,7 @@ void __init setup_arch(char **cmdline_p)
 
 #ifdef CONFIG_VT
 #if defined(CONFIG_VGA_CONSOLE)
-   if (!efi_enabled || (efi_mem_type(0xa) != EFI_CONVENTIONAL_MEMORY))
+   if (vgacon_enabled && (!efi_enabled || (efi_mem_type(0xa) != 
EFI_CONVENTIONAL_MEMORY)))
conswitchp = _con;
 #elif defined(CONFIG_DUMMY_CONSOLE)
conswitchp = _con;
===
--- a/include/asm-i386/setup.h
+++ b/include/asm-i386/setup.h
@@ -77,6 +77,8 @@ void __init add_memory_region(unsigned l
 void __init add_memory_region(unsigned long long start,
  unsigned long long size, int type);
 
+extern int vgacon_enabled;
+
 #endif /* __ASSEMBLY__ */
 
 #endif  /*  __KERNEL__  */

-- 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 01/20] XEN-paravirt: Fix typo in sync_constant_test_bit()s name.

2007-01-13 Thread Jeremy Fitzhardinge
Fix typo in sync_constant_test_bit()s name.

Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]>

===
--- a/include/asm-i386/sync_bitops.h
+++ b/include/asm-i386/sync_bitops.h
@@ -130,7 +130,7 @@ static inline int sync_test_and_change_b
return oldbit;
 }
 
-static __always_inline int sync_const_test_bit(int nr, const volatile unsigned 
long *addr)
+static __always_inline int sync_constant_test_bit(int nr, const volatile 
unsigned long *addr)
 {
return ((1UL << (nr & 31)) &
(((const volatile unsigned int *)addr)[nr >> 5])) != 0;

-- 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 00/20] XEN-paravirt: Xen guest implementation for paravirt_ops interface

2007-01-13 Thread Jeremy Fitzhardinge
This patch series implements the Linux Xen guest in terms of the
paravirt-ops interface.  The features in implemented this patch series
are:
 * domU only
 * UP only (most code is SMP-safe, but there's no way to create a new vcpu)
 * writable pagetables, with late pinning/early unpinning
   (no shadow pagetable support)
 * supports both PAE and non-PAE modes
 * xen console
 * virtual block device (blockfront)

(Netfront needs a bit of updating, and will be in a separate patch later.)

The patch series is in two parts:

1-11: cleanups to the core kernel, either to fix outright problems,
  or to add appropriate hooks for Xen
12-20: the Xen guest implementation itself

I've tried to make each patch as self-explanatory as possible.  The
series is based on 2.6.20-rc4-mm1.

Thanks,
J
-- 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Linux-fbdev-devel] Display class

2007-01-13 Thread James Simmons

Andrew please apply this patch.

Signed-off-by: James Simmons <[EMAIL PROTECTED]>

> 
> > Hi,
> > 
> > On Tuesday 05 December 2006 13:03, James Simmons wrote:
> > > +int probe_edid(struct display_device *dev, void *data)
> > > +{
> > > +   struct fb_monspecs spec;
> > > +   ssize_t size = 45;
> 
> That code was only for testing. I do have new core code. Andrew could 
> you merge this patch as it is against the -mm tree.
> 
> This new class provides a way common interface for various types of 
> displays such as LCD, CRT, LVDS etc. It is a expansion of the lcd
> class to include other types of displays.
>   
> diff -urN linux-2.6.20-rc3/drivers/video/display/display-sysfs.c 
> linux-2.6.20-rc3-display/drivers/video/display/display-sysfs.c
> --- linux-2.6.20-rc3/drivers/video/display/display-sysfs.c1969-12-31 
> 19:00:00.0 -0500
> +++ linux-2.6.20-rc3-display/drivers/video/display/display-sysfs.c
> 2007-01-13 16:22:54.0 -0500
> @@ -0,0 +1,208 @@
> +/*
> + *  display.c - Display output driver
> + *
> + *  Copyright (C) 2007 James Simmons <[EMAIL PROTECTED]>
> + *
> + * ~~
> + *
> + *  This program is free software; you can redistribute it and/or modify
> + *  it under the terms of the GNU General Public License as published by
> + *  the Free Software Foundation; either version 2 of the License, or (at
> + *  your option) any later version.
> + *
> + *  This program is distributed in the hope that it will be useful, but
> + *  WITHOUT ANY WARRANTY; without even the implied warranty of
> + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + *  General Public License for more details.
> + *
> + *  You should have received a copy of the GNU General Public License along
> + *  with this program; if not, write to the Free Software Foundation, Inc.,
> + *  59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
> + *
> + * ~~
> + */
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +static ssize_t display_show_name(struct class_device *cdev, char *buf)
> +{
> + struct display_device *dsp = to_display_device(cdev);
> + return snprintf(buf, PAGE_SIZE, "%s\n", dsp->name);
> +}
> +
> +static ssize_t display_show_type(struct class_device *cdev, char *buf)
> +{
> + struct display_device *dsp = to_display_device(cdev);
> + return snprintf(buf, PAGE_SIZE, "%s\n", dsp->driver->type);
> +}
> +
> +static ssize_t display_show_power(struct class_device *cdev, char *buf)
> +{
> + struct display_device *dsp = to_display_device(cdev);
> + ssize_t ret = -ENXIO;
> +
> + mutex_lock(>lock);
> + if (likely(dsp->driver->get_power))
> + ret = sprintf(buf,"%.8x\n", dsp->driver->get_power(dsp));
> + mutex_unlock(>lock);
> + return ret;
> +}
> +
> +static ssize_t display_store_power(struct class_device *cdev,
> + const char *buf, size_t count)
> +{
> + struct display_device *dsp = to_display_device(cdev);
> + ssize_t size;
> + char *endp;
> + int power;
> +
> + power = simple_strtoul(buf, , 0);
> + size = endp - buf;
> + if (*endp && isspace(*endp))
> + size++;
> + if (size != count)
> + return -EINVAL;
> +
> + mutex_lock(>lock);
> + if (likely(dsp->driver->set_power)) {
> + dsp->request_state = power;
> + dsp->driver->set_power(dsp);
> + }
> + mutex_unlock(>lock);
> + return count;
> +}
> +
> +static ssize_t display_show_contrast(struct class_device *cdev, char *buf)
> +{
> + struct display_device *dsp = to_display_device(cdev);
> + ssize_t rc = -ENXIO;
> +
> + mutex_lock(>lock);
> + if (likely(dsp->driver) && dsp->driver->get_contrast)
> + rc = sprintf(buf, "%d\n", dsp->driver->get_contrast(dsp));
> + mutex_unlock(>lock);
> + return rc;
> +}
> +
> +static ssize_t display_store_contrast(struct class_device *cdev, const char 
> *buf, size_t count)
> +{
> + 
> + struct display_device *dsp = to_display_device(cdev);
> + ssize_t ret = -EINVAL, size;
> + int contrast;
> + char *endp;
> +
> + contrast = simple_strtoul(buf, , 0);
> + size = endp - buf;
> +
> + if (*endp && isspace(*endp))
> + size++;
> +
> + if (size != count)
> + return ret;
> +
> + mutex_lock(>lock);
> + if (likely(dsp->driver && dsp->driver->set_contrast)) {
> + pr_debug("display: set contrast to %d\n", contrast);
> + dsp->driver->set_contrast(dsp, contrast);
> + ret = count;
> + }
> + mutex_unlock(>lock);
> + return ret;
> +}
> +
> +static ssize_t display_show_max_contrast(struct class_device *cdev, char 
> *buf)
> +{
> + struct display_device *dsp = to_display_device(cdev);
> + ssize_t rc = -ENXIO;
> +
> + mutex_lock(>lock);
> + if (likely(dsp->driver))
> 

Re: [-mm patch] make mmc_sysfs.c:mmc_key_type static

2007-01-13 Thread Pierre Ossman
Adrian Bunk wrote:
> On Thu, Jan 11, 2007 at 10:26:27PM -0800, Andrew Morton wrote:
>> ...
>> Changes since 2.6.20-rc3-mm1:
>> ...
>>  git-mmc.patch
>> ...
>>  git trees
>> ...
> 
> 
> This patch makes the needlessly global struct mmc_key_type static.
> 
> Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>
> 

Thanks, applied.

Rgds
Pierre




signature.asc
Description: OpenPGP digital signature


Re: Display class

2007-01-13 Thread James Simmons

> Hi,
> 
> On Tuesday 05 December 2006 13:03, James Simmons wrote:
> > +int probe_edid(struct display_device *dev, void *data)
> > +{
> > +   struct fb_monspecs spec;
> > +   ssize_t size = 45;

That code was only for testing. I do have new core code. Andrew could 
you merge this patch as it is against the -mm tree.

This new class provides a way common interface for various types of 
displays such as LCD, CRT, LVDS etc. It is a expansion of the lcd
class to include other types of displays.
  
diff -urN linux-2.6.20-rc3/drivers/video/display/display-sysfs.c 
linux-2.6.20-rc3-display/drivers/video/display/display-sysfs.c
--- linux-2.6.20-rc3/drivers/video/display/display-sysfs.c  1969-12-31 
19:00:00.0 -0500
+++ linux-2.6.20-rc3-display/drivers/video/display/display-sysfs.c  
2007-01-13 16:22:54.0 -0500
@@ -0,0 +1,208 @@
+/*
+ *  display.c - Display output driver
+ *
+ *  Copyright (C) 2007 James Simmons <[EMAIL PROTECTED]>
+ *
+ * ~~
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or (at
+ *  your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful, but
+ *  WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *  General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, write to the Free Software Foundation, Inc.,
+ *  59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
+ *
+ * ~~
+ */
+#include 
+#include 
+#include 
+#include 
+
+static ssize_t display_show_name(struct class_device *cdev, char *buf)
+{
+   struct display_device *dsp = to_display_device(cdev);
+   return snprintf(buf, PAGE_SIZE, "%s\n", dsp->name);
+}
+
+static ssize_t display_show_type(struct class_device *cdev, char *buf)
+{
+   struct display_device *dsp = to_display_device(cdev);
+   return snprintf(buf, PAGE_SIZE, "%s\n", dsp->driver->type);
+}
+
+static ssize_t display_show_power(struct class_device *cdev, char *buf)
+{
+   struct display_device *dsp = to_display_device(cdev);
+   ssize_t ret = -ENXIO;
+
+   mutex_lock(>lock);
+   if (likely(dsp->driver->get_power))
+   ret = sprintf(buf,"%.8x\n", dsp->driver->get_power(dsp));
+   mutex_unlock(>lock);
+   return ret;
+}
+
+static ssize_t display_store_power(struct class_device *cdev,
+   const char *buf, size_t count)
+{
+   struct display_device *dsp = to_display_device(cdev);
+   ssize_t size;
+   char *endp;
+   int power;
+
+   power = simple_strtoul(buf, , 0);
+   size = endp - buf;
+   if (*endp && isspace(*endp))
+   size++;
+   if (size != count)
+   return -EINVAL;
+
+   mutex_lock(>lock);
+   if (likely(dsp->driver->set_power)) {
+   dsp->request_state = power;
+   dsp->driver->set_power(dsp);
+   }
+   mutex_unlock(>lock);
+   return count;
+}
+
+static ssize_t display_show_contrast(struct class_device *cdev, char *buf)
+{
+   struct display_device *dsp = to_display_device(cdev);
+   ssize_t rc = -ENXIO;
+
+   mutex_lock(>lock);
+   if (likely(dsp->driver) && dsp->driver->get_contrast)
+   rc = sprintf(buf, "%d\n", dsp->driver->get_contrast(dsp));
+   mutex_unlock(>lock);
+   return rc;
+}
+
+static ssize_t display_store_contrast(struct class_device *cdev, const char 
*buf, size_t count)
+{
+   
+   struct display_device *dsp = to_display_device(cdev);
+   ssize_t ret = -EINVAL, size;
+   int contrast;
+   char *endp;
+
+   contrast = simple_strtoul(buf, , 0);
+   size = endp - buf;
+
+   if (*endp && isspace(*endp))
+   size++;
+
+   if (size != count)
+   return ret;
+
+   mutex_lock(>lock);
+   if (likely(dsp->driver && dsp->driver->set_contrast)) {
+   pr_debug("display: set contrast to %d\n", contrast);
+   dsp->driver->set_contrast(dsp, contrast);
+   ret = count;
+   }
+   mutex_unlock(>lock);
+   return ret;
+}
+
+static ssize_t display_show_max_contrast(struct class_device *cdev, char *buf)
+{
+   struct display_device *dsp = to_display_device(cdev);
+   ssize_t rc = -ENXIO;
+
+   mutex_lock(>lock);
+   if (likely(dsp->driver))
+   rc = sprintf(buf, "%d\n", dsp->driver->max_contrast);
+   mutex_unlock(>lock);
+   return rc;
+}
+
+static void display_class_release(struct class_device *dev)
+{
+   struct display_device *dsp = to_display_device(dev);
+   

[PATCH -mm] MMCONFIG: Reject a broken MCFG tables on Asus etc

2007-01-13 Thread OGAWA Hirofumi
This rejects a broken MCFG tables on Asus etc.
Arjan and Andi suggest this.

Signed-off-by: OGAWA Hirofumi <[EMAIL PROTECTED]>
---

 arch/i386/pci/mmconfig-shared.c |   24 ++-
 arch/i386/pci/mmconfig.c|9 ---
 arch/x86_64/pci/mmconfig.c  |   50 +++-
 3 files changed, 37 insertions(+), 46 deletions(-)

diff -puN arch/i386/pci/mmconfig-shared.c~pci-mmconfig-reject-mcfg_broken 
arch/i386/pci/mmconfig-shared.c
--- linux-2.6/arch/i386/pci/mmconfig-shared.c~pci-mmconfig-reject-mcfg_broken   
2007-01-12 23:15:58.0 +0900
+++ linux-2.6-hirofumi/arch/i386/pci/mmconfig-shared.c  2007-01-12 
23:15:58.0 +0900
@@ -207,6 +207,26 @@ static void __init pci_mmcfg_insert_reso
}
 }
 
+static void __init pci_mmcfg_reject_broken(void)
+{
+   struct acpi_table_mcfg_config *cfg = _mmcfg_config[0];
+
+   /*
+* Handle more broken MCFG tables on Asus etc.
+* They only contain a single entry for bus 0-0.
+*/
+   if (pci_mmcfg_config_num == 1 &&
+   cfg->pci_segment_group_number == 0 &&
+   (cfg->start_bus_number | cfg->end_bus_number) == 0) {
+   kfree(pci_mmcfg_config);
+   pci_mmcfg_config = NULL;
+   pci_mmcfg_config_num = 0;
+
+   printk(KERN_ERR "PCI: start and end of bus number is 0. "
+  "Rejected as broken MCFG.");
+   }
+}
+
 void __init pci_mmcfg_init(int type)
 {
int known_bridge = 0;
@@ -217,8 +237,10 @@ void __init pci_mmcfg_init(int type)
if (type == 1 && pci_mmcfg_check_hostbridge())
known_bridge = 1;
 
-   if (!known_bridge)
+   if (!known_bridge) {
acpi_table_parse(ACPI_MCFG, acpi_parse_mcfg);
+   pci_mmcfg_reject_broken();
+   }
 
if ((pci_mmcfg_config_num == 0) ||
(pci_mmcfg_config == NULL) ||
diff -puN arch/i386/pci/mmconfig.c~pci-mmconfig-reject-mcfg_broken 
arch/i386/pci/mmconfig.c
--- linux-2.6/arch/i386/pci/mmconfig.c~pci-mmconfig-reject-mcfg_broken  
2007-01-12 23:15:58.0 +0900
+++ linux-2.6-hirofumi/arch/i386/pci/mmconfig.c 2007-01-12 23:15:58.0 
+0900
@@ -42,15 +42,6 @@ static u32 get_base_addr(unsigned int se
return cfg->base_address;
}
 
-   /* Handle more broken MCFG tables on Asus etc.
-  They only contain a single entry for bus 0-0. Assume
-  this applies to all busses. */
-   cfg = _mmcfg_config[0];
-   if (pci_mmcfg_config_num == 1 &&
-   cfg->pci_segment_group_number == 0 &&
-   (cfg->start_bus_number | cfg->end_bus_number) == 0)
-   return cfg->base_address;
-
/* Fall back to type 0 */
return 0;
 }
diff -puN arch/x86_64/pci/mmconfig.c~pci-mmconfig-reject-mcfg_broken 
arch/x86_64/pci/mmconfig.c
--- linux-2.6/arch/x86_64/pci/mmconfig.c~pci-mmconfig-reject-mcfg_broken
2007-01-12 23:15:58.0 +0900
+++ linux-2.6-hirofumi/arch/x86_64/pci/mmconfig.c   2007-01-12 
23:20:25.0 +0900
@@ -20,39 +20,6 @@ struct mmcfg_virt {
 };
 static struct mmcfg_virt *pci_mmcfg_virt;
 
-static inline int mcfg_broken(void)
-{
-   struct acpi_table_mcfg_config *cfg = _mmcfg_config[0];
-
-   /* Handle more broken MCFG tables on Asus etc.
-  They only contain a single entry for bus 0-0. Assume
-  this applies to all busses. */
-   if (pci_mmcfg_config_num == 1 &&
-   cfg->pci_segment_group_number == 0 &&
-   (cfg->start_bus_number | cfg->end_bus_number) == 0)
-   return 1;
-   return 0;
-}
-
-static void __iomem *mcfg_ioremap(struct acpi_table_mcfg_config *cfg)
-{
-   void __iomem *addr;
-   u32 size;
-
-   if (mcfg_broken())
-   size = 256 << 20;
-   else
-   size = (cfg->end_bus_number + 1) << 20;
-
-   addr = ioremap_nocache(cfg->base_address, size);
-   if (addr) {
-   printk(KERN_INFO "PCI: Using MMCONFIG at %x - %x\n",
-  cfg->base_address,
-  cfg->base_address + size - 1);
-   }
-   return addr;
-}
-
 static char __iomem *get_virt(unsigned int seg, unsigned bus)
 {
struct acpi_table_mcfg_config *cfg;
@@ -66,9 +33,6 @@ static char __iomem *get_virt(unsigned i
return pci_mmcfg_virt[cfg_num].virt;
}
 
-   if (mcfg_broken())
-   return pci_mmcfg_virt[0].virt;
-
/* Fall back to type 0 */
return NULL;
 }
@@ -154,6 +118,20 @@ int __init pci_mmcfg_arch_reachable(unsi
return pci_dev_base(seg, bus, devfn) != NULL;
 }
 
+static void __iomem * __init mcfg_ioremap(struct acpi_table_mcfg_config *cfg)
+{
+   void __iomem *addr;
+   u32 size;
+
+   size = (cfg->end_bus_number + 1) << 20;
+   addr = ioremap_nocache(cfg->base_address, size);
+   if (addr) {
+   printk(KERN_INFO "PCI: Using MMCONFIG at %x - 

Re: High lock spin time for zone->lru_lock under extreme conditions

2007-01-13 Thread Andrew Morton
> On Sat, 13 Jan 2007 11:53:34 -0800 Ravikiran G Thirumalai <[EMAIL PROTECTED]> 
> wrote:
> On Sat, Jan 13, 2007 at 12:00:17AM -0800, Andrew Morton wrote:
> > > On Fri, 12 Jan 2007 23:36:43 -0800 Ravikiran G Thirumalai <[EMAIL 
> > > PROTECTED]> wrote:
> > > > >void __lockfunc _spin_lock_irq(spinlock_t *lock)
> > > > >{
> > > > >local_irq_disable();
> > > > >> rdtsc(t1);
> > > > >preempt_disable();
> > > > >spin_acquire(>dep_map, 0, 0, _RET_IP_);
> > > > >_raw_spin_lock(lock);
> > > > >> rdtsc(t2);
> > > > >if (lock->spin_time < (t2 - t1))
> > > > >lock->spin_time = t2 - t1;
> > > > >}
> > > > >
> > > > >On some runs, we found that the zone->lru_lock spun for 33 seconds or 
> > > > >more
> > > > >while the maximal CS time was 3 seconds or so.
> > > > 
> > > > What is the "CS time"?
> > > 
> > > Critical Section :).  This is the maximal time interval I measured  from 
> > > t2 above to the time point we release the spin lock.  This is the hold 
> > > time I guess.
> > 
> > By no means.  The theory here is that CPUA is taking and releasing the
> > lock at high frequency, but CPUB never manages to get in and take it.  In
> > which case the maximum-acquisition-time is much larger than the
> > maximum-hold-time.
> > 
> > I'd suggest that you use a similar trick to measure the maximum hold time:
> > start the timer after we got the lock, stop it just before we release the
> > lock (assuming that the additional rdtsc delay doesn't "fix" things, of
> > course...)
> 
> Well, that is exactly what I described above  as CS time.

Seeing the code helps.

>  The
> instrumentation goes like this:
> 
> void __lockfunc _spin_lock_irq(spinlock_t *lock)
> {
> unsigned long long t1,t2;
> local_irq_disable();
> t1 = get_cycles_sync();
> preempt_disable();
> spin_acquire(>dep_map, 0, 0, _RET_IP_);
> _raw_spin_lock(lock);
> t2 = get_cycles_sync();
> lock->raw_lock.htsc = t2;
> if (lock->spin_time < (t2 - t1))
> lock->spin_time = t2 - t1;
> }
> ...
> 
> void __lockfunc _spin_unlock_irq(spinlock_t *lock)
> {
> unsigned long long t1 ;
> spin_release(>dep_map, 1, _RET_IP_);
> t1 = get_cycles_sync();
> if (lock->cs_time < (t1 -  lock->raw_lock.htsc))
> lock->cs_time = t1 -  lock->raw_lock.htsc;
> _raw_spin_unlock(lock);
> local_irq_enable();
> preempt_enable();
> }
> 
> Am I missing something?  Is this not what you just described? (The
> synchronizing rdtsc might not be really required at all locations, but I 
> doubt if it would contribute a significant fraction to 33s  or even 
> the 3s hold time on a 2.6 GHZ opteron).

OK, now we need to do a dump_stack() each time we discover a new max hold
time.  That might a bit tricky: the printk code does spinlocking too so
things could go recursively deadlocky.  Maybe make spin_unlock_irq() return
the hold time then do:

void lru_spin_unlock_irq(struct zone *zone)
{
long this_time;

this_time = spin_unlock_irq(>lru_lock);
if (this_time > zone->max_time) {
zone->max_time = this_time;
dump_stack();
}
}

or similar.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6 Partitioning bug with LILO

2007-01-13 Thread Andries Brouwer
On Fri, Jan 12, 2007 at 06:18:00PM -0500, Kris Karas wrote:
> Hello Andries,
> 
> I noticed you're listed as the maintainer for the disk
> geometry/partitioning logic in the 2.6 kernel, so I'm sending this to
> you, as I think this bug is most likely in that part of the code, ...
> 
> I've been bug-hunting with John Coffman to solve an issue where running
> LILO trashes the ext2 metadata on my /boot partition.  The consensus so
> far is that it's not LILO at all, but rather some subtle bug in the
> kernel that's the culprit.  I can reproduce it easily enough on 2.6, but
> not on 2.4, which further suggests its kernel-related.
> 
> If one does:
> 
>   umount /boot
>   e2fsck -f -y /dev/hda1
>   mount /dev/hda1 /boot
>   lilo
>   umount /boot
>   e2fsck -f -y /dev/hda1
> 
> the second run of e2fsck will report fixable block bitmap errors.

It is easy to see the cause of this.
There is an old problem with the Linux whole disk device /dev/hda
namely that there is aliasing with /dev/hdaN.
I don't know whether there exist kernels that fix this problem -
as far as I know this has been a problem since ancient times.

But, given the fact that this is a well-known problem of the kernel,
a utility should be careful to avoid problems and flush buffers
at appropriate times.

Now that lilo is one of the few utilities to write to /dev/hda,
it should be fixed.

[And yes, also the kernel should be fixed.]

Andries

[let me cc linux-kernel]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20-rc5: known unfixed regressions

2007-01-13 Thread Aaron Sethman


On Sat, 13 Jan 2007, Adrian Bunk wrote:


On Sat, Jan 13, 2007 at 04:51:36PM +0100, Damien Wyart wrote:

* Adrian Bunk <[EMAIL PROTECTED]> [070113 08:11]:

This still leaves the old regressions we have not yet fixed...
This email lists some known regressions in 2.6.20-rc5 compared to 2.6.19.



Subject: BUG: scheduling while atomic: hald-addon-stor/...
 cdrom_{open,release,ioctl} in trace
References : http://lkml.org/lkml/2006/12/26/105
 http://lkml.org/lkml/2006/12/29/22
 http://lkml.org/lkml/2006/12/31/133
Submitter  : Jon Smirl <[EMAIL PROTECTED]>
 Damien Wyart <[EMAIL PROTECTED]>
 Aaron Sethman <[EMAIL PROTECTED]>
Status : unknown


I have not seen the problem since using rc3, so I guess it is ok now.
Maybe the commit 9414232fa0cc28e2f51b8c76d260f2748f7953fc has fixed the
problem, but I am not 100% sure.


Thanks for this information.

Jon, Aaron, can you confirm it's fixed in -rc5?


I haven't seen it in a while anyways, fwiw.

-Aaron
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.20-rc4-mm1 USB (asix) problem

2007-01-13 Thread Eric Buddington
The following problem occured on an Athlon64 X2 under 2.6.20-rc4-mm1,
but not 2.6.20-rc3-mm1.

I'm using two D-Link DUB-E100 USB ethernet adapters, using the 'asix'
driver. When I upgraded to 2.6.20-rc4-mm1, they were still recognized,
but various ifconfig operations on them (up/down, changing IP) caused
a system freeze (including caps lock/num lock lights) for many seconds.
I do not believe there was anything new in dmesg when the system
resumed. USB debugging was not turned on at the time, though the
problem is repeatable.

Also, no packets actually made it out of the adapters (watching from
other systems on the network).

Since this is a system we need running and networked, I can't do
extensive testing on it, but I might be to bring it down for a few
quick tests if that would help.

-Eric

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: O_DIRECT question

2007-01-13 Thread Michael Tokarev
Bill Davidsen wrote:
> Linus Torvalds wrote:
>>
[]
>> But what O_DIRECT does right now is _not_ really sensible, and the
>> O_DIRECT propeller-heads seem to have some problem even admitting that
>> there _is_ a problem, because they don't care. 
> 
> You say that as if it were a failing. Currently if you mix access via
> O_DIRECT and non-DIRECT you can get unexpected results. You can screw
> yourself, mangle your data, or have no problems at all if you avoid
> trying to access the same bytes in multiple ways. There are lots of ways
> to get or write stale data, not all involve O_DIRECT in any way, and the
> people actually using O_DIRECT now are managing very well.
> 
> I don't regard it as a system failing that I am allowed to shoot myself
> in the foot, it's one of the benefits of Linux over Windows. Using
> O_DIRECT now is like being your own lawyer, room for both creativity and
> serious error. But what's there appears portable, which is important as
> well.

If I got it right (and please someone tell me if I *really* got it right!),
the problem is elsewhere.

Suppose you have a filesystem, not at all related to databases and stuff.
Your usual root filesystem, with your /etc/ /var and so on directories.

Some time ago you edited /etc/shadow, updating it by writing new file and
renaming it to proper place.  So you have that old content of your shadow
file (now deleted) somewhere on the disk, but not accessible from the
filesystem.

Now, a bad guy deliberately tries to open some file on this filesystem, using
O_DIRECT flag, ftruncates() it to some huge size (or does seek+write), and
at the same time tries to use O_DIRECT read of the data.

Due to all the races etc, it is possible for him to read that old content of
/etc/shadow file you've deleted before.

> I do have one thought, WRT reading uninitialized disk data. I would hope
> that sparse files are handled right, and that when doing a write with
> O_DIRECT the metadata is not updated until the write is done.

"hope that sparse files are handled right" is a high hope.  Exactly because
this very place IS racy.

Again, *IF* I got it correctly.

/mjt
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kernel command line for a specific framebuffer console driver

2007-01-13 Thread Juergen Beisert
Hi Alexey,

On Friday 12 January 2007 20:36, Alexey Dobriyan wrote:
> On Fri, Jan 12, 2007 at 01:43:42PM +0100, Juergen Beisert wrote:
> > does someone know how to forward a kernel command line option to
> > configure the AMD Geode GX1 framebuffer?
> >
> > I tried with "video=gx1fb:[EMAIL PROTECTED]" but it does not work. On
> > another machine with an SIS framebuffer the line
> > "video=sisfb:[EMAIL PROTECTED]" works as expected.
> >
> > Any ideas?
>
> Yes. You try this patch and report whether it works or not.

Thank you very much. Yes it works. I tried these kernel parameters:

1) video=gx1fb:mode:[EMAIL PROTECTED],crt:1
  -> CRT was active, 160x64 console
2) video=gx1fb:mode:[EMAIL PROTECTED],crt:1
  -> CRT was active, 128x48 console
3) video=gx1fb:mode:[EMAIL PROTECTED],crt:0,panel:800x600
  -> CRT was disabled, 100x37 console
4) video=gx1fb:mode:[EMAIL PROTECTED],crt:0,panel:800x600
  -> CRT was disabled, 80x25 console

Sorry, I have no flatpanel, so I cannot test if the "panel" option works 
correctly. But somethings changes when I tried different values (see 3 and 
4).

Regards
Juergen
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux v2.6.20-rc5

2007-01-13 Thread Bill Davidsen

Jeff Chua wrote:

On 1/13/07, Jeff Chua <[EMAIL PROTECTED]> wrote:

On 1/13/07, Andrew Morton <[EMAIL PROTECTED]> wrote:
> On Fri, 12 Jan 2007 14:27:48 -0500 (EST)
> Linus Torvalds <[EMAIL PROTECTED]> wrote:

  CC [M]  drivers/kvm/vmx.o
{standard input}: Assembler messages:
{standard input}:3257: Error: bad register name `%sil'
make[2]: *** [drivers/kvm/vmx.o] Error 1
make[1]: *** [drivers/kvm] Error 2
make: *** [drivers] Error 2

Am I missing something or this is a real problem?
Applied 2.6.20-rc5-mm-fixes and got this problem.
Using gcc version 3.4.5, binutils-2.17.50.0.8


Same problem with vanilla linux-2.6.20-rc5.


What target? I had no such problem with x86, haven't tried the x86_64 
build yet. Haven't even been able to try a boot, but the build was fine ;-)


--
bill davidsen <[EMAIL PROTECTED]>
  CTO TMR Associates, Inc
  Doing interesting things with small computers since 1979
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Choosing a HyperThreading/SMP/MultiCore kernel ?

2007-01-13 Thread Bill Davidsen

[EMAIL PROTECTED] wrote:

On Fri, 12 Jan 2007 10:03:49 EST, Lennart Sorensen said:

I would expect any distribution should work on these (as long as the
kernel they use isn't too old.).  Of course if it is a Mac, you need a
distribution that supports their firmware (which is of course not a PC
bios).  As long as you can boot it, any i386 or amd64 kernel with smp
enabled should use all the processors present (well amd64 on the
core2duo and on the p4 if it is em64t enabled).


amd64 will only work on a core2duo if it's a T7200 or higher - the
lower numbers are 32-bit-only chipsets.  I admit not knowing what
exact variant the Mac has.


I don't believe that's correct, the Intel features page indicates all 
core2 have both 64bit and virtualization. Perhaps some of the core (no 
2) models didn't? Even the old 930 had those features by my notes.



I believe the closest optimization for a Core2 is probably the Pentium M
(certainly not the P4/netburst).  Not entirely sure though.


CONFIG_MCORE2=y

That's probably even closer :)  At least in 2.6.20-rc4-mm1.  


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] How to (automatically) find the correct maintainer(s)

2007-01-13 Thread Stefan Richter
On 13 Jan, Richard Knutsson wrote:
> Stefan Richter wrote:
>> On 13 Jan, Richard Knutsson wrote:
>> [...]
>>   
>>> SUPERCOOL ALPHA CARD
>>>
>>> P:  Clark Kent
>>> M:  [EMAIL PROTECTED]
>>> L:  [EMAIL PROTECTED]
>>> C:  SUPER_A
>>> S:  Maintained
>>> (C: for CONFIG. Any better idea?)
>>>
>>> then if someone changes a file who are built with CONFIG_SUPER_A, can 
>>> easily backtrack it to the correct maintainer(s).
>>> 
>> [...]
>>   
>>> My first idea was to use the pathway and define that directories above 
>>> the specified (if not specified by another) would fall to the current 
>>> maintainer. It would work, but requires that all pathways be specified 
>>> at once, or a few maintainers with "short" pathways would get much of 
>>> the patches (and it is not as correct/easy to maintain as looking for 
>>> the CONFIG_flag).
>>>
>>>
>>> Any thoughts on this is very much appreciated (is there any flaws with 
>>> this?).
>>> 
>>
>>  - What about drivers which have no MAINTAINER entry but reside in a
>>subsystem with MAINTAINER entry?
>>   
> Hmm, how are those drivers built? Can you please point me to one?

I believe you read too quickly what I wrote, didn't you? :-)
The MAINTAINER file doesn't influence how drivers are built.

>>  - What if these drivers depend on two subsystems?
>>   
> Not sure if I understand the problem. I don't see the maintainers for 
> the subsystems too interested in a driver, and it is the drivers 
> maintainer we want.

I am specifically thinking of drivers which are maintained by the
subsystem maintainers. (Well, see below...)

Besides, the subsystem maintainer could point the submitter to a
more appropriate channel or ignore the submitter. (A submitter who
feels ignored is hopefully doing some more research then.) Also,
a driver maintainer certainly reads the mailinglist to which the
submitter posted.

>>  - Config options map to object files but do not map directly to source
>>files. Diffstats show source files.
>>   
> Can you make a object-file out of 2 c-files? Using Makefile?

Yes, you can, although I don't know if it is directly done in the
kernel build system. Of course what is often done is to make n object
files out of n c files, then link them to make 1 object file.

>> Example: The sbp2 driver is an IEEE 1394 driver and a SCSI driver.
>> sbp2.o is enabled by CONFIG_IEEE1394_SBP2 which depends on
>> CONFIG_IEEE1394 and CONFIG_SCSI. sbp2.c resides in drivers/ieee1394/.
>> What is the algorithm to look up sbp2's maintainers?
>>   
> The one listed for CONFIG_IEEE1394_SBP2 :)

...OK, we /could/ write

IEEE 1394 SUBSYSTEM
C:  IEEE1394
C:  IEEE1394_OHCI1394
C:  IEEE1394_SBP2
C:  IEEE1394_DV1394  /* would better be put into a new own entry due to 
different status of maintenance level */
C:  IEEE1394_VIDEO1394  /* that one perhaps too */
L:  [EMAIL PROTECTED]
P:  Ben and me
[...]
IEEE 1394 IPV4 DRIVER (eth1394)
C:  IEEE1394_ETH1394
[...]

On the other hand, we could write

IEEE 1394 SUBSYSTEM
F:  drivers/ieee1394
L:  [EMAIL PROTECTED]
P:  Ben and me
[...]
IEEE 1394 IPV4 DRIVER (eth1394)
F:  drivers/ieee1394/eth1394
[...]

If it was done the latter way, i.e. using F: not C:, it could be
made a rule that the more specific entries come after more generic
entries. Thus the last match of multiple matches is the proper one.
In any case, the longest match is the proper one.

> But what about ex ieee1394_core.o? Is ieee1394-objs "equal" to 
> ieee1394.o? (Seems I need to read some Makefile docs...)

Yes and yes. (Documentation/kbuild/makefiles.txt)

>> Don't get me wrong though. Easier lookup of maintainers and mailinglists
>> sounds to me like a desirable feature, not just from the point of view
>> of submitters but also of maintainers.
>>   
> Well, as they say: "If it is too good to be true, it usually is" (but I 
> don't think it is too far fetched)

No, it probably isn't.

> (Btw, what I can see, there is no possibility to get the wrong 
> maintainer. Just that sometime it can't give you an answer and you have 
> to do it in the old way).

Your approach could give a wrong answer if someone implements a
very "clever" mapping. My approach could give a wrong answer if
someone takes a generic match while there was a more specific
match.

Your approach requires to evaluate the diffstat, one or more
Makefile (taking the Linux Makefile syntax into account), and the
MAINTAINERS file. My approach just requires to evaluate the
diffstat and the MAINTAINERS file.
-- 
Stefan Richter
-=-=-=== ---= -==-=
http://arcgraph.de/sr/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: O_DIRECT question

2007-01-13 Thread Bill Davidsen

Linus Torvalds wrote:


On Sat, 13 Jan 2007, Michael Tokarev wrote:

(No, really - this load isn't entirely synthetic.  It's a typical database
workload - random I/O all over, on a large file.  If it can, it combines
several I/Os into one, by requesting more than a single block at a time,
but overall it is random.)


My point is that you can get basically ALL THE SAME GOOD BEHAVIOUR without 
having all the BAD behaviour that O_DIRECT adds.


For example, just the requirement that O_DIRECT can never create a file 
mapping, and can never interact with ftruncate would actually make 
O_DIRECT a lot more palatable to me. Together with just the requirement 
that an O_DIRECT open would literally disallow any non-O_DIRECT accesses, 
and flush the page cache entirely, would make all the aliases go away.


At that point, O_DIRECT would be a way of saying "we're going to do 
uncached accesses to this pre-allocated file". Which is a half-way 
sensible thing to do.


But it's not necessary, it would break existing programs, would be 
incompatible with other o/s like AIX, BSD, Solaris. And it doesn't 
provide the legitimate use for O_DIRECT in avoiding cache pollution when 
writing a LARGE file.


But what O_DIRECT does right now is _not_ really sensible, and the 
O_DIRECT propeller-heads seem to have some problem even admitting that 
there _is_ a problem, because they don't care. 


You say that as if it were a failing. Currently if you mix access via 
O_DIRECT and non-DIRECT you can get unexpected results. You can screw 
yourself, mangle your data, or have no problems at all if you avoid 
trying to access the same bytes in multiple ways. There are lots of ways 
to get or write stale data, not all involve O_DIRECT in any way, and the 
people actually using O_DIRECT now are managing very well.


I don't regard it as a system failing that I am allowed to shoot myself 
in the foot, it's one of the benefits of Linux over Windows. Using 
O_DIRECT now is like being your own lawyer, room for both creativity and 
serious error. But what's there appears portable, which is important as 
well.


I do have one thought, WRT reading uninitialized disk data. I would hope 
that sparse files are handled right, and that when doing a write with 
O_DIRECT the metadata is not updated until the write is done.


A lot of DB people seem to simply not care about security or anything 
else.anything else. I'm trying to tell you that quoting numbers is 
pointless, when simply the CORRECTNESS of O_DIRECT is very much in doubt.


The guiding POSIX standard appears dead, and major DB programs which 
work on Linux run on AIX, Solaris, and BSD. That sounds like a good 
level of compatibility. I'm not sure what more correctness you would 
want beyond a proposed standard and common practice. It's tricky to use, 
like many other neat features.


I xonfess I have abused O_DIRECT by opening a file with O_DIRECT, 
fdopen()ing it for C, supplying my own large aligned buffer, and using 
that with an otherwise unmodified large program which uses fprintf(). 
That worked on all of the major UNIX variants as well.


--
bill davidsen <[EMAIL PROTECTED]>
  CTO TMR Associates, Inc
  Doing interesting things with small computers since 1979
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] How to (automatically) find the correct maintainer(s)

2007-01-13 Thread Matthias Schniedermeyer
Richard Knutsson wrote:

> Any thoughts on this is very much appreciated (is there any flaws with
> this?).

The thought that crossed my mind was:

Why not do the same thing that was done to the "Help"-file. (Before it
was superseded by Kconfig).

Originaly there was a central Help-file, with all the texts. Then it was
split and placed in each sub-dir. And later it was superseded by Kconfig.

On the other hand you could skip the intermediate step and just fold the
Maintainer-data directly into Kconfig, that way everything is "in one
place" and you could place a "Maintainers"-Button next to the
"Help"-Button in *config, or just display it alongside the help.

And MAYBE that would also lessen the "update-to-date"-problem, as you
can just write the MAINTAINERs-data when you create/update the
Kconfig-file. Which is a thing that creates much bigger pain when you
forget it accidently. ;-)

Oh, and it neadly solves the mapping-problem, for at least all
kernel-parts that have a Kconfig-option/Sub-Tree.





Bis denn

-- 
Real Programmers consider "what you see is what you get" to be just as
bad a concept in Text Editors as it is in women. No, the Real Programmer
wants a "you asked for it, you got it" text editor -- complicated,
cryptic, powerful, unforgiving, dangerous.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: High lock spin time for zone->lru_lock under extreme conditions

2007-01-13 Thread Ravikiran G Thirumalai
On Sat, Jan 13, 2007 at 12:00:17AM -0800, Andrew Morton wrote:
> > On Fri, 12 Jan 2007 23:36:43 -0800 Ravikiran G Thirumalai <[EMAIL 
> > PROTECTED]> wrote:
> > > >void __lockfunc _spin_lock_irq(spinlock_t *lock)
> > > >{
> > > >local_irq_disable();
> > > >> rdtsc(t1);
> > > >preempt_disable();
> > > >spin_acquire(>dep_map, 0, 0, _RET_IP_);
> > > >_raw_spin_lock(lock);
> > > >> rdtsc(t2);
> > > >if (lock->spin_time < (t2 - t1))
> > > >lock->spin_time = t2 - t1;
> > > >}
> > > >
> > > >On some runs, we found that the zone->lru_lock spun for 33 seconds or 
> > > >more
> > > >while the maximal CS time was 3 seconds or so.
> > > 
> > > What is the "CS time"?
> > 
> > Critical Section :).  This is the maximal time interval I measured  from 
> > t2 above to the time point we release the spin lock.  This is the hold 
> > time I guess.
> 
> By no means.  The theory here is that CPUA is taking and releasing the
> lock at high frequency, but CPUB never manages to get in and take it.  In
> which case the maximum-acquisition-time is much larger than the
> maximum-hold-time.
> 
> I'd suggest that you use a similar trick to measure the maximum hold time:
> start the timer after we got the lock, stop it just before we release the
> lock (assuming that the additional rdtsc delay doesn't "fix" things, of
> course...)

Well, that is exactly what I described above  as CS time.  The
instrumentation goes like this:

void __lockfunc _spin_lock_irq(spinlock_t *lock)
{
unsigned long long t1,t2;
local_irq_disable();
t1 = get_cycles_sync();
preempt_disable();
spin_acquire(>dep_map, 0, 0, _RET_IP_);
_raw_spin_lock(lock);
t2 = get_cycles_sync();
lock->raw_lock.htsc = t2;
if (lock->spin_time < (t2 - t1))
lock->spin_time = t2 - t1;
}
...

void __lockfunc _spin_unlock_irq(spinlock_t *lock)
{
unsigned long long t1 ;
spin_release(>dep_map, 1, _RET_IP_);
t1 = get_cycles_sync();
if (lock->cs_time < (t1 -  lock->raw_lock.htsc))
lock->cs_time = t1 -  lock->raw_lock.htsc;
_raw_spin_unlock(lock);
local_irq_enable();
preempt_enable();
}

Am I missing something?  Is this not what you just described? (The
synchronizing rdtsc might not be really required at all locations, but I 
doubt if it would contribute a significant fraction to 33s  or even 
the 3s hold time on a 2.6 GHZ opteron).
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: O_DIRECT question

2007-01-13 Thread Bill Davidsen

Bodo Eggert wrote:


(*) This would allow fadvise_size(), too, which could reduce fragmentation
(and give an early warning on full disks) without forcing e.g. fat to
zero all blocks. OTOH, fadvise_size() would allow users to reserve the
complete disk space without his filesizes reflecting this.


Please clarify how this would interact with quota, and why it wouldn't 
allow someone to run me out of disk.


--
bill davidsen <[EMAIL PROTECTED]>
  CTO TMR Associates, Inc
  Doing interesting things with small computers since 1979
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] How to (automatically) find the correct maintainer(s)

2007-01-13 Thread Richard Knutsson

Stefan Richter wrote:

On 13 Jan, Richard Knutsson wrote:
[...]
  

SUPERCOOL ALPHA CARD

P:  Clark Kent
M:  [EMAIL PROTECTED]
L:  [EMAIL PROTECTED]
C:  SUPER_A
S:  Maintained
(C: for CONFIG. Any better idea?)

then if someone changes a file who are built with CONFIG_SUPER_A, can 
easily backtrack it to the correct maintainer(s).


[...]
  
My first idea was to use the pathway and define that directories above 
the specified (if not specified by another) would fall to the current 
maintainer. It would work, but requires that all pathways be specified 
at once, or a few maintainers with "short" pathways would get much of 
the patches (and it is not as correct/easy to maintain as looking for 
the CONFIG_flag).



Any thoughts on this is very much appreciated (is there any flaws with 
this?).



 - What about drivers which have no MAINTAINER entry but reside in a
   subsystem with MAINTAINER entry?
  

Hmm, how are those drivers built? Can you please point me to one?

 - What if these drivers depend on two subsystems?
  
Not sure if I understand the problem. I don't see the maintainers for 
the subsystems too interested in a driver, and it is the drivers 
maintainer we want.
 
 - Config options map to object files but do not map directly to source

   files. Diffstats show source files.
  

Can you make a object-file out of 2 c-files? Using Makefile?

Example: The sbp2 driver is an IEEE 1394 driver and a SCSI driver.
sbp2.o is enabled by CONFIG_IEEE1394_SBP2 which depends on
CONFIG_IEEE1394 and CONFIG_SCSI. sbp2.c resides in drivers/ieee1394/.
What is the algorithm to look up sbp2's maintainers?
  

The one listed for CONFIG_IEEE1394_SBP2 :)

But what about ex ieee1394_core.o? Is ieee1394-objs "equal" to 
ieee1394.o? (Seems I need to read some Makefile docs...)

Don't get me wrong though. Easier lookup of maintainers and mailinglists
sounds to me like a desirable feature, not just from the point of view
of submitters but also of maintainers.
  
Well, as they say: "If it is too good to be true, it usually is" (but I 
don't think it is too far fetched)


(Btw, what I can see, there is no possibility to get the wrong 
maintainer. Just that sometime it can't give you an answer and you have 
to do it in the old way).


Thanks for all the pointers!
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[-mm] reiserfs4 still hangs

2007-01-13 Thread Marc Dietrich

Hi,

using 2.6.20-rc3-mm1 and 2.6.20-rc4-mm1, I get reiserfs4 related processes in 
down state (not only using googleearth...). Any hints?

sysrq-t shows:

Jan 13 19:32:57 fb07-iapwap2 kernel: googleearth-b D 0001 0  6089   
6072  6109   (NOTLB)
Jan 13 19:32:57 fb07-iapwap2 kernel:c45f3a94 0086 c4d7a050 
0001 c02bb6b5 c013a38b c02bb6b5 
Jan 13 19:32:57 fb07-iapwap2 kernel:c4d7a050 0004 c4d7a050 
e9eb4e3d 0044 0001cd83 c4d7a15c c7bae8d4
Jan 13 19:32:57 fb07-iapwap2 kernel:0282 c7bae8d4 c7bae884 
c7bae8d4  c987ad20 dcdd223a 
Jan 13 19:32:57 fb07-iapwap2 kernel: Call Trace:
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] 
_spin_unlock_irqrestore+0x45/0x60
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] mark_held_locks+0x6b/0x90
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] 
_spin_unlock_irqrestore+0x45/0x60
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] 
reiser4_go_to_sleep+0x5a/0x90 [reiser4]
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] 
autoremove_wake_function+0x0/0x50
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] 
capture_fuse_wait+0x164/0x190 [reiser4]
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] wait_for_fusion+0x0/0x30 
[reiser4]
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] 
reiser4_try_capture+0xa04/0xa30 [reiser4]
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] _spin_lock+0x2a/0x40
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] 
longterm_lock_znode+0x2bb/0x470 [reiser4]
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] _spin_lock+0x2a/0x40
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] coord_by_handle+0x40a/0xcf0 
[reiser4]
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] 
nfs_lookup_revalidate+0x1c/0x4a0 [nfs]
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] 
reiser4_object_lookup+0xc6/0x110 [reiser4]
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] unit_key_cde+0x49/0x70 
[reiser4]
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] reiser4_seal_init+0x20/0x60 
[reiser4]
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] coord_by_key+0x9e/0xc0 
[reiser4]
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] lookup_sd+0x61/0xa0 
[reiser4]
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] reiser4_iget+0x15b/0x330 
[reiser4]
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] 
reiser4_lookup_common+0x6a/0x120 [reiser4]
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] do_lookup+0x148/0x190
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] 
__link_path_walk+0x7cc/0xe20
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] 
_atomic_dec_and_lock+0x31/0x60
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] mntput_no_expire+0x13/0x70
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] link_path_walk+0x63/0xc0
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] link_path_walk+0x43/0xc0
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] restore_nocheck+0x12/0x15
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] 
trace_hardirqs_on+0xc7/0x170
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] do_path_lookup+0x84/0x210
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] getname+0x9a/0xf0
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] __user_walk_fd+0x3b/0x60
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] sys_faccessat+0x99/0x160
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] restore_nocheck+0x12/0x15
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] 
trace_hardirqs_on+0xc7/0x170
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] sys_access+0x1f/0x30
Jan 13 19:32:57 fb07-iapwap2 kernel:  [] syscall_call+0x7/0xb
Jan 13 19:32:57 fb07-iapwap2 kernel:  ===


locks:
Jan 13 19:32:57 fb07-iapwap2 kernel: Showing all locks held in the system:
Jan 13 19:32:57 fb07-iapwap2 kernel: 3 locks held by pdflush/117:
Jan 13 19:32:57 fb07-iapwap2 kernel:  #0:  (>s_umount_key#17){}, at: 
[] writeback_inodes+0x9a/0xe0
Jan 13 19:32:57 fb07-iapwap2 kernel:  #1:  (>commit_mutex){--..}, at: 
[] reiser4_txn_end+0x3bc/0x510 [reiser4
]
Jan 13 19:32:57 fb07-iapwap2 kernel:  #2:  (>mutex){--..}, at: 
[] synchronize_qrcu+0x13/0xb0
Jan 13 19:32:57 fb07-iapwap2 kernel: 1 lock held by mingetty/5432:
Jan 13 19:32:57 fb07-iapwap2 kernel:  #0:  (>atomic_read_lock){--..}, at: 
[] read_chan+0x414/0x610
Jan 13 19:32:57 fb07-iapwap2 kernel: 1 lock held by mingetty/5433:
Jan 13 19:32:57 fb07-iapwap2 kernel:  #0:  (>atomic_read_lock){--..}, at: 
[] read_chan+0x414/0x610
Jan 13 19:32:57 fb07-iapwap2 kernel: 1 lock held by mingetty/5434:
Jan 13 19:32:57 fb07-iapwap2 kernel:  #0:  (>atomic_read_lock){--..}, at: 
[] read_chan+0x414/0x610
Jan 13 19:32:57 fb07-iapwap2 kernel: 1 lock held by mingetty/5435:
Jan 13 19:32:57 fb07-iapwap2 kernel:  #0:  (>atomic_read_lock){--..}, at: 
[] read_chan+0x414/0x610
Jan 13 19:32:57 fb07-iapwap2 kernel: 1 lock held by mingetty/5455:
Jan 13 19:32:57 fb07-iapwap2 kernel:  #0:  (>atomic_read_lock){--..}, at: 
[] read_chan+0x414/0x610
Jan 13 19:32:57 fb07-iapwap2 kernel: 1 lock held by bash/5487:
Jan 13 19:32:57 fb07-iapwap2 kernel:  #0:  (>atomic_read_lock){--..}, at: 
[] read_chan+0x414/0x610
Jan 13 19:32:57 fb07-iapwap2 kernel: 2 locks held by googleearth-bin/6089:
Jan 13 19:32:57 fb07-iapwap2 kernel:  #0:  (>i_mutex){--..}, at: 
[] do_lookup+0xb3/0x190
Jan 13 19:32:57 fb07-iapwap2 kernel:  #1:  

Re: [-mm] reiserfs4 still hangs

2007-01-13 Thread Andrew Morton
> On Sat, 13 Jan 2007 19:54:53 +0100 Marc Dietrich <[EMAIL PROTECTED]> wrote:
> 
> Hi,
> 
> using 2.6.20-rc3-mm1 and 2.6.20-rc4-mm1, I get reiserfs4 related processes in 
> down state (not only using googleearth...). Any hints?
>
> sysrq-t shows:
> 
> Jan 13 19:32:57 fb07-iapwap2 kernel: googleearth-b D 0001 0  6089   
> 6072  6109   (NOTLB)
> Jan 13 19:32:57 fb07-iapwap2 kernel:c45f3a94 0086 c4d7a050 
> 0001 c02bb6b5 c013a38b c02bb6b5 
> Jan 13 19:32:57 fb07-iapwap2 kernel:c4d7a050 0004 c4d7a050 
> e9eb4e3d 0044 0001cd83 c4d7a15c c7bae8d4
> Jan 13 19:32:57 fb07-iapwap2 kernel:0282 c7bae8d4 c7bae884 
> c7bae8d4  c987ad20 dcdd223a 
> Jan 13 19:32:57 fb07-iapwap2 kernel: Call Trace:
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] 
> _spin_unlock_irqrestore+0x45/0x60
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] mark_held_locks+0x6b/0x90
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] 
> _spin_unlock_irqrestore+0x45/0x60
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] 
> reiser4_go_to_sleep+0x5a/0x90 [reiser4]
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] 
> autoremove_wake_function+0x0/0x50
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] 
> capture_fuse_wait+0x164/0x190 [reiser4]
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] wait_for_fusion+0x0/0x30 
> [reiser4]
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] 
> reiser4_try_capture+0xa04/0xa30 [reiser4]
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] _spin_lock+0x2a/0x40
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] 
> longterm_lock_znode+0x2bb/0x470 [reiser4]
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] _spin_lock+0x2a/0x40
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] 
> coord_by_handle+0x40a/0xcf0 
> [reiser4]
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] 
> nfs_lookup_revalidate+0x1c/0x4a0 [nfs]
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] 
> reiser4_object_lookup+0xc6/0x110 [reiser4]
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] unit_key_cde+0x49/0x70 
> [reiser4]
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] 
> reiser4_seal_init+0x20/0x60 
> [reiser4]
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] coord_by_key+0x9e/0xc0 
> [reiser4]
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] lookup_sd+0x61/0xa0 
> [reiser4]
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] reiser4_iget+0x15b/0x330 
> [reiser4]
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] 
> reiser4_lookup_common+0x6a/0x120 [reiser4]
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] do_lookup+0x148/0x190
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] 
> __link_path_walk+0x7cc/0xe20
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] 
> _atomic_dec_and_lock+0x31/0x60
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] mntput_no_expire+0x13/0x70
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] link_path_walk+0x63/0xc0
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] link_path_walk+0x43/0xc0
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] restore_nocheck+0x12/0x15
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] 
> trace_hardirqs_on+0xc7/0x170
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] do_path_lookup+0x84/0x210
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] getname+0x9a/0xf0
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] __user_walk_fd+0x3b/0x60
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] sys_faccessat+0x99/0x160
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] restore_nocheck+0x12/0x15
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] 
> trace_hardirqs_on+0xc7/0x170
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] sys_access+0x1f/0x30
> Jan 13 19:32:57 fb07-iapwap2 kernel:  [] syscall_call+0x7/0xb
> Jan 13 19:32:57 fb07-iapwap2 kernel:  ===
> 
> 
> locks:
> Jan 13 19:32:57 fb07-iapwap2 kernel: Showing all locks held in the system:
> Jan 13 19:32:57 fb07-iapwap2 kernel: 3 locks held by pdflush/117:
> Jan 13 19:32:57 fb07-iapwap2 kernel:  #0:  (>s_umount_key#17){}, 
> at: 
> [] writeback_inodes+0x9a/0xe0
> Jan 13 19:32:57 fb07-iapwap2 kernel:  #1:  (>commit_mutex){--..}, at: 
> [] reiser4_txn_end+0x3bc/0x510 [reiser4
> ]
> Jan 13 19:32:57 fb07-iapwap2 kernel:  #2:  (>mutex){--..}, at: 
> [] synchronize_qrcu+0x13/0xb0
> Jan 13 19:32:57 fb07-iapwap2 kernel: 1 lock held by mingetty/5432:
> Jan 13 19:32:57 fb07-iapwap2 kernel:  #0:  (>atomic_read_lock){--..}, 
> at: 
> [] read_chan+0x414/0x610
> Jan 13 19:32:57 fb07-iapwap2 kernel: 1 lock held by mingetty/5433:
> Jan 13 19:32:57 fb07-iapwap2 kernel:  #0:  (>atomic_read_lock){--..}, 
> at: 
> [] read_chan+0x414/0x610
> Jan 13 19:32:57 fb07-iapwap2 kernel: 1 lock held by mingetty/5434:
> Jan 13 19:32:57 fb07-iapwap2 kernel:  #0:  (>atomic_read_lock){--..}, 
> at: 
> [] read_chan+0x414/0x610
> Jan 13 19:32:57 fb07-iapwap2 kernel: 1 lock held by mingetty/5435:
> Jan 13 19:32:57 fb07-iapwap2 kernel:  #0:  (>atomic_read_lock){--..}, 
> at: 
> [] read_chan+0x414/0x610
> Jan 13 19:32:57 fb07-iapwap2 kernel: 1 lock held by mingetty/5455:
> Jan 13 19:32:57 fb07-iapwap2 kernel:  #0:  (>atomic_read_lock){--..}, 
> at: 
> [] read_chan+0x414/0x610
> Jan 13 19:32:57 fb07-iapwap2 kernel: 1 lock held by bash/5487:

Re: ieee1394 feature needed: overwrite SPLIT_TIMEOUT from userspace

2007-01-13 Thread Stefan Richter
(full quote for linux1394-devel)

On 12 Jan, Philipp Beyer wrote to linux-kernel:
> Hi,
> 
> I'm investigating an unwanted behaviour of our firewire devices in 
> connection with the ieee1394 kernel module.
> 
> The problem is caused by a non standard-conform behaviour of our
> devices. Anyway, changes on the device-side dont seem to be the
> best solution, so I'm looking for a workaround in terms of a
> kernel patch.
> 
> The problem:
> Our devices exceed the SPLIT_TIMEOUT for write requests in some
> situations, where write accesses to the devices flash memory are 
> triggered.

There are certainly a number of ways to implement this in your
device in a conforming way. For example, if it is too costly to
avoid the transaction timeout, you could add a register to your
device to be polled by the requester after it initiated a lengthy
operation. The extra register would become responsive when the
operation finished and could even show whether the operation
succeeded.

But then, why not support lengthier timeouts in Linux if it can be
done with minimum overhead.

> The SPLIT_TIMEOUT could be adjusted as it's part of the 
> CSR layout, but the longest interval possible is 8 seconds. We need
> a substantial longer interval to assure failure-free operation.
> (the maximum timeout needed may be around 120 seconds)
> 
> The presumed solution:
> These long timeouts are only needed in a few rare situations like
> writing user presets to flash or firmware updates. As far as I've
> examined the kernel code it would be the best thing to have a
> function (ioctl?) accessible from userspace that overwrites the
> stored SPLIT_TIMEOUT for a certain connected device. This way
> there should not be any interferences in case of normal operation.
> Until (rare) write accesses to the flash memory are performed, a
> reasonable short timeout could be used.

I have an alternative suggestion:

 - Keep a global timeout for split transactions for all nodes.
   Tracking different timeouts per node would add significant code
   footprint.
 - Control the timeout like before via a write request to the
   SPLIT_TIMEOUT CSR.
 - Allow the local node to write a nonstandard value of >7 to
   SPLIT_TIMEOUT_HI. This would not be compliant with IEEE 1394(a)
   but at least with IEEE 1212.

This suggestion may fall short if a bus manager is present. Also,
I have concerns to add such a non-conforming feature to mainline.
(Not that our drivers were fully compliant to the spec now or that
100% by-the-book behavior would be desirable in the first place...)

> Since I don't have any real experience in kernel hacking yet,
> this should be interpreted as a feature request at first:
> If the described feature is easy to implement I would appreciate
> if someone could do this.

I could post a patch which works as I outlined if it fits your
requirements.

> Otherwise I'm confident that I'm able to write a patch on my own.
> In this case the critical part would be to meet the standards
> of the kernel community, since we would like to have the patch
> included in the mainline.
> 
> Therefore I'm also interested in any kind of advices about how
> to realize an appropriate patch.
> 
> Thanks,
> 
> Philipp Beyer
> 
> Software Development
> Allied Vision Technologies

See Documentation/SubmittingPatches in the Linux kernel source
distribution for advice on code submission.
-- 
Stefan Richter
-=-=-=== ---= -==-=
http://arcgraph.de/sr/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Choosing a HyperThreading/SMP/MultiCore kernel ?

2007-01-13 Thread Lennart Sorensen
On Fri, Jan 12, 2007 at 10:38:43PM -0500, [EMAIL PROTECTED] wrote:
> amd64 will only work on a core2duo if it's a T7200 or higher - the
> lower numbers are 32-bit-only chipsets.  I admit not knowing what
> exact variant the Mac has.

The Core Duo had 32bit only (being a Pentium M), but the Core 2 Duo
should always be 64bit capable (at least that is what this list says:
http://en.wikipedia.org/wiki/List_of_Intel_Core_2_microprocessors#Core_2_Duo_2
)

> CONFIG_MCORE2=y

Oh good.  Makes life much simpler for users.

--
Len Sorensen
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   >