Re: usb+sysfs: duplicate filename 'bInterfaceNumber'

2007-10-14 Thread Dave Young
On 10/14/07, Borislav Petkov <[EMAIL PROTECTED]> wrote:
> Hi,
>
> i get the following warning on yesterday's git tree (v2.6.23-2840-g752097c):
>
> Oct 14 09:07:15 zmei kernel: [   49.368030] sysfs: duplicate filename 
> 'bInterfaceNumber' can not be created
> Oct 14 09:07:15 zmei kernel: [   49.368086] WARNING: at fs/sysfs/dir.c:425 
> sysfs_add_one()
> Oct 14 09:07:15 zmei kernel: [   49.368134]  [] 
> show_trace_log_lvl+0x1a/0x2f
> Oct 14 09:07:15 zmei kernel: [   49.368220]  [] show_trace+0x12/0x14
> Oct 14 09:07:15 zmei kernel: [   49.368300]  [] dump_stack+0x16/0x18
> Oct 14 09:07:15 zmei kernel: [   49.368379]  [] 
> sysfs_add_one+0x57/0xbc
> Oct 14 09:07:15 zmei kernel: [   49.368461]  [] 
> sysfs_add_file+0x49/0x71
> Oct 14 09:07:15 zmei kernel: [   49.368541]  [] 
> sysfs_create_group+0x86/0xe8
> Oct 14 09:07:15 zmei kernel: [   49.368621]  [] 
> usb_create_sysfs_intf_files+0x27/0x9b
> Oct 14 09:07:15 zmei kernel: [   49.368704]  [] 
> usb_set_configuration+0x454/0x466
> Oct 14 09:07:15 zmei kernel: [   49.368787]  [] 
> generic_probe+0x53/0x94
> Oct 14 09:07:15 zmei kernel: [   49.368867]  [] 
> usb_probe_device+0x35/0x3b
> Oct 14 09:07:15 zmei kernel: [   49.368947]  [] 
> driver_probe_device+0xcb/0x14f
> Oct 14 09:07:15 zmei kernel: [   49.369039]  [] 
> __device_attach+0x8/0xa
> Oct 14 09:07:15 zmei kernel: [   49.369119]  [] 
> bus_for_each_drv+0x3b/0x63
> Oct 14 09:07:15 zmei kernel: [   49.369199]  [] 
> device_attach+0x70/0x85
> Oct 14 09:07:15 zmei kernel: [   49.369279]  [] 
> bus_attach_device+0x29/0x77
> Oct 14 09:07:15 zmei kernel: [   49.369359]  [] 
> device_add+0x28c/0x445
> Oct 14 09:07:15 zmei kernel: [   49.369439]  [] 
> usb_new_device+0x44/0x82
> Oct 14 09:07:15 zmei kernel: [   49.369519]  [] 
> hub_thread+0x666/0x9c2
> Oct 14 09:07:15 zmei kernel: [   49.369598]  [] kthread+0x3b/0x62
> Oct 14 09:07:15 zmei kernel: [   49.369679]  [] 
> kernel_thread_helper+0x7/0x10
> Oct 14 09:07:15 zmei kernel: [   49.369759]  ===
>
> The usb hub in question is named 4-1:1.0 and it has an extension connected to 
> it
> which is used to activate the 2 usb connectors at the side of the pc's 
> monitor.
> Correct me if i'm wrong but from what i've understood so far from reading the 
> code,
> i think, it adds the bInterfaceNumber-file after calling 
> usb_create_sysfs_intf_files(intf).
> However, the currently active usbhost interface alternate setting is the only 
> one active
> so the bInterfaceNumber exists already and therefore the warning, but this is
> just a guess since i'm not that fluent in the usb internals.
Hi,
I have encountered the same problem which was  reported in
http://lkml.org/lkml/2007/9/29/45

For the first one "usbcore duplicated sysfs filename" , I have submit
a patch to fix it.

For the "bInterfaceNumber" one, I have no idea, the same problem still
exist in the latest 23-mm1 tree.


>
> .config attached.
>
> --
> Regards/Gruß,
> Boris.
>
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] jiffies_round -> jiffies_round_relative conversion - b43/b43legacy

2007-10-14 Thread Anton Blanchard

When rounding a relative timeout we need to use round_jiffies_relative(). 

Signed-off-by: Anton Blanchard <[EMAIL PROTECTED]>
---

diff --git a/drivers/net/wireless/b43/main.c b/drivers/net/wireless/b43/main.c
index c141a26..41049a4 100644
--- a/drivers/net/wireless/b43/main.c
+++ b/drivers/net/wireless/b43/main.c
@@ -2392,7 +2392,7 @@ out_requeue:
if (b43_debug(dev, B43_DBG_PWORK_FAST))
delay = msecs_to_jiffies(50);
else
-   delay = round_jiffies(HZ * 15);
+   delay = round_jiffies_relative(HZ * 15);
queue_delayed_work(wl->hw->workqueue, >periodic_work, delay);
 out:
mutex_unlock(>mutex);
diff --git a/drivers/net/wireless/b43legacy/main.c 
b/drivers/net/wireless/b43legacy/main.c
index f074951..bd0bd9b 100644
--- a/drivers/net/wireless/b43legacy/main.c
+++ b/drivers/net/wireless/b43legacy/main.c
@@ -2260,7 +2260,7 @@ out_requeue:
if (b43legacy_debug(dev, B43legacy_DBG_PWORK_FAST))
delay = msecs_to_jiffies(50);
else
-   delay = round_jiffies(HZ);
+   delay = round_jiffies_relative(HZ);
queue_delayed_work(dev->wl->hw->workqueue,
   >periodic_work, delay);
 out:
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: What still uses the block layer?

2007-10-14 Thread Stefan Richter
Rob Landley wrote:
> I was at least attempting to ask a serious question.
...
> Actually, I was going through Documentation/block thinking about making a 
> 00-INDEX for it, but my earlier questions of the scsi guys left me with the 
> impression that the block layer is _not_ used by the SCSI layer.

Ah, so it was about your documentation work.  I already forgot the
context of your previous inquiries.  Alas the tone of them already did
some damage, leading to responses like these.

...
> since 
> every non-embedded modern storage device I'm aware of has been consumed by 
> the SCSI layer (despite none of them actually having a discernably closer 
> relationship to SCSI than ATA did)
...

The Linux SCSI subsystems don't consume, they provide services; nowadays
not only for SCSI hardware and SCSI protocols but also for a number of
subsystems whose tasks are similar enough to SCSI subsystems to make the
SCSI core and upper SCSI layer useful to them too.

BTW:
| Now that IDE disks have been rerouted through the scsi layer, SATA goes
| through the scsi layer, USB goes through the scsi layer, firewire goes
| through the scsi layer...

As a side note, SBP-2 is a SCSI transport protocol, hence ieee1394/sbp2
and firewire/fw-sbp2 are Linux SCSI low-level drivers.  Anything else
would be just wrong and infeasible in this particular case.
-- 
Stefan Richter
-=-=-=== =-=- -
http://arcgraph.de/sr/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] jiffies_round -> jiffies_round_relative conversion - rt2x00

2007-10-14 Thread Anton Blanchard

When rounding a relative timeout we need to use round_jiffies_relative(). 

Signed-off-by: Anton Blanchard <[EMAIL PROTECTED]>
---

diff --git a/drivers/net/wireless/rt2x00/rt2x00lib.h 
b/drivers/net/wireless/rt2x00/rt2x00lib.h
index 298faa9..06d9bc0 100644
--- a/drivers/net/wireless/rt2x00/rt2x00lib.h
+++ b/drivers/net/wireless/rt2x00/rt2x00lib.h
@@ -30,7 +30,7 @@
  * Interval defines
  * Both the link tuner as the rfkill will be called once per second.
  */
-#define LINK_TUNE_INTERVAL ( round_jiffies(HZ) )
+#define LINK_TUNE_INTERVAL ( round_jiffies_relative(HZ) )
 #define RFKILL_POLL_INTERVAL   ( 1000 )
 
 /*
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] jiffies_round -> jiffies_round_relative conversion - ipw2100/ipw2200

2007-10-14 Thread Anton Blanchard

When rounding a relative timeout we need to use round_jiffies_relative(). 

Signed-off-by: Anton Blanchard <[EMAIL PROTECTED]>
---

diff --git a/drivers/net/wireless/ipw2100.c b/drivers/net/wireless/ipw2100.c
index 2d46a16..739d060 100644
--- a/drivers/net/wireless/ipw2100.c
+++ b/drivers/net/wireless/ipw2100.c
@@ -1769,7 +1769,7 @@ static int ipw2100_up(struct ipw2100_priv *priv, int 
deferred)
if (priv->stop_rf_kill) {
priv->stop_rf_kill = 0;
queue_delayed_work(priv->workqueue, >rf_kill,
-  round_jiffies(HZ));
+  round_jiffies_relative(HZ));
}
 
deferred = 1;
@@ -2102,7 +2102,8 @@ static void isr_indicate_rf_kill(struct ipw2100_priv 
*priv, u32 status)
/* Make sure the RF Kill check timer is running */
priv->stop_rf_kill = 0;
cancel_delayed_work(>rf_kill);
-   queue_delayed_work(priv->workqueue, >rf_kill, round_jiffies(HZ));
+   queue_delayed_work(priv->workqueue, >rf_kill,
+  round_jiffies_relative(HZ));
 }
 
 static void isr_scan_complete(struct ipw2100_priv *priv, u32 status)
@@ -4237,7 +4238,7 @@ static int ipw_radio_kill_sw(struct ipw2100_priv *priv, 
int disable_radio)
priv->stop_rf_kill = 0;
cancel_delayed_work(>rf_kill);
queue_delayed_work(priv->workqueue, >rf_kill,
-  round_jiffies(HZ));
+  round_jiffies_relative(HZ));
} else
schedule_reset(priv);
}
@@ -5975,7 +5976,7 @@ static void ipw2100_rf_kill(struct work_struct *work)
IPW_DEBUG_RF_KILL("RF Kill active, rescheduling GPIO check\n");
if (!priv->stop_rf_kill)
queue_delayed_work(priv->workqueue, >rf_kill,
-  round_jiffies(HZ));
+  round_jiffies_relative(HZ));
goto exit_unlock;
}
 
diff --git a/drivers/net/wireless/ipw2200.c b/drivers/net/wireless/ipw2200.c
index feb8fcb..88b0f81 100644
--- a/drivers/net/wireless/ipw2200.c
+++ b/drivers/net/wireless/ipw2200.c
@@ -1753,7 +1753,7 @@ static int ipw_radio_kill_sw(struct ipw_priv *priv, int 
disable_radio)
/* Make sure the RF_KILL check timer is running */
cancel_delayed_work(>rf_kill);
queue_delayed_work(priv->workqueue, >rf_kill,
-  round_jiffies(2 * HZ));
+  round_jiffies_relative(2 * HZ));
} else
queue_work(priv->workqueue, >up);
}
@@ -4364,7 +4364,7 @@ static void handle_scan_event(struct ipw_priv *priv)
if (!priv->user_requested_scan) {
if (!delayed_work_pending(>scan_event))
queue_delayed_work(priv->workqueue, >scan_event,
-round_jiffies(msecs_to_jiffies(4000)));
+
round_jiffies_relative(msecs_to_jiffies(4000)));
} else {
union iwreq_data wrqu;
 
@@ -4728,7 +4728,7 @@ static void ipw_rx_notification(struct ipw_priv *priv,
 && priv->status & STATUS_ASSOCIATED)
queue_delayed_work(priv->workqueue,
   >request_scan,
-  round_jiffies(HZ));
+  round_jiffies_relative(HZ));
 
/* Send an empty event to user space.
 * We don't send the received data on the event because
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH][resend] param_sysfs_builtin memchr argument fix

2007-10-14 Thread Dave Young
If memchr argument is longer than strlen(kp->name), there will be some weird 
result.

It will casuse duplicate filenames in sysfs for the "nousb". kernel
warning messages are as bellow:

sysfs: duplicate filename 'usbcore' can not be created
WARNING: at fs/sysfs/dir.c:416 sysfs_add_one()
 [] sysfs_add_one+0xa0/0xe0
 [] create_dir+0x48/0xb0
 [] sysfs_create_dir+0x29/0x50
 [] create_dir+0x1b/0x50
 [] kobject_add+0x46/0x150
 [] kobject_init+0x3a/0x80
 [] kernel_param_sysfs_setup+0x50/0xb0
 [] param_sysfs_builtin+0xee/0x130
 [] param_sysfs_init+0x23/0x60
 [] __next_cpu+0x12/0x20
 [] kernel_init+0x0/0xb0
 [] kernel_init+0x0/0xb0
 [] do_initcalls+0x46/0x1e0
 [] create_proc_entry+0x52/0x90
 [] register_irq_proc+0x9c/0xc0
 [] proc_mkdir_mode+0x34/0x50
 [] kernel_init+0x0/0xb0
 [] kernel_init+0x62/0xb0
 [] kernel_thread_helper+0x7/0x14
 ===
kobject_add failed for usbcore with -EEXIST, don't try to register things with 
the same name in the same directory.
 [] kobject_add+0xf6/0x150
 [] kernel_param_sysfs_setup+0x50/0xb0
 [] param_sysfs_builtin+0xee/0x130
 [] param_sysfs_init+0x23/0x60
 [] __next_cpu+0x12/0x20
 [] kernel_init+0x0/0xb0
 [] kernel_init+0x0/0xb0
 [] do_initcalls+0x46/0x1e0
 [] create_proc_entry+0x52/0x90
 [] register_irq_proc+0x9c/0xc0
 [] proc_mkdir_mode+0x34/0x50
 [] kernel_init+0x0/0xb0
 [] kernel_init+0x62/0xb0
 [] kernel_thread_helper+0x7/0x14
 ===
Module 'usbcore' failed to be added to sysfs, error number -17
The system will be unstable now.

Signed-off-by: Dave Young <[EMAIL PROTECTED]> 

---
kernel/params.c |8 +++-
1 file changed, 7 insertions(+), 1 deletion(-)

diff -upr linux/kernel/params.c linux.new/kernel/params.c
--- linux/kernel/params.c   2007-10-08 14:30:06.0 +0800
+++ linux.new/kernel/params.c   2007-10-09 09:16:55.0 +0800
@@ -592,11 +592,17 @@ static void __init param_sysfs_builtin(v
 
for (i=0; i < __stop___param - __start___param; i++) {
char *dot;
+   size_t kplen;
 
kp = &__start___param[i];
+   kplen = strlen(kp->name);
 
/* We do not handle args without periods. */
-   dot = memchr(kp->name, '.', MAX_KBUILD_MODNAME);
+   if (kplen > MAX_KBUILD_MODNAME) {
+   DEBUGP("kernel parameter name is too long: %s\n", 
kp->name);
+   continue;
+   }
+   dot = memchr(kp->name, '.', kplen);
if (!dot) {
DEBUGP("couldn't find period in %s\n", kp->name);
continue;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] jiffies_round -> jiffies_round_relative conversion in EDAC drivers

2007-10-14 Thread Anton Blanchard

When rounding a relative timeout we need to use round_jiffies_relative(). 

Signed-off-by: Anton Blanchard <[EMAIL PROTECTED]>
---

diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c
index f3690a6..46400ec 100644
--- a/drivers/edac/edac_device.c
+++ b/drivers/edac/edac_device.c
@@ -436,7 +436,7 @@ static void edac_device_workq_function(struct work_struct 
*work_req)
 */
if (edac_dev->poll_msec == 1000)
queue_delayed_work(edac_workqueue, _dev->work,
-   round_jiffies(edac_dev->delay));
+   round_jiffies_relative(edac_dev->delay));
else
queue_delayed_work(edac_workqueue, _dev->work,
edac_dev->delay);
@@ -468,7 +468,7 @@ void edac_device_workq_setup(struct edac_device_ctl_info 
*edac_dev,
 */
if (edac_dev->poll_msec == 1000)
queue_delayed_work(edac_workqueue, _dev->work,
-   round_jiffies(edac_dev->delay));
+   round_jiffies_relative(edac_dev->delay));
else
queue_delayed_work(edac_workqueue, _dev->work,
edac_dev->delay);
diff --git a/drivers/edac/edac_pci.c b/drivers/edac/edac_pci.c
index 5dee9f5..7573e07 100644
--- a/drivers/edac/edac_pci.c
+++ b/drivers/edac/edac_pci.c
@@ -246,7 +246,7 @@ static void edac_pci_workq_function(struct work_struct 
*work_req)
/* if we are on a one second period, then use round */
msec = edac_pci_get_poll_msec();
if (msec == 1000)
-   delay = round_jiffies(msecs_to_jiffies(msec));
+   delay = round_jiffies_relative(msecs_to_jiffies(msec));
else
delay = msecs_to_jiffies(msec);
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ patch .24-rc0 5/5 ] SuperIO locks coordinator - use in other hwmon/*.c

2007-10-14 Thread Jim Cromie

05 - use superio-locks in rest of drivers/hwmon/*.c

this patch is compile-tested only, please review for sanity before you 
try running them.  Things to look for - missing superio_release(),

opportunities to use superio_devid(), superio_inw(), etc.


Signed-off-by:  Jim Cromie <[EMAIL PROTECTED]>
---
hwmon-superio-others
Kconfig  |6 +++
f71805f.c|   86 +++---
it87.c   |   80 +++---
smsc47b397.c |   63 -
smsc47m1.c   |   69 +
vt1211.c |   70 +
w83627ehf.c  |  110 +--
7 files changed, 188 insertions(+), 296 deletions(-)

diff -ruNp -X dontdiff -X exclude-diffs 
hwmon-fan-push-offset/drivers/hwmon/f71805f.c 
hwmon-superio.old/drivers/hwmon/f71805f.c
--- hwmon-fan-push-offset/drivers/hwmon/f71805f.c   2007-10-14 
13:00:24.0 -0600
+++ hwmon-superio.old/drivers/hwmon/f71805f.c   2007-10-14 17:22:23.0 
-0600
@@ -39,6 +39,7 @@
#include 
#include 
#include 
+#include 
#include 

static struct platform_device *pdev;
@@ -52,8 +53,6 @@ enum kinds { f71805f, f71872f };

#define F71805F_LD_HWM  0x04

-#define SIO_REG_LDSEL  0x07/* Logical device select */
-#define SIO_REG_DEVID  0x20/* Device ID (2 bytes) */
#define SIO_REG_DEVREV  0x22/* Device revision */
#define SIO_REG_MANID   0x23/* Fintek ID (2 bytes) */
#define SIO_REG_FNSEL1  0x29/* Multi Function Select 1 (F71872F) */
@@ -64,43 +63,15 @@ enum kinds { f71805f, f71872f };
#define SIO_F71805F_ID  0x0406
#define SIO_F71872F_ID  0x0341

-static inline int
-superio_inb(int base, int reg)
-{
-   outb(reg, base);
-   return inb(base + 1);
-}
-
-static int
-superio_inw(int base, int reg)
-{
-   int val;
-   outb(reg++, base);
-   val = inb(base + 1) << 8;
-   outb(reg, base);
-   val |= inb(base + 1);
-   return val;
-}
+static struct superio* gate;

-static inline void
-superio_select(int base, int ld)
-{
-   outb(SIO_REG_LDSEL, base);
-   outb(ld, base + 1);
-}
-
-static inline void
-superio_enter(int base)
-{
-   outb(0x87, base);
-   outb(0x87, base);
-}
-
-static inline void
-superio_exit(int base)
-{
-   outb(0xaa, base);
-}
+static __devinit struct superio_search where = {
+   .cmdreg_addrs = { 0x2e, 0x4e },
+   .device_ids   = { SIO_F71805F_ID, SIO_F71872F_ID, 0 },
+   .devid_word   = 1,
+   .enter_seq= { 0x87, 0x87 },
+   .exit_seq = { 0xAA }
+};

/*
 * ISA constants
@@ -1480,31 +1451,33 @@ exit:
return err;
}

-static int __init f71805f_find(int sioaddr, unsigned short *address,
-  struct f71805f_sio_data *sio_data)
+static int __init f71805f_find(struct f71805f_sio_data *sio_data)
{
int err = -ENODEV;
-   u16 devid;
+   u16 devid, address;

static const char *names[] = {
"F71805F/FG",
"F71872F/FG or F71806F/FG",
};
-
-   superio_enter(sioaddr);
-
-   devid = superio_inw(sioaddr, SIO_REG_MANID);
+   gate = superio_find();
+   if (!gate) {
+   printk(KERN_WARNING "pc87360: superio port not detected, "
+  "module not intalled.\n");
+   return -ENODEV;
+   }
+   superio_enter(gate);
+   devid = superio_inw(gate, SIO_REG_MANID);
if (devid != SIO_FINTEK_ID)
goto exit;

-   devid = superio_inw(sioaddr, SIO_REG_DEVID);
switch (devid) {
case SIO_F71805F_ID:
sio_data->kind = f71805f;
break;
case SIO_F71872F_ID:
sio_data->kind = f71872f;
-   sio_data->fnsel1 = superio_inb(sioaddr, SIO_REG_FNSEL1);
+   sio_data->fnsel1 = superio_inb(gate, SIO_REG_FNSEL1);
break;
default:
printk(KERN_INFO DRVNAME ": Unsupported Fintek device, "
@@ -1512,28 +1485,28 @@ static int __init f71805f_find(int sioad
goto exit;
}

-   superio_select(sioaddr, F71805F_LD_HWM);
-   if (!(superio_inb(sioaddr, SIO_REG_ENABLE) & 0x01)) {
+   superio_select(gate, F71805F_LD_HWM);
+   if (!(superio_inb(gate, SIO_REG_ENABLE) & 0x01)) {
printk(KERN_WARNING DRVNAME ": Device not activated, "
   "skipping\n");
goto exit;
}

-   *address = superio_inw(sioaddr, SIO_REG_ADDR);
-   if (*address == 0) {
+   address = superio_inw(gate, SIO_REG_ADDR);
+   if (address == 0) {
printk(KERN_WARNING DRVNAME ": Base address not set, "
   "skipping\n");
goto exit;
}
-   *address &= ~(REGION_LENGTH - 1);   /* Ignore 3 LSB */
+   address &= ~(REGION_LENGTH - 1);/* Ignore 3 

[ patch .24-rc0 3/5 ] SuperIO locks coordinator - use in hwmon/pc87360

2007-10-14 Thread Jim Cromie

03 - use superio-locks in drivers/hwmon/pc87360

this driver keeps the slot for only during __init, since it 
only needs the sio-port to read the ISA addresses of the 
Logical Devices in the chip, which are then used exclusively.



Signed-off-by:  Jim Cromie <[EMAIL PROTECTED]>
---
hwmon-superio-pc87360
Kconfig   |1 
pc87360.c |   73 +-

2 files changed, 31 insertions(+), 43 deletions(-)



diff -ruNp -X dontdiff -X exclude-diffs 
hwmon-fan-push-offset/drivers/hwmon/Kconfig 
hwmon-superio.old/drivers/hwmon/Kconfig
--- hwmon-fan-push-offset/drivers/hwmon/Kconfig 2007-10-14 13:00:24.0 
-0600
+++ hwmon-superio.old/drivers/hwmon/Kconfig 2007-10-14 17:22:23.0 
-0600
@@ -489,6 +495,7 @@ config SENSORS_MAX6650
config SENSORS_PC87360
tristate "National Semiconductor PC87360 family"
select HWMON_VID
+   select SUPERIO_LOCKS
help
  If you say yes here you get access to the hardware monitoring
  functions of the National Semiconductor PC8736x Super-I/O chips.
diff -ruNp -X dontdiff -X exclude-diffs 
hwmon-fan-push-offset/drivers/hwmon/pc87360.c 
hwmon-superio.old/drivers/hwmon/pc87360.c
--- hwmon-fan-push-offset/drivers/hwmon/pc87360.c   2007-10-14 
13:00:24.0 -0600
+++ hwmon-superio.old/drivers/hwmon/pc87360.c   2007-10-14 17:22:23.0 
-0600
@@ -41,6 +41,7 @@
#include 
#include 
#include 
+#include 
#include 
#include 
#include 
@@ -64,10 +65,14 @@ MODULE_PARM_DESC(init,
 */

#define DEV 0x07/* Register: Logical device select */
-#define DEVID  0x20/* Register: Device ID */
#define ACT 0x30/* Register: Device activation */
#define BASE0x60/* Register: Base address */

+static __devinit struct superio_search where = {
+   .cmdreg_addrs = { 0x2E, 0x4E },
+   .device_ids   = { 0xE1, 0xE8, 0xE4, 0xE5, 0xE9, 0 },
+};
+
#define FSCM0x09/* Logical device: fans */
#define VLM 0x0d/* Logical device: voltages */
#define TMS 0x0e/* Logical device: temperatures */
@@ -77,24 +82,6 @@ static const u8 logdev[3] = { FSCM, VLM,
#define LD_IN   1
#define LD_TEMP 2

-static inline void superio_outb(int sioaddr, int reg, int val)
-{
-   outb(reg, sioaddr);
-   outb(val, sioaddr+1);
-}
-
-static inline int superio_inb(int sioaddr, int reg)
-{
-   outb(reg, sioaddr);
-   return inb(sioaddr+1);
-}
-
-static inline void superio_exit(int sioaddr)
-{
-   outb(0x02, sioaddr);
-   outb(0x02, sioaddr+1);
-}
-
/*
 * Logical devices
 */
@@ -817,17 +804,22 @@ static DEVICE_ATTR(name, S_IRUGO, show_n
 * Device detection, registration and update
 */

-static int __init pc87360_find(int sioaddr, u8 *devid, unsigned short 
*addresses)
+static int __init pc87360_find(unsigned short *addresses)
{
u16 val;
-   int i;
-   int nrdev; /* logical device count */
+   int i, nrdev; /* logical device count */

-   /* No superio_enter */
+   struct superio* const gate = superio_find();
+   if (!gate) {
+   printk(KERN_WARNING "pc87360: superio port not detected, "
+  "module not intalled.\n");
+   return -ENODEV;
+   }
+   superio_enter(gate);
+   devid = gate->devid; /* Remember the device id */

/* Identify device */
-   val = superio_inb(sioaddr, DEVID);
-   switch (val) {
+   switch (devid) {
case 0xE1: /* PC87360 */
case 0xE8: /* PC87363 */
case 0xE4: /* PC87364 */
@@ -838,25 +830,23 @@ static int __init pc87360_find(int sioad
nrdev = 3;
break;
default:
-   superio_exit(sioaddr);
+   superio_exit(gate);
return -ENODEV;
}
-   /* Remember the device id */
-   *devid = val;

for (i = 0; i < nrdev; i++) {
/* select logical device */
-   superio_outb(sioaddr, DEV, logdev[i]);
+   superio_select(gate, logdev[i]);

-   val = superio_inb(sioaddr, ACT);
+   val = superio_inb(gate, ACT);
if (!(val & 0x01)) {
printk(KERN_INFO "pc87360: Device 0x%02x not "
   "activated\n", logdev[i]);
continue;
}

-   val = (superio_inb(sioaddr, BASE) << 8)
-   | superio_inb(sioaddr, BASE + 1);
+   val = (superio_inb(gate, BASE) << 8)
+   | superio_inb(gate, BASE + 1);
if (!val) {
printk(KERN_INFO "pc87360: Base address not set for "
   "device 0x%02x\n", logdev[i]);
@@ -866,8 +856,8 @@ static int __init pc87360_find(int sioad
addresses[i] = val;

if (i==0) { /* Fans */
-   confreg[0] = superio_inb(sioaddr, 0xF0);
-   confreg[1] = 

[ patch .24-rc0 4/5 ] SuperIO locks coordinator - use in drivers/char/pc8736x-gpio

2007-10-14 Thread Jim Cromie

04 - use superio-locks in drivers/char/pc8736x_gpio

this driver keeps the slot for the lifetime of the driver 
( __init til __exit ), since the driver needs the sio-port

to change pin configurations.

patches 03,04 were tested on a soekris 4801 a year ago,
the box is currently busy.  Together they sanity-test
the sharing of a reservation with 2 different life-cycles.


Signed-off-by:  Jim Cromie <[EMAIL PROTECTED]>
---
hwmon-superio-pc8736x-gpio
Kconfig|1 
pc8736x_gpio.c |   83 ++---

2 files changed, 34 insertions(+), 50 deletions(-)

diff -ruNp -X dontdiff -X exclude-diffs 
hwmon-fan-push-offset/drivers/char/Kconfig 
hwmon-superio.old/drivers/char/Kconfig
--- hwmon-fan-push-offset/drivers/char/Kconfig  2007-10-14 13:00:24.0 
-0600
+++ hwmon-superio.old/drivers/char/Kconfig  2007-10-14 17:22:23.0 
-0600
@@ -943,6 +943,7 @@ config PC8736x_GPIO
depends on X86
default SCx200_GPIO # mostly N
select NSC_GPIO # needed for support routines
+   select SUPERIO_LOCKS
help
  Give userspace access to the GPIO pins on the National
  Semiconductor PC-8736x (x=[03456]) SuperIO chip.  The chip
diff -ruNp -X dontdiff -X exclude-diffs 
hwmon-fan-push-offset/drivers/char/pc8736x_gpio.c 
hwmon-superio.old/drivers/char/pc8736x_gpio.c
--- hwmon-fan-push-offset/drivers/char/pc8736x_gpio.c   2007-10-14 
13:00:24.0 -0600
+++ hwmon-superio.old/drivers/char/pc8736x_gpio.c   2007-10-14 
17:22:23.0 -0600
@@ -20,6 +20,7 @@
#include 
#include 
#include 
+#include 
#include 

#define DEVNAME "pc8736x_gpio"
@@ -36,13 +37,12 @@ static DEFINE_MUTEX(pc8736x_gpio_config_
static unsigned pc8736x_gpio_base;
static u8 pc8736x_gpio_shadow[4];

-#define SIO_BASE1   0x2E   /* 1st command-reg to check */
-#define SIO_BASE2   0x4E   /* alt command-reg to check */
+static struct superio* gate;

-#define SIO_SID0x20/* SuperI/O ID Register */
-#define SIO_SID_VALUE  0xe9/* Expected value in SuperI/O ID Register */
-
-#define SIO_CF10x21/* chip config, bit0 is chip enable */
+static __devinit struct superio_search where = {
+   .cmdreg_addrs = { 0x2E, 0x4E },
+   .device_ids   = { 0xE1, 0xE8, 0xE4, 0xE5, 0xE9, 0 },
+};

#define PC8736X_GPIO_RANGE  16 /* ioaddr range */
#define PC8736X_GPIO_CT 32 /* minors matching 4 8 bit ports */
@@ -52,6 +52,7 @@ static u8 pc8736x_gpio_shadow[4];
#define SIO_GPIO_UNIT   0x7 /* unit number of GPIO */
#define SIO_VLM_UNIT0x0D
#define SIO_TMS_UNIT0x0E
+#define SIO_CF10x21/* chip config, bit0 is chip enable */

/* config-space addrs to read/write each unit's runtime addr */
#define SIO_BASE_HADDR  0x60
@@ -62,7 +63,6 @@ static u8 pc8736x_gpio_shadow[4];
#define SIO_GPIO_PIN_CONFIG 0xF1
#define SIO_GPIO_PIN_EVENT  0xF2

-static unsigned char superio_cmd = 0;
static unsigned char selected_device = 0xFF;/* bogus start val */

/* GPIO port runtime access, functionality */
@@ -76,35 +76,9 @@ static int port_offset[] = { 0, 4, 8, 10

static struct platform_device *pdev;  /* use in dev_*() */

-static inline void superio_outb(int addr, int val)
-{
-   outb_p(addr, superio_cmd);
-   outb_p(val, superio_cmd + 1);
-}
-
-static inline int superio_inb(int addr)
-{
-   outb_p(addr, superio_cmd);
-   return inb_p(superio_cmd + 1);
-}
-
-static int pc8736x_superio_present(void)
-{
-   /* try the 2 possible values, read a hardware reg to verify */
-   superio_cmd = SIO_BASE1;
-   if (superio_inb(SIO_SID) == SIO_SID_VALUE)
-   return superio_cmd;
-
-   superio_cmd = SIO_BASE2;
-   if (superio_inb(SIO_SID) == SIO_SID_VALUE)
-   return superio_cmd;
-
-   return 0;
-}
-
static void device_select(unsigned devldn)
{
-   superio_outb(SIO_UNIT_SEL, devldn);
+   superio_select(gate, devldn);
selected_device = devldn;
}

@@ -112,7 +86,7 @@ static void select_pin(unsigned iminor)
{
/* select GPIO port/pin from device minor number */
device_select(SIO_GPIO_UNIT);
-   superio_outb(SIO_GPIO_PIN_SELECT,
+   superio_outb(gate, SIO_GPIO_PIN_SELECT,
 ((iminor << 1) & 0xF0) | (iminor & 0x7));
}

@@ -121,19 +95,19 @@ static inline u32 pc8736x_gpio_configure
{
u32 config, new_config;

+   superio_enter(gate);
mutex_lock(_gpio_config_lock);

-   device_select(SIO_GPIO_UNIT);
+   /* read pin's current config value */
select_pin(index);
-
-   /* read current config value */
-   config = superio_inb(func_slct);
+   config = superio_inb(gate, func_slct);

/* set new config */
new_config = (config & mask) | bits;
-   superio_outb(func_slct, new_config);
+   superio_outb(gate, func_slct, new_config);

mutex_unlock(_gpio_config_lock);
+   superio_exit(gate);


[ patch .24-rc0 2/5 ] SuperIO locks coordinator - use in hwmon/w83627hf

2007-10-14 Thread Jim Cromie

02 - use superio-locks in drivers/hwmon/w83627hf.c

tested on an AMD-Barton mobo.


Signed-off-by:  Jim Cromie <[EMAIL PROTECTED]>
---
hwmon-superio-w83627hf
Kconfig|1 
w83627hf.c |  140 -

2 files changed, 58 insertions(+), 83 deletions(-)

diff -ruNp -X dontdiff -X exclude-diffs 
hwmon-fan-push-offset/drivers/hwmon/Kconfig 
hwmon-superio.old/drivers/hwmon/Kconfig
--- hwmon-fan-push-offset/drivers/hwmon/Kconfig 2007-10-14 13:00:24.0 
-0600
+++ hwmon-superio.old/drivers/hwmon/Kconfig 2007-10-14 17:22:23.0 
-0600
@@ -675,6 +688,7 @@ config SENSORS_W83L785TS
config SENSORS_W83627HF
tristate "Winbond W83627HF, W83627THF, W83637HF, W83687THF, W83697HF"
select HWMON_VID
+   select SUPERIO_LOCKS
help
  If you say yes here you get support for the Winbond W836X7 series
  of sensor chips: the W83627HF, W83627THF, W83637HF, W83687THF and
diff -ruNp -X dontdiff -X exclude-diffs 
hwmon-fan-push-offset/drivers/hwmon/w83627hf.c 
hwmon-superio.old/drivers/hwmon/w83627hf.c
--- hwmon-fan-push-offset/drivers/hwmon/w83627hf.c  2007-10-14 
17:13:47.0 -0600
+++ hwmon-superio.old/drivers/hwmon/w83627hf.c  2007-10-14 17:22:23.0 
-0600
@@ -50,6 +50,7 @@
#include 
#include 
#include 
+#include 
#include 
#include "lm75.h"

@@ -75,11 +76,6 @@ static int init = 1;
module_param(init, bool, 0);
MODULE_PARM_DESC(init, "Set to zero to bypass chip initialization");

-/* modified from kernel/include/traps.c */
-static int REG;/* The register to read/write */
-#defineDEV 0x07/* Register: Logical device select */
-static int VAL;/* The value to read/write */
-
/* logical device numbers for superio_select (below) */
#define W83627HF_LD_FDC 0x00
#define W83627HF_LD_PRT 0x01
@@ -97,8 +93,6 @@ static int VAL;   /* The value to read/wr
#define W83627HF_LD_ACPI0x0a
#define W83627HF_LD_HWM 0x0b

-#defineDEVID   0x20/* Register: Device ID */
-
#define W83627THF_GPIO5_EN  0x30 /* w83627thf only */
#define W83627THF_GPIO5_IOSR0xf3 /* w83627thf only */
#define W83627THF_GPIO5_DR  0xf4 /* w83627thf only */
@@ -107,47 +101,25 @@ static int VAL;   /* The value to read/wr
#define W83687THF_VID_CFG   0xF0 /* w83687thf only */
#define W83687THF_VID_DATA  0xF1 /* w83687thf only */

-static inline void
-superio_outb(int reg, int val)
-{
-   outb(reg, REG);
-   outb(val, VAL);
-}
-
-static inline int
-superio_inb(int reg)
-{
-   outb(reg, REG);
-   return inb(VAL);
-}
-
-static inline void
-superio_select(int ld)
-{
-   outb(DEV, REG);
-   outb(ld, VAL);
-}
-
-static inline void
-superio_enter(void)
-{
-   outb(0x87, REG);
-   outb(0x87, REG);
-}
-
-static inline void
-superio_exit(void)
-{
-   outb(0xAA, REG);
-}
+#define W627_DEVID 0x52
+#define W627THF_DEVID  0x82
+#define W697_DEVID 0x60
+#define W637_DEVID 0x70
+#define W687THF_DEVID  0x85
+
+#define WINB_ACT_REG   0x30
+#define WINB_BASE_REG  0x60
+
+static struct superio* gate;
+
+static __devinit struct superio_search where = {
+   .cmdreg_addrs = { 0x2e, 0x4e },
+   .device_ids   = { W627_DEVID, W627THF_DEVID, W697_DEVID,
+ W637_DEVID, W687THF_DEVID, 0 },
+   .enter_seq= { 0x87, 0x87, 0 },
+   .exit_seq = { 0xAA, 0 }
+};

-#define W627_DEVID 0x52
-#define W627THF_DEVID 0x82
-#define W697_DEVID 0x60
-#define W637_DEVID 0x70
-#define W687THF_DEVID 0x85
-#define WINB_ACT_REG 0x30
-#define WINB_BASE_REG 0x60
/* Constants specified below */

/* Alignment of the base address */
@@ -995,12 +967,10 @@ show_name(struct device *dev, struct dev
return sprintf(buf, "%s\n", data->name);
}
static DEVICE_ATTR(name, S_IRUGO, show_name, NULL);
-
-static int __init w83627hf_find(int sioaddr, unsigned short *addr,
-   struct w83627hf_sio_data *sio_data)
+   
+static u16 __init w83627hf_find(struct w83627hf_sio_data *sio_data)
{
-   int err = -ENODEV;
-   u16 val;
+   u16 val, addr = 0;

static const __initdata char *names[] = {
"W83627HF",
@@ -1010,11 +980,15 @@ static int __init w83627hf_find(int sioa
"W83687THF",
};

-   REG = sioaddr;
-   VAL = sioaddr + 1;
+   gate = superio_find();
+   if (!gate) {
+   printk(KERN_WARNING DRVNAME ": superio port not detected, "
+  "module not intalled.\n");
+   return 0;
+   }
+   superio_enter(gate);
+   val = superio_devid(gate);

-   superio_enter();
-   val= superio_inb(DEVID);
switch (val) {
case W627_DEVID:
sio_data->type = w83627hf;
@@ -1038,36 +1012,34 @@ static int __init w83627hf_find(int sioa

[ patch .24-rc0 1/5 ] SuperIO locks coordinator - add the module

2007-10-14 Thread Jim Cromie

01 - adds superio_locks module

User-drivers specify the sio-port characteristics they can support
device-ids, sio-port-addrs, enter & exit sequences, etc in 
a struct superio_search (in __devinit, preferably). 

superio_find() then searches existing slots/shared-reservations 
for a matching sio-port, and returns it if found. 
Otherwize it probes port-addrs, specified by find() user, 
and makes and returns a new reservation.


   superio_find()	finds and reserves the slot, 
			returned as ptr or null

   superio_release()relinguishes the slot (ref-counted)

Once theyve got the reservation in struct superio * gate 
(as named in patches 2-5) they *may* use


   superio_lock(gate)
   superio_enter/exit(gate)
   superio_inb/w(gate, regaddr),
   superio_outb/w(gate, regaddr, val)

or they can do it themselves with inb/outb, by using gate->sioaddr, etc.

The API names (superio_find etc) were chosen to fit the idiom 
used in hwmon/*.c, patches 2-5 remove the per-user-driver

copies of the superio_*() fns.

I added the module to /drivers/hwmon, mostly cuz thats where
Ive used it - perhaps drivers/isa is better ?


Signed-off-by:  Jim Cromie <[EMAIL PROTECTED]>
---
hwmon-superio-module
drivers/hwmon/Kconfig |9 +
drivers/hwmon/Makefile|1 
drivers/hwmon/superio_locks.c |  235 ++

include/linux/superio-locks.h |  112 
4 files changed, 357 insertions(+)

diff -ruNp -X dontdiff -X exclude-diffs 
hwmon-fan-push-offset/drivers/hwmon/Kconfig 
hwmon-superio.old/drivers/hwmon/Kconfig
--- hwmon-fan-push-offset/drivers/hwmon/Kconfig 2007-10-14 13:00:24.0 
-0600
+++ hwmon-superio.old/drivers/hwmon/Kconfig 2007-10-14 17:22:23.0 
-0600
@@ -750,4 +765,13 @@ config HWMON_DEBUG_CHIP
  a problem with I2C support and want to see more of what is going
  on.

+config SUPERIO_LOCKS
+   tristate "Super-IO port sharing"
+   default n
+   help
+ This module provides a shared reservation for use by drivers
+ which need to share access to a multi-function device via
+ its superio port, and which register that port.
+
endif # HWMON
+
diff -ruNp -X dontdiff -X exclude-diffs 
hwmon-fan-push-offset/drivers/hwmon/Makefile 
hwmon-superio.old/drivers/hwmon/Makefile
--- hwmon-fan-push-offset/drivers/hwmon/Makefile2007-10-14 
13:00:24.0 -0600
+++ hwmon-superio.old/drivers/hwmon/Makefile2007-10-14 17:22:23.0 
-0600
@@ -72,3 +72,4 @@ ifeq ($(CONFIG_HWMON_DEBUG_CHIP),y)
EXTRA_CFLAGS += -DDEBUG
endif

+obj-$(CONFIG_SUPERIO_LOCKS)+= superio_locks.o
diff -ruNp -X dontdiff -X exclude-diffs 
hwmon-fan-push-offset/drivers/hwmon/superio_locks.c 
hwmon-superio.old/drivers/hwmon/superio_locks.c
--- hwmon-fan-push-offset/drivers/hwmon/superio_locks.c 1969-12-31 
17:00:00.0 -0700
+++ hwmon-superio.old/drivers/hwmon/superio_locks.c 2007-10-14 
20:27:49.0 -0600
@@ -0,0 +1,235 @@
+
+#define DRVNAME "superio_locks"
+#define DEBUG 1
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+MODULE_AUTHOR("Jim Cromie <[EMAIL PROTECTED]");
+MODULE_LICENSE("GPL");
+
+/*
+ * This module allows multiple driver modules to coordinate their use
+ * of a Super-IO port to control the multiple logical devices behind
+ * it.  Drivers will superio_find() their port, and 2 such modules
+ * will share a slot containing the lock.
+ */
+static int max_locks = 3;  /* 3 is enough for 90% uses */
+module_param(max_locks, int, 0);
+MODULE_PARM_DESC(max_locks,
+" Number of sio-lock clients to serve (default=3)");
+
+static int paranoid;
+module_param(paranoid, int, 0);
+MODULE_PARM_DESC(paranoid,
+" when true, fails if superio-port region is claimed");
+
+static struct superio *sio_locks;
+static struct mutex reservation_lock;
+
+#define SIO_DEVID_ADDR_STD 0x20
+#define SIO_LDN_ADDR_STD 0x07
+
+struct superio* superio_probe_reserve(const struct superio_search * const 
where,
+ int cmd_addr, int want_devid)
+{
+   struct superio *gate;
+   u16 devid_addr = where->devid_addr;
+   u16 mydevid;
+   int slot, rc, i;
+
+   if (!devid_addr)
+   devid_addr = SIO_DEVID_ADDR_STD;
+
+
+   /* send superio-enter sequence for devices which need them */
+   for (i = 0; i < SEQ_SZ && where->enter_seq[i]; i++)
+   outb(where->enter_seq[i], cmd_addr);
+
+   outb(devid_addr, cmd_addr);
+
+   /* sanity check that cmd-reg remembers the val just written
+   rc = inb(cmd_addr);
+   if (rc != (devid_addr & 0xFF)) {
+   pr_debug("cmd-reg absent at %x, got %x\n", cmd_addr, rc);
+   return NULL;
+   }
+   */
+   /* Read the device-id register(s), using cmd written above */
+   if (!where->devid_word) {
+   mydevid = inb(cmd_addr+1);
+   } else {
+   /* want 16 bit devid, so get it 

[ patch .24-rc0 0/5 ] SuperIO locks coordinator

2007-10-14 Thread Jim Cromie
this patchset (on hwmon-git) re-introduces superio_locks module, 
previously RFC'd here, where I 'borrowed' another thread..


http://marc.info/?l=linux-kernel=115821759424601=2

The module shares out slots/shared-reservations containing
a mutex, so that multiple modules can coordinate access to
the sio-port.

Im crossposting - LKML for more reviewers, lm-sensors for
folks with the hardware and (perhaps) more interest.

If its not too late, please consider for 2.6.24-rc0

01 - adds superio_locks module

User-drivers specify the sio-port characteristics they can support
device-ids, sio-port-addrs, enter & exit sequences, etc in 
a struct superio_search (in __devinit, preferably). 

superio_find() then searches existing slots/shared-reservations 
for a matching sio-port, and returns it if found. 
Otherwize it probes port-addrs, specified by find() user, 
and makes and returns a new reservation.


   superio_find()	finds and reserves the slot, 
			returned as ptr or null

   superio_release()relinguishes the slot (ref-counted)

Once theyve got the reservation in struct superio * gate 
(as named in patches 2-5) they *may* use


   superio_lock(gate)
   superio_enter/exit(gate)
   superio_inb/w(gate, regaddr),
   superio_outb/w(gate, regaddr, val)

or they can do it themselves with inb/outb, by using gate->sioaddr, etc.

The API names (superio_find etc) were chosen to fit the idiom 
used in hwmon/*.c, patches 2-5 remove the per-user-driver

copies of the superio_*() fns.

I added the module to /drivers/hwmon, mostly cuz thats where
Ive used it - perhaps drivers/isa is better ?


02 - use superio-locks in drivers/hwmon/w83627hf.c

tested on an AMD-Barton mobo.


03 - use superio-locks in drivers/hwmon/pc87360

this driver keeps the slot for only during __init, since it 
only needs the sio-port to read the ISA addresses of the 
Logical Devices in the chip, which are then used exclusively.



04 - use superio-locks in drivers/char/pc8736x_gpio

this driver keeps the slot for the lifetime of the driver 
( __init til __exit ), since the driver needs the sio-port

to change pin configurations.

patches 03,04 were tested on a soekris 4801 a year ago,
the box is currently busy.  Together they sanity-test
the sharing of a reservation with 2 different life-cycles.


05 - use superio-locks in rest of drivers/hwmon/*.c

this patch is compile-tested only, please review for sanity before you 
try running them.  Things to look for - missing superio_release(),

opportunities to use superio_devid(), superio_inw(), etc.


Driver sizes:
without/with DEBUG
  2364 664  963124 c34 drivers/hwmon/superio_locks.ko
  2938 664  963698 e72 drivers/hwmon/superio_locks.ko

effect on user-driver sizes:
before/after
 122094004  36   162493f79 drivers/hwmon/pc87360.ko
 124344068  36   16538409a drivers/hwmon/pc87360.ko

I hope this is small enough; per the link, there apparently are 
actual problems (not just theoretical) that this should help with.


OPERATION

The previous thread includes cut-paste from dmesg, from when 
I tested on the soekris.  No point in repasting here..


RANDOM POINTS/CAVEATS

- Ive not tested the 16-bit device-id check (no hardware)

- superio_(enter|exit) and superio(un)?lock are nearly redundant.
The former also sends the active/idle command sequences, with a tiny
overhead when none are needed (forex the pc87360 chip).  One driver
in patch 5 needed a my_superio_enter() due to an odd unlock sequence.

- uses request_region - this might detect a 'rogue' sio-port user
(which requests region, but doesnt use this module).
It did detect a missing superio_release(), which I hacked around
with the paranoid mod-option (easier than rebooting)

- superio-locks.h has static-inline fns, probably not enough
code to do an export for, since >2 sio-ports is probably rare.

- superio_locks may be relevant elsewhere (ACPI ?) 
but I havent thought about them, hwmon seemed like a good testcase.


- implements several defaults (byte-sized devid, ldn-addr=7, 
devid-addr=20).  Perhaps more are possible, as long as theres

an override.  I avoided setting the defaults into struct superio_search,
so that it could be const'd.

Please identify those that are standard enough.
This could include #define LPC_* constants, for use in
superio_inb(gate, LPC_LDN), superio_inw(gate, LPC_ISA_ADDR) 
then user drivers could just #define SIO_* for non-standard registers.


- Hans, I think Ive added all your suggestions, thanks.

Signed-off-by:  Jim Cromie <[EMAIL PROTECTED]>
---


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 005 of 5] md: Fix type that is stopping raid5 grow from working.

2007-10-14 Thread NeilBrown

This kmem_cache_create is creating a cache that already exists.  We
could us the alternate name, just like we do a few lines up.

Signed-off-by: Neil Brown <[EMAIL PROTECTED]>
Cc: "Dan Williams" <[EMAIL PROTECTED]>

### Diffstat output
 ./drivers/md/raid5.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff .prev/drivers/md/raid5.c ./drivers/md/raid5.c
--- .prev/drivers/md/raid5.c2007-10-15 14:12:03.0 +1000
+++ ./drivers/md/raid5.c2007-10-15 14:12:06.0 +1000
@@ -1380,7 +1380,7 @@ static int resize_stripes(raid5_conf_t *
if (!sc)
return -ENOMEM;
 
-   sc_q = kmem_cache_create(conf->sq_cache_name[conf->active_name],
+   sc_q = kmem_cache_create(conf->sq_cache_name[1-conf->active_name],
   (sizeof(struct stripe_queue)+(newsize-1) *
sizeof(struct r5_queue_dev)) +
r5_io_weight_size(newsize) +
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 003 of 5] md: Expose the degraded status of an assembled array through sysfs

2007-10-14 Thread NeilBrown

From: Iustin Pop <[EMAIL PROTECTED]>

The 'degraded' attribute is useful to quickly determine if the array is
degraded, instead of parsing 'mdadm -D' output or relying on the other
techniques (number of working devices against number of defined devices, etc.).
The md code already keeps track of this attribute, so it's useful to export it.

Signed-off-by: Iustin Pop <[EMAIL PROTECTED]>
Signed-off-by: Neil Brown <[EMAIL PROTECTED]>

### Diffstat output
 ./drivers/md/md.c |7 +++
 1 file changed, 7 insertions(+)

diff .prev/drivers/md/md.c ./drivers/md/md.c
--- .prev/drivers/md/md.c   2007-10-15 14:06:32.0 +1000
+++ ./drivers/md/md.c   2007-10-15 14:06:52.0 +1000
@@ -2833,6 +2833,12 @@ sync_max_store(mddev_t *mddev, const cha
 static struct md_sysfs_entry md_sync_max =
 __ATTR(sync_speed_max, S_IRUGO|S_IWUSR, sync_max_show, sync_max_store);
 
+static ssize_t
+degraded_show(mddev_t *mddev, char *page)
+{
+   return sprintf(page, "%d\n", mddev->degraded);
+}
+static struct md_sysfs_entry md_degraded = __ATTR_RO(degraded);
 
 static ssize_t
 sync_speed_show(mddev_t *mddev, char *page)
@@ -2976,6 +2982,7 @@ static struct attribute *md_redundancy_a
_suspend_lo.attr,
_suspend_hi.attr,
_bitmap.attr,
+   _degraded.attr,
NULL,
 };
 static struct attribute_group md_redundancy_group = {
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 004 of 5] md: Make sure read errors are auto-corrected during a 'check' resync in raid1

2007-10-14 Thread NeilBrown

Whenever a read error is found, we should attempt to overwrite with
correct data to 'fix' it.

However when do a 'check' pass (which compares data blocks that are
successfully read, but doesn't normally overwrite) we don't do that.
We should.

Signed-off-by: Neil Brown <[EMAIL PROTECTED]>

### Diffstat output
 ./drivers/md/raid1.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff .prev/drivers/md/raid1.c ./drivers/md/raid1.c
--- .prev/drivers/md/raid1.c2007-10-15 14:07:17.0 +1000
+++ ./drivers/md/raid1.c2007-10-15 14:08:55.0 +1000
@@ -1214,7 +1214,8 @@ static void sync_request_write(mddev_t *
j = 0;
if (j >= 0)
mddev->resync_mismatches += 
r1_bio->sectors;
-   if (j < 0 || test_bit(MD_RECOVERY_CHECK, 
>recovery)) {
+   if (j < 0 || (test_bit(MD_RECOVERY_CHECK, 
>recovery)
+ && test_bit(BIO_UPTODATE, 
>bi_flags))) {
sbio->bi_end_io = NULL;
rdev_dec_pending(conf->mirrors[i].rdev, 
mddev);
} else {
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 002 of 5] md: 'sync_action' in sysfs returns wrong value for readonly arrays

2007-10-14 Thread NeilBrown

When an array is started read-only, MD_RECOVERY_NEEDED can be set but
no recovery will be running.  This causes 'sync_action' to report the
wrong value.

We could remove the test for MD_RECOVERY_NEEDED, but doing so would
leave a small gap after requesting a sync action, where 'sync_action'
would still report the old value.

So make sure that for a read-only array, 'sync_action' always returns 'idle'.


Signed-off-by: Neil Brown <[EMAIL PROTECTED]>

### Diffstat output
 ./drivers/md/md.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff .prev/drivers/md/md.c ./drivers/md/md.c
--- .prev/drivers/md/md.c   2007-10-15 14:06:32.0 +1000
+++ ./drivers/md/md.c   2007-10-15 14:06:32.0 +1000
@@ -2714,7 +2714,7 @@ action_show(mddev_t *mddev, char *page)
 {
char *type = "idle";
if (test_bit(MD_RECOVERY_RUNNING, >recovery) ||
-   test_bit(MD_RECOVERY_NEEDED, >recovery)) {
+   (!mddev->ro && test_bit(MD_RECOVERY_NEEDED, >recovery))) {
if (test_bit(MD_RECOVERY_RESHAPE, >recovery))
type = "reshape";
else if (test_bit(MD_RECOVERY_SYNC, >recovery)) {
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 001 of 5] md: Fix a bug in some never-used code.

2007-10-14 Thread NeilBrown

http://bugzilla.kernel.org/show_bug.cgi?id=3277

There is a seq_printf here that isn't being passed a 'seq'.
Howeve as the code is inside #ifdef MD_DEBUG, nobody noticed.

Also remove some extra spaces.

Signed-off-by: Neil Brown <[EMAIL PROTECTED]>

### Diffstat output
 ./drivers/md/raid0.c |   10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff .prev/drivers/md/raid0.c ./drivers/md/raid0.c
--- .prev/drivers/md/raid0.c2007-10-15 14:05:58.0 +1000
+++ ./drivers/md/raid0.c2007-10-15 14:06:05.0 +1000
@@ -472,7 +472,7 @@ bad_map:
bio_io_error(bio);
return 0;
 }
-  
+
 static void raid0_status (struct seq_file *seq, mddev_t *mddev)
 {
 #undef MD_DEBUG
@@ -480,18 +480,18 @@ static void raid0_status (struct seq_fil
int j, k, h;
char b[BDEVNAME_SIZE];
raid0_conf_t *conf = mddev_to_conf(mddev);
-  
+
h = 0;
for (j = 0; j < conf->nr_strip_zones; j++) {
seq_printf(seq, "  z%d", j);
if (conf->hash_table[h] == conf->strip_zone+j)
-   seq_printf("(h%d)", h++);
+   seq_printf(seq, "(h%d)", h++);
seq_printf(seq, "=[");
for (k = 0; k < conf->strip_zone[j].nb_dev; k++)
-   seq_printf (seq, "%s/", bdevname(
+   seq_printf(seq, "%s/", bdevname(
conf->strip_zone[j].dev[k]->bdev,b));
 
-   seq_printf (seq, "] zo=%d do=%d s=%d\n",
+   seq_printf(seq, "] zo=%d do=%d s=%d\n",
conf->strip_zone[j].zone_offset,
conf->strip_zone[j].dev_offset,
conf->strip_zone[j].size);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 000 of 5] md: Five minor md patch, some for 2.6.24.

2007-10-14 Thread NeilBrown
Following are 5 minor patches for md in current -mm.

The first 4 are suitable to flow into 2.6.24.

The last fixes a small bug in Dan Williams' patches currently in -mm,
which are not scheduled for 2.6.24.

Thanks,
NeilBrown


 [PATCH 001 of 5] md: Fix a bug in some never-used code.
 [PATCH 002 of 5] md: 'sync_action' in sysfs returns wrong value for readonly 
arrays
 [PATCH 003 of 5] md: Expose the degraded status of an assembled array through 
sysfs
 [PATCH 004 of 5] md: Make sure read errors are auto-corrected during a 'check' 
resync in raid1
 [PATCH 005 of 5] md: Fix type that is stopping raid5 grow from working.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Interaction between Xen and XFS: stray RW mappings

2007-10-14 Thread David Chinner
On Sun, Oct 14, 2007 at 09:18:17PM -0700, Jeremy Fitzhardinge wrote:
> David Chinner wrote:
> > With defaults - little effect as vmap should never be used. It's
> > only when you start using larger block sizes for metadata that this
> > becomes an issue. The CONFIG_XEN workaround should be fine until we
> > get a proper vmap cache
> 
> Hm, well I saw the problem with a filesystem made with mkfs.xfs with no
> options, so there must be at least *some* vmapping going on there.

Sorry - I should have been more precise - vmap should never be used in
performance critical paths on default configs.  Log recovery will
trigger vmap/vunmap usage, so this is probably what you are seeing.

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Interaction between Xen and XFS: stray RW mappings

2007-10-14 Thread Jeremy Fitzhardinge
David Chinner wrote:
> With defaults - little effect as vmap should never be used. It's
> only when you start using larger block sizes for metadata that this
> becomes an issue. The CONFIG_XEN workaround should be fine until we
> get a proper vmap cache

Hm, well I saw the problem with a filesystem made with mkfs.xfs with no
options, so there must be at least *some* vmapping going on there.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Interaction between Xen and XFS: stray RW mappings

2007-10-14 Thread David Chinner
On Sun, Oct 14, 2007 at 08:42:34PM -0700, Jeremy Fitzhardinge wrote:
> Nick Piggin wrote:
> >  That's not going to
> > happen for at least a cycle or two though, so in the meantime maybe
> > an ifdef for that XFS vmap batching code would help?
> >   
> 
> For now I've proposed a patch to simply eagerly vunmap everything when
> CONFIG_XEN is set.  It certainly works, but I don't have a good feel for
> how much of a performance hit that imposes on XFS.

With defaults - little effect as vmap should never be used. It's
only when you start using larger block sizes for metadata that this
becomes an issue. The CONFIG_XEN workaround should be fine until we
get a proper vmap cache

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PATCH] SCSI updates for 2.6.24

2007-10-14 Thread James Bottomley
This is the accumulated updates queued for 2.6.24.  It contains the
usual slew of driver updates, plus some gdth and advansys rewrites.  We
still have some outstanding bugs in gdth and fc4 for which I'm hoping to
sweep fixes into the next update.

The patch is available here:

master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6.git

The short changelog is:

Adrian Bunk (5):
  esp_scsi: remove __dev{init,exit}
  imm: fix check-after-use
  nsp_cs: remove kernel 2.4 code
  make scsi_decode_sense_buffer and scsi_decode_sense_extras static
  scsi_error.c should #include "scsi_transport_api.h"

Alan Cox (3):
  dtc: Fix typo
  eata_pio: Clean up proc handling, bracketing and use cpu_relax()
  dtc: clean up indent damage and add printk levels

Andrew Morton (3):
  arcmsr: build fix
  ips: warning fix
  aacraid: rename check_reset

Andrew Vasquez (15):
  qla2xxx: Update version number to 8.02.00-k4.
  qla2xxx: Limit iIDMA speed adjustments.
  qla2xxx: Rework MSI-X handlers.
  qla2xxx: Clear options-flags while staging firmware-execution.
  qla2xxx: Sparse cleanups in qla_mid.c
  qla2xxx: Cleanup several 'sparse' warnings.
  qla2xxx: Use shost_priv().
  qla2xxx: Remove unused member (list) from srb_t structure.
  qla2xxx: Use the correct pointer-address during NVRAM writes.
  qla2xxx: Set correct attribute count during FDMI RPA.
  qla2xxx: Query additional RISC registers during ISP25XX firmware dump.
  qla2xxx: Correct staging of RISC while attempting to pause.
  qla2xxx: Query additional RISC information during a pause.
  qla2xxx: Add flash burst-read/write support.
  qla2xxx: Collapse and simplify ISP2XXX firmware dump routines.

Bartlomiej Zolnierkiewicz (1):
  MAINTAINERS: mark ide-scsi as Orphan

Bernhard Walle (1):
  ips: Update version information

Boaz Harrosh (12):
  gdth: !use_sg cleanup and use of scsi accessors
  gdth: Move members from SCp to gdth_cmndinfo, stage 2
  gdth: Setup proper per-command private data
  gdth: Remove gdth_ctr_tab[]
  gdth: gdth_interrupt() gdth_get_status() & gdth_wait() fixes
  gdth: clean up host private data
  NCR5380: Use scsi_eh API for REQUEST_SENSE invocation
  usb storage: use scsi_eh API in REQUEST_SENSE execution
  scsi_error: Refactoring scsi_error to facilitate in synchronous REQUEST_SE
  scsi_error: code cleanup before refactoring of scsi_send_eh_cmnd()
  ide-scsi.: convert to data accessors and !use_sg cleanup
  microtek: use data accessors and !use_sg cleanup

Christof Schmitt (5):
  zfcp: Enable debug feature before setting adapter online
  scsi_transport_fc: Introduce disable_target_scan flag
  zfcp: Remove braces for only one statement
  zfcp: Remove unnecessary assignment
  zfcp: correct indentation for nested if-else

Christoph Hellwig (5):
  gdth: switch to modern scsi host registration
  gdth: Remove virt hosts
  gdth: split out pci probing
  gdth: split out eisa probing
  gdth: split out isa probing

David Woodhouse (1):
  Fix ibmvscsi client for multiplatform iSeries+pSeries kernel

Dhaval Giani (1):
  gdth: fix CONFIG_ISA build failure

Eric Moore (16):
  mptctl : shutup uninitialized variable warnings
  mptlan: bug fix, only half the message frame is dma'd resulting in corrupt
  mpt fusion: fix up fusion prints using the sdev_printk, dev_printk, and sh
  mpt fusion: lock down ScsiLookup
  mpt fusion: Fix sparse warnings
  mpt fusion: add use of shost_priv and remove all the typecasting
  MAINTAINERS : mpt fusion mailing list change
  mpt fusion: bump version to 3.04.06
  mpt fusion: Kconfig cleanup
  mpt fusion: removing Dell copyright
  mpt fusion: removing references to hd->ioc
  mpt fusion: rename vdev to vdevice
  mpt fusion: adding/removing white space
  mpt fusion: standardize printks and debug info
  mpt fusion: Add support for ATTO 4LD: Rebranded LSI 53C1030
  Addition to pci_ids.h for ATTO Technology, Inc.

FUJITA Tomonori (17):
  srp_transport: convert to use supported_mode attribute
  fc_transport: add target driver support
  add supported_mode and active_mode attributes to the host
  tgt: fix can_queue bug
  fc4: convert to use the data buffer accessors
  sg: increase sglist_len of the sg_scatter_hold structure
  ps3rom: convert to use the data buffer accessors
  scsi_transport_srp: remove tgt dependencies
  tgt: convert ibmvstgt to use transport tsk_mgmt_response callback
  tgt: move tsk_mgmt_response callback to transport class
  tgt: convert libsrp and ibmvstgt to use srp_transport
  srp_transport: add target driver support
  tgt: add I_T nexus support
  transport_srp: add rport roles attribute
  ib_srp: convert to use the srp transport class
  ibmvscsi: convert to use the srp transport class
 

Re: [2.6 patch] __inet6_csk_dst_store(): fix check-after-use

2007-10-14 Thread Noriaki TAKAMIYA
Hi,

>> Mon, 15 Oct 2007 11:45:10 +0900
>> [Subject: Re: [2.6 patch] __inet6_csk_dst_store(): fix check-after-use]
>> Masahide NAKAMURA <[EMAIL PROTECTED]> wrote...

> 
> On Sun, 14 Oct 2007 19:52:12 +0200
> Adrian Bunk <[EMAIL PROTECTED]> wrote:
> 
> > The Coverity checker spotted that we have already oops'ed if "dst"
> > was NULL.
> > 
> > Since "dst" being NULL doesn't seem to be possible at this point this 
> > patch removes the NULL check.
> > 
> > Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>
> 
> Agreed.
> 
> Acked-by: Masahide NAKAMURA <[EMAIL PROTECTED]>

  I also agreed.
  
Acked-by: Noriaki TAKAMIYA <[EMAIL PROTECTED]>

--
Noriaki TAKAMIYA
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] doc: add uio document to docbook compilation target

2007-10-14 Thread Randy Dunlap
On Mon, 15 Oct 2007 08:21:01 +0900 Satoru Takeuchi wrote:

> Add uio document to DocBook compilation target.
> 
> `make *docs' doesn't generate "The Userspace I/O HOWTO", the user space
> I/O document written in DocBook.
> 
> Signed-off-by: Satoru Takeuchi <[EMAIL PROTECTED]>
> 
> Index: linux/Documentation/DocBook/Makefile
> ===
> --- linux.orig/Documentation/DocBook/Makefile 2007-10-12 23:54:19.0 
> +0900
> +++ linux/Documentation/DocBook/Makefile  2007-10-12 23:55:14.0 
> +0900
> @@ -11,7 +11,7 @@ DOCBOOKS := wanbook.xml z8530book.xml mc
>   procfs-guide.xml writing_usb_driver.xml \
>   kernel-api.xml filesystems.xml lsm.xml usb.xml \
>   gadget.xml libata.xml mtdnand.xml librs.xml rapidio.xml \
> - genericirq.xml
> + genericirq.xml uio-howto.xml
>  
>  ###
>  # The build process is as follows (targets):

Hi,

Thanks, looks reasonable, but patch does not apply cleanly to
mainline or -mm.  (trivial to fix)

---
~Randy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Interaction between Xen and XFS: stray RW mappings

2007-10-14 Thread Jeremy Fitzhardinge
Nick Piggin wrote:
> Yeah, it would be possible. The easiest way would just be to shoot down
> all lazy vmaps (because you're doing the global IPIs anyway, which are
> the expensive thing, at which point you may as well purge the rest of
> your lazy mappings).
>   

Sure.

> If it is sufficiently rare, then it could be the simplest thing to do.
>   

Yes.  If there's some way to tell whether a particular page is in a lazy
mapping then that would help, since we could easily tell whether we need
to do the whole shootdown thing.  I would expect the population of
lazily mapped pages in the whole freepage pool to be pretty small, but
if the allocator tends to return the most recently freed pages you might
hit them fairly regularly (shoving them at the other end of the freelist
might be useful).

> OK, I see. Because even though it is technically safe where we are
> using it (because nothing writes through the mappings after the page
> is freed), a corrupted guest could use the same window to do bad
> things with the pagetables?
>   

That's right.  The hypervisor doesn't trust the guests, so it prevents
them from getting into a state where they can do bad things.

> For Xen -- shouldn't be a big deal. We can have a single Linux mm API
> to call, and we can do the right thing WRT vmap/kamp. I should try to
> merge my current lazy vmap patches which replace the XFS stuff, so we
> can implement such an API and fix your XFS issue?

Sounds good.

>  That's not going to
> happen for at least a cycle or two though, so in the meantime maybe
> an ifdef for that XFS vmap batching code would help?
>   

For now I've proposed a patch to simply eagerly vunmap everything when
CONFIG_XEN is set.  It certainly works, but I don't have a good feel for
how much of a performance hit that imposes on XFS.  A slightly more
subtle change would be to test to see if we're actually running under
Xen before taking the eager path, so at least the performance burden
only affects actual Xen users (and I presume xfs+xen is a fairly rare
combination, or this problem would have turned up earlier, or perhaps
the old xenified kernels have some other workaround for it).

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Version 7 (2.6.23) Smack: Simplified Mandatory Access Control Kernel

2007-10-14 Thread Casey Schaufler

--- "Ahmed S. Darwish" <[EMAIL PROTECTED]> wrote:

> Hi Casey,
> 
> On Sun, Oct 14, 2007 at 10:15:42AM -0700, Casey Schaufler wrote:
> > 
> > +
> > +CIPSO Configuration
> > +
> > +It is normally unnecessary to specify the CIPSO configuration. The default
> > +values used by the system handle all internal cases. Smack will compose
> CIPSO
> > +label values to match the Smack labels being used without administrative
> > +intervention. 
> >
> 
> I have two issues with CIPSO and Smack:
> 
> 1-
> 
> Using default configuration (system startup script + smacfs fstab line),
> system
> can't access any service outside the Lan. "ICMP parameter problem message"
> always
> appear from the first Wan router (traceroute + tcpdump at [1]).
> 
> Services inside the LAN can be accessed normally. System can connect to a Lan
> Windows share. It also connects to the gateway HTTP server easily.
> 
> After some tweaking, I discovered that using CIPSOv6 solves all above
> problems:
> $ echo -n "NLBL_CIPSOv6" > /smack/nltype
> 
> Is this a normal behaviour ?

Well ... sort of. CIPSOv6 isn't actually implemented in the
labeled networking code. What you're seeing is unlabeled packets.

As far as CIPSOv4 and your WAN router, It is possible that it is
configured either to reject CIPSO packets or to allow only CIPSO
packets in a particular DOI or to enforce a CIPSO policy of its
own.

> 2-
> 
> > 4. Any access requested on an object labeled "*" is permitted.
> [...]
> > +Unlabeled packets that come into the system will be given the
> > +ambient label.
> 
> Default conf let the ambient attribute = _ which works fine. Setting ambient
> = *
> stops all external (non lo) network traffic. Did I miss another use of
> "ambient"
> or this is a normal behaviour ?.

An IP operation is considered a write from the sender to the receiver.
The packet label is the label of the sender. Thus, in the unlabeled
packet case, the ambient label ("*" in your case) is attached to packet,
and the access check always denies access because of the first access
rule, which is that a subject with a star label will always be denied
access.

> > +Administration
> > +
> > +Smack supports some mount options:
> > +
> > +   smackfsdef=label: specifies the label to give files that lack
> > +   the Smack label extended attribute.
> > +
> 
> Although using smackfsdef=* as a mount option, all my system files have the
> floor
> attribute. Most of the /dev files have the * attribute though.

The smackfsdef mount option applies to files that don't actually
have the security.SMACK64 attribute. If those files have the attribute
whatever value is associated with it will be used.


Thank you.


Casey Schaufler
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] vivi, videobuf_to_vmalloc() and related breakage

2007-10-14 Thread Nick Piggin
On Monday 15 October 2007 12:01, Al Viro wrote:
>   AFAICS, videobuf-vmalloc use of mem->vma and mem->vmalloc is
> bogus.
>
> You obtain the latter with vmalloc_user(); so far, so good.  Then you have
> retval=remap_vmalloc_range(vma, mem->vmalloc,0);
> where vma is given to you by mmap(); again, fine - we get the memory
> pointed to be mem->vmalloc() mapped at vma->vm_start.
>
> Now we get the trouble: things like
>
> static void vivi_fillbuff(struct vivi_dev *dev,struct vivi_buffer *buf)
> {
>   ...
>   void *vbuf=videobuf_to_vmalloc (>vb);
>   ...
>   copy_to_user(vbuf + ..., ..., ...)
>
> get vbuf equal to ->vmalloc of buf->vp.priv and that is _not_ a userland
> address.  Giving it to copy_to_user() is not going to do anything good.
> On some targets it'll fail, on some - write to unrelated user memory.
> What is going on there?  If that's an attempt to copy into that buffer
> allocated by vmalloc_user(), why are we doing copy_to_user() at all?

Right you are. remap_vmalloc_range doesn't turn the passed vmalloc
area into user memory (it creates a completely new mapping).

Presumably it either wants to copy_to_user to that new mapping, or
memcpy to ->vmalloc? Would the former be an attempt to avoid some
virtual aliasing issues?


> But there's more; we have made a copy of vma (kmalloc+memcpy), stored it in
> mem->vma and later we cheerfully do remap_vmalloc_range(mem->vma,).
> And kfree that mem->vma immediately afterwards.  What the hell?  It might
> not break now, but that seems to be playing very fast and loose with the
> warranties provided by VM.

I don't know why one would be finding remap_vmalloc_range to fail
it mmap time but not later? Should just do it at mmap time and if
that is failing, then work out why (or ask linux-mm for help).
Actually there is probably a window where we can get subsequent
anonymous pages faulted into the empty vma there if we haven't
remapped it, then the subsequent attempt to remap will hit the
BUG_ON in remap_pte_range.

(that's aside from the big conceptual problem with passing in an
"invented" vma... don't do that! (: )
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: git-sched patch won't boot on SN arch, 2.6.23-mm1

2007-10-14 Thread Paul Jackson
Update to list - Ingo sent me offlist a broken out patch set of this
sched work, and I'm working with him to isolate the change that is
causing this problem.

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson <[EMAIL PROTECTED]> 1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6 patch] __inet6_csk_dst_store(): fix check-after-use

2007-10-14 Thread Masahide NAKAMURA

On Sun, 14 Oct 2007 19:52:12 +0200
Adrian Bunk <[EMAIL PROTECTED]> wrote:

> The Coverity checker spotted that we have already oops'ed if "dst"
> was NULL.
> 
> Since "dst" being NULL doesn't seem to be possible at this point this 
> patch removes the NULL check.
> 
> Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

Agreed.

Acked-by: Masahide NAKAMURA <[EMAIL PROTECTED]>

-- 
Masahide NAKAMURA

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Which companies are helping developing the kernel

2007-10-14 Thread Greg KH
On Sun, Oct 14, 2007 at 11:28:46PM +0100, Alistair John Strachan wrote:
> On Sunday 14 October 2007 23:06:22 Stefan Heinrichsen wrote:
> > Hello,
> >
> > I posted this question at comp.linux.misc and where told this would be a
> > better place therefore. I would like to do a internship in the field of the
> > Linux kernel.
> > Can someone tell me where to find a list of companies (don't matter in
> > which country) that employ kernel developers?
> 
> I think Greg wrote a paper on this subject, so I've added him to CC in case 
> he 
> has the link handy.

Yeah, but my paper didn't really track companies very well.  The lwn.net
article is the best, and below is my version of who did things in
2.6.23.  Note, the lack of a company is not an indicator that they did
nothing, just that I could not easily determine someone worked for them.
I'll try to send out my "who are you working for" emails in a week or so
to see if I can further categorize the "unknowns".

thanks,

greg k-h
-

Processed 7075 csets from 992 developers
126 employers found

Top changeset contributors by employer
(Unknown) 1116 (15.8%)
(None) 843 (11.9%)
Red Hat827 (11.7%)
IBM557 (7.9%)
Linux Foundation   528 (7.5%)
Novell 449 (6.3%)
Intel  242 (3.4%)
Oracle 158 (2.2%)
MIPS Technologies  143 (2.0%)
Nokia  133 (1.9%)
NetApp 119 (1.7%)
NTT 99 (1.4%)
Astaro  97 (1.4%)
MontaVista  90 (1.3%)
(Consultant)86 (1.2%)
SGI 84 (1.2%)
Qumranet74 (1.0%)
QLogic  70 (1.0%)
(Academia)  70 (1.0%)
SWsoft  64 (0.9%)
Analog Devices  61 (0.9%)
HP  60 (0.8%)
Sony59 (0.8%)
rPath   56 (0.8%)
XenSource   53 (0.7%)
CERN49 (0.7%)
CC Computer Consultants 48 (0.7%)
Freescale   47 (0.7%)
Fujitsu 47 (0.7%)
Tripeaks46 (0.7%)
linutronix  44 (0.6%)
Snapgear39 (0.6%)
Simtec  34 (0.5%)
Atmel   28 (0.4%)
Google  28 (0.4%)
Cisco   27 (0.4%)
Toshiba 25 (0.4%)
Broadcom25 (0.4%)
SteelEye24 (0.3%)
Renesas Technology  23 (0.3%)
Mellanox21 (0.3%)
LSI Logic   17 (0.2%)
Adaptec 16 (0.2%)
Wipro   15 (0.2%)
Marvell 14 (0.2%)
Miracle Linux   14 (0.2%)
Solid Boot Ltd. 14 (0.2%)
AMD 12 (0.2%)
Hitachi 11 (0.2%)
ARM 11 (0.2%)
Canonical   10 (0.1%)
XIV Information Systems  9 (0.1%)
OpenedHand   9 (0.1%)
Open Grid Computing  9 (0.1%)
Veritas  8 (0.1%)
Secretlab7 (0.1%)
Neterion 7 (0.1%)
Katalix Systems  7 (0.1%)
SANPeople7 (0.1%)
Digi International   7 (0.1%)
Znyx Networks6 (0.1%)
Wind River   6 (0.1%)
NEC  6 (0.1%)
Wolfson Microelectronics 6 (0.1%)
SUNY Computer Science6 (0.1%)
NetXen   6 (0.1%)
NVidia   6 (0.1%)
Myricom  6 (0.1%)
Chelsio  5 (0.1%)
Realtek  5 (0.1%)
IPUnity-Glenayre 5 (0.1%)
Linux Networx5 (0.1%)
Barco4 (0.1%)
SIOS Technology  4 (0.1%)
MIPS 4 (0.1%)
University of Aberdeen   4 (0.1%)
PiKRON s.r.o 4 (0.1%)
Pardus   4 (0.1%)
Crash Barrier4 (0.1%)
Tresys   4 (0.1%)
ClusterFS4 (0.1%)
Macq Electronique3 (0.0%)
OLPC 3 (0.0%)
CompuLab 3 (0.0%)
DENX Software Engineering3 (0.0%)
Embedded Alley Solutions 3 (0.0%)
Twin Sun 3 (0.0%)
Transmode Systems3 (0.0%)
Cosmosbay~Vectis 3 (0.0%)
Pengutronix  2 (0.0%)
Sierra Wireless  2 (0.0%)
Real-Time Remedies   2 (0.0%)
Bull 2 (0.0%)
ScaleMP Inc. 2 (0.0%)
Samsung  2 (0.0%)
MicroGate Systems2 (0.0%)
Atomide  2 (0.0%)
US National Security Agency2 (0.0%)
VMWare   2 (0.0%)
Verismo  2 (0.0%)
Akamai Technologies  2 (0.0%)
CE Linux Forum   2 (0.0%)

Re: 2.6.23.1 x86 hardware monitoring bug?

2007-10-14 Thread Mark M. Hoffman
Hi Justin:

(added some CCs)

* Justin Piszcz <[EMAIL PROTECTED]> [2007-10-14 15:30:18 -0400]:
> As a regular user, I cannot see the sensors on the A-bit board, but I can 
> see the CPU temperature, how come I can see one but not the other?
> 
> Kernel: $ uname -a
> Linux mybox 2.6.23.1 #4 SMP PREEMPT Sun Oct 14 15:20:53 EDT 2007 i686 
> GNU/Linux
> Distribution: Debian Lenny
> 
> $ sensors
> abituguru3-isa-00e0
> Adapter: ISA adapter
> 
> coretemp-isa-
> Adapter: ISA adapter
> temp1:   +35°C  (high =   +85°C)
> 
> coretemp-isa-0001
> Adapter: ISA adapter
> temp1:   +36°C  (high =   +85°C)
> 
> As root:
> 
> # sensors
> abituguru3-isa-00e0
> Adapter: ISA adapter
> CPU Core:   +1.35 V  (min  +0.00 V, max  +1.60 V)
> DDR2:   +2.02 V  (min  +1.60 V, max  +2.40 V)
> DDR2 VTT:   +1.01 V  (min  +0.80 V, max  +1.20 V)
> CPU VTT 1.2V:   +1.22 V  (min  +0.95 V, max  +1.45 V)
> MCH & PCIE 1.5V:+1.50 V  (min  +1.20 V, max  +1.80 V)
> MCH 2.5V:   +2.58 V  (min  +2.00 V, max  +3.00 V)
> ICH 1.05V:  +1.04 V  (min  +0.85 V, max  +1.25 V)
> ATX +12V (24-Pin): +12.18 V  (min  +9.60 V, max +14.40 V)
> ATX +12V (4-pin):  +12.24 V  (min  +9.60 V, max +14.40 V)
> ATX +5V:+5.07 V  (min  +3.99 V, max  +6.00 V)
> +3.3V:  +3.40 V  (min  +2.64 V, max  +3.94 V)
> 5VSB:   +5.10 V  (min  +3.99 V, max  +6.00 V)
> CPU:  +43°C  (high =   +75°C, crit =   +85°C)
> System :  +38°C  (high =   +55°C, crit =   +65°C)
> PWM1: +43°C  (high =   +80°C, crit =   +90°C)
> PWM2: +43°C  (high =   +80°C, crit =   +90°C)
> PWM3: +46°C  (high =   +80°C, crit =   +90°C)
> PWM4: +40°C  (high =   +80°C, crit =   +90°C)
> CPU Fan:   1380 RPM  (min  300 RPM)
> NB Fan:   0 RPM  (min  300 RPM)
> SYS Fan:  0 RPM  (min  300 RPM)
> AUX1 Fan: 0 RPM  (min  300 RPM)
> AUX2 Fan: 0 RPM  (min  300 RPM)
> AUX3 Fan: 0 RPM  (min  300 RPM)
> OTES1 Fan:0 RPM  (min  300 RPM)
> 
> coretemp-isa-
> Adapter: ISA adapter
> Core 0:  +39°C  (high =   +85°C)
> 
> coretemp-isa-0001
> Adapter: ISA adapter
> Core 1:  +39°C  (high =   +85°C)

Strange.  What does 'ls -lH /sys/class/hwmon/hwmon*/device' say?

Regards,

-- 
Mark M. Hoffman
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Interaction between Xen and XFS: stray RW mappings

2007-10-14 Thread Nick Piggin
On Monday 15 October 2007 10:57, Jeremy Fitzhardinge wrote:
> Nick Piggin wrote:
> > Yes, as Dave said, vmap (more specifically: vunmap) is very expensive
> > because it generally has to invalidate TLBs on all CPUs.
>
> I see.
>
> > I'm looking at some more general solutions to this (already have some
> > batching / lazy unmapping that replaces the XFS specific one), however
> > they are still likely going to leave vmap mappings around after freeing
> > the page.
>
> Hm.  Could there be a call to shoot down any lazy mappings of a page, so
> the Xen pagetable code could use it on any pagetable page?  Ideally one
> that could be used on any page, but only causes expensive operations
> where needed.

Yeah, it would be possible. The easiest way would just be to shoot down
all lazy vmaps (because you're doing the global IPIs anyway, which are
the expensive thing, at which point you may as well purge the rest of
your lazy mappings).

If it is sufficiently rare, then it could be the simplest thing to do.


> > We _could_ hold on to the pages as well, but that's pretty inefficient.
> > The memory cost of keeping the mappings around tends to be well under
> > 1% the cost of the page itself. OTOH we could also avoid lazy flushes
> > on architectures where it is not costly. Either way, it probably would
> > require an arch hook or even a couple of ifdefs in mm/vmalloc.c for
> > Xen. Although... it would be nice if Xen could take advantage of some
> > of these optimisations as well.
>
> In general the lazy unmappings won't worry Xen.  It's only for the
> specific case of allocating memory for pagetables.  Xen can do a bit of
> extra optimisation for cross-cpu tlb flushes (if the target vcpus are
> not currently running, then you don't need to do anything), but they're
> still an expensive operation, so the optimisation is definitely useful.

OK.


> > What's the actual problem for Xen? Anything that can be changed?
>
> Not easily.  Xen doesn't use shadow pagetables.  Instead, it gives the
> guest domains direct access to the real CPU's pagetable, but makes sure
> they're always mapped RO so that the hypervisor can control updates to
> the pagetables (either by trapping writes or via explicit hypercalls).
> This means that when constructing a new pagetable, Xen will verify that
> all the mappings of pages making up the new pagetable are RO before
> allowing it to be used.  If there are stray RW mappings of those pages,
> pagetable construction will fail.

OK, I see. Because even though it is technically safe where we are
using it (because nothing writes through the mappings after the page
is freed), a corrupted guest could use the same window to do bad
things with the pagetables?


> Aside from XFS, the only other case I've found where there could be
> stray RW mappings is when using high pages which are still in the kmap
> cache; I added an explicit call to flush the kmap cache to handle this.
> If vmap and kmap can be unified (at least the lazy unmap aspects of
> them), then that would be a nice little cleanup.

vmap is slightly harder than kmap in some respects. However it would
be really nice to get vmap fast and general enough to completely
replace all the kmap crud -- that's one goal, but the first thing
I'm doing is to concentrate on just vmap to work out how to make it
as fast as possible.

For Xen -- shouldn't be a big deal. We can have a single Linux mm API
to call, and we can do the right thing WRT vmap/kamp. I should try to
merge my current lazy vmap patches which replace the XFS stuff, so we
can implement such an API and fix your XFS issue? That's not going to
happen for at least a cycle or two though, so in the meantime maybe
an ifdef for that XFS vmap batching code would help?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC] vivi, videobuf_to_vmalloc() and related breakage

2007-10-14 Thread Al Viro
AFAICS, videobuf-vmalloc use of mem->vma and mem->vmalloc is
bogus.

You obtain the latter with vmalloc_user(); so far, so good.  Then you have
retval=remap_vmalloc_range(vma, mem->vmalloc,0);
where vma is given to you by mmap(); again, fine - we get the memory
pointed to be mem->vmalloc() mapped at vma->vm_start.

Now we get the trouble: things like

static void vivi_fillbuff(struct vivi_dev *dev,struct vivi_buffer *buf)
{
...
void *vbuf=videobuf_to_vmalloc (>vb);
...
copy_to_user(vbuf + ..., ..., ...)

get vbuf equal to ->vmalloc of buf->vp.priv and that is _not_ a userland
address.  Giving it to copy_to_user() is not going to do anything good.
On some targets it'll fail, on some - write to unrelated user memory.
What is going on there?  If that's an attempt to copy into that buffer
allocated by vmalloc_user(), why are we doing copy_to_user() at all?

But there's more; we have made a copy of vma (kmalloc+memcpy), stored it in
mem->vma and later we cheerfully do remap_vmalloc_range(mem->vma,).
And kfree that mem->vma immediately afterwards.  What the hell?  It might
not break now, but that seems to be playing very fast and loose with the
warranties provided by VM.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 01/52] CRED: Introduce a COW credentials record

2007-10-14 Thread David Chinner
On Fri, Oct 12, 2007 at 05:05:24PM +0100, David Howells wrote:
> diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c
> index 4ca4beb..a460508 100644
> --- a/fs/xfs/xfs_acl.c
> +++ b/fs/xfs/xfs_acl.c
> @@ -383,7 +383,7 @@ xfs_acl_allow_set(
>   error = bhv_vop_getattr(vp, , 0, NULL);
>   if (error)
>   return error;
> - if (va.va_uid != current->fsuid && !capable(CAP_FOWNER))
> + if (va.va_uid != current->cred->uid && !capable(CAP_FOWNER))

current_fsuid() should be used here.

>   return EPERM;
>   return error;
>  }
> @@ -457,13 +457,13 @@ xfs_acl_access(
>   switch (fap->acl_entry[i].ae_tag) {
>   case ACL_USER_OBJ:
>   seen_userobj = 1;
> - if (fuid != current->fsuid)
> + if (fuid != current->cred->uid)
>   continue;
>   matched.ae_tag = ACL_USER_OBJ;
>   matched.ae_perm = allows;
>   break;
>   case ACL_USER:
> - if (fap->acl_entry[i].ae_id != current->fsuid)
> + if (fap->acl_entry[i].ae_id != current->cred->uid)

and here as well

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm -v5 0/3] i386/x86_64 boot: 32-bit boot protocol

2007-10-14 Thread Huang, Ying
Hi, Peter and Andi,

Do you think this patch set is ready for merging? Otherwise what I can
do to make it ready?

Best Regards,
Huang Ying

On Fri, 2007-10-12 at 13:52 +0800, Huang, Ying wrote:
> This patchset defines a 32-bit boot protocol for i386/x86_64 platform,
> adds an extensible boot parameter passing mechanism, export the boot
> parameters via sysfs.
> 
> The patchset has been tested against 2.6.23-rc8-mm2 kernel on x86_64
> and i386.
> 
> This patchset is based on the proposal of Peter Anvin.
> 
> 
> Known Issues:
> 
> - Where is safe to place the linked list of setup_data?  Because the
>   length of the linked list of setup_data is variable, it can not be
>   copied into BSS segment of kernel as that of "zero page". We must
>   find a safe place for it, where it will not be overwritten by kernel
>   during booting up. The i386 kernel will overwrite some pages after
>   _end. The x86_64 kernel will overwrite some pages from 0x1000 on.
> 
> - The fields in zero page are fairly complex (such as struct
>   edd_info). Is it necessary to document every field inside the first
>   level fields, until the primary data type? Or is it sufficient to
>   provide the C struct name only?
> 
> 
> v5:
> 
> - Use bt_ioremap/bt_iounmap in copy_setup_data.
> 
> v4:
> 
> - Reserve setup_data and boot parameters for accessing during
>   runtime.
> - Export boot parameters via sysfs.
> 
> v3:
> 
> - Move hd0_info and hd1_info back to zero page for compatibility.
> 
> v2:
> 
> - Increase the boot protocol version number
> - Check version number before parsing setup data.
> - Revise zero page description according to the source code and move
>   them to zero-page.txt.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: What still uses the block layer?

2007-10-14 Thread Theodore Tso
On Sun, Oct 14, 2007 at 06:45:44PM -0500, Rob Landley wrote:
> I admit a certain amount of personal annoyance that once the SCSI
> layer consumes a category of device (USB, SATA, PATA), they can
> often _only_ be used by going through the SCSI midlayer.  (This
> strikes me as analogous to TCP/IP claiming ethernet and PPP devices
> so thoroughly that you can no longer address them as eth1 or
> /dev/ttyS0.)

That's because modern USB, ATAPI (what was once known as IDE), SATA
really *all* using the SCSI command protocols at the low level, just
as Ethernet and PPP interfaces really are fundamentally the same
thing.  You can rail against it, but that's the mark of someone who
refuses to accept reality.

> This has the annoying effect of bundling together different types of
> devices and making device enumeration unnecessarily difficult: my
> laptop only has one SATA hard drive and can't gain another without a
> soldering iron, but that drive could move from /dev/sda to /dev/sdb
> if I reboot the system with a USB key plugged in.  This seems like a
> regrettable loss of orthogonality to me.  I remember back when
> /dev/usb0 and /dev/hda were separate devices that showed up in /dev,
> but these days "it's SCSI" seems to trump "it's USB", "it's ATA", or
> "it's SATA".  (Even though none of those are actually SCSI hardware,
> they just send a similar packet protocol across the wire.)

You're showing your ignorance here.  In fact in the past few years,
ATA and SCSI has been converging significantly, with the ATAPI
specification has essentially incorporating the SCSI protocol by
reference and by value --- with the point that SAS was developed by
the SCSI Trade Association, and SAS is effectively a superset of SATA,
to the point where with care, you can actually mix SAS and SATA drives
on the same in enclosure (SAS and SATA are physically compatible on
the connector level).

More to the point, with SATA, hot plugging has been designed in, so
probing order is not going to be well defined, just as with USB
devices.  And there are already relatively common situations where the
same disk can show up via multiple different interfaces.

For example, if you have a modern Thinkpad with an secondary SATA hard
drive in an Ultrabay, and you plug it into the Ultrabay in your T60,
it will show up as a SATA drive.  However, if you plug it into the
Advanced dock, it shows up as a USB device.  And with iSCSI not only
can you encapsulate a SCSI command stream over USB, you can do so over
IP as well.  In any case, regardless of how the physical SATA drive is
attached to the system, you want it to show up as the same device and
be mounted in the same location.

That's why identifying filesystem by UUID's or Labels is so critical.
This is not a new concept; we've had the capability to do this for
over a decade, and I always knew it would be necessary for us to do
this sooner or later --- which is why I added the UUID support to ext2
back in 1996.

> The fact that udev can theoretically unwind this hairball is not an
> excuse for conflating different categories of devices in the first
> place.

See the thinkpad Ultrabay drive example above.  You address hosts by
IP address; it doesn't matter whether you access them via a PPP
interface, or a wireless interface, or a ethernet interface.
Similarly, a disk could in theory be accessible over USB, SATA, or
iSCSI, and the Thinkpad example is only one such where the same
filesystem might be accessible over multiple interfaces.  And with
multipath fiber channel SAN's (and I hate to break it to you, but FC
also uses SCSI protocols) storage is very much looking more and more
like networking.

- Ted
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6.24 PATCH 02/25] dm io:ctl use constant struct size

2007-10-14 Thread Alasdair G Kergon
On Sat, Oct 13, 2007 at 12:16:29AM +0200, Arnd Bergmann wrote:
> This change seems rather bogus, you're changing the ABI just to work
> around a bug in the compat_ioctl layer. Why not just do the compat
> code the right way, like the patch below?

The underlying ABI is not changing, I hope - the trailing padding in the
struct should not affect the processing of the data by dm, and I see no
reason to continue maintaining the fiction that the 32-bit and 64-bit
ioctls are in some way incompatible with each other when they aren't
AFAIK.

And yes, a follow-up patch can clean up our use of the compatibility
mechanism, going a little bit further than the patch you attached, I
hope.

Alasdair
-- 
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: wierd file perms

2007-10-14 Thread vignesh babu
ls --version
ls (GNU coreutils) 5.97

An fsck did it :) and had the source restored by checkout -f

Think Jan is right, there is a diff between the two...

On 10/15/07, Mark Lord <[EMAIL PROTECTED]> wrote:
> Jan Engelhardt wrote:
> > On Oct 14 2007 09:27, Mark Lord wrote:
> >> Jan-Benedict Glaw wrote:
> >>> On Sat, 2007-10-13 22:40:23 +0530, vignesh babu <[EMAIL PROTECTED]>
> >>> wrote:
>  I was surprised and did an ls -l on the files and guess what I found:
> 
>  total 0
>  ?- ? ? ? ?? fcntl.c
>  ?- ? ? ? ?? fifo.c
>  ?- ? ? ? ?? filesystems.c
>  ?- ? ? ? ?? file_table.c
>  ?- ? ? ? ?? freevxfs
> 
>  So end result is that, Im not able to delete the files or change perms
>  or ownership even as root.
> >>> Most probably, your filesystem is broken and needs a fsck.
> >> No, this is perfectly normal behaviour, for when a directory
> >> has READ permissions but not EXECUTE permissions.
> >
> > Er, close.
> >
> >   16:02 ichi:/dev/shm > md a
> >   16:02 ichi:/dev/shm > touch a/b
> >   16:02 ichi:/dev/shm > chmod 644 a
> >   16:02 ichi:/dev/shm > ls -l a
> >   /bin/ls: cannot access a/b: Permission denied
> >   total 0
> >   -? ? ? ? ?? b
> >   16:02 ichi:/dev/shm > ls --version
> >   ls (GNU coreutils) 6.9
> >
> >
> > There is a difference ..  "-?" vs "?-".
>
> That's just a version difference for GNU ls.
> Here, with ls (GNU coreutils) 5.97 it gives this:
>
> ?- ? ? ? ?? a/b
>
>
>


-- 

--
"Why is it that every time I'm with you, makes me believe in magic?"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: What still uses the block layer?

2007-10-14 Thread Neil Brown
On Sunday October 14, [EMAIL PROTECTED] wrote:
> On Sunday 14 October 2007 12:46:12 pm Stefan Richter wrote:
> > David Newall wrote:
> > > That is so rude.
> 
> When a reply contains as a reply to the first paragraph "you're wrong" with 
> no 
> elaboration, and as a reply to the second paragraph nothing but expletives 
> and personal insults, I tend to stop reading.  It really doesn't come across 
> as a serious reply.
> 
> I was at least attempting to ask a serious question.

Indeed you were, and let me try to answer it as best I can.

I like to think of the "block layer" as two main parts.

Firstly there is the "interface" which it defines, embodied primarily
in generic_make_request() and 'struct bio'.  There are various other
small routines in ll_rw_blk.c, and there is 'struct request_queue'
which is also involved in the other half of the block layer.

This interface defines how requests are passed down, how their
completion is acknowledged, and various other little details

Any block device can register a make_request_fn function and get the
requests (struct bio) almost exactly as the client (filesystem or
whatever) sent them down - just with a few sanity checks and some
translation (for partitions) applied.

The other half of the "block layer" is the io scheduler code.
This involves the 'struct request' and __make_request() and the various
routines it calls.
This collects bios (passed down from clients) and produces 'requests'
which devices can handle.  One of the important differences between
bios and requests is the amount of parallelism.
A filesystem can send down as may concurrent bios as it likes (or as
it can allocate memory for).
A device can only handle a limited number of requests at a time,
depending on the limit of the 'tags command queueing' mechanism
particular to that device.
The scheduler bridges this parallelism gap by  scheduling.

So the "block layer" consists of "block interface" and "io scheduler"

All block devices use the "block interface" - they have no choice.
Many block devices use the "io scheduler", but many don't.
md and dm, loop, umem, and others do their own scheduling as they have
needs that are specific to the devices, or that otherwise don't
benefit from the io scheduler (which is really designed for
rotating-media style devices).

SCSI devices can be both block device and non-block devices
(traditionally 'char devices').

The 'scsi generic' or 'sg' interface to SCSI devices allows arbitrary
SCSI commands to be sent to a SCSI device.  There are many SCSI
devices that are not block devices as all (media robots, etc).

When a SCSI device is being used as a block device, the block
interface is used.  When it is being used as a 'generic device', the
block interface is not used.

Now we get to the heart of the matter, and to where my knowledge
becomes a little less detailed - so please forgive if I say something
silly.

I believe that the SCSI-generic handling still uses the IO scheduler,
even though it doesn't use the block interface.
It is probable that the IO scheduler is not a perfect match for the
needs of SCSI-generic handling.  Given it's origin, that should not be
surprising.

I believe the linux-scsi email that you referred was addressing this
issue.  When the author says:

That approach makes the Linux block layer either a nuisance,
irrelevant or a complete anachronism 

I believe he is referring to what I would call the IO scheduler, and is
observing that it is not a perfect fit.  He is probably right.

So to answer your question:

  SCSI block devices use both the "block interface" and the "io
  scheduler" and I believe that when people talk about "the block layer"
  they refer to these two things.
  i.e. the SCSI layer provides "scsi_request_fn".  The block interface
  calls __make_request which performs IO scheduling and calls
  scsi_request_fn for each request.

Hope that helps.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[git pull] agp patches for 2.6.24-rc1

2007-10-14 Thread Dave Airlie


Hi Linus,

Please pull from 'agp-patches' branch of
master.kernel.org:/pub/scm/linux/kernel/git/airlied/agp-2.6.git agp-patches

to receive the following updates:

 drivers/char/agp/agp.h|7 +--
 drivers/char/agp/ali-agp.c|   27 ---
 drivers/char/agp/amd-k7-agp.c |9 ++---
 drivers/char/agp/backend.c|   12 
 drivers/char/agp/generic.c|   19 +--
 drivers/char/agp/i460-agp.c   |4 ++--
 drivers/char/agp/intel-agp.c  |6 --
 7 files changed, 50 insertions(+), 34 deletions(-)

Dave Airlie (1):
  AGP fix race condition between unmapping and freeing pages

Jesper Juhl (1):
  fix use after free in amd create gatt pages

diff --git a/drivers/char/agp/agp.h b/drivers/char/agp/agp.h
index 8955e7f..b83824c 100644
--- a/drivers/char/agp/agp.h
+++ b/drivers/char/agp/agp.h
@@ -58,6 +58,9 @@ struct gatt_mask {
 * devices this will probably be ignored */
 };

+#define AGP_PAGE_DESTROY_UNMAP 1
+#define AGP_PAGE_DESTROY_FREE 2
+
 struct aper_size_info_8 {
int size;
int num_entries;
@@ -113,7 +116,7 @@ struct agp_bridge_driver {
struct agp_memory *(*alloc_by_type) (size_t, int);
void (*free_by_type)(struct agp_memory *);
void *(*agp_alloc_page)(struct agp_bridge_data *);
-   void (*agp_destroy_page)(void *);
+   void (*agp_destroy_page)(void *, int flags);
 int (*agp_type_to_mask_type) (struct agp_bridge_data *, int);
 };

@@ -267,7 +270,7 @@ int agp_generic_remove_memory(struct agp_memory *mem, off_t 
pg_start, int type);
 struct agp_memory *agp_generic_alloc_by_type(size_t page_count, int type);
 void agp_generic_free_by_type(struct agp_memory *curr);
 void *agp_generic_alloc_page(struct agp_bridge_data *bridge);
-void agp_generic_destroy_page(void *addr);
+void agp_generic_destroy_page(void *addr, int flags);
 void agp_free_key(int key);
 int agp_num_entries(void);
 u32 agp_collect_device_status(struct agp_bridge_data *bridge, u32 mode, u32 
command);
diff --git a/drivers/char/agp/ali-agp.c b/drivers/char/agp/ali-agp.c
index 4941ddb..aa5ddb7 100644
--- a/drivers/char/agp/ali-agp.c
+++ b/drivers/char/agp/ali-agp.c
@@ -156,29 +156,34 @@ static void *m1541_alloc_page(struct agp_bridge_data 
*bridge)
return addr;
 }

-static void ali_destroy_page(void * addr)
+static void ali_destroy_page(void * addr, int flags)
 {
if (addr) {
-   global_cache_flush();   /* is this really needed?  --hch */
-   agp_generic_destroy_page(addr);
-   global_flush_tlb();
+   if (flags & AGP_PAGE_DESTROY_UNMAP) {
+   global_cache_flush();   /* is this really needed?  
--hch */
+   agp_generic_destroy_page(addr, flags);
+   global_flush_tlb();
+   } else
+   agp_generic_destroy_page(addr, flags);
}
 }

-static void m1541_destroy_page(void * addr)
+static void m1541_destroy_page(void * addr, int flags)
 {
u32 temp;

if (addr == NULL)
return;

-   global_cache_flush();
+   if (flags & AGP_PAGE_DESTROY_UNMAP) {
+   global_cache_flush();

-   pci_read_config_dword(agp_bridge->dev, ALI_CACHE_FLUSH_CTRL, );
-   pci_write_config_dword(agp_bridge->dev, ALI_CACHE_FLUSH_CTRL,
-   (((temp & ALI_CACHE_FLUSH_ADDR_MASK) |
- virt_to_gart(addr)) | ALI_CACHE_FLUSH_EN));
-   agp_generic_destroy_page(addr);
+   pci_read_config_dword(agp_bridge->dev, ALI_CACHE_FLUSH_CTRL, 
);
+   pci_write_config_dword(agp_bridge->dev, ALI_CACHE_FLUSH_CTRL,
+  (((temp & ALI_CACHE_FLUSH_ADDR_MASK) |
+virt_to_gart(addr)) | 
ALI_CACHE_FLUSH_EN));
+   }
+   agp_generic_destroy_page(addr, flags);
 }


diff --git a/drivers/char/agp/amd-k7-agp.c b/drivers/char/agp/amd-k7-agp.c
index f60bca7..1405a42 100644
--- a/drivers/char/agp/amd-k7-agp.c
+++ b/drivers/char/agp/amd-k7-agp.c
@@ -100,21 +100,16 @@ static int amd_create_gatt_pages(int nr_tables)

for (i = 0; i < nr_tables; i++) {
entry = kzalloc(sizeof(struct amd_page_map), GFP_KERNEL);
+   tables[i] = entry;
if (entry == NULL) {
-   while (i > 0) {
-   kfree(tables[i-1]);
-   i--;
-   }
-   kfree(tables);
retval = -ENOMEM;
break;
}
-   tables[i] = entry;
retval = amd_create_page_map(entry);
if (retval != 0)
break;
}
-   amd_irongate_private.num_tables = nr_tables;
+   amd_irongate_private.num_tables = i;
amd_irongate_private.gatt_pages = tables;

if (retval != 0)
diff --git 

[git pull] drm patches for 2.6.24-rc1

2007-10-14 Thread Dave Airlie


Hi Linus,

This contains a major macro removal and ioctl related usercopy cleanups, 
it also fixes a bug in the intel interrupt code with a dodgy calloc size.


Please pull the 'drm-patches' branch from
ssh://master.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6.git drm-patches

Dave.

 drivers/char/drm/drm.h|   20 +-
 drivers/char/drm/drmP.h   |  237 +++--
 drivers/char/drm/drm_agpsupport.c |  130 +++-
 drivers/char/drm/drm_auth.c   |   48 ++--
 drivers/char/drm/drm_bufs.c   |  203 ---
 drivers/char/drm/drm_context.c|  177 --
 drivers/char/drm/drm_dma.c|   11 +-
 drivers/char/drm/drm_drawable.c   |   67 ++---
 drivers/char/drm/drm_drv.c|  186 ++-
 drivers/char/drm/drm_fops.c   |   34 +-
 drivers/char/drm/drm_ioc32.c  |2 +-
 drivers/char/drm/drm_ioctl.c  |  196 +---
 drivers/char/drm/drm_irq.c|   98 +++
 drivers/char/drm/drm_lock.c   |   75 ++---
 drivers/char/drm/drm_os_linux.h   |   10 -
 drivers/char/drm/drm_pciids.h |2 -
 drivers/char/drm/drm_scatter.c|   48 +--
 drivers/char/drm/drm_vm.c |4 +-
 drivers/char/drm/i810_dma.c   |  312 ++
 drivers/char/drm/i810_drm.h   |5 -
 drivers/char/drm/i810_drv.h   |9 +-
 drivers/char/drm/i830_dma.c   |  210 +---
 drivers/char/drm/i830_drv.h   |   15 +-
 drivers/char/drm/i830_irq.c   |   30 +--
 drivers/char/drm/i915_dma.c   |  214 ++---
 drivers/char/drm/i915_drv.h   |   36 ++-
 drivers/char/drm/i915_irq.c   |  128 +++
 drivers/char/drm/i915_mem.c   |  125 +++
 drivers/char/drm/mga_dma.c|  140 -
 drivers/char/drm/mga_drv.h|   21 +-
 drivers/char/drm/mga_state.c  |  197 +---
 drivers/char/drm/mga_warp.c   |8 +-
 drivers/char/drm/r128_cce.c   |  138 -
 drivers/char/drm/r128_drm.h   |   18 -
 drivers/char/drm/r128_drv.h   |   23 +-
 drivers/char/drm/r128_state.c |  351 +---
 drivers/char/drm/r300_cmdbuf.c|   68 ++--
 drivers/char/drm/radeon_cp.c  |  146 -
 drivers/char/drm/radeon_drv.h |   43 ++--
 drivers/char/drm/radeon_irq.c |   34 +--
 drivers/char/drm/radeon_mem.c |  108 +++
 drivers/char/drm/radeon_state.c   |  683 +
 drivers/char/drm/savage_bci.c |  145 -
 drivers/char/drm/savage_drv.h |9 +-
 drivers/char/drm/savage_state.c   |  200 ++--
 drivers/char/drm/sis_drv.c|2 +-
 drivers/char/drm/sis_drv.h|5 +-
 drivers/char/drm/sis_mm.c |  112 +++
 drivers/char/drm/via_dma.c|  144 -
 drivers/char/drm/via_dmablit.c|   54 ++--
 drivers/char/drm/via_drv.h|   22 +-
 drivers/char/drm/via_irq.c|   47 ++--
 drivers/char/drm/via_map.c|   14 +-
 drivers/char/drm/via_mm.c |   83 ++---
 drivers/char/drm/via_verifier.c   |8 +-
 drivers/char/drm/via_video.c  |   20 +-
 56 files changed, 2359 insertions(+), 3116 deletions(-)

commit ace3dff5b7f0bf5a647e60dcd0c0a7d46792f5d9
Author: Xavier Bachelot <[EMAIL PROTECTED]>
Date:   Mon Oct 15 11:09:35 2007 +1000

via invalid device ids removal

0x1106, 0x7204 is unknown and thus is not an IGP/GPU.
0x1106, 0x3304 is K8M800 hostbridge, not an IGP/GPU.
None of them are in drm git tree.

Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Dave Airlie <[EMAIL PROTECTED]>

commit eed0f722b3fccb1eb2706b5f484cb511d46f70b8
Author: chaohong guo <[EMAIL PROTECTED]>
Date:   Mon Oct 15 10:45:49 2007 +1000

radeon: Commit the ring after each partial texture upload blit.

This makes sure each blit starts as early as possible, which may improve
texture upload performance in some cases.

Signed-off-by: Dave Airlie <[EMAIL PROTECTED]>

commit 54583bf4efda79388fc13163e35c016c8bc5de81
Author: Dave Airlie <[EMAIL PROTECTED]>
Date:   Sun Oct 14 21:21:30 2007 +1000

i915: fix vbl swap allocation size.

Oops...

Signed-off-by: Dave Airlie <[EMAIL PROTECTED]>

commit c153f45f9b7e30289157bba3ff5682291df16caa
Author: Eric Anholt <[EMAIL PROTECTED]>
Date:   Mon Sep 3 12:06:45 2007 +1000

drm: Replace DRM_IOCTL_ARGS with (dev, data, file_priv) and remove 
DRM_DEVICE.

The data is now in kernel space, copied in/out as appropriate according to t
This results in DRM_COPY_{TO,FROM}_USER going away, and error paths to deal
with those failures.  This also means that XFree86 4.2.0 support for i810 DR
is lost.

Signed-off-by: Dave Airlie <[EMAIL PROTECTED]>

commit b589ee5943a9610ebaea6e4e3433f2ae4d812b0b
Author: Dave Airlie <[EMAIL PROTECTED]>
Date:   Tue Aug 28 15:16:47 2007 +1000

drm: remove XFREE86_VERSION macros.

These are no longer needed or being used.

Signed-off-by: Dave Airlie <[EMAIL PROTECTED]>

commit 6c340eac0285f3d62406d2d902d0e96fbf2a5dc0
Author: Eric 

Re: [PATCH]: drm: cleanup DRM_DEBUG() parameters

2007-10-14 Thread Dave Airlie
On 10/14/07, Németh Márton <[EMAIL PROTECTED]> wrote:
> From: Márton Németh <[EMAIL PROTECTED]>
>
> As DRM_DEBUG macro already prints out the __FUNCTION__ string (see
> drivers/char/drm/drmP.h), it is not worth doing this again. At some
> other places the ending "\n" was added.
>
> Signed-off-by: Márton Németh <[EMAIL PROTECTED]>

Hi Márton,

Could you rebase the patch against the drm-mm tree or against -mm
kernel? as it conflicts with the stuff I'm about to push to Linus..

Dave.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Interaction between Xen and XFS: stray RW mappings

2007-10-14 Thread Jeremy Fitzhardinge
Nick Piggin wrote:
> Yes, as Dave said, vmap (more specifically: vunmap) is very expensive
> because it generally has to invalidate TLBs on all CPUs.
>   

I see.

> I'm looking at some more general solutions to this (already have some
> batching / lazy unmapping that replaces the XFS specific one), however
> they are still likely going to leave vmap mappings around after freeing
> the page.
>   

Hm.  Could there be a call to shoot down any lazy mappings of a page, so
the Xen pagetable code could use it on any pagetable page?  Ideally one
that could be used on any page, but only causes expensive operations
where needed.

> We _could_ hold on to the pages as well, but that's pretty inefficient.
> The memory cost of keeping the mappings around tends to be well under
> 1% the cost of the page itself. OTOH we could also avoid lazy flushes
> on architectures where it is not costly. Either way, it probably would
> require an arch hook or even a couple of ifdefs in mm/vmalloc.c for
> Xen. Although... it would be nice if Xen could take advantage of some
> of these optimisations as well.
>   

In general the lazy unmappings won't worry Xen.  It's only for the
specific case of allocating memory for pagetables.  Xen can do a bit of
extra optimisation for cross-cpu tlb flushes (if the target vcpus are
not currently running, then you don't need to do anything), but they're
still an expensive operation, so the optimisation is definitely useful.

> What's the actual problem for Xen? Anything that can be changed?
>   

Not easily.  Xen doesn't use shadow pagetables.  Instead, it gives the
guest domains direct access to the real CPU's pagetable, but makes sure
they're always mapped RO so that the hypervisor can control updates to
the pagetables (either by trapping writes or via explicit hypercalls). 
This means that when constructing a new pagetable, Xen will verify that
all the mappings of pages making up the new pagetable are RO before
allowing it to be used.  If there are stray RW mappings of those pages,
pagetable construction will fail.

Aside from XFS, the only other case I've found where there could be
stray RW mappings is when using high pages which are still in the kmap
cache; I added an explicit call to flush the kmap cache to handle this. 
If vmap and kmap can be unified (at least the lazy unmap aspects of
them), then that would be a nice little cleanup.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


WTF is HIDIOCGRDESC supposed to do (aside of being a roothole)?

2007-10-14 Thread Al Viro
This

+   if (get_user(len, (int __user *)arg))
+   return -EFAULT;
+   if (copy_to_user(*((__u8 **)(user_arg +
+   sizeof(__u32))),
+   dev->hid->rdesc, len))

is an instant trouble - you dereference userland-supplied address and
expect it to be OK; then you take the obtained value and use it as
address to shove the data into.

Now,
a) dereference is Not Safe(tm), even if you have get_user()
succeeded just before (and it might be completely unrelated to userland
data at that address).
b) copying arbitrary amount of data?  Without any sanity checks on
len, when we'd just got it from userland?
c) just WTF is that thing supposed to do?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: What still uses the block layer?

2007-10-14 Thread Luben Tuikov
--- James Bottomley <[EMAIL PROTECTED]> wrote:
> On Sat, 2007-10-13 at 16:05 -0600, Matthew Wilcox wrote:
> > On Thu, Oct 11, 2007 at 08:11:21PM -0500, Rob Landley wrote:
> > > My impression from asking questions on the linux-scsi mailing list is 
> > > that the 
> > > scsi upper/middle/lower layers doesn't use the block layer described in 
> > > Documentation/block/*.
> > 
> > Entirely incorrect.
> 
> OK, right ... could we please get a sense of decorum back on this list.
> 
> Rob, if you didn't ask your alleged questions in such a pejorative
> manner, we'd get a lot further; and Matthew, if you didn't rise to the
> bait so spectacularly it wouldn't prolong these threads.
> 
> Really, both of you, I have better things to do with my time than
> mediate behaviours that should have been educated out of you in the
> kindergarten sand pit.

I really didn't find Rob's email "pejorative" at all.  It seems to me
he was just asking for clarification, information and trying to
understand how it all works and ties together.  His email seemed
genuine enough of a person just asking to understand how it all works.

Matthew's expletive and extremely rude response really shows
the general attitude of the linux-scsi people.

Heck, I got a similar response just a week ago here on the
list, trying to convince Garzik and his band, that storage nodes
SHOULD NOT be SAS WWN generators.  Should I have even tried?  That's
the question.

Good luck everyone,
   Luben

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFC: reviewer's statement of oversight

2007-10-14 Thread Neil Brown
On Tuesday October 9, [EMAIL PROTECTED] wrote:
> Hi Neil.
> > 
> >From:The Author, Primary Author, or Authors of the patch.
> > Authors should also provide a Signed-off-by: tag.
> > 
> > Purpose: to give credit to authors
> The SCM should include this info and we should not duplicate this
> in the changelog's.
> I know some tools require this format but that's something else.

If the SCM stores some tags in special places, that is fine with me.
The remove the need for the tag and an understanding of why it exists.
Can 'git' store a list of Authors?  Do we want to allow a list?

> 
> > > +
> > > +Signed-off-by:  A person adding a Signed-off-by tag is attesting that the
> > > + patch is, to the best of his or her knowledge, legally able
> > > + to be merged into the mainline and distributed under the
> > > + terms of the GNU General Public License, version 2.  See
> > > + the Developer's Certificate of Origin, found in
> > > + Documentation/SubmittingPatches, for the precise meaning of
> > > + Signed-off-by.
> > 
> > Purpose: to allow subsequent review of the originality of 
> > the contribution should copyright questions arise.
> 
> We often use s-o-b to docuemnt the path a patch took from origin (the
> top-most s-o-b) to tree apply (lowest s-o-b).
> This is IIUC part of the intended behaviour of s-o-b but it is not
> clear from the above text.

My understanding of Andrew Morton's position on s-o-b is that it is an
unordered set.  I know this because when I have sent him patches with
a proper From: line, he has complained and begrudingly took the first
s-o-b, but said he didn't like to.
So there seems to be disagreement on this (I think it looks like a
path to - but apparently not to everyone).

> 
> 
> > > +
> > > +Acked-by:The person named (who should be an active developer in 
> > > the
> > > + area addressed by the patch) is aware of the patch and has
> > > + no objection to its inclusion.  An Acked-by tag does not
> > > + imply any involvement in the development of the patch or
> > > + that a detailed review was done.
> > 
> > Purpose:  to inform upstream aggregators that
> > consensus was achieved for the change.  This is
> > particularly relevant for changes that affect multiple
> > Maintenance Domains.
> > 
> consensus seems too strong a wording here. consensus imply more than one
> that agree on the patch where I often see people give their "Acked-by:" by
> simple changelog reading.

I'm failing to follow your logic.
You seem to be contrasting:
  "consensus imply more than one that agree"
 which I agree with:  "From" plus all "Acked-By" will be more than
 one in all cases that "Acked-By" is used
with
  "people give their "Acked-by:" by simple changlog reading"
 which I also agree with but this just highlights that "Acked-by"
 is different from "Reviewed-by" 

Confused.

Thanks,
NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFC: reviewer's statement of oversight

2007-10-14 Thread Neil Brown
On Wednesday October 10, [EMAIL PROTECTED] wrote:
> On Tue, Oct 09, 2007 at 10:49:20AM -0600, Jonathan Corbet wrote:
> > Neil Brown <[EMAIL PROTECTED]> wrote:
> > > > + (b) Any problems, concerns, or questions relating to the patch have 
> > > > been
> > > > + communicated back to the submitter.  I am satisfied with how the
> > > > + submitter has responded to my comments.
> > > 
> > > This seems more detailed that necessary.  The process (communicated
> > > back / responded) is not really relevant.
> > 
> > Instead, it seems to me that the process is crucially important.
> > Reviewed-by shouldn't be a rubber stamp that somebody applies to a
> > patch; I think it should really imply that issues of interest have been
> > communicated to the developers.  If we are setting expectations for what
> > Reviewed-by means, I would prefer to leave an explicit mention of
> > communication in there. 
> 
> I couldn't agree more, Jon.
> 
> If we are to have a meaningful reviewed-by tag, it has to be clearly
> documented as to what responsibilities it places on the reviewer. If
> someone doesn't want to perform a well conducted review, then they
> haven't earned the right to issue a Reviewed-by tag - they can use
> the Acked-by rubber stamp instead.

Maybe I'm making a mountain out of a molehill but...

Clearly documented responsibilities?  Yes.
Prescribed process?  No.

If someone sends me a patch, and I review it, and I find a couple of
problems, do I need to negotiate with the submitter before correcting
them and putting a "Reviewed-by" tag on it (along with my
Signed-off-by before sending it upstream)?

The above clause (b) seems to say that I do.  Is that something we
want to mandate?

My take on the responsibilities implied by Reviewed-by: is that the
code has been inspected, comprehended, considered, and found to be
both appropriate and without discernible error.  The process by which
the code got to that state is not relevant to the tag (though it
probably is relevant to the general health of the community).

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


nfsd updates for 2.6.24

2007-10-14 Thread J. Bruce Fields
You can pull the following nfs server changes from

  git://linux-nfs.org/~bfields/linux.git nfs-server-stable

Nothing earth-shaking this time; mainly small bugfixes and cleanups.

--b.

Andrew Morton (1):
  nfsd warning fix

Christoph Hellwig (1):
  nfsd: fix horrible indentation in nfsd_setattr

Dr. David Alan Gilbert (1):
  knfsd: Add source address to sunrpc svc errors

J. Bruce Fields (15):
  nfsd: tone down inaccurate dprintk
  nfsd: remove unused cache_for_each macro
  knfsd: delete code made redundant by map_new_errors
  knfsd: cleanup of nfsd4 cmp_* functions
  knfsd: demote some printk()s to dprintk()s
  knfsd: nfs4 name->id mapping not correctly parsing negative downcall
  knfsd: spawn kernel thread to probe callback channel
  knfsd: move nfsv4 slab creation/destruction to module init/exit
  knfsd: fix callback rpc cred
  knfsd: remove code duplication in nfsd4_setclientid()
  svcgss: move init code into separate function
  knfsd: let nfsd manage timing out its own leases
  knfsd: don't shutdown callbacks until nfsv4 client is freed
  knfsd: nfsv4 delegation recall should take reference on client
  knfsd: query filesystem for NFSv4 getattr of FATTR4_MAXNAME

Peter Staubach (1):
  knfsd: 64 bit ino support for NFS server

 fs/nfsd/nfs3xdr.c |   59 +--
 fs/nfsd/nfs4callback.c|   89 +
 fs/nfsd/nfs4idmap.c   |8 +-
 fs/nfsd/nfs4proc.c|4 +-
 fs/nfsd/nfs4state.c   |  200 +---
 fs/nfsd/nfs4xdr.c |   22 ++--
 fs/nfsd/nfsctl.c  |7 +-
 fs/nfsd/nfssvc.c  |8 +-
 fs/nfsd/nfsxdr.c  |4 +
 fs/nfsd/vfs.c |   43 +---
 include/linux/nfsd/nfsd.h |   18 ++--
 include/linux/nfsd/nfsfh.h|   42 +---
 include/linux/nfsd/xdr4.h |4 +-
 include/linux/sunrpc/cache.h  |   10 --
 net/sunrpc/auth_gss/svcauth_gss.c |  144 ++
 net/sunrpc/svc.c  |   40 ++--
 16 files changed, 326 insertions(+), 376 deletions(-)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: hdparm standby timeout not working for WD raptors?

2007-10-14 Thread Mark Weber
On 10/14/07, Mark Lord <[EMAIL PROTECTED]> wrote:
> 1. How are you forcing the drives into standby?

hdparm -y



> 2. Have you reproduced this with a stock kernel.org kernel yet?

No; maybe later this week.



> One possibility is that these drives may be the kind that
> generate a "spurious" interrupt when they spin-down via the timer,
> and perhaps libata is "processing" that interrupt and sending
> additional command(s) that then wake the drive up again immediately.
>
> To rule this out, you could try using drivers/ide for a moment or two,
> and see if the same problem persists with those drives.
>
> You could also try dumping /proc/interrupts in conjunction with "hdparm -S1",
> and we can compare that with a "known good" system.
>
> Something like this:
>
> hdparm -B255 /dev/sda
> hdparm -S1 /dev/sda
> cat /proc/interrupts
> sleep 6
> cat /proc/interrupts




Here's the log for the second suggestion:



narf ~ # hdparm -B255 /dev/sda

/dev/sda:
 setting Advanced Power Management level to disabled
 HDIO_DRIVE_CMD failed: Input/output error

narf ~ # hdparm -S1 /dev/sda

/dev/sda:
 setting standby to 1 (5 seconds)
narf ~ # cat /proc/interrupts
   CPU0
  0:   2268   IO-APIC-edge  timer
  1:  2   IO-APIC-edge  i8042
  8:  5   IO-APIC-edge  rtc
  9:  0   IO-APIC-fasteoi   acpi
 12:  4   IO-APIC-edge  i8042
 14: 206827   IO-APIC-edge  ide0
 16: 350813   IO-APIC-fasteoi   sata_promise, uhci_hcd:usb5
 17:   39596029   IO-APIC-fasteoi   eth0
 18:  0   IO-APIC-fasteoi   uhci_hcd:usb4
 19: 728947   IO-APIC-fasteoi   libata, uhci_hcd:usb3
 20:  0   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb2
NMI:  0
LOC:2736793
ERR:  0
MIS:  0
narf ~ # sleep 6
narf ~ # cat /proc/interrupts
   CPU0
  0:   2268   IO-APIC-edge  timer
  1:  2   IO-APIC-edge  i8042
  8:  5   IO-APIC-edge  rtc
  9:  0   IO-APIC-fasteoi   acpi
 12:  4   IO-APIC-edge  i8042
 14: 206828   IO-APIC-edge  ide0
 16: 350813   IO-APIC-fasteoi   sata_promise, uhci_hcd:usb5
 17:   39596069   IO-APIC-fasteoi   eth0
 18:  0   IO-APIC-fasteoi   uhci_hcd:usb4
 19: 728947   IO-APIC-fasteoi   libata, uhci_hcd:usb3
 20:  0   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb2
NMI:  0
LOC:2736881
ERR:  0
MIS:  0
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


NO_HZ and cpu monitoring tools

2007-10-14 Thread Anton Blanchard

Hi,

When using a NO_HZ kernel on ppc64, I noticed top gives some interesting
results:

Cpu0  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si, 0.0%st
Cpu1  :  0.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si, 0.0%st
Cpu2  :  0.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si, 0.0%st
Cpu3  :  0.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si, 0.0%st
Cpu4  :  0.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si, 0.0%st
Cpu5  :  1.1%us,  0.0%sy,  0.0%ni, 98.9%id,  0.0%wa,  0.0%hi,  0.0%si, 0.0%st
Cpu6  :  0.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si, 0.0%st
Cpu7  :  0.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si, 0.0%st

Notice how only 2 cpus report idle time. Im guessing this happens if 
a core sleeps for longer than the update period in top. Where should
this be fixed?

It would be possible for the proc read method to add in the right number
of idle jiffies, or top could just assume no increment means 100% idle.

Anton
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: What still uses the block layer?

2007-10-14 Thread Rob Landley
On Sunday 14 October 2007 5:24:32 pm James Bottomley wrote:
> On Sat, 2007-10-13 at 16:05 -0600, Matthew Wilcox wrote:
> > On Thu, Oct 11, 2007 at 08:11:21PM -0500, Rob Landley wrote:
> > > My impression from asking questions on the linux-scsi mailing list is
> > > that the scsi upper/middle/lower layers doesn't use the block layer
> > > described in Documentation/block/*.
> >
> > Entirely incorrect.
>
> OK, right ... could we please get a sense of decorum back on this list.

Did I reply to the insult?

> Rob, if you didn't ask your alleged questions in such a pejorative
> manner, we'd get a lot further

I'm not attempting to be pejorative.

I admit a certain amount of personal annoyance that once the SCSI layer 
consumes a category of device (USB, SATA, PATA), they can often _only_ be 
used by going through the SCSI midlayer.  (This strikes me as analogous to 
TCP/IP claiming ethernet and PPP devices so thoroughly that you can no longer 
address them as eth1 or /dev/ttyS0.)

This has the annoying effect of bundling together different types of devices 
and making device enumeration unnecessarily difficult: my laptop only has one 
SATA hard drive and can't gain another without a soldering iron, but that 
drive could move from /dev/sda to /dev/sdb if I reboot the system with a USB 
key plugged in.  This seems like a regrettable loss of orthogonality to me.  
I remember back when /dev/usb0 and /dev/hda were separate devices that showed 
up in /dev, but these days "it's SCSI" seems to trump "it's USB", "it's ATA", 
or "it's SATA".  (Even though none of those are actually SCSI hardware, they 
just send a similar packet protocol across the wire.)

The fact that udev can theoretically unwind this hairball is not an excuse for 
conflating different categories of devices in the first place.  Avoiding an 
unnecessary problem seems superior to trying to get udev to solve it.  Note 
that Ubuntu 7.04 solves it by sticking a UUID on every _partition_, and then 
spinning up my external USB hard drive trying to find the root partition on a 
reboot.  Tell me how this can be considered progress:

> # /etc/fstab: static file system information.
> #
> #
> proc/proc   procdefaults0   0
> # /dev/sda1
> UUID=04d1b984-bd65-46f1-9a77-c158cf4bed1b /   ext3 
defaults,errors=remount-ro,noatime 0   1
> # /dev/sda5 
> UUID=cdf0936d-9f19-42c6-b131-9fefcf1321ef noneswapsw
0   0
> /dev/scd0   /media/cdrom0   udf,iso9660 user,noauto   0   0 
> UUID=86bbb512-ab7e-4a12-8618-1190f032c082  /boot ext3 defaults 0 0 

Conflating categories of hardware that cannot easily be enumerated (USB) with 
categories that can (the SATA hard drive in my laptop, of which there can be 
only one) strikes me as a bad thing.  Putting them in a common "scsi device 
pool" within which they do not enumerate consistently is not something I 
enjoy dealing with.

However, the response to my attempts to express this dissatisfaction on the 
SCSI list a few months ago came too close to a flamewar for me to consider 
continuing it productive.  I'd still love to update the "2.4 scsi howto" and 
corresponding sg howto, but lack the expertise.  The SCSI layer really isn't 
my area, and I was much happier back when I could avoid using it at all.

The question I was trying to ask _here_ was about the block layer.  I seem not 
to have asked it very well.  Sorry 'bout that.

Rob
-- 
"One of my most productive days was throwing away 1000 lines of code."
  - Ken Thompson.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: hdparm standby timeout not working for WD raptors?

2007-10-14 Thread Mark Lord

1. How are you forcing the drives into standby?

2. Have you reproduced this with a stock kernel.org kernel yet?

One possibility is that these drives may be the kind that
generate a "spurious" interrupt when they spin-down via the timer,
and perhaps libata is "processing" that interrupt and sending
additional command(s) that then wake the drive up again immediately.

To rule this out, you could try using drivers/ide for a moment or two,
and see if the same problem persists with those drives.

You could also try dumping /proc/interrupts in conjunction with "hdparm -S1",
and we can compare that with a "known good" system.

Something like this:

hdparm -B255 /dev/sda
hdparm -S1 /dev/sda
cat /proc/interrupts
sleep 6
cat /proc/interrupts

Cheers
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: What still uses the block layer?

2007-10-14 Thread Rob Landley
On Sunday 14 October 2007 12:46:12 pm Stefan Richter wrote:
> David Newall wrote:
> > That is so rude.

When a reply contains as a reply to the first paragraph "you're wrong" with no 
elaboration, and as a reply to the second paragraph nothing but expletives 
and personal insults, I tend to stop reading.  It really doesn't come across 
as a serious reply.

I was at least attempting to ask a serious question.

> Such responses sometimes happen after provocative posts like the thread
> starter's.  He could have asked straight away for help with fixing his
> boot environment instead of wrapping his question into a feigned design
> discussion.  It appeared as if he is out for a fight rather than
> interested in help.

Actually, I was going through Documentation/block thinking about making a 
00-INDEX for it, but my earlier questions of the scsi guys left me with the 
impression that the block layer is _not_ used by the SCSI layer.  And since 
every non-embedded modern storage device I'm aware of has been consumed by 
the SCSI layer (despite none of them actually having a discernably closer 
relationship to SCSI than ATA did), I didn't know whether or not it was more 
appropriate to index this directory or request its deletion.  So I asked.

Back when I  asked the scsi guys about this, I got no direct answer.  I 
asked "where does the block layer work into this" in the context of questiosn 
about the relationship between the scsi upper, middle, and lower layers, and 
I never got a reply, even though the question was quoted back at me here:
http://www.mail-archive.com/linux-scsi%40vger.kernel.org/msg09086.html

The closest I got to an answer was later in the thread:
http://www.mail-archive.com/linux-scsi%40vger.kernel.org/msg09131.html

Which said:
> That approach makes the Linux block layer either a nuisance,
> irrelevant or a complete anachronism (in the case of OSD).
> IMO the linux block layer should be morphed into a library
> of internal queue handling routines. Storage upper level
> drivers such as sd can continue to present the "block"
> view ** of storage devices such as disks.

The gist of the thread (and the documentation I was referred to) is that the 
scsi "upper layer" presents /dev nodes and ioctls, the scsi mid-layer is a 
routing layer very roughly analogus to a TCP/IP stack, and the scsi low-layer 
drivers interface with specific pieces of hardware.  Apparently, the block 
layer is not between any of these, they talk directly to each other.  This 
would seem to indicate that I/O requests made to scsi devices are never 
routed through a common block I/O request handling layer shared with non-SCSI 
block devices.  I was not, however, certain of this, hence my attempt to 
bring the topic back up.

Oh, and sending a patch correcting Jens Axboe's address in this old 
documentation.  He's apparently at Oracle now...

Rob
-- 
"One of my most productive days was throwing away 1000 lines of code."
  - Ken Thompson.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: MSI interrupts and disable_irq

2007-10-14 Thread Benjamin Herrenschmidt

On Sun, 2007-10-14 at 16:15 -0700, Yinghai Lu wrote:
> On 10/14/07, Benjamin Herrenschmidt <[EMAIL PROTECTED]> wrote:
> >
> > On Sun, 2007-10-14 at 09:15 +0200, Manfred Spraul wrote:
> > > Yinghai Lu wrote:
> > > > On 10/13/07, Manfred Spraul <[EMAIL PROTECTED]> wrote:
> > > >
> > > >> Someone around with a MSI capable board? The forcedeth driver does
> > > >> dev->irq = pci_dev->irq
> > > >> in nv_probe(), especially before pci_enable_msi().
> > > >> Does pci_enable_msi() change pci_dev->irq? Then we would disable the
> > > >> wrong interrupt
> > > >>
> > > >
> > > > the request_irq==>setup_irq will make dev->irq = pci_dev->irq.
> > > >
> > > >
> > > Where is that?
> > > Otherwise I would propose the attached patch. My board is not
> > > MSI-capable, thus I can't test it myself.
> >
> > Why not just copy pcidev->irq to dev->irq once ?
> 
> it seems e1000 is using np->pci_dev->irq directly too.

Heh, allright, doesn't matter, I was just proposing to avoid one more
indirection :-)

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Interaction between Xen and XFS: stray RW mappings

2007-10-14 Thread David Chinner
On Sun, Oct 14, 2007 at 04:12:20PM -0700, Jeremy Fitzhardinge wrote:
> David Chinner wrote:
> > You mean xfs_buf.c.
> >   
> 
> Yes, sorry.
> 
> > And yes, we delay unmapping pages until we have a batch of them
> > to unmap. vmap and vunmap do not scale, so this is batching helps
> > alleviate some of the worst of the problems.
> >   
> 
> How much performance does it cost?

Every vunmap() cal causes a global TLB sync, and the region lists
are globl with a spin lock protecting them. I thin kNick has shown
a 64p altix with ~60 cpus spinning on the vmap locks under a
simple workload

> What kind of workloads would it show
> up under?

A directory traversal when using large directory block sizes
with large directories


> > Realistically, if this delayed release of vmaps is a problem for
> > Xen, then I think that some generic VM solution is needed to this
> > problem as vmap() is likely to become more common in future (think
> > large blocks in filesystems). Nick - any comments?
> >   
> 
> Well, the only real problem is that the pages are returned to the free
> pool and reallocated while still being part of a mapping.  If the pages
> are still owned by the filesystem/pagecache, then there's no problem.

The pages are still attached to the blockdev address space mapping,
but there's nothing stopping them from being reclaimed before they are
unmapped.

> What's the lifetime of things being vmapped/unmapped in xfs?  Are they
> necessarily being freed when they're unmapped, or could unmapping of
> freed memory be more immediate than other memory?

It's all "freed memory". At the time we pull the buffer down, there are
no further references to the buffer. the pages are released and the mapping
is never used again until it is torn down. it is torn down either on the
next xfsbufd run (either memory pressure or every 15s) or every 64th
new vmap() call to map new buffers.

> Maybe it just needs a notifier chain or something.

We've already got a memroy shrinker hook that triggers this reclaim.

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


WANTED: kernel projects for CS students

2007-10-14 Thread Rik van Riel
The kernel newbies community often gets inquiries from CS students who
need a project for their studies and would like to do something with
the Linux kernel, but would also like their code to be useful to the
community afterwards.

In order to make it easier for them, I am trying to put together a
page with projects that:
- Are self contained enough that the students can implement the
  project by themselves, since that is often a university requirement.
- Are self contained enough that Linux could merge the code (maybe
  with additional changes) after the student has been working on it
  for a few months.
- Are large enough to qualify as a student project, luckily there is
  flexibility here since we get inquiries for anything from 6 week
  projects to 6 month projects.

If you have ideas on what projects would be useful, please add them
to this page (or email me):

http://kernelnewbies.org/KernelProjects

thanks,

Rik
-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Interaction between Xen and XFS: stray RW mappings

2007-10-14 Thread Nick Piggin
On Monday 15 October 2007 09:12, Jeremy Fitzhardinge wrote:
> David Chinner wrote:
> > You mean xfs_buf.c.
>
> Yes, sorry.
>
> > And yes, we delay unmapping pages until we have a batch of them
> > to unmap. vmap and vunmap do not scale, so this is batching helps
> > alleviate some of the worst of the problems.
>
> How much performance does it cost?  What kind of workloads would it show
> up under?
>
> > Realistically, if this delayed release of vmaps is a problem for
> > Xen, then I think that some generic VM solution is needed to this
> > problem as vmap() is likely to become more common in future (think
> > large blocks in filesystems). Nick - any comments?
>
> Well, the only real problem is that the pages are returned to the free
> pool and reallocated while still being part of a mapping.  If the pages
> are still owned by the filesystem/pagecache, then there's no problem.
>
> What's the lifetime of things being vmapped/unmapped in xfs?  Are they
> necessarily being freed when they're unmapped, or could unmapping of
> freed memory be more immediate than other memory?

Yes, as Dave said, vmap (more specifically: vunmap) is very expensive
because it generally has to invalidate TLBs on all CPUs.

I'm looking at some more general solutions to this (already have some
batching / lazy unmapping that replaces the XFS specific one), however
they are still likely going to leave vmap mappings around after freeing
the page.

We _could_ hold on to the pages as well, but that's pretty inefficient.
The memory cost of keeping the mappings around tends to be well under
1% the cost of the page itself. OTOH we could also avoid lazy flushes
on architectures where it is not costly. Either way, it probably would
require an arch hook or even a couple of ifdefs in mm/vmalloc.c for
Xen. Although... it would be nice if Xen could take advantage of some
of these optimisations as well.

What's the actual problem for Xen? Anything that can be changed?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Hook compat_sys_nanosleep up to high res timer code

2007-10-14 Thread Anton Blanchard

Hi Arnd,
 
> The code looks correct, but I think it would be nicer to change 
> hrtimer_nanosleep to take a kernel pointer and have all three
> callers (common_nsleep, sys_nanosleep and compat_sys_nanosleep)
> do the copy_to_user/put_compat_timespec in the caller.

Good idea, I had considered that but thought a larger cleanup might run
afoul of the merge rules :)

Regardless, here it is. Id appreciate a once over since it does affect
more code than the previous patch :)

Anton

--

Now we have high res timers on ppc64 I thought Id test them. It turns
out compat_sys_nanosleep hasnt been converted to the hrtimer code and so
is limited to HZ resolution.

The following patch pulls the copy_to_user out of hrtimer_nanosleep and
into the callers (common_nsleep, sys_nanosleep and compat_sys_nanosleep)
thus avoiding any set_fs(KERNEL_DS) or compat_alloc_userspace tricks.
 
Signed-off-by: Anton Blanchard <[EMAIL PROTECTED]>
---

diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index 540799b..7a9398e 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -300,7 +300,7 @@ hrtimer_forward(struct hrtimer *timer, ktime_t now, ktime_t 
interval);
 
 /* Precise sleep: */
 extern long hrtimer_nanosleep(struct timespec *rqtp,
- struct timespec __user *rmtp,
+ struct timespec *rmtp,
  const enum hrtimer_mode mode,
  const clockid_t clockid);
 extern long hrtimer_nanosleep_restart(struct restart_block *restart_block);
diff --git a/kernel/compat.c b/kernel/compat.c
index 3bae374..44abfce 100644
--- a/kernel/compat.c
+++ b/kernel/compat.c
@@ -40,62 +40,26 @@ int put_compat_timespec(const struct timespec *ts, struct 
compat_timespec __user
__put_user(ts->tv_nsec, >tv_nsec)) ? -EFAULT : 0;
 }
 
-static long compat_nanosleep_restart(struct restart_block *restart)
-{
-   unsigned long expire = restart->arg0, now = jiffies;
-   struct compat_timespec __user *rmtp;
-
-   /* Did it expire while we handled signals? */
-   if (!time_after(expire, now))
-   return 0;
-
-   expire = schedule_timeout_interruptible(expire - now);
-   if (expire == 0)
-   return 0;
-
-   rmtp = (struct compat_timespec __user *)restart->arg1;
-   if (rmtp) {
-   struct compat_timespec ct;
-   struct timespec t;
-
-   jiffies_to_timespec(expire, );
-   ct.tv_sec = t.tv_sec;
-   ct.tv_nsec = t.tv_nsec;
-   if (copy_to_user(rmtp, , sizeof(ct)))
-   return -EFAULT;
-   }
-   /* The 'restart' block is already filled in */
-   return -ERESTART_RESTARTBLOCK;
-}
-
 asmlinkage long compat_sys_nanosleep(struct compat_timespec __user *rqtp,
-   struct compat_timespec __user *rmtp)
+struct compat_timespec __user *rmtp)
 {
-   struct timespec t;
-   struct restart_block *restart;
-   unsigned long expire;
+   struct timespec tu, rmt;
+   long ret;
 
-   if (get_compat_timespec(, rqtp))
+   if (get_compat_timespec(, rqtp))
return -EFAULT;
 
-   if ((t.tv_nsec >= 10L) || (t.tv_nsec < 0) || (t.tv_sec < 0))
+   if (!timespec_valid())
return -EINVAL;
 
-   expire = timespec_to_jiffies() + (t.tv_sec || t.tv_nsec);
-   expire = schedule_timeout_interruptible(expire);
-   if (expire == 0)
-   return 0;
+   ret = hrtimer_nanosleep(, , HRTIMER_MODE_REL, CLOCK_MONOTONIC);
 
-   if (rmtp) {
-   jiffies_to_timespec(expire, );
-   if (put_compat_timespec(, rmtp))
+   if (ret) {
+   if (put_compat_timespec(, rmtp))
return -EFAULT;
}
-   restart = _thread_info()->restart_block;
-   restart->fn = compat_nanosleep_restart;
-   restart->arg0 = jiffies + expire;
-   restart->arg1 = (unsigned long) rmtp;
-   return -ERESTART_RESTARTBLOCK;
+
+   return ret;
 }
 
 static inline long get_compat_itimerval(struct itimerval *o,
diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index dc8a445..095e09e 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -1286,8 +1286,7 @@ static int __sched do_nanosleep(struct hrtimer_sleeper 
*t, enum hrtimer_mode mod
 long __sched hrtimer_nanosleep_restart(struct restart_block *restart)
 {
struct hrtimer_sleeper t;
-   struct timespec __user *rmtp;
-   struct timespec tu;
+   struct timespec *rmtp;
ktime_t time;
 
restart->fn = do_no_restart_syscall;
@@ -1298,14 +1297,12 @@ long __sched hrtimer_nanosleep_restart(struct 
restart_block *restart)
if (do_nanosleep(, HRTIMER_MODE_ABS))
return 0;
 
-   rmtp = (struct timespec __user *) restart->arg1;
+   rmtp = (struct timespec *)restart->arg1;
if (rmtp) {
time 

[PATCH] doc: add uio document to docbook compilation target

2007-10-14 Thread Satoru Takeuchi
Add uio document to DocBook compilation target.

`make *docs' doesn't generate "The Userspace I/O HOWTO", the user space
I/O document written in DocBook.

Signed-off-by: Satoru Takeuchi <[EMAIL PROTECTED]>

Index: linux/Documentation/DocBook/Makefile
===
--- linux.orig/Documentation/DocBook/Makefile   2007-10-12 23:54:19.0 
+0900
+++ linux/Documentation/DocBook/Makefile2007-10-12 23:55:14.0 
+0900
@@ -11,7 +11,7 @@ DOCBOOKS := wanbook.xml z8530book.xml mc
procfs-guide.xml writing_usb_driver.xml \
kernel-api.xml filesystems.xml lsm.xml usb.xml \
gadget.xml libata.xml mtdnand.xml librs.xml rapidio.xml \
-   genericirq.xml
+   genericirq.xml uio-howto.xml
 
 ###
 # The build process is as follows (targets):
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: MSI interrupts and disable_irq

2007-10-14 Thread Yinghai Lu
On 10/14/07, Benjamin Herrenschmidt <[EMAIL PROTECTED]> wrote:
>
> On Sun, 2007-10-14 at 09:15 +0200, Manfred Spraul wrote:
> > Yinghai Lu wrote:
> > > On 10/13/07, Manfred Spraul <[EMAIL PROTECTED]> wrote:
> > >
> > >> Someone around with a MSI capable board? The forcedeth driver does
> > >> dev->irq = pci_dev->irq
> > >> in nv_probe(), especially before pci_enable_msi().
> > >> Does pci_enable_msi() change pci_dev->irq? Then we would disable the
> > >> wrong interrupt
> > >>
> > >
> > > the request_irq==>setup_irq will make dev->irq = pci_dev->irq.
> > >
> > >
> > Where is that?
> > Otherwise I would propose the attached patch. My board is not
> > MSI-capable, thus I can't test it myself.
>
> Why not just copy pcidev->irq to dev->irq once ?

it seems e1000 is using np->pci_dev->irq directly too.

YH
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


PROBLEM: kernel memory subsystem incorrectly invokes OOM killer under certain situations

2007-10-14 Thread Chris Drake
Hi linux-kernel,



[1.] One line summary of the problem:

kernel memory subsystem incorrectly invokes OOM killer under certain situations


[2.] Full description of the problem/report:

My guess is that whatever invokes the OOM killer is incorrectly
"deciding" that memory allocated for disk cache operations cannot be
"reclaimed", or, the oom killer code itself is incorrectly killing
processes when the cause of the memory exhaustion is the disk cache
subsystem (and not a runaway process).

Specifically - I have a RedHat AS4u5 2.6.9-55.0.6.ELsmp system with
4gigs RAM, running vmware 1.0.4, and another AS4 guest, which has 3
virtual SCSI drives.  The following guest command reliably causes the
host OOM killer to terminate my vmware process:

dd if=/dev/sdb of=/deb/sdc

(to clone the contents of a 16gb virtual disk).  The host has one 2TB
file system only.

While it's easiest to use vmware to demonstrate the problem, this does
not appear to be a problem with vmware itself.


[3.] Keywords (i.e., modules, networking, kernel):

/usr/src/redhat/BUILD/kernel-2.6.9/linux-2.6.9/mm/oom_kill.c

OOM killer


[4.] Kernel version (from /proc/version):

Linux version 2.6.9-55.0.6.ELsmp ([EMAIL PROTECTED]) (gcc version 3.4.6 
20060404 (Red Hat 3.4.6-8)) #1 SMP Thu Aug 23 11:11:20 EDT 2007


[5.] Output of Oops.. message (if applicable) with symbolic information 
 resolved (see Documentation/oops-tracing.txt)

Here's the messages output showing the offending oom-kill.

Oct 14 21:05:14 dor kernel: oom-killer: gfp_mask=0xd0
Oct 14 21:05:14 dor kernel: Mem-info:
Oct 14 21:05:14 dor kernel: DMA per-cpu:
Oct 14 21:05:14 dor kernel: cpu 0 hot: low 2, high 6, batch 1
Oct 14 21:05:14 dor kernel: cpu 0 cold: low 0, high 2, batch 1
Oct 14 21:05:14 dor kernel: cpu 1 hot: low 2, high 6, batch 1
Oct 14 21:05:14 dor kernel: cpu 1 cold: low 0, high 2, batch 1
Oct 14 21:05:14 dor kernel: cpu 2 hot: low 2, high 6, batch 1
Oct 14 21:05:14 dor kernel: cpu 2 cold: low 0, high 2, batch 1
Oct 14 21:05:14 dor kernel: cpu 3 hot: low 2, high 6, batch 1
Oct 14 21:05:14 dor kernel: cpu 3 cold: low 0, high 2, batch 1
Oct 14 21:05:14 dor kernel: Normal per-cpu:
Oct 14 21:05:14 dor kernel: cpu 0 hot: low 32, high 96, batch 16
Oct 14 21:05:20 dor kernel: cpu 0 cold: low 0, high 32, batch 16
Oct 14 21:05:20 dor kernel: cpu 1 hot: low 32, high 96, batch 16
Oct 14 21:05:21 dor kernel: cpu 1 cold: low 0, high 32, batch 16
Oct 14 21:05:21 dor kernel: cpu 2 hot: low 32, high 96, batch 16
Oct 14 21:05:21 dor kernel: cpu 2 cold: low 0, high 32, batch 16
Oct 14 21:05:21 dor kernel: cpu 3 hot: low 32, high 96, batch 16
Oct 14 21:05:21 dor kernel: cpu 3 cold: low 0, high 32, batch 16
Oct 14 21:05:21 dor kernel: HighMem per-cpu:
Oct 14 21:05:21 dor kernel: cpu 0 hot: low 32, high 96, batch 16
Oct 14 21:05:21 dor kernel: cpu 0 cold: low 0, high 32, batch 16
Oct 14 21:05:21 dor kernel: cpu 1 hot: low 32, high 96, batch 16
Oct 14 21:05:21 dor kernel: cpu 1 cold: low 0, high 32, batch 16
Oct 14 21:05:21 dor kernel: cpu 2 hot: low 32, high 96, batch 16
Oct 14 21:05:21 dor kernel: cpu 2 cold: low 0, high 32, batch 16
Oct 14 21:05:21 dor kernel: cpu 3 hot: low 32, high 96, batch 16
Oct 14 21:05:21 dor kernel: cpu 3 cold: low 0, high 32, batch 16
Oct 14 21:05:21 dor kernel: 
Oct 14 21:05:21 dor kernel: Free pages:   26152kB (3584kB HighMem)
Oct 14 21:05:21 dor kernel: Active:599689 inactive:398895 dirty:429 
writeback:15 unstable:0 free:6538 slab:13298 mapped:369678 pagetables:6087
Oct 14 21:05:21 dor kernel: DMA free:12544kB min:180kB low:360kB high:540kB 
active:0kB inactive:0kB present:16384kB pages_scanned:871 all_unreclaimable? yes
Oct 14 21:05:27 dor kernel: protections[]: 0 0 0
Oct 14 21:05:28 dor kernel: Normal free:10024kB min:10056kB low:20112kB 
high:30168kB active:928kB inactive:775024kB present:901120kB 
pages_scanned:5812455 all_unreclaimable? yes
Oct 14 21:05:28 dor kernel: protections[]: 0 0 0
Oct 14 21:05:28 dor kernel: HighMem free:3584kB min:512kB low:1024kB 
high:1536kB active:2397828kB inactive:820556kB present:3538944kB 
pages_scanned:0 all_unreclaimable? no
Oct 14 21:05:28 dor kernel: protections[]: 0 0 0
Oct 14 21:05:28 dor kernel: DMA: 4*4kB 4*8kB 3*16kB 3*32kB 3*64kB 3*128kB 
2*256kB 0*512kB 1*1024kB 1*2048kB 2*4096kB = 12544kB
Oct 14 21:05:28 dor kernel: Normal: 0*4kB 1*8kB 0*16kB 1*32kB 0*64kB 10*128kB 
6*256kB 0*512kB 1*1024kB 1*2048kB 1*4096kB = 10024kB
Oct 14 21:05:28 dor kernel: HighMem: 52*4kB 68*8kB 85*16kB 26*32kB 2*64kB 
0*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 3584kB
Oct 14 21:05:28 dor kernel: Swap cache: add 678557, delete 673525, find 
277514/347205, race 0+5
Oct 14 21:05:28 dor kernel: 0 bounce buffer pages
Oct 14 21:05:28 dor kernel: Free swap:   20303648kB
Oct 14 21:05:28 dor kernel: 1114112 pages of RAM
Oct 14 21:05:28 dor kernel: 819184 pages of HIGHMEM
Oct 14 21:05:28 dor kernel: 75731 reserved pages
Oct 14 21:05:28 dor kernel: 1013077 pages shared
Oct 14 21:05:28 dor kernel: 5040 pages swap cached
Oct 

Re: [PATCH] Version 7 (2.6.23) Smack: Simplified Mandatory Access Control Kernel

2007-10-14 Thread Ahmed S. Darwish
Hi Casey,

On Sun, Oct 14, 2007 at 10:15:42AM -0700, Casey Schaufler wrote:
> 
> +
> +CIPSO Configuration
> +
> +It is normally unnecessary to specify the CIPSO configuration. The default
> +values used by the system handle all internal cases. Smack will compose CIPSO
> +label values to match the Smack labels being used without administrative
> +intervention. 
>

I have two issues with CIPSO and Smack:

1-

Using default configuration (system startup script + smacfs fstab line), system
can't access any service outside the Lan. "ICMP parameter problem message" 
always
appear from the first Wan router (traceroute + tcpdump at [1]).

Services inside the LAN can be accessed normally. System can connect to a Lan
Windows share. It also connects to the gateway HTTP server easily.

After some tweaking, I discovered that using CIPSOv6 solves all above problems:
$ echo -n "NLBL_CIPSOv6" > /smack/nltype

Is this a normal behaviour ?

2-

> 4. Any access requested on an object labeled "*" is permitted.
[...]
> +Unlabeled packets that come into the system will be given the
> +ambient label.

Default conf let the ambient attribute = _ which works fine. Setting ambient = *
stops all external (non lo) network traffic. Did I miss another use of "ambient"
or this is a normal behaviour ?.

> +Administration
> +
> +Smack supports some mount options:
> +
> + smackfsdef=label: specifies the label to give files that lack
> + the Smack label extended attribute.
> +

Although using smackfsdef=* as a mount option, all my system files have the 
floor
attribute. Most of the /dev files have the * attribute though.


[1]

traceroute to google.com (64.233.187.99), 30 hops max, 40 byte packets
 1  host-196.218.207.17.tedata.net (196.218.207.17)  1.976 ms  1.850 ms  2.127 
ms
 2  DOKKI-R03C-GZ-EG (163.121.170.78)  27.429 ms  28.091 ms  23.336 ms

Here's the tcpdump for accessing google.com:

22:51:26.008883 IP host-196.218.207.18.tedata.net.54011 > 
host-196.218.207.17.tedata.net.domain:  11001+ A? google.com. (28)
22:51:26.011066 IP host-196.218.207.18.tedata.net.45317 > 
host-196.218.207.17.tedata.net.domain:  44913+[|domain]
22:51:26.052154 IP host-196.218.207.17.tedata.net.domain > 
host-196.218.207.18.tedata.net.54011:  11001 3/0/0 A 
py-in-f99.google.com,[|domain]
22:51:26.052700 IP host-196.218.207.18.tedata.net.57180 > 
py-in-f99.google.com.www: S 282373541:282373541(0) win 5840 
22:51:26.090608 IP host-196.218.207.17.tedata.net.domain > 
host-196.218.207.18.tedata.net.45317:  44913 1/0/0 (89)
22:51:26.091473 IP host-196.218.207.18.tedata.net.34417 > 
host-196.218.207.17.tedata.net.domain:  49202+[|domain]
--> 22:51:26.105443 IP DOKKI-R03C-GZ-EG > host-196.218.207.18.tedata.net: ICMP 
parameter problem - octet 20, length 48

Best Regards,

-- 
Ahmed S. Darwish
HomePage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Interaction between Xen and XFS: stray RW mappings

2007-10-14 Thread Jeremy Fitzhardinge
David Chinner wrote:
> You mean xfs_buf.c.
>   

Yes, sorry.

> And yes, we delay unmapping pages until we have a batch of them
> to unmap. vmap and vunmap do not scale, so this is batching helps
> alleviate some of the worst of the problems.
>   

How much performance does it cost?  What kind of workloads would it show
up under?

> Realistically, if this delayed release of vmaps is a problem for
> Xen, then I think that some generic VM solution is needed to this
> problem as vmap() is likely to become more common in future (think
> large blocks in filesystems). Nick - any comments?
>   

Well, the only real problem is that the pages are returned to the free
pool and reallocated while still being part of a mapping.  If the pages
are still owned by the filesystem/pagecache, then there's no problem.

What's the lifetime of things being vmapped/unmapped in xfs?  Are they
necessarily being freed when they're unmapped, or could unmapping of
freed memory be more immediate than other memory?

Maybe it just needs a notifier chain or something.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Interaction between Xen and XFS: stray RW mappings

2007-10-14 Thread David Chinner
On Fri, Oct 12, 2007 at 09:58:43AM -0700, Jeremy Fitzhardinge wrote:
> Hi Dave & other XFS folk,
> 
> I'm tracking down a bug which appears to be a bad interaction between XFS
> and Xen.  It looks like XFS is holding RW mappings on free pages, which Xen
> is trying to get an exclusive RO mapping on so it can turn them into
> pagetables.
> 
> I'm assuming the pages are actually free, and this isn't a use after free
> problem.  From a quick poke around, the most likely pieces of XFS code is
> the stuff in xfs_map.c which creates a virtually contiguous mapping of pages
> with vmap, and seems to delay unmapping them.

You mean xfs_buf.c.

And yes, we delay unmapping pages until we have a batch of them
to unmap. vmap and vunmap do not scale, so this is batching helps
alleviate some of the worst of the problems.

> When pinning a pagetable, Xen tries to eliminate any RW aliases of the pages
> its using.  This is generally trivial because pages returned by
> get_free_page don't have any mappings aside from the normal kernel mapping.
> High pages, when using CONFIG_HIGHPTE, may have a residual kmap mapping, so
> we clear out the kmap cache if we encounter a highpage in the pagetable.
> 
> I guess we could create a special-case interface to do the same thing with
> XFS mappings, but it would be nicer to have something more generic.

*nod*

> Is my analysis correct?  Or should XFS not be holding stray mappings?  Or is
> there already some kind of generic mechanism I can use to get it to release
> its mappings?

The xfsbufd cleans out any stale mappings - it's woken by the memory
shrinker interface (i.e. calls xfsbufd_wakeup()). Otherwise every
64th buffer being vmap()d will flush out stale mappings.

Realistically, if this delayed release of vmaps is a problem for
Xen, then I think that some generic VM solution is needed to this
problem as vmap() is likely to become more common in future (think
large blocks in filesystems). Nick - any comments?

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Which companies are helping developing the kernel

2007-10-14 Thread Jiri Kosina
On Mon, 15 Oct 2007, Stefan Heinrichsen wrote:

> I posted this question at comp.linux.misc and where told this would be a 
> better place therefore. I would like to do a internship in the field of 
> the Linux kernel. Can someone tell me where to find a list of companies 
> (don't matter in which country) that employ kernel developers?

Look at Greg's talk from the last OLS: 
https://ols2006.108.redhat.com/2007/Reprints/kroah-hartman-Reprint.pdf

-- 
Jiri Kosina
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Which companies are helping developing the kernel

2007-10-14 Thread Guilherme Amadio
On Mon, Oct 15, 2007 at 12:06:22AM +0200, Stefan Heinrichsen wrote:
> Hello,
> 
> I posted this question at comp.linux.misc and where told this would be a 
> better place therefore.
> I would like to do a internship in the field of the Linux kernel.
> Can someone tell me where to find a list of companies (don't matter in which
> country) that employ kernel developers?
> 
  Hello,

  Adding to his question, I am interested in doing a PhD in Operating Systems.
  Would anybody have some information on a similar list of labs and/or 
Universities
  that host projects related to Linux?
  
  Thanks in advance.

  Guilherme

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.4 patch] Port of adutux driver from 2.6 kernel to 2.4.

2007-10-14 Thread Willy Tarreau
On Sun, Oct 14, 2007 at 11:45:36PM +0300, Vitaliy Ivanov wrote:
> > Also, while I understand you would be very glad to get your work merged
> > (we all once had our first piece of code), I'd like to mention that you
> > seem to be the only user of this hardware under 2.4 (since it is currently
> > not supported). I'm not sure it's very reasonable to merge a driver in 2.4
> > right now for just one user. Even more, I understand that you finally moved
> > to other hardware, so my feeling is that you did this work as an exercice
> > (which was cleanly performed, BTW), but that it will not get any real use
> > in 2.4.
> 
> Yes, I would like it to be merged... But not just to see my name in
> the kernel sources.
> I'm the only one user of this hardware under 2.4 because it's some
> kind of trick to make it work under 2.4 w/o its support in the kernel.
> We moved to the other hardware because of our reasons and some
> customers can move to the other OS where this hardware will be
> supported.

But something's strange. If people were using 2.4, this hardware has
never worked for them, right ? Why suddenly would they decide that
they have to switch OS or hardware ? Or maybe this hardware replaces
an old one which was supported ?

> I'm sure that we're not the first who tried to make it work in 2.4.

Perhaps, but if you were the last, interest is pretty limited :-)

> Original driver was created for 2.5 and > because interrupt out urbs
> were not supported in 2.4. Now it's not an issue.
> 
> >
> > Since 2.4 is moving very slowly, there should be no problem applying
> > this patch to any version if you really need to use it. Maybe it would
> > even work with your 2.4 enterprise kernel.
> >
> > Note that I'm not radically opposed to merge support for new drivers.
> > If you provide us with really good arguments for a merge, maybe I'll
> > change my opinion, but I doubt about it, since the only users of this
> > device must currently be running 2.6.
> 
> I'm not going to force you to do this, also I can't do this:). But
> after going through the recent news where Greg proposed to create
> drivers for companies for free it looks really reasonable. I
> understand that we are talking about 2.4 but if you will simply run
> diff for adutux of 2.4 and 2.6 you will see that changes are really
> trivial.

That's what I've seen. I can propose you something (unless someone
else raises his hand saying "no") : you update your patch with a
short description of what the hardware module is supposed to be used
for, and you accept to step up as the maintainer for this backport,
which will imply that you put your name and mail in the MAINTAINERS
file. That way, if you're the only user, nobody will be annoyed, and
if there are other users and some of them have problems, I don't waste
my time on something I don't know at all. If you agree with this deal
(which I think is fair), then I'm willing to merge your patch into
2.4.36-pre.

> Also IMHO the more drivers are in the tree the more users will use it.

Not necessarily. 2.4 is currently used by people who already are in 2.4
and cannot/do not want to switch, and by people who are looking for close
to zero maintenance. Drivers are often a reason to switch away from 2.4,
but not to stay in 2.4.

> Once it will be merged in the mainline then it will be backported to
> enterprise kernels and would gain wide usage.

I don't believe that. Enterprise kernels will not evolve much and will
probably not enable it as long as they have not tested it.

Regards,
Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: What still uses the block layer?

2007-10-14 Thread Tilman Schmidt
Am 14.10.2007 19:46 schrieb Stefan Richter:
> David Newall wrote:
>> That is so rude.
> 
> Such responses sometimes happen after provocative posts like the thread
> starter's.

Provocation is often in the eye of the beholder, and basic manners
should be observed nevertheless.

>  He could have asked straight away for help with fixing his
> boot environment instead of wrapping his question into a feigned design
> discussion.

No, he couldn't have. He quite obviously didn't even know enough
to understand his boot environment might be at fault, and hence
was unable to conceive the question you're demanding from him.

>  It appeared as if he is out for a fight rather than
> interested in help.

It may have appeared like that from the highly antagonistic mindset
that seems so prevalent in LKML. But if one just stepped back and
took a breath before answering it should have been quite obvious
that he wasn't. (out for a fight, that is)

Granted, it can be difficult to comprehend the point of view of
someone who does not know or understand something you yourself know
or understand well. But you should at least be aware of that
inability, and consequently refrain from accusing of provocation
where there may be none. Hanlon's razor, cynical as it may sound at
first, is an eminently humanistic principle.

-- 
Tilman Schmidt  E-Mail: [EMAIL PROTECTED]
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeöffnet mindestens haltbar bis: (siehe Rückseite)



signature.asc
Description: OpenPGP digital signature


Re: 2.6.23-mm1: BUG in reiserfs_delete_xattrs

2007-10-14 Thread Laurent Riffard
Le 12.10.2007 06:31, Andrew Morton a écrit :
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23/2.6.23-mm1/

/home is mounted with the following options:
   /dev/mapper/vglinux1-lvhome on /home type reiserfs 
(rw,noatime,nodiratime,user_xattr)

I guess that beagled (the Beagle desktop search daemon) has populated user
xattrs on almost all files. Now, when I delete a file, two BUGs occur
and the system hangs. Here is the stack for the first BUG (the second
one is very similar):

[partially hand copied stack]
_fput
fput
reiserfs_delete_xattrs
reiserfs_delete_inode
generic_delete_inode
generic_drop_inode
iput
do_unlinkat
sys_unlink
sys_enter_past_esp

I reported a similar BUG in 2.6.22-rc8-mm2 (see
http://lkml.org/lkml/2007/9/27/235). Dave Hansen sent a patch for it, I
tested it and it was OK for 2.6.22-rc8-mm2.

I tried this patch on 2.6.23-mm1, and it fixed the BUGs here too.


From: Dave Hansen <[EMAIL PROTECTED]>

The bug is caused by reiserfs creating a special 'struct file' with a
NULL vfsmount.  

/* Opens a file pointer to the attribute associated with inode */
static struct file *open_xa_file(const struct inode *inode, const char
*name,
 int flags)
{
...
fp = dentry_open(xafile, NULL, O_RDWR);
/* dentry_open dputs the dentry if it fails */


As Christoph just said, this is somewhat of a bandaid.  But, it
shouldn't hurt anything.

---

 lxc-dave/fs/file_table.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff -puN fs/open.c~fix-reiserfs-oops fs/open.c
diff -puN fs/file_table.c~fix-reiserfs-oops fs/file_table.c
--- lxc/fs/file_table.c~fix-reiserfs-oops   2007-09-27 13:32:20.0 
-0700
+++ lxc-dave/fs/file_table.c2007-09-27 13:33:11.0 -0700
@@ -236,7 +236,7 @@ void fastcall __fput(struct file *file)
fops_put(file->f_op);
if (file->f_mode & FMODE_WRITE) {
put_write_access(inode);
-   if (!special_file(inode->i_mode))
+   if (!special_file(inode->i_mode) && mnt)
mnt_drop_write(mnt);
}
put_pid(file->f_owner.pid);
diff -puN include/linux/mount.h~fix-reiserfs-oops include/linux/mount.h
_


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Which companies are helping developing the kernel

2007-10-14 Thread Benoit Boissinot
On 10/15/07, Alistair John Strachan <[EMAIL PROTECTED]> wrote:
> On Sunday 14 October 2007 23:06:22 Stefan Heinrichsen wrote:
> > Hello,
> >
> > I posted this question at comp.linux.misc and where told this would be a
> > better place therefore. I would like to do a internship in the field of the
> > Linux kernel.
> > Can someone tell me where to find a list of companies (don't matter in
> > which country) that employ kernel developers?
>
perhaps this helps:

http://lwn.net/Articles/247582/

regards,

Benoit
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-14 Thread Erez Zadok
In message <[EMAIL PROTECTED]>, Pekka J Enberg writes:
> Hi Erez,
> 
> On Sun, 14 Oct 2007, Erez Zadok wrote:
> > In unionfs_writepage() I tried to emulate as best possible what the lower
> > f/s will have returned to the VFS.  Since tmpfs's ->writepage can return
> > AOP_WRITEPAGE_ACTIVATE and re-mark its page as dirty, I did the same in
> > unionfs: mark again my page as dirty, and return AOP_WRITEPAGE_ACTIVATE.
> > 
> > Should I be doing something different when unionfs stacks on top of tmpfs?
> > (BTW, this is probably also relevant to ecryptfs.)
> 
> Look at mm/filemap.c:__filemap_fdatawrite_range(). You shouldn't be 
> calling unionfs_writepage() _at all_ if the lower mapping has 
> BDI_CAP_NO_WRITEBACK capability set. Perhaps something like the totally 
> untested patch below?
> 
>   Pekka
[...]

Pekka, with a small change to your patch (to handle time-based cache
coherency), your patch worked well and passed all my tests.  Thanks.

So now I wonder if we still need the patch to prevent AOP_WRITEPAGE_ACTIVATE
from being returned to userland.  I guess we still need it, b/c even with
your patch, generic_writepages() can return AOP_WRITEPAGE_ACTIVATE back to
the VFS and we need to ensure that doesn't "leak" outside the kernel.

Erez.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Don't leak 'listeners' in netlink_kernel_create()

2007-10-14 Thread Eric W. Biederman
Jesper Juhl <[EMAIL PROTECTED]> writes:

> From: Jesper Juhl <[EMAIL PROTECTED]>
> Subject: Don't leak 'listeners' in netlink_kernel_create()
>
> The Coverity checker spotted that we'll leak the storage allocated 
> to 'listeners' in netlink_kernel_create() when the
>   if (!nl_table[unit].registered)
> check is false.
>
> This patch avoids the leak.
>
>
> Signed-off-by: Jesper Juhl <[EMAIL PROTECTED]>

This patch appears trivially correct to me.
Acked-by: "Eric W. Biederman" <[EMAIL PROTECTED]>

> ---
>
>  af_netlink.c |2 ++
>  1 file changed, 2 insertions(+)
>
> --- linux-2.6/net/netlink/af_netlink.c~   2007-10-14 23:29:50.0 
> +0200
> +++ linux-2.6/net/netlink/af_netlink.c2007-10-14 23:29:50.0 
> +0200
> @@ -1378,6 +1378,8 @@ netlink_kernel_create(struct net *net, i
>   nl_table[unit].cb_mutex = cb_mutex;
>   nl_table[unit].module = module;
>   nl_table[unit].registered = 1;
> + } else {
> + kfree(listeners);
>   }
>   netlink_table_ungrab();
>  
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Which companies are helping developing the kernel

2007-10-14 Thread Alistair John Strachan
On Sunday 14 October 2007 23:06:22 Stefan Heinrichsen wrote:
> Hello,
>
> I posted this question at comp.linux.misc and where told this would be a
> better place therefore. I would like to do a internship in the field of the
> Linux kernel.
> Can someone tell me where to find a list of companies (don't matter in
> which country) that employ kernel developers?

I think Greg wrote a paper on this subject, so I've added him to CC in case he 
has the link handy.

-- 
Cheers,
Alistair.

137/1 Warrender Park Road, Edinburgh, UK.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Hook compat_sys_nanosleep up to high res timer code

2007-10-14 Thread Arnd Bergmann
On Sunday 14 October 2007, Anton Blanchard wrote:
> Now we have high res timers on ppc64 I thought Id test them. It turns
> out compat_sys_nanosleep hasnt been converted to the hrtimer code and so
> is limited to HZ resolution.
> 
> The following patch makes compat_sys_nanosleep call hrtimer_nanosleep
> and uses compat_alloc_user_space to avoid setting KERNEL_DS.
> 
> Signed-off-by: Anton Blanchard <[EMAIL PROTECTED]>

The code looks correct, but I think it would be nicer to change 
hrtimer_nanosleep to take a kernel pointer and have all three
callers (common_nsleep, sys_nanosleep and compat_sys_nanosleep)
do the copy_to_user/put_compat_timespec in the caller.

This would also make it possible to get rid of set_fs() in
compat_sys_clock_nanosleep().

Arnd <><
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: In response to kernel compression e-mail a few months ago.

2007-10-14 Thread Justin Piszcz



On Sun, 14 Oct 2007, Jan Engelhardt wrote:



On Oct 14 2007 16:58, Justin Piszcz wrote:


compress:
 PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
10544 war   20   0  700m 681m 1632 S  141 20.7   1:41.46 7z


Just how you can utilize a CPU to 141% remains a mystery..
[ to be noted this is sqrt(2)*100 ]



It uses 2 cores (multi-thread/multi-core), I believe the author of 7z (I 
asked him about this before) said the compression algorithm can use 
1.8-2.2 cpus.


Justin.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-2.6.23-mm1 crashed

2007-10-14 Thread James Bottomley
On Sun, 2007-10-14 at 12:21 -0700, Andrew Morton wrote:
> On Sun, 14 Oct 2007 22:45:47 +0400 "Dave Milter" <[EMAIL PROTECTED]> wrote:
> 
> > I build linux-2.6.23-mm1 and try to boot it using qemu,
> > and it crashed with trace like this:
> > do_page_fault
> > error_code
> > lock_acquire
> > _spin_lock_irqsave
> > gdth_timeout
> > run_timer_softirq
> > __do_softirq
> > do_softirq
> > 
> > I have screenshot, but have no idea, is it legal to include it, if I
> > sent copy to lkml.
> > config of kernel in attachment,
> > I apply all three patches from hot-fixes.
> > 
> 
> The screenshot is here:  http://userweb.kernel.org/~akpm/crash.png
> 
> It would appear that gdth_timeout() is passing a bad pointer into
> spin_lock_irqsave().

There's a bug in the gdth rework in that the instance can be deleted
from the list before the actual timer is stopped.  This can be worked
around I think by the following patch; although we really should be
stopping the timer from firing when the list goes empty.

James

diff --git a/drivers/scsi/gdth.c b/drivers/scsi/gdth.c
index e8010a7..7fa22be 100644
--- a/drivers/scsi/gdth.c
+++ b/drivers/scsi/gdth.c
@@ -3793,6 +3793,9 @@ static void gdth_timeout(ulong data)
 gdth_ha_str *ha;
 ulong flags;
 
+if (list_empty(_instances))
+   return;
+
 ha = list_first_entry(_instances, gdth_ha_str, list);
 spin_lock_irqsave(>smp_lock, flags);
 



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: What still uses the block layer?

2007-10-14 Thread James Bottomley
On Sat, 2007-10-13 at 16:05 -0600, Matthew Wilcox wrote:
> On Thu, Oct 11, 2007 at 08:11:21PM -0500, Rob Landley wrote:
> > My impression from asking questions on the linux-scsi mailing list is that 
> > the 
> > scsi upper/middle/lower layers doesn't use the block layer described in 
> > Documentation/block/*.
> 
> Entirely incorrect.

OK, right ... could we please get a sense of decorum back on this list.

Rob, if you didn't ask your alleged questions in such a pejorative
manner, we'd get a lot further; and Matthew, if you didn't rise to the
bait so spectacularly it wouldn't prolong these threads.

Really, both of you, I have better things to do with my time than
mediate behaviours that should have been educated out of you in the
kindergarten sand pit.

James


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Which companies are helping developing the kernel

2007-10-14 Thread Stefan Heinrichsen
Hello,

I posted this question at comp.linux.misc and where told this would be a better 
place therefore.
I would like to do a internship in the field of the Linux kernel.
Can someone tell me where to find a list of companies (don't matter in which
country) that employ kernel developers?

Stefan

-- 
GMX FreeMail: 1 GB Postfach, 5 E-Mail-Adressen, 10 Free SMS.
Alle Infos und kostenlose Anmeldung: http://www.gmx.net/de/go/freemail
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-mm1

2007-10-14 Thread Milan Broz
Andrew Morton wrote:
> On Sun, 14 Oct 2007 21:12:08 +0200 "Torsten Kaiser" <[EMAIL PROTECTED]> wrote:
...
>> 354036 Page allocated via order 0, mask 0x11202
>> 1 (PFN/Block always differ) PFN 3072 Block 6 type 0  
>> Flags
>> 354338 [0x80266373] mempool_alloc+83
>> 354338 [0x80266373] mempool_alloc+83
>> 354025 [0x802bb389] bio_alloc_bioset+185
>> 354058 [0x804d2b40] kcryptd_do_crypt+0
>> 354052 [0x804d2cc7] kcryptd_do_crypt+391
>> 354058 [0x804d2b40] kcryptd_do_crypt+0
>> 354052 [0x80245d3c] run_workqueue+204
>> 354062 [0x802467b0] worker_thread+0
>>
>> I'm using dm-crypt with CONFIG_CRYPTO_TWOFISH_X86_64
>>
>>> The other info shows a tremendous memory leak, not via slab.  Looks like
>>> someone is running alloc_pages() directly and isnb't giving them back.
>> Blaming it on dm-crypt looks right, as the leak seems to happens, if
>> there is (heavy) disk activity.
>> (updatedb just ate ~500 Mb)
>>
> 
> Yup, it does appear that dm-crypt is leaking.  Let's add some cc's.

More precisely - change below from git-block.patch update
caused that pages are not deallocated at all.
(cc-ing Jens)

-static int crypt_endio(struct bio *clone, unsigned int done, int error)
+static void crypt_endio(struct bio *clone, int error)
...
-* free the processed pages, even if
-* it's only a partially completed write
+* free the processed pages
 */
-   if (!read_io)
-   crypt_free_buffer_pages(cc, clone, done);
-
-   /* keep going - not finished yet */
-   if (unlikely(clone->bi_size))
-   return 1;
-
-   if (!read_io)
+   if (!read_io) {
+   crypt_free_buffer_pages(cc, clone, clone->bi_size);
goto out;
+   }

clone->bi_size is zero here now, so crypt_free_buffer_pages will not
work correctly (previously there was count of processed bytes).

But because it seems that bio cannot be processed partially now, we can
simplify crypt_free_buffer_pages to always remove all allocated pages.

Milan
--
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: hdparm standby timeout not working for WD raptors?

2007-10-14 Thread Mark Weber
On 10/14/07, Bart Samwel <[EMAIL PROTECTED]> wrote:
>
> Just to be sure: you did use -S 60 to get 5 minutes, right?

Yes. And hdparm is kind enough to print:

/dev/sda:
 setting standby to 60 (5 minutes)

Here's a bizarre sequence which I just noticed:
[extraneous blank lines removed for clarity]

>> hdparm -C /dev/sd[abcde]
/dev/sda:  drive state is:  standby
/dev/sdb:  drive state is:  standby
/dev/sdc:  drive state is:  standby
/dev/sdd:  drive state is:  standby
/dev/sde:  drive state is:  standby

>> hdparm -S 60 /dev/sda
/dev/sda: setting standby to 60 (5 minutes)

>> hdparm -C /dev/sd[abcde]
/dev/sda: drive state is:  active/idle
/dev/sdb: drive state is:  active/idle
/dev/sdc: drive state is:  standby
/dev/sdd: drive state is:  standby
/dev/sde: drive state is:  standby

Note that the -S 60 on /dev/sda affected
/dev/sdb too! This is repeatable.

I have these drives as RAID5 (software RAID).
I don't know if that has anything to do with the
failure of -S or not. Don't know if hdparm bypasses
the RAID or not.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Hook compat_sys_nanosleep up to high res timer code

2007-10-14 Thread Anton Blanchard

Now we have high res timers on ppc64 I thought Id test them. It turns
out compat_sys_nanosleep hasnt been converted to the hrtimer code and so
is limited to HZ resolution.

The following patch makes compat_sys_nanosleep call hrtimer_nanosleep
and uses compat_alloc_user_space to avoid setting KERNEL_DS.

Signed-off-by: Anton Blanchard <[EMAIL PROTECTED]>
---

diff --git a/kernel/compat.c b/kernel/compat.c
index 3bae374..46795ac 100644
--- a/kernel/compat.c
+++ b/kernel/compat.c
@@ -40,62 +40,29 @@ int put_compat_timespec(const struct timespec *ts, struct 
compat_timespec __user
__put_user(ts->tv_nsec, >tv_nsec)) ? -EFAULT : 0;
 }
 
-static long compat_nanosleep_restart(struct restart_block *restart)
-{
-   unsigned long expire = restart->arg0, now = jiffies;
-   struct compat_timespec __user *rmtp;
-
-   /* Did it expire while we handled signals? */
-   if (!time_after(expire, now))
-   return 0;
-
-   expire = schedule_timeout_interruptible(expire - now);
-   if (expire == 0)
-   return 0;
-
-   rmtp = (struct compat_timespec __user *)restart->arg1;
-   if (rmtp) {
-   struct compat_timespec ct;
-   struct timespec t;
-
-   jiffies_to_timespec(expire, );
-   ct.tv_sec = t.tv_sec;
-   ct.tv_nsec = t.tv_nsec;
-   if (copy_to_user(rmtp, , sizeof(ct)))
-   return -EFAULT;
-   }
-   /* The 'restart' block is already filled in */
-   return -ERESTART_RESTARTBLOCK;
-}
-
 asmlinkage long compat_sys_nanosleep(struct compat_timespec __user *rqtp,
-   struct compat_timespec __user *rmtp)
+struct compat_timespec __user *rmtp)
 {
-   struct timespec t;
-   struct restart_block *restart;
-   unsigned long expire;
+   struct timespec tu;
+   struct timespec __user *rmtp64;
+   long ret;
 
-   if (get_compat_timespec(, rqtp))
+   if (get_compat_timespec(, rqtp))
return -EFAULT;
 
-   if ((t.tv_nsec >= 10L) || (t.tv_nsec < 0) || (t.tv_sec < 0))
+   if (!timespec_valid())
return -EINVAL;
 
-   expire = timespec_to_jiffies() + (t.tv_sec || t.tv_nsec);
-   expire = schedule_timeout_interruptible(expire);
-   if (expire == 0)
-   return 0;
+   rmtp64 = compat_alloc_user_space(sizeof(*rmtp64));
+   ret = hrtimer_nanosleep(, rmtp64, HRTIMER_MODE_REL, CLOCK_MONOTONIC);
 
-   if (rmtp) {
-   jiffies_to_timespec(expire, );
-   if (put_compat_timespec(, rmtp))
+   if (ret) {
+   if (copy_from_user(, rmtp64, sizeof(*rmtp64)) ||
+   put_compat_timespec(, rmtp))
return -EFAULT;
}
-   restart = _thread_info()->restart_block;
-   restart->fn = compat_nanosleep_restart;
-   restart->arg0 = jiffies + expire;
-   restart->arg1 = (unsigned long) rmtp;
-   return -ERESTART_RESTARTBLOCK;
+
+   return ret;
 }
 
 static inline long get_compat_itimerval(struct itimerval *o,
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: {GIT pull] x86 bugfixes

2007-10-14 Thread Thomas Gleixner
On Sun, 14 Oct 2007, Jeff Garzik wrote:
> Unless the size is overlarge (currently 400k, on lkml), any chance I could
> talk you into appending the associated patch onto the end of future emails?

Sure. See below.
 
> If it helps, I use the attached script when I send stuff upstream.

Cute. 

Thanks,

tglx
---

diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index bd72d94..11b03d3 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define MAX_PATCH_LEN (255-1)
 
diff --git a/arch/x86/kernel/apic_64.c b/arch/x86/kernel/apic_64.c
index 395928d..09b8209 100644
--- a/arch/x86/kernel/apic_64.c
+++ b/arch/x86/kernel/apic_64.c
@@ -964,8 +964,34 @@ void __init setup_boot_APIC_clock (void)
setup_APIC_timer();
 }
 
+/*
+ * AMD C1E enabled CPUs have a real nasty problem: Some BIOSes set the
+ * C1E flag only in the secondary CPU, so when we detect the wreckage
+ * we already have enabled the boot CPU local apic timer. Check, if
+ * disable_apic_timer is set and the DUMMY flag is cleared. If yes,
+ * set the DUMMY flag again and force the broadcast mode in the
+ * clockevents layer.
+ */
+void __cpuinit check_boot_apic_timer_broadcast(void)
+{
+   struct clock_event_device *levt = _cpu(lapic_events, boot_cpu_id);
+
+   if (!disable_apic_timer ||
+   (lapic_clockevent.features & CLOCK_EVT_FEAT_DUMMY))
+   return;
+
+   printk(KERN_INFO "AMD C1E detected late. Force timer broadcast.\n");
+   lapic_clockevent.features |= CLOCK_EVT_FEAT_DUMMY;
+   levt->features |= CLOCK_EVT_FEAT_DUMMY;
+
+   local_irq_enable();
+   clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_FORCE, _cpu_id);
+   local_irq_disable();
+}
+
 void __cpuinit setup_secondary_APIC_clock(void)
 {
+   check_boot_apic_timer_broadcast();
setup_APIC_timer();
 }
 
diff --git a/arch/x86/kernel/smpboot_64.c b/arch/x86/kernel/smpboot_64.c
index 57ccf7c..720a7d1 100644
--- a/arch/x86/kernel/smpboot_64.c
+++ b/arch/x86/kernel/smpboot_64.c
@@ -335,11 +335,6 @@ void __cpuinit start_secondary(void)
 */
check_tsc_sync_target();
 
-   Dprintk("cpu %d: setting up apic clock\n", smp_processor_id()); 
-   setup_secondary_APIC_clock();
-
-   Dprintk("cpu %d: enabling apic timer\n", smp_processor_id());
-
if (nmi_watchdog == NMI_IO_APIC) {
disable_8259A_irq(0);
enable_NMI_through_LVT0(NULL);
@@ -374,6 +369,8 @@ void __cpuinit start_secondary(void)
 
unlock_ipi_call_lock();
 
+   setup_secondary_APIC_clock();
+
cpu_idle();
 }
 
diff --git a/include/linux/clockchips.h b/include/linux/clockchips.h
index d2ddea9..c33b0dc 100644
--- a/include/linux/clockchips.h
+++ b/include/linux/clockchips.h
@@ -31,6 +31,7 @@ enum clock_event_nofitiers {
CLOCK_EVT_NOTIFY_ADD,
CLOCK_EVT_NOTIFY_BROADCAST_ON,
CLOCK_EVT_NOTIFY_BROADCAST_OFF,
+   CLOCK_EVT_NOTIFY_BROADCAST_FORCE,
CLOCK_EVT_NOTIFY_BROADCAST_ENTER,
CLOCK_EVT_NOTIFY_BROADCAST_EXIT,
CLOCK_EVT_NOTIFY_SUSPEND,
diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index 298bc7c..fc3fc79 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -217,26 +217,43 @@ static void tick_do_broadcast_on_off(void *why)
bc = tick_broadcast_device.evtdev;
 
/*
-* Is the device in broadcast mode forever or is it not
-* affected by the powerstate ?
+* Is the device not affected by the powerstate ?
 */
-   if (!dev || !tick_device_is_functional(dev) ||
-   !(dev->features & CLOCK_EVT_FEAT_C3STOP))
+   if (!dev || !(dev->features & CLOCK_EVT_FEAT_C3STOP))
goto out;
 
-   if (*reason == CLOCK_EVT_NOTIFY_BROADCAST_ON) {
+   /*
+* Defect device ?
+*/
+   if (!tick_device_is_functional(dev)) {
+   /*
+* AMD C1E wreckage fixup:
+*
+* Device was registered functional in the first
+* place. Now the secondary CPU detected the C1E
+* misfeature and notifies us to fix it up
+*/
+   if (*reason != CLOCK_EVT_NOTIFY_BROADCAST_FORCE)
+   goto out;
+   }
+
+   switch (*reason) {
+   case CLOCK_EVT_NOTIFY_BROADCAST_ON:
+   case CLOCK_EVT_NOTIFY_BROADCAST_FORCE:
if (!cpu_isset(cpu, tick_broadcast_mask)) {
cpu_set(cpu, tick_broadcast_mask);
if (td->mode == TICKDEV_MODE_PERIODIC)
clockevents_set_mode(dev,
 CLOCK_EVT_MODE_SHUTDOWN);
}
-   } else {
+   break;
+   case CLOCK_EVT_NOTIFY_BROADCAST_OFF:
if (cpu_isset(cpu, tick_broadcast_mask)) {
   

Re: In response to kernel compression e-mail a few months ago.

2007-10-14 Thread Jan Engelhardt

On Oct 14 2007 16:58, Justin Piszcz wrote:
>
> compress:
>  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
> 10544 war   20   0  700m 681m 1632 S  141 20.7   1:41.46 7z

Just how you can utilize a CPU to 141% remains a mystery..
[ to be noted this is sqrt(2)*100 ]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: MSI interrupts and disable_irq

2007-10-14 Thread Benjamin Herrenschmidt

On Sun, 2007-10-14 at 09:15 +0200, Manfred Spraul wrote:
> Yinghai Lu wrote:
> > On 10/13/07, Manfred Spraul <[EMAIL PROTECTED]> wrote:
> >   
> >> Someone around with a MSI capable board? The forcedeth driver does
> >> dev->irq = pci_dev->irq
> >> in nv_probe(), especially before pci_enable_msi().
> >> Does pci_enable_msi() change pci_dev->irq? Then we would disable the
> >> wrong interrupt
> >> 
> >
> > the request_irq==>setup_irq will make dev->irq = pci_dev->irq.
> >
> >   
> Where is that?
> Otherwise I would propose the attached patch. My board is not 
> MSI-capable, thus I can't test it myself.

Why not just copy pcidev->irq to dev->irq once ?

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: {GIT pull] x86 bugfixes

2007-10-14 Thread Jeff Garzik

Thomas Gleixner wrote:

Linus,

please pull from

  ssh://master.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-x86.git

Thanks,

tglx
--
Dave Jones (1):
  x86: fix missing include for vsyscall

Thomas Gleixner (3):
  clockevents: introduce force broadcast notifier
  x86: move local APIC timer init to the end of start_secondary()
  x86: force timer broadcast on late AMD C1E detection

 arch/x86/kernel/alternative.c |1 +
 arch/x86/kernel/apic_64.c |   26 ++
 arch/x86/kernel/smpboot_64.c  |7 ++-
 include/linux/clockchips.h|1 +
 kernel/time/tick-broadcast.c  |   29 +++--
 kernel/time/tick-common.c |1 +
 6 files changed, 54 insertions(+), 11 deletions(-)


Unless the size is overlarge (currently 400k, on lkml), any chance I 
could talk you into appending the associated patch onto the end of 
future emails?


If it helps, I use the attached script when I send stuff upstream.

Jeff





mkmsg.sh
Description: application/shellscript


Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems

2007-10-14 Thread Mark Lord

Rafael J. Wysocki wrote:

On Sunday, 14 October 2007 22:13, Mark Lord wrote:

Since upgrading to 2.6.23.1 from 2.6.23-rc9, resume-from-RAM has been 
misbehaving here.

It takes much (+5-7 seconds) longer to resume *sometimes*, but not all/most of 
the time.
And sometimes I get get flashing keyboard LEDs and have to hold the power button
in for a full hard reset.

With 2.6.23-rc8/rc9, no such troubles.

Difficult to reproduce, other than perhaps once a day.
Anybody want to fess up with a likely candidate?


Not really, but if you rule out all of the POWERPC and MIPS patches, there's
not much left ...


Yeah, I didn't see much there either.

I'll keep an eye on things over the next few days,
and post again if it persists.

I was using the powertop patches with -rc9, but not with 2.6.23.1.
Maybe they helped (???).

-ml
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PROBLEM: kernel panic

2007-10-14 Thread Scott Petler

Ok, I'll do that
Thanks

Sent from my iPhone

On Oct 14, 2007, at 2:32 PM, Trond Myklebust  
<[EMAIL PROTECTED]> wrote:




On Sun, 2007-10-14 at 14:00 -0700, Scott Petler wrote:

Doug,

I thought that might do it, it does seem to work.  I edited the  
driver

line in my xorg.conf
from nvidia to nv and then,
 F1
login as root
/etc/init.d/gdm stop
/etc/init.d/gdm start

It came back up with the same flashing crap on the second monitor,  
so I

did it again with
the 2nd monitor turned off altogether, and I'll see how long it  
will run

on the single monitor
without the nvidia driver without crashing (maybe forever...).

Thanks,
Scott


Please don't forget to reboot, though. The kernel will remain tainted
until you've rebooted without loading the nvidia module.

Cheers
 Trond


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PROBLEM: kernel panic

2007-10-14 Thread Trond Myklebust

On Sun, 2007-10-14 at 14:00 -0700, Scott Petler wrote:
> Doug,
> 
> I thought that might do it, it does seem to work.  I edited the driver 
> line in my xorg.conf
> from nvidia to nv and then,
>  F1
> login as root
> /etc/init.d/gdm stop
> /etc/init.d/gdm start
> 
> It came back up with the same flashing crap on the second monitor, so I 
> did it again with
> the 2nd monitor turned off altogether, and I'll see how long it will run 
> on the single monitor
> without the nvidia driver without crashing (maybe forever...).
> 
> Thanks, 
> Scott

Please don't forget to reboot, though. The kernel will remain tainted
until you've rebooted without loading the nvidia module.

Cheers
  Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


{GIT pull] x86 bugfixes

2007-10-14 Thread Thomas Gleixner
Linus,

please pull from

  ssh://master.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-x86.git

Thanks,

tglx
--
Dave Jones (1):
  x86: fix missing include for vsyscall

Thomas Gleixner (3):
  clockevents: introduce force broadcast notifier
  x86: move local APIC timer init to the end of start_secondary()
  x86: force timer broadcast on late AMD C1E detection

 arch/x86/kernel/alternative.c |1 +
 arch/x86/kernel/apic_64.c |   26 ++
 arch/x86/kernel/smpboot_64.c  |7 ++-
 include/linux/clockchips.h|1 +
 kernel/time/tick-broadcast.c  |   29 +++--
 kernel/time/tick-common.c |1 +
 6 files changed, 54 insertions(+), 11 deletions(-)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PROBLEM: kernel panic

2007-10-14 Thread Scott Petler

Doug,

I thought that might do it, it does seem to work.  I edited the driver 
line in my xorg.conf

from nvidia to nv and then,
 F1
login as root
/etc/init.d/gdm stop
/etc/init.d/gdm start

It came back up with the same flashing crap on the second monitor, so I 
did it again with
the 2nd monitor turned off altogether, and I'll see how long it will run 
on the single monitor

without the nvidia driver without crashing (maybe forever...).

Thanks,
Scott

Doug Whitesell (LKML) wrote:

On Oct 14, 2007, at 1:44 PM, Scott Petler wrote:


Trond,

I'm not exactly sure how to go back to not using the nvidia driver 
and select the xorg one.  I do know that I wasn't able to use both 
monitors with the xorg driver, but I'm willing to try that to isolate 
the problem.


If memory serves, you change your xorg.conf's "Device" section's 
"Driver" entry to read:

"nv"
instead of
"nvidia",

although I would consult the xorg documentation/your distribution's 
documentation (and back up your working xorg.conf) for specific 
details. (Your boot sequence may also automatically load the nvidia 
module, again, consult your distribution's documentation for details.)


Cheers/hope this helps,
dcw



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


undefined symbol 'APM_EMULATION' during 'make oldconfig' on HPPA

2007-10-14 Thread Frans Pop
$ git describe
v2.6.23-3345-g52d4e66

$ make oldconfig >/dev/null
drivers/macintosh/Kconfig:121:warning: 'select' used by config 
symbol 'PMAC_APM_EMU' refers to undefined symbol 'APM_EMULATION'

Cheers,
FJP
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


menuconfig: fail with clearer error if curses.h N/A

2007-10-14 Thread Frans Pop
It would be nice if 'make menuconfig' could fail earlier or with a clearer
error if curses.h is not available. The actual error is currently rather
buried in a huge amount of indirect errors.

After installing libncurses-dev (Debian) everything was fine.

$ make menuconfig
  HOSTCC  scripts/kconfig/lxdialog/checklist.o
In file included from scripts/kconfig/lxdialog/checklist.c:24:
scripts/kconfig/lxdialog/dialog.h:32:20: error: curses.h: No such file or 
directory
In file included from scripts/kconfig/lxdialog/checklist.c:24:
scripts/kconfig/lxdialog/dialog.h:97: error: expected specifier-qualifier-list 
before ‘chtype’
scripts/kconfig/lxdialog/dialog.h:187: error: expected ‘)’ before ‘*’ token
scripts/kconfig/lxdialog/dialog.h:194: error: expected ‘)’ before ‘*’ token
scripts/kconfig/lxdialog/dialog.h:196: error: expected ‘)’ before ‘*’ token
scripts/kconfig/lxdialog/dialog.h:197: error: expected ‘)’ before ‘*’ token
scripts/kconfig/lxdialog/dialog.h:198: error: expected ‘)’ before ‘*’ token
scripts/kconfig/lxdialog/dialog.h:199: error: expected ‘)’ before ‘*’ token
scripts/kconfig/lxdialog/dialog.h:201: error: expected ‘)’ before ‘*’ token
scripts/kconfig/lxdialog/checklist.c:31: error: expected ‘)’ before ‘*’ token
scripts/kconfig/lxdialog/checklist.c:59: error: expected ‘)’ before ‘*’ token
scripts/kconfig/lxdialog/checklist.c:95: error: expected ‘)’ before ‘*’ token
scripts/kconfig/lxdialog/checklist.c: In function ‘dialog_checklist’:
[another 30 or so lines with warnings/errors omitted]

Cheers,
FJP
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] remove GPL restriction from set_dumpable()

2007-10-14 Thread Jiri Kosina
On Sun, 14 Oct 2007, Christoph Hellwig wrote:

> > Commit 6c5d5238 introduced a set_dumpable() function that replaced the 
> > direct access to mm_struct->dumpable. I don't think there is any 
> > reason to restrict this function to EXPORT_SYMBOL_GPL() -- previously 
> > any module could modify current->mm->dumpable without any 
> > resitrictions, so it makes a little sense to turn this into 'internal 
> > interface' at once.
> Nack, the just shouldn't do such things at all.  I start to get really 
> sick of patches adding random exports everywhere.

I actually don't care that much whether this is merged or not. In fact the 
function itself is pretty trivial and standalone, so any 3rd party module 
willing to modify current->mm->flags can just reimplement it line-by-line 
themselves without breaking the license anyway ...

My main point here was that we should probably better document somewhere 
what are the intended usage scenarios for EXPORT_SYMBOL() vs. 
EXPORT_SYMBOL_GPL(). "Really internal interface" seems a little bit vague 
to me.

-- 
Jiri Kosina
SUSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: tuner-core.c:fe_has_signal() can returne uninitialized value

2007-10-14 Thread Michael Krufky
Adrian Bunk wrote:
> Commit 1f5ef19779df2c2f75870332b37dd3004c08a515 added the following 
> function to drivers/media/video/tuner-core.c:
>
> <--  snip  -->
>
> static int fe_has_signal(struct tuner *t)
> {
> struct dvb_tuner_ops *fe_tuner_ops = >fe.ops.tuner_ops;
> u16 strength;
>
> if (fe_tuner_ops->get_rf_strength)
> fe_tuner_ops->get_rf_strength(>fe, );
>
> return strength;
> }
>
> <--  snip  -->
>
>
> If (!fe_tuner_ops->get_rf_strength) this function returns the value of 
> an uninitialized variable.
>
> Spotted by the Coverity checker.
>   
Thank you, Adrian.  I've fixed this in my tree:

http://linuxtv.org/hg/~mkrufky/v4l-dvb/rev/101ca558a777

Mauro, please pull from:

http://linuxtv.org/hg/~mkrufky/v4l-dvb

for:

- tuner-core.c: fe_has_signal() can return uninitialized value

 tuner-core.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Please send this to Linus for 2.6.24

Regards,

Mike
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems

2007-10-14 Thread Rafael J. Wysocki
On Sunday, 14 October 2007 22:13, Mark Lord wrote:
> Since upgrading to 2.6.23.1 from 2.6.23-rc9, resume-from-RAM has been 
> misbehaving here.
> 
> It takes much (+5-7 seconds) longer to resume *sometimes*, but not all/most 
> of the time.
> And sometimes I get get flashing keyboard LEDs and have to hold the power 
> button
> in for a full hard reset.
> 
> With 2.6.23-rc8/rc9, no such troubles.
> 
> Difficult to reproduce, other than perhaps once a day.
> Anybody want to fess up with a likely candidate?

Not really, but if you rule out all of the POWERPC and MIPS patches, there's
not much left ...

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git patches] IDE updates (part 2)

2007-10-14 Thread Benjamin Herrenschmidt

> How's about this patch?
> 
> [PATCH] ide-pmac: fix pmac_ide_init_hwif_ports()
> 
> * pmac_ide_init_hwif_ports() can be called by ide_init_hwif_ports()
>   (through ppc_ide_md.ide_init_hwif hook) for non IDE PMAC interfaces.
>   If this is the case the hw->io_ports[] should be already setup by
>   ide_init_hwif_ports()->ide_std_init_ports() so remove redundant code
>   from pmac_ide_init_hwif_ports().
> 
>   As side-effect this change fixes ctl_addr == 0 special handling in
>   ide_init_hwif_ports().
> 
> * Fix misleading comment while at it.

I would have to try it. Problem is, I don't actually have any powermac
with a PCI IDE controller at hand.. ouch. I'll see what I can find.

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: hdparm standby timeout not working for WD raptors?

2007-10-14 Thread Bart Samwel

Mark Weber wrote:

On 10/14/07, Bart Samwel <[EMAIL PROTECTED]> wrote:

Some things to check:

* Run "hdparm -I" on your drive. In the "Capabilities" section there is
a line "Standby timer values", for some drives this mentions a device
specific minimum. I know some drives that ignore any setting below 60
seconds.

* I also know of quite a number of drives where hdparm -B settings
override the -S settings, even if you set the -S settings after the
hdparm -B settings. You could try combinations with various values of
hdparm -B, especially 1 and 255.


Thanks for the suggestions.

The -I command prints out a bunch of stuff including:
Standby timer values: spec'd by Standard, with device specific minimum


Ahhh. Spec'd by standard means that each -S unit is worth 5 seconds (for 
values up to 240 = 20 minutes), and the second part means that there is 
a minimum (which is not specified in this report, unfortunately). 
Perhaps you can get a hold of the full drive manual, the exact minimum 
value is probably specified there.



I tried setting -B to 1 and and then set -S to 5 minutes.
Also, -B 255 and then set -S to 5 minutes.
No luck with either. These drives want to keep running.


Just to be sure: you did use -S 60 to get 5 minutes, right?


One thing of possible interest: The -B command printed
the following message:

/dev/sda:
 setting Advanced Power Management level to 0x01 (1)
 HDIO_DRIVE_CMD failed: Input/output error

I would guess that the first line came out just before
hdparm tried to do the set, and the second line indicates
that the set failed.


Yes, that seems correct. Nothing too weird there: it simply seems that 
the drive doesn't support the power management knob. (AFAIK you should 
be able to confirm this using the feature sets listed in the output of 
hdparm -I.)



Perhaps -S is failing too, just without the diagnostic?


Perhaps, but I'd expect it to print a diagnostic if it fails. I do seem 
to remember that (at least for some drives that I've seen) there isn't a 
diagnostic if you go below the device specific minimum, the value is 
simply ignored.


Cheers,
Bart
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] remove GPL restriction from set_dumpable()

2007-10-14 Thread Christoph Hellwig
On Sun, Oct 14, 2007 at 01:04:31PM +0200, Jiri Kosina wrote:
> From: Jiri Kosina <[EMAIL PROTECTED]>
> 
> remove GPL restriction from set_dumpable()
> 
> Commit 6c5d5238 introduced a set_dumpable() function that replaced the 
> direct access to mm_struct->dumpable. I don't think there is any reason to 
> restrict this function to EXPORT_SYMBOL_GPL() -- previously any module 
> could modify current->mm->dumpable without any resitrictions, so it makes 
> a little sense to turn this into 'internal interface' at once.
> 
> There in fact are 3rd party modules that modify the dumpable flag, and 
> this patch should fix the situation for them once again (for example 
> vmware).

Nack, the just shouldn't do such things at all.  I start to get really
sick of patches adding random exports everywhere.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: In response to kernel compression e-mail a few months ago.

2007-10-14 Thread Justin Piszcz



On Sun, 14 Oct 2007, Al Viro wrote:


On Sun, Oct 14, 2007 at 09:46:15PM +0200, Jan Engelhardt wrote:

(Obviously we shall pick .7z)


The hell it is.  Take a look at memory footprint of those suckers...



For compression with -mx=9 it does use 500-900 MiB of RAM, that is true.
For decompression, 50-70 MiB.

Each have their pros/cons but nothing can compress the kernel any further 
than 7z, supports stdin/stdout and also has a native windows port.  I used 
to strictly use bzip2 for backups and such but if I can pick off an 
additional 20-30% more than bzip2 for my backups which I will not use often,
7zip seems to be the winner for space savings and possibly for 
bandwidth/cost savings..


compress:
  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
10544 war   20   0  700m 681m 1632 S  141 20.7   1:41.46 7z

decompress:
  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
11927 war   20   0 71256  66m 1536 R   88  2.0   0:04.07 7z

Justin.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   >