[PATCH 0/13] cxgb3 - driver updates

2007-08-10 Thread divy
Hi Jeff,

I'm submitting a patch series for inclusion in netdev#upstream.

Here is a brief description:
-   MAC hang workaround update
-   Modify max HW Rx coalescing size
-   Log SGE doorbell Fifo overflow
-   Use Tx immediate data for offload packets whenever possible
-   RDMA can get internal mem info to workaround HW issues
-   More validity checks on connection ids
-   Stop MAC when a fatal error is detected
-   Log HW serial number
-   Update internal mem operating mode
-   Update engine microcode management, version is now 1.1.0
-   Update FW management, version is now 4.6.0
-   Ignore some HW errors until the HW is initialized
-   Check MSI/MSI-X after it got enabled

Cheers,
Divy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/13] cxgb3 - driver updates

2007-08-10 Thread divy
Hi Jeff,

I'm submitting a patch series for inclusion in netdev#upstream.

Here is a brief description:
-   MAC hang workaround update
-   Modify max HW Rx coalescing size
-   Log SGE doorbell Fifo overflow
-   Use Tx immediate data for offload packets whenever possible
-   RDMA can get internal mem info to workaround HW issues
-   More validity checks on connection ids
-   Stop MAC when a fatal error is detected
-   Log HW serial number
-   Update internal mem operating mode
-   Update engine microcode management, version is now 1.1.0
-   Update FW management, version is now 4.6.0
-   Ignore some HW errors until the HW is initialized
-   Check MSI/MSI-X after it got enabled

Cheers,
Divy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] do not export /usr/include/scsi in make headers_install

2007-08-10 Thread David Woodhouse
On Mon, 2007-08-06 at 15:02 +0200, Olaf Hering wrote:
> On Mon, Aug 06, Christoph Hellwig wrote:
> 
> > On Mon, Aug 06, 2007 at 02:45:46PM +0200, Olaf Hering wrote:
> > > 
> > > glibc and make headers_install_all provide /usr/include/scsi
> > > One of them has to go.
> > > 
> > > A quick diff shows no differences, expect:
> > 
> > ..
> > 
> > > Which copy should be provided by a distributor?
> > 
> > The glibc one of course.  The kernel scsi.h should never have been
> > added to the list of exportable headers.  
> 
> /usr/include/scsi is provided by glibc.
> Remove the scsi export from make headers_install target.
> 
> 
> Signed-off-by: Olaf Hering <[EMAIL PROTECTED]>

Acked-by: David Woodhouse <[EMAIL PROTECTED]>

> ---
>  include/Kbuild  |1 -
>  include/scsi/Kbuild |4 
>  2 files changed, 5 deletions(-)
> 
> --- a/include/Kbuild
> +++ b/include/Kbuild
> @@ -1,6 +1,5 @@
>  header-y += asm-generic/
>  header-y += linux/
> -header-y += scsi/
>  header-y += sound/
>  header-y += mtd/
>  header-y += rdma/
> --- a/include/scsi/Kbuild
> +++ /dev/null
> @@ -1,4 +0,0 @@
> -header-y += scsi.h
> -
> -unifdef-y += scsi_ioctl.h
> -unifdef-y += sg.h
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [SCSI] aic94xx: new driver

2007-08-10 Thread David Woodhouse
On Sat, 2007-08-11 at 04:49 +0100, Christoph Hellwig wrote:
> On Fri, Aug 10, 2007 at 11:09:22PM +0800, David Woodhouse wrote:
> > The files in /usr/include/scsi are actually shipped by glibc, and most
> > distributions use glibc's version instead of the one from the kernel --
> > so this additional userspace interface is automatically incompatible
> > with most people's installations.
> 
> Stop here right now.  You just noticed the real bug, and that's exporting
> scsi.h at all.  I think Olaf sent a patch to fix this already. 

That's a good enough answer for me, certainly.

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 12/13] cxgb3 - log and clear PEX errors

2007-08-10 Thread Divy Le Ray
From: Divy Le Ray <[EMAIL PROTECTED]>

Clear pciE PEX errors late at module load time.
Log details when PEX errors occur.

Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]>
---

 drivers/net/cxgb3/t3_hw.c |6 ++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/drivers/net/cxgb3/t3_hw.c b/drivers/net/cxgb3/t3_hw.c
index 3d47627..538b254 100644
--- a/drivers/net/cxgb3/t3_hw.c
+++ b/drivers/net/cxgb3/t3_hw.c
@@ -1355,6 +1355,10 @@ static void pcie_intr_handler(struct adapter *adapter)
{0}
};
 
+   if (t3_read_reg(adapter, A_PCIE_INT_CAUSE) & F_PEXERR)
+   CH_ALERT(adapter, "PEX error code 0x%x\n",
+t3_read_reg(adapter, A_PCIE_PEX_ERR));
+
if (t3_handle_intr_status(adapter, A_PCIE_INT_CAUSE, PCIE_INTR_MASK,
  pcie_intr_info, adapter->irq_stats))
t3_fatal_err(adapter);
@@ -1806,6 +1810,8 @@ void t3_intr_clear(struct adapter *adapter)
for (i = 0; i < ARRAY_SIZE(cause_reg_addr); ++i)
t3_write_reg(adapter, cause_reg_addr[i], 0x);
 
+   if (is_pcie(adapter))
+   t3_write_reg(adapter, A_PCIE_PEX_ERR, 0x);
t3_write_reg(adapter, A_PL_INT_CAUSE0, 0x);
t3_read_reg(adapter, A_PL_INT_CAUSE0);  /* flush */
 }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Documentation files in html format?

2007-08-10 Thread Stefan Richter
Rene Herman wrote:
> On 08/10/2007 10:12 PM, Sam Ravnborg wrote:
> 
>>> What primary requirements does in-tree Linux kernel documentation have
>>> to fulfill in general?
>>
>> Skipping the obvious ones such as correct, up-to-date etc.
>> o Readable as-is
>> o Grepable
>> o buildable as structured documents or almost like a single book
>> o Easy to replicate structure
>> o Maintainable in any decent text-editor (emacs, vim, whatever)

Low entry barrier for patches from unsuspecting occasional contributors?

> Easy to put online?

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=tree;f=Documentation
http://lxr.linux.no/source/Documentation/
http://users.sosdg.org/~qiyong/lxr/source/Documentation/
http://www.linux-m32r.org/lxr/http/source/Documentation/
http://lxr.free-electrons.com/source/Documentation/


(I admit though that formats like asciidoc or docbook are beneficial for
larger documentation files which want chapters, table of contents, and
internal crossreferences.)
-- 
Stefan Richter
-=-=-=== =--- -=-==
http://arcgraph.de/sr/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 13/13] cxgb3 - test MSI capabilities

2007-08-10 Thread Divy Le Ray
From: Divy Le Ray <[EMAIL PROTECTED]>

Check that the HW in really in MSI/MSI-X mode
when it was succesfully enabled.

Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]>
---

 drivers/net/cxgb3/cxgb3_main.c |   42 
 drivers/net/cxgb3/regs.h   |4 
 2 files changed, 46 insertions(+), 0 deletions(-)

diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c
index eaebd7f..1449692 100644
--- a/drivers/net/cxgb3/cxgb3_main.c
+++ b/drivers/net/cxgb3/cxgb3_main.c
@@ -2318,6 +2318,46 @@ void t3_fatal_err(struct adapter *adapter)
 
 }
 
+/*
+ * Interrupt handler used to check if MSI/MSI-X works on this platform.
+ */
+static irqreturn_t check_intr_handler(int irq, void *cookie)
+{
+   struct adapter *adap = cookie;
+
+   t3_set_reg_field(adap, A_PL_INT_ENABLE0, F_MI1, 0);
+   return IRQ_HANDLED;
+}
+
+static void __devinit check_msi(struct adapter *adap)
+{
+   int vec, mi1;
+
+   if (!(t3_read_reg(adap, A_PL_INT_CAUSE0) & F_MI1))
+   return;
+
+   vec = (adap->flags & USING_MSI) ? adap->pdev->irq :
+ adap->msix_info[0].vec;
+
+   if (request_irq(vec, check_intr_handler, 0, adap->name, adap))
+   return;
+
+   t3_set_reg_field(adap, A_PL_INT_ENABLE0, 0, F_MI1);
+   msleep(10);
+   mi1 = t3_read_reg(adap, A_PL_INT_ENABLE0) & F_MI1;
+   if (mi1)
+   t3_set_reg_field(adap, A_PL_INT_ENABLE0, F_MI1, 0);
+   free_irq(vec, adap);
+
+   if (mi1) {
+   cxgb_disable_msi(adap);
+   dev_info(&adap->pdev->dev,
+"the kernel believes that MSI is available on this "
+"platform\nbut the driver's MSI test has failed.  "
+"Proceeding with INTx interrupts.\n");
+   }
+}
+
 static int __devinit cxgb_enable_msix(struct adapter *adap)
 {
struct msix_entry entries[SGE_QSETS + 1];
@@ -2554,6 +2594,8 @@ static int __devinit init_one(struct pci_dev *pdev,
adapter->flags |= USING_MSIX;
else if (msi > 0 && pci_enable_msi(pdev) == 0)
adapter->flags |= USING_MSI;
+   if (adapter->flags & (USING_MSIX | USING_MSI))
+   check_msi(adapter);
 
err = sysfs_create_group(&adapter->port[0]->dev.kobj,
 &cxgb3_attr_group);
diff --git a/drivers/net/cxgb3/regs.h b/drivers/net/cxgb3/regs.h
index 5e1bc0d..f97f8ab 100644
--- a/drivers/net/cxgb3/regs.h
+++ b/drivers/net/cxgb3/regs.h
@@ -1639,6 +1639,10 @@
 #define V_MC5A(x) ((x) << S_MC5A)
 #define F_MC5AV_MC5A(1U)
 
+#define S_MI113
+#define V_MI1(x) ((x) << S_MI1)
+#define F_MI1V_MI1(1U)
+
 #define S_CPL_SWITCH12
 #define V_CPL_SWITCH(x) ((x) << S_CPL_SWITCH)
 #define F_CPL_SWITCHV_CPL_SWITCH(1U)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 11/13] cxgb3 - Firmware update

2007-08-10 Thread Divy Le Ray
From: Divy Le Ray <[EMAIL PROTECTED]>

Update firmware version
Allow the driver to be up and running with older FW image

Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]>
---

 drivers/net/cxgb3/common.h |2 +-
 drivers/net/cxgb3/cxgb3_main.c |9 +
 drivers/net/cxgb3/t3_hw.c  |   20 +++-
 drivers/net/cxgb3/version.h|2 +-
 4 files changed, 22 insertions(+), 11 deletions(-)

diff --git a/drivers/net/cxgb3/common.h b/drivers/net/cxgb3/common.h
index b665b20..ff867c2 100644
--- a/drivers/net/cxgb3/common.h
+++ b/drivers/net/cxgb3/common.h
@@ -691,7 +691,7 @@ int t3_read_flash(struct adapter *adapter, unsigned int 
addr,
  unsigned int nwords, u32 *data, int byte_oriented);
 int t3_load_fw(struct adapter *adapter, const u8 * fw_data, unsigned int size);
 int t3_get_fw_version(struct adapter *adapter, u32 *vers);
-int t3_check_fw_version(struct adapter *adapter);
+int t3_check_fw_version(struct adapter *adapter, int *must_load);
 int t3_init_hw(struct adapter *adapter, u32 fw_params);
 void mac_prep(struct cmac *mac, struct adapter *adapter, int index);
 void early_hw_init(struct adapter *adapter, const struct adapter_info *ai);
diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c
index 65ded16..eaebd7f 100644
--- a/drivers/net/cxgb3/cxgb3_main.c
+++ b/drivers/net/cxgb3/cxgb3_main.c
@@ -814,11 +814,12 @@ static int cxgb_up(struct adapter *adap)
int must_load;
 
if (!(adap->flags & FULL_INIT_DONE)) {
-   err = t3_check_fw_version(adap);
-   if (err == -EINVAL)
+   err = t3_check_fw_version(adap, &must_load);
+   if (err == -EINVAL) {
err = upgrade_fw(adap);
-   if (err)
-   goto out;
+   if (err && must_load)
+   goto out;
+   }
 
err = t3_check_tpsram_version(adap, &must_load);
if (err == -EINVAL) {
diff --git a/drivers/net/cxgb3/t3_hw.c b/drivers/net/cxgb3/t3_hw.c
index 63032e8..3d47627 100644
--- a/drivers/net/cxgb3/t3_hw.c
+++ b/drivers/net/cxgb3/t3_hw.c
@@ -957,16 +957,18 @@ int t3_get_fw_version(struct adapter *adapter, u32 *vers)
 /**
  * t3_check_fw_version - check if the FW is compatible with this driver
  * @adapter: the adapter
- *
+ * @must_load: set to 1 if loading a new FW image is required
+
  * Checks if an adapter's FW is compatible with the driver.  Returns 0
  * if the versions are compatible, a negative error otherwise.
  */
-int t3_check_fw_version(struct adapter *adapter)
+int t3_check_fw_version(struct adapter *adapter, int *must_load)
 {
int ret;
u32 vers;
unsigned int type, major, minor;
 
+   *must_load = 1;
ret = t3_get_fw_version(adapter, &vers);
if (ret)
return ret;
@@ -979,9 +981,17 @@ int t3_check_fw_version(struct adapter *adapter)
minor == FW_VERSION_MINOR)
return 0;
 
-   CH_ERR(adapter, "found wrong FW version(%u.%u), "
-  "driver needs version %u.%u\n", major, minor,
-  FW_VERSION_MAJOR, FW_VERSION_MINOR);
+   if (major != FW_VERSION_MAJOR)
+   CH_ERR(adapter, "found wrong FW version(%u.%u), "
+  "driver needs version %u.%u\n", major, minor,
+  FW_VERSION_MAJOR, FW_VERSION_MINOR);
+   else {
+   *must_load = 0;
+   CH_WARN(adapter, "found wrong FW minor version(%u.%u), "
+   "driver compiled for version %u.%u\n", major, minor,
+   FW_VERSION_MAJOR, FW_VERSION_MINOR);
+   }
+
return -EINVAL;
 }
 
diff --git a/drivers/net/cxgb3/version.h b/drivers/net/cxgb3/version.h
index eb508bf..ef1c633 100644
--- a/drivers/net/cxgb3/version.h
+++ b/drivers/net/cxgb3/version.h
@@ -39,6 +39,6 @@
 
 /* Firmware version */
 #define FW_VERSION_MAJOR 4
-#define FW_VERSION_MINOR 3
+#define FW_VERSION_MINOR 6
 #define FW_VERSION_MICRO 0
 #endif /* __CHELSIO_VERSION_H */
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 6/13] cxgb3 - tighten checks on TID values

2007-08-10 Thread Divy Le Ray
From: Divy Le Ray <[EMAIL PROTECTED]>

Enforce validity checks on connection ids

Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]>
---

 drivers/net/cxgb3/cxgb3_defs.h|   20 ++--
 drivers/net/cxgb3/cxgb3_offload.c |   28 +++-
 2 files changed, 41 insertions(+), 7 deletions(-)

diff --git a/drivers/net/cxgb3/cxgb3_defs.h b/drivers/net/cxgb3/cxgb3_defs.h
index 483a594..45e9216 100644
--- a/drivers/net/cxgb3/cxgb3_defs.h
+++ b/drivers/net/cxgb3/cxgb3_defs.h
@@ -79,9 +79,17 @@ static inline struct t3c_tid_entry *lookup_tid(const struct 
tid_info *t,
 static inline struct t3c_tid_entry *lookup_stid(const struct tid_info *t,
unsigned int tid)
 {
+   union listen_entry *e;
+
if (tid < t->stid_base || tid >= t->stid_base + t->nstids)
return NULL;
-   return &(stid2entry(t, tid)->t3c_tid);
+
+   e = stid2entry(t, tid);
+   if ((void *)e->next >= (void *)t->tid_tab &&
+   (void *)e->next < (void *)&t->atid_tab[t->natids])
+   return NULL;
+
+   return &e->t3c_tid;
 }
 
 /*
@@ -90,9 +98,17 @@ static inline struct t3c_tid_entry *lookup_stid(const struct 
tid_info *t,
 static inline struct t3c_tid_entry *lookup_atid(const struct tid_info *t,
unsigned int tid)
 {
+   union active_open_entry *e;
+
if (tid < t->atid_base || tid >= t->atid_base + t->natids)
return NULL;
-   return &(atid2entry(t, tid)->t3c_tid);
+
+   e = atid2entry(t, tid);
+   if ((void *)e->next >= (void *)t->tid_tab &&
+   (void *)e->next < (void *)&t->atid_tab[t->natids])
+   return NULL;
+
+   return &e->t3c_tid;
 }
 
 int process_rx(struct t3cdev *dev, struct sk_buff **skbs, int n);
diff --git a/drivers/net/cxgb3/cxgb3_offload.c 
b/drivers/net/cxgb3/cxgb3_offload.c
index 522c1be..7fb526a 100644
--- a/drivers/net/cxgb3/cxgb3_offload.c
+++ b/drivers/net/cxgb3/cxgb3_offload.c
@@ -57,7 +57,7 @@ static DEFINE_RWLOCK(adapter_list_lock);
 static LIST_HEAD(adapter_list);
 
 static const unsigned int MAX_ATIDS = 64 * 1024;
-static const unsigned int ATID_BASE = 0x10;
+static const unsigned int ATID_BASE = 0x1;
 
 static inline int offload_activated(struct t3cdev *tdev)
 {
@@ -684,10 +684,19 @@ static int do_cr(struct t3cdev *dev, struct sk_buff *skb)
 {
struct cpl_pass_accept_req *req = cplhdr(skb);
unsigned int stid = G_PASS_OPEN_TID(ntohl(req->tos_tid));
+   struct tid_info *t = &(T3C_DATA(dev))->tid_maps;
struct t3c_tid_entry *t3c_tid;
+   unsigned int tid = GET_TID(req);
 
-   t3c_tid = lookup_stid(&(T3C_DATA(dev))->tid_maps, stid);
-   if (t3c_tid->ctx && t3c_tid->client->handlers &&
+   if (unlikely(tid >= t->ntids)) {
+   printk("%s: passive open TID %u too large\n",
+  dev->name, tid);
+   t3_fatal_err(tdev2adap(dev));
+   return CPL_RET_BUF_DONE;
+   }
+   
+   t3c_tid = lookup_stid(t, stid);
+   if (t3c_tid && t3c_tid->ctx && t3c_tid->client->handlers &&
t3c_tid->client->handlers[CPL_PASS_ACCEPT_REQ]) {
return t3c_tid->client->handlers[CPL_PASS_ACCEPT_REQ]
(dev, skb, t3c_tid->ctx);
@@ -769,16 +778,25 @@ static int do_act_establish(struct t3cdev *dev, struct 
sk_buff *skb)
 {
struct cpl_act_establish *req = cplhdr(skb);
unsigned int atid = G_PASS_OPEN_TID(ntohl(req->tos_tid));
+   struct tid_info *t = &(T3C_DATA(dev))->tid_maps;
struct t3c_tid_entry *t3c_tid;
+   unsigned int tid = GET_TID(req);
 
-   t3c_tid = lookup_atid(&(T3C_DATA(dev))->tid_maps, atid);
+   if (unlikely(tid >= t->ntids)) {
+   printk("%s: active establish TID %u too large\n",
+  dev->name, tid);
+   t3_fatal_err(tdev2adap(dev));
+   return CPL_RET_BUF_DONE;
+   }
+
+   t3c_tid = lookup_atid(t, atid);
if (t3c_tid && t3c_tid->ctx && t3c_tid->client->handlers &&
t3c_tid->client->handlers[CPL_ACT_ESTABLISH]) {
return t3c_tid->client->handlers[CPL_ACT_ESTABLISH]
(dev, skb, t3c_tid->ctx);
} else {
printk(KERN_ERR "%s: received clientless CPL command 0x%x\n",
-  dev->name, CPL_PASS_ACCEPT_REQ);
+  dev->name, CPL_ACT_ESTABLISH);
return CPL_RET_BUF_DONE | CPL_RET_BAD_MSG;
}
 }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 10/13] cxgb3 - engine microcode update

2007-08-10 Thread Divy Le Ray
From: Divy Le Ray <[EMAIL PROTECTED]>

Load microcode engine when the interface
is configured up.
Bump up version to 1.1.0.
Allow the driver to be and running with
older microcode images.
Allow ethtool to log the microcode version.

Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]>
---

 drivers/net/cxgb3/common.h |8 ++-
 drivers/net/cxgb3/cxgb3_main.c |  116 
 drivers/net/cxgb3/t3_hw.c  |   43 +--
 3 files changed, 113 insertions(+), 54 deletions(-)

diff --git a/drivers/net/cxgb3/common.h b/drivers/net/cxgb3/common.h
index d54446f..b665b20 100644
--- a/drivers/net/cxgb3/common.h
+++ b/drivers/net/cxgb3/common.h
@@ -127,8 +127,8 @@ enum {  /* adapter 
interrupt-maintained statistics */
 
 enum {
TP_VERSION_MAJOR= 1,
-   TP_VERSION_MINOR= 0,
-   TP_VERSION_MICRO= 44
+   TP_VERSION_MINOR= 1,
+   TP_VERSION_MICRO= 0
 };
 
 #define S_TP_VERSION_MAJOR 16
@@ -438,6 +438,7 @@ enum {  /* chip 
revisions */
T3_REV_A  = 0,
T3_REV_B  = 2,
T3_REV_B2 = 3,
+   T3_REV_C  = 4,
 };
 
 struct trace_params {
@@ -682,7 +683,8 @@ const struct adapter_info *t3_get_adapter_info(unsigned int 
board_id);
 int t3_seeprom_read(struct adapter *adapter, u32 addr, u32 *data);
 int t3_seeprom_write(struct adapter *adapter, u32 addr, u32 data);
 int t3_seeprom_wp(struct adapter *adapter, int enable);
-int t3_check_tpsram_version(struct adapter *adapter);
+int t3_get_tp_version(struct adapter *adapter, u32 *vers);
+int t3_check_tpsram_version(struct adapter *adapter, int *must_load);
 int t3_check_tpsram(struct adapter *adapter, u8 *tp_ram, unsigned int size);
 int t3_set_proto_sram(struct adapter *adap, u8 *data);
 int t3_read_flash(struct adapter *adapter, unsigned int addr,
diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c
index e5744e7..65ded16 100644
--- a/drivers/net/cxgb3/cxgb3_main.c
+++ b/drivers/net/cxgb3/cxgb3_main.c
@@ -721,6 +721,7 @@ static void bind_qsets(struct adapter *adap)
 }
 
 #define FW_FNAME "t3fw-%d.%d.%d.bin"
+#define TPSRAM_NAME "t3%c_protocol_sram-%d.%d.%d.bin"
 
 static int upgrade_fw(struct adapter *adap)
 {
@@ -742,6 +743,61 @@ static int upgrade_fw(struct adapter *adap)
return ret;
 }
 
+static inline char t3rev2char(struct adapter *adapter)
+{
+   char rev = 0;
+
+   switch(adapter->params.rev) {
+   case T3_REV_A:
+   rev = 'a';
+   break;
+   case T3_REV_B:
+   case T3_REV_B2:
+   rev = 'b';
+   break;
+   case T3_REV_C:
+   rev = 'c';
+   break;
+   }
+   return rev;
+}
+
+int update_tpsram(struct adapter *adap)
+{
+   const struct firmware *tpsram;
+   char buf[64];
+   struct device *dev = &adap->pdev->dev;
+   int ret;
+   char rev;
+   
+   rev = t3rev2char(adap);
+   if (!rev)
+   return 0;
+
+   snprintf(buf, sizeof(buf), TPSRAM_NAME, rev,
+TP_VERSION_MAJOR, TP_VERSION_MINOR, TP_VERSION_MICRO);
+
+   ret = request_firmware(&tpsram, buf, dev);
+   if (ret < 0) {
+   dev_err(dev, "could not load TP SRAM: unable to load %s\n",
+   buf);
+   return ret;
+   }
+   
+   ret = t3_check_tpsram(adap, tpsram->data, tpsram->size);
+   if (ret)
+   goto release_tpsram;
+
+   ret = t3_set_proto_sram(adap, tpsram->data);
+   if (ret)
+   dev_err(dev, "loading protocol SRAM failed\n");
+
+release_tpsram:
+   release_firmware(tpsram);
+   
+   return ret;
+}
+
 /**
  * cxgb_up - enable the adapter
  * @adapter: adapter being enabled
@@ -755,6 +811,7 @@ static int upgrade_fw(struct adapter *adap)
 static int cxgb_up(struct adapter *adap)
 {
int err = 0;
+   int must_load;
 
if (!(adap->flags & FULL_INIT_DONE)) {
err = t3_check_fw_version(adap);
@@ -763,6 +820,13 @@ static int cxgb_up(struct adapter *adap)
if (err)
goto out;
 
+   err = t3_check_tpsram_version(adap, &must_load);
+   if (err == -EINVAL) {
+   err = update_tpsram(adap);
+   if (err && must_load)
+   goto out;
+   }
+
err = init_dummy_netdevs(adap);
if (err)
goto out;
@@ -1097,9 +1161,11 @@ static int get_eeprom_len(struct net_device *dev)
 static void get_drvinfo(struct net_device *dev, struct ethtool_drvinfo *info)
 {
u32 fw_vers = 0;
+   u32 tp_vers = 0;
struct adapter *adapter = dev->priv;
 
t3_get_fw_version(adapter, &fw_vers);
+   t3_get_tp_version(adapter, &tp_vers);
 
strcpy(info->driver, DRV_NAME);
strcpy(info->ver

[PATCH 9/13] cxgb3 - Update internal memory management

2007-08-10 Thread Divy Le Ray
From: Divy Le Ray <[EMAIL PROTECTED]>

Set PM1 internal memory to round robin mode

Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]>
---

 drivers/net/cxgb3/regs.h  |2 ++
 drivers/net/cxgb3/t3_hw.c |2 ++
 2 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/net/cxgb3/regs.h b/drivers/net/cxgb3/regs.h
index 2824278..5e1bc0d 100644
--- a/drivers/net/cxgb3/regs.h
+++ b/drivers/net/cxgb3/regs.h
@@ -1326,6 +1326,7 @@
 #define V_D0_WEIGHT(x) ((x) << S_D0_WEIGHT)
 
 #define A_PM1_RX_CFG 0x5c0
+#define A_PM1_RX_MODE 0x5c4
 
 #define A_PM1_RX_INT_ENABLE 0x5d8
 
@@ -1394,6 +1395,7 @@
 #define A_PM1_RX_INT_CAUSE 0x5dc
 
 #define A_PM1_TX_CFG 0x5e0
+#define A_PM1_TX_MODE 0x5e4
 
 #define A_PM1_TX_INT_ENABLE 0x5f8
 
diff --git a/drivers/net/cxgb3/t3_hw.c b/drivers/net/cxgb3/t3_hw.c
index 23b1a16..13bfbec 100644
--- a/drivers/net/cxgb3/t3_hw.c
+++ b/drivers/net/cxgb3/t3_hw.c
@@ -3189,6 +3189,8 @@ int t3_init_hw(struct adapter *adapter, u32 fw_params)
t3_set_reg_field(adapter, A_PCIX_CFG, 0, F_CLIDECEN);
 
t3_write_reg(adapter, A_PM1_RX_CFG, 0x);
+   t3_write_reg(adapter, A_PM1_RX_MODE, 0);
+   t3_write_reg(adapter, A_PM1_TX_MODE, 0);
init_hw_for_avail_ports(adapter, adapter->params.nports);
t3_sge_init(adapter, &adapter->params.sge);
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 8/13] cxgb3 - log adapter derial number

2007-08-10 Thread Divy Le Ray
From: Divy Le Ray <[EMAIL PROTECTED]>

Log HW serial number when cxgb3 module is loaded.

Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]>
---

 drivers/net/cxgb3/common.h |2 ++
 drivers/net/cxgb3/cxgb3_main.c |6 --
 drivers/net/cxgb3/t3_hw.c  |3 ++-
 3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/net/cxgb3/common.h b/drivers/net/cxgb3/common.h
index 55922ed..d54446f 100644
--- a/drivers/net/cxgb3/common.h
+++ b/drivers/net/cxgb3/common.h
@@ -97,6 +97,7 @@ enum {
MAX_NPORTS = 2, /* max # of ports */
MAX_FRAME_SIZE = 10240, /* max MAC frame size, including header + FCS */
EEPROMSIZE = 8192,  /* Serial EEPROM size */
+   SERNUM_LEN = 16,/* Serial # length */
RSS_TABLE_SIZE = 64,/* size of RSS lookup and mapping tables */
TCB_SIZE = 128, /* TCB size */
NMTUS = 16, /* size of MTU table */
@@ -391,6 +392,7 @@ struct vpd_params {
unsigned int uclk;
unsigned int mdc;
unsigned int mem_timing;
+   u8 sn[SERNUM_LEN + 1];
u8 eth_base[6];
u8 port_type[MAX_NPORTS];
unsigned short xauicfg[2];
diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c
index a1f94cf..e5744e7 100644
--- a/drivers/net/cxgb3/cxgb3_main.c
+++ b/drivers/net/cxgb3/cxgb3_main.c
@@ -2333,10 +2333,12 @@ static void __devinit print_port_info(struct adapter 
*adap,
   (adap->flags & USING_MSIX) ? " MSI-X" :
   (adap->flags & USING_MSI) ? " MSI" : "");
if (adap->name == dev->name && adap->params.vpd.mclk)
-   printk(KERN_INFO "%s: %uMB CM, %uMB PMTX, %uMB PMRX\n",
+   printk(KERN_INFO
+  "%s: %uMB CM, %uMB PMTX, %uMB PMRX, S/N: %s\n",
   adap->name, t3_mc7_size(&adap->cm) >> 20,
   t3_mc7_size(&adap->pmtx) >> 20,
-  t3_mc7_size(&adap->pmrx) >> 20);
+  t3_mc7_size(&adap->pmrx) >> 20,
+  adap->params.vpd.sn);
}
 }
 
diff --git a/drivers/net/cxgb3/t3_hw.c b/drivers/net/cxgb3/t3_hw.c
index dd3149d..23b1a16 100644
--- a/drivers/net/cxgb3/t3_hw.c
+++ b/drivers/net/cxgb3/t3_hw.c
@@ -505,7 +505,7 @@ struct t3_vpd {
u8 vpdr_len[2];
VPD_ENTRY(pn, 16);  /* part number */
VPD_ENTRY(ec, 16);  /* EC level */
-   VPD_ENTRY(sn, 16);  /* serial number */
+   VPD_ENTRY(sn, SERNUM_LEN); /* serial number */
VPD_ENTRY(na, 12);  /* MAC address base */
VPD_ENTRY(cclk, 6); /* core clock */
VPD_ENTRY(mclk, 6); /* mem clock */
@@ -648,6 +648,7 @@ static int get_vpd_params(struct adapter *adapter, struct 
vpd_params *p)
p->uclk = simple_strtoul(vpd.uclk_data, NULL, 10);
p->mdc = simple_strtoul(vpd.mdc_data, NULL, 10);
p->mem_timing = simple_strtoul(vpd.mt_data, NULL, 10);
+   memcpy(p->sn, vpd.sn_data, SERNUM_LEN);
 
/* Old eeproms didn't have port information */
if (adapter->params.rev == 0 && !vpd.port0_data[0]) {
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/13] cxgb3 - use immediate data for offload Tx

2007-08-10 Thread Divy Le Ray
From: Divy Le Ray <[EMAIL PROTECTED]>

Send small TX_DATA work requests as immediate data even when
there are fragments.

Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]>
---

 drivers/net/cxgb3/sge.c |   17 +++--
 1 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/drivers/net/cxgb3/sge.c b/drivers/net/cxgb3/sge.c
index 9213cda..dca2716 100644
--- a/drivers/net/cxgb3/sge.c
+++ b/drivers/net/cxgb3/sge.c
@@ -1182,8 +1182,8 @@ int t3_eth_xmit(struct sk_buff *skb, struct net_device 
*dev)
  *
  * Writes a packet as immediate data into a Tx descriptor.  The packet
  * contains a work request at its beginning.  We must write the packet
- * carefully so the SGE doesn't read accidentally before it's written in
- * its entirety.
+ * carefully so the SGE doesn't read it accidentally before it's written
+ * in its entirety.
  */
 static inline void write_imm(struct tx_desc *d, struct sk_buff *skb,
 unsigned int len, unsigned int gen)
@@ -1191,7 +1191,11 @@ static inline void write_imm(struct tx_desc *d, struct 
sk_buff *skb,
struct work_request_hdr *from = (struct work_request_hdr *)skb->data;
struct work_request_hdr *to = (struct work_request_hdr *)d;
 
-   memcpy(&to[1], &from[1], len - sizeof(*from));
+   if (likely(!skb->data_len))
+   memcpy(&to[1], &from[1], len - sizeof(*from));
+   else
+   skb_copy_bits(skb, sizeof(*from), &to[1], len - sizeof(*from));
+
to->wr_hi = from->wr_hi | htonl(F_WR_SOP | F_WR_EOP |
V_WR_BCNTLFLT(len & 7));
wmb();
@@ -1261,7 +1265,7 @@ static inline void reclaim_completed_tx_imm(struct 
sge_txq *q)
 
 static inline int immediate(const struct sk_buff *skb)
 {
-   return skb->len <= WR_LEN && !skb->data_len;
+   return skb->len <= WR_LEN;
 }
 
 /**
@@ -1467,12 +1471,13 @@ static void write_ofld_wr(struct adapter *adap, struct 
sk_buff *skb,
  */
 static inline unsigned int calc_tx_descs_ofld(const struct sk_buff *skb)
 {
-   unsigned int flits, cnt = skb_shinfo(skb)->nr_frags;
+   unsigned int flits, cnt;
 
-   if (skb->len <= WR_LEN && cnt == 0)
+   if (skb->len <= WR_LEN)
return 1;   /* packet fits as immediate data */
 
flits = skb_transport_offset(skb) / 8;  /* headers */
+   cnt = skb_shinfo(skb)->nr_frags;
if (skb->tail != skb->transport_header)
cnt++;
return flits_to_desc(flits + sgl_len(cnt));
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 5/13] cxgb3 - Expose HW memory page info

2007-08-10 Thread Divy Le Ray
From: Divy Le Ray <[EMAIL PROTECTED]>

Let the RDMA driver get HW page info to work around HW issues.
Assign explicit enum values.

Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]>
---

 drivers/net/cxgb3/cxgb3_ctl_defs.h |   52 +---
 drivers/net/cxgb3/cxgb3_offload.c  |7 +
 2 files changed, 38 insertions(+), 21 deletions(-)

diff --git a/drivers/net/cxgb3/cxgb3_ctl_defs.h 
b/drivers/net/cxgb3/cxgb3_ctl_defs.h
index 2095dda..6c4f320 100644
--- a/drivers/net/cxgb3/cxgb3_ctl_defs.h
+++ b/drivers/net/cxgb3/cxgb3_ctl_defs.h
@@ -33,27 +33,29 @@
 #define _CXGB3_OFFLOAD_CTL_DEFS_H
 
 enum {
-   GET_MAX_OUTSTANDING_WR,
-   GET_TX_MAX_CHUNK,
-   GET_TID_RANGE,
-   GET_STID_RANGE,
-   GET_RTBL_RANGE,
-   GET_L2T_CAPACITY,
-   GET_MTUS,
-   GET_WR_LEN,
-   GET_IFF_FROM_MAC,
-   GET_DDP_PARAMS,
-   GET_PORTS,
-
-   ULP_ISCSI_GET_PARAMS,
-   ULP_ISCSI_SET_PARAMS,
-
-   RDMA_GET_PARAMS,
-   RDMA_CQ_OP,
-   RDMA_CQ_SETUP,
-   RDMA_CQ_DISABLE,
-   RDMA_CTRL_QP_SETUP,
-   RDMA_GET_MEM,
+   GET_MAX_OUTSTANDING_WR  = 0,
+   GET_TX_MAX_CHUNK= 1,
+   GET_TID_RANGE   = 2,
+   GET_STID_RANGE  = 3,
+   GET_RTBL_RANGE  = 4,
+   GET_L2T_CAPACITY= 5,
+   GET_MTUS= 6,
+   GET_WR_LEN  = 7,
+   GET_IFF_FROM_MAC= 8,
+   GET_DDP_PARAMS  = 9,
+   GET_PORTS   = 10,
+
+   ULP_ISCSI_GET_PARAMS= 11,
+   ULP_ISCSI_SET_PARAMS= 12,
+
+   RDMA_GET_PARAMS = 13,
+   RDMA_CQ_OP  = 14,
+   RDMA_CQ_SETUP   = 15,
+   RDMA_CQ_DISABLE = 16,
+   RDMA_CTRL_QP_SETUP  = 17,
+   RDMA_GET_MEM= 18,
+
+   GET_RX_PAGE_INFO= 50,
 };
 
 /*
@@ -161,4 +163,12 @@ struct rdma_ctrlqp_setup {
unsigned long long base_addr;
unsigned int size;
 };
+
+/*
+ * Offload TX/RX page information.
+ */
+struct ofld_page_info {
+   unsigned int page_size;  /* Page size, should be a power of 2 */
+   unsigned int num;/* Number of pages */
+};
 #endif /* _CXGB3_OFFLOAD_CTL_DEFS_H */
diff --git a/drivers/net/cxgb3/cxgb3_offload.c 
b/drivers/net/cxgb3/cxgb3_offload.c
index e620ed4..522c1be 100644
--- a/drivers/net/cxgb3/cxgb3_offload.c
+++ b/drivers/net/cxgb3/cxgb3_offload.c
@@ -317,6 +317,8 @@ static int cxgb_offload_ctl(struct t3cdev *tdev, unsigned 
int req, void *data)
struct iff_mac *iffmacp;
struct ddp_params *ddpp;
struct adap_ports *ports;
+   struct ofld_page_info *rx_page_info;
+   struct tp_params *tp = &adapter->params.tp;
int i;
 
switch (req) {
@@ -382,6 +384,11 @@ static int cxgb_offload_ctl(struct t3cdev *tdev, unsigned 
int req, void *data)
if (!offload_running(adapter))
return -EAGAIN;
return cxgb_rdma_ctl(adapter, req, data);
+   case GET_RX_PAGE_INFO:
+   rx_page_info = data;
+   rx_page_info->page_size = tp->rx_pg_size;
+   rx_page_info->num = tp->rx_num_pgs;
+   break;
default:
return -EOPNOTSUPP;
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 7/13] cxgb3 - Fatal error update

2007-08-10 Thread Divy Le Ray
From: Divy Le Ray <[EMAIL PROTECTED]>

Stop the MAC when a fatal error is detected.

Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]>
---

 drivers/net/cxgb3/cxgb3_main.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c
index dc5d269..a1f94cf 100644
--- a/drivers/net/cxgb3/cxgb3_main.c
+++ b/drivers/net/cxgb3/cxgb3_main.c
@@ -2270,6 +2270,10 @@ void t3_fatal_err(struct adapter *adapter)
 
if (adapter->flags & FULL_INIT_DONE) {
t3_sge_stop(adapter);
+   t3_write_reg(adapter, A_XGM_TX_CTRL, 0);
+   t3_write_reg(adapter, A_XGM_RX_CTRL, 0);
+   t3_write_reg(adapter, XGM_REG(A_XGM_TX_CTRL, 1), 0);
+   t3_write_reg(adapter, XGM_REG(A_XGM_RX_CTRL, 1), 0);
t3_intr_disable(adapter);
}
CH_ALERT(adapter, "encountered fatal error, operation suspended\n");
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/13] cxgb3 - Update rx coalescing length

2007-08-10 Thread Divy Le Ray
From: Divy Le Ray <[EMAIL PROTECTED]>

Set max Rx coalescing length to 12288

Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]>
---

 drivers/net/cxgb3/common.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/cxgb3/common.h b/drivers/net/cxgb3/common.h
index c46c249..55922ed 100644
--- a/drivers/net/cxgb3/common.h
+++ b/drivers/net/cxgb3/common.h
@@ -104,7 +104,7 @@ enum {
PROTO_SRAM_LINES = 128, /* size of TP sram */
 };
 
-#define MAX_RX_COALESCING_LEN 16224U
+#define MAX_RX_COALESCING_LEN 12288U
 
 enum {
PAUSE_RX = 1 << 0,
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/13] cxgb3 - SGE doorbell overflow warning

2007-08-10 Thread Divy Le Ray
From: Divy Le Ray <[EMAIL PROTECTED]>

Log doorbell Fifo overflow

Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]>
---

 drivers/net/cxgb3/regs.h |8 
 drivers/net/cxgb3/sge.c  |4 
 2 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/drivers/net/cxgb3/regs.h b/drivers/net/cxgb3/regs.h
index aa80313..2824278 100644
--- a/drivers/net/cxgb3/regs.h
+++ b/drivers/net/cxgb3/regs.h
@@ -172,6 +172,14 @@
 
 #define A_SG_INT_CAUSE 0x5c
 
+#define S_HIPIODRBDROPERR11
+#define V_HIPIODRBDROPERR(x) ((x) << S_HIPIODRBDROPERR)
+#define F_HIPIODRBDROPERRV_HIPIODRBDROPERR(1U)
+
+#define S_LOPIODRBDROPERR10
+#define V_LOPIODRBDROPERR(x) ((x) << S_LOPIODRBDROPERR)
+#define F_LOPIODRBDROPERRV_LOPIODRBDROPERR(1U)
+
 #define S_RSPQDISABLED3
 #define V_RSPQDISABLED(x) ((x) << S_RSPQDISABLED)
 #define F_RSPQDISABLEDV_RSPQDISABLED(1U)
diff --git a/drivers/net/cxgb3/sge.c b/drivers/net/cxgb3/sge.c
index a2cfd68..9213cda 100644
--- a/drivers/net/cxgb3/sge.c
+++ b/drivers/net/cxgb3/sge.c
@@ -2476,6 +2476,10 @@ void t3_sge_err_intr_handler(struct adapter *adapter)
 "(0x%x)\n", (v >> S_RSPQ0DISABLED) & 0xff);
}
 
+   if (status & (F_HIPIODRBDROPERR | F_LOPIODRBDROPERR))
+   CH_ALERT(adapter, "SGE dropped %s priority doorbell\n",
+status & F_HIPIODRBDROPERR ? "high" : "lo");
+
t3_write_reg(adapter, A_SG_INT_CAUSE, status);
if (status & (F_RSPQCREDITOVERFOW | F_RSPQDISABLED))
t3_fatal_err(adapter);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/13] cxgb3 - MAC workaround update

2007-08-10 Thread Divy Le Ray
From: Divy Le Ray <[EMAIL PROTECTED]>

Update the MAC workaround to deal with switches that do not
honor pause frames.

Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]>
---

 drivers/net/cxgb3/common.h |1 +
 drivers/net/cxgb3/xgmac.c  |   22 +++---
 2 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/drivers/net/cxgb3/common.h b/drivers/net/cxgb3/common.h
index 1637800..c46c249 100644
--- a/drivers/net/cxgb3/common.h
+++ b/drivers/net/cxgb3/common.h
@@ -507,6 +507,7 @@ struct cmac {
unsigned int tx_xcnt;
u64 tx_mcnt;
unsigned int rx_xcnt;
+   unsigned int rx_ocnt;
u64 rx_mcnt;
unsigned int toggle_cnt;
unsigned int txen;
diff --git a/drivers/net/cxgb3/xgmac.c b/drivers/net/cxgb3/xgmac.c
index c302b1a..1d1c391 100644
--- a/drivers/net/cxgb3/xgmac.c
+++ b/drivers/net/cxgb3/xgmac.c
@@ -437,12 +437,13 @@ int t3_mac_enable(struct cmac *mac, int which)
struct mac_stats *s = &mac->stats;

if (which & MAC_DIRECTION_TX) {
-   t3_write_reg(adap, A_XGM_TX_CTRL + oft, F_TXEN);
t3_write_reg(adap, A_TP_PIO_ADDR, A_TP_TX_DROP_CFG_CH0 + idx);
t3_write_reg(adap, A_TP_PIO_DATA, 0xc0ede401);
t3_write_reg(adap, A_TP_PIO_ADDR, A_TP_TX_DROP_MODE);
t3_set_reg_field(adap, A_TP_PIO_DATA, 1 << idx, 1 << idx);
 
+   t3_write_reg(adap, A_XGM_TX_CTRL + oft, F_TXEN);
+
t3_write_reg(adap, A_TP_PIO_ADDR, A_TP_TX_DROP_CNT_CH0 + idx);
mac->tx_mcnt = s->tx_frames;
mac->tx_tcnt = (G_TXDROPCNTCH0RCVD(t3_read_reg(adap,
@@ -454,6 +455,7 @@ int t3_mac_enable(struct cmac *mac, int which)
mac->rx_xcnt = (G_TXSPI4SOPCNT(t3_read_reg(adap,
A_XGM_RX_SPI4_SOP_EOP_CNT +
oft)));
+   mac->rx_ocnt = s->rx_fifo_ovfl;
mac->txen = F_TXEN;
mac->toggle_cnt = 0;
}
@@ -464,24 +466,19 @@ int t3_mac_enable(struct cmac *mac, int which)
 
 int t3_mac_disable(struct cmac *mac, int which)
 {
-   int idx = macidx(mac);
struct adapter *adap = mac->adapter;
-   int val;
 
if (which & MAC_DIRECTION_TX) {
t3_write_reg(adap, A_XGM_TX_CTRL + mac->offset, 0);
-   t3_write_reg(adap, A_TP_PIO_ADDR, A_TP_TX_DROP_CFG_CH0 + idx);
-   t3_write_reg(adap, A_TP_PIO_DATA, 0xc01f);
-   t3_write_reg(adap, A_TP_PIO_ADDR, A_TP_TX_DROP_MODE);
-   t3_set_reg_field(adap, A_TP_PIO_DATA, 1 << idx, 1 << idx);
mac->txen = 0;
}
if (which & MAC_DIRECTION_RX) {
+   int val = F_MAC_RESET_;
+
t3_set_reg_field(mac->adapter, A_XGM_RESET_CTRL + mac->offset,
 F_PCS_RESET_, 0);
msleep(100);
t3_write_reg(adap, A_XGM_RX_CTRL + mac->offset, 0);
-   val = F_MAC_RESET_;
if (is_10G(adap))
val |= F_PCS_RESET_;
else if (uses_xaui(adap))
@@ -541,11 +538,14 @@ int t3b2_mac_watchdog_task(struct cmac *mac)
}
 
 rxcheck:
-   if (rx_mcnt != mac->rx_mcnt)
+   if (rx_mcnt != mac->rx_mcnt) {
rx_xcnt = (G_TXSPI4SOPCNT(t3_read_reg(adap,
A_XGM_RX_SPI4_SOP_EOP_CNT +
-   mac->offset)));
-   else
+   mac->offset))) +
+   (s->rx_fifo_ovfl -
+mac->rx_ocnt);
+   mac->rx_ocnt = s->rx_fifo_ovfl;
+   } else
goto out;
 
if (mac->rx_mcnt != s->rx_frames && rx_xcnt == 0 &&
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] writeback dirty inodes fixes

2007-08-10 Thread Fengguang Wu
On Sat, Aug 11, 2007 at 02:02:02PM +0800, Fengguang Wu wrote:
> Andrew,
> 
> Now the patches are simplified and rebased to 2.6.23-rc2-mm2.
> 
> The following two patches should be put immediately after
> writeback-fix-periodic-superblock-dirty-inode-flushing.patch:
> 
>  writeback: fix time ordering of the per superblock inode lists 8   
>  writeback: fix ntfs with sb_has_dirty_inodes() 

The following tree patches should be updated to resolve merge conflicts:

sync_sb_inodes-propagate-errors.patch 
reiser4-sb_sync_inodes.patch
check_dirty_inode_list.patch (extended to check s_io/s_more_io)

They are attached in this mail.
From: Andrew Morton <[EMAIL PROTECTED]>

Guillame points out that sync_sb_inodes() is failing to propagate error codes
back.  Fix that, and make several other void-returning functions not drop
reportable error codes.

Cc: Guillaume Chazarain <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
---

 fs/fs-writeback.c |   56 +++-
 include/linux/writeback.h |6 +--
 2 files changed, 45 insertions(+), 17 deletions(-)

--- linux-2.6.23-rc2-mm2.orig/fs/fs-writeback.c
+++ linux-2.6.23-rc2-mm2/fs/fs-writeback.c
@@ -392,13 +392,17 @@ __writeback_single_inode(struct inode *i
  * on the writer throttling path, and we get decent balancing between many
  * throttled threads: we don't want them all piling up on inode_sync_wait.
  */
-static void
+static int
 sync_sb_inodes(struct super_block *sb, struct writeback_control *wbc)
 {
+	int ret = 0;
+
 	if (!wbc->for_kupdate || list_empty(&sb->s_io))
 		queue_io(sb, wbc->older_than_this);
 
 	while (!list_empty(&sb->s_io)) {
+		int err;
+
 		struct inode *inode = list_entry(sb->s_io.prev,
 		struct inode, i_list);
 		struct address_space *mapping = inode->i_mapping;
@@ -444,7 +448,9 @@ sync_sb_inodes(struct super_block *sb, s
 		BUG_ON(inode->i_state & I_FREEING);
 		__iget(inode);
 		pages_skipped = wbc->pages_skipped;
-		__writeback_single_inode(inode, wbc);
+		err = __writeback_single_inode(inode, wbc);
+		if (!ret)
+			ret = err;
 		if (wbc->sync_mode == WB_SYNC_HOLD) {
 			inode->dirtied_when = jiffies;
 			list_move(&inode->i_list, &sb->s_dirty);
@@ -469,7 +475,7 @@ sync_sb_inodes(struct super_block *sb, s
 	if (list_empty(&sb->s_io))
 		list_splice_init(&sb->s_more_io, &sb->s_io);
 
-	return;		/* Leave any unwritten inodes on s_io */
+	return ret;		/* Leave any unwritten inodes on s_io */
 }
 
 /*
@@ -491,10 +497,10 @@ sync_sb_inodes(struct super_block *sb, s
  * sync_sb_inodes will seekout the blockdev which matches `bdi'.  Maybe not
  * super-efficient but we're about to do a ton of I/O...
  */
-void
-writeback_inodes(struct writeback_control *wbc)
+int writeback_inodes(struct writeback_control *wbc)
 {
 	struct super_block *sb;
+	int ret = 0;
 
 	might_sleep();
 	spin_lock(&sb_lock);
@@ -512,9 +518,13 @@ restart:
 			 */
 			if (down_read_trylock(&sb->s_umount)) {
 if (sb->s_root) {
+	int err;
+
 	spin_lock(&inode_lock);
-	sync_sb_inodes(sb, wbc);
+	err = sync_sb_inodes(sb, wbc);
 	spin_unlock(&inode_lock);
+	if (!ret)
+		ret = err;
 }
 up_read(&sb->s_umount);
 			}
@@ -526,6 +536,7 @@ restart:
 			break;
 	}
 	spin_unlock(&sb_lock);
+	return ret;
 }
 
 /*
@@ -539,7 +550,7 @@ restart:
  * We add in the number of potentially dirty inodes, because each inode write
  * can dirty pagecache in the underlying blockdev.
  */
-void sync_inodes_sb(struct super_block *sb, int wait)
+int sync_inodes_sb(struct super_block *sb, int wait)
 {
 	struct writeback_control wbc = {
 		.sync_mode	= wait ? WB_SYNC_ALL : WB_SYNC_HOLD,
@@ -548,14 +559,16 @@ void sync_inodes_sb(struct super_block *
 	};
 	unsigned long nr_dirty = global_page_state(NR_FILE_DIRTY);
 	unsigned long nr_unstable = global_page_state(NR_UNSTABLE_NFS);
+	int ret;
 
 	wbc.nr_to_write = nr_dirty + nr_unstable +
 			(inodes_stat.nr_inodes - inodes_stat.nr_unused) +
 			nr_dirty + nr_unstable;
 	wbc.nr_to_write += wbc.nr_to_write / 2;		/* Bit more for luck */
 	spin_lock(&inode_lock);
-	sync_sb_inodes(sb, &wbc);
+	ret = sync_sb_inodes(sb, &wbc);
 	spin_unlock(&inode_lock);
+	return ret;
 }
 
 /*
@@ -591,13 +604,16 @@ static void set_sb_syncing(int val)
  * outstanding dirty inodes, the writeback goes block-at-a-time within the
  * filesystem's write_inode().  This is extremely slow.
  */
-static void __sync_inodes(int wait)
+static int __sync_inodes(int wait)
 {
 	struct super_block *sb;
+	int ret = 0;
 
 	spin_lock(&sb_lock);
 restart:
 	list_for_each_entry(sb, &super_blocks, s_list) {
+		int err;
+
 		if (sb->s_syncing)
 			continue;
 		sb->s_syncing = 1;
@@ -605,8 +621,12 @@ restart:
 		spin_unlock(&sb_lock);
 		down_read(&sb->s_umount);
 		if (sb->s_root) {
-			sync_inodes_sb(sb, wait);
-			sync_blockdev(sb->s_bdev);
+			err = sync_inodes_sb(sb, wait);
+			if (!ret)
+ret = err;
+			err = sync_blockdev(sb->s_bdev);
+			if (!ret)
+ret = err;
 	

Re: Software based ECC ?

2007-08-10 Thread Valdis . Kletnieks
On Fri, 10 Aug 2007 23:16:45 +0200, roland said:

> http://pdos.csail.mit.edu/papers/softecc:ddopson-meng/softecc_ddopson-meng.pdf
> 
> "SoftECC : A System for Software Memory Integrity Checking"
> 
> Is it possible to implement something like this within the Linux virtual
> memory subsystem ?

Anything that can be simulated with a Turing machine is *possible*.

The question is how many rocket boosters the pig needs for takeoff.

Hint: The thesis talks about why he didn't implement it for Linux.

> If it can be done, wouldn`t this be a great feature ?

Read section 5.2 of that thesis, particularly this quote from 5.2.2:

"For random word writes, this implies that SoftECC will need an order of
magnitude more compute time than the user-mode code"

Basically, on every single memory page that gets dirtied, we have to then
re-checksum the page (blowing away cache lines in the process).  If you want
to get a feel for it, find the kernel code that recognizes that a page is
dirtied, and just add a few lines there:

int foo = 0, i;
for (i=0;i++;<1024) { // adjust for non-4K pages
foo ^= *(page+i);
}

and see how much your system crawls.

Personally, I'd recommend just shelling out the bucks for hardware ECC if
the reliability matters.



pgp59H6a1oMSE.pgp
Description: PGP signature


[PATCH 2/2] writeback: fix ntfs with sb_has_dirty_inodes()

2007-08-10 Thread Fengguang Wu
NTFS's if-condition on dirty inodes is not complete.
Fix it with sb_has_dirty_inodes().

Cc: Anton Altaparmakov <[EMAIL PROTECTED]>
Cc: Ken Chen <[EMAIL PROTECTED]>
Cc: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Fengguang Wu <[EMAIL PROTECTED]>
---
---
 fs/fs-writeback.c  |9 -
 fs/ntfs/super.c|4 ++--
 include/linux/fs.h |1 +
 3 files changed, 11 insertions(+), 3 deletions(-)

--- linux-2.6.23-rc2-mm2.orig/fs/ntfs/super.c
+++ linux-2.6.23-rc2-mm2/fs/ntfs/super.c
@@ -2381,14 +2381,14 @@ static void ntfs_put_super(struct super_
 */
ntfs_commit_inode(vol->mft_ino);
write_inode_now(vol->mft_ino, 1);
-   if (!list_empty(&sb->s_dirty)) {
+   if (sb_has_dirty_inodes(sb)) {
const char *s1, *s2;
 
mutex_lock(&vol->mft_ino->i_mutex);
truncate_inode_pages(vol->mft_ino->i_mapping, 0);
mutex_unlock(&vol->mft_ino->i_mutex);
write_inode_now(vol->mft_ino, 1);
-   if (!list_empty(&sb->s_dirty)) {
+   if (sb_has_dirty_inodes(sb)) {
static const char *_s1 = "inodes";
static const char *_s2 = "";
s1 = _s1;
--- linux-2.6.23-rc2-mm2.orig/include/linux/fs.h
+++ linux-2.6.23-rc2-mm2/include/linux/fs.h
@@ -1712,6 +1712,7 @@ extern int bdev_read_only(struct block_d
 extern int set_blocksize(struct block_device *, int);
 extern int sb_set_blocksize(struct super_block *, int);
 extern int sb_min_blocksize(struct super_block *, int);
+extern int sb_has_dirty_inodes(struct super_block *);
 
 extern int generic_file_mmap(struct file *, struct vm_area_struct *);
 extern int generic_file_readonly_mmap(struct file *, struct vm_area_struct *);
--- linux-2.6.23-rc2-mm2.orig/fs/fs-writeback.c
+++ linux-2.6.23-rc2-mm2/fs/fs-writeback.c
@@ -188,6 +188,13 @@ static void queue_io(struct super_block 
}
 }
 
+int sb_has_dirty_inodes(struct super_block *sb)
+{
+   return !list_empty(&sb->s_dirty) ||
+  !list_empty(&sb->s_io);
+}
+EXPORT_SYMBOL(sb_has_dirty_inodes);
+
 /*
  * Write a single inode's dirty pages and inode data out to disk.
  * If `wait' is set, wait on the writeout.
@@ -485,7 +492,7 @@ writeback_inodes(struct writeback_contro
 restart:
sb = sb_entry(super_blocks.prev);
for (; sb != sb_entry(&super_blocks); sb = sb_entry(sb->s_list.prev)) {
-   if (!list_empty(&sb->s_dirty) || !list_empty(&sb->s_io)) {
+   if (sb_has_dirty_inodes(sb)) {
/* we're making our own get_super here */
sb->s_count++;
spin_unlock(&sb_lock);

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/2] writeback dirty inodes fixes

2007-08-10 Thread Fengguang Wu
Andrew,

Now the patches are simplified and rebased to 2.6.23-rc2-mm2.

The following two patches should be put immediately after
writeback-fix-periodic-superblock-dirty-inode-flushing.patch:

 writeback: fix time ordering of the per superblock inode lists 8   
 writeback: fix ntfs with sb_has_dirty_inodes() 

Thank you,
Fengguang
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] writeback: fix time ordering of the per superblock inode lists 8

2007-08-10 Thread Fengguang Wu
Fix the time ordering bug re-introduced by
writeback-fix-periodic-superblock-dirty-inode-flushing.patch.

It works by never move not-yet-expired dirty inodes from s_dirty to s_io,
*only to* move them back. The move-inodes-back-and-forth thing is a mess.

Cc: Ken Chen <[EMAIL PROTECTED]>
Cc: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Fengguang Wu <[EMAIL PROTECTED]>
---
 fs/fs-writeback.c |   40 ++--
 1 file changed, 22 insertions(+), 18 deletions(-)

--- linux-2.6.23-rc2-mm2.orig/fs/fs-writeback.c
+++ linux-2.6.23-rc2-mm2/fs/fs-writeback.c
@@ -172,6 +172,23 @@ static void requeue_io(struct inode *ino
 }
 
 /*
+ * Queue expired dirty inodes for io.
+ */
+static void queue_io(struct super_block *sb,
+   unsigned long *older_than_this)
+{
+   while (!list_empty(&sb->s_dirty)) {
+   struct inode *inode = list_entry(sb->s_dirty.prev,
+   struct inode, i_list);
+   /* Was this inode dirtied too recently? */
+   if (older_than_this &&
+   time_after(inode->dirtied_when, *older_than_this))
+   break;
+   list_move(&inode->i_list, &sb->s_io);
+   }
+}
+
+/*
  * Write a single inode's dirty pages and inode data out to disk.
  * If `wait' is set, wait on the writeout.
  *
@@ -295,10 +312,10 @@ __writeback_single_inode(struct inode *i
 
/*
 * We're skipping this inode because it's locked, and we're not
-* doing writeback-for-data-integrity.  Move it to the head of
-* s_dirty so that writeback can proceed with the other inodes
-* on s_io.  We'll have another go at writing back this inode
-* when the s_dirty iodes get moved back onto s_io.
+* doing writeback-for-data-integrity.  Move it to s_more_io so
+* that writeback can proceed with the other inodes on s_io.
+* We'll have another go at writing back this inode when we
+* completed a full scan of s_io.
 */
requeue_io(inode);
 
@@ -362,10 +379,8 @@ __writeback_single_inode(struct inode *i
 static void
 sync_sb_inodes(struct super_block *sb, struct writeback_control *wbc)
 {
-   const unsigned long start = jiffies;/* livelock avoidance */
-
if (!wbc->for_kupdate || list_empty(&sb->s_io))
-   list_splice_init(&sb->s_dirty, &sb->s_io);
+   queue_io(sb, wbc->older_than_this);
 
while (!list_empty(&sb->s_io)) {
struct inode *inode = list_entry(sb->s_io.prev,
@@ -406,17 +421,6 @@ sync_sb_inodes(struct super_block *sb, s
continue;   /* blockdev has wrong queue */
}
 
-   /* Was this inode dirtied after sync_sb_inodes was called? */
-   if (time_after(inode->dirtied_when, start))
-   break;
-
-   /* Was this inode dirtied too recently? */
-   if (wbc->older_than_this && time_after(inode->dirtied_when,
-   *wbc->older_than_this)) {
-   list_splice_init(&sb->s_io, sb->s_dirty.prev);
-   break;
-   }
-
/* Is another pdflush already flushing this queue? */
if (current_is_pdflush() && !writeback_acquire(bdi))
break;

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-10 Thread Willy Tarreau
On Sat, Aug 11, 2007 at 12:50:08AM +0200, Roman Zippel wrote:
> Hi,
> 
> On Fri, 10 Aug 2007, Ingo Molnar wrote:
> 
> > achieve that. It probably wont make a real difference, but it's really 
> > easy for you to send and it's still very useful when one tries to 
> > eliminate possibilities and when one wants to concentrate on the 
> > remaining possibilities alone.
> 
> The thing I'm afraid about CFS is its possible unpredictability, which 
> would make it hard to reproduce problems and we may end up with users with 
> unexplainable weird problems. That's the main reason I'm trying so hard to 
> push for a design discussion.

You may be interested by looking at the very early CFS versions. The design
was much more naive and understandable. After that, a lot of tricks have
been added to take into account a lot of uses and corner cases, which may
not help in understanding it globally.

> Just to give an idea here are two more examples of irregular behaviour, 
> which are hopefully easier to reproduce.
> 
> 1. Two simple busy loops, one of them is reniced to 15, according to my 
> calculations the reniced task should get about 3.4% (1/(1.25^15+1)), but I 
> get this:
> 
>   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
>  4433 roman 20   0  1532  300  244 R 99.2  0.2   5:05.51 l
>  4434 roman 35  15  1532   72   16 R  0.7  0.1   0:10.62 l

Could this be caused by typos in some tables like you have found in wmult ?

> OTOH upto nice level 12 I get what I expect.
> 
> 2. If I start 20 busy loops, initially I see in top that every task gets 
> 5% and time increments equally (as it should):
(...)

> But if I renice all of them to -15, the time every task gets is rather 
> random:
> 
>   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
>  4492 roman  5 -15  1532   68   16 R  1.0  0.1   0:07.95 l
>  4491 roman  5 -15  1532   68   16 R  4.3  0.1   0:07.62 l
>  4490 roman  5 -15  1532   68   16 R  3.3  0.1   0:07.50 l
>  4489 roman  5 -15  1532   68   16 R  7.6  0.1   0:07.80 l
>  4488 roman  5 -15  1532   68   16 R  9.6  0.1   0:08.31 l
>  4487 roman  5 -15  1532   68   16 R  3.3  0.1   0:07.59 l
>  4486 roman  5 -15  1532   68   16 R  6.6  0.1   0:07.08 l
>  4485 roman  5 -15  1532   68   16 R 10.0  0.1   0:07.31 l
>  4484 roman  5 -15  1532   68   16 R  8.0  0.1   0:07.30 l
>  4483 roman  5 -15  1532   68   16 R  7.0  0.1   0:07.34 l
>  4482 roman  5 -15  1532   68   16 R  1.0  0.1   0:05.84 l
>  4481 roman  5 -15  1532   68   16 R  1.0  0.1   0:07.16 l
>  4480 roman  5 -15  1532   68   16 R  3.3  0.1   0:07.00 l
>  4479 roman  5 -15  1532   68   16 R  1.0  0.1   0:06.66 l
>  4478 roman  5 -15  1532   68   16 R  8.6  0.1   0:06.96 l
>  4477 roman  5 -15  1532   68   16 R  8.6  0.1   0:07.63 l
>  4476 roman  5 -15  1532   68   16 R  9.6  0.1   0:07.38 l
>  4475 roman  5 -15  1532   68   16 R  1.3  0.1   0:07.09 l
>  4474 roman  5 -15  1532   68   16 R  2.3  0.1   0:07.97 l
>  4473 roman  5 -15  1532  296  244 R  1.0  0.2   0:07.73 l

Do you see this only at -15, or starting with -15 and below ?

Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Documentation files in html format?

2007-08-10 Thread Willy Tarreau
On Sat, Aug 11, 2007 at 01:08:30AM +0200, Sam Ravnborg wrote:
> > 
> > The problem I have with asciidoc is that it's a nightmare to get it
> > to work. It's what GIT uses, and after spending a whole day trying
> > to *build* that thing, I finally resigned and asked Junio if he could
> > publish the pre-formatted manpages himself, which he agreed to.
> 
> Bit uses in addition to asciidoc also docbook and a bit more.
> As asciidoc is some phython scripts it should be trivial to
> install with no build required.

I remember it relied on some tools to process xml, but I don't know
exactly what. It were those tools which I could not build.

> Maybe it was the docbook stuff you had trouble with?

possible, I don't remember that much, it was a painful day one year ago.

> My Kbuild example were made without using other tools than asciidoc but
> if pdf is desired some additional tools are needed.

It was just needed to build the man pages, so I would have expected it
to be pretty straight-forward too.

Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-10 Thread Willy Tarreau
On Fri, Aug 10, 2007 at 11:15:55PM +0200, Roman Zippel wrote:
> Hi,
> 
> On Fri, 10 Aug 2007, Willy Tarreau wrote:
> 
> > fortunately all bug reporters are not like you. It's amazing how long
> > you can resist sending a simple bug report to a developer!
> 
> I'm more amazed how long Ingo can resist providing some explanations (not 
> just about this problem).

It's a matter of time balance. It takes a short time to send the output
of a script, and it takes a very long time to explain how things work.
I often encounter the same situation with haproxy. People ask me to
explain them in detail how this or that would apply to their context, and
it's often easier for me to provide them with a 5-lines patch to add the
feature they need, than to spend half an hour explaining why and how it
would badly behave.

> It's not like I haven't given him anything, he already has the test 
> programs, he already knows the system configuration.
> Well, I've sent him the stuff now...

fine, thanks.

> > Maybe you
> > consider that you need to fix the bug by yourself after you understand
> > the code,
> 
> Fixing the bug requires some knowledge what the code is intended to do.
> 
> > Please try to be a little bit more transparent if you really want the
> > bugs fixed, and don't behave as if you wanted this bug to survive
> > till -final.
> 
> Could you please ask Ingo the same? I'm simply trying to get some 
> transparancy into the CFS design. Without further information it's 
> difficult to tell, whether something is supposed to work this way or it's 
> a bug.

I know that Ingo tends to reply to a question with another question. But
as I said, imagine if he has to explain the same things to each person
who asks him for it. I think that a more constructive approach would be
to point what is missing/unclear/inexact in the doc so that he adds some
paragraphs for you and everyone else. If you need this information to debug,
most likely other people will need it too.

> In this case it's quite possible that due to a recent change my testcase 
> doesn't work anymore. Should I consider the problem fixed or did it just 
> go into hiding? Without more information it's difficult to verify this 
> independently.

generally, problems that appear only on one person's side and which suddenly
disappear are either caused by some random buggy patch left in the tree (not
your case it seems), or by an obscure bug of the feature being tested which
will resurface from time to time as long as it's not identified.

Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 00/23] per device dirty throttling -v8

2007-08-10 Thread Valdis . Kletnieks
On Fri, 10 Aug 2007 00:04:45 EDT, Bill Davidsen said:

> > I never imagined that itwas the 20%+ hit that is being described, and 
> > with so little impact, or I would have switched to it across the board 
> > years ago.
> > 
> To get that magnitude you need slow disk with very fast CPU. It helps 
> most of systems where the disk hardware is marginal or worse for the i/o 
> load. Don't take that as typical.

I suspect that almost every single laptop with a Core2 Duo in it falls into
that classification, and it's getting worse every year, as we see more
disparity between CPU speeds (increasing) and disk seek times (basically nailed
to the floor for the last decade).



pgpSAQlmGIEyL.pgp
Description: PGP signature


Re: [patch 3/4] Enable link power management for ata drivers

2007-08-10 Thread Valdis . Kletnieks
On Thu, 09 Aug 2007 14:24:16 PDT, Kristen Carlson Accardi said:
> +++ 2.6-git/drivers/ata/libata-scsi.c
> @@ -2904,6 +2976,52 @@ void ata_scsi_simulate(struct ata_device

> + if ((dev->horkage & ATA_HORKAGE_IPM) ||
> + !(dev->flags & ATA_DFLAG_IPM)) {
> + ata_dev_printk(dev, KERN_ERR,
> + "Unable to set Link PM policy\n");
> + ap->pm_policy = MAX_PERFORMANCE;
> + }

KERN_INFO please, or KERN_WARNING at the highest, at least until such time
as enough drivers support enough hardware that it really *does* qualify for
"this should not fail" status.

(OK, so I'm just cranky because I'm tired of seeing a KERN_ERR thrown at every
reboot, just because the ata_piix driver doesn't know how to set this stuff up
for the DVD?RW drive in my laptop.  But when this goes upstream, lots of
*other* people are going to get hit by the exact same thing and think there's
something actually *wrong* with their hardware.)


pgpOEeHllsO1p.pgp
Description: PGP signature


Re: Serial ports rearranged in 2.6.22?

2007-08-10 Thread Yinghai Lu
On 8/10/07, Michael Mauch <[EMAIL PROTECTED]> wrote:
> Hi,
>
> until 2.6.21, I had the normal assignments for ttyS0 and ttyS1:
>
> 00:08: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
> 00:09: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
>
> With 2.6.22 I get the names <-> ports/irqs the other way around:
>
> 00:08: ttyS0 at I/O 0x2f8 (irq = 3) is a 16550A
> 00:09: ttyS1 at I/O 0x3f8 (irq = 4) is a 16550A
>
> Is this supposed to be that way? Should we reassign these names with
> udev? udev-114 doesn't seem to have built-in rules to assign the
> traditional names.
>
> Or could it be related to some brokeness in my BIOS (ACPI/PNP)?
>
> I'm using the 8250_pnp module (and it's the same with builtin serial
> modules). I made sure that I did not accidentally change the BIOS
> settings for the serial ports.
>
> I'm using Gentoo, but on the lirc list was a Fedora user with the same
> symptoms.

http://lkml.org/lkml/2007/7/25/455

YH
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: rcutorture xtime usage

2007-08-10 Thread Paul E. McKenney
On Fri, Aug 10, 2007 at 05:29:49PM -0700, Paul E. McKenney wrote:
> 
> Errmmm...  No joy.
> 
>   ERROR: "cpu_clock" [kernel/rcutorture.ko] undefined!
> 
> Turns out that cpu_clock also ain't exported, and rcutorture.c is
> a module.  Would adding an EXPORT_SYMBOL_GPL() as in the patch below
> be acceptable?

Except that the old xtime symbol was EXPORT_SYMBOL() rather than my
proposed EXPORT_SYMBOL_GPL() for the equivalent new cpu_clock().

Sigh!!!  I will leave this one for others to sort out.

Andrew, please consider this patch withdrawn and apply the version that
does not rely on time for entropy.  Please let me know if you would like
me to resend it.

Thanx, Paul

> If not, I have a tested patch to rcutorture.c that leverages statistical
> counters.  Your choice.
> 
>   Thanx, Paul
> 
> Add an EXPORT_SYMBOL_GPL() for cpu_clock() and make rcutorture.c use it.
> Compiles, but not yet tested.
> 
> Signed-off-by: Paul E. McKenney <[EMAIL PROTECTED]>
> ---
> 
>  rcutorture.c |8 ++--
>  sched.c  |2 ++
>  2 files changed, 4 insertions(+), 6 deletions(-)
> 
> diff -urpNa -X dontdiff linux-2.6.23-rc2/kernel/rcutorture.c 
> linux-2.6.23-rc2-rcutorturesched/kernel/rcutorture.c
> --- linux-2.6.23-rc2/kernel/rcutorture.c  2007-08-03 19:49:55.0 
> -0700
> +++ linux-2.6.23-rc2-rcutorturesched/kernel/rcutorture.c  2007-08-10 
> 17:15:22.0 -0700
> @@ -42,7 +42,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> @@ -166,16 +165,13 @@ struct rcu_random_state {
>  
>  /*
>   * Crude but fast random-number generator.  Uses a linear congruential
> - * generator, with occasional help from get_random_bytes().
> + * generator, with occasional help from cpu_clock().
>   */
>  static unsigned long
>  rcu_random(struct rcu_random_state *rrsp)
>  {
> - long refresh;
> -
>   if (--rrsp->rrs_count < 0) {
> - get_random_bytes(&refresh, sizeof(refresh));
> - rrsp->rrs_state += refresh;
> + rrsp->rrs_state += (unsigned long)cpu_clock(smp_processor_id());
>   rrsp->rrs_count = RCU_RANDOM_REFRESH;
>   }
>   rrsp->rrs_state = rrsp->rrs_state * RCU_RANDOM_MULT + RCU_RANDOM_ADD;
> diff -urpNa -X dontdiff linux-2.6.23-rc2/kernel/sched.c 
> linux-2.6.23-rc2-rcutorturesched/kernel/sched.c
> --- linux-2.6.23-rc2/kernel/sched.c   2007-08-03 19:49:55.0 -0700
> +++ linux-2.6.23-rc2-rcutorturesched/kernel/sched.c   2007-08-10 
> 17:22:57.0 -0700
> @@ -394,6 +394,8 @@ unsigned long long cpu_clock(int cpu)
>   return now;
>  }
>  
> +EXPORT_SYMBOL_GPL(cpu_clock);
> +
>  #ifdef CONFIG_FAIR_GROUP_SCHED
>  /* Change a task's ->cfs_rq if it moves across CPUs */
>  static inline void set_task_cfs_rq(struct task_struct *p)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: early boot lockup with 2.6.23-rc1

2007-08-10 Thread Mikko Rapeli
On Fri, Aug 10, 2007 at 10:20:31PM +0300, Mikko Rapeli wrote:
> I've bisected thus far, if it helps:

Bisect came to this conclusion:

git-bisect start
# good: [4eb6bf6bfb580afaf1e1a1d30cba17a078530cf4] lots-of-architectures: 
enable arbitary speed tty support
git-bisect good 4eb6bf6bfb580afaf1e1a1d30cba17a078530cf4
# bad: [773208946a132fb733ba273ee8562814f828cc28] Revert "USB: fix 
gregkh-usb-usb-use-menuconfig-objects"
git-bisect bad 773208946a132fb733ba273ee8562814f828cc28
# bad: [dc690d8ef842b464f1c429a376ca16cb8dbee6ae] Merge 
master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6
git-bisect bad dc690d8ef842b464f1c429a376ca16cb8dbee6ae
# good: [15028aad00ddf241581fbe74a02ec89cbb28d35d] [TG3]: Update version to 
3.78.
git-bisect good 15028aad00ddf241581fbe74a02ec89cbb28d35d
# bad: [82afee684fe3badaf5ee3fc5b6fda687d558bfb5] Merge 
master.kernel.org:/pub/scm/linux/kernel/git/cooloney/blackfin-2.6
git-bisect bad 82afee684fe3badaf5ee3fc5b6fda687d558bfb5
# bad: [c39736823232bc3ca113c8228fa852c09fba300e] Remove old i386 setup code
git-bisect bad c39736823232bc3ca113c8228fa852c09fba300e
# good: [5be865661516263d90317a6b35b588a2d7c3cb55] String-handling functions 
for the new x86 setup code.
git-bisect good 5be865661516263d90317a6b35b588a2d7c3cb55
# good: [3b53d3045bbb8ea3c9dce663b102eab0903817c5] MCA support for new x86 
setup code
git-bisect good 3b53d3045bbb8ea3c9dce663b102eab0903817c5
# good: [7052fdd890bda0b3904674b69a1d24aec0a10d67] Code for actual 
protected-mode entry
git-bisect good 7052fdd890bda0b3904674b69a1d24aec0a10d67
# good: [f2d98ae63dc64dedb00499289e13a50677f771f9] Linker script for the new 
x86 setup code
git-bisect good f2d98ae63dc64dedb00499289e13a50677f771f9
# bad: [91a6c462b02d8dc02dbe95e5a407d78078a38d01] Use the new x86 setup code 
for x86-64; unify with i386
git-bisect bad 91a6c462b02d8dc02dbe95e5a407d78078a38d01
# bad: [4fd06960f120e02e9abc802a09f9511c400042a5] Use the new x86 setup code 
for i386
git-bisect bad 4fd06960f120e02e9abc802a09f9511c400042a5


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/24] make atomic_read() behave consistently on alpha

2007-08-10 Thread Valdis . Kletnieks
On Sat, 11 Aug 2007 02:38:40 +0200, Segher Boessenkool said:
> >> That means GCC cannot compile Linux; it already optimises
> >> some accesses to scalars to smaller accesses when it knows
> >> it is allowed to.  Not often though, since it hardly ever
> >> helps in the cost model it employs.
> >
> > Please give an example code snippet + gcc version + arch
> > to back this up.
> 
>   unsigned char f(unsigned long *p)
>   {
>   return *p & 1;
>   }

Not really valid, because it's still able to do one atomic access to
compute the result.

Now, if you had found an example where it converts a 32-bit atomic access into
2 separate 16-bit accesses that weren't atomic as a whole


pgpaXvjoy1naa.pgp
Description: PGP signature


Re: [PATCH 6/24] make atomic_read() behave consistently on frv

2007-08-10 Thread Paul E. McKenney
On Sat, Aug 11, 2007 at 08:54:46AM +0800, Herbert Xu wrote:
> Chris Snook <[EMAIL PROTECTED]> wrote:
> > 
> > cpu_relax() contains a barrier, so it should do the right thing.  For 
> > non-smp architectures, I'm concerned about interacting with interrupt 
> > handlers.  Some drivers do use atomic_* operations.
> 
> What problems with interrupt handlers? Access to int/long must
> be atomic or we're in big trouble anyway.

Reordering due to compiler optimizations.  CPU reordering does not
affect interactions with interrupt handlers on a given CPU, but
reordering due to compiler code-movement optimization does.  Since
volatile can in some cases suppress code-movement optimizations,
it can affect interactions with interrupt handlers.

Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [SCSI] aic94xx: new driver

2007-08-10 Thread Christoph Hellwig
On Fri, Aug 10, 2007 at 11:09:22PM +0800, David Woodhouse wrote:
> The files in /usr/include/scsi are actually shipped by glibc, and most
> distributions use glibc's version instead of the one from the kernel --
> so this additional userspace interface is automatically incompatible
> with most people's installations.

Stop here right now.  You just noticed the real bug, and that's exporting
scsi.h at all.  I think Olaf sent a patch to fix this already.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/20] Introduce MS_KERNMOUNT flag

2007-08-10 Thread Christoph Hellwig
On Fri, Aug 10, 2007 at 03:47:55PM +0400, [EMAIL PROTECTED] wrote:
> This flag tells the .get_sb callback that this is a kern_mount() call
> so that it can trust *data pointer to be valid in-kernel one. If this
> flag is passed from the user process, it is cleared since the *data
> pointer is not a valid kernel object.
> 
> Running a few steps forward - this will be needed for proc to create the
> superblock and store a valid pid namespace on it during the namespace
> creation. The reason, why the namespace cannot live without proc mount
> is described in the appropriate patch.

I don't like this at all.  We should never pass kernel and userspace
addresses through the same pointer.  Maybe add an additional argument
to the get_sb prototype instead.  But this whole idea of mounting /proc
from kernelspace sounds like a really bad idea to me.  /proc should
never be mounted from the kernel but always normally from userspace.

> 
> Signed-off-by: Pavel Emelyanov <[EMAIL PROTECTED]>
> Cc: Oleg Nesterov <[EMAIL PROTECTED]>
> 
> ---
> 
>  fs/namespace.c |3 ++-
>  fs/super.c |6 +++---
>  include/linux/fs.h |4 +++-
>  3 files changed, 8 insertions(+), 5 deletions(-)
> 
> diff -upr linux-2.6.23-rc1-mm1.orig/fs/namespace.c 
> linux-2.6.23-rc1-mm1-7/fs/namespace.c
> --- linux-2.6.23-rc1-mm1.orig/fs/namespace.c  2007-07-26 16:34:45.0 
> +0400
> +++ linux-2.6.23-rc1-mm1-7/fs/namespace.c 2007-07-26 16:36:36.0 
> +0400
> @@ -1579,7 +1579,8 @@ long do_mount(char *dev_name, char *dir_
>   mnt_flags |= MNT_NOMNT;
>  
>   flags &= ~(MS_NOSUID | MS_NOEXEC | MS_NODEV | MS_ACTIVE |
> -MS_NOATIME | MS_NODIRATIME | MS_RELATIME | MS_NOMNT);
> +MS_NOATIME | MS_NODIRATIME | MS_RELATIME |
> +MS_NOMNT | MS_KERNMOUNT);
>  
>   /* ... and get the mountpoint */
>   retval = path_lookup(dir_name, LOOKUP_FOLLOW, &nd);
> diff -upr linux-2.6.23-rc1-mm1.orig/fs/super.c 
> linux-2.6.23-rc1-mm1-7/fs/super.c
> --- linux-2.6.23-rc1-mm1.orig/fs/super.c  2007-07-26 16:34:45.0 
> +0400
> +++ linux-2.6.23-rc1-mm1-7/fs/super.c 2007-07-26 16:36:36.0 +0400
> @@ -944,9 +944,9 @@ do_kern_mount(const char *fstype, int fl
>   return mnt;
>  }
>  
> -struct vfsmount *kern_mount(struct file_system_type *type)
> +struct vfsmount *kern_mount_data(struct file_system_type *type, void *data)
>  {
> - return vfs_kern_mount(type, 0, type->name, NULL);
> + return vfs_kern_mount(type, MS_KERNMOUNT, type->name, data);
>  }
>  
> -EXPORT_SYMBOL(kern_mount);
> +EXPORT_SYMBOL_GPL(kern_mount_data);
> diff -upr linux-2.6.23-rc1-mm1.orig/include/linux/fs.h 
> linux-2.6.23-rc1-mm1-7/include/linux/fs.h
> --- linux-2.6.23-rc1-mm1.orig/include/linux/fs.h  2007-07-26 
> 16:34:45.0 +0400
> +++ linux-2.6.23-rc1-mm1-7/include/linux/fs.h 2007-07-26 16:36:36.0 
> +0400
> @@ -129,6 +129,7 @@ extern int dir_notify_enable;
>  #define MS_RELATIME  (1<<21) /* Update atime relative to mtime/ctime. */
>  #define MS_SETUSER   (1<<23) /* set mnt_uid to current user */
>  #define MS_NOMNT (1<<24) /* don't allow unprivileged submounts */
> +#define MS_KERNMOUNT (1<<25) /* this is a kern_mount call */
>  #define MS_ACTIVE(1<<30)
>  #define MS_NOUSER(1<<31)
>  
> @@ -1459,7 +1460,8 @@ void unnamed_dev_init(void);
>  
>  extern int register_filesystem(struct file_system_type *);
>  extern int unregister_filesystem(struct file_system_type *);
> -extern struct vfsmount *kern_mount(struct file_system_type *);
> +extern struct vfsmount *kern_mount_data(struct file_system_type *, void 
> *data);
> +#define kern_mount(type) kern_mount_data(type, NULL)
>  extern int may_umount_tree(struct vfsmount *);
>  extern int may_umount(struct vfsmount *);
>  extern void umount_tree(struct vfsmount *, int, struct list_head *);
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
---end quoted text---
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/7] Modify lguest32 to make room for lguest64

2007-08-10 Thread Rusty Russell
On Wed, 2007-08-08 at 20:32 -0400, Steven Rostedt wrote:
> Hi all,
> 
> I've been working on lguest64 and in order to do this, I had to move
> a lot of the i386 specific out of the way.  Well, the lguest64 port
> is still not ready to display, but before Rusty makes too many changes
> I would like this in upstream so I don't have to keep repeating my
> changes :-)
> 
> 
> So this patch series moves lguest32 out of the way for other archs.

Yeah, after some more thought I've not applied most of this.  We really
don't want to move everything then move it back; I prefer Jes' more
cautious approach of moving a little bit at a time.

We really have three parts: (1) bits that are generic, (2) bits that
should be generic but my implementation is naive, (3) bits that really
are i386-specific.

Patches which move 2 to 1 are gratefully accepted: I realize a mass move
is easier and this requires thought, but that's what we need.

Since I can't build a module over two directories, that seems to destroy
the idea of an i386/ subdir.  Instead I've done a patch which renames
the *clearly* i386-specific things to i386_, which at least works.
I've pushed it into the repository http://lguest.ozlabs.org/patches/

Cheers,
Rusty.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6 patch] cpqphp_ctrl.c: remove dead code

2007-08-10 Thread Christoph Hellwig
On Thu, Aug 09, 2007 at 03:47:02PM -0700, Kristen Carlson Accardi wrote:
> fine by me - let's NAK this patch (and all future ones for this driver) until 
> someone with hardware steps up to maintain this driver.  Eventually it
> will just die I guess.

Very bad idea.   For example I sent a patch ages ago to remove kernel_thread
useage from the driver.  We need to get that patch in sooner or later because
the kernel_thread export will have to go away.  We're not going to block that
on this driver.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC] Adding a TIF_KERNEL_TRACE to thread_info.h, s390 and ia64 8 bit limit

2007-08-10 Thread Mathieu Desnoyers
Hi,

I would like to add a TIF_KERNEL_TRACE that would have the same effect
as TIF_SYSCALL_TRACE, which is to call into do_syscall_trace when
enabled. It would be enabled by setting it in each thread info
structure (and protected against racy thread creation with proper flag
copy from the parent thread upon thread creation). The particularity of
TIF_KERNEL_TRACE is that it would be enabled dynamically system-wide
when kernel tracing is active.

The current similar flags that exist are TIF_SYSCALL_TRACE (for ptrace)
and TIF_SYSCALL_AUDIT (set by audit_alloc() at process creation if
auditing is enabled). However, touching these flags system-wide would
conflict with either syscall audit or ptrace, therefore the
introduction of a new thread flag looks like a plausible solution.

However, since the instructions used to test these flags are limited to
8 bits on some architectures, we run out of free flags at least on s390
and ia64.

I would appreciate some comments about the idea in general, and how
bitfield limitation should be overcomed for s390 and ia64.

Some details about the problematic patches below.

Thanks,

Mathieu Desnoyers


On s390:

/home/compudj/git/linux-2.6-lttng/arch/s390/kernel/entry.S: Assembler messages:
/home/compudj/git/linux-2.6-lttng/arch/s390/kernel/entry.S:252: Error: operand 
out of range (289 is not between 0 and 255)
/home/compudj/git/linux-2.6-lttng/arch/s390/kernel/entry.S:362: Error: operand 
out of range (289 is not between 0 and 255)
make[2]: *** [arch/s390/kernel/entry.o] Error 1

when adding:

---
 include/asm-s390/thread_info.h |2 ++
 1 file changed, 2 insertions(+)

Index: linux-2.6-lttng/include/asm-s390/thread_info.h
===
--- linux-2.6-lttng.orig/include/asm-s390/thread_info.h 2007-07-30 
18:53:20.0 -0400
+++ linux-2.6-lttng/include/asm-s390/thread_info.h  2007-07-30 
18:53:24.0 -0400
@@ -96,6 +96,7 @@ static inline struct thread_info *curren
 #define TIF_SYSCALL_AUDIT  5   /* syscall auditing active */
 #define TIF_SINGLE_STEP6   /* deliver sigtrap on return to 
user */
 #define TIF_MCCK_PENDING   7   /* machine check handling is pending */
+#define TIF_KERNEL_TRACE   8   /* kernel trace active */
 #define TIF_USEDFPU16  /* FPU was used by this task this 
quantum (SMP) */
 #define TIF_POLLING_NRFLAG 17  /* true if poll_idle() is polling 
   TIF_NEED_RESCHED */
@@ -110,6 +111,7 @@ static inline struct thread_info *curren
 #define _TIF_SYSCALL_AUDIT (1

Re: [PATCH 5/7] Change lguest launcher to use asm generic include

2007-08-10 Thread Rusty Russell
On Wed, 2007-08-08 at 20:32 -0400, Steven Rostedt wrote:
> plain text document attachment
> (0005-Change-lguest-launcher-to-use-asm-generic-include-instead-of-explicitly.txt)
> Have the lguest launcher include e820.h via asm/e820.h instead of explicitly
> saying i386.
> 
> Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]>

Applied, thanks,

Rusty.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.23-rc2-mm2 build error on MIPS and ARM

2007-08-10 Thread Mathieu Desnoyers
Hi Andrew,

I got the following errors when building 2.6.23-rc2-mm2 on both mips and
arm. Both errors are very much alike.

MIPS:

 
/opt/crosstool/gcc-3.4.5-glibc-2.3.6/mips-unknown-linux-gnu/lib/gcc/mips-unknown-linux-gnu/3.4.5/include
 -D__KERNEL__ -Iinclude -Iinclude2 -I/home/compudj/git/linux-2.6-lttng/include 
-include include/linux/autoconf.h -I/home/compudj/git/linux-2.6-lttng/. -I. 
-Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing 
-fno-common -Werror-implicit-function-declaration -Os -mabi=32 -G 0 
-mno-abicalls -fno-pic -pipe -msoft-float -ffreestanding -march=r5000 
-Wa,--trap -I/home/compudj/git/linux-2.6-lttng/include/asm-mips/mach-ip22 
-Iinclude/asm-mips/mach-ip22 
-I/home/compudj/git/linux-2.6-lttng/include/asm-mips/mach-generic 
-Iinclude/asm-mips/mach-generic -D"VMLINUX_LOAD_ADDRESS=0x88002000" 
-fomit-frame-pointer -Wdeclaration-after-statement  -D"KBUILD_STR(s)=#s" 
-D"KBUILD_BASENAME=KBUILD_STR(asm_offsets)"  
-D"KBUILD_MODNAME=KBUILD_STR(asm_offsets)" -fverbose-asm -S -o 
arch/mips/kernel/asm-offsets.s 
/home/compudj/git/linux-2.6-lttng/arch/mips/kernel/asm-offsets.c
In file included from 
/home/compudj/git/linux-2.6-lttng/include/linux/sched.h:58,
 from 
/home/compudj/git/linux-2.6-lttng/arch/mips/kernel/asm-offsets.c:13:
/home/compudj/git/linux-2.6-lttng/include/linux/mm_types.h:115: error: syntax 
error before "pgprot_t"
/home/compudj/git/linux-2.6-lttng/include/linux/mm_types.h:115: warning: no 
semicolon at end of struct or union
/home/compudj/git/linux-2.6-lttng/include/linux/mm_types.h:161: error: syntax 
error before '}' token
/home/compudj/git/linux-2.6-lttng/include/linux/mm_types.h:175: error: syntax 
error before "pgd_t"
/home/compudj/git/linux-2.6-lttng/include/linux/mm_types.h:175: warning: no 
semicolon at end of struct or union
/home/compudj/git/linux-2.6-lttng/include/linux/mm_types.h:229: error: syntax 
error before '}' token
In file included from 
/home/compudj/git/linux-2.6-lttng/arch/mips/kernel/asm-offsets.c:13:
/home/compudj/git/linux-2.6-lttng/include/linux/sched.h: In function `mmdrop':
/home/compudj/git/linux-2.6-lttng/include/linux/sched.h:1509: error: 
dereferencing pointer to incomplete type
/home/compudj/git/linux-2.6-lttng/include/linux/sched.h: In function 
`arch_pick_mmap_layout':
/home/compudj/git/linux-2.6-lttng/include/linux/sched.h:1762: error: 
dereferencing pointer to incomplete type
/home/compudj/git/linux-2.6-lttng/include/linux/sched.h:1763: error: 
dereferencing pointer to incomplete type
/home/compudj/git/linux-2.6-lttng/include/linux/sched.h:1764: error: 
dereferencing pointer to incomplete type
In file included from 
/home/compudj/git/linux-2.6-lttng/arch/mips/kernel/asm-offsets.c:14:
/home/compudj/git/linux-2.6-lttng/include/linux/mm.h: In function 
`vma_nonlinear_insert':
/home/compudj/git/linux-2.6-lttng/include/linux/mm.h:968: error: dereferencing 
pointer to incomplete type
/home/compudj/git/linux-2.6-lttng/include/linux/mm.h:969: error: dereferencing 
pointer to incomplete type
/home/compudj/git/linux-2.6-lttng/include/linux/mm.h: In function 
`find_vma_intersection':
/home/compudj/git/linux-2.6-lttng/include/linux/mm.h:1078: error: dereferencing 
pointer to incomplete type
/home/compudj/git/linux-2.6-lttng/include/linux/mm.h: In function `vma_pages':
/home/compudj/git/linux-2.6-lttng/include/linux/mm.h:1085: error: dereferencing 
pointer to incomplete type
/home/compudj/git/linux-2.6-lttng/include/linux/mm.h:1085: error: dereferencing 
pointer to incomplete type
/home/compudj/git/linux-2.6-lttng/arch/mips/kernel/asm-offsets.c: In function 
`output_mm_defines':
/home/compudj/git/linux-2.6-lttng/arch/mips/kernel/asm-offsets.c:220: error: 
dereferencing pointer to incomplete type
/home/compudj/git/linux-2.6-lttng/arch/mips/kernel/asm-offsets.c:221: error: 
dereferencing pointer to incomplete type
/home/compudj/git/linux-2.6-lttng/arch/mips/kernel/asm-offsets.c:222: error: 
dereferencing pointer to incomplete type
make[2]: *** [arch/mips/kernel/asm-offsets.s] Error 1
make[1]: *** [prepare0] Error 2
make: *** [_all] Error 2


ARM:


  
/opt/crosstool/gcc-4.0.2-glibc-2.3.6/arm-unknown-linux-gnu/bin/arm-unknown-linux-gnu-gcc
 -Wp,-MD,arch/arm/kernel/.asm-offsets.s.d  -nostdinc -isystem 
/opt/crosstool/gcc-4.0.2-glibc-2.3.6/arm-unknown-linux-gnu/lib/gcc/arm-unknown-linux-gnu/4.0.2/include
 -D__KERNEL__ -Iinclude -Iinclude2 -I/home/compudj/git/linux-2.6-lttng/include 
-include include/linux/autoconf.h -mlittle-endian 
-I/home/compudj/git/linux-2.6-lttng/. -I. -Wall -Wundef -Wstrict-prototypes 
-Wno-trigraphs -fno-strict-aliasing -fno-common 
-Werror-implicit-function-declaration -Os -marm -fno-omit-frame-pointer -mapcs 
-mno-sched-prolog -mabi=apcs-gnu -mno-thumb-interwork -D__LINUX_ARM_ARCH__=4 
-march=armv4 -mtune=strongarm110 -msoft-float -Uarm -fno-omit-frame-pointer 
-fno-optimize-sibling-calls -Wdeclaration-after-statement -Wno-pointer-sign  
-D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_S

Re: [PATCH 00/25] move handling of setuid/gid bits from VFS into individual setattr functions (RESEND)

2007-08-10 Thread Christoph Hellwig
On Fri, Aug 10, 2007 at 04:47:52PM -0400, Jeff Layton wrote:
> attr->ia_valid after the setattr operation returns. If either ATTR_KILL_*
> bit is set then BUG(). The helper function already clears those bits
> so anything using it should automatically be ok. We'd have to fix
> up NFS and a few others that don't implement suid/sgid.
> 
> This is not as certain as changing the name of the inode operation. It
> would only pop when someone is attempting to change a setuid/setgid
> file on these filesystems. Still, it should conceivably catch most if
> not all offenders. Would that be sufficient to take care of everyone's
> concerns?

I like the idea of checking ia_valid after return a lot.  But instead of
going BUG() it should just do the default action, that we can avoid
touching all the filesystem and only need to change those that need
special care.  I also have plans to add some new AT_ flags for implementing
some filesystem ioctl in generic code that would benefit greatly from
the ia_valid checkin after return to return ENOTTY fr filesystems not
implementing those ioctls.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Process stuck in md_wakeup_thread

2007-08-10 Thread Russ Dill
On 2.6.22 from debian (stock), I have a process (dpkg) stuck with the following
calltrace:

SysRq : Show Blocked State

 freesibling
  task PCstack   pid father child younger older
dpkg  D 0003 0 26040  20765 (NOTLB)
   e57d5e30 00200082  0003 dfc48ba8  dfc48ba8  
   0007 e0af45c0 e8ce17aa 0002827f 00051ec2 e0af46cc c1809980  
   e8ce1324 0002827f 00200082 f881cd4c 00200286 f8ba2c85 c1809980 e57d5e60 
Call Trace:
 [] md_wakeup_thread+0x26/0x28 [md_mod]
 [] raid5_unplug_device+0x4e/0x5a [raid456]
 [] io_schedule+0x1d/0x27
 [] sync_page+0x0/0x3b
 [] sync_page+0x38/0x3b
 [] __wait_on_bit_lock+0x2a/0x52
 [] __lock_page+0x58/0x5e
 [] wake_bit_function+0x0/0x3c
 [] truncate_inode_pages_range+0x201/0x256
 [] truncate_inode_pages+0x17/0x1a
 [] reiserfs_delete_inode+0x36/0xdd [reiserfs]
 [] reiserfs_delete_inode+0x0/0xdd [reiserfs]
 [] generic_delete_inode+0xa0/0x105
 [] iput+0x60/0x62
 [] do_unlinkat+0xb6/0x126
 [] syscall_call+0x7/0xb
 ===

My system is still up and running.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [lm-sensors] 2.6.23-rc1 regression: hwmon/w83627ehf: wrong fan speed

2007-08-10 Thread David Hubbard
Hi Stefan, (Replying to everyone on the list, sorry!)

On 8/10/07, Stefan Richter <[EMAIL PROTECTED]> wrote:
> Should I hardwire correct dividers or pulse per rev in sensors.conf or
> is the driver supposed to work the correct dividers out --- like it did
> before 2.6.23-rc?

The dividers are read-only in userspace. The driver manages the
dividers automatically. The dividers are needed because the w83627ehf
chip only has an 8 bit register to count pulses for each fan. So if
the fan is moving slowly, you want the divider to be 128 so that every
pulse gets counted. If the fan is moving fast, you want the divider to
be 1 so that the register doesn't overflow. Once the register is read
in by the driver, the effect of the divider is cancelled out in
software so that you get an RPM reading from the fan. One side effect
of this is that a fast moving fan reports the RPM more quickly than a
slow moving fan.

If you turn on HWMON debugging, the driver will report when it is
changing the divider in dmesg.

Hope that helps,
David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Disk spin down issue on shut down/suspend to disk

2007-08-10 Thread Robert Hancock

Thomas Renninger wrote:

On Thu, 2007-08-09 at 15:16 +, Pavel Machek wrote:

Hi!


firmwarekit-discuss <[EMAIL PROTECTED]> (added to CC list)
see: http://linuxfirmwarekit.org/

But if I understand this problem right, this won't be easy.
The ACPI tables are just parsed with system ("iasl ...") and syntactical
errors/warnings are printed out.
I also thought about a test, interpreting the DSDT and read out values
of cpufreq tables and sanity check them. AFAIK the linuxfirmwarekit is
not designed for that atm. You need to compile in most parts of the
acpica code and parse and interpret DSDT/SSDT code yourself in the
firmwarekit core or inside a plugin, then do a walk_namespace call or
whatever to find the functions/parts you like to examine. This is a lot
work and needs a proper design (providing an interface to plugins to let
them easily check specific AML/ASL code).

Furthermore, we don't really know what we're looking for.  How can you
tell a given write to an ioport is issuing STANDBYNOW to an ATA disk or
trying to power the machine off?  Adding to the fun, many modern ATA
controller have more than one way to issue a command.  Maybe we can
match accesses inside regions specified by PCI BARs  :-(

Hmmm... perhaps we should do it the other way. ACPI is allowed to
touch the embedded controller, what else? Maybe we should warn as soon
as API touches non-EC I/O port?


This is not working...
ACPI can and does access all kind of other I/O ports and other
resources.
Hmm, are the disk accesses done by ACPI via OperationRegion/Field
declared variables?
I try to get a check for those clashing with native drivers (hopefully
this approach is successful for 2.6.24, can't say for sure yet), I
wonder whether this one would give a warning like "Libata driver is
using the same SystemIO/SystemMem resources than ACPI OperationRegion
declaration XY".
This would not solve the problem, but at least show the need of such a
test. Such ACPI vs native driver interference problems are very hard
nuts (in identifying and solving).

Can someone post an ASL code snippet how ACPI actually access the disk
and in which parts/functions, pls.


Again, it's not believed that this is being done via AML, but via a BIOS 
SMM trap on the ACPI sleep state hardware IO port. We have no real 
ability to find out what the BIOS is doing or prevent it in this case.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/4] Embed zone_id information within the zonelist->zones pointer

2007-08-10 Thread Andi Kleen
On Friday 10 August 2007 21:02, Christoph Lameter wrote:
> On Fri, 10 Aug 2007, Andi Kleen wrote:
> > > x86_64 does not support ZONE_HIGHMEM.
> >
> > I also plan to eliminate ZONE_DMA soon (and replace all its users
> > with a new allocator that sits outside the normal fallback lists)
>
> Hallelujah. You are my hero! x86_64 will switch off CONFIG_ZONE_DMA?

Yes. i386 too actually.

The DMA zone will be still there, but only reachable with special functions.

This is fine because the default zone protection heuristics keep DMA 
near always free from !GFP_DMA allocations anyways -- so it doesn't make much 
difference if it's totally unreachable. swiotlb will also use the same pool.

Also all callers are going to pass masks around so it's always clear
what address range they really need. Actually a lot of them
pass still 16MB simply because it is hard to find out what masks
old undocumented hardware really needs. But this could change.

This also means the DMA support in sl[a-z]b is not needed anymore.

I went through near all GFP_DMA users and found they're usually
happy enough with pages. If someone comes up who really needs
lots of subobjects the right way for them would be likely extending
the pci pool allocator for this case. But I haven't found a need for this yet.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 6/24] make atomic_read() behave consistently on frv

2007-08-10 Thread Herbert Xu
Chris Snook <[EMAIL PROTECTED]> wrote:
> 
> cpu_relax() contains a barrier, so it should do the right thing.  For 
> non-smp architectures, I'm concerned about interacting with interrupt 
> handlers.  Some drivers do use atomic_* operations.

What problems with interrupt handlers? Access to int/long must
be atomic or we're in big trouble anyway.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86: make io-apic not connected pin print complete

2007-08-10 Thread Yinghai Lu
[PATCH] x86: make io-apic not connected pin print complete

normally will have two segment not connected pin
pin0, and pin after 15...

so need to print out "not connected\n" for previous segment,
before print out connected pins info...

Signed-off-by: Yinghai Lu <[EMAIL PROTECTED]>

diff --git a/arch/x86_64/kernel/io_apic.c b/arch/x86_64/kernel/io_apic.c
index 050141c..a591679 100644
--- a/arch/x86_64/kernel/io_apic.c
+++ b/arch/x86_64/kernel/io_apic.c
@@ -874,6 +874,10 @@ static void __init setup_IO_APIC_irqs(void)
apic_printk(APIC_VERBOSE, ", %d-%d", 
mp_ioapics[apic].mpc_apicid, pin);
continue;
}
+   if (!first_notcon) {
+   apic_printk(APIC_VERBOSE, " not connected.\n");
+   first_notcon = 1;
+   }
 
irq = pin_2_irq(idx, apic, pin);
add_pin_to_irq(irq, apic, pin);
@@ -884,7 +888,7 @@ static void __init setup_IO_APIC_irqs(void)
}
 
if (!first_notcon)
-   apic_printk(APIC_VERBOSE," not connected.\n");
+   apic_printk(APIC_VERBOSE, " not connected.\n");
 }
 
 /*
diff --git a/arch/i386/kernel/io_apic.c b/arch/i386/kernel/io_apic.c
index 893df82..39cf860 100644
--- a/arch/i386/kernel/io_apic.c
+++ b/arch/i386/kernel/io_apic.c
@@ -1301,6 +1301,11 @@ static void __init setup_IO_APIC_irqs(void)
continue;
}
 
+   if (!first_notcon) {
+   apic_printk(APIC_VERBOSE, " not connected.\n");
+   first_notcon = 1;
+   }
+
entry.trigger = irq_trigger(idx);
entry.polarity = irq_polarity(idx);
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/24] make atomic_read() behave consistently on alpha

2007-08-10 Thread Segher Boessenkool

That means GCC cannot compile Linux; it already optimises
some accesses to scalars to smaller accesses when it knows
it is allowed to.  Not often though, since it hardly ever
helps in the cost model it employs.


Please give an example code snippet + gcc version + arch
to back this up.


unsigned char f(unsigned long *p)
{
return *p & 1;
}


This doesn't really matter since we only care about the LSB.


It is exactly what I claimed, and what you asked proof of.


Do you have an example where gcc reads it non-atmoically and
we care about all parts?


Like I explained in the original mail; no, I suspect such
a testcase will be really hard to construct, esp. as a small
testcase.  I have no reason to believe it is impossible to
do so though -- maybe someone else can write trickier code
than I can, in which case, please do so.


Segher

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/24] make atomic_read() behave consistently on alpha

2007-08-10 Thread Herbert Xu
On Sat, Aug 11, 2007 at 02:38:40AM +0200, Segher Boessenkool wrote:
> >>That means GCC cannot compile Linux; it already optimises
> >>some accesses to scalars to smaller accesses when it knows
> >>it is allowed to.  Not often though, since it hardly ever
> >>helps in the cost model it employs.
> >
> >Please give an example code snippet + gcc version + arch
> >to back this up.
> 
>   unsigned char f(unsigned long *p)
>   {
>   return *p & 1;
>   }

This doesn't really matter since we only care about the LSB.
Do you have an example where gcc reads it non-atmoically and
we care about all parts?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Make rcutorture RNG use locally grown entropy

2007-08-10 Thread Paul E. McKenney
This patch converts rcutorture's random-number generator from
get_random_bytes() (which has locking issues in some builds with patches)
to instead use local-to-rcutorture statistical counters.  This involves
reading other CPUs' statistics, so the frequency of entropy addition
is simultaneously decreased by an order of magnitude.

This patch is an alternative to adding an EXPORT_SYMBOL_GPL() for the new
cpu_clock() API.

Signed-off-by: Paul E. McKenney <[EMAIL PROTECTED]>
---

 rcutorture.c |   13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff -urpNa -X dontdiff linux-2.6.22.1-rt4/kernel/rcutorture.c 
linux-2.6.22.1-rt4-rcutorturesched/kernel/rcutorture.c
--- linux-2.6.22.1-rt4/kernel/rcutorture.c  2007-07-21 16:58:22.0 
-0700
+++ linux-2.6.22.1-rt4-rcutorturesched/kernel/rcutorture.c  2007-08-10 
08:42:41.0 -0700
@@ -155,26 +155,27 @@ rcu_torture_free(struct rcu_torture *p)
 struct rcu_random_state {
unsigned long rrs_state;
long rrs_count;
+   int rrs_cpu;
 };
 
 #define RCU_RANDOM_MULT 39916801  /* prime */
 #define RCU_RANDOM_ADD 479001701 /* prime */
-#define RCU_RANDOM_REFRESH 1
+#define RCU_RANDOM_REFRESH 10
 
 #define DEFINE_RCU_RANDOM(name) struct rcu_random_state name = { 0, 0 }
 
 /*
  * Crude but fast random-number generator.  Uses a linear congruential
- * generator, with occasional help from get_random_bytes().
+ * generator, with occasional help from other CPUs' fast-running statistics.
  */
 static unsigned long
 rcu_random(struct rcu_random_state *rrsp)
 {
-   long refresh;
-
if (--rrsp->rrs_count < 0) {
-   get_random_bytes(&refresh, sizeof(refresh));
-   rrsp->rrs_state += refresh;
+   rrsp->rrs_cpu = next_cpu(rrsp->rrs_cpu, cpu_online_map);
+   if (rrsp->rrs_cpu >= NR_CPUS)
+   rrsp->rrs_cpu = 0;
+   rrsp->rrs_state += per_cpu(rcu_torture_count, rrsp->rrs_cpu)[0];
rrsp->rrs_count = RCU_RANDOM_REFRESH;
}
rrsp->rrs_state = rrsp->rrs_state * RCU_RANDOM_MULT + RCU_RANDOM_ADD;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/24] make atomic_read() behave consistently on alpha

2007-08-10 Thread Segher Boessenkool

That means GCC cannot compile Linux; it already optimises
some accesses to scalars to smaller accesses when it knows
it is allowed to.  Not often though, since it hardly ever
helps in the cost model it employs.


Please give an example code snippet + gcc version + arch
to back this up.


unsigned char f(unsigned long *p)
{
return *p & 1;
}

with both

powerpc64-linux-gcc (GCC) 4.3.0 20070731 (experimental)

and

powerpc64-linux-gcc-4.2.0 (GCC) 4.2.0

(sorry, I don't have anything newer or older right now; if you
really care, I can test with those too)

generate (in 64-bit mode):

.L.f:
lbz 3,7(3)
rldicl 3,3,0,63
blr

and in 32-bit mode:

f:
stwu 1,-16(1)
nop
nop
lbz 3,3(3)
addi 1,1,16
rlwinm 3,3,0,31,31
blr

(the nops are because I use --with-cpu=970).


But perhaps you do not care for PowerPC, in which case:

i686-linux-gcc (GCC) 4.2.0 20060410 (experimental)

(sorry for the old version, I don't build x86 compilers
all that often; also I don't have a 64-bit version right
now):

f:
pushl   %ebp
movl%esp, %ebp
movl8(%ebp), %eax
popl%ebp
movzbl  (%eax), %eax
andl$1, %eax
ret


If you want testing with any other versions, and/or for
any other target architecture, I can do that; it takes a
few minutes to build a compiler.

It is quite hard to build a testcase that reads more than
one part of the "long", since for small testcases the
compiler will almost always be smart enough to do one
bigger read instead; but it certainly isn't inconceivable,
and anyway the compiler would be fully in its right to do
reads non-atomically if not instructed otherwise.


Segher

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 9/24] make atomic_read() behave consistently on ia64

2007-08-10 Thread Paul Mackerras
Chris Snook writes:

> I'll do this for the whole patchset.  Stay tuned for the resubmit.

Could you incorporate Segher's patch to turn atomic_{read,set} into
asm on powerpc?  Segher claims that using asm is really the only
reliable way to ensure that gcc does what we want, and he seems to
have a point.

Paul.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] powerpc: Implement atomic{,64}_{read,write}() without volatile

2007-08-10 Thread Paul Mackerras
Segher Boessenkool writes:

> Instead, use asm() like all other atomic operations already do.
> 
> Also use inline functions instead of macros; this actually
> improves code generation (some code becomes a little smaller,
> probably because of improved alias information -- just a few
> hundred bytes total on a default kernel build, nothing shocking).
> 
> Signed-off-by: Segher Boessenkool <[EMAIL PROTECTED]>

Looks OK to me.  In the hope that Chris Snook will pick it up and
include it with his other atomic changes:

Acked-by: Paul Mackerras <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-10 Thread Ingo Molnar

* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> * Roman Zippel <[EMAIL PROTECTED]> wrote:
> 
> > Well, I've sent him the stuff now...
> 
> received it - thanks alot, looking at it!

everything looks good in your debug output and the TSC dump data, except 
for the wait_runtime values, they are quite out of balance - and that 
balance cannot be explained with jiffies granularity or with any sort of 
sched_clock() artifact. So this clearly looks like a CFS regression that 
should be fixed.

the only relevant thing that comes to mind at the moment is that last 
week Peter noticed a buggy aspect of sleeper bonuses (in that we do not 
rate-limit their output, hence we 'waste' them instead of redistributing 
them), and i've got the small patch below in my queue to fix that - 
could you give it a try?

this is just a blind stab into the dark - i couldnt see any real impact 
from that patch in various workloads (and it's not upstream yet), so it 
might not make a big difference. The trace you did (could you send the 
source for that?) seems to implicate sleeper bonuses though.

if this patch doesnt help, could you check the general theory whether 
it's related to sleeper-fairness, via turning it off:

   echo 30 > /proc/sys/kernel/sched_features

does the bug go away if you do that? If sleeper bonuses are showing too 
many artifacts then we could turn it off for final .23.

Ingo

->
Subject: sched: fix sleeper bonus
From: Ingo Molnar <[EMAIL PROTECTED]>

Peter Ziljstra noticed that the sleeper bonus deduction code was not 
properly rate-limited: a task that scheduled more frequently would get a 
disproportionately large deduction. So limit the deduction to delta_exec 
and limit production to runtime_limit.

Not-Yet-Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
---
 kernel/sched_fair.c |   12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

Index: linux/kernel/sched_fair.c
===
--- linux.orig/kernel/sched_fair.c
+++ linux/kernel/sched_fair.c
@@ -75,7 +75,7 @@ enum {
 
 unsigned int sysctl_sched_features __read_mostly =
SCHED_FEAT_FAIR_SLEEPERS*1 |
-   SCHED_FEAT_SLEEPER_AVG  *1 |
+   SCHED_FEAT_SLEEPER_AVG  *0 |
SCHED_FEAT_SLEEPER_LOAD_AVG *1 |
SCHED_FEAT_PRECISE_CPU_LOAD *1 |
SCHED_FEAT_START_DEBIT  *1 |
@@ -304,11 +304,9 @@ __update_curr(struct cfs_rq *cfs_rq, str
delta_mine = calc_delta_mine(delta_exec, curr->load.weight, lw);
 
if (cfs_rq->sleeper_bonus > sysctl_sched_granularity) {
-   delta = calc_delta_mine(cfs_rq->sleeper_bonus,
-   curr->load.weight, lw);
-   if (unlikely(delta > cfs_rq->sleeper_bonus))
-   delta = cfs_rq->sleeper_bonus;
-
+   delta = min(cfs_rq->sleeper_bonus, (u64)delta_exec);
+   delta = calc_delta_mine(delta, curr->load.weight, lw);
+   delta = min((u64)delta, cfs_rq->sleeper_bonus);
cfs_rq->sleeper_bonus -= delta;
delta_mine -= delta;
}
@@ -521,6 +519,8 @@ static void __enqueue_sleeper(struct cfs
 * Track the amount of bonus we've given to sleepers:
 */
cfs_rq->sleeper_bonus += delta_fair;
+   if (unlikely(cfs_rq->sleeper_bonus > sysctl_sched_runtime_limit))
+   cfs_rq->sleeper_bonus = sysctl_sched_runtime_limit;
 
schedstat_add(cfs_rq, wait_runtime, se->wait_runtime);
 }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: rcutorture xtime usage

2007-08-10 Thread Paul E. McKenney
On Fri, Aug 10, 2007 at 01:30:55PM -0700, Paul E. McKenney wrote:
> On Fri, Aug 10, 2007 at 10:12:12AM -0700, Andrew Morton wrote:
> > On Fri, 10 Aug 2007 08:12:08 -0700 "Paul E. McKenney" <[EMAIL PROTECTED]> 
> > wrote:
> > 
> > > > One used to use sched_clock() for this, then get frowned at.  Now we
> > > > have cpu_clock()...
> > > 
> > > Hmmm...  And cpu_clock() is not in 2.6.22, so must appear in some later
> > > release.  Which means that the rate of API change in this area is a
> > > bit high, so I should avoid it like the plague.
> > 
> > eh, it's been there for weeks.  It is dust-encrusted.
> > 
> > >  Therefore, I should
> > > look for some other convenient source of entropy.
> > > 
> > > One convenient source would the per-CPU statistics that rcutorture
> > > maintains.  Of course, a given CPU's RNG is nearly in lock-step with
> > > its own statistics, but not with the adjacent CPU's statistics...
> > > 
> > > I will send a patch.
> > 
> > Please use cpu_clock().  It ain't going away.
> 
> D'accord...

Errmmm...  No joy.

ERROR: "cpu_clock" [kernel/rcutorture.ko] undefined!

Turns out that cpu_clock also ain't exported, and rcutorture.c is
a module.  Would adding an EXPORT_SYMBOL_GPL() as in the patch below
be acceptable?

If not, I have a tested patch to rcutorture.c that leverages statistical
counters.  Your choice.

Thanx, Paul

Add an EXPORT_SYMBOL_GPL() for cpu_clock() and make rcutorture.c use it.
Compiles, but not yet tested.

Signed-off-by: Paul E. McKenney <[EMAIL PROTECTED]>
---

 rcutorture.c |8 ++--
 sched.c  |2 ++
 2 files changed, 4 insertions(+), 6 deletions(-)

diff -urpNa -X dontdiff linux-2.6.23-rc2/kernel/rcutorture.c 
linux-2.6.23-rc2-rcutorturesched/kernel/rcutorture.c
--- linux-2.6.23-rc2/kernel/rcutorture.c2007-08-03 19:49:55.0 
-0700
+++ linux-2.6.23-rc2-rcutorturesched/kernel/rcutorture.c2007-08-10 
17:15:22.0 -0700
@@ -42,7 +42,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -166,16 +165,13 @@ struct rcu_random_state {
 
 /*
  * Crude but fast random-number generator.  Uses a linear congruential
- * generator, with occasional help from get_random_bytes().
+ * generator, with occasional help from cpu_clock().
  */
 static unsigned long
 rcu_random(struct rcu_random_state *rrsp)
 {
-   long refresh;
-
if (--rrsp->rrs_count < 0) {
-   get_random_bytes(&refresh, sizeof(refresh));
-   rrsp->rrs_state += refresh;
+   rrsp->rrs_state += (unsigned long)cpu_clock(smp_processor_id());
rrsp->rrs_count = RCU_RANDOM_REFRESH;
}
rrsp->rrs_state = rrsp->rrs_state * RCU_RANDOM_MULT + RCU_RANDOM_ADD;
diff -urpNa -X dontdiff linux-2.6.23-rc2/kernel/sched.c 
linux-2.6.23-rc2-rcutorturesched/kernel/sched.c
--- linux-2.6.23-rc2/kernel/sched.c 2007-08-03 19:49:55.0 -0700
+++ linux-2.6.23-rc2-rcutorturesched/kernel/sched.c 2007-08-10 
17:22:57.0 -0700
@@ -394,6 +394,8 @@ unsigned long long cpu_clock(int cpu)
return now;
 }
 
+EXPORT_SYMBOL_GPL(cpu_clock);
+
 #ifdef CONFIG_FAIR_GROUP_SCHED
 /* Change a task's ->cfs_rq if it moves across CPUs */
 static inline void set_task_cfs_rq(struct task_struct *p)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCHv2] Fix to keep watchdog disabled by default for i386/x86_64

2007-08-10 Thread Daniel Gollub
Fixed wrong expression which enabled watchdogs even if nmi_watchdog kernel 
parameter wasn't set. This regression got slightly introduced with commit 
b7471c6da94d30d3deadc55986cc38d1ff57f9ca.

Introduced NMI_DISABLED (-1) which allows to switch the value of NMI_DEFAULT 
without breaking the APIC NMI watchdog code (again).

Fixes:
https://bugzilla.novell.com/show_bug.cgi?id=298084
http://bugzilla.kernel.org/show_bug.cgi?id=7839
And likely some more nmi_watchdog=0 related issues.

Resubmit: x86_64 changes compiled but untested. Shame on me!

Signed-off-by: Daniel Gollub <[EMAIL PROTECTED]>
---
 arch/i386/kernel/apic.c  |2 +-
 arch/i386/kernel/nmi.c   |4 ++--
 arch/x86_64/kernel/nmi.c |4 ++--
 include/asm-i386/nmi.h   |3 ++-
 include/asm-x86_64/nmi.h |3 ++-
 5 files changed, 9 insertions(+), 7 deletions(-)

diff -rup a/arch/i386/kernel/apic.c b/arch/i386/kernel/apic.c
--- a/arch/i386/kernel/apic.c   2007-08-04 04:49:55.0 +0200
+++ b/arch/i386/kernel/apic.c   2007-08-10 21:38:37.0 +0200
@@ -1087,7 +1087,7 @@ static int __init detect_init_APIC (void
if (l & MSR_IA32_APICBASE_ENABLE)
mp_lapic_addr = l & MSR_IA32_APICBASE_BASE;
 
-   if (nmi_watchdog != NMI_NONE)
+   if (nmi_watchdog != NMI_NONE && nmi_watchdog != NMI_DISABLED)
nmi_watchdog = NMI_LOCAL_APIC;
 
printk(KERN_INFO "Found and enabled local APIC!\n");
diff -rup a/arch/i386/kernel/nmi.c b/arch/i386/kernel/nmi.c
--- a/arch/i386/kernel/nmi.c2007-08-04 04:49:55.0 +0200
+++ b/arch/i386/kernel/nmi.c2007-08-10 22:00:40.0 +0200
@@ -77,7 +77,7 @@ static int __init check_nmi_watchdog(voi
unsigned int *prev_nmi_count;
int cpu;
 
-   if ((nmi_watchdog == NMI_NONE) || (nmi_watchdog == NMI_DEFAULT))
+   if ((nmi_watchdog == NMI_NONE) || (nmi_watchdog == NMI_DISABLED))
return 0;
 
if (!atomic_read(&nmi_active))
@@ -424,7 +424,7 @@ int proc_nmi_enabled(struct ctl_table *t
if (!!old_state == !!nmi_watchdog_enabled)
return 0;
 
-   if (atomic_read(&nmi_active) < 0) {
+   if (atomic_read(&nmi_active) < 0 || nmi_watchdog == NMI_DISABLED) {
printk( KERN_WARNING "NMI watchdog is permanently disabled\n");
return -EIO;
}
diff -rup a/arch/x86_64/kernel/nmi.c b/arch/x86_64/kernel/nmi.c
--- a/arch/x86_64/kernel/nmi.c  2007-08-04 04:49:55.0 +0200
+++ b/arch/x86_64/kernel/nmi.c  2007-08-10 21:59:36.0 +0200
@@ -85,7 +85,7 @@ int __init check_nmi_watchdog (void)
int *counts;
int cpu;
 
-   if ((nmi_watchdog == NMI_NONE) || (nmi_watchdog == NMI_DEFAULT))
+   if ((nmi_watchdog == NMI_NONE) || (nmi_watchdog == NMI_DISABLED)) 
return 0;
 
if (!atomic_read(&nmi_active))
@@ -442,7 +442,7 @@ int proc_nmi_enabled(struct ctl_table *t
if (!!old_state == !!nmi_watchdog_enabled)
return 0;
 
-   if (atomic_read(&nmi_active) < 0) {
+   if (atomic_read(&nmi_active) < 0 || nmi_watchdog == NMI_DISABLED) {
printk( KERN_WARNING "NMI watchdog is permanently disabled\n");
return -EIO;
}
diff -rup a/include/asm-i386/nmi.h b/include/asm-i386/nmi.h
--- a/include/asm-i386/nmi.h2007-08-04 04:49:55.0 +0200
+++ b/include/asm-i386/nmi.h2007-08-10 22:04:51.0 +0200
@@ -33,11 +33,12 @@ extern int nmi_watchdog_tick (struct pt_
 
 extern atomic_t nmi_active;
 extern unsigned int nmi_watchdog;
-#define NMI_DEFAULT -1
+#define NMI_DISABLED-1
 #define NMI_NONE   0
 #define NMI_IO_APIC1
 #define NMI_LOCAL_APIC 2
 #define NMI_INVALID3
+#define NMI_DEFAULTNMI_DISABLED
 
 struct ctl_table;
 struct file;
diff -rup a/include/asm-x86_64/nmi.h b/include/asm-x86_64/nmi.h
--- a/include/asm-x86_64/nmi.h  2007-08-04 04:49:55.0 +0200
+++ b/include/asm-x86_64/nmi.h  2007-08-10 22:04:41.0 +0200
@@ -64,11 +64,12 @@ extern int setup_nmi_watchdog(char *);
 
 extern atomic_t nmi_active;
 extern unsigned int nmi_watchdog;
-#define NMI_DEFAULT-1
+#define NMI_DISABLED-1
 #define NMI_NONE   0
 #define NMI_IO_APIC1
 #define NMI_LOCAL_APIC 2
 #define NMI_INVALID3
+#define NMI_DEFAULTNMI_DISABLED
 
 struct ctl_table;
 struct file;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] powerpc: Implement atomic{,64}_{read,write}() without volatile

2007-08-10 Thread Segher Boessenkool
Instead, use asm() like all other atomic operations already do.

Also use inline functions instead of macros; this actually
improves code generation (some code becomes a little smaller,
probably because of improved alias information -- just a few
hundred bytes total on a default kernel build, nothing shocking).

Signed-off-by: Segher Boessenkool <[EMAIL PROTECTED]>
---
 include/asm-powerpc/atomic.h |   34 --
 1 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/include/asm-powerpc/atomic.h b/include/asm-powerpc/atomic.h
index c44810b..bc17506 100644
--- a/include/asm-powerpc/atomic.h
+++ b/include/asm-powerpc/atomic.h
@@ -5,7 +5,7 @@
  * PowerPC atomic operations
  */
 
-typedef struct { volatile int counter; } atomic_t;
+typedef struct { int counter; } atomic_t;
 
 #ifdef __KERNEL__
 #include 
@@ -15,8 +15,19 @@ typedef struct { volatile int counter; } atomic_t;
 
 #define ATOMIC_INIT(i) { (i) }
 
-#define atomic_read(v) ((v)->counter)
-#define atomic_set(v,i)(((v)->counter) = (i))
+static __inline__ int atomic_read(const atomic_t *v)
+{
+   int t;
+
+   __asm__ __volatile__("lwz%U1%X1 %0,%1" : "=r"(t) : "m"(v->counter));
+
+   return t;
+}
+
+static __inline__ void atomic_set(atomic_t *v, int i)
+{
+   __asm__ __volatile__("stw%U0%X0 %1,%0" : "=m"(v->counter) : "r"(i));
+}
 
 static __inline__ void atomic_add(int a, atomic_t *v)
 {
@@ -240,12 +251,23 @@ static __inline__ int atomic_dec_if_positive(atomic_t *v)
 
 #ifdef __powerpc64__
 
-typedef struct { volatile long counter; } atomic64_t;
+typedef struct { long counter; } atomic64_t;
 
 #define ATOMIC64_INIT(i)   { (i) }
 
-#define atomic64_read(v)   ((v)->counter)
-#define atomic64_set(v,i)  (((v)->counter) = (i))
+static __inline__ long atomic64_read(const atomic_t *v)
+{
+   long t;
+
+   __asm__ __volatile__("ld%U1%X1 %0,%1" : "=r"(t) : "m"(v->counter));
+
+   return t;
+}
+
+static __inline__ void atomic64_set(atomic_t *v, long i)
+{
+   __asm__ __volatile__("std%U0%X0 %1,%0" : "=m"(v->counter) : "r"(i));
+}
 
 static __inline__ void atomic64_add(long a, atomic64_t *v)
 {
-- 
1.5.2.1.144.gabc40-dirty

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Use of directories to hold root?

2007-08-10 Thread H. Peter Anvin
Jan Engelhardt wrote:
> On Aug 10 2007 17:24, Mark Cannon wrote:
>> You pass the kernel the root option to specify the root partition.
>> Is there a way to identify a directory in that partition that holds the
>> root or something equivalent to this?
> 
> No, but you can use pivot_root.

Or better yet, use an initramfs with MS_MOVE; same as you would with the
"normal" use of initramfs.

-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/2] spinlock: lockbreak cleanup

2007-08-10 Thread Andi Kleen

Nick,

These two patches make my P4 (single socket HT) test box not boot. I dropped 
them for now.

Some oopses

-Andi


NMI Watchdog detected LOCKUP on CPU 1
CPU 1 
Modules linked in:
Pid: 1648, comm: sh Not tainted 2.6.23-rc2-git3 #472
RIP: 0010:[]  [] _spin_lock+0x10/0x18
RSP: 0018:810001127f20  EFLAGS: 0097
RAX: df84 RBX: 8100398de040 RCX: 810001105850
RDX: 810080852000 RSI:  RDI: 810001017180
RBP: 810001127f58 R08: 1001 R09: 807c5180
R10: 0001 R11: 8030ed1e R12: 810001017180
R13: 8100398de040 R14: 0001 R15: 81003a6c3b48
FS:  2b0f1abcef60() GS:81003e0ffcc0() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 0045b090 CR3: 3db5c000 CR4: 06e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process sh (pid: 1648, threadinfo 81003948, task 8100398de040)
Stack:  8022fd10 000104ab 8100398de040 8100398de040
  81003d086ac0 810001017180 0001
 8023b91a 807c25c8 810039481cd0 
 8021ae58 807c25c8 8021b461 81003a51b408
  0001 8020bfd6 810039481cd0  
 810039481de8 8030ed1e 8001 0206
 81003948 810001017180 81003948 81003da4c850
 8100398de040  ff10 80545e8a
 0010 0246 810039481d58 0018
 0086 80311570 8100398d91c0 81003ad83e50
 8100398de040 81003e0e0790 8100398de248 39481dc8
 81003da4c850 142e 8100398de040 81000100e208
 81003e0e0790  810039481e88 810039481e90
 81000100e180 810039481e68 0059a4f0 810039481df8
 8022f56c 810039481e08 80545f79 810039481e58
 80545fb3 0002 0292 81003db92000
 81003e0e0790 0001 81000100e180 81003e0e0790
 0001 810039481ed8 8022fa8f 810039481e68
 810039481e68 8100398de040 0001 0001
 81000101 810039481e98 810039481e98 81003db10be0
 0202 8100398d91c0 398d91c0 81003db92000
 0059a920 81003affd380 8028399b 810039481f58
 81003db92000 0059a920 0059a4f0 81003db92000
 0059a920 0059a8b0 8020a1ec 2b0f1a9a7628
 00594e20 0059a4f0 00599c01 00594e20
 8020b767 0059a8b0 0059a920 00594e20
 00599c01 0059a4f0 00594e20 0202
   2b0f1b28 003b
  0059a920 0059a4f0 00594e20
 003b 2b0f1aa33d97 0033 0202
 7fff906c42c8 002b
Call Trace:
   [] scheduler_tick+0x3e/0x149
 [] update_process_times+0x5c/0x68
 [] smp_local_timer_interrupt+0x34/0x55
 [] smp_apic_timer_interrupt+0x44/0x5b
 [] apic_timer_interrupt+0x66/0x70
   [] nfs_permission+0x0/0x1d1
 [] thread_return+0x58/0xd0
 [] nfs_file_open+0x0/0x7c
 [] __cond_resched+0x1c/0x44
 [] cond_resched+0x2e/0x39
 [] wait_for_completion+0x17/0xbe
 [] sched_exec+0xb3/0xce
 [] do_execve+0x5d/0x1a6
 [] sys_execve+0x36/0x8b
 [] stub_execve+0x67/0xb0


Code: 8a 07 0f ae e8 eb f3 c3 f0 81 2f 00 00 00 01 74 05 e8 a8 62 
Kernel panic - not syncing: Aiee, killing interrupt handler!


(another boot) 


NMI Watchdog detected LOCKUP on CPU 0
CPU 0 
Modules linked in:
Pid: 1193, comm: udevstart Not tainted 2.6.23-rc2-git3 #474
RIP: 0010:[]  [] _spin_lock+0x15/0x18
RSP: 0018:81003a6cf8d0  EFLAGS: 0002
RAX: 6a6b RBX: 807c6180 RCX: 
RDX:  RSI: 81003a6cf930 RDI: 81000100e180
RBP: 81003a6cf8f8 R08: 81003a7f9680 R09: 81003a628b48
R10: 0053b31b R11: 8030eb02 R12: 81000100e180
R13: 81003a6cf930 R14: 810001118100 R15: 
FS:  2b85cee96b00() GS:8072d000() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 2b6f5d8d6310 CR3: 3a524000 CR4: 06e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process udevstart (pid: 1193, threadinfo 81003a6ce000, task 
810001263890)
Stack:  8022c09b 0003 0001 810001118100
 806d6188 81003a6cf968 8022d166 3a6cf928
 0003 810001017180 810001017180 81003a6cf948
 0092 81003a6cf978 81003a043d20 00

Re: [PATCH 1/24] make atomic_read() behave consistently on alpha

2007-08-10 Thread Herbert Xu
On Fri, Aug 10, 2007 at 10:07:27PM +0200, Segher Boessenkool wrote:
> 
> That means GCC cannot compile Linux; it already optimises
> some accesses to scalars to smaller accesses when it knows
> it is allowed to.  Not often though, since it hardly ever
> helps in the cost model it employs.

Please give an example code snippet + gcc version + arch
to back this up.

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 9/24] make atomic_read() behave consistently on ia64

2007-08-10 Thread Luck, Tony
> Here are the functions in which they occur in the object file. You
> may have to chase down some inlining to find the function that
> actually uses atomic_*().

Ignore this ... Andreas' patch was only two lines so I
thought I'd "save time" by just hand-editing the source over
on my build machine.  I managed to goof that by editing the
wrong  function for one of the cases. :-(

New result.  With Andreas's patch correctly applied, the generated
vmlinux is identical with/without your patch.

-Tony
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix to keep watchdog disabled by default for i386/x86_64

2007-08-10 Thread Andi Kleen

> +#deifne NMI_DEFAULT  NMI_DISABLED

Actually tested?

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Documentation files in html format?

2007-08-10 Thread Rene Herman

On 08/10/2007 10:12 PM, Sam Ravnborg wrote:


What primary requirements does in-tree Linux kernel documentation have
to fulfill in general?


Skipping the obvious ones such as correct, up-to-date etc.
o Readable as-is
o Grepable
o buildable as structured documents or almost like a single book
o Easy to replicate structure
o Maintainable in any decent text-editor (emacs, vim, whatever)


Easy to put online?


Asciidoc is quite close to plaintext and it looks to me that the
formatting possibilities are quite good.

I spend an hour experimenting a little with
Documentation/kbuild/makefiles.txt.

Diff below shows quite a lot of changes but for the most
this is removal of the indent tab.
Most likely I could have tweaked asciidoc to accept this
but wanted to use default config.

The resulting html page can be seen here:
http://www.ravnborg.org/kbuild/makefiles.html


FWIW, this looks very good to me...

Rene.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/4] Add ETHTOOL_[GS]FLAGS sub-ioctls

2007-08-10 Thread Rick Jones

David Miller wrote:

From: Ben Greear <[EMAIL PROTECTED]>
Date: Fri, 10 Aug 2007 15:40:02 -0700



For GSO on output, is there a generic fallback for any driver that
does not specifically implement GSO?



Absolutely, in fact that's mainly what it's there for.

I don't think there is any issue.  The knob is there via
ethtool for people who really want to disable it.


Just to be paranoid (who me?) we are then at a point where what happened 
a couple months ago with forwarding between 10G and IPoIB won't happen 
again - where things failed because a 10G NIC had LRO enabled by default?


rick jones
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 02/10] mm: system wide ALLOC_NO_WATERMARK

2007-08-10 Thread Daniel Phillips
On 8/10/07, Christoph Lameter <[EMAIL PROTECTED]> wrote:
> The idea of adding code to deal with "I have no memory" situations
> in a kernel that based on have as much memory as possible in use at all
> times is plainly the wrong approach.

No.  It is you who have read the patches wrongly, because what you
imply here is exactly backwards.

> If you need memory then memory needs
> to be reclaimed. That is the basic way that things work

Wrong.  A naive reading of your comment would suggest you do not
understand how PF_MEMALLOC works, and that it has worked that way from
day one (well, since long before I arrived) and that we just do more
of the same, except better.

> and following that
> through brings about a much less invasive solution without all the issues
> that the proposed solution creates.

What issues?  Test case please, a real one that you have run yourself.
 Please, no more theoretical issues that cannot be demonstrated in
practice because they do not exist.

Regards,

Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [v4l-dvb-maintainer] [2.6 patch] dvb_frontend_ioctl(): fix check-after-use

2007-08-10 Thread Trent Piepho
On Fri, 10 Aug 2007, Markus Rechberger wrote:
> On 8/1/07, Manu Abraham <[EMAIL PROTECTED]> wrote:
> > On 7/31/07, Adrian Bunk <[EMAIL PROTECTED]> wrote:
> > > The Coverity checker spotted that we have already oops'ed if "fe" was
> > NULL.
> > >
> > > --- linux-2.6.23-rc1-mm1/drivers/media/dvb/dvb-core/dvb_frontend.c.old
> > > +++ linux-2.6.23-rc1-mm1/drivers/media/dvb/dvb-core/dvb_frontend.c
> > > @@ -706,11 +706,11 @@ static int dvb_frontend_ioctl(struct ino
> > > -   if (!fe || fepriv->exit)
> > > +   if (fepriv->exit)
> > > return -ENODEV;
>
> This issue has been known for a while including some other problems at
> that part.
>
> http://article.gmane.org/gmane.linux.drivers.dvb/35351/match=patch+dvb_net+hotplugging+support
>
> this includes a link where this and more got discussed in May.

For dvb_net_close, I like the patch I already posted better.  To fix the
check-after-use, it's not "use" part that's the problem, it's the "check" part
that isn't necessary.

I traced the dvb-net code, http://article.gmane.org/gmane.linux.kernel/543689,
and I'm sure that dvbdev can't be NULL.

My patch also deletes a few pieces of duplicated code by calling
dvb_generic_release().

The only problem is that practically no one uses dvb-net, so it's very hard to
test these patches.

In all the dvb code, were is the locking for device open and release?  I don't
see it.  What is preventing two threads from trying to open and/or close the
same dvb device at the same time?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 9/24] make atomic_read() behave consistently on ia64

2007-08-10 Thread Chris Snook

Linus Torvalds wrote:


On Fri, 10 Aug 2007, Luck, Tony wrote:

Here are the functions in which they occur in the object file. You
may have to chase down some inlining to find the function that
actually uses atomic_*().


Could you just make the "atomic_read()" and "atomic_set()" functions be 
inline functions instead?


That way you get nice compiler warnings when you pass the wrong kind of 
object around.  So


static void atomic_set(atomic_t *p, int value)
{
*(volatile int *)&p->value = value;
}

static int atomic_read(atomic_t *p)
{
return *(volatile int *)&p->value;
}

etc...


I'll do this for the whole patchset.  Stay tuned for the resubmit.

-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 9/24] make atomic_read() behave consistently on ia64

2007-08-10 Thread Linus Torvalds


On Fri, 10 Aug 2007, Luck, Tony wrote:
> 
> Here are the functions in which they occur in the object file. You
> may have to chase down some inlining to find the function that
> actually uses atomic_*().

Could you just make the "atomic_read()" and "atomic_set()" functions be 
inline functions instead?

That way you get nice compiler warnings when you pass the wrong kind of 
object around.  So

static void atomic_set(atomic_t *p, int value)
{
*(volatile int *)&p->value = value;
}

static int atomic_read(atomic_t *p)
{
return *(volatile int *)&p->value;
}

etc...

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Documentation files in html format?

2007-08-10 Thread Sam Ravnborg
> 
> The problem I have with asciidoc is that it's a nightmare to get it
> to work. It's what GIT uses, and after spending a whole day trying
> to *build* that thing, I finally resigned and asked Junio if he could
> publish the pre-formatted manpages himself, which he agreed to.

Bit uses in addition to asciidoc also docbook and a bit more.
As asciidoc is some phython scripts it should be trivial to
install with no build required.
Maybe it was the docbook stuff you had trouble with?

My Kbuild example were made without using other tools than asciidoc but
if pdf is desired some additional tools are needed.

Sam
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] powerfc fix for assembler -g

2007-08-10 Thread Roland McGrath
ppc64 does the unusual thing of using #include on a compiler-generated
assembly file (lparmap.s) from an assembly source file (head_64.S).
This runs afoul of my recent patch to pass -gdwarf2 to the assembler
under CONFIG_DEBUG_INFO.  This patch avoids the problem by disabling
DWARF generation (-g0) when producing lparmap.s.

Signed-off-by: Roland McGrath <[EMAIL PROTECTED]>
---
 arch/powerpc/kernel/Makefile |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index f39a72f..b0cb2e6 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -81,6 +81,7 @@ obj-y += iomap.o
 endif
 
 ifeq ($(CONFIG_PPC_ISERIES),y)
+CFLAGS_lparmap.s   += -g0
 extra-y += lparmap.s
 $(obj)/head_64.o:  $(obj)/lparmap.s
 AFLAGS_head_64.o += -I$(obj)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 9/24] make atomic_read() behave consistently on ia64

2007-08-10 Thread Luck, Tony
> Possibly.  Either that or we've uncovered some latent bugs.  Maybe a 
> combination of the two.  Can you list those 19 changes so we can 
evaluate them?

Here are the functions in which they occur in the object file. You
may have to chase down some inlining to find the function that
actually uses atomic_*().

freeque
do_msgrcv
sk_free
sock_wfree
sock_rfree
sock_kmalloc
sock_kfree_s
sock_setsockopt
skb_release_data
__sk_stream_mem_reclaim
sk_tream_mem_schedule
sk_stream_rfree
sk_attach_filter
ip_frag_destroy * 2
ip_frag_queue * 2
ip_frag_reasm * 2

-Tony

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Noatime vs relatime

2007-08-10 Thread Rene Herman

On 08/10/2007 05:10 PM, Matti Aarnio wrote:


On Fri, Aug 10, 2007 at 07:26:46AM -0700, Vlad wrote:
... 

"Warning: Atime will be disabled by default in future kernel versions,
but you will still be able to turn it on when configuring the kernel."

This should give a heads-up to the 0.001% of people who still use
atime so that they know to customize this option or start using modern
file-monitoring techniques like inotify.


NO for two reasons:
  - atime semantics are just fine in server environments
  - inotify IS NOT scalable to millions of files, nor
to situations where we want to check alteration weeks
or months after the fact

In reality I would perhaps prefer mount-behaviour being altered
from 'by default do atime' to 'by default do noatime.


I must say I've been wondering about relatime a bit as well. Are there 
actually users who do really want atime, but not badly enough to want real 
atime?


I've been running with noatime for years now and do not plan on changing 
that so have been shrugging this entire discussion off with "no care of 
mine", but whose care _is_ it?



There MUST be an easy way to tell system that "yes, I want to track
last accesstime."


mount -o atime. Or as far as I'm concerned, keep the default as posixly 
compliant as one wants and teach people and distributions to mount "noatime" 
as I hear some have already been doing. I may be wrong, but to me, relatime 
sounds like compromising for the sake of compromising...


Rene.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: early boot lockup with 2.6.23-rc1

2007-08-10 Thread H. Peter Anvin
Mikko Rapeli wrote:
> 
> Oops, I was wrong and bad enough to think nesting #ifdef's would work;
> 2.6.23-rc2 with query_mca() to query_edd() in arch/i386/boot/main.c
> commented out works.
> 
> Sorry about that one.
> 

OK, good.  That would be consistent with the current analysis.

Let me know what you get out of the test patch I sent.

-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-10 Thread Roman Zippel
Hi,

On Fri, 10 Aug 2007, Ingo Molnar wrote:

> achieve that. It probably wont make a real difference, but it's really 
> easy for you to send and it's still very useful when one tries to 
> eliminate possibilities and when one wants to concentrate on the 
> remaining possibilities alone.

The thing I'm afraid about CFS is its possible unpredictability, which 
would make it hard to reproduce problems and we may end up with users with 
unexplainable weird problems. That's the main reason I'm trying so hard to 
push for a design discussion.

Just to give an idea here are two more examples of irregular behaviour, 
which are hopefully easier to reproduce.

1. Two simple busy loops, one of them is reniced to 15, according to my 
calculations the reniced task should get about 3.4% (1/(1.25^15+1)), but I 
get this:

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 4433 roman 20   0  1532  300  244 R 99.2  0.2   5:05.51 l
 4434 roman 35  15  1532   72   16 R  0.7  0.1   0:10.62 l

OTOH upto nice level 12 I get what I expect.

2. If I start 20 busy loops, initially I see in top that every task gets 
5% and time increments equally (as it should):

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 4492 roman 20   0  1532   68   16 R  5.0  0.1   0:02.86 l
 4491 roman 20   0  1532   68   16 R  5.0  0.1   0:02.86 l
 4490 roman 20   0  1532   68   16 R  5.0  0.1   0:02.86 l
 4489 roman 20   0  1532   68   16 R  5.0  0.1   0:02.86 l
 4488 roman 20   0  1532   68   16 R  5.0  0.1   0:02.86 l
 4487 roman 20   0  1532   68   16 R  5.0  0.1   0:02.86 l
 4486 roman 20   0  1532   68   16 R  5.0  0.1   0:02.86 l
 4485 roman 20   0  1532   68   16 R  5.0  0.1   0:02.86 l
 4484 roman 20   0  1532   68   16 R  5.0  0.1   0:02.86 l
 4483 roman 20   0  1532   68   16 R  5.0  0.1   0:02.86 l
 4482 roman 20   0  1532   68   16 R  5.0  0.1   0:02.86 l
 4481 roman 20   0  1532   68   16 R  5.0  0.1   0:02.86 l
 4480 roman 20   0  1532   68   16 R  5.0  0.1   0:02.86 l
 4479 roman 20   0  1532   68   16 R  5.0  0.1   0:02.86 l
 4478 roman 20   0  1532   68   16 R  5.0  0.1   0:02.86 l
 4477 roman 20   0  1532   68   16 R  5.0  0.1   0:02.86 l
 4476 roman 20   0  1532   68   16 R  5.0  0.1   0:02.86 l
 4475 roman 20   0  1532   68   16 R  5.0  0.1   0:02.86 l
 4474 roman 20   0  1532   68   16 R  5.0  0.1   0:02.86 l
 4473 roman 20   0  1532  296  244 R  5.0  0.2   0:02.86 l

But if I renice all of them to -15, the time every task gets is rather 
random:

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 4492 roman  5 -15  1532   68   16 R  1.0  0.1   0:07.95 l
 4491 roman  5 -15  1532   68   16 R  4.3  0.1   0:07.62 l
 4490 roman  5 -15  1532   68   16 R  3.3  0.1   0:07.50 l
 4489 roman  5 -15  1532   68   16 R  7.6  0.1   0:07.80 l
 4488 roman  5 -15  1532   68   16 R  9.6  0.1   0:08.31 l
 4487 roman  5 -15  1532   68   16 R  3.3  0.1   0:07.59 l
 4486 roman  5 -15  1532   68   16 R  6.6  0.1   0:07.08 l
 4485 roman  5 -15  1532   68   16 R 10.0  0.1   0:07.31 l
 4484 roman  5 -15  1532   68   16 R  8.0  0.1   0:07.30 l
 4483 roman  5 -15  1532   68   16 R  7.0  0.1   0:07.34 l
 4482 roman  5 -15  1532   68   16 R  1.0  0.1   0:05.84 l
 4481 roman  5 -15  1532   68   16 R  1.0  0.1   0:07.16 l
 4480 roman  5 -15  1532   68   16 R  3.3  0.1   0:07.00 l
 4479 roman  5 -15  1532   68   16 R  1.0  0.1   0:06.66 l
 4478 roman  5 -15  1532   68   16 R  8.6  0.1   0:06.96 l
 4477 roman  5 -15  1532   68   16 R  8.6  0.1   0:07.63 l
 4476 roman  5 -15  1532   68   16 R  9.6  0.1   0:07.38 l
 4475 roman  5 -15  1532   68   16 R  1.3  0.1   0:07.09 l
 4474 roman  5 -15  1532   68   16 R  2.3  0.1   0:07.97 l
 4473 roman  5 -15  1532  296  244 R  1.0  0.2   0:07.73 l

bye, Roman
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/4] Add ETHTOOL_[GS]FLAGS sub-ioctls

2007-08-10 Thread David Miller
From: Ben Greear <[EMAIL PROTECTED]>
Date: Fri, 10 Aug 2007 15:40:02 -0700

> For GSO on output, is there a generic fallback for any driver that
> does not specifically implement GSO?

Absolutely, in fact that's mainly what it's there for.

I don't think there is any issue.  The knob is there via
ethtool for people who really want to disable it.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Partition information lost on reboot.

2007-08-10 Thread Rene Herman

On 08/10/2007 02:30 PM, Michal Piotrowski wrote:


[Adding linux-scsi and Adaptec support to CC]

On 10/08/07, Jegadeesh <[EMAIL PROTECTED]> wrote:

Hi,

I have a scsi disk on Adaptec ASC-29320 U320. I have created a linux
partition and ext3 filesystem over it.
Now the problem is, whenever the machine is rebooted, the partition
information to the OS is lost and I get an error saying it as a not valid
block device.
But fdisk tool shows the partitions, but "cat /proc/partitions" doesnt have
this. I need to do a "partprobe" and then have to mount it explicitly.

What could be causing this problem. Given below are some of the command
outputs.


Is that the "Adaptec AIC79xx U320 support" (CONFIG_SCSI_AIC79XX) driver? If 
so, did you lower the "Initial bus reset delay" (default is 5000 ms) in the 
kernel configuration?


Rene.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 9/24] make atomic_read() behave consistently on ia64

2007-08-10 Thread Chris Snook

Luck, Tony wrote:

Use atomic64_read to read an atomic64_t.


Thanks Andreas!

Chris: This bug is why the 8-byte loads got changed to 4-byte + sign-extend
by your change to atomic_read().


I figured as much.  Thanks for confirming this.


With this applied together with shuffling the volatile from the
declaration to the usage (in both atomic_read() and atomic_set()
the generated code *almost* reverts to the original.

There are some differences where ld4 have turned into ld8 though.
Are these bugs in the use of atomic_add() and atomic_sub().  E.g.
the first of these changes is in: ipc/msg.c:freeque() where we have:

atomic_sub(msg->q_cbytes, &msg_bytes);

Now the type of msg->q_cbytes is "unsigned long" ... so it seems a
poor idea to subtract such a large typed object from "msg_bytes"
which is a mere slip of an atomic_t.

Or is there some other type-wrangling that needs to happen in
include/asm-ia64/atomic.h?  There are a total of nineteen of
these ld4->ld8 transforms.


Possibly.  Either that or we've uncovered some latent bugs.  Maybe a 
combination of the two.  Can you list those 19 changes so we can 
evaluate them?  I'm told there were some *(volatile *) bugs fixed in gcc 
recently, so it's also possible your 3.4.6 is showing those.  I can test 
that on a more recent gcc on ia64 if it's inconvenient for you to do so 
on your test box.


-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc1 regression: hwmon/w83627ehf: wrong fan speed

2007-08-10 Thread Stefan Richter
I wrote:
> # sensors
> w83627ehf-isa-0290
> Adapter: ISA adapter
> VCore: +0.95 V  (min =  +0.00 V, max =  +1.74 V)
> in1:  +12.30 V  (min =  +1.64 V, max =  +3.22 V) ALARM
> AVCC:  +3.28 V  (min =  +1.89 V, max =  +1.94 V) ALARM
> 3VCC:  +3.26 V  (min =  +0.18 V, max =  +0.72 V) ALARM
> in4:   +1.58 V  (min =  +0.57 V, max =  +0.90 V) ALARM
> in5:   +1.70 V  (min =  +0.41 V, max =  +1.19 V) ALARM
> in6:   +3.43 V  (min =  +0.31 V, max =  +3.05 V) ALARM
> VSB:   +3.25 V  (min =  +0.37 V, max =  +3.01 V) ALARM
> VBAT:  +3.18 V  (min =  +3.94 V, max =  +0.74 V) ALARM
> in9:   +1.88 V  (min =  +0.79 V, max =  +1.40 V) ALARM
> Case Fan:0 RPM  (min =  753 RPM, div = 128) ALARM
> CPU Fan:88 RPM  (min =  659 RPM, div = 64) ALARM
> Aux Fan: 0 RPM  (min = 10546 RPM, div = 128) ALARM
> fan5:0 RPM  (min =  753 RPM, div = 128) ALARM
> Sys Temp:+44 C  (high =-5 C, hyst =   -34 C)   ALARM
> CPU Temp:  +38.0 C  (high = +80.0 C, hyst = +75.0 C)
> AUX Temp:  +43.5 C  (high = +80.0 C, hyst = +75.0 C)
> 
> coretemp-isa-
> Adapter: ISA adapter
> 
> coretemp-isa-0001
> Adapter: ISA adapter
...
> I'll reboot in a minute into 2.6.22(-rc5) and post the "sensors" output.

# sensors
w83627ehf-i2c-9191-290
 ERROR: Can't get adapter or algorithm?!?
VCore: +0.95 V  (min =  +0.00 V, max =  +1.74 V)
in1:  +12.20 V  (min =  +1.64 V, max =  +3.22 V) ALARM
AVCC:  +3.26 V  (min =  +1.89 V, max =  +1.94 V) ALARM
3VCC:  +3.26 V  (min =  +0.18 V, max =  +0.72 V) ALARM
in4:   +1.58 V  (min =  +0.57 V, max =  +0.90 V) ALARM
in5:   +1.71 V  (min =  +0.41 V, max =  +1.19 V) ALARM
in6:   +3.43 V  (min =  +0.31 V, max =  +3.05 V) ALARM
VSB:   +3.26 V  (min =  +0.37 V, max =  +3.01 V) ALARM
VBAT:  +3.18 V  (min =  +3.94 V, max =  +0.74 V) ALARM
in9:   +1.88 V  (min =  +0.79 V, max =  +1.40 V) ALARM
Case Fan:  484 RPM  (min = 84375 RPM, div = 16) ALARM
CPU Fan:  1424 RPM  (min = 21093 RPM, div = 4) ALARM
Aux Fan: 0 RPM  (min = 10546 RPM, div = 128) ALARM
fan5:0 RPM  (min = 10546 RPM, div = 128) ALARM
Sys Temp:+45 C  (high =-5 C, hyst =   -34 C)   ALARM
CPU Temp:  +39.5 C  (high = +80.0 C, hyst = +75.0 C)
AUX Temp:  +44.5 C  (high = +80.0 C, hyst = +75.0 C)

coretemp-isa-
Adapter: ISA adapter

coretemp-isa-0001
Adapter: ISA adapter

-- 
Stefan Richter
-=-=-=== =--- -=-==
http://arcgraph.de/sr/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/4] Add ETHTOOL_[GS]FLAGS sub-ioctls

2007-08-10 Thread Ben Greear

David Miller wrote:

From: Ben Greear <[EMAIL PROTECTED]>



I believe LRO is going to have to be disabled for routing/bridging,
so the stack will probably need to become aware of it at some point...


The packet will be GSO'd on output I believe, so it won't
break anything.

Alternatively, we could make the driver only LRO accumulate if the
packet is unicast and matches one of the MAC's programmed into the
chip.


I think even this would fail if you are doing something clever with
NAT or other iptables stuff.  Probably we're going to have to put this
in the hands of the users..who hopefully can determine whether they
can allow LRO or not...

For GSO on output, is there a generic fallback for any driver that
does not specifically implement GSO?

Thanks,
Ben

--
Ben Greear <[EMAIL PROTECTED]>
Candela Technologies Inc  http://www.candelatech.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: early boot lockup with 2.6.23-rc1

2007-08-10 Thread Mikko Rapeli
On Fri, Aug 10, 2007 at 10:20:31PM +0300, Mikko Rapeli wrote:
> On Fri, Aug 10, 2007 at 09:45:31AM -0700, H. Peter Anvin wrote:
> > Let me get this straight... "edd=skipmbr" boots fine, but commenting out
> > the call to query_edd() didn't?  Could you please try that (and, I
> > guess, only that), and make sure everything necessary is rebuild.
> >
> > 2.6.23-*rc2* you say boots fine with "edd=skipmbr", but not without?
> 
> Yes, vanilla 2.6.23-rc2 with edd=skipmbr boots fine.
> 
> > Did you try the above commenting-out on rc2?
> 
> Yes, didn't work with 2.6.23-rc2 but printed one dot in the upper left
> corner after grub stuff.

Oops, I was wrong and bad enough to think nesting #ifdef's would work;
2.6.23-rc2 with query_mca() to query_edd() in arch/i386/boot/main.c
commented out works.

Sorry about that one.

-Mikko
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm2

2007-08-10 Thread John W. Linville
On Fri, Aug 10, 2007 at 01:20:19PM -0700, Andrew Morton wrote:

> git-wireless now has the usual git catastrophe when merging it against the
> recently-discovered net-2.6.24 tree, so I'll need to do something about
> that first.

I have rebased the wireless-dev tree, and the mm-master branch there
should specifically avoid these merge conflicts.

Hth!

John
-- 
John W. Linville
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 9/24] make atomic_read() behave consistently on ia64

2007-08-10 Thread Luck, Tony
> Use atomic64_read to read an atomic64_t.

Thanks Andreas!

Chris: This bug is why the 8-byte loads got changed to 4-byte + sign-extend
by your change to atomic_read().

With this applied together with shuffling the volatile from the
declaration to the usage (in both atomic_read() and atomic_set()
the generated code *almost* reverts to the original.

There are some differences where ld4 have turned into ld8 though.
Are these bugs in the use of atomic_add() and atomic_sub().  E.g.
the first of these changes is in: ipc/msg.c:freeque() where we have:

atomic_sub(msg->q_cbytes, &msg_bytes);

Now the type of msg->q_cbytes is "unsigned long" ... so it seems a
poor idea to subtract such a large typed object from "msg_bytes"
which is a mere slip of an atomic_t.

Or is there some other type-wrangling that needs to happen in
include/asm-ia64/atomic.h?  There are a total of nineteen of
these ld4->ld8 transforms.

-Tony
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial patches from -mm

2007-08-10 Thread Andrew Morton
On Fri, 10 Aug 2007 14:38:35 -0700
Andrew Morton <[EMAIL PROTECTED]> wrote:

> 
> I'll send these
> 
> dont-optimise-away-baud-rate-changes-when-bother-is-used.patch
> serial-add-support-for-ite-887x-chips.patch
> serial_txx9-fix-modem-control-line-handling.patch
> serial_txx9-cleanup-includes.patch
> serial-8250-handle-saving-the-clear-on-read-bits-from-the-lsr.patch
> add-blacklisting-capability-to-serial_pci-to-avoid-misdetection.patch
> 
> for review, please.
> 
> I've identified these as not-for-2.6.23 which may of course have been
> incorrect.
> 

Based on an Alan ack and my own review I have queued these:

dont-optimise-away-baud-rate-changes-when-bother-is-used.patch
serial-add-support-for-ite-887x-chips.patch
serial_txx9-fix-modem-control-line-handling.patch
serial-8250-handle-saving-the-clear-on-read-bits-from-the-lsr.patch
add-blacklisting-capability-to-serial_pci-to-avoid-misdetection.patch

for 2.6.23 and this:

serial_txx9-cleanup-includes.patch

for 2.6.24.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Fix to keep watchdog disabled by default for i386/x86_64

2007-08-10 Thread Daniel Gollub
Fixed wrong expression which enabled watchdogs even if nmi_watchdog kernel 
parameter wasn't set. This regression got slightly introduced with commit 
b7471c6da94d30d3deadc55986cc38d1ff57f9ca.

Introduced NMI_DISABLED (-1) which allows to switch the value of NMI_DEFAULT 
without breaking the APIC NMI watchdog code (again).

Fixes:
https://bugzilla.novell.com/show_bug.cgi?id=298084
http://bugzilla.kernel.org/show_bug.cgi?id=7839
And likely some more nmi_watchdog=0 related issues.

Signed-off-by: Daniel Gollub <[EMAIL PROTECTED]>
---
diff -rup a/arch/i386/kernel/apic.c b/arch/i386/kernel/apic.c
--- a/arch/i386/kernel/apic.c   2007-08-04 04:49:55.0 +0200
+++ b/arch/i386/kernel/apic.c   2007-08-10 21:38:37.0 +0200
@@ -1087,7 +1087,7 @@ static int __init detect_init_APIC (void
if (l & MSR_IA32_APICBASE_ENABLE)
mp_lapic_addr = l & MSR_IA32_APICBASE_BASE;
 
-   if (nmi_watchdog != NMI_NONE)
+   if (nmi_watchdog != NMI_NONE && nmi_watchdog != NMI_DISABLED)
nmi_watchdog = NMI_LOCAL_APIC;
 
printk(KERN_INFO "Found and enabled local APIC!\n");
diff -rup a/arch/i386/kernel/nmi.c b/arch/i386/kernel/nmi.c
--- a/arch/i386/kernel/nmi.c2007-08-04 04:49:55.0 +0200
+++ b/arch/i386/kernel/nmi.c2007-08-10 22:00:40.0 +0200
@@ -77,7 +77,7 @@ static int __init check_nmi_watchdog(voi
unsigned int *prev_nmi_count;
int cpu;
 
-   if ((nmi_watchdog == NMI_NONE) || (nmi_watchdog == NMI_DEFAULT))
+   if ((nmi_watchdog == NMI_NONE) || (nmi_watchdog == NMI_DISABLED))
return 0;
 
if (!atomic_read(&nmi_active))
@@ -424,7 +424,7 @@ int proc_nmi_enabled(struct ctl_table *t
if (!!old_state == !!nmi_watchdog_enabled)
return 0;
 
-   if (atomic_read(&nmi_active) < 0) {
+   if (atomic_read(&nmi_active) < 0 || nmi_watchdog == NMI_DISABLED) {
printk( KERN_WARNING "NMI watchdog is permanently disabled\n");
return -EIO;
}
diff -rup a/arch/x86_64/kernel/nmi.c b/arch/x86_64/kernel/nmi.c
--- a/arch/x86_64/kernel/nmi.c  2007-08-04 04:49:55.0 +0200
+++ b/arch/x86_64/kernel/nmi.c  2007-08-10 21:59:36.0 +0200
@@ -85,7 +85,7 @@ int __init check_nmi_watchdog (void)
int *counts;
int cpu;
 
-   if ((nmi_watchdog == NMI_NONE) || (nmi_watchdog == NMI_DEFAULT))
+   if ((nmi_watchdog == NMI_NONE) || (nmi_watchdog == NMI_DISABLED)) 
return 0;
 
if (!atomic_read(&nmi_active))
@@ -442,7 +442,7 @@ int proc_nmi_enabled(struct ctl_table *t
if (!!old_state == !!nmi_watchdog_enabled)
return 0;
 
-   if (atomic_read(&nmi_active) < 0) {
+   if (atomic_read(&nmi_active) < 0 || nmi_watchdog == NMI_DISABLED) {
printk( KERN_WARNING "NMI watchdog is permanently disabled\n");
return -EIO;
}
diff -rup a/include/asm-i386/nmi.h b/include/asm-i386/nmi.h
--- a/include/asm-i386/nmi.h2007-08-04 04:49:55.0 +0200
+++ b/include/asm-i386/nmi.h2007-08-10 22:04:51.0 +0200
@@ -33,11 +33,12 @@ extern int nmi_watchdog_tick (struct pt_
 
 extern atomic_t nmi_active;
 extern unsigned int nmi_watchdog;
-#define NMI_DEFAULT -1
+#define NMI_DISABLED-1
 #define NMI_NONE   0
 #define NMI_IO_APIC1
 #define NMI_LOCAL_APIC 2
 #define NMI_INVALID3
+#define NMI_DEFAULTNMI_DISABLED
 
 struct ctl_table;
 struct file;
diff -rup a/include/asm-x86_64/nmi.h b/include/asm-x86_64/nmi.h
--- a/include/asm-x86_64/nmi.h  2007-08-04 04:49:55.0 +0200
+++ b/include/asm-x86_64/nmi.h  2007-08-10 22:04:41.0 +0200
@@ -64,11 +64,12 @@ extern int setup_nmi_watchdog(char *);
 
 extern atomic_t nmi_active;
 extern unsigned int nmi_watchdog;
-#define NMI_DEFAULT-1
+#define NMI_DISABLED-1
 #define NMI_NONE   0
 #define NMI_IO_APIC1
 #define NMI_LOCAL_APIC 2
 #define NMI_INVALID3
+#deifne NMI_DEFAULTNMI_DISABLED
 
 struct ctl_table;
 struct file;

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc1 regression: hwmon/w83627ehf: wrong fan speed

2007-08-10 Thread Stefan Richter
Jean Delvare wrote:
> I just tried 2.6.23-rc2 on a system where I use the w83627ehf hardware
> monitoring driver, and was not able to reproduce the problem you
> described. Fan speeds are reported properly for me. Which I kind of
> expected, as I tested all my w83627ehf patches on this system before
> submitting them.

Thanks that you are still after it.  I was busy with other stuff the
whole week, hence no git bisect result from me yet.

> Please try using sensors instead of ksensors, and confirm that the
> behavior is the same. I'd like to rule out a problem in ksensors
> itself. sensors will also report the fan divs, this is a useful
> information given the problem you have.

# sensors
w83627ehf-isa-0290
Adapter: ISA adapter
VCore: +0.95 V  (min =  +0.00 V, max =  +1.74 V)
in1:  +12.30 V  (min =  +1.64 V, max =  +3.22 V) ALARM
AVCC:  +3.28 V  (min =  +1.89 V, max =  +1.94 V) ALARM
3VCC:  +3.26 V  (min =  +0.18 V, max =  +0.72 V) ALARM
in4:   +1.58 V  (min =  +0.57 V, max =  +0.90 V) ALARM
in5:   +1.70 V  (min =  +0.41 V, max =  +1.19 V) ALARM
in6:   +3.43 V  (min =  +0.31 V, max =  +3.05 V) ALARM
VSB:   +3.25 V  (min =  +0.37 V, max =  +3.01 V) ALARM
VBAT:  +3.18 V  (min =  +3.94 V, max =  +0.74 V) ALARM
in9:   +1.88 V  (min =  +0.79 V, max =  +1.40 V) ALARM
Case Fan:0 RPM  (min =  753 RPM, div = 128) ALARM
CPU Fan:88 RPM  (min =  659 RPM, div = 64) ALARM
Aux Fan: 0 RPM  (min = 10546 RPM, div = 128) ALARM
fan5:0 RPM  (min =  753 RPM, div = 128) ALARM
Sys Temp:+44 C  (high =-5 C, hyst =   -34 C)   ALARM
CPU Temp:  +38.0 C  (high = +80.0 C, hyst = +75.0 C)
AUX Temp:  +43.5 C  (high = +80.0 C, hyst = +75.0 C)

coretemp-isa-
Adapter: ISA adapter

coretemp-isa-0001
Adapter: ISA adapter

(The aux fan and fan5 are not connected.)

> Your original post suggests that the fan speed is supposed to change
> depending on the system load? Or temperature? Please describe the
> mechanism used to achieve this. Could it be that this mechanism isn't
> working properly, and the reported (low) speeds are actually true?

The motherboard controls the CPU fan and I believe also the case fan,
probably based on temperatures.  (The manual is buried somewhere and
MSI's download site is down right in this moment.)

The low speeds or the dividers incorrect.  I'll reboot in a minute into
2.6.22(-rc5)  and post the "sensors" output.

> What fan inputs are used by your CPU and system fans? "sensors
> -c /dev/null" will tell.

...
fan1:  484 RPM  (min = 12053 RPM, div = 16) ALARM
fan2:   89 RPM  (min =  659 RPM, div = 64) ALARM
fan3:0 RPM  (min = 10546 RPM, div = 128) ALARM
fan5:0 RPM  (min = 1506 RPM, div = 128) ALARM
...

Hmm, interesting.  When I now re-run sensors I get
...
Case Fan:  484 RPM  (min = 12053 RPM, div = 16) ALARM
CPU Fan:89 RPM  (min =  659 RPM, div = 64) ALARM
Aux Fan: 0 RPM  (min = 10546 RPM, div = 128) ALARM
fan5:0 RPM  (min = 1506 RPM, div = 128) ALARM
...

(I'm still in 2.6.23-rc2.  Ksensors picked the 484 RPM of the case fan
up too, and that's most certainly the correct speed.  Just the CPU fan's
speed is still wrong; or rather its divider should be 16 rather than 64.)

> Other than that, I can only ask for the same things Mark already
> suggested: compile with HWMON debugging and provide the logs (this will
> show what fan div the driver is trying to select), and try bisecting
> using git to find out which patch exactly caused the problem.

How comes the divider of one of the fans changed from one minute to the
other?

FWIW, the ``chip "w83627ehf-*"ยดยด section in Gentoo's /etc/sensors.conf
provides only labels for fan{1,2,3}. It is titled
# Winbond W83627EHF configuration originally contributed by Leon Moonen
# This is for an Asus P5P800, voltages for A8V-E SE.

Should I hardwire correct dividers or pulse per rev in sensors.conf or
is the driver supposed to work the correct dividers out --- like it did
before 2.6.23-rc?
-- 
Stefan Richter
-=-=-=== =--- -=-==
http://arcgraph.de/sr/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


BUG: when using 'brctl stp'

2007-08-10 Thread Daniel K.

I get this on the latest GIT, it was also present shortly after -rc1.
I have not tested with earlier kernels.

# brctl stp br0 on
[  169.672008] BUG: sleeping function called from invalid context at 
kernel/mutex.c:86
[  169.672532] in_atomic():1, irqs_disabled():0
[  169.672832] 
[  169.672832] Call Trace:

[  169.673406]  [] mutex_lock+0x19/0x2f
[  169.673696]  [] __alloc_pages+0x71/0x2d3
[  169.673996]  [] :bridge:set_stp_state+0x12/0x37
[  169.674293]  [] :bridge:store_bridge_parm+0x5f/0x79
[  169.674587]  [] sysfs_write_file+0xf2/0x134
[  169.674879]  [] vfs_write+0xce/0x177
[  169.675170]  [] sys_write+0x45/0x6e
[  169.675463]  [] system_call+0x7e/0x83
[  169.675769] 
[  169.676139] br0: starting userspace STP failed, staring kernel STP


# brctl stp br0 off
[  171.774500] BUG: sleeping function called from invalid context at 
kernel/mutex.c:86
[  171.775040] in_atomic():1, irqs_disabled():0
[  171.775327] 
[  171.775328] Call Trace:

[  171.775906]  [] mutex_lock+0x19/0x2f
[  171.776195]  [] __alloc_pages+0x71/0x2d3
[  171.776496]  [] :bridge:set_stp_state+0x12/0x37
[  171.776792]  [] :bridge:store_bridge_parm+0x5f/0x79
[  171.777086]  [] sysfs_write_file+0xf2/0x134
[  171.777378]  [] vfs_write+0xce/0x177
[  171.777669]  [] sys_write+0x45/0x6e
[  171.777958]  [] system_call+0x7e/0x83
[  171.778250] 



Daniel K.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] flush icache before set_pte() on ia64 take9 [2/2] flush icache at set_pte

2007-08-10 Thread KAMEZAWA Hiroyuki
On Fri, 10 Aug 2007 11:17:30 -0700
"Luck, Tony" <[EMAIL PROTECTED]> wrote:

> 1) In arch/ia64/mm/init.c: __ia64_sync_icache_dcache()
> 
> - if (!pte_exec(pte))
> - return; /* not an executable page... */
> + BUG_ON(!pte_exec(pte));
> 
> In this latest version the only route to this routine is from set_pte()
> inside the test :
> 
>   if (pte_exec(pteval) && ) {
>   }
> 
> So this BUG_ON is now redundant.
> 
I see.

> 2) In include/asm-ia64/pgtable.h
> 
> + if (pte_exec(pteval) &&// flush only new executable page.
> + pte_present(pteval) && // swap out ?
> + pte_user(pteval) &&// ignore kernel page
> + (!pte_present(*ptep) ||// do_no_page or swap in, migration,
> + pte_pfn(*ptep) != pte_pfn(pteval))) // do_wp_page(), page copy
> + /* load_module() calles flush_icache_range() explicitly*/
> + __ia64_sync_icache_dcache(pteval);
> 
> Just above this there is a comment saying that pte_exec() only works
> when pte_present() is true.  So we must re-order the conditions so that
> we check that the pteval satisfies pte_present() before using either of
> pte_exec() or pte_user() on it like this:
> 
>   if (pte_present(pteval) &&
>   pte_exec(pteval) &&
>   pte_user(pteval) &&
> 
> I put in some crude counters to see whether we should check pte_exec() or
> pte_user() next ... and it was very clear that the pte_exec() check gets
> us out of the if() faster (at least during a kernel build).
> 
ok.

I'm sorry that I'll be offlined until next Wednesday. So, I'll post above
fix in a week or so.


> I also compared how often the old code called lazy_mmu_prot_update()
> with how often the new code calls __ia64_sync_icache_dcache() (again
> using kernel build as my workload) ... and the answer is about the
> same (less than 0.2% change ... probably less than run-to-run variation).
> 
> 
> So now the only remaining task is to convince myself that this
> new version covers all the cases.
> 
yes. I want more eyes for review. 

Thanks,
-Kame

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Software based ECC ?

2007-08-10 Thread Alan Cox
On Fri, 10 Aug 2007 23:16:45 +0200
"roland" <[EMAIL PROTECTED]> wrote:

> Hello !
> 
> since ECC (speaking in terms of ram/memory) is some widespread hardware
> technology
> within server/enterprise computing for protection of memory failure,  i
> wonder:
> 
> Can`t this be done in software, too ?

Only one way to find out. If it interest you - have a go at it
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/4] Add ETHTOOL_[GS]FLAGS sub-ioctls

2007-08-10 Thread David Miller
From: Ben Greear <[EMAIL PROTECTED]>
Date: Fri, 10 Aug 2007 14:11:24 -0700

> Jeff Garzik wrote:
> 
> > This patch copies Auke in adding NETIF_F_LRO.  Is that just for 
> > temporary merging, or does the net core really not touch it at all?
> > 
> > Because, logically, if NETIF_F_LRO exists nowhere else but this patch, 
> > we should not add it to dev->features.  LRO knowledge can be contained 
> > entirely within the driver, if the net core never tests NETIF_F_LRO.
> > 
> > I haven't reviewed the other NETIF_F_XXX flags, but, that logic can be 
> > applied to any other NETIF_F_XXX flag:  if the net stack isn't using it, 
> > it's a piece of information specific to that driver.
> 
> I believe LRO is going to have to be disabled for routing/bridging,
> so the stack will probably need to become aware of it at some point...

The packet will be GSO'd on output I believe, so it won't
break anything.

Alternatively, we could make the driver only LRO accumulate if the
packet is unicast and matches one of the MAC's programmed into the
chip.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 00/16] Permit filesystem local caching [try #3]

2007-08-10 Thread Casey Schaufler

--- David Howells <[EMAIL PROTECTED]> wrote:

> These patches add local caching for network filesystems such as NFS and AFS.
> 
> FS-Cache now runs fully asynchronously as required by Trond Myklebust for
> NFS.
> 
> --
> Changes:
> [try #3]:
> 
>  (*) Added missing file to CacheFiles patch.
> 
>  (*) Made new security functions return errors and pass actual return data
> via
>  argument pointer.
> 
>  (*) Cleaned up NFS patch.
> 
>  (*) The 'fsc' flag must now be passed to NFS mount by the string options.
> 
>  (*) Split the NFS patch into three as requested by Trond.
> 
> [try #2]:
> 
>  (*) The CacheFiles module no longer accepts directory fds in its cull and
>  inuse commands from cachefilesd.  Instead it uses the current working
>  directory of the calling process as the basis for looking up the object.
>  Corollary to this, fget_light() no longer needs to be exported.
> 

How would you expect an LSM that is not SELinux to interface with
CacheFiles? You have gone to a great deal of effort to support the
requirements of an SELinux system, and that's good, but you have
extended the LSM interface to expose SELinux data structures (secids)
and require them for the operation of CacheFiles, and that's bad.
The data used within an LSM is private to the LSM, and this applies
to SELinux as well as to any other LSM that may come along, such
as the Smack LSM I'm working on. This applies to task data as well
as file data. Further, the behavior of the system in the presence
of an LSM should be controlled by the LSM, it is more than a little
scary that CacheFiles is enforcing SELinux policy based on secids
that may be coming from a different LSM.

I applaud the integration of CacheFiles with SELinux. Unfortunately,
you've done so using the LSM interface in such a way that an LSM
other than SELinux is likely to demonstrate inappropriate behaviors
in the presence of CacheFiles because you have so carefully integrated
the SELinux requirements.

If the integration with SELinux is important to you, and I would
expect that it is given the work you've put into it, I suggest that
the SELinux specific behaviors be identified so that another LSM
can provide the behavior appropriate to the policy it chooses to
enforce and put that into SELinux with an LSM interface. I know
that you're looking at a significant effort to do that, but I
wouldn't think that you'd want CacheFiles to behave badly in the
presence of an LSM that doesn't happen to be SELinux.

I also know it's tempting to point out the SELinux is the only
upstream LSM. I hope to change that before too long, and I know
there are others with ambitions as well. I would not like to see
CacheFiles have to get excluded in the presence of other LSMs
and I doubt you would either.


Casey Schaufler
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 25/25 -v2] add paravirtualization support for x86_64

2007-08-10 Thread Glauber de Oliveira Costa
This is finally, the patch we were all looking for. This
patch adds a paravirt.h header with the definition of paravirt_ops
struct. Also, it defines a bunch of inline functions that will
replace, or hook, the other calls. Every one of those functions
adds an entry in the parainstructions section (see vmlinux.lds.S).
Those entries can then be used to runtime-patch the paravirt_ops
functions.

paravirt.c contains implementations of paravirt functions that
are used natively, such as the native_patch. It also fill the
paravirt_ops structure with the whole lot of functions that
were (re)defined throughout this patch set.

There are also changes in asm-offsets.c. paravirt.h needs it
to find out the offsets into the structure of functions
such as irq_enable, used in assembly files.

[  updates from v1
   * make PARAVIRT hidden in Kconfig (Andi Kleen)
   * cleanups in paravirt.h (Andi Kleen)
   * modifications needed to accomodate other parts of the
   patch that changed, such as getting rid of ebda_info
   * put the integers at struct paravirt_ops at the end
   (Jeremy)
]
Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]>
Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]>
---
 arch/x86_64/Kconfig  |   11 +++
 arch/x86_64/kernel/Makefile  |1 +
 arch/x86_64/kernel/asm-offsets.c |   14 ++
 arch/x86_64/kernel/vmlinux.lds.S |6 ++
 include/asm-x86_64/smp.h |2 +-
 5 files changed, 33 insertions(+), 1 deletions(-)

diff --git a/arch/x86_64/Kconfig b/arch/x86_64/Kconfig
index ffa0364..00b2fc9 100644
--- a/arch/x86_64/Kconfig
+++ b/arch/x86_64/Kconfig
@@ -373,6 +373,17 @@ config NODES_SHIFT
 
 # Dummy CONFIG option to select ACPI_NUMA from drivers/acpi/Kconfig.
 
+config PARAVIRT
+   bool
+   depends on EXPERIMENTAL
+   help
+ Paravirtualization is a way of running multiple instances of
+ Linux on the same machine, under a hypervisor.  This option
+ changes the kernel so it can modify itself when it is run
+ under a hypervisor, improving performance significantly.
+ However, when run without a hypervisor the kernel is
+ theoretically slower.  If in doubt, say N.
+
 config X86_64_ACPI_NUMA
bool "ACPI NUMA detection"
depends on NUMA
diff --git a/arch/x86_64/kernel/Makefile b/arch/x86_64/kernel/Makefile
index ff5d8c9..120467f 100644
--- a/arch/x86_64/kernel/Makefile
+++ b/arch/x86_64/kernel/Makefile
@@ -38,6 +38,7 @@ obj-$(CONFIG_X86_VSMP)+= vsmp.o
 obj-$(CONFIG_K8_NB)+= k8.o
 obj-$(CONFIG_AUDIT)+= audit.o
 
+obj-$(CONFIG_PARAVIRT) += paravirt.o
 obj-$(CONFIG_MODULES)  += module.o
 obj-$(CONFIG_PCI)  += early-quirks.o
 
diff --git a/arch/x86_64/kernel/asm-offsets.c b/arch/x86_64/kernel/asm-offsets.c
index 778953b..f5eff70 100644
--- a/arch/x86_64/kernel/asm-offsets.c
+++ b/arch/x86_64/kernel/asm-offsets.c
@@ -15,6 +15,9 @@
 #include 
 #include 
 #include 
+#ifdef CONFIG_PARAVIRT
+#include 
+#endif
 
 #define DEFINE(sym, val) \
 asm volatile("\n->" #sym " %0 " #val : : "i" (val))
@@ -72,6 +75,17 @@ int main(void)
   offsetof (struct rt_sigframe32, uc.uc_mcontext));
BLANK();
 #endif
+#ifdef CONFIG_PARAVIRT
+#define ENTRY(entry) DEFINE(PARAVIRT_ ## entry, offsetof(struct paravirt_ops, 
entry))
+   ENTRY(paravirt_enabled);
+   ENTRY(irq_disable);
+   ENTRY(irq_enable);
+   ENTRY(syscall_return);
+   ENTRY(iret);
+   ENTRY(read_cr2);
+   ENTRY(swapgs);
+   BLANK();
+#endif
DEFINE(pbe_address, offsetof(struct pbe, address));
DEFINE(pbe_orig_address, offsetof(struct pbe, orig_address));
DEFINE(pbe_next, offsetof(struct pbe, next));
diff --git a/arch/x86_64/kernel/vmlinux.lds.S b/arch/x86_64/kernel/vmlinux.lds.S
index ba8ea97..c3fce85 100644
--- a/arch/x86_64/kernel/vmlinux.lds.S
+++ b/arch/x86_64/kernel/vmlinux.lds.S
@@ -185,6 +185,12 @@ SECTIONS
   .altinstr_replacement : AT(ADDR(.altinstr_replacement) - LOAD_OFFSET) {
*(.altinstr_replacement)
   }
+  . = ALIGN(8);
+  .parainstructions : AT(ADDR(.parainstructions) - LOAD_OFFSET) {
+  __parainstructions = .;
+   *(.parainstructions)
+  __parainstructions_end = .;
+  }
   /* .exit.text is discard at runtime, not link time, to deal with references
  from .altinstructions and .eh_frame */
   .exit.text : AT(ADDR(.exit.text) - LOAD_OFFSET) { *(.exit.text) }
diff --git a/include/asm-x86_64/smp.h b/include/asm-x86_64/smp.h
index 6b4..403901b 100644
--- a/include/asm-x86_64/smp.h
+++ b/include/asm-x86_64/smp.h
@@ -22,7 +22,7 @@ extern int disable_apic;
 #ifdef CONFIG_PARAVIRT
 #include 
 void native_flush_tlb_others(cpumask_t cpumask, struct mm_struct *mm,
-   unsigned long va);
+   unsigned long va);
 #else
 #define startup_ipi_hook(apicid, rip, rsp) do { } while (0)
 #endif
-- 
1.4.

[PATCH 15/25 -v2] introducing paravirt_activate_mm

2007-08-10 Thread Glauber de Oliveira Costa
This function/macro will allow a paravirt guest to be notified we changed
the current task cr3, and act upon it. It's up to them

Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]>
Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]>
---
 include/asm-x86_64/mmu_context.h |   17 ++---
 1 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/include/asm-x86_64/mmu_context.h b/include/asm-x86_64/mmu_context.h
index 9592698..77ce047 100644
--- a/include/asm-x86_64/mmu_context.h
+++ b/include/asm-x86_64/mmu_context.h
@@ -7,7 +7,16 @@
 #include 
 #include 
 #include 
+
+#ifdef CONFIG_PARAVIRT
+#include 
+#else
 #include 
+static inline void paravirt_activate_mm(struct mm_struct *prev,
+   struct mm_struct *next)
+{
+}
+#endif /* CONFIG_PARAVIRT */
 
 /*
  * possibly do the LDT unload here?
@@ -67,8 +76,10 @@ static inline void switch_mm(struct mm_struct *prev, struct 
mm_struct *next,
asm volatile("movl %0,%%fs"::"r"(0));  \
 } while(0)
 
-#define activate_mm(prev, next) \
-   switch_mm((prev),(next),NULL)
-
+#define activate_mm(prev, next)\
+do {   \
+   paravirt_activate_mm(prev, next);   \
+   switch_mm((prev),(next),NULL);  \
+} while (0)
 
 #endif
-- 
1.4.4.2

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 9/25 -v2] report ring kernel is running without paravirt

2007-08-10 Thread Glauber de Oliveira Costa
When paravirtualization is disabled, the kernel is always
running at ring 0. So report it in the appropriate macro

Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]>
Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]>
---
 include/asm-x86_64/segment.h |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/include/asm-x86_64/segment.h b/include/asm-x86_64/segment.h
index 04b8ab2..240c1bf 100644
--- a/include/asm-x86_64/segment.h
+++ b/include/asm-x86_64/segment.h
@@ -50,4 +50,8 @@
 #define GDT_SIZE (GDT_ENTRIES * 8)
 #define TLS_SIZE (GDT_ENTRY_TLS_ENTRIES * 8) 
 
+#ifndef CONFIG_PARAVIRT
+#define get_kernel_rpl()  0
+#endif
+
 #endif
-- 
1.4.4.2

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 20/25 -v2] replace syscall_init

2007-08-10 Thread Glauber de Oliveira Costa
This patch replaces syscall_init by x86_64_syscall_init.
The former will be later replaced by a paravirt replacement
in case paravirt is on

Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]>
Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]>
---
 arch/x86_64/kernel/setup64.c |8 +++-
 include/asm-x86_64/proto.h   |3 +++
 2 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/arch/x86_64/kernel/setup64.c b/arch/x86_64/kernel/setup64.c
index 49f7342..723822c 100644
--- a/arch/x86_64/kernel/setup64.c
+++ b/arch/x86_64/kernel/setup64.c
@@ -153,7 +153,7 @@ __attribute__((section(".bss.page_aligned")));
 extern asmlinkage void ignore_sysret(void);
 
 /* May not be marked __init: used by software suspend */
-void syscall_init(void)
+void x86_64_syscall_init(void)
 {
/* 
 * LSTAR and STAR live in a bit strange symbiosis.
@@ -172,6 +172,12 @@ void syscall_init(void)
wrmsrl(MSR_SYSCALL_MASK, EF_TF|EF_DF|EF_IE|0x3000); 
 }
 
+/* Overriden in paravirt.c if CONFIG_PARAVIRT */
+void __attribute__((weak)) syscall_init(void)
+{
+   x86_64_syscall_init();
+}
+
 void __cpuinit check_efer(void)
 {
unsigned long efer;
diff --git a/include/asm-x86_64/proto.h b/include/asm-x86_64/proto.h
index 31f20ad..77ed2de 100644
--- a/include/asm-x86_64/proto.h
+++ b/include/asm-x86_64/proto.h
@@ -18,6 +18,9 @@ extern void init_memory_mapping(unsigned long start, unsigned 
long end);
 
 extern void system_call(void); 
 extern int kernel_syscall(void);
+#ifdef CONFIG_PARAVIRT
+extern void x86_64_syscall_init(void);
+#endif
 extern void syscall_init(void);
 
 extern void ia32_syscall(void);
-- 
1.4.4.2

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 24/25 -v2] paravirt hooks for arch initialization

2007-08-10 Thread Glauber de Oliveira Costa
This patch add paravirtualization hooks in the arch initialization
process. paravirt_arch_setup() lets the guest issue any specific
initialization routine

Also, there is memory_setup(), so guests can handle it their way.

[  updates from v1
   * Don't use a separate ebda pv hook (Jeremy/Andi)
   * Make paravirt_setup_arch() void (Andi)
]

Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]>
Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]>
---
 arch/x86_64/kernel/setup.c |   32 +++-
 include/asm-x86_64/e820.h  |6 ++
 include/asm-x86_64/page.h  |1 +
 3 files changed, 38 insertions(+), 1 deletions(-)

diff --git a/arch/x86_64/kernel/setup.c b/arch/x86_64/kernel/setup.c
index af838f6..19e0d90 100644
--- a/arch/x86_64/kernel/setup.c
+++ b/arch/x86_64/kernel/setup.c
@@ -44,6 +44,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -65,6 +66,12 @@
 #include 
 #include 
 
+#ifdef CONFIG_PARAVIRT
+#include 
+#else
+#define paravirt_arch_setup()  do {} while (0)
+#endif
+
 /*
  * Machine setup..
  */
@@ -208,6 +215,16 @@ static void discover_ebda(void)
 * 4K EBDA area at 0x40E
 */
ebda_addr = *(unsigned short *)__va(EBDA_ADDR_POINTER);
+   /*
+* There can be some situations, like paravirtualized guests,
+* in which there is no available ebda information. In such
+* case, just skip it
+*/
+   if (!ebda_addr) {
+   ebda_size = 0;
+   return;
+   }
+
ebda_addr <<= 4;
 
ebda_size = *(unsigned short *)__va(ebda_addr);
@@ -221,6 +238,13 @@ static void discover_ebda(void)
ebda_size = 64*1024;
 }
 
+/* Overridden in paravirt.c if CONFIG_PARAVIRT */
+void __attribute__((weak)) memory_setup(void)
+{
+   return setup_memory_region();
+}
+
+
 void __init setup_arch(char **cmdline_p)
 {
printk(KERN_INFO "Command line: %s\n", boot_command_line);
@@ -231,12 +255,18 @@ void __init setup_arch(char **cmdline_p)
saved_video_mode = SAVED_VIDEO_MODE;
bootloader_type = LOADER_TYPE;
 
+   /*
+* By returning non-zero here, a paravirt impl can choose to
+* skip the rest of the setup process
+*/
+   paravirt_arch_setup();
+
 #ifdef CONFIG_BLK_DEV_RAM
rd_image_start = RAMDISK_FLAGS & RAMDISK_IMAGE_START_MASK;
rd_prompt = ((RAMDISK_FLAGS & RAMDISK_PROMPT_FLAG) != 0);
rd_doload = ((RAMDISK_FLAGS & RAMDISK_LOAD_FLAG) != 0);
 #endif
-   setup_memory_region();
+   memory_setup();
copy_edd();
 
if (!MOUNT_ROOT_RDONLY)
diff --git a/include/asm-x86_64/e820.h b/include/asm-x86_64/e820.h
index 3486e70..2ced3ba 100644
--- a/include/asm-x86_64/e820.h
+++ b/include/asm-x86_64/e820.h
@@ -20,7 +20,12 @@
 #define E820_ACPI  3
 #define E820_NVS   4
 
+#define MAP_TYPE_STR   "BIOS-e820"
+
 #ifndef __ASSEMBLY__
+
+void native_ebda_info(unsigned *addr, unsigned *size);
+
 struct e820entry {
u64 addr;   /* start of memory segment */
u64 size;   /* size of memory segment */
@@ -56,6 +61,7 @@ extern struct e820map e820;
 
 extern unsigned ebda_addr, ebda_size;
 extern unsigned long nodemap_addr, nodemap_size;
+
 #endif/*!__ASSEMBLY__*/
 
 #endif/*__E820_HEADER*/
diff --git a/include/asm-x86_64/page.h b/include/asm-x86_64/page.h
index ec8b245..8c40fb2 100644
--- a/include/asm-x86_64/page.h
+++ b/include/asm-x86_64/page.h
@@ -149,6 +149,7 @@ extern unsigned long __phys_addr(unsigned long);
 #define __boot_pa(x)   __pa(x)
 #ifdef CONFIG_FLATMEM
 #define pfn_valid(pfn) ((pfn) < end_pfn)
+
 #endif
 
 #define virt_to_page(kaddr)pfn_to_page(__pa(kaddr) >> PAGE_SHIFT)
-- 
1.4.4.2

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 17/25 -v2] introduce paravirt_release_pgd()

2007-08-10 Thread Glauber de Oliveira Costa
This patch introduces a new macro/function that informs a paravirt
guest when its page table is not more in use, and can be released.
In case we're not paravirt, just do nothing.

Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]>
Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]>
---
 include/asm-x86_64/pgalloc.h |7 +++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/include/asm-x86_64/pgalloc.h b/include/asm-x86_64/pgalloc.h
index b467be6..dbe1267 100644
--- a/include/asm-x86_64/pgalloc.h
+++ b/include/asm-x86_64/pgalloc.h
@@ -9,6 +9,12 @@
 #define QUICK_PGD 0/* We preserve special mappings over free */
 #define QUICK_PT 1 /* Other page table pages that are zero on free */
 
+#ifdef CONFIG_PARAVIRT
+#include 
+#else
+#define paravirt_release_pgd(pgd) do { } while (0)
+#endif
+
 #define pmd_populate_kernel(mm, pmd, pte) \
set_pmd(pmd, __pmd(_PAGE_TABLE | __pa(pte)))
 #define pud_populate(mm, pud, pmd) \
@@ -100,6 +106,7 @@ static inline pgd_t *pgd_alloc(struct mm_struct *mm)
 static inline void pgd_free(pgd_t *pgd)
 {
BUG_ON((unsigned long)pgd & (PAGE_SIZE-1));
+   paravirt_release_pgd(pgd);
quicklist_free(QUICK_PGD, pgd_dtor, pgd);
 }
 
-- 
1.4.4.2

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 22/25 -v2] turn priviled operation into a macro

2007-08-10 Thread Glauber de Oliveira Costa
under paravirt, read cr2 cannot be issued directly anymore.
So wrap it in a macro, defined to the operation itself in case
paravirt is off, but to something else if we have paravirt
in the game

Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]>
Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]>
---
 arch/x86_64/kernel/head.S |   10 +-
 1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/arch/x86_64/kernel/head.S b/arch/x86_64/kernel/head.S
index e89abcd..1bb6c55 100644
--- a/arch/x86_64/kernel/head.S
+++ b/arch/x86_64/kernel/head.S
@@ -18,6 +18,12 @@
 #include 
 #include 
 #include 
+#ifdef CONFIG_PARAVIRT
+#include 
+#include 
+#else
+#define GET_CR2_INTO_RCX mov %cr2, %rcx
+#endif
 
 /* we are not able to switch in one step to the final KERNEL ADRESS SPACE
  * because we need identity-mapped pages.
@@ -267,7 +273,9 @@ ENTRY(early_idt_handler)
xorl %eax,%eax
movq 8(%rsp),%rsi   # get rip
movq (%rsp),%rdx
-   movq %cr2,%rcx
+   /* When PARAVIRT is on, this operation may clobber rax. It is
+ something safe to do, because we've just zeroed rax. */
+   GET_CR2_INTO_RCX
leaq early_idt_msg(%rip),%rdi
call early_printk
cmpl $2,early_recursion_flag(%rip)
-- 
1.4.4.2

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 23/25 -v2] provide paravirt patching function

2007-08-10 Thread Glauber de Oliveira Costa
This patch introduces apply_paravirt(), a function that shall
be called by i386/alternative.c to apply replacements to
paravirt_functions. It is defined to an do-nothing function
if paravirt is not enabled.

Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]>
Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]>
---
 include/asm-x86_64/alternative.h |8 +---
 1 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/include/asm-x86_64/alternative.h b/include/asm-x86_64/alternative.h
index ab161e8..e69a141 100644
--- a/include/asm-x86_64/alternative.h
+++ b/include/asm-x86_64/alternative.h
@@ -143,12 +143,14 @@ static inline void alternatives_smp_switch(int smp) {}
  */
 #define ASM_OUTPUT2(a, b) a, b
 
-struct paravirt_patch;
+struct paravirt_patch_site;
 #ifdef CONFIG_PARAVIRT
-void apply_paravirt(struct paravirt_patch *start, struct paravirt_patch *end);
+void apply_paravirt(struct paravirt_patch_site *start,
+   struct paravirt_patch_site *end);
 #else
 static inline void
-apply_paravirt(struct paravirt_patch *start, struct paravirt_patch *end)
+apply_paravirt(struct paravirt_patch_site *start,
+   struct paravirt_patch_site *end)
 {}
 #define __parainstructions NULL
 #define __parainstructions_end NULL
-- 
1.4.4.2

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 16/25 -v2] turn page operations into native versions

2007-08-10 Thread Glauber de Oliveira Costa
This patch turns the page operations (set and make a page table)
into native_ versions. The operations itself will be later
overriden by paravirt.

Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]>
Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]>
---
 include/asm-x86_64/page.h |   36 +++-
 1 files changed, 31 insertions(+), 5 deletions(-)

diff --git a/include/asm-x86_64/page.h b/include/asm-x86_64/page.h
index 88adf1a..ec8b245 100644
--- a/include/asm-x86_64/page.h
+++ b/include/asm-x86_64/page.h
@@ -64,16 +64,42 @@ typedef struct { unsigned long pgprot; } pgprot_t;
 
 extern unsigned long phys_base;
 
-#define pte_val(x) ((x).pte)
-#define pmd_val(x) ((x).pmd)
-#define pud_val(x) ((x).pud)
-#define pgd_val(x) ((x).pgd)
-#define pgprot_val(x)  ((x).pgprot)
+static inline unsigned long native_pte_val(pte_t pte)
+{
+   return pte.pte;
+}
+
+static inline unsigned long native_pud_val(pud_t pud)
+{
+   return pud.pud;
+}
+
+
+static inline unsigned long native_pmd_val(pmd_t pmd)
+{
+   return pmd.pmd;
+}
+
+static inline unsigned long native_pgd_val(pgd_t pgd)
+{
+   return pgd.pgd;
+}
+
+#ifdef CONFIG_PARAVIRT
+#include 
+#else
+#define pte_val(x) native_pte_val(x)
+#define pmd_val(x) native_pmd_val(x)
+#define pud_val(x) native_pud_val(x)
+#define pgd_val(x) native_pgd_val(x)
 
 #define __pte(x) ((pte_t) { (x) } )
 #define __pmd(x) ((pmd_t) { (x) } )
 #define __pud(x) ((pud_t) { (x) } )
 #define __pgd(x) ((pgd_t) { (x) } )
+#endif /* CONFIG_PARAVIRT */
+
+#define pgprot_val(x)  ((x).pgprot)
 #define __pgprot(x)((pgprot_t) { (x) } )
 
 #endif /* !__ASSEMBLY__ */
-- 
1.4.4.2

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 7/25 -v2] interrupt related native paravirt functions.

2007-08-10 Thread Glauber de Oliveira Costa
The interrupt initialization routine becomes native_init_IRQ and will
be overriden later in case paravirt is on.

[  updates from v1
   * After a talk with Jeremy Fitzhardinge, it turned out that making the
   interrupt vector global was not a good idea. So it is removed in this
   patch
]
Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]>
Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]>
---
 arch/x86_64/kernel/i8259.c |5 -
 include/asm-x86_64/irq.h   |2 ++
 2 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/arch/x86_64/kernel/i8259.c b/arch/x86_64/kernel/i8259.c
index 948cae6..048e3cb 100644
--- a/arch/x86_64/kernel/i8259.c
+++ b/arch/x86_64/kernel/i8259.c
@@ -484,7 +484,10 @@ static int __init init_timer_sysfs(void)
 
 device_initcall(init_timer_sysfs);
 
-void __init init_IRQ(void)
+/* Overridden in paravirt.c */
+void init_IRQ(void) __attribute__((weak, alias("native_init_IRQ")));
+
+void __init native_init_IRQ(void)
 {
int i;
 
diff --git a/include/asm-x86_64/irq.h b/include/asm-x86_64/irq.h
index 5006c6e..be55299 100644
--- a/include/asm-x86_64/irq.h
+++ b/include/asm-x86_64/irq.h
@@ -46,6 +46,8 @@ static __inline__ int irq_canonicalize(int irq)
 extern void fixup_irqs(cpumask_t map);
 #endif
 
+void native_init_IRQ(void);
+
 #define __ARCH_HAS_DO_SOFTIRQ 1
 
 #endif /* _ASM_IRQ_H */
-- 
1.4.4.2

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   >