[PATCH 0/13] cxgb3 - driver updates
Hi Jeff, I'm submitting a patch series for inclusion in netdev#upstream. Here is a brief description: - MAC hang workaround update - Modify max HW Rx coalescing size - Log SGE doorbell Fifo overflow - Use Tx immediate data for offload packets whenever possible - RDMA can get internal mem info to workaround HW issues - More validity checks on connection ids - Stop MAC when a fatal error is detected - Log HW serial number - Update internal mem operating mode - Update engine microcode management, version is now 1.1.0 - Update FW management, version is now 4.6.0 - Ignore some HW errors until the HW is initialized - Check MSI/MSI-X after it got enabled Cheers, Divy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/13] cxgb3 - driver updates
Hi Jeff, I'm submitting a patch series for inclusion in netdev#upstream. Here is a brief description: - MAC hang workaround update - Modify max HW Rx coalescing size - Log SGE doorbell Fifo overflow - Use Tx immediate data for offload packets whenever possible - RDMA can get internal mem info to workaround HW issues - More validity checks on connection ids - Stop MAC when a fatal error is detected - Log HW serial number - Update internal mem operating mode - Update engine microcode management, version is now 1.1.0 - Update FW management, version is now 4.6.0 - Ignore some HW errors until the HW is initialized - Check MSI/MSI-X after it got enabled Cheers, Divy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] do not export /usr/include/scsi in make headers_install
On Mon, 2007-08-06 at 15:02 +0200, Olaf Hering wrote: > On Mon, Aug 06, Christoph Hellwig wrote: > > > On Mon, Aug 06, 2007 at 02:45:46PM +0200, Olaf Hering wrote: > > > > > > glibc and make headers_install_all provide /usr/include/scsi > > > One of them has to go. > > > > > > A quick diff shows no differences, expect: > > > > .. > > > > > Which copy should be provided by a distributor? > > > > The glibc one of course. The kernel scsi.h should never have been > > added to the list of exportable headers. > > /usr/include/scsi is provided by glibc. > Remove the scsi export from make headers_install target. > > > Signed-off-by: Olaf Hering <[EMAIL PROTECTED]> Acked-by: David Woodhouse <[EMAIL PROTECTED]> > --- > include/Kbuild |1 - > include/scsi/Kbuild |4 > 2 files changed, 5 deletions(-) > > --- a/include/Kbuild > +++ b/include/Kbuild > @@ -1,6 +1,5 @@ > header-y += asm-generic/ > header-y += linux/ > -header-y += scsi/ > header-y += sound/ > header-y += mtd/ > header-y += rdma/ > --- a/include/scsi/Kbuild > +++ /dev/null > @@ -1,4 +0,0 @@ > -header-y += scsi.h > - > -unifdef-y += scsi_ioctl.h > -unifdef-y += sg.h > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [SCSI] aic94xx: new driver
On Sat, 2007-08-11 at 04:49 +0100, Christoph Hellwig wrote: > On Fri, Aug 10, 2007 at 11:09:22PM +0800, David Woodhouse wrote: > > The files in /usr/include/scsi are actually shipped by glibc, and most > > distributions use glibc's version instead of the one from the kernel -- > > so this additional userspace interface is automatically incompatible > > with most people's installations. > > Stop here right now. You just noticed the real bug, and that's exporting > scsi.h at all. I think Olaf sent a patch to fix this already. That's a good enough answer for me, certainly. -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 12/13] cxgb3 - log and clear PEX errors
From: Divy Le Ray <[EMAIL PROTECTED]> Clear pciE PEX errors late at module load time. Log details when PEX errors occur. Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]> --- drivers/net/cxgb3/t3_hw.c |6 ++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/drivers/net/cxgb3/t3_hw.c b/drivers/net/cxgb3/t3_hw.c index 3d47627..538b254 100644 --- a/drivers/net/cxgb3/t3_hw.c +++ b/drivers/net/cxgb3/t3_hw.c @@ -1355,6 +1355,10 @@ static void pcie_intr_handler(struct adapter *adapter) {0} }; + if (t3_read_reg(adapter, A_PCIE_INT_CAUSE) & F_PEXERR) + CH_ALERT(adapter, "PEX error code 0x%x\n", +t3_read_reg(adapter, A_PCIE_PEX_ERR)); + if (t3_handle_intr_status(adapter, A_PCIE_INT_CAUSE, PCIE_INTR_MASK, pcie_intr_info, adapter->irq_stats)) t3_fatal_err(adapter); @@ -1806,6 +1810,8 @@ void t3_intr_clear(struct adapter *adapter) for (i = 0; i < ARRAY_SIZE(cause_reg_addr); ++i) t3_write_reg(adapter, cause_reg_addr[i], 0x); + if (is_pcie(adapter)) + t3_write_reg(adapter, A_PCIE_PEX_ERR, 0x); t3_write_reg(adapter, A_PL_INT_CAUSE0, 0x); t3_read_reg(adapter, A_PL_INT_CAUSE0); /* flush */ } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Documentation files in html format?
Rene Herman wrote: > On 08/10/2007 10:12 PM, Sam Ravnborg wrote: > >>> What primary requirements does in-tree Linux kernel documentation have >>> to fulfill in general? >> >> Skipping the obvious ones such as correct, up-to-date etc. >> o Readable as-is >> o Grepable >> o buildable as structured documents or almost like a single book >> o Easy to replicate structure >> o Maintainable in any decent text-editor (emacs, vim, whatever) Low entry barrier for patches from unsuspecting occasional contributors? > Easy to put online? http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=tree;f=Documentation http://lxr.linux.no/source/Documentation/ http://users.sosdg.org/~qiyong/lxr/source/Documentation/ http://www.linux-m32r.org/lxr/http/source/Documentation/ http://lxr.free-electrons.com/source/Documentation/ (I admit though that formats like asciidoc or docbook are beneficial for larger documentation files which want chapters, table of contents, and internal crossreferences.) -- Stefan Richter -=-=-=== =--- -=-== http://arcgraph.de/sr/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 13/13] cxgb3 - test MSI capabilities
From: Divy Le Ray <[EMAIL PROTECTED]> Check that the HW in really in MSI/MSI-X mode when it was succesfully enabled. Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]> --- drivers/net/cxgb3/cxgb3_main.c | 42 drivers/net/cxgb3/regs.h |4 2 files changed, 46 insertions(+), 0 deletions(-) diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c index eaebd7f..1449692 100644 --- a/drivers/net/cxgb3/cxgb3_main.c +++ b/drivers/net/cxgb3/cxgb3_main.c @@ -2318,6 +2318,46 @@ void t3_fatal_err(struct adapter *adapter) } +/* + * Interrupt handler used to check if MSI/MSI-X works on this platform. + */ +static irqreturn_t check_intr_handler(int irq, void *cookie) +{ + struct adapter *adap = cookie; + + t3_set_reg_field(adap, A_PL_INT_ENABLE0, F_MI1, 0); + return IRQ_HANDLED; +} + +static void __devinit check_msi(struct adapter *adap) +{ + int vec, mi1; + + if (!(t3_read_reg(adap, A_PL_INT_CAUSE0) & F_MI1)) + return; + + vec = (adap->flags & USING_MSI) ? adap->pdev->irq : + adap->msix_info[0].vec; + + if (request_irq(vec, check_intr_handler, 0, adap->name, adap)) + return; + + t3_set_reg_field(adap, A_PL_INT_ENABLE0, 0, F_MI1); + msleep(10); + mi1 = t3_read_reg(adap, A_PL_INT_ENABLE0) & F_MI1; + if (mi1) + t3_set_reg_field(adap, A_PL_INT_ENABLE0, F_MI1, 0); + free_irq(vec, adap); + + if (mi1) { + cxgb_disable_msi(adap); + dev_info(&adap->pdev->dev, +"the kernel believes that MSI is available on this " +"platform\nbut the driver's MSI test has failed. " +"Proceeding with INTx interrupts.\n"); + } +} + static int __devinit cxgb_enable_msix(struct adapter *adap) { struct msix_entry entries[SGE_QSETS + 1]; @@ -2554,6 +2594,8 @@ static int __devinit init_one(struct pci_dev *pdev, adapter->flags |= USING_MSIX; else if (msi > 0 && pci_enable_msi(pdev) == 0) adapter->flags |= USING_MSI; + if (adapter->flags & (USING_MSIX | USING_MSI)) + check_msi(adapter); err = sysfs_create_group(&adapter->port[0]->dev.kobj, &cxgb3_attr_group); diff --git a/drivers/net/cxgb3/regs.h b/drivers/net/cxgb3/regs.h index 5e1bc0d..f97f8ab 100644 --- a/drivers/net/cxgb3/regs.h +++ b/drivers/net/cxgb3/regs.h @@ -1639,6 +1639,10 @@ #define V_MC5A(x) ((x) << S_MC5A) #define F_MC5AV_MC5A(1U) +#define S_MI113 +#define V_MI1(x) ((x) << S_MI1) +#define F_MI1V_MI1(1U) + #define S_CPL_SWITCH12 #define V_CPL_SWITCH(x) ((x) << S_CPL_SWITCH) #define F_CPL_SWITCHV_CPL_SWITCH(1U) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 11/13] cxgb3 - Firmware update
From: Divy Le Ray <[EMAIL PROTECTED]> Update firmware version Allow the driver to be up and running with older FW image Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]> --- drivers/net/cxgb3/common.h |2 +- drivers/net/cxgb3/cxgb3_main.c |9 + drivers/net/cxgb3/t3_hw.c | 20 +++- drivers/net/cxgb3/version.h|2 +- 4 files changed, 22 insertions(+), 11 deletions(-) diff --git a/drivers/net/cxgb3/common.h b/drivers/net/cxgb3/common.h index b665b20..ff867c2 100644 --- a/drivers/net/cxgb3/common.h +++ b/drivers/net/cxgb3/common.h @@ -691,7 +691,7 @@ int t3_read_flash(struct adapter *adapter, unsigned int addr, unsigned int nwords, u32 *data, int byte_oriented); int t3_load_fw(struct adapter *adapter, const u8 * fw_data, unsigned int size); int t3_get_fw_version(struct adapter *adapter, u32 *vers); -int t3_check_fw_version(struct adapter *adapter); +int t3_check_fw_version(struct adapter *adapter, int *must_load); int t3_init_hw(struct adapter *adapter, u32 fw_params); void mac_prep(struct cmac *mac, struct adapter *adapter, int index); void early_hw_init(struct adapter *adapter, const struct adapter_info *ai); diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c index 65ded16..eaebd7f 100644 --- a/drivers/net/cxgb3/cxgb3_main.c +++ b/drivers/net/cxgb3/cxgb3_main.c @@ -814,11 +814,12 @@ static int cxgb_up(struct adapter *adap) int must_load; if (!(adap->flags & FULL_INIT_DONE)) { - err = t3_check_fw_version(adap); - if (err == -EINVAL) + err = t3_check_fw_version(adap, &must_load); + if (err == -EINVAL) { err = upgrade_fw(adap); - if (err) - goto out; + if (err && must_load) + goto out; + } err = t3_check_tpsram_version(adap, &must_load); if (err == -EINVAL) { diff --git a/drivers/net/cxgb3/t3_hw.c b/drivers/net/cxgb3/t3_hw.c index 63032e8..3d47627 100644 --- a/drivers/net/cxgb3/t3_hw.c +++ b/drivers/net/cxgb3/t3_hw.c @@ -957,16 +957,18 @@ int t3_get_fw_version(struct adapter *adapter, u32 *vers) /** * t3_check_fw_version - check if the FW is compatible with this driver * @adapter: the adapter - * + * @must_load: set to 1 if loading a new FW image is required + * Checks if an adapter's FW is compatible with the driver. Returns 0 * if the versions are compatible, a negative error otherwise. */ -int t3_check_fw_version(struct adapter *adapter) +int t3_check_fw_version(struct adapter *adapter, int *must_load) { int ret; u32 vers; unsigned int type, major, minor; + *must_load = 1; ret = t3_get_fw_version(adapter, &vers); if (ret) return ret; @@ -979,9 +981,17 @@ int t3_check_fw_version(struct adapter *adapter) minor == FW_VERSION_MINOR) return 0; - CH_ERR(adapter, "found wrong FW version(%u.%u), " - "driver needs version %u.%u\n", major, minor, - FW_VERSION_MAJOR, FW_VERSION_MINOR); + if (major != FW_VERSION_MAJOR) + CH_ERR(adapter, "found wrong FW version(%u.%u), " + "driver needs version %u.%u\n", major, minor, + FW_VERSION_MAJOR, FW_VERSION_MINOR); + else { + *must_load = 0; + CH_WARN(adapter, "found wrong FW minor version(%u.%u), " + "driver compiled for version %u.%u\n", major, minor, + FW_VERSION_MAJOR, FW_VERSION_MINOR); + } + return -EINVAL; } diff --git a/drivers/net/cxgb3/version.h b/drivers/net/cxgb3/version.h index eb508bf..ef1c633 100644 --- a/drivers/net/cxgb3/version.h +++ b/drivers/net/cxgb3/version.h @@ -39,6 +39,6 @@ /* Firmware version */ #define FW_VERSION_MAJOR 4 -#define FW_VERSION_MINOR 3 +#define FW_VERSION_MINOR 6 #define FW_VERSION_MICRO 0 #endif /* __CHELSIO_VERSION_H */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 6/13] cxgb3 - tighten checks on TID values
From: Divy Le Ray <[EMAIL PROTECTED]> Enforce validity checks on connection ids Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]> --- drivers/net/cxgb3/cxgb3_defs.h| 20 ++-- drivers/net/cxgb3/cxgb3_offload.c | 28 +++- 2 files changed, 41 insertions(+), 7 deletions(-) diff --git a/drivers/net/cxgb3/cxgb3_defs.h b/drivers/net/cxgb3/cxgb3_defs.h index 483a594..45e9216 100644 --- a/drivers/net/cxgb3/cxgb3_defs.h +++ b/drivers/net/cxgb3/cxgb3_defs.h @@ -79,9 +79,17 @@ static inline struct t3c_tid_entry *lookup_tid(const struct tid_info *t, static inline struct t3c_tid_entry *lookup_stid(const struct tid_info *t, unsigned int tid) { + union listen_entry *e; + if (tid < t->stid_base || tid >= t->stid_base + t->nstids) return NULL; - return &(stid2entry(t, tid)->t3c_tid); + + e = stid2entry(t, tid); + if ((void *)e->next >= (void *)t->tid_tab && + (void *)e->next < (void *)&t->atid_tab[t->natids]) + return NULL; + + return &e->t3c_tid; } /* @@ -90,9 +98,17 @@ static inline struct t3c_tid_entry *lookup_stid(const struct tid_info *t, static inline struct t3c_tid_entry *lookup_atid(const struct tid_info *t, unsigned int tid) { + union active_open_entry *e; + if (tid < t->atid_base || tid >= t->atid_base + t->natids) return NULL; - return &(atid2entry(t, tid)->t3c_tid); + + e = atid2entry(t, tid); + if ((void *)e->next >= (void *)t->tid_tab && + (void *)e->next < (void *)&t->atid_tab[t->natids]) + return NULL; + + return &e->t3c_tid; } int process_rx(struct t3cdev *dev, struct sk_buff **skbs, int n); diff --git a/drivers/net/cxgb3/cxgb3_offload.c b/drivers/net/cxgb3/cxgb3_offload.c index 522c1be..7fb526a 100644 --- a/drivers/net/cxgb3/cxgb3_offload.c +++ b/drivers/net/cxgb3/cxgb3_offload.c @@ -57,7 +57,7 @@ static DEFINE_RWLOCK(adapter_list_lock); static LIST_HEAD(adapter_list); static const unsigned int MAX_ATIDS = 64 * 1024; -static const unsigned int ATID_BASE = 0x10; +static const unsigned int ATID_BASE = 0x1; static inline int offload_activated(struct t3cdev *tdev) { @@ -684,10 +684,19 @@ static int do_cr(struct t3cdev *dev, struct sk_buff *skb) { struct cpl_pass_accept_req *req = cplhdr(skb); unsigned int stid = G_PASS_OPEN_TID(ntohl(req->tos_tid)); + struct tid_info *t = &(T3C_DATA(dev))->tid_maps; struct t3c_tid_entry *t3c_tid; + unsigned int tid = GET_TID(req); - t3c_tid = lookup_stid(&(T3C_DATA(dev))->tid_maps, stid); - if (t3c_tid->ctx && t3c_tid->client->handlers && + if (unlikely(tid >= t->ntids)) { + printk("%s: passive open TID %u too large\n", + dev->name, tid); + t3_fatal_err(tdev2adap(dev)); + return CPL_RET_BUF_DONE; + } + + t3c_tid = lookup_stid(t, stid); + if (t3c_tid && t3c_tid->ctx && t3c_tid->client->handlers && t3c_tid->client->handlers[CPL_PASS_ACCEPT_REQ]) { return t3c_tid->client->handlers[CPL_PASS_ACCEPT_REQ] (dev, skb, t3c_tid->ctx); @@ -769,16 +778,25 @@ static int do_act_establish(struct t3cdev *dev, struct sk_buff *skb) { struct cpl_act_establish *req = cplhdr(skb); unsigned int atid = G_PASS_OPEN_TID(ntohl(req->tos_tid)); + struct tid_info *t = &(T3C_DATA(dev))->tid_maps; struct t3c_tid_entry *t3c_tid; + unsigned int tid = GET_TID(req); - t3c_tid = lookup_atid(&(T3C_DATA(dev))->tid_maps, atid); + if (unlikely(tid >= t->ntids)) { + printk("%s: active establish TID %u too large\n", + dev->name, tid); + t3_fatal_err(tdev2adap(dev)); + return CPL_RET_BUF_DONE; + } + + t3c_tid = lookup_atid(t, atid); if (t3c_tid && t3c_tid->ctx && t3c_tid->client->handlers && t3c_tid->client->handlers[CPL_ACT_ESTABLISH]) { return t3c_tid->client->handlers[CPL_ACT_ESTABLISH] (dev, skb, t3c_tid->ctx); } else { printk(KERN_ERR "%s: received clientless CPL command 0x%x\n", - dev->name, CPL_PASS_ACCEPT_REQ); + dev->name, CPL_ACT_ESTABLISH); return CPL_RET_BUF_DONE | CPL_RET_BAD_MSG; } } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 10/13] cxgb3 - engine microcode update
From: Divy Le Ray <[EMAIL PROTECTED]> Load microcode engine when the interface is configured up. Bump up version to 1.1.0. Allow the driver to be and running with older microcode images. Allow ethtool to log the microcode version. Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]> --- drivers/net/cxgb3/common.h |8 ++- drivers/net/cxgb3/cxgb3_main.c | 116 drivers/net/cxgb3/t3_hw.c | 43 +-- 3 files changed, 113 insertions(+), 54 deletions(-) diff --git a/drivers/net/cxgb3/common.h b/drivers/net/cxgb3/common.h index d54446f..b665b20 100644 --- a/drivers/net/cxgb3/common.h +++ b/drivers/net/cxgb3/common.h @@ -127,8 +127,8 @@ enum { /* adapter interrupt-maintained statistics */ enum { TP_VERSION_MAJOR= 1, - TP_VERSION_MINOR= 0, - TP_VERSION_MICRO= 44 + TP_VERSION_MINOR= 1, + TP_VERSION_MICRO= 0 }; #define S_TP_VERSION_MAJOR 16 @@ -438,6 +438,7 @@ enum { /* chip revisions */ T3_REV_A = 0, T3_REV_B = 2, T3_REV_B2 = 3, + T3_REV_C = 4, }; struct trace_params { @@ -682,7 +683,8 @@ const struct adapter_info *t3_get_adapter_info(unsigned int board_id); int t3_seeprom_read(struct adapter *adapter, u32 addr, u32 *data); int t3_seeprom_write(struct adapter *adapter, u32 addr, u32 data); int t3_seeprom_wp(struct adapter *adapter, int enable); -int t3_check_tpsram_version(struct adapter *adapter); +int t3_get_tp_version(struct adapter *adapter, u32 *vers); +int t3_check_tpsram_version(struct adapter *adapter, int *must_load); int t3_check_tpsram(struct adapter *adapter, u8 *tp_ram, unsigned int size); int t3_set_proto_sram(struct adapter *adap, u8 *data); int t3_read_flash(struct adapter *adapter, unsigned int addr, diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c index e5744e7..65ded16 100644 --- a/drivers/net/cxgb3/cxgb3_main.c +++ b/drivers/net/cxgb3/cxgb3_main.c @@ -721,6 +721,7 @@ static void bind_qsets(struct adapter *adap) } #define FW_FNAME "t3fw-%d.%d.%d.bin" +#define TPSRAM_NAME "t3%c_protocol_sram-%d.%d.%d.bin" static int upgrade_fw(struct adapter *adap) { @@ -742,6 +743,61 @@ static int upgrade_fw(struct adapter *adap) return ret; } +static inline char t3rev2char(struct adapter *adapter) +{ + char rev = 0; + + switch(adapter->params.rev) { + case T3_REV_A: + rev = 'a'; + break; + case T3_REV_B: + case T3_REV_B2: + rev = 'b'; + break; + case T3_REV_C: + rev = 'c'; + break; + } + return rev; +} + +int update_tpsram(struct adapter *adap) +{ + const struct firmware *tpsram; + char buf[64]; + struct device *dev = &adap->pdev->dev; + int ret; + char rev; + + rev = t3rev2char(adap); + if (!rev) + return 0; + + snprintf(buf, sizeof(buf), TPSRAM_NAME, rev, +TP_VERSION_MAJOR, TP_VERSION_MINOR, TP_VERSION_MICRO); + + ret = request_firmware(&tpsram, buf, dev); + if (ret < 0) { + dev_err(dev, "could not load TP SRAM: unable to load %s\n", + buf); + return ret; + } + + ret = t3_check_tpsram(adap, tpsram->data, tpsram->size); + if (ret) + goto release_tpsram; + + ret = t3_set_proto_sram(adap, tpsram->data); + if (ret) + dev_err(dev, "loading protocol SRAM failed\n"); + +release_tpsram: + release_firmware(tpsram); + + return ret; +} + /** * cxgb_up - enable the adapter * @adapter: adapter being enabled @@ -755,6 +811,7 @@ static int upgrade_fw(struct adapter *adap) static int cxgb_up(struct adapter *adap) { int err = 0; + int must_load; if (!(adap->flags & FULL_INIT_DONE)) { err = t3_check_fw_version(adap); @@ -763,6 +820,13 @@ static int cxgb_up(struct adapter *adap) if (err) goto out; + err = t3_check_tpsram_version(adap, &must_load); + if (err == -EINVAL) { + err = update_tpsram(adap); + if (err && must_load) + goto out; + } + err = init_dummy_netdevs(adap); if (err) goto out; @@ -1097,9 +1161,11 @@ static int get_eeprom_len(struct net_device *dev) static void get_drvinfo(struct net_device *dev, struct ethtool_drvinfo *info) { u32 fw_vers = 0; + u32 tp_vers = 0; struct adapter *adapter = dev->priv; t3_get_fw_version(adapter, &fw_vers); + t3_get_tp_version(adapter, &tp_vers); strcpy(info->driver, DRV_NAME); strcpy(info->ver
[PATCH 9/13] cxgb3 - Update internal memory management
From: Divy Le Ray <[EMAIL PROTECTED]> Set PM1 internal memory to round robin mode Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]> --- drivers/net/cxgb3/regs.h |2 ++ drivers/net/cxgb3/t3_hw.c |2 ++ 2 files changed, 4 insertions(+), 0 deletions(-) diff --git a/drivers/net/cxgb3/regs.h b/drivers/net/cxgb3/regs.h index 2824278..5e1bc0d 100644 --- a/drivers/net/cxgb3/regs.h +++ b/drivers/net/cxgb3/regs.h @@ -1326,6 +1326,7 @@ #define V_D0_WEIGHT(x) ((x) << S_D0_WEIGHT) #define A_PM1_RX_CFG 0x5c0 +#define A_PM1_RX_MODE 0x5c4 #define A_PM1_RX_INT_ENABLE 0x5d8 @@ -1394,6 +1395,7 @@ #define A_PM1_RX_INT_CAUSE 0x5dc #define A_PM1_TX_CFG 0x5e0 +#define A_PM1_TX_MODE 0x5e4 #define A_PM1_TX_INT_ENABLE 0x5f8 diff --git a/drivers/net/cxgb3/t3_hw.c b/drivers/net/cxgb3/t3_hw.c index 23b1a16..13bfbec 100644 --- a/drivers/net/cxgb3/t3_hw.c +++ b/drivers/net/cxgb3/t3_hw.c @@ -3189,6 +3189,8 @@ int t3_init_hw(struct adapter *adapter, u32 fw_params) t3_set_reg_field(adapter, A_PCIX_CFG, 0, F_CLIDECEN); t3_write_reg(adapter, A_PM1_RX_CFG, 0x); + t3_write_reg(adapter, A_PM1_RX_MODE, 0); + t3_write_reg(adapter, A_PM1_TX_MODE, 0); init_hw_for_avail_ports(adapter, adapter->params.nports); t3_sge_init(adapter, &adapter->params.sge); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 8/13] cxgb3 - log adapter derial number
From: Divy Le Ray <[EMAIL PROTECTED]> Log HW serial number when cxgb3 module is loaded. Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]> --- drivers/net/cxgb3/common.h |2 ++ drivers/net/cxgb3/cxgb3_main.c |6 -- drivers/net/cxgb3/t3_hw.c |3 ++- 3 files changed, 8 insertions(+), 3 deletions(-) diff --git a/drivers/net/cxgb3/common.h b/drivers/net/cxgb3/common.h index 55922ed..d54446f 100644 --- a/drivers/net/cxgb3/common.h +++ b/drivers/net/cxgb3/common.h @@ -97,6 +97,7 @@ enum { MAX_NPORTS = 2, /* max # of ports */ MAX_FRAME_SIZE = 10240, /* max MAC frame size, including header + FCS */ EEPROMSIZE = 8192, /* Serial EEPROM size */ + SERNUM_LEN = 16,/* Serial # length */ RSS_TABLE_SIZE = 64,/* size of RSS lookup and mapping tables */ TCB_SIZE = 128, /* TCB size */ NMTUS = 16, /* size of MTU table */ @@ -391,6 +392,7 @@ struct vpd_params { unsigned int uclk; unsigned int mdc; unsigned int mem_timing; + u8 sn[SERNUM_LEN + 1]; u8 eth_base[6]; u8 port_type[MAX_NPORTS]; unsigned short xauicfg[2]; diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c index a1f94cf..e5744e7 100644 --- a/drivers/net/cxgb3/cxgb3_main.c +++ b/drivers/net/cxgb3/cxgb3_main.c @@ -2333,10 +2333,12 @@ static void __devinit print_port_info(struct adapter *adap, (adap->flags & USING_MSIX) ? " MSI-X" : (adap->flags & USING_MSI) ? " MSI" : ""); if (adap->name == dev->name && adap->params.vpd.mclk) - printk(KERN_INFO "%s: %uMB CM, %uMB PMTX, %uMB PMRX\n", + printk(KERN_INFO + "%s: %uMB CM, %uMB PMTX, %uMB PMRX, S/N: %s\n", adap->name, t3_mc7_size(&adap->cm) >> 20, t3_mc7_size(&adap->pmtx) >> 20, - t3_mc7_size(&adap->pmrx) >> 20); + t3_mc7_size(&adap->pmrx) >> 20, + adap->params.vpd.sn); } } diff --git a/drivers/net/cxgb3/t3_hw.c b/drivers/net/cxgb3/t3_hw.c index dd3149d..23b1a16 100644 --- a/drivers/net/cxgb3/t3_hw.c +++ b/drivers/net/cxgb3/t3_hw.c @@ -505,7 +505,7 @@ struct t3_vpd { u8 vpdr_len[2]; VPD_ENTRY(pn, 16); /* part number */ VPD_ENTRY(ec, 16); /* EC level */ - VPD_ENTRY(sn, 16); /* serial number */ + VPD_ENTRY(sn, SERNUM_LEN); /* serial number */ VPD_ENTRY(na, 12); /* MAC address base */ VPD_ENTRY(cclk, 6); /* core clock */ VPD_ENTRY(mclk, 6); /* mem clock */ @@ -648,6 +648,7 @@ static int get_vpd_params(struct adapter *adapter, struct vpd_params *p) p->uclk = simple_strtoul(vpd.uclk_data, NULL, 10); p->mdc = simple_strtoul(vpd.mdc_data, NULL, 10); p->mem_timing = simple_strtoul(vpd.mt_data, NULL, 10); + memcpy(p->sn, vpd.sn_data, SERNUM_LEN); /* Old eeproms didn't have port information */ if (adapter->params.rev == 0 && !vpd.port0_data[0]) { - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/13] cxgb3 - use immediate data for offload Tx
From: Divy Le Ray <[EMAIL PROTECTED]> Send small TX_DATA work requests as immediate data even when there are fragments. Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]> --- drivers/net/cxgb3/sge.c | 17 +++-- 1 files changed, 11 insertions(+), 6 deletions(-) diff --git a/drivers/net/cxgb3/sge.c b/drivers/net/cxgb3/sge.c index 9213cda..dca2716 100644 --- a/drivers/net/cxgb3/sge.c +++ b/drivers/net/cxgb3/sge.c @@ -1182,8 +1182,8 @@ int t3_eth_xmit(struct sk_buff *skb, struct net_device *dev) * * Writes a packet as immediate data into a Tx descriptor. The packet * contains a work request at its beginning. We must write the packet - * carefully so the SGE doesn't read accidentally before it's written in - * its entirety. + * carefully so the SGE doesn't read it accidentally before it's written + * in its entirety. */ static inline void write_imm(struct tx_desc *d, struct sk_buff *skb, unsigned int len, unsigned int gen) @@ -1191,7 +1191,11 @@ static inline void write_imm(struct tx_desc *d, struct sk_buff *skb, struct work_request_hdr *from = (struct work_request_hdr *)skb->data; struct work_request_hdr *to = (struct work_request_hdr *)d; - memcpy(&to[1], &from[1], len - sizeof(*from)); + if (likely(!skb->data_len)) + memcpy(&to[1], &from[1], len - sizeof(*from)); + else + skb_copy_bits(skb, sizeof(*from), &to[1], len - sizeof(*from)); + to->wr_hi = from->wr_hi | htonl(F_WR_SOP | F_WR_EOP | V_WR_BCNTLFLT(len & 7)); wmb(); @@ -1261,7 +1265,7 @@ static inline void reclaim_completed_tx_imm(struct sge_txq *q) static inline int immediate(const struct sk_buff *skb) { - return skb->len <= WR_LEN && !skb->data_len; + return skb->len <= WR_LEN; } /** @@ -1467,12 +1471,13 @@ static void write_ofld_wr(struct adapter *adap, struct sk_buff *skb, */ static inline unsigned int calc_tx_descs_ofld(const struct sk_buff *skb) { - unsigned int flits, cnt = skb_shinfo(skb)->nr_frags; + unsigned int flits, cnt; - if (skb->len <= WR_LEN && cnt == 0) + if (skb->len <= WR_LEN) return 1; /* packet fits as immediate data */ flits = skb_transport_offset(skb) / 8; /* headers */ + cnt = skb_shinfo(skb)->nr_frags; if (skb->tail != skb->transport_header) cnt++; return flits_to_desc(flits + sgl_len(cnt)); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 5/13] cxgb3 - Expose HW memory page info
From: Divy Le Ray <[EMAIL PROTECTED]> Let the RDMA driver get HW page info to work around HW issues. Assign explicit enum values. Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]> --- drivers/net/cxgb3/cxgb3_ctl_defs.h | 52 +--- drivers/net/cxgb3/cxgb3_offload.c |7 + 2 files changed, 38 insertions(+), 21 deletions(-) diff --git a/drivers/net/cxgb3/cxgb3_ctl_defs.h b/drivers/net/cxgb3/cxgb3_ctl_defs.h index 2095dda..6c4f320 100644 --- a/drivers/net/cxgb3/cxgb3_ctl_defs.h +++ b/drivers/net/cxgb3/cxgb3_ctl_defs.h @@ -33,27 +33,29 @@ #define _CXGB3_OFFLOAD_CTL_DEFS_H enum { - GET_MAX_OUTSTANDING_WR, - GET_TX_MAX_CHUNK, - GET_TID_RANGE, - GET_STID_RANGE, - GET_RTBL_RANGE, - GET_L2T_CAPACITY, - GET_MTUS, - GET_WR_LEN, - GET_IFF_FROM_MAC, - GET_DDP_PARAMS, - GET_PORTS, - - ULP_ISCSI_GET_PARAMS, - ULP_ISCSI_SET_PARAMS, - - RDMA_GET_PARAMS, - RDMA_CQ_OP, - RDMA_CQ_SETUP, - RDMA_CQ_DISABLE, - RDMA_CTRL_QP_SETUP, - RDMA_GET_MEM, + GET_MAX_OUTSTANDING_WR = 0, + GET_TX_MAX_CHUNK= 1, + GET_TID_RANGE = 2, + GET_STID_RANGE = 3, + GET_RTBL_RANGE = 4, + GET_L2T_CAPACITY= 5, + GET_MTUS= 6, + GET_WR_LEN = 7, + GET_IFF_FROM_MAC= 8, + GET_DDP_PARAMS = 9, + GET_PORTS = 10, + + ULP_ISCSI_GET_PARAMS= 11, + ULP_ISCSI_SET_PARAMS= 12, + + RDMA_GET_PARAMS = 13, + RDMA_CQ_OP = 14, + RDMA_CQ_SETUP = 15, + RDMA_CQ_DISABLE = 16, + RDMA_CTRL_QP_SETUP = 17, + RDMA_GET_MEM= 18, + + GET_RX_PAGE_INFO= 50, }; /* @@ -161,4 +163,12 @@ struct rdma_ctrlqp_setup { unsigned long long base_addr; unsigned int size; }; + +/* + * Offload TX/RX page information. + */ +struct ofld_page_info { + unsigned int page_size; /* Page size, should be a power of 2 */ + unsigned int num;/* Number of pages */ +}; #endif /* _CXGB3_OFFLOAD_CTL_DEFS_H */ diff --git a/drivers/net/cxgb3/cxgb3_offload.c b/drivers/net/cxgb3/cxgb3_offload.c index e620ed4..522c1be 100644 --- a/drivers/net/cxgb3/cxgb3_offload.c +++ b/drivers/net/cxgb3/cxgb3_offload.c @@ -317,6 +317,8 @@ static int cxgb_offload_ctl(struct t3cdev *tdev, unsigned int req, void *data) struct iff_mac *iffmacp; struct ddp_params *ddpp; struct adap_ports *ports; + struct ofld_page_info *rx_page_info; + struct tp_params *tp = &adapter->params.tp; int i; switch (req) { @@ -382,6 +384,11 @@ static int cxgb_offload_ctl(struct t3cdev *tdev, unsigned int req, void *data) if (!offload_running(adapter)) return -EAGAIN; return cxgb_rdma_ctl(adapter, req, data); + case GET_RX_PAGE_INFO: + rx_page_info = data; + rx_page_info->page_size = tp->rx_pg_size; + rx_page_info->num = tp->rx_num_pgs; + break; default: return -EOPNOTSUPP; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 7/13] cxgb3 - Fatal error update
From: Divy Le Ray <[EMAIL PROTECTED]> Stop the MAC when a fatal error is detected. Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]> --- drivers/net/cxgb3/cxgb3_main.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c index dc5d269..a1f94cf 100644 --- a/drivers/net/cxgb3/cxgb3_main.c +++ b/drivers/net/cxgb3/cxgb3_main.c @@ -2270,6 +2270,10 @@ void t3_fatal_err(struct adapter *adapter) if (adapter->flags & FULL_INIT_DONE) { t3_sge_stop(adapter); + t3_write_reg(adapter, A_XGM_TX_CTRL, 0); + t3_write_reg(adapter, A_XGM_RX_CTRL, 0); + t3_write_reg(adapter, XGM_REG(A_XGM_TX_CTRL, 1), 0); + t3_write_reg(adapter, XGM_REG(A_XGM_RX_CTRL, 1), 0); t3_intr_disable(adapter); } CH_ALERT(adapter, "encountered fatal error, operation suspended\n"); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/13] cxgb3 - Update rx coalescing length
From: Divy Le Ray <[EMAIL PROTECTED]> Set max Rx coalescing length to 12288 Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]> --- drivers/net/cxgb3/common.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/net/cxgb3/common.h b/drivers/net/cxgb3/common.h index c46c249..55922ed 100644 --- a/drivers/net/cxgb3/common.h +++ b/drivers/net/cxgb3/common.h @@ -104,7 +104,7 @@ enum { PROTO_SRAM_LINES = 128, /* size of TP sram */ }; -#define MAX_RX_COALESCING_LEN 16224U +#define MAX_RX_COALESCING_LEN 12288U enum { PAUSE_RX = 1 << 0, - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/13] cxgb3 - SGE doorbell overflow warning
From: Divy Le Ray <[EMAIL PROTECTED]> Log doorbell Fifo overflow Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]> --- drivers/net/cxgb3/regs.h |8 drivers/net/cxgb3/sge.c |4 2 files changed, 12 insertions(+), 0 deletions(-) diff --git a/drivers/net/cxgb3/regs.h b/drivers/net/cxgb3/regs.h index aa80313..2824278 100644 --- a/drivers/net/cxgb3/regs.h +++ b/drivers/net/cxgb3/regs.h @@ -172,6 +172,14 @@ #define A_SG_INT_CAUSE 0x5c +#define S_HIPIODRBDROPERR11 +#define V_HIPIODRBDROPERR(x) ((x) << S_HIPIODRBDROPERR) +#define F_HIPIODRBDROPERRV_HIPIODRBDROPERR(1U) + +#define S_LOPIODRBDROPERR10 +#define V_LOPIODRBDROPERR(x) ((x) << S_LOPIODRBDROPERR) +#define F_LOPIODRBDROPERRV_LOPIODRBDROPERR(1U) + #define S_RSPQDISABLED3 #define V_RSPQDISABLED(x) ((x) << S_RSPQDISABLED) #define F_RSPQDISABLEDV_RSPQDISABLED(1U) diff --git a/drivers/net/cxgb3/sge.c b/drivers/net/cxgb3/sge.c index a2cfd68..9213cda 100644 --- a/drivers/net/cxgb3/sge.c +++ b/drivers/net/cxgb3/sge.c @@ -2476,6 +2476,10 @@ void t3_sge_err_intr_handler(struct adapter *adapter) "(0x%x)\n", (v >> S_RSPQ0DISABLED) & 0xff); } + if (status & (F_HIPIODRBDROPERR | F_LOPIODRBDROPERR)) + CH_ALERT(adapter, "SGE dropped %s priority doorbell\n", +status & F_HIPIODRBDROPERR ? "high" : "lo"); + t3_write_reg(adapter, A_SG_INT_CAUSE, status); if (status & (F_RSPQCREDITOVERFOW | F_RSPQDISABLED)) t3_fatal_err(adapter); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/13] cxgb3 - MAC workaround update
From: Divy Le Ray <[EMAIL PROTECTED]> Update the MAC workaround to deal with switches that do not honor pause frames. Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]> --- drivers/net/cxgb3/common.h |1 + drivers/net/cxgb3/xgmac.c | 22 +++--- 2 files changed, 12 insertions(+), 11 deletions(-) diff --git a/drivers/net/cxgb3/common.h b/drivers/net/cxgb3/common.h index 1637800..c46c249 100644 --- a/drivers/net/cxgb3/common.h +++ b/drivers/net/cxgb3/common.h @@ -507,6 +507,7 @@ struct cmac { unsigned int tx_xcnt; u64 tx_mcnt; unsigned int rx_xcnt; + unsigned int rx_ocnt; u64 rx_mcnt; unsigned int toggle_cnt; unsigned int txen; diff --git a/drivers/net/cxgb3/xgmac.c b/drivers/net/cxgb3/xgmac.c index c302b1a..1d1c391 100644 --- a/drivers/net/cxgb3/xgmac.c +++ b/drivers/net/cxgb3/xgmac.c @@ -437,12 +437,13 @@ int t3_mac_enable(struct cmac *mac, int which) struct mac_stats *s = &mac->stats; if (which & MAC_DIRECTION_TX) { - t3_write_reg(adap, A_XGM_TX_CTRL + oft, F_TXEN); t3_write_reg(adap, A_TP_PIO_ADDR, A_TP_TX_DROP_CFG_CH0 + idx); t3_write_reg(adap, A_TP_PIO_DATA, 0xc0ede401); t3_write_reg(adap, A_TP_PIO_ADDR, A_TP_TX_DROP_MODE); t3_set_reg_field(adap, A_TP_PIO_DATA, 1 << idx, 1 << idx); + t3_write_reg(adap, A_XGM_TX_CTRL + oft, F_TXEN); + t3_write_reg(adap, A_TP_PIO_ADDR, A_TP_TX_DROP_CNT_CH0 + idx); mac->tx_mcnt = s->tx_frames; mac->tx_tcnt = (G_TXDROPCNTCH0RCVD(t3_read_reg(adap, @@ -454,6 +455,7 @@ int t3_mac_enable(struct cmac *mac, int which) mac->rx_xcnt = (G_TXSPI4SOPCNT(t3_read_reg(adap, A_XGM_RX_SPI4_SOP_EOP_CNT + oft))); + mac->rx_ocnt = s->rx_fifo_ovfl; mac->txen = F_TXEN; mac->toggle_cnt = 0; } @@ -464,24 +466,19 @@ int t3_mac_enable(struct cmac *mac, int which) int t3_mac_disable(struct cmac *mac, int which) { - int idx = macidx(mac); struct adapter *adap = mac->adapter; - int val; if (which & MAC_DIRECTION_TX) { t3_write_reg(adap, A_XGM_TX_CTRL + mac->offset, 0); - t3_write_reg(adap, A_TP_PIO_ADDR, A_TP_TX_DROP_CFG_CH0 + idx); - t3_write_reg(adap, A_TP_PIO_DATA, 0xc01f); - t3_write_reg(adap, A_TP_PIO_ADDR, A_TP_TX_DROP_MODE); - t3_set_reg_field(adap, A_TP_PIO_DATA, 1 << idx, 1 << idx); mac->txen = 0; } if (which & MAC_DIRECTION_RX) { + int val = F_MAC_RESET_; + t3_set_reg_field(mac->adapter, A_XGM_RESET_CTRL + mac->offset, F_PCS_RESET_, 0); msleep(100); t3_write_reg(adap, A_XGM_RX_CTRL + mac->offset, 0); - val = F_MAC_RESET_; if (is_10G(adap)) val |= F_PCS_RESET_; else if (uses_xaui(adap)) @@ -541,11 +538,14 @@ int t3b2_mac_watchdog_task(struct cmac *mac) } rxcheck: - if (rx_mcnt != mac->rx_mcnt) + if (rx_mcnt != mac->rx_mcnt) { rx_xcnt = (G_TXSPI4SOPCNT(t3_read_reg(adap, A_XGM_RX_SPI4_SOP_EOP_CNT + - mac->offset))); - else + mac->offset))) + + (s->rx_fifo_ovfl - +mac->rx_ocnt); + mac->rx_ocnt = s->rx_fifo_ovfl; + } else goto out; if (mac->rx_mcnt != s->rx_frames && rx_xcnt == 0 && - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/2] writeback dirty inodes fixes
On Sat, Aug 11, 2007 at 02:02:02PM +0800, Fengguang Wu wrote: > Andrew, > > Now the patches are simplified and rebased to 2.6.23-rc2-mm2. > > The following two patches should be put immediately after > writeback-fix-periodic-superblock-dirty-inode-flushing.patch: > > writeback: fix time ordering of the per superblock inode lists 8 > writeback: fix ntfs with sb_has_dirty_inodes() The following tree patches should be updated to resolve merge conflicts: sync_sb_inodes-propagate-errors.patch reiser4-sb_sync_inodes.patch check_dirty_inode_list.patch (extended to check s_io/s_more_io) They are attached in this mail. From: Andrew Morton <[EMAIL PROTECTED]> Guillame points out that sync_sb_inodes() is failing to propagate error codes back. Fix that, and make several other void-returning functions not drop reportable error codes. Cc: Guillaume Chazarain <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> --- fs/fs-writeback.c | 56 +++- include/linux/writeback.h |6 +-- 2 files changed, 45 insertions(+), 17 deletions(-) --- linux-2.6.23-rc2-mm2.orig/fs/fs-writeback.c +++ linux-2.6.23-rc2-mm2/fs/fs-writeback.c @@ -392,13 +392,17 @@ __writeback_single_inode(struct inode *i * on the writer throttling path, and we get decent balancing between many * throttled threads: we don't want them all piling up on inode_sync_wait. */ -static void +static int sync_sb_inodes(struct super_block *sb, struct writeback_control *wbc) { + int ret = 0; + if (!wbc->for_kupdate || list_empty(&sb->s_io)) queue_io(sb, wbc->older_than_this); while (!list_empty(&sb->s_io)) { + int err; + struct inode *inode = list_entry(sb->s_io.prev, struct inode, i_list); struct address_space *mapping = inode->i_mapping; @@ -444,7 +448,9 @@ sync_sb_inodes(struct super_block *sb, s BUG_ON(inode->i_state & I_FREEING); __iget(inode); pages_skipped = wbc->pages_skipped; - __writeback_single_inode(inode, wbc); + err = __writeback_single_inode(inode, wbc); + if (!ret) + ret = err; if (wbc->sync_mode == WB_SYNC_HOLD) { inode->dirtied_when = jiffies; list_move(&inode->i_list, &sb->s_dirty); @@ -469,7 +475,7 @@ sync_sb_inodes(struct super_block *sb, s if (list_empty(&sb->s_io)) list_splice_init(&sb->s_more_io, &sb->s_io); - return; /* Leave any unwritten inodes on s_io */ + return ret; /* Leave any unwritten inodes on s_io */ } /* @@ -491,10 +497,10 @@ sync_sb_inodes(struct super_block *sb, s * sync_sb_inodes will seekout the blockdev which matches `bdi'. Maybe not * super-efficient but we're about to do a ton of I/O... */ -void -writeback_inodes(struct writeback_control *wbc) +int writeback_inodes(struct writeback_control *wbc) { struct super_block *sb; + int ret = 0; might_sleep(); spin_lock(&sb_lock); @@ -512,9 +518,13 @@ restart: */ if (down_read_trylock(&sb->s_umount)) { if (sb->s_root) { + int err; + spin_lock(&inode_lock); - sync_sb_inodes(sb, wbc); + err = sync_sb_inodes(sb, wbc); spin_unlock(&inode_lock); + if (!ret) + ret = err; } up_read(&sb->s_umount); } @@ -526,6 +536,7 @@ restart: break; } spin_unlock(&sb_lock); + return ret; } /* @@ -539,7 +550,7 @@ restart: * We add in the number of potentially dirty inodes, because each inode write * can dirty pagecache in the underlying blockdev. */ -void sync_inodes_sb(struct super_block *sb, int wait) +int sync_inodes_sb(struct super_block *sb, int wait) { struct writeback_control wbc = { .sync_mode = wait ? WB_SYNC_ALL : WB_SYNC_HOLD, @@ -548,14 +559,16 @@ void sync_inodes_sb(struct super_block * }; unsigned long nr_dirty = global_page_state(NR_FILE_DIRTY); unsigned long nr_unstable = global_page_state(NR_UNSTABLE_NFS); + int ret; wbc.nr_to_write = nr_dirty + nr_unstable + (inodes_stat.nr_inodes - inodes_stat.nr_unused) + nr_dirty + nr_unstable; wbc.nr_to_write += wbc.nr_to_write / 2; /* Bit more for luck */ spin_lock(&inode_lock); - sync_sb_inodes(sb, &wbc); + ret = sync_sb_inodes(sb, &wbc); spin_unlock(&inode_lock); + return ret; } /* @@ -591,13 +604,16 @@ static void set_sb_syncing(int val) * outstanding dirty inodes, the writeback goes block-at-a-time within the * filesystem's write_inode(). This is extremely slow. */ -static void __sync_inodes(int wait) +static int __sync_inodes(int wait) { struct super_block *sb; + int ret = 0; spin_lock(&sb_lock); restart: list_for_each_entry(sb, &super_blocks, s_list) { + int err; + if (sb->s_syncing) continue; sb->s_syncing = 1; @@ -605,8 +621,12 @@ restart: spin_unlock(&sb_lock); down_read(&sb->s_umount); if (sb->s_root) { - sync_inodes_sb(sb, wait); - sync_blockdev(sb->s_bdev); + err = sync_inodes_sb(sb, wait); + if (!ret) +ret = err; + err = sync_blockdev(sb->s_bdev); + if (!ret) +ret = err;
Re: Software based ECC ?
On Fri, 10 Aug 2007 23:16:45 +0200, roland said: > http://pdos.csail.mit.edu/papers/softecc:ddopson-meng/softecc_ddopson-meng.pdf > > "SoftECC : A System for Software Memory Integrity Checking" > > Is it possible to implement something like this within the Linux virtual > memory subsystem ? Anything that can be simulated with a Turing machine is *possible*. The question is how many rocket boosters the pig needs for takeoff. Hint: The thesis talks about why he didn't implement it for Linux. > If it can be done, wouldn`t this be a great feature ? Read section 5.2 of that thesis, particularly this quote from 5.2.2: "For random word writes, this implies that SoftECC will need an order of magnitude more compute time than the user-mode code" Basically, on every single memory page that gets dirtied, we have to then re-checksum the page (blowing away cache lines in the process). If you want to get a feel for it, find the kernel code that recognizes that a page is dirtied, and just add a few lines there: int foo = 0, i; for (i=0;i++;<1024) { // adjust for non-4K pages foo ^= *(page+i); } and see how much your system crawls. Personally, I'd recommend just shelling out the bucks for hardware ECC if the reliability matters. pgp59H6a1oMSE.pgp Description: PGP signature
[PATCH 2/2] writeback: fix ntfs with sb_has_dirty_inodes()
NTFS's if-condition on dirty inodes is not complete. Fix it with sb_has_dirty_inodes(). Cc: Anton Altaparmakov <[EMAIL PROTECTED]> Cc: Ken Chen <[EMAIL PROTECTED]> Cc: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Fengguang Wu <[EMAIL PROTECTED]> --- --- fs/fs-writeback.c |9 - fs/ntfs/super.c|4 ++-- include/linux/fs.h |1 + 3 files changed, 11 insertions(+), 3 deletions(-) --- linux-2.6.23-rc2-mm2.orig/fs/ntfs/super.c +++ linux-2.6.23-rc2-mm2/fs/ntfs/super.c @@ -2381,14 +2381,14 @@ static void ntfs_put_super(struct super_ */ ntfs_commit_inode(vol->mft_ino); write_inode_now(vol->mft_ino, 1); - if (!list_empty(&sb->s_dirty)) { + if (sb_has_dirty_inodes(sb)) { const char *s1, *s2; mutex_lock(&vol->mft_ino->i_mutex); truncate_inode_pages(vol->mft_ino->i_mapping, 0); mutex_unlock(&vol->mft_ino->i_mutex); write_inode_now(vol->mft_ino, 1); - if (!list_empty(&sb->s_dirty)) { + if (sb_has_dirty_inodes(sb)) { static const char *_s1 = "inodes"; static const char *_s2 = ""; s1 = _s1; --- linux-2.6.23-rc2-mm2.orig/include/linux/fs.h +++ linux-2.6.23-rc2-mm2/include/linux/fs.h @@ -1712,6 +1712,7 @@ extern int bdev_read_only(struct block_d extern int set_blocksize(struct block_device *, int); extern int sb_set_blocksize(struct super_block *, int); extern int sb_min_blocksize(struct super_block *, int); +extern int sb_has_dirty_inodes(struct super_block *); extern int generic_file_mmap(struct file *, struct vm_area_struct *); extern int generic_file_readonly_mmap(struct file *, struct vm_area_struct *); --- linux-2.6.23-rc2-mm2.orig/fs/fs-writeback.c +++ linux-2.6.23-rc2-mm2/fs/fs-writeback.c @@ -188,6 +188,13 @@ static void queue_io(struct super_block } } +int sb_has_dirty_inodes(struct super_block *sb) +{ + return !list_empty(&sb->s_dirty) || + !list_empty(&sb->s_io); +} +EXPORT_SYMBOL(sb_has_dirty_inodes); + /* * Write a single inode's dirty pages and inode data out to disk. * If `wait' is set, wait on the writeout. @@ -485,7 +492,7 @@ writeback_inodes(struct writeback_contro restart: sb = sb_entry(super_blocks.prev); for (; sb != sb_entry(&super_blocks); sb = sb_entry(sb->s_list.prev)) { - if (!list_empty(&sb->s_dirty) || !list_empty(&sb->s_io)) { + if (sb_has_dirty_inodes(sb)) { /* we're making our own get_super here */ sb->s_count++; spin_unlock(&sb_lock); -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/2] writeback dirty inodes fixes
Andrew, Now the patches are simplified and rebased to 2.6.23-rc2-mm2. The following two patches should be put immediately after writeback-fix-periodic-superblock-dirty-inode-flushing.patch: writeback: fix time ordering of the per superblock inode lists 8 writeback: fix ntfs with sb_has_dirty_inodes() Thank you, Fengguang - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] writeback: fix time ordering of the per superblock inode lists 8
Fix the time ordering bug re-introduced by writeback-fix-periodic-superblock-dirty-inode-flushing.patch. It works by never move not-yet-expired dirty inodes from s_dirty to s_io, *only to* move them back. The move-inodes-back-and-forth thing is a mess. Cc: Ken Chen <[EMAIL PROTECTED]> Cc: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Fengguang Wu <[EMAIL PROTECTED]> --- fs/fs-writeback.c | 40 ++-- 1 file changed, 22 insertions(+), 18 deletions(-) --- linux-2.6.23-rc2-mm2.orig/fs/fs-writeback.c +++ linux-2.6.23-rc2-mm2/fs/fs-writeback.c @@ -172,6 +172,23 @@ static void requeue_io(struct inode *ino } /* + * Queue expired dirty inodes for io. + */ +static void queue_io(struct super_block *sb, + unsigned long *older_than_this) +{ + while (!list_empty(&sb->s_dirty)) { + struct inode *inode = list_entry(sb->s_dirty.prev, + struct inode, i_list); + /* Was this inode dirtied too recently? */ + if (older_than_this && + time_after(inode->dirtied_when, *older_than_this)) + break; + list_move(&inode->i_list, &sb->s_io); + } +} + +/* * Write a single inode's dirty pages and inode data out to disk. * If `wait' is set, wait on the writeout. * @@ -295,10 +312,10 @@ __writeback_single_inode(struct inode *i /* * We're skipping this inode because it's locked, and we're not -* doing writeback-for-data-integrity. Move it to the head of -* s_dirty so that writeback can proceed with the other inodes -* on s_io. We'll have another go at writing back this inode -* when the s_dirty iodes get moved back onto s_io. +* doing writeback-for-data-integrity. Move it to s_more_io so +* that writeback can proceed with the other inodes on s_io. +* We'll have another go at writing back this inode when we +* completed a full scan of s_io. */ requeue_io(inode); @@ -362,10 +379,8 @@ __writeback_single_inode(struct inode *i static void sync_sb_inodes(struct super_block *sb, struct writeback_control *wbc) { - const unsigned long start = jiffies;/* livelock avoidance */ - if (!wbc->for_kupdate || list_empty(&sb->s_io)) - list_splice_init(&sb->s_dirty, &sb->s_io); + queue_io(sb, wbc->older_than_this); while (!list_empty(&sb->s_io)) { struct inode *inode = list_entry(sb->s_io.prev, @@ -406,17 +421,6 @@ sync_sb_inodes(struct super_block *sb, s continue; /* blockdev has wrong queue */ } - /* Was this inode dirtied after sync_sb_inodes was called? */ - if (time_after(inode->dirtied_when, start)) - break; - - /* Was this inode dirtied too recently? */ - if (wbc->older_than_this && time_after(inode->dirtied_when, - *wbc->older_than_this)) { - list_splice_init(&sb->s_io, sb->s_dirty.prev); - break; - } - /* Is another pdflush already flushing this queue? */ if (current_is_pdflush() && !writeback_acquire(bdi)) break; -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
On Sat, Aug 11, 2007 at 12:50:08AM +0200, Roman Zippel wrote: > Hi, > > On Fri, 10 Aug 2007, Ingo Molnar wrote: > > > achieve that. It probably wont make a real difference, but it's really > > easy for you to send and it's still very useful when one tries to > > eliminate possibilities and when one wants to concentrate on the > > remaining possibilities alone. > > The thing I'm afraid about CFS is its possible unpredictability, which > would make it hard to reproduce problems and we may end up with users with > unexplainable weird problems. That's the main reason I'm trying so hard to > push for a design discussion. You may be interested by looking at the very early CFS versions. The design was much more naive and understandable. After that, a lot of tricks have been added to take into account a lot of uses and corner cases, which may not help in understanding it globally. > Just to give an idea here are two more examples of irregular behaviour, > which are hopefully easier to reproduce. > > 1. Two simple busy loops, one of them is reniced to 15, according to my > calculations the reniced task should get about 3.4% (1/(1.25^15+1)), but I > get this: > > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > 4433 roman 20 0 1532 300 244 R 99.2 0.2 5:05.51 l > 4434 roman 35 15 1532 72 16 R 0.7 0.1 0:10.62 l Could this be caused by typos in some tables like you have found in wmult ? > OTOH upto nice level 12 I get what I expect. > > 2. If I start 20 busy loops, initially I see in top that every task gets > 5% and time increments equally (as it should): (...) > But if I renice all of them to -15, the time every task gets is rather > random: > > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > 4492 roman 5 -15 1532 68 16 R 1.0 0.1 0:07.95 l > 4491 roman 5 -15 1532 68 16 R 4.3 0.1 0:07.62 l > 4490 roman 5 -15 1532 68 16 R 3.3 0.1 0:07.50 l > 4489 roman 5 -15 1532 68 16 R 7.6 0.1 0:07.80 l > 4488 roman 5 -15 1532 68 16 R 9.6 0.1 0:08.31 l > 4487 roman 5 -15 1532 68 16 R 3.3 0.1 0:07.59 l > 4486 roman 5 -15 1532 68 16 R 6.6 0.1 0:07.08 l > 4485 roman 5 -15 1532 68 16 R 10.0 0.1 0:07.31 l > 4484 roman 5 -15 1532 68 16 R 8.0 0.1 0:07.30 l > 4483 roman 5 -15 1532 68 16 R 7.0 0.1 0:07.34 l > 4482 roman 5 -15 1532 68 16 R 1.0 0.1 0:05.84 l > 4481 roman 5 -15 1532 68 16 R 1.0 0.1 0:07.16 l > 4480 roman 5 -15 1532 68 16 R 3.3 0.1 0:07.00 l > 4479 roman 5 -15 1532 68 16 R 1.0 0.1 0:06.66 l > 4478 roman 5 -15 1532 68 16 R 8.6 0.1 0:06.96 l > 4477 roman 5 -15 1532 68 16 R 8.6 0.1 0:07.63 l > 4476 roman 5 -15 1532 68 16 R 9.6 0.1 0:07.38 l > 4475 roman 5 -15 1532 68 16 R 1.3 0.1 0:07.09 l > 4474 roman 5 -15 1532 68 16 R 2.3 0.1 0:07.97 l > 4473 roman 5 -15 1532 296 244 R 1.0 0.2 0:07.73 l Do you see this only at -15, or starting with -15 and below ? Willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Documentation files in html format?
On Sat, Aug 11, 2007 at 01:08:30AM +0200, Sam Ravnborg wrote: > > > > The problem I have with asciidoc is that it's a nightmare to get it > > to work. It's what GIT uses, and after spending a whole day trying > > to *build* that thing, I finally resigned and asked Junio if he could > > publish the pre-formatted manpages himself, which he agreed to. > > Bit uses in addition to asciidoc also docbook and a bit more. > As asciidoc is some phython scripts it should be trivial to > install with no build required. I remember it relied on some tools to process xml, but I don't know exactly what. It were those tools which I could not build. > Maybe it was the docbook stuff you had trouble with? possible, I don't remember that much, it was a painful day one year ago. > My Kbuild example were made without using other tools than asciidoc but > if pdf is desired some additional tools are needed. It was just needed to build the man pages, so I would have expected it to be pretty straight-forward too. Willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
On Fri, Aug 10, 2007 at 11:15:55PM +0200, Roman Zippel wrote: > Hi, > > On Fri, 10 Aug 2007, Willy Tarreau wrote: > > > fortunately all bug reporters are not like you. It's amazing how long > > you can resist sending a simple bug report to a developer! > > I'm more amazed how long Ingo can resist providing some explanations (not > just about this problem). It's a matter of time balance. It takes a short time to send the output of a script, and it takes a very long time to explain how things work. I often encounter the same situation with haproxy. People ask me to explain them in detail how this or that would apply to their context, and it's often easier for me to provide them with a 5-lines patch to add the feature they need, than to spend half an hour explaining why and how it would badly behave. > It's not like I haven't given him anything, he already has the test > programs, he already knows the system configuration. > Well, I've sent him the stuff now... fine, thanks. > > Maybe you > > consider that you need to fix the bug by yourself after you understand > > the code, > > Fixing the bug requires some knowledge what the code is intended to do. > > > Please try to be a little bit more transparent if you really want the > > bugs fixed, and don't behave as if you wanted this bug to survive > > till -final. > > Could you please ask Ingo the same? I'm simply trying to get some > transparancy into the CFS design. Without further information it's > difficult to tell, whether something is supposed to work this way or it's > a bug. I know that Ingo tends to reply to a question with another question. But as I said, imagine if he has to explain the same things to each person who asks him for it. I think that a more constructive approach would be to point what is missing/unclear/inexact in the doc so that he adds some paragraphs for you and everyone else. If you need this information to debug, most likely other people will need it too. > In this case it's quite possible that due to a recent change my testcase > doesn't work anymore. Should I consider the problem fixed or did it just > go into hiding? Without more information it's difficult to verify this > independently. generally, problems that appear only on one person's side and which suddenly disappear are either caused by some random buggy patch left in the tree (not your case it seems), or by an obscure bug of the feature being tested which will resurface from time to time as long as it's not identified. Willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 00/23] per device dirty throttling -v8
On Fri, 10 Aug 2007 00:04:45 EDT, Bill Davidsen said: > > I never imagined that itwas the 20%+ hit that is being described, and > > with so little impact, or I would have switched to it across the board > > years ago. > > > To get that magnitude you need slow disk with very fast CPU. It helps > most of systems where the disk hardware is marginal or worse for the i/o > load. Don't take that as typical. I suspect that almost every single laptop with a Core2 Duo in it falls into that classification, and it's getting worse every year, as we see more disparity between CPU speeds (increasing) and disk seek times (basically nailed to the floor for the last decade). pgpSAQlmGIEyL.pgp Description: PGP signature
Re: [patch 3/4] Enable link power management for ata drivers
On Thu, 09 Aug 2007 14:24:16 PDT, Kristen Carlson Accardi said: > +++ 2.6-git/drivers/ata/libata-scsi.c > @@ -2904,6 +2976,52 @@ void ata_scsi_simulate(struct ata_device > + if ((dev->horkage & ATA_HORKAGE_IPM) || > + !(dev->flags & ATA_DFLAG_IPM)) { > + ata_dev_printk(dev, KERN_ERR, > + "Unable to set Link PM policy\n"); > + ap->pm_policy = MAX_PERFORMANCE; > + } KERN_INFO please, or KERN_WARNING at the highest, at least until such time as enough drivers support enough hardware that it really *does* qualify for "this should not fail" status. (OK, so I'm just cranky because I'm tired of seeing a KERN_ERR thrown at every reboot, just because the ata_piix driver doesn't know how to set this stuff up for the DVD?RW drive in my laptop. But when this goes upstream, lots of *other* people are going to get hit by the exact same thing and think there's something actually *wrong* with their hardware.) pgpOEeHllsO1p.pgp Description: PGP signature
Re: Serial ports rearranged in 2.6.22?
On 8/10/07, Michael Mauch <[EMAIL PROTECTED]> wrote: > Hi, > > until 2.6.21, I had the normal assignments for ttyS0 and ttyS1: > > 00:08: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A > 00:09: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A > > With 2.6.22 I get the names <-> ports/irqs the other way around: > > 00:08: ttyS0 at I/O 0x2f8 (irq = 3) is a 16550A > 00:09: ttyS1 at I/O 0x3f8 (irq = 4) is a 16550A > > Is this supposed to be that way? Should we reassign these names with > udev? udev-114 doesn't seem to have built-in rules to assign the > traditional names. > > Or could it be related to some brokeness in my BIOS (ACPI/PNP)? > > I'm using the 8250_pnp module (and it's the same with builtin serial > modules). I made sure that I did not accidentally change the BIOS > settings for the serial ports. > > I'm using Gentoo, but on the lirc list was a Fedora user with the same > symptoms. http://lkml.org/lkml/2007/7/25/455 YH - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: rcutorture xtime usage
On Fri, Aug 10, 2007 at 05:29:49PM -0700, Paul E. McKenney wrote: > > Errmmm... No joy. > > ERROR: "cpu_clock" [kernel/rcutorture.ko] undefined! > > Turns out that cpu_clock also ain't exported, and rcutorture.c is > a module. Would adding an EXPORT_SYMBOL_GPL() as in the patch below > be acceptable? Except that the old xtime symbol was EXPORT_SYMBOL() rather than my proposed EXPORT_SYMBOL_GPL() for the equivalent new cpu_clock(). Sigh!!! I will leave this one for others to sort out. Andrew, please consider this patch withdrawn and apply the version that does not rely on time for entropy. Please let me know if you would like me to resend it. Thanx, Paul > If not, I have a tested patch to rcutorture.c that leverages statistical > counters. Your choice. > > Thanx, Paul > > Add an EXPORT_SYMBOL_GPL() for cpu_clock() and make rcutorture.c use it. > Compiles, but not yet tested. > > Signed-off-by: Paul E. McKenney <[EMAIL PROTECTED]> > --- > > rcutorture.c |8 ++-- > sched.c |2 ++ > 2 files changed, 4 insertions(+), 6 deletions(-) > > diff -urpNa -X dontdiff linux-2.6.23-rc2/kernel/rcutorture.c > linux-2.6.23-rc2-rcutorturesched/kernel/rcutorture.c > --- linux-2.6.23-rc2/kernel/rcutorture.c 2007-08-03 19:49:55.0 > -0700 > +++ linux-2.6.23-rc2-rcutorturesched/kernel/rcutorture.c 2007-08-10 > 17:15:22.0 -0700 > @@ -42,7 +42,6 @@ > #include > #include > #include > -#include > #include > #include > #include > @@ -166,16 +165,13 @@ struct rcu_random_state { > > /* > * Crude but fast random-number generator. Uses a linear congruential > - * generator, with occasional help from get_random_bytes(). > + * generator, with occasional help from cpu_clock(). > */ > static unsigned long > rcu_random(struct rcu_random_state *rrsp) > { > - long refresh; > - > if (--rrsp->rrs_count < 0) { > - get_random_bytes(&refresh, sizeof(refresh)); > - rrsp->rrs_state += refresh; > + rrsp->rrs_state += (unsigned long)cpu_clock(smp_processor_id()); > rrsp->rrs_count = RCU_RANDOM_REFRESH; > } > rrsp->rrs_state = rrsp->rrs_state * RCU_RANDOM_MULT + RCU_RANDOM_ADD; > diff -urpNa -X dontdiff linux-2.6.23-rc2/kernel/sched.c > linux-2.6.23-rc2-rcutorturesched/kernel/sched.c > --- linux-2.6.23-rc2/kernel/sched.c 2007-08-03 19:49:55.0 -0700 > +++ linux-2.6.23-rc2-rcutorturesched/kernel/sched.c 2007-08-10 > 17:22:57.0 -0700 > @@ -394,6 +394,8 @@ unsigned long long cpu_clock(int cpu) > return now; > } > > +EXPORT_SYMBOL_GPL(cpu_clock); > + > #ifdef CONFIG_FAIR_GROUP_SCHED > /* Change a task's ->cfs_rq if it moves across CPUs */ > static inline void set_task_cfs_rq(struct task_struct *p) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: early boot lockup with 2.6.23-rc1
On Fri, Aug 10, 2007 at 10:20:31PM +0300, Mikko Rapeli wrote: > I've bisected thus far, if it helps: Bisect came to this conclusion: git-bisect start # good: [4eb6bf6bfb580afaf1e1a1d30cba17a078530cf4] lots-of-architectures: enable arbitary speed tty support git-bisect good 4eb6bf6bfb580afaf1e1a1d30cba17a078530cf4 # bad: [773208946a132fb733ba273ee8562814f828cc28] Revert "USB: fix gregkh-usb-usb-use-menuconfig-objects" git-bisect bad 773208946a132fb733ba273ee8562814f828cc28 # bad: [dc690d8ef842b464f1c429a376ca16cb8dbee6ae] Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6 git-bisect bad dc690d8ef842b464f1c429a376ca16cb8dbee6ae # good: [15028aad00ddf241581fbe74a02ec89cbb28d35d] [TG3]: Update version to 3.78. git-bisect good 15028aad00ddf241581fbe74a02ec89cbb28d35d # bad: [82afee684fe3badaf5ee3fc5b6fda687d558bfb5] Merge master.kernel.org:/pub/scm/linux/kernel/git/cooloney/blackfin-2.6 git-bisect bad 82afee684fe3badaf5ee3fc5b6fda687d558bfb5 # bad: [c39736823232bc3ca113c8228fa852c09fba300e] Remove old i386 setup code git-bisect bad c39736823232bc3ca113c8228fa852c09fba300e # good: [5be865661516263d90317a6b35b588a2d7c3cb55] String-handling functions for the new x86 setup code. git-bisect good 5be865661516263d90317a6b35b588a2d7c3cb55 # good: [3b53d3045bbb8ea3c9dce663b102eab0903817c5] MCA support for new x86 setup code git-bisect good 3b53d3045bbb8ea3c9dce663b102eab0903817c5 # good: [7052fdd890bda0b3904674b69a1d24aec0a10d67] Code for actual protected-mode entry git-bisect good 7052fdd890bda0b3904674b69a1d24aec0a10d67 # good: [f2d98ae63dc64dedb00499289e13a50677f771f9] Linker script for the new x86 setup code git-bisect good f2d98ae63dc64dedb00499289e13a50677f771f9 # bad: [91a6c462b02d8dc02dbe95e5a407d78078a38d01] Use the new x86 setup code for x86-64; unify with i386 git-bisect bad 91a6c462b02d8dc02dbe95e5a407d78078a38d01 # bad: [4fd06960f120e02e9abc802a09f9511c400042a5] Use the new x86 setup code for i386 git-bisect bad 4fd06960f120e02e9abc802a09f9511c400042a5 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/24] make atomic_read() behave consistently on alpha
On Sat, 11 Aug 2007 02:38:40 +0200, Segher Boessenkool said: > >> That means GCC cannot compile Linux; it already optimises > >> some accesses to scalars to smaller accesses when it knows > >> it is allowed to. Not often though, since it hardly ever > >> helps in the cost model it employs. > > > > Please give an example code snippet + gcc version + arch > > to back this up. > > unsigned char f(unsigned long *p) > { > return *p & 1; > } Not really valid, because it's still able to do one atomic access to compute the result. Now, if you had found an example where it converts a 32-bit atomic access into 2 separate 16-bit accesses that weren't atomic as a whole pgpaXvjoy1naa.pgp Description: PGP signature
Re: [PATCH 6/24] make atomic_read() behave consistently on frv
On Sat, Aug 11, 2007 at 08:54:46AM +0800, Herbert Xu wrote: > Chris Snook <[EMAIL PROTECTED]> wrote: > > > > cpu_relax() contains a barrier, so it should do the right thing. For > > non-smp architectures, I'm concerned about interacting with interrupt > > handlers. Some drivers do use atomic_* operations. > > What problems with interrupt handlers? Access to int/long must > be atomic or we're in big trouble anyway. Reordering due to compiler optimizations. CPU reordering does not affect interactions with interrupt handlers on a given CPU, but reordering due to compiler code-movement optimization does. Since volatile can in some cases suppress code-movement optimizations, it can affect interactions with interrupt handlers. Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [SCSI] aic94xx: new driver
On Fri, Aug 10, 2007 at 11:09:22PM +0800, David Woodhouse wrote: > The files in /usr/include/scsi are actually shipped by glibc, and most > distributions use glibc's version instead of the one from the kernel -- > so this additional userspace interface is automatically incompatible > with most people's installations. Stop here right now. You just noticed the real bug, and that's exporting scsi.h at all. I think Olaf sent a patch to fix this already. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/20] Introduce MS_KERNMOUNT flag
On Fri, Aug 10, 2007 at 03:47:55PM +0400, [EMAIL PROTECTED] wrote: > This flag tells the .get_sb callback that this is a kern_mount() call > so that it can trust *data pointer to be valid in-kernel one. If this > flag is passed from the user process, it is cleared since the *data > pointer is not a valid kernel object. > > Running a few steps forward - this will be needed for proc to create the > superblock and store a valid pid namespace on it during the namespace > creation. The reason, why the namespace cannot live without proc mount > is described in the appropriate patch. I don't like this at all. We should never pass kernel and userspace addresses through the same pointer. Maybe add an additional argument to the get_sb prototype instead. But this whole idea of mounting /proc from kernelspace sounds like a really bad idea to me. /proc should never be mounted from the kernel but always normally from userspace. > > Signed-off-by: Pavel Emelyanov <[EMAIL PROTECTED]> > Cc: Oleg Nesterov <[EMAIL PROTECTED]> > > --- > > fs/namespace.c |3 ++- > fs/super.c |6 +++--- > include/linux/fs.h |4 +++- > 3 files changed, 8 insertions(+), 5 deletions(-) > > diff -upr linux-2.6.23-rc1-mm1.orig/fs/namespace.c > linux-2.6.23-rc1-mm1-7/fs/namespace.c > --- linux-2.6.23-rc1-mm1.orig/fs/namespace.c 2007-07-26 16:34:45.0 > +0400 > +++ linux-2.6.23-rc1-mm1-7/fs/namespace.c 2007-07-26 16:36:36.0 > +0400 > @@ -1579,7 +1579,8 @@ long do_mount(char *dev_name, char *dir_ > mnt_flags |= MNT_NOMNT; > > flags &= ~(MS_NOSUID | MS_NOEXEC | MS_NODEV | MS_ACTIVE | > -MS_NOATIME | MS_NODIRATIME | MS_RELATIME | MS_NOMNT); > +MS_NOATIME | MS_NODIRATIME | MS_RELATIME | > +MS_NOMNT | MS_KERNMOUNT); > > /* ... and get the mountpoint */ > retval = path_lookup(dir_name, LOOKUP_FOLLOW, &nd); > diff -upr linux-2.6.23-rc1-mm1.orig/fs/super.c > linux-2.6.23-rc1-mm1-7/fs/super.c > --- linux-2.6.23-rc1-mm1.orig/fs/super.c 2007-07-26 16:34:45.0 > +0400 > +++ linux-2.6.23-rc1-mm1-7/fs/super.c 2007-07-26 16:36:36.0 +0400 > @@ -944,9 +944,9 @@ do_kern_mount(const char *fstype, int fl > return mnt; > } > > -struct vfsmount *kern_mount(struct file_system_type *type) > +struct vfsmount *kern_mount_data(struct file_system_type *type, void *data) > { > - return vfs_kern_mount(type, 0, type->name, NULL); > + return vfs_kern_mount(type, MS_KERNMOUNT, type->name, data); > } > > -EXPORT_SYMBOL(kern_mount); > +EXPORT_SYMBOL_GPL(kern_mount_data); > diff -upr linux-2.6.23-rc1-mm1.orig/include/linux/fs.h > linux-2.6.23-rc1-mm1-7/include/linux/fs.h > --- linux-2.6.23-rc1-mm1.orig/include/linux/fs.h 2007-07-26 > 16:34:45.0 +0400 > +++ linux-2.6.23-rc1-mm1-7/include/linux/fs.h 2007-07-26 16:36:36.0 > +0400 > @@ -129,6 +129,7 @@ extern int dir_notify_enable; > #define MS_RELATIME (1<<21) /* Update atime relative to mtime/ctime. */ > #define MS_SETUSER (1<<23) /* set mnt_uid to current user */ > #define MS_NOMNT (1<<24) /* don't allow unprivileged submounts */ > +#define MS_KERNMOUNT (1<<25) /* this is a kern_mount call */ > #define MS_ACTIVE(1<<30) > #define MS_NOUSER(1<<31) > > @@ -1459,7 +1460,8 @@ void unnamed_dev_init(void); > > extern int register_filesystem(struct file_system_type *); > extern int unregister_filesystem(struct file_system_type *); > -extern struct vfsmount *kern_mount(struct file_system_type *); > +extern struct vfsmount *kern_mount_data(struct file_system_type *, void > *data); > +#define kern_mount(type) kern_mount_data(type, NULL) > extern int may_umount_tree(struct vfsmount *); > extern int may_umount(struct vfsmount *); > extern void umount_tree(struct vfsmount *, int, struct list_head *); > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ---end quoted text--- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] Modify lguest32 to make room for lguest64
On Wed, 2007-08-08 at 20:32 -0400, Steven Rostedt wrote: > Hi all, > > I've been working on lguest64 and in order to do this, I had to move > a lot of the i386 specific out of the way. Well, the lguest64 port > is still not ready to display, but before Rusty makes too many changes > I would like this in upstream so I don't have to keep repeating my > changes :-) > > > So this patch series moves lguest32 out of the way for other archs. Yeah, after some more thought I've not applied most of this. We really don't want to move everything then move it back; I prefer Jes' more cautious approach of moving a little bit at a time. We really have three parts: (1) bits that are generic, (2) bits that should be generic but my implementation is naive, (3) bits that really are i386-specific. Patches which move 2 to 1 are gratefully accepted: I realize a mass move is easier and this requires thought, but that's what we need. Since I can't build a module over two directories, that seems to destroy the idea of an i386/ subdir. Instead I've done a patch which renames the *clearly* i386-specific things to i386_, which at least works. I've pushed it into the repository http://lguest.ozlabs.org/patches/ Cheers, Rusty. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] cpqphp_ctrl.c: remove dead code
On Thu, Aug 09, 2007 at 03:47:02PM -0700, Kristen Carlson Accardi wrote: > fine by me - let's NAK this patch (and all future ones for this driver) until > someone with hardware steps up to maintain this driver. Eventually it > will just die I guess. Very bad idea. For example I sent a patch ages ago to remove kernel_thread useage from the driver. We need to get that patch in sooner or later because the kernel_thread export will have to go away. We're not going to block that on this driver. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC] Adding a TIF_KERNEL_TRACE to thread_info.h, s390 and ia64 8 bit limit
Hi, I would like to add a TIF_KERNEL_TRACE that would have the same effect as TIF_SYSCALL_TRACE, which is to call into do_syscall_trace when enabled. It would be enabled by setting it in each thread info structure (and protected against racy thread creation with proper flag copy from the parent thread upon thread creation). The particularity of TIF_KERNEL_TRACE is that it would be enabled dynamically system-wide when kernel tracing is active. The current similar flags that exist are TIF_SYSCALL_TRACE (for ptrace) and TIF_SYSCALL_AUDIT (set by audit_alloc() at process creation if auditing is enabled). However, touching these flags system-wide would conflict with either syscall audit or ptrace, therefore the introduction of a new thread flag looks like a plausible solution. However, since the instructions used to test these flags are limited to 8 bits on some architectures, we run out of free flags at least on s390 and ia64. I would appreciate some comments about the idea in general, and how bitfield limitation should be overcomed for s390 and ia64. Some details about the problematic patches below. Thanks, Mathieu Desnoyers On s390: /home/compudj/git/linux-2.6-lttng/arch/s390/kernel/entry.S: Assembler messages: /home/compudj/git/linux-2.6-lttng/arch/s390/kernel/entry.S:252: Error: operand out of range (289 is not between 0 and 255) /home/compudj/git/linux-2.6-lttng/arch/s390/kernel/entry.S:362: Error: operand out of range (289 is not between 0 and 255) make[2]: *** [arch/s390/kernel/entry.o] Error 1 when adding: --- include/asm-s390/thread_info.h |2 ++ 1 file changed, 2 insertions(+) Index: linux-2.6-lttng/include/asm-s390/thread_info.h === --- linux-2.6-lttng.orig/include/asm-s390/thread_info.h 2007-07-30 18:53:20.0 -0400 +++ linux-2.6-lttng/include/asm-s390/thread_info.h 2007-07-30 18:53:24.0 -0400 @@ -96,6 +96,7 @@ static inline struct thread_info *curren #define TIF_SYSCALL_AUDIT 5 /* syscall auditing active */ #define TIF_SINGLE_STEP6 /* deliver sigtrap on return to user */ #define TIF_MCCK_PENDING 7 /* machine check handling is pending */ +#define TIF_KERNEL_TRACE 8 /* kernel trace active */ #define TIF_USEDFPU16 /* FPU was used by this task this quantum (SMP) */ #define TIF_POLLING_NRFLAG 17 /* true if poll_idle() is polling TIF_NEED_RESCHED */ @@ -110,6 +111,7 @@ static inline struct thread_info *curren #define _TIF_SYSCALL_AUDIT (1
Re: [PATCH 5/7] Change lguest launcher to use asm generic include
On Wed, 2007-08-08 at 20:32 -0400, Steven Rostedt wrote: > plain text document attachment > (0005-Change-lguest-launcher-to-use-asm-generic-include-instead-of-explicitly.txt) > Have the lguest launcher include e820.h via asm/e820.h instead of explicitly > saying i386. > > Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]> Applied, thanks, Rusty. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.23-rc2-mm2 build error on MIPS and ARM
Hi Andrew, I got the following errors when building 2.6.23-rc2-mm2 on both mips and arm. Both errors are very much alike. MIPS: /opt/crosstool/gcc-3.4.5-glibc-2.3.6/mips-unknown-linux-gnu/lib/gcc/mips-unknown-linux-gnu/3.4.5/include -D__KERNEL__ -Iinclude -Iinclude2 -I/home/compudj/git/linux-2.6-lttng/include -include include/linux/autoconf.h -I/home/compudj/git/linux-2.6-lttng/. -I. -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -Os -mabi=32 -G 0 -mno-abicalls -fno-pic -pipe -msoft-float -ffreestanding -march=r5000 -Wa,--trap -I/home/compudj/git/linux-2.6-lttng/include/asm-mips/mach-ip22 -Iinclude/asm-mips/mach-ip22 -I/home/compudj/git/linux-2.6-lttng/include/asm-mips/mach-generic -Iinclude/asm-mips/mach-generic -D"VMLINUX_LOAD_ADDRESS=0x88002000" -fomit-frame-pointer -Wdeclaration-after-statement -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(asm_offsets)" -D"KBUILD_MODNAME=KBUILD_STR(asm_offsets)" -fverbose-asm -S -o arch/mips/kernel/asm-offsets.s /home/compudj/git/linux-2.6-lttng/arch/mips/kernel/asm-offsets.c In file included from /home/compudj/git/linux-2.6-lttng/include/linux/sched.h:58, from /home/compudj/git/linux-2.6-lttng/arch/mips/kernel/asm-offsets.c:13: /home/compudj/git/linux-2.6-lttng/include/linux/mm_types.h:115: error: syntax error before "pgprot_t" /home/compudj/git/linux-2.6-lttng/include/linux/mm_types.h:115: warning: no semicolon at end of struct or union /home/compudj/git/linux-2.6-lttng/include/linux/mm_types.h:161: error: syntax error before '}' token /home/compudj/git/linux-2.6-lttng/include/linux/mm_types.h:175: error: syntax error before "pgd_t" /home/compudj/git/linux-2.6-lttng/include/linux/mm_types.h:175: warning: no semicolon at end of struct or union /home/compudj/git/linux-2.6-lttng/include/linux/mm_types.h:229: error: syntax error before '}' token In file included from /home/compudj/git/linux-2.6-lttng/arch/mips/kernel/asm-offsets.c:13: /home/compudj/git/linux-2.6-lttng/include/linux/sched.h: In function `mmdrop': /home/compudj/git/linux-2.6-lttng/include/linux/sched.h:1509: error: dereferencing pointer to incomplete type /home/compudj/git/linux-2.6-lttng/include/linux/sched.h: In function `arch_pick_mmap_layout': /home/compudj/git/linux-2.6-lttng/include/linux/sched.h:1762: error: dereferencing pointer to incomplete type /home/compudj/git/linux-2.6-lttng/include/linux/sched.h:1763: error: dereferencing pointer to incomplete type /home/compudj/git/linux-2.6-lttng/include/linux/sched.h:1764: error: dereferencing pointer to incomplete type In file included from /home/compudj/git/linux-2.6-lttng/arch/mips/kernel/asm-offsets.c:14: /home/compudj/git/linux-2.6-lttng/include/linux/mm.h: In function `vma_nonlinear_insert': /home/compudj/git/linux-2.6-lttng/include/linux/mm.h:968: error: dereferencing pointer to incomplete type /home/compudj/git/linux-2.6-lttng/include/linux/mm.h:969: error: dereferencing pointer to incomplete type /home/compudj/git/linux-2.6-lttng/include/linux/mm.h: In function `find_vma_intersection': /home/compudj/git/linux-2.6-lttng/include/linux/mm.h:1078: error: dereferencing pointer to incomplete type /home/compudj/git/linux-2.6-lttng/include/linux/mm.h: In function `vma_pages': /home/compudj/git/linux-2.6-lttng/include/linux/mm.h:1085: error: dereferencing pointer to incomplete type /home/compudj/git/linux-2.6-lttng/include/linux/mm.h:1085: error: dereferencing pointer to incomplete type /home/compudj/git/linux-2.6-lttng/arch/mips/kernel/asm-offsets.c: In function `output_mm_defines': /home/compudj/git/linux-2.6-lttng/arch/mips/kernel/asm-offsets.c:220: error: dereferencing pointer to incomplete type /home/compudj/git/linux-2.6-lttng/arch/mips/kernel/asm-offsets.c:221: error: dereferencing pointer to incomplete type /home/compudj/git/linux-2.6-lttng/arch/mips/kernel/asm-offsets.c:222: error: dereferencing pointer to incomplete type make[2]: *** [arch/mips/kernel/asm-offsets.s] Error 1 make[1]: *** [prepare0] Error 2 make: *** [_all] Error 2 ARM: /opt/crosstool/gcc-4.0.2-glibc-2.3.6/arm-unknown-linux-gnu/bin/arm-unknown-linux-gnu-gcc -Wp,-MD,arch/arm/kernel/.asm-offsets.s.d -nostdinc -isystem /opt/crosstool/gcc-4.0.2-glibc-2.3.6/arm-unknown-linux-gnu/lib/gcc/arm-unknown-linux-gnu/4.0.2/include -D__KERNEL__ -Iinclude -Iinclude2 -I/home/compudj/git/linux-2.6-lttng/include -include include/linux/autoconf.h -mlittle-endian -I/home/compudj/git/linux-2.6-lttng/. -I. -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -Os -marm -fno-omit-frame-pointer -mapcs -mno-sched-prolog -mabi=apcs-gnu -mno-thumb-interwork -D__LINUX_ARM_ARCH__=4 -march=armv4 -mtune=strongarm110 -msoft-float -Uarm -fno-omit-frame-pointer -fno-optimize-sibling-calls -Wdeclaration-after-statement -Wno-pointer-sign -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_S
Re: [PATCH 00/25] move handling of setuid/gid bits from VFS into individual setattr functions (RESEND)
On Fri, Aug 10, 2007 at 04:47:52PM -0400, Jeff Layton wrote: > attr->ia_valid after the setattr operation returns. If either ATTR_KILL_* > bit is set then BUG(). The helper function already clears those bits > so anything using it should automatically be ok. We'd have to fix > up NFS and a few others that don't implement suid/sgid. > > This is not as certain as changing the name of the inode operation. It > would only pop when someone is attempting to change a setuid/setgid > file on these filesystems. Still, it should conceivably catch most if > not all offenders. Would that be sufficient to take care of everyone's > concerns? I like the idea of checking ia_valid after return a lot. But instead of going BUG() it should just do the default action, that we can avoid touching all the filesystem and only need to change those that need special care. I also have plans to add some new AT_ flags for implementing some filesystem ioctl in generic code that would benefit greatly from the ia_valid checkin after return to return ENOTTY fr filesystems not implementing those ioctls. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Process stuck in md_wakeup_thread
On 2.6.22 from debian (stock), I have a process (dpkg) stuck with the following calltrace: SysRq : Show Blocked State freesibling task PCstack pid father child younger older dpkg D 0003 0 26040 20765 (NOTLB) e57d5e30 00200082 0003 dfc48ba8 dfc48ba8 0007 e0af45c0 e8ce17aa 0002827f 00051ec2 e0af46cc c1809980 e8ce1324 0002827f 00200082 f881cd4c 00200286 f8ba2c85 c1809980 e57d5e60 Call Trace: [] md_wakeup_thread+0x26/0x28 [md_mod] [] raid5_unplug_device+0x4e/0x5a [raid456] [] io_schedule+0x1d/0x27 [] sync_page+0x0/0x3b [] sync_page+0x38/0x3b [] __wait_on_bit_lock+0x2a/0x52 [] __lock_page+0x58/0x5e [] wake_bit_function+0x0/0x3c [] truncate_inode_pages_range+0x201/0x256 [] truncate_inode_pages+0x17/0x1a [] reiserfs_delete_inode+0x36/0xdd [reiserfs] [] reiserfs_delete_inode+0x0/0xdd [reiserfs] [] generic_delete_inode+0xa0/0x105 [] iput+0x60/0x62 [] do_unlinkat+0xb6/0x126 [] syscall_call+0x7/0xb === My system is still up and running. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [lm-sensors] 2.6.23-rc1 regression: hwmon/w83627ehf: wrong fan speed
Hi Stefan, (Replying to everyone on the list, sorry!) On 8/10/07, Stefan Richter <[EMAIL PROTECTED]> wrote: > Should I hardwire correct dividers or pulse per rev in sensors.conf or > is the driver supposed to work the correct dividers out --- like it did > before 2.6.23-rc? The dividers are read-only in userspace. The driver manages the dividers automatically. The dividers are needed because the w83627ehf chip only has an 8 bit register to count pulses for each fan. So if the fan is moving slowly, you want the divider to be 128 so that every pulse gets counted. If the fan is moving fast, you want the divider to be 1 so that the register doesn't overflow. Once the register is read in by the driver, the effect of the divider is cancelled out in software so that you get an RPM reading from the fan. One side effect of this is that a fast moving fan reports the RPM more quickly than a slow moving fan. If you turn on HWMON debugging, the driver will report when it is changing the divider in dmesg. Hope that helps, David - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Disk spin down issue on shut down/suspend to disk
Thomas Renninger wrote: On Thu, 2007-08-09 at 15:16 +, Pavel Machek wrote: Hi! firmwarekit-discuss <[EMAIL PROTECTED]> (added to CC list) see: http://linuxfirmwarekit.org/ But if I understand this problem right, this won't be easy. The ACPI tables are just parsed with system ("iasl ...") and syntactical errors/warnings are printed out. I also thought about a test, interpreting the DSDT and read out values of cpufreq tables and sanity check them. AFAIK the linuxfirmwarekit is not designed for that atm. You need to compile in most parts of the acpica code and parse and interpret DSDT/SSDT code yourself in the firmwarekit core or inside a plugin, then do a walk_namespace call or whatever to find the functions/parts you like to examine. This is a lot work and needs a proper design (providing an interface to plugins to let them easily check specific AML/ASL code). Furthermore, we don't really know what we're looking for. How can you tell a given write to an ioport is issuing STANDBYNOW to an ATA disk or trying to power the machine off? Adding to the fun, many modern ATA controller have more than one way to issue a command. Maybe we can match accesses inside regions specified by PCI BARs :-( Hmmm... perhaps we should do it the other way. ACPI is allowed to touch the embedded controller, what else? Maybe we should warn as soon as API touches non-EC I/O port? This is not working... ACPI can and does access all kind of other I/O ports and other resources. Hmm, are the disk accesses done by ACPI via OperationRegion/Field declared variables? I try to get a check for those clashing with native drivers (hopefully this approach is successful for 2.6.24, can't say for sure yet), I wonder whether this one would give a warning like "Libata driver is using the same SystemIO/SystemMem resources than ACPI OperationRegion declaration XY". This would not solve the problem, but at least show the need of such a test. Such ACPI vs native driver interference problems are very hard nuts (in identifying and solving). Can someone post an ASL code snippet how ACPI actually access the disk and in which parts/functions, pls. Again, it's not believed that this is being done via AML, but via a BIOS SMM trap on the ACPI sleep state hardware IO port. We have no real ability to find out what the BIOS is doing or prevent it in this case. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/4] Embed zone_id information within the zonelist->zones pointer
On Friday 10 August 2007 21:02, Christoph Lameter wrote: > On Fri, 10 Aug 2007, Andi Kleen wrote: > > > x86_64 does not support ZONE_HIGHMEM. > > > > I also plan to eliminate ZONE_DMA soon (and replace all its users > > with a new allocator that sits outside the normal fallback lists) > > Hallelujah. You are my hero! x86_64 will switch off CONFIG_ZONE_DMA? Yes. i386 too actually. The DMA zone will be still there, but only reachable with special functions. This is fine because the default zone protection heuristics keep DMA near always free from !GFP_DMA allocations anyways -- so it doesn't make much difference if it's totally unreachable. swiotlb will also use the same pool. Also all callers are going to pass masks around so it's always clear what address range they really need. Actually a lot of them pass still 16MB simply because it is hard to find out what masks old undocumented hardware really needs. But this could change. This also means the DMA support in sl[a-z]b is not needed anymore. I went through near all GFP_DMA users and found they're usually happy enough with pages. If someone comes up who really needs lots of subobjects the right way for them would be likely extending the pci pool allocator for this case. But I haven't found a need for this yet. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 6/24] make atomic_read() behave consistently on frv
Chris Snook <[EMAIL PROTECTED]> wrote: > > cpu_relax() contains a barrier, so it should do the right thing. For > non-smp architectures, I'm concerned about interacting with interrupt > handlers. Some drivers do use atomic_* operations. What problems with interrupt handlers? Access to int/long must be atomic or we're in big trouble anyway. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86: make io-apic not connected pin print complete
[PATCH] x86: make io-apic not connected pin print complete normally will have two segment not connected pin pin0, and pin after 15... so need to print out "not connected\n" for previous segment, before print out connected pins info... Signed-off-by: Yinghai Lu <[EMAIL PROTECTED]> diff --git a/arch/x86_64/kernel/io_apic.c b/arch/x86_64/kernel/io_apic.c index 050141c..a591679 100644 --- a/arch/x86_64/kernel/io_apic.c +++ b/arch/x86_64/kernel/io_apic.c @@ -874,6 +874,10 @@ static void __init setup_IO_APIC_irqs(void) apic_printk(APIC_VERBOSE, ", %d-%d", mp_ioapics[apic].mpc_apicid, pin); continue; } + if (!first_notcon) { + apic_printk(APIC_VERBOSE, " not connected.\n"); + first_notcon = 1; + } irq = pin_2_irq(idx, apic, pin); add_pin_to_irq(irq, apic, pin); @@ -884,7 +888,7 @@ static void __init setup_IO_APIC_irqs(void) } if (!first_notcon) - apic_printk(APIC_VERBOSE," not connected.\n"); + apic_printk(APIC_VERBOSE, " not connected.\n"); } /* diff --git a/arch/i386/kernel/io_apic.c b/arch/i386/kernel/io_apic.c index 893df82..39cf860 100644 --- a/arch/i386/kernel/io_apic.c +++ b/arch/i386/kernel/io_apic.c @@ -1301,6 +1301,11 @@ static void __init setup_IO_APIC_irqs(void) continue; } + if (!first_notcon) { + apic_printk(APIC_VERBOSE, " not connected.\n"); + first_notcon = 1; + } + entry.trigger = irq_trigger(idx); entry.polarity = irq_polarity(idx); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/24] make atomic_read() behave consistently on alpha
That means GCC cannot compile Linux; it already optimises some accesses to scalars to smaller accesses when it knows it is allowed to. Not often though, since it hardly ever helps in the cost model it employs. Please give an example code snippet + gcc version + arch to back this up. unsigned char f(unsigned long *p) { return *p & 1; } This doesn't really matter since we only care about the LSB. It is exactly what I claimed, and what you asked proof of. Do you have an example where gcc reads it non-atmoically and we care about all parts? Like I explained in the original mail; no, I suspect such a testcase will be really hard to construct, esp. as a small testcase. I have no reason to believe it is impossible to do so though -- maybe someone else can write trickier code than I can, in which case, please do so. Segher - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/24] make atomic_read() behave consistently on alpha
On Sat, Aug 11, 2007 at 02:38:40AM +0200, Segher Boessenkool wrote: > >>That means GCC cannot compile Linux; it already optimises > >>some accesses to scalars to smaller accesses when it knows > >>it is allowed to. Not often though, since it hardly ever > >>helps in the cost model it employs. > > > >Please give an example code snippet + gcc version + arch > >to back this up. > > unsigned char f(unsigned long *p) > { > return *p & 1; > } This doesn't really matter since we only care about the LSB. Do you have an example where gcc reads it non-atmoically and we care about all parts? Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Make rcutorture RNG use locally grown entropy
This patch converts rcutorture's random-number generator from get_random_bytes() (which has locking issues in some builds with patches) to instead use local-to-rcutorture statistical counters. This involves reading other CPUs' statistics, so the frequency of entropy addition is simultaneously decreased by an order of magnitude. This patch is an alternative to adding an EXPORT_SYMBOL_GPL() for the new cpu_clock() API. Signed-off-by: Paul E. McKenney <[EMAIL PROTECTED]> --- rcutorture.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff -urpNa -X dontdiff linux-2.6.22.1-rt4/kernel/rcutorture.c linux-2.6.22.1-rt4-rcutorturesched/kernel/rcutorture.c --- linux-2.6.22.1-rt4/kernel/rcutorture.c 2007-07-21 16:58:22.0 -0700 +++ linux-2.6.22.1-rt4-rcutorturesched/kernel/rcutorture.c 2007-08-10 08:42:41.0 -0700 @@ -155,26 +155,27 @@ rcu_torture_free(struct rcu_torture *p) struct rcu_random_state { unsigned long rrs_state; long rrs_count; + int rrs_cpu; }; #define RCU_RANDOM_MULT 39916801 /* prime */ #define RCU_RANDOM_ADD 479001701 /* prime */ -#define RCU_RANDOM_REFRESH 1 +#define RCU_RANDOM_REFRESH 10 #define DEFINE_RCU_RANDOM(name) struct rcu_random_state name = { 0, 0 } /* * Crude but fast random-number generator. Uses a linear congruential - * generator, with occasional help from get_random_bytes(). + * generator, with occasional help from other CPUs' fast-running statistics. */ static unsigned long rcu_random(struct rcu_random_state *rrsp) { - long refresh; - if (--rrsp->rrs_count < 0) { - get_random_bytes(&refresh, sizeof(refresh)); - rrsp->rrs_state += refresh; + rrsp->rrs_cpu = next_cpu(rrsp->rrs_cpu, cpu_online_map); + if (rrsp->rrs_cpu >= NR_CPUS) + rrsp->rrs_cpu = 0; + rrsp->rrs_state += per_cpu(rcu_torture_count, rrsp->rrs_cpu)[0]; rrsp->rrs_count = RCU_RANDOM_REFRESH; } rrsp->rrs_state = rrsp->rrs_state * RCU_RANDOM_MULT + RCU_RANDOM_ADD; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/24] make atomic_read() behave consistently on alpha
That means GCC cannot compile Linux; it already optimises some accesses to scalars to smaller accesses when it knows it is allowed to. Not often though, since it hardly ever helps in the cost model it employs. Please give an example code snippet + gcc version + arch to back this up. unsigned char f(unsigned long *p) { return *p & 1; } with both powerpc64-linux-gcc (GCC) 4.3.0 20070731 (experimental) and powerpc64-linux-gcc-4.2.0 (GCC) 4.2.0 (sorry, I don't have anything newer or older right now; if you really care, I can test with those too) generate (in 64-bit mode): .L.f: lbz 3,7(3) rldicl 3,3,0,63 blr and in 32-bit mode: f: stwu 1,-16(1) nop nop lbz 3,3(3) addi 1,1,16 rlwinm 3,3,0,31,31 blr (the nops are because I use --with-cpu=970). But perhaps you do not care for PowerPC, in which case: i686-linux-gcc (GCC) 4.2.0 20060410 (experimental) (sorry for the old version, I don't build x86 compilers all that often; also I don't have a 64-bit version right now): f: pushl %ebp movl%esp, %ebp movl8(%ebp), %eax popl%ebp movzbl (%eax), %eax andl$1, %eax ret If you want testing with any other versions, and/or for any other target architecture, I can do that; it takes a few minutes to build a compiler. It is quite hard to build a testcase that reads more than one part of the "long", since for small testcases the compiler will almost always be smart enough to do one bigger read instead; but it certainly isn't inconceivable, and anyway the compiler would be fully in its right to do reads non-atomically if not instructed otherwise. Segher - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 9/24] make atomic_read() behave consistently on ia64
Chris Snook writes: > I'll do this for the whole patchset. Stay tuned for the resubmit. Could you incorporate Segher's patch to turn atomic_{read,set} into asm on powerpc? Segher claims that using asm is really the only reliable way to ensure that gcc does what we want, and he seems to have a point. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] powerpc: Implement atomic{,64}_{read,write}() without volatile
Segher Boessenkool writes: > Instead, use asm() like all other atomic operations already do. > > Also use inline functions instead of macros; this actually > improves code generation (some code becomes a little smaller, > probably because of improved alias information -- just a few > hundred bytes total on a default kernel build, nothing shocking). > > Signed-off-by: Segher Boessenkool <[EMAIL PROTECTED]> Looks OK to me. In the hope that Chris Snook will pick it up and include it with his other atomic changes: Acked-by: Paul Mackerras <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
* Ingo Molnar <[EMAIL PROTECTED]> wrote: > * Roman Zippel <[EMAIL PROTECTED]> wrote: > > > Well, I've sent him the stuff now... > > received it - thanks alot, looking at it! everything looks good in your debug output and the TSC dump data, except for the wait_runtime values, they are quite out of balance - and that balance cannot be explained with jiffies granularity or with any sort of sched_clock() artifact. So this clearly looks like a CFS regression that should be fixed. the only relevant thing that comes to mind at the moment is that last week Peter noticed a buggy aspect of sleeper bonuses (in that we do not rate-limit their output, hence we 'waste' them instead of redistributing them), and i've got the small patch below in my queue to fix that - could you give it a try? this is just a blind stab into the dark - i couldnt see any real impact from that patch in various workloads (and it's not upstream yet), so it might not make a big difference. The trace you did (could you send the source for that?) seems to implicate sleeper bonuses though. if this patch doesnt help, could you check the general theory whether it's related to sleeper-fairness, via turning it off: echo 30 > /proc/sys/kernel/sched_features does the bug go away if you do that? If sleeper bonuses are showing too many artifacts then we could turn it off for final .23. Ingo -> Subject: sched: fix sleeper bonus From: Ingo Molnar <[EMAIL PROTECTED]> Peter Ziljstra noticed that the sleeper bonus deduction code was not properly rate-limited: a task that scheduled more frequently would get a disproportionately large deduction. So limit the deduction to delta_exec and limit production to runtime_limit. Not-Yet-Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> --- kernel/sched_fair.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) Index: linux/kernel/sched_fair.c === --- linux.orig/kernel/sched_fair.c +++ linux/kernel/sched_fair.c @@ -75,7 +75,7 @@ enum { unsigned int sysctl_sched_features __read_mostly = SCHED_FEAT_FAIR_SLEEPERS*1 | - SCHED_FEAT_SLEEPER_AVG *1 | + SCHED_FEAT_SLEEPER_AVG *0 | SCHED_FEAT_SLEEPER_LOAD_AVG *1 | SCHED_FEAT_PRECISE_CPU_LOAD *1 | SCHED_FEAT_START_DEBIT *1 | @@ -304,11 +304,9 @@ __update_curr(struct cfs_rq *cfs_rq, str delta_mine = calc_delta_mine(delta_exec, curr->load.weight, lw); if (cfs_rq->sleeper_bonus > sysctl_sched_granularity) { - delta = calc_delta_mine(cfs_rq->sleeper_bonus, - curr->load.weight, lw); - if (unlikely(delta > cfs_rq->sleeper_bonus)) - delta = cfs_rq->sleeper_bonus; - + delta = min(cfs_rq->sleeper_bonus, (u64)delta_exec); + delta = calc_delta_mine(delta, curr->load.weight, lw); + delta = min((u64)delta, cfs_rq->sleeper_bonus); cfs_rq->sleeper_bonus -= delta; delta_mine -= delta; } @@ -521,6 +519,8 @@ static void __enqueue_sleeper(struct cfs * Track the amount of bonus we've given to sleepers: */ cfs_rq->sleeper_bonus += delta_fair; + if (unlikely(cfs_rq->sleeper_bonus > sysctl_sched_runtime_limit)) + cfs_rq->sleeper_bonus = sysctl_sched_runtime_limit; schedstat_add(cfs_rq, wait_runtime, se->wait_runtime); } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: rcutorture xtime usage
On Fri, Aug 10, 2007 at 01:30:55PM -0700, Paul E. McKenney wrote: > On Fri, Aug 10, 2007 at 10:12:12AM -0700, Andrew Morton wrote: > > On Fri, 10 Aug 2007 08:12:08 -0700 "Paul E. McKenney" <[EMAIL PROTECTED]> > > wrote: > > > > > > One used to use sched_clock() for this, then get frowned at. Now we > > > > have cpu_clock()... > > > > > > Hmmm... And cpu_clock() is not in 2.6.22, so must appear in some later > > > release. Which means that the rate of API change in this area is a > > > bit high, so I should avoid it like the plague. > > > > eh, it's been there for weeks. It is dust-encrusted. > > > > > Therefore, I should > > > look for some other convenient source of entropy. > > > > > > One convenient source would the per-CPU statistics that rcutorture > > > maintains. Of course, a given CPU's RNG is nearly in lock-step with > > > its own statistics, but not with the adjacent CPU's statistics... > > > > > > I will send a patch. > > > > Please use cpu_clock(). It ain't going away. > > D'accord... Errmmm... No joy. ERROR: "cpu_clock" [kernel/rcutorture.ko] undefined! Turns out that cpu_clock also ain't exported, and rcutorture.c is a module. Would adding an EXPORT_SYMBOL_GPL() as in the patch below be acceptable? If not, I have a tested patch to rcutorture.c that leverages statistical counters. Your choice. Thanx, Paul Add an EXPORT_SYMBOL_GPL() for cpu_clock() and make rcutorture.c use it. Compiles, but not yet tested. Signed-off-by: Paul E. McKenney <[EMAIL PROTECTED]> --- rcutorture.c |8 ++-- sched.c |2 ++ 2 files changed, 4 insertions(+), 6 deletions(-) diff -urpNa -X dontdiff linux-2.6.23-rc2/kernel/rcutorture.c linux-2.6.23-rc2-rcutorturesched/kernel/rcutorture.c --- linux-2.6.23-rc2/kernel/rcutorture.c2007-08-03 19:49:55.0 -0700 +++ linux-2.6.23-rc2-rcutorturesched/kernel/rcutorture.c2007-08-10 17:15:22.0 -0700 @@ -42,7 +42,6 @@ #include #include #include -#include #include #include #include @@ -166,16 +165,13 @@ struct rcu_random_state { /* * Crude but fast random-number generator. Uses a linear congruential - * generator, with occasional help from get_random_bytes(). + * generator, with occasional help from cpu_clock(). */ static unsigned long rcu_random(struct rcu_random_state *rrsp) { - long refresh; - if (--rrsp->rrs_count < 0) { - get_random_bytes(&refresh, sizeof(refresh)); - rrsp->rrs_state += refresh; + rrsp->rrs_state += (unsigned long)cpu_clock(smp_processor_id()); rrsp->rrs_count = RCU_RANDOM_REFRESH; } rrsp->rrs_state = rrsp->rrs_state * RCU_RANDOM_MULT + RCU_RANDOM_ADD; diff -urpNa -X dontdiff linux-2.6.23-rc2/kernel/sched.c linux-2.6.23-rc2-rcutorturesched/kernel/sched.c --- linux-2.6.23-rc2/kernel/sched.c 2007-08-03 19:49:55.0 -0700 +++ linux-2.6.23-rc2-rcutorturesched/kernel/sched.c 2007-08-10 17:22:57.0 -0700 @@ -394,6 +394,8 @@ unsigned long long cpu_clock(int cpu) return now; } +EXPORT_SYMBOL_GPL(cpu_clock); + #ifdef CONFIG_FAIR_GROUP_SCHED /* Change a task's ->cfs_rq if it moves across CPUs */ static inline void set_task_cfs_rq(struct task_struct *p) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCHv2] Fix to keep watchdog disabled by default for i386/x86_64
Fixed wrong expression which enabled watchdogs even if nmi_watchdog kernel parameter wasn't set. This regression got slightly introduced with commit b7471c6da94d30d3deadc55986cc38d1ff57f9ca. Introduced NMI_DISABLED (-1) which allows to switch the value of NMI_DEFAULT without breaking the APIC NMI watchdog code (again). Fixes: https://bugzilla.novell.com/show_bug.cgi?id=298084 http://bugzilla.kernel.org/show_bug.cgi?id=7839 And likely some more nmi_watchdog=0 related issues. Resubmit: x86_64 changes compiled but untested. Shame on me! Signed-off-by: Daniel Gollub <[EMAIL PROTECTED]> --- arch/i386/kernel/apic.c |2 +- arch/i386/kernel/nmi.c |4 ++-- arch/x86_64/kernel/nmi.c |4 ++-- include/asm-i386/nmi.h |3 ++- include/asm-x86_64/nmi.h |3 ++- 5 files changed, 9 insertions(+), 7 deletions(-) diff -rup a/arch/i386/kernel/apic.c b/arch/i386/kernel/apic.c --- a/arch/i386/kernel/apic.c 2007-08-04 04:49:55.0 +0200 +++ b/arch/i386/kernel/apic.c 2007-08-10 21:38:37.0 +0200 @@ -1087,7 +1087,7 @@ static int __init detect_init_APIC (void if (l & MSR_IA32_APICBASE_ENABLE) mp_lapic_addr = l & MSR_IA32_APICBASE_BASE; - if (nmi_watchdog != NMI_NONE) + if (nmi_watchdog != NMI_NONE && nmi_watchdog != NMI_DISABLED) nmi_watchdog = NMI_LOCAL_APIC; printk(KERN_INFO "Found and enabled local APIC!\n"); diff -rup a/arch/i386/kernel/nmi.c b/arch/i386/kernel/nmi.c --- a/arch/i386/kernel/nmi.c2007-08-04 04:49:55.0 +0200 +++ b/arch/i386/kernel/nmi.c2007-08-10 22:00:40.0 +0200 @@ -77,7 +77,7 @@ static int __init check_nmi_watchdog(voi unsigned int *prev_nmi_count; int cpu; - if ((nmi_watchdog == NMI_NONE) || (nmi_watchdog == NMI_DEFAULT)) + if ((nmi_watchdog == NMI_NONE) || (nmi_watchdog == NMI_DISABLED)) return 0; if (!atomic_read(&nmi_active)) @@ -424,7 +424,7 @@ int proc_nmi_enabled(struct ctl_table *t if (!!old_state == !!nmi_watchdog_enabled) return 0; - if (atomic_read(&nmi_active) < 0) { + if (atomic_read(&nmi_active) < 0 || nmi_watchdog == NMI_DISABLED) { printk( KERN_WARNING "NMI watchdog is permanently disabled\n"); return -EIO; } diff -rup a/arch/x86_64/kernel/nmi.c b/arch/x86_64/kernel/nmi.c --- a/arch/x86_64/kernel/nmi.c 2007-08-04 04:49:55.0 +0200 +++ b/arch/x86_64/kernel/nmi.c 2007-08-10 21:59:36.0 +0200 @@ -85,7 +85,7 @@ int __init check_nmi_watchdog (void) int *counts; int cpu; - if ((nmi_watchdog == NMI_NONE) || (nmi_watchdog == NMI_DEFAULT)) + if ((nmi_watchdog == NMI_NONE) || (nmi_watchdog == NMI_DISABLED)) return 0; if (!atomic_read(&nmi_active)) @@ -442,7 +442,7 @@ int proc_nmi_enabled(struct ctl_table *t if (!!old_state == !!nmi_watchdog_enabled) return 0; - if (atomic_read(&nmi_active) < 0) { + if (atomic_read(&nmi_active) < 0 || nmi_watchdog == NMI_DISABLED) { printk( KERN_WARNING "NMI watchdog is permanently disabled\n"); return -EIO; } diff -rup a/include/asm-i386/nmi.h b/include/asm-i386/nmi.h --- a/include/asm-i386/nmi.h2007-08-04 04:49:55.0 +0200 +++ b/include/asm-i386/nmi.h2007-08-10 22:04:51.0 +0200 @@ -33,11 +33,12 @@ extern int nmi_watchdog_tick (struct pt_ extern atomic_t nmi_active; extern unsigned int nmi_watchdog; -#define NMI_DEFAULT -1 +#define NMI_DISABLED-1 #define NMI_NONE 0 #define NMI_IO_APIC1 #define NMI_LOCAL_APIC 2 #define NMI_INVALID3 +#define NMI_DEFAULTNMI_DISABLED struct ctl_table; struct file; diff -rup a/include/asm-x86_64/nmi.h b/include/asm-x86_64/nmi.h --- a/include/asm-x86_64/nmi.h 2007-08-04 04:49:55.0 +0200 +++ b/include/asm-x86_64/nmi.h 2007-08-10 22:04:41.0 +0200 @@ -64,11 +64,12 @@ extern int setup_nmi_watchdog(char *); extern atomic_t nmi_active; extern unsigned int nmi_watchdog; -#define NMI_DEFAULT-1 +#define NMI_DISABLED-1 #define NMI_NONE 0 #define NMI_IO_APIC1 #define NMI_LOCAL_APIC 2 #define NMI_INVALID3 +#define NMI_DEFAULTNMI_DISABLED struct ctl_table; struct file; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] powerpc: Implement atomic{,64}_{read,write}() without volatile
Instead, use asm() like all other atomic operations already do. Also use inline functions instead of macros; this actually improves code generation (some code becomes a little smaller, probably because of improved alias information -- just a few hundred bytes total on a default kernel build, nothing shocking). Signed-off-by: Segher Boessenkool <[EMAIL PROTECTED]> --- include/asm-powerpc/atomic.h | 34 -- 1 files changed, 28 insertions(+), 6 deletions(-) diff --git a/include/asm-powerpc/atomic.h b/include/asm-powerpc/atomic.h index c44810b..bc17506 100644 --- a/include/asm-powerpc/atomic.h +++ b/include/asm-powerpc/atomic.h @@ -5,7 +5,7 @@ * PowerPC atomic operations */ -typedef struct { volatile int counter; } atomic_t; +typedef struct { int counter; } atomic_t; #ifdef __KERNEL__ #include @@ -15,8 +15,19 @@ typedef struct { volatile int counter; } atomic_t; #define ATOMIC_INIT(i) { (i) } -#define atomic_read(v) ((v)->counter) -#define atomic_set(v,i)(((v)->counter) = (i)) +static __inline__ int atomic_read(const atomic_t *v) +{ + int t; + + __asm__ __volatile__("lwz%U1%X1 %0,%1" : "=r"(t) : "m"(v->counter)); + + return t; +} + +static __inline__ void atomic_set(atomic_t *v, int i) +{ + __asm__ __volatile__("stw%U0%X0 %1,%0" : "=m"(v->counter) : "r"(i)); +} static __inline__ void atomic_add(int a, atomic_t *v) { @@ -240,12 +251,23 @@ static __inline__ int atomic_dec_if_positive(atomic_t *v) #ifdef __powerpc64__ -typedef struct { volatile long counter; } atomic64_t; +typedef struct { long counter; } atomic64_t; #define ATOMIC64_INIT(i) { (i) } -#define atomic64_read(v) ((v)->counter) -#define atomic64_set(v,i) (((v)->counter) = (i)) +static __inline__ long atomic64_read(const atomic_t *v) +{ + long t; + + __asm__ __volatile__("ld%U1%X1 %0,%1" : "=r"(t) : "m"(v->counter)); + + return t; +} + +static __inline__ void atomic64_set(atomic_t *v, long i) +{ + __asm__ __volatile__("std%U0%X0 %1,%0" : "=m"(v->counter) : "r"(i)); +} static __inline__ void atomic64_add(long a, atomic64_t *v) { -- 1.5.2.1.144.gabc40-dirty - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Use of directories to hold root?
Jan Engelhardt wrote: > On Aug 10 2007 17:24, Mark Cannon wrote: >> You pass the kernel the root option to specify the root partition. >> Is there a way to identify a directory in that partition that holds the >> root or something equivalent to this? > > No, but you can use pivot_root. Or better yet, use an initramfs with MS_MOVE; same as you would with the "normal" use of initramfs. -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/2] spinlock: lockbreak cleanup
Nick, These two patches make my P4 (single socket HT) test box not boot. I dropped them for now. Some oopses -Andi NMI Watchdog detected LOCKUP on CPU 1 CPU 1 Modules linked in: Pid: 1648, comm: sh Not tainted 2.6.23-rc2-git3 #472 RIP: 0010:[] [] _spin_lock+0x10/0x18 RSP: 0018:810001127f20 EFLAGS: 0097 RAX: df84 RBX: 8100398de040 RCX: 810001105850 RDX: 810080852000 RSI: RDI: 810001017180 RBP: 810001127f58 R08: 1001 R09: 807c5180 R10: 0001 R11: 8030ed1e R12: 810001017180 R13: 8100398de040 R14: 0001 R15: 81003a6c3b48 FS: 2b0f1abcef60() GS:81003e0ffcc0() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 0045b090 CR3: 3db5c000 CR4: 06e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Process sh (pid: 1648, threadinfo 81003948, task 8100398de040) Stack: 8022fd10 000104ab 8100398de040 8100398de040 81003d086ac0 810001017180 0001 8023b91a 807c25c8 810039481cd0 8021ae58 807c25c8 8021b461 81003a51b408 0001 8020bfd6 810039481cd0 810039481de8 8030ed1e 8001 0206 81003948 810001017180 81003948 81003da4c850 8100398de040 ff10 80545e8a 0010 0246 810039481d58 0018 0086 80311570 8100398d91c0 81003ad83e50 8100398de040 81003e0e0790 8100398de248 39481dc8 81003da4c850 142e 8100398de040 81000100e208 81003e0e0790 810039481e88 810039481e90 81000100e180 810039481e68 0059a4f0 810039481df8 8022f56c 810039481e08 80545f79 810039481e58 80545fb3 0002 0292 81003db92000 81003e0e0790 0001 81000100e180 81003e0e0790 0001 810039481ed8 8022fa8f 810039481e68 810039481e68 8100398de040 0001 0001 81000101 810039481e98 810039481e98 81003db10be0 0202 8100398d91c0 398d91c0 81003db92000 0059a920 81003affd380 8028399b 810039481f58 81003db92000 0059a920 0059a4f0 81003db92000 0059a920 0059a8b0 8020a1ec 2b0f1a9a7628 00594e20 0059a4f0 00599c01 00594e20 8020b767 0059a8b0 0059a920 00594e20 00599c01 0059a4f0 00594e20 0202 2b0f1b28 003b 0059a920 0059a4f0 00594e20 003b 2b0f1aa33d97 0033 0202 7fff906c42c8 002b Call Trace: [] scheduler_tick+0x3e/0x149 [] update_process_times+0x5c/0x68 [] smp_local_timer_interrupt+0x34/0x55 [] smp_apic_timer_interrupt+0x44/0x5b [] apic_timer_interrupt+0x66/0x70 [] nfs_permission+0x0/0x1d1 [] thread_return+0x58/0xd0 [] nfs_file_open+0x0/0x7c [] __cond_resched+0x1c/0x44 [] cond_resched+0x2e/0x39 [] wait_for_completion+0x17/0xbe [] sched_exec+0xb3/0xce [] do_execve+0x5d/0x1a6 [] sys_execve+0x36/0x8b [] stub_execve+0x67/0xb0 Code: 8a 07 0f ae e8 eb f3 c3 f0 81 2f 00 00 00 01 74 05 e8 a8 62 Kernel panic - not syncing: Aiee, killing interrupt handler! (another boot) NMI Watchdog detected LOCKUP on CPU 0 CPU 0 Modules linked in: Pid: 1193, comm: udevstart Not tainted 2.6.23-rc2-git3 #474 RIP: 0010:[] [] _spin_lock+0x15/0x18 RSP: 0018:81003a6cf8d0 EFLAGS: 0002 RAX: 6a6b RBX: 807c6180 RCX: RDX: RSI: 81003a6cf930 RDI: 81000100e180 RBP: 81003a6cf8f8 R08: 81003a7f9680 R09: 81003a628b48 R10: 0053b31b R11: 8030eb02 R12: 81000100e180 R13: 81003a6cf930 R14: 810001118100 R15: FS: 2b85cee96b00() GS:8072d000() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 2b6f5d8d6310 CR3: 3a524000 CR4: 06e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Process udevstart (pid: 1193, threadinfo 81003a6ce000, task 810001263890) Stack: 8022c09b 0003 0001 810001118100 806d6188 81003a6cf968 8022d166 3a6cf928 0003 810001017180 810001017180 81003a6cf948 0092 81003a6cf978 81003a043d20 00
Re: [PATCH 1/24] make atomic_read() behave consistently on alpha
On Fri, Aug 10, 2007 at 10:07:27PM +0200, Segher Boessenkool wrote: > > That means GCC cannot compile Linux; it already optimises > some accesses to scalars to smaller accesses when it knows > it is allowed to. Not often though, since it hardly ever > helps in the cost model it employs. Please give an example code snippet + gcc version + arch to back this up. Thanks, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 9/24] make atomic_read() behave consistently on ia64
> Here are the functions in which they occur in the object file. You > may have to chase down some inlining to find the function that > actually uses atomic_*(). Ignore this ... Andreas' patch was only two lines so I thought I'd "save time" by just hand-editing the source over on my build machine. I managed to goof that by editing the wrong function for one of the cases. :-( New result. With Andreas's patch correctly applied, the generated vmlinux is identical with/without your patch. -Tony - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix to keep watchdog disabled by default for i386/x86_64
> +#deifne NMI_DEFAULT NMI_DISABLED Actually tested? -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Documentation files in html format?
On 08/10/2007 10:12 PM, Sam Ravnborg wrote: What primary requirements does in-tree Linux kernel documentation have to fulfill in general? Skipping the obvious ones such as correct, up-to-date etc. o Readable as-is o Grepable o buildable as structured documents or almost like a single book o Easy to replicate structure o Maintainable in any decent text-editor (emacs, vim, whatever) Easy to put online? Asciidoc is quite close to plaintext and it looks to me that the formatting possibilities are quite good. I spend an hour experimenting a little with Documentation/kbuild/makefiles.txt. Diff below shows quite a lot of changes but for the most this is removal of the indent tab. Most likely I could have tweaked asciidoc to accept this but wanted to use default config. The resulting html page can be seen here: http://www.ravnborg.org/kbuild/makefiles.html FWIW, this looks very good to me... Rene. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/4] Add ETHTOOL_[GS]FLAGS sub-ioctls
David Miller wrote: From: Ben Greear <[EMAIL PROTECTED]> Date: Fri, 10 Aug 2007 15:40:02 -0700 For GSO on output, is there a generic fallback for any driver that does not specifically implement GSO? Absolutely, in fact that's mainly what it's there for. I don't think there is any issue. The knob is there via ethtool for people who really want to disable it. Just to be paranoid (who me?) we are then at a point where what happened a couple months ago with forwarding between 10G and IPoIB won't happen again - where things failed because a 10G NIC had LRO enabled by default? rick jones - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 02/10] mm: system wide ALLOC_NO_WATERMARK
On 8/10/07, Christoph Lameter <[EMAIL PROTECTED]> wrote: > The idea of adding code to deal with "I have no memory" situations > in a kernel that based on have as much memory as possible in use at all > times is plainly the wrong approach. No. It is you who have read the patches wrongly, because what you imply here is exactly backwards. > If you need memory then memory needs > to be reclaimed. That is the basic way that things work Wrong. A naive reading of your comment would suggest you do not understand how PF_MEMALLOC works, and that it has worked that way from day one (well, since long before I arrived) and that we just do more of the same, except better. > and following that > through brings about a much less invasive solution without all the issues > that the proposed solution creates. What issues? Test case please, a real one that you have run yourself. Please, no more theoretical issues that cannot be demonstrated in practice because they do not exist. Regards, Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [v4l-dvb-maintainer] [2.6 patch] dvb_frontend_ioctl(): fix check-after-use
On Fri, 10 Aug 2007, Markus Rechberger wrote: > On 8/1/07, Manu Abraham <[EMAIL PROTECTED]> wrote: > > On 7/31/07, Adrian Bunk <[EMAIL PROTECTED]> wrote: > > > The Coverity checker spotted that we have already oops'ed if "fe" was > > NULL. > > > > > > --- linux-2.6.23-rc1-mm1/drivers/media/dvb/dvb-core/dvb_frontend.c.old > > > +++ linux-2.6.23-rc1-mm1/drivers/media/dvb/dvb-core/dvb_frontend.c > > > @@ -706,11 +706,11 @@ static int dvb_frontend_ioctl(struct ino > > > - if (!fe || fepriv->exit) > > > + if (fepriv->exit) > > > return -ENODEV; > > This issue has been known for a while including some other problems at > that part. > > http://article.gmane.org/gmane.linux.drivers.dvb/35351/match=patch+dvb_net+hotplugging+support > > this includes a link where this and more got discussed in May. For dvb_net_close, I like the patch I already posted better. To fix the check-after-use, it's not "use" part that's the problem, it's the "check" part that isn't necessary. I traced the dvb-net code, http://article.gmane.org/gmane.linux.kernel/543689, and I'm sure that dvbdev can't be NULL. My patch also deletes a few pieces of duplicated code by calling dvb_generic_release(). The only problem is that practically no one uses dvb-net, so it's very hard to test these patches. In all the dvb code, were is the locking for device open and release? I don't see it. What is preventing two threads from trying to open and/or close the same dvb device at the same time? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 9/24] make atomic_read() behave consistently on ia64
Linus Torvalds wrote: On Fri, 10 Aug 2007, Luck, Tony wrote: Here are the functions in which they occur in the object file. You may have to chase down some inlining to find the function that actually uses atomic_*(). Could you just make the "atomic_read()" and "atomic_set()" functions be inline functions instead? That way you get nice compiler warnings when you pass the wrong kind of object around. So static void atomic_set(atomic_t *p, int value) { *(volatile int *)&p->value = value; } static int atomic_read(atomic_t *p) { return *(volatile int *)&p->value; } etc... I'll do this for the whole patchset. Stay tuned for the resubmit. -- Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 9/24] make atomic_read() behave consistently on ia64
On Fri, 10 Aug 2007, Luck, Tony wrote: > > Here are the functions in which they occur in the object file. You > may have to chase down some inlining to find the function that > actually uses atomic_*(). Could you just make the "atomic_read()" and "atomic_set()" functions be inline functions instead? That way you get nice compiler warnings when you pass the wrong kind of object around. So static void atomic_set(atomic_t *p, int value) { *(volatile int *)&p->value = value; } static int atomic_read(atomic_t *p) { return *(volatile int *)&p->value; } etc... Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Documentation files in html format?
> > The problem I have with asciidoc is that it's a nightmare to get it > to work. It's what GIT uses, and after spending a whole day trying > to *build* that thing, I finally resigned and asked Junio if he could > publish the pre-formatted manpages himself, which he agreed to. Bit uses in addition to asciidoc also docbook and a bit more. As asciidoc is some phython scripts it should be trivial to install with no build required. Maybe it was the docbook stuff you had trouble with? My Kbuild example were made without using other tools than asciidoc but if pdf is desired some additional tools are needed. Sam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] powerfc fix for assembler -g
ppc64 does the unusual thing of using #include on a compiler-generated assembly file (lparmap.s) from an assembly source file (head_64.S). This runs afoul of my recent patch to pass -gdwarf2 to the assembler under CONFIG_DEBUG_INFO. This patch avoids the problem by disabling DWARF generation (-g0) when producing lparmap.s. Signed-off-by: Roland McGrath <[EMAIL PROTECTED]> --- arch/powerpc/kernel/Makefile |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile index f39a72f..b0cb2e6 100644 --- a/arch/powerpc/kernel/Makefile +++ b/arch/powerpc/kernel/Makefile @@ -81,6 +81,7 @@ obj-y += iomap.o endif ifeq ($(CONFIG_PPC_ISERIES),y) +CFLAGS_lparmap.s += -g0 extra-y += lparmap.s $(obj)/head_64.o: $(obj)/lparmap.s AFLAGS_head_64.o += -I$(obj) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 9/24] make atomic_read() behave consistently on ia64
> Possibly. Either that or we've uncovered some latent bugs. Maybe a > combination of the two. Can you list those 19 changes so we can evaluate them? Here are the functions in which they occur in the object file. You may have to chase down some inlining to find the function that actually uses atomic_*(). freeque do_msgrcv sk_free sock_wfree sock_rfree sock_kmalloc sock_kfree_s sock_setsockopt skb_release_data __sk_stream_mem_reclaim sk_tream_mem_schedule sk_stream_rfree sk_attach_filter ip_frag_destroy * 2 ip_frag_queue * 2 ip_frag_reasm * 2 -Tony - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Noatime vs relatime
On 08/10/2007 05:10 PM, Matti Aarnio wrote: On Fri, Aug 10, 2007 at 07:26:46AM -0700, Vlad wrote: ... "Warning: Atime will be disabled by default in future kernel versions, but you will still be able to turn it on when configuring the kernel." This should give a heads-up to the 0.001% of people who still use atime so that they know to customize this option or start using modern file-monitoring techniques like inotify. NO for two reasons: - atime semantics are just fine in server environments - inotify IS NOT scalable to millions of files, nor to situations where we want to check alteration weeks or months after the fact In reality I would perhaps prefer mount-behaviour being altered from 'by default do atime' to 'by default do noatime. I must say I've been wondering about relatime a bit as well. Are there actually users who do really want atime, but not badly enough to want real atime? I've been running with noatime for years now and do not plan on changing that so have been shrugging this entire discussion off with "no care of mine", but whose care _is_ it? There MUST be an easy way to tell system that "yes, I want to track last accesstime." mount -o atime. Or as far as I'm concerned, keep the default as posixly compliant as one wants and teach people and distributions to mount "noatime" as I hear some have already been doing. I may be wrong, but to me, relatime sounds like compromising for the sake of compromising... Rene. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: early boot lockup with 2.6.23-rc1
Mikko Rapeli wrote: > > Oops, I was wrong and bad enough to think nesting #ifdef's would work; > 2.6.23-rc2 with query_mca() to query_edd() in arch/i386/boot/main.c > commented out works. > > Sorry about that one. > OK, good. That would be consistent with the current analysis. Let me know what you get out of the test patch I sent. -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
Hi, On Fri, 10 Aug 2007, Ingo Molnar wrote: > achieve that. It probably wont make a real difference, but it's really > easy for you to send and it's still very useful when one tries to > eliminate possibilities and when one wants to concentrate on the > remaining possibilities alone. The thing I'm afraid about CFS is its possible unpredictability, which would make it hard to reproduce problems and we may end up with users with unexplainable weird problems. That's the main reason I'm trying so hard to push for a design discussion. Just to give an idea here are two more examples of irregular behaviour, which are hopefully easier to reproduce. 1. Two simple busy loops, one of them is reniced to 15, according to my calculations the reniced task should get about 3.4% (1/(1.25^15+1)), but I get this: PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 4433 roman 20 0 1532 300 244 R 99.2 0.2 5:05.51 l 4434 roman 35 15 1532 72 16 R 0.7 0.1 0:10.62 l OTOH upto nice level 12 I get what I expect. 2. If I start 20 busy loops, initially I see in top that every task gets 5% and time increments equally (as it should): PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 4492 roman 20 0 1532 68 16 R 5.0 0.1 0:02.86 l 4491 roman 20 0 1532 68 16 R 5.0 0.1 0:02.86 l 4490 roman 20 0 1532 68 16 R 5.0 0.1 0:02.86 l 4489 roman 20 0 1532 68 16 R 5.0 0.1 0:02.86 l 4488 roman 20 0 1532 68 16 R 5.0 0.1 0:02.86 l 4487 roman 20 0 1532 68 16 R 5.0 0.1 0:02.86 l 4486 roman 20 0 1532 68 16 R 5.0 0.1 0:02.86 l 4485 roman 20 0 1532 68 16 R 5.0 0.1 0:02.86 l 4484 roman 20 0 1532 68 16 R 5.0 0.1 0:02.86 l 4483 roman 20 0 1532 68 16 R 5.0 0.1 0:02.86 l 4482 roman 20 0 1532 68 16 R 5.0 0.1 0:02.86 l 4481 roman 20 0 1532 68 16 R 5.0 0.1 0:02.86 l 4480 roman 20 0 1532 68 16 R 5.0 0.1 0:02.86 l 4479 roman 20 0 1532 68 16 R 5.0 0.1 0:02.86 l 4478 roman 20 0 1532 68 16 R 5.0 0.1 0:02.86 l 4477 roman 20 0 1532 68 16 R 5.0 0.1 0:02.86 l 4476 roman 20 0 1532 68 16 R 5.0 0.1 0:02.86 l 4475 roman 20 0 1532 68 16 R 5.0 0.1 0:02.86 l 4474 roman 20 0 1532 68 16 R 5.0 0.1 0:02.86 l 4473 roman 20 0 1532 296 244 R 5.0 0.2 0:02.86 l But if I renice all of them to -15, the time every task gets is rather random: PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 4492 roman 5 -15 1532 68 16 R 1.0 0.1 0:07.95 l 4491 roman 5 -15 1532 68 16 R 4.3 0.1 0:07.62 l 4490 roman 5 -15 1532 68 16 R 3.3 0.1 0:07.50 l 4489 roman 5 -15 1532 68 16 R 7.6 0.1 0:07.80 l 4488 roman 5 -15 1532 68 16 R 9.6 0.1 0:08.31 l 4487 roman 5 -15 1532 68 16 R 3.3 0.1 0:07.59 l 4486 roman 5 -15 1532 68 16 R 6.6 0.1 0:07.08 l 4485 roman 5 -15 1532 68 16 R 10.0 0.1 0:07.31 l 4484 roman 5 -15 1532 68 16 R 8.0 0.1 0:07.30 l 4483 roman 5 -15 1532 68 16 R 7.0 0.1 0:07.34 l 4482 roman 5 -15 1532 68 16 R 1.0 0.1 0:05.84 l 4481 roman 5 -15 1532 68 16 R 1.0 0.1 0:07.16 l 4480 roman 5 -15 1532 68 16 R 3.3 0.1 0:07.00 l 4479 roman 5 -15 1532 68 16 R 1.0 0.1 0:06.66 l 4478 roman 5 -15 1532 68 16 R 8.6 0.1 0:06.96 l 4477 roman 5 -15 1532 68 16 R 8.6 0.1 0:07.63 l 4476 roman 5 -15 1532 68 16 R 9.6 0.1 0:07.38 l 4475 roman 5 -15 1532 68 16 R 1.3 0.1 0:07.09 l 4474 roman 5 -15 1532 68 16 R 2.3 0.1 0:07.97 l 4473 roman 5 -15 1532 296 244 R 1.0 0.2 0:07.73 l bye, Roman - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/4] Add ETHTOOL_[GS]FLAGS sub-ioctls
From: Ben Greear <[EMAIL PROTECTED]> Date: Fri, 10 Aug 2007 15:40:02 -0700 > For GSO on output, is there a generic fallback for any driver that > does not specifically implement GSO? Absolutely, in fact that's mainly what it's there for. I don't think there is any issue. The knob is there via ethtool for people who really want to disable it. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Partition information lost on reboot.
On 08/10/2007 02:30 PM, Michal Piotrowski wrote: [Adding linux-scsi and Adaptec support to CC] On 10/08/07, Jegadeesh <[EMAIL PROTECTED]> wrote: Hi, I have a scsi disk on Adaptec ASC-29320 U320. I have created a linux partition and ext3 filesystem over it. Now the problem is, whenever the machine is rebooted, the partition information to the OS is lost and I get an error saying it as a not valid block device. But fdisk tool shows the partitions, but "cat /proc/partitions" doesnt have this. I need to do a "partprobe" and then have to mount it explicitly. What could be causing this problem. Given below are some of the command outputs. Is that the "Adaptec AIC79xx U320 support" (CONFIG_SCSI_AIC79XX) driver? If so, did you lower the "Initial bus reset delay" (default is 5000 ms) in the kernel configuration? Rene. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 9/24] make atomic_read() behave consistently on ia64
Luck, Tony wrote: Use atomic64_read to read an atomic64_t. Thanks Andreas! Chris: This bug is why the 8-byte loads got changed to 4-byte + sign-extend by your change to atomic_read(). I figured as much. Thanks for confirming this. With this applied together with shuffling the volatile from the declaration to the usage (in both atomic_read() and atomic_set() the generated code *almost* reverts to the original. There are some differences where ld4 have turned into ld8 though. Are these bugs in the use of atomic_add() and atomic_sub(). E.g. the first of these changes is in: ipc/msg.c:freeque() where we have: atomic_sub(msg->q_cbytes, &msg_bytes); Now the type of msg->q_cbytes is "unsigned long" ... so it seems a poor idea to subtract such a large typed object from "msg_bytes" which is a mere slip of an atomic_t. Or is there some other type-wrangling that needs to happen in include/asm-ia64/atomic.h? There are a total of nineteen of these ld4->ld8 transforms. Possibly. Either that or we've uncovered some latent bugs. Maybe a combination of the two. Can you list those 19 changes so we can evaluate them? I'm told there were some *(volatile *) bugs fixed in gcc recently, so it's also possible your 3.4.6 is showing those. I can test that on a more recent gcc on ia64 if it's inconvenient for you to do so on your test box. -- Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1 regression: hwmon/w83627ehf: wrong fan speed
I wrote: > # sensors > w83627ehf-isa-0290 > Adapter: ISA adapter > VCore: +0.95 V (min = +0.00 V, max = +1.74 V) > in1: +12.30 V (min = +1.64 V, max = +3.22 V) ALARM > AVCC: +3.28 V (min = +1.89 V, max = +1.94 V) ALARM > 3VCC: +3.26 V (min = +0.18 V, max = +0.72 V) ALARM > in4: +1.58 V (min = +0.57 V, max = +0.90 V) ALARM > in5: +1.70 V (min = +0.41 V, max = +1.19 V) ALARM > in6: +3.43 V (min = +0.31 V, max = +3.05 V) ALARM > VSB: +3.25 V (min = +0.37 V, max = +3.01 V) ALARM > VBAT: +3.18 V (min = +3.94 V, max = +0.74 V) ALARM > in9: +1.88 V (min = +0.79 V, max = +1.40 V) ALARM > Case Fan:0 RPM (min = 753 RPM, div = 128) ALARM > CPU Fan:88 RPM (min = 659 RPM, div = 64) ALARM > Aux Fan: 0 RPM (min = 10546 RPM, div = 128) ALARM > fan5:0 RPM (min = 753 RPM, div = 128) ALARM > Sys Temp:+44 C (high =-5 C, hyst = -34 C) ALARM > CPU Temp: +38.0 C (high = +80.0 C, hyst = +75.0 C) > AUX Temp: +43.5 C (high = +80.0 C, hyst = +75.0 C) > > coretemp-isa- > Adapter: ISA adapter > > coretemp-isa-0001 > Adapter: ISA adapter ... > I'll reboot in a minute into 2.6.22(-rc5) and post the "sensors" output. # sensors w83627ehf-i2c-9191-290 ERROR: Can't get adapter or algorithm?!? VCore: +0.95 V (min = +0.00 V, max = +1.74 V) in1: +12.20 V (min = +1.64 V, max = +3.22 V) ALARM AVCC: +3.26 V (min = +1.89 V, max = +1.94 V) ALARM 3VCC: +3.26 V (min = +0.18 V, max = +0.72 V) ALARM in4: +1.58 V (min = +0.57 V, max = +0.90 V) ALARM in5: +1.71 V (min = +0.41 V, max = +1.19 V) ALARM in6: +3.43 V (min = +0.31 V, max = +3.05 V) ALARM VSB: +3.26 V (min = +0.37 V, max = +3.01 V) ALARM VBAT: +3.18 V (min = +3.94 V, max = +0.74 V) ALARM in9: +1.88 V (min = +0.79 V, max = +1.40 V) ALARM Case Fan: 484 RPM (min = 84375 RPM, div = 16) ALARM CPU Fan: 1424 RPM (min = 21093 RPM, div = 4) ALARM Aux Fan: 0 RPM (min = 10546 RPM, div = 128) ALARM fan5:0 RPM (min = 10546 RPM, div = 128) ALARM Sys Temp:+45 C (high =-5 C, hyst = -34 C) ALARM CPU Temp: +39.5 C (high = +80.0 C, hyst = +75.0 C) AUX Temp: +44.5 C (high = +80.0 C, hyst = +75.0 C) coretemp-isa- Adapter: ISA adapter coretemp-isa-0001 Adapter: ISA adapter -- Stefan Richter -=-=-=== =--- -=-== http://arcgraph.de/sr/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/4] Add ETHTOOL_[GS]FLAGS sub-ioctls
David Miller wrote: From: Ben Greear <[EMAIL PROTECTED]> I believe LRO is going to have to be disabled for routing/bridging, so the stack will probably need to become aware of it at some point... The packet will be GSO'd on output I believe, so it won't break anything. Alternatively, we could make the driver only LRO accumulate if the packet is unicast and matches one of the MAC's programmed into the chip. I think even this would fail if you are doing something clever with NAT or other iptables stuff. Probably we're going to have to put this in the hands of the users..who hopefully can determine whether they can allow LRO or not... For GSO on output, is there a generic fallback for any driver that does not specifically implement GSO? Thanks, Ben -- Ben Greear <[EMAIL PROTECTED]> Candela Technologies Inc http://www.candelatech.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: early boot lockup with 2.6.23-rc1
On Fri, Aug 10, 2007 at 10:20:31PM +0300, Mikko Rapeli wrote: > On Fri, Aug 10, 2007 at 09:45:31AM -0700, H. Peter Anvin wrote: > > Let me get this straight... "edd=skipmbr" boots fine, but commenting out > > the call to query_edd() didn't? Could you please try that (and, I > > guess, only that), and make sure everything necessary is rebuild. > > > > 2.6.23-*rc2* you say boots fine with "edd=skipmbr", but not without? > > Yes, vanilla 2.6.23-rc2 with edd=skipmbr boots fine. > > > Did you try the above commenting-out on rc2? > > Yes, didn't work with 2.6.23-rc2 but printed one dot in the upper left > corner after grub stuff. Oops, I was wrong and bad enough to think nesting #ifdef's would work; 2.6.23-rc2 with query_mca() to query_edd() in arch/i386/boot/main.c commented out works. Sorry about that one. -Mikko - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm2
On Fri, Aug 10, 2007 at 01:20:19PM -0700, Andrew Morton wrote: > git-wireless now has the usual git catastrophe when merging it against the > recently-discovered net-2.6.24 tree, so I'll need to do something about > that first. I have rebased the wireless-dev tree, and the mm-master branch there should specifically avoid these merge conflicts. Hth! John -- John W. Linville [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 9/24] make atomic_read() behave consistently on ia64
> Use atomic64_read to read an atomic64_t. Thanks Andreas! Chris: This bug is why the 8-byte loads got changed to 4-byte + sign-extend by your change to atomic_read(). With this applied together with shuffling the volatile from the declaration to the usage (in both atomic_read() and atomic_set() the generated code *almost* reverts to the original. There are some differences where ld4 have turned into ld8 though. Are these bugs in the use of atomic_add() and atomic_sub(). E.g. the first of these changes is in: ipc/msg.c:freeque() where we have: atomic_sub(msg->q_cbytes, &msg_bytes); Now the type of msg->q_cbytes is "unsigned long" ... so it seems a poor idea to subtract such a large typed object from "msg_bytes" which is a mere slip of an atomic_t. Or is there some other type-wrangling that needs to happen in include/asm-ia64/atomic.h? There are a total of nineteen of these ld4->ld8 transforms. -Tony - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial patches from -mm
On Fri, 10 Aug 2007 14:38:35 -0700 Andrew Morton <[EMAIL PROTECTED]> wrote: > > I'll send these > > dont-optimise-away-baud-rate-changes-when-bother-is-used.patch > serial-add-support-for-ite-887x-chips.patch > serial_txx9-fix-modem-control-line-handling.patch > serial_txx9-cleanup-includes.patch > serial-8250-handle-saving-the-clear-on-read-bits-from-the-lsr.patch > add-blacklisting-capability-to-serial_pci-to-avoid-misdetection.patch > > for review, please. > > I've identified these as not-for-2.6.23 which may of course have been > incorrect. > Based on an Alan ack and my own review I have queued these: dont-optimise-away-baud-rate-changes-when-bother-is-used.patch serial-add-support-for-ite-887x-chips.patch serial_txx9-fix-modem-control-line-handling.patch serial-8250-handle-saving-the-clear-on-read-bits-from-the-lsr.patch add-blacklisting-capability-to-serial_pci-to-avoid-misdetection.patch for 2.6.23 and this: serial_txx9-cleanup-includes.patch for 2.6.24. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Fix to keep watchdog disabled by default for i386/x86_64
Fixed wrong expression which enabled watchdogs even if nmi_watchdog kernel parameter wasn't set. This regression got slightly introduced with commit b7471c6da94d30d3deadc55986cc38d1ff57f9ca. Introduced NMI_DISABLED (-1) which allows to switch the value of NMI_DEFAULT without breaking the APIC NMI watchdog code (again). Fixes: https://bugzilla.novell.com/show_bug.cgi?id=298084 http://bugzilla.kernel.org/show_bug.cgi?id=7839 And likely some more nmi_watchdog=0 related issues. Signed-off-by: Daniel Gollub <[EMAIL PROTECTED]> --- diff -rup a/arch/i386/kernel/apic.c b/arch/i386/kernel/apic.c --- a/arch/i386/kernel/apic.c 2007-08-04 04:49:55.0 +0200 +++ b/arch/i386/kernel/apic.c 2007-08-10 21:38:37.0 +0200 @@ -1087,7 +1087,7 @@ static int __init detect_init_APIC (void if (l & MSR_IA32_APICBASE_ENABLE) mp_lapic_addr = l & MSR_IA32_APICBASE_BASE; - if (nmi_watchdog != NMI_NONE) + if (nmi_watchdog != NMI_NONE && nmi_watchdog != NMI_DISABLED) nmi_watchdog = NMI_LOCAL_APIC; printk(KERN_INFO "Found and enabled local APIC!\n"); diff -rup a/arch/i386/kernel/nmi.c b/arch/i386/kernel/nmi.c --- a/arch/i386/kernel/nmi.c2007-08-04 04:49:55.0 +0200 +++ b/arch/i386/kernel/nmi.c2007-08-10 22:00:40.0 +0200 @@ -77,7 +77,7 @@ static int __init check_nmi_watchdog(voi unsigned int *prev_nmi_count; int cpu; - if ((nmi_watchdog == NMI_NONE) || (nmi_watchdog == NMI_DEFAULT)) + if ((nmi_watchdog == NMI_NONE) || (nmi_watchdog == NMI_DISABLED)) return 0; if (!atomic_read(&nmi_active)) @@ -424,7 +424,7 @@ int proc_nmi_enabled(struct ctl_table *t if (!!old_state == !!nmi_watchdog_enabled) return 0; - if (atomic_read(&nmi_active) < 0) { + if (atomic_read(&nmi_active) < 0 || nmi_watchdog == NMI_DISABLED) { printk( KERN_WARNING "NMI watchdog is permanently disabled\n"); return -EIO; } diff -rup a/arch/x86_64/kernel/nmi.c b/arch/x86_64/kernel/nmi.c --- a/arch/x86_64/kernel/nmi.c 2007-08-04 04:49:55.0 +0200 +++ b/arch/x86_64/kernel/nmi.c 2007-08-10 21:59:36.0 +0200 @@ -85,7 +85,7 @@ int __init check_nmi_watchdog (void) int *counts; int cpu; - if ((nmi_watchdog == NMI_NONE) || (nmi_watchdog == NMI_DEFAULT)) + if ((nmi_watchdog == NMI_NONE) || (nmi_watchdog == NMI_DISABLED)) return 0; if (!atomic_read(&nmi_active)) @@ -442,7 +442,7 @@ int proc_nmi_enabled(struct ctl_table *t if (!!old_state == !!nmi_watchdog_enabled) return 0; - if (atomic_read(&nmi_active) < 0) { + if (atomic_read(&nmi_active) < 0 || nmi_watchdog == NMI_DISABLED) { printk( KERN_WARNING "NMI watchdog is permanently disabled\n"); return -EIO; } diff -rup a/include/asm-i386/nmi.h b/include/asm-i386/nmi.h --- a/include/asm-i386/nmi.h2007-08-04 04:49:55.0 +0200 +++ b/include/asm-i386/nmi.h2007-08-10 22:04:51.0 +0200 @@ -33,11 +33,12 @@ extern int nmi_watchdog_tick (struct pt_ extern atomic_t nmi_active; extern unsigned int nmi_watchdog; -#define NMI_DEFAULT -1 +#define NMI_DISABLED-1 #define NMI_NONE 0 #define NMI_IO_APIC1 #define NMI_LOCAL_APIC 2 #define NMI_INVALID3 +#define NMI_DEFAULTNMI_DISABLED struct ctl_table; struct file; diff -rup a/include/asm-x86_64/nmi.h b/include/asm-x86_64/nmi.h --- a/include/asm-x86_64/nmi.h 2007-08-04 04:49:55.0 +0200 +++ b/include/asm-x86_64/nmi.h 2007-08-10 22:04:41.0 +0200 @@ -64,11 +64,12 @@ extern int setup_nmi_watchdog(char *); extern atomic_t nmi_active; extern unsigned int nmi_watchdog; -#define NMI_DEFAULT-1 +#define NMI_DISABLED-1 #define NMI_NONE 0 #define NMI_IO_APIC1 #define NMI_LOCAL_APIC 2 #define NMI_INVALID3 +#deifne NMI_DEFAULTNMI_DISABLED struct ctl_table; struct file; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1 regression: hwmon/w83627ehf: wrong fan speed
Jean Delvare wrote: > I just tried 2.6.23-rc2 on a system where I use the w83627ehf hardware > monitoring driver, and was not able to reproduce the problem you > described. Fan speeds are reported properly for me. Which I kind of > expected, as I tested all my w83627ehf patches on this system before > submitting them. Thanks that you are still after it. I was busy with other stuff the whole week, hence no git bisect result from me yet. > Please try using sensors instead of ksensors, and confirm that the > behavior is the same. I'd like to rule out a problem in ksensors > itself. sensors will also report the fan divs, this is a useful > information given the problem you have. # sensors w83627ehf-isa-0290 Adapter: ISA adapter VCore: +0.95 V (min = +0.00 V, max = +1.74 V) in1: +12.30 V (min = +1.64 V, max = +3.22 V) ALARM AVCC: +3.28 V (min = +1.89 V, max = +1.94 V) ALARM 3VCC: +3.26 V (min = +0.18 V, max = +0.72 V) ALARM in4: +1.58 V (min = +0.57 V, max = +0.90 V) ALARM in5: +1.70 V (min = +0.41 V, max = +1.19 V) ALARM in6: +3.43 V (min = +0.31 V, max = +3.05 V) ALARM VSB: +3.25 V (min = +0.37 V, max = +3.01 V) ALARM VBAT: +3.18 V (min = +3.94 V, max = +0.74 V) ALARM in9: +1.88 V (min = +0.79 V, max = +1.40 V) ALARM Case Fan:0 RPM (min = 753 RPM, div = 128) ALARM CPU Fan:88 RPM (min = 659 RPM, div = 64) ALARM Aux Fan: 0 RPM (min = 10546 RPM, div = 128) ALARM fan5:0 RPM (min = 753 RPM, div = 128) ALARM Sys Temp:+44 C (high =-5 C, hyst = -34 C) ALARM CPU Temp: +38.0 C (high = +80.0 C, hyst = +75.0 C) AUX Temp: +43.5 C (high = +80.0 C, hyst = +75.0 C) coretemp-isa- Adapter: ISA adapter coretemp-isa-0001 Adapter: ISA adapter (The aux fan and fan5 are not connected.) > Your original post suggests that the fan speed is supposed to change > depending on the system load? Or temperature? Please describe the > mechanism used to achieve this. Could it be that this mechanism isn't > working properly, and the reported (low) speeds are actually true? The motherboard controls the CPU fan and I believe also the case fan, probably based on temperatures. (The manual is buried somewhere and MSI's download site is down right in this moment.) The low speeds or the dividers incorrect. I'll reboot in a minute into 2.6.22(-rc5) and post the "sensors" output. > What fan inputs are used by your CPU and system fans? "sensors > -c /dev/null" will tell. ... fan1: 484 RPM (min = 12053 RPM, div = 16) ALARM fan2: 89 RPM (min = 659 RPM, div = 64) ALARM fan3:0 RPM (min = 10546 RPM, div = 128) ALARM fan5:0 RPM (min = 1506 RPM, div = 128) ALARM ... Hmm, interesting. When I now re-run sensors I get ... Case Fan: 484 RPM (min = 12053 RPM, div = 16) ALARM CPU Fan:89 RPM (min = 659 RPM, div = 64) ALARM Aux Fan: 0 RPM (min = 10546 RPM, div = 128) ALARM fan5:0 RPM (min = 1506 RPM, div = 128) ALARM ... (I'm still in 2.6.23-rc2. Ksensors picked the 484 RPM of the case fan up too, and that's most certainly the correct speed. Just the CPU fan's speed is still wrong; or rather its divider should be 16 rather than 64.) > Other than that, I can only ask for the same things Mark already > suggested: compile with HWMON debugging and provide the logs (this will > show what fan div the driver is trying to select), and try bisecting > using git to find out which patch exactly caused the problem. How comes the divider of one of the fans changed from one minute to the other? FWIW, the ``chip "w83627ehf-*"ยดยด section in Gentoo's /etc/sensors.conf provides only labels for fan{1,2,3}. It is titled # Winbond W83627EHF configuration originally contributed by Leon Moonen # This is for an Asus P5P800, voltages for A8V-E SE. Should I hardwire correct dividers or pulse per rev in sensors.conf or is the driver supposed to work the correct dividers out --- like it did before 2.6.23-rc? -- Stefan Richter -=-=-=== =--- -=-== http://arcgraph.de/sr/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
BUG: when using 'brctl stp'
I get this on the latest GIT, it was also present shortly after -rc1. I have not tested with earlier kernels. # brctl stp br0 on [ 169.672008] BUG: sleeping function called from invalid context at kernel/mutex.c:86 [ 169.672532] in_atomic():1, irqs_disabled():0 [ 169.672832] [ 169.672832] Call Trace: [ 169.673406] [] mutex_lock+0x19/0x2f [ 169.673696] [] __alloc_pages+0x71/0x2d3 [ 169.673996] [] :bridge:set_stp_state+0x12/0x37 [ 169.674293] [] :bridge:store_bridge_parm+0x5f/0x79 [ 169.674587] [] sysfs_write_file+0xf2/0x134 [ 169.674879] [] vfs_write+0xce/0x177 [ 169.675170] [] sys_write+0x45/0x6e [ 169.675463] [] system_call+0x7e/0x83 [ 169.675769] [ 169.676139] br0: starting userspace STP failed, staring kernel STP # brctl stp br0 off [ 171.774500] BUG: sleeping function called from invalid context at kernel/mutex.c:86 [ 171.775040] in_atomic():1, irqs_disabled():0 [ 171.775327] [ 171.775328] Call Trace: [ 171.775906] [] mutex_lock+0x19/0x2f [ 171.776195] [] __alloc_pages+0x71/0x2d3 [ 171.776496] [] :bridge:set_stp_state+0x12/0x37 [ 171.776792] [] :bridge:store_bridge_parm+0x5f/0x79 [ 171.777086] [] sysfs_write_file+0xf2/0x134 [ 171.777378] [] vfs_write+0xce/0x177 [ 171.777669] [] sys_write+0x45/0x6e [ 171.777958] [] system_call+0x7e/0x83 [ 171.778250] Daniel K. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] flush icache before set_pte() on ia64 take9 [2/2] flush icache at set_pte
On Fri, 10 Aug 2007 11:17:30 -0700 "Luck, Tony" <[EMAIL PROTECTED]> wrote: > 1) In arch/ia64/mm/init.c: __ia64_sync_icache_dcache() > > - if (!pte_exec(pte)) > - return; /* not an executable page... */ > + BUG_ON(!pte_exec(pte)); > > In this latest version the only route to this routine is from set_pte() > inside the test : > > if (pte_exec(pteval) && ) { > } > > So this BUG_ON is now redundant. > I see. > 2) In include/asm-ia64/pgtable.h > > + if (pte_exec(pteval) &&// flush only new executable page. > + pte_present(pteval) && // swap out ? > + pte_user(pteval) &&// ignore kernel page > + (!pte_present(*ptep) ||// do_no_page or swap in, migration, > + pte_pfn(*ptep) != pte_pfn(pteval))) // do_wp_page(), page copy > + /* load_module() calles flush_icache_range() explicitly*/ > + __ia64_sync_icache_dcache(pteval); > > Just above this there is a comment saying that pte_exec() only works > when pte_present() is true. So we must re-order the conditions so that > we check that the pteval satisfies pte_present() before using either of > pte_exec() or pte_user() on it like this: > > if (pte_present(pteval) && > pte_exec(pteval) && > pte_user(pteval) && > > I put in some crude counters to see whether we should check pte_exec() or > pte_user() next ... and it was very clear that the pte_exec() check gets > us out of the if() faster (at least during a kernel build). > ok. I'm sorry that I'll be offlined until next Wednesday. So, I'll post above fix in a week or so. > I also compared how often the old code called lazy_mmu_prot_update() > with how often the new code calls __ia64_sync_icache_dcache() (again > using kernel build as my workload) ... and the answer is about the > same (less than 0.2% change ... probably less than run-to-run variation). > > > So now the only remaining task is to convince myself that this > new version covers all the cases. > yes. I want more eyes for review. Thanks, -Kame - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Software based ECC ?
On Fri, 10 Aug 2007 23:16:45 +0200 "roland" <[EMAIL PROTECTED]> wrote: > Hello ! > > since ECC (speaking in terms of ram/memory) is some widespread hardware > technology > within server/enterprise computing for protection of memory failure, i > wonder: > > Can`t this be done in software, too ? Only one way to find out. If it interest you - have a go at it - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/4] Add ETHTOOL_[GS]FLAGS sub-ioctls
From: Ben Greear <[EMAIL PROTECTED]> Date: Fri, 10 Aug 2007 14:11:24 -0700 > Jeff Garzik wrote: > > > This patch copies Auke in adding NETIF_F_LRO. Is that just for > > temporary merging, or does the net core really not touch it at all? > > > > Because, logically, if NETIF_F_LRO exists nowhere else but this patch, > > we should not add it to dev->features. LRO knowledge can be contained > > entirely within the driver, if the net core never tests NETIF_F_LRO. > > > > I haven't reviewed the other NETIF_F_XXX flags, but, that logic can be > > applied to any other NETIF_F_XXX flag: if the net stack isn't using it, > > it's a piece of information specific to that driver. > > I believe LRO is going to have to be disabled for routing/bridging, > so the stack will probably need to become aware of it at some point... The packet will be GSO'd on output I believe, so it won't break anything. Alternatively, we could make the driver only LRO accumulate if the packet is unicast and matches one of the MAC's programmed into the chip. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 00/16] Permit filesystem local caching [try #3]
--- David Howells <[EMAIL PROTECTED]> wrote: > These patches add local caching for network filesystems such as NFS and AFS. > > FS-Cache now runs fully asynchronously as required by Trond Myklebust for > NFS. > > -- > Changes: > [try #3]: > > (*) Added missing file to CacheFiles patch. > > (*) Made new security functions return errors and pass actual return data > via > argument pointer. > > (*) Cleaned up NFS patch. > > (*) The 'fsc' flag must now be passed to NFS mount by the string options. > > (*) Split the NFS patch into three as requested by Trond. > > [try #2]: > > (*) The CacheFiles module no longer accepts directory fds in its cull and > inuse commands from cachefilesd. Instead it uses the current working > directory of the calling process as the basis for looking up the object. > Corollary to this, fget_light() no longer needs to be exported. > How would you expect an LSM that is not SELinux to interface with CacheFiles? You have gone to a great deal of effort to support the requirements of an SELinux system, and that's good, but you have extended the LSM interface to expose SELinux data structures (secids) and require them for the operation of CacheFiles, and that's bad. The data used within an LSM is private to the LSM, and this applies to SELinux as well as to any other LSM that may come along, such as the Smack LSM I'm working on. This applies to task data as well as file data. Further, the behavior of the system in the presence of an LSM should be controlled by the LSM, it is more than a little scary that CacheFiles is enforcing SELinux policy based on secids that may be coming from a different LSM. I applaud the integration of CacheFiles with SELinux. Unfortunately, you've done so using the LSM interface in such a way that an LSM other than SELinux is likely to demonstrate inappropriate behaviors in the presence of CacheFiles because you have so carefully integrated the SELinux requirements. If the integration with SELinux is important to you, and I would expect that it is given the work you've put into it, I suggest that the SELinux specific behaviors be identified so that another LSM can provide the behavior appropriate to the policy it chooses to enforce and put that into SELinux with an LSM interface. I know that you're looking at a significant effort to do that, but I wouldn't think that you'd want CacheFiles to behave badly in the presence of an LSM that doesn't happen to be SELinux. I also know it's tempting to point out the SELinux is the only upstream LSM. I hope to change that before too long, and I know there are others with ambitions as well. I would not like to see CacheFiles have to get excluded in the presence of other LSMs and I doubt you would either. Casey Schaufler [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 25/25 -v2] add paravirtualization support for x86_64
This is finally, the patch we were all looking for. This patch adds a paravirt.h header with the definition of paravirt_ops struct. Also, it defines a bunch of inline functions that will replace, or hook, the other calls. Every one of those functions adds an entry in the parainstructions section (see vmlinux.lds.S). Those entries can then be used to runtime-patch the paravirt_ops functions. paravirt.c contains implementations of paravirt functions that are used natively, such as the native_patch. It also fill the paravirt_ops structure with the whole lot of functions that were (re)defined throughout this patch set. There are also changes in asm-offsets.c. paravirt.h needs it to find out the offsets into the structure of functions such as irq_enable, used in assembly files. [ updates from v1 * make PARAVIRT hidden in Kconfig (Andi Kleen) * cleanups in paravirt.h (Andi Kleen) * modifications needed to accomodate other parts of the patch that changed, such as getting rid of ebda_info * put the integers at struct paravirt_ops at the end (Jeremy) ] Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]> Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]> --- arch/x86_64/Kconfig | 11 +++ arch/x86_64/kernel/Makefile |1 + arch/x86_64/kernel/asm-offsets.c | 14 ++ arch/x86_64/kernel/vmlinux.lds.S |6 ++ include/asm-x86_64/smp.h |2 +- 5 files changed, 33 insertions(+), 1 deletions(-) diff --git a/arch/x86_64/Kconfig b/arch/x86_64/Kconfig index ffa0364..00b2fc9 100644 --- a/arch/x86_64/Kconfig +++ b/arch/x86_64/Kconfig @@ -373,6 +373,17 @@ config NODES_SHIFT # Dummy CONFIG option to select ACPI_NUMA from drivers/acpi/Kconfig. +config PARAVIRT + bool + depends on EXPERIMENTAL + help + Paravirtualization is a way of running multiple instances of + Linux on the same machine, under a hypervisor. This option + changes the kernel so it can modify itself when it is run + under a hypervisor, improving performance significantly. + However, when run without a hypervisor the kernel is + theoretically slower. If in doubt, say N. + config X86_64_ACPI_NUMA bool "ACPI NUMA detection" depends on NUMA diff --git a/arch/x86_64/kernel/Makefile b/arch/x86_64/kernel/Makefile index ff5d8c9..120467f 100644 --- a/arch/x86_64/kernel/Makefile +++ b/arch/x86_64/kernel/Makefile @@ -38,6 +38,7 @@ obj-$(CONFIG_X86_VSMP)+= vsmp.o obj-$(CONFIG_K8_NB)+= k8.o obj-$(CONFIG_AUDIT)+= audit.o +obj-$(CONFIG_PARAVIRT) += paravirt.o obj-$(CONFIG_MODULES) += module.o obj-$(CONFIG_PCI) += early-quirks.o diff --git a/arch/x86_64/kernel/asm-offsets.c b/arch/x86_64/kernel/asm-offsets.c index 778953b..f5eff70 100644 --- a/arch/x86_64/kernel/asm-offsets.c +++ b/arch/x86_64/kernel/asm-offsets.c @@ -15,6 +15,9 @@ #include #include #include +#ifdef CONFIG_PARAVIRT +#include +#endif #define DEFINE(sym, val) \ asm volatile("\n->" #sym " %0 " #val : : "i" (val)) @@ -72,6 +75,17 @@ int main(void) offsetof (struct rt_sigframe32, uc.uc_mcontext)); BLANK(); #endif +#ifdef CONFIG_PARAVIRT +#define ENTRY(entry) DEFINE(PARAVIRT_ ## entry, offsetof(struct paravirt_ops, entry)) + ENTRY(paravirt_enabled); + ENTRY(irq_disable); + ENTRY(irq_enable); + ENTRY(syscall_return); + ENTRY(iret); + ENTRY(read_cr2); + ENTRY(swapgs); + BLANK(); +#endif DEFINE(pbe_address, offsetof(struct pbe, address)); DEFINE(pbe_orig_address, offsetof(struct pbe, orig_address)); DEFINE(pbe_next, offsetof(struct pbe, next)); diff --git a/arch/x86_64/kernel/vmlinux.lds.S b/arch/x86_64/kernel/vmlinux.lds.S index ba8ea97..c3fce85 100644 --- a/arch/x86_64/kernel/vmlinux.lds.S +++ b/arch/x86_64/kernel/vmlinux.lds.S @@ -185,6 +185,12 @@ SECTIONS .altinstr_replacement : AT(ADDR(.altinstr_replacement) - LOAD_OFFSET) { *(.altinstr_replacement) } + . = ALIGN(8); + .parainstructions : AT(ADDR(.parainstructions) - LOAD_OFFSET) { + __parainstructions = .; + *(.parainstructions) + __parainstructions_end = .; + } /* .exit.text is discard at runtime, not link time, to deal with references from .altinstructions and .eh_frame */ .exit.text : AT(ADDR(.exit.text) - LOAD_OFFSET) { *(.exit.text) } diff --git a/include/asm-x86_64/smp.h b/include/asm-x86_64/smp.h index 6b4..403901b 100644 --- a/include/asm-x86_64/smp.h +++ b/include/asm-x86_64/smp.h @@ -22,7 +22,7 @@ extern int disable_apic; #ifdef CONFIG_PARAVIRT #include void native_flush_tlb_others(cpumask_t cpumask, struct mm_struct *mm, - unsigned long va); + unsigned long va); #else #define startup_ipi_hook(apicid, rip, rsp) do { } while (0) #endif -- 1.4.
[PATCH 15/25 -v2] introducing paravirt_activate_mm
This function/macro will allow a paravirt guest to be notified we changed the current task cr3, and act upon it. It's up to them Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]> Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]> --- include/asm-x86_64/mmu_context.h | 17 ++--- 1 files changed, 14 insertions(+), 3 deletions(-) diff --git a/include/asm-x86_64/mmu_context.h b/include/asm-x86_64/mmu_context.h index 9592698..77ce047 100644 --- a/include/asm-x86_64/mmu_context.h +++ b/include/asm-x86_64/mmu_context.h @@ -7,7 +7,16 @@ #include #include #include + +#ifdef CONFIG_PARAVIRT +#include +#else #include +static inline void paravirt_activate_mm(struct mm_struct *prev, + struct mm_struct *next) +{ +} +#endif /* CONFIG_PARAVIRT */ /* * possibly do the LDT unload here? @@ -67,8 +76,10 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next, asm volatile("movl %0,%%fs"::"r"(0)); \ } while(0) -#define activate_mm(prev, next) \ - switch_mm((prev),(next),NULL) - +#define activate_mm(prev, next)\ +do { \ + paravirt_activate_mm(prev, next); \ + switch_mm((prev),(next),NULL); \ +} while (0) #endif -- 1.4.4.2 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 9/25 -v2] report ring kernel is running without paravirt
When paravirtualization is disabled, the kernel is always running at ring 0. So report it in the appropriate macro Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]> Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]> --- include/asm-x86_64/segment.h |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/include/asm-x86_64/segment.h b/include/asm-x86_64/segment.h index 04b8ab2..240c1bf 100644 --- a/include/asm-x86_64/segment.h +++ b/include/asm-x86_64/segment.h @@ -50,4 +50,8 @@ #define GDT_SIZE (GDT_ENTRIES * 8) #define TLS_SIZE (GDT_ENTRY_TLS_ENTRIES * 8) +#ifndef CONFIG_PARAVIRT +#define get_kernel_rpl() 0 +#endif + #endif -- 1.4.4.2 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 20/25 -v2] replace syscall_init
This patch replaces syscall_init by x86_64_syscall_init. The former will be later replaced by a paravirt replacement in case paravirt is on Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]> Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]> --- arch/x86_64/kernel/setup64.c |8 +++- include/asm-x86_64/proto.h |3 +++ 2 files changed, 10 insertions(+), 1 deletions(-) diff --git a/arch/x86_64/kernel/setup64.c b/arch/x86_64/kernel/setup64.c index 49f7342..723822c 100644 --- a/arch/x86_64/kernel/setup64.c +++ b/arch/x86_64/kernel/setup64.c @@ -153,7 +153,7 @@ __attribute__((section(".bss.page_aligned"))); extern asmlinkage void ignore_sysret(void); /* May not be marked __init: used by software suspend */ -void syscall_init(void) +void x86_64_syscall_init(void) { /* * LSTAR and STAR live in a bit strange symbiosis. @@ -172,6 +172,12 @@ void syscall_init(void) wrmsrl(MSR_SYSCALL_MASK, EF_TF|EF_DF|EF_IE|0x3000); } +/* Overriden in paravirt.c if CONFIG_PARAVIRT */ +void __attribute__((weak)) syscall_init(void) +{ + x86_64_syscall_init(); +} + void __cpuinit check_efer(void) { unsigned long efer; diff --git a/include/asm-x86_64/proto.h b/include/asm-x86_64/proto.h index 31f20ad..77ed2de 100644 --- a/include/asm-x86_64/proto.h +++ b/include/asm-x86_64/proto.h @@ -18,6 +18,9 @@ extern void init_memory_mapping(unsigned long start, unsigned long end); extern void system_call(void); extern int kernel_syscall(void); +#ifdef CONFIG_PARAVIRT +extern void x86_64_syscall_init(void); +#endif extern void syscall_init(void); extern void ia32_syscall(void); -- 1.4.4.2 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 24/25 -v2] paravirt hooks for arch initialization
This patch add paravirtualization hooks in the arch initialization process. paravirt_arch_setup() lets the guest issue any specific initialization routine Also, there is memory_setup(), so guests can handle it their way. [ updates from v1 * Don't use a separate ebda pv hook (Jeremy/Andi) * Make paravirt_setup_arch() void (Andi) ] Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]> Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]> --- arch/x86_64/kernel/setup.c | 32 +++- include/asm-x86_64/e820.h |6 ++ include/asm-x86_64/page.h |1 + 3 files changed, 38 insertions(+), 1 deletions(-) diff --git a/arch/x86_64/kernel/setup.c b/arch/x86_64/kernel/setup.c index af838f6..19e0d90 100644 --- a/arch/x86_64/kernel/setup.c +++ b/arch/x86_64/kernel/setup.c @@ -44,6 +44,7 @@ #include #include #include +#include #include #include @@ -65,6 +66,12 @@ #include #include +#ifdef CONFIG_PARAVIRT +#include +#else +#define paravirt_arch_setup() do {} while (0) +#endif + /* * Machine setup.. */ @@ -208,6 +215,16 @@ static void discover_ebda(void) * 4K EBDA area at 0x40E */ ebda_addr = *(unsigned short *)__va(EBDA_ADDR_POINTER); + /* +* There can be some situations, like paravirtualized guests, +* in which there is no available ebda information. In such +* case, just skip it +*/ + if (!ebda_addr) { + ebda_size = 0; + return; + } + ebda_addr <<= 4; ebda_size = *(unsigned short *)__va(ebda_addr); @@ -221,6 +238,13 @@ static void discover_ebda(void) ebda_size = 64*1024; } +/* Overridden in paravirt.c if CONFIG_PARAVIRT */ +void __attribute__((weak)) memory_setup(void) +{ + return setup_memory_region(); +} + + void __init setup_arch(char **cmdline_p) { printk(KERN_INFO "Command line: %s\n", boot_command_line); @@ -231,12 +255,18 @@ void __init setup_arch(char **cmdline_p) saved_video_mode = SAVED_VIDEO_MODE; bootloader_type = LOADER_TYPE; + /* +* By returning non-zero here, a paravirt impl can choose to +* skip the rest of the setup process +*/ + paravirt_arch_setup(); + #ifdef CONFIG_BLK_DEV_RAM rd_image_start = RAMDISK_FLAGS & RAMDISK_IMAGE_START_MASK; rd_prompt = ((RAMDISK_FLAGS & RAMDISK_PROMPT_FLAG) != 0); rd_doload = ((RAMDISK_FLAGS & RAMDISK_LOAD_FLAG) != 0); #endif - setup_memory_region(); + memory_setup(); copy_edd(); if (!MOUNT_ROOT_RDONLY) diff --git a/include/asm-x86_64/e820.h b/include/asm-x86_64/e820.h index 3486e70..2ced3ba 100644 --- a/include/asm-x86_64/e820.h +++ b/include/asm-x86_64/e820.h @@ -20,7 +20,12 @@ #define E820_ACPI 3 #define E820_NVS 4 +#define MAP_TYPE_STR "BIOS-e820" + #ifndef __ASSEMBLY__ + +void native_ebda_info(unsigned *addr, unsigned *size); + struct e820entry { u64 addr; /* start of memory segment */ u64 size; /* size of memory segment */ @@ -56,6 +61,7 @@ extern struct e820map e820; extern unsigned ebda_addr, ebda_size; extern unsigned long nodemap_addr, nodemap_size; + #endif/*!__ASSEMBLY__*/ #endif/*__E820_HEADER*/ diff --git a/include/asm-x86_64/page.h b/include/asm-x86_64/page.h index ec8b245..8c40fb2 100644 --- a/include/asm-x86_64/page.h +++ b/include/asm-x86_64/page.h @@ -149,6 +149,7 @@ extern unsigned long __phys_addr(unsigned long); #define __boot_pa(x) __pa(x) #ifdef CONFIG_FLATMEM #define pfn_valid(pfn) ((pfn) < end_pfn) + #endif #define virt_to_page(kaddr)pfn_to_page(__pa(kaddr) >> PAGE_SHIFT) -- 1.4.4.2 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 17/25 -v2] introduce paravirt_release_pgd()
This patch introduces a new macro/function that informs a paravirt guest when its page table is not more in use, and can be released. In case we're not paravirt, just do nothing. Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]> Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]> --- include/asm-x86_64/pgalloc.h |7 +++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/include/asm-x86_64/pgalloc.h b/include/asm-x86_64/pgalloc.h index b467be6..dbe1267 100644 --- a/include/asm-x86_64/pgalloc.h +++ b/include/asm-x86_64/pgalloc.h @@ -9,6 +9,12 @@ #define QUICK_PGD 0/* We preserve special mappings over free */ #define QUICK_PT 1 /* Other page table pages that are zero on free */ +#ifdef CONFIG_PARAVIRT +#include +#else +#define paravirt_release_pgd(pgd) do { } while (0) +#endif + #define pmd_populate_kernel(mm, pmd, pte) \ set_pmd(pmd, __pmd(_PAGE_TABLE | __pa(pte))) #define pud_populate(mm, pud, pmd) \ @@ -100,6 +106,7 @@ static inline pgd_t *pgd_alloc(struct mm_struct *mm) static inline void pgd_free(pgd_t *pgd) { BUG_ON((unsigned long)pgd & (PAGE_SIZE-1)); + paravirt_release_pgd(pgd); quicklist_free(QUICK_PGD, pgd_dtor, pgd); } -- 1.4.4.2 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 22/25 -v2] turn priviled operation into a macro
under paravirt, read cr2 cannot be issued directly anymore. So wrap it in a macro, defined to the operation itself in case paravirt is off, but to something else if we have paravirt in the game Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]> Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]> --- arch/x86_64/kernel/head.S | 10 +- 1 files changed, 9 insertions(+), 1 deletions(-) diff --git a/arch/x86_64/kernel/head.S b/arch/x86_64/kernel/head.S index e89abcd..1bb6c55 100644 --- a/arch/x86_64/kernel/head.S +++ b/arch/x86_64/kernel/head.S @@ -18,6 +18,12 @@ #include #include #include +#ifdef CONFIG_PARAVIRT +#include +#include +#else +#define GET_CR2_INTO_RCX mov %cr2, %rcx +#endif /* we are not able to switch in one step to the final KERNEL ADRESS SPACE * because we need identity-mapped pages. @@ -267,7 +273,9 @@ ENTRY(early_idt_handler) xorl %eax,%eax movq 8(%rsp),%rsi # get rip movq (%rsp),%rdx - movq %cr2,%rcx + /* When PARAVIRT is on, this operation may clobber rax. It is + something safe to do, because we've just zeroed rax. */ + GET_CR2_INTO_RCX leaq early_idt_msg(%rip),%rdi call early_printk cmpl $2,early_recursion_flag(%rip) -- 1.4.4.2 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 23/25 -v2] provide paravirt patching function
This patch introduces apply_paravirt(), a function that shall be called by i386/alternative.c to apply replacements to paravirt_functions. It is defined to an do-nothing function if paravirt is not enabled. Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]> Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]> --- include/asm-x86_64/alternative.h |8 +--- 1 files changed, 5 insertions(+), 3 deletions(-) diff --git a/include/asm-x86_64/alternative.h b/include/asm-x86_64/alternative.h index ab161e8..e69a141 100644 --- a/include/asm-x86_64/alternative.h +++ b/include/asm-x86_64/alternative.h @@ -143,12 +143,14 @@ static inline void alternatives_smp_switch(int smp) {} */ #define ASM_OUTPUT2(a, b) a, b -struct paravirt_patch; +struct paravirt_patch_site; #ifdef CONFIG_PARAVIRT -void apply_paravirt(struct paravirt_patch *start, struct paravirt_patch *end); +void apply_paravirt(struct paravirt_patch_site *start, + struct paravirt_patch_site *end); #else static inline void -apply_paravirt(struct paravirt_patch *start, struct paravirt_patch *end) +apply_paravirt(struct paravirt_patch_site *start, + struct paravirt_patch_site *end) {} #define __parainstructions NULL #define __parainstructions_end NULL -- 1.4.4.2 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 16/25 -v2] turn page operations into native versions
This patch turns the page operations (set and make a page table) into native_ versions. The operations itself will be later overriden by paravirt. Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]> Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]> --- include/asm-x86_64/page.h | 36 +++- 1 files changed, 31 insertions(+), 5 deletions(-) diff --git a/include/asm-x86_64/page.h b/include/asm-x86_64/page.h index 88adf1a..ec8b245 100644 --- a/include/asm-x86_64/page.h +++ b/include/asm-x86_64/page.h @@ -64,16 +64,42 @@ typedef struct { unsigned long pgprot; } pgprot_t; extern unsigned long phys_base; -#define pte_val(x) ((x).pte) -#define pmd_val(x) ((x).pmd) -#define pud_val(x) ((x).pud) -#define pgd_val(x) ((x).pgd) -#define pgprot_val(x) ((x).pgprot) +static inline unsigned long native_pte_val(pte_t pte) +{ + return pte.pte; +} + +static inline unsigned long native_pud_val(pud_t pud) +{ + return pud.pud; +} + + +static inline unsigned long native_pmd_val(pmd_t pmd) +{ + return pmd.pmd; +} + +static inline unsigned long native_pgd_val(pgd_t pgd) +{ + return pgd.pgd; +} + +#ifdef CONFIG_PARAVIRT +#include +#else +#define pte_val(x) native_pte_val(x) +#define pmd_val(x) native_pmd_val(x) +#define pud_val(x) native_pud_val(x) +#define pgd_val(x) native_pgd_val(x) #define __pte(x) ((pte_t) { (x) } ) #define __pmd(x) ((pmd_t) { (x) } ) #define __pud(x) ((pud_t) { (x) } ) #define __pgd(x) ((pgd_t) { (x) } ) +#endif /* CONFIG_PARAVIRT */ + +#define pgprot_val(x) ((x).pgprot) #define __pgprot(x)((pgprot_t) { (x) } ) #endif /* !__ASSEMBLY__ */ -- 1.4.4.2 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 7/25 -v2] interrupt related native paravirt functions.
The interrupt initialization routine becomes native_init_IRQ and will be overriden later in case paravirt is on. [ updates from v1 * After a talk with Jeremy Fitzhardinge, it turned out that making the interrupt vector global was not a good idea. So it is removed in this patch ] Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]> Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]> --- arch/x86_64/kernel/i8259.c |5 - include/asm-x86_64/irq.h |2 ++ 2 files changed, 6 insertions(+), 1 deletions(-) diff --git a/arch/x86_64/kernel/i8259.c b/arch/x86_64/kernel/i8259.c index 948cae6..048e3cb 100644 --- a/arch/x86_64/kernel/i8259.c +++ b/arch/x86_64/kernel/i8259.c @@ -484,7 +484,10 @@ static int __init init_timer_sysfs(void) device_initcall(init_timer_sysfs); -void __init init_IRQ(void) +/* Overridden in paravirt.c */ +void init_IRQ(void) __attribute__((weak, alias("native_init_IRQ"))); + +void __init native_init_IRQ(void) { int i; diff --git a/include/asm-x86_64/irq.h b/include/asm-x86_64/irq.h index 5006c6e..be55299 100644 --- a/include/asm-x86_64/irq.h +++ b/include/asm-x86_64/irq.h @@ -46,6 +46,8 @@ static __inline__ int irq_canonicalize(int irq) extern void fixup_irqs(cpumask_t map); #endif +void native_init_IRQ(void); + #define __ARCH_HAS_DO_SOFTIRQ 1 #endif /* _ASM_IRQ_H */ -- 1.4.4.2 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/