Re: 2.6.21-rc5-mm3 - no boot, "address not 2M aligned"
On Sat, 31 Mar 2007 09:12:20 +0200 Helge Hafting <[EMAIL PROTECTED]> wrote: > A new error for me: > > loading 2.6.21rc5mm3 > Bios data check successful > Destination address not 2M aligned > -- System halted > > > This is using the same lilo that loads 2.6.18rc5mm1 fine. > x86-64 > That's new. Does changing the value of CONFIG_RELOCATABLE change anything? Please send the .config. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: add command line option "local_apic_timer_c2_ok"
On Fri, 30 Mar 2007 00:05:39 +0200, Andi Kleen wrote > On Thursday 29 March 2007 23:16, Linus Torvalds wrote: > > > > On Thu, 29 Mar 2007, Andi Kleen wrote: > > > > > > Here's a patch. I don't have a system with C1E, so i only tested that > > > the apic timer still works on a older AMD box. > > > > I think this looks better than what we have now, but it would look even > > better if the core CPUID stuff was in arch/i386/kernel/cpu/amd.c, and we > > simply had X86_FEATURE_BROKEN_C1_LAPIC etc.. > > > > And then the apic.c code would just check > > > > if (boot_cpu_has(X86_FEATURE_BROKEN_C1_LAPIC)) > > return -1; > > > > or similar. > > Ok fair point. Here's an updated patch. I've tested this patch little bit more on my nx6325 and I've found scenario in which my box works slow. When I boot HP with connected AC (it boots fast), and then after boot I unplug AC and try to power HP off it's working very slow (powering off process take few minutes). On battery it's always booting and powering off fast. Can enybody with nx6325 confirm this ? -- Greetings - CeHo - Grzegorz Chwesewicz - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] VMI paravirt-ops bugfix for 2.6.21
So lazy MMU mode is vulnerable to interrupts coming in and issuing kmap_atomic, which does not work when under lazy MMU mode. The window for this is small, but it means highmem kernels, especially with heavy network, USB, or AIO workloads are vulnerable to getting invariably fatal pagefaults in interrupt handlers. For now, the best fix is to simply disable and re-enable interrupts when entering and exiting lazy mode (which, btw, is already guaranteed to have preempt disabled). For the future, a better fix is to simply exit lazy mode when issuing kmap_atomic, but I do not want to touch any generic code now for 2.6.21. Hopefully there is still time to apply it. Thanks to Jeremy Fitzhardinge for pointing this out. Zach Critical bugfix; when using software RAID, potentially USB or AIO in highmem configurations, drivers are allowed to use kmap_atomic from interrupt context. This is incompatible with the current implementation of lazy MMU mode, and means the kmap will silently fail, causing either memory corruption or kernel panics. This bug is only visible with >970 megs of RAM and extreme memory pressure, but nontheless extremely serious. The fix is to disable interrupts on the CPU when entering a lazy MMU state; this is totally safe, as preemption is already disabled, and lazy update state can neither be nested nor overlapping. Thus per-cpu variables to track the state and flags can be used to disable interrupts during this critical region. Signed-off-by: Zachary Amsden <[EMAIL PROTECTED]> diff -r be8c61492e28 arch/i386/kernel/vmi.c --- a/arch/i386/kernel/vmi.cFri Mar 30 14:13:45 2007 -0700 +++ b/arch/i386/kernel/vmi.cFri Mar 30 14:18:16 2007 -0700 @@ -69,6 +69,7 @@ struct { void (*flush_tlb)(int); void (*set_initial_ap_state)(int, int); void (*halt)(void); + void (*set_lazy_mode)(int mode); } vmi_ops; /* XXX move this to alternative.h */ @@ -574,6 +575,31 @@ vmi_startup_ipi_hook(int phys_apicid, un } #endif +static void vmi_set_lazy_mode(int new_mode) +{ + static DEFINE_PER_CPU(int, mode); + static DEFINE_PER_CPU(unsigned long, flags); + int cpu = smp_processor_id(); + + if (!vmi_ops.set_lazy_mode) + return; + + /* +* Modes do not nest or overlap, so we can simply disable +* irqs when entering a mode and re-enable when leaving. +*/ + BUG_ON(per_cpu(mode, cpu) && new_mode); + BUG_ON(!new_mode && !per_cpu(mode, cpu)); + + if (new_mode) + local_irq_save(per_cpu(flags, cpu)); + else + local_irq_restore(per_cpu(flags, cpu)); + + vmi_ops.set_lazy_mode(new_mode); + per_cpu(mode, cpu) = new_mode; +} + static inline int __init check_vmi_rom(struct vrom_header *rom) { struct pci_header *pci; @@ -804,7 +830,7 @@ static inline int __init activate_vmi(vo para_wrap(load_esp0, vmi_load_esp0, set_kernel_stack, UpdateKernelStack); para_fill(set_iopl_mask, SetIOPLMask); para_fill(io_delay, IODelay); - para_fill(set_lazy_mode, SetLazyMode); + para_wrap(set_lazy_mode, vmi_set_lazy_mode, set_lazy_mode, SetLazyMode); /* user and kernel flush are just handled with different flags to FlushTLB */ para_wrap(flush_tlb_user, vmi_flush_tlb_user, flush_tlb, FlushTLB);
[PATCH 2/2] kconfig/kbuild: fix dependency problem
>From bbc89026f3e5d9e437ce4cd26d3013fe226103e2 Mon Sep 17 00:00:00 2001 From: Sam Ravnborg <[EMAIL PROTECTED]> Date: Sat, 31 Mar 2007 09:34:46 +0200 Subject: [PATCH] kconfig/kbuild: fix dependency problem Commit 2e3646e51b2d6415549b310655df63e7e0d7a080 changed the way the split config tree is built, but failed to also adjust fixdep accordingly - if changing a config option from or to m, files referencing the respective CONFIG_..._MODULE (but not the corresponding CONFIG_...) didn't get rebuilt. This happens because tristate symbol has three values represented by different CONFIG_ symbols: =n => CONFIG_SYMBOL undefined =y => CONFIG_SYMBOL equals 1 =m => CONFIG_SYMBOL_MODULE equals 1 But conf_split_config did not support the _MODULE syntax and therefore no include/config/symbol/module.h file was generated/touched when changing a symbol to/from m. Thus make did nt pick up the change and rebuild failed. This patch teaches conf_split_config to support the _MODULE variant. This fixes a problem reported by Randy Dunlap <[EMAIL PROTECTED]>. arch/i386/kernel/apm.o revealed this bug. Original fix was posted by: "Jan Beulich" <[EMAIL PROTECTED]> which inspired this better fix. Signed-off-by: Sam Ravnborg <[EMAIL PROTECTED]> --- scripts/kconfig/confdata.c |6 ++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/scripts/kconfig/confdata.c b/scripts/kconfig/confdata.c index ff6b39b..4137961 100644 --- a/scripts/kconfig/confdata.c +++ b/scripts/kconfig/confdata.c @@ -646,6 +646,12 @@ int conf_split_config(void) *d = '\0'; if (touch_file(path)) return 1; + /* For tristate symbols we need to touch symbol/module.h too */ + if (sym->type == S_TRISTATE) { + strcat(path, "/module"); + if (touch_file(path)) + return 1; + } } if (chdir("../..")) return 1; -- 1.5.1.rc3.gaa453 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] kconfig: factor out code in conf_spilt_config
>From e9fcc3bf8d1c71df1ae650d5291c5d4b15d71656 Mon Sep 17 00:00:00 2001 From: Sam Ravnborg <[EMAIL PROTECTED]> Date: Sat, 31 Mar 2007 09:15:12 +0200 Subject: [PATCH] kconfig: factor out code in conf_spilt_config This patch simply factor out code and do not introduce any functional changes. Signed-off-by: Sam Ravnborg <[EMAIL PROTECTED]> --- scripts/kconfig/confdata.c | 75 1 files changed, 41 insertions(+), 34 deletions(-) diff --git a/scripts/kconfig/confdata.c b/scripts/kconfig/confdata.c index 664fe29..ff6b39b 100644 --- a/scripts/kconfig/confdata.c +++ b/scripts/kconfig/confdata.c @@ -533,13 +533,48 @@ int conf_write(const char *name) return 0; } +/* Touch the file specified (adding .h to the name) */ +static int touch_file(const char *file) +{ + struct stat sb; + int fd; + char *d; + char name[128]; + + strcpy(name, file); + strcat(name, ".h"); + + /* Open existing file */ + fd = open(name, O_WRONLY | O_CREAT | O_TRUNC, 0644); + if (fd == -1) { + if (errno != ENOENT) + return 1; + /* +* Create directory components, +* unless they exist already. +*/ + d = name; + while ((d = strchr(d, '/'))) { + *d = 0; + if (stat(name, &sb) && mkdir(name, 0755)) + return 1; + *d++ = '/'; + } + /* Directories created, now create file. */ + fd = open(name, O_WRONLY | O_CREAT | O_TRUNC, 0644); + if (fd == -1) + return 1; + } + close(fd); + return 0; +} + int conf_split_config(void) { char *name, path[128]; char *s, *d, c; struct symbol *sym; - struct stat sb; - int res, i, fd; + int res, i; name = getenv("KCONFIG_AUTOCONFIG"); if (!name) @@ -601,45 +636,17 @@ int conf_split_config(void) * different from 'no'). */ - /* Replace all '_' and append ".h" */ + /* Replace all '_' with '/' */ s = sym->name; d = path; while ((c = *s++)) { c = tolower(c); *d++ = (c == '_') ? '/' : c; } - strcpy(d, ".h"); - - /* Assume directory path already exists. */ - fd = open(path, O_WRONLY | O_CREAT | O_TRUNC, 0644); - if (fd == -1) { - if (errno != ENOENT) { - res = 1; - break; - } - /* -* Create directory components, -* unless they exist already. -*/ - d = path; - while ((d = strchr(d, '/'))) { - *d = 0; - if (stat(path, &sb) && mkdir(path, 0755)) { - res = 1; - goto out; - } - *d++ = '/'; - } - /* Try it again. */ - fd = open(path, O_WRONLY | O_CREAT | O_TRUNC, 0644); - if (fd == -1) { - res = 1; - break; - } - } - close(fd); + *d = '\0'; + if (touch_file(path)) + return 1; } -out: if (chdir("../..")) return 1; -- 1.5.1.rc3.gaa453 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2.6.21 3/4] cxgb3 - Tighten xgmac workaround
From: Divy Le Ray <[EMAIL PROTECTED]> Run the watchdog task when the link is up. Flush the XGMAC Tx FIFO when the link drops. Also remove a statistics update that should have gone in the previous modification of xgmac.c. Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]> --- drivers/net/cxgb3/cxgb3_main.c | 16 +--- drivers/net/cxgb3/regs.h |4 drivers/net/cxgb3/xgmac.c |1 - 3 files changed, 17 insertions(+), 4 deletions(-) diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c index 145b67c..512daf7 100644 --- a/drivers/net/cxgb3/cxgb3_main.c +++ b/drivers/net/cxgb3/cxgb3_main.c @@ -185,16 +185,26 @@ void t3_os_link_changed(struct adapter * int speed, int duplex, int pause) { struct net_device *dev = adapter->port[port_id]; + struct port_info *pi = netdev_priv(dev); + struct cmac *mac = &pi->mac; /* Skip changes from disabled ports. */ if (!netif_running(dev)) return; if (link_stat != netif_carrier_ok(dev)) { - if (link_stat) + if (link_stat) { + t3_set_reg_field(adapter, +A_XGM_TXFIFO_CFG + mac->offset, +F_ENDROPPKT, 0); netif_carrier_on(dev); - else + } else { netif_carrier_off(dev); + t3_set_reg_field(adapter, +A_XGM_TXFIFO_CFG + mac->offset, +F_ENDROPPKT, F_ENDROPPKT); + } + link_report(dev); } } @@ -2119,7 +2129,7 @@ static void check_t3b2_mac(struct adapte continue; status = 0; - if (netif_running(dev)) + if (netif_running(dev) && netif_carrier_ok(dev)) status = t3b2_mac_watchdog_task(&p->mac); if (status == 1) p->mac.stats.num_toggled++; diff --git a/drivers/net/cxgb3/regs.h b/drivers/net/cxgb3/regs.h index b38629a..f8be41c 100644 --- a/drivers/net/cxgb3/regs.h +++ b/drivers/net/cxgb3/regs.h @@ -1940,6 +1940,10 @@ #define V_TXFIFOTHRESH(x) ((x) << S_TXFIFOTHRESH) +#define S_ENDROPPKT21 +#define V_ENDROPPKT(x) ((x) << S_ENDROPPKT) +#define F_ENDROPPKTV_ENDROPPKT(1U) + #define A_XGM_SERDES_CTRL 0x890 #define A_XGM_SERDES_CTRL0 0x8e0 diff --git a/drivers/net/cxgb3/xgmac.c b/drivers/net/cxgb3/xgmac.c index 2b42c13..94aaff0 100644 --- a/drivers/net/cxgb3/xgmac.c +++ b/drivers/net/cxgb3/xgmac.c @@ -471,7 +471,6 @@ const struct mac_stats *t3_mac_update_st RMON_UPDATE(mac, rx_symbol_errs, RX_SYM_CODE_ERR_FRAMES); RMON_UPDATE(mac, rx_too_long, RX_OVERSIZE_FRAMES); - mac->stats.rx_too_long += RMON_READ(mac, A_XGM_RX_MAX_PKT_SIZE_ERR_CNT); v = RMON_READ(mac, A_XGM_RX_MAX_PKT_SIZE_ERR_CNT); if (mac->adapter->params.rev == T3_REV_B2) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2.6.21 4/4] cxgb3 - Firwmare update
From: Divy Le Ray <[EMAIL PROTECTED]> Introduce FW micro version. Bump up FW version to 3.3.0 Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]> --- drivers/net/cxgb3/cxgb3_main.c |4 ++-- drivers/net/cxgb3/version.h|5 - 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c index 512daf7..26240fd 100644 --- a/drivers/net/cxgb3/cxgb3_main.c +++ b/drivers/net/cxgb3/cxgb3_main.c @@ -721,7 +721,7 @@ static void bind_qsets(struct adapter *a } } -#define FW_FNAME "t3fw-%d.%d.bin" +#define FW_FNAME "t3fw-%d.%d.%d.bin" static int upgrade_fw(struct adapter *adap) { @@ -731,7 +731,7 @@ static int upgrade_fw(struct adapter *ad struct device *dev = &adap->pdev->dev; snprintf(buf, sizeof(buf), FW_FNAME, FW_VERSION_MAJOR, -FW_VERSION_MINOR); +FW_VERSION_MINOR, FW_VERSION_MICRO); ret = request_firmware(&fw, buf, dev); if (ret < 0) { dev_err(dev, "could not upgrade firmware: unable to load %s\n", diff --git a/drivers/net/cxgb3/version.h b/drivers/net/cxgb3/version.h index 82278f8..042e27e 100644 --- a/drivers/net/cxgb3/version.h +++ b/drivers/net/cxgb3/version.h @@ -36,6 +36,9 @@ #define DRV_NAME "cxgb3" /* Driver version */ #define DRV_VERSION "1.0-ko" + +/* Firmware version */ #define FW_VERSION_MAJOR 3 -#define FW_VERSION_MINOR 2 +#define FW_VERSION_MINOR 3 +#define FW_VERSION_MICRO 0 #endif /* __CHELSIO_VERSION_H */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2.6.21 1/4] cxgb3 - Safeguard TCAM size usage
From: Divy Le Ray <[EMAIL PROTECTED]> Ensure that the TCAM active region size is at least 16. Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]> --- drivers/net/cxgb3/common.h|3 +++ drivers/net/cxgb3/cxgb3_main.c|7 +-- drivers/net/cxgb3/cxgb3_offload.c |4 +++- 3 files changed, 11 insertions(+), 3 deletions(-) diff --git a/drivers/net/cxgb3/common.h b/drivers/net/cxgb3/common.h index 85e5543..38a0565 100644 --- a/drivers/net/cxgb3/common.h +++ b/drivers/net/cxgb3/common.h @@ -358,6 +358,9 @@ enum { MC5_MODE_72_BIT = 2 }; +/* MC5 min active region size */ +enum { MC5_MIN_TIDS = 16 }; + struct vpd_params { unsigned int cclk; unsigned int mclk; diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c index d553836..b82544e 100644 --- a/drivers/net/cxgb3/cxgb3_main.c +++ b/drivers/net/cxgb3/cxgb3_main.c @@ -485,12 +485,14 @@ static ssize_t show_##name(struct device static ssize_t set_nfilters(struct net_device *dev, unsigned int val) { struct adapter *adap = dev->priv; + int min_tids = is_offload(adap) ? MC5_MIN_TIDS : 0; if (adap->flags & FULL_INIT_DONE) return -EBUSY; if (val && adap->params.rev == 0) return -EINVAL; - if (val > t3_mc5_size(&adap->mc5) - adap->params.mc5.nservers) + if (val > t3_mc5_size(&adap->mc5) - adap->params.mc5.nservers - + min_tids) return -EINVAL; adap->params.mc5.nfilters = val; return 0; @@ -508,7 +510,8 @@ static ssize_t set_nservers(struct net_d if (adap->flags & FULL_INIT_DONE) return -EBUSY; - if (val > t3_mc5_size(&adap->mc5) - adap->params.mc5.nfilters) + if (val > t3_mc5_size(&adap->mc5) - adap->params.mc5.nfilters - + MC5_MIN_TIDS) return -EINVAL; adap->params.mc5.nservers = val; return 0; diff --git a/drivers/net/cxgb3/cxgb3_offload.c b/drivers/net/cxgb3/cxgb3_offload.c index f6ed033..eed7a48 100644 --- a/drivers/net/cxgb3/cxgb3_offload.c +++ b/drivers/net/cxgb3/cxgb3_offload.c @@ -553,7 +553,9 @@ int cxgb3_alloc_atid(struct t3cdev *tdev struct tid_info *t = &(T3C_DATA(tdev))->tid_maps; spin_lock_bh(&t->atid_lock); - if (t->afree) { + if (t->afree && + t->atids_in_use + atomic_read(&t->tids_in_use) + MC5_MIN_TIDS <= + t->ntids) { union active_open_entry *p = t->afree; atid = (p - t->atid_tab) + t->atid_base; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2.6.21 2/4] cxgb3 - detect NIC only adapters
From: Divy Le Ray <[EMAIL PROTECTED]> Differentiate NIC only adapters from RNICs. Initialize offload capabilities for RNICs only. Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]> --- drivers/net/cxgb3/common.h |6 +++--- drivers/net/cxgb3/cxgb3_main.c |8 drivers/net/cxgb3/mc5.c|3 +++ drivers/net/cxgb3/sge.c|2 +- drivers/net/cxgb3/t3_hw.c | 24 ++-- 5 files changed, 29 insertions(+), 14 deletions(-) diff --git a/drivers/net/cxgb3/common.h b/drivers/net/cxgb3/common.h index 38a0565..97128d8 100644 --- a/drivers/net/cxgb3/common.h +++ b/drivers/net/cxgb3/common.h @@ -112,8 +112,7 @@ enum { }; enum { - SUPPORTED_OFFLOAD = 1 << 24, - SUPPORTED_IRQ = 1 << 25 + SUPPORTED_IRQ = 1 << 24 }; enum { /* adapter interrupt-maintained statistics */ @@ -405,6 +404,7 @@ struct adapter_params { unsigned int stats_update_period; /* MAC stats accumulation period */ unsigned int linkpoll_period; /* link poll period in 0.1s */ unsigned int rev; /* chip revision */ + unsigned int offload; }; enum { /* chip revisions */ @@ -605,7 +605,7 @@ static inline int is_10G(const struct ad static inline int is_offload(const struct adapter *adap) { - return adapter_info(adap)->caps & SUPPORTED_OFFLOAD; + return adap->params.offload; } static inline unsigned int core_ticks_per_usec(const struct adapter *adap) diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c index b82544e..145b67c 100644 --- a/drivers/net/cxgb3/cxgb3_main.c +++ b/drivers/net/cxgb3/cxgb3_main.c @@ -407,7 +407,7 @@ static void quiesce_rx(struct adapter *a static int setup_sge_qsets(struct adapter *adap) { int i, j, err, irq_idx = 0, qset_idx = 0, dummy_dev_idx = 0; - unsigned int ntxq = is_offload(adap) ? SGE_TXQ_PER_SET : 1; + unsigned int ntxq = SGE_TXQ_PER_SET; if (adap->params.rev > 0 && !(adap->flags & USING_MSI)) irq_idx = -1; @@ -922,7 +922,7 @@ static int cxgb_open(struct net_device * return err; set_bit(pi->port_id, &adapter->open_device_map); - if (!ofld_disable) { + if (is_offload(adapter) && !ofld_disable) { err = offload_open(dev); if (err) printk(KERN_WARNING @@ -2270,9 +2270,9 @@ static void __devinit print_port_info(st if (!test_bit(i, &adap->registered_device_map)) continue; - printk(KERN_INFO "%s: %s %s RNIC (rev %d) %s%s\n", + printk(KERN_INFO "%s: %s %s %sNIC (rev %d) %s%s\n", dev->name, ai->desc, pi->port_type->desc, - adap->params.rev, buf, + is_offload(adap) ? "R" : "", adap->params.rev, buf, (adap->flags & USING_MSIX) ? " MSI-X" : (adap->flags & USING_MSI) ? " MSI" : ""); if (adap->name == dev->name && adap->params.vpd.mclk) diff --git a/drivers/net/cxgb3/mc5.c b/drivers/net/cxgb3/mc5.c index 644d62e..84c1ffa 100644 --- a/drivers/net/cxgb3/mc5.c +++ b/drivers/net/cxgb3/mc5.c @@ -328,6 +328,9 @@ int t3_mc5_init(struct mc5 *mc5, unsigne unsigned int tcam_size = mc5->tcam_size; struct adapter *adap = mc5->adapter; + if (!tcam_size) + return 0; + if (nroutes > MAX_ROUTES || nroutes + nservers + nfilters > tcam_size) return -EINVAL; diff --git a/drivers/net/cxgb3/sge.c b/drivers/net/cxgb3/sge.c index c237834..027ab2c 100644 --- a/drivers/net/cxgb3/sge.c +++ b/drivers/net/cxgb3/sge.c @@ -2631,7 +2631,7 @@ int t3_sge_alloc_qset(struct adapter *ad q->txq[TXQ_ETH].stop_thres = nports * flits_to_desc(sgl_len(MAX_SKB_FRAGS + 1) + 3); - if (ntxq == 1) { + if (!is_offload(adapter)) { #ifdef USE_RX_PAGE q->fl[0].buf_size = RX_PAGE_SIZE; #else diff --git a/drivers/net/cxgb3/t3_hw.c b/drivers/net/cxgb3/t3_hw.c index 791ed6d..d83f075 100644 --- a/drivers/net/cxgb3/t3_hw.c +++ b/drivers/net/cxgb3/t3_hw.c @@ -438,23 +438,23 @@ static const struct adapter_info t3_adap {2, 0, 0, 0, F_GPIO2_OEN | F_GPIO4_OEN | F_GPIO2_OUT_VAL | F_GPIO4_OUT_VAL, F_GPIO3 | F_GPIO5, -SUPPORTED_OFFLOAD, +0, &mi1_mdio_ops, "Chelsio PE9000"}, {2, 0, 0, 0, F_GPIO2_OEN | F_GPIO4_OEN | F_GPIO2_OUT_VAL | F_GPIO4_OUT_VAL, F_GPIO3 | F_GPIO5, -SUPPORTED_OFFLOAD, +0, &mi1_mdio_ops, "Chelsio T302"}, {1, 0, 0, 0, F_GPIO1_OEN | F_GPIO6_OEN | F_GPIO7_OEN | F_GPIO10_OEN | F_GPIO1_OUT_VAL | F_GPIO6_OUT_VAL | F_GPIO10_OUT_VAL, 0, -SUPPORTED_1baseT_Full | SUPPORTED_AUI | SUPPORTED_OFFLOAD, +SUPPORTED_1baseT_Full | SUPPORTED_AUI, &mi1_mdio_ext_ops, "C
Re: 2.6.21-rc5-mm3 - no boot, "address not 2M aligned"
A new error for me: loading 2.6.21rc5mm3 Bios data check successful Destination address not 2M aligned -- System halted This is using the same lilo that loads 2.6.18rc5mm1 fine. x86-64 Helge Hafting - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2.6.21 0/4] cxgb3 - bug fixes
Hi Jeff, I'm submitting a set of bug fixes for inclusion in 2.6.21. The patches are built against Linus'git tree. Here is a brief description: - Ensure that the on-board TCAM's active region size is always greater than 16 - the driver now recognizes NIC only adapters - tighten the MAC hang workaround - bump up firmware version Cheers, Divy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc5-mm2 - compile error on x86-64
The patch did not apply, but mm3 compiled so I'll try that instead. Helge Hafting - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc5-mm2 - compile error on x86-64
Helge Hafting <[EMAIL PROTECTED]> writes: > Correct. I seem to remember that the latter is considered > "deprecated, but some programs may still depend on it". So I disabled it to > see what broke. udev complained about the missing /proc/sys/kernel/hotplug, > but was happy to use /sys/kernel/uevent_helper instead. I didn't > notice other problems, so I left things like that. Well if anything it is the other way around. The preferred interface to sysctls is /proc/sys. There is the whole thing where people aren't to happy with non-process related things in /proc, so in that sense there is a bit of deprecation, but /proc and /proc/sys are fully supported. The plethora of configuration is what remains when I dug into the binary sys_sysctl interface and tested the assertion that no one uses it, and it has been deprecated for years and we could just kill it. We can now remove the binary sys_sysctl syscall while keeping /proc/sys support. Someday I might even get ambitious and add the appropriate deprecated warnings so we can kill the binary interface. I got as far as seeing that there were a small handful of real programs that use sys_sysctl. I looked at how were giving notice and realized that was insufficient to tell users we were deprecating the thing. I didn't see much point (except being able to immediate drop support) to removing sys_sysctl and since we would have to go a couple of years still supporting it to remove it properly I got lazy and stopped. Maybe myself or someone else can get ambitious and deprecate sys_sysctl properly and we can remove it one of these years... Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: mcdx -- do_request(): non-read command to cd!!
On Fri, Mar 30 2007, Rene Herman wrote: > Hi Al. > > GIT doesn't remember, it's been too long, but IIRC you were the last one > to do some work on mcdx (the old proprietary mitsumi cd-rom driver). The > thing builds without warnings on 2.6.20.4, unlike most other proprietary > CD-ROM drivers, so someone did... > > In any case, I just bet you're positively thrilled receiving bug-reports > for the thing right? Mmm? > > I dug up a 1-speed Mitsumi CRMC-LU005S today. Brilliant drive! You push > on the front, after which it comes loose and you then yank the entire > drive, mechanism and all, out of its casing over some kind of magnetic > resistance it seems and then open a _second_ top-loading door, put in > the CD and follow the procedure backwards again. I've done that at least > 20 times now and I'm not by any means done yet. Brilliant. > > The drive works fine under DOS (*), with both IRQ-less and IRQ-enabled > controllers. The linux driver does not work though: > > [EMAIL PROTECTED]:~# modprobe mcdx > > [EMAIL PROTECTED]:~# dmesg | tail -4 > mcdx Version 2.14(hs) > mcdx $Id: mcdx.c,v 1.21 1997/01/26 07:12:59 davem Exp $ > Uniform CD-ROM driver Revision: 3.20 > mcdx: Mitsumi CD-ROM installed at 0x300, irq 15. (Firmware version M 4) > > [EMAIL PROTECTED]:~# mount /dev/mcdx0 /mnt/cdrom > mount: block device /dev/mcdx0 is write-protected, mounting read-only > mount: /dev/mcdx0: can't read superblock > > [EMAIL PROTECTED]:~# dmesg | tail -4 > mcdx: Mitsumi CD-ROM installed at 0x300, irq 15. (Firmware version M 4) > mcdx do_request(): non-read command to cd!! > end_request: I/O error, dev mcdx0, sector 0 > FAT: unable to read boot sector > [EMAIL PROTECTED]:~# > > This same 300/15 pair works under DOS in the same machine and IRQ15 is > firing. The error sounds very block-ish. Would you happen to know? > > I'll happily test patches :-) Try this. diff --git a/drivers/cdrom/mcdx.c b/drivers/cdrom/mcdx.c index f574962..7086313 100644 --- a/drivers/cdrom/mcdx.c +++ b/drivers/cdrom/mcdx.c @@ -577,6 +577,11 @@ static void do_mcdx_request(request_queue_t * q) if (!req) return; + if (!blk_fs_request(req)) { + end_request(req, 0); + goto again; + } + stuffp = req->rq_disk->private_data; if (!stuffp->present) { @@ -596,7 +601,7 @@ static void do_mcdx_request(request_queue_t * q) xtrace(REQUEST, "do_request() (%lu + %lu)\n", req->sector, req->nr_sectors); - if (req->cmd != READ) { + if (rq_data_dir(req) != READ) { xwarn("do_request(): non-read command to cd!!\n"); xtrace(REQUEST, "end_request(0): write\n"); end_request(req, 0); -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [test] hackbench.c interactivity results: vanilla versus SD/RSDL
On Sat, 2007-03-31 at 08:31 +0200, Mike Galbraith wrote: > On Fri, 2007-03-30 at 22:41 -0700, Xenofon Antidides wrote: > > > Patch makes X yuck with any load. I stick with SD. General comment directed at nobody in particular: If anyone thinks the current scheduler sucks rocks, maybe they should try to fix it. If they think SD is the best thing since sliced bread, maybe they should help Con fix that. Code talks... -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [4/4] 2.6.21-rc5: known regressions (v2)
Le vendredi 30 mars 2007 à 23:49 +0200, Adrian Bunk a écrit : > Subject: MacMini doesn't come out of suspend to ram (i386 clockevents) > (CONFIG_HPET_TIMER) > References : http://lkml.org/lkml/2007/3/21/374 > Submitter : Frédéric Riss <[EMAIL PROTECTED]> > Tino Keitel <[EMAIL PROTECTED]> > Caused-By : Thomas Gleixner <[EMAIL PROTECTED]> > commit e9e2cdb412412326c4827fc78ba27f410d837e6e > Status : unknown This one has been fixed by 399afa4fc9238fbae42116cf25a54671c0e8f56e. Suspend to ram now works with HPET enabled (and regardless of the NO_HZ setting). Thanks! Fred. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fix dependency generation
On Thu, Mar 29, 2007 at 10:27:14AM +0100, Jan Beulich wrote: > Commit 2e3646e51b2d6415549b310655df63e7e0d7a080 changed the way > the split config tree is built, but failed to also adjust fixdep > accordingly - if changing a config option from or to m, files > referencing the respective CONFIG_..._MODULE (but not the > corresponding CONFIG_...) didn't get rebuilt. The problem is that tristate symbol represent three values. =n => CONFIG_SYMBOL is undefined =y => CONFIG_SYMBOL is defined =m => COMFIG_SYMBOL_MODULE is defined The function split_config does not take into account the different values and 'fixing' this in fixdep is wrong. Because fixdep does not know if the variable is a tristate symbol or not so it can either blindly remove _MODULE (your patch) or each time it encounters _MODULE check for a symbol with and without _MODULE. The better fix is to teach the split_config function that for tristate symbols two files shall be created in the include/config hirachy. So for apm this gets: include/config/apm.h include/config/apm/module.h This will make kconfig behave correct the day that someone add a config symbol with a _MODULE suffix. I will follow-up with two patches that implement the changes to split_config. The first is a pure code refactoring preparing for the second patch. Roman - please ack/nack these this since they touches kconfig backend. Sam > > Once at it, also eliminate false dependencies due to use of > ...CONFIG_... identifiers. > > Signed-off-by: Jan Beulich <[EMAIL PROTECTED]> > > --- linux-2.6.21-rc5/scripts/basic/fixdep.c 2007-02-04 19:44:54.0 > +0100 > +++ 2.6.21-rc5-fixdep-mod/scripts/basic/fixdep.c 2007-03-29 > 11:11:10.0 +0200 > @@ -29,8 +29,7 @@ > * option which is mentioned in any of the listed prequisites. > * > * To be exact, split-include populates a tree in include/config/, > - * e.g. include/config/his/driver.h, which contains the #define/#undef > - * for the CONFIG_HIS_DRIVER option. > + * e.g. include/config/his/driver.h, consiting of empty files. > * > * So if the user changes his CONFIG_HIS_DRIVER option, only the objects > * which depend on "include/linux/config/his/driver.h" will be rebuilt, > @@ -223,7 +222,7 @@ void use_config(char *m, int slen) > void parse_config_file(char *map, size_t len) > { > int *end = (int *) (map + len); > - /* start at +1, so that p can never be < map */ > + /* start at +1, so that p can never be <= map */ > int *m = (int *) map + 1; > char *p, *q; > > @@ -235,6 +234,8 @@ void parse_config_file(char *map, size_t > continue; > conf: > if (p > map + len - 7) > + break; > + if (isalnum(p[-1]) || p[-1] == '_') > continue; > if (memcmp(p, "CONFIG_", 7)) > continue; > @@ -245,6 +246,8 @@ void parse_config_file(char *map, size_t > continue; > > found: > + if (!memcmp(q - 7, "_MODULE", 7)) > + q -= 7; > use_config(p+7, q-p-7); > } > } > > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns
When I run 2.6.21-rc5 + Andi's x86 patches + paravirt_ops patches, I've been getting my machine shut down with critical thermal shutdown messages: Mar 30 23:19:03 localhost kernel: ACPI: Critical trip point Mar 30 23:19:03 localhost kernel: Critical temperature reached (128 C), shutting down. Mar 30 23:19:03 localhost kernel: Critical temperature reached (128 C), shutting down. Mar 30 23:19:03 localhost shutdown[19417]: shutting down for system halt and the machine does feel pretty hot. Interestingly, when the machine reboots, the fan spins up to a noticeably higher speed, so it seems that maybe something is getting fan speed control wrong. The machine is a Thinkpad X60, with a 1.8GHz Core Duo. I can run it indefinitely with the FC6 2.6.20-1.2933.fc6 kernel, so I don't think there's anything wrong with the hardware. And it was sitting on a desktop plugged into mains, so there's no problems with obstructed airflow. I was running a normal email/browsing/editing/compiling workload, and I don't think there was anything particularly CPU intensive running at the time. I run cpufreq with the conservative governor. Running now with the FC6 kernel, I get: : ezr:pts/2; cat /proc/acpi/thermal_zone/THM?/temperature temperature: 69 C temperature: 82 C Config attached. J CONFIG_X86_32=y CONFIG_GENERIC_TIME=y CONFIG_CLOCKSOURCE_WATCHDOG=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_SEMAPHORE_SLEEPERS=y CONFIG_X86=y CONFIG_MMU=y CONFIG_ZONE_DMA=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_GENERIC_BUG=y CONFIG_GENERIC_HWEIGHT=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_DMI=y CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" CONFIG_EXPERIMENTAL=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 CONFIG_LOCALVERSION="-paravirt" CONFIG_SWAP=y CONFIG_SYSVIPC=y CONFIG_SYSVIPC_SYSCTL=y CONFIG_POSIX_MQUEUE=y CONFIG_BSD_PROCESS_ACCT=y CONFIG_TASKSTATS=y CONFIG_TASK_DELAY_ACCT=y CONFIG_SYSFS_DEPRECATED=y CONFIG_RELAY=y CONFIG_BLK_DEV_INITRD=y CONFIG_INITRAMFS_SOURCE="" CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_SYSCTL=y CONFIG_UID16=y CONFIG_SYSCTL_SYSCALL=y CONFIG_KALLSYMS=y CONFIG_KALLSYMS_ALL=y CONFIG_KALLSYMS_EXTRA_PASS=y CONFIG_HOTPLUG=y CONFIG_PRINTK=y CONFIG_BUG=y CONFIG_ELF_CORE=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_EPOLL=y CONFIG_SHMEM=y CONFIG_SLAB=y CONFIG_VM_EVENT_COUNTERS=y CONFIG_RT_MUTEXES=y CONFIG_BASE_SMALL=0 CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y CONFIG_MODVERSIONS=y CONFIG_MODULE_SRCVERSION_ALL=y CONFIG_KMOD=y CONFIG_STOP_MACHINE=y CONFIG_BLOCK=y CONFIG_IOSCHED_NOOP=y CONFIG_IOSCHED_AS=y CONFIG_IOSCHED_DEADLINE=y CONFIG_IOSCHED_CFQ=y CONFIG_DEFAULT_CFQ=y CONFIG_DEFAULT_IOSCHED="cfq" CONFIG_TICK_ONESHOT=y CONFIG_NO_HZ=y CONFIG_HIGH_RES_TIMERS=y CONFIG_SMP=y CONFIG_X86_PC=y CONFIG_PARAVIRT=y CONFIG_VMI=y CONFIG_MPENTIUMM=y CONFIG_X86_CMPXCHG=y CONFIG_X86_L1_CACHE_SHIFT=6 CONFIG_RWSEM_XCHGADD_ALGORITHM=y CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_X86_WP_WORKS_OK=y CONFIG_X86_INVLPG=y CONFIG_X86_BSWAP=y CONFIG_X86_POPAD_OK=y CONFIG_X86_CMPXCHG64=y CONFIG_X86_GOOD_APIC=y CONFIG_X86_INTEL_USERCOPY=y CONFIG_X86_USE_PPRO_CHECKSUM=y CONFIG_X86_TSC=y CONFIG_HPET_TIMER=y CONFIG_NR_CPUS=8 CONFIG_SCHED_MC=y CONFIG_PREEMPT_VOLUNTARY=y CONFIG_X86_LOCAL_APIC=y CONFIG_X86_IO_APIC=y CONFIG_X86_MCE=y CONFIG_X86_MCE_P4THERMAL=y CONFIG_VM86=y CONFIG_X86_CPUID=m CONFIG_EDD=m CONFIG_HIGHMEM64G=y CONFIG_PAGE_OFFSET=0xC000 CONFIG_HIGHMEM=y CONFIG_X86_PAE=y CONFIG_ARCH_FLATMEM_ENABLE=y CONFIG_ARCH_SPARSEMEM_ENABLE=y CONFIG_ARCH_SELECT_MEMORY_MODEL=y CONFIG_ARCH_POPULATES_NODE_MAP=y CONFIG_SELECT_MEMORY_MODEL=y CONFIG_FLATMEM_MANUAL=y CONFIG_FLATMEM=y CONFIG_FLAT_NODE_MEM_MAP=y CONFIG_SPARSEMEM_STATIC=y CONFIG_SPLIT_PTLOCK_CPUS=4 CONFIG_RESOURCES_64BIT=y CONFIG_ZONE_DMA_FLAG=1 CONFIG_HIGHPTE=y CONFIG_MATH_EMULATION=y CONFIG_MTRR=y CONFIG_IRQBALANCE=y CONFIG_HZ_1000=y CONFIG_HZ=1000 CONFIG_PHYSICAL_START=0x10 CONFIG_PHYSICAL_ALIGN=0x10 CONFIG_HOTPLUG_CPU=y CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y CONFIG_PM=y CONFIG_PM_DEBUG=y CONFIG_SOFTWARE_SUSPEND=y CONFIG_PM_STD_PARTITION="" CONFIG_SUSPEND_SMP=y CONFIG_ACPI=y CONFIG_ACPI_SLEEP=y CONFIG_ACPI_SLEEP_PROC_FS=y CONFIG_ACPI_PROCFS=y CONFIG_ACPI_AC=y CONFIG_ACPI_BATTERY=y CONFIG_ACPI_BUTTON=y CONFIG_ACPI_VIDEO=m CONFIG_ACPI_FAN=y CONFIG_ACPI_PROCESSOR=y CONFIG_ACPI_HOTPLUG_CPU=y CONFIG_ACPI_THERMAL=y CONFIG_ACPI_IBM=m CONFIG_ACPI_IBM_BAY=y CONFIG_ACPI_BLACKLIST_YEAR=1999 CONFIG_ACPI_EC=y CONFIG_ACPI_POWER=y CONFIG_ACPI_SYSTEM=y CONFIG_X86_PM_TIMER=y CONFIG_ACPI_CONTAINER=y CONFIG_CPU_FREQ=y CONFIG_CPU_FREQ_TABLE=y CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE=y CONFIG_CPU_FREQ_GOV_PERFORMANCE=y CONFIG_CPU_FREQ_GOV_POWERSAVE=y CONFIG_CPU_FREQ_GOV_USERSPACE=y CONFIG_CPU_FREQ_GOV_ONDEMAND=y CONFIG_CPU_FREQ_GOV_CONSERVATIVE=y CONFIG_X86_ACPI_CPUFREQ=y CONFIG_X86_SPEEDSTEP_CENTRINO=y CONFIG_X86_SPEEDSTEP_CENTRINO_ACPI=y CONFIG_X86_SPEEDSTEP_CENTRINO_TABLE=y CONFI
Re: 2.6.21-rc5-mm2 - compile error on x86-64
On Thu, Mar 29, 2007 at 02:28:16PM -0700, Andrew Morton wrote: > On Thu, 29 Mar 2007 20:20:20 +0200 > Helge Hafting <[EMAIL PROTECTED]> wrote: > [...] > yup, people will presumably work on fixing these things up after the > feature hits mainline. > > > LD init/built-in.o > > LD .tmp_vmlinux1 > > fs/built-in.o: In function `proc_root_init': > > /usr/src/linux/fs/proc/root.c:83: undefined reference to `proc_sys_init' > > Ah. I assume you have CONFIG_SYSCTL=y, CONFIG_PROC_SYSCTL=n? Correct. I seem to remember that the latter is considered "deprecated, but some programs may still depend on it". So I disabled it to see what broke. udev complained about the missing /proc/sys/kernel/hotplug, but was happy to use /sys/kernel/uevent_helper instead. I didn't notice other problems, so I left things like that. Helge Hafting - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [test] hackbench.c interactivity results: vanilla versus SD/RSDL
On Fri, 2007-03-30 at 22:41 -0700, Xenofon Antidides wrote: > Patch makes X yuck with any load. I stick with SD. Shrug. My milage is different, but hey, it's a work in progress. If SD ever gets to the point that it actually delivers what it claims, I may join you. In the meantime, IMHO mainline is MUCH better in the general case. If the general case was that which the various sleep exploits do, the history mechanism in mainline wouldn't have survived it's first day. -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc5-mm1
Hello, > > > 2) This was found a couple minutes later when the system was > > >really busy and close to oom condition. > > > > > > INFO: lockdep is turned off. > > > BUG: soft lockup detected on CPU#0! > > > [] show_trace_log_lvl+0x1a/0x30 > > > [] show_trace+0x12/0x14 > > > [] dump_stack+0x16/0x18 > > > [] softlockup_tick+0x81/0xa8 > > > [] run_local_timers+0x12/0x14 > > > [] update_process_times+0x2b/0x63 > > > [] tick_sched_timer+0x4d/0x9e > > > [] hrtimer_interrupt+0x12e/0x1a6 > > > [] timer_interrupt+0xe/0x15 > > > [] handle_IRQ_event+0x28/0x59 > > > [] handle_level_irq+0x6e/0xe7 > > > [] do_IRQ+0x3d/0x7f > > > [] common_interrupt+0x2e/0x34 > > > [] do_softirq+0x4d/0x50 > > > [] irq_exit+0x7e/0x80 > > > [] do_IRQ+0x42/0x7f > > > [] common_interrupt+0x2e/0x34 > > > [] core_sys_select+0x1c6/0x310 > > > [] sys_select+0x39/0x18f > > > [] sysenter_past_esp+0x5d/0x99 > > > === > > > Clocksource tsc unstable (delta = 9372804176 ns) > > > Time: acpi_pm clocksource has been installed. > > Hmm.. No clue right off. Does booting w/ clocksource=acpi_pm avoid the > issue? Sorry. Can't reproduce it either way. Mariusz - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [test] hackbench.c interactivity results: vanilla versus SD/RSDL
On Sat, 2007-03-31 at 05:42 +0200, Mike Galbraith wrote: > Yesterday, I piddled around with tracking interactive backlog as a way > to detect when the load isn't really an interactive load, that's very > simple and has potential. Kinda like the patch below (though it can all be done slow path), or something like my old throttling patches do (for grins I revived one, and watched it yawn at your exploit)... top - 07:49:36 up 6 min, 13 users, load average: 4.42, 3.11, 1.40 PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ P COMMAND 6027 root 20 0 1564 104 24 R 45 0.0 0:09.47 1 fiftypercent 6028 root 19 0 1564 104 24 R 40 0.0 0:09.43 1 fiftypercent 6025 root 25 0 2892 1240 1032 R 32 0.1 0:09.04 1 sh 6024 root 16 0 1564 436 356 S 30 0.0 0:10.45 0 fiftypercent 6026 root 15 0 1564 104 24 R 27 0.0 0:09.52 0 fiftypercent 6029 root 16 0 1564 104 24 R 18 0.0 0:09.33 0 fiftypercent ...or both, or maybe something clever instead :) --- kernel/sched.c.org 2007-03-27 15:47:49.0 +0200 +++ kernel/sched.c 2007-03-31 06:56:57.0 +0200 @@ -109,6 +109,7 @@ unsigned long long __attribute__((weak)) #define MAX_SLEEP_AVG (DEF_TIMESLICE * MAX_BONUS) #define STARVATION_LIMIT (MAX_SLEEP_AVG) #define NS_MAX_SLEEP_AVG (JIFFIES_TO_NS(MAX_SLEEP_AVG)) +#define INTERACTIVE_LIMIT (DEF_TIMESLICE * 4) /* * If a task is 'interactive' then we reinsert it in the active @@ -167,6 +168,9 @@ unsigned long long __attribute__((weak)) (JIFFIES_TO_NS(MAX_SLEEP_AVG * \ (MAX_BONUS / 2 + DELTA((p)) + 1) / MAX_BONUS - 1)) +#define INTERACTIVE_BACKLOG_EXCEEDED(array) \ + ((array)->interactive_ticks > INTERACTIVE_LIMIT) + #define TASK_PREEMPTS_CURR(p, rq) \ ((p)->prio < (rq)->curr->prio) @@ -201,6 +205,7 @@ static inline unsigned int task_timeslic struct prio_array { unsigned int nr_active; + int interactive_ticks; DECLARE_BITMAP(bitmap, MAX_PRIO+1); /* include 1 bit for delimiter */ struct list_head queue[MAX_PRIO]; }; @@ -234,6 +239,7 @@ struct rq { */ unsigned long nr_uninterruptible; + unsigned long switch_timestamp; unsigned long expired_timestamp; /* Cached timestamp set by update_cpu_clock() */ unsigned long long most_recent_timestamp; @@ -691,6 +697,8 @@ static void dequeue_task(struct task_str list_del(&p->run_list); if (list_empty(array->queue + p->prio)) __clear_bit(p->prio, array->bitmap); + if (TASK_INTERACTIVE(p)) + array->interactive_ticks -= p->time_slice; } static void enqueue_task(struct task_struct *p, struct prio_array *array) @@ -700,6 +708,8 @@ static void enqueue_task(struct task_str __set_bit(p->prio, array->bitmap); array->nr_active++; p->array = array; + if (TASK_INTERACTIVE(p)) + array->interactive_ticks += p->time_slice; } /* @@ -882,7 +892,11 @@ static int recalc_task_prio(struct task_ /* Caller must always ensure 'now >= p->timestamp' */ unsigned long sleep_time = now - p->timestamp; - if (batch_task(p)) + /* +* Migration timestamp adjustment may induce negative time. +* Ignore unquantifiable values as well as SCHED_BATCH tasks. +*/ + if (now < p->timestamp || batch_task(p)) sleep_time = 0; if (likely(sleep_time > 0)) { @@ -3051,9 +3065,9 @@ static inline int expired_starving(struc { if (rq->curr->static_prio > rq->best_expired_prio) return 1; - if (!STARVATION_LIMIT || !rq->expired_timestamp) + if (!STARVATION_LIMIT) return 0; - if (jiffies - rq->expired_timestamp > STARVATION_LIMIT * rq->nr_running) + if (jiffies - rq->switch_timestamp > STARVATION_LIMIT * rq->nr_running) return 1; return 0; } @@ -3131,8 +3145,74 @@ void account_steal_time(struct task_stru cpustat->steal = cputime64_add(cpustat->steal, tmp); } +/* + * Promote and requeue the next lower priority task. If no task + * is available in the active array, switch to the expired array. + * @rq: runqueue to search. + * @prio: priority at which to begin search. + */ +static inline void promote_next_lower(struct rq *rq, int prio) +{ + struct prio_array *array = rq->active; + struct task_struct *p = NULL; + unsigned long long now = rq->most_recent_timestamp; + unsigned long *bitmap; + unsigned long starving = JIFFIES_TO_NS(rq->nr_running * DEF_TIMESLICE); + int idx = prio + 1, found_noninteractive = 0; + +repeat: + bitmap = array->bitmap; + idx = find_next_bit(bitmap, MAX_PRIO, idx); + if (idx < MAX_PRIO) { + struct list_head *queue = array->queue + idx; + + p = list_entry(queue->next, struct task_struct,
Re: [test] hackbench.c interactivity results: vanilla versus SD/RSDL
--- Mike Galbraith <[EMAIL PROTECTED]> wrote: > On Fri, 2007-03-30 at 19:36 -0700, Xenofon Antidides > wrote: > > > Something different on many cpus? Sorry I was > thinking > > something other. I try 50% run + 50% sleep on one > cpu > > and mainline has big problem. Sorry for bad code I > > copy bits to make it work. Start program first > then > > run bash 100% cpu (while : ; do : ; done). Try > change > > program forks from 1 till 3 or more mainline > kernel > > and bash gets 0%. Mainline hangs with program. SD does not have problem with program and is more responsible then mainline. > That's mainline with the below (which I'm trying > various ideas to improve). Patch makes X yuck with any load. I stick with SD. Xant Bored stiff? Loosen up... Download and play hundreds of games for free on Yahoo! Games. http://games.yahoo.com/games/front - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [test] hackbench.c interactivity results: vanilla versus SD/RSDL
Xenofon Antidides wrote: - Original Message From: Ingo Molnar <[EMAIL PROTECTED]> To: Con Kolivas <[EMAIL PROTECTED]> Cc: linux list ; Andrew Morton <[EMAIL PROTECTED]>; Mike Galbraith <[EMAIL PROTECTED]> Sent: Thursday, March 29, 2007 9:22:49 PM Subject: [test] hackbench.c interactivity results: vanilla versus SD/RSDL * Ingo Molnar <[EMAIL PROTECTED]> wrote: * Con Kolivas <[EMAIL PROTECTED]> wrote: I'm cautiously optimistic that we're at the thin edge of the bugfix wedge now. [...] and the numbers he posted: http://marc.info/?l=linux-kernel&m=117448900626028&w=2 We been staring at these numbers for while now and we come to the conclusion they wrong. The test is f is 3 tasks, two on different and one on same cpu as sh here: virgin 2.6.21-rc3-rsdl-smp top - 13:52:50 up 7 min, 12 users, load average: 3.45, 2.89, 1.51 PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ P COMMAND 6560 root 31 0 2892 1236 1032 R 82 0.1 1:50.24 1 sh 6558 root 28 0 1428 276 228 S 42 0.0 1:00.09 1 f 6557 root 30 0 1424 280 228 R 35 0.0 1:00.25 0 f 6559 root 39 0 1424 276 228 R 33 0.0 0:58.36 0 f 6560 sh is asking for 100% cpu on cpu number 1 6558 f is asking for 50% cpu on cpu number 1 6557 f is asking for 50% cpu on cpu number 0 6559 f is asking for 50% cpu on cpu number 0 So if 6560 and 6558 are asking for cpu from cpu number 1: 6560 wants 100% and 6558 wants 50%. 6560 should get 2/3 cpu 6558 should get 1/3 cpu I don't think you can say that. If the 50% task alternated between long periods of running and sleeping, then the end result should approach a task that is sleeping for 50% of the time, and on the CPU 25% of the time. As the periods get shorter, then the schedulers will favour the 50% task relatively more, but details will depend on implementation. You could have an implementation that always gives runs the 50% task when it becomes runnable, because it is decided that its priority is higher because it has been sleeping. The only thing you can really say is that the 50% task should get between 25% and 50% (inclusive) CPU time. -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 4/6] Convert PDA into the percpu section
Currently x86 (similar to x84-64) has a special per-cpu structure called "i386_pda" which can be easily and efficiently referenced via the %fs register. An ELF section is more flexible than a structure, allowing any piece of code to use this area. Indeed, such a section already exists: the per-cpu area. So this patch: (1) Removes the PDA and uses per-cpu variables for each current member. (2) Replaces the __KERNEL_PDA segment with __KERNEL_PERCPU. (3) Creates a per-cpu mirror of __per_cpu_offset called this_cpu_off, which can be used to calculate addresses for this CPU's variables. (4) Simplifies startup, because %fs doesn't need to be loaded with a special segment at early boot; it can be deferred until the first percpu area is allocated (or never for UP). The result is less code and one less x86-specific concept. Signed-off-by: Rusty Russell <[EMAIL PROTECTED]> Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> Cc: Andi Kleen <[EMAIL PROTECTED]> --- arch/i386/kernel/asm-offsets.c |5 - arch/i386/kernel/cpu/common.c | 17 - arch/i386/kernel/entry.S |5 - arch/i386/kernel/head.S| 31 + arch/i386/kernel/i386_ksyms.c |2 arch/i386/kernel/irq.c |3 arch/i386/kernel/process.c | 12 ++- arch/i386/kernel/smpboot.c | 34 -- arch/i386/kernel/vmi.c |6 - arch/i386/kernel/vmlinux.lds.S |1 include/asm-i386/current.h |5 - include/asm-i386/irq_regs.h| 12 ++- include/asm-i386/pda.h | 99 -- include/asm-i386/percpu.h | 132 +--- include/asm-i386/processor.h |2 include/asm-i386/segment.h |6 - include/asm-i386/smp.h |4 - 17 files changed, 179 insertions(+), 197 deletions(-) === --- a/arch/i386/kernel/asm-offsets.c +++ b/arch/i386/kernel/asm-offsets.c @@ -15,7 +15,6 @@ #include #include #include -#include #define DEFINE(sym, val) \ asm volatile("\n->" #sym " %0 " #val : : "i" (val)) @@ -101,10 +100,6 @@ void foo(void) OFFSET(crypto_tfm_ctx_offset, crypto_tfm, __crt_ctx); - BLANK(); - OFFSET(PDA_cpu, i386_pda, cpu_number); - OFFSET(PDA_pcurrent, i386_pda, pcurrent); - #ifdef CONFIG_PARAVIRT BLANK(); OFFSET(PARAVIRT_enabled, paravirt_ops, paravirt_enabled); === --- a/arch/i386/kernel/cpu/common.c +++ b/arch/i386/kernel/cpu/common.c @@ -18,7 +18,6 @@ #include #include #endif -#include #include "cpu.h" @@ -47,12 +46,9 @@ DEFINE_PER_CPU(struct gdt_page, gdt_page [GDT_ENTRY_APMBIOS_BASE+2] = { 0x, 0x00409200 }, /* data */ [GDT_ENTRY_ESPFIX_SS] = { 0x, 0x00c09200 }, - [GDT_ENTRY_PDA] = { 0x, 0x00c09200 }, /* set in setup_pda */ + [GDT_ENTRY_PERCPU] = { 0x, 0x }, } }; EXPORT_PER_CPU_SYMBOL_GPL(gdt_page); - -DEFINE_PER_CPU(struct i386_pda, _cpu_pda); -EXPORT_PER_CPU_SYMBOL(_cpu_pda); static int cachesize_override __cpuinitdata = -1; static int disable_x86_fxsr __cpuinitdata; @@ -627,20 +623,13 @@ void __init early_cpu_init(void) #endif } -/* Make sure %gs is initialized properly in idle threads */ +/* Make sure %fs is initialized properly in idle threads */ struct pt_regs * __devinit idle_regs(struct pt_regs *regs) { memset(regs, 0, sizeof(struct pt_regs)); - regs->xfs = __KERNEL_PDA; + regs->xfs = __KERNEL_PERCPU; return regs; } - -/* Initial PDA used by boot CPU */ -struct i386_pda boot_pda = { - ._pda = &boot_pda, - .cpu_number = 0, - .pcurrent = &init_task, -}; /* * cpu_init() initializes state that is per-CPU. Some data is already === --- a/arch/i386/kernel/entry.S +++ b/arch/i386/kernel/entry.S @@ -132,7 +132,7 @@ 1: movl $(__USER_DS), %edx; \ movl %edx, %ds; \ movl %edx, %es; \ - movl $(__KERNEL_PDA), %edx; \ + movl $(__KERNEL_PERCPU), %edx; \ movl %edx, %fs #define RESTORE_INT_REGS \ @@ -560,7 +560,6 @@ END(syscall_badsys) #define FIXUP_ESPFIX_STACK \ /* since we are on a wrong stack, we cant make it a C code :( */ \ - movl %fs:PDA_cpu, %ebx; \ PER_CPU(gdt_page, %ebx); \ GET_DESC_BASE(GDT_ENTRY_ESPFIX_SS, %ebx, %eax, %ax, %al, %ah); \ addl %esp, %eax; \ @@ -685,7 +684,7 @@ error_code: pushl %fs CFI_ADJUST_CFA_OFFSET 4 /*CFI_REL_OFFSET fs, 0*/ - movl $(__KERNEL_PDA), %ecx + movl $(__KERNEL_PERCPU), %ecx movl %ecx, %fs UNWIND_ESPFIX_STACK popl %ecx === --- a/arch/i386/kernel/head.S +++ b/arch/i386/kernel/head.S @@ -317,12 +317,12 @@ 2:movl %cr0,%eax movl %eax,%cr0
[patch 6/6] Define per_cpu_offset
Define per_cpu_offset in asm-i386/percpu.h when SMP defined, like asm-generic/percpu.h does for UP. Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> Cc: Rusty Russell <[EMAIL PROTECTED]> Cc: Andi Kleen <[EMAIL PROTECTED]> --- include/asm-i386/percpu.h |2 ++ 1 file changed, 2 insertions(+) === --- a/include/asm-i386/percpu.h +++ b/include/asm-i386/percpu.h @@ -34,6 +34,8 @@ /* This is used for other cpus to find our section. */ extern unsigned long __per_cpu_offset[]; + +#define per_cpu_offset(x) (__per_cpu_offset[x]) /* Separate out the type, so (int[3], foo) works. */ #define DECLARE_PER_CPU(type, name) extern __typeof__(type) per_cpu__##name -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 5/6] cleanups to help using per-cpu variables from asm
This patch does a few small cleanups: - use PER_CPU_NAME to generate the names of per-cpu variables - use lea to add the per_cpu offset in PER_CPU(), because it doesn't affect condition flags - add PER_CPU_VAR which allows direct access to pre-cpu variables with the %fs: prefix on SMP. Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> Cc: Rusty Russell <[EMAIL PROTECTED]> Cc: Andi Kleen <[EMAIL PROTECTED]> --- include/asm-i386/percpu.h | 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) === --- a/include/asm-i386/percpu.h +++ b/include/asm-i386/percpu.h @@ -16,12 +16,14 @@ *PER_CPU(cpu_gdt_descr, %ebx) */ #ifdef CONFIG_SMP +#define PER_CPU(var, reg) \ + movl %fs:per_cpu__##this_cpu_off, reg; \ + lea per_cpu__##var(reg), reg +#define PER_CPU_VAR(var) %fs:per_cpu__##var +#else /* ! SMP */ #define PER_CPU(var, reg) \ - movl %fs:per_cpu__this_cpu_off, reg;\ - addl $per_cpu__##var, reg -#else /* ! SMP */ -#define PER_CPU(var, reg) \ - movl $per_cpu__##var, reg; + movl $per_cpu__##var, reg +#define PER_CPU_VAR(var) per_cpu__##var #endif /* SMP */ #else /* ...!ASSEMBLY */ -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [test] hackbench.c interactivity results: vanilla versus
Mike Galbraith wrote: > Yesterday, I piddled around with tracking interactive backlog as a way > to detect when the load isn't really an interactive load, that's very > simple and has potential. You may want to consider fixing latencies per nice relative to load, as the biggest problem with iab are huge latency delays, which exhibit themselves as starvation, caused by unfair timeslice management. Thanks! -- Al - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 2/6] Allow percpu variables to be page-aligned
Let's allow page-alignment in general for per-cpu data (wanted by Xen, and Ingo suggested KVM as well). Because larger alignments can use more room, we increase the max per-cpu memory to 64k rather than 32k: it's getting a little tight. Signed-off-by: Rusty Russell <[EMAIL PROTECTED]> Acked-by: Ingo Molnar <[EMAIL PROTECTED]> Cc: Andi Kleen <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> --- arch/alpha/kernel/vmlinux.lds.S |2 +- arch/arm/kernel/vmlinux.lds.S |2 +- arch/cris/arch-v32/vmlinux.lds.S |1 + arch/frv/kernel/vmlinux.lds.S |1 + arch/i386/kernel/vmlinux.lds.S|2 +- arch/m32r/kernel/vmlinux.lds.S|2 +- arch/mips/kernel/vmlinux.lds.S|2 +- arch/parisc/kernel/vmlinux.lds.S |2 +- arch/powerpc/kernel/setup_64.c|4 ++-- arch/powerpc/kernel/vmlinux.lds.S |6 +- arch/ppc/kernel/vmlinux.lds.S |2 +- arch/s390/kernel/vmlinux.lds.S|2 +- arch/sh/kernel/vmlinux.lds.S |2 +- arch/sh64/kernel/vmlinux.lds.S|2 +- arch/sparc/kernel/vmlinux.lds.S |2 +- arch/sparc64/kernel/smp.c |6 +++--- arch/x86_64/kernel/setup64.c |4 ++-- arch/x86_64/kernel/vmlinux.lds.S |2 +- arch/xtensa/kernel/vmlinux.lds.S |2 +- init/main.c |4 ++-- kernel/module.c |8 21 files changed, 29 insertions(+), 31 deletions(-) === --- a/arch/alpha/kernel/vmlinux.lds.S +++ b/arch/alpha/kernel/vmlinux.lds.S @@ -69,7 +69,7 @@ SECTIONS . = ALIGN(8); SECURITY_INIT - . = ALIGN(64); + . = ALIGN(8192); __per_cpu_start = .; .data.percpu : { *(.data.percpu) } __per_cpu_end = .; === --- a/arch/arm/kernel/vmlinux.lds.S +++ b/arch/arm/kernel/vmlinux.lds.S @@ -59,7 +59,7 @@ SECTIONS usr/built-in.o(.init.ramfs) __initramfs_end = .; #endif - . = ALIGN(64); + . = ALIGN(4096); __per_cpu_start = .; *(.data.percpu) __per_cpu_end = .; === --- a/arch/cris/arch-v32/vmlinux.lds.S +++ b/arch/cris/arch-v32/vmlinux.lds.S @@ -91,6 +91,7 @@ SECTIONS } SECURITY_INIT + . = ALIGN (8192); __per_cpu_start = .; .data.percpu : { *(.data.percpu) } __per_cpu_end = .; === --- a/arch/frv/kernel/vmlinux.lds.S +++ b/arch/frv/kernel/vmlinux.lds.S @@ -57,6 +57,7 @@ SECTIONS __alt_instructions_end = .; .altinstr_replacement : { *(.altinstr_replacement) } + . = ALIGN(4096); __per_cpu_start = .; .data.percpu : { *(.data.percpu) } __per_cpu_end = .; === --- a/arch/i386/kernel/vmlinux.lds.S +++ b/arch/i386/kernel/vmlinux.lds.S @@ -194,7 +194,7 @@ SECTIONS __initramfs_end = .; } #endif - . = ALIGN(L1_CACHE_BYTES); + . = ALIGN(4096); .data.percpu : AT(ADDR(.data.percpu) - LOAD_OFFSET) { __per_cpu_start = .; *(.data.percpu) === --- a/arch/m32r/kernel/vmlinux.lds.S +++ b/arch/m32r/kernel/vmlinux.lds.S @@ -110,7 +110,7 @@ SECTIONS __initramfs_end = .; #endif - . = ALIGN(32); + . = ALIGN(4096); __per_cpu_start = .; .data.percpu : { *(.data.percpu) } __per_cpu_end = .; === --- a/arch/mips/kernel/vmlinux.lds.S +++ b/arch/mips/kernel/vmlinux.lds.S @@ -119,7 +119,7 @@ SECTIONS .init.ramfs : { *(.init.ramfs) } __initramfs_end = .; #endif - . = ALIGN(32); + . = ALIGN(_PAGE_SIZE); __per_cpu_start = .; .data.percpu : { *(.data.percpu) } __per_cpu_end = .; === --- a/arch/parisc/kernel/vmlinux.lds.S +++ b/arch/parisc/kernel/vmlinux.lds.S @@ -181,7 +181,7 @@ SECTIONS .init.ramfs : { *(.init.ramfs) } __initramfs_end = .; #endif - . = ALIGN(32); + . = ALIGN(ASM_PAGE_SIZE); __per_cpu_start = .; .data.percpu : { *(.data.percpu) } __per_cpu_end = .; === --- a/arch/powerpc/kernel/setup_64.c +++ b/arch/powerpc/kernel/setup_64.c @@ -583,14 +583,14 @@ void __init setup_per_cpu_areas(void) char *ptr; /* Copy section for each CPU (we discard the original) */ - size = ALIGN(__per_cpu_end - __per_cpu_start, SMP_CACHE_BYTES); + size = ALIGN(__per_cpu_end - __per_cpu_start, PAGE_SIZE); #ifdef CONFIG_MODULES if (size < PERCPU_ENOUGH_ROOM) size = PERCPU_ENOUGH_ROOM; #endif for_each_possible_cpu(i) { - ptr = alloc_bootmem_node(NODE_DATA(cpu_to_node(i)), size); +
[patch 0/6] i386 gdt and percpu cleanups
Hi Andi, This is a series of patches based on your latest queue (as of the other day, at least). It includes: - the most recent patch to compute the appropriate amount of percpu space to allocate, using a separate reservation for modules where needed. - make the percpu sections page-aligned, so that percpu variables can be page aligned if needed (which is used by gdt_page) - page-align the gdt - remove the pda and convert all pda usages into percpu variables (percpu variables still use the %fs prefix mechanism the pda used) - some improvements to asm-i386/percpu.h to make asm access to percpu variables easy - define per_cpu_offset in asm-i386/percpu.h, to match asm-generic/ Thanks, J -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 3/6] Page-align the GDT
Xen wants a dedicated page for the GDT. I believe VMI likes it too. lguest, KVM and native don't care. Simple transformation to page-aligned "struct gdt_page". Signed-off-by: Rusty Russell <[EMAIL PROTECTED]> Acked-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> --- arch/i386/kernel/cpu/common.c |6 +++--- arch/i386/kernel/entry.S |2 +- arch/i386/kernel/head.S |2 +- arch/i386/kernel/traps.c |2 +- include/asm-i386/desc.h |9 +++-- 5 files changed, 13 insertions(+), 8 deletions(-) === --- a/arch/i386/kernel/cpu/common.c +++ b/arch/i386/kernel/cpu/common.c @@ -22,7 +22,7 @@ #include "cpu.h" -DEFINE_PER_CPU(struct desc_struct, cpu_gdt[GDT_ENTRIES]) = { +DEFINE_PER_CPU(struct gdt_page, gdt_page) = { .gdt = { [GDT_ENTRY_KERNEL_CS] = { 0x, 0x00cf9a00 }, [GDT_ENTRY_KERNEL_DS] = { 0x, 0x00cf9200 }, [GDT_ENTRY_DEFAULT_USER_CS] = { 0x, 0x00cffa00 }, @@ -48,8 +48,8 @@ DEFINE_PER_CPU(struct desc_struct, cpu_g [GDT_ENTRY_ESPFIX_SS] = { 0x, 0x00c09200 }, [GDT_ENTRY_PDA] = { 0x, 0x00c09200 }, /* set in setup_pda */ -}; -EXPORT_PER_CPU_SYMBOL_GPL(cpu_gdt); +} }; +EXPORT_PER_CPU_SYMBOL_GPL(gdt_page); DEFINE_PER_CPU(struct i386_pda, _cpu_pda) = { ._pda = &per_cpu___cpu_pda, === --- a/arch/i386/kernel/entry.S +++ b/arch/i386/kernel/entry.S @@ -558,7 +558,7 @@ END(syscall_badsys) #define FIXUP_ESPFIX_STACK \ /* since we are on a wrong stack, we cant make it a C code :( */ \ movl %fs:PDA_cpu, %ebx; \ - PER_CPU(cpu_gdt, %ebx); \ + PER_CPU(gdt_page, %ebx); \ GET_DESC_BASE(GDT_ENTRY_ESPFIX_SS, %ebx, %eax, %ax, %al, %ah); \ addl %esp, %eax; \ pushl $__KERNEL_DS; \ === --- a/arch/i386/kernel/head.S +++ b/arch/i386/kernel/head.S @@ -599,7 +599,7 @@ idt_descr: .word 0 # 32 bit align gdt_desc.address ENTRY(early_gdt_descr) .word GDT_ENTRIES*8-1 - .long per_cpu__cpu_gdt /* Overwritten for secondary CPUs */ + .long per_cpu__gdt_page /* Overwritten for secondary CPUs */ /* * The boot_gdt_table must mirror the equivalent in setup.S and is === --- a/arch/i386/kernel/traps.c +++ b/arch/i386/kernel/traps.c @@ -1037,7 +1037,7 @@ fastcall unsigned long patch_espfix_desc fastcall unsigned long patch_espfix_desc(unsigned long uesp, unsigned long kesp) { - struct desc_struct *gdt = __get_cpu_var(cpu_gdt); + struct desc_struct *gdt = __get_cpu_var(gdt_page).gdt; unsigned long base = (kesp - uesp) & -THREAD_SIZE; unsigned long new_kesp = kesp - base; unsigned long lim_pages = (new_kesp | (THREAD_SIZE - 1)) >> PAGE_SHIFT; === --- a/include/asm-i386/desc.h +++ b/include/asm-i386/desc.h @@ -18,10 +18,15 @@ struct Xgt_desc_struct { unsigned short pad; } __attribute__ ((packed)); -DECLARE_PER_CPU(struct desc_struct, cpu_gdt[GDT_ENTRIES]); +struct gdt_page +{ + struct desc_struct gdt[GDT_ENTRIES]; +} __attribute__((aligned(PAGE_SIZE))); +DECLARE_PER_CPU(struct gdt_page, gdt_page); + static inline struct desc_struct *get_cpu_gdt_table(unsigned int cpu) { - return per_cpu(cpu_gdt, cpu); + return per_cpu(gdt_page, cpu).gdt; } extern struct Xgt_desc_struct idt_descr; -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 1/6] i386: Account for module percpu space separately from kernel percpu
Rather than using a single constant PERCPU_ENOUGH_ROOM, compute it as the sum of kernel_percpu + PERCPU_MODULE_RESERVE. This is now common to all architectures; if an architecture wants to set PERCPU_ENOUGH_ROOM to something special, then it may do so (ia64 is the only one which does). Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> Cc: Rusty Russell <[EMAIL PROTECTED]> Cc: Eric W. Biederman <[EMAIL PROTECTED]> Cc: Andi Kleen <[EMAIL PROTECTED]> --- include/asm-alpha/percpu.h | 14 -- include/asm-sparc64/percpu.h | 10 -- include/asm-x86_64/percpu.h | 10 -- include/linux/percpu.h |9 - init/main.c |7 ++- kernel/module.c |2 +- 6 files changed, 11 insertions(+), 41 deletions(-) === --- a/include/asm-alpha/percpu.h +++ b/include/asm-alpha/percpu.h @@ -1,19 +1,5 @@ #ifndef __ALPHA_PERCPU_H #define __ALPHA_PERCPU_H - -/* - * Increase the per cpu area for Alpha so that - * modules using percpu area can load. - */ -#ifdef CONFIG_MODULES -# define PERCPU_MODULE_RESERVE 8192 -#else -# define PERCPU_MODULE_RESERVE 0 -#endif - -#define PERCPU_ENOUGH_ROOM \ - (ALIGN(__per_cpu_end - __per_cpu_start, SMP_CACHE_BYTES) + \ -PERCPU_MODULE_RESERVE) #include === --- a/include/asm-sparc64/percpu.h +++ b/include/asm-sparc64/percpu.h @@ -4,16 +4,6 @@ #include #ifdef CONFIG_SMP - -#ifdef CONFIG_MODULES -# define PERCPU_MODULE_RESERVE 8192 -#else -# define PERCPU_MODULE_RESERVE 0 -#endif - -#define PERCPU_ENOUGH_ROOM \ - (ALIGN(__per_cpu_end - __per_cpu_start, SMP_CACHE_BYTES) + \ -PERCPU_MODULE_RESERVE) extern void setup_per_cpu_areas(void); === --- a/include/asm-x86_64/percpu.h +++ b/include/asm-x86_64/percpu.h @@ -10,16 +10,6 @@ #ifdef CONFIG_SMP #include - -#ifdef CONFIG_MODULES -# define PERCPU_MODULE_RESERVE 8192 -#else -# define PERCPU_MODULE_RESERVE 0 -#endif - -#define PERCPU_ENOUGH_ROOM \ - (ALIGN(__per_cpu_end - __per_cpu_start, SMP_CACHE_BYTES) + \ -PERCPU_MODULE_RESERVE) #define __per_cpu_offset(cpu) (cpu_pda(cpu)->data_offset) #define __my_cpu_offset() read_pda(data_offset) === --- a/include/linux/percpu.h +++ b/include/linux/percpu.h @@ -11,8 +11,15 @@ /* Enough to cover all DEFINE_PER_CPUs in kernel, including modules. */ #ifndef PERCPU_ENOUGH_ROOM -#define PERCPU_ENOUGH_ROOM 32768 +#ifdef CONFIG_MODULES +#define PERCPU_MODULE_RESERVE 8192 +#else +#define PERCPU_MODULE_RESERVE 0 #endif + +#define PERCPU_ENOUGH_ROOM \ + (__per_cpu_end - __per_cpu_start + PERCPU_MODULE_RESERVE) +#endif /* PERCPU_ENOUGH_ROOM */ /* * Must be an lvalue. Since @var must be a simple identifier, === --- a/init/main.c +++ b/init/main.c @@ -369,11 +369,8 @@ static void __init setup_per_cpu_areas(v unsigned long nr_possible_cpus = num_possible_cpus(); /* Copy section for each CPU (we discard the original) */ - size = ALIGN(__per_cpu_end - __per_cpu_start, SMP_CACHE_BYTES); -#ifdef CONFIG_MODULES - if (size < PERCPU_ENOUGH_ROOM) - size = PERCPU_ENOUGH_ROOM; -#endif + + size = ALIGN(PERCPU_ENOUGH_ROOM, SMP_CACHE_BYTES); ptr = alloc_bootmem(size * nr_possible_cpus); for_each_possible_cpu(i) { === --- a/kernel/module.c +++ b/kernel/module.c @@ -430,7 +430,7 @@ static int percpu_modinit(void) pcpu_size = kmalloc(sizeof(pcpu_size[0]) * pcpu_num_allocated, GFP_KERNEL); /* Static in-kernel percpu data (used). */ - pcpu_size[0] = -ALIGN(__per_cpu_end-__per_cpu_start, SMP_CACHE_BYTES); + pcpu_size[0] = -(__per_cpu_end-__per_cpu_start); /* Free room. */ pcpu_size[1] = PERCPU_ENOUGH_ROOM + pcpu_size[0]; if (pcpu_size[1] < 0) { -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[powerpc] RS/6000 43p-150 no longer boots as of 2.6.18
I know this is a bit late to be reporting this, as it happened before 2.6.18, but my PowerPC CHRP machine (RS/6000 43p-150, 604e CPU) no longer boots. From the console: instantiating rtas at 0x1ffe5000 ... done copying OF device tree ... Building dt strings... Building dt structure... Device tree strings 0x008dc000 -> 0x008dcd97 Device tree struct 0x008dd000 -> 0x008e2000 Calling quiesce ... returning from prom_init ...and here it hangs. This happened between 2.6.17-git21 and -git22. .config is attached. I'd be happy to test patches and provide more information. Thanks, Peter # # Automatically generated make config: don't edit # Linux kernel version: 2.6.17-git21 # Sat Feb 3 23:58:54 2007 # # CONFIG_PPC64 is not set CONFIG_PPC32=y CONFIG_PPC_MERGE=y CONFIG_MMU=y CONFIG_GENERIC_HARDIRQS=y CONFIG_IRQ_PER_CPU=y CONFIG_RWSEM_XCHGADD_ALGORITHM=y CONFIG_GENERIC_HWEIGHT=y CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_GENERIC_FIND_NEXT_BIT=y CONFIG_PPC=y CONFIG_EARLY_PRINTK=y CONFIG_GENERIC_NVRAM=y CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_PPC_OF=y CONFIG_PPC_UDBG_16550=y # CONFIG_GENERIC_TBSYNC is not set # CONFIG_DEFAULT_UIMAGE is not set # # Processor support # CONFIG_CLASSIC32=y # CONFIG_PPC_52xx is not set # CONFIG_PPC_82xx is not set # CONFIG_PPC_83xx is not set # CONFIG_PPC_85xx is not set # CONFIG_PPC_86xx is not set # CONFIG_40x is not set # CONFIG_44x is not set # CONFIG_8xx is not set # CONFIG_E200 is not set CONFIG_6xx=y CONFIG_PPC_FPU=y # CONFIG_ALTIVEC is not set CONFIG_PPC_STD_MMU=y CONFIG_PPC_STD_MMU_32=y # CONFIG_SMP is not set CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" # # Code maturity level options # CONFIG_EXPERIMENTAL=y CONFIG_BROKEN_ON_SMP=y CONFIG_INIT_ENV_ARG_LIMIT=32 # # General setup # CONFIG_LOCALVERSION="-wire" # CONFIG_LOCALVERSION_AUTO is not set CONFIG_SWAP=y CONFIG_SYSVIPC=y CONFIG_POSIX_MQUEUE=y CONFIG_BSD_PROCESS_ACCT=y # CONFIG_BSD_PROCESS_ACCT_V3 is not set CONFIG_SYSCTL=y CONFIG_AUDIT=y # CONFIG_AUDITSYSCALL is not set CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y # CONFIG_RELAY is not set CONFIG_INITRAMFS_SOURCE="" CONFIG_CC_OPTIMIZE_FOR_SIZE=y # CONFIG_EMBEDDED is not set CONFIG_KALLSYMS=y CONFIG_KALLSYMS_ALL=y # CONFIG_KALLSYMS_EXTRA_PASS is not set CONFIG_HOTPLUG=y CONFIG_PRINTK=y CONFIG_BUG=y CONFIG_ELF_CORE=y CONFIG_BASE_FULL=y CONFIG_RT_MUTEXES=y CONFIG_FUTEX=y CONFIG_EPOLL=y CONFIG_SHMEM=y CONFIG_SLAB=y CONFIG_VM_EVENT_COUNTERS=y # CONFIG_TINY_SHMEM is not set CONFIG_BASE_SMALL=0 # CONFIG_SLOB is not set # # Loadable module support # CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y CONFIG_MODULE_FORCE_UNLOAD=y # CONFIG_MODVERSIONS is not set # CONFIG_MODULE_SRCVERSION_ALL is not set CONFIG_KMOD=y # # Block layer # # CONFIG_LBD is not set # CONFIG_BLK_DEV_IO_TRACE is not set # CONFIG_LSF is not set # # IO Schedulers # CONFIG_IOSCHED_NOOP=y CONFIG_IOSCHED_AS=y CONFIG_IOSCHED_DEADLINE=y CONFIG_IOSCHED_CFQ=y CONFIG_DEFAULT_AS=y # CONFIG_DEFAULT_DEADLINE is not set # CONFIG_DEFAULT_CFQ is not set # CONFIG_DEFAULT_NOOP is not set CONFIG_DEFAULT_IOSCHED="anticipatory" # # Platform support # CONFIG_PPC_MULTIPLATFORM=y # CONFIG_PPC_ISERIES is not set # CONFIG_EMBEDDED6xx is not set # CONFIG_APUS is not set CONFIG_PPC_CHRP=y # CONFIG_PPC_PMAC is not set # CONFIG_PPC_CELL is not set # CONFIG_PPC_CELL_NATIVE is not set # CONFIG_UDBG_RTAS_CONSOLE is not set CONFIG_MPIC=y CONFIG_PPC_RTAS=y # CONFIG_RTAS_ERROR_LOGGING is not set CONFIG_RTAS_PROC=y # CONFIG_MMIO_NVRAM is not set CONFIG_PPC_MPC106=y # CONFIG_PPC_970_NAP is not set # CONFIG_CPU_FREQ is not set CONFIG_TAU=y # CONFIG_TAU_INT is not set # CONFIG_TAU_AVERAGE is not set # CONFIG_WANT_EARLY_SERIAL is not set # # Kernel options # # CONFIG_HIGHMEM is not set # CONFIG_HZ_100 is not set CONFIG_HZ_250=y # CONFIG_HZ_1000 is not set CONFIG_HZ=250 # CONFIG_PREEMPT_NONE is not set CONFIG_PREEMPT_VOLUNTARY=y # CONFIG_PREEMPT is not set CONFIG_BINFMT_ELF=y CONFIG_BINFMT_MISC=m CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y # CONFIG_KEXEC is not set CONFIG_ARCH_FLATMEM_ENABLE=y CONFIG_SELECT_MEMORY_MODEL=y CONFIG_FLATMEM_MANUAL=y # CONFIG_DISCONTIGMEM_MANUAL is not set # CONFIG_SPARSEMEM_MANUAL is not set CONFIG_FLATMEM=y CONFIG_FLAT_NODE_MEM_MAP=y # CONFIG_SPARSEMEM_STATIC is not set CONFIG_SPLIT_PTLOCK_CPUS=4 # CONFIG_RESOURCES_64BIT is not set CONFIG_PROC_DEVICETREE=y CONFIG_CMDLINE_BOOL=y CONFIG_CMDLINE="console=ttyS0,9600 console=tty0 root=/dev/hda1" CONFIG_PM=y CONFIG_PM_LEGACY=y # CONFIG_PM_DEBUG is not set # CONFIG_SOFTWARE_SUSPEND is not set CONFIG_SECCOMP=y CONFIG_ISA_DMA_API=y # # Bus options # CONFIG_ISA=y CONFIG_GENERIC_ISA_DMA=y CONFIG_PPC_I8259=y CONFIG_PPC_INDIRECT_PCI=y CONFIG_PCI=y CONFIG_PCI_DOMAINS=y # CONFIG_PCIEPORTBUS is not set # CONFIG_PCI_DEBUG is not set # # PCCARD (PCMCIA/CardBus) support # # CONFIG_PCCARD is not set # # PCI Hotplug Support # # CONFIG_HOTPLUG_PCI is not set # # Advanced setup # # CONFIG_ADVANCED_OPTIONS is not set # # Default settings
Re: [patch 32/37] CRYPTO: api: scatterwalk_copychunks() fails to advance through scatterlist
On Fri, Mar 30, 2007 at 08:11:29PM -0700, Greg KH wrote: > > Is this an "add-on" patch, or a replacement one? This is an add-on. In case you want a replacement, here it is: Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- diff --git a/crypto/scatterwalk.c b/crypto/scatterwalk.c index 35172d3..0f76175 100644 --- a/crypto/scatterwalk.c +++ b/crypto/scatterwalk.c @@ -91,6 +91,8 @@ void scatterwalk_copychunks(void *buf, struct scatter_walk *walk, memcpy_dir(buf, vaddr, len_this_page, out); scatterwalk_unmap(vaddr, out); + scatterwalk_advance(walk, len_this_page); + if (nbytes == len_this_page) break; @@ -99,7 +101,5 @@ void scatterwalk_copychunks(void *buf, struct scatter_walk *walk, scatterwalk_pagedone(walk, out, 1); } - - scatterwalk_advance(walk, nbytes); } EXPORT_SYMBOL_GPL(scatterwalk_copychunks); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [test] hackbench.c interactivity results: vanilla versus SD/RSDL
On Sat, 2007-03-31 at 05:23 +0200, Mike Galbraith wrote: > On Fri, 2007-03-30 at 19:36 -0700, Xenofon Antidides wrote: > > > Something different on many cpus? Sorry I was thinking > > something other. I try 50% run + 50% sleep on one cpu > > and mainline has big problem. Sorry for bad code I > > copy bits to make it work. Start program first then > > run bash 100% cpu (while : ; do : ; done). Try change > > program forks from 1 till 3 or more mainline kernel > > and bash gets 0%. > > top - 05:16:41 up 43 min, 13 users, load average: 9.51, 4.32, 5.67 > > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ P COMMAND > 7146 root 15 0 1564 104 24 R 43 0.0 0:20.74 0 fiftypercent > 7142 root 15 0 1564 104 24 S 37 0.0 0:18.08 0 fiftypercent > 7140 root 15 0 1564 436 356 R 21 0.0 0:18.94 1 fiftypercent > 7144 root 15 0 1564 104 24 R 21 0.0 0:18.75 1 fiftypercent > 7143 root 15 0 1564 104 24 R 20 0.0 0:18.85 1 fiftypercent > 7145 root 15 0 1564 104 24 R 19 0.0 0:18.30 1 fiftypercent > 7147 root 15 0 1564 104 24 R 19 0.0 0:18.03 1 fiftypercent > 7141 root 16 0 1564 104 24 R 10 0.0 0:18.29 0 fiftypercent > 6245 root 16 0 3368 1876 1376 R7 0.2 0:49.81 0 bash > > That's mainline with the below (which I'm trying various ideas to improve). Note: that's not an sh -c started at the same time as the 50% duty cycle dos, the pertinent data is that bash is getting into the loop. Yesterday, I piddled around with tracking interactive backlog as a way to detect when the load isn't really an interactive load, that's very simple and has potential. -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 32/37] CRYPTO: api: scatterwalk_copychunks() fails to advance through scatterlist
On Sat, Mar 31, 2007 at 12:14:37PM +1000, Herbert Xu wrote: > On Sat, Mar 31, 2007 at 03:41:32AM +0200, Patrick McHardy wrote: > > > > > [CRYPTO] api: scatterwalk_copychunks() fails to advance through > > > scatterlist > > > > This patch seems to cause some problems, I get reproducable freezes > > on the receiving system with net-2.6.22 when sending IPsec packets > > larger than the mtu (reproduced about 10 times). Reverting this > > patch seems to fix it. In a few cases the oops also occured on the > > sending system. > > > > Backtrace from UML (sending system): > > > > uml:~# ping 10.0.0.1 -s 2 > > PING 10.0.0.1 (10.0.0.1) 2(20028) bytes of data. > > BUG: soft lockup detected on CPU#0! > > Call Trace: > > Indeed. That patch was buggy. Sorry for not catching this earlier. > > This should fix the problem. > > [CRYPTO] api: Use the right value when advancing scatterwalk_copychunks > > In the scatterwalk_copychunks loop, We should be advancing by > len_this_page and not nbytes. The latter is the total length. > > Signed-off-by: Herbert Xu <[EMAIL PROTECTED]> Is this an "add-on" patch, or a replacement one? thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: libata bugfix: preserve LBA bit for HDIO_DRIVE_TASK
Mark Lord wrote: > Ideally, this would go into linux-2.6.21. > > Preserve the LBA bit in the DevSel/Head register for HDIO_DRIVE_TASK. > > Signed-off-by: Mark Lord <[EMAIL PROTECTED]> > --- > --- linux/drivers/ata/libata-scsi.c.orig2007-03-21 > 13:35:02.0 -0400 > +++ linux/drivers/ata/libata-scsi.c2007-03-30 17:40:58.0 -0400 > @@ -333,7 +333,7 @@ > scsi_cmd[8] = args[3]; > scsi_cmd[10] = args[4]; > scsi_cmd[12] = args[5]; > -scsi_cmd[13] = args[6] & 0x0f; > +scsi_cmd[13] = args[6] & 0x4f; > scsi_cmd[14] = args[0]; > > /* Good values for timeout and retries? Values below IDE seems to be just overriding devsel (0x10) and leaving the rest alone. Maybe we should do (args[6] & ~0x10) here? Or is it safer this way? Thanks. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [test] hackbench.c interactivity results: vanilla versus SD/RSDL
On Fri, 2007-03-30 at 19:36 -0700, Xenofon Antidides wrote: > Something different on many cpus? Sorry I was thinking > something other. I try 50% run + 50% sleep on one cpu > and mainline has big problem. Sorry for bad code I > copy bits to make it work. Start program first then > run bash 100% cpu (while : ; do : ; done). Try change > program forks from 1 till 3 or more mainline kernel > and bash gets 0%. top - 05:16:41 up 43 min, 13 users, load average: 9.51, 4.32, 5.67 PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ P COMMAND 7146 root 15 0 1564 104 24 R 43 0.0 0:20.74 0 fiftypercent 7142 root 15 0 1564 104 24 S 37 0.0 0:18.08 0 fiftypercent 7140 root 15 0 1564 436 356 R 21 0.0 0:18.94 1 fiftypercent 7144 root 15 0 1564 104 24 R 21 0.0 0:18.75 1 fiftypercent 7143 root 15 0 1564 104 24 R 20 0.0 0:18.85 1 fiftypercent 7145 root 15 0 1564 104 24 R 19 0.0 0:18.30 1 fiftypercent 7147 root 15 0 1564 104 24 R 19 0.0 0:18.03 1 fiftypercent 7141 root 16 0 1564 104 24 R 10 0.0 0:18.29 0 fiftypercent 6245 root 16 0 3368 1876 1376 R7 0.2 0:49.81 0 bash That's mainline with the below (which I'm trying various ideas to improve). --- linux-2.6.21-rc5/kernel/sched.c.org 2007-03-27 15:47:49.0 +0200 +++ linux-2.6.21-rc5/kernel/sched.c 2007-03-30 18:21:12.0 +0200 @@ -234,6 +234,7 @@ struct rq { */ unsigned long nr_uninterruptible; + unsigned long switch_timestamp; unsigned long expired_timestamp; /* Cached timestamp set by update_cpu_clock() */ unsigned long long most_recent_timestamp; @@ -882,7 +883,11 @@ static int recalc_task_prio(struct task_ /* Caller must always ensure 'now >= p->timestamp' */ unsigned long sleep_time = now - p->timestamp; - if (batch_task(p)) + /* +* Migration timestamp adjustment may induce negative time. +* Ignore unquantifiable values as well as SCHED_BATCH tasks. +*/ + if (now < p->timestamp || batch_task(p)) sleep_time = 0; if (likely(sleep_time > 0)) { @@ -3051,9 +3056,9 @@ static inline int expired_starving(struc { if (rq->curr->static_prio > rq->best_expired_prio) return 1; - if (!STARVATION_LIMIT || !rq->expired_timestamp) + if (!STARVATION_LIMIT) return 0; - if (jiffies - rq->expired_timestamp > STARVATION_LIMIT * rq->nr_running) + if (jiffies - rq->switch_timestamp > STARVATION_LIMIT * rq->nr_running) return 1; return 0; } @@ -3131,6 +3136,67 @@ void account_steal_time(struct task_stru cpustat->steal = cputime64_add(cpustat->steal, tmp); } +/* + * Promote and requeue the next lower priority task. If no task + * is available in the active array, switch to the expired array. + * @rq: runqueue to search. + * @prio: priority at which to begin search. + */ +static inline void promote_next_lower(struct rq *rq, int prio) +{ + struct prio_array *array = rq->active; + struct task_struct *p = NULL; + unsigned long long now = rq->most_recent_timestamp; + unsigned long *bitmap; + unsigned long starving = JIFFIES_TO_NS(rq->nr_running * DEF_TIMESLICE); + int idx = prio + 1, found_noninteractive = 0; + +repeat: + bitmap = array->bitmap; + idx = find_next_bit(bitmap, MAX_PRIO, idx); + if (idx < MAX_PRIO) { + struct list_head *queue = array->queue + idx; + + p = list_entry(queue->next, struct task_struct, run_list); + if (!TASK_INTERACTIVE(p)) + found_noninteractive = 1; + + /* Skip non-starved queues. */ + if (now < p->last_ran + starving) { + idx++; + p = NULL; + goto repeat; + } + } else if (!found_noninteractive && array == rq->active) { + /* Nobody home, check the expired array. */ + array = rq->expired; + idx = 0; + p = NULL; + goto repeat; + } + + /* Found one, requeue it. */ + if (p) { + dequeue_task(p, p->array); + if (array == rq->active) + p->prio--; + /* +* If we pulled a task from the expired array, correct +* expired array info. We can't afford a full search +* for best_expired_prio, but do the best we can. +*/ + else { + idx = sched_find_first_bit(array->bitmap); + if (idx < MAX_PRIO) { + if (rq->best_expired_prio > idx) + rq->best_expired_prio = idx; + } else +
[PATCH] Clean up ELF note generation
Three cleanups: 1: ELF notes are never mapped, so there's no need to have any access flags in their phdr. 2: When generating them from asm, tell the assembler to use a SHT_NOTE section type. There doesn't seem to be a way to do this from C. 3: Use ANSI rather than traditional cpp behaviour to stringify the macro argument. Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> Cc: Eric W. Biederman <[EMAIL PROTECTED]> --- arch/i386/kernel/vmlinux.lds.S|2 +- include/asm-generic/vmlinux.lds.h |2 +- include/linux/elfnote.h |4 ++-- 3 files changed, 4 insertions(+), 4 deletions(-) === --- a/arch/i386/kernel/vmlinux.lds.S +++ b/arch/i386/kernel/vmlinux.lds.S @@ -34,7 +34,7 @@ PHDRS { PHDRS { text PT_LOAD FLAGS(5); /* R_E */ data PT_LOAD FLAGS(7); /* RWE */ - note PT_NOTE FLAGS(4); /* R__ */ + note PT_NOTE FLAGS(0); /* ___ */ } SECTIONS { === --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -208,7 +208,7 @@ } #define NOTES \ - .notes : { *(.note.*) } :note + .notes : { *(.note.*) } :note #define INITCALLS \ *(.initcall0.init) \ === --- a/include/linux/elfnote.h +++ b/include/linux/elfnote.h @@ -39,12 +39,12 @@ * ELFNOTE(XYZCo, 12, .long, 0xdeadbeef) */ #define ELFNOTE(name, type, desctype, descdata)\ -.pushsection .note.name; \ +.pushsection .note.name, "",@note ; \ .align 4 ; \ .long 2f - 1f/* namesz */; \ .long 4f - 3f/* descsz */; \ .long type ; \ -1:.asciz "name"; \ +1:.asciz #name ; \ 2:.align 4 ; \ 3:desctype descdata; \ 4:.align 4 ; \ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [3/4] 2.6.21-rc5: known regressions (v2)
On Sat, Mar 31, 2007 at 10:52:59AM +0800, Jeff Chua wrote: > On 3/31/07, Adrian Bunk <[EMAIL PROTECTED]> wrote: > > >Subject: ThinkPad doesn't resume from suspend to RAM > >References : http://lkml.org/lkml/2007/2/27/80 > > http://lkml.org/lkml/2007/2/28/348 > >Submitter : Jens Axboe <[EMAIL PROTECTED]> > > Jeff Chua <[EMAIL PROTECTED]> > >Status : unknown > > Fixed with CONFIG_NO_HZ unset and patch from Maxim > (http://lkml.org/lkml/2007/3/29/108). Thanks for this information. Jens, does suspend to RAM also work for you with the latest -git? > Thanks, > Jeff, cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFD driver-core] Lifetime problems of the current driver model
Tejun Heo wrote: > Cornelia Huck wrote: >> On Sat, 31 Mar 2007 00:08:19 +0900, >> Tejun Heo <[EMAIL PROTECTED]> wrote: >> >>> (3) make sure all existing kobjects are released by module exit function. >>> >>> For example, let's say there is a hypothetical disk device /dev/dk0 >>> driven by a hypothetical driver mydrv. /dev/dk0 is represented like the >>> following in the sysfs tree. >>> >>> /sys/devices/pci:00/:00:1f.0/dk0/{myknob0,myknob1} >>> >>> Owner of both attrs myknob0 and myknob1 is mydrv and opening either >>> increases the reference counts of dk0 and mydrv and closing does the >>> opposite. >>> >>> * When there is no opener of either knob and the /dev/dk0 isn't used by >>> anyone. Reference count of dk0 is 1, mydrv 0. >> Hm, but as long as dk0 is registered, it can be looked up and someone >> could get a reference on it. > > Yeah, exactly. That's why any getting any kobject reference backed by a > module must be accompanied by try_module_get(). > > int mydrv_get_dk(struct dk *dk) > { > rc = try_module_get(mydrv); > if (rc) > return rc; > kobject_get(&dk->kobj); > return 0; > } And one more thing just in case. In the above code, try_module_get() and kobject_get() must be and is atomic w.r.t. try_stop_module(). That's why we do the following. stop_machine_run(__try_stop_module, &sref, NR_CPUS);. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 0/6] i386 gdt and percpu cleanups
Jeremy Fitzhardinge wrote: > This is a series of patches based on your latest queue (as of the > other day, at least). > BTW, the From: line attributions got dropped from a few of these. These: Allow percpu variables to be page-aligned Page-align the GDT should be From: Rusty. He did most of the work on the others, but I changed them enough that he shouldn't be saddled with the blame when things break ;) J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [PATCH] uml: fix static linking for real
On Sat, Mar 31, 2007 at 03:20:27AM +0200, Paolo 'Blaisorblade' Giarrusso wrote: > There was a typo in commit 7632fc8f809a97f9d82ce125e8e3e579390ce2e5, > preventing > it from working - 32bit binaries crashed hopelessly before the below fix and > work perfectly now. > Merge for 2.6.21, please. ACK. Jeff -- Work email - jdike at linux dot intel dot com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFD driver-core] Lifetime problems of the current driver model
Cornelia Huck wrote: > On Sat, 31 Mar 2007 00:08:19 +0900, > Tejun Heo <[EMAIL PROTECTED]> wrote: > >> (3) make sure all existing kobjects are released by module exit function. >> >> For example, let's say there is a hypothetical disk device /dev/dk0 >> driven by a hypothetical driver mydrv. /dev/dk0 is represented like the >> following in the sysfs tree. >> >> /sys/devices/pci:00/:00:1f.0/dk0/{myknob0,myknob1} >> >> Owner of both attrs myknob0 and myknob1 is mydrv and opening either >> increases the reference counts of dk0 and mydrv and closing does the >> opposite. >> >> * When there is no opener of either knob and the /dev/dk0 isn't used by >> anyone. Reference count of dk0 is 1, mydrv 0. > > Hm, but as long as dk0 is registered, it can be looked up and someone > could get a reference on it. Yeah, exactly. That's why any getting any kobject reference backed by a module must be accompanied by try_module_get(). int mydrv_get_dk(struct dk *dk) { rc = try_module_get(mydrv); if (rc) return rc; kobject_get(&dk->kobj); return 0; } >> * User issues rmmod mydrv. As mydrv's reference count is zero, unload >> proceeds and mydrv's exit function is called. >> >> * mydrv's exit function looks like the following. >> >> mydrv_exit() >> { >> sysfs_remove_file(dk0, myknob0); >> sysfs_remove_file(dk1, myknob1); >> device_del(dk0); >> deinit controller; >> release all resources; >> } >> >> The device_del(dk0) drops dk0's reference count to zero and its >> ->release is invoked immediately. > > And here is the problem if someone else still has a reference. The > module will be unloaded, but ->release will not be called until the > "someone else" gives up the reference... Exactly, in that case, module reference count must not be zero. You and I are saying the same thing. Why are we running in circle? -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: fs/block_dev.c:953: warning: 'found' might be used uninitialized in this function
2007/3/31, Adrian Bunk <[EMAIL PROTECTED]>: On Thu, Mar 29, 2007 at 11:16:39PM -0400, Kyle Moffett wrote: > On Mar 28, 2007, at 16:14:54, Andrew Morton wrote: > >On Wed, 28 Mar 2007 19:23:32 +0200 (CEST) > >Jiri Kosina <[EMAIL PROTECTED]> wrote: > > > >>blockdev: bd_claim_by_kobject() could check value of unititalized > >>pointer > >> > >>Fixes this warning: > >> > >>fs/block_dev.c: In function `bd_claim_by_kobject': > >>fs/block_dev.c:953: warning: 'found' might be used uninitialized > >>in this function > >> > >>struct bd_holder *found is initialized only when bd_claim() > >>returns zero. If it returns nonzero, ptr stays uninitialized. > >>Later the value of the pointer is checked. > > > >that generates extra code and people get upset. > > > >One approach which we could ue in here is > > > > struct bd_holder *found = found; /* Suppress bogus gcc warning */ > > Well, that would be correct except the warning is an actual kernel > bug. Read Jiri's message (which you also quoted): > >struct bd_holder *found is initialized only when bd_claim() returns > >zero. If it returns nonzero, ptr stays uninitialized. Later the > >value of the pointer is checked. > > So in this case it has to be initialized to NULL or there's a > potential BUG() lurking. No, the code is correct and it's impossible that the variable ever gets read uninitialized. And BTW, i386 gcc 4.1 doesn't give me a warning for this. Toralf, which gcc version and architecture did you see this with? I am also using i386 gcc 4.1.1, and I did receive many warnings of such kind yesterday. I think we should fix them. And the reason for the existence of such things is we just want to use them for writing first instead of reading, thus ignore the initialization. -- So Dark The Con Of Man. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 0/6] i386 gdt and percpu cleanups
Rusty Russell wrote: > One nitpick: I'd really like PER_CPU() renamed to PER_CPU_ADDR(). > That's a separate patch, but I think would be far clearer. > Seems pretty simple, given that it has precisely one use. J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 4/6] Convert PDA into the percpu section
Andi Kleen wrote: > On Saturday 31 March 2007 04:00, Jeremy Fitzhardinge wrote: > >> Currently x86 (similar to x84-64) has a special per-cpu structure >> called "i386_pda" which can be easily and efficiently referenced via >> the %fs register. An ELF section is more flexible than a structure, >> allowing any piece of code to use this area. Indeed, such a section >> already exists: the per-cpu area. >> > > Hmm, I'm a little reluctant. This moves i386 more away from x86-64 > again. If we ever merge them it would mean more work. Do you really need it? It cleans things up a fair bit: 1. At initialization, it doesn't require %fs to be loaded before being able to use per-cpu variables, since you can use percpu with %fs set to a plain 0-based 4G segment; you can defer initialization until SMP bringup (which is never on a UP kernel). PDA requires %fs to be specially set up to point to an initial PDA, which includes setting up a gdt entry, generally before C code is run. For paravirtualized boot, this setup needs to be replicated by each hypervisor startup sequence; without the PDA, it becomes a non-issue (especially since hypervisors typically start up with %fs as a flat segment anyway). Overall, both UP and SMP boot is simpler and less fragile. 2. Adding things to the pda requires changing , which often means including extra headers to allow added definitions. Since pda.h is used to implement things like "current" and "smp_processor_id", it gets included everywhere. Any header included in effectively gets included everywhere in the kernel. Also, it turns pda.h into a concentrated nest of patch conflicts. percpu requires no central modifications to add a new percpu variable. 3. There's no disadvantage to using a percpu at all, especially if you can use the x86_*_percpu functions which allow direct access to the variable via %fs. If one construct will do, why have two? Removing the pda removes quite a bit of unnecessary code. 4. I think, ultimately, it would be better to migrate x86_64 away from using the pda to all percpu too, though this has some tricky bits for now. Certainly, not having this patch at this stage will require me to rework quite a few of the later patches. I was going to put off sending out this patch until later, but reworking everything to work both with pda and percpu was so fragile and tricky-bug-prone that I decided to push it early and save myself a lot of work. J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 0/6] i386 gdt and percpu cleanups
> - some improvements to asm-i386/percpu.h to make asm access to percpu >variables easy One nitpick: I'd really like PER_CPU() renamed to PER_CPU_ADDR(). That's a separate patch, but I think would be far clearer. Thanks, Rusty. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 4/6] Convert PDA into the percpu section
On Sat, 2007-03-31 at 04:35 +0200, Andi Kleen wrote: > On Saturday 31 March 2007 04:00, Jeremy Fitzhardinge wrote: > > Currently x86 (similar to x84-64) has a special per-cpu structure > > called "i386_pda" which can be easily and efficiently referenced via > > the %fs register. An ELF section is more flexible than a structure, > > allowing any piece of code to use this area. Indeed, such a section > > already exists: the per-cpu area. > > Hmm, I'm a little reluctant. This moves i386 more away from x86-64 > again. If we ever merge them it would mean more work. Do you really need it? Well, I think the merge should go the other way in this case: this really does simplify things. The only thing stopping x86-64 from doing the same as i386 is the stack-protector stuff. And that can be fixed (unfortunately requires a gcc patch to change the %gs:40 to %gs:__gcc_stack_protector_offset and emit a weak absolute symbol __gss_stack_protector_offset = 40). I shall prepare a patch for that next week; I've been busy Kleening up lguest 8) Rusty. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [3/4] 2.6.21-rc5: known regressions (v2)
On 3/31/07, Adrian Bunk <[EMAIL PROTECTED]> wrote: Subject: ThinkPad doesn't resume from suspend to RAM References : http://lkml.org/lkml/2007/2/27/80 http://lkml.org/lkml/2007/2/28/348 Submitter : Jens Axboe <[EMAIL PROTECTED]> Jeff Chua <[EMAIL PROTECTED]> Status : unknown Fixed with CONFIG_NO_HZ unset and patch from Maxim (http://lkml.org/lkml/2007/3/29/108). Thanks, Jeff, - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 9/9] clocksource: refactor duplicate registration checking
On Fri, 2007-03-30 at 21:59 -0400, James Morris wrote: > On Fri, 30 Mar 2007, Daniel Walker wrote: > > > /** > > * clocksource_register - Used to install new clocksources > > * @t: clocksource to be registered > > * > > - * Returns -EBUSY if registration fails, zero otherwise. > > + * Always returns zero. > > */ > > int clocksource_register(struct clocksource *c) > > Return should be void, then. Yeah, that's another patch tho .. Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] CPUSETS: add mems to basic usage documentation
On Fri, Mar 30, 2007 at 02:30:47AM -0700, Paul Jackson wrote: > Simon Horman wrote: > > +++ linux-2.6/Documentation/cpusets.txt 2007-03-30 13:03:19.0 > > +0900 > > ... > > +Add some mems: > > +# /bin/echo 0-7 > mems > > Nice change - thanks. > > Acked-by: Paul Jackson <[EMAIL PROTECTED]> Thanks > (I probably would not add a dmesg complaint; we don't usually > do that for ordinary system call failures. Pay close attention > to the resulting errno - in this case ENOSPC.) Understood -- Horms H: http://www.vergenet.net/~horms/ W: http://www.valinux.co.jp/en/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [4/4] 2.6.21-rc5: known regressions (v2)
On 3/31/07, Adrian Bunk <[EMAIL PROTECTED]> wrote: Subject: suspend to disk hangs (CONFIG_NO_HZ) References : http://lkml.org/lkml/2007/3/25/217 Submitter : Jeff Chua <[EMAIL PROTECTED]> Status : unknown Still broken on.2.6.21-rc5. Jeff. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ckrm-tech] [PATCH 7/7] containers (V7): Container interface to nsproxy subsystem
On Mon, Feb 12, 2007 at 12:15:28AM -0800, [EMAIL PROTECTED] wrote: > +int ns_container_clone(struct task_struct *tsk) > +{ > + return container_clone(tsk, &ns_subsys); > +} This function is a no-op if ns hierarchy is not mounted at this point. This would mean that we will miss out on some directories in ns hierarchy if it happened to be mounted later. It would be nice to recreate such missing directories upon mount. However I suspect it would not be easy ..Maybe we need to scan the task list and (re-)invoke ns_container_clone() for every new tsk->nsproxy we find in the list. Alternately perhaps we could auto mount (kern_mount) ns hierarchy very early at bootup? On the flip side that would require remount support so that additional controllers (like cpuset, mem) can be bound to (non-empty) ns hierarchy after bootup. -- Regards, vatsa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [test] hackbench.c interactivity results: vanilla versus SD/RSDL
--- Mike Galbraith <[EMAIL PROTECTED]> wrote: > On Fri, 2007-03-30 at 15:05 +, Xenofon Antidides > wrote: > > - Original Message > > From: Ingo Molnar <[EMAIL PROTECTED]> > > To: Con Kolivas <[EMAIL PROTECTED]> > > Cc: linux list ; > Andrew Morton <[EMAIL PROTECTED]>; Mike > Galbraith <[EMAIL PROTECTED]> > > Sent: Thursday, March 29, 2007 9:22:49 PM > > Subject: [test] hackbench.c interactivity results: > vanilla versus SD/RSDL > > > > > > * Ingo Molnar <[EMAIL PROTECTED]> wrote: > > > > > * Con Kolivas <[EMAIL PROTECTED]> wrote: > > > > > > > I'm cautiously optimistic that we're at the > thin edge of the bugfix > > > > wedge now. > > [...] > > > > > and the numbers he posted: > > > > > > > http://marc.info/?l=linux-kernel&m=117448900626028&w=2 > > > > We been staring at these numbers for while now and > we come to the conclusion they wrong. > > > > The test is f is 3 tasks, two on different and one > on same cpu as sh here: > > virgin 2.6.21-rc3-rsdl-smp > > top - 13:52:50 up 7 min, 12 users, load average: > 3.45, 2.89, 1.51 > > > > PID USER PR NI VIRT RES SHR S %CPU %MEM >TIME+ P COMMAND > > 6560 root 31 0 2892 1236 1032 R 82 0.1 > 1:50.24 1 sh > > 6558 root 28 0 1428 276 228 S 42 0.0 > 1:00.09 1 f > > 6557 root 30 0 1424 280 228 R 35 0.0 > 1:00.25 0 f > > 6559 root 39 0 1424 276 228 R 33 0.0 > 0:58.36 0 f > > This is a 1 second sample, tasks migrate. > > -Mike Something different on many cpus? Sorry I was thinking something other. I try 50% run + 50% sleep on one cpu and mainline has big problem. Sorry for bad code I copy bits to make it work. Start program first then run bash 100% cpu (while : ; do : ; done). Try change program forks from 1 till 3 or more mainline kernel and bash gets 0%. Xant Get your own web address. Have a HUGE year through Yahoo! Small Business. http://smallbusiness.yahoo.com/domains/?p=BESTDEAL// gcc -O2 -o fiftyp fiftyp.c -lrt // code from interbench.c #include #include #include #include #include #include int forks=1; int runus,sleepus=7000; unsigned long loops_per_ms; void terminal_error(const char *name) { fprintf(stderr, "\n"); perror(name); exit (1); } unsigned long long get_nsecs(struct timespec *myts) { if (clock_gettime(CLOCK_REALTIME, myts)) terminal_error("clock_gettime"); return (myts->tv_sec * 10 + myts->tv_nsec ); } void burn_loops(unsigned long loops) { unsigned long i; /* * We need some magic here to prevent the compiler from optimising * this loop away. Otherwise trying to emulate a fixed cpu load * with this loop will not work. */ for (i = 0 ; i < loops ; i++) asm volatile("" : : : "memory"); } /* Use this many usecs of cpu time */ void burn_usecs(unsigned long usecs) { unsigned long ms_loops; ms_loops = loops_per_ms / 1000 * usecs; burn_loops(ms_loops); } void microsleep(unsigned long long usecs) { struct timespec req, rem; rem.tv_sec = rem.tv_nsec = 0; req.tv_sec = usecs / 100; req.tv_nsec = (usecs - (req.tv_sec * 100)) * 1000; continue_sleep: if ((nanosleep(&req, &rem)) == -1) { if (errno == EINTR) { if (rem.tv_sec || rem.tv_nsec) { req.tv_sec = rem.tv_sec; req.tv_nsec = rem.tv_nsec; goto continue_sleep; } goto out; } terminal_error("nanosleep"); } out: return; } /* * In an unoptimised loop we try to benchmark how many meaningless loops * per second we can perform on this hardware to fairly accurately * reproduce certain percentage cpu usage */ void calibrate_loop(void) { unsigned long long start_time, loops_per_msec, run_time = 0; unsigned long loops; struct timespec myts; loops_per_msec = 100; redo: /* Calibrate to within 1% accuracy */ while (run_time > 101 || run_time < 99) { loops = loops_per_msec; start_time = get_nsecs(&myts); burn_loops(loops); run_time = get_nsecs(&myts) - start_time; loops_per_msec = (100 * loops_per_msec / run_time ? : loops_per_msec); } /* Rechecking after a pause increases reproducibility */ sleep(1); loops = loops_per_msec; start_time = get_nsecs(&myts); burn_loops(loops); run_time = get_nsecs(&myts) - start_time; /* Tolerate 5% difference on checking */ if (run_time > 105 || run_time < 95) goto redo; loops_per_ms=loops_per_msec; sleep(1); start_time=get_nsecs(&myts); microsleep(sleepus); run_time=get_nsecs(&myts)-start_time; runus=run_time/1000; } int main(void){ int i; calibrate_loop(); printf("starting %d forks\n",forks); for(i=1;i
Re: [patch 4/6] Convert PDA into the percpu section
On Saturday 31 March 2007 04:00, Jeremy Fitzhardinge wrote: > Currently x86 (similar to x84-64) has a special per-cpu structure > called "i386_pda" which can be easily and efficiently referenced via > the %fs register. An ELF section is more flexible than a structure, > allowing any piece of code to use this area. Indeed, such a section > already exists: the per-cpu area. Hmm, I'm a little reluctant. This moves i386 more away from x86-64 again. If we ever merge them it would mean more work. Do you really need it? -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [bug] hung bootup in various drivers, was: "2.6.21-rc5: known regressions"
On Fri, 2007-03-30 at 12:32 -0700, Greg KH wrote: > On Fri, Mar 30, 2007 at 07:46:19PM +0200, Ingo Molnar wrote: > > > > * Greg KH <[EMAIL PROTECTED]> wrote: > > > > > > BUG: at drivers/base/driver.c:187 driver_unregister() > > > > [] show_trace_log_lvl+0x19/0x2e > > > > [] show_trace+0x12/0x14 > > > > [] dump_stack+0x14/0x16 > > > > [] driver_unregister+0x3d/0x43 > > > > [] pci_unregister_driver+0x10/0x5f > > > > [] slgt_init+0x9b/0x1ca > > > > [] init+0x15d/0x2bd > > > > [] kernel_thread_helper+0x7/0x10 > > > > > Yes, we should allow the ability to call unregister_driver from within > > > the module_init function. > > > > > > But I don't understand what is causing you to see this problem. Who > > > is holding the reference on the struct device at this point in time? > > > Is it the fact that userspace has some files open and it hasn't > > > released them yet? > > > > at least in the slgt_init() case the affected codepath is trivial: > > > > if ((rc = pci_register_driver(&pci_driver)) < 0) { > > printk("%s pci_register_driver error=%d\n", driver_name, > > rc); > > return rc; > > } > > pci_registered = 1; > > > > if (!slgt_device_list) { > > printk("%s no devices found\n",driver_name); > > pci_unregister_driver(&pci_driver); > > return -ENODEV; > > > > slgt_device_list is NULL because no matching PCI ID is on my system (i > > dont have this hardware), so the ->probe() function did not get called > > at all. > > Sorry, no, I realize how this could happen in the driver, I just don't > see what in the driver core would be keeping this driver from having > it's release function called at the unregister() time. > > Something has grabbed a reference to the driver... > > Oh wait, is this code a module or built into the kernel? > > If it's built in, there's still a reference counting bug in the > module/driver hookup logic as we really don't have a "module" yet we are > still thinking we do as we represent it in /sys/module and create the > linkages. > > I created some horrible patches to try to track this down, as it was > reported on lkml (look for "Subject: kref refcounting breakage in mainline" ) > but never got it working correctly. > > I bet if you build that code as a module, it will work just fine, can > you try it? > > Kay, did you ever get a chance to look into this reference counting > issue? Does the attached work for you? Thanks, Kay diff --git a/include/linux/device.h b/include/linux/device.h index caad9bb..5cf30e9 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -128,6 +128,7 @@ struct device_driver { struct module * owner; const char * mod_name; /* used for built-in modules */ + struct module_kobject * mkobj; int (*probe) (struct device * dev); int (*remove) (struct device * dev); diff --git a/kernel/module.c b/kernel/module.c index fbc51de..dcdb32b 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -2384,8 +2384,13 @@ void module_add_driver(struct module *mo /* Lookup built-in module entry in /sys/modules */ mkobj = kset_find_obj(&module_subsys.kset, drv->mod_name); - if (mkobj) + if (mkobj) { mk = container_of(mkobj, struct module_kobject, kobj); + /* remember our module structure */ + drv->mkobj = mk; + /* kset_find_obj took a reference */ + kobject_put(mkobj); + } } if (!mk) @@ -2405,17 +2410,22 @@ EXPORT_SYMBOL(module_add_driver); void module_remove_driver(struct device_driver *drv) { + struct module_kobject *mk = NULL; char *driver_name; if (!drv) return; sysfs_remove_link(&drv->kobj, "module"); - if (drv->owner && drv->owner->mkobj.drivers_dir) { + + if (drv->owner) + mk = &drv->owner->mkobj; + else if (drv->mkobj) + mk = drv->mkobj; + if (mk && mk->drivers_dir) { driver_name = make_driver_name(drv); if (driver_name) { - sysfs_remove_link(drv->owner->mkobj.drivers_dir, - driver_name); + sysfs_remove_link(mk->drivers_dir, driver_name); kfree(driver_name); } }
Re: [patch 32/37] CRYPTO: api: scatterwalk_copychunks() fails to advance through scatterlist
Herbert Xu wrote: > Indeed. That patch was buggy. Sorry for not catching this earlier. > > This should fix the problem. Works fine, thanks Herbert. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[-mm3 patch]Warning fix: check the return value of kobject_add etc.
Since kobject_add, sysfs_create_link and sysfs_create_file are marked as '__must_check', so we must always check their return values, or gcc will give us warnings. Signed-off-by: Cong WANG <[EMAIL PROTECTED]> --- --- fs/partitions/check.c.orig 2007-03-30 21:35:45.0 +0800 +++ fs/partitions/check.c 2007-03-30 21:49:53.0 +0800 @@ -385,10 +385,16 @@ void add_partition(struct gendisk *disk, p->kobj.parent = &disk->kobj; p->kobj.ktype = &ktype_part; kobject_init(&p->kobj); - kobject_add(&p->kobj); + if (kobject_add(&p->kobj)) { + kfree(p); + return; + } if (!disk->part_uevent_suppress) kobject_uevent(&p->kobj, KOBJ_ADD); - sysfs_create_link(&p->kobj, &block_subsys.kset.kobj, "subsystem"); + if (sysfs_create_link(&p->kobj, &block_subsys.kset.kobj, "subsystem")) { + kfree(p); + return; + } if (flags & ADDPART_FLAG_WHOLEDISK) { static struct attribute addpartattr = { .name = "whole_disk", @@ -396,7 +402,10 @@ void add_partition(struct gendisk *disk, .owner = THIS_MODULE, }; - sysfs_create_file(&p->kobj, &addpartattr); + if (sysfs_create_file(&p->kobj, &addpartattr)) { + kfree(p); + return; + } } partition_sysfs_add_subdir(p); disk->part[part-1] = p; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/13] signal/timer/event fds v8 - anonymous inode source ...
On Fri, 30 Mar 2007, Andrew Morton wrote: > > > > Ok, it was panincing, and someone made me change it. Would you please > > agree? > > The system can survive w/out, but it'll be a broken system WRT userspace. > > I'd say panic. There's no much point in limping along with an > incorrectly-working kernel, only to have some small number of apps fail > mysteriously later on. Well, in this case (since it's at bootup only), I'd agree with panic(), but generally I disagree - it's actually much better to have a broken system limping along and allowing things like syslogd to write the problem to log-files and generally working as well as possible. If people can do a "dmesg" and send it out as an email, we're much more likely to get good bug-reports. But for early boot, and for something that can't really happen anyway, panic() sounds fine. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 32/37] CRYPTO: api: scatterwalk_copychunks() fails to advance through scatterlist
On Sat, Mar 31, 2007 at 03:41:32AM +0200, Patrick McHardy wrote: > > > [CRYPTO] api: scatterwalk_copychunks() fails to advance through scatterlist > > This patch seems to cause some problems, I get reproducable freezes > on the receiving system with net-2.6.22 when sending IPsec packets > larger than the mtu (reproduced about 10 times). Reverting this > patch seems to fix it. In a few cases the oops also occured on the > sending system. > > Backtrace from UML (sending system): > > uml:~# ping 10.0.0.1 -s 2 > PING 10.0.0.1 (10.0.0.1) 2(20028) bytes of data. > BUG: soft lockup detected on CPU#0! > Call Trace: Indeed. That patch was buggy. Sorry for not catching this earlier. This should fix the problem. [CRYPTO] api: Use the right value when advancing scatterwalk_copychunks In the scatterwalk_copychunks loop, We should be advancing by len_this_page and not nbytes. The latter is the total length. Signed-off-by: Herbert Xu <[EMAIL PROTECTED]> Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- diff --git a/crypto/scatterwalk.c b/crypto/scatterwalk.c index a664231..0f76175 100644 --- a/crypto/scatterwalk.c +++ b/crypto/scatterwalk.c @@ -91,7 +91,7 @@ void scatterwalk_copychunks(void *buf, struct scatter_walk *walk, memcpy_dir(buf, vaddr, len_this_page, out); scatterwalk_unmap(vaddr, out); - scatterwalk_advance(walk, nbytes); + scatterwalk_advance(walk, len_this_page); if (nbytes == len_this_page) break; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: strange high system cpu usage.
Lee Thanks for your help. In testing different kernels we found that using an unpatched kernel from kernel.org seems to fix the problem. I'm assuming that a patch added in the gentoo-sources patch set was creating the problem. Our once 8 minute untar is now down to 7-8 seconds with a vanilla 2.6.18.6 kernel. If anyone is interested in our oprofile code or other info, just ask and I'll post it. Otherwise I'll be reporting this to the gentoo developers. -E > - Original Message - > From: "Elliott Johnson" <[EMAIL PROTECTED]> > To: linux-kernel@vger.kernel.org > Subject: Re: strange high system cpu usage. > Date: Fri, 30 Mar 2007 11:54:57 +0800 > > > > What problem are you trying to solve? IOW, how do you know it's not > > just an artifact of diferent load average calculation between 2.4 and > > 2.6? > > > > Are you actually seeing reduced throughput/performance? Or are you > > just looking at load average? > > > > Lee > > Well the problem is apparent, we are having abnormally high cpu > usage. It's about a > 20-40% performance hit. > > The load calculations were not between 2.4 and 2.6 kernel versions, > but between 2.6.8 and > 2.6.19. Sorry if this wasn't very clear from my last email. > > In trying to diagnose the problem I also looked at memory stats > (vmstat) and found the > 'buffered' memory statistic way off from the comparable debian > (2.6.8) install (0-300kb > versus 500mb). > > The vmstat man page has little information on this statistic and > there seems to be varying > explanations on the web. I was hoping for a decisive explanation > (or link) and possibly > advice in toggling this value (or reasons not to). > > I'm still trying to work on this at my end. Some recent tests show > that it might be > related to the megasas driver or the large number of small files we > are using on a xfs > formated 10T array. I'll keep at it. > > Thanks for your response, > > -Elliott > > = > Search for products and services at: > http://search.mail.com > > -- > Powered by Outblaze > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > = Search for products and services at: http://search.mail.com -- Powered by Outblaze - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 9/9] clocksource: refactor duplicate registration checking
On Fri, 30 Mar 2007, Daniel Walker wrote: > /** > * clocksource_register - Used to install new clocksources > * @t: clocksource to be registered > * > - * Returns -EBUSY if registration fails, zero otherwise. > + * Always returns zero. > */ > int clocksource_register(struct clocksource *c) Return should be void, then. - James -- James Morris <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 32/37] CRYPTO: api: scatterwalk_copychunks() fails to advance through scatterlist
Greg KH wrote: > -stable review patch. If anyone has any objections, please let us know. > > -- > From: J. Bruce Fields <[EMAIL PROTECTED]> > > [CRYPTO] api: scatterwalk_copychunks() fails to advance through scatterlist This patch seems to cause some problems, I get reproducable freezes on the receiving system with net-2.6.22 when sending IPsec packets larger than the mtu (reproduced about 10 times). Reverting this patch seems to fix it. In a few cases the oops also occured on the sending system. Backtrace from UML (sending system): uml:~# ping 10.0.0.1 -s 2 PING 10.0.0.1 (10.0.0.1) 2(20028) bytes of data. BUG: soft lockup detected on CPU#0! Call Trace: 61787408: [<602b346f>] _spin_lock+0x9/0xb 61787418: [<6004f7b7>] softlockup_tick+0xa1/0xaf 61787438: [<6003c9d3>] run_local_timers+0x13/0x15 61787448: [<6003c7e8>] update_process_times+0x49/0x73 61787478: [<6001926e>] timer_handler+0x21/0x4f 617874a8: [<60029327>] sig_handler_common_skas+0xff/0x118 617874e8: [<6002625f>] real_alarm_handler+0x37/0x3b 61787508: [<600262b6>] alarm_handler+0x53/0x63 61787538: [<60027e65>] hard_handler+0x15/0x18 617875f8: [<6015bfd9>] scatterwalk_copychunks+0x6d/0xb4 617876d8: [<6001adda>] maybe_map+0x32/0x9f 61787728: [<6015d332>] blkcipher_walk_next+0x11d/0x30f 61787738: [<6006b58c>] poison_obj+0x27/0x32 61787740: [<6015d332>] blkcipher_walk_next+0x11d/0x30f 61787758: [<6006cc92>] cache_alloc_debugcheck_after+0xe5/0x12e 61787780: [<6015bfbf>] scatterwalk_copychunks+0x53/0xb4 61787788: [<6006d14e>] __kmalloc+0xb7/0xc4 617877c8: [<6015d3b6>] blkcipher_walk_next+0x1a1/0x30f 61787828: [<6015d186>] blkcipher_walk_done+0x12e/0x1bd 61787838: [<6002dae3>] aes_encrypt+0x0/0xb 61787850: [<601643d8>] xor_128+0x0/0x1c 61787878: [<6016416d>] crypto_cbc_encrypt+0x7a/0x8b 61787918: [<60244183>] esp_output+0x32b/0x44c 61787948: [<602b34dc>] _spin_unlock_bh+0x12/0x14 617879c8: [<60257051>] xfrm4_output_one+0xaa/0x16a 61787a08: [<60257234>] xfrm4_output_finish2+0x123/0x131 61787a28: [<6025728f>] xfrm4_output_finish+0x3d/0xb9 61787a58: [<60257366>] xfrm4_output+0x5b/0x5d 61787a78: [<602183a9>] ip_push_pending_frames+0x374/0x442 61787ac8: [<6023008d>] raw_sendmsg+0x2d0/0x396 61787b78: [<60237edd>] inet_sendmsg+0x46/0x53 61787ba8: [<601bb5ca>] sock_sendmsg+0xea/0x103 61787c18: [<600473b9>] autoremove_wake_function+0x0/0x39 61787c38: [<600192d3>] add_mmap+0x37/0x149 61787c98: [<6001b10d>] buffer_op+0x2e/0x5f 61787cd8: [<6001b1d5>] copy_from_user_skas+0x7a/0x7c 61787d08: [<601c2747>] verify_iovec+0x4f/0x90 61787d38: [<601bcc32>] sys_sendmsg+0x172/0x1db 61787d68: [<602b34b5>] _spin_unlock_irqrestore+0x18/0x1d 61787d88: [<601924bf>] __up_read+0x76/0x7f 61787db8: [<60049ae1>] up_read+0x9/0xb 61787dc8: [<60019d78>] handle_page_fault+0x1f4/0x224 61787e28: [<60019f29>] segv+0xa7/0x27e 61787ef8: [<6001ab91>] handle_syscall+0x65/0x80 61787f08: [<60019e7c>] segv_handler+0x68/0x6e 61787f28: [<600287ab>] handle_trap+0xd0/0xdb 61787f68: [<60028c2d>] userspace+0x139/0x181 61787fc8: [<6001a8ba>] fork_handler+0x86/0x8d - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: exposing FSB clock speed in /sys
Stephane Eranian <[EMAIL PROTECTED]> writes: > It seems that the kernel does not expose the Front-Side Bus (FSN) Clock > speed to user applications. You mean the APIC timer frequency which happens to match the FSB on some CPUs? > Knowledge the the FSB speed is very useful to monitoring tools. It is used > to compute certain bus-related metrics. Can you describe those metrics in detail? -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/13] signal/timer/event fds v8 - anonymous inode source ...
On Fri, 30 Mar 2007, Andrew Morton wrote: > I'd say panic. There's no much point in limping along with an > incorrectly-working kernel, only to have some small number of apps fail > mysteriously later on. Panic it is ... > > > Can we make this optional if CONFIG_EMBEDDED? You plan on converting > > > epoll > > > to use this facility, but with CONFIG_EPOLL=n, this is all dead code? > > > > Hmmm, the whole point is that all this stuff works with or without epoll. > > And epoll need no changes to support this. > > I'm suggesting that all known clients of anon_inode be made optional. > Hence anon_iode can become optional too. > > It's a desirable objective, at least. The default, really. Ok, I'll put them under Kconf. - Davide - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 10/13] signal/timer/event fds v8 - eventfd core ...
On Fri, 30 Mar 2007 18:11:55 -0700 (PDT) Davide Libenzi wrote: > > > > + */ > > > > So it is the caller's responsibility to ensure that *file refers to an > > eventfd file? > > In which function? I lost you ... > eventfd_signal() assumes that the passed in file* refers to an eventfd file. So if a caller passes in a file* for /etc/passwd, the kernel will go splat. I guess that's caveat emptor, and any violations of that will show up quickly in testing. My main concern would be that there might be some way for a naughty user to force the kernel to pass a non-eventfd file* into this function. That depends upon as-yet-unwritten code - is there a risk of this happening, and how do we prevent it? > > > > +int eventfd_signal(struct file *file, int n) > > > +{ > > > + struct eventfd_ctx *ctx = file->private_data; > > > + unsigned long flags; > > > + > > > + if (n < 0) > > > + return -EINVAL; > > > + spin_lock_irqsave(&ctx->lock, flags); > > > + if (ULLONG_MAX - ctx->count < n) > > > + n = (int) (ULLONG_MAX - ctx->count); > > > + ctx->count += n; > > > + if (waitqueue_active(&ctx->wqh)) > > > + wake_up_locked(&ctx->wqh); > > > + spin_unlock_irqrestore(&ctx->lock, flags); > > > + > > > + return n; > > > +} > > > > > > + DECLARE_WAITQUEUE(wait, current); > > > + > > > + if (count < sizeof(ucnt)) > > > + return -EINVAL; > > > + if (get_user(ucnt, (const __u64 __user *) buf)) > > > + return -EFAULT; > > > > Some architectures do not implement 64-bit get_user() > > copy_from_user it is, then ... > spose so. I think architectures _should_ implement 64-bit get_user() and put_user() nowadays. So you could leave the code as-is and inform the arch maintainers, if you're feeling keen. If all this code has its own Kconfig options then the architectures won't break until their maintainers come along to enable the new features, so they'll implement 64-bit get_user() at that time and things will all unfold in a nicely non-chaotic fashion. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: drivers/video/aty/atyfb_base.c: array overruns
On Mon, 2007-03-19 at 10:22 +0100, Adrian Bunk wrote: > The Coverity checker spotted the following two array overruns in > drivers/video/aty/atyfb_base.c: > > <-- snip --> > > ... > static const u32 lt_lcd_regs[] = { > CONFIG_PANEL_LG, > LCD_GEN_CNTL_LG, > DSTN_CONTROL_LG, > HFB_PITCH_ADDR_LG, > HORZ_STRETCHING_LG, > VERT_STRETCHING_LG, > 0, /* EXT_VERT_STRETCH */ > LT_GIO_LG, > POWER_MANAGEMENT_LG > }; We can pad this array with zeroes, as a stop-gap measure. Ville, what do you think? Tony > > void aty_st_lcd(int index, u32 val, const struct atyfb_par *par) > { > if (M64_HAS(LT_LCD_REGS)) { > aty_st_le32(lt_lcd_regs[index], val, par); > ... > } > ... > u32 aty_ld_lcd(int index, const struct atyfb_par *par) > { > if (M64_HAS(LT_LCD_REGS)) { > return aty_ld_le32(lt_lcd_regs[index], par); > ... > } > ... > static int aty_bl_update_status(struct backlight_device *bd) > { > struct atyfb_par *par = class_get_devdata(&bd->class_dev); > unsigned int reg = aty_ld_lcd(LCD_MISC_CNTL, par); > ... > aty_st_lcd(LCD_MISC_CNTL, reg, par); > > return 0; > } > ... > > <-- snip --> > > LCD_MISC_CNTL = 0x14 = 20 > 8 > > cu > Adrian > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [uml-devel] [patch 06/37] UML - Fix static linking
On venerdì 30 marzo 2007, Greg KH wrote: > -stable review patch. If anyone has any objections, please let us know. I have one objection, the fix has a typo! This is the additional fix (note '.note' instead of 'note'): --- linux-2.6.git.orig/include/asm-um/common.lds.S +++ linux-2.6.git/include/asm-um/common.lds.S @@ -15,7 +15,7 @@ PROVIDE (_unprotected_end = .); . = ALIGN(4096); - .note : { *(note.*) } + .note : { *(.note.*) } __start___ex_table = .; __ex_table : { *(__ex_table) } __stop___ex_table = .; With this, the fix should be merged - I just re-hit this bug and rechecked everything, now it's ok. > -- > From: Jeff Dike <[EMAIL PROTECTED]> > > During a static link, ld has started putting a .note section in the > .uml.setup.init section. This has the result that the UML setups > begin with 32 bytes of garbage and UML crashes immediately on boot. > > This patch creates a specific .note section for ld to drop this stuff > into. > > Signed-off-by: Jeff Dike <[EMAIL PROTECTED]> > Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> > > --- > include/asm-um/common.lds.S |1 + > 1 file changed, 1 insertion(+) > > --- a/include/asm-um/common.lds.S > +++ b/include/asm-um/common.lds.S > @@ -15,6 +15,7 @@ >PROVIDE (_unprotected_end = .); > >. = ALIGN(4096); > + .note : { *(note.*) } >__start___ex_table = .; >__ex_table : { *(__ex_table) } >__stop___ex_table = .; -- Inform me of my mistakes, so I can add them to my list! Paolo Giarrusso, aka Blaisorblade http://www.user-mode-linux.org/~blaisorblade - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [PATCH] uml: fix static linking for real
There was a typo in commit 7632fc8f809a97f9d82ce125e8e3e579390ce2e5, preventing it from working - 32bit binaries crashed hopelessly before the below fix and work perfectly now. Merge for 2.6.21, please. Signed-off-by: Paolo 'Blaisorblade' Giarrusso <[EMAIL PROTECTED]> --- include/asm-um/common.lds.S |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/include/asm-um/common.lds.S b/include/asm-um/common.lds.S index b16222b..f5de80c 100644 --- a/include/asm-um/common.lds.S +++ b/include/asm-um/common.lds.S @@ -15,7 +15,7 @@ PROVIDE (_unprotected_end = .); . = ALIGN(4096); - .note : { *(note.*) } + .note : { *(.note.*) } __start___ex_table = .; __ex_table : { *(__ex_table) } __stop___ex_table = .; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 10/13] signal/timer/event fds v8 - eventfd core ...
On Fri, 30 Mar 2007, Davide Libenzi wrote: > > Some architectures do not implement 64-bit get_user() > > copy_from_user it is, then ... That's messed up though. We do have put_user and we miss get_user. Bah... - Davide - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 10/13] signal/timer/event fds v8 - eventfd core ...
On Fri, 30 Mar 2007, Andrew Morton wrote: > > +struct eventfd_ctx { > > + spinlock_t lock; > > + wait_queue_head_t wqh; > > + __u64 count; > > +}; > > Again, can we borrow wqh.lock? > > `count' needs documentation - these things are key to understanding the > code. Added. > > + */ > > So it is the caller's responsibility to ensure that *file refers to an > eventfd file? In which function? I lost you ... > > +int eventfd_signal(struct file *file, int n) > > +{ > > + struct eventfd_ctx *ctx = file->private_data; > > + unsigned long flags; > > + > > + if (n < 0) > > + return -EINVAL; > > + spin_lock_irqsave(&ctx->lock, flags); > > + if (ULLONG_MAX - ctx->count < n) > > + n = (int) (ULLONG_MAX - ctx->count); > > + ctx->count += n; > > + if (waitqueue_active(&ctx->wqh)) > > + wake_up_locked(&ctx->wqh); > > + spin_unlock_irqrestore(&ctx->lock, flags); > > + > > + return n; > > +} > > Neither the incoming arg (usefully named "n") nor the return value are > documented. Documented now. > Needs interface documentation, please. Even the changelog doesn't tell us > what an EAGAIN return from read() means. I'll be adding the errno documentation to all of them. > > +static ssize_t eventfd_write(struct file *file, const char __user *buf, > > size_t count, > > +loff_t *ppos) > > +{ > > + struct eventfd_ctx *ctx = file->private_data; > > + ssize_t res; > > + __u64 ucnt; > > + DECLARE_WAITQUEUE(wait, current); > > + > > + if (count < sizeof(ucnt)) > > + return -EINVAL; > > + if (get_user(ucnt, (const __u64 __user *) buf)) > > + return -EFAULT; > > Some architectures do not implement 64-bit get_user() copy_from_user it is, then ... - Davide - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 6/13] signal/timer/event fds v8 - timerfd core ...
On Fri, 30 Mar 2007 17:47:28 -0700 (PDT) Davide Libenzi wrote: > On Fri, 30 Mar 2007, Andrew Morton wrote: > > > > +struct timerfd_ctx { > > > + struct hrtimer tmr; > > > + ktime_t tintv; > > > + spinlock_t lock; > > > + wait_queue_head_t wqh; > > > + unsigned long ticks; > > > +}; > > > > Did you consider using the (presently unused) lock inside wqh instead of > > adding a new one? That's a little bit rude, poking into waitqueue > > internals like that, but we do it elsewhere and tricks like that are > > acceptable in core-kernel, I guess. > > Please, no. Gain is not worth the plug into the structure design IMO. > The decision is not that obvious - your patch's main use of timerfd_ctx.lock is to provide locking for wqh - ie: to duplicate the function of the existing lock which is there for that purpose. So I think it's a legitimate optimisation to borrow it. > > > > +static enum hrtimer_restart timerfd_tmrproc(struct hrtimer *htmr) > > > +{ > > > + struct timerfd_ctx *ctx = container_of(htmr, struct timerfd_ctx, tmr); > > > + enum hrtimer_restart rval = HRTIMER_NORESTART; > > > + unsigned long flags; > > > + > > > + spin_lock_irqsave(&ctx->lock, flags); > > > + ctx->ticks++; > > > + wake_up_locked(&ctx->wqh); > > > + if (ctx->tintv.tv64 != 0) { > > > + hrtimer_forward(htmr, hrtimer_cb_get_time(htmr), ctx->tintv); > > > + rval = HRTIMER_RESTART; > > > + } > > > + spin_unlock_irqrestore(&ctx->lock, flags); > > > + > > > + return rval; > > > +} > > > > What's this do? > > Really, do we need to comment such trivial code? There is *nothing* that > is worth a line of comment in there. IMO useless comment are more annoying > than blank lines. > Look at it from the point of view of someone who knows kernel code but does not specifically know this subsystem. That describes the great majority of people who will be reading your code. > > > > > +static void timerfd_setup(struct timerfd_ctx *ctx, int clockid, int > > > flags, > > > + const struct itimerspec *ktmr) > > > +{ > > > + enum hrtimer_mode htmode; > > > + ktime_t texp; > > > + > > > + htmode = (flags & TFD_TIMER_ABSTIME) ? HRTIMER_MODE_ABS: > > > HRTIMER_MODE_REL; > > > + > > > + texp = timespec_to_ktime(ktmr->it_value); > > > + ctx->ticks = 0; > > > + ctx->tintv = timespec_to_ktime(ktmr->it_interval); > > > + hrtimer_init(&ctx->tmr, clockid, htmode); > > > + ctx->tmr.expires = texp; > > > + ctx->tmr.function = timerfd_tmrproc; > > > + if (texp.tv64 != 0) > > > + hrtimer_start(&ctx->tmr, texp, htmode); > > > +} > > > > What does the special case texp.tv64 == 0 signify? Is that obvious to > > anyone who understands hrtimers? Is it something which we can expect > > Micheal to immediately understand? Should it be documented somewhere? > > Michael should not read the code, but the patch description that comes > with it ;) > To some extent, yes - there's a lot of material which is relevant to a complex system call like this which isn't appropriate to code comments. But a descrition of the role of texp.tv64 in here is an aid to understanding the implementation and hence is appropriate and needed. > > > > +asmlinkage long sys_timerfd(int ufd, int clockid, int flags, > > > + const struct itimerspec __user *utmr) > > > > Somehow we need to get from this to a manpage. > > Again, the patch description describes (modulo returned errno's) the API > pretty well. > A basic description of the inputs, outputs and return value is appropriate to most high-level kernel funtions. One here won't hurt. > > > > OK, this is briefly documented in the patch changelog. That interface > > documentation should be fleshed out and moved into the .c file. a) because > > it is easier to find and b) if we change it, it's a bit hard to go back and > > alter that changelog! > > I think it's better to leave it out of the code, and keep it in the patch > header. > Patch headers are not maintainable. Nobody wants to have to go off and waddle though the git repo to understand the design intent behind each function. Look, I'm just providing feedback as an experienced kernel developer who is reading your code for the first time. I had questions, and I saw things which I felt were not adequately communicated. You are the last person who can judge what is obvious and what is not, because you already understand it! I do err on the make-it-easy-for-them side, but that's not a bad thing, I think. Very large numbers of people read core kernel code and the actual change rate of this code will be low. So we can afford to put the effort into making these peoples' code-reading as productive as we can. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 6/13] signal/timer/event fds v8 - timerfd core ...
On Fri, 30 Mar 2007, Andrew Morton wrote: > > +struct timerfd_ctx { > > + struct hrtimer tmr; > > + ktime_t tintv; > > + spinlock_t lock; > > + wait_queue_head_t wqh; > > + unsigned long ticks; > > +}; > > Did you consider using the (presently unused) lock inside wqh instead of > adding a new one? That's a little bit rude, poking into waitqueue > internals like that, but we do it elsewhere and tricks like that are > acceptable in core-kernel, I guess. Please, no. Gain is not worth the plug into the structure design IMO. > I find that the key to understanding kernel code is to understand the data > structures and the relationships between them. Once you have that in your > head, the code tends to just fall out. Hence there is good maintainability > payoff in putting work into documenting the struct, its fields, the > relationship between this struct and other structs, and any and all locking > requirements. > > Seemed obvious to me, but comment added. > > +static enum hrtimer_restart timerfd_tmrproc(struct hrtimer *htmr); > > +static void timerfd_setup(struct timerfd_ctx *ctx, int clockid, int flags, > > + const struct itimerspec *ktmr); > > +static int timerfd_close(struct inode *inode, struct file *file); > > +static unsigned int timerfd_poll(struct file *file, poll_table *wait); > > +static ssize_t timerfd_read(struct file *file, char __user *buf, size_t > > count, > > + loff_t *ppos); > > It'd be nice to find a way to make these declarations go away. Gone. > > > + > > + > > + > > blankness. You blank freak! :) > > +static const struct file_operations timerfd_fops = { > > + .release= timerfd_close, > > Rename to timerfd_release Done. > > +static enum hrtimer_restart timerfd_tmrproc(struct hrtimer *htmr) > > +{ > > + struct timerfd_ctx *ctx = container_of(htmr, struct timerfd_ctx, tmr); > > + enum hrtimer_restart rval = HRTIMER_NORESTART; > > + unsigned long flags; > > + > > + spin_lock_irqsave(&ctx->lock, flags); > > + ctx->ticks++; > > + wake_up_locked(&ctx->wqh); > > + if (ctx->tintv.tv64 != 0) { > > + hrtimer_forward(htmr, hrtimer_cb_get_time(htmr), ctx->tintv); > > + rval = HRTIMER_RESTART; > > + } > > + spin_unlock_irqrestore(&ctx->lock, flags); > > + > > + return rval; > > +} > > What's this do? Really, do we need to comment such trivial code? There is *nothing* that is worth a line of comment in there. IMO useless comment are more annoying than blank lines. > > +static void timerfd_setup(struct timerfd_ctx *ctx, int clockid, int flags, > > + const struct itimerspec *ktmr) > > +{ > > + enum hrtimer_mode htmode; > > + ktime_t texp; > > + > > + htmode = (flags & TFD_TIMER_ABSTIME) ? HRTIMER_MODE_ABS: > > HRTIMER_MODE_REL; > > + > > + texp = timespec_to_ktime(ktmr->it_value); > > + ctx->ticks = 0; > > + ctx->tintv = timespec_to_ktime(ktmr->it_interval); > > + hrtimer_init(&ctx->tmr, clockid, htmode); > > + ctx->tmr.expires = texp; > > + ctx->tmr.function = timerfd_tmrproc; > > + if (texp.tv64 != 0) > > + hrtimer_start(&ctx->tmr, texp, htmode); > > +} > > What does the special case texp.tv64 == 0 signify? Is that obvious to > anyone who understands hrtimers? Is it something which we can expect > Micheal to immediately understand? Should it be documented somewhere? Michael should not read the code, but the patch description that comes with it ;) > > +asmlinkage long sys_timerfd(int ufd, int clockid, int flags, > > + const struct itimerspec __user *utmr) > > Somehow we need to get from this to a manpage. Again, the patch description describes (modulo returned errno's) the API pretty well. > OK, this is briefly documented in the patch changelog. That interface > documentation should be fleshed out and moved into the .c file. a) because > it is easier to find and b) if we change it, it's a bit hard to go back and > alter that changelog! I think it's better to leave it out of the code, and keep it in the patch header. > How come it's OK to truncate 64-bit timerfd_ctx.ticks to 32-bit like this? 2^32 ticks should be fine. I could make it a 64 bit thing, but IMO 32 bit is OK. - Davide - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Intel DP965LT Mainboard running?
Grant Coady wrote: On Sat, 31 Mar 2007 00:31:38 +0200, Oliver Joa <[EMAIL PROTECTED]> wrote: Hi, does anyone have a running Intel DP965LT Mainboard? I can not get this Board running. You can see the Problems in the Thread "Corrupt XFS-Filesystems on new Hardware and Kernel". Please can you give me a running Kernel-Config? http://bugsplatter.mine.nu/system/dp965lt.html some notes and gotchas http://bugsplatter.mine.nu/test/boxen/silly/configs and dmesgs I've only had reiserfs and ext3 going, not XFS. that page mentions that the onboard NIC has problems linking at 100mbit. Have you tried debugging the issue with us? If you can, open up a mail to [EMAIL PROTECTED] or file a ticket at e1000.sf.net?! Cheers, Auke - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2/13] signal/timer/event fds v7 - signalfd core ...
On Fri, 30 Mar 2007, Andrew Morton wrote: > General comments: > > - All these patches will be considered a 100% regression by the > linux-on-a-cellphone people. What do we have to do to make all of this > stuff Kconfigurable? I guess we can, yes. > - All this code is moving us toward being able to unify all asynchronous > event handling under epoll, yes? > > If so, it is a competitor to kevent, only it's coming from the other > direction. > > I personally find it an attractive competitor, because it is much more > incremental and is easier from a design POV. But what are its > shortcomings wrt kevent? Do we have a feel for the what the performance > difference will be? > > Which other kernel subsystems need to be wired up for this approach to > reach the same level of capability as kevent? Those patches are not bound to an interface to be used, that's the whole point of it. You can use it with POSIX select/poll if you want. Epoll was there, and was already covering the huge set of pollable devices. Timers, signals and event fds complement this set (and you don't need epoll to use them). The KAIO notification coming to an eventfd (last patch of the serie - like 30 lines), allows you to listen for KAIO readiness on an event fd (hence using either selct/poll/epoll). > - Some poor schmuck needs to document all this stuff. Other poor schmucks > need to program to it, and to develop libraries which talk to it, etc. > Other schmucks need to understand and maintain it. I judge the code and > the patches to be inadequately documented. Well, many Linux man pages are smaller than the API description that comes with those patches ;) > Apart from general code commentary, which I will point out at the > relevant sites, I wonder about things like: > > - What are the sharing semantics? > > - Across dup(), dup2() and fork()? They work. But you'd still be "listening" the the sighand that created the signalfd. Until that sighand gets detached. After that you read(2) zero bytes, that tells you that the "remote disconnected" ;) > - If two !CLONE_SIGHAND, CLONE_FS threads are sharing a signalfd and one > alters its signal mask? The "mask" is private to the file*, and does not alter the sighand one. A thread can fetch other one signals if you like. > - If two processes are sharing a signalfd across fork() and one > alters its signal mask or something? Which signal mask? Process one or signalfd one? > - What are the effects upon the signalfd if the process alters its > signal state? As I said, signalfd and process competes over dequeue_signal for the signal fetch. You get a given signal once, either on the fd or with std async delivery. If you want to be sure to get it always on the fd, you need to block it. > - What happens if a task has multiple signalfds open? Does one > signal get delivered to all of the fds? Signalfds compete over dequeue_signal(), so only one of them will get it. > IMO all combinations and permutations should be documented for > posterity and it should be done now so we can review this design. Ok. > > +static int signalfd_lock(struct signalfd_ctx *ctx, struct signalfd_lockctx > > *lk); > > +static void signalfd_unlock(struct signalfd_lockctx *lk); > > +static void signalfd_cleanup(struct signalfd_ctx *ctx); > > +static int signalfd_close(struct inode *inode, struct file *file); > > +static unsigned int signalfd_poll(struct file *file, poll_table *wait); > > +static int signalfd_copyinfo(struct signalfd_siginfo __user *uinfo, > > +siginfo_t const *kinfo); > > `cosnt siginfo_t *', please. > > I dunno, I find all these forward declarations to be a fugly waste of > space, and a maintenance hassle. I think a lot of them can be made to go > away with some very simple code reorganisations. Done. > > +static ssize_t signalfd_read(struct file *file, char __user *buf, size_t > > count, > > +loff_t *ppos); > > + > > + > > + > > +static const struct file_operations signalfd_fops = { > > + .release= signalfd_close, > > Please rename signalfd_close to signalfd_release. Done. > > +static int signalfd_lock(struct signalfd_ctx *ctx, struct signalfd_lockctx > > *lk) > > +{ > > + struct sighand_struct *sighand = NULL; > > + > > + rcu_read_lock(); > > + lk->tsk = rcu_dereference(ctx->tsk); > > + if (likely(lk->tsk != NULL)) > > + sighand = lock_task_sighand(lk->tsk, &lk->flags); > > + rcu_read_unlock(); > > + > > + if (sighand && !ctx->tsk) { > > + unlock_task_sighand(lk->tsk, &lk->flags); > > + sighand = NULL; > > + } > > + > > + return sighand != NULL; > > +} > > This function needs documentation - it really is quite obscure. What does > its return value mean? Why does it sometimes do lock_task_sighand() and > sometimes does not? I assume that it's handling exitted
Re: [1/4] 2.6.21-rc5: known regressions (v2)
On Fri, Mar 30, 2007 at 11:32:09PM +0200, Adrian Bunk wrote: > > Subject: kernels fail to boot with drives on ATIIXP controller > (ACPI/IRQ related) > References : https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=229621 > http://lkml.org/lkml/2007/3/4/257 > Submitter : Michal Jaegermann <[EMAIL PROTECTED]> > Status : unknown I have now even better one with pata_via. A kernel, which for all practical purposes is 2.6.21-rc5, not only refuses to boot (and I cannot find some option combination which would allow me to do so anyway) but simply refuses to read _any_ data from a media. This included a partitioning information. Earlier kernel on the same hardware boots without raising any fuss. Details are collected as https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=234650 Michal - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/4] [SCSI]stex: fix id mapping issue
Ed Lin wrote: The internal id/lun mapping of st_vsc and st_vsc1 controllers is different from st_shasta. The original driver code can only map first 16 'entities' for st_vsc and st_vsc1 while there are actually 128 available. Also the ST_MAX_LUN_PER_TARGET should be 8, although this can do no harm because inquiries beyond boundary are discarded by firmware. The correct internal mapping should be: id:0~15, lun:0~7 (st_shasta) id:0, lun:0~127 (st_yosemite) id:0~127, lun:0 (st_vsc and st_vsc1) To scsi mid layer they are all channel:0~7, id:0~15, lun:0, with a maximun 'entity' number of 128. The RAID console only interfaces to scsi mid layer and is always mapped at channel:0, id:16, lun:0. Signed-off-by: Ed Lin <[EMAIL PROTECTED]> ACK patches 1-4. I presume James will apply them to scsi-fixes... Jeff - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2/5] signalfd v2 - signalfd core ...
On Thursday 08 March 2007 18:28, Linus Torvalds wrote: > The sad part is that there really is no reason why the BSD crowd couldn't > have done recvmsg() as an "extended read with per-system call flags", > which would have made things like O_NONBLOCK etc unnecessary, because you > could do it just with MSG_DONTWAIT.. Wait a second here... O_NONBLOCK is not just unnecessary - it's buggy! Try to do nonblocking read from stdin (fd #0) - * setting O_NONBLOCK with fcntl will set it for all other processes which has the same stdin! * trying to reset O_NONBLOCK after the read doesn't help (think kill -9) * duping fd #0 doesn't help because O_NONBLOCK is not per-fd, it's shared just like filepos. I really like that trick with recvmsg + MSG_DONTWAIT instead. -- vda - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Intel DP965LT Mainboard running?
On Sat, 31 Mar 2007 00:31:38 +0200, Oliver Joa <[EMAIL PROTECTED]> wrote: >Hi, > >does anyone have a running Intel DP965LT Mainboard? I can not get this >Board running. You can see the Problems in the Thread "Corrupt >XFS-Filesystems on new Hardware and Kernel". Please can you give me a >running Kernel-Config? http://bugsplatter.mine.nu/system/dp965lt.html some notes and gotchas http://bugsplatter.mine.nu/test/boxen/silly/configs and dmesgs I've only had reiserfs and ext3 going, not XFS. Grant. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/4] [SCSI]stex: minor cleanup and version update
Add debug information into abort and host_reset routine. Change ioremap to ioremap_nocache. Version updated to 3.6..1. Signed-off-by: Ed Lin <[EMAIL PROTECTED]> --- diff --git a/drivers/scsi/stex.c b/drivers/scsi/stex.c index 9465f35..5a10cfa 100644 --- a/drivers/scsi/stex.c +++ b/drivers/scsi/stex.c @@ -32,11 +32,12 @@ #include #include #include #include +#include #define DRV_NAME "stex" -#define ST_DRIVER_VERSION "3.1.0.1" +#define ST_DRIVER_VERSION "3.6..1" #define ST_VER_MAJOR 3 -#define ST_VER_MINOR 1 +#define ST_VER_MINOR 6 #define ST_OEM 0 #define ST_BUILD_VER 1 @@ -1007,6 +1008,11 @@ static int stex_abort(struct scsi_cmnd * u32 data; int result = SUCCESS; unsigned long flags; + + printk(KERN_INFO DRV_NAME + "(%s): aborting command\n", pci_name(hba->pdev)); + scsi_print_command(cmd); + base = hba->mmio_base; spin_lock_irqsave(host->host_lock, flags); if (tag < host->can_queue && hba->ccb[tag].cmd == cmd) @@ -1092,6 +1098,10 @@ static int stex_reset(struct scsi_cmnd * unsigned long before; hba = (struct st_hba *) &cmd->device->host->hostdata[0]; + printk(KERN_INFO DRV_NAME + "(%s): resetting host\n", pci_name(hba->pdev)); + scsi_print_command(cmd); + hba->mu_status = MU_STATE_RESETTING; if (hba->cardtype == st_shasta) @@ -1211,7 +1221,7 @@ stex_probe(struct pci_dev *pdev, const s goto out_scsi_host_put; } - hba->mmio_base = ioremap(pci_resource_start(pdev, 0), + hba->mmio_base = ioremap_nocache(pci_resource_start(pdev, 0), pci_resource_len(pdev, 0)); if ( !hba->mmio_base) { printk(KERN_ERR DRV_NAME "(%s): memory map failed\n", - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/4] [SCSI]stex: extend hard reset wait time
During hard bus reset of st_shasta controllers, 1 ms is not enough for 16-port controllers, although it's good for 8-port controllers. Extend the wait time to 100 ms to allow bus resets finish successfully. Signed-off-by: Ed Lin <[EMAIL PROTECTED]> --- diff --git a/drivers/scsi/stex.c b/drivers/scsi/stex.c index 4d68533..1e8d7ac 100644 --- a/drivers/scsi/stex.c +++ b/drivers/scsi/stex.c @@ -1055,7 +1055,12 @@ static void stex_hard_reset(struct st_hb pci_read_config_byte(bus->self, PCI_BRIDGE_CONTROL, &pci_bctl); pci_bctl |= PCI_BRIDGE_CTL_BUS_RESET; pci_write_config_byte(bus->self, PCI_BRIDGE_CONTROL, pci_bctl); - msleep(1); + + /* +* 1 ms may be enough for 8-port controllers. But 16-port controllers +* require more time to finish bus reset. Use 100 ms here for safety +*/ + msleep(100); pci_bctl &= ~PCI_BRIDGE_CTL_BUS_RESET; pci_write_config_byte(bus->self, PCI_BRIDGE_CONTROL, pci_bctl); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/4] [SCSI]stex: fix reset recovery for console device
After reset completed, the scsi error handler sends out START_STOP and TEST_UNIT_READY to the device. For 'normal' devices these commands will be handled by firmware. However, because the RAID console only interfaces to scsi mid layer, the firmware will not process these commands for it. This will make the console to be offlined right after reset. Add the handling in driver to fix this problem. Signed-off-by: Ed Lin <[EMAIL PROTECTED]> --- diff --git a/drivers/scsi/stex.c b/drivers/scsi/stex.c index 1e8d7ac..9465f35 100644 --- a/drivers/scsi/stex.c +++ b/drivers/scsi/stex.c @@ -605,6 +605,14 @@ stex_queuecommand(struct scsi_cmnd *cmd, stex_invalid_field(cmd, done); return 0; } + case TEST_UNIT_READY: + case START_STOP: + if (id == ST_MAX_ARRAY_SUPPORTED) { + cmd->result = DID_OK << 16 | COMMAND_COMPLETE << 8; + done(cmd); + return 0; + } + break; case INQUIRY: if (id != ST_MAX_ARRAY_SUPPORTED) break; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/4] [SCSI]stex: fix id mapping issue
The internal id/lun mapping of st_vsc and st_vsc1 controllers is different from st_shasta. The original driver code can only map first 16 'entities' for st_vsc and st_vsc1 while there are actually 128 available. Also the ST_MAX_LUN_PER_TARGET should be 8, although this can do no harm because inquiries beyond boundary are discarded by firmware. The correct internal mapping should be: id:0~15, lun:0~7 (st_shasta) id:0, lun:0~127 (st_yosemite) id:0~127, lun:0 (st_vsc and st_vsc1) To scsi mid layer they are all channel:0~7, id:0~15, lun:0, with a maximun 'entity' number of 128. The RAID console only interfaces to scsi mid layer and is always mapped at channel:0, id:16, lun:0. Signed-off-by: Ed Lin <[EMAIL PROTECTED]> --- diff --git a/drivers/scsi/stex.c b/drivers/scsi/stex.c index 69be132..4d68533 100644 --- a/drivers/scsi/stex.c +++ b/drivers/scsi/stex.c @@ -115,7 +115,7 @@ enum { ST_MAX_ARRAY_SUPPORTED = 16, ST_MAX_TARGET_NUM = (ST_MAX_ARRAY_SUPPORTED+1), - ST_MAX_LUN_PER_TARGET = 16, + ST_MAX_LUN_PER_TARGET = 8, st_shasta = 0, st_vsc = 1, @@ -645,12 +645,16 @@ stex_queuecommand(struct scsi_cmnd *cmd, req = stex_alloc_req(hba); - if (hba->cardtype == st_yosemite) { - req->lun = lun * (ST_MAX_TARGET_NUM - 1) + id; - req->target = 0; - } else { + if (hba->cardtype == st_shasta) { req->lun = lun; req->target = id; + } else if (hba->cardtype == st_yosemite){ + req->lun = id * ST_MAX_LUN_PER_TARGET + lun; + req->target = 0; + } else { + /* st_vsc and st_vsc1 */ + req->lun = 0; + req->target = id * ST_MAX_LUN_PER_TARGET + lun; } /* cdb */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/3] slab: avoid __initdata warning (may be a bogus one)
set_up_list3s is not __init and references initkmem_list3. Also, kmem_cache_create calls setup_cpu_cache which calls set_up_list3s. The state machine _may_ prevent the code from accessing this data after freeing initdata (it makes sure it's used only up to boot), so this warning may be a false positive. Signed-off-by: Paolo 'Blaisorblade' Giarrusso <[EMAIL PROTECTED]> --- mm/slab.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/mm/slab.c b/mm/slab.c index 0934f8d..0772faf 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -305,7 +305,7 @@ struct kmem_list3 { * Need this for bootstrapping a per node allocator. */ #define NUM_INIT_LISTS (2 * MAX_NUMNODES + 1) -struct kmem_list3 __initdata initkmem_list3[NUM_INIT_LISTS]; +struct kmem_list3 initkmem_list3[NUM_INIT_LISTS]; #defineCACHE_CACHE 0 #defineSIZE_AC 1 #defineSIZE_L3 (1 + MAX_NUMNODES) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/3] utrace - uml: make UML compile with utrace enabled
* The prototype of arch_ptrace doesn't match the one in include/linux/ptrace.h. * utrace_um_native is referred to by utrace_native_view but never defined. Cc: Jeff Dike <[EMAIL PROTECTED]> Cc: Roland McGrath <[EMAIL PROTECTED]> Signed-off-by: Paolo 'Blaisorblade' Giarrusso <[EMAIL PROTECTED]> --- arch/um/kernel/ptrace.c |7 ++- 1 files changed, 6 insertions(+), 1 deletions(-) diff --git a/arch/um/kernel/ptrace.c b/arch/um/kernel/ptrace.c index f66d01c..a42caf3 100644 --- a/arch/um/kernel/ptrace.c +++ b/arch/um/kernel/ptrace.c @@ -16,7 +16,12 @@ void ptrace_disable(struct task_struct *child) { } -long arch_ptrace(struct task_struct *child, long request, long addr, long data) +const struct utrace_regset_view utrace_um_native; + +int arch_ptrace(long *request, struct task_struct *child, + struct utrace_attached_engine *engine, + unsigned long addr, unsigned long data, + long *retval) { return -ENOSYS; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [uml-devel] [PATCH 1/2] UML - Fix umid in xterm titles
On venerdì 30 marzo 2007, Jeff Dike wrote: > From: Davide Brini <[EMAIL PROTECTED]> > > Calls lines_init() *after* xterm_title is modified to include umid. > > Signed-off-by: Davide Brini <[EMAIL PROTECTED]> > Signed-off-by: Jeff Dike <[EMAIL PROTECTED]> Acked-by: Paolo 'Blaisorblade' Giarrusso <[EMAIL PROTECTED]> > -- > arch/um/drivers/ssl.c |4 ++-- > arch/um/drivers/stdio_console.c |4 ++-- > 2 files changed, 4 insertions(+), 4 deletions(-) > > Index: linux-2.6.21-mm/arch/um/drivers/ssl.c > === > --- linux-2.6.21-mm.orig/arch/um/drivers/ssl.c2007-03-30 > 10:11:01.0 -0400 +++ > linux-2.6.21-mm/arch/um/drivers/ssl.c 2007-03-30 10:28:51.0 -0400 > @@ -191,12 +191,12 @@ static int ssl_init(void) > ssl_driver = register_lines(&driver, &ssl_ops, serial_lines, > ARRAY_SIZE(serial_lines)); > > - lines_init(serial_lines, ARRAY_SIZE(serial_lines), &opts); > - > new_title = add_xterm_umid(opts.xterm_title); > if (new_title != NULL) > opts.xterm_title = new_title; > > + lines_init(serial_lines, ARRAY_SIZE(serial_lines), &opts); > + > ssl_init_done = 1; > register_console(&ssl_cons); > return 0; > Index: linux-2.6.21-mm/arch/um/drivers/stdio_console.c > === > --- linux-2.6.21-mm.orig/arch/um/drivers/stdio_console.c 2007-03-30 > 10:11:01.0 -0400 +++ > linux-2.6.21-mm/arch/um/drivers/stdio_console.c 2007-03-30 > 10:28:51.0 -0400 @@ -166,12 +166,12 @@ int stdio_init(void) > return -1; > printk(KERN_INFO "Initialized stdio console driver\n"); > > - lines_init(vts, ARRAY_SIZE(vts), &opts); > - > new_title = add_xterm_umid(opts.xterm_title); > if(new_title != NULL) > opts.xterm_title = new_title; > > + lines_init(vts, ARRAY_SIZE(vts), &opts); > + > con_init_done = 1; > register_console(&stdiocons); > return 0; > > - > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > ___ > User-mode-linux-devel mailing list > User-mode-linux-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel -- Inform me of my mistakes, so I can add them to my list! Paolo Giarrusso, aka Blaisorblade http://www.user-mode-linux.org/~blaisorblade - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/3] sys_futex64-allows-64bit-futexes-workaround for uml
Copy sys_futex64-allows-64bit-futexes-workaround.patch to UML (to unbreak the UML build). Note however that in include/asm-generic/futex.h we have: static inline int futex_atomic_cmpxchg_inatomic(int __user *uaddr, int oldval, int newval) { return -ENOSYS; } Which is a better solution. Pierre Peiffer, please consider that. Cc: Pierre Peiffer <[EMAIL PROTECTED]> Signed-off-by: Paolo 'Blaisorblade' Giarrusso <[EMAIL PROTECTED]> --- include/asm-um/futex.h | 13 + 1 files changed, 13 insertions(+), 0 deletions(-) diff --git a/include/asm-um/futex.h b/include/asm-um/futex.h index 6a332a9..e875d3e 100644 --- a/include/asm-um/futex.h +++ b/include/asm-um/futex.h @@ -3,4 +3,17 @@ #include +static inline u64 +futex_atomic_cmpxchg_inatomic64(u64 __user *uaddr, u64 oldval, u64 newval) +{ + return 0; +} + +static inline int +futex_atomic_op_inuser64 (int encoded_op, u64 __user *uaddr) +{ + return 0; +} + + #endif - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/3] mm-only patches
Patch-arounds for mm-only compile errors/warnings, got on 2.6.21-rc5-mm2, still apply on 2.6.21-rc5-mm3. -- Inform me of my mistakes, so I can add them to my list! Paolo Giarrusso, aka Blaisorblade http://www.user-mode-linux.org/~blaisorblade - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/13] signal/timer/event fds v8 - anonymous inode source ...
On Fri, 30 Mar 2007 15:44:15 -0700 (PDT) Davide Libenzi wrote: > On Fri, 30 Mar 2007, Andrew Morton wrote: > > > > +#include > > > + > > > + > > > + > > > > Too many blank lines > > It'd be interesting to know how much is enough. You use one, ppl says it > is too dense. You use more, ppl says it's too much. > There's the one-line rule for inter-function spacing, but what's the > include-functions ones? Or the functions-data ones? > less ;) > > > > +static int __init aino_init(void) > > > +{ > > > + int error; > > > + > > > + error = register_filesystem(&aino_fs_type); > > > + if (error) > > > + goto err_exit; > > > + aino_mnt = kern_mount(&aino_fs_type); > > > + if (IS_ERR(aino_mnt)) { > > > + error = PTR_ERR(aino_mnt); > > > + goto err_unregister_filesystem; > > > + } > > > + aino_inode = aino_mkinode(); > > > + if (IS_ERR(aino_inode)) { > > > + error = PTR_ERR(aino_inode); > > > + goto err_mntput; > > > + } > > > + > > > + return 0; > > > + > > > +err_mntput: > > > + mntput(aino_mnt); > > > +err_unregister_filesystem: > > > + unregister_filesystem(&aino_fs_type); > > > +err_exit: > > > + printk(KERN_ERR "aino_init() failed (%d)\n", error); > > > > I suspect this is panic time? > > Ok, it was panincing, and someone made me change it. Would you please > agree? > The system can survive w/out, but it'll be a broken system WRT userspace. I'd say panic. There's no much point in limping along with an incorrectly-working kernel, only to have some small number of apps fail mysteriously later on. > > > > Can we make this optional if CONFIG_EMBEDDED? You plan on converting epoll > > to use this facility, but with CONFIG_EPOLL=n, this is all dead code? > > Hmmm, the whole point is that all this stuff works with or without epoll. > And epoll need no changes to support this. I'm suggesting that all known clients of anon_inode be made optional. Hence anon_iode can become optional too. It's a desirable objective, at least. The default, really. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fix page leak during core dump
On Fri, 30 Mar 2007, Andrew Morton wrote: > > again?> Oooh, yes please. > diff -puN fs/binfmt_elf_fdpic.c~fix-page-leak-during-core-dump > fs/binfmt_elf_fdpic.c > --- a/fs/binfmt_elf_fdpic.c~fix-page-leak-during-core-dump > +++ a/fs/binfmt_elf_fdpic.c > @@ -1480,8 +1480,10 @@ static int elf_fdpic_dump_segments(struc > DUMP_SEEK(file->f_pos + PAGE_SIZE); > } > else if (page == ZERO_PAGE(addr)) { > - DUMP_SEEK(file->f_pos + PAGE_SIZE); > - page_cache_release(page); > + if (!dump_seek(file, file->f_pos + PAGE_SIZE)) { > + page_cache_release(page); > + return 0; > + } > } > else { > void *kaddr; > _ No, I think that's wrong: whereas the binfmt_elf one did its page_cache_release down below at the bottom of the block, this version does it in each subblock, so there you're removing the dump_seek success one. Can't we preserve that beauteous macro here and just do... --- a/fs/binfmt_elf_fdpic.c +++ b/fs/binfmt_elf_fdpic.c @@ -1480,8 +1480,8 @@ static int elf_fdpic_dump_segments(struc DUMP_SEEK(file->f_pos + PAGE_SIZE); } else if (page == ZERO_PAGE(addr)) { - DUMP_SEEK(file->f_pos + PAGE_SIZE); page_cache_release(page); + DUMP_SEEK(file->f_pos + PAGE_SIZE); } else { void *kaddr; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2.6.21-rc5 2/3] msi: fix ARM compile
In file included from drivers/pci/msi.c:22: include/asm/smp.h:17:26: asm/arch/smp.h: No such file or directory include/asm/smp.h:20:3: #error " included in non-SMP build" include/asm/smp.h:23:1: warning: "raw_smp_processor_id" redefined In file included from include/linux/sched.h:65, from include/linux/mm.h:4, from drivers/pci/msi.c:10: include/linux/smp.h:85:1: warning: this is the location of the previous definition Tested on powerpc, i386, and x86_64. Signed-off-by: Dan Williams <[EMAIL PROTECTED]> Acked-by: Eric W. Biederman <[EMAIL PROTECTED]> --- drivers/pci/msi.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index ad33e01..7a7152b 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -16,10 +16,10 @@ #include #include #include +#include #include #include -#include #include "pci.h" #include "msi.h" - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2.6.21-rc5 3/3] iop13xx: msi support (rev6)
From: Daniel Wolstenholme <[EMAIL PROTECTED]> Enable devices to signal interrupts via PCI memory cycles. rev6: * fix enable/disable typo, Michael Ellerman rev5: * fix up ack, enable, and disable for iop13xx_msi_chip rev4: * move smp compile fix to separate patch * use dynamic_irq_init in create_irq() * hookup mask/unmask routines in iop13xx_msi_chip rev3: * change msi.c to use linux/smp.h instead of asm/smp.h * call dynamic_irq_cleanup at destroy_irq time rev2: * destroy_irq did not take the full 128 bits of msi_irq_in_use into account * added missing '&' for calls to test_and_set_bit and clear_bit [EMAIL PROTECTED]: review comments/suggestions] [EMAIL PROTECTED]: cleanups/forward port to 2.6-git] Signed-off-by: Daniel Wolstenholme <[EMAIL PROTECTED]> Signed-off-by: Dan Williams <[EMAIL PROTECTED]> Acked-by: Eric W. Biederman <[EMAIL PROTECTED]> --- arch/arm/mach-iop13xx/Makefile |1 arch/arm/mach-iop13xx/irq.c|5 + arch/arm/mach-iop13xx/msi.c| 193 arch/arm/mach-iop13xx/pci.c| 16 +++ include/asm-arm/arch-iop13xx/iop13xx.h | 29 + include/asm-arm/arch-iop13xx/irqs.h|8 + include/asm-arm/arch-iop13xx/msi.h | 11 ++ 7 files changed, 261 insertions(+), 2 deletions(-) diff --git a/arch/arm/mach-iop13xx/Makefile b/arch/arm/mach-iop13xx/Makefile index 4185e05..02bd511 100644 --- a/arch/arm/mach-iop13xx/Makefile +++ b/arch/arm/mach-iop13xx/Makefile @@ -9,3 +9,4 @@ obj-$(CONFIG_ARCH_IOP13XX) += pci.o obj-$(CONFIG_ARCH_IOP13XX) += io.o obj-$(CONFIG_MACH_IQ81340SC) += iq81340sc.o obj-$(CONFIG_MACH_IQ81340MC) += iq81340mc.o +obj-$(CONFIG_PCI_MSI) += msi.o diff --git a/arch/arm/mach-iop13xx/irq.c b/arch/arm/mach-iop13xx/irq.c index b2eb0b9..5791add 100644 --- a/arch/arm/mach-iop13xx/irq.c +++ b/arch/arm/mach-iop13xx/irq.c @@ -26,6 +26,7 @@ #include #include #include +#include /* INTCTL0 CP6 R0 Page 4 */ @@ -258,7 +259,7 @@ void __init iop13xx_init_irq(void) write_intbase(INTBASE); write_intsize(INTSIZE_4); - for(i = 0; i < NR_IOP13XX_IRQS; i++) { + for(i = 0; i <= IRQ_IOP13XX_HPI; i++) { if (i < 32) set_irq_chip(i, &iop13xx_irqchip1); else if (i < 64) @@ -271,4 +272,6 @@ void __init iop13xx_init_irq(void) set_irq_handler(i, handle_level_irq); set_irq_flags(i, IRQF_VALID | IRQF_PROBE); } + + iop13xx_msi_init(); } diff --git a/arch/arm/mach-iop13xx/msi.c b/arch/arm/mach-iop13xx/msi.c new file mode 100644 index 000..f620675 --- /dev/null +++ b/arch/arm/mach-iop13xx/msi.c @@ -0,0 +1,193 @@ +/* + * arch/arm/mach-iop13xx/msi.c + * + * PCI MSI support for the iop13xx processor + * + * Copyright (c) 2006, Intel Corporation. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple + * Place - Suite 330, Boston, MA 02111-1307 USA. + * + */ +#include +#include +#include +#include + + +static unsigned long msi_irq_in_use[4]; + +/* IMIPR0 CP6 R8 Page 1 + */ +static inline u32 read_imipr_0(void) +{ + u32 val; + asm volatile("mrc p6, 0, %0, c8, c1, 0":"=r" (val)); + return val; +} +static inline void write_imipr_0(u32 val) +{ + asm volatile("mcr p6, 0, %0, c8, c1, 0"::"r" (val)); +} + +/* IMIPR1 CP6 R9 Page 1 + */ +static inline u32 read_imipr_1(void) +{ + u32 val; + asm volatile("mrc p6, 0, %0, c9, c1, 0":"=r" (val)); + return val; +} +static inline void write_imipr_1(u32 val) +{ + asm volatile("mcr p6, 0, %0, c9, c1, 0"::"r" (val)); +} + +/* IMIPR2 CP6 R10 Page 1 + */ +static inline u32 read_imipr_2(void) +{ + u32 val; + asm volatile("mrc p6, 0, %0, c10, c1, 0":"=r" (val)); + return val; +} +static inline void write_imipr_2(u32 val) +{ + asm volatile("mcr p6, 0, %0, c10, c1, 0"::"r" (val)); +} + +/* IMIPR3 CP6 R11 Page 1 + */ +static inline u32 read_imipr_3(void) +{ + u32 val; + asm volatile("mrc p6, 0, %0, c11, c1, 0":"=r" (val)); + return val; +} +static inline void write_imipr_3(u32 val) +{ + asm volatile("mcr p6, 0, %0, c11, c1, 0"::"r" (val)); +} + +static u32 (*read_imipr[])(void) = { + read_imipr_0, + read_imipr_1, + read_imipr_2, + read_imipr_3, +}; + +static void (*write_imipr[])(u32) = { + write_imipr_0, + write_imipr_1, + write_imipr_2, + write_imipr
[PATCH 2.6.21-rc5 1/3] msi: introduce ARCH_SUPPORTS_MSI Kconfig option (rev2)
Allows architectures to advertise that they support MSI rather than listing each architecture as a PCI_MSI dependency. rev2: * update i386 and x86_64 as well Signed-off-by: Dan Williams <[EMAIL PROTECTED]> Acked-by: "Eric W. Biederman" <[EMAIL PROTECTED]> --- arch/arm/Kconfig |1 + arch/i386/Kconfig|1 + arch/ia64/Kconfig|1 + arch/sparc64/Kconfig |1 + arch/x86_64/Kconfig |1 + drivers/pci/Kconfig |6 +- 6 files changed, 10 insertions(+), 1 deletions(-) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index e7baca2..db00376 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -255,6 +255,7 @@ config ARCH_IOP13XX depends on MMU select PLAT_IOP select PCI + select ARCH_SUPPORTS_MSI help Support for Intel's IOP13XX (XScale) family of processors. diff --git a/arch/i386/Kconfig b/arch/i386/Kconfig index 53d6237..bcf2fc4 100644 --- a/arch/i386/Kconfig +++ b/arch/i386/Kconfig @@ -1073,6 +1073,7 @@ config PCI bool "PCI support" if !X86_VISWS depends on !X86_VOYAGER default y if X86_VISWS + select ARCH_SUPPORTS_MSI if (X86_LOCAL_APIC && X86_IO_APIC) help Find out whether you have a PCI motherboard. PCI is the name of a bus system, i.e. the way the CPU talks to the other stuff inside diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig index e19185d..3b71f97 100644 --- a/arch/ia64/Kconfig +++ b/arch/ia64/Kconfig @@ -14,6 +14,7 @@ config IA64 select PCI if (!IA64_HP_SIM) select ACPI if (!IA64_HP_SIM) select PM if (!IA64_HP_SIM) + select ARCH_SUPPORTS_MSI default y help The Itanium Processor Family is Intel's 64-bit successor to diff --git a/arch/sparc64/Kconfig b/arch/sparc64/Kconfig index 1a6348b..b9b2b52 100644 --- a/arch/sparc64/Kconfig +++ b/arch/sparc64/Kconfig @@ -299,6 +299,7 @@ config SUN_IO config PCI bool "PCI support" + select ARCH_SUPPORTS_MSI help Find out whether you have a PCI motherboard. PCI is the name of a bus system, i.e. the way the CPU talks to the other stuff inside diff --git a/arch/x86_64/Kconfig b/arch/x86_64/Kconfig index 56eb14c..e9b4f05 100644 --- a/arch/x86_64/Kconfig +++ b/arch/x86_64/Kconfig @@ -676,6 +676,7 @@ menu "Bus options (PCI etc.)" config PCI bool "PCI support" + select ARCH_SUPPORTS_MSI if (X86_LOCAL_APIC && X86_IO_APIC) # x86-64 doesn't support PCI BIOS access from long mode so always go direct. config PCI_DIRECT diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig index 5ea5bc7..70efe8f 100644 --- a/drivers/pci/Kconfig +++ b/drivers/pci/Kconfig @@ -1,10 +1,14 @@ # # PCI configuration # +config ARCH_SUPPORTS_MSI + bool + default n + config PCI_MSI bool "Message Signaled Interrupts (MSI and MSI-X)" depends on PCI - depends on (X86_LOCAL_APIC && X86_IO_APIC) || IA64 || SPARC64 + depends on ARCH_SUPPORTS_MSI help This allows device drivers to enable MSI (Message Signaled Interrupts). Message Signaled Interrupts enable a device to - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2.6.21-rc5 0/3] iop13xx msi support and a couple msi cleanups
Here is the latest revision of some patches that have been bouncing around linux-pci for a while. linux-kernel is copied to get a few more eyes on the ARCH_SUPPORTS_MSI change. To my knowledge these patches have not yet been queued into a maintainer tree. Dan Williams (2): msi: introduce ARCH_SUPPORTS_MSI Kconfig option (rev2) msi: fix ARM compile Daniel Wolstenholme (1): iop13xx: msi support (rev6) git pull git://lost.foo-projects.org/~dwillia2/git/iop msi -- Dan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 34/37] libata bugfix: HDIO_DRIVE_TASK
Greg KH wrote: -stable review patch. If anyone has any objections, please let us know. -- From: Mark Lord <[EMAIL PROTECTED]> libata bugfix: HDIO_DRIVE_TASK I was trying to use HDIO_DRIVE_TASK for something today, and discovered that the libata implementation does not copy over the upper four LBA bits from args[6]. This is serious, as any tools using this ioctl would have their commands applied to the wrong sectors on the drive, possibly resulting in disk corruption. Ideally, newer apps should use SG_IO/ATA_16 directly, avoiding this bug. But with libata poised to displace drivers/ide, better compatibility here is a must. This patch fixes libata to use the upper four LBA bits passed in from the ioctl. The original drivers/ide implementation copies over all bits except for the master/slave select bit. With this patch, libata will copy only the four high-order LBA bits, just in case there are assumptions elsewhere in libata (?). Signed-off-by: Mark Lord <[EMAIL PROTECTED]> Cc: Chuck Ebbert <[EMAIL PROTECTED]> Signed-off-by: Jeff Garzik <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> .. Mmmm.. I've just noticed another bit we should be preserving there, both for *stable* and current mainline. Instead of: + scsi_cmd[13] = args[6] & 0x0f; We should be doing: + scsi_cmd[13] = args[6] & 0x4f; As-is, the patch still helps, but it is not as useful as it could be. Here's the fixed version. I'm also sending out a 2.6.21 patch via Jeff. Signed-off-by: Mark Lord <[EMAIL PROTECTED]> --- drivers/ata/libata-scsi.c |1 + 1 file changed, 1 insertion(+) --- a/drivers/ata/libata-scsi.c +++ b/drivers/ata/libata-scsi.c @@ -295,6 +295,7 @@ int ata_task_ioctl(struct scsi_device *s scsi_cmd[8] = args[3]; scsi_cmd[10] = args[4]; scsi_cmd[12] = args[5]; + scsi_cmd[13] = args[6] & 0x4f; scsi_cmd[14] = args[0]; /* Good values for timeout and retries? Values below -- Mark Lord Real-Time Remedies Inc. [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/13] signal/timer/event fds v8 - anonymous inode source ...
On Fri, 30 Mar 2007 15:44:15 -0700 (PDT) Davide Libenzi wrote: > On Fri, 30 Mar 2007, Andrew Morton wrote: > > > > +#include > > > + > > > + > > > + > > > > Too many blank lines > > It'd be interesting to know how much is enough. You use one, ppl says it > is too dense. You use more, ppl says it's too much. > There's the one-line rule for inter-function spacing, but what's the > include-functions ones? Or the functions-data ones? 1 :) > > > +static int ainofs_delete_dentry(struct dentry *dentry); > > > +static struct inode *aino_mkinode(void); > > > > Unneeded forward declaration. > > Same here. You're the third says this, so I'm gonna change it. But pls > consider adding it to the coding style. > > > > > > +static struct file_system_type aino_fs_type = { > > > + .name = "ainofs", > > > + .get_sb = ainofs_get_sb, > > > + .kill_sb= kill_anon_super, > > > +}; > > > +static struct dentry_operations ainofs_dentry_operations = { > > > + .d_delete = ainofs_delete_dentry, > > > +}; > > > > If this is moved elsewhere we can perhaps remove some or all of the > > unpleasing static function forward-declarations. > > Grrr :) you are a puttycat > > > +/** > > > + * aino_getfd - creates a new file instance by hooking it up to and > > > anonymous > > > + * inode, and a dentry that describe the "class" of the file > > > + * @pfd: [out] pointer to the file descriptor > > > + * @dpinode: [out] pointer to the inode > > > + * @pfile: [out] pointer to the file struct > > > + * @name:[in]name of the "class" of the new file > > > + * @fops [in]file operations for the new file > > > + * @priv [in]private data for the new file (will be file's > > > private_data) > > > > The [in] and [out] thing is nice - does kerneldoc handle it appropriately? > > No idea. It should come out as text at least. Yes, it's just [nice] text. But the function description needs to fit on one line. If that's not enough, put more description after the @params lines, separated by a * "blank" line. --- ~Randy *** Remember to use Documentation/SubmitChecklist when testing your code *** - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] remove artificial software max_loop limit
On Fri, 30 Mar 2007 15:06:03 -0700 "Ken Chen" <[EMAIL PROTECTED]> wrote: > On 3/30/07, Andrew Morton <[EMAIL PROTECTED]> wrote: > > So.. this change will cause a fatal error for anyone who is presently > > using max_loop, won't it? If they're doing that within their > > initramfs/initrd/etc then things could get rather ugly for them. > > probably, if they access loop device non-sequentially. > My point is that the modprobe will fail if it is passed an unrecognised module parameter (won't it?) So if we're worried about not breaking existing setups, we should retain this module parameter as a do-nothing thing, maybe with a this-is-going-away warning printk, too. > > > I don't know how much of a problem this will be in practice - do people use > > max_loop much? > > I don't know either. hm. > > > btw, did you test this change as both a module and as linked-into-vmlinux? > > as linked-into-vmlinux. why do you ask? It breaks if it is module? > I made last minute change to a mutex name and shamely posted without > doing a compile test. Besides that, is there something else breaks? Just idle curiosity regarding how much testing it had seen. Generally one would expect things to be OK, but there can be startup ordering problems. The most common problem is that the module simply doesn't load because it's using some not-exported-to-modules symbol - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 1/5] RT kernel: force detect HPET from PCI space
Anyone got the same thing for CK804? I had my hopes high, and then I saw the DECLARE_PCI_FIXUP_HEADER values [and the thread title was misleading] I have an A8N-E motherboard with AthlonX2 and the ACPI definitions are missing the HPET (standard feature of Asus motherboards). I too got interested to get my motherboard working. Luckily I found this http://lkml.org/lkml/2006/12/17/69 from which I generated the following patch: --- arch/i386/kernel/quirks.c.orig 2007-03-30 23:43:06.0 +0300 +++ arch/i386/kernel/quirks.c 2007-03-30 23:26:47.0 +0300 @@ -101,5 +101,39 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_I DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH7_1, force_enable_hpet); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH7_31, force_enable_hpet); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH8_1, force_enable_hpet); + +static void __init force_enable_nvidia_hpet(struct pci_dev *dev) +{ +u8 enabled; + u32 addr; + + if (hpet_address) + return; + + pci_read_config_dword(dev, 0x44, &addr); + if (addr != 0xfefff000L) { + printk(KERN_INFO "Unsafe HPET address 0x%08x. Cannot force enable HPET\n", addr); + return; + } + + pci_read_config_byte(dev, 0xA3, &enabled); + if ((enabled & 4) == 0) { + if (enabled != 0xc1) { + printk(KERN_INFO "Unsafe HPET enable 0x%02x. Cannot force enable HPET\n", enabled); + return; + } + pci_write_config_byte(dev, 0xA3, enabled | 4); + pci_read_config_byte(dev, 0xA3, &enabled); + if ((enabled & 4) == 0) { + printk(KERN_INFO "Failed to force enable HPET\n"); + return; + } + } + + force_hpet_address = addr; + printk(KERN_INFO "Force enabled HPET. Base address 0x%08lx\n", force_hpet_address); +} + +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_NVIDIA, 0x0050, force_enable_nvidia_hpet); // NForce4 #endif Now Linux seems to detect HPET and it passes at least the basic sanity checks: Force enabled HPET. Base address 0xfefff000 HPET: hpet_period 4000, hpet_tick 8 Successfully registered HPET clocksource Unfortunately the 2.6.20-mm2 kernel to which I tried to patch the patch series seems to hang few seconds later after half way in udev startup event processing. It could either be something totally different in 2.6.20-mm2 that just happens to fail or more likely some interrupt setup that still needs to be done. I have no idea how to continue from here. -Mikko - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
libata bugfix: preserve LBA bit for HDIO_DRIVE_TASK
Ideally, this would go into linux-2.6.21. Preserve the LBA bit in the DevSel/Head register for HDIO_DRIVE_TASK. Signed-off-by: Mark Lord <[EMAIL PROTECTED]> --- --- linux/drivers/ata/libata-scsi.c.orig2007-03-21 13:35:02.0 -0400 +++ linux/drivers/ata/libata-scsi.c 2007-03-30 17:40:58.0 -0400 @@ -333,7 +333,7 @@ scsi_cmd[8] = args[3]; scsi_cmd[10] = args[4]; scsi_cmd[12] = args[5]; - scsi_cmd[13] = args[6] & 0x0f; + scsi_cmd[13] = args[6] & 0x4f; scsi_cmd[14] = args[0]; /* Good values for timeout and retries? Values below -- Mark Lord Real-Time Remedies Inc. [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/