Re: CFS review
* Roman Zippel <[EMAIL PROTECTED]> wrote: > > 4544 roman 20 0 1796 520 432 S 32.1 0.4 0:21.08 lt > > 4545 roman 20 0 1796 344 256 R 32.1 0.3 0:21.07 lt > > 4546 roman 20 0 1796 344 256 R 31.7 0.3 0:21.07 lt > > 4547 roman 20 0 1532 272 216 R 3.3 0.2 0:01.94 l > > > > and i'm still wondering how that output was possible. > > I disabled the jiffies logic and the result is still the same, so this > problem isn't related to resolution at all. how did you disable the jiffies logic? Also, could you please send me the cfs-debug-info.sh: http://people.redhat.com/mingo/cfs-scheduler/tools/cfs-debug-info.sh captured _while_ the above workload is running. This is the third time i've asked for that :-) to establish that the basic sched_clock() behavior is sound on that box, could you please also run this tool: http://people.redhat.com/mingo/cfs-scheduler/tools/tsc-dump.c please run it both while the system is idle, and while there's a CPU hog running: while :; do :; done & and send me that output too? (it's 2x 60 lines only) Thanks! Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.23-rc2-mm2
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc2/2.6.23-rc2-mm2/ - Various problems from 2.6.23-rc2-mm1 were fixed Boilerplate: - See the `hot-fixes' directory for any important updates to this patchset. - To fetch an -mm tree using git, use (for example) git-fetch git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git tag v2.6.16-rc2-mm1 git-checkout -b local-v2.6.16-rc2-mm1 v2.6.16-rc2-mm1 - -mm kernel commit activity can be reviewed by subscribing to the mm-commits mailing list. echo "subscribe mm-commits" | mail [EMAIL PROTECTED] - If you hit a bug in -mm and it is not obvious which patch caused it, it is most valuable if you can perform a bisection search to identify which patch introduced the bug. Instructions for this process are at http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt But beware that this process takes some time (around ten rebuilds and reboots), so consider reporting the bug first and if we cannot immediately identify the faulty patch, then perform the bisection search. - When reporting bugs, please try to Cc: the relevant maintainer and mailing list on any email. - When reporting bugs in this kernel via email, please also rewrite the email Subject: in some manner to reflect the nature of the bug. Some developers filter by Subject: when looking for messages to read. - Occasional snapshots of the -mm lineup are uploaded to ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/ and are announced on the mm-commits list. Changes since 2.6.23-rc1-mm2: origin.patch git-acpi.patch git-alsa.patch git-agpgart.patch git-audit-master.patch git-cifs.patch git-cpufreq.patch git-dma.patch git-drm.patch git-dvb.patch git-hwmon.patch git-gfs2-nmw.patch git-hid.patch git-ieee1394.patch git-infiniband.patch git-input.patch git-jfs.patch git-jg-misc.patch git-kvm.patch git-libata-all.patch git-m32r.patch git-md-accel.patch git-mips.patch git-mmc.patch git-mtd.patch git-ubi.patch git-netdev-all.patch git-ixgbe.patch git-nfsd.patch git-ocfs2.patch git-r8169.patch git-s390.patch git-sh.patch git-sh64.patch git-scsi-misc.patch git-unionfs.patch git-v9fs.patch git-watchdog.patch git-wireless.patch git-ipwireless_cs.patch git-newsetup.patch git-xfs.patch git-cryptodev.patch git-kgdb.patch git trees -genirq-temporary-fix-for-level-triggered-irq-resend-fix.patch -allow-rcutorture-to-handle-synchronize_sched.patch -alpha-o_cloexec-definition.patch -sound-pci-ioremap-iounmap-balancing.patch -fix-ide-ide-add-platform-ide-driver.patch -mmc-make-it-build.patch -mmc-fix-section-mismatch-warnings-for-drivers-mmc-host.patch -git-nfsd-build-fixes.patch -nfsd-warning-fix.patch -dvb-remove-bogus-bug_on-in-videobuf_dvb_thread.patch Merged into mainline or a subsystem tree +hex_dump-add-missing-const-qualifiers.patch +rcu-remove-prototype-for-nonexistent-function.patch +cris-drivers-cdrom-kconfig-no-longer.patch +spidev-warning-fix.patch +timerremove-clockevents_unregister_notifier.patch +fix-compilation-with-gcc-42.patch +fix-compilation-with-gcc-42-fix.patch +lguest-files-should-explicitly-include-asm-paravirth.patch +alpha-werror-fixes-for-sys_titanc.patch +readahead-docbook-fix.patch More 2.6.23 queue +pm-fix-dependencies-of-config_suspend-and-config_hibernation-updated-3x.patch Might be 2.6.23 +acpi-ec-remove-potential-deadlock-from-ec.patch ACPI fix +sound-pci-ioremap-iounmap-balancing.patch ALSA fix +make-power-supply-class-available-for-arm-architecture.patch ARM fix +git-dma-up-fix.patch Fix git-dma +jdelvare-i2c-i2c-mpc-dont-disable-i2c-module-on-stop-condition.patch +jdelvare-i2c-i2c-core-make-some-code-static.patch I2C tree updates -drivers-i2c-i2c-corec-make-code-static.patch Dropped +alpm-increase-number-of-allowable-device-flags.patch ALPM pathces were updated again +st340823a-hpa-and-libata.patch +pata_cmd64x-set-up-mwdma-modes-properly.patch +ata_piix-disallow-udma-133-on-ich5-ich7.patch ata/pata things +ide-hpt366-fix-pci-clock-detection-for-hpt374.patch +ide-hpt366-ultradma-filtering-for-sata-cards.patch +ide-atiixp-dma-setup-fixes.patch +ide-it8213-piix-slc90e66-remove-dma-2-pio.patch +ide-au1xxx-use-ide-tune-dma.patch +ide-hpt34x-fix-config-hpt34x-autodma-n-handling.patch +ide-ide-remove-drive-init-speed-zeroing.patch +ide-ide-remove-ide-use-fast-pio.patch +ide-cs5530-sc1200-add-pio-autotune-fallback-to-ide-dma-check.patch +ide-sl82c105-add-pio-autotune-fallback-to-ide-dma-check.patch +ide-ide-cris-add-pio-autotune-fallback-to-ide-dma-check.patch +ide-ide-pmac-add-pio-autotune-fallback-to-ide-dma-check.patch +ide-ide-remove-ide-dma-check.patch IDE tree updates +mips-detect-bcm947xx-cpus.patch +mips-bcm947xx-support.patch +rfc-add-bcm947xx-to-kconfig.patch +mips-add-bcm947xx-to-makefile.patch MIPS things -8139too-force-media-setting-fix.patch Dropped - it broke
Re: [RFC PATCH 1/4] pass open file to ->setattr()
> >> > This is needed to be able to correctly implement open-unlink-fsetattr > >> > semantics in some filesystem such as sshfs, without having to resort > >> > to "silly-renaming". > >> > >> How do you plan to do that? > > > > Easy: the SFTP protocol has stateful opens and defines an FSTAT call. > > Is it possible to reconnect without umounting? Yes, but open files and in-progress requests are lost at reconnect. > If yes, the unlinked files would be lost in spite of being opened, > wouldn't they? Sure. Obviously one of the drawbacks of a stateful protocol is that the server state can't survive a reconnect. But that sort of reliability has never been the goal of sshfs. And even if that was needed, it could probably be much better handled in a lower layer. Miklos - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][Take2] PCI legacy I/O port free driver - Making Intel e1000 driver legacy I/O port free
Dear Auke Sorry I sent the wrong patch. I resubmit the patch. Tomohiro Kusumi Signed-off-by: Tomohiro Kusumi <[EMAIL PROTECTED]> --- diff -Nurp linux-2.6.22.org/drivers/net/e1000/e1000.h linux-2.6.22/drivers/net/e1000/e1000.h --- linux-2.6.22.org/drivers/net/e1000/e1000.h 2007-07-09 08:32:17.0 +0900 +++ linux-2.6.22/drivers/net/e1000/e1000.h 2007-08-10 09:56:03.0 +0900 @@ -342,6 +342,9 @@ struct e1000_adapter { boolean_t quad_port_a; unsigned long flags; uint32_t eeprom_wol; + + int use_ioport; + int bars; }; enum e1000_state_t { diff -Nurp linux-2.6.22.org/drivers/net/e1000/e1000_main.c linux-2.6.22/drivers/net/e1000/e1000_main.c --- linux-2.6.22.org/drivers/net/e1000/e1000_main.c 2007-07-09 08:32:17.0 +0900 +++ linux-2.6.22/drivers/net/e1000/e1000_main.c 2007-08-10 14:27:41.0 +0900 @@ -222,6 +222,11 @@ static pci_ers_result_t e1000_io_error_d static pci_ers_result_t e1000_io_slot_reset(struct pci_dev *pdev); static void e1000_io_resume(struct pci_dev *pdev); +static unsigned int enable_legacy_ioport_free = 0; +module_param(enable_legacy_ioport_free, uint, 0644); +MODULE_PARM_DESC(enable_legacy_ioport_free, "Enable legacy I/O port free (default:0)"); +static int e1000_test_legacy_ioport(struct pci_dev *pdev); + static struct pci_error_handlers e1000_err_handler = { .error_detected = e1000_io_error_detected, .slot_reset = e1000_io_slot_reset, @@ -868,8 +873,25 @@ e1000_probe(struct pci_dev *pdev, int i, err, pci_using_dac; uint16_t eeprom_data = 0; uint16_t eeprom_apme_mask = E1000_EEPROM_APME; - if ((err = pci_enable_device(pdev))) + int bars = 0; + int use_ioport = 0; + + if (enable_legacy_ioport_free) { + if ((use_ioport = e1000_test_legacy_ioport(pdev)) < 0) { + E1000_ERR("e1000_test_legacy_ioport failed, aborting\n"); + return -1; + } + if (use_ioport) + bars = pci_select_bars(pdev, IORESOURCE_MEM | IORESOURCE_IO); + else + bars = pci_select_bars(pdev, IORESOURCE_MEM); + + if ((err = pci_enable_device_bars(pdev, bars))) + return err; + } + else if ((err = pci_enable_device(pdev))) { return err; + } if (!(err = pci_set_dma_mask(pdev, DMA_64BIT_MASK)) && !(err = pci_set_consistent_dma_mask(pdev, DMA_64BIT_MASK))) { @@ -883,7 +905,8 @@ e1000_probe(struct pci_dev *pdev, pci_using_dac = 0; } - if ((err = pci_request_regions(pdev, e1000_driver_name))) + if ((enable_legacy_ioport_free && (err = pci_request_selected_regions(pdev, bars, e1000_driver_name))) || + (err = pci_request_regions(pdev, e1000_driver_name))) goto err_pci_reg; pci_set_master(pdev); @@ -902,6 +925,10 @@ e1000_probe(struct pci_dev *pdev, adapter->pdev = pdev; adapter->hw.back = adapter; adapter->msg_enable = (1 << debug) - 1; + if (enable_legacy_ioport_free) { + adapter->use_ioport = use_ioport; + adapter->bars = bars; + } mmio_start = pci_resource_start(pdev, BAR_0); mmio_len = pci_resource_len(pdev, BAR_0); @@ -911,12 +938,14 @@ e1000_probe(struct pci_dev *pdev, if (!adapter->hw.hw_addr) goto err_ioremap; - for (i = BAR_1; i <= BAR_5; i++) { - if (pci_resource_len(pdev, i) == 0) - continue; - if (pci_resource_flags(pdev, i) & IORESOURCE_IO) { - adapter->hw.io_base = pci_resource_start(pdev, i); - break; + if (!enable_legacy_ioport_free || adapter->use_ioport) { + for (i = BAR_1; i <= BAR_5; i++) { + if (pci_resource_len(pdev, i) == 0) + continue; + if (pci_resource_flags(pdev, i) & IORESOURCE_IO) { + adapter->hw.io_base = pci_resource_start(pdev, i); + break; + } } } @@ -1182,7 +1211,10 @@ err_sw_init: err_ioremap: free_netdev(netdev); err_alloc_etherdev: - pci_release_regions(pdev); + if (enable_legacy_ioport_free) + pci_release_selected_regions(pdev, bars); + else + pci_release_regions(pdev); err_pci_reg: err_dma: pci_disable_device(pdev); @@ -1234,7 +1266,10 @@ e1000_remove(struct pci_dev *pdev) iounmap(adapter->hw.hw_addr); if (adapter->hw.flash_address) iounmap(adapter->hw.flash_address); - pci_release_regions(pdev); + if (enable_legacy_ioport_free) + pci_release_selected_regions(pdev, adapter->bars); + else +
Re: 2.6.23-rc2-mm1: sleeping function called from invalid context at kernel/mutex.c:86
Andrew Morton wrote: > On Fri, 10 Aug 2007 01:23:07 +0200 > Mariusz Kozlowski <[EMAIL PROTECTED]> wrote: > >> Hello, >> >> This probably doesn't have great impact ;) but ... >> To reproduce: run torture tests for RCU and then sysrq+q. >> >> SysRq : Show Pending Timers >> Timer List Version: v0.3 >> HRTIMER_MAX_CLOCK_BASES: 2 >> now at 1764338760370 nsecs >> >> cpu: 0 >> clock 0: >> .index: 0 >> .resolution: 1 nsecs >> .get_time: ktime_get_real >> .offset: 1186699025823815427 nsecs >> active timers: >> clock 1: >> .index: 1 >> .resolution: 1 nsecs >> .get_time: ktime_get >> .offset: 0 nsecs >> active timers: >> #0: <3>BUG: sleeping function called from invalid context at >> kernel/mutex.c:86 >> in_atomic():1, irqs_disabled():1 >> INFO: lockdep is turned off. >> irq event stamp: 0 >> hardirqs last enabled at (0): [<>] 0x0 >> hardirqs last disabled at (0): [] copy_process+0x4a8/0x144c >> softirqs last enabled at (0): [] copy_process+0x4c6/0x144c >> softirqs last disabled at (0): [<>] 0x0 >> [] show_trace_log_lvl+0x1a/0x30 >> [] show_trace+0x12/0x14 >> [] dump_stack+0x15/0x17 >> [] __might_sleep+0xb7/0xc9 >> [] mutex_lock+0x15/0x1f >> [] lookup_module_symbol_name+0x17/0xc0 >> [] lookup_symbol_name+0x3f/0x43 >> [] print_name_offset+0x1f/0x96 >> [] timer_list_show+0x802/0xcbd >> [] sysrq_timer_list_show+0xc/0xe >> [] sysrq_handle_show_timers+0x8/0xa >> [] __handle_sysrq+0x7b/0x115 >> [] handle_sysrq+0x20/0x24 >> [] kbd_event+0x3a8/0x5c7 >> [] input_pass_event+0x8f/0x91 >> [] input_handle_event+0x98/0x38d >> [] input_event+0x54/0x67 >> [] atkbd_interrupt+0x200/0x59e >> [] serio_interrupt+0x7c/0x80 >> [] i8042_interrupt+0x17a/0x289 >> [] handle_IRQ_event+0x28/0x59 >> [] handle_level_irq+0xad/0x10b >> [] do_IRQ+0x93/0xd0 >> [] common_interrupt+0x2e/0x34 >> [] rcu_read_delay+0x8/0x36 [rcutorture] >> [] rcu_torture_reader+0x6e/0x169 [rcutorture] >> [] kthread+0x36/0x58 >> [] kernel_thread_helper+0x7/0x1c >> === > > We seem to have made a mess in there. timer_list_show() ends up calling > lookup_module_symbol_name(), which takes a mutex. However print_symbol() > (which is called at oops time, interrupt time, etc) calls > module_address_lookup(), which is basically the same, only it doesn't take > the mutex. > > I guess a quicky fix would be to switch > kernel/time/timer_list.c:print_name_offset() from > lookup_module_symbol_name() to module_address_lookup(). But we'd still > have a mess in there. > > (adds ccs, runs away) I don't think rcutorture matters for this bug. As far as I can tell, Andrew's description of this problem will always apply to this particular sysrq: the keyboard interrupt leads to handle_sysrq, which leads to timer_list_show, which leads to lookup_module_symbol_name, which acquires a mutex. - Josh Triplett - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] SLUB fixes
The following changes are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/christoph/slab.git to-linus Christoph Lameter (2): SLUB: Remove checks for MAX_PARTIAL from kmem_cache_shrink SLUB: Fix dynamic dma kmalloc cache creation Jesper Juhl (1): SLUB: Fix format specifier in Documentation/vm/slabinfo.c Documentation/vm/slabinfo.c |2 +- mm/slub.c | 68 +- 2 files changed, 48 insertions(+), 22 deletions(-) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH][Take2] PCI legacy I/O port free driver - Making Intel e1000 driver legacy I/O port free
Dear Auke http://lkml.org/lkml/2007/5/16/275 > I'm ok with the bottom part of the patch, but I do not like the modification > of > the pci device ID table in this way. As Arjan van der Ven previously commented > as well, this makes it hard for future device ID's to be bound to the driver. > > On top of that, there is no logical correlation between the mapping and > chipsets, so a lot of information is lost in that table. It really does not > show > which _chipsets_ support this functionality. > > I think if we want to work with this, we need some way of mapping the device > ID's back to chipsets, and enable the feature on that basis. This patches will meet your need in the way that it shows correlation between their device IDs and the chipsets. It does not use the existing macro INTEL_E1000_ETHERNET_DEVICE() to see whether the device uses I/O port or not. Instead of modifying the PCI device table, I've added another function called e1000_test_legacy_ioport() which tells you whether the PCI device uses legacy I/O port or not by checking its chipset. But one thing I want to say is that I am not sure about the chipsets that require legacy I/O port. I found the following code in e1000 driver code and it seems to be the only part that is using I/O port. So I thought the following chipsets are the only ones using legacy I/O port. I might be wrong, so any comments would be helpful. drivers/net/e1000/e1000_hw.c 524 int32_t 525 e1000_reset_hw(struct e1000_hw *hw) 526 { ... 618 switch (hw->mac_type) { 619 case e1000_82544: 620 case e1000_82540: 621 case e1000_82545: 622 case e1000_82546: 623 case e1000_82541: 624 case e1000_82541_rev_2: 625 /* These controllers can't ack the 64-bit write when issuing the 626 * reset, so use IO-mapping as a workaround to issue the reset */ 627 E1000_WRITE_REG_IO(hw, CTRL, (ctrl | E1000_CTRL_RST)); > I also would like this option to be non-default, IOW use legacy IO by default, > and allow the user to specify a module load option to disable use of this > feature: I've also added the module parameter so that the user can select whether he or she wants to enable the legacy I/O port free. The legacy I/O port free option is non-default. Rest of the part has not been changed since my previous patch. Any comments would be helpful. Tomohiro Kusumi Signed-off-by: Tomohiro Kusumi <[EMAIL PROTECTED]> --- diff -Nurp linux-2.6.22.org/drivers/net/e1000/e1000.h linux-2.6.22/drivers/net/e1000/e1000.h --- linux-2.6.22.org/drivers/net/e1000/e1000.h 2007-07-09 08:32:17.0 +0900 +++ linux-2.6.22/drivers/net/e1000/e1000.h 2007-08-10 09:56:03.0 +0900 @@ -342,6 +342,9 @@ struct e1000_adapter { boolean_t quad_port_a; unsigned long flags; uint32_t eeprom_wol; + + int use_ioport; + int bars; }; enum e1000_state_t { diff -Nurp linux-2.6.22.org/drivers/net/e1000/e1000_main.c linux-2.6.22/drivers/net/e1000/e1000_main.c --- linux-2.6.22.org/drivers/net/e1000/e1000_main.c 2007-07-09 08:32:17.0 +0900 +++ linux-2.6.22/drivers/net/e1000/e1000_main.c 2007-08-10 13:09:25.0 +0900 @@ -222,6 +222,11 @@ static pci_ers_result_t e1000_io_error_d static pci_ers_result_t e1000_io_slot_reset(struct pci_dev *pdev); static void e1000_io_resume(struct pci_dev *pdev); +static unsigned int enable_legacy_ioport_free = 0; +module_param(enable_legacy_ioport_free, uint, 0644); +MODULE_PARM_DESC(enable_legacy_ioport_free, "Enable legacy I/O port free (default:0)"); +static int e1000_test_legacy_ioport(struct pci_dev *pdev); + static struct pci_error_handlers e1000_err_handler = { .error_detected = e1000_io_error_detected, .slot_reset = e1000_io_slot_reset, @@ -868,8 +873,22 @@ e1000_probe(struct pci_dev *pdev, int i, err, pci_using_dac; uint16_t eeprom_data = 0; uint16_t eeprom_apme_mask = E1000_EEPROM_APME; - if ((err = pci_enable_device(pdev))) + int bars = 0; + int use_ioport = 0; + + if (enable_legacy_ioport_free) { + if ((use_ioport = e1000_test_legacy_ioport(pdev)) < 0) { + E1000_ERR("e1000_test_legacy_ioport failed, aborting\n"); + return -1; + } + if (use_ioport) + bars = pci_select_bars(pdev, IORESOURCE_MEM | IORESOURCE_IO); + else + bars = pci_select_bars(pdev, IORESOURCE_MEM); + } + else if ((err = pci_enable_device(pdev))) { return err; + } if (!(err = pci_set_dma_mask(pdev, DMA_64BIT_MASK)) && !(err = pci_set_consistent_dma_mask(pdev, DMA_64BIT_MASK))) { @@ -883,7 +902,8 @@ e1000_probe(struct pci_dev *pdev, pci_using_dac = 0; } - if ((err =
Question about PF_NOFREEZE
if one thread set its current->flag with PF_NOFREEZE, then it means this thread is unfreezable,does this mean, when the system entered into a suspended state, even though all the other threads have already gone sleep, this thread still keeps awaken? One thing I am very confused is, if all the other threads goes to sleep,can this only one thread(assume only one thread marked itself as unfreezable.) still works well? Regards Jason - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: EFI e820 map handling
On 8/9/07, Andi Kleen <[EMAIL PROTECTED]> wrote: > > Hallo, > > I thought a bit about the zero page problem. I really would prefer to not > having it used in a boot loader right now because it's not extensible anymore > when external users start (ab)using it. > > When I asked for separate EFI->e820 functions I was really thinking > of the kernel to do the conversion; not the boot loader. > > Could you move that code into the kernel early boot code please? > e.g. on x86-64 it could be in head64.c. It could stuff the result > into the zero page to pass it cleanly on without special cases later. > > On i386 a head32.c that runs before start_kernel() could be also > introduced for this. > > As long as it's localized there it is fine. > > This would also allow to define new private e820 types and extend > the string decoding in e820; so that dmesg will correctly contain > > EFI: > > instead of > > BIOS-e820: ... > How about elilo to load freebsd or opensolaris? YH - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][RFC] 4K stacks default, not a debug thing any more...?
On Friday August 10, [EMAIL PROTECTED] wrote: > On 8/1/07, Neil Brown <[EMAIL PROTECTED]> wrote: > > > No, this does not use indefinite stack. > > > > loop will schedule each request to be handled by a kernel thread, so > > requests to 'loop' are serialised, never stacked. > > > > In 2.6.22, generic_make_request detects and serialises recursive calls, > > so unlimited recursion is not possible there either. > > Is that saying "before 2.6.22, a read/write on a deeply layered device > would use a lot of stack?" before 2.6.22, a stack of dm and/or md devices (not loop, and not md/raid0 or md/linear) would use more stack the more devices were involved. If you made a very deep stack, you could push the stack over any limit you chose. I won't say "a lot of stack" as I haven't measured the exact amount, just "more stack as you add more devices". NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][RFC] 4K stacks default, not a debug thing any more...?
On 8/1/07, Neil Brown <[EMAIL PROTECTED]> wrote: > No, this does not use indefinite stack. > > loop will schedule each request to be handled by a kernel thread, so > requests to 'loop' are serialised, never stacked. > > In 2.6.22, generic_make_request detects and serialises recursive calls, > so unlimited recursion is not possible there either. Is that saying "before 2.6.22, a read/write on a deeply layered device would use a lot of stack?" - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 0/9] Memory controller introduction (v4)
KAMEZAWA Hiroyuki wrote: > On Wed, 8 Aug 2007 12:51:39 +0900 > KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> wrote: > >> On Sat, 28 Jul 2007 01:39:37 +0530 >> Balbir Singh <[EMAIL PROTECTED]> wrote: >>> At OLS, the resource management BOF, it was discussed that we need to manage >>> RSS and unmapped page cache together. This patchset is a step towards that >>> >> Can I make a question ? Why limiting RSS instead of # of used pages per >> container ? Maybe bacause of shared pages between container > SorryIgnore above question. > I didn't understand what mem_container_charge() accounts and limits. > It controls # of meta_pages. Hi Kame, Actually the number of pages resident in memory brought in by a container is charged. However each such page will have a meta_page allocated to keep the extra data. Yes, the accounting counts the number of meta_page which is same as the number of mapped and unmapped (pagecache) pages brought into the system memory by this container. Whether pagecache pages should be included or not is configurable per container through the 'type' file in containerfs. --Vaidy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problems with USB disk
Greg KH wrote: On Tue, Aug 07, 2007 at 10:26:15PM +0200, Niels wrote: Hi, I'm having problems with a new 500 GB USB disk. It works, but sometimes I get these in dmesg: usb 1-3: reset high speed USB device using ehci_hcd and address 2 usb 5-1: USB disconnect, address 2 drivers/usb/class/usblp.c: usblp0: removed sd 0:0:0:0: Device not ready: <6>: Sense Key : 0x2 [current] : ASC=0x4 ASCQ=0x2 end_request: I/O error, dev sda, sector 254148215 sd 0:0:0:0: Device not ready: <6>: Sense Key : 0x2 [current] : ASC=0x4 ASCQ=0x2 end_request: I/O error, dev sda, sector 252434023 EXT3-fs error (device sda1): ext3_find_entry: reading directory #15761836 offset 0 There's also a printer connected. This is on a pci/usb2 card. When the above happens, I get I/O errors. When I mount the drive next, there are errors and often missing files. Quite annoying! Kernel is 2.6.21 What's going on? You have a low voltage issue, or a bad cable. The device is electronically disconnecting itself. Try using a externally-powered hub, or a new cable. I see the external drive becoming read-only, although I haven't checked the dmesg for the events, since other things in my system generate a bunch of output I have to wade through. New cable, separate power, doesn't do it under 2.6.20-* Fedora or 2.6.21.x kernel.org kernels. I'll check the dmesg next time it happens, but I doubt a kernel version change would heal the hardware issues you mention. -- Bill Davidsen <[EMAIL PROTECTED]> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rtc max frequency setting
H. Peter Anvin wrote: Jan Engelhardt wrote: Hi, with the old rtc.ko module, there was a /proc/sys/dev/rtc/max-user-freq that could be set. With rtc_cmos.ko (or the new rtc infrastructure in general), I am missing this file. Where can I set the max-user-freq now, or is this obsolete now? (mplayer prefers to have user-freq to be >= 1024.) Qemu wants something like this too. Both of these really want something else, which is a high-frequency userspace timer. What is the best way to do that on modern kernels? /proc/sys/dev/hpet/max-user-freq? But I notice that some kernels provide both values (my 2.6.15, was where I looked), so maybe the rtc went away. -- Bill Davidsen <[EMAIL PROTECTED]> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Unable to Handle kernel paging request at virtual address during rmmod/insmod
Hi, I'm kind of lost in my debugging. Hope someone out there can help me figure out the problem. I am loading 4 modules-> class driver, device controller, interface layer(working on a dual-core) abstraction layer. Device has to work as a mass storage class. After doing some file transfer and more complex testing. I unload the modules/driver, and here is where my problem will come out. Most of the time, I will get "unable to handle kernel paging request". and this happens at different timing. I implemented memory counter in all the modules to ensure that all kmalloc'd are kfreed. So I am confident there is no memory leakage in each module. My understanding from the call trace is, it is failing somewhere in free_block/sys_delete_module, etc. These routines are internal already to kernel and were called upon rmmod. Is there anything I can do in my modules to resolve this issue. I am using vmalloc in allocating 64K buffer twice in one of the module. In Hal abstraction layer, also an iormap_nocache was used. Could the problem be related to this? I am using linux-2.6.18. Any help will be greatly appreciated. grace - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 00/23] per device dirty throttling -v8
Andi Kleen wrote: richard kennedy <[EMAIL PROTECTED]> writes: This is on a standard desktop machine so there are lots of other processes running on it, and although there is a degree of variability in the numbers,they are very repeatable and your patch always out performs the stock mm2. looks good to me iirc the goal of this is less to get better performance, but to avoid long user visible latencies. Of course if it's faster it's great too, but that's only secondary. What a trade-off, if you want to get rid of long latency you have to live with better throughput. I can live with that. ;-) Your point well taken, not the intent of the patch, but it may indicate where a performance bottleneck happens as well. -- Bill Davidsen <[EMAIL PROTECTED]> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Oops in 2.6.21-gentoo-r4 during rsync
Hi, I get the following error when doing an 'rsync -avz --progress.' Any help would be appreciated. Unable to handle kernel NULL pointer dereference at RIP: [] __rmqueue+0x6c/0x140 PGD 7b24f067 PUD 71ab9067 PMD 0 Oops: [1] SMP CPU 1 Modules linked in: rfcomm snd_seq snd_seq_device bridge llc acpi_memhotplug pciehp pci_hotplug kvm saa7134_alsa saa7134_dvb dvb_pll tda826x mt352 tda10086 video_buf_dvb dvb_core nxt200x isl6421 tda1004x udf ext4dev jbd2 dm_bbr ntfs vfat msdos fat i2c_amd756 i2c_dev usblp usb_storage uhci_hcd quickcam_messenger usbvideo nbd snd_pcm_oss snd_mixer_oss ieee80211_crypt_tkip ieee80211_crypt_ccmp ieee80211_crypt_wep ieee80211 ieee80211_crypt fuse aes bfusb bcm203x bnep sco hidp l2cap pwc hci_usb usbhid tuner saa7134 video_buf compat_ioctl32 ir_kbd_i2c ir_common videodev v4l2_common v4l1_compat i2c_nforce2 i2c_core ehci_hcd ohci_hcd k8temp sg Pid: 9731, comm: rsync Not tainted 2.6.21-gentoo-r4 #1 RIP: 0010:[] [] __rmqueue+0x6c/0x140 RSP: 0018:81007266d4d8 EFLAGS: 00010013 RAX: RBX: RCX: RDX: RSI: RDI: 8100c600 RBP: 8100c698 R08: R09: R10: 0929 R11: 0001 R12: R13: 81007e3a0d40 R14: 8100c600 R15: 001f FS: 2b548bc18ae0() GS:81007e3a0cc0() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: CR3: 702d4000 CR4: 06e0 Process rsync (pid: 9731, threadinfo 81007266c000, task 8100711a4a20) Stack: 0082 81007e3a0d50 81007e3a0d40 802645f0 81007266d558 0001 8100d868 00018022d104 8100d860 0050 Call Trace: [] get_page_from_freelist+0x1e0/0x4e0 [] __alloc_pages+0x180/0x300 [] kmem_getpages+0x70/0x140 [] fallback_alloc+0x104/0x1c0 [] kmem_cache_alloc+0x97/0xb0 [] journal_add_journal_head+0x27/0x170 [] journal_dirty_data+0x37/0x210 [] ext3_journal_dirty_data+0x1d /0x50 [] walk_page_buffers+0x68/0xb0 [] journal_dirty_data_fn+0x0/0x20 [] ext3_ordered_writepage+0x108/0x190 [] shrink_inactive_list+0x442 /0x8a0 [] __split_bio+0x3cc/0x3f0 [] __up_read+0x21/0xb0 [] dm_request+0x119/0x120 [] generic_make_request+0x159 /0x170 [] shrink_zone+0xf6/0x130 [] try_to_free_pages+0x183/0x280 [] __alloc_pages+0x1df/0x300 [] __do_page_cache_readahead+0xe4 /0x290 [] __pollwait+0x0/0x120 [] transfer_objects+0x52/0x80 [] __activate_task+0x33/0x50 [] try_to_wake_up+0x3dc/0x400 [] dm_table_any_congested+0x15/0x70 [] blockable_page_cache_readahead+0x6d/0xe0 [] make_ahead_window+0x86/0xb0 [] page_cache_readahead+0x195 /0x1e0 [] do_generic_mapping_read+0x127/0x410 [] file_read_actor+0x0/0x140 [] generic_file_aio_read+0x16c/0x1b0 [] do_sync_read+0xcf/0x120 [] autoremove_wake_function+0x0/0x30 [] vfs_read+0xdb/0x180 [] sys_read+0x53/0x90 [] system_call+0x7e/0x83 Code: 48 8b 08 48 8b 50 08 4c 8d 68 d8 48 89 51 08 48 89 0a 48 c7 RIP [] __rmqueue+0x6c/0x140 RSP CR2: Thanks, Sol oops_dmesg Description: Binary data
Re: [PATCH 00/23] per device dirty throttling -v8
[EMAIL PROTECTED] wrote: On Sun, 5 Aug 2007, Diego Calleja wrote: El Sun, 5 Aug 2007 09:13:20 +0200, Ingo Molnar <[EMAIL PROTECTED]> escribió: Measurements show that noatime helps 20-30% on regular desktop workloads, easily 50% for kernel builds and much more than that (in excess of 100%) for file-read-intense workloads. We cannot just walk And as everybody knows in servers is a popular practice to disable it. According to an interview to the kernel.org admins "Beyond that, Peter noted, "very little fancy is going on, and that is good because fancy is hard to maintain." He explained that the only fancy thing being done is that all filesystems are mounted noatime meaning that the system doesn't have to make writes to the filesystem for files which are simply being read, "that cut the load average in half." I bet that some people would consider such performance hit a bug... actually, it's popular practice to disable it by people who know how big a hit it is and know how few programs use it. i've been a linux sysadmin for 10 years, and have known about noatime for at least 7 years, but I always thought of it in the catagory of 'use it only on your performance critical machines where you are trying to extract every ounce of performance, and keep an eye out for things misbehaving' I never imagined that itwas the 20%+ hit that is being described, and with so little impact, or I would have switched to it across the board years ago. To get that magnitude you need slow disk with very fast CPU. It helps most of systems where the disk hardware is marginal or worse for the i/o load. Don't take that as typical. I'll bet there are a lot of admins out there in the same boat. adding an option in the kernel to change the default sounds like a very good first step, even if the default isn't changed today. -- Bill Davidsen <[EMAIL PROTECTED]> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [MTD] Fix CFI build error with meaningless nonfunctional .config
On Mon, 2007-08-06 at 18:15 +0200, Ingo Molnar wrote: > randconfig testing on .23-rc2 triggered the following build error: When building NOR flash support, you have compile-time options for the bus width and the number of individual chips which are interleaved together onto that bus. The code to deal with arbitrary geometry is a bit convoluted, and people want to just configure it for the specific hardware they have, to avoid the runtime overhead. Selecting _none_ of the available options doesn't make any sense. You should have at least one. This makes it build though, since people persist in trying. Signed-off-by: David Woodhouse <[EMAIL PROTECTED]> diff --git a/include/linux/mtd/cfi.h b/include/linux/mtd/cfi.h index 123948b..e17c534 100644 --- a/include/linux/mtd/cfi.h +++ b/include/linux/mtd/cfi.h @@ -57,6 +57,15 @@ #define cfi_interleave_is_8(cfi) (0) #endif +#ifndef cfi_interleave +#warning No CONFIG_MTD_CFI_Ix selected. No NOR chip support can work. +static inline int cfi_interleave(void *cfi) +{ + BUG(); + return 0; +} +#endif + static inline int cfi_interleave_supported(int i) { switch (i) { diff --git a/include/linux/mtd/map.h b/include/linux/mtd/map.h index 81f3a31..a9fae03 100644 --- a/include/linux/mtd/map.h +++ b/include/linux/mtd/map.h @@ -125,7 +125,15 @@ #endif #ifndef map_bankwidth -#error "No bus width supported. What's the point?" +#warning "No CONFIG_MTD_MAP_BANK_WIDTH_xx selected. No NOR chip support can work" +static inline int map_bankwidth(void *map) +{ + BUG(); + return 0; +} +#define map_bankwidth_is_large(map) (0) +#define map_words(map) (0) +#define MAX_MAP_BANKWIDTH 1 #endif static inline int map_bankwidth_supported(int w) -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 02/10] mm: system wide ALLOC_NO_WATERMARK
On Thu, 9 Aug 2007, Daniel Phillips wrote: > If you believe that the deadlock problems we address here can be > better fixed by making reclaim more intelligent then please post a > patch and we will test it. I am highly skeptical, but the proof is in > the patch. Then please test the patch that I posted here earlier to reclaim even if PF_MEMALLOC is set. It may require some fixups but it should address your issues in most vm load situations. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Minor fix to Documentation/powerpc/00-INDEX
Signed-off-by: Rob Landley <[EMAIL PROTECTED]> I have a python script to convert 00-INDEX files into index.html files, and a second script to show 404 errors in the result as well as files/directories nothing links to. (It's not very useful yet, but in case you're wondering http://kernel.org/doc/docdiridx.py and http://kernel.org/doc/doclinkcheck.py .) Anyway, my simple index.html generator breaks on the Documentation/powerpc directory because one of the description lines is two lines long. This patch joins those two lines together into one line. This is the only instance (so far) of this problem. --- In case you're wondering, here are the current the 404 errors in the various 00-INDEX files. Fixing all this is on my todo list: Documentation/ecryptfs.txt Documentation/time_interpolators.txt Documentation/arm/SA1100 Documentation/arm/XScale Documentation/arm/empeg Documentation/arm/nwfpe Documentation/isdn/README.eicon Documentation/fb/clgenfb.txt Documentation/networking/ethertap.txt Documentation/filesystems/reiser4.txt Documentation/scsi/AM53C974.txt Documentation/scsi/ChangeLog The "files and directories not linked to" list is 679 lines long. diff -r /dev/null Documentation/powerpc/00-INDEX --- a/Documentation/powerpc/00-INDEXThu Aug 09 08:40:21 2007 -0700 +++ b/Documentation/powerpc/00-INDEXThu Aug 09 20:49:03 2007 -0500 @@ -6,8 +6,7 @@ 00-INDEX 00-INDEX - this file cpu_features.txt - - info on how we support a variety of CPUs with minimal compile-time - options. + - info on how we support a variety of CPUs with minimal compile-time options. eeh-pci-error-recovery.txt - info on PCI Bus EEH Error Recovery hvcs.txt -- "One of my most productive days was throwing away 1000 lines of code." - Ken Thompson. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
mlock() is working?
Hi guys, I'm trying to lock some piece of the code in memory using mlock(). I did a simple program to test it and to certify I using my own simple page fault notifier [0]. The program is below. -- cut -- #include #include #define SIZE 1 int mlock_all = 0; int f(void) { int c[SIZE]; int i; if (mlock_all) { if (!mlockall(MCL_CURRENT)) fprintf(stderr, "mlockall'ed succefully\n"); else perror("mlockall"); } else { if (!mlock([0], SIZE)) fprintf(stderr, "mlock'ed succefully\n"); else perror("mlock"); } fprintf(stderr, "start: 0x%x, end: 0x%x\n", [0], [SIZE]); for (i = 0; i < SIZE; i++) c[i] = i; } int main(int argc, char **argv) { if (argv[1]) mlock_all = 1; while(1) { f(); sleep (15); } return 0; } -- cut -- So, if I use mlockall() I always obtained the desired result, i.e., I lock the 'c[SIZE]'. But when I switch to mlock() it never works and my page fault notifier prints all pages concerning 'c[SIZE]'. Am I missing something? Is it possible to lock the automatic variables? My Linux is 2.6.22.2. my regards [0] http://lkml.org/lkml/2007/7/27/11 http://lkml.org/lkml/2007/7/27/8 -- Tiago Vignatti C3SL - Centro de Computação Científica e Software Livre www.c3sl.ufpr.br - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 02/10] mm: system wide ALLOC_NO_WATERMARK
On 8/9/07, Christoph Lameter <[EMAIL PROTECTED]> wrote: > The allocations problems that this patch addresses can be fixed by making > reclaim > more intelligent. If you believe that the deadlock problems we address here can be better fixed by making reclaim more intelligent then please post a patch and we will test it. I am highly skeptical, but the proof is in the patch. > If we can reclaim in an emergency even in ATOMIC contexts then things get much > easier. It is already easy, and it is already fixed in this patch series. Sure, we can pare these patches down a little more, but you are going to have a really hard time coming up with something simpler that actually works. Regards, Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] obsolete fragment from ext4
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Once ext4 will not implement fragment, it is believed it will never be implement in future. Therefore fragment related source code in ext4 should be obsoleted -- no one will use it. This patch obsolete fragment from ext4. Another patch posted on linux-ext4 removing fragment supporting from e2fsprogs. I tested both patch. Signed-Off-By: Coly Li <[EMAIL PROTECTED]> diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c index 427f830..b8b538d 100644 - --- a/fs/ext4/ialloc.c +++ b/fs/ext4/ialloc.c @@ -576,11 +576,6 @@ got: /* dirsync only applies to directories */ if (!S_ISDIR(mode)) ei->i_flags &= ~EXT4_DIRSYNC_FL; - -#ifdef EXT4_FRAGMENTS - - ei->i_faddr = 0; - - ei->i_frag_no = 0; - - ei->i_frag_size = 0; - -#endif ei->i_file_acl = 0; ei->i_dir_acl = 0; ei->i_dtime = 0; diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index a4848e0..f283522 100644 - --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -2645,11 +2645,6 @@ void ext4_read_inode(struct inode * inode) } inode->i_blocks = le32_to_cpu(raw_inode->i_blocks); ei->i_flags = le32_to_cpu(raw_inode->i_flags); - -#ifdef EXT4_FRAGMENTS - - ei->i_faddr = le32_to_cpu(raw_inode->i_faddr); - - ei->i_frag_no = raw_inode->i_frag; - - ei->i_frag_size = raw_inode->i_fsize; - -#endif ei->i_file_acl = le32_to_cpu(raw_inode->i_file_acl); if (EXT4_SB(inode->i_sb)->s_es->s_creator_os != cpu_to_le32(EXT4_OS_HURD)) @@ -2794,11 +2789,6 @@ static int ext4_do_update_inode(handle_t *handle, raw_inode->i_blocks = cpu_to_le32(inode->i_blocks); raw_inode->i_dtime = cpu_to_le32(ei->i_dtime); raw_inode->i_flags = cpu_to_le32(ei->i_flags); - -#ifdef EXT4_FRAGMENTS - - raw_inode->i_faddr = cpu_to_le32(ei->i_faddr); - - raw_inode->i_frag = ei->i_frag_no; - - raw_inode->i_fsize = ei->i_frag_size; - -#endif if (EXT4_SB(inode->i_sb)->s_es->s_creator_os != cpu_to_le32(EXT4_OS_HURD)) raw_inode->i_file_acl_high = diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 75adbb6..5e04d68 100644 - --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -1655,14 +1655,6 @@ static int ext4_fill_super (struct super_block *sb, void *data, int silent) if (sbi->s_inode_size > EXT4_GOOD_OLD_INODE_SIZE) sb->s_time_gran = 1 << (EXT4_EPOCH_BITS - 2); } - - sbi->s_frag_size = EXT4_MIN_FRAG_SIZE << - -le32_to_cpu(es->s_log_frag_size); - - if (blocksize != sbi->s_frag_size) { - - printk(KERN_ERR - -"EXT4-fs: fragsize %lu != blocksize %u (unsupported)\n", - -sbi->s_frag_size, blocksize); - - goto failed_mount; - - } sbi->s_desc_size = le16_to_cpu(es->s_desc_size); if (EXT4_HAS_INCOMPAT_FEATURE(sb, EXT4_FEATURE_INCOMPAT_64BIT)) { if (sbi->s_desc_size < EXT4_MIN_DESC_SIZE_64BIT || @@ -1676,7 +1668,6 @@ static int ext4_fill_super (struct super_block *sb, void *data, int silent) } else sbi->s_desc_size = EXT4_MIN_DESC_SIZE; sbi->s_blocks_per_group = le32_to_cpu(es->s_blocks_per_group); - - sbi->s_frags_per_group = le32_to_cpu(es->s_frags_per_group); sbi->s_inodes_per_group = le32_to_cpu(es->s_inodes_per_group); if (EXT4_INODE_SIZE(sb) == 0) goto cantfind_ext4; @@ -1700,12 +1691,6 @@ static int ext4_fill_super (struct super_block *sb, void *data, int silent) sbi->s_blocks_per_group); goto failed_mount; } - - if (sbi->s_frags_per_group > blocksize * 8) { - - printk (KERN_ERR - - "EXT4-fs: #fragments per group too big: %lu\n", - - sbi->s_frags_per_group); - - goto failed_mount; - - } if (sbi->s_inodes_per_group > blocksize * 8) { printk (KERN_ERR "EXT4-fs: #inodes per group too big: %lu\n", diff --git a/include/linux/ext4_fs.h b/include/linux/ext4_fs.h index cdee7aa..3baeb99 100644 - --- a/include/linux/ext4_fs.h +++ b/include/linux/ext4_fs.h @@ -105,20 +105,6 @@ #define EXT4_BLOCK_ALIGN(size, blkbits)ALIGN((size), (1 << (blkbits))) /* - - * Macro-instructions used to manage fragments - - */ - -#define EXT4_MIN_FRAG_SIZE 1024 - -#define EXT4_MAX_FRAG_SIZE 4096 - -#define EXT4_MIN_FRAG_LOG_SIZE 10 - -#ifdef __KERNEL__ - -# define EXT4_FRAG_SIZE(s) (EXT4_SB(s)->s_frag_size) - -# define EXT4_FRAGS_PER_BLOCK(s) (EXT4_SB(s)->s_frags_per_block) - -#else - -# define EXT4_FRAG_SIZE(s) (EXT4_MIN_FRAG_SIZE << (s)->s_log_frag_size) - -# define EXT4_FRAGS_PER_BLOCK(s) (EXT4_BLOCK_SIZE(s) / EXT4_FRAG_SIZE(s)) - -#endif - - - -/* * Structure of a blocks group descriptor */ struct
Re: JFFS2/mtdsuper modprobe "unknown symbol" in 2.6.23-rc1
In message <[EMAIL PROTECTED]>, Adrian Bunk writes: > On Thu, Aug 09, 2007 at 10:38:18PM -0400, Erez Zadok wrote: > > I'm getting an error modprobing jffs2 due to mtdsuper failing to insmod: > >... > > Does anyone know what am I missing? > > You miss that 2.6.23-rc2 with this bug fixed has already been released. Great, I'll upgrade to rc2 (I've had this problem since .22-rc). Thanks for the quick response. Erez. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: JFFS2/mtdsuper modprobe "unknown symbol" in 2.6.23-rc1
On Thu, Aug 09, 2007 at 10:38:18PM -0400, Erez Zadok wrote: > I'm getting an error modprobing jffs2 due to mtdsuper failing to insmod: >... > Does anyone know what am I missing? You miss that 2.6.23-rc2 with this bug fixed has already been released. > Thanks, > Erez. cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
JFFS2/mtdsuper modprobe "unknown symbol" in 2.6.23-rc1
I'm getting an error modprobing jffs2 due to mtdsuper failing to insmod: # modprobe jffs2 WARNING: Error inserting mtdsuper (/lib/modules/2.6.23-rc1/kernel/drivers/mtd/mtdsuper.ko): Unknown symbol in module, or unknown parameter (see dmesg) FATAL: Error inserting jffs2 (/lib/modules/2.6.23-rc1/kernel/fs/jffs2/jffs2.ko): Unknown symbol in module, or unknown parameter (see dmesg) # dmesg | tail mtdsuper: Unknown symbol get_mtd_device mtdsuper: Unknown symbol put_mtd_device jffs2: Unknown symbol get_sb_mtd jffs2: Unknown symbol kill_mtd_super My relevant .config is: CONFIG_MTD=m CONFIG_MTD_BLKDEVS=m CONFIG_MTD_BLOCK=m CONFIG_MTD_MAP_BANK_WIDTH_1=y CONFIG_MTD_MAP_BANK_WIDTH_2=y CONFIG_MTD_MAP_BANK_WIDTH_4=y CONFIG_MTD_CFI_I1=y CONFIG_MTD_CFI_I2=y CONFIG_MTD_BLOCK2MTD=m CONFIG_JFFS2_FS=m CONFIG_JFFS2_FS_DEBUG=0 CONFIG_JFFS2_FS_WRITEBUFFER=y CONFIG_JFFS2_SUMMARY=y CONFIG_JFFS2_FS_XATTR=y CONFIG_JFFS2_FS_POSIX_ACL=y CONFIG_JFFS2_FS_SECURITY=y CONFIG_JFFS2_COMPRESSION_OPTIONS=y CONFIG_JFFS2_ZLIB=y CONFIG_JFFS2_RTIME=y CONFIG_JFFS2_CMODE_PRIORITY=y A "quick hack" around this which I found is to add MODULE_LICENSE("GPL"); to the end of drivers/mtd/mtdsuper.c, but that doesn't sound like the right fix. Does anyone know what am I missing? Thanks, Erez. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] sysctl: Error on bad sysctl tables
YOSHIFUJI Hideaki / 吉藤英明 <[EMAIL PROTECTED]> writes: > Hello. > > In article <[EMAIL PROTECTED]> (at Thu, 09 Aug 2007 > 14:09:29 -0600), [EMAIL PROTECTED] (Eric W. Biederman) says: > >> After going through the kernels sysctl tables several times it has >> become clear that code review and testing is just not effective in >> prevent problematic sysctl tables from being used in the stable >> kernel. I certainly can't seem to fix the problems as fast as >> they are introduced. > : >> The biggest part of the code is the table of valid binary sysctl >> entries, but since we have frozen our set of binary sysctls this table >> should not need to change, and it makes it much easier to detect >> when someone unintentionally adds a new binary sysctl value. > > I don't think everyone needs to have this code, so > it is better to make it configurable via > CONFIG_SYSCTL_DEBUG or something..., ...no? I guess the other thing is. Except for code size it doesn't matter. As register_sysctl_table gets called very rarely. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 04/10] sysctl: Fix neighbour table sysctls.
YOSHIFUJI Hideaki / 吉藤英明 <[EMAIL PROTECTED]> writes: > In article <[EMAIL PROTECTED]> (at Thu, 09 Aug 2007 > 20:23:16 -0600), [EMAIL PROTECTED] (Eric W. Biederman) says: > >> YOSHIFUJI Hideaki / 吉藤英明 <[EMAIL PROTECTED]> writes: >> >> > Would you explain why it does not work properly >> > for those cases? >> >> Mostly no appropriate strategy routine was setup to >> report the data to the caller of sys_sysctl. > > I assume that default strategy have been existing for it, no?! > Maybe, I do miss something... I'd have to go through it case by case. But in general unless your proc_handler is proc_dointvec the default strategy routine which does a raw binary copy of your data out will generally do the wrong thing. So especially if your data is jiffies or otherwise needs processing you don't want to use the default strategy routine. Until relatively recently no one was really policing the sysctl interfaces and even now it isn't too serious. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] sysctl: Error on bad sysctl tables
YOSHIFUJI Hideaki / 吉藤英明 <[EMAIL PROTECTED]> writes: > Hello. > > In article <[EMAIL PROTECTED]> (at Thu, 09 Aug 2007 > 14:09:29 -0600), [EMAIL PROTECTED] (Eric W. Biederman) says: > >> After going through the kernels sysctl tables several times it has >> become clear that code review and testing is just not effective in >> prevent problematic sysctl tables from being used in the stable >> kernel. I certainly can't seem to fix the problems as fast as >> they are introduced. > : >> The biggest part of the code is the table of valid binary sysctl >> entries, but since we have frozen our set of binary sysctls this table >> should not need to change, and it makes it much easier to detect >> when someone unintentionally adds a new binary sysctl value. > > I don't think everyone needs to have this code, so > it is better to make it configurable via > CONFIG_SYSCTL_DEBUG or something..., ...no? I wouldn't reject such a patch. We are a ways out from the next stable kernel merge window and I'd love to see what else falls out so I'd like to have it on by default for a bit. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 04/10] sysctl: Fix neighbour table sysctls.
In article <[EMAIL PROTECTED]> (at Thu, 09 Aug 2007 20:23:16 -0600), [EMAIL PROTECTED] (Eric W. Biederman) says: > YOSHIFUJI Hideaki / 吉藤英明 <[EMAIL PROTECTED]> writes: > > > Would you explain why it does not work properly > > for those cases? > > Mostly no appropriate strategy routine was setup to > report the data to the caller of sys_sysctl. I assume that default strategy have been existing for it, no?! Maybe, I do miss something... --yoshfuji - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 04/10] sysctl: Fix neighbour table sysctls.
YOSHIFUJI Hideaki / 吉藤英明 <[EMAIL PROTECTED]> writes: > Would you explain why it does not work properly > for those cases? Mostly no appropriate strategy routine was setup to report the data to the caller of sys_sysctl. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: rcutorture xtime usage
On Thu, Aug 09, 2007 at 07:00:40PM -0700, Paul E. McKenney wrote: > On Fri, Aug 10, 2007 at 03:31:46AM +0200, Adrian Bunk wrote: > > On Thu, Aug 09, 2007 at 01:51:06AM -0700, Andrew Morton wrote: > > >... > > > Changes since 2.6.23-rc2-mm1: > > >... > > > +allow-rcutorture-to-handle-synchronize_sched.patch > > >... > > > 2.6.23 queue > > >... > > > > All drivers were converted to no longer use xtime directly since it > > might be quite outdated, but this patch adds a usage of xtime.tv_nsec > > as RNG... > > This code doesn't care if the time is outdated, as it is simply > periodically perturbing an RNG, but OK. >... I should have been a bit more concrete: I have a patch pending to unexport xtime for catching unsafe usages, and you added an (ab)user. > Thanx, Paul cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 04/10] sysctl: Fix neighbour table sysctls.
In article <[EMAIL PROTECTED]> (at Thu, 09 Aug 2007 18:56:09 -0600), [EMAIL PROTECTED] (Eric W. Biederman) says: > > - In ipv6 ndisc_ifinfo_syctl_change so it doesn't depend on binary > sysctl names for a function that works with proc. : Well, retrans_time_ms and base_reachable_time_ms supercedes retrans_time and base_reachable_time, we've warned for long time for its deprecation. So, maybe, it is time to remove the old interfaces (retrans_time and base_reachable_time) and simplify ndisc_ifinfo_syctl_change(). --yoshfuji - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 04/10] sysctl: Fix neighbour table sysctls.
Andrew Morton <[EMAIL PROTECTED]> writes: > But it is good to remove bad interfaces, if we possibly can. > > It is worth making the attempt. Does anyone know of anything which will > break? I fed NET_NEIGH_ANYCAST_DELAY at random into > http://www.google.com/codesearch and came up with nothing... My current policy is that since I could only find 5 real world linux programs that even call sys_sysctl, that if I find a broken sysctl binary interface I'm lazy and just remove it. The only networking one I know of is radvd. Added to that I just pushed an autochecking sysctl patch to Andrew that fails register_sysctl_table if the sysctl table is broken. And all of these showed up. So some fix was needed or things would have been even worse. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 04/10] sysctl: Fix neighbour table sysctls.
In article <[EMAIL PROTECTED]> (at Thu, 09 Aug 2007 18:49:21 -0700 (PDT)), David Miller <[EMAIL PROTECTED]> says: > From: YOSHIFUJI Hideaki / 吉藤英明 <[EMAIL PROTECTED]> > Date: Fri, 10 Aug 2007 10:47:10 +0900 (JST) > > > I disagree. It is bad to remove existing interface. > > Ditto for other patches. > > I think perhaps you misunderstand what Eric is doing. > > sys_sysctl() isn't working properly for these cases and it is both a > deprecated interface and not worth the pain of adding support > in these cases. Would you explain why it does not work properly for those cases? --yoshfuji - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix typo in arch/i386/kernel/tsc.c
On Thu, 09 Aug 2007 18:58:09 -0700 Josh Triplett <[EMAIL PROTECTED]> wrote: > - * We can use khz divisor instead of mhz to keep a better percision, since > + * We can use khz divisor instead of mhz to keep a better precision, since I have an arbitrary i-dont-do-typos policy (unless they're in a printk or in documentation). [EMAIL PROTECTED] is the home for patches such as this, please. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: rcutorture xtime usage
On Thu, 9 Aug 2007 19:00:40 -0700 "Paul E. McKenney" <[EMAIL PROTECTED]> wrote: > On Fri, Aug 10, 2007 at 03:31:46AM +0200, Adrian Bunk wrote: > > On Thu, Aug 09, 2007 at 01:51:06AM -0700, Andrew Morton wrote: > > >... > > > Changes since 2.6.23-rc2-mm1: > > >... > > > +allow-rcutorture-to-handle-synchronize_sched.patch > > >... > > > 2.6.23 queue > > >... > > > > All drivers were converted to no longer use xtime directly since it > > might be quite outdated, but this patch adds a usage of xtime.tv_nsec > > as RNG... > > This code doesn't care if the time is outdated, as it is simply > periodically perturbing an RNG, but OK. > > So, what interface are we supposed to be using instead? I cannot use > get_random_bytes() due to locking issues. This is not a cryptographically > secure usage, so the perturbation does not need to be extremely high > quality. > > On x86, I would just grab the low-order bits of the TSC, but all of the > world is not an x86. ;-) > One used to use sched_clock() for this, then get frowned at. Now we have cpu_clock()... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 04/10] mm: slub: add knowledge of reserve pages
On Thu, 9 Aug 2007, Daniel Phillips wrote: > No matter how you look at this problem, you still need to have _some_ > sort of reserve, and limit access to it. We extend existing methods, The reserve is in the memory in the zone and reclaim can guarantee that there are a sufficient number of easily reclaimable pages in it. > you are proposing to what seems like an entirely new reserve The reserve always has been managed by per zone counters. Nothing new there. > management system. Great idea, maybe, but it does not solve the > deadlocks. You still need some organized way of being sure that your > reserve is as big as you need (hopefully not an awful lot bigger) and > you still have to make sure that nobody dips into that reserve further > than they are allowed to. Nope there is no need to have additional reserves. You delay the writeout until you are finished with reclaim. Then you do the writeout. During writeout reclaim may be called as needed. After the writeout is complete then you recheck the vm counters again to be sure that dirty ratio / easily reclaimable ratio and mem low / high boundaries are still okay. If not go back to reclaim. > So translation: reclaim from "easily freeable" lists is an > optimization, maybe a great one. Probably great. Reclaim from atomic > context is also a great idea, probably. But you are talking about a > whole nuther patch set. Neither of those are in themselves a fix for > these deadlocks. Yes they are a much better fix and may allow code cleanup by getting rid of checks for PF_MEMALLOC. They integrate in a straightforward way into the existing reclaim methods. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: rcutorture xtime usage
On Fri, Aug 10, 2007 at 03:31:46AM +0200, Adrian Bunk wrote: > On Thu, Aug 09, 2007 at 01:51:06AM -0700, Andrew Morton wrote: > >... > > Changes since 2.6.23-rc2-mm1: > >... > > +allow-rcutorture-to-handle-synchronize_sched.patch > >... > > 2.6.23 queue > >... > > All drivers were converted to no longer use xtime directly since it > might be quite outdated, but this patch adds a usage of xtime.tv_nsec > as RNG... This code doesn't care if the time is outdated, as it is simply periodically perturbing an RNG, but OK. So, what interface are we supposed to be using instead? I cannot use get_random_bytes() due to locking issues. This is not a cryptographically secure usage, so the perturbation does not need to be extremely high quality. On x86, I would just grab the low-order bits of the TSC, but all of the world is not an x86. ;-) Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] sysctl: Error on bad sysctl tables
Hello. In article <[EMAIL PROTECTED]> (at Thu, 09 Aug 2007 14:09:29 -0600), [EMAIL PROTECTED] (Eric W. Biederman) says: > After going through the kernels sysctl tables several times it has > become clear that code review and testing is just not effective in > prevent problematic sysctl tables from being used in the stable > kernel. I certainly can't seem to fix the problems as fast as > they are introduced. : > The biggest part of the code is the table of valid binary sysctl > entries, but since we have frozen our set of binary sysctls this table > should not need to change, and it makes it much easier to detect > when someone unintentionally adds a new binary sysctl value. I don't think everyone needs to have this code, so it is better to make it configurable via CONFIG_SYSCTL_DEBUG or something..., ...no? --yoshfuji - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Fix typo in arch/i386/kernel/tsc.c
Signed-off-by: Josh Triplett <[EMAIL PROTECTED]> --- arch/i386/kernel/tsc.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/i386/kernel/tsc.c b/arch/i386/kernel/tsc.c index debd7db..8a58d30 100644 --- a/arch/i386/kernel/tsc.c +++ b/arch/i386/kernel/tsc.c @@ -80,7 +80,7 @@ EXPORT_SYMBOL_GPL(check_tsc_unstable); * And since SC is a constant power of two, we can convert the div * into a shift. * - * We can use khz divisor instead of mhz to keep a better percision, since + * We can use khz divisor instead of mhz to keep a better precision, since * cyc2ns_scale is limited to 10^6 * 2^10, which fits in 32 bits. * ([EMAIL PROTECTED]) * -- 1.5.2.1 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 04/10] sysctl: Fix neighbour table sysctls.
On Fri, 10 Aug 2007 10:47:10 +0900 (JST) YOSHIFUJI Hideaki / 吉藤英明 <[EMAIL PROTECTED]> wrote: > Hello. > > In article <[EMAIL PROTECTED]> (at Thu, 09 Aug 2007 18:56:09 -0600), [EMAIL > PROTECTED] (Eric W. Biederman) says: > > > > > - In ipv6 ndisc_ifinfo_syctl_change so it doesn't depend on binary > > sysctl names for a function that works with proc. > > > > - In neighbour.c reorder the table to put the possibly unused entries > > at the end so we can remove them by terminating the table early. > > > > - In neighbour.c kill the entries with questionable binary sysctl > > handling behavior. > > > > - In neighbour.c if we don't have a strategy routine remove the > > binary path. So we don't the default sysctl strategy routine > > on data that is not ready for it. > > > > I disagree. It is bad to remove existing interface. But it is good to remove bad interfaces, if we possibly can. It is worth making the attempt. Does anyone know of anything which will break? I fed NET_NEIGH_ANYCAST_DELAY at random into http://www.google.com/codesearch and came up with nothing... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 04/10] mm: slub: add knowledge of reserve pages
On 8/8/07, Andrew Morton <[EMAIL PROTECTED]> wrote: > On Wed, 8 Aug 2007 10:57:13 -0700 (PDT) > Christoph Lameter <[EMAIL PROTECTED]> wrote: > > > I think in general irq context reclaim is doable. Cannot see obvious > > issues on a first superficial pass through rmap.c. The irq holdoff would > > be pretty long though which may make it unacceptable. > > The IRQ holdoff could be tremendous. But if it is sufficiently infrequent > and if the worst effect is merely a network rx ring overflow then the tradeoff > might be a good one. Hi Andrew, No matter how you look at this problem, you still need to have _some_ sort of reserve, and limit access to it. We extend existing methods, you are proposing to what seems like an entirely new reserve management system. Great idea, maybe, but it does not solve the deadlocks. You still need some organized way of being sure that your reserve is as big as you need (hopefully not an awful lot bigger) and you still have to make sure that nobody dips into that reserve further than they are allowed to. So translation: reclaim from "easily freeable" lists is an optimization, maybe a great one. Probably great. Reclaim from atomic context is also a great idea, probably. But you are talking about a whole nuther patch set. Neither of those are in themselves a fix for these deadlocks. Regards, Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 04/10] sysctl: Fix neighbour table sysctls.
From: YOSHIFUJI Hideaki / 吉藤英明 <[EMAIL PROTECTED]> Date: Fri, 10 Aug 2007 10:47:10 +0900 (JST) > I disagree. It is bad to remove existing interface. > Ditto for other patches. I think perhaps you misunderstand what Eric is doing. sys_sysctl() isn't working properly for these cases and it is both a deprecated interface and not worth the pain of adding support in these cases. The fact that nobody complains that none of this stuff works via sys_sysctl() to me proves that it is never used. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 02/10] mm: system wide ALLOC_NO_WATERMARK
On Thu, 9 Aug 2007, Daniel Phillips wrote: > You can fix reclaim as much as you want and the basic deadlock will > still not go away. When you finally do get to writing something out, > memory consumers in the writeout path are going to cause problems, > which this patch set fixes. We currently also do *not* write out immediately. I/O is queued when submitted so it does *not* reduce memory. It is better to actually delay writeout until you have thrown out clean pages. At that point the free memory is at its high point. If memory goes below the high point again by these writes then we can again reclaim until things are right. > Agreed that the idea of mempool always sounded strange, and we show > how to get rid of them, but that is not the immediate purpose of this > patch set. Ok mempools are unrelated. The allocations problems that this patch addresses can be fixed by making reclaim more intelligent. This may likely make mempools less of an issue in the kernel. If we can reclaim in an emergency even in ATOMIC contexts then things get much easier. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.22 x86_64 : kernel initial decompression hangs on vmware
Zachary Amsden wrote: Avi Kivity wrote: We haven't seen any issue with the 2.6.22 boot decompressor. Which of the four (fs, gs, ldt, or tr) were proving problematic and why? It was tr that was affecting Workstation, since we boot through normal BIOS path, and only a 16-bit task was loaded at this point. Ah. Maybe we didn't have an exit while we were in long mode with the 16-bit tss, so VT didn't notice the illegal combination. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 04/10] sysctl: Fix neighbour table sysctls.
Hello. In article <[EMAIL PROTECTED]> (at Thu, 09 Aug 2007 18:56:09 -0600), [EMAIL PROTECTED] (Eric W. Biederman) says: > > - In ipv6 ndisc_ifinfo_syctl_change so it doesn't depend on binary > sysctl names for a function that works with proc. > > - In neighbour.c reorder the table to put the possibly unused entries > at the end so we can remove them by terminating the table early. > > - In neighbour.c kill the entries with questionable binary sysctl > handling behavior. > > - In neighbour.c if we don't have a strategy routine remove the > binary path. So we don't the default sysctl strategy routine > on data that is not ready for it. > I disagree. It is bad to remove existing interface. Ditto for other patches. Regards, --yoshfuji - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/4] Embed zone_id information within the zonelist->zones pointer
On Fri, 10 Aug 2007, Mel Gorman wrote: > > > +#if defined(CONFIG_SMP) && INTERNODE_CACHE_SHIFT > ZONES_SHIFT > > > > Is this necessary? ZONES_SHIFT is always <= 2 so it will work with > > any pointer. Why disable this for UP? > > > > Caution in case the number of zones increases. There was no guarantee of > zone alignment. It's the same reason I have a BUG_ON in the encode > function so that if we don't catch problems at compile-time, it'll go > BANG in a nice predictable fashion. Caution would lead to a BUG_ON but why the #if? Why exclude UP? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.23-rc2-mm1: rcutorture xtime usage
On Thu, Aug 09, 2007 at 01:51:06AM -0700, Andrew Morton wrote: >... > Changes since 2.6.23-rc2-mm1: >... > +allow-rcutorture-to-handle-synchronize_sched.patch >... > 2.6.23 queue >... All drivers were converted to no longer use xtime directly since it might be quite outdated, but this patch adds a usage of xtime.tv_nsec as RNG... cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/24] make atomic_read() behave consistently on alpha
On Thu, Aug 09, 2007 at 03:24:40PM -0400, Chris Snook wrote: > Paul E. McKenney wrote: > >On Thu, Aug 09, 2007 at 02:13:52PM -0400, Chris Snook wrote: > >>Paul E. McKenney wrote: > >>>On Thu, Aug 09, 2007 at 01:14:35PM -0400, Chris Snook wrote: > If you're depending on volatile writes > being visible to other CPUs, you're screwed either way, because the CPU > can hold that data in cache as long as it wants before it writes it to > memory. When this finally does happen, it will happen atomically, > which is all that atomic_set guarantees. If you need to guarantee that > the value is written to memory at a particular time in your execution > sequence, you either have to read it from memory to force the compiler > to store it first (and a volatile cast in atomic_read will suffice for > this) or you have to use LOCK_PREFIX instructions which will invalidate > remote cache lines containing the same variable. This patch doesn't > change either of these cases. > >>>The case that it -can- change is interactions with interrupt handlers. > >>>And NMI/SMI handlers, for that matter. > >>You have a point here, but only if you can guarantee that the interrupt > >>handler is running on a processor sharing the cache that has the > >>not-yet-written volatile value. That implies a strictly non-SMP > >>architecture. At the moment, none of those have volatile in their > >>declaration of atomic_t, so this patch can't break any of them. > > > >This can also happen when using per-CPU variables. And there are a > >number of per-CPU variables that are either atomic themselves or are > >structures containing atomic fields. > > Accessing per-CPU variables in this fashion reliably already requires a > suitable smp/non-smp read/write memory barrier. I maintain that if we > break anything with this change, it was really already broken, if less > obviously. Can you give a real or synthetic example of legitimate code > that could break? My main concern is actually the lack of symmetry -- I would expect that an atomic_set() would have the same properties as atomic_read(). It is easy and cheap to provide them with similar properties, so why not? Debugging even a single problem would consume far more time than simply giving them corresponding semantics. But you asked for examples. These are synthetic, and of course legitimacy is in the eye of the beholder. 1. Watchdog variable. atomic_t watchdog = ATOMIC_INIT(0); ... int i; while (!done) { /* Do so stuff that doesn't take more than a few us. */ /* Could do atomic increment, but throughput penalty. */ i++; atomic_set(, i); } do_something_with(); /* Every so often on some other CPU... */ if ((new_watchdog = atomic_read()) == old_watchdog) die_horribly(); old_watchdog = new_watchdog; If atomic_set() did not have volatile semantics, the compiler would be within its rights optimizing it to simply get the final value of "i" after exit from the loop. This would cause the watchdog check to fail spuriously. Memory barriers are not required in this case, because the CPU cannot hang onto the value for very long -- we don't care about the exact value, or about exact synchronization, but rather about whether or not the value is changing. In this (toy) example, one might replace the atomic_set() with an atomic increment (though that might be too expensive in some cases) or with something like: atomic_set(, atomic_read() + 1); However, other cases might not permit this transformation, for example, an existing heavily used API might take int rather than atomic_t. Some will no doubt argue that this example should use a macro or an asm similar to the "forget()" asm put forward elsewhere in this thread. 2. Communicating both with interrupt handler and with other CPUs. For example, data elements that are built up in a location visible to interrupts and NMIs, and then added as a unit to a data structure visible to other CPUs. This more-realistic example is abbreviated to the point of pointlessness as follows: struct foo { atomic_t a; atomic_t b; }; DEFINE_PER_CPU(struct foo *, staging) = NULL; /* Create element in staging area. */ __get_cpu_var(staging) = kzalloc(sizeof(*p), GFP_WHATEVER); if (__get_cpu_var(staging) == NULL) die_horribly(); /* allocate an element of some per-CPU array, get the result in "i" */ atomic_set(__get_cpu_var(staging).a, i); /* allocate another element of a per-CPU array, with result in "i" */
[RFC 3/3] SGI Altix cross partition memory (XPMEM)
This patch provides cross partition access to user memory (XPMEM) when running multiple partitions on a single SGI Altix. Signed-off-by: Dean Nelson <[EMAIL PROTECTED]> xpmem-module.v002.bz2 Description: BZip2 compressed data
[RFC 2/3] SGI Altix cross partition memory (XPMEM)
This patch exports zap_page_range as it is needed by XPMEM. Signed-off-by: Dean Nelson <[EMAIL PROTECTED]> --- XPMEM would have used sys_madvise() except that madvise_dontneed() madvise_dontneed() returns an -EINVAL if VM_PFNMAP is set, which is always true for the pages XPMEM imports from other partitions and is also true for uncached pages allocated locally via the mspec allocator. XPMEM needs zap_page_range() functionality for these types of pages as well as 'normal' pages. Index: linux-2.6/mm/memory.c === --- linux-2.6.orig/mm/memory.c 2007-08-09 07:07:55.762651612 -0500 +++ linux-2.6/mm/memory.c 2007-08-09 07:15:43.226389312 -0500 @@ -894,6 +894,7 @@ tlb_finish_mmu(tlb, address, end); return end; } +EXPORT_SYMBOL_GPL(zap_page_range); /* * Do a quick page-table lookup for a single page. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 1/3] SGI Altix cross partition memory (XPMEM)
This patch exports __put_task_struct as it is needed by XPMEM. Signed-off-by: Dean Nelson <[EMAIL PROTECTED]> --- One struct file_operations registered by XPMEM, xpmem_open(), calls 'get_task_struct(current->group_leader)' and another, xpmem_flush(), calls 'put_task_struct(tg->group_leader)'. The reason for this is given in the comment block that appears in xpmem_open(). /* * Increment 'usage' and 'mm->mm_users' for the current task's thread * group leader. This ensures that both its task_struct and mm_struct * will still be around when our thread group exits. (The Linux kernel * normally tears down the mm_struct prior to calling a module's * 'flush' function.) Since all XPMEM thread groups must go through * this path, this extra reference to mm_users also allows us to * directly inc/dec mm_users in xpmem_ensure_valid_PFNs() and avoid * mmput() which has a scaling issue with the mmlist_lock. */ Index: linux-2.6/kernel/fork.c === --- linux-2.6.orig/kernel/fork.c2007-08-09 07:07:55.426611601 -0500 +++ linux-2.6/kernel/fork.c 2007-08-09 07:15:43.246391700 -0500 @@ -127,6 +127,7 @@ if (!profile_handoff_task(tsk)) free_task(tsk); } +EXPORT_SYMBOL_GPL(__put_task_struct); void __init fork_init(unsigned long mempages) { - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC 0/3] SGI Altix cross partition memory (XPMEM)
Terminology The term 'partition', adopted by the SGI hardware designers and which perculated up into the software, is used in reference to a single SSI when multiple SSIs are running on a single Altix. An Altix running multiple SSIs is said to be 'partitioned', whereas one that is running only a single SSI is said to be 'unpartitioned'. The term '[a]cross partition' refers to a functionality that spans between two SSIs on a multi-SSI Altix. ('XP' is its abbreviation.) Introduction This feature provides cross partition access to user memory (XPMEM) when running multiple partitions on a single SGI Altix. XPMEM, like XPNET, utilizes XPC to communicate between the partitions. XPMEM allows a user process to identify portion(s) of its address space that other user processes can attach (i.e. map) into their own address spaces. These processes can be running on the same or a different partition from the one whose memory they are attaching. Known Issues XPMEM is not currently using the kthread API (which is also true for XPC) because it was in the process of being changed to require a kthread_stop() be done for every kthread_create() and the kthread_stop() couldn't be called for a thread that had already exited. In talking with Eric Biederman, there was some thought of creating a kthread_orphan() which would eliminate the need for a call to kthread_stop() being required. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 06/10] sysctl: Remove broken sunrpc debug binary sysctls
This is debug code so no need to support binary sysctl, and the binary sysctls as they were written were not consistent with what showed up in /proc so remove the binary sysctl support. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> --- net/sunrpc/sysctl.c |4 1 files changed, 0 insertions(+), 4 deletions(-) diff --git a/net/sunrpc/sysctl.c b/net/sunrpc/sysctl.c index 738db32..864b541 100644 --- a/net/sunrpc/sysctl.c +++ b/net/sunrpc/sysctl.c @@ -114,7 +114,6 @@ done: static ctl_table debug_table[] = { { - .ctl_name = CTL_RPCDEBUG, .procname = "rpc_debug", .data = _debug, .maxlen = sizeof(int), @@ -122,7 +121,6 @@ static ctl_table debug_table[] = { .proc_handler = _dodebug }, { - .ctl_name = CTL_NFSDEBUG, .procname = "nfs_debug", .data = _debug, .maxlen = sizeof(int), @@ -130,7 +128,6 @@ static ctl_table debug_table[] = { .proc_handler = _dodebug }, { - .ctl_name = CTL_NFSDDEBUG, .procname = "nfsd_debug", .data = _debug, .maxlen = sizeof(int), @@ -138,7 +135,6 @@ static ctl_table debug_table[] = { .proc_handler = _dodebug }, { - .ctl_name = CTL_NLMDEBUG, .procname = "nlm_debug", .data = _debug, .maxlen = sizeof(int), -- 1.5.1.1.181.g2de0 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][RFC] 4K stacks default, not a debug thing any more...?
On 8/1/07, Alan Cox <[EMAIL PROTECTED]> wrote: > On Wed, 1 Aug 2007 15:33:58 +0200 > Andrea Arcangeli <[EMAIL PROTECTED]> wrote: > > Tweaking kernel ptes is prohibitive during clone() because that's > > kernel memory and it would require a flush tlb all with IPIs that > > won't scale (IPIs are really the blocker) > > Agreed - except when doing debug work then its an acceptable cost. You > still have to sort the debug side out because you are going to fault the > kernel stack which will probably then cause a triple fault and reboot on > the spot. I was assuming debugging work, yes. I was also thinking it wouldn't be done at clone() time, but mapped (on a single CPU) at the time of a context switch. It would eliminate IPI, but would probably make the rest of the TLB handling much too ugly to contemplate.As an alternative, could the TLB flush and associated IPI be deferred until the process migrates? First migration would trigger flush/IPI, further migration would be as now, no? I'd happily run it with various dm/md layers underneath On 8/1/07, Denis Vlasenko <[EMAIL PROTECTED]> wrote: > Hmm, neat. Why do you need to _allocate second page_ at all? > Just mark it "not present"... Because the kernel mapping covers all physical memory contiguously, so if the page isn't allocated, it could be used by a kernel data structure you need to access. Same reason the kernel stack has to be contiguous pages. Well, for non-highmem at least. Either way, you don't want to mark an in-use page as inaccessable, you never know what's under there. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 10/10] sysctl: Remove the binary interface for aio-nr, aio-max-nr, acpi_video_flags
aio-nr, aio-max-nr, acpi_video_flags are unsigned long values which sysctl does not handle properly with a 64bit kernel and a 32bit user space. Since no one is likely to be using the binary sysctl values and the ascii interface still works, this patch just removes support for the binary sysctl interface from the kernel. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> --- kernel/sysctl.c |3 --- 1 files changed, 0 insertions(+), 3 deletions(-) diff --git a/kernel/sysctl.c b/kernel/sysctl.c index ccae8da..03759ab 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -688,7 +688,6 @@ static struct ctl_table kern_table[] = { #endif #ifdefined(CONFIG_ACPI_SLEEP) && defined(CONFIG_X86) { - .ctl_name = KERN_ACPI_VIDEO_FLAGS, .procname = "acpi_video_flags", .data = _realmode_flags, .maxlen = sizeof (unsigned long), @@ -1148,7 +1147,6 @@ static struct ctl_table fs_table[] = { .extra2 = , }, { - .ctl_name = FS_AIO_NR, .procname = "aio-nr", .data = _nr, .maxlen = sizeof(aio_nr), @@ -1156,7 +1154,6 @@ static struct ctl_table fs_table[] = { .proc_handler = _doulongvec_minmax, }, { - .ctl_name = FS_AIO_MAX_NR, .procname = "aio-max-nr", .data = _max_nr, .maxlen = sizeof(aio_max_nr), -- 1.5.1.1.181.g2de0 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 09/10] sysctl: ipv4 remove binary sysctl paths where they are broken.
Currently tcp_available_congestion_control does not even attempt being read from sys_sysctl, and ipfrag_max_dist while it works allows setting of invalid values using sys_sysctl. So just kill the binary sys_sysctl support for these sysctls. If the support is not important enough to test and get right it probably isn't important enough to keep. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> --- net/ipv4/sysctl_net_ipv4.c |2 -- 1 files changed, 0 insertions(+), 2 deletions(-) diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c index 53ef0f4..282eb7e 100644 --- a/net/ipv4/sysctl_net_ipv4.c +++ b/net/ipv4/sysctl_net_ipv4.c @@ -672,7 +672,6 @@ ctl_table ipv4_table[] = { .strategy = _jiffies }, { - .ctl_name = NET_IPV4_IPFRAG_MAX_DIST, .procname = "ipfrag_max_dist", .data = _ipfrag_max_dist, .maxlen = sizeof(int), @@ -797,7 +796,6 @@ ctl_table ipv4_table[] = { }, #endif /* CONFIG_NETLABEL */ { - .ctl_name = NET_TCP_AVAIL_CONG_CONTROL, .procname = "tcp_available_congestion_control", .maxlen = TCP_CA_BUF_MAX, .mode = 0444, -- 1.5.1.1.181.g2de0 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 08/10] sysctl: Remove broken cdrom binary sysctls
The binary interface for the cdrom sysctls can't possilby work. So remove the binary sysctls and update the test for finding out which sysctl table entry we are dealy with to use the procname and not the ctl_name (which I am removing). Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> --- drivers/cdrom/cdrom.c | 31 +-- 1 files changed, 9 insertions(+), 22 deletions(-) diff --git a/drivers/cdrom/cdrom.c b/drivers/cdrom/cdrom.c index 67ee3d4..f0c6318 100644 --- a/drivers/cdrom/cdrom.c +++ b/drivers/cdrom/cdrom.c @@ -3468,32 +3468,25 @@ static int cdrom_sysctl_handler(ctl_table *ctl, int write, struct file * filp, else *valp = 0; - switch (ctl->ctl_name) { - case DEV_CDROM_AUTOCLOSE: { + if (strcmp(ctl->procname, "autoclose") == 0) { if (valp == _sysctl_settings.autoclose) autoclose = cdrom_sysctl_settings.autoclose; - break; - } - case DEV_CDROM_AUTOEJECT: { + } + else if (strcmp(ctl->procname, "autoeject") == 0) { if (valp == _sysctl_settings.autoeject) autoeject = cdrom_sysctl_settings.autoeject; - break; - } - case DEV_CDROM_DEBUG: { + } + else if (strcmp(ctl->procname, "debug") == 0) { if (valp == _sysctl_settings.debug) debug = cdrom_sysctl_settings.debug; - break; - } - case DEV_CDROM_LOCK: { + } + else if (strcmp(ctl->procname, "lock") == 0) { if (valp == _sysctl_settings.lock) lockdoor = cdrom_sysctl_settings.lock; - break; - } - case DEV_CDROM_CHECK_MEDIA: { + } + else if (strcmp(ctl->procname, "check_media") == 0) { if (valp == _sysctl_settings.check) check_media_type = cdrom_sysctl_settings.check; - break; - } } /* update the option flags according to the changes. we don't have per device options through sysctl yet, @@ -3507,7 +3500,6 @@ static int cdrom_sysctl_handler(ctl_table *ctl, int write, struct file * filp, /* Place files in /proc/sys/dev/cdrom */ static ctl_table cdrom_table[] = { { - .ctl_name = DEV_CDROM_INFO, .procname = "info", .data = _sysctl_settings.info, .maxlen = CDROM_STR_SIZE, @@ -3515,7 +3507,6 @@ static ctl_table cdrom_table[] = { .proc_handler = _sysctl_info, }, { - .ctl_name = DEV_CDROM_AUTOCLOSE, .procname = "autoclose", .data = _sysctl_settings.autoclose, .maxlen = sizeof(int), @@ -3523,7 +3514,6 @@ static ctl_table cdrom_table[] = { .proc_handler = _sysctl_handler, }, { - .ctl_name = DEV_CDROM_AUTOEJECT, .procname = "autoeject", .data = _sysctl_settings.autoeject, .maxlen = sizeof(int), @@ -3531,7 +3521,6 @@ static ctl_table cdrom_table[] = { .proc_handler = _sysctl_handler, }, { - .ctl_name = DEV_CDROM_DEBUG, .procname = "debug", .data = _sysctl_settings.debug, .maxlen = sizeof(int), @@ -3539,7 +3528,6 @@ static ctl_table cdrom_table[] = { .proc_handler = _sysctl_handler, }, { - .ctl_name = DEV_CDROM_LOCK, .procname = "lock", .data = _sysctl_settings.lock, .maxlen = sizeof(int), @@ -3547,7 +3535,6 @@ static ctl_table cdrom_table[] = { .proc_handler = _sysctl_handler, }, { - .ctl_name = DEV_CDROM_CHECK_MEDIA, .procname = "check_media", .data = _sysctl_settings.check, .maxlen = sizeof(int), -- 1.5.1.1.181.g2de0 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 07/10] sysctl: x86_64 remove unnecessary binary paths.
Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> --- arch/x86_64/ia32/ia32_binfmt.c |1 - arch/x86_64/kernel/vsyscall.c | 10 +- 2 files changed, 1 insertions(+), 10 deletions(-) diff --git a/arch/x86_64/ia32/ia32_binfmt.c b/arch/x86_64/ia32/ia32_binfmt.c index dffd2ac..c80c3f1 100644 --- a/arch/x86_64/ia32/ia32_binfmt.c +++ b/arch/x86_64/ia32/ia32_binfmt.c @@ -291,7 +291,6 @@ static void elf32_init(struct pt_regs *regs) static ctl_table abi_table2[] = { { - .ctl_name = 99, .procname = "vsyscall32", .data = _vsyscall32, .maxlen = sizeof(int), diff --git a/arch/x86_64/kernel/vsyscall.c b/arch/x86_64/kernel/vsyscall.c index 06c3494..69918b5 100644 --- a/arch/x86_64/kernel/vsyscall.c +++ b/arch/x86_64/kernel/vsyscall.c @@ -260,18 +260,10 @@ out: return ret; } -static int vsyscall_sysctl_nostrat(ctl_table *t, int __user *name, int nlen, - void __user *oldval, size_t __user *oldlenp, - void __user *newval, size_t newlen) -{ - return -ENOSYS; -} - static ctl_table kernel_table2[] = { - { .ctl_name = 99, .procname = "vsyscall64", + { .procname = "vsyscall64", .data = _gtod_data.sysctl_enabled, .maxlen = sizeof(int), .mode = 0644, - .strategy = vsyscall_sysctl_nostrat, .proc_handler = vsyscall_sysctl_change }, {} }; -- 1.5.1.1.181.g2de0 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 06/10] sysctl: Remove broken sunrpc debug binary sysctls
>From ddf280c9de903f1fb5d4ecf9c68df0c479d7c7d2 Mon Sep 17 00:00:00 2001 From: Eric W. Biederman <[EMAIL PROTECTED]> Date: Thu, 9 Aug 2007 16:00:00 -0600 Subject: This is debug code so no need to support binary sysctl, and the binary sysctls as they were written were not consistent with what showed up in /proc so remove the binary sysctl support. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> --- net/sunrpc/sysctl.c |4 1 files changed, 0 insertions(+), 4 deletions(-) diff --git a/net/sunrpc/sysctl.c b/net/sunrpc/sysctl.c index 738db32..864b541 100644 --- a/net/sunrpc/sysctl.c +++ b/net/sunrpc/sysctl.c @@ -114,7 +114,6 @@ done: static ctl_table debug_table[] = { { - .ctl_name = CTL_RPCDEBUG, .procname = "rpc_debug", .data = _debug, .maxlen = sizeof(int), @@ -122,7 +121,6 @@ static ctl_table debug_table[] = { .proc_handler = _dodebug }, { - .ctl_name = CTL_NFSDEBUG, .procname = "nfs_debug", .data = _debug, .maxlen = sizeof(int), @@ -130,7 +128,6 @@ static ctl_table debug_table[] = { .proc_handler = _dodebug }, { - .ctl_name = CTL_NFSDDEBUG, .procname = "nfsd_debug", .data = _debug, .maxlen = sizeof(int), @@ -138,7 +135,6 @@ static ctl_table debug_table[] = { .proc_handler = _dodebug }, { - .ctl_name = CTL_NLMDEBUG, .procname = "nlm_debug", .data = _debug, .maxlen = sizeof(int), -- 1.5.1.1.181.g2de0 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 05/10] sysctl: ipv6 route flushing (kill binary path)
We don't preoperly support the sysctl binary path for flushing the ipv6 routes. So remove support for a binary path. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> --- net/ipv6/route.c |1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 55ea80f..0d23a46 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -2458,7 +2458,6 @@ int ipv6_sysctl_rtcache_flush(ctl_table *ctl, int write, struct file * filp, ctl_table ipv6_route_table[] = { { - .ctl_name = NET_IPV6_ROUTE_FLUSH, .procname = "flush", .data = _delay, .maxlen = sizeof(int), -- 1.5.1.1.181.g2de0 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 04/10] sysctl: Fix neighbour table sysctls.
- In ipv6 ndisc_ifinfo_syctl_change so it doesn't depend on binary sysctl names for a function that works with proc. - In neighbour.c reorder the table to put the possibly unused entries at the end so we can remove them by terminating the table early. - In neighbour.c kill the entries with questionable binary sysctl handling behavior. - In neighbour.c if we don't have a strategy routine remove the binary path. So we don't the default sysctl strategy routine on data that is not ready for it. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> --- net/core/neighbour.c | 75 ++ net/ipv6/ndisc.c | 24 ++- 2 files changed, 49 insertions(+), 50 deletions(-) diff --git a/net/core/neighbour.c b/net/core/neighbour.c index ca2a153..27c3f4e 100644 --- a/net/core/neighbour.c +++ b/net/core/neighbour.c @@ -2498,7 +2498,6 @@ static struct neigh_sysctl_table { .proc_handler = _dointvec, }, { - .ctl_name = NET_NEIGH_RETRANS_TIME, .procname = "retrans_time", .maxlen = sizeof(int), .mode = 0644, @@ -2543,27 +2542,40 @@ static struct neigh_sysctl_table { .proc_handler = _dointvec, }, { - .ctl_name = NET_NEIGH_ANYCAST_DELAY, .procname = "anycast_delay", .maxlen = sizeof(int), .mode = 0644, .proc_handler = _dointvec_userhz_jiffies, }, { - .ctl_name = NET_NEIGH_PROXY_DELAY, .procname = "proxy_delay", .maxlen = sizeof(int), .mode = 0644, .proc_handler = _dointvec_userhz_jiffies, }, { - .ctl_name = NET_NEIGH_LOCKTIME, .procname = "locktime", .maxlen = sizeof(int), .mode = 0644, .proc_handler = _dointvec_userhz_jiffies, }, { + .ctl_name = NET_NEIGH_RETRANS_TIME_MS, + .procname = "retrans_time_ms", + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = _dointvec_ms_jiffies, + .strategy = _ms_jiffies, + }, + { + .ctl_name = NET_NEIGH_REACHABLE_TIME_MS, + .procname = "base_reachable_time_ms", + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = _dointvec_ms_jiffies, + .strategy = _ms_jiffies, + }, + { .ctl_name = NET_NEIGH_GC_INTERVAL, .procname = "gc_interval", .maxlen = sizeof(int), @@ -2592,22 +2604,7 @@ static struct neigh_sysctl_table { .mode = 0644, .proc_handler = _dointvec, }, - { - .ctl_name = NET_NEIGH_RETRANS_TIME_MS, - .procname = "retrans_time_ms", - .maxlen = sizeof(int), - .mode = 0644, - .proc_handler = _dointvec_ms_jiffies, - .strategy = _ms_jiffies, - }, - { - .ctl_name = NET_NEIGH_REACHABLE_TIME_MS, - .procname = "base_reachable_time_ms", - .maxlen = sizeof(int), - .mode = 0644, - .proc_handler = _dointvec_ms_jiffies, - .strategy = _ms_jiffies, - }, + {} }, .neigh_dev = { { @@ -2660,42 +2657,48 @@ int neigh_sysctl_register(struct net_device *dev, struct neigh_parms *p, t->neigh_vars[9].data = >anycast_delay; t->neigh_vars[10].data = >proxy_delay; t->neigh_vars[11].data = >locktime; + t->neigh_vars[12].data = >retrans_time; + t->neigh_vars[13].data = >base_reachable_time; if (dev) { dev_name_source = dev->name; t->neigh_dev[0].ctl_name = dev->ifindex; - t->neigh_vars[12].procname = NULL; - t->neigh_vars[13].procname = NULL; -
[PATCH 03/10] sysctl: Remove binary sysctl support where it clearly doesn't work.
These functions all of wrapper functions for the proc interface that are needed for them to work correctly. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> --- kernel/sysctl.c |7 --- 1 files changed, 0 insertions(+), 7 deletions(-) diff --git a/kernel/sysctl.c b/kernel/sysctl.c index d6257ee..ccae8da 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -350,7 +350,6 @@ static struct ctl_table kern_table[] = { }, #ifdef CONFIG_PROC_SYSCTL { - .ctl_name = KERN_TAINTED, .procname = "tainted", .data = , .maxlen = sizeof(int), @@ -359,7 +358,6 @@ static struct ctl_table kern_table[] = { }, #endif { - .ctl_name = KERN_CAP_BSET, .procname = "cap-bound", .data = _bset, .maxlen = sizeof(kernel_cap_t), @@ -635,7 +633,6 @@ static struct ctl_table kern_table[] = { .proc_handler = _dointvec, }, { - .ctl_name = KERN_NMI_WATCHDOG, .procname = "nmi_watchdog", .data = _watchdog_enabled, .maxlen = sizeof (int), @@ -818,7 +815,6 @@ static struct ctl_table vm_table[] = { .extra2 = _hundred, }, { - .ctl_name = VM_DIRTY_WB_CS, .procname = "dirty_writeback_centisecs", .data = _writeback_interval, .maxlen = sizeof(dirty_writeback_interval), @@ -826,7 +822,6 @@ static struct ctl_table vm_table[] = { .proc_handler = _writeback_centisecs_handler, }, { - .ctl_name = VM_DIRTY_EXPIRE_CS, .procname = "dirty_expire_centisecs", .data = _expire_interval, .maxlen = sizeof(dirty_expire_interval), @@ -854,7 +849,6 @@ static struct ctl_table vm_table[] = { }, #ifdef CONFIG_HUGETLB_PAGE { - .ctl_name = VM_HUGETLB_PAGES, .procname = "nr_hugepages", .data = _huge_pages, .maxlen = sizeof(unsigned long), @@ -1079,7 +1073,6 @@ static struct ctl_table fs_table[] = { .proc_handler = _dointvec, }, { - .ctl_name = FS_NRFILE, .procname = "file-nr", .data = _stat, .maxlen = 3*sizeof(int), -- 1.5.1.1.181.g2de0 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 02/10] sysct mqueue: Remove the binary sysctl numbers
Because of a conflict with FS_INODE_NR none of the binary sysctl numbers use by mqueue, were available to user space. So just remove them. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> --- ipc/mqueue.c | 10 -- 1 files changed, 0 insertions(+), 10 deletions(-) diff --git a/ipc/mqueue.c b/ipc/mqueue.c index 145d5a0..13fdf67 100644 --- a/ipc/mqueue.c +++ b/ipc/mqueue.c @@ -44,12 +44,6 @@ #define STATE_PENDING 1 #define STATE_READY2 -/* used by sysctl */ -#define FS_MQUEUE 1 -#define CTL_QUEUESMAX 2 -#define CTL_MSGMAX 3 -#define CTL_MSGSIZEMAX 4 - /* default values */ #define DFLT_QUEUESMAX 256 /* max number of message queues */ #define DFLT_MSGMAX10 /* max number of messages in each queue */ @@ -1197,7 +1191,6 @@ static int msg_maxsize_limit_max = INT_MAX; static ctl_table mq_sysctls[] = { { - .ctl_name = CTL_QUEUESMAX, .procname = "queues_max", .data = _max, .maxlen = sizeof(int), @@ -1205,7 +1198,6 @@ static ctl_table mq_sysctls[] = { .proc_handler = _dointvec, }, { - .ctl_name = CTL_MSGMAX, .procname = "msg_max", .data = _max, .maxlen = sizeof(int), @@ -1215,7 +1207,6 @@ static ctl_table mq_sysctls[] = { .extra2 = _max_limit_max, }, { - .ctl_name = CTL_MSGSIZEMAX, .procname = "msgsize_max", .data = _max, .maxlen = sizeof(int), @@ -1229,7 +1220,6 @@ static ctl_table mq_sysctls[] = { static ctl_table mq_sysctl_dir[] = { { - .ctl_name = FS_MQUEUE, .procname = "mqueue", .mode = 0555, .child = mq_sysctls, -- 1.5.1.1.181.g2de0 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 01/10] sysctl: Update sysctl_check_table
Well it turns out after I dug into the problems a little more I was returning a few false positives so this patch updates my logic to remove them. - Don't complain about 0 ctl_names in sysctl_check_binary_path It is valid for someone to remove the sysctl binary interface and still keep the same sysctl proc interface. - Count ctl_names and procnames as matching if they both don't exist. - Only warn about missing min when the generic functions care. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> --- kernel/sysctl_check.c | 30 -- 1 files changed, 16 insertions(+), 14 deletions(-) diff --git a/kernel/sysctl_check.c b/kernel/sysctl_check.c index 389c4ba..930a514 100644 --- a/kernel/sysctl_check.c +++ b/kernel/sysctl_check.c @@ -1420,12 +1420,14 @@ static int sysctl_check_dir(struct ctl_table *table) ref = sysctl_check_lookup(table); if (ref) { int match = 0; - if (table->procname && ref->procname && - (strcmp(table->procname, ref->procname) == 0)) + if ((!table->procname && !ref->procname) || + (table->procname && ref->procname && +(strcmp(table->procname, ref->procname) == 0))) match++; - if (table->ctl_name && ref->ctl_name && - (table->ctl_name == ref->ctl_name)) + if ((!table->ctl_name && !ref->ctl_name) || + (table->ctl_name && ref->ctl_name && +(table->ctl_name == ref->ctl_name))) match++; if (match != 2) { @@ -1462,8 +1464,8 @@ static void sysctl_check_bin_path(struct ctl_table *table, const char **fail) (strcmp(table->procname, ref->procname) != 0))) set_fail(fail, table, "procname does not match binary path procname"); - if (ref->ctl_name && - (!table->ctl_name || table->ctl_name != ref->ctl_name)) + if (ref->ctl_name && table->ctl_name && + (table->ctl_name != ref->ctl_name)) set_fail(fail, table, "ctl_name does not match binary path ctl_name"); } } @@ -1499,7 +1501,7 @@ int sysctl_check_table(struct ctl_table *table) if (table->extra2) set_fail(, table, "Directory with extra2"); if (sysctl_check_dir(table)) - set_fail(, table, "Inconsistent directory"); + set_fail(, table, "Inconsistent directory names"); } else { if ((table->strategy == sysctl_data) || (table->strategy == sysctl_string) || @@ -1520,14 +1522,14 @@ int sysctl_check_table(struct ctl_table *table) if (!table->maxlen) set_fail(, table, "No maxlen"); } - if ((table->strategy == sysctl_intvec) || - (table->proc_handler == proc_dointvec_minmax) || - (table->proc_handler == proc_doulongvec_minmax) || + if ((table->proc_handler == proc_doulongvec_minmax) || (table->proc_handler == proc_doulongvec_ms_jiffies_minmax)) { - if (!table->extra1) - set_fail(, table, "No min"); - if (!table->extra2) - set_fail(, table, "No max"); + if (table->maxlen > sizeof (unsigned long)) { + if (!table->extra1) + set_fail(, table, "No min"); + if (!table->extra2) + set_fail(, table, "No max"); + } } if (table->ctl_name && !table->strategy) set_fail(, table, "Missing strategy"); -- 1.5.1.1.181.g2de0 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm patch] DMA engine kconfig improvements
On Fri, Aug 03, 2007 at 07:15:31PM -0700, Dan Williams wrote: > On 7/25/07, Adrian Bunk <[EMAIL PROTECTED]> wrote: > > On Wed, Jul 25, 2007 at 04:03:04AM -0700, Andrew Morton wrote: > > >... > > > Changes since 2.6.22-rc6-mm1: > > >... > > > +dma-arch-fix.patch > > > > > > Fix git-dma.patch > > >... > > > > This results in an ARM-only driver in an X86-only menu... > > > > What about the patch below instead that also improves a few other things? > I like it, just a few nits: > > > -menu "DMA Engine support" > > - depends on HAS_DMA > > +menuconfig DMADEVICES > > + bool "DMA Engine support" > > + depends on (PCI && X86) || ARCH_IOP32X || ARCH_IOP33X || > > ARCH_IOP13XX > > + help > > + Intel(R) DMA engines > > + > Perhaps we should go ahead and define ARCH_HAS_DMA_OFFLOAD and have > DMADEVICES depend on that option. A ppc32 driver is in the works: > http://marc.info/?l=linux-raid=117400143317440=2 >... That would be overkill - what my patch does here is just a minor cosmetical thing that could be dropped if it would become a problem. > Regards, > Dan (for Shannon while he is on vacation) cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB doesn't work with kdump kernel on Cell
On 8/9/07, Lucio Correia <[EMAIL PROTECTED]> wrote: > On Wed, 2007-08-08 at 23:10 +0200, Arnd Bergmann wrote: > > On Wednesday 08 August 2007, Lucio Correia wrote: > > > DMA 0 ->12288 > > > Normal 12288 ->12288 > > > early_node_map[2] active PFN ranges > > > 0:0 -> 2560 > > > 1:12287 ->12288 > > > > As Christoph found, this memory map is really strange. Other machines > > have something like > > > > Zone PFN ranges: > > DMA 0 ->16384 > > Normal 16384 ->16384 > > early_node_map[2] active PFN ranges > > 0:0 -> 8192 > > 1: 8192 ->16384 > > > > Lucio, > > What code builds the memory map that gets passed to the kdump kernel? It comes out of the device tree, just like a regular kernel. The device tree for the kdump kernel is built by kexec-tools, it parses /proc/device-tree and does a bunch of logic to avoid various reserved regions: the kernel, TCE tables, RTAS etc. > I also tried to pass maxcpus=1 for the command line of second kernel, > and it didn't work. How can I alternatively disable the node? maxcpus is poorly tested and is known to be broken on Cell, please don't use it, or fix it first :) cheers - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: kernel BUG at mm/swap_state.c:78!
On Thu, Aug 09, 2007 at 04:37:35PM +0100, Hugh Dickins wrote: > On Thu, 9 Aug 2007, Mariusz Kozlowski wrote: > > Hello, > > > > Nothing unusual happening, allmodconfig compiling etc. > > Not sure why it says kernel was tainted though ... hmmm. > > > > [ cut here ] > > kernel BUG at mm/swap_state.c:78! > > invalid opcode: [#1] > > PREEMPT > > Modules linked in: orinoco_cs orinoco hermes pl2303 usbserial pcmcia > > 8250_pci 8250 serial_core yenta_socket rsrc_nonstatic pcmcia_core 8139too > > CPU:0 > > EIP:0060:[]Tainted: PVLI > > EFLAGS: 00010246 (2.6.23-rc2-mm1 #1) > > EIP is at __add_to_swap_cache+0xc6/0xd7 > > eax: 4000 ebx: c11285c0 ecx: 00d0 edx: 0283 > > esi: c11285c0 edi: 0283 ebp: c1858f90 esp: c1858f84 > > ds: 007b es: 007b fs: gs: ss: 0068 > > Process kprefetchd (pid: 236, ti=c1858000 task=c18d14d0 task.ti=c1858000) > > Stack: 0283 c11285c0 c3d5a3c8 c1858fa0 c01504ea c11285c0 > > c1858fcc > >c015307c 0001 0007 0002 0002 0283 > > fffc > > c0152d5c c1858fe0 c0127f2e c0127ef8 > > > > Call Trace: > > [] show_trace_log_lvl+0x1a/0x30 > > [] show_stack_log_lvl+0xa9/0xd5 > > [] show_registers+0x219/0x38d > > [] die+0x104/0x23e > > [] do_trap+0x83/0xad > > [] do_invalid_op+0x88/0x92 > > [] error_code+0x6a/0x70 > > [] add_to_swap_cache+0x22/0x58 > > [] kprefetchd+0x320/0x364 > > [] kthread+0x36/0x58 > > [] kernel_thread_helper+0x7/0x14 > > === > > INFO: lockdep is turned off. > > Code: 0f 89 7b 0c 83 05 fc c9 53 c0 01 8b 13 c1 ea 1e 8d 04 12 01 d0 c1 e0 > > 03 29 d0 c1 e0 05 ff 80 b8 c0 53 c0 ff 05 34 1d 68 c0 eb 96 <0f> 0b eb fe > > 0f 0b eb fe 0f 0b eb fe 8b 53 0c eb be 55 89 e5 56 > > EIP: [] __add_to_swap_cache+0xc6/0xd7 SS:ESP 0068:c1858f84 > > Don't worry about reproducing untainted, I got the same earlier > and was just preparing and testing the hotfix: here it is... > > > Nick's mm-clarify-__add_to_swap_cache-locking.patch is fine for mainline, > but soon generates a "kernel BUG at mm/swap_state.c:78!" when it meets > mm-implement-swap-prefetching.patch in 2.6.23-rc2-mm1. We could add a > fix to the latter, but I think it's better to adjust Nick's, so that > it's right for whichever tree it's in: move the responsibility to > SetPageLocked from read_swap_cache_async to add_to_swap_cache. Hmm, yeah I like this better, it is more like add_to_page_cache now. Thanks. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLUB doesn't work with kdump kernel on Cell
On 8/9/07, Arnd Bergmann <[EMAIL PROTECTED]> wrote: > On Wednesday 08 August 2007, Lucio Correia wrote: > > DMA 0 -> 12288 > > Normal 12288 -> 12288 > > early_node_map[2] active PFN ranges > > 0: 0 -> 2560 > > 1: 12287 -> 12288 > > As Christoph found, this memory map is really strange. Other machines > have something like > > Zone PFN ranges: > DMA 0 ->16384 > Normal 16384 ->16384 > early_node_map[2] active PFN ranges > 0:0 -> 8192 > 1: 8192 ->16384 > > Lucio, > What code builds the memory map that gets passed to the kdump kernel? > Does the original kernel see the same map on your machine? I'd have to check, but I'd guess it's the reserved region for RTAS. You can confirm by checking the boot log to see where prom_init instanciated RTAS. cheers - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 00/23] per device dirty throttling -v8
Andrew Morton wrote: On Wed, 08 Aug 2007 14:10:15 -0700 "Martin J. Bligh" <[EMAIL PROTECTED]> wrote: Why isn't this easily fixable by just adding an additional dirty flag that says atime has changed? Then we only cause a write when we remove the inode from the inode cache, if only atime is updated. I think that could be made to work, and it would fix the performance issue. It is a behaviour change. At present ext3 (for example) commits everything every five seconds. After a change like this, a crash+recovery could cause a file's atime to go backwards by an arbitrarily large time interval - it could easily be months. I would think that (really) updating atime on open would be enough, hopefully without being too much. The "lazyatime" thing I was playing with only updated on open, final close, write, and fork. I like the idea of updating once in a while, but one of the benefits of noatime is allowing drives to spin down via inactivity. If something does get done in the area of less but non-zero atime tracking, perhaps that could be taken into account. I have to check what "laptop_mode actually does, since my laptops are old installs. -- Bill Davidsen <[EMAIL PROTECTED]> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Documentation files in html format?
On 08/10/2007 12:27 AM, Francois Romieu wrote: Andi Kleen <[EMAIL PROTECTED]> : [...] I don't think that is used by Linuxdoc. Try a make pdfdocs and see for yourself. It reminds me of an old PII but it does not really make clear how html to pdf conversion would improve the situation. With HTML the source format is itself the preferred object format for many purposes (something which I assume you wouldn't want to claim of DocBook source) meaning that for those uses there is no conversion. Which given the number of times "make *docs" has bombed out on me through the years I find a definite improvement. Add in that it's much easier to produce HTML, that it covers most all formatting needs something like the kernel documentation directory needs, integrates unchanged, directly and nicely into the effort Rob Landley is doing with collecting documentation online and is a format you can read with a program most users have open and available 100% of the time rather than requiring a complete stack of semi-obscure external software -- and I just don't see why anyone would want to argue that DocBook and its associated crapola should _not_ be buried in that same dark, desolate place where other abortive attempts at improvement such as GNU info already reside. Rene. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 02/10] mm: system wide ALLOC_NO_WATERMARK
On 8/9/07, Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Thu, 9 Aug 2007, Daniel Phillips wrote: > > On 8/8/07, Christoph Lameter <[EMAIL PROTECTED]> wrote: > > > On Wed, 8 Aug 2007, Daniel Phillips wrote: > > > Maybe we need to kill PF_MEMALLOC > > Shrink_caches needs to be able to recurse into filesystems at least, > > and for the duration of the recursion the filesystem must have > > privileged access to reserves. Consider the difficulty of handling > > that with anything other than a process flag. > > Shrink_caches needs to allocate memory? Hmmm... Maybe we can only limit > the PF_MEMALLOC use. PF_MEMALLOC is not such a bad thing. It will usually be less code than mempool for the same use case, besides being able to handle a wider range of problems. We introduce __GPF_MEMALLOC for situations where the need for reserve memory is locally known, as in the network stack, which is similar or identical to the use case for mempool. One could reasonably ask why we need mempool with a lighter alternative available. But this is a case of to each their own I think. Either technique will work for reserve management. > > In theory, we could reduce the size of the global memalloc pool by > > including "easily freeable" memory in it. This is just an > > optimization and does not belong in this patch set, which fixes a > > system integrity issue. > > I think the main thing would be to fix reclaim to not do stupid things > like triggering writeout early in the reclaim pass and to allow reentry > into reclaim. The idea of memory pools always sounded strange to me given > that you have a lot of memory in a zone that is reclaimable as needed. You can fix reclaim as much as you want and the basic deadlock will still not go away. When you finally do get to writing something out, memory consumers in the writeout path are going to cause problems, which this patch set fixes. Agreed that the idea of mempool always sounded strange, and we show how to get rid of them, but that is not the immediate purpose of this patch set. Regards, Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: sleeping function called from invalid context at kernel/mutex.c:86
On Fri, 10 Aug 2007 01:23:07 +0200 Mariusz Kozlowski <[EMAIL PROTECTED]> wrote: > Hello, > > This probably doesn't have great impact ;) but ... > To reproduce: run torture tests for RCU and then sysrq+q. > > SysRq : Show Pending Timers > Timer List Version: v0.3 > HRTIMER_MAX_CLOCK_BASES: 2 > now at 1764338760370 nsecs > > cpu: 0 > clock 0: > .index: 0 > .resolution: 1 nsecs > .get_time: ktime_get_real > .offset: 1186699025823815427 nsecs > active timers: > clock 1: > .index: 1 > .resolution: 1 nsecs > .get_time: ktime_get > .offset: 0 nsecs > active timers: > #0: <3>BUG: sleeping function called from invalid context at > kernel/mutex.c:86 > in_atomic():1, irqs_disabled():1 > INFO: lockdep is turned off. > irq event stamp: 0 > hardirqs last enabled at (0): [<>] 0x0 > hardirqs last disabled at (0): [] copy_process+0x4a8/0x144c > softirqs last enabled at (0): [] copy_process+0x4c6/0x144c > softirqs last disabled at (0): [<>] 0x0 > [] show_trace_log_lvl+0x1a/0x30 > [] show_trace+0x12/0x14 > [] dump_stack+0x15/0x17 > [] __might_sleep+0xb7/0xc9 > [] mutex_lock+0x15/0x1f > [] lookup_module_symbol_name+0x17/0xc0 > [] lookup_symbol_name+0x3f/0x43 > [] print_name_offset+0x1f/0x96 > [] timer_list_show+0x802/0xcbd > [] sysrq_timer_list_show+0xc/0xe > [] sysrq_handle_show_timers+0x8/0xa > [] __handle_sysrq+0x7b/0x115 > [] handle_sysrq+0x20/0x24 > [] kbd_event+0x3a8/0x5c7 > [] input_pass_event+0x8f/0x91 > [] input_handle_event+0x98/0x38d > [] input_event+0x54/0x67 > [] atkbd_interrupt+0x200/0x59e > [] serio_interrupt+0x7c/0x80 > [] i8042_interrupt+0x17a/0x289 > [] handle_IRQ_event+0x28/0x59 > [] handle_level_irq+0xad/0x10b > [] do_IRQ+0x93/0xd0 > [] common_interrupt+0x2e/0x34 > [] rcu_read_delay+0x8/0x36 [rcutorture] > [] rcu_torture_reader+0x6e/0x169 [rcutorture] > [] kthread+0x36/0x58 > [] kernel_thread_helper+0x7/0x1c > === We seem to have made a mess in there. timer_list_show() ends up calling lookup_module_symbol_name(), which takes a mutex. However print_symbol() (which is called at oops time, interrupt time, etc) calls module_address_lookup(), which is basically the same, only it doesn't take the mutex. I guess a quicky fix would be to switch kernel/time/timer_list.c:print_name_offset() from lookup_module_symbol_name() to module_address_lookup(). But we'd still have a mess in there. (adds ccs, runs away) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Unable to handle kernel paging request at virtual address
On Fri, Aug 10, 2007 at 01:44:24AM +0200, shacky wrote: > > You snipped the most important part. Even a digital photo of the > > crash would be more useful than what we have above. > > So far, there's not really much to go on. > > Could you tell me what is the most important part, so I try to rewrite > it by hand? It's hard to blindly guess because there's so little to go on. At the least, a complete list of the modules loaded, the EIP/RIP and the call trace. (This makes up 90% of the output, hence the suggestion to take a photograph). > I don't think a digital photo will be much useful because the whole > error is writed on a simple black screen with white caracters. That's expected. We usually cope with that quite well. Dave -- http://www.codemonkey.org.uk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: problems while mounting /boot partition
On Aug 8 2007 18:28, Michal Piotrowski wrote: > >Hi Brian, > >Brian J. Murrell pisze: >> I am using Ubuntu Gutsy, which is the in-development branch heading for >> their next stable release. > >You forgot about message subject, so no one has read this report. Actually, given the volume on LKML, a line without a subject is making the most attention since all others do have one. :) Jan -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: rtl8139 inconsistent lock state
Hello, = [ INFO: inconsistent lock state ] 2.6.23-rc2-mm1 #7 - inconsistent {in-hardirq-W} -> {hardirq-on-W} usage. ifconfig/5492 [HC0[0]:SC0[0]:HE1:SE1] takes: (>lock){+...}, at: [] rtl8139_interrupt+0x27/0x46b [8139too] {in-hardirq-W} state was registered at: [] __lock_acquire+0x949/0x11ac [] lock_acquire+0x99/0xb2 [] _spin_lock+0x35/0x42 [] rtl8139_interrupt+0x27/0x46b [8139too] [] handle_IRQ_event+0x28/0x59 [] handle_level_irq+0xad/0x10b [] do_IRQ+0x93/0xd0 [] common_interrupt+0x2e/0x34 [] cpuidle_idle_call+0x74/0x99 [] cpu_idle+0x87/0x89 [] rest_init+0x60/0x62 [] start_kernel+0x23a/0x2c5 [<>] 0x0 [] 0x irq event stamp: 1777 hardirqs last enabled at (1777): [] kfree+0xee/0x105 hardirqs last disabled at (1776): [] kfree+0x87/0x105 softirqs last enabled at (1756): [] dev_deactivate+0x86/0xa5 softirqs last disabled at (1754): [] _spin_lock_bh+0xe/0x47 other info that might help us debug this: 1 lock held by ifconfig/5492: #0: (rtnl_mutex){--..}, at: [] mutex_lock+0x1c/0x1f stack backtrace: [] show_trace_log_lvl+0x1a/0x30 [] show_trace+0x12/0x14 [] dump_stack+0x15/0x17 [] print_usage_bug+0x145/0x14f [] mark_lock+0x61f/0x70c [] __lock_acquire+0x73e/0x11ac [] lock_acquire+0x99/0xb2 [] _spin_lock+0x35/0x42 [] rtl8139_interrupt+0x27/0x46b [8139too] [] free_irq+0x11b/0x146 [] rtl8139_close+0x8a/0x14a [8139too] [] dev_close+0x57/0x74 [] dev_change_flags+0x8e/0x190 [] devinet_ioctl+0x4af/0x652 [] inet_ioctl+0x56/0x71 [] sock_ioctl+0xa5/0x1d4 [] do_ioctl+0x22/0x71 [] vfs_ioctl+0x55/0x29e [] sys_ioctl+0x33/0x69 [] sysenter_past_esp+0x5f/0x99 === Regards, Mariusz - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] pata_artop: fix UDMA5 for AEC6280[R] and UDMA6 for AEC6880[R]
On Friday 10 August 2007, Alan Cox wrote: > On Thu, 9 Aug 2007 23:19:34 +0200 > Bartlomiej Zolnierkiewicz <[EMAIL PROTECTED]> wrote: > > > > > Maximum supported UDMA mode for AEC6280[R] is UDMA5 (not UDMA4) > > and for AEC6880[R] it is UDMA6 (not UDMA5): > > > > * Fix the problem by adding missing struct ata_port_info to > > artop_init_one(). > > > > * Use the right naming (s/626/628/). > > > > * Bump driver version. > > > > Fixes IDE->libata regression, problem was never present in IDE aec62xx > > driver. > > Have you tested this ?? -ENODEV so no and testing is welcomed. However I went over both drivers to make sure that this change is safe and correct. BTW presence of the above bugs would strongly indicate that pata_artop has never been tested (properly) with AEC6x80[R], otherwise these bugs should have been noticed and fixed much earlier. Thanks, Bart - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Unable to handle kernel paging request at virtual address
> You snipped the most important part. Even a digital photo of the > crash would be more useful than what we have above. > So far, there's not really much to go on. Could you tell me what is the most important part, so I try to rewrite it by hand? I don't think a digital photo will be much useful because the whole error is writed on a simple black screen with white caracters. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.22 x86_64 : kernel initial decompression hangs on vmware
Avi Kivity wrote: We haven't seen any issue with the 2.6.22 boot decompressor. Which of the four (fs, gs, ldt, or tr) were proving problematic and why? It was tr that was affecting Workstation, since we boot through normal BIOS path, and only a 16-bit task was loaded at this point. Just to make the state comprehensive, I opted to reload everything. Zach - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Unable to handle kernel paging request at virtual address
On Fri, Aug 10, 2007 at 01:17:14AM +0200, shacky wrote: > [87.935473] BUG: unable to handle kernel paging request at virtual > address 6d207972 > [...] printing eip: > [...] 6d207972 > [...] *pde = > [...] Oops: 000 [#2] > [...] SMP > [...] Modules linked in: bluetooth capability lirc_dev > speedstep_lib cpufreq_powersave cpufreq_stats cpufreq_userspace > cpufreq_ondemand cpufreq_conservative freq_table video container sbs > button dock ac battery ipv6 sbp2 lp fuse snd_emu10k1_synth > snd_emux_synth snd_seq_virmidi snd_seq_midi_emul [] etc. > > I'm using the kernel 2.6.22. > > I'm omitting the rest of the error because it is very very long and I > have to rewrite it because I can't copy it. If you need some > other information please ask. :-) You snipped the most important part. Even a digital photo of the crash would be more useful than what we have above. So far, there's not really much to go on. Dave -- http://www.codemonkey.org.uk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] cpqphp_ctrl.c: remove dead code
On Thu, Aug 09, 2007 at 04:20:01PM -0700, Kristen Carlson Accardi wrote: > On Fri, 10 Aug 2007 01:04:36 +0200 > Adrian Bunk <[EMAIL PROTECTED]> wrote: > > > On Thu, Aug 09, 2007 at 03:47:02PM -0700, Kristen Carlson Accardi wrote: > > > > > > fine by me - let's NAK this patch (and all future ones for this driver) > > > until > > > someone with hardware steps up to maintain this driver. Eventually it > > > will just die I guess. > > > > We have tons of unmaintained drivers and none of them has such a silly > > auto-NAK policy. > > > > cu > > Adrian > > OK - "all future ones" was too extreme. I'll take trivial patches (of > which this one is not). As I've wrote in the patch description, all it does is to remove an if() check that could never be false (which is easily verifyable if you look at the source code). I've also verified that my patch does not change a single bit in the object file (after compilation with gcc 4.2.1). What's your definition of a trivial patch? cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: i386 doublefault handler is broken with CONFIG_DEBUG_SPINLOCK
On 08/09/2007 07:16 PM, Andi Kleen wrote: > > I tested it. Even on a box without spin lock debugging I get a hard > hang after > > double fault, gdt at c1404000 [255 bytes] > > even though it should have printed the registers. > So it looks like there is more broken in the DF handler than just > this. Looks like it just fails the ptr_ok() test: #define ptr_ok(x) ((x) > PAGE_OFFSET && (x) < PAGE_OFFSET + 0x100) page_offset c000 + 100 < c1404000 What should that be changed to, or is there some easier way to test that? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1
Alan Cox wrote: >>> [ 28.828484] :00:1f.1: cannot adjust BAR0 (not I/O) >>> [ 28.828487] :00:1f.1: cannot adjust BAR1 (not I/O) >>> [ 28.828489] :00:1f.1: cannot adjust BAR2 (not I/O) >>> [ 28.828491] :00:1f.1: cannot adjust BAR3 (not I/O) > > This means it didn't do anything. (wrongly because its checking I/O bits > on a BAR which are ignored according to the spec) > >>> Region 0: [virtual] Memory at 01f0 (32-bit, non-prefetchable) >>> [disabled] [size=8] >>> Region 1: [virtual] Memory at 03f0 (type 3, non-prefetchable) >>> [disabled] [size=1] >>> Region 2: [virtual] Memory at 0170 (32-bit, non-prefetchable) >>> [disabled] [size=8] >>> Region 3: [virtual] Memory at 0370 (type 3, non-prefetchable) >>> [disabled] [size=1] > > The controller is disabled and when disabled it seems to think its > memory. Valid but interesting. > > The box is an Dell Precision WorkStation 530 MT. Actually I have an ATA-7 disc on the primary EIDE connector ( one port free ) and an oldish CDROM on the secondary EIDE connector ( one port free ). http://194.231.229.228/lara/lara.dmesg ( from 2.6.23-rc2-mm1 with the 2 patches reverted ) http://194.231.229.228/lara/lara.lspci ( lspci - -nn ) http://194.231.229.228/lara/lara.html ( lshw html output ) If you want me to do/try something let me know. Gabriel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/4] Embed zone_id information within the zonelist->zones pointer
On (09/08/07 14:37), Christoph Lameter didst pronounce: > On Thu, 9 Aug 2007, Mel Gorman wrote: > > > } > > > > +#if defined(CONFIG_SMP) && INTERNODE_CACHE_SHIFT > ZONES_SHIFT > > Is this necessary? ZONES_SHIFT is always <= 2 so it will work with > any pointer. Why disable this for UP? > Caution in case the number of zones increases. There was no guarantee of zone alignment. It's the same reason I have a BUG_ON in the encode function so that if we don't catch problems at compile-time, it'll go BANG in a nice predictable fashion. > > --- linux-2.6.23-rc1-mm2-010_use_zonelist/mm/vmstat.c 2007-08-07 > > 14:45:11.0 +0100 > > +++ linux-2.6.23-rc1-mm2-015_zoneid_zonelist/mm/vmstat.c2007-08-09 > > 15:52:12.0 +0100 > > @@ -365,11 +365,11 @@ void refresh_cpu_vm_stats(int cpu) > > */ > > void zone_statistics(struct zonelist *zonelist, struct zone *z) > > { > > - if (z->zone_pgdat == zonelist->zones[0]->zone_pgdat) { > > + if (z->zone_pgdat == zonelist_zone(zonelist->_zones[0])->zone_pgdat) { > > __inc_zone_state(z, NUMA_HIT); > > } else { > > __inc_zone_state(z, NUMA_MISS); > > - __inc_zone_state(zonelist->zones[0], NUMA_FOREIGN); > > + __inc_zone_state(zonelist_zone(zonelist->_zones[0]), > > NUMA_FOREIGN); > > } > > if (z->node == numa_node_id()) > > __inc_zone_state(z, NUMA_LOCAL); > > H. I hope the compiler does subexpression optimization on > > zonelist_zone(zonelist->_zones[0]) > I'll check > Acked-by: Christoph Lameter <[EMAIL PROTECTED]> > -- -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: sleeping function called from invalid context at kernel/mutex.c:86
Hello, This probably doesn't have great impact ;) but ... To reproduce: run torture tests for RCU and then sysrq+q. SysRq : Show Pending Timers Timer List Version: v0.3 HRTIMER_MAX_CLOCK_BASES: 2 now at 1764338760370 nsecs cpu: 0 clock 0: .index: 0 .resolution: 1 nsecs .get_time: ktime_get_real .offset: 1186699025823815427 nsecs active timers: clock 1: .index: 1 .resolution: 1 nsecs .get_time: ktime_get .offset: 0 nsecs active timers: #0: <3>BUG: sleeping function called from invalid context at kernel/mutex.c:86 in_atomic():1, irqs_disabled():1 INFO: lockdep is turned off. irq event stamp: 0 hardirqs last enabled at (0): [<>] 0x0 hardirqs last disabled at (0): [] copy_process+0x4a8/0x144c softirqs last enabled at (0): [] copy_process+0x4c6/0x144c softirqs last disabled at (0): [<>] 0x0 [] show_trace_log_lvl+0x1a/0x30 [] show_trace+0x12/0x14 [] dump_stack+0x15/0x17 [] __might_sleep+0xb7/0xc9 [] mutex_lock+0x15/0x1f [] lookup_module_symbol_name+0x17/0xc0 [] lookup_symbol_name+0x3f/0x43 [] print_name_offset+0x1f/0x96 [] timer_list_show+0x802/0xcbd [] sysrq_timer_list_show+0xc/0xe [] sysrq_handle_show_timers+0x8/0xa [] __handle_sysrq+0x7b/0x115 [] handle_sysrq+0x20/0x24 [] kbd_event+0x3a8/0x5c7 [] input_pass_event+0x8f/0x91 [] input_handle_event+0x98/0x38d [] input_event+0x54/0x67 [] atkbd_interrupt+0x200/0x59e [] serio_interrupt+0x7c/0x80 [] i8042_interrupt+0x17a/0x289 [] handle_IRQ_event+0x28/0x59 [] handle_level_irq+0xad/0x10b [] do_IRQ+0x93/0xd0 [] common_interrupt+0x2e/0x34 [] rcu_read_delay+0x8/0x36 [rcutorture] [] rcu_torture_reader+0x6e/0x169 [rcutorture] [] kthread+0x36/0x58 [] kernel_thread_helper+0x7/0x1c === , tick_sched_timer, S:01, tick_nohz_restart_sched_tick, swapper/0 # expires at 176433900 nsecs [in 239630 nsecs] #1: , it_real_fn, S:01, do_setitimer, artsd/7461 # expires at 1764742781512 nsecs [in 404021142 nsecs] #2: , hrtimer_wakeup, S:01, do_nanosleep, kwrapper/7452 # expires at 1764922105491 nsecs [in 583345121 nsecs] #3: , it_real_fn, S:01, do_setitimer, syslogd/6719 # expires at 1790027922194 nsecs [in 25689161824 nsecs] .expires_next : 176433900 nsecs .hres_active: 1 .nr_events : 1422687 .nohz_mode : 2 .idle_tick : 46585900 nsecs .tick_stopped : 0 .idle_jiffies : 165857 .idle_calls : 1812679 .idle_sleeps: 1761361 .idle_entrytime : 466865075138 nsecs .idle_sleeptime : 357976883572 nsecs .last_jiffies : 166865 .next_jiffies : 166866 .idle_expires : 46595100 nsecs jiffies: 1464338 Tick Device: mode: 1 Clock Event Device: pit max_delta_ns: 27461866 min_delta_ns: 12571 mult: 5124677 shift: 32 mode: 3 next_event: 176433900 nsecs set_next_event: pit_next_event set_mode: init_pit_timer event_handler: hrtimer_interrupt Regards, Mariusz # # Automatically generated make config: don't edit # Linux kernel version: 2.6.23-rc2-mm1 # Fri Aug 10 00:12:50 2007 # CONFIG_X86_32=y CONFIG_GENERIC_TIME=y CONFIG_GENERIC_CMOS_UPDATE=y CONFIG_CLOCKSOURCE_WATCHDOG=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_NONIRQ_WAKEUP=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_SEMAPHORE_SLEEPERS=y CONFIG_X86=y CONFIG_MMU=y CONFIG_ZONE_DMA=y CONFIG_QUICKLIST=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_GENERIC_BUG=y CONFIG_GENERIC_HWEIGHT=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_DMI=y CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" # # General setup # CONFIG_EXPERIMENTAL=y CONFIG_BROKEN_ON_SMP=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 CONFIG_LOCALVERSION="" # CONFIG_LOCALVERSION_AUTO is not set CONFIG_SWAP=y CONFIG_SWAP_PREFETCH=y CONFIG_SYSVIPC=y CONFIG_SYSVIPC_SYSCTL=y CONFIG_POSIX_MQUEUE=y CONFIG_BSD_PROCESS_ACCT=y # CONFIG_BSD_PROCESS_ACCT_V3 is not set # CONFIG_TASKSTATS is not set # CONFIG_USER_NS is not set # CONFIG_AUDIT is not set CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_LOG_BUF_SHIFT=17 # CONFIG_CONTAINERS is not set CONFIG_SYSFS_DEPRECATED=y # CONFIG_RELAY is not set CONFIG_BLK_DEV_INITRD=y CONFIG_INITRAMFS_SOURCE="" # CONFIG_CC_OPTIMIZE_FOR_SIZE is not set CONFIG_SYSCTL=y # CONFIG_EMBEDDED is not set CONFIG_UID16=y CONFIG_SYSCTL_SYSCALL=y CONFIG_KALLSYMS=y CONFIG_KALLSYMS_ALL=y # CONFIG_KALLSYMS_EXTRA_PASS is not set CONFIG_HOTPLUG=y CONFIG_PRINTK=y CONFIG_BUG=y CONFIG_ELF_CORE=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_ANON_INODES=y CONFIG_EPOLL=y CONFIG_SIGNALFD=y CONFIG_TIMERFD=y CONFIG_EVENTFD=y CONFIG_SHMEM=y CONFIG_VM_EVENT_COUNTERS=y CONFIG_SLUB_DEBUG=y # CONFIG_SLAB is not set CONFIG_SLUB=y # CONFIG_SLOB is not set CONFIG_PROC_PAGE_MONITOR=y CONFIG_PROC_KPAGEMAP=y CONFIG_RT_MUTEXES=y # CONFIG_TINY_SHMEM is not set CONFIG_BASE_SMALL=0 CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y # CONFIG_MODULE_FORCE_UNLOAD is not set # CONFIG_MODVERSIONS is not set # CONFIG_MODULE_SRCVERSION_ALL is
Re: [2.6 patch] cpqphp_ctrl.c: remove dead code
> Silly is in the eye of the beholder. I don't want to take this patch > because it needs to be reviewed by someone who really knows the intent > of the driver. Seems silly to me to blindly take patches. For unmaintained code we usually work on wackipedia theory ("its probably right but if not we can revert it/update it cheaply") Alan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ofa-general] Re: [PATCH RFC] RDMA/CMA: Allocate PS_TCP ports from the host TCP port space.
How about we just remove the RDMA stack altogether? I am not at all kidding. If you guys can't stay in your sand box and need to cause problems for the normal network stack, it's unacceptable. We were told all along the if RDMA went into the tree none of this kind of stuff would be an issue. There are currently two RDMA solutions available. Each solution has different requirements and uses the normal network stack differently. Infiniband uses its own transport. iWarp runs over TCP. We have tried to leverage the existing infrastructure where it makes sense. After TCP port reservation, what's next? It seems an at least bi-monthly event that the RDMA folks need to put their fingers into something else in the normal networking stack. No more. Currently, the RDMA stack uses its own port space. This causes a problem for iWarp, and is what Steve is looking for a solution for. I'm not an iWarp guru, so I don't know what options exist. Can iWarp use its own address family? Identify specific IP addresses for iWarp use? Restrict iWarp to specific port numbers? Let the app control the correct operation? I don't know. Steve merely defined a problem and suggested a possible solution. He's looking for constructive help trying to solve the problem. - Sean - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] cpqphp_ctrl.c: remove dead code
On Fri, 10 Aug 2007 01:04:36 +0200 Adrian Bunk <[EMAIL PROTECTED]> wrote: > On Thu, Aug 09, 2007 at 03:47:02PM -0700, Kristen Carlson Accardi wrote: > > > > fine by me - let's NAK this patch (and all future ones for this driver) > > until > > someone with hardware steps up to maintain this driver. Eventually it > > will just die I guess. > > We have tons of unmaintained drivers and none of them has such a silly > auto-NAK policy. > > cu > Adrian OK - "all future ones" was too extreme. I'll take trivial patches (of which this one is not). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] cpqphp_ctrl.c: remove dead code
On Fri, 10 Aug 2007 01:04:36 +0200 Adrian Bunk <[EMAIL PROTECTED]> wrote: > We have tons of unmaintained drivers and none of them has such a silly > auto-NAK policy. > > cu > Adrian Silly is in the eye of the beholder. I don't want to take this patch because it needs to be reviewed by someone who really knows the intent of the driver. Seems silly to me to blindly take patches. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Unable to handle kernel paging request at virtual address
Hi. I installed Ubuntu 7.04 and then upgraded to the future 7.10 on my system based on an Asus Pundit barebone with 512 mb RAM and 120 Gb IDE hard disk. The system works without any problem, but when I try to shutdown or restart the system, after a while during the shutdown process, the system hangs and I see this error: [87.935473] BUG: unable to handle kernel paging request at virtual address 6d207972 [...] printing eip: [...] 6d207972 [...] *pde = [...] Oops: 000 [#2] [...] SMP [...] Modules linked in: bluetooth capability lirc_dev speedstep_lib cpufreq_powersave cpufreq_stats cpufreq_userspace cpufreq_ondemand cpufreq_conservative freq_table video container sbs button dock ac battery ipv6 sbp2 lp fuse snd_emu10k1_synth snd_emux_synth snd_seq_virmidi snd_seq_midi_emul [] etc. I'm using the kernel 2.6.22. I'm omitting the rest of the error because it is very very long and I have to rewrite it because I can't copy it. If you need some other information please ask. :-) Could you help me, please? What could be the problem? Thank you very much! Bye. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: i386 doublefault handler is broken with CONFIG_DEBUG_SPINLOCK
On Thu, Aug 09, 2007 at 02:40:27PM -0400, Chuck Ebbert wrote: > On 08/09/2007 01:49 PM, Andi Kleen wrote: > > Chuck Ebbert <[EMAIL PROTECTED]> writes: > >> Initializing FS in the doublefault_tss should fix it. > >> > >> Signed-off-by: Chuck Ebbert <[EMAIL PROTECTED]> > >> > >> --- > >> > >> NOTE: not even compile tested. > > > > Can you please test it? > > > > It compiles but I can't really test it further right now. I tested it. Even on a box without spin lock debugging I get a hard hang after double fault, gdt at c1404000 [255 bytes] even though it should have printed the registers. So it looks like there is more broken in the DF handler than just this. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] cpqphp_ctrl.c: remove dead code
> fine by me - let's NAK this patch (and all future ones for this driver) until > someone with hardware steps up to maintain this driver. Eventually it > will just die I guess. If you want to NAK it perhaps you should become maintainer ;) Alan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
Hi, On Wed, 1 Aug 2007, Ingo Molnar wrote: > just to make sure, how does 'top' output of the l + "lt 3" testcase look > like now on your laptop? Yesterday it was this: > > 4544 roman 20 0 1796 520 432 S 32.1 0.4 0:21.08 lt > 4545 roman 20 0 1796 344 256 R 32.1 0.3 0:21.07 lt > 4546 roman 20 0 1796 344 256 R 31.7 0.3 0:21.07 lt > 4547 roman 20 0 1532 272 216 R 3.3 0.2 0:01.94 l > > and i'm still wondering how that output was possible. I disabled the jiffies logic and the result is still the same, so this problem isn't related to resolution at all. I traced it a little and what's happing is that the busy loop really only gets little time, it only runs inbetween the timer tasks. When the timer task is woken up __enqueue_sleeper() updates sleeper_bonus and a little later when the busy loop is preempted __update_curr() is called a last time and it's fully hit by the sleeper_bonus. So the timer tasks use less time than they actually get and thus produce overflows, the busy loop OTOH is punished and underflows. So it seems my initial suspicion was right and this logic is dodgy, what is it actually supposed to do? Why is some random task accounted with the sleeper_bonus? bye, Roman PS: Can I still expect answer about all the other stuff? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] cpqphp_ctrl.c: remove dead code
On Thu, Aug 09, 2007 at 03:47:02PM -0700, Kristen Carlson Accardi wrote: > > fine by me - let's NAK this patch (and all future ones for this driver) until > someone with hardware steps up to maintain this driver. Eventually it > will just die I guess. We have tons of unmaintained drivers and none of them has such a silly auto-NAK policy. cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/24] make atomic_read() behave consistently on alpha
So, why not use the well-defined alternative? Because we don't need to, and it hurts performance. It hurts performance by implementing 32-bit atomic reads in assembler? No, I misunderstood the question. Implementing 32-bit atomic reads in assembler is redundant, because any sane compiler, *particularly* and optimizing compiler (and we're only in this mess because of optimizing compilers) Oh please, don't tell me you don't want an optimising compiler. And if you _do_ want one, well you're in this mess because you chose C as implementation language and C has some pretty strange rules. Trying to use not-all-that-well-defined-and-completely- misunderstood features of the language doesn't make things easier; trying to use something that isn't even part of the language and that your particular compiler originally supported by accident, and that isn't yet an officially supported feature, and that on top of it all has a track record of problems -- well it makes me wonder if you're in this game for fun or what. will give us that automatically without the assembler. No, it does *not* give it to you automatically; you have to do either the asm() thing, or the not-defined-at-all *(volatile *)& thing. Yes, it is legal for a compiler to violate this assumption. It is also legal for us to refuse to maintain compatibility with compilers that suck this badly. So that's rm include/linux/compiler-gcc*.h then. Good luck with the intel compiler, maybe it works more to your liking. Segher - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] at91 pm: Compilation fix for at91sam926x
> > +#if defined(CONFIG_ARCH_AT91RM9200) > > at91_sys_write(AT91_SDRAMC_SRR, 1); /* > > self-refresh mode */ > Why don't use: > if (cpu_is_at91rm9200()) > at91_sys_write(AT91_SDRAMC_SRR, 1); What is the benefit? Will the optimizer remove the code if the CPU is not the at91rm9200? Best Regards Ulf Samuelsson - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] cpqphp_ctrl.c: remove dead code
On Thu, 9 Aug 2007 15:24:27 -0700 Greg KH <[EMAIL PROTECTED]> wrote: [EMAIL PROTECTED] > On Thu, Aug 09, 2007 at 02:51:40PM -0700, Kristen Carlson Accardi wrote: > > On Mon, 23 Jul 2007 16:51:05 +0200 > > Adrian Bunk <[EMAIL PROTECTED]> wrote: > > > > > If !mem_node we did already return -ENOMEM above in the function. > > > > > > Spotted by the Coverity checker. > > > > > > Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]> > > > > Greg - you are listed as the maintainer for this driver. > > Not anymore, look at 2.6.23-rc1 :) > > > Can you either > > point me to someone who can review this patch or review it yourself? > > Looking at the code, it looks like it's possible that the driver writer > > wanted this code patch to be able to be taken if it got IO resources > > and not MEM resources, and if they didn't there's other cleanups that > > should be done for the no iomem case. > > Hm, I agree that this looks like the way the code was intended to work, > but as this code has been working just fine so far the way it is, I'm > not inclined to change it much, if any. > > Especially as I no longer even have the hardware to test it on :( > > So, how about we just leave it alone? > > thanks, > > greg k-h > fine by me - let's NAK this patch (and all future ones for this driver) until someone with hardware steps up to maintain this driver. Eventually it will just die I guess. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] pata_artop: fix UDMA5 for AEC6280[R] and UDMA6 for AEC6880[R]
On Thu, 9 Aug 2007 23:19:34 +0200 Bartlomiej Zolnierkiewicz <[EMAIL PROTECTED]> wrote: > > Maximum supported UDMA mode for AEC6280[R] is UDMA5 (not UDMA4) > and for AEC6880[R] it is UDMA6 (not UDMA5): > > * Fix the problem by adding missing struct ata_port_info to artop_init_one(). > > * Use the right naming (s/626/628/). > > * Bump driver version. > > Fixes IDE->libata regression, problem was never present in IDE aec62xx driver. Have you tested this ?? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/