Re: CFS review

2007-08-09 Thread Ingo Molnar

* Roman Zippel <[EMAIL PROTECTED]> wrote:

> >  4544 roman 20   0  1796  520  432 S 32.1  0.4   0:21.08 lt
> >  4545 roman 20   0  1796  344  256 R 32.1  0.3   0:21.07 lt
> >  4546 roman 20   0  1796  344  256 R 31.7  0.3   0:21.07 lt
> >  4547 roman 20   0  1532  272  216 R  3.3  0.2   0:01.94 l
> > 
> > and i'm still wondering how that output was possible.
> 
> I disabled the jiffies logic and the result is still the same, so this 
> problem isn't related to resolution at all.

how did you disable the jiffies logic? Also, could you please send me 
the cfs-debug-info.sh:

  http://people.redhat.com/mingo/cfs-scheduler/tools/cfs-debug-info.sh

captured _while_ the above workload is running. This is the third time 
i've asked for that :-)

to establish that the basic sched_clock() behavior is sound on that box, 
could you please also run this tool:

   http://people.redhat.com/mingo/cfs-scheduler/tools/tsc-dump.c

please run it both while the system is idle, and while there's a CPU hog 
running:

  while :; do :; done &

and send me that output too? (it's 2x 60 lines only) Thanks!

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.23-rc2-mm2

2007-08-09 Thread Andrew Morton

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc2/2.6.23-rc2-mm2/

- Various problems from 2.6.23-rc2-mm1 were fixed



Boilerplate:

- See the `hot-fixes' directory for any important updates to this patchset.

- To fetch an -mm tree using git, use (for example)

  git-fetch git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git 
tag v2.6.16-rc2-mm1
  git-checkout -b local-v2.6.16-rc2-mm1 v2.6.16-rc2-mm1

- -mm kernel commit activity can be reviewed by subscribing to the
  mm-commits mailing list.

echo "subscribe mm-commits" | mail [EMAIL PROTECTED]

- If you hit a bug in -mm and it is not obvious which patch caused it, it is
  most valuable if you can perform a bisection search to identify which patch
  introduced the bug.  Instructions for this process are at

http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt

  But beware that this process takes some time (around ten rebuilds and
  reboots), so consider reporting the bug first and if we cannot immediately
  identify the faulty patch, then perform the bisection search.

- When reporting bugs, please try to Cc: the relevant maintainer and mailing
  list on any email.

- When reporting bugs in this kernel via email, please also rewrite the
  email Subject: in some manner to reflect the nature of the bug.  Some
  developers filter by Subject: when looking for messages to read.

- Occasional snapshots of the -mm lineup are uploaded to
  ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/ and are announced on
  the mm-commits list.



Changes since 2.6.23-rc1-mm2:

 origin.patch
 git-acpi.patch
 git-alsa.patch
 git-agpgart.patch
 git-audit-master.patch
 git-cifs.patch
 git-cpufreq.patch
 git-dma.patch
 git-drm.patch
 git-dvb.patch
 git-hwmon.patch
 git-gfs2-nmw.patch
 git-hid.patch
 git-ieee1394.patch
 git-infiniband.patch
 git-input.patch
 git-jfs.patch
 git-jg-misc.patch
 git-kvm.patch
 git-libata-all.patch
 git-m32r.patch
 git-md-accel.patch
 git-mips.patch
 git-mmc.patch
 git-mtd.patch
 git-ubi.patch
 git-netdev-all.patch
 git-ixgbe.patch
 git-nfsd.patch
 git-ocfs2.patch
 git-r8169.patch
 git-s390.patch
 git-sh.patch
 git-sh64.patch
 git-scsi-misc.patch
 git-unionfs.patch
 git-v9fs.patch
 git-watchdog.patch
 git-wireless.patch
 git-ipwireless_cs.patch
 git-newsetup.patch
 git-xfs.patch
 git-cryptodev.patch
 git-kgdb.patch

 git trees

-genirq-temporary-fix-for-level-triggered-irq-resend-fix.patch
-allow-rcutorture-to-handle-synchronize_sched.patch
-alpha-o_cloexec-definition.patch
-sound-pci-ioremap-iounmap-balancing.patch
-fix-ide-ide-add-platform-ide-driver.patch
-mmc-make-it-build.patch
-mmc-fix-section-mismatch-warnings-for-drivers-mmc-host.patch
-git-nfsd-build-fixes.patch
-nfsd-warning-fix.patch
-dvb-remove-bogus-bug_on-in-videobuf_dvb_thread.patch

 Merged into mainline or a subsystem tree

+hex_dump-add-missing-const-qualifiers.patch
+rcu-remove-prototype-for-nonexistent-function.patch
+cris-drivers-cdrom-kconfig-no-longer.patch
+spidev-warning-fix.patch
+timerremove-clockevents_unregister_notifier.patch
+fix-compilation-with-gcc-42.patch
+fix-compilation-with-gcc-42-fix.patch
+lguest-files-should-explicitly-include-asm-paravirth.patch
+alpha-werror-fixes-for-sys_titanc.patch
+readahead-docbook-fix.patch

 More 2.6.23 queue

+pm-fix-dependencies-of-config_suspend-and-config_hibernation-updated-3x.patch

 Might be 2.6.23

+acpi-ec-remove-potential-deadlock-from-ec.patch

 ACPI fix

+sound-pci-ioremap-iounmap-balancing.patch

 ALSA fix

+make-power-supply-class-available-for-arm-architecture.patch

 ARM fix

+git-dma-up-fix.patch

 Fix git-dma

+jdelvare-i2c-i2c-mpc-dont-disable-i2c-module-on-stop-condition.patch
+jdelvare-i2c-i2c-core-make-some-code-static.patch

 I2C tree updates

-drivers-i2c-i2c-corec-make-code-static.patch

 Dropped

+alpm-increase-number-of-allowable-device-flags.patch

 ALPM pathces were updated again

+st340823a-hpa-and-libata.patch
+pata_cmd64x-set-up-mwdma-modes-properly.patch
+ata_piix-disallow-udma-133-on-ich5-ich7.patch

 ata/pata things

+ide-hpt366-fix-pci-clock-detection-for-hpt374.patch
+ide-hpt366-ultradma-filtering-for-sata-cards.patch
+ide-atiixp-dma-setup-fixes.patch
+ide-it8213-piix-slc90e66-remove-dma-2-pio.patch
+ide-au1xxx-use-ide-tune-dma.patch
+ide-hpt34x-fix-config-hpt34x-autodma-n-handling.patch
+ide-ide-remove-drive-init-speed-zeroing.patch
+ide-ide-remove-ide-use-fast-pio.patch
+ide-cs5530-sc1200-add-pio-autotune-fallback-to-ide-dma-check.patch
+ide-sl82c105-add-pio-autotune-fallback-to-ide-dma-check.patch
+ide-ide-cris-add-pio-autotune-fallback-to-ide-dma-check.patch
+ide-ide-pmac-add-pio-autotune-fallback-to-ide-dma-check.patch
+ide-ide-remove-ide-dma-check.patch

 IDE tree updates

+mips-detect-bcm947xx-cpus.patch
+mips-bcm947xx-support.patch
+rfc-add-bcm947xx-to-kconfig.patch
+mips-add-bcm947xx-to-makefile.patch

 MIPS things

-8139too-force-media-setting-fix.patch

 Dropped - it broke


Re: [RFC PATCH 1/4] pass open file to ->setattr()

2007-08-09 Thread Miklos Szeredi
> >> > This is needed to be able to correctly implement open-unlink-fsetattr
> >> > semantics in some filesystem such as sshfs, without having to resort
> >> > to "silly-renaming".
> >> 
> >> How do you plan to do that?
> > 
> > Easy: the SFTP protocol has stateful opens and defines an FSTAT call.
> 
> Is it possible to reconnect without umounting?

Yes, but open files and in-progress requests are lost at reconnect.

> If yes, the unlinked files would be lost in spite of being opened,
> wouldn't they?

Sure.  Obviously one of the drawbacks of a stateful protocol is that
the server state can't survive a reconnect.

But that sort of reliability has never been the goal of sshfs.  And
even if that was needed, it could probably be much better handled in a
lower layer.

Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][Take2] PCI legacy I/O port free driver - Making Intel e1000 driver legacy I/O port free

2007-08-09 Thread Tomohiro Kusumi
Dear Auke

Sorry I sent the wrong patch.
I resubmit the patch.

Tomohiro Kusumi

Signed-off-by: Tomohiro Kusumi <[EMAIL PROTECTED]>

---
diff -Nurp linux-2.6.22.org/drivers/net/e1000/e1000.h 
linux-2.6.22/drivers/net/e1000/e1000.h
--- linux-2.6.22.org/drivers/net/e1000/e1000.h  2007-07-09 08:32:17.0 
+0900
+++ linux-2.6.22/drivers/net/e1000/e1000.h  2007-08-10 09:56:03.0 
+0900
@@ -342,6 +342,9 @@ struct e1000_adapter {
boolean_t quad_port_a;
unsigned long flags;
uint32_t eeprom_wol;
+
+   int use_ioport;
+   int bars;
 };

 enum e1000_state_t {
diff -Nurp linux-2.6.22.org/drivers/net/e1000/e1000_main.c 
linux-2.6.22/drivers/net/e1000/e1000_main.c
--- linux-2.6.22.org/drivers/net/e1000/e1000_main.c 2007-07-09 
08:32:17.0 +0900
+++ linux-2.6.22/drivers/net/e1000/e1000_main.c 2007-08-10 14:27:41.0 
+0900
@@ -222,6 +222,11 @@ static pci_ers_result_t e1000_io_error_d
 static pci_ers_result_t e1000_io_slot_reset(struct pci_dev *pdev);
 static void e1000_io_resume(struct pci_dev *pdev);

+static unsigned int enable_legacy_ioport_free = 0;
+module_param(enable_legacy_ioport_free, uint, 0644);
+MODULE_PARM_DESC(enable_legacy_ioport_free, "Enable legacy I/O port free 
(default:0)");
+static int e1000_test_legacy_ioport(struct pci_dev *pdev);
+
 static struct pci_error_handlers e1000_err_handler = {
.error_detected = e1000_io_error_detected,
.slot_reset = e1000_io_slot_reset,
@@ -868,8 +873,25 @@ e1000_probe(struct pci_dev *pdev,
int i, err, pci_using_dac;
uint16_t eeprom_data = 0;
uint16_t eeprom_apme_mask = E1000_EEPROM_APME;
-   if ((err = pci_enable_device(pdev)))
+   int bars = 0;
+   int use_ioport = 0;
+
+   if (enable_legacy_ioport_free) {
+   if ((use_ioport = e1000_test_legacy_ioport(pdev)) < 0) {
+   E1000_ERR("e1000_test_legacy_ioport failed, 
aborting\n");
+   return -1;
+   }
+   if (use_ioport)
+   bars = pci_select_bars(pdev, IORESOURCE_MEM | 
IORESOURCE_IO);
+   else
+   bars = pci_select_bars(pdev, IORESOURCE_MEM);
+
+   if ((err = pci_enable_device_bars(pdev, bars)))
+   return err;
+   }
+   else if ((err = pci_enable_device(pdev))) {
return err;
+   }

if (!(err = pci_set_dma_mask(pdev, DMA_64BIT_MASK)) &&
!(err = pci_set_consistent_dma_mask(pdev, DMA_64BIT_MASK))) {
@@ -883,7 +905,8 @@ e1000_probe(struct pci_dev *pdev,
pci_using_dac = 0;
}

-   if ((err = pci_request_regions(pdev, e1000_driver_name)))
+   if ((enable_legacy_ioport_free && (err = 
pci_request_selected_regions(pdev, bars, e1000_driver_name))) ||
+   (err = pci_request_regions(pdev, e1000_driver_name)))
goto err_pci_reg;

pci_set_master(pdev);
@@ -902,6 +925,10 @@ e1000_probe(struct pci_dev *pdev,
adapter->pdev = pdev;
adapter->hw.back = adapter;
adapter->msg_enable = (1 << debug) - 1;
+   if (enable_legacy_ioport_free) {
+   adapter->use_ioport = use_ioport;
+   adapter->bars = bars;
+   }

mmio_start = pci_resource_start(pdev, BAR_0);
mmio_len = pci_resource_len(pdev, BAR_0);
@@ -911,12 +938,14 @@ e1000_probe(struct pci_dev *pdev,
if (!adapter->hw.hw_addr)
goto err_ioremap;

-   for (i = BAR_1; i <= BAR_5; i++) {
-   if (pci_resource_len(pdev, i) == 0)
-   continue;
-   if (pci_resource_flags(pdev, i) & IORESOURCE_IO) {
-   adapter->hw.io_base = pci_resource_start(pdev, i);
-   break;
+   if (!enable_legacy_ioport_free || adapter->use_ioport) {
+   for (i = BAR_1; i <= BAR_5; i++) {
+   if (pci_resource_len(pdev, i) == 0)
+   continue;
+   if (pci_resource_flags(pdev, i) & IORESOURCE_IO) {
+   adapter->hw.io_base = pci_resource_start(pdev, 
i);
+   break;
+   }
}
}

@@ -1182,7 +1211,10 @@ err_sw_init:
 err_ioremap:
free_netdev(netdev);
 err_alloc_etherdev:
-   pci_release_regions(pdev);
+   if (enable_legacy_ioport_free)
+   pci_release_selected_regions(pdev, bars);
+   else
+   pci_release_regions(pdev);
 err_pci_reg:
 err_dma:
pci_disable_device(pdev);
@@ -1234,7 +1266,10 @@ e1000_remove(struct pci_dev *pdev)
iounmap(adapter->hw.hw_addr);
if (adapter->hw.flash_address)
iounmap(adapter->hw.flash_address);
-   pci_release_regions(pdev);
+   if (enable_legacy_ioport_free)
+   pci_release_selected_regions(pdev, adapter->bars);
+   else
+   

Re: 2.6.23-rc2-mm1: sleeping function called from invalid context at kernel/mutex.c:86

2007-08-09 Thread Josh Triplett
Andrew Morton wrote:
> On Fri, 10 Aug 2007 01:23:07 +0200
> Mariusz Kozlowski <[EMAIL PROTECTED]> wrote:
> 
>> Hello,
>>
>>  This probably doesn't have great impact ;) but ...
>> To reproduce: run torture tests for RCU and then sysrq+q.
>>
>> SysRq : Show Pending Timers
>> Timer List Version: v0.3
>> HRTIMER_MAX_CLOCK_BASES: 2
>> now at 1764338760370 nsecs
>>
>> cpu: 0
>>  clock 0:
>>   .index:  0
>>   .resolution: 1 nsecs
>>   .get_time:   ktime_get_real
>>   .offset: 1186699025823815427 nsecs
>> active timers:
>>  clock 1:
>>   .index:  1
>>   .resolution: 1 nsecs
>>   .get_time:   ktime_get
>>   .offset: 0 nsecs
>> active timers:
>>  #0: <3>BUG: sleeping function called from invalid context at 
>> kernel/mutex.c:86
>> in_atomic():1, irqs_disabled():1
>> INFO: lockdep is turned off.
>> irq event stamp: 0
>> hardirqs last  enabled at (0): [<>] 0x0
>> hardirqs last disabled at (0): [] copy_process+0x4a8/0x144c
>> softirqs last  enabled at (0): [] copy_process+0x4c6/0x144c
>> softirqs last disabled at (0): [<>] 0x0
>>  [] show_trace_log_lvl+0x1a/0x30
>>  [] show_trace+0x12/0x14
>>  [] dump_stack+0x15/0x17
>>  [] __might_sleep+0xb7/0xc9
>>  [] mutex_lock+0x15/0x1f
>>  [] lookup_module_symbol_name+0x17/0xc0
>>  [] lookup_symbol_name+0x3f/0x43
>>  [] print_name_offset+0x1f/0x96
>>  [] timer_list_show+0x802/0xcbd
>>  [] sysrq_timer_list_show+0xc/0xe
>>  [] sysrq_handle_show_timers+0x8/0xa
>>  [] __handle_sysrq+0x7b/0x115
>>  [] handle_sysrq+0x20/0x24
>>  [] kbd_event+0x3a8/0x5c7
>>  [] input_pass_event+0x8f/0x91
>>  [] input_handle_event+0x98/0x38d
>>  [] input_event+0x54/0x67
>>  [] atkbd_interrupt+0x200/0x59e
>>  [] serio_interrupt+0x7c/0x80
>>  [] i8042_interrupt+0x17a/0x289
>>  [] handle_IRQ_event+0x28/0x59
>>  [] handle_level_irq+0xad/0x10b
>>  [] do_IRQ+0x93/0xd0
>>  [] common_interrupt+0x2e/0x34
>>  [] rcu_read_delay+0x8/0x36 [rcutorture]
>>  [] rcu_torture_reader+0x6e/0x169 [rcutorture]
>>  [] kthread+0x36/0x58
>>  [] kernel_thread_helper+0x7/0x1c
>>  ===
> 
> We seem to have made a mess in there.  timer_list_show() ends up calling
> lookup_module_symbol_name(), which takes a mutex.  However print_symbol()
> (which is called at oops time, interrupt time, etc) calls
> module_address_lookup(), which is basically the same, only it doesn't take
> the mutex.
> 
> I guess a quicky fix would be to switch
> kernel/time/timer_list.c:print_name_offset() from
> lookup_module_symbol_name() to module_address_lookup().  But we'd still
> have a mess in there.
> 
> (adds ccs, runs away)

I don't think rcutorture matters for this bug.  As far as I can tell, Andrew's
description of this problem will always apply to this particular sysrq: the
keyboard interrupt leads to handle_sysrq, which leads to timer_list_show,
which leads to lookup_module_symbol_name, which acquires a mutex.

- Josh Triplett
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] SLUB fixes

2007-08-09 Thread Christoph Lameter
The following changes are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/christoph/slab.git to-linus

Christoph Lameter (2):
  SLUB: Remove checks for MAX_PARTIAL from kmem_cache_shrink
  SLUB: Fix dynamic dma kmalloc cache creation

Jesper Juhl (1):
  SLUB: Fix format specifier in Documentation/vm/slabinfo.c

 Documentation/vm/slabinfo.c |2 +-
 mm/slub.c   |   68 +-
 2 files changed, 48 insertions(+), 22 deletions(-)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH][Take2] PCI legacy I/O port free driver - Making Intel e1000 driver legacy I/O port free

2007-08-09 Thread Tomohiro Kusumi
Dear Auke

http://lkml.org/lkml/2007/5/16/275
> I'm ok with the bottom part of the patch, but I do not like the modification 
> of
> the pci device ID table in this way. As Arjan van der Ven previously commented
> as well, this makes it hard for future device ID's to be bound to the driver.
>
> On top of that, there is no logical correlation between the mapping and
> chipsets, so a lot of information is lost in that table. It really does not 
> show
> which _chipsets_ support this functionality.
>
> I think if we want to work with this, we need some way of mapping the device
> ID's back to chipsets, and enable the feature on that basis.

 This patches will meet your need in the way that it shows correlation
 between their device IDs and the chipsets. It does not use the existing
 macro INTEL_E1000_ETHERNET_DEVICE() to see whether the device uses I/O
 port or not. Instead of modifying the PCI device table, I've added another
 function called e1000_test_legacy_ioport() which tells you whether the PCI
 device uses legacy I/O port or not by checking its chipset.

 But one thing I want to say is that I am not sure about the chipsets that
 require legacy I/O port. I found the following code in e1000 driver code
 and it seems to be the only part that is using I/O port. So I thought
 the following chipsets are the only ones using legacy I/O port.
 I might be wrong, so any comments would be helpful.

drivers/net/e1000/e1000_hw.c
 524 int32_t
 525 e1000_reset_hw(struct e1000_hw *hw)
 526 {
 ...
 618 switch (hw->mac_type) {
 619 case e1000_82544:
 620 case e1000_82540:
 621 case e1000_82545:
 622 case e1000_82546:
 623 case e1000_82541:
 624 case e1000_82541_rev_2:
 625 /* These controllers can't ack the 64-bit write when 
issuing the
 626  * reset, so use IO-mapping as a workaround to issue the 
reset */
 627 E1000_WRITE_REG_IO(hw, CTRL, (ctrl | E1000_CTRL_RST));

> I also would like this option to be non-default, IOW use legacy IO by default,
> and allow the user to specify a module load option to disable use of this 
> feature:

 I've also added the module parameter so that the user can select whether
 he or she wants to enable the legacy I/O port free. The legacy I/O port free
 option is non-default.

 Rest of the part has not been changed since my previous patch. Any comments
 would be helpful.

Tomohiro Kusumi

Signed-off-by: Tomohiro Kusumi <[EMAIL PROTECTED]>

---
diff -Nurp linux-2.6.22.org/drivers/net/e1000/e1000.h 
linux-2.6.22/drivers/net/e1000/e1000.h
--- linux-2.6.22.org/drivers/net/e1000/e1000.h  2007-07-09 08:32:17.0 
+0900
+++ linux-2.6.22/drivers/net/e1000/e1000.h  2007-08-10 09:56:03.0 
+0900
@@ -342,6 +342,9 @@ struct e1000_adapter {
boolean_t quad_port_a;
unsigned long flags;
uint32_t eeprom_wol;
+
+   int use_ioport;
+   int bars;
 };

 enum e1000_state_t {
diff -Nurp linux-2.6.22.org/drivers/net/e1000/e1000_main.c 
linux-2.6.22/drivers/net/e1000/e1000_main.c
--- linux-2.6.22.org/drivers/net/e1000/e1000_main.c 2007-07-09 
08:32:17.0 +0900
+++ linux-2.6.22/drivers/net/e1000/e1000_main.c 2007-08-10 13:09:25.0 
+0900
@@ -222,6 +222,11 @@ static pci_ers_result_t e1000_io_error_d
 static pci_ers_result_t e1000_io_slot_reset(struct pci_dev *pdev);
 static void e1000_io_resume(struct pci_dev *pdev);

+static unsigned int enable_legacy_ioport_free = 0;
+module_param(enable_legacy_ioport_free, uint, 0644);
+MODULE_PARM_DESC(enable_legacy_ioport_free, "Enable legacy I/O port free 
(default:0)");
+static int e1000_test_legacy_ioport(struct pci_dev *pdev);
+
 static struct pci_error_handlers e1000_err_handler = {
.error_detected = e1000_io_error_detected,
.slot_reset = e1000_io_slot_reset,
@@ -868,8 +873,22 @@ e1000_probe(struct pci_dev *pdev,
int i, err, pci_using_dac;
uint16_t eeprom_data = 0;
uint16_t eeprom_apme_mask = E1000_EEPROM_APME;
-   if ((err = pci_enable_device(pdev)))
+   int bars = 0;
+   int use_ioport = 0;
+
+   if (enable_legacy_ioport_free) {
+   if ((use_ioport = e1000_test_legacy_ioport(pdev)) < 0) {
+   E1000_ERR("e1000_test_legacy_ioport failed, 
aborting\n");
+   return -1;
+   }
+   if (use_ioport)
+   bars = pci_select_bars(pdev, IORESOURCE_MEM | 
IORESOURCE_IO);
+   else
+   bars = pci_select_bars(pdev, IORESOURCE_MEM);
+   }
+   else if ((err = pci_enable_device(pdev))) {
return err;
+   }

if (!(err = pci_set_dma_mask(pdev, DMA_64BIT_MASK)) &&
!(err = pci_set_consistent_dma_mask(pdev, DMA_64BIT_MASK))) {
@@ -883,7 +902,8 @@ e1000_probe(struct pci_dev *pdev,
pci_using_dac = 0;
}

-   if ((err = 

Question about PF_NOFREEZE

2007-08-09 Thread jidong xiao
if one thread set its current->flag with PF_NOFREEZE, then it means
this thread is unfreezable,does this mean, when the system entered
into a suspended state, even though all the other threads have already
gone sleep, this thread still keeps awaken?

One thing I am very confused is, if all the other threads goes to
sleep,can this only one thread(assume only one thread marked itself as
unfreezable.) still works well?

Regards
Jason
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: EFI e820 map handling

2007-08-09 Thread Yinghai Lu
On 8/9/07, Andi Kleen <[EMAIL PROTECTED]> wrote:
>
> Hallo,
>
> I thought a bit about the zero page problem. I really would prefer to not
> having it used in a boot loader right now because it's not extensible anymore
> when external users start (ab)using it.
>
> When I asked for separate EFI->e820 functions I was really thinking
> of the kernel to do the conversion; not the boot loader.
>
> Could you move that code into the kernel early boot code please?
> e.g. on x86-64 it could be in head64.c.  It could stuff the result
> into the zero page to pass it cleanly on without special cases later.
>
> On i386 a head32.c that runs before start_kernel() could be also
> introduced for this.
>
> As long as it's localized there it is fine.
>
> This would also allow to define new private e820 types and extend
> the string decoding in e820; so that dmesg will correctly contain
>
> EFI: 
>
> instead of
>
> BIOS-e820: ...
>

How about elilo to load freebsd or opensolaris?

YH
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][RFC] 4K stacks default, not a debug thing any more...?

2007-08-09 Thread Neil Brown
On Friday August 10, [EMAIL PROTECTED] wrote:
> On 8/1/07, Neil Brown <[EMAIL PROTECTED]> wrote:
> 
> > No, this does not use indefinite stack.
> >
> > loop will schedule each request to be handled by a kernel thread, so
> > requests to 'loop' are serialised, never stacked.
> >
> > In 2.6.22, generic_make_request detects and serialises recursive calls,
> > so unlimited recursion is not possible there either.
> 
> Is that saying "before 2.6.22, a read/write on a deeply layered device
> would use a lot of stack?"

before 2.6.22, a stack of dm and/or md devices (not loop, and not
md/raid0 or md/linear) would use more stack the more devices were
involved.  If you made a very deep stack, you could push the stack
over any limit you chose.

I won't say "a lot of stack" as I haven't measured the exact amount,
just "more stack as you add more devices".

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][RFC] 4K stacks default, not a debug thing any more...?

2007-08-09 Thread Dan Merillat
On 8/1/07, Neil Brown <[EMAIL PROTECTED]> wrote:

> No, this does not use indefinite stack.
>
> loop will schedule each request to be handled by a kernel thread, so
> requests to 'loop' are serialised, never stacked.
>
> In 2.6.22, generic_make_request detects and serialises recursive calls,
> so unlimited recursion is not possible there either.

Is that saying "before 2.6.22, a read/write on a deeply layered device
would use a lot of stack?"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [-mm PATCH 0/9] Memory controller introduction (v4)

2007-08-09 Thread Vaidyanathan Srinivasan

KAMEZAWA Hiroyuki wrote:
> On Wed, 8 Aug 2007 12:51:39 +0900
> KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> wrote:
> 
>> On Sat, 28 Jul 2007 01:39:37 +0530
>> Balbir Singh <[EMAIL PROTECTED]> wrote:
>>> At OLS, the resource management BOF, it was discussed that we need to manage
>>> RSS and unmapped page cache together. This patchset is a step towards that
>>>
>> Can I make a question ? Why limiting RSS instead of # of used pages per
>> container ? Maybe bacause of shared pages between container 
> SorryIgnore above question.
> I didn't understand what mem_container_charge() accounts and limits.
> It controls # of meta_pages.

Hi Kame,

Actually the number of pages resident in memory brought in by a
container is charged.  However each such page will have a meta_page
allocated to keep the extra data.

Yes, the accounting counts the number of meta_page which is same as
the number of mapped and unmapped (pagecache) pages brought into the
system memory by this container.  Whether pagecache pages should be
included or not is configurable per container through the 'type' file
in containerfs.

--Vaidy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Problems with USB disk

2007-08-09 Thread Bill Davidsen

Greg KH wrote:

On Tue, Aug 07, 2007 at 10:26:15PM +0200, Niels wrote:

Hi,

I'm having problems with a new 500 GB USB disk. It works, but sometimes I
get these in dmesg:


usb 1-3: reset high speed USB device using ehci_hcd and address 2
usb 5-1: USB disconnect, address 2
drivers/usb/class/usblp.c: usblp0: removed
sd 0:0:0:0: Device not ready: <6>: Sense Key : 0x2 [current]
: ASC=0x4 ASCQ=0x2
end_request: I/O error, dev sda, sector 254148215
sd 0:0:0:0: Device not ready: <6>: Sense Key : 0x2 [current]
: ASC=0x4 ASCQ=0x2
end_request: I/O error, dev sda, sector 252434023
EXT3-fs error (device sda1): ext3_find_entry: reading directory #15761836
offset 0


There's also a printer connected. This is on a pci/usb2 card. When the above
happens, I get I/O errors. When I mount the drive next, there are errors
and often missing files. Quite annoying!

Kernel is 2.6.21

What's going on?


You have a low voltage issue, or a bad cable.  The device is
electronically disconnecting itself.  Try using a externally-powered
hub, or a new cable.

I see the external drive becoming read-only, although I haven't checked 
the dmesg for the events, since other things in my system generate a 
bunch of output I have to wade through.


New cable, separate power, doesn't do it under 2.6.20-* Fedora or 
2.6.21.x kernel.org kernels.


I'll check the dmesg next time it happens, but I doubt a kernel version 
change would heal the hardware issues you mention.



--
Bill Davidsen <[EMAIL PROTECTED]>
  "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: rtc max frequency setting

2007-08-09 Thread Bill Davidsen

H. Peter Anvin wrote:

Jan Engelhardt wrote:

Hi,

with the old rtc.ko module, there was a /proc/sys/dev/rtc/max-user-freq 
that could be set. With rtc_cmos.ko (or the new rtc infrastructure in 
general), I am missing this file. Where can I set the max-user-freq now, 
or is this obsolete now? (mplayer prefers to have user-freq to be >= 1024.)




Qemu wants something like this too.  Both of these really want something
else, which is a high-frequency userspace timer.

What is the best way to do that on modern kernels?

/proc/sys/dev/hpet/max-user-freq? But I notice that some kernels provide 
both values (my 2.6.15, was where I looked), so maybe the rtc went away.


--
Bill Davidsen <[EMAIL PROTECTED]>
  "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Unable to Handle kernel paging request at virtual address during rmmod/insmod

2007-08-09 Thread Grace Baldonasa
Hi,

I'm kind of lost in my debugging. Hope someone out there can help me
figure out the problem.
I am loading 4 modules-> class driver, device controller, interface
layer(working on a dual-core)  abstraction layer.
Device has to work as a mass storage class. After doing some file
transfer and more complex testing. I unload the modules/driver, and
here is where my problem will come
out. Most of the time, I will get "unable to handle kernel paging
request". and this happens at different timing. I implemented memory
counter in all the modules to ensure that all
kmalloc'd are kfreed. So I am confident there is no memory leakage in
each module.

My understanding from the call trace is, it is failing somewhere in
free_block/sys_delete_module, etc. These routines are internal already
to kernel and were called upon rmmod.
Is there anything I can do in my modules to resolve this issue.

I am using vmalloc in allocating 64K buffer twice in one of the
module. In Hal abstraction layer, also an iormap_nocache was used.
Could the problem be related to this?
I am using linux-2.6.18.


Any help will be greatly appreciated.


grace
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 00/23] per device dirty throttling -v8

2007-08-09 Thread Bill Davidsen

Andi Kleen wrote:

richard kennedy <[EMAIL PROTECTED]> writes:

This is on a standard desktop machine so there are lots of other
processes running on it, and although there is a degree of variability
in the numbers,they are very repeatable and your patch always out
performs the stock mm2.
looks good to me


iirc the goal of this is less to get better performance, but to avoid long user 
visible
latencies.  Of course if it's faster it's great too, but that's only secondary.

What a trade-off, if you want to get rid of long latency you have to 
live with better throughput. I can live with that. ;-)


Your point well taken, not the intent of the patch, but it may indicate 
where a performance bottleneck happens as well.


--
Bill Davidsen <[EMAIL PROTECTED]>
  "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Oops in 2.6.21-gentoo-r4 during rsync

2007-08-09 Thread Sol Jerome
Hi, I get the following error when doing an 'rsync -avz --progress.' Any
help would be appreciated.

Unable to handle kernel NULL pointer dereference at  RIP:
[] __rmqueue+0x6c/0x140
PGD 7b24f067 PUD 71ab9067 PMD 0
Oops:  [1] SMP
CPU 1
Modules linked in: rfcomm snd_seq snd_seq_device bridge llc
acpi_memhotplug pciehp pci_hotplug kvm saa7134_alsa saa7134_dvb dvb_pll
tda826x mt352 tda10086 video_buf_dvb dvb_core nxt200x isl6421 tda1004x
udf ext4dev jbd2 dm_bbr ntfs vfat msdos fat i2c_amd756 i2c_dev usblp
usb_storage uhci_hcd quickcam_messenger usbvideo nbd snd_pcm_oss
snd_mixer_oss ieee80211_crypt_tkip ieee80211_crypt_ccmp
ieee80211_crypt_wep ieee80211 ieee80211_crypt fuse aes bfusb bcm203x
bnep sco hidp l2cap pwc hci_usb usbhid tuner saa7134 video_buf
compat_ioctl32 ir_kbd_i2c ir_common videodev v4l2_common v4l1_compat
i2c_nforce2 i2c_core ehci_hcd ohci_hcd k8temp sg
Pid: 9731, comm: rsync Not tainted 2.6.21-gentoo-r4 #1
RIP: 0010:[] [] __rmqueue+0x6c/0x140
RSP: 0018:81007266d4d8 EFLAGS: 00010013
RAX:  RBX:  RCX: 
RDX:  RSI:  RDI: 8100c600
RBP: 8100c698 R08:  R09: 
R10: 0929 R11: 0001 R12: 
R13: 81007e3a0d40 R14: 8100c600 R15: 001f
FS: 2b548bc18ae0() GS:81007e3a0cc0() knlGS:
CS: 0010 DS:  ES:  CR0: 8005003b
CR2:  CR3: 702d4000 CR4: 06e0
Process rsync (pid: 9731, threadinfo 81007266c000, task
8100711a4a20)
Stack: 0082 81007e3a0d50  
81007e3a0d40 802645f0 81007266d558 0001
8100d868 00018022d104 8100d860 0050
Call Trace:
[] get_page_from_freelist+0x1e0/0x4e0
[] __alloc_pages+0x180/0x300
[] kmem_getpages+0x70/0x140
[] fallback_alloc+0x104/0x1c0
[] kmem_cache_alloc+0x97/0xb0
[] journal_add_journal_head+0x27/0x170
[] journal_dirty_data+0x37/0x210
[] ext3_journal_dirty_data+0x1d /0x50
[] walk_page_buffers+0x68/0xb0
[] journal_dirty_data_fn+0x0/0x20
[] ext3_ordered_writepage+0x108/0x190
[] shrink_inactive_list+0x442 /0x8a0
[] __split_bio+0x3cc/0x3f0
[] __up_read+0x21/0xb0
[] dm_request+0x119/0x120
[] generic_make_request+0x159 /0x170
[] shrink_zone+0xf6/0x130
[] try_to_free_pages+0x183/0x280
[] __alloc_pages+0x1df/0x300
[] __do_page_cache_readahead+0xe4 /0x290
[] __pollwait+0x0/0x120
[] transfer_objects+0x52/0x80
[] __activate_task+0x33/0x50
[] try_to_wake_up+0x3dc/0x400
[] dm_table_any_congested+0x15/0x70
[] blockable_page_cache_readahead+0x6d/0xe0
[] make_ahead_window+0x86/0xb0
[] page_cache_readahead+0x195 /0x1e0
[] do_generic_mapping_read+0x127/0x410
[] file_read_actor+0x0/0x140
[] generic_file_aio_read+0x16c/0x1b0
[] do_sync_read+0xcf/0x120
[] autoremove_wake_function+0x0/0x30
[] vfs_read+0xdb/0x180
[] sys_read+0x53/0x90
[] system_call+0x7e/0x83


Code: 48 8b 08 48 8b 50 08 4c 8d 68 d8 48 89 51 08 48 89 0a 48 c7
RIP [] __rmqueue+0x6c/0x140
RSP 
CR2: 

Thanks,
Sol


oops_dmesg
Description: Binary data


Re: [PATCH 00/23] per device dirty throttling -v8

2007-08-09 Thread Bill Davidsen

[EMAIL PROTECTED] wrote:

On Sun, 5 Aug 2007, Diego Calleja wrote:


El Sun, 5 Aug 2007 09:13:20 +0200, Ingo Molnar <[EMAIL PROTECTED]> escribió:


Measurements show that noatime helps 20-30% on regular desktop
workloads, easily 50% for kernel builds and much more than that (in
excess of 100%) for file-read-intense workloads. We cannot just walk



And as everybody knows in servers is a popular practice to disable it.
According to an interview to the kernel.org admins

"Beyond that, Peter noted, "very little fancy is going on, and that is 
good
because fancy is hard to maintain." He explained that the only fancy 
thing

being done is that all filesystems are mounted noatime meaning that the
system doesn't have to make writes to the filesystem for files which are
simply being read, "that cut the load average in half."

I bet that some people would consider such performance hit a bug...



actually, it's popular practice to disable it by people who know how big 
a hit it is and know how few programs use it.


i've been a linux sysadmin for 10 years, and have known about noatime 
for at least 7 years, but I always thought of it in the catagory of 'use 
it only on your performance critical machines where you are trying to 
extract every ounce of performance, and keep an eye out for things 
misbehaving'


I never imagined that itwas the 20%+ hit that is being described, and 
with so little impact, or I would have switched to it across the board 
years ago.


To get that magnitude you need slow disk with very fast CPU. It helps 
most of systems where the disk hardware is marginal or worse for the i/o 
load. Don't take that as typical.



I'll bet there are a lot of admins out there in the same boat.

adding an option in the kernel to change the default sounds like a very 
good first step, even if the default isn't changed today.




--
Bill Davidsen <[EMAIL PROTECTED]>
  "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] [MTD] Fix CFI build error with meaningless nonfunctional .config

2007-08-09 Thread David Woodhouse
On Mon, 2007-08-06 at 18:15 +0200, Ingo Molnar wrote:
> randconfig testing on .23-rc2 triggered the following build error:

When building NOR flash support, you have compile-time options for the
bus width and the number of individual chips which are interleaved
together onto that bus. The code to deal with arbitrary geometry is a
bit convoluted, and people want to just configure it for the specific
hardware they have, to avoid the runtime overhead.

Selecting _none_ of the available options doesn't make any sense. You
should have at least one. This makes it build though, since people
persist in trying.

Signed-off-by: David Woodhouse <[EMAIL PROTECTED]>

diff --git a/include/linux/mtd/cfi.h b/include/linux/mtd/cfi.h
index 123948b..e17c534 100644
--- a/include/linux/mtd/cfi.h
+++ b/include/linux/mtd/cfi.h
@@ -57,6 +57,15 @@
 #define cfi_interleave_is_8(cfi) (0)
 #endif
 
+#ifndef cfi_interleave
+#warning No CONFIG_MTD_CFI_Ix selected. No NOR chip support can work.
+static inline int cfi_interleave(void *cfi)
+{
+   BUG();
+   return 0;
+}
+#endif
+
 static inline int cfi_interleave_supported(int i)
 {
switch (i) {
diff --git a/include/linux/mtd/map.h b/include/linux/mtd/map.h
index 81f3a31..a9fae03 100644
--- a/include/linux/mtd/map.h
+++ b/include/linux/mtd/map.h
@@ -125,7 +125,15 @@
 #endif
 
 #ifndef map_bankwidth
-#error "No bus width supported. What's the point?"
+#warning "No CONFIG_MTD_MAP_BANK_WIDTH_xx selected. No NOR chip support can 
work"
+static inline int map_bankwidth(void *map)
+{
+   BUG();
+   return 0;
+}
+#define map_bankwidth_is_large(map) (0)
+#define map_words(map) (0)
+#define MAX_MAP_BANKWIDTH 1
 #endif
 
 static inline int map_bankwidth_supported(int w)


-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 02/10] mm: system wide ALLOC_NO_WATERMARK

2007-08-09 Thread Christoph Lameter
On Thu, 9 Aug 2007, Daniel Phillips wrote:

> If you believe that the deadlock problems we address here can be
> better fixed by making reclaim more intelligent then please post a
> patch and we will test it.  I am highly skeptical, but the proof is in
> the patch.

Then please test the patch that I posted here earlier to reclaim even if 
PF_MEMALLOC is set. It may require some fixups but it should address your 
issues in most vm load situations.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Minor fix to Documentation/powerpc/00-INDEX

2007-08-09 Thread Rob Landley
Signed-off-by: Rob Landley <[EMAIL PROTECTED]>

I have a python script to convert 00-INDEX files into index.html files, and a
second script to show 404 errors in the result as well as files/directories
nothing links to.   (It's not very useful yet, but in case you're wondering
http://kernel.org/doc/docdiridx.py and http://kernel.org/doc/doclinkcheck.py
.)

Anyway, my simple index.html generator breaks on the Documentation/powerpc
directory because one of the description lines is two lines long.  This
patch joins those two lines together into one line.  This is the only
instance (so far) of this problem.

---

In case you're wondering, here are the current the 404 errors in the
various 00-INDEX files.  Fixing all this is on my todo list:

Documentation/ecryptfs.txt
Documentation/time_interpolators.txt
Documentation/arm/SA1100
Documentation/arm/XScale
Documentation/arm/empeg
Documentation/arm/nwfpe
Documentation/isdn/README.eicon
Documentation/fb/clgenfb.txt
Documentation/networking/ethertap.txt
Documentation/filesystems/reiser4.txt
Documentation/scsi/AM53C974.txt
Documentation/scsi/ChangeLog

The "files and directories not linked to" list is 679 lines long.

diff -r /dev/null Documentation/powerpc/00-INDEX
--- a/Documentation/powerpc/00-INDEXThu Aug 09 08:40:21 2007 -0700
+++ b/Documentation/powerpc/00-INDEXThu Aug 09 20:49:03 2007 -0500
@@ -6,8 +6,7 @@ 00-INDEX
 00-INDEX
- this file
 cpu_features.txt
-   - info on how we support a variety of CPUs with minimal compile-time
-   options.
+   - info on how we support a variety of CPUs with minimal compile-time 
options.
 eeh-pci-error-recovery.txt
- info on PCI Bus EEH Error Recovery
 hvcs.txt

-- 
"One of my most productive days was throwing away 1000 lines of code."
  - Ken Thompson.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


mlock() is working?

2007-08-09 Thread Tiago Vignatti

Hi guys,

I'm trying to lock some piece of the code in memory using mlock(). I did 
a simple program to test it and to certify I using my own simple page 
fault notifier [0]. The program is below.


-- cut --

#include 
#include 

#define SIZE 1

int mlock_all = 0;

int
f(void)
{
int c[SIZE];
int i;

if (mlock_all) {
if (!mlockall(MCL_CURRENT))
fprintf(stderr, "mlockall'ed succefully\n");
else
perror("mlockall");
}
else {
if (!mlock([0], SIZE))
fprintf(stderr, "mlock'ed succefully\n");
else
perror("mlock");
}

fprintf(stderr, "start: 0x%x, end: 0x%x\n", [0], [SIZE]);

for (i = 0; i < SIZE; i++)
c[i] = i;

}
int
main(int argc, char **argv)
{
if (argv[1])
mlock_all = 1;

while(1) {
f();
sleep (15);
}

return 0;
}

-- cut --


So, if I use mlockall() I always obtained the desired result, i.e., I 
lock the 'c[SIZE]'. But when I switch to mlock() it never works and my 
page fault notifier prints all pages concerning 'c[SIZE]'. Am I missing 
something? Is it possible to lock the automatic variables?


My Linux is 2.6.22.2.

my regards

[0] http://lkml.org/lkml/2007/7/27/11
http://lkml.org/lkml/2007/7/27/8

--
Tiago Vignatti
C3SL - Centro de Computação Científica e Software Livre
www.c3sl.ufpr.br
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 02/10] mm: system wide ALLOC_NO_WATERMARK

2007-08-09 Thread Daniel Phillips
On 8/9/07, Christoph Lameter <[EMAIL PROTECTED]> wrote:
> The allocations problems that this patch addresses can be fixed by making 
> reclaim
> more intelligent.

If you believe that the deadlock problems we address here can be
better fixed by making reclaim more intelligent then please post a
patch and we will test it.  I am highly skeptical, but the proof is in
the patch.

> If we can reclaim in an emergency even in ATOMIC contexts then things get much
> easier.

It is already easy, and it is already fixed in this patch series.
Sure, we can pare these patches down a little more, but you are going
to have a really hard time coming up with something simpler that
actually works.

Regards,

Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] obsolete fragment from ext4

2007-08-09 Thread Coly Li
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Once ext4 will not implement fragment, it is believed it will never be implement
in future. Therefore fragment related source code in ext4 should be obsoleted --
no one will use it.

This patch obsolete fragment from ext4.
Another patch posted on linux-ext4 removing fragment supporting from e2fsprogs.
I tested both patch.


Signed-Off-By: Coly Li <[EMAIL PROTECTED]>

diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
index 427f830..b8b538d 100644
- --- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -576,11 +576,6 @@ got:
/* dirsync only applies to directories */
if (!S_ISDIR(mode))
ei->i_flags &= ~EXT4_DIRSYNC_FL;
- -#ifdef EXT4_FRAGMENTS
- - ei->i_faddr = 0;
- - ei->i_frag_no = 0;
- - ei->i_frag_size = 0;
- -#endif
ei->i_file_acl = 0;
ei->i_dir_acl = 0;
ei->i_dtime = 0;
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index a4848e0..f283522 100644
- --- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2645,11 +2645,6 @@ void ext4_read_inode(struct inode * inode)
}
inode->i_blocks = le32_to_cpu(raw_inode->i_blocks);
ei->i_flags = le32_to_cpu(raw_inode->i_flags);
- -#ifdef EXT4_FRAGMENTS
- - ei->i_faddr = le32_to_cpu(raw_inode->i_faddr);
- - ei->i_frag_no = raw_inode->i_frag;
- - ei->i_frag_size = raw_inode->i_fsize;
- -#endif
ei->i_file_acl = le32_to_cpu(raw_inode->i_file_acl);
if (EXT4_SB(inode->i_sb)->s_es->s_creator_os !=
cpu_to_le32(EXT4_OS_HURD))
@@ -2794,11 +2789,6 @@ static int ext4_do_update_inode(handle_t *handle,
raw_inode->i_blocks = cpu_to_le32(inode->i_blocks);
raw_inode->i_dtime = cpu_to_le32(ei->i_dtime);
raw_inode->i_flags = cpu_to_le32(ei->i_flags);
- -#ifdef EXT4_FRAGMENTS
- - raw_inode->i_faddr = cpu_to_le32(ei->i_faddr);
- - raw_inode->i_frag = ei->i_frag_no;
- - raw_inode->i_fsize = ei->i_frag_size;
- -#endif
if (EXT4_SB(inode->i_sb)->s_es->s_creator_os !=
cpu_to_le32(EXT4_OS_HURD))
raw_inode->i_file_acl_high =
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 75adbb6..5e04d68 100644
- --- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1655,14 +1655,6 @@ static int ext4_fill_super (struct super_block *sb, void 
*data, int silent)
if (sbi->s_inode_size > EXT4_GOOD_OLD_INODE_SIZE)
sb->s_time_gran = 1 << (EXT4_EPOCH_BITS - 2);
}
- - sbi->s_frag_size = EXT4_MIN_FRAG_SIZE <<
- -le32_to_cpu(es->s_log_frag_size);
- - if (blocksize != sbi->s_frag_size) {
- - printk(KERN_ERR
- -"EXT4-fs: fragsize %lu != blocksize %u (unsupported)\n",
- -sbi->s_frag_size, blocksize);
- - goto failed_mount;
- - }
sbi->s_desc_size = le16_to_cpu(es->s_desc_size);
if (EXT4_HAS_INCOMPAT_FEATURE(sb, EXT4_FEATURE_INCOMPAT_64BIT)) {
if (sbi->s_desc_size < EXT4_MIN_DESC_SIZE_64BIT ||
@@ -1676,7 +1668,6 @@ static int ext4_fill_super (struct super_block *sb, void 
*data, int silent)
} else
sbi->s_desc_size = EXT4_MIN_DESC_SIZE;
sbi->s_blocks_per_group = le32_to_cpu(es->s_blocks_per_group);
- - sbi->s_frags_per_group = le32_to_cpu(es->s_frags_per_group);
sbi->s_inodes_per_group = le32_to_cpu(es->s_inodes_per_group);
if (EXT4_INODE_SIZE(sb) == 0)
goto cantfind_ext4;
@@ -1700,12 +1691,6 @@ static int ext4_fill_super (struct super_block *sb, void 
*data, int silent)
sbi->s_blocks_per_group);
goto failed_mount;
}
- - if (sbi->s_frags_per_group > blocksize * 8) {
- - printk (KERN_ERR
- - "EXT4-fs: #fragments per group too big: %lu\n",
- - sbi->s_frags_per_group);
- - goto failed_mount;
- - }
if (sbi->s_inodes_per_group > blocksize * 8) {
printk (KERN_ERR
"EXT4-fs: #inodes per group too big: %lu\n",
diff --git a/include/linux/ext4_fs.h b/include/linux/ext4_fs.h
index cdee7aa..3baeb99 100644
- --- a/include/linux/ext4_fs.h
+++ b/include/linux/ext4_fs.h
@@ -105,20 +105,6 @@
 #define EXT4_BLOCK_ALIGN(size, blkbits)ALIGN((size), (1 << 
(blkbits)))

 /*
- - * Macro-instructions used to manage fragments
- - */
- -#define EXT4_MIN_FRAG_SIZE   1024
- -#define  EXT4_MAX_FRAG_SIZE  4096
- -#define EXT4_MIN_FRAG_LOG_SIZE 10
- -#ifdef __KERNEL__
- -# define EXT4_FRAG_SIZE(s)   (EXT4_SB(s)->s_frag_size)
- -# define EXT4_FRAGS_PER_BLOCK(s) (EXT4_SB(s)->s_frags_per_block)
- -#else
- -# define EXT4_FRAG_SIZE(s)   (EXT4_MIN_FRAG_SIZE << 
(s)->s_log_frag_size)
- -# define EXT4_FRAGS_PER_BLOCK(s) (EXT4_BLOCK_SIZE(s) / EXT4_FRAG_SIZE(s))
- -#endif
- -
- -/*
  * Structure of a blocks group descriptor
  */
 struct 

Re: JFFS2/mtdsuper modprobe "unknown symbol" in 2.6.23-rc1

2007-08-09 Thread Erez Zadok
In message <[EMAIL PROTECTED]>, Adrian Bunk writes:
> On Thu, Aug 09, 2007 at 10:38:18PM -0400, Erez Zadok wrote:
> > I'm getting an error modprobing jffs2 due to mtdsuper failing to insmod:
> >...
> > Does anyone know what am I missing?
> 
> You miss that 2.6.23-rc2 with this bug fixed has already been released.

Great, I'll upgrade to rc2 (I've had this problem since .22-rc).  Thanks for
the quick response.

Erez.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: JFFS2/mtdsuper modprobe "unknown symbol" in 2.6.23-rc1

2007-08-09 Thread Adrian Bunk
On Thu, Aug 09, 2007 at 10:38:18PM -0400, Erez Zadok wrote:
> I'm getting an error modprobing jffs2 due to mtdsuper failing to insmod:
>...
> Does anyone know what am I missing?

You miss that 2.6.23-rc2 with this bug fixed has already been released.

> Thanks,
> Erez.

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


JFFS2/mtdsuper modprobe "unknown symbol" in 2.6.23-rc1

2007-08-09 Thread Erez Zadok
I'm getting an error modprobing jffs2 due to mtdsuper failing to insmod:

# modprobe jffs2
WARNING: Error inserting mtdsuper
(/lib/modules/2.6.23-rc1/kernel/drivers/mtd/mtdsuper.ko):
Unknown symbol in module, or unknown parameter (see dmesg)
FATAL: Error inserting jffs2
(/lib/modules/2.6.23-rc1/kernel/fs/jffs2/jffs2.ko): Unknown
symbol in module, or unknown parameter (see dmesg)

# dmesg | tail
mtdsuper: Unknown symbol get_mtd_device
mtdsuper: Unknown symbol put_mtd_device
jffs2: Unknown symbol get_sb_mtd
jffs2: Unknown symbol kill_mtd_super

My relevant .config is:

CONFIG_MTD=m
CONFIG_MTD_BLKDEVS=m
CONFIG_MTD_BLOCK=m
CONFIG_MTD_MAP_BANK_WIDTH_1=y
CONFIG_MTD_MAP_BANK_WIDTH_2=y
CONFIG_MTD_MAP_BANK_WIDTH_4=y
CONFIG_MTD_CFI_I1=y
CONFIG_MTD_CFI_I2=y
CONFIG_MTD_BLOCK2MTD=m
CONFIG_JFFS2_FS=m
CONFIG_JFFS2_FS_DEBUG=0
CONFIG_JFFS2_FS_WRITEBUFFER=y
CONFIG_JFFS2_SUMMARY=y
CONFIG_JFFS2_FS_XATTR=y
CONFIG_JFFS2_FS_POSIX_ACL=y
CONFIG_JFFS2_FS_SECURITY=y
CONFIG_JFFS2_COMPRESSION_OPTIONS=y
CONFIG_JFFS2_ZLIB=y
CONFIG_JFFS2_RTIME=y
CONFIG_JFFS2_CMODE_PRIORITY=y

A "quick hack" around this which I found is to add

  MODULE_LICENSE("GPL");

to the end of drivers/mtd/mtdsuper.c, but that doesn't sound like the right
fix.

Does anyone know what am I missing?

Thanks,
Erez.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] sysctl: Error on bad sysctl tables

2007-08-09 Thread Eric W. Biederman
YOSHIFUJI Hideaki / 吉藤英明 <[EMAIL PROTECTED]> writes:

> Hello.
>
> In article <[EMAIL PROTECTED]> (at Thu, 09 Aug 2007
> 14:09:29 -0600), [EMAIL PROTECTED] (Eric W. Biederman) says:
>
>> After going through the kernels sysctl tables several times it has
>> become clear that code review and testing is just not effective in
>> prevent problematic sysctl tables from being used in the stable
>> kernel.  I certainly can't seem to fix the problems as fast as
>> they are introduced.
> :
>> The biggest part of the code is the table of valid binary sysctl
>> entries, but since we have frozen our set of binary sysctls this table
>> should not need to change, and it makes it much easier to detect
>> when someone unintentionally adds a new binary sysctl value.
>
> I don't think everyone needs to have this code, so
> it is better to make it configurable via
> CONFIG_SYSCTL_DEBUG or something..., ...no?

I guess the other thing is.  Except for code size it doesn't matter.
As register_sysctl_table gets called very rarely.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 04/10] sysctl: Fix neighbour table sysctls.

2007-08-09 Thread Eric W. Biederman
YOSHIFUJI Hideaki / 吉藤英明 <[EMAIL PROTECTED]> writes:

> In article <[EMAIL PROTECTED]> (at Thu, 09 Aug 2007
> 20:23:16 -0600), [EMAIL PROTECTED] (Eric W. Biederman) says:
>
>> YOSHIFUJI Hideaki / 吉藤英明 <[EMAIL PROTECTED]> writes:
>> 
>> > Would you explain why it does not work properly
>> > for those cases?
>> 
>> Mostly no appropriate strategy routine was setup to 
>> report the data to the caller of sys_sysctl.
>
> I assume that default strategy have been existing for it, no?!
> Maybe, I do miss something...

I'd have to go through it case by case.  But in general
unless your proc_handler is proc_dointvec the default
strategy routine which does a raw binary copy of your data
out will generally do the wrong thing.

So especially if your data is jiffies or otherwise needs
processing you don't want to use the default strategy
routine.

Until relatively recently no one was really policing the
sysctl interfaces and even now it isn't too serious.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] sysctl: Error on bad sysctl tables

2007-08-09 Thread Eric W. Biederman
YOSHIFUJI Hideaki / 吉藤英明 <[EMAIL PROTECTED]> writes:

> Hello.
>
> In article <[EMAIL PROTECTED]> (at Thu, 09 Aug 2007
> 14:09:29 -0600), [EMAIL PROTECTED] (Eric W. Biederman) says:
>
>> After going through the kernels sysctl tables several times it has
>> become clear that code review and testing is just not effective in
>> prevent problematic sysctl tables from being used in the stable
>> kernel.  I certainly can't seem to fix the problems as fast as
>> they are introduced.
> :
>> The biggest part of the code is the table of valid binary sysctl
>> entries, but since we have frozen our set of binary sysctls this table
>> should not need to change, and it makes it much easier to detect
>> when someone unintentionally adds a new binary sysctl value.
>
> I don't think everyone needs to have this code, so
> it is better to make it configurable via
> CONFIG_SYSCTL_DEBUG or something..., ...no?

I wouldn't reject such a patch.  We are a ways out from the next
stable kernel merge window and I'd love to see what else falls out so
I'd like to have it on by default for a bit.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 04/10] sysctl: Fix neighbour table sysctls.

2007-08-09 Thread YOSHIFUJI Hideaki / 吉藤英明
In article <[EMAIL PROTECTED]> (at Thu, 09 Aug 2007 20:23:16 -0600), [EMAIL 
PROTECTED] (Eric W. Biederman) says:

> YOSHIFUJI Hideaki / 吉藤英明 <[EMAIL PROTECTED]> writes:
> 
> > Would you explain why it does not work properly
> > for those cases?
> 
> Mostly no appropriate strategy routine was setup to 
> report the data to the caller of sys_sysctl.

I assume that default strategy have been existing for it, no?!
Maybe, I do miss something...

--yoshfuji
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 04/10] sysctl: Fix neighbour table sysctls.

2007-08-09 Thread Eric W. Biederman
YOSHIFUJI Hideaki / 吉藤英明 <[EMAIL PROTECTED]> writes:

> Would you explain why it does not work properly
> for those cases?

Mostly no appropriate strategy routine was setup to 
report the data to the caller of sys_sysctl.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: rcutorture xtime usage

2007-08-09 Thread Adrian Bunk
On Thu, Aug 09, 2007 at 07:00:40PM -0700, Paul E. McKenney wrote:
> On Fri, Aug 10, 2007 at 03:31:46AM +0200, Adrian Bunk wrote:
> > On Thu, Aug 09, 2007 at 01:51:06AM -0700, Andrew Morton wrote:
> > >...
> > > Changes since 2.6.23-rc2-mm1:
> > >...
> > > +allow-rcutorture-to-handle-synchronize_sched.patch
> > >...
> > >  2.6.23 queue
> > >...
> > 
> > All drivers were converted to no longer use xtime directly since it 
> > might be quite outdated, but this patch adds a usage of xtime.tv_nsec
> > as RNG...
> 
> This code doesn't care if the time is outdated, as it is simply
> periodically perturbing an RNG, but OK.
>...

I should have been a bit more concrete:

I have a patch pending to unexport xtime for catching unsafe usages, and 
you added an (ab)user.

>   Thanx, Paul

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 04/10] sysctl: Fix neighbour table sysctls.

2007-08-09 Thread YOSHIFUJI Hideaki / 吉藤英明
In article <[EMAIL PROTECTED]> (at Thu, 09 Aug 2007 18:56:09 -0600), [EMAIL 
PROTECTED] (Eric W. Biederman) says:

> 
> - In ipv6 ndisc_ifinfo_syctl_change so it doesn't depend on binary
>   sysctl names for a function that works with proc.
:

Well, retrans_time_ms and base_reachable_time_ms supercedes
retrans_time and base_reachable_time, we've warned for long
time for its deprecation.  So, maybe, it is time to remove
the old interfaces (retrans_time and base_reachable_time)
and simplify ndisc_ifinfo_syctl_change().

--yoshfuji
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 04/10] sysctl: Fix neighbour table sysctls.

2007-08-09 Thread Eric W. Biederman
Andrew Morton <[EMAIL PROTECTED]> writes:

> But it is good to remove bad interfaces, if we possibly can.
>
> It is worth making the attempt.  Does anyone know of anything which will
> break?  I fed NET_NEIGH_ANYCAST_DELAY at random into
> http://www.google.com/codesearch and came up with nothing...

My current policy is that since I could only find 5 real world linux
programs that even call sys_sysctl, that if I find a broken sysctl
binary interface I'm lazy and just remove it.  The only networking one
I know of is radvd.

Added to that I just pushed an autochecking sysctl patch to Andrew
that fails register_sysctl_table if the sysctl table is broken.  And
all of these showed up.  So some fix was needed or things would have
been even worse.

Eric

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 04/10] sysctl: Fix neighbour table sysctls.

2007-08-09 Thread YOSHIFUJI Hideaki / 吉藤英明
In article <[EMAIL PROTECTED]> (at Thu, 09 Aug 2007 18:49:21 -0700 (PDT)), 
David Miller <[EMAIL PROTECTED]> says:

> From: YOSHIFUJI Hideaki / 吉藤英明 <[EMAIL PROTECTED]>
> Date: Fri, 10 Aug 2007 10:47:10 +0900 (JST)
> 
> > I disagree.  It is bad to remove existing interface.
> > Ditto for other patches.
> 
> I think perhaps you misunderstand what Eric is doing.
> 
> sys_sysctl() isn't working properly for these cases and it is both a
> deprecated interface and not worth the pain of adding support
> in these cases.

Would you explain why it does not work properly
for those cases?

--yoshfuji
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix typo in arch/i386/kernel/tsc.c

2007-08-09 Thread Andrew Morton
On Thu, 09 Aug 2007 18:58:09 -0700 Josh Triplett <[EMAIL PROTECTED]> wrote:

> - *  We can use khz divisor instead of mhz to keep a better percision, since
> + *  We can use khz divisor instead of mhz to keep a better precision, since

I have an arbitrary i-dont-do-typos policy (unless they're in a printk or
in documentation).  [EMAIL PROTECTED] is the home for patches such as this,
please.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: rcutorture xtime usage

2007-08-09 Thread Andrew Morton
On Thu, 9 Aug 2007 19:00:40 -0700 "Paul E. McKenney" <[EMAIL PROTECTED]> wrote:

> On Fri, Aug 10, 2007 at 03:31:46AM +0200, Adrian Bunk wrote:
> > On Thu, Aug 09, 2007 at 01:51:06AM -0700, Andrew Morton wrote:
> > >...
> > > Changes since 2.6.23-rc2-mm1:
> > >...
> > > +allow-rcutorture-to-handle-synchronize_sched.patch
> > >...
> > >  2.6.23 queue
> > >...
> > 
> > All drivers were converted to no longer use xtime directly since it 
> > might be quite outdated, but this patch adds a usage of xtime.tv_nsec
> > as RNG...
> 
> This code doesn't care if the time is outdated, as it is simply
> periodically perturbing an RNG, but OK.
> 
> So, what interface are we supposed to be using instead?  I cannot use
> get_random_bytes() due to locking issues.  This is not a cryptographically
> secure usage, so the perturbation does not need to be extremely high
> quality.
> 
> On x86, I would just grab the low-order bits of the TSC, but all of the
> world is not an x86.  ;-)
> 

One used to use sched_clock() for this, then get frowned at.  Now we
have cpu_clock()...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 04/10] mm: slub: add knowledge of reserve pages

2007-08-09 Thread Christoph Lameter
On Thu, 9 Aug 2007, Daniel Phillips wrote:

> No matter how you look at this problem, you still need to have _some_
> sort of reserve, and limit access to it.  We extend existing methods,

The reserve is in the memory in the zone and reclaim can guarantee that 
there are a sufficient number of easily reclaimable pages in it.

> you are proposing to what seems like an entirely new reserve

The reserve always has been managed by per zone counters. Nothing new 
there.

> management system.  Great idea, maybe, but it does not solve the
> deadlocks.  You still need some organized way of being sure that your
> reserve is as big as you need (hopefully not an awful lot bigger) and
> you still have to make sure that nobody dips into that reserve further
> than they are allowed to.

Nope there is no need to have additional reserves. You delay the writeout 
until you are finished with reclaim. Then you do the writeout. During 
writeout reclaim may be called as needed. After the writeout is complete 
then you recheck the vm counters again to be sure that dirty ratio / 
easily reclaimable ratio and mem low / high boundaries are still okay. If not 
go 
back to reclaim.

> So translation: reclaim from "easily freeable" lists is an
> optimization, maybe a great one.  Probably great.  Reclaim from atomic
> context is also a great idea, probably. But you are talking about a
> whole nuther patch set.  Neither of those are in themselves a fix for
> these deadlocks.

Yes they are a much better fix and may allow code cleanup by getting rid 
of checks for PF_MEMALLOC. They integrate in a straightforward way 
into the existing reclaim methods.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: rcutorture xtime usage

2007-08-09 Thread Paul E. McKenney
On Fri, Aug 10, 2007 at 03:31:46AM +0200, Adrian Bunk wrote:
> On Thu, Aug 09, 2007 at 01:51:06AM -0700, Andrew Morton wrote:
> >...
> > Changes since 2.6.23-rc2-mm1:
> >...
> > +allow-rcutorture-to-handle-synchronize_sched.patch
> >...
> >  2.6.23 queue
> >...
> 
> All drivers were converted to no longer use xtime directly since it 
> might be quite outdated, but this patch adds a usage of xtime.tv_nsec
> as RNG...

This code doesn't care if the time is outdated, as it is simply
periodically perturbing an RNG, but OK.

So, what interface are we supposed to be using instead?  I cannot use
get_random_bytes() due to locking issues.  This is not a cryptographically
secure usage, so the perturbation does not need to be extremely high
quality.

On x86, I would just grab the low-order bits of the TSC, but all of the
world is not an x86.  ;-)

Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] sysctl: Error on bad sysctl tables

2007-08-09 Thread YOSHIFUJI Hideaki / 吉藤英明
Hello.

In article <[EMAIL PROTECTED]> (at Thu, 09 Aug 2007 14:09:29 -0600), [EMAIL 
PROTECTED] (Eric W. Biederman) says:

> After going through the kernels sysctl tables several times it has
> become clear that code review and testing is just not effective in
> prevent problematic sysctl tables from being used in the stable
> kernel.  I certainly can't seem to fix the problems as fast as
> they are introduced.
:
> The biggest part of the code is the table of valid binary sysctl
> entries, but since we have frozen our set of binary sysctls this table
> should not need to change, and it makes it much easier to detect
> when someone unintentionally adds a new binary sysctl value.

I don't think everyone needs to have this code, so
it is better to make it configurable via
CONFIG_SYSCTL_DEBUG or something..., ...no?

--yoshfuji
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Fix typo in arch/i386/kernel/tsc.c

2007-08-09 Thread Josh Triplett
Signed-off-by: Josh Triplett <[EMAIL PROTECTED]>
---
 arch/i386/kernel/tsc.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/i386/kernel/tsc.c b/arch/i386/kernel/tsc.c
index debd7db..8a58d30 100644
--- a/arch/i386/kernel/tsc.c
+++ b/arch/i386/kernel/tsc.c
@@ -80,7 +80,7 @@ EXPORT_SYMBOL_GPL(check_tsc_unstable);
  * And since SC is a constant power of two, we can convert the div
  *  into a shift.
  *
- *  We can use khz divisor instead of mhz to keep a better percision, since
+ *  We can use khz divisor instead of mhz to keep a better precision, since
  *  cyc2ns_scale is limited to 10^6 * 2^10, which fits in 32 bits.
  *  ([EMAIL PROTECTED])
  *
-- 
1.5.2.1


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 04/10] sysctl: Fix neighbour table sysctls.

2007-08-09 Thread Andrew Morton
On Fri, 10 Aug 2007 10:47:10 +0900 (JST) YOSHIFUJI Hideaki / 吉藤英明 <[EMAIL 
PROTECTED]> wrote:

> Hello.
> 
> In article <[EMAIL PROTECTED]> (at Thu, 09 Aug 2007 18:56:09 -0600), [EMAIL 
> PROTECTED] (Eric W. Biederman) says:
> 
> > 
> > - In ipv6 ndisc_ifinfo_syctl_change so it doesn't depend on binary
> >   sysctl names for a function that works with proc.
> > 
> > - In neighbour.c reorder the table to put the possibly unused entries
> >   at the end so we can remove them by terminating the table early.
> > 
> > - In neighbour.c kill the entries with questionable binary sysctl
> >   handling behavior.
> > 
> > - In neighbour.c if we don't have a strategy routine remove the
> >   binary path.  So we don't the default sysctl strategy routine
> >   on data that is not ready for it.
> > 
> 
> I disagree.  It is bad to remove existing interface.

But it is good to remove bad interfaces, if we possibly can.

It is worth making the attempt.  Does anyone know of anything which will
break?  I fed NET_NEIGH_ANYCAST_DELAY at random into
http://www.google.com/codesearch and came up with nothing...


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 04/10] mm: slub: add knowledge of reserve pages

2007-08-09 Thread Daniel Phillips
On 8/8/07, Andrew Morton <[EMAIL PROTECTED]> wrote:
> On Wed, 8 Aug 2007 10:57:13 -0700 (PDT)
> Christoph Lameter <[EMAIL PROTECTED]> wrote:
>
> > I think in general irq context reclaim is doable. Cannot see obvious
> > issues on a first superficial pass through rmap.c. The irq holdoff would
> > be pretty long though which may make it unacceptable.
>
> The IRQ holdoff could be tremendous.  But if it is sufficiently infrequent
> and if the worst effect is merely a network rx ring overflow then the tradeoff
> might be a good one.

Hi Andrew,

No matter how you look at this problem, you still need to have _some_
sort of reserve, and limit access to it.  We extend existing methods,
you are proposing to what seems like an entirely new reserve
management system.  Great idea, maybe, but it does not solve the
deadlocks.  You still need some organized way of being sure that your
reserve is as big as you need (hopefully not an awful lot bigger) and
you still have to make sure that nobody dips into that reserve further
than they are allowed to.

So translation: reclaim from "easily freeable" lists is an
optimization, maybe a great one.  Probably great.  Reclaim from atomic
context is also a great idea, probably. But you are talking about a
whole nuther patch set.  Neither of those are in themselves a fix for
these deadlocks.

Regards,

Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 04/10] sysctl: Fix neighbour table sysctls.

2007-08-09 Thread David Miller
From: YOSHIFUJI Hideaki / 吉藤英明 <[EMAIL PROTECTED]>
Date: Fri, 10 Aug 2007 10:47:10 +0900 (JST)

> I disagree.  It is bad to remove existing interface.
> Ditto for other patches.

I think perhaps you misunderstand what Eric is doing.

sys_sysctl() isn't working properly for these cases and it is both a
deprecated interface and not worth the pain of adding support
in these cases.

The fact that nobody complains that none of this stuff works
via sys_sysctl() to me proves that it is never used.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 02/10] mm: system wide ALLOC_NO_WATERMARK

2007-08-09 Thread Christoph Lameter
On Thu, 9 Aug 2007, Daniel Phillips wrote:

> You can fix reclaim as much as you want and the basic deadlock will
> still not go away.  When you finally do get to writing something out,
> memory consumers in the writeout path are going to cause problems,
> which this patch set fixes.

We currently also do *not* write out immediately. I/O is queued when 
submitted so it does *not* reduce memory. It is better to actually delay 
writeout until you have thrown out clean pages. At that point the free 
memory is at its high point. If memory goes below the high point again by 
these writes then we can again reclaim until things are right.

> Agreed that the idea of mempool always sounded strange, and we show
> how to get rid of them, but that is not the immediate purpose of this
> patch set.

Ok mempools are unrelated. The allocations problems that this patch 
addresses can be fixed by making reclaim more intelligent. This may likely 
make mempools less of an issue in the kernel. If we can reclaim in an 
emergency even in ATOMIC contexts then things get much easier.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 x86_64 : kernel initial decompression hangs on vmware

2007-08-09 Thread Avi Kivity

Zachary Amsden wrote:

Avi Kivity wrote:


We haven't seen any issue with the 2.6.22 boot decompressor.  Which 
of the four (fs, gs, ldt, or tr) were proving problematic and why?


It was tr that was affecting Workstation, since we boot through normal 
BIOS path, and only a 16-bit task was loaded at this point.




Ah.  Maybe we didn't have an exit while we were in long mode with the 
16-bit tss, so VT didn't notice the illegal combination.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 04/10] sysctl: Fix neighbour table sysctls.

2007-08-09 Thread YOSHIFUJI Hideaki / 吉藤英明
Hello.

In article <[EMAIL PROTECTED]> (at Thu, 09 Aug 2007 18:56:09 -0600), [EMAIL 
PROTECTED] (Eric W. Biederman) says:

> 
> - In ipv6 ndisc_ifinfo_syctl_change so it doesn't depend on binary
>   sysctl names for a function that works with proc.
> 
> - In neighbour.c reorder the table to put the possibly unused entries
>   at the end so we can remove them by terminating the table early.
> 
> - In neighbour.c kill the entries with questionable binary sysctl
>   handling behavior.
> 
> - In neighbour.c if we don't have a strategy routine remove the
>   binary path.  So we don't the default sysctl strategy routine
>   on data that is not ready for it.
> 

I disagree.  It is bad to remove existing interface.
Ditto for other patches.

Regards,

--yoshfuji
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/4] Embed zone_id information within the zonelist->zones pointer

2007-08-09 Thread Christoph Lameter

On Fri, 10 Aug 2007, Mel Gorman wrote:

> > > +#if defined(CONFIG_SMP) && INTERNODE_CACHE_SHIFT > ZONES_SHIFT
> > 
> > Is this necessary? ZONES_SHIFT is always <= 2 so it will work with 
> > any pointer. Why disable this for UP?
> > 
> 
> Caution in case the number of zones increases. There was no guarantee of
> zone alignment. It's the same reason I have a BUG_ON in the encode
> function so that if we don't catch problems at compile-time, it'll go
> BANG in a nice predictable fashion.

Caution would lead to a BUG_ON but why the #if? Why exclude UP?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.23-rc2-mm1: rcutorture xtime usage

2007-08-09 Thread Adrian Bunk
On Thu, Aug 09, 2007 at 01:51:06AM -0700, Andrew Morton wrote:
>...
> Changes since 2.6.23-rc2-mm1:
>...
> +allow-rcutorture-to-handle-synchronize_sched.patch
>...
>  2.6.23 queue
>...

All drivers were converted to no longer use xtime directly since it 
might be quite outdated, but this patch adds a usage of xtime.tv_nsec
as RNG...

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/24] make atomic_read() behave consistently on alpha

2007-08-09 Thread Paul E. McKenney
On Thu, Aug 09, 2007 at 03:24:40PM -0400, Chris Snook wrote:
> Paul E. McKenney wrote:
> >On Thu, Aug 09, 2007 at 02:13:52PM -0400, Chris Snook wrote:
> >>Paul E. McKenney wrote:
> >>>On Thu, Aug 09, 2007 at 01:14:35PM -0400, Chris Snook wrote:
>    If you're depending on volatile writes 
> being visible to other CPUs, you're screwed either way, because the CPU 
> can hold that data in cache as long as it wants before it writes it to 
> memory.  When this finally does happen, it will happen atomically, 
> which is all that atomic_set guarantees.  If you need to guarantee that 
> the value is written to memory at a particular time in your execution 
> sequence, you either have to read it from memory to force the compiler 
> to store it first (and a volatile cast in atomic_read will suffice for 
> this) or you have to use LOCK_PREFIX instructions which will invalidate 
> remote cache lines containing the same variable.  This patch doesn't 
> change either of these cases.
> >>>The case that it -can- change is interactions with interrupt handlers.
> >>>And NMI/SMI handlers, for that matter.
> >>You have a point here, but only if you can guarantee that the interrupt 
> >>handler is running on a processor sharing the cache that has the 
> >>not-yet-written volatile value.  That implies a strictly non-SMP 
> >>architecture.  At the moment, none of those have volatile in their 
> >>declaration of atomic_t, so this patch can't break any of them.
> >
> >This can also happen when using per-CPU variables.  And there are a
> >number of per-CPU variables that are either atomic themselves or are
> >structures containing atomic fields.
> 
> Accessing per-CPU variables in this fashion reliably already requires a 
> suitable smp/non-smp read/write memory barrier.  I maintain that if we 
> break anything with this change, it was really already broken, if less 
> obviously.  Can you give a real or synthetic example of legitimate code 
> that could break?

My main concern is actually the lack of symmetry -- I would expect
that an atomic_set() would have the same properties as atomic_read().
It is easy and cheap to provide them with similar properties, so why not?
Debugging even a single problem would consume far more time than simply
giving them corresponding semantics.

But you asked for examples.  These are synthetic, and of course legitimacy
is in the eye of the beholder.

1.  Watchdog variable.

atomic_t watchdog = ATOMIC_INIT(0);

...

int i;
while (!done) {

/* Do so stuff that doesn't take more than a few us. */
/* Could do atomic increment, but throughput penalty. */

i++;
atomic_set(, i);
}
do_something_with();


/* Every so often on some other CPU... */

if ((new_watchdog = atomic_read()) == old_watchdog)
die_horribly();
old_watchdog = new_watchdog;


If atomic_set() did not have volatile semantics, the compiler
would be within its rights optimizing it to simply get the
final value of "i" after exit from the loop.  This would cause
the watchdog check to fail spuriously.  Memory barriers are
not required in this case, because the CPU cannot hang onto
the value for very long -- we don't care about the exact value,
or about exact synchronization, but rather about whether or
not the value is changing.

In this (toy) example, one might replace the atomic_set() with
an atomic increment (though that might be too expensive in some
cases) or with something like:

atomic_set(, atomic_read() + 1);

However, other cases might not permit this transformation,
for example, an existing heavily used API might take int rather
than atomic_t.

Some will no doubt argue that this example should use a
macro or an asm similar to the "forget()" asm put forward
elsewhere in this thread.

2.  Communicating both with interrupt handler and with other CPUs.
For example, data elements that are built up in a location visible
to interrupts and NMIs, and then added as a unit to a data structure
visible to other CPUs.  This more-realistic example is abbreviated
to the point of pointlessness as follows:

struct foo {
atomic_t a;
atomic_t b;
};

DEFINE_PER_CPU(struct foo *, staging) = NULL;

/* Create element in staging area. */

__get_cpu_var(staging) = kzalloc(sizeof(*p), GFP_WHATEVER);
if (__get_cpu_var(staging) == NULL)
die_horribly();
/* allocate an element of some per-CPU array, get the result in "i" */
atomic_set(__get_cpu_var(staging).a, i);
/* allocate another element of a per-CPU array, with result in "i" */

[RFC 3/3] SGI Altix cross partition memory (XPMEM)

2007-08-09 Thread Dean Nelson
This patch provides cross partition access to user memory (XPMEM) when
running multiple partitions on a single SGI Altix.

Signed-off-by: Dean Nelson <[EMAIL PROTECTED]>



xpmem-module.v002.bz2
Description: BZip2 compressed data


[RFC 2/3] SGI Altix cross partition memory (XPMEM)

2007-08-09 Thread Dean Nelson
This patch exports zap_page_range as it is needed by XPMEM.

Signed-off-by: Dean Nelson <[EMAIL PROTECTED]>

---

XPMEM would have used sys_madvise() except that madvise_dontneed()
madvise_dontneed() returns an -EINVAL if VM_PFNMAP is set, which is
always true for the pages XPMEM imports from other partitions and is
also true for uncached pages allocated locally via the mspec allocator.
XPMEM needs zap_page_range() functionality for these types of pages as
well as 'normal' pages.

Index: linux-2.6/mm/memory.c
===
--- linux-2.6.orig/mm/memory.c  2007-08-09 07:07:55.762651612 -0500
+++ linux-2.6/mm/memory.c   2007-08-09 07:15:43.226389312 -0500
@@ -894,6 +894,7 @@
tlb_finish_mmu(tlb, address, end);
return end;
 }
+EXPORT_SYMBOL_GPL(zap_page_range);
 
 /*
  * Do a quick page-table lookup for a single page.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/3] SGI Altix cross partition memory (XPMEM)

2007-08-09 Thread Dean Nelson
This patch exports __put_task_struct as it is needed by XPMEM.

Signed-off-by: Dean Nelson <[EMAIL PROTECTED]>

---

One struct file_operations registered by XPMEM, xpmem_open(), calls
'get_task_struct(current->group_leader)' and another, xpmem_flush(), calls
'put_task_struct(tg->group_leader)'. The reason for this is given in the
comment block that appears in xpmem_open().

/*
 * Increment 'usage' and 'mm->mm_users' for the current task's thread
 * group leader. This ensures that both its task_struct and mm_struct
 * will still be around when our thread group exits. (The Linux kernel
 * normally tears down the mm_struct prior to calling a module's
 * 'flush' function.) Since all XPMEM thread groups must go through
 * this path, this extra reference to mm_users also allows us to
 * directly inc/dec mm_users in xpmem_ensure_valid_PFNs() and avoid
 * mmput() which has a scaling issue with the mmlist_lock.
 */

Index: linux-2.6/kernel/fork.c
===
--- linux-2.6.orig/kernel/fork.c2007-08-09 07:07:55.426611601 -0500
+++ linux-2.6/kernel/fork.c 2007-08-09 07:15:43.246391700 -0500
@@ -127,6 +127,7 @@
if (!profile_handoff_task(tsk))
free_task(tsk);
 }
+EXPORT_SYMBOL_GPL(__put_task_struct);
 
 void __init fork_init(unsigned long mempages)
 {
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC 0/3] SGI Altix cross partition memory (XPMEM)

2007-08-09 Thread Dean Nelson

Terminology

The term 'partition', adopted by the SGI hardware designers and which
perculated up into the software, is used in reference to a single SSI
when multiple SSIs are running on a single Altix. An Altix running
multiple SSIs is said to be 'partitioned', whereas one that is running
only a single SSI is said to be 'unpartitioned'.

The term '[a]cross partition' refers to a functionality that spans between
two SSIs on a multi-SSI Altix. ('XP' is its abbreviation.)

Introduction

This feature provides cross partition access to user memory (XPMEM) when
running multiple partitions on a single SGI Altix. XPMEM, like XPNET,
utilizes XPC to communicate between the partitions.

XPMEM allows a user process to identify portion(s) of its address space
that other user processes can attach (i.e. map) into their own address
spaces. These processes can be running on the same or a different
partition from the one whose memory they are attaching.

Known Issues

XPMEM is not currently using the kthread API (which is also true for XPC)
because it was in the process of being changed to require a kthread_stop()
be done for every kthread_create() and the kthread_stop() couldn't be called
for a thread that had already exited. In talking with Eric Biederman, there
was some thought of creating a kthread_orphan() which would eliminate the
need for a call to kthread_stop() being required.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 06/10] sysctl: Remove broken sunrpc debug binary sysctls

2007-08-09 Thread Eric W. Biederman

This is debug code so no need to support binary sysctl,
and the binary sysctls as they were written were not
consistent with what showed up in /proc so remove
the binary sysctl support.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
---
 net/sunrpc/sysctl.c |4 
 1 files changed, 0 insertions(+), 4 deletions(-)

diff --git a/net/sunrpc/sysctl.c b/net/sunrpc/sysctl.c
index 738db32..864b541 100644
--- a/net/sunrpc/sysctl.c
+++ b/net/sunrpc/sysctl.c
@@ -114,7 +114,6 @@ done:
 
 static ctl_table debug_table[] = {
{
-   .ctl_name   = CTL_RPCDEBUG,
.procname   = "rpc_debug",
.data   = _debug,
.maxlen = sizeof(int),
@@ -122,7 +121,6 @@ static ctl_table debug_table[] = {
.proc_handler   = _dodebug
},
{
-   .ctl_name   = CTL_NFSDEBUG,
.procname   = "nfs_debug",
.data   = _debug,
.maxlen = sizeof(int),
@@ -130,7 +128,6 @@ static ctl_table debug_table[] = {
.proc_handler   = _dodebug
},
{
-   .ctl_name   = CTL_NFSDDEBUG,
.procname   = "nfsd_debug",
.data   = _debug,
.maxlen = sizeof(int),
@@ -138,7 +135,6 @@ static ctl_table debug_table[] = {
.proc_handler   = _dodebug
},
{
-   .ctl_name   = CTL_NLMDEBUG,
.procname   = "nlm_debug",
.data   = _debug,
.maxlen = sizeof(int),
-- 
1.5.1.1.181.g2de0

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][RFC] 4K stacks default, not a debug thing any more...?

2007-08-09 Thread Dan Merillat
On 8/1/07, Alan Cox <[EMAIL PROTECTED]> wrote:
> On Wed, 1 Aug 2007 15:33:58 +0200
> Andrea Arcangeli <[EMAIL PROTECTED]> wrote:
> > Tweaking kernel ptes is prohibitive during clone() because that's
> > kernel memory and it would require a flush tlb all with IPIs that
> > won't scale (IPIs are really the blocker)
>
> Agreed - except when doing debug work then its an acceptable cost. You
> still have to sort the debug side out because you are going to fault the
> kernel stack which will probably then cause a triple fault and reboot on
> the spot.

I was assuming debugging work, yes.  I was also thinking it wouldn't
be done at clone() time, but mapped (on a single CPU) at the time of a
context switch.  It would eliminate IPI, but would probably make the
rest of the TLB handling much too ugly to contemplate.As an
alternative, could the TLB flush and associated IPI be deferred until
the process migrates?   First migration would trigger flush/IPI,
further migration would be as now, no?   I'd happily run it with
various dm/md layers underneath

On 8/1/07, Denis Vlasenko <[EMAIL PROTECTED]> wrote:
> Hmm, neat. Why do you need to _allocate second page_ at all?
> Just mark it "not present"...

Because the kernel mapping covers all physical memory contiguously, so
if the page isn't allocated, it could be used by a kernel data
structure you need to access.  Same reason the kernel stack has to be
contiguous pages.   Well, for non-highmem at least.  Either way, you
don't want to mark an in-use page as inaccessable, you never know
what's under there.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 10/10] sysctl: Remove the binary interface for aio-nr, aio-max-nr, acpi_video_flags

2007-08-09 Thread Eric W. Biederman

aio-nr, aio-max-nr, acpi_video_flags are unsigned long values
which sysctl does not handle properly with a 64bit kernel
and a 32bit user space.

Since no one is likely to be using the binary sysctl values
and the ascii interface still works, this patch just removes
support for the binary sysctl interface from the kernel.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
---
 kernel/sysctl.c |3 ---
 1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index ccae8da..03759ab 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -688,7 +688,6 @@ static struct ctl_table kern_table[] = {
 #endif
 #ifdefined(CONFIG_ACPI_SLEEP) && defined(CONFIG_X86)
{
-   .ctl_name   = KERN_ACPI_VIDEO_FLAGS,
.procname   = "acpi_video_flags",
.data   = _realmode_flags,
.maxlen = sizeof (unsigned long),
@@ -1148,7 +1147,6 @@ static struct ctl_table fs_table[] = {
.extra2 = ,
},
{
-   .ctl_name   = FS_AIO_NR,
.procname   = "aio-nr",
.data   = _nr,
.maxlen = sizeof(aio_nr),
@@ -1156,7 +1154,6 @@ static struct ctl_table fs_table[] = {
.proc_handler   = _doulongvec_minmax,
},
{
-   .ctl_name   = FS_AIO_MAX_NR,
.procname   = "aio-max-nr",
.data   = _max_nr,
.maxlen = sizeof(aio_max_nr),
-- 
1.5.1.1.181.g2de0

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 09/10] sysctl: ipv4 remove binary sysctl paths where they are broken.

2007-08-09 Thread Eric W. Biederman

Currently tcp_available_congestion_control does not even
attempt being read from sys_sysctl, and ipfrag_max_dist
while it works allows setting of invalid values using
sys_sysctl.

So just kill the binary sys_sysctl support for these
sysctls.  If the support is not important enough to
test and get right it probably isn't important enough
to keep.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
---
 net/ipv4/sysctl_net_ipv4.c |2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 53ef0f4..282eb7e 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -672,7 +672,6 @@ ctl_table ipv4_table[] = {
.strategy   = _jiffies
},
{
-   .ctl_name   = NET_IPV4_IPFRAG_MAX_DIST,
.procname   = "ipfrag_max_dist",
.data   = _ipfrag_max_dist,
.maxlen = sizeof(int),
@@ -797,7 +796,6 @@ ctl_table ipv4_table[] = {
},
 #endif /* CONFIG_NETLABEL */
{
-   .ctl_name   = NET_TCP_AVAIL_CONG_CONTROL,
.procname   = "tcp_available_congestion_control",
.maxlen = TCP_CA_BUF_MAX,
.mode   = 0444,
-- 
1.5.1.1.181.g2de0

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 08/10] sysctl: Remove broken cdrom binary sysctls

2007-08-09 Thread Eric W. Biederman

The binary interface for the cdrom sysctls can't possilby work.
So remove the binary sysctls and update the test for finding
out which sysctl table entry we are dealy with to use the procname
and not the ctl_name (which I am removing).

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
---
 drivers/cdrom/cdrom.c |   31 +--
 1 files changed, 9 insertions(+), 22 deletions(-)

diff --git a/drivers/cdrom/cdrom.c b/drivers/cdrom/cdrom.c
index 67ee3d4..f0c6318 100644
--- a/drivers/cdrom/cdrom.c
+++ b/drivers/cdrom/cdrom.c
@@ -3468,32 +3468,25 @@ static int cdrom_sysctl_handler(ctl_table *ctl, int 
write, struct file * filp,
else
*valp = 0;
 
-   switch (ctl->ctl_name) {
-   case DEV_CDROM_AUTOCLOSE: {
+   if (strcmp(ctl->procname, "autoclose") == 0) {
if (valp == _sysctl_settings.autoclose)
autoclose = cdrom_sysctl_settings.autoclose;
-   break;
-   }
-   case DEV_CDROM_AUTOEJECT: {
+   }
+   else if (strcmp(ctl->procname, "autoeject") == 0) {
if (valp == _sysctl_settings.autoeject)
autoeject = cdrom_sysctl_settings.autoeject;
-   break;
-   }
-   case DEV_CDROM_DEBUG: {
+   }
+   else if (strcmp(ctl->procname, "debug") == 0) {
if (valp == _sysctl_settings.debug)
debug = cdrom_sysctl_settings.debug;
-   break;
-   }
-   case DEV_CDROM_LOCK: {
+   }
+   else if (strcmp(ctl->procname, "lock") == 0) {
if (valp == _sysctl_settings.lock)
lockdoor = cdrom_sysctl_settings.lock;
-   break;
-   }
-   case DEV_CDROM_CHECK_MEDIA: {
+   }
+   else if (strcmp(ctl->procname, "check_media") == 0) {
if (valp == _sysctl_settings.check)
check_media_type = cdrom_sysctl_settings.check;
-   break;
-   }
}
/* update the option flags according to the changes. we
   don't have per device options through sysctl yet,
@@ -3507,7 +3500,6 @@ static int cdrom_sysctl_handler(ctl_table *ctl, int 
write, struct file * filp,
 /* Place files in /proc/sys/dev/cdrom */
 static ctl_table cdrom_table[] = {
{
-   .ctl_name   = DEV_CDROM_INFO,
.procname   = "info",
.data   = _sysctl_settings.info, 
.maxlen = CDROM_STR_SIZE,
@@ -3515,7 +3507,6 @@ static ctl_table cdrom_table[] = {
.proc_handler   = _sysctl_info,
},
{
-   .ctl_name   = DEV_CDROM_AUTOCLOSE,
.procname   = "autoclose",
.data   = _sysctl_settings.autoclose,
.maxlen = sizeof(int),
@@ -3523,7 +3514,6 @@ static ctl_table cdrom_table[] = {
.proc_handler   = _sysctl_handler,
},
{
-   .ctl_name   = DEV_CDROM_AUTOEJECT,
.procname   = "autoeject",
.data   = _sysctl_settings.autoeject,
.maxlen = sizeof(int),
@@ -3531,7 +3521,6 @@ static ctl_table cdrom_table[] = {
.proc_handler   = _sysctl_handler,
},
{
-   .ctl_name   = DEV_CDROM_DEBUG,
.procname   = "debug",
.data   = _sysctl_settings.debug,
.maxlen = sizeof(int),
@@ -3539,7 +3528,6 @@ static ctl_table cdrom_table[] = {
.proc_handler   = _sysctl_handler,
},
{
-   .ctl_name   = DEV_CDROM_LOCK,
.procname   = "lock",
.data   = _sysctl_settings.lock,
.maxlen = sizeof(int),
@@ -3547,7 +3535,6 @@ static ctl_table cdrom_table[] = {
.proc_handler   = _sysctl_handler,
},
{
-   .ctl_name   = DEV_CDROM_CHECK_MEDIA,
.procname   = "check_media",
.data   = _sysctl_settings.check,
.maxlen = sizeof(int),
-- 
1.5.1.1.181.g2de0

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 07/10] sysctl: x86_64 remove unnecessary binary paths.

2007-08-09 Thread Eric W. Biederman

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
---
 arch/x86_64/ia32/ia32_binfmt.c |1 -
 arch/x86_64/kernel/vsyscall.c  |   10 +-
 2 files changed, 1 insertions(+), 10 deletions(-)

diff --git a/arch/x86_64/ia32/ia32_binfmt.c b/arch/x86_64/ia32/ia32_binfmt.c
index dffd2ac..c80c3f1 100644
--- a/arch/x86_64/ia32/ia32_binfmt.c
+++ b/arch/x86_64/ia32/ia32_binfmt.c
@@ -291,7 +291,6 @@ static void elf32_init(struct pt_regs *regs)
 
 static ctl_table abi_table2[] = {
{
-   .ctl_name   = 99,
.procname   = "vsyscall32",
.data   = _vsyscall32,
.maxlen = sizeof(int),
diff --git a/arch/x86_64/kernel/vsyscall.c b/arch/x86_64/kernel/vsyscall.c
index 06c3494..69918b5 100644
--- a/arch/x86_64/kernel/vsyscall.c
+++ b/arch/x86_64/kernel/vsyscall.c
@@ -260,18 +260,10 @@ out:
return ret;
 }
 
-static int vsyscall_sysctl_nostrat(ctl_table *t, int __user *name, int nlen,
-   void __user *oldval, size_t __user *oldlenp,
-   void __user *newval, size_t newlen)
-{
-   return -ENOSYS;
-}
-
 static ctl_table kernel_table2[] = {
-   { .ctl_name = 99, .procname = "vsyscall64",
+   { .procname = "vsyscall64",
  .data = _gtod_data.sysctl_enabled, .maxlen = sizeof(int),
  .mode = 0644,
- .strategy = vsyscall_sysctl_nostrat,
  .proc_handler = vsyscall_sysctl_change },
{}
 };
-- 
1.5.1.1.181.g2de0

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 06/10] sysctl: Remove broken sunrpc debug binary sysctls

2007-08-09 Thread Eric W. Biederman
>From ddf280c9de903f1fb5d4ecf9c68df0c479d7c7d2 Mon Sep 17 00:00:00 2001
From: Eric W. Biederman <[EMAIL PROTECTED]>
Date: Thu, 9 Aug 2007 16:00:00 -0600
Subject: 

This is debug code so no need to support binary sysctl,
and the binary sysctls as they were written were not
consistent with what showed up in /proc so remove
the binary sysctl support.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
---
 net/sunrpc/sysctl.c |4 
 1 files changed, 0 insertions(+), 4 deletions(-)

diff --git a/net/sunrpc/sysctl.c b/net/sunrpc/sysctl.c
index 738db32..864b541 100644
--- a/net/sunrpc/sysctl.c
+++ b/net/sunrpc/sysctl.c
@@ -114,7 +114,6 @@ done:
 
 static ctl_table debug_table[] = {
{
-   .ctl_name   = CTL_RPCDEBUG,
.procname   = "rpc_debug",
.data   = _debug,
.maxlen = sizeof(int),
@@ -122,7 +121,6 @@ static ctl_table debug_table[] = {
.proc_handler   = _dodebug
},
{
-   .ctl_name   = CTL_NFSDEBUG,
.procname   = "nfs_debug",
.data   = _debug,
.maxlen = sizeof(int),
@@ -130,7 +128,6 @@ static ctl_table debug_table[] = {
.proc_handler   = _dodebug
},
{
-   .ctl_name   = CTL_NFSDDEBUG,
.procname   = "nfsd_debug",
.data   = _debug,
.maxlen = sizeof(int),
@@ -138,7 +135,6 @@ static ctl_table debug_table[] = {
.proc_handler   = _dodebug
},
{
-   .ctl_name   = CTL_NLMDEBUG,
.procname   = "nlm_debug",
.data   = _debug,
.maxlen = sizeof(int),
-- 
1.5.1.1.181.g2de0

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 05/10] sysctl: ipv6 route flushing (kill binary path)

2007-08-09 Thread Eric W. Biederman

We don't preoperly support the sysctl binary path for flushing
the ipv6 routes.  So remove support for a binary path.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
---
 net/ipv6/route.c |1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 55ea80f..0d23a46 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -2458,7 +2458,6 @@ int ipv6_sysctl_rtcache_flush(ctl_table *ctl, int write, 
struct file * filp,
 
 ctl_table ipv6_route_table[] = {
{
-   .ctl_name   =   NET_IPV6_ROUTE_FLUSH,
.procname   =   "flush",
.data   =   _delay,
.maxlen =   sizeof(int),
-- 
1.5.1.1.181.g2de0

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 04/10] sysctl: Fix neighbour table sysctls.

2007-08-09 Thread Eric W. Biederman

- In ipv6 ndisc_ifinfo_syctl_change so it doesn't depend on binary
  sysctl names for a function that works with proc.

- In neighbour.c reorder the table to put the possibly unused entries
  at the end so we can remove them by terminating the table early.

- In neighbour.c kill the entries with questionable binary sysctl
  handling behavior.

- In neighbour.c if we don't have a strategy routine remove the
  binary path.  So we don't the default sysctl strategy routine
  on data that is not ready for it.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
---
 net/core/neighbour.c |   75 ++
 net/ipv6/ndisc.c |   24 ++-
 2 files changed, 49 insertions(+), 50 deletions(-)

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index ca2a153..27c3f4e 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -2498,7 +2498,6 @@ static struct neigh_sysctl_table {
.proc_handler   = _dointvec,
},
{
-   .ctl_name   = NET_NEIGH_RETRANS_TIME,
.procname   = "retrans_time",
.maxlen = sizeof(int),
.mode   = 0644,
@@ -2543,27 +2542,40 @@ static struct neigh_sysctl_table {
.proc_handler   = _dointvec,
},
{
-   .ctl_name   = NET_NEIGH_ANYCAST_DELAY,
.procname   = "anycast_delay",
.maxlen = sizeof(int),
.mode   = 0644,
.proc_handler   = _dointvec_userhz_jiffies,
},
{
-   .ctl_name   = NET_NEIGH_PROXY_DELAY,
.procname   = "proxy_delay",
.maxlen = sizeof(int),
.mode   = 0644,
.proc_handler   = _dointvec_userhz_jiffies,
},
{
-   .ctl_name   = NET_NEIGH_LOCKTIME,
.procname   = "locktime",
.maxlen = sizeof(int),
.mode   = 0644,
.proc_handler   = _dointvec_userhz_jiffies,
},
{
+   .ctl_name   = NET_NEIGH_RETRANS_TIME_MS,
+   .procname   = "retrans_time_ms",
+   .maxlen = sizeof(int),
+   .mode   = 0644,
+   .proc_handler   = _dointvec_ms_jiffies,
+   .strategy   = _ms_jiffies,
+   },
+   {
+   .ctl_name   = NET_NEIGH_REACHABLE_TIME_MS,
+   .procname   = "base_reachable_time_ms",
+   .maxlen = sizeof(int),
+   .mode   = 0644,
+   .proc_handler   = _dointvec_ms_jiffies,
+   .strategy   = _ms_jiffies,
+   },
+   {
.ctl_name   = NET_NEIGH_GC_INTERVAL,
.procname   = "gc_interval",
.maxlen = sizeof(int),
@@ -2592,22 +2604,7 @@ static struct neigh_sysctl_table {
.mode   = 0644,
.proc_handler   = _dointvec,
},
-   {
-   .ctl_name   = NET_NEIGH_RETRANS_TIME_MS,
-   .procname   = "retrans_time_ms",
-   .maxlen = sizeof(int),
-   .mode   = 0644,
-   .proc_handler   = _dointvec_ms_jiffies,
-   .strategy   = _ms_jiffies,
-   },
-   {
-   .ctl_name   = NET_NEIGH_REACHABLE_TIME_MS,
-   .procname   = "base_reachable_time_ms",
-   .maxlen = sizeof(int),
-   .mode   = 0644,
-   .proc_handler   = _dointvec_ms_jiffies,
-   .strategy   = _ms_jiffies,
-   },
+   {}
},
.neigh_dev = {
{
@@ -2660,42 +2657,48 @@ int neigh_sysctl_register(struct net_device *dev, 
struct neigh_parms *p,
t->neigh_vars[9].data  = >anycast_delay;
t->neigh_vars[10].data = >proxy_delay;
t->neigh_vars[11].data = >locktime;
+   t->neigh_vars[12].data  = >retrans_time;
+   t->neigh_vars[13].data  = >base_reachable_time;
 
if (dev) {
dev_name_source = dev->name;
t->neigh_dev[0].ctl_name = dev->ifindex;
-   t->neigh_vars[12].procname = NULL;
-   t->neigh_vars[13].procname = NULL;
-   

[PATCH 03/10] sysctl: Remove binary sysctl support where it clearly doesn't work.

2007-08-09 Thread Eric W. Biederman

These functions all of wrapper functions for the proc interface
that are needed for them to work correctly.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
---
 kernel/sysctl.c |7 ---
 1 files changed, 0 insertions(+), 7 deletions(-)

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index d6257ee..ccae8da 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -350,7 +350,6 @@ static struct ctl_table kern_table[] = {
},
 #ifdef CONFIG_PROC_SYSCTL
{
-   .ctl_name   = KERN_TAINTED,
.procname   = "tainted",
.data   = ,
.maxlen = sizeof(int),
@@ -359,7 +358,6 @@ static struct ctl_table kern_table[] = {
},
 #endif
{
-   .ctl_name   = KERN_CAP_BSET,
.procname   = "cap-bound",
.data   = _bset,
.maxlen = sizeof(kernel_cap_t),
@@ -635,7 +633,6 @@ static struct ctl_table kern_table[] = {
.proc_handler   = _dointvec,
},
{
-   .ctl_name   = KERN_NMI_WATCHDOG,
.procname   = "nmi_watchdog",
.data   = _watchdog_enabled,
.maxlen = sizeof (int),
@@ -818,7 +815,6 @@ static struct ctl_table vm_table[] = {
.extra2 = _hundred,
},
{
-   .ctl_name   = VM_DIRTY_WB_CS,
.procname   = "dirty_writeback_centisecs",
.data   = _writeback_interval,
.maxlen = sizeof(dirty_writeback_interval),
@@ -826,7 +822,6 @@ static struct ctl_table vm_table[] = {
.proc_handler   = _writeback_centisecs_handler,
},
{
-   .ctl_name   = VM_DIRTY_EXPIRE_CS,
.procname   = "dirty_expire_centisecs",
.data   = _expire_interval,
.maxlen = sizeof(dirty_expire_interval),
@@ -854,7 +849,6 @@ static struct ctl_table vm_table[] = {
},
 #ifdef CONFIG_HUGETLB_PAGE
 {
-   .ctl_name   = VM_HUGETLB_PAGES,
.procname   = "nr_hugepages",
.data   = _huge_pages,
.maxlen = sizeof(unsigned long),
@@ -1079,7 +1073,6 @@ static struct ctl_table fs_table[] = {
.proc_handler   = _dointvec,
},
{
-   .ctl_name   = FS_NRFILE,
.procname   = "file-nr",
.data   = _stat,
.maxlen = 3*sizeof(int),
-- 
1.5.1.1.181.g2de0

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 02/10] sysct mqueue: Remove the binary sysctl numbers

2007-08-09 Thread Eric W. Biederman

Because of a conflict with FS_INODE_NR none of the binary
sysctl numbers use by mqueue, were available to user space.
So just remove them.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
---
 ipc/mqueue.c |   10 --
 1 files changed, 0 insertions(+), 10 deletions(-)

diff --git a/ipc/mqueue.c b/ipc/mqueue.c
index 145d5a0..13fdf67 100644
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
@@ -44,12 +44,6 @@
 #define STATE_PENDING  1
 #define STATE_READY2
 
-/* used by sysctl */
-#define FS_MQUEUE  1
-#define CTL_QUEUESMAX  2
-#define CTL_MSGMAX 3
-#define CTL_MSGSIZEMAX 4
-
 /* default values */
 #define DFLT_QUEUESMAX 256 /* max number of message queues */
 #define DFLT_MSGMAX10  /* max number of messages in each queue */
@@ -1197,7 +1191,6 @@ static int msg_maxsize_limit_max = INT_MAX;
 
 static ctl_table mq_sysctls[] = {
{
-   .ctl_name   = CTL_QUEUESMAX,
.procname   = "queues_max",
.data   = _max,
.maxlen = sizeof(int),
@@ -1205,7 +1198,6 @@ static ctl_table mq_sysctls[] = {
.proc_handler   = _dointvec,
},
{
-   .ctl_name   = CTL_MSGMAX,
.procname   = "msg_max",
.data   = _max,
.maxlen = sizeof(int),
@@ -1215,7 +1207,6 @@ static ctl_table mq_sysctls[] = {
.extra2 = _max_limit_max,
},
{
-   .ctl_name   = CTL_MSGSIZEMAX,
.procname   = "msgsize_max",
.data   = _max,
.maxlen = sizeof(int),
@@ -1229,7 +1220,6 @@ static ctl_table mq_sysctls[] = {
 
 static ctl_table mq_sysctl_dir[] = {
{
-   .ctl_name   = FS_MQUEUE,
.procname   = "mqueue",
.mode   = 0555,
.child  = mq_sysctls,
-- 
1.5.1.1.181.g2de0

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 01/10] sysctl: Update sysctl_check_table

2007-08-09 Thread Eric W. Biederman


Well it turns out after I dug into the problems a
little more I was returning a few false positives
so this patch updates my logic to remove them.

- Don't complain about 0 ctl_names in sysctl_check_binary_path
  It is valid for someone to remove the sysctl binary interface
  and still keep the same sysctl proc interface.

- Count ctl_names and procnames as matching if they both don't
  exist.

- Only warn about missing min when the generic functions care.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
---
 kernel/sysctl_check.c |   30 --
 1 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/kernel/sysctl_check.c b/kernel/sysctl_check.c
index 389c4ba..930a514 100644
--- a/kernel/sysctl_check.c
+++ b/kernel/sysctl_check.c
@@ -1420,12 +1420,14 @@ static int sysctl_check_dir(struct ctl_table *table)
ref = sysctl_check_lookup(table);
if (ref) {
int match = 0;
-   if (table->procname && ref->procname &&
-   (strcmp(table->procname, ref->procname) == 0))
+   if ((!table->procname && !ref->procname) ||
+   (table->procname && ref->procname &&
+(strcmp(table->procname, ref->procname) == 0)))
match++;
 
-   if (table->ctl_name && ref->ctl_name &&
-   (table->ctl_name == ref->ctl_name))
+   if ((!table->ctl_name && !ref->ctl_name) ||
+   (table->ctl_name && ref->ctl_name &&
+(table->ctl_name == ref->ctl_name)))
match++;
 
if (match != 2) {
@@ -1462,8 +1464,8 @@ static void sysctl_check_bin_path(struct ctl_table 
*table, const char **fail)
 (strcmp(table->procname, ref->procname) != 0)))
set_fail(fail, table, "procname does not match binary 
path procname");
 
-   if (ref->ctl_name &&
-   (!table->ctl_name || table->ctl_name != ref->ctl_name))
+   if (ref->ctl_name && table->ctl_name &&
+   (table->ctl_name != ref->ctl_name))
set_fail(fail, table, "ctl_name does not match binary 
path ctl_name");
}
 }
@@ -1499,7 +1501,7 @@ int sysctl_check_table(struct ctl_table *table)
if (table->extra2)
set_fail(, table, "Directory with extra2");
if (sysctl_check_dir(table))
-   set_fail(, table, "Inconsistent 
directory");
+   set_fail(, table, "Inconsistent directory 
names");
} else {
if ((table->strategy == sysctl_data) ||
(table->strategy == sysctl_string) ||
@@ -1520,14 +1522,14 @@ int sysctl_check_table(struct ctl_table *table)
if (!table->maxlen)
set_fail(, table, "No maxlen");
}
-   if ((table->strategy == sysctl_intvec) ||
-   (table->proc_handler == proc_dointvec_minmax) ||
-   (table->proc_handler == proc_doulongvec_minmax) ||
+   if ((table->proc_handler == proc_doulongvec_minmax) ||
(table->proc_handler == 
proc_doulongvec_ms_jiffies_minmax)) {
-   if (!table->extra1)
-   set_fail(, table, "No min");
-   if (!table->extra2)
-   set_fail(, table, "No max");
+   if (table->maxlen > sizeof (unsigned long)) {
+   if (!table->extra1)
+   set_fail(, table, "No 
min");
+   if (!table->extra2)
+   set_fail(, table, "No 
max");
+   }
}
if (table->ctl_name && !table->strategy)
set_fail(, table, "Missing strategy");
-- 
1.5.1.1.181.g2de0

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [-mm patch] DMA engine kconfig improvements

2007-08-09 Thread Adrian Bunk
On Fri, Aug 03, 2007 at 07:15:31PM -0700, Dan Williams wrote:
> On 7/25/07, Adrian Bunk <[EMAIL PROTECTED]> wrote:
> > On Wed, Jul 25, 2007 at 04:03:04AM -0700, Andrew Morton wrote:
> > >...
> > > Changes since 2.6.22-rc6-mm1:
> > >...
> > > +dma-arch-fix.patch
> > >
> > >  Fix git-dma.patch
> > >...
> >
> > This results in an ARM-only driver in an X86-only menu...
> >
> > What about the patch below instead that also improves a few other things?
> I like it, just a few nits:
> 
> > -menu "DMA Engine support"
> > -   depends on HAS_DMA
> > +menuconfig DMADEVICES
> > +   bool "DMA Engine support"
> > +   depends on (PCI && X86) || ARCH_IOP32X || ARCH_IOP33X || 
> > ARCH_IOP13XX
> > +   help
> > + Intel(R) DMA engines
> > +
> Perhaps we should go ahead and define ARCH_HAS_DMA_OFFLOAD and have
> DMADEVICES depend on that option.  A ppc32 driver is in the works:
> http://marc.info/?l=linux-raid=117400143317440=2
>...

That would be overkill - what my patch does here is just a minor 
cosmetical thing that could be dropped if it would become a problem.

> Regards,
> Dan (for Shannon while he is on vacation)

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB doesn't work with kdump kernel on Cell

2007-08-09 Thread Michael Ellerman
On 8/9/07, Lucio Correia <[EMAIL PROTECTED]> wrote:
> On Wed, 2007-08-08 at 23:10 +0200, Arnd Bergmann wrote:
> > On Wednesday 08 August 2007, Lucio Correia wrote:
> > >   DMA 0 ->12288
> > >   Normal  12288 ->12288
> > > early_node_map[2] active PFN ranges
> > > 0:0 -> 2560
> > > 1:12287 ->12288
> >
> > As Christoph found, this memory map is really strange. Other machines
> > have something like
> >
> > Zone PFN ranges:
> >   DMA 0 ->16384
> >   Normal  16384 ->16384
> > early_node_map[2] active PFN ranges
> > 0:0 -> 8192
> > 1: 8192 ->16384
> >
> > Lucio,
> > What code builds the memory map that gets passed to the kdump kernel?

It comes out of the device tree, just like a regular kernel. The
device tree for the kdump kernel is built by kexec-tools, it parses
/proc/device-tree and does a bunch of logic to avoid various reserved
regions: the kernel, TCE tables, RTAS etc.

> I also tried to pass maxcpus=1 for the command line of second kernel,
> and it didn't work. How can I alternatively disable the node?

maxcpus is poorly tested and is known to be broken on Cell, please
don't use it, or fix it first :)

cheers
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: kernel BUG at mm/swap_state.c:78!

2007-08-09 Thread Nick Piggin
On Thu, Aug 09, 2007 at 04:37:35PM +0100, Hugh Dickins wrote:
> On Thu, 9 Aug 2007, Mariusz Kozlowski wrote:
> > Hello,
> > 
> > Nothing unusual happening, allmodconfig compiling etc.
> > Not sure why it says kernel was tainted though ... hmmm.
> > 
> > [ cut here ]
> > kernel BUG at mm/swap_state.c:78!
> > invalid opcode:  [#1]
> > PREEMPT 
> > Modules linked in: orinoco_cs orinoco hermes pl2303 usbserial pcmcia 
> > 8250_pci 8250 serial_core yenta_socket rsrc_nonstatic pcmcia_core 8139too
> > CPU:0
> > EIP:0060:[]Tainted: PVLI
> > EFLAGS: 00010246   (2.6.23-rc2-mm1 #1)
> > EIP is at __add_to_swap_cache+0xc6/0xd7
> > eax: 4000   ebx: c11285c0   ecx: 00d0   edx: 0283
> > esi: c11285c0   edi: 0283   ebp: c1858f90   esp: c1858f84
> > ds: 007b   es: 007b   fs:   gs:   ss: 0068
> > Process kprefetchd (pid: 236, ti=c1858000 task=c18d14d0 task.ti=c1858000)
> > Stack: 0283 c11285c0 c3d5a3c8 c1858fa0 c01504ea c11285c0  
> > c1858fcc 
> >c015307c 0001 0007 0002 0002 0283  
> > fffc 
> > c0152d5c c1858fe0 c0127f2e c0127ef8   
> >  
> > Call Trace:
> >  [] show_trace_log_lvl+0x1a/0x30
> >  [] show_stack_log_lvl+0xa9/0xd5
> >  [] show_registers+0x219/0x38d
> >  [] die+0x104/0x23e
> >  [] do_trap+0x83/0xad
> >  [] do_invalid_op+0x88/0x92
> >  [] error_code+0x6a/0x70
> >  [] add_to_swap_cache+0x22/0x58
> >  [] kprefetchd+0x320/0x364
> >  [] kthread+0x36/0x58
> >  [] kernel_thread_helper+0x7/0x14
> >  ===
> > INFO: lockdep is turned off.
> > Code: 0f 89 7b 0c 83 05 fc c9 53 c0 01 8b 13 c1 ea 1e 8d 04 12 01 d0 c1 e0 
> > 03 29 d0 c1 e0 05 ff 80 b8 c0 53 c0 ff 05 34 1d 68 c0 eb 96 <0f> 0b eb fe 
> > 0f 0b eb fe 0f 0b eb fe 8b 53 0c eb be 55 89 e5 56 
> > EIP: [] __add_to_swap_cache+0xc6/0xd7 SS:ESP 0068:c1858f84
> 
> Don't worry about reproducing untainted, I got the same earlier
> and was just preparing and testing the hotfix: here it is...
> 
> 
> Nick's mm-clarify-__add_to_swap_cache-locking.patch is fine for mainline,
> but soon generates a "kernel BUG at mm/swap_state.c:78!" when it meets
> mm-implement-swap-prefetching.patch in 2.6.23-rc2-mm1.  We could add a
> fix to the latter, but I think it's better to adjust Nick's, so that
> it's right for whichever tree it's in: move the responsibility to
> SetPageLocked from read_swap_cache_async to add_to_swap_cache.

Hmm, yeah I like this better, it is more like add_to_page_cache now.
Thanks.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SLUB doesn't work with kdump kernel on Cell

2007-08-09 Thread Michael Ellerman
On 8/9/07, Arnd Bergmann <[EMAIL PROTECTED]> wrote:
> On Wednesday 08 August 2007, Lucio Correia wrote:
> > DMA 0 -> 12288
> > Normal 12288 -> 12288
> > early_node_map[2] active PFN ranges
> > 0: 0 -> 2560
> > 1: 12287 -> 12288
>
> As Christoph found, this memory map is really strange. Other machines
> have something like
>
> Zone PFN ranges:
>   DMA 0 ->16384
>   Normal  16384 ->16384
> early_node_map[2] active PFN ranges
> 0:0 -> 8192
> 1: 8192 ->16384
>
> Lucio,
> What code builds the memory map that gets passed to the kdump kernel?
> Does the original kernel see the same map on your machine?

I'd have to check, but I'd guess it's the reserved region for RTAS.
You can confirm by checking the boot log to see where prom_init
instanciated RTAS.

cheers
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 00/23] per device dirty throttling -v8

2007-08-09 Thread Bill Davidsen

Andrew Morton wrote:

On Wed, 08 Aug 2007 14:10:15 -0700
"Martin J. Bligh" <[EMAIL PROTECTED]> wrote:


Why isn't this easily fixable by just adding an additional dirty
flag that says atime has changed? Then we only cause a write
when we remove the inode from the inode cache, if only atime
is updated.


I think that could be made to work, and it would fix the performance
issue.

It is a behaviour change.  At present ext3 (for example) commits everything
every five seconds.  After a change like this, a crash+recovery could cause
a file's atime to go backwards by an arbitrarily large time interval - it
could easily be months.

I would think that (really) updating atime on open would be enough, 
hopefully without being too much. The "lazyatime" thing I was playing 
with only updated on open, final close, write, and fork.


I like the idea of updating once in a while, but one of the benefits of 
noatime is allowing drives to spin down via inactivity. If something 
does get done in the area of less but non-zero atime tracking, perhaps 
that could be taken into account. I have to check what "laptop_mode 
actually does, since my laptops are old installs.


--
Bill Davidsen <[EMAIL PROTECTED]>
  "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Documentation files in html format?

2007-08-09 Thread Rene Herman

On 08/10/2007 12:27 AM, Francois Romieu wrote:


Andi Kleen <[EMAIL PROTECTED]> :
[...]



I don't think that is used by Linuxdoc. Try a make pdfdocs and see for
yourself.


It reminds me of an old PII but it does not really make clear how html to
pdf conversion would improve the situation.


With HTML the source format is itself the preferred object format for many 
purposes (something which I assume you wouldn't want to claim of DocBook 
source) meaning that for those uses there is no conversion.


Which given the number of times "make *docs" has bombed out on me through 
the years I find a definite improvement. Add in that it's much easier to 
produce HTML, that it covers most all formatting needs something like the 
kernel documentation directory needs, integrates unchanged, directly and 
nicely into the effort Rob Landley is doing with collecting documentation 
online and is a format you can read with a program most users have open and 
available 100% of the time rather than requiring a complete stack of 
semi-obscure external software -- and I just don't see why anyone would want 
to argue that DocBook and its associated crapola should _not_ be buried in 
that same dark, desolate place where other abortive attempts at improvement 
such as GNU info already reside.


Rene.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 02/10] mm: system wide ALLOC_NO_WATERMARK

2007-08-09 Thread Daniel Phillips
On 8/9/07, Christoph Lameter <[EMAIL PROTECTED]> wrote:
> On Thu, 9 Aug 2007, Daniel Phillips wrote:
> > On 8/8/07, Christoph Lameter <[EMAIL PROTECTED]> wrote:
> > > On Wed, 8 Aug 2007, Daniel Phillips wrote:
> > > Maybe we need to kill PF_MEMALLOC
> > Shrink_caches needs to be able to recurse into filesystems at least,
> > and for the duration of the recursion the filesystem must have
> > privileged access to reserves.  Consider the difficulty of handling
> > that with anything other than a process flag.
>
> Shrink_caches needs to allocate memory? Hmmm... Maybe we can only limit
> the PF_MEMALLOC use.

PF_MEMALLOC is not such a bad thing.  It will usually be less code
than mempool for the same use case, besides being able to handle a
wider range of problems.  We  introduce __GPF_MEMALLOC for situations
where the need for reserve memory is locally known, as in the network
stack, which is similar or identical to the use case for mempool.  One
could reasonably ask why we need mempool with a lighter alternative
available.  But this is a case of to each their own I think.  Either
technique will work for reserve management.

> > In theory, we could reduce the size of the global memalloc pool by
> > including "easily freeable" memory in it.  This is just an
> > optimization and does not belong in this patch set, which fixes a
> > system integrity issue.
>
> I think the main thing would be to fix reclaim to not do stupid things
> like triggering writeout early in the reclaim pass and to allow reentry
> into reclaim. The idea of memory pools always sounded strange to me given
> that you have a lot of memory in a zone that is reclaimable as needed.

You can fix reclaim as much as you want and the basic deadlock will
still not go away.  When you finally do get to writing something out,
memory consumers in the writeout path are going to cause problems,
which this patch set fixes.

Agreed that the idea of mempool always sounded strange, and we show
how to get rid of them, but that is not the immediate purpose of this
patch set.

Regards,

Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: sleeping function called from invalid context at kernel/mutex.c:86

2007-08-09 Thread Andrew Morton
On Fri, 10 Aug 2007 01:23:07 +0200
Mariusz Kozlowski <[EMAIL PROTECTED]> wrote:

> Hello,
> 
>   This probably doesn't have great impact ;) but ...
> To reproduce: run torture tests for RCU and then sysrq+q.
> 
> SysRq : Show Pending Timers
> Timer List Version: v0.3
> HRTIMER_MAX_CLOCK_BASES: 2
> now at 1764338760370 nsecs
> 
> cpu: 0
>  clock 0:
>   .index:  0
>   .resolution: 1 nsecs
>   .get_time:   ktime_get_real
>   .offset: 1186699025823815427 nsecs
> active timers:
>  clock 1:
>   .index:  1
>   .resolution: 1 nsecs
>   .get_time:   ktime_get
>   .offset: 0 nsecs
> active timers:
>  #0: <3>BUG: sleeping function called from invalid context at 
> kernel/mutex.c:86
> in_atomic():1, irqs_disabled():1
> INFO: lockdep is turned off.
> irq event stamp: 0
> hardirqs last  enabled at (0): [<>] 0x0
> hardirqs last disabled at (0): [] copy_process+0x4a8/0x144c
> softirqs last  enabled at (0): [] copy_process+0x4c6/0x144c
> softirqs last disabled at (0): [<>] 0x0
>  [] show_trace_log_lvl+0x1a/0x30
>  [] show_trace+0x12/0x14
>  [] dump_stack+0x15/0x17
>  [] __might_sleep+0xb7/0xc9
>  [] mutex_lock+0x15/0x1f
>  [] lookup_module_symbol_name+0x17/0xc0
>  [] lookup_symbol_name+0x3f/0x43
>  [] print_name_offset+0x1f/0x96
>  [] timer_list_show+0x802/0xcbd
>  [] sysrq_timer_list_show+0xc/0xe
>  [] sysrq_handle_show_timers+0x8/0xa
>  [] __handle_sysrq+0x7b/0x115
>  [] handle_sysrq+0x20/0x24
>  [] kbd_event+0x3a8/0x5c7
>  [] input_pass_event+0x8f/0x91
>  [] input_handle_event+0x98/0x38d
>  [] input_event+0x54/0x67
>  [] atkbd_interrupt+0x200/0x59e
>  [] serio_interrupt+0x7c/0x80
>  [] i8042_interrupt+0x17a/0x289
>  [] handle_IRQ_event+0x28/0x59
>  [] handle_level_irq+0xad/0x10b
>  [] do_IRQ+0x93/0xd0
>  [] common_interrupt+0x2e/0x34
>  [] rcu_read_delay+0x8/0x36 [rcutorture]
>  [] rcu_torture_reader+0x6e/0x169 [rcutorture]
>  [] kthread+0x36/0x58
>  [] kernel_thread_helper+0x7/0x1c
>  ===

We seem to have made a mess in there.  timer_list_show() ends up calling
lookup_module_symbol_name(), which takes a mutex.  However print_symbol()
(which is called at oops time, interrupt time, etc) calls
module_address_lookup(), which is basically the same, only it doesn't take
the mutex.

I guess a quicky fix would be to switch
kernel/time/timer_list.c:print_name_offset() from
lookup_module_symbol_name() to module_address_lookup().  But we'd still
have a mess in there.

(adds ccs, runs away)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Unable to handle kernel paging request at virtual address

2007-08-09 Thread Dave Jones
On Fri, Aug 10, 2007 at 01:44:24AM +0200, shacky wrote:
 > > You snipped the most important part.  Even a digital photo of the
 > > crash would be more useful than what we have above.
 > > So far, there's not really much to go on.
 > 
 > Could you tell me what is the most important part, so I try to rewrite
 > it by hand?

It's hard to blindly guess because there's so little to go on.
At the least, a complete list of the modules loaded, the EIP/RIP
and the call trace. (This makes up 90% of the output, hence the
suggestion to take a photograph).

 > I don't think a digital photo will be much useful because the whole
 > error is writed on a simple black screen with white caracters.

That's expected. We usually cope with that quite well.

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: problems while mounting /boot partition

2007-08-09 Thread Jan Engelhardt

On Aug 8 2007 18:28, Michal Piotrowski wrote:
>
>Hi Brian,
>
>Brian J. Murrell pisze:
>> I am using Ubuntu Gutsy, which is the in-development branch heading for
>> their next stable release.
>
>You forgot about message subject, so no one has read this report.

Actually, given the volume on LKML, a line without a subject is making
the most attention since all others do have one. :)


Jan
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: rtl8139 inconsistent lock state

2007-08-09 Thread Mariusz Kozlowski
Hello,

=
[ INFO: inconsistent lock state ]
2.6.23-rc2-mm1 #7
-
inconsistent {in-hardirq-W} -> {hardirq-on-W} usage.
ifconfig/5492 [HC0[0]:SC0[0]:HE1:SE1] takes:
 (>lock){+...}, at: [] rtl8139_interrupt+0x27/0x46b [8139too]
{in-hardirq-W} state was registered at:
  [] __lock_acquire+0x949/0x11ac
  [] lock_acquire+0x99/0xb2
  [] _spin_lock+0x35/0x42
  [] rtl8139_interrupt+0x27/0x46b [8139too]
  [] handle_IRQ_event+0x28/0x59
  [] handle_level_irq+0xad/0x10b
  [] do_IRQ+0x93/0xd0
  [] common_interrupt+0x2e/0x34
  [] cpuidle_idle_call+0x74/0x99
  [] cpu_idle+0x87/0x89
  [] rest_init+0x60/0x62
  [] start_kernel+0x23a/0x2c5
  [<>] 0x0
  [] 0x
irq event stamp: 1777
hardirqs last  enabled at (1777): [] kfree+0xee/0x105
hardirqs last disabled at (1776): [] kfree+0x87/0x105
softirqs last  enabled at (1756): [] dev_deactivate+0x86/0xa5
softirqs last disabled at (1754): [] _spin_lock_bh+0xe/0x47

other info that might help us debug this:
1 lock held by ifconfig/5492:
 #0:  (rtnl_mutex){--..}, at: [] mutex_lock+0x1c/0x1f

stack backtrace:
 [] show_trace_log_lvl+0x1a/0x30
 [] show_trace+0x12/0x14
 [] dump_stack+0x15/0x17
 [] print_usage_bug+0x145/0x14f
 [] mark_lock+0x61f/0x70c
 [] __lock_acquire+0x73e/0x11ac
 [] lock_acquire+0x99/0xb2
 [] _spin_lock+0x35/0x42
 [] rtl8139_interrupt+0x27/0x46b [8139too]
 [] free_irq+0x11b/0x146
 [] rtl8139_close+0x8a/0x14a [8139too]
 [] dev_close+0x57/0x74
 [] dev_change_flags+0x8e/0x190
 [] devinet_ioctl+0x4af/0x652
 [] inet_ioctl+0x56/0x71
 [] sock_ioctl+0xa5/0x1d4
 [] do_ioctl+0x22/0x71
 [] vfs_ioctl+0x55/0x29e
 [] sys_ioctl+0x33/0x69
 [] sysenter_past_esp+0x5f/0x99
 ===

Regards,

Mariusz
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] pata_artop: fix UDMA5 for AEC6280[R] and UDMA6 for AEC6880[R]

2007-08-09 Thread Bartlomiej Zolnierkiewicz
On Friday 10 August 2007, Alan Cox wrote:
> On Thu, 9 Aug 2007 23:19:34 +0200
> Bartlomiej Zolnierkiewicz <[EMAIL PROTECTED]> wrote:
> 
> > 
> > Maximum supported UDMA mode for AEC6280[R] is UDMA5 (not UDMA4)
> > and for AEC6880[R] it is UDMA6 (not UDMA5):
> > 
> > * Fix the problem by adding missing struct ata_port_info to 
> > artop_init_one().
> > 
> > * Use the right naming (s/626/628/).
> > 
> > * Bump driver version.
> > 
> > Fixes IDE->libata regression, problem was never present in IDE aec62xx 
> > driver.
> 
> Have you tested this ??

-ENODEV so no and testing is welcomed.

However I went over both drivers to make sure that this change is safe
and correct.

BTW presence of the above bugs would strongly indicate that pata_artop has
never been tested (properly) with AEC6x80[R], otherwise these bugs should
have been noticed and fixed much earlier.

Thanks,
Bart
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Unable to handle kernel paging request at virtual address

2007-08-09 Thread shacky
> You snipped the most important part.  Even a digital photo of the
> crash would be more useful than what we have above.
> So far, there's not really much to go on.

Could you tell me what is the most important part, so I try to rewrite
it by hand?
I don't think a digital photo will be much useful because the whole
error is writed on a simple black screen with white caracters.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 x86_64 : kernel initial decompression hangs on vmware

2007-08-09 Thread Zachary Amsden

Avi Kivity wrote:


We haven't seen any issue with the 2.6.22 boot decompressor.  Which of 
the four (fs, gs, ldt, or tr) were proving problematic and why?


It was tr that was affecting Workstation, since we boot through normal 
BIOS path, and only a 16-bit task was loaded at this point.


Just to make the state comprehensive, I opted to reload everything.

Zach
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Unable to handle kernel paging request at virtual address

2007-08-09 Thread Dave Jones
On Fri, Aug 10, 2007 at 01:17:14AM +0200, shacky wrote:

 > [87.935473] BUG: unable to handle kernel paging request at virtual
 > address 6d207972
 > [...] printing eip:
 > [...] 6d207972
 > [...] *pde = 
 > [...] Oops: 000 [#2]
 > [...] SMP
 > [...] Modules linked in: bluetooth capability lirc_dev
 > speedstep_lib cpufreq_powersave cpufreq_stats cpufreq_userspace
 > cpufreq_ondemand cpufreq_conservative freq_table video container sbs
 > button dock ac battery ipv6 sbp2 lp fuse snd_emu10k1_synth
 > snd_emux_synth snd_seq_virmidi snd_seq_midi_emul  [] etc.
 > 
 > I'm using the kernel 2.6.22.
 > 
 > I'm omitting the rest of the error because it is very very long and I
 > have to rewrite it because I can't copy it. If you need some
 > other information please ask. :-)

You snipped the most important part.  Even a digital photo of the
crash would be more useful than what we have above.
So far, there's not really much to go on.

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6 patch] cpqphp_ctrl.c: remove dead code

2007-08-09 Thread Adrian Bunk
On Thu, Aug 09, 2007 at 04:20:01PM -0700, Kristen Carlson Accardi wrote:
> On Fri, 10 Aug 2007 01:04:36 +0200
> Adrian Bunk <[EMAIL PROTECTED]> wrote:
> 
> > On Thu, Aug 09, 2007 at 03:47:02PM -0700, Kristen Carlson Accardi wrote:
> > > 
> > > fine by me - let's NAK this patch (and all future ones for this driver) 
> > > until 
> > > someone with hardware steps up to maintain this driver.  Eventually it
> > > will just die I guess.
> > 
> > We have tons of unmaintained drivers and none of them has such a silly 
> > auto-NAK policy.
> > 
> > cu
> > Adrian
> 
> OK - "all future ones" was too extreme.  I'll take trivial patches (of
> which this one is not).

As I've wrote in the patch description, all it does is to remove an if() 
check that could never be false (which is easily verifyable if you look 
at the source code).

I've also verified that my patch does not change a single bit in the 
object file (after compilation with gcc 4.2.1).

What's your definition of a trivial patch?

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386 doublefault handler is broken with CONFIG_DEBUG_SPINLOCK

2007-08-09 Thread Chuck Ebbert
On 08/09/2007 07:16 PM, Andi Kleen wrote:
> 
> I tested it. Even on a box without spin lock debugging I get a hard
> hang after
> 
> double fault, gdt at c1404000 [255 bytes]
> 
> even though it should have printed the registers.
> So it looks like there is more broken in the DF handler than just
> this.

Looks like it just fails the ptr_ok() test:

#define ptr_ok(x) ((x) > PAGE_OFFSET && (x) < PAGE_OFFSET + 0x100)

page_offset  c000
+ 100

   < c1404000

What should that be changed to, or is there some easier way to test that?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1

2007-08-09 Thread Gabriel C
Alan Cox wrote:
>>> [   28.828484] :00:1f.1: cannot adjust BAR0 (not I/O)
>>> [   28.828487] :00:1f.1: cannot adjust BAR1 (not I/O)
>>> [   28.828489] :00:1f.1: cannot adjust BAR2 (not I/O)
>>> [   28.828491] :00:1f.1: cannot adjust BAR3 (not I/O)
> 
> This means it didn't do anything. (wrongly because its checking I/O bits
> on a BAR which are ignored according to the spec)
> 
>>> Region 0: [virtual] Memory at 01f0 (32-bit, non-prefetchable) 
>>> [disabled] [size=8]
>>> Region 1: [virtual] Memory at 03f0 (type 3, non-prefetchable) 
>>> [disabled] [size=1]
>>> Region 2: [virtual] Memory at 0170 (32-bit, non-prefetchable) 
>>> [disabled] [size=8]
>>> Region 3: [virtual] Memory at 0370 (type 3, non-prefetchable) 
>>> [disabled] [size=1]
> 
> The controller is disabled and when disabled it seems to think its
> memory. Valid but interesting.
> 
> 

The box is an Dell Precision WorkStation 530 MT.

Actually I have an ATA-7 disc on the primary EIDE connector ( one port free ) 
and an oldish CDROM
on the secondary EIDE connector ( one port free ).

http://194.231.229.228/lara/lara.dmesg ( from 2.6.23-rc2-mm1 with the 2 patches 
reverted )
http://194.231.229.228/lara/lara.lspci ( lspci - -nn )
http://194.231.229.228/lara/lara.html ( lshw html output )

If you want me to do/try something let me know.


Gabriel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/4] Embed zone_id information within the zonelist->zones pointer

2007-08-09 Thread Mel Gorman
On (09/08/07 14:37), Christoph Lameter didst pronounce:
> On Thu, 9 Aug 2007, Mel Gorman wrote:
> 
> >  }
> >  
> > +#if defined(CONFIG_SMP) && INTERNODE_CACHE_SHIFT > ZONES_SHIFT
> 
> Is this necessary? ZONES_SHIFT is always <= 2 so it will work with 
> any pointer. Why disable this for UP?
> 

Caution in case the number of zones increases. There was no guarantee of
zone alignment. It's the same reason I have a BUG_ON in the encode
function so that if we don't catch problems at compile-time, it'll go
BANG in a nice predictable fashion.

> > --- linux-2.6.23-rc1-mm2-010_use_zonelist/mm/vmstat.c   2007-08-07 
> > 14:45:11.0 +0100
> > +++ linux-2.6.23-rc1-mm2-015_zoneid_zonelist/mm/vmstat.c2007-08-09 
> > 15:52:12.0 +0100
> > @@ -365,11 +365,11 @@ void refresh_cpu_vm_stats(int cpu)
> >   */
> >  void zone_statistics(struct zonelist *zonelist, struct zone *z)
> >  {
> > -   if (z->zone_pgdat == zonelist->zones[0]->zone_pgdat) {
> > +   if (z->zone_pgdat == zonelist_zone(zonelist->_zones[0])->zone_pgdat) {
> > __inc_zone_state(z, NUMA_HIT);
> > } else {
> > __inc_zone_state(z, NUMA_MISS);
> > -   __inc_zone_state(zonelist->zones[0], NUMA_FOREIGN);
> > +   __inc_zone_state(zonelist_zone(zonelist->_zones[0]), 
> > NUMA_FOREIGN);
> > }
> > if (z->node == numa_node_id())
> > __inc_zone_state(z, NUMA_LOCAL);
> 
> H. I hope the compiler does subexpression optimization on 
> 
>   zonelist_zone(zonelist->_zones[0]) 
> 

I'll check

> Acked-by: Christoph Lameter <[EMAIL PROTECTED]>
> 

-- 
-- 
Mel Gorman
Part-time Phd Student  Linux Technology Center
University of Limerick IBM Dublin Software Lab
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: sleeping function called from invalid context at kernel/mutex.c:86

2007-08-09 Thread Mariusz Kozlowski
Hello,

This probably doesn't have great impact ;) but ...
To reproduce: run torture tests for RCU and then sysrq+q.

SysRq : Show Pending Timers
Timer List Version: v0.3
HRTIMER_MAX_CLOCK_BASES: 2
now at 1764338760370 nsecs

cpu: 0
 clock 0:
  .index:  0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset: 1186699025823815427 nsecs
active timers:
 clock 1:
  .index:  1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset: 0 nsecs
active timers:
 #0: <3>BUG: sleeping function called from invalid context at kernel/mutex.c:86
in_atomic():1, irqs_disabled():1
INFO: lockdep is turned off.
irq event stamp: 0
hardirqs last  enabled at (0): [<>] 0x0
hardirqs last disabled at (0): [] copy_process+0x4a8/0x144c
softirqs last  enabled at (0): [] copy_process+0x4c6/0x144c
softirqs last disabled at (0): [<>] 0x0
 [] show_trace_log_lvl+0x1a/0x30
 [] show_trace+0x12/0x14
 [] dump_stack+0x15/0x17
 [] __might_sleep+0xb7/0xc9
 [] mutex_lock+0x15/0x1f
 [] lookup_module_symbol_name+0x17/0xc0
 [] lookup_symbol_name+0x3f/0x43
 [] print_name_offset+0x1f/0x96
 [] timer_list_show+0x802/0xcbd
 [] sysrq_timer_list_show+0xc/0xe
 [] sysrq_handle_show_timers+0x8/0xa
 [] __handle_sysrq+0x7b/0x115
 [] handle_sysrq+0x20/0x24
 [] kbd_event+0x3a8/0x5c7
 [] input_pass_event+0x8f/0x91
 [] input_handle_event+0x98/0x38d
 [] input_event+0x54/0x67
 [] atkbd_interrupt+0x200/0x59e
 [] serio_interrupt+0x7c/0x80
 [] i8042_interrupt+0x17a/0x289
 [] handle_IRQ_event+0x28/0x59
 [] handle_level_irq+0xad/0x10b
 [] do_IRQ+0x93/0xd0
 [] common_interrupt+0x2e/0x34
 [] rcu_read_delay+0x8/0x36 [rcutorture]
 [] rcu_torture_reader+0x6e/0x169 [rcutorture]
 [] kthread+0x36/0x58
 [] kernel_thread_helper+0x7/0x1c
 ===
, tick_sched_timer, S:01, tick_nohz_restart_sched_tick, swapper/0
 # expires at 176433900 nsecs [in 239630 nsecs]
 #1: , it_real_fn, S:01, do_setitimer, artsd/7461
 # expires at 1764742781512 nsecs [in 404021142 nsecs]
 #2: , hrtimer_wakeup, S:01, do_nanosleep, kwrapper/7452
 # expires at 1764922105491 nsecs [in 583345121 nsecs]
 #3: , it_real_fn, S:01, do_setitimer, syslogd/6719
 # expires at 1790027922194 nsecs [in 25689161824 nsecs]
  .expires_next   : 176433900 nsecs
  .hres_active: 1
  .nr_events  : 1422687
  .nohz_mode  : 2
  .idle_tick  : 46585900 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 165857
  .idle_calls : 1812679
  .idle_sleeps: 1761361
  .idle_entrytime : 466865075138 nsecs
  .idle_sleeptime : 357976883572 nsecs
  .last_jiffies   : 166865
  .next_jiffies   : 166866
  .idle_expires   : 46595100 nsecs
jiffies: 1464338


Tick Device: mode: 1
Clock Event Device: pit
 max_delta_ns:   27461866
 min_delta_ns:   12571
 mult:   5124677
 shift:  32
 mode:   3
 next_event: 176433900 nsecs
 set_next_event: pit_next_event
 set_mode:   init_pit_timer
 event_handler:  hrtimer_interrupt

Regards,

Mariusz
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.23-rc2-mm1
# Fri Aug 10 00:12:50 2007
#
CONFIG_X86_32=y
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_NONIRQ_WAKEUP=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_QUICKLIST=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_DMI=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

#
# General setup
#
CONFIG_EXPERIMENTAL=y
CONFIG_BROKEN_ON_SMP=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_SWAP=y
CONFIG_SWAP_PREFETCH=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
# CONFIG_BSD_PROCESS_ACCT_V3 is not set
# CONFIG_TASKSTATS is not set
# CONFIG_USER_NS is not set
# CONFIG_AUDIT is not set
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=17
# CONFIG_CONTAINERS is not set
CONFIG_SYSFS_DEPRECATED=y
# CONFIG_RELAY is not set
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SYSCTL=y
# CONFIG_EMBEDDED is not set
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_ANON_INODES=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_SLUB_DEBUG=y
# CONFIG_SLAB is not set
CONFIG_SLUB=y
# CONFIG_SLOB is not set
CONFIG_PROC_PAGE_MONITOR=y
CONFIG_PROC_KPAGEMAP=y
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
# CONFIG_MODVERSIONS is not set
# CONFIG_MODULE_SRCVERSION_ALL is 

Re: [2.6 patch] cpqphp_ctrl.c: remove dead code

2007-08-09 Thread Alan Cox
> Silly is in the eye of the beholder.  I don't want to take this patch
> because it needs to be reviewed by someone who really knows the intent
> of the driver.  Seems silly to me to blindly take patches.

For unmaintained code we usually work on wackipedia theory ("its probably
right but if not we can revert it/update it cheaply")

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ofa-general] Re: [PATCH RFC] RDMA/CMA: Allocate PS_TCP ports from the host TCP port space.

2007-08-09 Thread Sean Hefty

How about we just remove the RDMA stack altogether?  I am not at all
kidding.  If you guys can't stay in your sand box and need to cause
problems for the normal network stack, it's unacceptable.  We were
told all along the if RDMA went into the tree none of this kind of
stuff would be an issue.


There are currently two RDMA solutions available.  Each solution has 
different requirements and uses the normal network stack differently. 
Infiniband uses its own transport.  iWarp runs over TCP.


We have tried to leverage the existing infrastructure where it makes sense.


After TCP port reservation, what's next?  It seems an at least
bi-monthly event that the RDMA folks need to put their fingers
into something else in the normal networking stack.  No more.


Currently, the RDMA stack uses its own port space.  This causes a 
problem for iWarp, and is what Steve is looking for a solution for.  I'm 
not an iWarp guru, so I don't know what options exist.  Can iWarp use 
its own address family?  Identify specific IP addresses for iWarp use? 
Restrict iWarp to specific port numbers?  Let the app control the 
correct operation?  I don't know.


Steve merely defined a problem and suggested a possible solution.  He's 
looking for constructive help trying to solve the problem.


- Sean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6 patch] cpqphp_ctrl.c: remove dead code

2007-08-09 Thread Kristen Carlson Accardi
On Fri, 10 Aug 2007 01:04:36 +0200
Adrian Bunk <[EMAIL PROTECTED]> wrote:

> On Thu, Aug 09, 2007 at 03:47:02PM -0700, Kristen Carlson Accardi wrote:
> > 
> > fine by me - let's NAK this patch (and all future ones for this driver) 
> > until 
> > someone with hardware steps up to maintain this driver.  Eventually it
> > will just die I guess.
> 
> We have tons of unmaintained drivers and none of them has such a silly 
> auto-NAK policy.
> 
> cu
> Adrian

OK - "all future ones" was too extreme.  I'll take trivial patches (of
which this one is not).
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6 patch] cpqphp_ctrl.c: remove dead code

2007-08-09 Thread Kristen Carlson Accardi
On Fri, 10 Aug 2007 01:04:36 +0200
Adrian Bunk <[EMAIL PROTECTED]> wrote:

> We have tons of unmaintained drivers and none of them has such a silly 
> auto-NAK policy.
> 
> cu
> Adrian

Silly is in the eye of the beholder.  I don't want to take this patch
because it needs to be reviewed by someone who really knows the intent
of the driver.  Seems silly to me to blindly take patches.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Unable to handle kernel paging request at virtual address

2007-08-09 Thread shacky
Hi.
I installed Ubuntu 7.04 and then upgraded to the future 7.10 on my
system based on an Asus Pundit barebone with 512 mb RAM and 120 Gb IDE
hard disk.
The system works without any problem, but when I try to shutdown or
restart the system, after a while during the shutdown process, the
system hangs and I see this error:

[87.935473] BUG: unable to handle kernel paging request at virtual
address 6d207972
[...] printing eip:
[...] 6d207972
[...] *pde = 
[...] Oops: 000 [#2]
[...] SMP
[...] Modules linked in: bluetooth capability lirc_dev
speedstep_lib cpufreq_powersave cpufreq_stats cpufreq_userspace
cpufreq_ondemand cpufreq_conservative freq_table video container sbs
button dock ac battery ipv6 sbp2 lp fuse snd_emu10k1_synth
snd_emux_synth snd_seq_virmidi snd_seq_midi_emul  [] etc.

I'm using the kernel 2.6.22.

I'm omitting the rest of the error because it is very very long and I
have to rewrite it because I can't copy it. If you need some
other information please ask. :-)

Could you help me, please? What could be the problem?

Thank you very much!
Bye.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386 doublefault handler is broken with CONFIG_DEBUG_SPINLOCK

2007-08-09 Thread Andi Kleen
On Thu, Aug 09, 2007 at 02:40:27PM -0400, Chuck Ebbert wrote:
> On 08/09/2007 01:49 PM, Andi Kleen wrote:
> > Chuck Ebbert <[EMAIL PROTECTED]> writes:
> >> Initializing FS in the doublefault_tss should fix it.
> >>
> >> Signed-off-by: Chuck Ebbert <[EMAIL PROTECTED]>
> >>
> >> ---
> >>
> >>  NOTE: not even compile tested.
> > 
> > Can you please test it?
> > 
> 
> It compiles but I can't really test it further right now.

I tested it. Even on a box without spin lock debugging I get a hard
hang after

double fault, gdt at c1404000 [255 bytes]

even though it should have printed the registers.
So it looks like there is more broken in the DF handler than just
this.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6 patch] cpqphp_ctrl.c: remove dead code

2007-08-09 Thread Alan Cox
> fine by me - let's NAK this patch (and all future ones for this driver) until 
> someone with hardware steps up to maintain this driver.  Eventually it
> will just die I guess.

If you want to NAK it perhaps you should become maintainer ;)

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CFS review

2007-08-09 Thread Roman Zippel
Hi,

On Wed, 1 Aug 2007, Ingo Molnar wrote:

> just to make sure, how does 'top' output of the l + "lt 3" testcase look 
> like now on your laptop? Yesterday it was this:
> 
>  4544 roman 20   0  1796  520  432 S 32.1  0.4   0:21.08 lt
>  4545 roman 20   0  1796  344  256 R 32.1  0.3   0:21.07 lt
>  4546 roman 20   0  1796  344  256 R 31.7  0.3   0:21.07 lt
>  4547 roman 20   0  1532  272  216 R  3.3  0.2   0:01.94 l
> 
> and i'm still wondering how that output was possible.

I disabled the jiffies logic and the result is still the same, so this 
problem isn't related to resolution at all.
I traced it a little and what's happing is that the busy loop really only 
gets little time, it only runs inbetween the timer tasks. When the timer 
task is woken up __enqueue_sleeper() updates sleeper_bonus and a little 
later when the busy loop is preempted __update_curr() is called a last 
time and it's fully hit by the sleeper_bonus. So the timer tasks use less 
time than they actually get and thus produce overflows, the busy loop OTOH 
is punished and underflows.
So it seems my initial suspicion was right and this logic is dodgy, what 
is it actually supposed to do? Why is some random task accounted with the 
sleeper_bonus?

bye, Roman

PS: Can I still expect answer about all the other stuff?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6 patch] cpqphp_ctrl.c: remove dead code

2007-08-09 Thread Adrian Bunk
On Thu, Aug 09, 2007 at 03:47:02PM -0700, Kristen Carlson Accardi wrote:
> 
> fine by me - let's NAK this patch (and all future ones for this driver) until 
> someone with hardware steps up to maintain this driver.  Eventually it
> will just die I guess.

We have tons of unmaintained drivers and none of them has such a silly 
auto-NAK policy.

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/24] make atomic_read() behave consistently on alpha

2007-08-09 Thread Segher Boessenkool

So, why not use the well-defined alternative?

Because we don't need to, and it hurts performance.

It hurts performance by implementing 32-bit atomic reads in assembler?


No, I misunderstood the question.  Implementing 32-bit atomic reads in 
assembler is redundant, because any sane compiler, *particularly* and 
optimizing compiler (and we're only in this mess because of optimizing 
compilers)


Oh please, don't tell me you don't want an optimising compiler.
And if you _do_ want one, well you're in this mess because you
chose C as implementation language and C has some pretty strange
rules.  Trying to use not-all-that-well-defined-and-completely-
misunderstood features of the language doesn't make things easier;
trying to use something that isn't even part of the language and
that your particular compiler originally supported by accident,
and that isn't yet an officially supported feature, and that on
top of it all has a track record of problems -- well it makes me
wonder if you're in this game for fun or what.


 will give us that automatically without the assembler.


No, it does *not* give it to you automatically; you have to do
either the asm() thing, or the not-defined-at-all *(volatile *)&
thing.

Yes, it is legal for a compiler to violate this assumption.  It is 
also legal for us to refuse to maintain compatibility with compilers 
that suck this badly.


So that's rm include/linux/compiler-gcc*.h then.  Good luck with
the intel compiler, maybe it works more to your liking.


Segher

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] at91 pm: Compilation fix for at91sam926x

2007-08-09 Thread Ulf Samuelsson

> > +#if defined(CONFIG_ARCH_AT91RM9200)
> > at91_sys_write(AT91_SDRAMC_SRR, 1); /*
> > self-refresh mode */

> Why don't use:
> if (cpu_is_at91rm9200())
> at91_sys_write(AT91_SDRAMC_SRR, 1);

What is the benefit?
Will the optimizer remove the code if the CPU is not the at91rm9200?

Best Regards
Ulf Samuelsson
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6 patch] cpqphp_ctrl.c: remove dead code

2007-08-09 Thread Kristen Carlson Accardi
On Thu, 9 Aug 2007 15:24:27 -0700
Greg KH <[EMAIL PROTECTED]> wrote:
[EMAIL PROTECTED]
> On Thu, Aug 09, 2007 at 02:51:40PM -0700, Kristen Carlson Accardi wrote:
> > On Mon, 23 Jul 2007 16:51:05 +0200
> > Adrian Bunk <[EMAIL PROTECTED]> wrote:
> > 
> > > If !mem_node we did already return -ENOMEM above in the function.
> > > 
> > > Spotted by the Coverity checker.
> > > 
> > > Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>
> > 
> > Greg - you are listed as the maintainer for this driver.
> 
> Not anymore, look at 2.6.23-rc1 :)
> 
> > Can you either
> > point me to someone who can review this patch or review it yourself?  
> > Looking at the code, it looks like it's possible that the driver writer
> > wanted this code patch to be able to be taken if it got IO resources
> > and not MEM resources, and if they didn't there's other cleanups that
> > should be done for the no iomem case.
> 
> Hm, I agree that this looks like the way the code was intended to work,
> but as this code has been working just fine so far the way it is, I'm
> not inclined to change it much, if any.
> 
> Especially as I no longer even have the hardware to test it on :(
> 
> So, how about we just leave it alone?
> 
> thanks,
> 
> greg k-h
> 

fine by me - let's NAK this patch (and all future ones for this driver) until 
someone with hardware steps up to maintain this driver.  Eventually it
will just die I guess.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] pata_artop: fix UDMA5 for AEC6280[R] and UDMA6 for AEC6880[R]

2007-08-09 Thread Alan Cox
On Thu, 9 Aug 2007 23:19:34 +0200
Bartlomiej Zolnierkiewicz <[EMAIL PROTECTED]> wrote:

> 
> Maximum supported UDMA mode for AEC6280[R] is UDMA5 (not UDMA4)
> and for AEC6880[R] it is UDMA6 (not UDMA5):
> 
> * Fix the problem by adding missing struct ata_port_info to artop_init_one().
> 
> * Use the right naming (s/626/628/).
> 
> * Bump driver version.
> 
> Fixes IDE->libata regression, problem was never present in IDE aec62xx driver.

Have you tested this ??

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >