Re: 2.6.23-rc1-mm1 -- mostly fails to build

2007-07-25 Thread Andrew Morton
On Wed, 25 Jul 2007 23:41:46 +0100 Andy Whitcroft <[EMAIL PROTECTED]> wrote:

> On Wed, Jul 25, 2007 at 05:36:56PM +0100, Andy Whitcroft wrote:
> 
> > Will investigate the NUMA-Q explosion and report on that separatly.
> 
> Ok, I've been looking at the NUMA-Q boot panic below:
> 
> BUG: unable to handle kernel NULL pointer dereference at virtual address 
> 
>  printing eip:
> c111689f
> *pdpt = 01387001
> *pde = 
> Oops:  [#1]
> SMP
> Modules linked in:
> CPU:0
> EIP:0060:[]Not tainted VLI
> EFLAGS: 00010286   (2.6.23-rc1-mm1-gc8131905-dirty #251)
> EIP is at pci_create_bus+0x11b/0x277
> eax:    ebx: c9352e00   ecx: c9073e94   edx: c9325400
> esi: c9325400   edi: c932559c   ebp: 0002   esp: c9073e90
> ds: 007b   es: 007b   fs: 00d8  gs:   ss: 0068
> Process swapper (pid: 1, ti=c9072000 task=c9070030 task.ti=c9072000)
> Stack: c12adc4f c9325400  0002    c934c800
>00d6  c1116a09  c9073ed5 c11b178a  c934c940
> 02f4 c12bd8ac c934c800 c12bd8b4 c9325000 c1119825 c934c800
> Call Trace:
>  [] pci_scan_bus_parented+0xe/0x21
>  [] pci_fixup_i450nx+0xa7/0x101
>  [] pci_do_fixups+0x2d/0x38
>  [] pci_device_add+0x48/0x77
>  [] pci_scan_single_device+0x1a/0x1f
>  [] pci_scan_slot+0x15/0x47
>  [] pci_scan_child_bus+0x19/0x7c
>  [] pci_scan_bus_parented+0x19/0x21
>  [] pcibios_scan_root+0x75/0x7e
>  [] pci_numa_init+0x2c/0xe4
>  [] kernel_init+0x0/0xa1
>  [] do_initcalls+0x73/0x1a3
>  [] proc_register+0xa0/0xa7
>  [] create_proc_entry+0x73/0x86
>  [] register_irq_proc+0x75/0x92
>  [] kernel_init+0x0/0xa1
>  [] kernel_init+0x5f/0xa1
>  [] kernel_thread_helper+0x7/0x10
>  ===
> Code: ff 8b 83 84 00 00 00 c7 04 24 4f dc 2a c1 89 44 24 04 e8 f8 42 f0 ff 83 
> 7c 24 14 00 75 15 8b 93 84 00 00 00 85 d2 74 0b 8b 43 44 <8b> 00 89 82 50 01 
> 00 00 c7 44 24 04 9a 04 00 00 8d bb 88 00 00
> EIP: [] pci_create_bus+0x11b/0x277 SS:ESP 0068:c9073e90
> Kernel panic - not syncing: Attempted to kill init!
> 
> This seems to have been caused by the introduction of the following
> code fragment into pci_create_bus:
> 
> if (!parent)
> set_dev_node(b->bridge, pcibus_to_node(b));
> 
> This has come as part of the -mm patch below:
> 
> try-parent-numa_node-at-first-before-using-default-v2.patch
> 
> This patch does not seem to be wrong in and of itself.  It does
> expose the fact that we are building busses with NULL sysdata.
> This has come up at least three times now.  Below is the patch
> proposed the last couple of times.  It is needed to allow any machine
> with i450nx quirk, plus for NUMA-Q systems.

All this could have happened due to my hamfisted repairing of yet another
reject storm, too.

> Andrew please could you add this to -mm again.
> 

ok..
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: -mm merge plans for 2.6.23

2007-07-25 Thread Nick Piggin

Andrew Morton wrote:


All this would end up needing runtime configurability and tweakability and
customisability.  All standard fare for userspace stuff - much easier than
patching the kernel.


So.  We can

a) provide a way for userspace to reload pagecache and

b) merge maps2 (once it's finished) (pokes mpm)

and we're done?



The userspace solution has been brought up before. It could be a good way
to go. I was thinking about how to do refetching of file backed pages from
the kernel, and it isn't impossible, but it it seems like locking would be
quite hard and it would be pretty complex and inflexible compared to a
userspace solution. Userspace might know what to chuck out, what to keep,
what access patterns to use...

Not that I want to say anything about swap prefetch getting merged: my
inbox is already full of enough "helpful suggestions" about that, so I'll
just be happy to have a look at little things like updatedb.

--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] fix 'dynreloc miscount' link error on Powerpc

2007-07-25 Thread Sam Ravnborg
Nathan Lynch <[EMAIL PROTECTED]> reported:
2.6.23-rc1 breaks the build for 64-bit powerpc for me (using
maple_defconfig):

  LD  vmlinux.o
powerpc64-unknown-linux-gnu-ld: dynreloc miscount for
kernel/built-in.o, section .opd
powerpc64-unknown-linux-gnu-ld: can not edit opd Bad value
make: *** [vmlinux.o] Error 1

However, I see a possibly related binutils patch:
http://article.gmane.org/gmane.comp.gnu.binutils/33650

It was tracked down to be caused by the weak prototype
declaration in mm.h:
__attribute__((weak)) const char *arch_vma_name(struct vm_area_struct *vma);

But there is no need to make the declaration weak - only the definition
needs to be marked weak. So drop the weak declaration.
And in the process drop the duplicate definition in page.h for powerpc.

Signed-off-by: Sam Ravnborg <[EMAIL PROTECTED]>
---
Note - the arch_vma_name fix for x86_64 needs to be applied first to avoid 
breaking x86_64



diff --git a/include/asm-powerpc/page.h b/include/asm-powerpc/page.h
index 10c51f4..236a921 100644
--- a/include/asm-powerpc/page.h
+++ b/include/asm-powerpc/page.h
@@ -190,7 +190,6 @@ extern void copy_user_page(void *to, void *from, unsigned 
long vaddr,
 extern int page_is_ram(unsigned long pfn);
 
 struct vm_area_struct;
-extern const char *arch_vma_name(struct vm_area_struct *vma);
 
 #include 
 #endif /* __ASSEMBLY__ */
diff --git a/include/linux/mm.h b/include/linux/mm.h
index c456c3a..3e9e8fe 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1246,7 +1246,7 @@ void drop_slab(void);
 extern int randomize_va_space;
 #endif
 
-__attribute__((weak)) const char *arch_vma_name(struct vm_area_struct *vma);
+const char * arch_vma_name(struct vm_area_struct *vma);
 
 #endif /* __KERNEL__ */
 #endif /* _LINUX_MM_H */


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: What's does KPROBE_ENTRY mean?

2007-07-25 Thread jidong xiao
Anyone can help this?

On 6/21/07, jidong xiao <[EMAIL PROTECTED]> wrote:
> I searched in linux kernel 2.6.10, didn't find it, then I tried
> 2.6.20, it is there. But I am not familiar with assembly language, so
> can anybody kindly explain it, I don't know the difference between
> KPROBE_ENTRY and ENTRY, however, I can find both of these items in
> some files, such as arch/x86_64/kernel/entry.S.
>
> Thank you.
>
> Regards
> Jason Xiao
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc1-mm1 - drivers/char/nozomi.c overflow in implicit constant conversion , warnings

2007-07-25 Thread Greg KH
On Wed, Jul 25, 2007 at 10:42:07PM +0200, Gabriel C wrote:
> 
> ...
> 
> drivers/char/nozomi.c: In function 'interrupt_handler':
> drivers/char/nozomi.c:1298: warning: overflow in implicit constant conversion
> drivers/char/nozomi.c: In function 'nozomi_card_init':
> drivers/char/nozomi.c:1568: warning: overflow in implicit constant conversion
> drivers/char/nozomi.c:1592: warning: overflow in implicit constant conversion
> drivers/char/nozomi.c: In function 'nozomi_card_exit':
> drivers/char/nozomi.c:1673: warning: overflow in implicit constant conversion

Ick, yeah, that driver needs help.  Luckily someone is starting to work
on it now and help clean it up.

thanks for letting me know.

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[-mm patch] DMA engine kconfig improvements

2007-07-25 Thread Adrian Bunk
On Wed, Jul 25, 2007 at 04:03:04AM -0700, Andrew Morton wrote:
>...
> Changes since 2.6.22-rc6-mm1:
>...
> +dma-arch-fix.patch
> 
>  Fix git-dma.patch
>...

This results in an ARM-only driver in an X86-only menu...

What about the patch below instead that also improves a few other things?


<--  snip  -->


This patch contains the following changes to the DMA engine menus:
- switch to menuconfig
- INTEL_IOATDMA must depend on X86
- INTEL_IOATDMA must select DCA
- device drivers shouldn't "default m"
- DCA shouldn't be a user visible option
- make it clear in the INTEL_IOATDMA help text that this driver is for
  rare hardware the user most likely doesn't has
- let DMA_ENGINE be select'ed by the DMA devices, making it less likely
  for a user to accidentally enable NET_DMA

Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

---

 drivers/dca/Kconfig |7 +
 drivers/dma/Kconfig |   59 +---
 2 files changed, 36 insertions(+), 30 deletions(-)

--- linux-2.6.23-rc1-mm1/drivers/dma/Kconfig.old2007-07-26 
06:45:46.0 +0200
+++ linux-2.6.23-rc1-mm1/drivers/dma/Kconfig2007-07-26 07:08:46.0 
+0200
@@ -2,42 +2,51 @@
 # DMA engine configuration
 #
 
-menu "DMA Engine support"
-   depends on HAS_DMA
+menuconfig DMADEVICES
+   bool "DMA Engine support"
+   depends on (PCI && X86) || ARCH_IOP32X || ARCH_IOP33X || ARCH_IOP13XX
+   help
+ Intel(R) DMA engines
+
+if DMADEVICES
+
+comment "DMA Devices"
+
+config INTEL_IOATDMA
+   tristate "Intel I/OAT DMA support"
+   depends on PCI && X86
+   select DMA_ENGINE
+   select DCA
+   help
+ Enable support for the Intel(R) I/OAT DMA engine present
+ in recent Intel Xeon CPUs.
+
+ Say yes if you have such a CPU.
+
+ If unsure, say N.
+
+config INTEL_IOP_ADMA
+   tristate "Intel IOP ADMA support"
+   depends on ARCH_IOP32X || ARCH_IOP33X || ARCH_IOP13XX
+   select ASYNC_CORE
+   select DMA_ENGINE
+   help
+ Enable support for the Intel(R) IOP Series RAID engines.
 
 config DMA_ENGINE
-   bool "Support for DMA engines"
-   ---help---
-  DMA engines offload bulk memory operations from the CPU to dedicated
-  hardware, allowing the operations to happen asynchronously.
+   bool
 
 comment "DMA Clients"
+   depends on DMA_ENGINE
 
 config NET_DMA
bool "Network: TCP receive copy offload"
depends on DMA_ENGINE && NET
default y
-   ---help---
+   help
  This enables the use of DMA engines in the network stack to
  offload receive copy-to-user operations, freeing CPU cycles.
  Since this is the main user of the DMA engine, it should be enabled;
  say Y here.
 
-comment "DMA Devices"
-
-config INTEL_IOATDMA
-   tristate "Intel I/OAT DMA support"
-   depends on DMA_ENGINE && PCI
-   default m
-   ---help---
- Enable support for the Intel(R) I/OAT DMA engine.
-
-config INTEL_IOP_ADMA
-tristate "Intel IOP ADMA support"
-depends on DMA_ENGINE && (ARCH_IOP32X || ARCH_IOP33X || ARCH_IOP13XX)
-   select ASYNC_CORE
-default m
----help---
-  Enable support for the Intel(R) IOP Series RAID engines.
-
-endmenu
+endif
--- linux-2.6.23-rc1-mm1/drivers/dca/Kconfig.old2007-07-26 
06:49:32.0 +0200
+++ linux-2.6.23-rc1-mm1/drivers/dca/Kconfig2007-07-26 06:49:41.0 
+0200
@@ -3,9 +3,6 @@
 #
 
 config DCA
-   tristate "DCA support for clients and providers"
-   ---help---
-  This is a server to help modules that want to use Direct Cache
- Access to find DCA providers that will supply correct CPU tags.
-   default m
+   tristate
+
 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/4] include asm-mips add missing edac h file

2007-07-25 Thread Andrew Morton
On Wed, 25 Jul 2007 14:55:01 -0600 [EMAIL PROTECTED] wrote:

> --- /dev/null
> +++ linux-2.6.23-rc1/include/asm-mips/edac.h
> @@ -0,0 +1,35 @@
> +#ifndef ASM_EDAC_H
> +#define ASM_EDAC_H
> +
> +/* ECC atomic, DMA, SMP and interrupt safe scrub function */
> +
> +static __inline__ void atomic_scrub(void *va, u32 size)

Please don't use __inline__ or __inline.  Good old "inline" will do.





> +{
> + unsigned long *virt_addr = va;
> + unsigned long temp;
> + u32 i;
> +
> + for (i = 0; i < size / sizeof(unsigned long); i++, virt_addr++) {
> +
> + /*
> +  * Very carefully read and write to memory atomically
> +  * so we are interrupt, DMA and SMP safe.
> +  *
> +  * Intel: asm("lock; addl $0, %0"::"m"(*virt_addr));
> +  */
> +
> + __asm__ __volatile__ (
> + "   .setmips3   \n"
> + "1: ll  %0, %1  # atomic_add\n"
> + "   ll  %0, %1  # atomic_add\n"
> + "   addu%0, $0  \n"
> + "   sc  %0, %1  \n"
> + "   beqz%0, 1b  \n"
> + "   .setmips0   \n"
> + : "=" (temp), "=m" (*virt_addr)
> + : "m" (*virt_addr));
> +
> + }
> +}

hm, I'd have thought that we could us plain old atomic_add() for this.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/4] drivers edac fix reset edac_mc pollmsec

2007-07-25 Thread Andrew Morton
On Wed, 25 Jul 2007 14:54:21 -0600 [EMAIL PROTECTED] wrote:

> +void edac_mc_reset_delay_period(int value)
>  {
> - /* cancel the current workq request */
> - edac_mc_workq_teardown(mci);
> + struct mem_ctl_info *mci;
> + struct list_head *item;
> +
> + mutex_lock(_ctls_mutex);
> +
> + /* scan the list and turn off all workq timers, doing so under lock
> +  */
> + list_for_each(item, _devices) {
> + mci = list_entry(item, struct mem_ctl_info, link);
> +
> + if (mci->op_state == OP_RUNNING_POLL)
> + cancel_delayed_work(>work);
> + }
> +
> + mutex_unlock(_ctls_mutex);

cancel_delayed_work() on its own looks a bit racy.  The work could
presently be running on another CPU.

So generally we'll run flush_workqueue() or cancel_work_sync() after the
cancel_delayed_work() to make sure that it has really gone away.

Beware however that you're holding a lock here.  If any of the work
functions which can be at mci->work also take mem_ctls_mutex then it is
deadlocky to run flush_workqueue() or cancel_work_sync() while holding that
lock.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] Char: moxa, fix and optimise empty timer

2007-07-25 Thread Andrew Morton
On Wed, 25 Jul 2007 23:43:29 +0200 (CEST) Jiri Slaby <[EMAIL PROTECTED]> wrote:

> moxa, fix and optimise empty timer
> 
> don't wait and delete empty timer in empty timer function. Also fire next
> empty timer at rounded jiffies to save power.
> 

What is actually being "fixed" here?

> 
> ---
> commit da52793b6347e8b6b048526ce2422e29b20bb335
> tree ad0bee78e45beef89dc740f81e0606d782296542
> parent 3a69b463dcad1ff142f46e8fb74e7dc5a092eb60
> author Jiri Slaby <[EMAIL PROTECTED]> Wed, 25 Jul 2007 23:42:49 +0200
> committer Jiri Slaby <[EMAIL PROTECTED]> Wed, 25 Jul 2007 23:42:49 +0200
> 
>  drivers/char/moxa.c |4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/char/moxa.c b/drivers/char/moxa.c
> index ed76f0a..5000b3b 100644
> --- a/drivers/char/moxa.c
> +++ b/drivers/char/moxa.c
> @@ -1040,14 +1040,14 @@ static void check_xmit_empty(unsigned long data)
>   struct moxa_port *ch;
>  
>   ch = (struct moxa_port *) data;
> - del_timer_sync(_ports[ch->port].emptyTimer);
>   if (ch->tty && (ch->statusflags & EMPTYWAIT)) {
>   if (MoxaPortTxQueue(ch->port) == 0) {
>   ch->statusflags &= ~EMPTYWAIT;
>   tty_wakeup(ch->tty);
>   return;
>   }
> - mod_timer(_ports[ch->port].emptyTimer, jiffies + HZ);
> + mod_timer(_ports[ch->port].emptyTimer,
> + round_jiffies(jiffies + HZ));
>   } else
>   ch->statusflags &= ~EMPTYWAIT;
>  }

A call to check_xmit_empty() used to guarantee that the timer was no longer
running (if the first `if' succeeds and the second does not), but that is
no longer the case.  You're sure this is correct?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PATCH] ACPI patches for 2.6.23-rc1

2007-07-25 Thread Al Boldi
[EMAIL PROTECTED] wrote:
> On Thu, 26 Jul 2007, Len Brown wrote:
> > On Wednesday 25 July 2007 16:40, Al Boldi wrote:
> >> Linus Torvalds wrote:
> >>> On Wed, 25 Jul 2007, Len Brown wrote:
>  git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6.git
>  release
> 
>  Fixes regressions -- a build failure, an oops, some dmesg spam.
>  Also fixes some D-state issues and adds ACPI module auto-loading.
>  Yes, I'd hoped to get the last two in before rc1.
>  I'm hopeful that a couple-days into rc2 is sufficiently early for
>  them.
> >>>
> >>> I hate pulling this, but I did. However, what I hate even more after
> >>> having done so is that ACPI now seems to select CPU hotplug. Why?
> >>>
> >>> That is just *broken*. Sure, if you select STR or hibernation, we need
> >>> CPU hotplug,
> >>
> >> You are kidding, right?  CPU hotplug is broken big time; it kills a
> >> machine like virus-scanner.  I always turn it of as a rule.  And now
> >> you want STR/STD to be dependent on it?  Even on UP?  Why?
> >
> > CPU_HOTPLUG is needed to take the non-boot processors off-line before
> > the suspend, and to bring them on-line upon the resume.  If you have
> > specific problems with bringing logical processors offline and online,
> > then please speak up because many are depending on this functionality
> > working.
>
> nobody is arguing that CPU_HOTPLUG should not be a requirement for
> suspend, what we are questioning is why simply enabling ACPI should
> require CPU_HOTPLUG.
>
> not everyone who configures ACPI wants to use suspend (of any flavor)

Actually, I would go one step further and just rip out hotplug from the 
kernel proper, and let userland handle it.  Really, just like devfs, hotplug 
has no place in the kernel.


Thanks!

--
Al

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] fix ppc kernels after build-id addition

2007-07-25 Thread Paul Mackerras
Meelis Roos writes:

> This patch fixes arch/ppc kernels, at least for prep subarch, after 
> build-id addition. Without the patch, kernels were 3 times the size and 
> bootloader refused to load them. Now they are back to normal again.

I just built an ARCH=ppc kernel for the prep subarch and the vmlinux
size was normal.  Can you identify exactly why the kernels got so much
bigger, i.e. what is taking up all the space?

Paul.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PATCH] ACPI patches for 2.6.23-rc1

2007-07-25 Thread david

On Thu, 26 Jul 2007, Len Brown wrote:


CONFIG_ACPI_SLEEP.  Not trivial for a user to select it
when it doesn't even appear on the menu.  It doesn't appear
because CONFIG_SUSPEND_SMP isn't enabled, but that doesn't
appear either -- because CONFIG_HOTPLUG_CPU isn't selected.


so have something like

config ACPI_SLEEP
 select HOTPLUG_CPU if X86 && SMP
 select SUSPEND_SMP if X86 && SMP

instead of makeing it dependant on ACPI.


If more config options where better, then this
would indeed be an improvement over 2.6.22.
But more config options isn't better -- except for "some people":-)


coupling unrelated fetures togeather isn't better either.

David Lang
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: -mm merge plans for 2.6.23

2007-07-25 Thread Andrew Morton
On Wed, 25 Jul 2007 09:09:01 -0700
"Ray Lee" <[EMAIL PROTECTED]> wrote:

> No, there's a third case which I find the most annoying. I have
> multiple working sets, the sum of which won't fit into RAM. When I
> finish one, the kernel had time to preemptively swap back in the
> other, and yet it didn't. So, I sit around, twiddling my thumbs,
> waiting for my music player to come back to life, or thunderbird,
> or...

Yes, I'm thinking that's a good problem statement and it isn't something
which the kernel even vaguely attempts to address, apart from normal
demand paging.

We could perhaps improve things with larger and smarter fault readaround,
perhaps guided by refault-rate measurement.  But that's still demand-paged
rather than being proactive/predictive/whatever.

None of this is swap-specific though: exactly the same problem would need
to be solved for mmapped files and even plain old pagecache.

In fact I'd restate the problem as "system is in steady state A, then there
is a workload shift causing transition to state B, then the system goes
idle.  We now wish to reinstate state A in anticipation of a resumption of
the original workload".

swap-prefetch solves a part of that.

A complete solution for anon and file-backed memory could be implemented
(ta-da) in userspace using the kernel inspection tools in -mm's maps2-*
patches.  We would need to add a means by which userspace can repopulate
swapcache, but that doesn't sound too hard (especially when you haven't
thought about it).

And userspace can right now work out which pages from which files are in
pagecache so this application can handle pagecache, swap and file-backed
memory.  (file-backed memory might not even need special treatment, given
that it's pagecache anyway).


And userspace can do a much better implementation of this
how-to-handle-large-load-shifts problem, because it is really quite
complex.  The system needs to be monitored to determine what is the "usual"
state (ie: the thing we wish to reestablish when the transient workload
subsides).  The system then needs to be monitored to determine when the
exceptional workload has started, and when it has subsided, and userspace
then needs to decide when to start reestablishing the old working set, at
what rate, when to abort doing that, etc.

All this would end up needing runtime configurability and tweakability and
customisability.  All standard fare for userspace stuff - much easier than
patching the kernel.


So.  We can

a) provide a way for userspace to reload pagecache and

b) merge maps2 (once it's finished) (pokes mpm)

and we're done?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PATCH] ACPI patches for 2.6.23-rc1

2007-07-25 Thread Len Brown
On Wednesday 25 July 2007 22:20, [EMAIL PROTECTED] wrote:
> On Wed, 25 Jul 2007, Len Brown wrote:
> 
> > On Wednesday 25 July 2007 14:48, Linus Torvalds wrote:
> >
> >> ... ACPI now seems to select CPU hotplug. Why?
> >
> > ACPI=y SMP=y systems require SUSPEND_SMP=y for system sleep support,
> > and that requires HOTPLUG_CPU=y.
> >
> > Note that ACPI=y SMP=n systems do not need it,
> > and thus will not select HOTPLUG_CPU=y
> >
> >> That is just *broken*. Sure, if you select STR or hibernation, we need CPU
> >> hotplug, but just for picking ACPI? Why?
> >
> > My assumption is that if somebody selects CONFIG_ACPI,
> > that 99% of the time, they intend that to include support for
> > the ACPI hooks for system sleep states.
> >
> > Conversely, supporting the 1% of people who don't want it
> > isn't worth messing with the 99% who do, nor is
> > the burden of yet another config option to maintain and
> > #ifdefs in the code.
> 
> so you are saying that you know better then we do what we need?

Feel free to share what you know about the benefits vs. the costs
of maintaining CONFIG_ACPI_SLEEP as a build option.

> some people configure ACPI only becouse their system won't work properly 
> without it. they have no intention of ever doing a STR or hibernate.

I agree.

Further, I expect that 100% - (some people %) = 99%

If you feel that your system has been degraded
because it now includes what used to be excluded under
CONFIG_ACPI_SLEEP=n, please let me know how.

> > On UP, they'd get ACPI system sleep support 100% of the time
> > by default, but on SMP this option had become problematic.
> >
> > We used to have this:
> >
> > if ACPI
> > ...
> > config ACPI_SLEEP
> >bool "Sleep States"
> >depends on X86 && (!SMP || SUSPEND_SMP)
> >depends on PM
> >default y
> >
> > So the poster-child failure was i386/defconfig itself...
> > It couldn't support suspend to RAM because it didn't include
> > CONFIG_ACPI_SLEEP.  Not trivial for a user to select it
> > when it doesn't even appear on the menu.  It doesn't appear
> > because CONFIG_SUSPEND_SMP isn't enabled, but that doesn't
> > appear either -- because CONFIG_HOTPLUG_CPU isn't selected.
> 
> so have something like
> 
> config ACPI_SLEEP
>  select HOTPLUG_CPU if X86 && SMP
>  select SUSPEND_SMP if X86 && SMP
> 
> instead of makeing it dependant on ACPI.

If more config options where better, then this
would indeed be an improvement over 2.6.22.
But more config options isn't better -- except for "some people":-)

thanks.
-Len

> > Most users don't want that.
> >
> > So today we have this:
> >
> > menuconfig ACPI
> > ...
> >select HOTPLUG_CPU if X86 && SMP
> >select SUSPEND_SMP if X86 && SMP
> >
> > Which I think leads to fewer surprises, and less complicated code.
> > (even though using select itself is fraught with peril:-)
> >
> > thanks,
> > -Len
> >
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [EMAIL PROTECTED]
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> >
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]gx-suspmod.c use boot_cpu_data instead of current_cpu_data

2007-07-25 Thread Andrew Morton
On Thu, 26 Jul 2007 04:20:10 + "Dave Young" <[EMAIL PROTECTED]> wrote:

> >On 7/25/07, Andrew Morton <[EMAIL PROTECTED]> wrote:
> > On Wed, 25 Jul 2007 14:19:05 + Dave Young <[EMAIL PROTECTED]> wrote:
> >
> > > Hi,
> > > in preemptible kernel will report BUG: using smp_processor_id() in 
> > > preemptible, so use boot_cpu_data instead of current_cpu_data.
> > >
> > > Signed-off-by: Dave Young <[EMAIL PROTECTED]>
> > >
> > > ---
> > > arch/i386/kernel/cpu/cpufreq/gx-suspmod.c |4 ++--
> > > 1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff -pur linux/arch/i386/kernel/cpu/cpufreq/gx-suspmod.c 
> > > linux.new/arch/i386/kernel/cpu/cpufreq/gx-suspmod.c
> > > --- linux/arch/i386/kernel/cpu/cpufreq/gx-suspmod.c   2007-07-25 
> > > 14:11:06.0 +
> > > +++ linux.new/arch/i386/kernel/cpu/cpufreq/gx-suspmod.c   2007-07-25 
> > > 13:57:29.0 +
> > > @@ -181,8 +181,8 @@ static __init struct pci_dev *gx_detect_
> > >   struct pci_dev *gx_pci = NULL;
> > >
> > >   /* check if CPU is a MediaGX or a Geode. */
> > > - if ((current_cpu_data.x86_vendor != X86_VENDOR_NSC) &&
> > > - (current_cpu_data.x86_vendor != X86_VENDOR_CYRIX)) {
> > > + if ((boot_cpu_data.x86_vendor != X86_VENDOR_NSC) &&
> > > + (boot_cpu_data.x86_vendor != X86_VENDOR_CYRIX)) {
> > >   dprintk("error: no MediaGX/Geode processor found!\n");
> > >   return NULL;
> > >   }
> >
> > um, I suspect it really wants to get at the current CPU.  But putting a
> > preempt_disable() around just that code is meaningless: the current CPU
> > could change immediately before or after the code block. It needs deeper
> > fixing, methinks.
> The only target is to get the cpu vendor, so boot_cpu_data is enough,
> the drivers/mtd/nand/cs553x_nand.c has the same usage.

I think there's some vague ambition in there to support non-identical CPUs.
In which case reading from the local CPU would make more sense.  (waves
frantically at cpufreq developers).

otoh, it'll take some work I suspect.  It'll need to sort out the overall
scope of "local cpu".  At what point and for how long should this code pin
itself on a cpu?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]gx-suspmod.c use boot_cpu_data instead of current_cpu_data

2007-07-25 Thread Dave Young
>On 7/25/07, Andrew Morton <[EMAIL PROTECTED]> wrote:
> On Wed, 25 Jul 2007 14:19:05 + Dave Young <[EMAIL PROTECTED]> wrote:
>
> > Hi,
> > in preemptible kernel will report BUG: using smp_processor_id() in 
> > preemptible, so use boot_cpu_data instead of current_cpu_data.
> >
> > Signed-off-by: Dave Young <[EMAIL PROTECTED]>
> >
> > ---
> > arch/i386/kernel/cpu/cpufreq/gx-suspmod.c |4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff -pur linux/arch/i386/kernel/cpu/cpufreq/gx-suspmod.c 
> > linux.new/arch/i386/kernel/cpu/cpufreq/gx-suspmod.c
> > --- linux/arch/i386/kernel/cpu/cpufreq/gx-suspmod.c   2007-07-25 
> > 14:11:06.0 +
> > +++ linux.new/arch/i386/kernel/cpu/cpufreq/gx-suspmod.c   2007-07-25 
> > 13:57:29.0 +
> > @@ -181,8 +181,8 @@ static __init struct pci_dev *gx_detect_
> >   struct pci_dev *gx_pci = NULL;
> >
> >   /* check if CPU is a MediaGX or a Geode. */
> > - if ((current_cpu_data.x86_vendor != X86_VENDOR_NSC) &&
> > - (current_cpu_data.x86_vendor != X86_VENDOR_CYRIX)) {
> > + if ((boot_cpu_data.x86_vendor != X86_VENDOR_NSC) &&
> > + (boot_cpu_data.x86_vendor != X86_VENDOR_CYRIX)) {
> >   dprintk("error: no MediaGX/Geode processor found!\n");
> >   return NULL;
> >   }
>
> um, I suspect it really wants to get at the current CPU.  But putting a
> preempt_disable() around just that code is meaningless: the current CPU
> could change immediately before or after the code block. It needs deeper
> fixing, methinks.
The only target is to get the cpu vendor, so boot_cpu_data is enough,
the drivers/mtd/nand/cs553x_nand.c has the same usage.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386-show-unhandled-signals-v3

2007-07-25 Thread Andrew Morton
On Wed, 25 Jul 2007 16:40:06 -0700 [EMAIL PROTECTED] (Masoud Asgharifard 
Sharbiani) wrote:

> This patch makes the i386 behave the same way that x86_64 does when a
> segfault happens. A line gets printed to the kernel log so that tools
> that need to check for failures can behave more uniformly between
> different kernels. Like x86_64, it can be disabled by setting
> debug.show_unhandled_signals sysctl variable to 0 (or by doing
> echo 0 > /proc/sys/debug/show_unhandled_signals)

Is that still correct?  Methinks /proc/sys/debug/exception-trace.





> Also, all of the lines being printed are now using printk_ratelimit()
> to deny the ability of DoS from a local user with a program like the
> following:
> main()
> {
>while (1)
>if (!fork()) *(int *)0 = 0;
> }

yup.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PATCH] ACPI patches for 2.6.23-rc1

2007-07-25 Thread david

On Thu, 26 Jul 2007, Len Brown wrote:


On Wednesday 25 July 2007 16:40, Al Boldi wrote:

Linus Torvalds wrote:

On Wed, 25 Jul 2007, Len Brown wrote:

git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6.git
release

Fixes regressions -- a build failure, an oops, some dmesg spam.
Also fixes some D-state issues and adds ACPI module auto-loading.
Yes, I'd hoped to get the last two in before rc1.
I'm hopeful that a couple-days into rc2 is sufficiently early for them.


I hate pulling this, but I did. However, what I hate even more after
having done so is that ACPI now seems to select CPU hotplug. Why?

That is just *broken*. Sure, if you select STR or hibernation, we need CPU
hotplug,


You are kidding, right?  CPU hotplug is broken big time; it kills a machine
like virus-scanner.  I always turn it of as a rule.  And now you want
STR/STD to be dependent on it?  Even on UP?  Why?


CPU_HOTPLUG is needed to take the non-boot processors off-line before the 
suspend,
and to bring them on-line upon the resume.  If you have specific problems
with bringing logical processors offline and online, then please speak up
because many are depending on this functionality working.


nobody is arguing that CPU_HOTPLUG should not be a requirement for 
suspend, what we are questioning is why simply enabling ACPI should 
require CPU_HOTPLUG.


not everyone who configures ACPI wants to use suspend (of any flavor)

David Lang
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ck] Re: -mm merge plans for 2.6.23

2007-07-25 Thread Jeff Garzik

Bartlomiej Zolnierkiewicz wrote:

On Wednesday 25 July 2007, Ingo Molnar wrote:
you dont _have to_ cooperative with the maintainer, but it's certainly 
useful to work with good maintainers, if your goal is to improve Linux. 
Or if for some reason communication is not working out fine then grow 
into the job and replace the maintainer by doing a better job.


The idea of growing into the job and replacing the maintainer by proving
the you are doing better job was viable few years ago but may not be
feasible today.


IMO...  Tejun is an excellent counter-example.  He showed up as an 
independent developer, put a bunch of his own spare time and energy into 
the codebase, and is probably libata's main engineer (in terms of code 
output) today.  If I get hit by a bus tomorrow, I think the Linux 
community would be quite happy with him as the libata maintainer.




The another problem is that sometimes it seems that independent developers
has to go through more hops than entreprise ones and it is really frustrating
experience for them.  There is no conspiracy here - it is only the natural
mechanism of trusting more in the code of people who you are working with more.


I think Tejun is a counter-example here too :)  Everyone's experience is 
different, but from my perspective, Tejun "appeared out of nowhere" 
producing good code, and so, it got merged rapidly.


Personally, for merging code, I tend to trust people who are most in 
tune with "the Linux Way(tm)."  It is hard to quantify, but quite often, 
independent developers "get it" when enterprise developers do not.




Now could I ask people to stop all this -ck threads and give the developers
involved in the recent events some time to calmly rethink the whole case.


Indeed...

Jeff


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PATCH] ACPI patches for 2.6.23-rc1

2007-07-25 Thread Len Brown
On Wednesday 25 July 2007 16:40, Al Boldi wrote:
> Linus Torvalds wrote:
> > On Wed, 25 Jul 2007, Len Brown wrote:
> > > git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6.git
> > > release
> > >
> > > Fixes regressions -- a build failure, an oops, some dmesg spam.
> > > Also fixes some D-state issues and adds ACPI module auto-loading.
> > > Yes, I'd hoped to get the last two in before rc1.
> > > I'm hopeful that a couple-days into rc2 is sufficiently early for them.
> >
> > I hate pulling this, but I did. However, what I hate even more after
> > having done so is that ACPI now seems to select CPU hotplug. Why?
> >
> > That is just *broken*. Sure, if you select STR or hibernation, we need CPU
> > hotplug,
> 
> You are kidding, right?  CPU hotplug is broken big time; it kills a machine 
> like virus-scanner.  I always turn it of as a rule.  And now you want 
> STR/STD to be dependent on it?  Even on UP?  Why?

CPU_HOTPLUG is needed to take the non-boot processors off-line before the 
suspend,
and to bring them on-line upon the resume.  If you have specific problems
with bringing logical processors offline and online, then please speak up
because many are depending on this functionality working.

SMP system sleep support has always depended on CPU_HOTPLUG, per above.

No, uniprocessor system sleep does not depend on CPU_HOTPLUG.

-Len
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


updatedb

2007-07-25 Thread Rene Herman

On 07/25/2007 07:15 PM, Robert Deaton wrote:


On 7/25/07, Rene Herman <[EMAIL PROTECTED]> wrote:


And there we go again -- off into blabber-land. Why does swap-prefetch 
help updatedb? Or doesn't it? And if it doesn't, why should anyone 
trust anything else someone who said it does says?


I don't think anyone has ever argued that swap-prefetch directly helps 
the performance of updatedb in any way


People have argued (claimed, rather) that swap-prefetch helps their system 
after updatedb has run -- you are doing so now.



however, I do recall people mentioning that updatedb, being a ram
intensive task, will often cause things to be swapped out while it runs
on say a nightly cronjob.


Problem spot no. 1.

RAM intensive? If I run updatedb here, it never grows itself beyond 2M. Yes, 
two. I'm certainly willing to accept that me and my systems are possibly not 
the reference but assuming I'm _very_ special hasn't done much for me either 
in the past.


The thing updatedb does do, or at least has the potential to do, is fill 
memory with cached inodes/dentries but Linux does not swap to make room for 
caches. So why will updatedb "often cause things to be swapped out"?


[ snip ]


Swap prefetch, on the other hand, would have kicked in shortly after
updatedb finished, leaving the applications in swap for a speedy
recovery when the person comes back to their computer.


Problem spot no. 2.

If updatedb filled all of RAM with inodes/dentries, that RAM is now used 
(ie, not free) and swap-prefetch wouldn't have anywhere to prefetch into so 
would _not_ have kicked in.


So what's happening? If you sit down with a copy op "top" in one terminal 
and updatedb in another, what does it show?


Rene.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [DRIVER SUBMISSION] DRBD wants to go mainline

2007-07-25 Thread Kyle Moffett

On Jul 25, 2007, at 22:03:37, [EMAIL PROTECTED] wrote:

On Wed, 25 Jul 2007, Satyam Sharma wrote:

On 7/25/07, Lars Ellenberg <[EMAIL PROTECTED]> wrote:

On Wed, Jul 25, 2007 at 04:41:53AM +0530, Satyam Sharma wrote:

[...]
But where does the "send" come into the picture over here -- a  
send won't block forever, so I don't foresee any issues  
whatsoever w.r.t.  kthreads conversion for that. [ BTW I hope  
you're *not* using any signals-based interface for your kernel  
thread _at all_. Kthreads disallow (ignore) all signals by  
default, as they should, and you really shouldn't need to write  
any logic to handle or >  do-certain-things-on-seeing a signal  
in a well designed kernel thread. ] and the sending latency is  
crucial to performance, while the recv will not timeout for the  
next few seconds.  Again, I don't see what sending latency has  
to do with a kernel_thread to kthread conversion. Or with  
signals, for that matter. Anyway, as Kyle Moffett mentioned  
elsewhere, you could probably look at other examples (say  
cifs_demultiplexer_thread() in fs/cifs/connect.c).


the basic problem, and what we use signals for, is:  it is  
waiting in recv, waiting for the peer to say something.  but I  
want it to stop recv, and go send something "right now".


That's ... weird. Most (all?) communication between any two  
parties would follow a protocol where someone recv's stuff, does  
something with it, and sends it back ... what would you send  
"right now" if you didn't receive anything?


becouse even though you didn't receive anything you now have  
something important to send.


remember that both sides can be sitting in receive mode. this puts  
them both in a position to respond to the other if the other has  
something to say.


Why not just have 2 threads, one for "sending" and one for  
"receiving".  When your receiving thread gets data it takes  
appropriate locks and processes it, then releases the locks and goes  
back to waiting for packets.  Your sending thread would take  
appropriate locks, generate data to send, release locks, and transmit  
packets.  You don't have to interrupt the receive thread to send  
packets, so where's the latency problem, exactly?


If I were writing that in userspace I would have:

(A) The pool of IO-generating threads (IE: What would ordinarily be  
userspace)

(B) One or a small number of data-reception threads.
(C) One or a small number of data-transmission threads.

When you get packets to process in your network-reception thread(s),  
you queue appropriate disk IOs and any appropriate responses with  
your transmission thread(s).  You can basically just sit in a loop on  
tcp_recvmsg=>demultiplex=>do-stuff.  When your IO-generators actually  
make stuff to send you queue such data for disk IO, then packetize it  
and hand it off to your data-transmission threads.


If you made all your sockets and inter-thread pipes nonblocking then  
in userspace you would just epoll_wait() on the sockets and pipes and  
be easily able to react to any IO from anywhere.


In kernel space there are similar nonblocking interfaces, although it  
would probably be easier just to use a couple threads.


Cheers,
Kyle Moffett

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH]MTD:Fix ctrl-alt-del cann't reboot for intel flash bug

2007-07-25 Thread Kevin Hao

hi,

 When we press ctrl-alt-del,kernel_restart_prepare will revoke
cfi_intelext_reboot which
will set flash to read array mode,but later when device_shutdown is
invoked which may
put current work queue to sleep and other process may be sheduled to
running and programming flash in not FL_READY mode again.So we cann't
boot up if this flash is used
for bootloader.This patch is against current Linus git tree.
  Sorry,my english is a little rusty.:-)
  Appreciated for any comment.

diff --git a/drivers/mtd/chips/cfi_cmdset_0001.c
b/drivers/mtd/chips/cfi_cmdset_0001.c
index 2f19fa7..a1009df 100644
--- a/drivers/mtd/chips/cfi_cmdset_0001.c
+++ b/drivers/mtd/chips/cfi_cmdset_0001.c
@@ -653,7 +653,7 @@ static int get_chip(struct map_info *map, struct
flchip *chip, unsigned long adr
 resettime:
   timeo = jiffies + HZ;
 retry:
-   if (chip->priv && (mode == FL_WRITING || mode == FL_ERASING ||
mode == FL_OTP_WRITE)) {
+   if (chip->priv && (mode == FL_WRITING || mode == FL_ERASING ||
mode == FL_OTP_WRITE || mode == FL_SHUTDOWN)) {
   /*
* OK. We have possibility for contension on the write/erase
* operations which are global to the real chip and not per
@@ -798,6 +798,9 @@ static int get_chip(struct map_info *map, struct
flchip *chip, unsigned long adr
   if (mode == FL_READY && chip->oldstate == FL_READY)
   return 0;

+   case FL_SHUTDOWN:
+   /* The machine is rebooting now,so no one can get chip
anymore */
+   return -EIO;
   default:
   sleep:
   set_current_state(TASK_UNINTERRUPTIBLE);
@@ -2402,10 +2405,10 @@ static int cfi_intelext_reset(struct mtd_info *mtd)
  and switch to array mode so any bootloader in
  flash is accessible for soft reboot. */
   spin_lock(chip->mutex);
-   ret = get_chip(map, chip, chip->start, FL_SYNCING);
+   ret = get_chip(map, chip, chip->start, FL_SHUTDOWN);
   if (!ret) {
   map_write(map, CMD(0xff), chip->start);
-   chip->state = FL_READY;
+   chip->state = FL_SHUTDOWN;
   }
   spin_unlock(chip->mutex);
   }
diff --git a/include/linux/mtd/flashchip.h b/include/linux/mtd/flashchip.h
index a293a3b..39e7d2a 100644
--- a/include/linux/mtd/flashchip.h
+++ b/include/linux/mtd/flashchip.h
@@ -40,6 +40,7 @@ typedef enum {
   FL_POINT,
   FL_XIP_WHILE_ERASING,
   FL_XIP_WHILE_WRITING,
+   FL_SHUTDOWN,
   FL_UNKNOWN
} flstate_t;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc1-mm1 - seems OK on Dell Latitude D820, except for tpm_tis

2007-07-25 Thread Andrew Morton
On Wed, 25 Jul 2007 18:03:14 -0400 [EMAIL PROTECTED] wrote:

> On Wed, 25 Jul 2007 04:03:04 PDT, Andrew Morton said:
> 
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2.6.23-rc1-mm1/
> 
> It built and booted on the first try for my Dell Latitude D820 laptop, Core2
> T7200 x86_64 kernel. Now at about 5 hours of uptime. I guess I got lucky and 
> my
> stock .config doesn't trip over any of the issues others are hitting.  I did
> hit *one* problem:
> 
> Under 2.6.22-rc6-mm1, 'modprobe tpm_tis' did this:
> 
> [   10.028000] tpm_tis 00:0f: 1.2 TPM (device-id 0x1001, rev-id 2)
> [   10.088000] tpm_tis 00:0f: Unable to request irq: 8 for probe
> 
> and the modprobe returned immediately.  Under 23-rc1-mm1, the modprobe
> takes a *long* time:
> 
> [   23.787331] tpm_tis 00:0f: 1.2 TPM (device-id 0x1001, rev-id 2)
> [   23.787353] tpm0 (IRQ 3) handled a spurious interrupt
> [  143.803891] tpm_tis 00:0f: tpm_transmit: tpm_send: error -62
> [  143.803920] tpm0 (IRQ 3) handled a spurious interrupt
> [  263.736163] tpm_tis 00:0f: tpm_transmit: tpm_send: error -62
> [  383.668381] tpm_tis 00:0f: tpm_transmit: tpm_send: error -62
> [  385.667261] tpm_tis 00:0f: tpm_transmit: tpm_send: error -62
> 
> then it finally returns.  I snuck in a few 'echo t > /proc/sysrq_trigger',
> and it's always waiting here:
> 
> [  193.154317] modprobe  S 001eeae8252d  5488  1446  1
> [  193.154321]  8100043d7a98 0082  
> 80795480
> [  193.154325]  8100043d7a48 810003538000 806813e0 
> 810003538290
> [  193.154329]  04173800 0202 00ff 
> 8023ba74
> [  193.154332] Call Trace:
> [  193.154336]  [] __mod_timer+0xc4/0xd6
> [  193.154340]  [] schedule_timeout+0x8d/0xb4
> [  193.154344]  [] process_timeout+0x0/0xb
> [  193.154347]  [] schedule_timeout+0x88/0xb4
> [  193.154353]  [] :tpm_tis:wait_for_stat+0xb0/0x11a
> [  193.154356]  [] autoremove_wake_function+0x0/0x38
> [  193.154360]  [] :tpm_tis:get_burstcount+0x63/0x8d
> [  193.154365]  [] :tpm_tis:tpm_tis_send+0x191/0x1d3
> [  193.154370]  [] tpm_transmit+0x98/0x1f1
> [  193.154374]  [] transmit_cmd+0x14/0x2e
> [  193.154377]  [] tpm_get_timeouts+0xe9/0x13c
> [  193.154382]  [] :tpm_tis:tpm_tis_init+0x405/0x44c
> [  193.154387]  [] :tpm_tis:tpm_tis_pnp_init+0x2e/0x30
> [  193.154391]  [] pnp_device_probe+0x7b/0xa3
> [  193.154394]  [] driver_probe_device+0xfa/0x17e
> [  193.154397]  [] __driver_attach+0x0/0x94
> [  193.154400]  [] __driver_attach+0x5b/0x94
> [  193.154403]  [] bus_for_each_dev+0x49/0x7a
> [  193.154408]  [] driver_attach+0x1c/0x1e
> [  193.154410]  [] bus_add_driver+0x86/0x1a9
> [  193.154414]  [] driver_register+0x72/0x76
> [  193.154417]  [] pnp_register_driver+0x1c/0x1e
> [  193.154421]  [] :tpm_tis:init_tis+0x81/0x89
> [  193.154425]  [] sys_init_module+0x14db/0x1657
> [  193.154433]  [] system_call+0x7e/0x83
> 
> Here's my /proc/interrupts:
> 
>CPU0   CPU1   
>   0:   15121156  0   IO-APIC-edge  timer
>   1:   3022  0   IO-APIC-edge  i8042
>   3:  0  0   IO-APIC-edge  tpm0
>   8:  0  0   IO-APIC-edge  rtc
>   9:  2  0   IO-APIC-fasteoi   acpi
>  12: 85  0   IO-APIC-edge  i8042
>  14:  46529  0   IO-APIC-edge  libata
>  15: 94  0   IO-APIC-edge  libata
>  16:1301032  0   IO-APIC-fasteoi   nvidia
>  17:148  0   IO-APIC-fasteoi   iwl3945
>  19: 10  0   IO-APIC-fasteoi   ohci1394, yenta
>  20:   9548  0   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb2
>  21:  10544  0   IO-APIC-fasteoi   uhci_hcd:usb3, HDA Intel
>  22:  0  0   IO-APIC-fasteoi   uhci_hcd:usb4
>  23:  0  0   IO-APIC-fasteoi   uhci_hcd:usb5
> 506:111  68979   PCI-MSI-edge  eth0
> NMI:  0  0 
> LOC:   15121050   15120952 
> ERR:  0
> 
> (tpm0 on 3 is new with 23-rc1-mm1.  I have *no* idea why 22-rc6-mm1 tried to
> put it on irq 8. Yes, kernel is currently tainted with nvidia module, but
> at the time of the traceback listed above, I had not yet even gotten the 
> module
> built, so the kernel was untainted at that point).
> 

I can't imagine what we did to break tpm_tis, sorry.  Nothing has changed
in there for ages.

Perhaps something broke at the bus level.  It would be useful to add

--- a/drivers/char/tpm/tpm_tis.c~a
+++ a/drivers/char/tpm/tpm_tis.c
@@ -595,9 +595,11 @@ static int __devinit tpm_tis_pnp_init(st
  const struct pnp_device_id *pnp_id)
 {
resource_size_t start, len;
+
start = pnp_mem_start(pnp_dev, 0);
len = pnp_mem_len(pnp_dev, 0);
-
+   printk("%s: start=%llu, len=%llu\n",
+   (unsigned long long)start, (unsigned long long)end);
return tpm_tis_init(_dev->dev, start, len);
 }
 

Re: [PATCH 1/7] lguest: documentation pt I: Preparation

2007-07-25 Thread Rusty Russell
On Wed, 2007-07-25 at 18:22 -0400, Rob Landley wrote:
> On Monday 23 July 2007 9:01:48 pm Rusty Russell wrote:
> > > IOW, I'd be interested in hearing Rob and Randy's opinions on it all,
> > > please.
> >
> > So they can see what we're talking about, here's an example of the
> > output:
> >
> > http://lguest.ozlabs.org/lguest-journey.c.bz2
> 
> Er, so you read the readme, and then you type "make Preparation!" (which I 
> wouldn't have guessed from the comment at the end of the readme), and it 
> spits this to stdout.

Hi Rob!

I'm going to ask an odd thing.  I *don't* think this should be part of
the generally-available kernel documentation.  I'm sure you can see
several reasons for this, but I'll spell them out for posterity.

People reading the lguest *code* documentation should feel a sense of
achievement.  It starts with a slight puzzle, works its way through deep
details of code, and emerges with the reader feeling confident enough to
start hacking on it.

Thanks for your understanding!
Rusty.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: commit 7e92b4fc34 - x86, serial: convert legacy COM ports to platform devices - broke my serial console

2007-07-25 Thread Bjorn Helgaas
On Wednesday 25 July 2007 08:21:06 pm Shaohua Li wrote:
> On Wed, 2007-07-25 at 17:37 -0700, Yinghai Lu wrote:
> > On 7/25/07, Bjorn Helgaas <[EMAIL PROTECTED]> wrote:
> > > Yinghai, you mentioned the same issue on boxes with multiple root
> > > bridges.  Any chance you could try this out there as well?
> > >
> > it doesn't solve pci_root_bus reverse problem.
> > 
> > is that too late for PNP0A03?
> > 
> > I wonder if we need to modify acpi_device_register to sort them.
> The pci root driver is an acpi driver not a pnp driver, so Bjorn's patch
> will not work.

Right, I forgot that the PCI root driver is an ACPI driver.

This is a longer-term idea, but the way I'd like to solve this is
by converting ACPI drivers into PNP drivers.  Then the PNPACPI sort
would work for all of them.  PNP would provide a hook to retrieve
the ACPI handle corresponding to a PNP device.

> Maybe the ACPI core (ACPICA) should do the sort? 

I thought about that, but I didn't see a nice way to do it.  The
current namespace interfaces like acpi_get_devices() are walk-
oriented -- they call a callback function for every node that
meets some criteria.  Sorting requires some sort of buffer so
you can look at all the matching nodes before returning any of
them.

Bjorn
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Minor errors in 2.6.23-rc1-rt2 series

2007-07-25 Thread Peter Williams
I've just been reviewing these patches and have spotted a couple of
errors that look like they were caused by fuzz during the patch process.

A patch that corrects the errors is attached.

Cheers
Peter
-- 
Peter Williams   [EMAIL PROTECTED]

"Learning, n. The kind of ignorance distinguishing the studious."
 -- Ambrose Bierce

diff -r e02fd64426b9 arch/i386/boot/compressed/Makefile
--- a/arch/i386/boot/compressed/MakefileThu Jul 26 10:33:58 2007 +1000
+++ b/arch/i386/boot/compressed/MakefileThu Jul 26 11:17:35 2007 +1000
@@ -9,10 +9,9 @@ EXTRA_AFLAGS   := -traditional
 EXTRA_AFLAGS   := -traditional
 
 LDFLAGS_vmlinux := -T
-CFLAGS := -m32 -D__KERNEL__ -Iinclude -O2  -fno-strict-aliasing
 hostprogs-y:= relocs
 
-CFLAGS  := -m32 -D__KERNEL__ $(LINUX_INCLUDE) -O2 \
+CFLAGS  := -m32 -D__KERNEL__ $(LINUX_INCLUDE) -Iinclude -O2 \
   -fno-strict-aliasing -fPIC \
   $(call cc-option,-ffreestanding) \
   $(call cc-option,-fno-stack-protector)
diff -r e02fd64426b9 arch/i386/kernel/smp.c
--- a/arch/i386/kernel/smp.cThu Jul 26 10:33:58 2007 +1000
+++ b/arch/i386/kernel/smp.cThu Jul 26 11:17:35 2007 +1000
@@ -651,7 +651,6 @@ fastcall notrace void smp_reschedule_int
 fastcall notrace void smp_reschedule_interrupt(struct pt_regs *regs)
 {
trace_special(regs->eip, 0, 0);
-   trace_special(regs->eip, 0, 0);
ack_APIC_irq();
set_tsk_need_resched(current);
 }
diff -r e02fd64426b9 include/asm-mips/mipsregs.h
--- a/include/asm-mips/mipsregs.h   Thu Jul 26 10:33:58 2007 +1000
+++ b/include/asm-mips/mipsregs.h   Thu Jul 26 11:17:35 2007 +1000
@@ -710,7 +710,7 @@ do {
\
unsigned long long __val;   \
unsigned long __flags;  \
\
-   local_irq_save(flags);  \
+   local_irq_save(__flags);\
if (sel == 0)   \
__asm__ __volatile__(   \
".set\tmips64\n\t"  \


Re: i386-show-unhandled-signals-v3

2007-07-25 Thread Masoud Sharbiani

On 7/25/07, Andrew Morton <[EMAIL PROTECTED]> wrote:

On Wed, 25 Jul 2007 16:40:06 -0700
[EMAIL PROTECTED] (Masoud Asgharifard Sharbiani) wrote:

> > Look: if there's a way in which an unprivileged user can trigger a printk
> > we fix it, end of story.  I don't know why this even slightly
> > controversial.
> >
>
> Fair enough. Here it is:

My favourite words.

> ---
> Hello,
> This patch makes the i386 behave the same way that x86_64 does when a
> segfault happens. A line gets printed to the kernel log so that tools
> that need to check for failures can behave more uniformly between
> different kernels. Like x86_64, it can be disabled by setting
> debug.show_unhandled_signals sysctl variable to 0 (or by doing
> echo 0 > /proc/sys/debug/show_unhandled_signals)

Do we really need the ratelimiting?  If the admin turns this on then he's
presumably prepared for the consequences.

I guess "yes", as people (even distros) are likely to turn this on and
forget about it.

The patch is larger than I expected, ho hum.



So, we happy? What else I can chop from this patch to make it more
acceptable for the people involved?
Please be advised that with this patch, the old exception_trace that
was enabled becomes disabled by default; x86_64 had that enabled, and
i386 didn't have anything...
cheers,
Masoud
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] config_zone_movable [1/2] clean up zone config by renumbering

2007-07-25 Thread KAMEZAWA Hiroyuki
On Wed, 25 Jul 2007 20:10:16 -0700
Andrew Morton <[EMAIL PROTECTED]> wrote:

> On Thu, 26 Jul 2007 11:50:20 +0900 KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> 
> wrote:
> 
> > I refreshed config_zone_movable patch set against 2.6.23-rc1.
> > Reflected comments on previous version.
> > Tested on ia64/NUMA system and my small i386 desktop.
> > 
> > Andrew, I like this patch but know that there are many types of memory 
> > layout. 
> > Could you test this set in -mm ?
> > I'll refresh this against rc1-mm1 if necessary.
> 
> Right now I want to concentrate on getting the present -mm queue vaguely
> stabilised, so I'd prefer to not be adding features or nontrivial cleanups
> for a few days at the minimum.
> 
ok. I'll wait.

> This process would be greatly aided by developers testing it out and
> sending fixes (or at least, reports) for anything they encounter.
>
yes. I'll test -mm.
 

> >  /*
> > + * Test zone type is configured or not.
> > + * You can use this functio for avoiding #ifdef.
> > + *
> > + * #ifdef OCNFIG_ZONE_DMA
> 
> typo there.
> 
OoI'll fix this typo when I send this again.

Thanks,
-Kame

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ck] Re: -mm merge plans for 2.6.23

2007-07-25 Thread Matthew Hawkins
On 7/26/07, Ray Lee <[EMAIL PROTECTED]> wrote:
> Yeah, I know about inotify, but it doesn't scale.

Yeah, the nonrecursive behaviour is a bugger.  Also I found it helped
to queue operations in userspace and execute periodically rather than
trying to execute on every single notification.  Worked well for
indexing, for virus scanning though you'd want to do some risk
analysis.

It'd be nice to have a filesystem that handled that sort of thing
internally *cough*winfs*cough*.  That was my hope for reiserfs a very
long time ago with its pluggable fs modules feature.

-- 
Matt
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] config_zone_movable [1/2] clean up zone config by renumbering

2007-07-25 Thread Andrew Morton
On Thu, 26 Jul 2007 11:50:20 +0900 KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> wrote:

> I refreshed config_zone_movable patch set against 2.6.23-rc1.
> Reflected comments on previous version.
> Tested on ia64/NUMA system and my small i386 desktop.
> 
> Andrew, I like this patch but know that there are many types of memory 
> layout. 
> Could you test this set in -mm ?
> I'll refresh this against rc1-mm1 if necessary.

Right now I want to concentrate on getting the present -mm queue vaguely
stabilised, so I'd prefer to not be adding features or nontrivial cleanups
for a few days at the minimum.

This process would be greatly aided by developers testing it out and
sending fixes (or at least, reports) for anything they encounter.


> -Kame
> 
> ==
> zone_ifdef_cleanup_by_renumbering.patch
> 
> Now, this patch defines zone_idx for not-configured-zones.
> like 
>   enum_zone_type {
>   (ZONE_DMA configured)
>   (ZONE_DMA32 configured)
>   ZONE_NORMAL
>   (ZONE_HIGHMEM configured)
>   ZONE_MOVABLE
>   MAX_NR_ZONES,
>   (ZONE_DMA not-configured)
>   (ZONE_DMA32 not-configured)
>   (ZONE_HIGHMEM not-configured)
>   };
> 
> By this, we can determine zone is configured or not by
> 
>   zone_idx < MAX_NR_ZONES.
> 
> We can avoid #ifdef for CONFIG_ZONE_xxx to some extent.
> 
> This patch also replaces CONFIG_ZONE_DMA_FLAG by is_configured_zone(ZONE_DMA).
> 
> Changelog: v1 -> v2
>   - rebased to 2.6.23-rc1
>   - Removed MAX_POSSIBLE_ZONES
>   - Added comments

Is this patch a bugfix?

box:/home/akpm> grep ifdef ~/x
zone_ifdef_cleanup_by_renumbering.patch
We can avoid #ifdef for CONFIG_ZONE_xxx to some extent.
+ * You can use this functio for avoiding #ifdef.
+ * #ifdef OCNFIG_ZONE_DMA
-#ifdef CONFIG_HIGHMEM
-#ifdef CONFIG_HIGHMEM
-#ifdef CONFIG_ZONE_DMA32
-#ifdef CONFIG_ZONE_DMA
-#ifdef CONFIG_ZONE_DMA
-#ifdef CONFIG_ZONE_DMA32
-#ifdef CONFIG_HIGHMEM
-#ifdef CONFIG_ZONE_DMA
-#ifdef CONFIG_ZONE_DMA32
-#ifdef CONFIG_HIGHMEM
-#ifdef CONFIG_ZONE_DMA
-#ifdef CONFIG_ZONE_DMA32
-#ifdef CONFIG_HIGHMEM
-#ifdef CONFIG_HIGHMEM
-#ifdef CONFIG_HIGHMEM

ooh, me like.

>  /*
> + * Test zone type is configured or not.
> + * You can use this functio for avoiding #ifdef.
> + *
> + * #ifdef OCNFIG_ZONE_DMA

typo there.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH][resend] sysfs/file.c - use mutex instead of semaphore

2007-07-25 Thread Dave Young
Use mutex instead of semaphore in sysfs/file.c : sys_buffer.

Signed-off-by: Dave Young <[EMAIL PROTECTED]> 

---
fs/sysfs/file.c |   14 +++---
1 file changed, 7 insertions(+), 7 deletions(-)

diff -upr linux/fs/sysfs/file.c linux.new/fs/sysfs/file.c
--- linux/fs/sysfs/file.c   2007-07-26 10:55:11.0 +
+++ linux.new/fs/sysfs/file.c   2007-07-26 10:57:13.0 +
@@ -8,8 +8,8 @@
 #include 
 #include 
 #include 
+#include 
 #include 
-#include 
 
 #include "sysfs.h"
 
@@ -55,7 +55,7 @@ struct sysfs_buffer {
loff_t  pos;
char* page;
struct sysfs_ops* ops;
-   struct semaphoresem;
+   struct mutexmutex;
int needs_read_fill;
int event;
 };
@@ -128,7 +128,7 @@ sysfs_read_file(struct file *file, char 
struct sysfs_buffer * buffer = file->private_data;
ssize_t retval = 0;
 
-   down(>sem);
+   mutex_lock(>mutex);
if (buffer->needs_read_fill) {
retval = fill_read_buffer(file->f_path.dentry,buffer);
if (retval)
@@ -139,7 +139,7 @@ sysfs_read_file(struct file *file, char 
retval = simple_read_from_buffer(buf, count, ppos, buffer->page,
 buffer->count);
 out:
-   up(>sem);
+   mutex_unlock(>mutex);
return retval;
 }
 
@@ -228,13 +228,13 @@ sysfs_write_file(struct file *file, cons
struct sysfs_buffer * buffer = file->private_data;
ssize_t len;
 
-   down(>sem);
+   mutex_lock(>mutex);
len = fill_write_buffer(buffer, buf, count);
if (len > 0)
len = flush_write_buffer(file->f_path.dentry, buffer, len);
if (len > 0)
*ppos += len;
-   up(>sem);
+   mutex_unlock(>mutex);
return len;
 }
 
@@ -294,7 +294,7 @@ static int sysfs_open_file(struct inode 
if (!buffer)
goto err_out;
 
-   init_MUTEX(>sem);
+   mutex_init(>mutex);
buffer->needs_read_fill = 1;
buffer->ops = ops;
file->private_data = buffer;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] config_zone_movable [2/2] config_zone_movable

2007-07-25 Thread KAMEZAWA Hiroyuki

Makes ZONE_MOVABLE as configurable

Based on "zone_ifdef_cleanup_by_renumbering.patch"

Signed-Off-By: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]>



---
 include/linux/gfp.h|3 ++-
 include/linux/mmzone.h |   15 +++
 include/linux/vmstat.h |   13 +++--
 mm/Kconfig |   12 
 mm/page_alloc.c|6 ++
 mm/vmstat.c|8 +++-
 6 files changed, 49 insertions(+), 8 deletions(-)

Index: linux-2.6.23-rc1/include/linux/mmzone.h
===
--- linux-2.6.23-rc1.orig/include/linux/mmzone.h
+++ linux-2.6.23-rc1/include/linux/mmzone.h
@@ -154,7 +154,9 @@ enum zone_type {
 */
ZONE_HIGHMEM,
 #endif
+#ifdef CONFIG_ZONE_MOVABLE
ZONE_MOVABLE,
+#endif
MAX_NR_ZONES,
 #ifndef CONFIG_ZONE_DMA
ZONE_DMA,
@@ -165,6 +167,9 @@ enum zone_type {
 #ifndef CONFIG_HIGHMEM
ZONE_HIGHMEM,
 #endif
+#ifndef CONFIG_ZONE_MOVABLE
+   ZONE_MOVABLE,
+#endif
 };
 
 /*
@@ -550,11 +555,13 @@ static inline int zone_idx_is(enum zone_
 
 static inline int zone_movable_is_highmem(void)
 {
-#if CONFIG_ARCH_POPULATES_NODE_MAP
-   if (is_configured_zone(ZONE_HIGHMEM))
-   return movable_zone == ZONE_HIGHMEM;
-#endif
+#ifdef CONFIG_ARCH_POPULATES_NODE_MAP
+   return is_configured_zone(ZONE_HIGHMEM) &&
+  is_configured_zone(ZONE_MOVABLE) &&
+   (movable_zone == ZONE_HIGHMEM);
+#else
return 0;
+#endif
 }
 
 static inline int is_highmem_idx(enum zone_type idx)
Index: linux-2.6.23-rc1/include/linux/gfp.h
===
--- linux-2.6.23-rc1.orig/include/linux/gfp.h
+++ linux-2.6.23-rc1/include/linux/gfp.h
@@ -104,7 +104,8 @@ static inline enum zone_type gfp_zone(gf
if (is_configured_zone(ZONE_DMA32) && (flags & __GFP_DMA32))
return ZONE_DMA32;
 
-   if ((flags & (__GFP_HIGHMEM | __GFP_MOVABLE)) ==
+   if (is_configured_zone(ZONE_MOVABLE) &&
+   (flags & (__GFP_HIGHMEM | __GFP_MOVABLE)) ==
(__GFP_HIGHMEM | __GFP_MOVABLE))
return ZONE_MOVABLE;
 
Index: linux-2.6.23-rc1/mm/Kconfig
===
--- linux-2.6.23-rc1.orig/mm/Kconfig
+++ linux-2.6.23-rc1/mm/Kconfig
@@ -112,6 +112,18 @@ config SPARSEMEM_EXTREME
def_bool y
depends on SPARSEMEM && !SPARSEMEM_STATIC
 
+
+config ZONE_MOVABLE
+   bool"Zone for movable pages"
+   depends on ARCH_POPULATES_NODE_MAP
+   help
+ Allows creating a zone type only for movable pages, e.g. page cache
+ and anonymous memory. Because movable pages are easily reclaimed
+ and page migration technique can move them, your chance for allocating
+ contiguous memory such as huge pages will be better than other zones.
+ To use this zone, please see "kernelcore=" or "movablecore=" in
+ Documentation/kernel-parameters.txt
+
 # eventually, we can have this option just 'select SPARSEMEM'
 config MEMORY_HOTPLUG
bool "Allow for memory hot-add"
Index: linux-2.6.23-rc1/mm/page_alloc.c
===
--- linux-2.6.23-rc1.orig/mm/page_alloc.c
+++ linux-2.6.23-rc1/mm/page_alloc.c
@@ -82,7 +82,9 @@ int sysctl_lowmem_reserve_ratio[MAX_NR_Z
 #ifdef CONFIG_HIGHMEM
 32,
 #endif
+#ifdef CONFIG_ZONE_MOVABLE
 32,
+#endif
 };
 
 EXPORT_SYMBOL(totalram_pages);
@@ -3444,6 +3446,10 @@ static int __init cmdline_parse_core(cha
if (!p)
return -EINVAL;
 
+   if (!is_configured_zone(ZONE_MOVABLE)) {
+   printk ("ZONE_MOVABLE is not configured, %s is ignored.\n",p);
+   return 0;
+   }
coremem = memparse(p, );
*core = coremem >> PAGE_SHIFT;
 
Index: linux-2.6.23-rc1/mm/vmstat.c
===
--- linux-2.6.23-rc1.orig/mm/vmstat.c
+++ linux-2.6.23-rc1/mm/vmstat.c
@@ -471,8 +471,14 @@ const struct seq_operations fragmentatio
 #define TEXT_FOR_HIGHMEM(xx)
 #endif
 
+#ifdef CONFIG_ZONE_MOVABLE
+#define TEXT_FOR_MOVABLE(xx) xx "_movable",
+#else
+#define TEXT_FOR_MOVABLE(xx)
+#endif
+
 #define TEXTS_FOR_ZONES(xx) TEXT_FOR_DMA(xx) TEXT_FOR_DMA32(xx) xx "_normal", \
-   TEXT_FOR_HIGHMEM(xx) xx "_movable",
+   TEXT_FOR_HIGHMEM(xx) xx 
TEXT_FOR_MOVABLE(xx)
 
 static const char * const vmstat_text[] = {
/* Zoned VM counters */
Index: linux-2.6.23-rc1/include/linux/vmstat.h
===
--- linux-2.6.23-rc1.orig/include/linux/vmstat.h
+++ linux-2.6.23-rc1/include/linux/vmstat.h
@@ -25,7 +25,14 @@
 #define HIGHMEM_ZONE(xx)
 #endif
 
-#define FOR_ALL_ZONES(xx) DMA_ZONE(xx) DMA32_ZONE(xx) xx##_NORMAL 
HIGHMEM_ZONE(xx) , xx##_MOVABLE
+#ifdef CONFIG_ZONE_MOVABLE

[PATCH] config_zone_movable [1/2] clean up zone config by renumbering

2007-07-25 Thread KAMEZAWA Hiroyuki
I refreshed config_zone_movable patch set against 2.6.23-rc1.
Reflected comments on previous version.
Tested on ia64/NUMA system and my small i386 desktop.

Andrew, I like this patch but know that there are many types of memory layout. 
Could you test this set in -mm ?
I'll refresh this against rc1-mm1 if necessary.

-Kame

==
zone_ifdef_cleanup_by_renumbering.patch

Now, this patch defines zone_idx for not-configured-zones.
like 
enum_zone_type {
(ZONE_DMA configured)
(ZONE_DMA32 configured)
ZONE_NORMAL
(ZONE_HIGHMEM configured)
ZONE_MOVABLE
MAX_NR_ZONES,
(ZONE_DMA not-configured)
(ZONE_DMA32 not-configured)
(ZONE_HIGHMEM not-configured)
};

By this, we can determine zone is configured or not by

zone_idx < MAX_NR_ZONES.

We can avoid #ifdef for CONFIG_ZONE_xxx to some extent.

This patch also replaces CONFIG_ZONE_DMA_FLAG by is_configured_zone(ZONE_DMA).

Changelog: v1 -> v2
- rebased to 2.6.23-rc1
- Removed MAX_POSSIBLE_ZONES
- Added comments

Signed-Off-By: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]>

---
 include/linux/gfp.h|   16 -
 include/linux/mmzone.h |   79 ++---
 include/linux/vmstat.h |   24 +++---
 mm/Kconfig |5 ---
 mm/page-writeback.c|7 +---
 mm/page_alloc.c|   33 
 mm/slab.c  |4 +-
 7 files changed, 87 insertions(+), 81 deletions(-)

Index: linux-2.6.23-rc1/include/linux/mmzone.h
===
--- linux-2.6.23-rc1.orig/include/linux/mmzone.h
+++ linux-2.6.23-rc1/include/linux/mmzone.h
@@ -155,10 +155,36 @@ enum zone_type {
ZONE_HIGHMEM,
 #endif
ZONE_MOVABLE,
-   MAX_NR_ZONES
+   MAX_NR_ZONES,
+#ifndef CONFIG_ZONE_DMA
+   ZONE_DMA,
+#endif
+#ifndef CONFIG_ZONE_DMA32
+   ZONE_DMA32,
+#endif
+#ifndef CONFIG_HIGHMEM
+   ZONE_HIGHMEM,
+#endif
 };
 
 /*
+ * Test zone type is configured or not.
+ * You can use this functio for avoiding #ifdef.
+ *
+ * #ifdef OCNFIG_ZONE_DMA
+ * do_something...
+ * #endif
+ * can be written as
+ * if (is_configured_zone(ZONE_DMA)) {
+ * do_something..
+ * }
+ */
+static inline int is_configured_zone(enum zone_type zoneidx)
+{
+   return (zoneidx < MAX_NR_ZONES);
+}
+
+/*
  * When a memory allocation must conform to specific limitations (such
  * as being suitable for DMA) the caller will pass in hints to the
  * allocator in the gfp_mask, in the zone modifier bits.  These bits
@@ -511,28 +537,35 @@ static inline int populated_zone(struct 
 
 extern int movable_zone;
 
-static inline int zone_movable_is_highmem(void)
+/*
+ * Check zone is configured && specified "idx" is equal to target zone type.
+ * Zone's index is calucalted by above zone_idx().
+ */
+static inline int zone_idx_is(enum zone_type idx, enum zone_type target)
 {
-#if defined(CONFIG_HIGHMEM) && defined(CONFIG_ARCH_POPULATES_NODE_MAP)
-   return movable_zone == ZONE_HIGHMEM;
-#else
+   if (is_configured_zone(target) && (idx == target))
+   return 1;
return 0;
+}
+
+static inline int zone_movable_is_highmem(void)
+{
+#if CONFIG_ARCH_POPULATES_NODE_MAP
+   if (is_configured_zone(ZONE_HIGHMEM))
+   return movable_zone == ZONE_HIGHMEM;
 #endif
+   return 0;
 }
 
 static inline int is_highmem_idx(enum zone_type idx)
 {
-#ifdef CONFIG_HIGHMEM
-   return (idx == ZONE_HIGHMEM ||
-   (idx == ZONE_MOVABLE && zone_movable_is_highmem()));
-#else
-   return 0;
-#endif
+   return (zone_idx_is(idx, ZONE_HIGHMEM) ||
+  (zone_idx_is(idx, ZONE_MOVABLE) && zone_movable_is_highmem()));
 }
 
 static inline int is_normal_idx(enum zone_type idx)
 {
-   return (idx == ZONE_NORMAL);
+   return zone_idx_is(idx, ZONE_NORMAL);
 }
 
 /**
@@ -543,36 +576,22 @@ static inline int is_normal_idx(enum zon
  */
 static inline int is_highmem(struct zone *zone)
 {
-#ifdef CONFIG_HIGHMEM
-   int zone_idx = zone - zone->zone_pgdat->node_zones;
-   return zone_idx == ZONE_HIGHMEM ||
-   (zone_idx == ZONE_MOVABLE && zone_movable_is_highmem());
-#else
-   return 0;
-#endif
+   return is_highmem_idx(zone_idx(zone));
 }
 
 static inline int is_normal(struct zone *zone)
 {
-   return zone == zone->zone_pgdat->node_zones + ZONE_NORMAL;
+   return zone_idx_is(zone_idx(zone), ZONE_NORMAL);
 }
 
 static inline int is_dma32(struct zone *zone)
 {
-#ifdef CONFIG_ZONE_DMA32
-   return zone == zone->zone_pgdat->node_zones + ZONE_DMA32;
-#else
-   return 0;
-#endif
+   return zone_idx_is(zone_idx(zone), ZONE_DMA32);
 }
 
 static inline int is_dma(struct zone *zone)
 {
-#ifdef CONFIG_ZONE_DMA
-   return zone == zone->zone_pgdat->node_zones + ZONE_DMA;
-#else
-   return 0;
-#endif
+   return 

Re: Question about core file generation

2007-07-25 Thread Carlo Florendo

Ravinandan Arakali (rarakali) wrote:

Hi,
When a process dumps core, the do_coredump() initiates the core
file generation. Is this operation synchronous(does the kernel
wait for core to be completely written to disk) ?


The operations whereby

(1) a process is in the process of exiting while;
(2) the kernel is doing a coredump

is asynchronous with respect to each other, since the process could have 
received an abort signal while the the kernel initiated the core dump.



Basically, if I have the parent process waiting for exit of
child which dumped core, can the parent access the core immediately
on receipt of "child exit" message ? Is it possible that the
core is still in the process of being written ? If so, what's
the event the parent needs to wait for to be assured of a complete
core.


When a child exits, check if the core dump file is still opened by a handle 
(HINT: lsof).


AFAICS from the kernel code, the core dump data or file routine is mutexed 
to ensure that there is only one process handling the core dump file.


You could check if there are still open file handles on the dump.  If there 
are none, it means that the dump had been completed.


HTH.

Best Regards,

Carlo

--
Carlo Florendo
Softare Engineer/Network Co-Administrator
Astra Philippines Inc.
UP-Ayala Technopark, Diliman 1101, Quezon City
Philippines
http://www.astra.ph

--
The Astra Group of Companies
5-3-11 Sekido, Tama City
Tokyo 206-0011, Japan
http://www.astra.co.jp
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC] extent mapped page cache

2007-07-25 Thread Nick Piggin
On Wed, Jul 25, 2007 at 10:10:07PM -0400, Chris Mason wrote:
> On Thu, 26 Jul 2007 03:37:28 +0200
> Nick Piggin <[EMAIL PROTECTED]> wrote:
> 
> >  
> > > One advantage to the state tree is that it separates the state from
> > > the memory being described, allowing a simple kmap style interface
> > > that covers subpages, highmem and superpages.
> > 
> > I suppose so, although we should have added those interfaces long
> > ago ;) The variants in fsblock are pretty good, and you could always
> > do an arbitrary extent (rather than block) based API using the
> > pagecache tree if it would be helpful.
> 
> Yes, you could use fsblock for the state bits and make a separate API
> to map the actual pages.
> 
> >  
> > 
> > > It also more naturally matches the way we want to do IO, making for
> > > easy clustering.
> > 
> > Well the pagecache tree is used to reasonable effect for that now.
> > OK the code isn't beautiful ;). Granted, this might be an area where
> > the seperate state tree ends up being better. We'll see.
> > 
> 
> One thing it gains us is finding the start of the cluster.  Even if
> called by kswapd, the state tree allows writepage to find the start of
> the cluster and send down a big bio (provided I implement trylock to
> avoid various deadlocks).

That's very true, we could potentially also do that with the block extent
tree that I want to try with fsblock.

I'm looking at "cleaning up" some of these aops APIs so hopefully most of
the deadlock problems go away. Should be useful to both our efforts. Will
post patches hopefully when I get time to finish the draft this weekend.


> > > O_DIRECT becomes a special case of readpages and writepagesthe
> > > memory used for IO just comes from userland instead of the page
> > > cache.
> > 
> > Could be, although you'll probably also need to teach the mm about
> > the state tree and/or still manipulate the pagecache tree to prevent
> > concurrency?
> 
> Well, it isn't coded yet, but I should be able to do it from the FS
> specific ops.

Probably, if you invalidate all the pagecache in the range beforehand
you should be able to do it (and I guess you want to do the invalidate
anyway). Although, below deadlock issues might still bite somehwere...


> > But isn't the main aim of O_DIRECT to do as little locking and
> > synchronisation with the pagecache as possible? I thought this is
> > why your race fixing patches got put on the back burner (although
> > they did look fairly nice from a correctness POV).
> 
> I put the placeholder patches on hold because handling a corner case
> where userland did O_DIRECT from a mmap'd region of the same file (Linus
> pointed it out to me).  Basically my patches had to work in 64k chunks
> to avoid a deadlock in get_user_pages.  With the state tree, I can
> allow the page to be faulted in but still properly deal with it.

Oh right, I didn't think of that one. Would you still have similar
issues with the external state tree? I mean, the filesystem doesn't
really know why the fault is taken. O_DIRECT read from a file into
mmapped memory of the same block in the file is almost hopeless I
think.


> > Well I'm kind of handwaving when it comes to O_DIRECT ;) It does look
> > like this might be another advantage of the state tree (although you
> > aren't allowed to slow down buffered IO to achieve the locking ;)).
> 
> ;) The O_DIRECT benefit is a fringe thing.  I've long wanted to help
> clean up that code, but the real point of the patch is to make general
> usage faster and less complex.  If I can't get there, the O_DIRECT
> stuff doesn't matter.

Sure, although unifying code is always a plus so I like that you've
got that in mind.


> > > The ability to put in additional tracking info like the process that
> > > first dirtied a range is also significant.  So, I think it is worth
> > > trying.
> > 
> > Definitely, and I'm glad you are. You haven't converted me yet, but
> > I look forward to finding the best ideas from our two approaches when
> > the patches are further along (ext2 port of fsblock coming along, so
> > we'll be able to have races soon :P).
> 
> I'm sure we can find some river in Cambridge, winner gets to throw
> Axboe in.

Very noble of you to donate your colleage to such a worthy cause.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ck] Re: -mm merge plans for 2.6.23

2007-07-25 Thread Bartlomiej Zolnierkiewicz

Hi,

Some general thoughts about submitter/maintainer responsibilities,
not necessarily connected with the recents events (I hasn't been
following them closely - some people don't have that much free time
to burn at their hands ;)...

On Wednesday 25 July 2007, Ingo Molnar wrote:
> 
> * Satyam Sharma <[EMAIL PROTECTED]> wrote:
> 
> > > concentrate on making sure that both you and the maintainer 
> > > understands the problem correctly,
> > 
> > This itself may require some "convincing" to do. What if the 
> > maintainer just doesn't recognize the problem? Note that the 
> > development model here is more about the "social" thing than purely a 
> > "technical" thing. People do handwave, possibly due to innocent 
> > misunderstandings, possibly without. Often it's just a case of seeing 
> > different reasons behind the "problematic behaviour". Or it could be a 
> > case of all of the above.
> 
> sure - but i was really not talking about from the user's perspective, 
> but from the enterprising kernel developer's perspective who'd like to 
> solve a particular problem. And the nice thing about concentrating on 
> the problem: if you do that well, it does not really matter what the 
> maintainer thinks!

Yes, this is a really good strategy to get you changes upstream (and it
works) - just make changes so perfect that nobody can really complain. :)

The only problem is that the bigger the change becomes the less likely it
is to get it perfect so for really big changes it is also useful to show
maintainer that you take responsibility of your changes (by taking bugreports
and potential review issues very seriously instead of ignoring them, past
history of your merged changes has also a big influence here) so he will
know that you won't leave him in the cold with your code when bugreports
happen and be _sure_ that they will happen with bigger changes.

> ( Talking to the maintainer can of course be of enormous help in the 
>   quest for understanding the problem and figuring out the best fix - 
>   the maintainer will most likely know more about the subject than 
>   yourself. More communication never hurts. It's an additional bonus if 
>   you manage to convince the maintainer to take up the matter for 
>   himself. It's not a given right though - a maintainer's main task is 
>   to judge code that is being submitted, to keep a subsystem running
>   smoothly and to not let it regress - but otherwise there can easily be
>   different priorities of what tasks to tackle first, and in that sense 
>   the maintainer is just one of the many overworked kernel developers 
>   who has no real obligation what to tackle first. )

Yep, and patch author should try to help maintainer understand both the
problem he is trying to fix and the solution, i.e. throwing some undocumented
patches and screaming at maintainer to merge them is not a way to go.

> If the maintainer rejects something despite it being well-reasoned, 
> well-researched and robustly implemented with no tradeoffs and 
> maintainance problems at all then it's a bad maintainer. (but from all 
> i've seen in the past few years the VM maintainers do their job pretty 
> damn fine.) And note that i _do_ disagree with them in this particular 
> swap-prefetch case, but still, the non-merging of swap-prefetch was not 
> a final decision at all. It was more of a "hm, dunno, i still dont 
> really like it - shouldnt this be done differently? Could we debug this 
> a bit better?" reaction. Yes, it can be frustrating after more than one 
> year.
> 
> > > possibly write some testcase that clearly exposes it, and
> > 
> > Oh yes -- that'll be helpful, but definitely not necessarily a 
> > prerequisite for all issues, and then you can't even expect everybody 
> > to write or test/benchmark with testcases. (oh, btw, this is assuming 
> > you do find consensus on a testcase)
> 
> no, but Con is/was certainly more than capable to write testcases and to 
> debug various scenarios. That's the way how new maintainers are found 
> within Linux: people take matters in their own hands and improve a 
> subsystem so that they'll either peacefully co-work with the other 
> maintainers or they replace them (sometimes not so peacefully - like in 
> the IDE/SATA/PATA saga).

Heh, now that you've raised IDE saga I feel obligated to stand up
and say a few words...

The latest opening of IDE saga was quite interesting in the current context
because we had exactly the reversed situation there - "independent" maintainer
and "enterprise" developer (imagine the amount of frustration on both sides)
but the root source was quite similar (inability to get changes merged).

IMO the source root of the conflict lied in coming from different perspectives
and having a bit different priorities (stabilising/cleaning current code vs
adding new features on top of pile of crap).  In such situations it is very
important to be able to stop for a moment and look at the situation from
the other person's 

sub.

2007-07-25 Thread Ying Chu

help
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: commit 7e92b4fc34 - x86, serial: convert legacy COM ports to platform devices - broke my serial console

2007-07-25 Thread Shaohua Li
On Wed, 2007-07-25 at 17:37 -0700, Yinghai Lu wrote:
> On 7/25/07, Bjorn Helgaas <[EMAIL PROTECTED]> wrote:
> > On Wednesday 25 July 2007 07:32:53 am Sébastien Dugué wrote:
> > > On Wed, 25 Jul 2007 07:16:44 -0600 Bjorn Helgaas <[EMAIL PROTECTED]> 
> > > wrote:
> > >
> > > > The _DDN is a "DOS device name", and the _UID is a "logical device ID
> > > > that does not change across reboots."  Both are optional, and PNPACPI
> > > > ignores them.  But maybe we could change PNPACPI to sort by them if
> > > > they are present.  I'll think about this a bit.
> > >
> > >   That would be nice, but I wish you good luck with all those
> > > crappy BIOSes out there.
> >
> > Yeah, it's an ugly world we live in.  Would you be able to try the
> > attached patch just for testing?  It should sort devices with the
> > same _HID by their _UID.  It doesn't have any effect on my systems,
> > because my devices are already ordered by _UID by default.  But I
> > think it should switch your COM1/COM2 ports back to the order you
> > expect.
> >
> > Yinghai, you mentioned the same issue on boxes with multiple root
> > bridges.  Any chance you could try this out there as well?
> >
> it doesn't solve pci_root_bus reverse problem.
> 
> is that too late for PNP0A03?
> 
> I wonder if we need to modify acpi_device_register to sort them.
The pci root driver is an acpi driver not a pnp driver, so Bjorn's patch
will not work. Maybe the ACPI core (ACPICA) should do the sort?

Thanks,
Shaohua
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc1-mm1

2007-07-25 Thread Andrew Morton
On Thu, 26 Jul 2007 01:55:03 + "Dave Young" <[EMAIL PROTECTED]> wrote:

> Hi,
> drivers/built-in.o(.text+0xc649): In function `acpi_pci_choose_state':
> : undefined reference to `acpi_pm_device_sleep_state'
> drivers/built-in.o(.text+0x3fe08): In function `pnpacpi_suspend':
> : undefined reference to `acpi_pm_device_sleep_state'
> make: *** [.tmp_vmlinux1] Error 1
> 
> The pci-acpi.c depends on CONFIG_ACPI_SLEEP

There's a hopeful-fix for this in 
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2.6.23-rc1-mm1/hot-fixes/

(will be readable in a few minutes - I had to chmod it)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PATCH] ACPI patches for 2.6.23-rc1

2007-07-25 Thread david

On Wed, 25 Jul 2007, Len Brown wrote:


On Wednesday 25 July 2007 14:48, Linus Torvalds wrote:


... ACPI now seems to select CPU hotplug. Why?


ACPI=y SMP=y systems require SUSPEND_SMP=y for system sleep support,
and that requires HOTPLUG_CPU=y.

Note that ACPI=y SMP=n systems do not need it,
and thus will not select HOTPLUG_CPU=y


That is just *broken*. Sure, if you select STR or hibernation, we need CPU
hotplug, but just for picking ACPI? Why?


My assumption is that if somebody selects CONFIG_ACPI,
that 99% of the time, they intend that to include support for
the ACPI hooks for system sleep states.

Conversely, supporting the 1% of people who don't want it
isn't worth messing with the 99% who do, nor is
the burden of yet another config option to maintain and
#ifdefs in the code.


so you are saying that you know better then we do what we need?

some people configure ACPI only becouse their system won't work properly 
without it. they have no intention of ever doing a STR or hibernate.


David Lang


On UP, they'd get ACPI system sleep support 100% of the time
by default, but on SMP this option had become problematic.

We used to have this:

if ACPI
...
config ACPI_SLEEP
   bool "Sleep States"
   depends on X86 && (!SMP || SUSPEND_SMP)
   depends on PM
   default y

So the poster-child failure was i386/defconfig itself...
It couldn't support suspend to RAM because it didn't include
CONFIG_ACPI_SLEEP.  Not trivial for a user to select it
when it doesn't even appear on the menu.  It doesn't appear
because CONFIG_SUSPEND_SMP isn't enabled, but that doesn't
appear either -- because CONFIG_HOTPLUG_CPU isn't selected.


so have something like

config ACPI_SLEEP
select HOTPLUG_CPU if X86 && SMP
select SUSPEND_SMP if X86 && SMP

instead of makeing it dependant on ACPI.

David Lang


Most users don't want that.

So today we have this:

menuconfig ACPI
...
   select HOTPLUG_CPU if X86 && SMP
   select SUSPEND_SMP if X86 && SMP

Which I think leads to fewer surprises, and less complicated code.
(even though using select itself is fraught with peril:-)

thanks,
-Len

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


serial flow control appears broken

2007-07-25 Thread Lee Howard

Hello.

I have fax modems that will, in their proper behavior with certain 
features, send up to 64 kilobytes of data to the host DTE all at once.  
(So, the fax modem handles an incoming fax and periodically will send 
between 256 bytes and 64 kilobytes of data in bursts.)


When the DCE-DTE (modem-to-host) communication rate is established at 
115200 bps data loss occurs systems using at least Linux kernels 2.6.5 
and 2.6.18 (and probably everything in-beween and then some more).  This 
is because the modem overflows the host's buffer.  This is evidenced in 
kernel logging:


Jul 23 14:01:30 gollum kernel: ttyS1: 1 input overrun(s)
Jul 23 17:09:45 gollum kernel: ttyS1: 1 input overrun(s)

Normally I would blame the modem itself for not honoring the host's flow 
control signals.  However, I have worked with the modem manufacturer 
closely on this matter for over three months now.  In that process they 
have improved the responsiveness of the modem and have fixed other 
problems, but the end result is that it truly does appear that the 
serial tty driver is not using flow control.  Whether software flow 
control (XON/XOFF) or hardware flow control (RTS/CTS) is used the result 
is the same.


This is evidenced in hardware flow control by a little LED labeled "RTS" 
that is on the external modem.  This LED lights up when pin 7 of the DB9 
serial connection is given +12Vdc current (signalling "RTS" is on - that 
the host can accept data).  The LED goes dark when the current is 
removed (signalling that the host cannot accept data).  This "RTS" LED 
never flickers at all, as it should, when receiving these bursts of data 
- the LED stays lit as long as the serial cable is connected to the 
host... and yet I will see those "input overrun" messages.  Thus, it 
seems quite clear that the Linux serial tty driver is not deasserting 
RTS as it should in hardware flow control.  (And probably the analogous 
problem exists in software flow control, too.)


Please tell me what I can do to help you resove and/or remedy this 
matter.  Also, please let me know if I have contacted the wrong people.  
(I have cross-posted to linux-kernel as a catch-all.  I am not 
subscribed to either linux-serial or linux-kernel mailing lists.  So 
please CC me in any list responses.)


If it is of any value to know (perhaps they have common code?), the same 
error occurs on FreeBSD 6.2 as well.   The problem does not occur on 
Windows.  The problem does not occur on RedHat 6.0 (kernel 2.2.5).


Thanks,

Lee.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC] extent mapped page cache

2007-07-25 Thread Chris Mason
On Thu, 26 Jul 2007 03:37:28 +0200
Nick Piggin <[EMAIL PROTECTED]> wrote:

>  
> > One advantage to the state tree is that it separates the state from
> > the memory being described, allowing a simple kmap style interface
> > that covers subpages, highmem and superpages.
> 
> I suppose so, although we should have added those interfaces long
> ago ;) The variants in fsblock are pretty good, and you could always
> do an arbitrary extent (rather than block) based API using the
> pagecache tree if it would be helpful.

Yes, you could use fsblock for the state bits and make a separate API
to map the actual pages.

>  
> 
> > It also more naturally matches the way we want to do IO, making for
> > easy clustering.
> 
> Well the pagecache tree is used to reasonable effect for that now.
> OK the code isn't beautiful ;). Granted, this might be an area where
> the seperate state tree ends up being better. We'll see.
> 

One thing it gains us is finding the start of the cluster.  Even if
called by kswapd, the state tree allows writepage to find the start of
the cluster and send down a big bio (provided I implement trylock to
avoid various deadlocks).

>  
> > O_DIRECT becomes a special case of readpages and writepagesthe
> > memory used for IO just comes from userland instead of the page
> > cache.
> 
> Could be, although you'll probably also need to teach the mm about
> the state tree and/or still manipulate the pagecache tree to prevent
> concurrency?

Well, it isn't coded yet, but I should be able to do it from the FS
specific ops.

> 
> But isn't the main aim of O_DIRECT to do as little locking and
> synchronisation with the pagecache as possible? I thought this is
> why your race fixing patches got put on the back burner (although
> they did look fairly nice from a correctness POV).

I put the placeholder patches on hold because handling a corner case
where userland did O_DIRECT from a mmap'd region of the same file (Linus
pointed it out to me).  Basically my patches had to work in 64k chunks
to avoid a deadlock in get_user_pages.  With the state tree, I can
allow the page to be faulted in but still properly deal with it.

> 
> Well I'm kind of handwaving when it comes to O_DIRECT ;) It does look
> like this might be another advantage of the state tree (although you
> aren't allowed to slow down buffered IO to achieve the locking ;)).

;) The O_DIRECT benefit is a fringe thing.  I've long wanted to help
clean up that code, but the real point of the patch is to make general
usage faster and less complex.  If I can't get there, the O_DIRECT
stuff doesn't matter.
> 
>  
> > The ability to put in additional tracking info like the process that
> > first dirtied a range is also significant.  So, I think it is worth
> > trying.
> 
> Definitely, and I'm glad you are. You haven't converted me yet, but
> I look forward to finding the best ideas from our two approaches when
> the patches are further along (ext2 port of fsblock coming along, so
> we'll be able to have races soon :P).

I'm sure we can find some river in Cambridge, winner gets to throw
Axboe in.

-chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] agp: don't lock pages

2007-07-25 Thread Nick Piggin
On Thu, Jul 26, 2007 at 11:44:22AM +1000, Dave Airlie wrote:
> >
> >Yeah I had a bit of a look around, and it seems OK (but would
> >appreciate an ack from someone who knows the code).
> >
> >These pages will never get seen by page reclaim, so we're OK
> >there. There is a get_page before the SetPageLocked and a put_page
> >right before the unlock_page, so refcounting should not be broken
> >if it wasn't already: note that the lock_page doesn't pin a
> >reference on a page in general -- we can use it as such for pagecache
> >(although it isn't very clean), because the lock pins the page in
> >pagecache and the pagecache holds a ref.
> >
> >Anyway, if Dave or David can take a look, that would be appreciated.
> >We'll need this for 2.6.23.
> 
> I talked with Ben earlier and I can't see anything inherently wrong
> with removing the lock_page, I assume it was put there to stop things
> getting swapped but if the get/put does that then I'd be happy to
> remove it.

Well it is prevented from being swapped out because it never gets
put on swapout lists, but the get/put certainly doesn't hurt :)
 

> I'm just a bit confused how this didn't get picked up in -mm at all.

Beats me. It was in there for nearly 5 months. Mustn't have been
tested or reported.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [DRIVER SUBMISSION] DRBD wants to go mainline

2007-07-25 Thread david

On Wed, 25 Jul 2007, Satyam Sharma wrote:


On 7/25/07, Lars Ellenberg <[EMAIL PROTECTED]> wrote:

 On Wed, Jul 25, 2007 at 04:41:53AM +0530, Satyam Sharma wrote:
>  [...]
> 
>  But where does the "send" come into the picture over here -- a send

>  won't block forever, so I don't foresee any issues whatsoever w.r.t.
>  kthreads conversion for that. [ BTW I hope you're *not* using any
>  signals-based interface for your kernel thread _at all_. Kthreads
>  disallow (ignore) all signals by default, as they should, and you really
>  shouldn't need to write any logic to handle or 
>  do-certain-things-on-seeing

>  a signal in a well designed kernel thread. ]
> 
> > and the sending

> > latency is crucial to performance, while the recv
> > will not timeout for the next few seconds.
> 
>  Again, I don't see what sending latency has to do with a kernel_thread

>  to kthread conversion. Or with signals, for that matter. Anyway, as
>  Kyle Moffett mentioned elsewhere, you could probably look at other
>  examples (say cifs_demultiplexer_thread() in fs/cifs/connect.c).

 the basic problem, and what we use signals for, is:

 it is waiting in recv, waiting for the peer to say something.
 but I want it to stop recv, and go send something "right now".


That's ... weird. Most (all?) communication between any two parties
would follow a protocol where someone recv's stuff, does something
with it, and sends it back ... what would you send "right now" if you
didn't receive anything?


becouse even though you didn't receive anything you now have something 
important to send.


remember that both sides can be sitting in receive mode. this puts them 
both in a position to respond to the other if the other has something to 
say.


David Lang
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc1-mm1

2007-07-25 Thread Dave Young

Hi,
drivers/built-in.o(.text+0xc649): In function `acpi_pci_choose_state':
: undefined reference to `acpi_pm_device_sleep_state'
drivers/built-in.o(.text+0x3fe08): In function `pnpacpi_suspend':
: undefined reference to `acpi_pm_device_sleep_state'
make: *** [.tmp_vmlinux1] Error 1

The pci-acpi.c depends on CONFIG_ACPI_SLEEP
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc1-mm1: git-kgdb breaks sh compilation

2007-07-25 Thread Paul Mundt
On Wed, Jul 25, 2007 at 11:17:41PM +0200, Adrian Bunk wrote:
> On Wed, Jul 25, 2007 at 04:03:04AM -0700, Andrew Morton wrote:
> >...
> > Changes since 2.6.22-rc6-mm1:
> >...
> >  git-kgdb.patch
> > 
> >  git trees
> >...
> 
> This causes the following compile error on sh:
> 
> <--  snip  -->
> 
> ...
>   CC  drivers/serial/sh-sci.o
> drivers/serial/sh-sci.c: In function 'put_string':
> drivers/serial/sh-sci.c:188: error: 'hexchars' undeclared (first use in this 
> function)
> drivers/serial/sh-sci.c:188: error: (Each undeclared identifier is reported 
> only once
> drivers/serial/sh-sci.c:188: error: for each function it appears in.)
> make[3]: *** [drivers/serial/sh-sci.o] Error 1
> 
> <--  snip  -->
> 
Cool, it's like 5 years ago all over again. It looks like most of
the kgdb stuff is going to need to be re-ported, as it's effectively
thrown out years of changes, and perhaps not surprisingly, blows up quite
spectacularly in the process.

At least this problem with KGDB disabled is an easy fix. I'll see about
getting CONFIG_KGDB=y resynced with the current serial driver and arch
stub.

-

serial: sh-sci: Fix build failure from kgdb fallout.

  CC  drivers/serial/sh-sci.o
drivers/serial/sh-sci.c: In function 'put_string':
drivers/serial/sh-sci.c:188: error: 'hexchars' undeclared (first use in this 
function)
drivers/serial/sh-sci.c:188: error: (Each undeclared identifier is reported 
only once
drivers/serial/sh-sci.c:188: error: for each function it appears in.)
make[3]: *** [drivers/serial/sh-sci.o] Error 1

Reported-by: Adrian Bunk <[EMAIL PROTECTED]>
Signed-off-by: Paul Mundt <[EMAIL PROTECTED]>

--

 drivers/serial/sh-sci.c |1 +
 1 file changed, 1 insertion(+)

diff -X linux-2.6.23-rc1-mm1/Documentation/dontdiff -urN 
linux-2.6.23-rc1-mm1.orig/drivers/serial/sh-sci.c 
linux-2.6.23-rc1-mm1/drivers/serial/sh-sci.c
--- linux-2.6.23-rc1-mm1.orig/drivers/serial/sh-sci.c   2007-07-26 
10:23:51.0 +0900
+++ linux-2.6.23-rc1-mm1/drivers/serial/sh-sci.c2007-07-26 
10:34:59.0 +0900
@@ -161,6 +161,7 @@
 
 #if defined(CONFIG_SH_STANDARD_BIOS) || defined(CONFIG_SH_KGDB)
int checksum;
+   const char hexchars[] = "0123456789abcdef";
int usegdb=0;
 
 #ifdef CONFIG_SH_STANDARD_BIOS
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] agp: don't lock pages

2007-07-25 Thread Dave Airlie


On Thu, Jul 26, 2007 at 07:26:53AM +1000, Benjamin Herrenschmidt wrote:
> On Wed, 2007-07-25 at 13:19 +0200, Nick Piggin wrote:
> > Hi,
> >
> > Does this patch solve the X problem? Does anyone see anything wrong
> > with it or know why agp was locking the pages?
>
> We need to do a little bit of auditing here, but I suspect it will turn
> out all right. I think the reason it locked them in the first place was
> to avoid AGP pages mapped into process space from being swapped out.
>
> I think that should be taken care of by appropriate vma flags nowadays,
> but we need to double check. It also might have been a way around dodgy
> refcounting at one point but I think we got that right nowadays (I
> remember fixing issues in that area when we removed PageReserved from
> those pages back then).

Yeah I had a bit of a look around, and it seems OK (but would
appreciate an ack from someone who knows the code).

These pages will never get seen by page reclaim, so we're OK
there. There is a get_page before the SetPageLocked and a put_page
right before the unlock_page, so refcounting should not be broken
if it wasn't already: note that the lock_page doesn't pin a
reference on a page in general -- we can use it as such for pagecache
(although it isn't very clean), because the lock pins the page in
pagecache and the pagecache holds a ref.

Anyway, if Dave or David can take a look, that would be appreciated.
We'll need this for 2.6.23.


I talked with Ben earlier and I can't see anything inherently wrong
with removing the lock_page, I assume it was put there to stop things
getting swapped but if the get/put does that then I'd be happy to
remove it.

I'm just a bit confused how this didn't get picked up in -mm at all.

Dave.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC] extent mapped page cache

2007-07-25 Thread Nick Piggin
On Wed, Jul 25, 2007 at 08:18:53AM -0400, Chris Mason wrote:
> On Wed, 25 Jul 2007 04:32:17 +0200
> Nick Piggin <[EMAIL PROTECTED]> wrote:
> 
> > Having another tree to store block state I think is a good idea as I
> > said in the fsblock thread with Dave, but I haven't clicked as to why
> > it is a big advantage to use it to manage pagecache state. (and I can
> > see some possible disadvantages in locking and tree manipulation
> > overhead).
> 
> Yes, there are definitely costs with the state tree, it will take some
> careful benchmarking to convince me it is a feasible solution. But,
> storing all the state in the pages themselves is impossible unless the
> block size equals the page size. So, we end up with something like
> fsblock/buffer heads or the state tree.

Yep, we have to have something.

 
> One advantage to the state tree is that it separates the state from
> the memory being described, allowing a simple kmap style interface
> that covers subpages, highmem and superpages.

I suppose so, although we should have added those interfaces long
ago ;) The variants in fsblock are pretty good, and you could always
do an arbitrary extent (rather than block) based API using the
pagecache tree if it would be helpful.
 

> It also more naturally matches the way we want to do IO, making for
> easy clustering.

Well the pagecache tree is used to reasonable effect for that now.
OK the code isn't beautiful ;). Granted, this might be an area where
the seperate state tree ends up being better. We'll see.

 
> O_DIRECT becomes a special case of readpages and writepagesthe
> memory used for IO just comes from userland instead of the page cache.

Could be, although you'll probably also need to teach the mm about
the state tree and/or still manipulate the pagecache tree to prevent
concurrency?

But isn't the main aim of O_DIRECT to do as little locking and
synchronisation with the pagecache as possible? I thought this is
why your race fixing patches got put on the back burner (although
they did look fairly nice from a correctness POV).

Well I'm kind of handwaving when it comes to O_DIRECT ;) It does look
like this might be another advantage of the state tree (although you
aren't allowed to slow down buffered IO to achieve the locking ;)).

 
> The ability to put in additional tracking info like the process that
> first dirtied a range is also significant.  So, I think it is worth
> trying.

Definitely, and I'm glad you are. You haven't converted me yet, but
I look forward to finding the best ideas from our two approaches when
the patches are further along (ext2 port of fsblock coming along, so
we'll be able to have races soon :P).

Thanks,
Nick
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: commit 7e92b4fc34 - x86, serial: convert legacy COM ports to platform devices - broke my serial console

2007-07-25 Thread Yinghai Lu

On 7/25/07, Yinghai Lu <[EMAIL PROTECTED]> wrote:

On 7/25/07, Bjorn Helgaas <[EMAIL PROTECTED]> wrote:
> On Wednesday 25 July 2007 07:32:53 am Sébastien Dugué wrote:
> > On Wed, 25 Jul 2007 07:16:44 -0600 Bjorn Helgaas <[EMAIL PROTECTED]> wrote:
> >
> > > The _DDN is a "DOS device name", and the _UID is a "logical device ID
> > > that does not change across reboots."  Both are optional, and PNPACPI
> > > ignores them.  But maybe we could change PNPACPI to sort by them if
> > > they are present.  I'll think about this a bit.
> >
> >   That would be nice, but I wish you good luck with all those
> > crappy BIOSes out there.
>
> Yeah, it's an ugly world we live in.  Would you be able to try the
> attached patch just for testing?  It should sort devices with the
> same _HID by their _UID.  It doesn't have any effect on my systems,
> because my devices are already ordered by _UID by default.  But I
> think it should switch your COM1/COM2 ports back to the order you
> expect.
>
> Yinghai, you mentioned the same issue on boxes with multiple root
> bridges.  Any chance you could try this out there as well?
>
it doesn't solve pci_root_bus reverse problem.

is that too late for PNP0A03?

I wonder if we need to modify acpi_device_register to sort them.


local hack for pci_root reverse problem would be

or in acpi_pci_register_driver, sort the acpi_pci_roots before using it.

YH
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC] e1000: clear ICR before requesting an IRQ line

2007-07-25 Thread Fernando Luis Vázquez Cao
On Wed, 2007-07-25 at 08:27 -0700, Kok, Auke wrote:
> Fernando Luis Vázquez Cao wrote:
> > I made an interesting finding while testing the two patches below.
> > 
> > http://lkml.org/lkml/2007/7/19/685
> > http://lkml.org/lkml/2007/7/19/687
> > 
> > These patches modify the traditional CONFIG_DEBUG_KERNEL in such a way
> > that the request_irq prints a warning if after calling the handler it
> > returned IRQ_HANDLED .
> > 
> > The code looks like this:
> > 
> > int request_irq(unsigned int irq, irq_handler_t handler,
> > unsigned long irqflags, const char *devname, void *dev_id)
> > .
> > if (irqflags & IRQF_DISABLED) {
> > unsigned long flags;
> > 
> > local_irq_save(flags);
> > retval = handler(irq, dev_id);
> > local_irq_restore(flags);
> > } else
> > retval = handler(irq, dev_id);
> > if (retval == IRQ_HANDLED) {
> > printk(KERN_WARNING
> >"%s (IRQ %d) handled a spurious interrupt\n",
> >devname, irq);
> > }
> > .
> > 
> > I discovered that the e1000 driver handles the "fake" interrupt, which,
> > in principle, is not correct because it obviously isn't a real interrupt
> > and it could have been an interrupt coming from another device that is
> > sharing the IRQ line.
> > 
> > The problem is that the interrupt handler assumes that if ICR!=0 it was
> > its device who generated the interrupt and, consequently, it should be
> > handled. But, unfortunately, that is not always the case. If the network
> > link is active when we open the device (e1000_open) the ICR will have
> > the E1000_ICR_LSC bit set (by the way, is this the expected behavior?).
> 
> yes. is it really a problem though?
It seems we may end up handling spurious interrupts or interrupts coming
from another devices.

> > This means that _any_ interrupt coming in after allocating our interrupt
> > (e1000_request_irq) will be handled, no matter where it came from.
> 
> we actually generate this LSC interrupt ourselves in the driver, to make sure 
> that we cascade into the watchdog which then enables or disables the link 
> code 
> based on the link status change. This allows us to _not_ do any link checking 
> in 
> _open and makes things a bit more simple.
I am not referring to the LSC interrupt the driver itself generates in
e1000_open just before returning. The ICR is masked (ICR==0) after
executing the driver probe (e1000_probe), but when we enter e1000_open
the E1000_ICR_LSC bit will already be set, before the function even
starts executing. I also observed that when the link is active the line

  /* fire a link status change interrupt to start the watchdog */
  E1000_WRITE_REG(>hw, ICS, E1000_ICS_LSC);

is redundant because the E1000_ICS_LSC bit is already set. In fact, the
irq handler gets invoked twice in a row with the interrupt cause being a
link status change.

> > The solution I came up with is clearing the ICR before calling
> > request_irq. I have to admit that I am not familiar enough with this
> > driver, so it is quite likely that this is not the right fix. I would
> > appreciate your comments on this.
> 
> Clearing the ICR before requesting an irq might not work - at the same time 
> the 
> device could generate another LSC irq...
Is it not possible to prevent the device from generating interrupts until we 
call request_irq?

Thank you for your feedback!

  - Fernando

> Of course, we probably should just schedule some delayed work to run our 
> watchdog in e1000_open, but I haven't checked if that actually works.
> 
> 
> Auke
> 
> > Signed-off-by: Fernando Luis Vazquez Cao <[EMAIL PROTECTED]>
> > ---
> > 
> > diff -urNp linux-2.6.22-orig/drivers/net/e1000/e1000_main.c 
> > linux-2.6.22-pendirq/drivers/net/e1000/e1000_main.c
> > --- linux-2.6.22-orig/drivers/net/e1000/e1000_main.c2007-07-19 
> > 18:18:53.0 +0900
> > +++ linux-2.6.22-pendirq/drivers/net/e1000/e1000_main.c 2007-07-25 
> > 17:22:54.0 +0900
> > @@ -1378,6 +1378,17 @@ e1000_alloc_queues(struct e1000_adapter 
> >  }
> >  
> >  /**
> > + * e1000_clear_interrupts
> > + * @adapter: address of board private structure
> > + *
> > + * Mask interrupts
> > + **/
> > +static void
> > +e1000_clear_interrupts(struct e1000_adapter *adapter) {
> > +   E1000_READ_REG(>hw, ICR);
> > +}
> > +
> > +/**
> >   * e1000_open - Called when a network interface is made active
> >   * @netdev: network interface device structure
> >   *
> > @@ -1431,6 +1442,9 @@ e1000_open(struct net_device *netdev)
> >  * so we have to setup our clean_rx handler before we do so.  */
> > e1000_configure(adapter);
> >  
> > +   /* Discard any possible pending interrupts. */
> > +   e1000_clear_interrupts(adapter);
> > +
> > err = e1000_request_irq(adapter);
> > if (err)
> > goto err_req_irq;

-
To unsubscribe from this list: send the line "unsubscribe 

Re: [ck] Re: -mm merge plans for 2.6.23

2007-07-25 Thread Ray Lee

On 7/25/07, Matthew Hawkins <[EMAIL PROTECTED]> wrote:

On 7/26/07, Ray Lee <[EMAIL PROTECTED]> wrote:
> I'd just like updatedb to amortize its work better. If we had some way
> to track all filesystem events, updatedb could keep a live and
> accurate index on the filesystem. And this isn't just updatedb that
> wants that, beagle and tracker et al also want to know filesystem
> events so that they can index the documents themselves as well as the
> metadata. And if they do it live, that spreads the cost out, including
> the VM pressure.

We already have this, its called inotify (and if I'm not mistaken,
beagle already uses it).


Yeah, I know about inotify, but it doesn't scale.

[EMAIL PROTECTED]:~$ find ~ -type d | wc -l
17933
[EMAIL PROTECTED]:~$

That's not fun with inotify, and that's just my home directory. The
vast majority of those are quiet the vast majority of the time, which
is the crux of the problem, and why inotify isn't a great fit for
on-demand virus scanners or indexers.

Ray
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 6/8] i386: bitops: Don't mark memory as clobbered unnecessarily

2007-07-25 Thread Linus Torvalds


On Wed, 25 Jul 2007, Linus Torvalds wrote:
> 
> Hmm. I really think you should take this up with the gcc people. That 
> looks like a gcc bug - because there really is nothing that guarantees 
> that the asm doesn't change the array that "x" points to, and the asm 
> clearly talks about clobbering memory.

Actually, I take that back. I think gcc does the right thing, and yes, 
it's explained by the memory clobber being just a blind "write to memory" 
rather than read memory. My bad.

It does leave us with very few ways of saying that an asm can *read* 
memory, and so it might be good to have it clarified that "volatile" 
implies that (at least with the memory clobber).

Your examples are good, I think.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 6/8] i386: bitops: Don't mark memory as clobbered unnecessarily

2007-07-25 Thread Linus Torvalds


On Wed, 25 Jul 2007, Trent Piepho wrote:
> 
> Specifically, check test6_memasm.s.  The C code looks like this:
> 
> extern int a; /* keep asm from being elided for having no used output */
> static inline void bar(void) { asm("call bar" : "=m"(a) : : "memory"); }
> /* float x can't alias asm's output int a */
> void foo(float *x) { x[20] = 1; bar(); x[20] = 2; }
> 
> The asm code ends up like this:
> foo:
> call bar
> movl4(%esp), %eax   # x, x
> movl$0x4000, 80(%eax)   #,
> ret

Hmm. I really think you should take this up with the gcc people. That 
looks like a gcc bug - because there really is nothing that guarantees 
that the asm doesn't change the array that "x" points to, and the asm 
clearly talks about clobbering memory.

> Notice that the first write to x[20] was NOT done.  It's also not done for a
> volatile asm without a memory clobber.  But if you combine both volatile and a
> memory clobber, then it is!  How to explain that?

I can't explain it. I do think you've found a gcc bug.

That said, the kernel mostly uses "asm volatile()" _together_ with a 
memory clobber for these kinds of things, so it sounds like the kernel 
wouldn't be impacted. But you're definitely right - the above report makes 
me worry.

> The difference between test2_volasm.s and test2_normasm.s is hard to explain
> too.  It seems like some times gcc forgets that imull is commutative.  It will
> emit "imull %edx, %eax" in some cases, but change an asm slightly and it will
> decide it must do "imull %eax, %edx ; movl %edx, %eax" for no apparent reason.

Well, that's likely just a subtle register allocation issue, and 
understandable. Generating perfect code is impossible, you want to 
generate good code on average.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ck] Re: -mm merge plans for 2.6.23

2007-07-25 Thread Matthew Hawkins

On 7/26/07, Ray Lee <[EMAIL PROTECTED]> wrote:

I'd just like updatedb to amortize its work better. If we had some way
to track all filesystem events, updatedb could keep a live and
accurate index on the filesystem. And this isn't just updatedb that
wants that, beagle and tracker et al also want to know filesystem
events so that they can index the documents themselves as well as the
metadata. And if they do it live, that spreads the cost out, including
the VM pressure.


We already have this, its called inotify (and if I'm not mistaken,
beagle already uses it).  Several years ago when it was still a little
flakey patch, I built a custom filesystem indexer into an enterprise
search engine using it (I needed to pull apart Unix mbox files).  The
only trouble of course is the action is triggered immediately, which
may not always be ideal (but that's a userspace problem)

--
Matt
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] IPv6: ipv6_addr_type() doesn't know about RFC4193 addresses

2007-07-25 Thread Dave Johnson
David Miller writes:
> Contrarily, there may be ipv6_addr_type() call sites that really
> do want to reject rfc4193 addresses.

A quick look through the callers and only these functions should be
effected, they check either RESERVED or UNICAST from ipv6_addr_type():

net/ipv6/addrconf.c:ipv6_dev_get_saddr()
net/ipv6/exthdrs.c: ipv6_dest_hao()   
net/ipv6/ip6_tunnel.c:  ip6_tnl_set_cap()
net/ipv6/netfilter/ip6t_REJECT.c:   send_reset()
net/ipv6/route.c:   ip6_route_add()
net/ipv6/route.c:   ip6_pkt_drop()
net/sctp/ipv6.c:sctp_v6_available()
net/sctp/ipv6.c:sctp_v6_addr_valid()

-- 
Dave Johnson
Starent Networks

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 6/8] i386: bitops: Don't mark memory as clobbered unnecessarily

2007-07-25 Thread Trent Piepho
On Tue, 24 Jul 2007, Linus Torvalds wrote:
> On Tue, 24 Jul 2007, Trent Piepho wrote:
> >
> > Speaking of that, why are all the asm functions in arch/i386/lib/string.c
> > defined as having a memory clobber, even those which don't modify memory
> > like strcmp, strchr, strlen and so on?
>
> That's because the memory clobber will serialize the inline asm with
> anything else that reads or writes memory.
>
> So even if we don't actually change any memory, if we cannot describe what
> we *read*, then we need to tell gcc to not re-order us wrt things that
> could *write*. And the simplest way to do that is to say that you clobber
> memory, even if you don't.

I went a made a test suite to see what really happened, and this isn't how
it works.  It appears that a memory clobber only tells gcc that the asm
writes to memory.  It does _not_ tell gcc that the asm reads from memory.

It's at http://www.speakeasy.org/~xyzzy/download/opttest.tar.gz
It's only 3k, but there are 16 files so I'm not inlining it.

The suite has a few test c files, which are compiled with various different
functions, inline norm asm, inline volatile asm, inline asm with a memory
clobber, a normal function, a __attribute__((const)) function, and so on.

They are compiled to asm file, and then a perl script scans the asm files to
figure out what optimizations gcc made.

"make test" will compile all the tests and run them through the perl scripts.
"make test1" will just run test1, etc.

It appears that a normal asm, one without volatile or a memory clobber, is
treated like a const function, which returns the output via a struct (not
using pass-by-address).  It has no side-effects, can't read or write global
variables, and can't dereference pointer arguments.

Adding volatile tells gcc the asm has some hidden side-effect.  It still can't
r/w globals or dereference inputs.  But it won't get elided if there are no
used outputs, common subexpression merged, or treated as a loop invariant.

Adding a memory clobber tells gcc that the asm modifies memory.  It doesn't
modify un-aliased local variables in registers.  It does modify aliased local
variables.  It does not read from memory.  gcc will move or elide a memory
write before an asm with a memory clobber if nothing else (besides the asm)
could see the write.  A memory clobber doesn't count as a side-effect either,
a non-volatile asm without unused outputs will be elided, even if has a memory
clobber.

Specifically, check test6_memasm.s.  The C code looks like this:

extern int a; /* keep asm from being elided for having no used output */
static inline void bar(void) { asm("call bar" : "=m"(a) : : "memory"); }
/* float x can't alias asm's output int a */
void foo(float *x) { x[20] = 1; bar(); x[20] = 2; }

The asm code ends up like this:
foo:
call bar
movl4(%esp), %eax   # x, x
movl$0x4000, 80(%eax)   #,
ret

Notice that the first write to x[20] was NOT done.  It's also not done for a
volatile asm without a memory clobber.  But if you combine both volatile and a
memory clobber, then it is!  How to explain that?

The difference between test2_volasm.s and test2_normasm.s is hard to explain
too.  It seems like some times gcc forgets that imull is commutative.  It will
emit "imull %edx, %eax" in some cases, but change an asm slightly and it will
decide it must do "imull %eax, %edx ; movl %edx, %eax" for no apparent reason.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] agp: don't lock pages

2007-07-25 Thread Nick Piggin
[one more try]

On Thu, Jul 26, 2007 at 02:41:14AM +0200, Nick Piggin wrote:
> [forgot to cc Dave Jones...]
> 
> 
> On Thu, Jul 26, 2007 at 07:26:53AM +1000, Benjamin Herrenschmidt wrote:
> > On Wed, 2007-07-25 at 13:19 +0200, Nick Piggin wrote:
> > > Hi,
> > > 
> > > Does this patch solve the X problem? Does anyone see anything wrong
> > > with it or know why agp was locking the pages?
> > 
> > We need to do a little bit of auditing here, but I suspect it will turn
> > out all right. I think the reason it locked them in the first place was
> > to avoid AGP pages mapped into process space from being swapped out.
> > 
> > I think that should be taken care of by appropriate vma flags nowadays,
> > but we need to double check. It also might have been a way around dodgy
> > refcounting at one point but I think we got that right nowadays (I
> > remember fixing issues in that area when we removed PageReserved from
> > those pages back then).
> 
> Yeah I had a bit of a look around, and it seems OK (but would
> appreciate an ack from someone who knows the code).
> 
> These pages will never get seen by page reclaim, so we're OK
> there. There is a get_page before the SetPageLocked and a put_page
> right before the unlock_page, so refcounting should not be broken
> if it wasn't already: note that the lock_page doesn't pin a
> reference on a page in general -- we can use it as such for pagecache
> (although it isn't very clean), because the lock pins the page in
> pagecache and the pagecache holds a ref.
> 
> Anyway, if Dave or David can take a look, that would be appreciated.
> We'll need this for 2.6.23.
> 
> Nick
> 
> > 
> > Ben.
> > 
> > > --
> > > AGP should not need to lock pages. They are not protecting any race
> > > because there is no lock_page calls, only SetPageLocked.
> > > 
> > > This is causing hangs with d00806b183152af6d24f46f0c33f14162ca1262a.
> > > 
> > > Signed-off-by: Nick Piggin <[EMAIL PROTECTED]>
> > > 
> > > diff --git a/drivers/char/agp/generic.c b/drivers/char/agp/generic.c
> > > index d535c40..3db4f40 100644
> > > --- a/drivers/char/agp/generic.c
> > > +++ b/drivers/char/agp/generic.c
> > > @@ -1170,7 +1170,6 @@ void *agp_generic_alloc_page(struct agp_
> > >   map_page_into_agp(page);
> > >  
> > >   get_page(page);
> > > - SetPageLocked(page);
> > >   atomic_inc(_bridge->current_memory_agp);
> > >   return page_address(page);
> > >  }
> > > @@ -1187,7 +1186,6 @@ void agp_generic_destroy_page(void *addr
> > >   page = virt_to_page(addr);
> > >   unmap_page_from_agp(page);
> > >   put_page(page);
> > > - unlock_page(page);
> > >   free_page((unsigned long)addr);
> > >   atomic_dec(_bridge->current_memory_agp);
> > >  }
> > > diff --git a/drivers/char/agp/intel-agp.c b/drivers/char/agp/intel-agp.c
> > > index a124060..2f319f4 100644
> > > --- a/drivers/char/agp/intel-agp.c
> > > +++ b/drivers/char/agp/intel-agp.c
> > > @@ -213,7 +213,6 @@ static void *i8xx_alloc_pages(void)
> > >   }
> > >   global_flush_tlb();
> > >   get_page(page);
> > > - SetPageLocked(page);
> > >   atomic_inc(_bridge->current_memory_agp);
> > >   return page_address(page);
> > >  }
> > > @@ -229,7 +228,6 @@ static void i8xx_destroy_pages(void *add
> > >   change_page_attr(page, 4, PAGE_KERNEL);
> > >   global_flush_tlb();
> > >   put_page(page);
> > > - unlock_page(page);
> > >   __free_pages(page, 2);
> > >   atomic_dec(_bridge->current_memory_agp);
> > >  }
> > > diff --git a/drivers/char/agp/sgi-agp.c b/drivers/char/agp/sgi-agp.c
> > > index cda608c..98cf8ab 100644
> > > --- a/drivers/char/agp/sgi-agp.c
> > > +++ b/drivers/char/agp/sgi-agp.c
> > > @@ -51,7 +51,6 @@ static void *sgi_tioca_alloc_page(struct
> > >   return NULL;
> > >  
> > >   get_page(page);
> > > - SetPageLocked(page);
> > >   atomic_inc(_bridge->current_memory_agp);
> > >   return page_address(page);
> > >  }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL try#2] Blackfin update

2007-07-25 Thread Bryan Wu
On Wed, 2007-07-25 at 11:26 -0700, Linus Torvalds wrote:
> 
> On Wed, 25 Jul 2007, Bryan Wu wrote:
> > 
> > Please pull from 'for-linus' branch of
> 
> This really is too big for post-rc1.
> 
> I realize that this is all blackfin-only, and that it doesn't matter from 
> a practical standpoint, but I want you guys to learn to follow the merge 
> window, and the only way I can do that is by enforcing it.
> 
>   Linus

Exactly. In the 2.6.23-rc1 merge window, I am trying to submit blackfin
driver patches and other new stuffs. So put those bug fixing and some
small updates in this series.

I will review the series again, submit critical bug fixing in rc2 and
maybe split rest of them into following -rc.

Thanks Jeff, after using you script, I generate this big email. It's
beautiful but I missed merge window.

Thanks a lot
Best Regards,
- Bryan Wu
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] agp: don't lock pages

2007-07-25 Thread Nick Piggin
[forgot to cc Dave Jones...]


On Thu, Jul 26, 2007 at 07:26:53AM +1000, Benjamin Herrenschmidt wrote:
> On Wed, 2007-07-25 at 13:19 +0200, Nick Piggin wrote:
> > Hi,
> > 
> > Does this patch solve the X problem? Does anyone see anything wrong
> > with it or know why agp was locking the pages?
> 
> We need to do a little bit of auditing here, but I suspect it will turn
> out all right. I think the reason it locked them in the first place was
> to avoid AGP pages mapped into process space from being swapped out.
> 
> I think that should be taken care of by appropriate vma flags nowadays,
> but we need to double check. It also might have been a way around dodgy
> refcounting at one point but I think we got that right nowadays (I
> remember fixing issues in that area when we removed PageReserved from
> those pages back then).

Yeah I had a bit of a look around, and it seems OK (but would
appreciate an ack from someone who knows the code).

These pages will never get seen by page reclaim, so we're OK
there. There is a get_page before the SetPageLocked and a put_page
right before the unlock_page, so refcounting should not be broken
if it wasn't already: note that the lock_page doesn't pin a
reference on a page in general -- we can use it as such for pagecache
(although it isn't very clean), because the lock pins the page in
pagecache and the pagecache holds a ref.

Anyway, if Dave or David can take a look, that would be appreciated.
We'll need this for 2.6.23.

Nick

> 
> Ben.
> 
> > --
> > AGP should not need to lock pages. They are not protecting any race
> > because there is no lock_page calls, only SetPageLocked.
> > 
> > This is causing hangs with d00806b183152af6d24f46f0c33f14162ca1262a.
> > 
> > Signed-off-by: Nick Piggin <[EMAIL PROTECTED]>
> > 
> > diff --git a/drivers/char/agp/generic.c b/drivers/char/agp/generic.c
> > index d535c40..3db4f40 100644
> > --- a/drivers/char/agp/generic.c
> > +++ b/drivers/char/agp/generic.c
> > @@ -1170,7 +1170,6 @@ void *agp_generic_alloc_page(struct agp_
> > map_page_into_agp(page);
> >  
> > get_page(page);
> > -   SetPageLocked(page);
> > atomic_inc(_bridge->current_memory_agp);
> > return page_address(page);
> >  }
> > @@ -1187,7 +1186,6 @@ void agp_generic_destroy_page(void *addr
> > page = virt_to_page(addr);
> > unmap_page_from_agp(page);
> > put_page(page);
> > -   unlock_page(page);
> > free_page((unsigned long)addr);
> > atomic_dec(_bridge->current_memory_agp);
> >  }
> > diff --git a/drivers/char/agp/intel-agp.c b/drivers/char/agp/intel-agp.c
> > index a124060..2f319f4 100644
> > --- a/drivers/char/agp/intel-agp.c
> > +++ b/drivers/char/agp/intel-agp.c
> > @@ -213,7 +213,6 @@ static void *i8xx_alloc_pages(void)
> > }
> > global_flush_tlb();
> > get_page(page);
> > -   SetPageLocked(page);
> > atomic_inc(_bridge->current_memory_agp);
> > return page_address(page);
> >  }
> > @@ -229,7 +228,6 @@ static void i8xx_destroy_pages(void *add
> > change_page_attr(page, 4, PAGE_KERNEL);
> > global_flush_tlb();
> > put_page(page);
> > -   unlock_page(page);
> > __free_pages(page, 2);
> > atomic_dec(_bridge->current_memory_agp);
> >  }
> > diff --git a/drivers/char/agp/sgi-agp.c b/drivers/char/agp/sgi-agp.c
> > index cda608c..98cf8ab 100644
> > --- a/drivers/char/agp/sgi-agp.c
> > +++ b/drivers/char/agp/sgi-agp.c
> > @@ -51,7 +51,6 @@ static void *sgi_tioca_alloc_page(struct
> > return NULL;
> >  
> > get_page(page);
> > -   SetPageLocked(page);
> > atomic_inc(_bridge->current_memory_agp);
> > return page_address(page);
> >  }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: commit 7e92b4fc34 - x86, serial: convert legacy COM ports to platform devices - broke my serial console

2007-07-25 Thread Yinghai Lu

On 7/25/07, Bjorn Helgaas <[EMAIL PROTECTED]> wrote:

On Wednesday 25 July 2007 07:32:53 am Sébastien Dugué wrote:
> On Wed, 25 Jul 2007 07:16:44 -0600 Bjorn Helgaas <[EMAIL PROTECTED]> wrote:
>
> > The _DDN is a "DOS device name", and the _UID is a "logical device ID
> > that does not change across reboots."  Both are optional, and PNPACPI
> > ignores them.  But maybe we could change PNPACPI to sort by them if
> > they are present.  I'll think about this a bit.
>
>   That would be nice, but I wish you good luck with all those
> crappy BIOSes out there.

Yeah, it's an ugly world we live in.  Would you be able to try the
attached patch just for testing?  It should sort devices with the
same _HID by their _UID.  It doesn't have any effect on my systems,
because my devices are already ordered by _UID by default.  But I
think it should switch your COM1/COM2 ports back to the order you
expect.

Yinghai, you mentioned the same issue on boxes with multiple root
bridges.  Any chance you could try this out there as well?


it doesn't solve pci_root_bus reverse problem.

is that too late for PNP0A03?

I wonder if we need to modify acpi_device_register to sort them.

YH
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 x86_64 : kernel initial decompression hangs on vmware

2007-07-25 Thread Zachary Amsden

Gabriel Barazer wrote:

Hi,

After upgrading kernel to 2.6.22 on a Vmware workstation guest version 
5.5 and 6 , the kernel decompression stage ("Decompressing Linux...") 
is hanging for a very long time (~5 minutes) before finally  
succeeding (displaying "done.\nBooting the kernel.\n"). During this 
time, the VM process is eating all the CPU time during the 
decompression, like an infinite loop. 


LKML is not the right place for this question.  You are running on Intel 
hardware?


Bring the discussion over to http://www.vmware.com/community forums

Zach
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc1-mm1

2007-07-25 Thread Andrew Morton
On Wed, 25 Jul 2007 17:07:05 -0700
Greg KH <[EMAIL PROTECTED]> wrote:

> > Guessing is this patch ?
> > 
> > gregkh-driver-warn-when-statically-allocated-kobjects-are-used.patch:   
> > __tracedata_end = .;
> > gregkh-driver-warn-when-statically-allocated-kobjects-are-used.patch:+  
> > _sdata = .; /* End of text section */
> 
> This patch is a horrible hack to try to see if kobjects are static and
> not dynamically created.

You could perhaps use something like module_address_lookup() for this
debugging aid.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 0/1] lro: Generic Large Receive Offload for TCP traffic

2007-07-25 Thread David Miller
From: Andrew Gallatin <[EMAIL PROTECTED]>
Date: Wed, 25 Jul 2007 13:17:54 -0400

> I've ported myri10ge to use the new LRO interface.  I have attached a
> preliminary patch to myri10ge.  I'm very pleased to note that the
> performance is on-par with my own LRO used by our out-of-tree driver.
> (except when using mixed MTUS, see performance data below).

Thanks for posting this port and feedback on the generic LRO
code.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] SCTP: IPv4 mapped addr not returned in SCTPv6 accept()

2007-07-25 Thread Dave Johnson

An accept() call on a SCTPv6 socket that returns due to connection of
a IPv4 mapped peer will fill out the 'struct sockaddr' with a zero
IPv6 address instead of the IPv4 mapped address of the peer.

This is due to the v4mapped flag not getting copied into the new
socket on accept() as well as a missing check for INET6 socket type in
sctp_v4_to_sk_*addr().

Signed-off-by: Dave Johnson <[EMAIL PROTECTED]>
Cc: Srinivas Akkipeddi <[EMAIL PROTECTED]>

= net/sctp/ipv6.c 1.108 vs edited =
--- 1.108/net/sctp/ipv6.c   2007-07-05 20:40:15 -04:00
+++ edited/net/sctp/ipv6.c  2007-07-25 16:30:41 -04:00
@@ -641,6 +641,8 @@
newsctp6sk = (struct sctp6_sock *)newsk;
inet_sk(newsk)->pinet6 = >inet6;
 
+   sctp_sk(newsk)->v4mapped = sctp_sk(sk)->v4mapped;
+
newinet = inet_sk(newsk);
newnp = inet6_sk(newsk);
 
= net/sctp/protocol.c 1.130 vs edited =
--- 1.130/net/sctp/protocol.c   2007-05-04 16:36:30 -04:00
+++ edited/net/sctp/protocol.c  2007-07-25 16:28:21 -04:00
@@ -257,13 +257,28 @@
 /* Initialize sk->sk_rcv_saddr from sctp_addr. */
 static void sctp_v4_to_sk_saddr(union sctp_addr *addr, struct sock *sk)
 {
-   inet_sk(sk)->rcv_saddr = addr->v4.sin_addr.s_addr;
+   if ((sk->sk_family == PF_INET6) && (sctp_sk(sk)->v4mapped)) {
+   inet6_sk(sk)->rcv_saddr.s6_addr32[0] = 0;
+   inet6_sk(sk)->rcv_saddr.s6_addr32[1] = 0;
+   inet6_sk(sk)->rcv_saddr.s6_addr32[2] = htonl(0x);
+   inet6_sk(sk)->rcv_saddr.s6_addr32[3] =
+   addr->v4.sin_addr.s_addr;
+   } else {
+   inet_sk(sk)->rcv_saddr = addr->v4.sin_addr.s_addr;
+   }
 }
 
 /* Initialize sk->sk_daddr from sctp_addr. */
 static void sctp_v4_to_sk_daddr(union sctp_addr *addr, struct sock *sk)
 {
-   inet_sk(sk)->daddr = addr->v4.sin_addr.s_addr;
+   if ((sk->sk_family == PF_INET6) && (sctp_sk(sk)->v4mapped)) {
+   inet6_sk(sk)->daddr.s6_addr32[0] = 0;
+   inet6_sk(sk)->daddr.s6_addr32[1] = 0;
+   inet6_sk(sk)->daddr.s6_addr32[2] = htonl(0x);
+   inet6_sk(sk)->daddr.s6_addr32[3] = addr->v4.sin_addr.s_addr;
+   } else {
+   inet_sk(sk)->daddr = addr->v4.sin_addr.s_addr;
+   }
 }
 
 /* Initialize a sctp_addr from an address parameter. */
@@ -557,6 +572,8 @@
newsk->sk_protocol = IPPROTO_SCTP;
newsk->sk_backlog_rcv = sk->sk_prot->backlog_rcv;
sock_reset_flag(newsk, SOCK_ZAPPED);
+
+   sctp_sk(newsk)->v4mapped = sctp_sk(sk)->v4mapped;
 
newinet = inet_sk(newsk);
 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] IPv6: ipv6_addr_type() doesn't know about RFC4193 addresses

2007-07-25 Thread Dave Johnson

ipv6_addr_type() doesn't check for 'Unique Local IPv6 Unicast
Addresses' (RFC4193) and returns IPV6_ADDR_RESERVED for that range.

SCTP uses this function and will fail bind() and connect() calls that
use RFC4193 addresses, SCTP will also ignore inbound connections from
RFC4193 addresses if listening on IPV6_ADDR_ANY.

There may be other users of ipv6_addr_type() that could also have
problems.

Signed-off-by: Dave Johnson <[EMAIL PROTECTED]>
Cc: Srinivas Akkipeddi <[EMAIL PROTECTED]>

= net/ipv6/addrconf_core.c 1.2 vs edited =
--- 1.2/net/ipv6/addrconf_core.c2007-02-26 14:42:57 -05:00
+++ edited/net/ipv6/addrconf_core.c 2007-07-25 15:21:41 -04:00
@@ -50,6 +50,9 @@
if ((st & htonl(0xFFC0)) == htonl(0xFEC0))
return (IPV6_ADDR_SITELOCAL | IPV6_ADDR_UNICAST |
IPV6_ADDR_SCOPE_TYPE(IPV6_ADDR_SCOPE_SITELOCAL));   
/* addr-select 3.1 */
+   if ((st & htonl(0xFE00)) == htonl(0xFC00))
+   return (IPV6_ADDR_UNICAST |
+   IPV6_ADDR_SCOPE_TYPE(IPV6_ADDR_SCOPE_GLOBAL));  
/* RFC 4193 */
 
if ((addr->s6_addr32[0] | addr->s6_addr32[1]) == 0) {
if (addr->s6_addr32[2] == 0) {

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pte_offset_map for ppc assumes HIGHPTE

2007-07-25 Thread Benjamin Herrenschmidt
On Wed, 2007-07-25 at 18:30 -0500, Dave McCracken wrote:
> On Wednesday 25 July 2007, Benjamin Herrenschmidt wrote:
> > Depends... if you have CONFIG_HIGHMEM and not CONFIG_HIGHPTE, you are
> > wasting time going through kmap_atomic unnecessarily no ? it will probably
> > not do anything because the PTE page is in lowmem but still...
> 
> Probably not much time.  You still need to do the page to virtual 
> translation, 
> which kmap_atomic does for you.

Fair enough.

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sysfs/udev broken in 2.6.23-rc1 [input, i2c, ...] (Was: sysfs/udev broken in latest git?)

2007-07-25 Thread Kay Sievers

On Wed, 2007-07-25 at 17:11 -0700, Greg KH wrote:
> On Wed, Jul 25, 2007 at 09:58:08AM +0200, Cornelia Huck wrote:
> > On Wed, 25 Jul 2007 02:19:18 +0200,
> > "Kay Sievers" <[EMAIL PROTECTED]> wrote:
> > 
> > > > >> Removing the dev->parent->bus check fixes it:
> > > 
> > > Yes, let's remove the check, I will check now if we possibly need to
> > > fix more than this or only the block-device patch.
> > 
> > It seems this is the only place we check for dev->parent->bus in the
> > current git tree.
> > 
> > Patch below.
> 
> Thanks for figuring this out, I'll add this to my tree.
> 
> So what is the input layer doing so differently from everyone else here?
> Is it correct?  (sorry, am at a conference this week, so can't dig into
> it as much as I would like to until Friday...)

It was the only place where we stacked (had a hierarchy of) class
devices. We got a class device being a child of another class device.

We never did anything like that and it was the reason to go for a
unified tree at /sys/devices/ instead of putting small hierarchy trees
all over the place, which can never be changed later, as they are the
defined entry points into the device tree.

Kay

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.22 x86_64 : kernel initial decompression hangs on vmware

2007-07-25 Thread Gabriel Barazer

Hi,

After upgrading kernel to 2.6.22 on a Vmware workstation guest version 
5.5 and 6 , the kernel decompression stage ("Decompressing Linux...") is 
hanging for a very long time (~5 minutes) before finally  succeeding 
(displaying "done.\nBooting the kernel.\n"). During this time, the VM 
process is eating all the CPU time during the decompression, like an 
infinite loop.
Between these 2 strings is the gunzip() function at 
boot/compressed/misc.c which does the real job, and the problem seemed 
to appear since commit 1ab60e0f72f71ec54831e525a3e1154f1c092408. 
(2.6.22-rc1 hangs, 2.6.21.6 works). The problem occurs with or without 
CONFIG_RELOCATABLE enabled.


What are the possible solutions to confirm where the problem is coming 
from ?


Thanks,

Gabriel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] IPv6: ipv6_addr_type() doesn't know about RFC4193 addresses

2007-07-25 Thread David Miller
From: Dave Johnson <[EMAIL PROTECTED]>
Date: Wed, 25 Jul 2007 19:49:09 -0400

> 
> ipv6_addr_type() doesn't check for 'Unique Local IPv6 Unicast
> Addresses' (RFC4193) and returns IPV6_ADDR_RESERVED for that range.
> 
> SCTP uses this function and will fail bind() and connect() calls that
> use RFC4193 addresses, SCTP will also ignore inbound connections from
> RFC4193 addresses if listening on IPV6_ADDR_ANY.
> 
> There may be other users of ipv6_addr_type() that could also have
> problems.

Contrarily, there may be ipv6_addr_type() call sites that really
do want to reject rfc4193 addresses.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sysfs/udev broken in 2.6.23-rc1 [input, i2c, ...] (Was: sysfs/udev broken in latest git?)

2007-07-25 Thread Greg KH
On Wed, Jul 25, 2007 at 09:58:08AM +0200, Cornelia Huck wrote:
> On Wed, 25 Jul 2007 02:19:18 +0200,
> "Kay Sievers" <[EMAIL PROTECTED]> wrote:
> 
> > > >> Removing the dev->parent->bus check fixes it:
> > 
> > Yes, let's remove the check, I will check now if we possibly need to
> > fix more than this or only the block-device patch.
> 
> It seems this is the only place we check for dev->parent->bus in the
> current git tree.
> 
> Patch below.

Thanks for figuring this out, I'll add this to my tree.

So what is the input layer doing so differently from everyone else here?
Is it correct?  (sorry, am at a conference this week, so can't dig into
it as much as I would like to until Friday...)

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc1-mm1

2007-07-25 Thread Greg KH
On Wed, Jul 25, 2007 at 11:05:22PM +0200, Gabriel C wrote:
> Gabriel C wrote:
> > H. Peter Anvin wrote:
> >> Sam Ravnborg wrote:
> >>> On Wed, Jul 25, 2007 at 08:48:50PM +0200, Michal Piotrowski wrote:
>  On 25/07/07, Andrew Morton <[EMAIL PROTECTED]> wrote:
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2.6.23-rc1-mm1/
> >
> >
>  Andi, this might be interesting for you
> 
>  make allmodconfig
>  make
>  [...]
>  WARNING: Absolute relocations present
>  Offset Info Type Sym.Value Sym.Name
>  c02041b3 00705601   R_386_32 c0308aa8  _sdata
> >>> Who did spit out this warning. Can you please provide make V=1
> >>> where we see the lines preceeding the warning.
> >>>
> >>> And config please.
> >>>
> >> config: "make allmodconfig"
> >>
> >> System.map would be more interesting, especially what is at 0xc02041b3.
> > 
> > I get here :
> > 
> > WARNING: Absolute relocations present
> > Offset Info Type Sym.Value Sym.Name
> > c0202e73 00703601   R_386_32 c03071bc  _sdata
> > 
> > $ grep c03071bc System.map
> > c03071bc R __tracedata_end
> > c03071bc A _sdata
> 
> 
> Guessing is this patch ?
> 
> gregkh-driver-warn-when-statically-allocated-kobjects-are-used.patch: 
>   __tracedata_end = .;
> gregkh-driver-warn-when-statically-allocated-kobjects-are-used.patch:+  
> _sdata = .; /* End of text section */

This patch is a horrible hack to try to see if kobjects are static and
not dynamically created.

Dave, any ideas what is happening here?

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386-show-unhandled-signals-v3

2007-07-25 Thread Andrew Morton
On Wed, 25 Jul 2007 16:40:06 -0700
[EMAIL PROTECTED] (Masoud Asgharifard Sharbiani) wrote:

> > Look: if there's a way in which an unprivileged user can trigger a printk
> > we fix it, end of story.  I don't know why this even slightly
> > controversial.
> > 
> 
> Fair enough. Here it is:

My favourite words.

> ---
> Hello,
> This patch makes the i386 behave the same way that x86_64 does when a
> segfault happens. A line gets printed to the kernel log so that tools
> that need to check for failures can behave more uniformly between
> different kernels. Like x86_64, it can be disabled by setting
> debug.show_unhandled_signals sysctl variable to 0 (or by doing
> echo 0 > /proc/sys/debug/show_unhandled_signals)

Do we really need the ratelimiting?  If the admin turns this on then he's
presumably prepared for the consequences.

I guess "yes", as people (even distros) are likely to turn this on and
forget about it.

The patch is larger than I expected, ho hum.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc1: i386 section mismatch warnings

2007-07-25 Thread Gabriel C
Sam Ravnborg wrote:
> On Tue, Jul 24, 2007 at 09:48:48AM +0200, Gabriel C wrote:
>> Al Viro wrote:
>>> On Mon, Jul 23, 2007 at 09:18:38PM -0400, Jeff Garzik wrote:
 make allmodconfig on i386:

 WARNING: vmlinux(.text+0xc0101183): Section mismatch: reference to 
>>> Ignore.  vmlinux.o ones are interesting; so are ones in modules.
>>> vmlinux ones are either duplicates of vmlinux.o or false positives.
>> allyesconfig has a lot Section mismatch warnings , are these false positive 
>> too ?
>>
>>
>> http://lkml.org/lkml/2007/7/22/312
> 
> Fixed in latest kbuild.git. See the following two patches.
>

One left on vmlinux.o :) and some on vmlinux.

...

WARNING: vmlinux.o(.text+0x183): Section mismatch: reference to 
.init.text.1:start_kernel (between 'is386' and 'check_x87')

...

This is on current git head with the 2 patches and allyesconfig. 

 
>   Sam
> 


Gabriel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH -mm] dma: INTEL_IOATDMA build fix

2007-07-25 Thread Nelson, Shannon
Satyam Sharma [mailto:[EMAIL PROTECTED] 
>
>Make CONFIG_INTEL_IOATDMA select CONFIG_DCA because it uses code
>exported from said dependency:
>
># CONFIG_DCA is not set
>CONFIG_INTEL_IOATDMA=m
>
>ERROR: "alloc_dca_provider" [drivers/dma/ioatdma.ko] undefined!
>ERROR: "register_dca_provider" [drivers/dma/ioatdma.ko] undefined!
>ERROR: "unregister_dca_provider" [drivers/dma/ioatdma.ko] undefined!
>ERROR: "free_dca_provider" [drivers/dma/ioatdma.ko] undefined!
>make[1]: *** [__modpost] Error 1
>
>"select" seems ok because CONFIG_DCA looks library-like and 
>doesn't itself
>depend upon anything else.
>
>Signed-off-by: Satyam Sharma <[EMAIL PROTECTED]>

Well, ioatdma should be smarter and not try to use ioat_dca.c if there
is no CONFIG_DCA.  However, until that happens, this is fine.

Signed-off-by: Shannon Nelson <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ANNOUNCE] Stacked GIT 0.13

2007-07-25 Thread Catalin Marinas
Stacked GIT 0.13 release is available from http://www.procode.org/stgit/.

StGIT is a Python application providing similar functionality to Quilt
(i.e. pushing/popping patches to/from a stack) on top of GIT. These
operations are performed using GIT commands and the patches are stored
as GIT commit objects, allowing easy merging of the StGIT patches into
other repositories using standard GIT functionality.

The main features in this release:

* Documentation directory with man pages
* Safety checks for the 'rebase' command
* Various contrib scripts
* 'cp' command to copy files
* 'sink' command to complement 'float'
* '--diff-opts' option to some commands for passing additional
arguments to 'git-diff-*'
* 'stgit.mail.prefix' configuration option for the default 'mail
--prefix' value
* Interractive 2-way merging via xxdiff or emacs (previously, only
3-way merging had this feature)
* Slightly changed behaviour to the 'patches' command when no
argument is given to show the patches touching the locally modified
files
* Correct importing of multipart e-mails
* '--unrelated' option to 'mail' to send patches unthreaded and
without sequence numbering
* '--update' option to 'refresh' to only check in the files
already modified by the current patch (similar to 'pick --update')
* '--keep' option to 'goto' (though it only works for patch popping)
* '--expose' option to 'pick' to append the picked commit id to
the log (similar to the 'git cherry-pick -x' command)
* The 'new' command can automatically generate the patch name from
the given log
* 'uncommit' can generate patches up to a given commit id
* Bug fixes


Acknowledgements (special thanks to Yann Dirson who was the main
contributor to this release; many thanks to all the contributors
below):

Yann Dirson (88):
  Refuse to pull/rebase when rendered unsafe by (un)commits.
  Rework of the 'stg pull' policy.
  Correctly raise exception on unknown pull-policy.
  Add a testcase for the safety of pull-policy='pull'.
  Factorize editor handling.
  Add a Documentation directory inspired by the git one.
  Merge doc/ and Documentation/ directories.
  Add missing files to stgit manifest.
  Add contrib/stg-whatchanged: look at what would be changed by refreshing.
  Add contrib/stg-show-old: show old version of a patch.
  Add contrib/stg-fold-files-from: pick selected changes from a
patch above current one
  Add contrib/stg-swallow: completely merge an unapplied patch
into current one
  Add contrib/stg-cvs: helper script to manage a mixed cvs/stgit
working copy
  Add contrib/stg-gitk: helper script to run gitk
  Add contrib/stg-mdiff: display diffs of diffs.
  Add contrib/stg-sink: sink a patch to given location (mirrors float).
  Fix bash completion to not garble the screen with an error message.
  Add 'stg cp' command.
  Do not link to docs that will never be written.
  Some clarifications to the main doc.
  Make 'stg cp' 2nd form safe.
  Fixed t2102-pull-policy-rebase to really test 'rebase' policy.
  Fix stack deletion when orig-base is present.
  Fix diagnostic messages on patch deletion and simplify others.
  Make use exception raised by removedirs.
  Drop utils.remove_dirs() in favor of os.removedirs().
  Copy parentbranch setting on 'stg branch --clone'.
  Remove debugging output from contrib/stg-sink
  Add a script to quickly run an interactive test session.
  Add support for unsetting config vars.
  Cleanup parent information on stgit branch deletion.
  Fix typo in help string.
  Don't use section 7 for main manpage.
  Add doc for 'clone' and 'init'.
  Add doc for 'branch'.
  Fix doc cross-refs.
  Make the documentation of options more consistent.
  Stop advertising 'pull' as the only operation blocked on protected stacks.
  Add "stg bury" command, with the functionality of contrib/stg-sink.
  Document the new 'stg branch --description' features.
  Avoid contrib/stg-swallow deleting unrelated empty patches.
  Cleanup variable names in pick.
  Copy patchlogs when cloning a stack or picking a patch.
  Teach bash to complete branch names in some places.
  Document patch syntax.
  Stop recording branch.*.remote to '.' for local parents.
  Rename "bury" back to "sink".
  Add 2 new contrib scripts.
  Call external commands without a shell where possible.
  Make diff flags handling more modular.
  Add new --diff-opts/-O flag to diff- and status-related commands.
  Fix deletion and move of a hidden patch (gna bug #9244).
  Fix removal of series with non-existant trash dir.
  Fix removal of series to nuke the formatversion config item.
  Robustify rebase test: check patches are reapplied.
  Catch early trying rebasing to unknown ref, and add testcase.
  Fixed typo in 

Re: [ck] Re: -mm merge plans for 2.6.23

2007-07-25 Thread André Goddard Rosa

Question:
  Could those who have found this prefetch helps them alot say how
  many disks they have?  In particular, is their swap on the same
  disk spindle as their root and user files?

Answer - for me:
  On my system where updatedb is a big problem, I have one, slow, disk.


On both desktop and laptop.

Cheers,
--
[]s,
André Goddard
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386-show-unhandled-signals-v3

2007-07-25 Thread Masoud Asgharifard Sharbiani
On Wed, Jul 25, 2007 at 04:25:28PM -0700, Andrew Morton wrote:
> On Wed, 25 Jul 2007 14:07:56 -0700
> "Masoud Sharbiani" <[EMAIL PROTECTED]> wrote:
> 
> > On 7/25/07, Andrew Morton <[EMAIL PROTECTED]> wrote:
> > > On Wed, 25 Jul 2007 16:57:43 +0200
> > > Andi Kleen <[EMAIL PROTECTED]> wrote:
> > >
> > > > On Wednesday 25 July 2007 16:45, Kirill Korotaev wrote:
> > > > > plz don't enable it by default... :/
> > > > > any user can spam syslog with these messages and if syslog is run as 
> > > > > root
> > > > > can take the whole diskspace...
> > > >
> > > > There are plenty of other ways to cause syslog messages anyways;
> > >
> > > tell us what they are and we'll fix them?
> > >
> > > > this argument is 100% bogus.
> > >
> > > people don't like leaving themselves open to logspamming.
> > >
> > >
> > > For this particular issue: someone please send a patch.
> > >
> > Andrew,
> > This is rate limited; Do you need me to rewrite it with it being
> > disabled by default?
> > 
> 
> Yes please.
> 
> Look: if there's a way in which an unprivileged user can trigger a printk
> we fix it, end of story.  I don't know why this even slightly
> controversial.
> 

Fair enough. Here it is:
---
Hello,
This patch makes the i386 behave the same way that x86_64 does when a
segfault happens. A line gets printed to the kernel log so that tools
that need to check for failures can behave more uniformly between
different kernels. Like x86_64, it can be disabled by setting
debug.show_unhandled_signals sysctl variable to 0 (or by doing
echo 0 > /proc/sys/debug/show_unhandled_signals)

Also, all of the lines being printed are now using printk_ratelimit()
to deny the ability of DoS from a local user with a program like the
following:
main()
{
   while (1)
   if (!fork()) *(int *)0 = 0;
}


cheers,
Masoud

Signed-off-by: Masoud Sharbiani <[EMAIL PROTECTED]

diff --git a/arch/i386/kernel/signal.c b/arch/i386/kernel/signal.c
index d574e38..f5dd856 100644
--- a/arch/i386/kernel/signal.c
+++ b/arch/i386/kernel/signal.c
@@ -199,6 +199,13 @@ asmlinkage int sys_sigreturn(unsigned long __unused)
return eax;
 
 badframe:
+   if (show_unhandled_signals && printk_ratelimit())
+   printk("%s%s[%d] bad frame in sigreturn frame:%p eip:%lx"
+  " esp:%lx oeax:%lx\n",
+   current->pid > 1 ? KERN_INFO : KERN_EMERG,
+   current->comm, current->pid, frame, regs->eip,
+   regs->esp, regs->orig_eax);
+
force_sig(SIGSEGV, current);
return 0;
 }  
diff --git a/arch/i386/kernel/traps.c b/arch/i386/kernel/traps.c
index 18c1c28..c20283c 100644
--- a/arch/i386/kernel/traps.c
+++ b/arch/i386/kernel/traps.c
@@ -611,6 +611,13 @@ fastcall void __kprobes do_general_protection(struct 
pt_regs * regs,
 
current->thread.error_code = error_code;
current->thread.trap_no = 13;
+   if (show_unhandled_signals && unhandled_signal(current, SIGSEGV) &&
+   printk_ratelimit())
+   printk(KERN_INFO
+   "%s[%d] general protection eip:%lx esp:%lx error:%lx\n",
+   current->comm, current->pid,
+   regs->eip, regs->esp, error_code);
+
force_sig(SIGSEGV, current);
return;
 
diff --git a/arch/i386/mm/fault.c b/arch/i386/mm/fault.c
index 1ecb3e4..52c940b 100644
--- a/arch/i386/mm/fault.c
+++ b/arch/i386/mm/fault.c
@@ -283,6 +283,8 @@ static inline int vmalloc_fault(unsigned long address)
return 0;
 }
 
+int show_unhandled_signals = 0;
+
 /*
  * This routine handles page faults.  It determines the address,
  * and the problem, and then passes it off to one of the appropriate
@@ -470,6 +472,14 @@ bad_area_nosemaphore:
if (is_prefetch(regs, address, error_code))
return;
 
+   if (show_unhandled_signals && unhandled_signal(tsk, SIGSEGV) &&
+   printk_ratelimit()) {
+   printk("%s%s[%d]: segfault at %08lx eip %08lx "
+   "esp %08lx error %lx\n",
+   tsk->pid > 1 ? KERN_INFO : KERN_EMERG,
+   tsk->comm, tsk->pid, address, regs->eip,
+   regs->esp, error_code);
+   }
tsk->thread.cr2 = address;
/* Kernel addresses are always protection faults */
tsk->thread.error_code = error_code | (address >= TASK_SIZE);
diff --git a/arch/x86_64/kernel/signal.c b/arch/x86_64/kernel/signal.c
index 290f5d8..f9506f6 100644
--- a/arch/x86_64/kernel/signal.c
+++ b/arch/x86_64/kernel/signal.c
@@ -480,7 +480,7 @@ do_notify_resume(struct pt_regs *regs, void *unused, __u32 
thread_info_flags)
 void signal_fault(struct pt_regs *regs, void __user *frame, char *where)
 { 
struct task_struct *me = current; 
-   if (exception_trace)
+   if (show_unhandled_signals && printk_ratelimit())
printk("%s[%d] bad 

Re: [GIT PATCH] more USB patches for 2.6.22

2007-07-25 Thread Linus Torvalds


On Thu, 19 Jul 2007, Greg KH wrote:
>
> Here are some more USB patches and fixes against your 2.6.22 git tree.
> 
> They add a new usb gadget driver, more urb->status cleanups, a new sysfs
> attribute to get the raw config of the usb device, and some bugfixes and
> documentation updates.

I have a flaky(?) USB multi-card reader, and I just got an oops with it on 
x86-64. It was preceded by some of the IO errors:

end_request: I/O error, dev sdc, sector 0
sd 11:0:0:1: [sdc] Result: hostbyte=0x07 driverbyte=0x00
end_request: I/O error, dev sdc, sector 0
Buffer I/O error on device sdc, logical block 0
usb 2-5: reset high speed USB device using ehci_hcd and address 10
usb 2-5: reset high speed USB device using ehci_hcd and address 10
usb 2-5: reset high speed USB device using ehci_hcd and address 10
usb 2-5: reset high speed USB device using ehci_hcd and address 10
usb 2-5: reset high speed USB device using ehci_hcd and address 10
usb 2-5: reset high speed USB device using ehci_hcd and address 10
usb 2-5: device descriptor read/all, error 0

but the oops itself happened when I then removed the USB device due to 
the errors, causing this:

usb 2-5: USB disconnect, address 10
sd 11:0:0:1: [sdc] Result: hostbyte=0x07 driverbyte=0x00
end_request: I/O error, dev sdc, sector 0
Buffer I/O error on device sdc, logical block 0
sd 11:0:0:1: [sdc] Result: hostbyte=0x07 driverbyte=0x00
end_request: I/O error, dev sdc, sector 0
Buffer I/O error on device sdc, logical block 0
sd 11:0:0:1: [sdc] Result: hostbyte=0x01 driverbyte=0x00
end_request: I/O error, dev sdc, sector 0
Buffer I/O error on device sdc, logical block 0
sd 11:0:0:1: [sdc] Result: hostbyte=0x01 driverbyte=0x00
end_request: I/O error, dev sdc, sector 0
Buffer I/O error on device sdc, logical block 0
sd 11:0:0:1: [sdc] Result: hostbyte=0x01 driverbyte=0x00
end_request: I/O error, dev sdc, sector 0
Buffer I/O error on device sdc, logical block 0
Dev sdc: unable to read RDB block 0
sd 11:0:0:1: [sdc] Result: hostbyte=0x01 driverbyte=0x00
end_request: I/O error, dev sdc, sector 0
Buffer I/O error on device sdc, logical block 0
 unable to read partition table
sd 11:0:0:1: [sdc] Attached SCSI removable disk
sd 11:0:0:1: Attached scsi generic sg3 type 0
usb-storage: device scan complete

and finally the oops itself:


general protection fault:  [1] SMP
CPU 0
Modules linked in:
Pid: 214, comm: khubd Not tainted 2.6.22-g20082208 #56
RIP: 0010:[]  [] kfree+0x27/0x81
RSP: 0018:81012bd0dd90  EFLAGS: 00010212
RAX: 037d001b2d7d01b8 RBX: 81000100 RCX: 80314f0f
RDX: 81012337b738 RSI: 037c811b2e7d01b8 RDI: ff241b0cff251c0b
RBP: ff241b0cff251c0b R08: 8062eed0 R09: 81012bc0f430
R10: 0287 R11: 803ed953 R12: 81008642f140
R13:  R14: 1540 R15: 0008
FS:  () GS:806a() 
knlGS:
CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
CR2: 2b02340410a0 CR3: 00010bd4b000 CR4: 06e0
Process khubd (pid: 214, threadinfo 81012bd0c000, task 
81012bed36b0)
Stack:  81012337b738 81011e9fa800 81008642f140 
803f50c4
 81012337b738 81011e9fa800 8064ae70 81011e9fa888
 81012ad60978 81012ad60800 81012ad60800 803ed96c
Call Trace:
 [] usb_destroy_configuration+0x85/0xee
 [] usb_release_dev+0x19/0x55
 [] kobject_cleanup+0x52/0x70
 [] kobject_release+0x0/0x9
 [] kref_put+0x5d/0x68
 [] hub_thread+0x390/0xb27
 [] autoremove_wake_function+0x0/0x2e
 [] hub_thread+0x0/0xb27
 [] kthread+0x47/0x76
 [] child_rip+0xa/0x12
 [] kthread+0x0/0x76
 [] child_rip+0x0/0x12

Code: 48 8b 06 25 00 40 02 00 48 3d 00 40 02 00 75 04 48 8b 76 10
RIP  [] kfree+0x27/0x81
 RSP 

Looks like another reference counting bug...

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Problems with framebuffer in 2.6.22-git17

2007-07-25 Thread Antonino A. Daplas
On Tue, 2007-07-24 at 22:45 +0100, Adrian McMenamin wrote:
> On 23/07/07, Antonino A. Daplas <[EMAIL PROTECTED]> wrote:
> > On Sun, 2007-07-22 at 19:41 +0100, Adrian McMenamin wrote:
> > > I ma having problems with the pvr2 fb on the Dreamcast in 2.6.22-git17
> > > - when the code is executed it appears to lock the Dreamcast up.
> > >
> > > The problem seems to be:
> > >
> > >  fb_notifier_call_chain(FB_EVENT_FB_REGISTERED, );
> > >
> > > In drivers/video/fbmem.c
> > >
> > > This hasn't been an issue before, so are there any recent changes that
> > > might have caused this?
> >
> > What's the last kernel that worked for you? Can you also post your
> > config?
> >
> > >
> > > (fb_notifier_call_chain calls a succession of stubs ending in
> > > __blocking_notifier_call_chain in kernel/sys.c)
> > >
> >
> > Try reverting commit a66ad56eb2c9644717da4d7f05f971d6786145e3.
> >
> > Tony
> >
> 
> Tony,
> 
> I have checked this a few times now, including against Paul's git as
> well as Linus's and the Dreamcast won't boot without its reversion.
> Don't know why, but it needs to be reverted until a better fix is
> available.

I'm also confused. Can you change the color depth to 32 bpp ('fbset
-depth 32')?  I'm thinking of a possible pseudo_palette overrun.

Tony 


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pte_offset_map for ppc assumes HIGHPTE

2007-07-25 Thread Dave McCracken
On Wednesday 25 July 2007, Benjamin Herrenschmidt wrote:
> Depends... if you have CONFIG_HIGHMEM and not CONFIG_HIGHPTE, you are
> wasting time going through kmap_atomic unnecessarily no ? it will probably
> not do anything because the PTE page is in lowmem but still...

Probably not much time.  You still need to do the page to virtual translation, 
which kmap_atomic does for you.

Dave McCracken
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc1-mm1

2007-07-25 Thread Len Brown
On Wednesday 25 July 2007 14:58, Andrew Morton wrote:
> On Wed, 25 Jul 2007 13:23:04 -0400
> Len Brown <[EMAIL PROTECTED]> wrote:
> 
> > Andrew, you want to re-pull the acpi tree, or do you want me to send
> > you some patches on top of the current mm?
> 
> I'd appreciate a fix for this one, please - I'll drop it int he hot-fixes
> directory as quite a few people seem to be hitting this.

Maybe simpler for mm1 to go backwards in time rather than forwards.
This should fix the problem at hand.

cheers,
-Len

commit 106994f83cdd97c77bfe1b333ca369560b6d0649
Author: Len Brown <[EMAIL PROTECTED]>
Date:   Wed Jul 25 19:17:38 2007 -0400

ACPI: revert d-states branch from Jun-17 to Jun-19 for 2.6.23-rc1-mm1

Signed-off-by: Len Brown <[EMAIL PROTECTED]>
---
 drivers/acpi/sleep/main.c  |   75 -
 drivers/pci/pci-acpi.c |   28 +---
 drivers/pci/pci.c  |8 +--
 drivers/pci/pci.h  |2
 drivers/pnp/driver.c   |5 --
 drivers/pnp/pnpacpi/core.c |   14 --
 include/acpi/acpi_bus.h|2
 include/linux/pnp.h|4 -
 8 files changed, 9 insertions(+), 129 deletions(-)

diff --git a/drivers/acpi/sleep/main.c b/drivers/acpi/sleep/main.c
index 34abe8e..ada2a6e 100644
--- a/drivers/acpi/sleep/main.c
+++ b/drivers/acpi/sleep/main.c
@@ -261,81 +261,6 @@ static struct platform_hibernation_ops 
acpi_hibernation_ops = {
 };
 #endif /* CONFIG_SOFTWARE_SUSPEND */
 
-/**
- * acpi_pm_device_sleep_state - return preferred power state of ACPI device
- * in the system sleep state given by %acpi_target_sleep_state
- * @dev: device to examine
- * @wake: if set, the device should be able to wake up the system
- * @d_min_p: used to store the upper limit of allowed states range
- * Return value: preferred power state of the device on success, -ENODEV on
- * failure (ie. if there's no 'struct acpi_device' for @dev)
- *
- * Find the lowest power (highest number) ACPI device power state that
- * device @dev can be in while the system is in the sleep state represented
- * by %acpi_target_sleep_state.  If @wake is nonzero, the device should be
- * able to wake up the system from this sleep state.  If @d_min_p is set,
- * the highest power (lowest number) device power state of @dev allowed
- * in this system sleep state is stored at the location pointed to by it.
- *
- * The caller must ensure that @dev is valid before using this function.
- * The caller is also responsible for figuring out if the device is
- * supposed to be able to wake up the system and passing this information
- * via @wake.
- */
-
-int acpi_pm_device_sleep_state(struct device *dev, int wake, int *d_min_p)
-{
-   acpi_handle handle = DEVICE_ACPI_HANDLE(dev);
-   struct acpi_device *adev;
-   char acpi_method[] = "_SxD";
-   unsigned long d_min, d_max;
-
-   if (!handle || ACPI_FAILURE(acpi_bus_get_device(handle, ))) {
-   printk(KERN_ERR "ACPI handle has no context!\n");
-   return -ENODEV;
-   }
-
-   acpi_method[2] = '0' + acpi_target_sleep_state;
-   /*
-* If the sleep state is S0, we will return D3, but if the device has
-* _S0W, we will use the value from _S0W
-*/
-   d_min = ACPI_STATE_D0;
-   d_max = ACPI_STATE_D3;
-
-   /*
-* If present, _SxD methods return the minimum D-state (highest power
-* state) we can use for the corresponding S-states.  Otherwise, the
-* minimum D-state is D0 (ACPI 3.x).
-*
-* NOTE: We rely on acpi_evaluate_integer() not clobbering the integer
-* provided -- that's our fault recovery, we ignore retval.
-*/
-   if (acpi_target_sleep_state > ACPI_STATE_S0)
-   acpi_evaluate_integer(handle, acpi_method, NULL, _min);
-
-   /*
-* If _PRW says we can wake up the system from the target sleep state,
-* the D-state returned by _SxD is sufficient for that (we assume a
-* wakeup-aware driver if wake is set).  Still, if _SxW exists
-* (ACPI 3.x), it should return the maximum (lowest power) D-state that
-* can wake the system.  _S0W may be valid, too.
-*/
-   if (acpi_target_sleep_state == ACPI_STATE_S0 ||
-   (wake && adev->wakeup.state.enabled &&
-adev->wakeup.sleep_state <= acpi_target_sleep_state)) {
-   acpi_method[3] = 'W';
-   acpi_evaluate_integer(handle, acpi_method, NULL, _max);
-   /* Sanity check */
-   if (d_max < d_min)
-   d_min = d_max;
-   }
-
-   if (d_min_p)
-   *d_min_p = d_min;
-   return d_max;
-}
-
 /*
  * Toshiba fails to preserve interrupts over S1, reinitialization
  * of 8259 is needed after S1 resume.
diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index 67c63d1..c806249 100644
--- 

Re: i386-show-unhandled-signals-v3

2007-07-25 Thread Andrew Morton
On Wed, 25 Jul 2007 14:07:56 -0700
"Masoud Sharbiani" <[EMAIL PROTECTED]> wrote:

> On 7/25/07, Andrew Morton <[EMAIL PROTECTED]> wrote:
> > On Wed, 25 Jul 2007 16:57:43 +0200
> > Andi Kleen <[EMAIL PROTECTED]> wrote:
> >
> > > On Wednesday 25 July 2007 16:45, Kirill Korotaev wrote:
> > > > plz don't enable it by default... :/
> > > > any user can spam syslog with these messages and if syslog is run as 
> > > > root
> > > > can take the whole diskspace...
> > >
> > > There are plenty of other ways to cause syslog messages anyways;
> >
> > tell us what they are and we'll fix them?
> >
> > > this argument is 100% bogus.
> >
> > people don't like leaving themselves open to logspamming.
> >
> >
> > For this particular issue: someone please send a patch.
> >
> Andrew,
> This is rate limited; Do you need me to rewrite it with it being
> disabled by default?
> 

Yes please.

Look: if there's a way in which an unprivileged user can trigger a printk
we fix it, end of story.  I don't know why this even slightly
controversial.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pte_offset_map for ppc assumes HIGHPTE

2007-07-25 Thread Benjamin Herrenschmidt
On Thu, 2007-07-26 at 01:18 +0200, Andreas Schwab wrote:
> Satya <[EMAIL PROTECTED]> writes:
> 
> > hello,
> > The implementation of pte_offset_map() for ppc assumes that PTEs are
> > kept in highmem (CONFIG_HIGHPTE). There is only one implmentation of
> > pte_offset_map() as follows (include/asm-ppc/pgtable.h):
> >
> > #define pte_offset_map(dir, addr)   \
> >  ((pte_t *) kmap_atomic(pmd_page(*(dir)), KM_PTE0) + 
> > pte_index(addr))
> >
> > Shouldn't this be made conditional according to CONFIG_HIGHPTE is
> > defined or not
> 
> kmap_atomic is always defined with or without CONFIG_HIGHPTE.
> 
> > (as implemented in include/asm-i386/pgtable.h) ?
> 
> I don't think that needs it either.

Depends... if you have CONFIG_HIGHMEM and not CONFIG_HIGHPTE, you are wasting
time going through kmap_atomic unnecessarily no ? it will probably not do 
anything
because the PTE page is in lowmem but still...

Ben.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pte_offset_map for ppc assumes HIGHPTE

2007-07-25 Thread Andreas Schwab
Satya <[EMAIL PROTECTED]> writes:

> hello,
> The implementation of pte_offset_map() for ppc assumes that PTEs are
> kept in highmem (CONFIG_HIGHPTE). There is only one implmentation of
> pte_offset_map() as follows (include/asm-ppc/pgtable.h):
>
> #define pte_offset_map(dir, addr)   \
>  ((pte_t *) kmap_atomic(pmd_page(*(dir)), KM_PTE0) + pte_index(addr))
>
> Shouldn't this be made conditional according to CONFIG_HIGHPTE is
> defined or not

kmap_atomic is always defined with or without CONFIG_HIGHPTE.

> (as implemented in include/asm-i386/pgtable.h) ?

I don't think that needs it either.

Andreas.

-- 
Andreas Schwab, SuSE Labs, [EMAIL PROTECTED]
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Time Problems with 2.6.23-rc1-gf695baf2

2007-07-25 Thread Eric Sesterhenn / Snakebyte
* Len Brown ([EMAIL PROTECTED]) wrote:
> > > > > [   13.506890] ACPI Exception (processor_throttling-0084): 
> > > > > AE_NOT_FOUND, Evaluating _PTC [20070126]
> > > > > [   13.507101] ACPI Exception (processor_throttling-0147): 
> > > > > AE_NOT_FOUND, Evaluating _TSS [20070126]
> 
> Note that these are just noise -- new code being verbose when looking for an 
> optional feature.
> 
> The fact that hitting the power button a bunch of times
> to make the system move along suggests some sort of missing interrupt problem 
> --
> most likely the timer itself.
> 
> [   13.868574] Probing IDE interface ide0...
> [  387.279576] Clocksource tsc unstable (delta = 370195339890 ns)
> 
> 5-minutes -- a long probe:-)
> 
> > CONFIG_NO_HZ=y
> 
> does CONFIG_NO_HZ=n make a difference?

[   41.007654] EXT3 FS on hda1, internal journal
[  322.133656] Clocksource tsc unstable (delta = 276476174785 ns)
Boot went fine but the system got pretty unresponsive later, 2-3 seconds
delay after keypresses on an idle system and a hang during shutdown which i had 
to resolve by
pressing the power button (not to switch it of the hard way, but to keep it 
rebooting)

> > CONFIG_HIGH_RES_TIMERS=y
> 
> does CONFIG_HIGH_RES_TIMERS=n make a difference?

doesnt change anything

> does "irqpoll" make any difference?
> does "notsc" make any difference?
> does "idle=poll" make any difference?

I tried these with the HIGH_RES_TIMERS=n, irqpoll and notsc dont change
a thing, idle=poll makes it boot normally


Greetings, Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: very big device. try to use READ CAPACITY(16)

2007-07-25 Thread Robert Hancock

Arkadiusz Miskiewicz wrote:

Hello,

What does "very big device. try to use READ CAPACITY(16)" mean for user? Is 
this advice for driver developer or for user (if for user then what does it 
mean exactly) ?


It isn't really advice at all, just indicates that the sd driver needed 
to use the bigger version of READ CAPACITY for that device. It looks 
like that message has now been reworded to "trying to use" instead of 
"try to use", so it doesn't sound like it's saying the user has to do 
something.





sdc : very big device. try to use READ CAPACITY(16).
SCSI device sdc: 4823210240 512-byte hdwr sectors (2469484 MB)
sdc: Write Protect is off
sdc: Mode Sense: 00 12 00 00
SCSI device sdc: write cache: enabled, read cache: enabled, doesn't support 
DPO or FUA

sdc : very big device. try to use READ CAPACITY(16).
SCSI device sdc: 4823210240 512-byte hdwr sectors (2469484 MB)
sdc: Write Protect is off
sdc: Mode Sense: 00 12 00 00
SCSI device sdc: write cache: enabled, read cache: enabled, doesn't support 
DPO or FUA

 sdc: unknown partition table
sd 0:2:0:0: Attached scsi disk sdc

2.6.21.6, stex driver



--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pte_offset_map for ppc assumes HIGHPTE

2007-07-25 Thread Benjamin Herrenschmidt
On Wed, 2007-07-25 at 17:16 -0500, Satya wrote:
> hello,
> The implementation of pte_offset_map() for ppc assumes that PTEs are
> kept in highmem (CONFIG_HIGHPTE). There is only one implmentation of
> pte_offset_map() as follows (include/asm-ppc/pgtable.h):
> 
> #define pte_offset_map(dir, addr)   \
>  ((pte_t *) kmap_atomic(pmd_page(*(dir)), KM_PTE0) + pte_index(addr))
> 
> Shouldn't this be made conditional according to CONFIG_HIGHPTE is
> defined or not (as implemented in include/asm-i386/pgtable.h) ?
> 
> the same goes for pte_offset_map_nested and the corresponding unmap functions.

Do we have CONFIG_HIGHMEM without CONFIG_HIGHPTE ? If yes, then indeed,
we should change that. Though I'm not sure I see the point of splitting
those 2 options.

Ben.
 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


very big device. try to use READ CAPACITY(16)

2007-07-25 Thread Arkadiusz Miskiewicz

Hello,

What does "very big device. try to use READ CAPACITY(16)" mean for user? Is 
this advice for driver developer or for user (if for user then what does it 
mean exactly) ?


sdc : very big device. try to use READ CAPACITY(16).
SCSI device sdc: 4823210240 512-byte hdwr sectors (2469484 MB)
sdc: Write Protect is off
sdc: Mode Sense: 00 12 00 00
SCSI device sdc: write cache: enabled, read cache: enabled, doesn't support 
DPO or FUA
sdc : very big device. try to use READ CAPACITY(16).
SCSI device sdc: 4823210240 512-byte hdwr sectors (2469484 MB)
sdc: Write Protect is off
sdc: Mode Sense: 00 12 00 00
SCSI device sdc: write cache: enabled, read cache: enabled, doesn't support 
DPO or FUA
 sdc: unknown partition table
sd 0:2:0:0: Attached scsi disk sdc

2.6.21.6, stex driver

-- 
Arkadiusz MiśkiewiczPLD/Linux Team
arekm / maven.plhttp://ftp.pld-linux.org/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/7] eCryptfs: Comments for some structs

2007-07-25 Thread Satyam Sharma

Trivial nits ...

On 7/26/07, Michael Halcrow <[EMAIL PROTECTED]> wrote:

[...]
+/**
+ * ecryptfs_global_auth_tok structs refer to authentication token keys
+ * in the user keyring that apply to newly created files. A list of
+ * these objects hangs off of the mount_crypt_stat struct for any
+ * given eCryptfs mount. This struct maintains a reference to both the
+ * key contents and the key itself so that the key can be put on
+ * unmount.
+ */


/** is used to annotate kernel-doc style comments, which this
one isn't -- IIRC, kernel-doc doesn't like this (?)


 struct ecryptfs_global_auth_tok {
 #define ECRYPTFS_AUTH_TOK_INVALID 0x0001
u32 flags;
-   struct list_head mount_crypt_stat_list;
-   struct key *global_auth_tok_key;
-   struct ecryptfs_auth_tok *global_auth_tok;
-   unsigned char sig[ECRYPTFS_SIG_SIZE_HEX + 1];



+   struct list_head mount_crypt_stat_list; /* Default auth_tok list for
+* the mount_crypt_stat */
+   struct key *global_auth_tok_key; /* The key from the user's keyring for
+ * the sig */


Tsk. You could consider using kernel-doc style itself to comment the
structure -- this stuff goes up there and doesn't look icky.


+   struct ecryptfs_auth_tok *global_auth_tok; /* The key contents */
+   unsigned char sig[ECRYPTFS_SIG_SIZE_HEX + 1]; /* The key identifier */
 };

+/**
+ * Typically, eCryptfs will use the same ciphers repeatedly throughout
+ * the course of its operations. In order to avoid unnecessarily
+ * destroying and initializing the same cipher repeatedly, eCryptfs
+ * keeps a list of crypto API contexts around to use when needed.
+ */


Again, you could consider using kernel-doc style comments here.


 struct ecryptfs_key_tfm {
struct crypto_blkcipher *key_tfm;
size_t key_size;
struct mutex key_tfm_mutex;
-   struct list_head key_tfm_list;
+   struct list_head key_tfm_list; /* The module's tfm list */
unsigned char cipher_name[ECRYPTFS_MAX_CIPHER_NAME_SIZE + 1];
 };



Satyam
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PATCH] ACPI patches for 2.6.23-rc1

2007-07-25 Thread Len Brown
On Wednesday 25 July 2007 14:48, Linus Torvalds wrote:

> ... ACPI now seems to select CPU hotplug. Why? 

ACPI=y SMP=y systems require SUSPEND_SMP=y for system sleep support,
and that requires HOTPLUG_CPU=y.

Note that ACPI=y SMP=n systems do not need it,
and thus will not select HOTPLUG_CPU=y

> That is just *broken*. Sure, if you select STR or hibernation, we need CPU 
> hotplug, but just for picking ACPI? Why?

My assumption is that if somebody selects CONFIG_ACPI,
that 99% of the time, they intend that to include support for
the ACPI hooks for system sleep states.

Conversely, supporting the 1% of people who don't want it
isn't worth messing with the 99% who do, nor is
the burden of yet another config option to maintain and
#ifdefs in the code.

On UP, they'd get ACPI system sleep support 100% of the time
by default, but on SMP this option had become problematic.

We used to have this:

if ACPI
...
config ACPI_SLEEP
bool "Sleep States"
depends on X86 && (!SMP || SUSPEND_SMP)
depends on PM
default y

So the poster-child failure was i386/defconfig itself...
It couldn't support suspend to RAM because it didn't include
CONFIG_ACPI_SLEEP.  Not trivial for a user to select it
when it doesn't even appear on the menu.  It doesn't appear
because CONFIG_SUSPEND_SMP isn't enabled, but that doesn't
appear either -- because CONFIG_HOTPLUG_CPU isn't selected.

Most users don't want that.

So today we have this:

menuconfig ACPI
...
select HOTPLUG_CPU if X86 && SMP
select SUSPEND_SMP if X86 && SMP

Which I think leads to fewer surprises, and less complicated code.
(even though using select itself is fraught with peril:-)

thanks,
-Len

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: MMC/SD Root filesystem suspend/resume problems

2007-07-25 Thread Pavel Machek
On Wed 2007-07-25 20:20:42, Richard Purdie wrote:
> On Wed, 2007-07-25 at 19:01 +, Pavel Machek wrote:
> > > I enabled the MMC_UNSAFE_RESUME option and the problems I was seeing was
> > > "fixed". I think having this option is a bad idea (in its current form)
> > > as it doesn't actually stop filesystem corruption.
> > > 
> > > With the option disabled, if a filesystem is mounted when you suspend my
> > > tests show the filesystem is corrupted. At least if the option is
> > > enabled, the filesystem is only corrupted if you remove the card whilst
> > > suspended which is more preferable.
> > 
> > Are we talking _corruption_ here, or are we talking 'the kind of
> > corruption recoverable by fsck that happens on powerfail'?
> 
> There was more damage to the system than just a dirty bit set. Yes, fsck
> could fix it but I don't think it should happen in the first place...

Well, that's "ok", that happens on sudden powerdowns, too.

(Well, but we do sync() during suspend, so it is a bit strange). Do
you have fsck logs perhaps?
Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc1-mm1: chipsfb_pci_suspend problem

2007-07-25 Thread Pavel Machek
Hi!

> > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2.6.23-rc1-mm1/
> > > 
> > > from pm-move-definition-of-struct-pm_ops-to-suspendh.patch :
> > > 
> > > drivers/video/chipsfb.c: In function 'chipsfb_pci_suspend':
> > > drivers/video/chipsfb.c:461: error: 'PM_SUSPEND_MEM' undeclared (first 
> > > use in this function)
> > > drivers/video/chipsfb.c:461: error: (Each undeclared identifier is 
> > > reported only once
> > > drivers/video/chipsfb.c:461: error: for each function it appears in.)
> > 
> > Well, actually, this is a bug in chipsfb.c, as it shouldn't use 
> > PM_SUSPEND_MEM
> > in there, but PMSG_SUSPEND (patch untested).
> > 
> > ---
> >  drivers/video/chipsfb.c |2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > Index: linux-2.6.23-rc1/drivers/video/chipsfb.c
> > ===
> > --- linux-2.6.23-rc1.orig/drivers/video/chipsfb.c
> > +++ linux-2.6.23-rc1/drivers/video/chipsfb.c
> > @@ -458,7 +458,7 @@ static int chipsfb_pci_suspend(struct pc
> >  
> > if (state.event == pdev->dev.power.power_state.event)
> > return 0;
> > -   if (state.event != PM_SUSPEND_MEM)
> > +   if (state != PMSG_SUSPEND)
> > goto done;
> >  
> > acquire_console_sem();
> 
> For reasons which aren't immediately obvious, the compiler didn't like
> that: comparing with an immediate struct liek that is a bit tricky.

Yes, that was deliberate "type safety".

> This is equivalent, and works:
> 
> 
> --- a/drivers/video/chipsfb.c~chipsfb-use-correct-pm-state
> +++ a/drivers/video/chipsfb.c
> @@ -24,6 +24,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -458,7 +459,7 @@ static int chipsfb_pci_suspend(struct pc
>  
>   if (state.event == pdev->dev.power.power_state.event)
>   return 0;
> - if (state.event != PM_SUSPEND_MEM)
> + if (state.event != PM_EVENT_SUSPEND)

And this is indeed correct. ACK.

> Is this a 2.6.23 thing?

Should not hurt anything, i'd say so.
Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: commit 7e92b4fc34 - x86, serial: convert legacy COM ports to platform devices - broke my serial console

2007-07-25 Thread Bjorn Helgaas
On Wednesday 25 July 2007 07:32:53 am Sébastien Dugué wrote:
> On Wed, 25 Jul 2007 07:16:44 -0600 Bjorn Helgaas <[EMAIL PROTECTED]> wrote:
> 
> > The _DDN is a "DOS device name", and the _UID is a "logical device ID
> > that does not change across reboots."  Both are optional, and PNPACPI
> > ignores them.  But maybe we could change PNPACPI to sort by them if
> > they are present.  I'll think about this a bit.
> 
>   That would be nice, but I wish you good luck with all those
> crappy BIOSes out there.

Yeah, it's an ugly world we live in.  Would you be able to try the
attached patch just for testing?  It should sort devices with the
same _HID by their _UID.  It doesn't have any effect on my systems,
because my devices are already ordered by _UID by default.  But I
think it should switch your COM1/COM2 ports back to the order you
expect.

Yinghai, you mentioned the same issue on boxes with multiple root
bridges.  Any chance you could try this out there as well?

Index: w/drivers/pnp/pnpacpi/core.c
===
--- w.orig/drivers/pnp/pnpacpi/core.c   2007-07-25 13:56:02.0 -0600
+++ w/drivers/pnp/pnpacpi/core.c2007-07-25 16:25:27.0 -0600
@@ -224,18 +224,91 @@
return -EINVAL;
 }
 
-static acpi_status __init pnpacpi_add_device_handler(acpi_handle handle,
+struct pnpacpi_device {
+   struct acpi_device  *device;
+   unsigned long   unique_id;
+   struct list_headlist;
+};
+
+static LIST_HEAD(pnpacpi_dev_list);
+
+static inline int hardware_id_match(struct acpi_device *dev1,
+   struct acpi_device *dev2)
+{
+   return !strncmp(acpi_device_hid(dev1), acpi_device_hid(dev2),
+   sizeof(acpi_device_hid(dev1)));
+}
+
+static inline int precedes(struct pnpacpi_device *dev1,
+  struct pnpacpi_device *dev2)
+{
+   return hardware_id_match(dev1->device, dev2->device) &&
+   dev1->unique_id < dev2->unique_id;
+}
+
+static acpi_status __init pnpacpi_find_device_handler(acpi_handle handle,
u32 lvl, void *context, void **rv)
 {
struct acpi_device *device;
+   struct pnpacpi_device *dev, *cur, *prev = NULL;
+   struct list_head *node;
+   acpi_status status;
+   unsigned long id;
 
-   if (!acpi_bus_get_device(handle, ))
-   pnpacpi_add_device(device);
-   else
+   if (acpi_bus_get_device(handle, ))
return AE_CTRL_DEPTH;
+
+   dev = kzalloc(sizeof(*dev), GFP_KERNEL);
+   if (!dev) {
+   pnp_err("Out of memory");
+   return AE_OK;
+   }
+
+   status = acpi_evaluate_integer(handle, METHOD_NAME__UID, NULL, );
+   if (ACPI_FAILURE(status))
+   id = 0;
+
+   INIT_LIST_HEAD(>list);
+   dev->device = device;
+   dev->unique_id = id;
+
+   /*
+* If several devices have the same _HID, sort them by _UID so
+* device names are ordered by _UID.  Note that _UID can be
+* either an integer or a string; we only order the integer ones.
+*/
+   if (list_empty(_dev_list)) {
+   list_add(>list, _dev_list);
+   return AE_OK;
+   }
+
+   list_for_each(node, _dev_list) {
+   cur = list_entry(node, struct pnpacpi_device, list);
+   if (precedes(dev, cur) ||
+   (prev && hardware_id_match(prev->device, cur->device))) {
+   list_add_tail(>list, node);
+   return AE_OK;
+   }
+   prev = cur;
+   }
+
+   list_add_tail(>list, _dev_list);
return AE_OK;
 }
 
+static void __init pnpacpi_add_devices(void)
+{
+   struct list_head *node, *next;
+   struct pnpacpi_device *dev;
+
+   list_for_each_safe(node, next, _dev_list) {
+   dev = list_entry(node, struct pnpacpi_device, list);
+   pnpacpi_add_device(dev->device);
+   list_del(node);
+   kfree(dev);
+   }
+}
+
 static int __init acpi_pnp_match(struct device *dev, void *_pnp)
 {
struct acpi_device  *acpi = to_acpi_device(dev);
@@ -282,7 +355,8 @@
pnp_info("PnP ACPI init");
pnp_register_protocol(_protocol);
register_acpi_bus_type(_pnp_bus);
-   acpi_get_devices(NULL, pnpacpi_add_device_handler, NULL, NULL);
+   acpi_get_devices(NULL, pnpacpi_find_device_handler, NULL, NULL);
+   pnpacpi_add_devices();
pnp_info("PnP ACPI: found %d devices", num);
unregister_acpi_bus_type(_pnp_bus);
pnp_platform_devices = 1;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc1-mm1 -- mostly fails to build

2007-07-25 Thread Andy Whitcroft
On Wed, Jul 25, 2007 at 05:36:56PM +0100, Andy Whitcroft wrote:

> Will investigate the NUMA-Q explosion and report on that separatly.

Ok, I've been looking at the NUMA-Q boot panic below:

BUG: unable to handle kernel NULL pointer dereference at virtual address 

 printing eip:
c111689f
*pdpt = 01387001
*pde = 
Oops:  [#1]
SMP
Modules linked in:
CPU:0
EIP:0060:[]Not tainted VLI
EFLAGS: 00010286   (2.6.23-rc1-mm1-gc8131905-dirty #251)
EIP is at pci_create_bus+0x11b/0x277
eax:    ebx: c9352e00   ecx: c9073e94   edx: c9325400
esi: c9325400   edi: c932559c   ebp: 0002   esp: c9073e90
ds: 007b   es: 007b   fs: 00d8  gs:   ss: 0068
Process swapper (pid: 1, ti=c9072000 task=c9070030 task.ti=c9072000)
Stack: c12adc4f c9325400  0002    c934c800
   00d6  c1116a09  c9073ed5 c11b178a  c934c940
    02f4 c12bd8ac c934c800 c12bd8b4 c9325000 c1119825 c934c800
Call Trace:
 [] pci_scan_bus_parented+0xe/0x21
 [] pci_fixup_i450nx+0xa7/0x101
 [] pci_do_fixups+0x2d/0x38
 [] pci_device_add+0x48/0x77
 [] pci_scan_single_device+0x1a/0x1f
 [] pci_scan_slot+0x15/0x47
 [] pci_scan_child_bus+0x19/0x7c
 [] pci_scan_bus_parented+0x19/0x21
 [] pcibios_scan_root+0x75/0x7e
 [] pci_numa_init+0x2c/0xe4
 [] kernel_init+0x0/0xa1
 [] do_initcalls+0x73/0x1a3
 [] proc_register+0xa0/0xa7
 [] create_proc_entry+0x73/0x86
 [] register_irq_proc+0x75/0x92
 [] kernel_init+0x0/0xa1
 [] kernel_init+0x5f/0xa1
 [] kernel_thread_helper+0x7/0x10
 ===
Code: ff 8b 83 84 00 00 00 c7 04 24 4f dc 2a c1 89 44 24 04 e8 f8 42 f0 ff 83 
7c 24 14 00 75 15 8b 93 84 00 00 00 85 d2 74 0b 8b 43 44 <8b> 00 89 82 50 01 00 
00 c7 44 24 04 9a 04 00 00 8d bb 88 00 00
EIP: [] pci_create_bus+0x11b/0x277 SS:ESP 0068:c9073e90
Kernel panic - not syncing: Attempted to kill init!

This seems to have been caused by the introduction of the following
code fragment into pci_create_bus:

if (!parent)
set_dev_node(b->bridge, pcibus_to_node(b));

This has come as part of the -mm patch below:

try-parent-numa_node-at-first-before-using-default-v2.patch

This patch does not seem to be wrong in and of itself.  It does
expose the fact that we are building busses with NULL sysdata.
This has come up at least three times now.  Below is the patch
proposed the last couple of times.  It is needed to allow any machine
with i450nx quirk, plus for NUMA-Q systems.

Andrew please could you add this to -mm again.

-apw

===
pci device ensure sysdata initialised v3

We have been seeing panic's on NUMA systems in pci_call_probe() in
2.6.19-rc1-mm1 and later.  This is related to the changes introduced
in the commit below:

[x86, PCI] Switch pci_bus::sysdata from NUMA node integer to a pointer
0a247a58fc3e2ecfc17654301033e8b8d08df2a2

In this change the sysdata has changed from directly representing
a value (the node number in NUMA) to a pointer to a structure.
However, it seems that we do not always initialise this sysdata
before we probe the device.

Prior to the changes above the node was defaulted to 'NULL'
allocating the devices to node 0 unconditionally.  This patch adds
a default sysdata entry (pci_default_sysdata), this is then used
where 'NULL' was used previously.  pci_default_sysdata defaults
the node to unknown (-1).  This is a more accurate assignment,
mirroring the value returned where no topology support is provided
and no locality information is available.

There are only two uses of this value in the affected architectures
(x86, x86_64) and generic code:

1) in x86_64, dma_alloc_pages() looks up the node in order to
   allocate node local memory.  Here if the node is invalid we
   will default to the first online node.  Behaviour here should
   be unchanged.
2) in generic, pci_call_probe() looks up the node in order to
   restrict execution of the probe on the card local node, to
   favor node local allocation.  Where this is unknown previously
   we would force execution (and thereby allocation) to node 0,
   this is arguably wrong and using -1 releases this restriction.

In an ideal world we should be supplying a sysdata for the
appropriate node where it is known.  Where it is not known defaulting
to -1 seems a better course, and would help us where node 0 is
short of memory.

Signed-off-by: Andy Whitcroft <[EMAIL PROTECTED]>
---
---
diff --git a/arch/i386/pci/common.c b/arch/i386/pci/common.c
index 85503de..362b7b4 100644
--- a/arch/i386/pci/common.c
+++ b/arch/i386/pci/common.c
@@ -27,6 +27,8 @@ unsigned long pirq_table_addr;
 struct pci_bus *pci_root_bus;
 struct pci_raw_ops *raw_pci_ops;
 
+struct pci_sysdata pci_default_sysdata = { .node = -1 };
+
 static int pci_read(struct pci_bus *bus, unsigned int devfn, int where, int 
size, u32 *value)
 {
return raw_pci_ops->read(0, bus->number, devfn, where, size, value);
diff --git a/arch/i386/pci/fixup.c b/arch/i386/pci/fixup.c
index 

Re: [ck] Re: -mm merge plans for 2.6.23

2007-07-25 Thread Michael Chang

On 7/25/07, Paul Jackson <[EMAIL PROTECTED]> wrote:

Question:
  Could those who have found this prefetch helps them alot say how
  many disks they have?  In particular, is their swap on the same
  disk spindle as their root and user files?


I have found that swap prefetch helped on all of the four machines
machine I have, although the effect is more noticeable on machines
with slower disks. They all have one hard disk, and root and swap were
always on the same disk. I have no idea how to determine how many disk
spindles they have, but since the drives are mainly low-end consumer
models sold with low-end sub $500 PCs...

--
Michael Chang

Please avoid sending me Word or PowerPoint attachments. Send me ODT,
RTF, or HTML instead.
See http://www.gnu.org/philosophy/no-word-attachments.html
Thank you.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: -mm merge plans for 2.6.23

2007-07-25 Thread Jesper Juhl

On 26/07/07, Paul Jackson <[EMAIL PROTECTED]> wrote:

> and the fact is: updatedb discards a considerable portion of the cache
> completely unnecessarily: on a reasonably complex box no way do all the

I'm wondering how much of this updatedb problem is due to poor layout
of swap and other file systems across disk spindles.

I'll wager that those most impacted by updatedb have just one disk.


[snip]


Question:
  Could those who have found this prefetch helps them alot say how
  many disks they have?  In particular, is their swap on the same
  disk spindle as their root and user files?



Swap prefetch helps me.

In my case I have a single (10K RPM, Ultra 160 SCSI) disk.

# fdisk -l /dev/sda

Disk /dev/sda: 36.7 GB, 36703918080 bytes
255 heads, 63 sectors/track, 4462 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

  Device Boot  Start End  Blocks   Id  System
/dev/sda1   1 974 7823623+  83  Linux
/dev/sda2 9751218 1959930   83  Linux
/dev/sda312191341  987997+  82  Linux swap
/dev/sda41342446225069432+  83  Linux

sda1 is "/", sda2 is "/usr/local/" and sda4 is "/home/"


But, I don't think updatedb is the problem, at least not just updatedb
on its own.
My machine has 2GB of RAM, so a single updatedb on its own will not
cause it to start swapping, but it does eat up a chunk of mem no doubt
about that.
The problem with updatedb is simply that it can be a contributing
factor to stuff being swapped out, but any memory hungry application
can do that - just try building an allyesconfig kernel and see how
much the linker eats towards the end.

What swap prefetch helps is not updatedb specifically, In my
experience it helps any case where you have applications running, then
start some memory hungry job that runs for a limited time, push the
previously started apps out to swap and then dies (like updatedb or a
compile job).

Without swap prefetch those apps that were pushed to swap won't be
brought back in before they are used (at which time the user is going
to have to sit there and wait for them).
With swap prefetch, the apps that got swapped out will slowly make
their way back once the mem hungry app has died and will then be fully
or partly back in memory when the user comes back to them.

That's how swap prefetch helps, it's got nothing to do with updatedb
as such - at least not as I see it.

--
Jesper Juhl <[EMAIL PROTECTED]>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please  http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] scheduler: improve SMP fairness in CFS

2007-07-25 Thread Chris Snook

Li, Tong N wrote:

On Wed, 2007-07-25 at 16:55 -0400, Chris Snook wrote:

Chris Friesen wrote:

Ingo Molnar wrote:

the 3s is the problem: change that to 60s! We no way want to 
over-migrate for SMP fairness, the change i did gives us reasonable 
long-term SMP fairness without the need for high-rate rebalancing.
Actually, I do have requirements from our engineering guys for 
short-term fairness.  They'd actually like decent fairness over even 
shorter intervals...1 second would be nice, 2 is acceptable.


They are willing to trade off random peak performance for predictability.

Chris

The sysctls for CFS have nanosecond resolution.  They default to 
millisecond-order values, but you can set them much lower.  See sched_fair.c for 
the knobs and their explanations.


-- Chris


This is incorrect. Those knobs control local-CPU fairness granularity
but have no control over fairness across CPUs.

I'll do some benchmarking as Ingo suggested.

  tong


CFS naturally enforces cross-CPU fairness anyway, as Ingo demonstrated. 
Lowering the local CPU parameters should cause system-wide fairness to converge 
faster.  It might be worthwhile to create a more explicit knob for this, but I'm 
inclined to believe we could do it in much less than 700 lines.  Ingo's 
one-liner to improve the 10/8 balancing case, and the resulting improvement, 
were exactly what I was saying should be possible and desirable.  TCP Nagle 
aside, it generally shouldn't take 700 lines of code to speed up the rate of 
convergence of something that already converges.


Until now I've been watching the scheduler rewrite from the sidelines, but I'm 
digging into it now.  I'll try to give some more constructive criticism soon.


-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] lguest: documentation pt I: Preparation

2007-07-25 Thread Rob Landley
On Monday 23 July 2007 9:01:48 pm Rusty Russell wrote:
> > IOW, I'd be interested in hearing Rob and Randy's opinions on it all,
> > please.
>
> So they can see what we're talking about, here's an example of the
> output:
>
>   http://lguest.ozlabs.org/lguest-journey.c.bz2

Er, so you read the readme, and then you type "make Preparation!" (which I 
wouldn't have guessed from the comment at the end of the readme), and it 
spits this to stdout.  Ok, I can add "make 'Preparation!' > lguest.txt" to my 
update script so I can mirror a copy of the output on the web.

Are there any other build targets that produce documentation which I should 
know about?

> Cheers,
> Rusty.

Rob
-- 
"One of my most productive days was throwing away 1000 lines of code."
  - Ken Thompson.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: -mm merge plans for 2.6.23

2007-07-25 Thread Zan Lynx
On Wed, 2007-07-25 at 15:05 -0700, Paul Jackson wrote:
[snip]
> Question:
>   Could those who have found this prefetch helps them alot say how
>   many disks they have?  In particular, is their swap on the same
>   disk spindle as their root and user files?
> 
> Answer - for me:
>   On my system where updatedb is a big problem, I have one, slow, disk.
>   On my system where updatedb is a small problem, swap is on a separate
> spindle.
>   On my system where updatedb is -no- problem, I have so much memory
> I never use swap.
> 
> I'd expect the laptop crowd to mostly have a single, slow, disk, and
> hence to find updatedb more painful.

A well done swap-to-flash would help here.  I sometimes do it anyway to
a 4GB CF card but I can tell it's hitting the read/update/write cycles
on the flash blocks.  The sad thing is that it is still a speed
improvement over swapping to laptop disk.
-- 
Zan Lynx <[EMAIL PROTECTED]>


signature.asc
Description: This is a digitally signed message part


  1   2   3   4   5   6   7   8   9   10   >