Re: x86-64 bad pmds in 2.6.11.6

2005-08-08 Thread Andy Davidson
On Wed, 6 Apr, 2005 22:49:03 -0400, Dave Jones wrote: On Thu, Mar 31, 2005 at 12:41:17PM +0200, Andi Kleen wrote: > On Wed, Mar 30, 2005 at 04:44:55PM -0500, Dave Jones wrote: > > I arrived at the office today to find my workstation had this spew > > in its dmesg buffer.. > Looks like rando

Debugging patch was Re: x86-64 bad pmds in 2.6.11.6 II

2005-04-22 Thread Andi Kleen
Can people who can reproduce the x86-64 2.6.11 pmd bad problem please apply the following patch and see (a) if it can be still reprocuded with it and send the output generated. Also a strace of the program that showed it (pid and name of it should be dumped) would be useful if not too big. Afte

Re: x86-64 bad pmds in 2.6.11.6 II

2005-04-19 Thread Hugh Dickins
On Tue, 19 Apr 2005, Andi Kleen wrote: > On Fri, Apr 15, 2005 at 06:58:20PM +0100, Hugh Dickins wrote: > > > > I must confess, with all due respect to Andi, that I don't understand his > > dismissal of the possibility that load_cr3 in leave_mm might be the fix > > (to create_elf_tables writing use

Re: x86-64 bad pmds in 2.6.11.6 II

2005-04-19 Thread Andi Kleen
On Fri, Apr 15, 2005 at 06:58:20PM +0100, Hugh Dickins wrote: > On Fri, 15 Apr 2005, Chris Wright wrote: > > * Andi Kleen ([EMAIL PROTECTED]) wrote: > > > On Thu, Apr 14, 2005 at 11:27:12AM -0700, Chris Wright wrote: > > > > Yes, I've seen it in .11 and earlier kernels. Happen to have same > > > >

Re: x86-64 bad pmds in 2.6.11.6 II

2005-04-15 Thread Dave Jones
On Fri, Apr 15, 2005 at 06:58:20PM +0100, Hugh Dickins wrote: > > > If there was a fix for the bad pmd problem it might be a candidate > > > for stable, but so far we dont know what causes it yet. > > If I figure a way to trigger here, I'll report back. > > Dave, earlier on you were quite ab

Re: x86-64 bad pmds in 2.6.11.6 II

2005-04-15 Thread Hugh Dickins
On Fri, 15 Apr 2005, Chris Wright wrote: > * Andi Kleen ([EMAIL PROTECTED]) wrote: > > On Thu, Apr 14, 2005 at 11:27:12AM -0700, Chris Wright wrote: > > > Yes, I've seen it in .11 and earlier kernels. Happen to have same > > > "x86_64" string on my bad pmd dumps, but can't reproduce it at all. > >

Re: x86-64 bad pmds in 2.6.11.6 II

2005-04-15 Thread Chris Wright
* Andi Kleen ([EMAIL PROTECTED]) wrote: > On Thu, Apr 14, 2005 at 11:27:12AM -0700, Chris Wright wrote: > > Yes, I've seen it in .11 and earlier kernels. Happen to have same > > "x86_64" string on my bad pmd dumps, but can't reproduce it at all. > > So, for now, I can hold off on adding the reload

Re: x86-64 bad pmds in 2.6.11.6 II

2005-04-15 Thread Andi Kleen
On Thu, Apr 14, 2005 at 11:27:12AM -0700, Chris Wright wrote: > * Andi Kleen ([EMAIL PROTECTED]) wrote: > > > I will take a closer look at the rc1/rc2 patches later this evening > > > and see if I can spot something. Can only report back tomorrow though. > > > > Actually itt started in .11 already

Re: x86-64 bad pmds in 2.6.11.6 II

2005-04-14 Thread Chris Wright
* Andi Kleen ([EMAIL PROTECTED]) wrote: > > I will take a closer look at the rc1/rc2 patches later this evening > > and see if I can spot something. Can only report back tomorrow though. > > Actually itt started in .11 already - sigh - on rereading the thread. > That will make the code audit harde

Re: x86-64 bad pmds in 2.6.11.6 II

2005-04-14 Thread Andi Kleen
> I will take a closer look at the rc1/rc2 patches later this evening > and see if I can spot something. Can only report back tomorrow though. Actually itt started in .11 already - sigh - on rereading the thread. That will make the code audit harder :/ -Andi - To unsubscribe from this list: send

Re: x86-64 bad pmds in 2.6.11.6

2005-04-14 Thread Andi Kleen
On Thu, Apr 14, 2005 at 06:34:58PM +0100, Hugh Dickins wrote: > On Thu, 14 Apr 2005, Andi Kleen wrote: > > > > Thanks for the analysis. However I doubt the load_cr3 patch can fix > > it. All it does is to stop the CPU from prefetching mappings (which > > can cause different problem). > > I though

Re: x86-64 bad pmds in 2.6.11.6

2005-04-14 Thread Hugh Dickins
On Thu, 14 Apr 2005, Andi Kleen wrote: > > Thanks for the analysis. However I doubt the load_cr3 patch can fix > it. All it does is to stop the CPU from prefetching mappings (which > can cause different problem). I thought that the leave_mm code (before your patch) flushes the TLB, but restores c

Re: x86-64 bad pmds in 2.6.11.6

2005-04-14 Thread Andi Kleen
> It looks very much as if the mm being created has for pmd a page > which was used for user stack in the outgoing mm; but somehow exec's > exit_mmap TLB flushing hasn't taken effect. I only now noticed this > patch where you fix just such an issue. Thanks for the analysis. However I doubt the lo

Re: x86-64 bad pmds in 2.6.11.6

2005-04-14 Thread Hugh Dickins
On Thu, 7 Apr 2005, Andi Kleen wrote: > Dave Jones wrote: > > I realised today that this happens every time X starts up for > > the first time. I did some experiments, and found that with 2.6.12rc1 > > it's gone. Either it got fixed accidentally, or its hidden now > > by one of the many changes i

re: x86-64 bad pmds in 2.6.11.6

2005-04-08 Thread Clem Taylor
Dave Jones reported seeing bad pmd messages in 2.6.11.6. I've been seeing them with 2.6.11 and today with 2.6.11.6. When I first saw the problem I ran memtest86 and it didn't catch anything after ~3hours. However, I don't see them when X starts. They tend to happen after a program segfaults: 2.6.1

Re: x86-64 bad pmds in 2.6.11.6

2005-04-06 Thread Andi Kleen
> I realised today that this happens every time X starts up for > the first time. I did some experiments, and found that with 2.6.12rc1 > it's gone. Either it got fixed accidentally, or its hidden now > by one of the many changes in 4-level patches. > > I'll try and narrow this down a little mor

Re: x86-64 bad pmds in 2.6.11.6

2005-04-06 Thread Dave Jones
On Thu, Mar 31, 2005 at 12:41:17PM +0200, Andi Kleen wrote: > On Wed, Mar 30, 2005 at 04:44:55PM -0500, Dave Jones wrote: > > [apologies to Andi for getting this twice, I goofed the l-k address > > the first time] > > > > > > I arrived at the office today to find my workstation had this

Re: x86-64 bad pmds in 2.6.11.6

2005-04-01 Thread Sergey S. Kostyliov
On Friday 01 April 2005 01:52, Dave Jones wrote: > On Thu, Mar 31, 2005 at 12:41:17PM +0200, Andi Kleen wrote: > > On Wed, Mar 30, 2005 at 04:44:55PM -0500, Dave Jones wrote: > > > [apologies to Andi for getting this twice, I goofed the l-k address > > > the first time] > > > > > > > > >

Re: x86-64 bad pmds in 2.6.11.6

2005-03-31 Thread Dave Jones
On Thu, Mar 31, 2005 at 12:41:17PM +0200, Andi Kleen wrote: > On Wed, Mar 30, 2005 at 04:44:55PM -0500, Dave Jones wrote: > > [apologies to Andi for getting this twice, I goofed the l-k address > > the first time] > > > > > > I arrived at the office today to find my workstation had this

Re: x86-64 bad pmds in 2.6.11.6

2005-03-31 Thread Andi Kleen
On Wed, Mar 30, 2005 at 04:44:55PM -0500, Dave Jones wrote: > [apologies to Andi for getting this twice, I goofed the l-k address > the first time] > > > I arrived at the office today to find my workstation had this spew > in its dmesg buffer.. Looks like random memory corruption to me. Can

x86-64 bad pmds in 2.6.11.6

2005-03-30 Thread Dave Jones
[apologies to Andi for getting this twice, I goofed the l-k address the first time] I arrived at the office today to find my workstation had this spew in its dmesg buffer.. mm/memory.c:97: bad pmd 81004b017438(0038a5500a88). mm/memory.c:97: bad pmd 81004b017440(000