Re: [Cooker] XFS+HIGHMEM lockup, with fix.

2002-11-22 Thread Bryan Whitehead
Todd Lyons wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Bryan Whitehead wrote on Mon, Nov 18, 2002 at 04:52:57PM -0800 :


ahh, great work, just came home after being away for the weekend, and 
that one made me very happy=)

I'd hope Mandrake would release a new kernel. If we are hitting this 
bug, I'm sure others are


Bryan, is there any indication that this bug exists in only the XFS
driver (ie patches we got were bad...grabbed at just the wrong moment in
time) or is it more kernel wide?  It seems to be isolated to the XFS
driver only in my limited view.


this only is in the XFS driver. The patch I sent only will affect the 
xfs module.

Blue skies...			Todd
- -- 
   MandrakeSoft USA   http://www.mandrakesoft.com
Mandrake: An amalgam of good ideas from RedHat, Debian, and MandrakeSoft.
All in all, IMHO, an unbeatable combination.   --Levi Ramsey on Cooker ML
   Cooker Version mandrake-release-9.1-0.1mdk Kernel 2.4.19-19mdksecure
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQE93TwQlp7v05cW2woRAuBpAJ9NHqeydQNuqoKqFVFmz20jqsRycwCgsDdZ
ka7WVDJe/HbgCdrv4SW5Sdg=
=Ik2f
-END PGP SIGNATURE-


--
Bryan Whitehead
SysAdmin - JPL - Interferometry Systems and Technology
Phone: 818 354 2903
[EMAIL PROTECTED]





Re: [Cooker] XFS+HIGHMEM lockup, with fix.

2002-11-21 Thread Todd Lyons
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Bryan Whitehead wrote on Mon, Nov 18, 2002 at 04:52:57PM -0800 :
 
 ahh, great work, just came home after being away for the weekend, and 
 that one made me very happy=)
 I'd hope Mandrake would release a new kernel. If we are hitting this 
 bug, I'm sure others are

Bryan, is there any indication that this bug exists in only the XFS
driver (ie patches we got were bad...grabbed at just the wrong moment in
time) or is it more kernel wide?  It seems to be isolated to the XFS
driver only in my limited view.

Blue skies...   Todd
- -- 
   MandrakeSoft USA   http://www.mandrakesoft.com
Mandrake: An amalgam of good ideas from RedHat, Debian, and MandrakeSoft.
All in all, IMHO, an unbeatable combination.   --Levi Ramsey on Cooker ML
   Cooker Version mandrake-release-9.1-0.1mdk Kernel 2.4.19-19mdksecure
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQE93TwQlp7v05cW2woRAuBpAJ9NHqeydQNuqoKqFVFmz20jqsRycwCgsDdZ
ka7WVDJe/HbgCdrv4SW5Sdg=
=Ik2f
-END PGP SIGNATURE-




Re: [Cooker] XFS+HIGHMEM lockup, with fix.

2002-11-18 Thread Per Øyvind Karlsen
Bryan Whitehead wrote:


After working with SGI for the past week, with kdb backtraces and many 
rebuilds of the kernel. The problem with the xfs filesystem as 
mandrake ships it was found. Finally! Now I can use all the memory we 
bought for our machines!! :)

I included a patch that needs to be dropped into the 
2.4.19-q16/patches directory in the kernel source as shipped with 
mandrake 9.0.

Please apply and release a new kernel... :)



diff -ru linux-2.4.19/fs/xfs.orig/pagebuf/page_buf_io.c linux-2.4.19/fs/xfs/pagebuf/page_buf_io.c
--- linux-2.4.19/fs/xfs.orig/pagebuf/page_buf_io.c	2002-11-13 17:25:21.0 -0800
+++ linux-2.4.19/fs/xfs/pagebuf/page_buf_io.c	2002-11-13 17:31:40.0 -0800
@@ -309,7 +309,8 @@
		__pb_block_prepare_write_async(ip, page,
			cpoff, cpoff+csize, at_eof, NULL,
			pbmapp, PBF_WRITE);
-		memset((void *) (kmap(page) + cpoff), 0, csize);
+		/* __pb_block_prepare_write already kmap'd the page */
+		memset((void *) (page_address(page) + cpoff), 0, csize);
		pagebuf_commit_write_core(ip, page, cpoff, cpoff + csize);
		pos = ((loff_t)page-index  PAGE_CACHE_SHIFT) +
			cpoff + csize;
Only in linux-2.4.19/fs/xfs/pagebuf: page_buf_io.c.orig
 

ahh, great work, just came home after being away for the weekend, and 
that one made me very happy=)

--
Mvh Per Øyvind Karlsen
Delonic Technology Group AS
Sysadmin, developer, greasemonkey
www.delonic.no - +47 41681061





Re: [Cooker] XFS+HIGHMEM lockup, with fix.

2002-11-18 Thread Bryan Whitehead
Per Øyvind Karlsen wrote:

Bryan Whitehead wrote:


After working with SGI for the past week, with kdb backtraces and many 
rebuilds of the kernel. The problem with the xfs filesystem as 
mandrake ships it was found. Finally! Now I can use all the memory we 
bought for our machines!! :)

I included a patch that needs to be dropped into the 
2.4.19-q16/patches directory in the kernel source as shipped with 
mandrake 9.0.

Please apply and release a new kernel... :)



diff -ru linux-2.4.19/fs/xfs.orig/pagebuf/page_buf_io.c 
linux-2.4.19/fs/xfs/pagebuf/page_buf_io.c
--- linux-2.4.19/fs/xfs.orig/pagebuf/page_buf_io.c2002-11-13 
17:25:21.0 -0800
+++ linux-2.4.19/fs/xfs/pagebuf/page_buf_io.c2002-11-13 
17:31:40.0 -0800
@@ -309,7 +309,8 @@
__pb_block_prepare_write_async(ip, page,
cpoff, cpoff+csize, at_eof, NULL,
pbmapp, PBF_WRITE);
-memset((void *) (kmap(page) + cpoff), 0, csize);
+/* __pb_block_prepare_write already kmap'd the page */
+memset((void *) (page_address(page) + cpoff), 0, csize);
pagebuf_commit_write_core(ip, page, cpoff, cpoff + csize);
pos = ((loff_t)page-index  PAGE_CACHE_SHIFT) +
cpoff + csize;
Only in linux-2.4.19/fs/xfs/pagebuf: page_buf_io.c.orig
 

ahh, great work, just came home after being away for the weekend, and 
that one made me very happy=)


I'd hope Mandrake would release a new kernel. If we are hitting this 
bug, I'm sure others are

--
Bryan Whitehead
SysAdmin - JPL - Interferometry Systems and Technology
Phone: 818 354 2903
[EMAIL PROTECTED]




Re: [Cooker] XFS+HIGHMEM lockup, with fix.

2002-11-15 Thread Todd Lyons
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Bryan Whitehead wrote on Fri, Nov 15, 2002 at 03:51:03PM -0800 :
 After working with SGI for the past week, with kdb backtraces and many 
 rebuilds of the kernel. The problem with the xfs filesystem as mandrake 
 ships it was found. Finally! Now I can use all the memory we bought for 
 our machines!! :)
 I included a patch that needs to be dropped into the 2.4.19-q16/patches 
 directory in the kernel source as shipped with mandrake 9.0.

Sweet!  I'm impressed that you found such an obscure bug.  Could you go
through some of the steps that you went through to do this debugging?  I
would suggest the following mentionables:
1) enabling kdb
2) interpreting the data
3) What made you zero in on that particular memset (ties back to #2)
4) Documentation that says the __pb_block_prepare_write already kmap'ed
the page.  If the answer is use the source Luke then so be it

Congrats!

Blue skies...   Todd
- -- 
   MandrakeSoft USA   http://www.mandrakesoft.com
  cat /boot/vmlinuz  /dev/dsp  #for great justice
   Cooker Version mandrake-release-9.1-0.1mdk Kernel 2.4.19-19mdksecure
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQE91Y6tlp7v05cW2woRAksAAJ9Ae8Jjj292JrCb5OoaySlEipd+RQCfft3S
G/OzSTk3d2/Xvu8NUerhl9s=
=CjD+
-END PGP SIGNATURE-




Re: [Cooker] XFS+HIGHMEM lockup, with fix.

2002-11-15 Thread Bryan Whitehead
Sweet!  I'm impressed that you found such an obscure bug.  Could you go
through some of the steps that you went through to do this debugging?  I
would suggest the following mentionables:
1) enabling kdb


Not that hard in the mandrake kernel, just switch the kdb option in the 
.spec file from 0 to 1.

2) interpreting the data


I did a full backtrace of every process that was on the machine. (After 
the machine locked up). once your in the kdb console you just type ps 
for the process list, and then bta to backtrace all processes.

The problem was there was NO xfs backtraces, however one processes was 
screwed up and a backtrace could not be done. This ment that something 
in that process got totally hosed. it's also the process that hard 
locked the machine.

I pointed out that the load on the machine can be very low when the bug 
is triggered, and that it doesn't get triggered when there is no high 
memory available. (even when running the highmem kernel)

In linux, when HIGHMEM is turned on there is still regular (LOWMEM) 
memory. As long as xfs was only playing with the LOWMEM segment of 
memory no lockups. But as soon as xfs used HIGHMEM (no matter the load 
ont he machine) it locked hard.

3) What made you zero in on that particular memset (ties back to #2)

I ran SGI's pre3 of XFS 1.2 kernels and had no problems. After that I 
used thier own brew of 1.1 with highmem, and didn't have a problem. 
After I told SGI they sent me a one liner patch.

4) Documentation that says the __pb_block_prepare_write already kmap'ed
the page.  If the answer is use the source Luke then so be it


SGI read the source... and knows it... ;)

Mandrake must have pulled the xfs patches from SGI at exactly the wrong 
time :(


--
Bryan Whitehead
SysAdmin - JPL - Interferometry Systems and Technology
Phone: 818 354 2903
[EMAIL PROTECTED]