Re: XFS corruption during power-blackout
Am Dienstag, 5. Juli 2005 20:10 schrieb Sonny Rao: On Tue, Jul 05, 2005 at 08:25:11PM +0300, Al Boldi wrote: Sonny Rao wrote: { On Wed, Jun 29, 2005 at 07:53:09AM +0300, Al Boldi wrote: What I found were 4 things in the dest dir: 1. Missing Dirs,Files. That's OK. 2. Files of size 0. That's acceptable. 3. Corrupted Files. That's unacceptable. 4. Corrupted Files with original fingerprint. That's ABSOLUTELY unacceptable. 2. Moral of the story is: What's ext3 doing the others aren't? Ext3 has stronger guaranties than basic filesystem consistency. I.e. in ordered mode, file data is always written before metadata, so the worst that could happen is a growing file's new data is written but the metadata isn't updated before a power failure... so the new writes wouldn't be seen afterwards. } Sonny, Thanks for you input! Is there an option in XFS,ReiserFS,JFS to enable ordered mode? I beleive in newer 2.6 kernels that Reiser has ordered mode (IIRC, courtesy of Chris Mason), And SuSE, ack. ftp://ftp.suse.com/pub/people/mason/patches/data-logging They are around some time ;-) but XFS and JFS do not support it. I seem to remember Shaggy (JFS maintainer) saying in older 2.4 kernels he tried to write file data before metadata but had to change that behavior in 2.6, not really sure why or anything beyond that. Greetings, Dieter -- Dieter Nützel @home: Dieter () nuetzel-hh ! de
Re: ReiserFS quota and 2.6.10
Am Mittwoch, 23. Februar 2005 11:28 schrieb Jan Kara: Because I like to keep people happy, especially Valdis Kletnieks, I'll repost my query from another email account without a disclaimer..here we go. :) The disclaimer was quite longer than the message... == From the subject, you have probably already guessed what I am about to ask.. Is quota working ok with vanilla 2.6.10 with ReiserFS? Yes, the quota should be working fine - I suggest to wait a bit for 2.6.11 as it has fixed some deadlocks in the quota code appearing under high load. Is this fixed in SLES kernels already? Thanks, Dieter
Re: Question on Reiser4 regarding power failures
Am Montag, 29. November 2004 10:25 schrieb Bernhard Prell: Thank you very much for your feedback so far! Kerin Millar wrote: For this reason, and because I believe that the stability of reiserfs was improved drastically in later revisions of the 2.4 kernel, I would urge that you consider using a modern 2.4 kernel and the latest reiserfs tools if possible! I don't want to describe why we are still using such an outdated system :-) but we will move to Gentoo with a current 2.6.x kernel and reiser4 soon. I just wanted to know if this will eliminate the problem once and for all - at least in theory - because of the new concepts in reiser4 (atomic transactions). -- Christian Mayrhuber wrote: I'd suggest to do the following for pull the plug scenarios on productive systems with reiserfs: 1) Disable write caching for ide drives with hdparm -W 0 /dev/hdX This is the most important thing to do. Will this really help to protect against partially written sectors and from there resulting read-errors (If a disk loses power while writing a sector the CRC-Check will fail and the disk reports an read-error that's not caused by a real hardware defect)? Changing the write cache strategy just moves the problem in time - maybe the propability that something happens is lower, because the amount of data that gets written at a certain point of time is smaller. 4) If you want to prevent data corruption and not just filesystem corruption, use a recent 2.6.x kernel which incoporates Chris Mason's data logging patches. Alternatively you could use a newer SuSE which has the data logging patches in the 2.4 kernel series, too. Maybe the installation of a a newer 2.4.x kernel rpm from ftp.suse.com will work for you. As reiser4 is atomic in it's operations it should protect your data even more than ext3/reiserfs with the data=journal mount option, unless you don't have write cache for ide drives enabled. Reiser4 is still beta. I am not concerned about the current beta quality, we will migrate early next year and someday the bugs will be fixed. Updating a SuSE system is quite a pain I experienced (without reinstalling), it will be easier with Gentoo to keep a reasonable current system. So why don't you use a current 2.4er with Chris Mason's (SuSE) latest ReiserFS 3.6.xx patches? Or maybe a newer SuSE 2.4er kernel (with Chris patches)? ftp://ftp.suse.com/pub/people/mason/patches/data-logging 2.4.25 is current for data=ordered|journal Greetings, Dieter
Re: 2.6.4 corruption
Am Montag, 22. März 2004 01:55 schrieb Tom Vier: after rebooting (no umount, sysreq wouldn't respond) due to a radeon problem, some files had their data blocks switched. one file was truncated, and what was truncated (it's mounted notail, btw) was attached to another file. one file was completely corrupt. this isn't considered normal behavior, i assume. i'm currently bootstrapping an altroot, so i can run reiserfsck. Use the data-logging patches for 2.6.4 up or wait or 2.6.5/2.6.6. http://marc.theaimsgroup.com/?l=reiserfsm=107967943828838w=2 and http://marc.theaimsgroup.com/?l=reiserfsm=107981018429164w=2 BTW I'm with DRI-Devel since the beginning and data-logging WORKS even with 2.4. Regards, Dieter
Re: new v3 2.6.4 logging/xattr patches
Am Sonntag, 21. März 2004 20:26 schrieb Dieter Nützel: Am Sonntag, 21. März 2004 20:22 schrieb Dieter Nützel: Am Sonntag, 21. März 2004 17:55 schrieben Sie: On Sun, 2004-03-21 at 11:44, Dieter Nützel wrote: The suse kernel of the day ;-) My experimental directory is a dump of all the suse reiserfs patches. kernel-source-2.6.4-9.19.i586 (18.03.2004) or kernel-source-2.6.4-13.2.i586 (20.03.2004) Works out of the box. Only Bootsplash is missing. Argh, K3b isn't working, again. All SCSI (DVD, CD-RW, etc.). Latest SuSE 9.0 Kernel kernel-source-2.6.4-13.12.i586 (21.03.2004) Linux version 2.6.4-13.12-smp ([EMAIL PROTECTED]) (gcc-Version 3.3.1 (SuSE Linux)) #1 SMP Mon Mar 22 14:19:29 CET 2004 Is very GOOD in all above aspects. ;-) dual Athlon MP 1900+ 1 GB DDR266, CL2 (2x 512 MB) Fujitsu MAS3184NP, 15k RPM Software RAID1 SunWave1 share/dbench# free -t total used free sharedbuffers cached Mem: 1036680 405904 630776 0 53880 137472 -/+ buffers/cache: 214552 822128 Swap: 2103288 02103288 Total: 3139968 4059042734064 SunWave1 share/dbench# time dbench 32 32 clients started Throughput 118.808 MB/sec (NB=148.51 MB/sec 1188.08 MBit/sec) 32 procs 8.250u 35.218s 0:35.55 122.2% 0+0k 0+0io 0pf+0w SunWave1 share/dbench# free -t total used free sharedbuffers cached Mem: 1036680 594592 442088 0 52348 96232 -/+ buffers/cache: 446012 590668 Swap: 2103288 02103288 Total: 3139968 5945922545376 Max load was ~15. -- Second run: Throughput 131.63 MB/sec (NB=164.537 MB/sec 1316.3 MBit/sec) 32 procs 8.347u 36.151s 0:32.09 138.6% 0+0k 0+0io 0pf+0w SunWave1 share/dbench# free -t total used free sharedbuffers cached Mem: 1036680 661580 375100 0 49748 72040 -/+ buffers/cache: 539792 496888 Swap: 2103288 02103288 Total: 3139968 6615802478388 Max load was ~18. ** Software RAID0 SunWave1 SOURCE/dbench# free -t total used free sharedbuffers cached Mem: 1036680 665544 371136 0 51852 76056 -/+ buffers/cache: 537636 499044 Swap: 2103288 02103288 Total: 3139968 6655442474424 SunWave1 SOURCE/dbench# time dbench 32 Throughput 190.978 MB/sec (NB=238.723 MB/sec 1909.78 MBit/sec) 32 procs 8.204u 32.067s 0:22.12 182.0% 0+0k 0+0io 0pf+0w SunWave1 SOURCE/dbench# free -t total used free sharedbuffers cached Mem: 1036680 437108 599572 0 59028 71192 -/+ buffers/cache: 306888 729792 Swap: 2103288 02103288 Total: 3139968 4371082702860 Max load was ~9. -- SunWave1 SOURCE/dbench# free -t total used free sharedbuffers cached Mem: 1036680 437280 599400 0 59460 71032 -/+ buffers/cache: 306788 729892 Swap: 2103288 02103288 Total: 3139968 4372802702688 SunWave1 SOURCE/dbench# time dbench 32 Throughput 184.295 MB/sec (NB=230.369 MB/sec 1842.95 MBit/sec) 32 procs 8.115u 32.166s 0:22.92 175.6% 0+0k 0+0io 0pf+0w SunWave1 SOURCE/dbench# free -t total used free sharedbuffers cached Mem: 1036680 523316 513364 0 78532 67260 -/+ buffers/cache: 377524 659156 Swap: 2103288 02103288 Total: 3139968 5233162616652 Max load was ~10. Greetings, Dieter
Re: [PATCH] updated data=ordered patch for 2.6.3
Am Mittwoch, 3. März 2004 10:13 schrieb Marc-Christian Petersen: On Tuesday 02 March 2004 20:53, Dieter Nützel wrote: Hi Dieter, I'll try on SuSE 2.6.3-16. sorry for my ignorance, but where do you find 2.6.3-_16_? I only find -0 ftp://sunsite.informatik.rwth-aachen.de/pub/Linux/suse/people/kraxel Or the original site and its mirrors. Regards, Dieter
Re: v3 logging speedups for 2.6
Am Donnerstag, 11. Dezember 2003 19:42 schrieb Chris Mason: On Thu, 2003-12-11 at 13:30, Dieter Nützel wrote: Am Donnerstag, 11. Dezember 2003 19:10 schrieb Chris Mason: Hello everyone, This is part one of the data logging port to 2.6, it includes all the cleanups and journal performance fixes. Basically, it's everything except the data=journal and data=ordered changes. The 2.6 merge has a few new things as well, I've changed things around so that metadata and log blocks will go onto the system dirty lists. This should make it easier to improve log performance, since most of the work will be done outside the journal locks. The code works for me, but should be considered highly experimental. In general, it is significantly faster than vanilla 2.6.0-test11, I've done tests with dbench, iozone, synctest and a few others. streaming writes didn't see much improvement (they were already at disk speeds), but most other tests did. Anyway, for the truly daring among you: ftp.suse.com/pub/people/mason/patches/data-logging/experimental/2.6.0-t est11 The more bug reports I get now, the faster I'll be able to stabilize things. Get the latest reiserfsck and check your disks after each use. Chris, with which kernel should I start on my SuSE 9.0? A special SuSE 2.6.0-test11 + data logging? Or plane native? --- There are such much patches in SuSE kernels... For the moment you can only try it on vanilla 2.6.0-test11. The suse 2.6 rpms have acls/xattrs and the new logging stuff won't apply. Jeff and I will fix that when the logging merge is really complete. At the rate I'm going, that should be by the end of next week, this part of the merge was the really tricky bits. Chris, can we have something against Gerd Knorr's [EMAIL PROTECTED] SuSE 2.6.1 kernel version, please? reiserfs-journal-writer Works fine (applies), still compiling...;-) reiserfs-logging Show some rejects: SunWave1 src/linux# patch -p1 -E -N ../patches/reiserfs-logging patching file fs/reiserfs/journal.c Hunk #38 FAILED at 2217. Hunk #39 succeeded at 2256 (offset 3 lines). Hunk #40 FAILED at 2294. Hunk #41 succeeded at 2423 with fuzz 1 (offset 40 lines). Hunk #42 succeeded at 2438 (offset 40 lines). Hunk #43 succeeded at 2456 (offset 40 lines). Hunk #44 succeeded at 2480 (offset 40 lines). Hunk #45 succeeded at 2519 (offset 40 lines). Hunk #46 succeeded at 2581 (offset 56 lines). Hunk #47 succeeded at 2606 (offset 56 lines). Hunk #48 succeeded at 2657 (offset 60 lines). Hunk #49 succeeded at 2727 (offset 60 lines). Hunk #50 succeeded at 2744 (offset 60 lines). Hunk #51 succeeded at 2754 (offset 60 lines). Hunk #52 succeeded at 2792 (offset 60 lines). Hunk #53 succeeded at 2832 (offset 60 lines). Hunk #54 succeeded at 2856 (offset 60 lines). Hunk #55 succeeded at 2888 (offset 60 lines). Hunk #56 succeeded at 2897 (offset 60 lines). Hunk #57 succeeded at 2985 (offset 60 lines). Hunk #58 FAILED at 3036. Hunk #59 succeeded at 3062 (offset 64 lines). Hunk #60 succeeded at 3096 with fuzz 1 (offset 67 lines). Hunk #61 succeeded at 3113 (offset 67 lines). Hunk #62 succeeded at 3147 (offset 67 lines). Hunk #63 succeeded at 3163 (offset 67 lines). Hunk #64 succeeded at 3176 (offset 67 lines). Hunk #65 succeeded at 3183 (offset 67 lines). Hunk #66 succeeded at 3219 (offset 67 lines). Hunk #67 succeeded at 3241 (offset 67 lines). 3 out of 67 hunks FAILED -- saving rejects to file fs/reiserfs/journal.c.rej patching file fs/reiserfs/objectid.c patching file fs/reiserfs/super.c Hunk #1 succeeded at 61 (offset 2 lines). Hunk #2 succeeded at 90 (offset 2 lines). Hunk #3 succeeded at 844 with fuzz 1 (offset 35 lines). Hunk #4 succeeded at 862 with fuzz 2 (offset 37 lines). Hunk #5 succeeded at 1442 with fuzz 1 (offset 47 lines). patching file fs/reiserfs/ibalance.c patching file fs/reiserfs/procfs.c patching file fs/reiserfs/fix_node.c patching file fs/reiserfs/inode.c Hunk #1 FAILED at 960. Hunk #2 succeeded at 1629 (offset 12 lines). 1 out of 2 hunks FAILED -- saving rejects to file fs/reiserfs/inode.c.rej patching file fs/reiserfs/do_balan.c patching file mm/page-writeback.c patching file include/linux/reiserfs_fs_i.h Hunk #2 FAILED at 50. 1 out of 2 hunks FAILED -- saving rejects to file include/linux/reiserfs_fs_i.h.rej patching file include/linux/reiserfs_fs_sb.h Hunk #1 succeeded at 107 (offset 1 line). Hunk #2 succeeded at 121 (offset 1 line). Hunk #3 FAILED at 155. Hunk #4 succeeded at 166 (offset 5 lines). Hunk #5 succeeded at 207 (offset 5 lines). Hunk #6 succeeded at 228 (offset 5 lines). Hunk #7 succeeded at 421 (offset 9 lines). Hunk #8 succeeded at 491 (offset 24 lines). Hunk #9 succeeded at 500 (offset 24 lines). 1 out of 9 hunks FAILED -- saving rejects to file include/linux/reiserfs_fs_sb.h.rej patching file include/linux/reiserfs_fs.h I haven't the time to do it myself, today... -- Dieter Nützel @home: Dieter.Nuetzel () hamburg ! de
Re: v3 logging speedups for 2.6
Am Montag, 12. Januar 2004 21:08 schrieb Dieter Nützel: Am Donnerstag, 11. Dezember 2003 19:42 schrieb Chris Mason: On Thu, 2003-12-11 at 13:30, Dieter Nützel wrote: Am Donnerstag, 11. Dezember 2003 19:10 schrieb Chris Mason: Hello everyone, This is part one of the data logging port to 2.6, it includes all the cleanups and journal performance fixes. Basically, it's everything except the data=journal and data=ordered changes. The 2.6 merge has a few new things as well, I've changed things around so that metadata and log blocks will go onto the system dirty lists. This should make it easier to improve log performance, since most of the work will be done outside the journal locks. The code works for me, but should be considered highly experimental. In general, it is significantly faster than vanilla 2.6.0-test11, I've done tests with dbench, iozone, synctest and a few others. streaming writes didn't see much improvement (they were already at disk speeds), but most other tests did. Anyway, for the truly daring among you: ftp.suse.com/pub/people/mason/patches/data-logging/experimental/2.6.0 -t est11 The more bug reports I get now, the faster I'll be able to stabilize things. Get the latest reiserfsck and check your disks after each use. Chris, with which kernel should I start on my SuSE 9.0? A special SuSE 2.6.0-test11 + data logging? Or plane native? --- There are such much patches in SuSE kernels... For the moment you can only try it on vanilla 2.6.0-test11. The suse 2.6 rpms have acls/xattrs and the new logging stuff won't apply. Jeff and I will fix that when the logging merge is really complete. At the rate I'm going, that should be by the end of next week, this part of the merge was the really tricky bits. Chris, can we have something against Gerd Knorr's [EMAIL PROTECTED] SuSE 2.6.1 kernel version, please? reiserfs-journal-writer Works fine (applies), still compiling...;-) Works fine! Greetings, Dieter reiserfs-logging Show some rejects: SunWave1 src/linux# patch -p1 -E -N ../patches/reiserfs-logging patching file fs/reiserfs/journal.c Hunk #38 FAILED at 2217. Hunk #39 succeeded at 2256 (offset 3 lines). Hunk #40 FAILED at 2294. Hunk #41 succeeded at 2423 with fuzz 1 (offset 40 lines). Hunk #42 succeeded at 2438 (offset 40 lines). Hunk #43 succeeded at 2456 (offset 40 lines). Hunk #44 succeeded at 2480 (offset 40 lines). Hunk #45 succeeded at 2519 (offset 40 lines). Hunk #46 succeeded at 2581 (offset 56 lines). Hunk #47 succeeded at 2606 (offset 56 lines). Hunk #48 succeeded at 2657 (offset 60 lines). Hunk #49 succeeded at 2727 (offset 60 lines). Hunk #50 succeeded at 2744 (offset 60 lines). Hunk #51 succeeded at 2754 (offset 60 lines). Hunk #52 succeeded at 2792 (offset 60 lines). Hunk #53 succeeded at 2832 (offset 60 lines). Hunk #54 succeeded at 2856 (offset 60 lines). Hunk #55 succeeded at 2888 (offset 60 lines). Hunk #56 succeeded at 2897 (offset 60 lines). Hunk #57 succeeded at 2985 (offset 60 lines). Hunk #58 FAILED at 3036. Hunk #59 succeeded at 3062 (offset 64 lines). Hunk #60 succeeded at 3096 with fuzz 1 (offset 67 lines). Hunk #61 succeeded at 3113 (offset 67 lines). Hunk #62 succeeded at 3147 (offset 67 lines). Hunk #63 succeeded at 3163 (offset 67 lines). Hunk #64 succeeded at 3176 (offset 67 lines). Hunk #65 succeeded at 3183 (offset 67 lines). Hunk #66 succeeded at 3219 (offset 67 lines). Hunk #67 succeeded at 3241 (offset 67 lines). 3 out of 67 hunks FAILED -- saving rejects to file fs/reiserfs/journal.c.rej patching file fs/reiserfs/objectid.c patching file fs/reiserfs/super.c Hunk #1 succeeded at 61 (offset 2 lines). Hunk #2 succeeded at 90 (offset 2 lines). Hunk #3 succeeded at 844 with fuzz 1 (offset 35 lines). Hunk #4 succeeded at 862 with fuzz 2 (offset 37 lines). Hunk #5 succeeded at 1442 with fuzz 1 (offset 47 lines). patching file fs/reiserfs/ibalance.c patching file fs/reiserfs/procfs.c patching file fs/reiserfs/fix_node.c patching file fs/reiserfs/inode.c Hunk #1 FAILED at 960. Hunk #2 succeeded at 1629 (offset 12 lines). 1 out of 2 hunks FAILED -- saving rejects to file fs/reiserfs/inode.c.rej patching file fs/reiserfs/do_balan.c patching file mm/page-writeback.c patching file include/linux/reiserfs_fs_i.h Hunk #2 FAILED at 50. 1 out of 2 hunks FAILED -- saving rejects to file include/linux/reiserfs_fs_i.h.rej patching file include/linux/reiserfs_fs_sb.h Hunk #1 succeeded at 107 (offset 1 line). Hunk #2 succeeded at 121 (offset 1 line). Hunk #3 FAILED at 155. Hunk #4 succeeded at 166 (offset 5 lines). Hunk #5 succeeded at 207 (offset 5 lines). Hunk #6 succeeded at 228 (offset 5 lines). Hunk #7 succeeded at 421 (offset 9 lines). Hunk #8 succeeded at 491 (offset 24 lines). Hunk #9 succeeded at 500 (offset 24 lines). 1 out of 9 hunks FAILED -- saving
Re: reiserfsprogs-3.6.12-pre1 release
Vitaly Fertman wrote (ao): The new pre release is available for downloading on ftp://ftp.namesys.com/pub/reiserfsprogs/pre/reiserfsprogs-3.6.12-pre1.tar.gz The release includes: * bad block support, documentation is available at http://www.namesys.com/bad-block-handling.html *reiserfsck can check ro mounted filesystems. Why is the SuSE (even 9.0) reiserfsprogs version (reiserfs-3.6.9-33.src.rpm) so OLD? --- It is much slower and outdated. I used the latest (3.6.11) on my older SuSE 7.3 without a hitch. It is somewhat boring, that SuSE 9.0 is released witout 3.6.11 at least. I couldn't compile 3.6.11 and 3.6.12-pre1 myself 'cause SuSE apply some patches on there source tarball. Greetings, Dieter
Re: Horrible ftruncate performance
Am Mittwoch, 16. Juli 2003 12:57 schrieb Oleg Drokin: Hello! On Wed, Jul 16, 2003 at 12:47:53PM +0200, Dieter N?tzel wrote: Somewhat. Mouse movement is OK, now. But... 1+0 Records aus 0.000u 3.090s 0:16.81 18.3% 0+0k 0+0io 153pf+0w 0.000u 0.050s 0:00.27 18.5% 0+0k 0+0io 122pf+0w INSTALL/SOURCE time dd if=/dev/zero of=sparse1 bs=1 seek=200G count=1 ; time sync 1+0 Records ein 1+0 Records aus 0.000u 3.010s 0:15.27 19.7% 0+0k 0+0io 153pf+0w 0.000u 0.020s 0:01.01 1.9% 0+0k 0+0io 122pf+0w So you create a file in 15 seconds Right. and remove it in 15 seconds. No. Normaly ~5 seconds. Ah, yes. Looking at wrong timeing info ;) I see that yesterday without the patch you had 1m, 9s, 5s, 2m times for 4 deletes... Kind of nothing changed except mouse now moves, INSTALL/SOURCE time rm sparse ; time sync 0.000u 14.990s 1:31.15 16.4%0+0k 0+0io 130pf+0w 0.000u 0.030s 0:00.22 13.6% 0+0k 0+0io 122pf+0w So the stuff fell out of cache and we need to read it again. Shouldn't this take only 15 seconds, then? Probably there was some seeking due to removal of lots of blocks. Worst case was ~5 minutes. Yeah, this is of course sad. BTW is this with search_reada patch? Yes. What if you try without it? Do _NOT_ really help. INSTALL/SOURCE l insgesamt 1032 drwxrwxr-x2 root root 176 Jul 17 20:05 . drwxr-xr-x3 root root 72 Jul 3 01:39 .. -rw-r--r--1 nuetzel users 452390 Jul 15 00:29 kmplayer-0.7.96.tar.bz2 -rw-r--r--1 nuetzel users 403358 Jul 14 21:46 modutils-2.4.21-18.src.rpm -rw-r--r--1 nuetzel users 194505 Jul 14 22:01 procps-2.0.13-1.src.rpm INSTALL/SOURCE time dd if=/dev/zero of=sparse bs=1 seek=200G count=1 ; time sync 1+0 Records ein 1+0 Records aus 0.000u 2.770s 0:15.88 17.4% 0+0k 0+0io 153pf+0w 0.000u 0.000s 0:00.79 0.0% 0+0k 0+0io 122pf+0w INSTALL/SOURCE time dd if=/dev/zero of=sparse1 bs=1 seek=200G count=1 ; time sync 1+0 Records ein 1+0 Records aus 0.010u 2.440s 0:15.03 16.3% 0+0k 0+0io 153pf+0w 0.010u 0.020s 0:01.08 2.7% 0+0k 0+0io 122pf+0w INSTALL/SOURCE time dd if=/dev/zero of=sparse2 bs=1 seek=200G count=1 ; time sync 1+0 Records ein 1+0 Records aus 0.010u 2.710s 0:14.94 18.2% 0+0k 0+0io 153pf+0w 0.000u 0.000s 0:01.76 0.0% 0+0k 0+0io 122pf+0w INSTALL/SOURCE l insgesamt 615444 drwxrwxr-x2 root root 248 Jul 17 20:06 . drwxr-xr-x3 root root 72 Jul 3 01:39 .. -rw-r--r--1 nuetzel users 452390 Jul 15 00:29 kmplayer-0.7.96.tar.bz2 -rw-r--r--1 nuetzel users 403358 Jul 14 21:46 modutils-2.4.21-18.src.rpm -rw-r--r--1 nuetzel users 194505 Jul 14 22:01 procps-2.0.13-1.src.rpm -rw-r--r--1 nuetzel users214748364801 Jul 17 20:06 sparse -rw-r--r--1 nuetzel users214748364801 Jul 17 20:06 sparse1 -rw-r--r--1 nuetzel users214748364801 Jul 17 20:07 sparse2 INSTALL/SOURCE time sync 0.000u 0.000s 0:00.02 0.0% 0+0k 0+0io 122pf+0w INSTALL/SOURCE time rm sparse2 ; time sync 0.000u 4.860s 0:04.82 100.8%0+0k 0+0io 130pf+0w 0.010u 0.000s 0:00.03 33.3% 0+0k 0+0io 122pf+0w INSTALL/SOURCE time rm sparse1 ; time sync 0.000u 4.910s 0:04.82 101.8%0+0k 0+0io 130pf+0w 0.000u 0.020s 0:00.03 66.6% 0+0k 0+0io 122pf+0w !!! INSTALL/SOURCE time rm sparse ; time sync 0.010u 6.500s 0:48.47 13.4% 0+0k 0+0io 130pf+0w 0.000u 0.000s 0:00.02 0.0% 0+0k 0+0io 122pf+0w !!! INSTALL/SOURCE l insgesamt 1032 drwxrwxr-x2 root root 176 Jul 17 20:08 . drwxr-xr-x3 root root 72 Jul 3 01:39 .. -rw-r--r--1 nuetzel users 452390 Jul 15 00:29 kmplayer-0.7.96.tar.bz2 -rw-r--r--1 nuetzel users 403358 Jul 14 21:46 modutils-2.4.21-18.src.rpm -rw-r--r--1 nuetzel users 194505 Jul 14 22:01 procps-2.0.13-1.src.rpm Overwrite: INSTALL/SOURCE time sync 0.000u 0.000s 0:00.02 0.0% 0+0k 0+0io 122pf+0w INSTALL/SOURCE time dd if=/dev/zero of=sparse bs=1 seek=200G count=1 ; time sync 1+0 Records ein 1+0 Records aus 0.010u 2.890s 0:16.17 17.9% 0+0k 0+0io 153pf+0w 0.000u 0.020s 0:01.27 1.5% 0+0k 0+0io 122pf+0w INSTALL/SOURCE l insgesamt 205836 drwxrwxr-x2 root root 200 Jul 17 20:09 . drwxr-xr-x3 root root 72 Jul 3 01:39 .. -rw-r--r--1 nuetzel users 452390 Jul 15 00:29 kmplayer-0.7.96.tar.bz2 -rw-r--r--1 nuetzel users 403358 Jul 14 21:46 modutils-2.4.21-18.src.rpm -rw-r--r--1 nuetzel users 194505 Jul 14 22:01 procps-2.0.13-1.src.rpm -rw-r--r--1 nuetzel users214748364801 Jul 17 20:09 sparse INSTALL/SOURCE time sync 0.000u 0.010s 0:00.09 11.1% 0+0k 0+0io 122pf+0w INSTALL/SOURCE time dd if=/dev/zero
Re: Horrible ftruncate performance
Am Mittwoch, 16. Juli 2003 12:35 schrieb Oleg Drokin: Hello! On Tue, Jul 15, 2003 at 09:55:09PM +0200, Dieter N?tzel wrote: Somewhat. Mouse movement is OK, now. But... 1+0 Records aus 0.000u 3.090s 0:16.81 18.3% 0+0k 0+0io 153pf+0w 0.000u 0.050s 0:00.27 18.5% 0+0k 0+0io 122pf+0w INSTALL/SOURCE time dd if=/dev/zero of=sparse1 bs=1 seek=200G count=1 ; time sync 1+0 Records ein 1+0 Records aus 0.000u 3.010s 0:15.27 19.7% 0+0k 0+0io 153pf+0w 0.000u 0.020s 0:01.01 1.9% 0+0k 0+0io 122pf+0w So you create a file in 15 seconds Right. and remove it in 15 seconds. No. Normaly ~5 seconds. INSTALL/SOURCE time rm sparse2 ; time sync 0.000u 4.930s 0:04.88 101.0%0+0k 0+0io 130pf+0w 0.000u 0.010s 0:00.02 50.0% 0+0k 0+0io 122pf+0w Kind of nothing changed except mouse now moves, Yes. am I reading this wrong? No. ;-) INSTALL/SOURCE time rm sparse ; time sync 0.000u 14.990s 1:31.15 16.4%0+0k 0+0io 130pf+0w 0.000u 0.030s 0:00.22 13.6% 0+0k 0+0io 122pf+0w So the stuff fell out of cache and we need to read it again. Shouldn't this take only 15 seconds, then? Worst case was ~5 minutes. hence the increased time. Hm, probably this case can be optimized if there is only one item in the leaf and this item should be removed. Need to take closer look to balancing code. Now, out of office. Greetings, Dieter
Re: Horrible ftruncate performance
Am Freitag, 11. Juli 2003 17:24 schrieb Oleg Drokin: Hello! On Fri, Jul 11, 2003 at 05:16:56PM +0200, Dieter N?tzel wrote: OK some hand work... Where comes this from? I don't find it my tree: fs/eiserfs/inode.c -if (un.unfm_nodenum) inode-i_blocks += inode-i_sb-s_blocksize / 512; //mark_tail_converted (inode); Thanks, Dieter
Re: Horrible ftruncate performance
Am Freitag, 11. Juli 2003 17:32 schrieb Oleg Drokin: Hello! On Fri, Jul 11, 2003 at 05:27:25PM +0200, Dieter N?tzel wrote: Actually I did it already, as data-logging patches can be applied to 2.4.22-pre3 (where this truncate patch was included). No -aaX. Right. Maybe it _IS_ time for this _AND_ all the other data-logging patches? 2.4.22-pre5? It's Chris turn. I thought it is good idea to test in -ac first, though (even taking into account that these patches are part of SuSE's stock kernels). I don't think -ac would make it = No big Reiser involved... Would make what? I think Alan have agreed to put data-logging code in already. OK, good to hear. But all ReiserFS data-logging users running -aa (SuSE) kernels (apart from WOLK ;-). -Dieter
Re: Horrible ftruncate performance
Am Freitag, 11. Juli 2003 17:36 schrieb Marc-Christian Petersen: On Friday 11 July 2003 17:32, Dieter Nützel wrote: Hi Dieter, Where comes this from? I don't find it my tree: fs/eiserfs/inode.c -if (un.unfm_nodenum) inode-i_blocks += inode-i_sb-s_blocksize / 512; //mark_tail_converted (inode); reiserfs-quota patch removes this. Ah, thank you very much. I can go ahead, then. -Dieter
Re: Horrible ftruncate performance
Am Freitag, 11. Juli 2003 20:32 schrieb Chris Mason: On Fri, 2003-07-11 at 13:27, Dieter Nützel wrote: 2.5 porting work has restarted at last, Oleg's really been helpful with keeping the 2.4 stuff up to date. Nice but. Patches against latest -aa could be helpful, then. Hmmm, the latest -aa isn't all that latest right now. True. 2.4.21-jam1 has 2.4.21pre8aa1 plus several other stuff. Do you want something against 2.4.21-rc8aa1 or should I wait until andrea updates to 2.4.22-pre something? I think the later is better. Up and running with hand crafted stuff ;-) As reminder the old numbers (single U160, IBM 10k rpm): 2.4.21-jam1 (aa1) plus all data-logging SOURCE/dri-trunk time dd if=/dev/zero of=sparse bs=1 seek=200G count=1 0.000u 362.760s 8:26.34 71.6% 0+0k 0+0io 124pf+0w It was runinng with a paralell C++ (-j20) compilation. Now the new ones (I've test with and without above C++ compilation ;-) INSTALL/SOURCE time dd if=/dev/zero of=sparse2 bs=1 seek=200G count=1 1+0 Records ein 1+0 Records aus 0.000u 0.010s 0:00.00 0.0% 0+0k 0+0io 153pf+0w INSTALL/SOURCE l insgesamt 19294 drwxrwxr-x2 root root 192 Jul 11 21:46 . drwxr-xr-x3 root root 72 Jul 3 01:39 .. -rw-r--r--1 nuetzel users 1696205 Jul 3 01:32 atlas3.5.6.tar.bz2 -rw-r--r--1 nuetzel users 2945814 Jul 2 02:53 k3b-0.9pre2.tar.gz -rw-r--r--1 nuetzel users15078557 Jul 2 03:04 movix-0.8.0rc2.tar.gz -rw-r--r--1 nuetzel users214748364801 Jul 11 21:46 sparse2 More than 506 times... = 506.34 seconds (8:26.34) / 0.01 seconds = 50.634 times ;-))) GREAT stuff! Thanks, Dieter
Re: Error messages.
Am Donnerstag, 6. März 2003 13:23 schrieb Anders Widman: Hardware is so much fun to debug sometimes, and when you are the 1% case life can really suck. Do you have: bad cooling Nope. Not warmer than 35C anywhere, including the surface of the drives. bad power supply Well, this has been checked too, though I could not be entirely sure. I have used two different Chieftek 340W PSUs bad voltage from power company The power distribution facility is just about 300m from here. And I have installed line filters that takes cares of noise and spikes. electrical noise How do I measure this?... Might be as there are many drives installed. They might disturb each other. ? I think though that Oleg is right that you should contact the IDE guys and ask them for their list of things to check for that particular error message (and tell us about it). Yes, I am trying to get contact with the linux-ide mailing list. We'll see what will come out of those. Prior experience with that list is not very successful ;) Maybe simple l-k? Or Alan Cox, Andre Hedrick directly? ;-) Regards, Dieter
Re: What Filesystem?
Am Mittwoch, 29. Januar 2003 16:28 schrieb Ross Vandegrift: On Wed, Jan 29, 2003 at 03:20:26PM -0500, James Thompson wrote: I am a visual artist and musician. Check out the document at http://myweb.cableone.net/eviltwin69/ALSA_JACK_ARDOUR.html. There a section that benchmarks various filesystems for their latency. The short story is that Reiserfs wins handily over Ext2, Ext3, and FAT32 (duh! why would someone test that...). Unfortunately, there's no comparison between Reiser/JFS/XFS. I think that would be more of a fair match. XFS could win _today_ for real time (granted video bandwidth, for which it was designed in the first place), but this could change (compensate) with the latest ReiserFS 3.x patches (data logging) and finally Reiser4 (see the Reiser Homepage). Anyhow though, for general low-latency multimedia work, ReiserFS looks like it's a good choice. Yes. Go with low-latency _and_ preemption patches (try Gentoo, it has it all). I did beta testing for Robert Love (MontaVista) for ages and it is _the way to go_ for multi media machines. Greetings, Dieter -- Dieter Nützel Graduate Student, Computer Science University of Hamburg Department of Computer Science @home: Dieter.Nuetzel at hamburg.de (replace at with @)
Re: What Filesystem?
Am Mittwoch, 29. Januar 2003 19:16 schrieb [EMAIL PROTECTED]: On Wed, 29 Jan 2003 15:20:26 EST, James Thompson [EMAIL PROTECTED] said: ... I have done much research in the field of computer /hardware/ suitable for commercial Digital Content Creation (P4 Xeon; Wildcat graphics; ART's PURE raytracing PCI render board, etc. ...) and now Hmm.. so we're looking at high-end rendering, which is usually a CPU hog. 1. Kernel Patches - pre-empt and low latency; so I'm not clear on why you're worried about low latency? Remember that it *does* come with an overhead What? --- Sorry, quantify it. I can't. All low latency and pre-emption tests have showed _improved_ throughput (Yes) and much better multi media (video/audio) experience ;-) No measurable overhead an single and SMP systems (both Athlon). That's why it is in 2.5/2.6. - the low-latency stuff is good if you're more concerned about fast response than total system load (for instance, on my laptop I'm willing to give up 5% of the CPU if it makes the X server run perceivably faster. If I was doing a lot of rendering, I'd want that 5% for user cycles. But you want to have the much better task switching behavior together with the brand new O(1) scheduler. Could you be more specific regarding what sort of content you are making? (i.e. single frame images suitable for monitor display (1600x1200 and smaller), or large-format for high-resolution printing (posters, etc), or video, etc..) The resources needed to produce a 10-minute video clip are different from the things you'll need to produce a 4 foot x 5 foot poster at 600DPI. You mean single vs 2-/4-/8-/etc. (NUMA) SMP systems or even clusters? ;-) Greetings, Dieter -- Dieter Nützel Graduate Student, Computer Science University of Hamburg Department of Computer Science @home: Dieter.Nuetzel at hamburg.de (replace at with @)
Re: slightly [OT] highmem (was Re: 2.4.20 at kernel.org and data logging)
Am Freitag, 24. Januar 2003 18:03 schrieb Oleg Drokin: Hello! On Fri, Jan 24, 2003 at 06:00:19PM +0100, Dieter N?tzel wrote: higmem4GB / highmem64GB with pae or does it produce more overhead that you mention below? You get no advantage of course. But lots of overhead. Rumours have it that 256M systems with highmem enabled kernels (default for RedHat beta it seems) are swapping much more then when the same kernel is built with highmem off. But that could be because they have forgotten to enabled HIGHMEM IO? See Andrea Ancangeli's -aa kernels. What HIGHMEM IO? There is exactly NO highmem, so sighmem IO code won't be used. Yes, you are right. There isn't any physical highmem. Shows how stupid the above idea is. Manuel's system haven't have highmem, too. So he should see the performance degradation, too. Thanks, Dieter 2.4.20-aa1 for example: CONFIG_HIGHIO: If you want to be able to do I/O to high memory pages, say Y. Otherwise low memory pages are used as bounce buffers causing a degrade in performance.
Re: what do you do that stresses your filesystem?
Am Montag, 23. Dezember 2002 12:37 schrieb Anders Widman: We were discussing how to optimize reiser4 best, and came to realize that us developers did not have a good enough intuition for what users do that stresses their filesystem enough that they care about its performance. If you just do edits of files it probably does not matter too much what fs you use. depends on how large edits =).. Like video-editing requires big r/w performance. Like Andrew Morten suggest all the time (deadline IO tests): Running (heavy) constant (streaming) IO in the background during normal IO (20-40 parallel gcc/g++ compilers (make -j20) in my case). WMs e.g. KDE 3.1 isn't a case any longer...;-) Give your workers a break. Merry Christmas! -- Dieter Nützel Graduate Student, Computer Science University of Hamburg Department of Computer Science @home: Dieter.Nuetzel at hamburg.de (replace at with @)
Re: [reiserfs-dev] Re: [ANNOUNCE] reiser4 snaphots
Am Freitag, 8. November 2002 21:39 schrieb Todd Lyons: [EMAIL PROTECTED] wanted us to know: Todd Lyons wrote: Maybe my understanding is a little off, but I thought that I could just rebuild the modules and install them and run depmod and the new module symbols would be in effect. But that didn't work (same error). So then I tried also a make bzImage and make install (which updates the System.map, something that didn't get done with make modules_install). It still would not load (same error). Finally, I rebooted loading the new kernel and that fixed it. No comments needed, unless they are what you did should have worked. I've seen similar issues when the /boot/System.map file doesn't match the kernel in use (this is particularly an issue for 'depmod' - the standard 'make modules_install' target passes a -F flag to depmod to use the just-created System.map file. That I get. You're saying that my make install step which updated the System.map in /boot was nullfied because it didn't exactly match the running kernel version. And you're also saying that -F makes it ignore what /proc/ksyms has. But then that leads to the next problem. That System.map doesn't match the /proc/ksyms, so I'm still left with a reboot as my only option. That's normal for all kernel developers. If you build a new kernel with new symbols some stuff couldn't match until you have rebooted. Thinking out loud. But I didn't change the way that the kernel was compiled. Reiser4 is _still_ a module. What am I missing? Are you sure? -Dieter
Re: [reiserfs-list] Reiserfs with Samba vs NetApp Filer
Am Donnerstag, 10. Oktober 2002 20:02 schrieb Hans Reiser: Philippe Gramoullé wrote: Hi, Yes, that's what i meant by not *out of the box* snapshots. We quite never played with LVM as it adds another software layer and i didn't feel it was mature enough at the time we put everything in production but given the benefits we might want to give it a try. BTW, Hans, what you would recommand : LVM or EVMS ? Thanks, Philippe On Thu, 10 Oct 2002 21:31:29 +0400 Hans Reiser [EMAIL PROTECTED] wrote: | Netapp as well have tons of features that linux boxes +PV210S can't | do :Clustering | | volume copy, out out the box snapshots,etc.. | | Linux has snapshots if you use lvm, but it is true that Netapp | makes a good product. I lack the expertise to advise on this. There has been a lot of controversy on this. Keep an eye out for lvm2 to come out. Read about it on LKML. LVM is dropped with 2.5.41+. LVM2 comes under the hood of EMVS. -Dieter
Re: [reiserfs-list] Reiserfs with Samba vs NetApp Filer
Am Donnerstag, 10. Oktober 2002 20:57 schrieb Dieter Nützel: Read about it on LKML. LVM is dropped with 2.5.41+. LVM2 comes under the hood of EMVS. Of course EVMS ;-) -Dieter
Re: [reiserfs-list] reiserfsprogs-3.6.4-pre2
On Wednesday 18 September 2002 03:55, Manuel Krause wrote: On 09/17/2002 03:23 PM, Vitaly Fertman wrote: Hi all, A new reiserfsprogs pre release is available at ftp.namesys.com/pub/reiserfsprogs/pre/reiserfsprogs-3.6.4-pre2.tar.gz Changes went into 3.6.4-pre2: fix-fixable sets correct item formats in item headers if needed. rebuild got some extra checks for invalid tails on pass0. fsck check does not complain on wrong file sizes if safelink exists. check dma mode/spead of harddrive and warn the user if it descreased -- it could happen due to some hardware problem. Bugs: during conversion tails to indirect items on pass2 and back conversion on semantic pass. not proper cleaning flags in item headers. during relocation of shared objects. new block allocating on pass2 (very rare case). Hi Vitaly! Does this mean these Bugs are [a] only in 3.6.4-pre2, so: newly created [b] in 3.6.4-pre2 and the previous -pres [c] or even in the 3.6.3 release,too ??? Manuel geh ins Bett oder lese richtig...;-) It should read (I think): * New stuff --- Changes... * with -pre2 fixed bugs... Good night! Dieter -- Dieter Nützel Graduate Student, Computer Science University of Hamburg Department of Computer Science home: Dieter.Nuetzel at hamburg.de (replace at with )
[reiserfs-list] Re: Linux-2.4.17-rc1 boot problem (fwd)
Am Montag, 17. Dezember 2001 04:14 schrieb Chris Mason: On Monday, December 17, 2001 01:52:23 AM +0100 Dieter Nützel [EMAIL PROTECTED] wrote: [cursing] I think its almost the same bug we found on friday in the P patch (not in the kernel yet) where 3.6.x filesystems are missing some intialization on remounts from ro to rw. Marcelo, you're bcc'd to let you know progress has been made, and to keep replies out of your inbox until we've all agreed this is the right fix. Patch attached, it sets the CONVERT bit and the hash function code when mounting readonly. Chris, do you think I can't see the bug 'cause I am running with the new P patch? Any hints how I can do some tests with the P patch , here? Correct, the P patch also fixes the biggest of the problems. I think we should also set the hash, but that is probably up for debate. Testing the unlink-truncate code that completes unlinks and truncates after a crash would be good. dd if=/dev/zero of=foo bs=1MB count=3000 # or some other big number truncate.pl foo 40 # silly perl script attached # wait 5 secs, hit reset Also, dbench with 20 or more procs will hit on the unlink code, wait for the plus signs to start showing up to indicate processes are finishing, hit reset. The bug is when we are remounting from ro to rw, so if you are doing this on a test drive, mount readonly first, then use mount -o rw,remount Will try the above tomorrow, but I have something around for a quick response. My father worked on the wires in our house (he have an electronic engineer education so you didn't be worried :-) and an older cable reinduced a bypass... ...so I had a real world test scenario...:-))) Here comes what I got in dmesg: Linux version 2.4.17-pre5-preempt (root@SunWave1) (gcc version 2.95.3 20010315 (SuSE)) #1 Fre Dez 7 20:39:06 CET 2001 [-] reiserfs: checking transaction log (device 08:03) ... Using r5 hash to sort names ReiserFS version 3.6.25 VFS: Mounted root (reiserfs filesystem) readonly. Freeing unused kernel memory: 208k freed Adding Swap: 1028120k swap-space (priority -1) reiserfs: checking transaction log (device 08:02) ... reiserfs: replayed 6 transactions in 1 seconds Using r5 hash to sort names ReiserFS version 3.6.25 reiserfs: checking transaction log (device 08:05) ... Using r5 hash to sort names ReiserFS version 3.6.25 reiserfs: checking transaction log (device 08:06) ... reiserfs: replayed 1 transactions in 1 seconds Using r5 hash to sort names Removing [511 2732 0x0 SD]..4done Removing [511 2729 0x0 SD]..4done Removing [511 2728 0x0 SD]..4done Removing [511 2724 0x0 SD]..4done Removing [511 2723 0x0 SD]..4done Removing [511 2722 0x0 SD]..4done Removing [511 2717 0x0 SD]..4done Removing [511 2716 0x0 SD]..4done Removing [511 2700 0x0 SD]..4done Removing [511 2295 0x0 SD]..4done Removing [511 1287 0x0 SD]..4done Removing [511 1241 0x0 SD]..4done There were 12 uncompleted unlinks/truncates. Completed ReiserFS version 3.6.25 reiserfs: checking transaction log (device 08:07) ... reiserfs: replayed 12 transactions in 1 seconds Using r5 hash to sort names ReiserFS version 3.6.25 reiserfs: checking transaction log (device 08:08) ... reiserfs: replayed 2 transactions in 1 seconds Using r5 hash to sort names ReiserFS version 3.6.25 reiserfs: checking transaction log (device 08:11) ... reiserfs: replayed 2 transactions in 4 seconds Using r5 hash to sort names ReiserFS version 3.6.25 reiserfs: checking transaction log (device 08:15) ... reiserfs: replayed 2 transactions in 4 seconds Using r5 hash to sort names ReiserFS version 3.6.25 reiserfs: checking transaction log (device 08:16) ... reiserfs: replayed 1 transactions in 4 seconds Using r5 hash to sort names ReiserFS version 3.6.25 reiserfs: checking transaction log (device 08:17) ... reiserfs: replayed 1 transactions in 5 seconds Using r5 hash to sort names ReiserFS version 3.6.25 reiserfs: checking transaction log (device 08:18) ... reiserfs: replayed 1 transactions in 4 seconds Using r5 hash to sort names ReiserFS version 3.6.25 Looks good, to me. -Dieter
Re: [reiserfs-list] Re: funny file permission after reiserfsck
Am Donnerstag, 13. Dezember 2001 23:16 schrieb Chris Mason: On Thursday, December 13, 2001 11:17:12 PM +0100 Manuel Krause [EMAIL PROTECTED] wrote: On 12/13/2001 10:45 PM, Chris Mason wrote: On Thursday, December 13, 2001 10:41:13 PM +0100 Manuel Krause [EMAIL PROTECTED] wrote: It's very clear now that the expanding-truncate-4 patch needs to be excluded and/or adjusted, too. :-) I hope that works for you after your inode-attrs experiment! Odd, which low latency patches were you running? -chris With 2.4.16 +reiserfs-patches: Andrew Mortons low-latency patch http://www.zip.com.au/~akpm/linux/2.4.16-low-latency.patch.gz (from page http://www.zip.com.au/~akpm/linux/schedlat.html ) This should be safe. Any chance I could talk you into testing 2.4.17-pre8 + andrew's patch? Maybe Manuel will...;-) The latest lock-break-rml-2.4.17-pre8-1.patch from Robert is based on Andrew's patch and include some (small) lock-breaks for ReiserFS. Any comments Chris? Has Robert something overlooked? Maybe only when expanding-truncate-4.patch comes into play? -Dieter BTW I am going after Manuel and try 2.4.17-pre8-preempt + lock-break + K-N+P witchout Chris's one.
[reiserfs-list] Fwd: Re: [PATCH] lock-breaking in reiserfs for the preemptible kernel
Some useful things you should all know about, too. Regards, Dieter -- Weitergeleitete Nachricht -- Subject: Re: [PATCH] lock-breaking in reiserfs for the preemptible kernel Date: Sat, 1 Dec 2001 16:35:30 +0100 From: Dieter Nützel [EMAIL PROTECTED] To: Robert Love [EMAIL PROTECTED] Cc: Torrey Hoffman [EMAIL PROTECTED], Nikita Danilov [EMAIL PROTECTED] Am Samstag, 1. Dezember 2001 06:23 schrieb Robert Love: Attached find an updated lock-break-reiserfs patch. Specifically, fix the dumb compile problem and change the lock depths as reported by Torrey. There are still some unaccounted for and thus they might cause excessive activity ... Robert Love Sorry Robert, but where is the attachment? ;-))) BTW Can I use it on top of 2.4.16 because I use all the pending ReiserFS patches (A-O)? --- Will try it anyway. Ah, 2.4.17-pre2 is out with the ReiserFS patches, Nikita? Thanks, Dieter
Re: [reiserfs-list] Re: [REISERFS TESTING] new patches on ftp.namesys.com: 2.4.15-pre7
Am Mittwoch, 21. November 2001 22:20 schrieb Andreas Dilger: On Nov 21, 2001 21:19 +0100, Dieter Nützel wrote: Some files are _NOT_ deleteable even as root, argh? The normal ext2 solution in this case is move it all to a separate dir, cp -a from the old dir, and then wait for e2fsck to clean up. Sorry, I don't understand you, here. If I _first_ move all to a separate dir, the original dir is empty, no? You mean move it all to a separate dir, cp -a from the new to the old dir, delete the new one, and then wait for e2fsck to clean up? Since reiserfsck won't do thisyet , just consider it a few kB of lost space in /dev/bad or whatever. This _should_ be OK, because cp doesn't know anything about attributes. SunWave1 /home/nuetzel# lsattr / BD--j //bin BD--j //dev -DX-j //dvd BD--j //etc BD--j //lib BD--j //mnt BDXE- //opt BDXE- //tmp BDXE- //var BDXE- //usr BD--j //boot No, this is garbage. Most of these are ext2-specific flags, and sadly some of them are probably not even settable by chattr, I don't know, but you could try. I think only 'j' is supported by chattr (full data journaling, which isn't implemented in reiserfs yet). chattr -R -j / Worked. SunWave1 /# time chattr -R -B / usage: chattr [-RV] [-+=AacdijsSu] [-v version] files... 0.000u 0.000s 0:00.00 0.0% 0+0k 0+0io 105pf+0w The others are unused, undocumented attributes, such as: B=compressed block D=compressed dirty file X=compression raw access E=compression error You can ignore these for now, they won't really hurt you, but maybe before pushing the reiserfs attributes patch, there should be a feature flag put in the superblock which means attributes are valid, and resierfsck will set this flag (if unset) after zeroing all of these fields. If it is an issue of the kernel not zeroing these fields for new inodes which is fixed by the attribute patch, then an attribute-aware reiserfs should refuse to mount with attributes enabled until reiserfsck --clear-attr is run to clear the attributes and set the flag. I think a compile time switch with a _BIG_ warning that this is _experimental_ stuff is needed, too. Thanks, Dieter
Re: [reiserfs-list] Re: [REISERFS TESTING] new patches on ftp.namesys.com: 2.4.15-pre7
Am Mittwoch, 21. November 2001 22:20 schrieb Andreas Dilger: On Nov 21, 2001 21:19 +0100, Dieter Nützel wrote: Some files are _NOT_ deleteable even as root, argh? The normal ext2 solution in this case is move it all to a separate dir, cp -a from the old dir, and then wait for e2fsck to clean up. Since reiserfsck won't do thisyet , just consider it a few kB of lost space in /dev/bad or whatever. This _should_ be OK, because cp doesn't know anything about attributes. Sorry, I've forgotten one question. Any idea about init spawning processes and system hang? Thanks again, Dieter
Re: [reiserfs-list] Re: [REISERFS TESTING] new patches on ftp.namesys.com: 2.4.15-pre7
Am Mittwoch, 21. November 2001 23:51 schrieb Andreas Dilger: So, you have a lot of bad inodes in /dev, do this (untested, but easily reversible): mv /dev /.badattr mkdir /dev lsattr -d /dev Hopefully /dev is created without any attributes. If it is, then you need to find a directory which has no attribute bits set, create /dev there, and mv it to the root directory. cp -a /.badattr/* /dev lsattr -R /dev Hopefully all of the new inodes in /dev will not have attributes set. You are the man! They had not and I am up and running, now. Presumably, the reiserfs attribute code does not inherit attributes for files which do not support them (e.g. special files), because ioctls on these files will talk to the device/socket/etc instead of to the filesystem. This might need to be fixed in the reiserfs patch. Yes, I see some garbage all around. In the end, these attributes don't do anything bad for you, so they could all just be ignores. You can put other bad files into .badattr until then also. I even could delete it at least ;-) My first try was to delete it under an older kernel. That worked but I got some broken inodes through the next reboot cycle, again. /dev/pts and /dev/shm So during my second run I leave them alone and bingo. I could delete them with rmdir (inclusive .badattr) after the next boot. Finally I have 2.4.15-pre8 + preempt + ReiserFS A-N + Andrea Arcangeli's 00_lowlatency-fixes-3 up and running. Now it is time to go to bed, get some sleep and have a really happy birthday. No, I do not need congratulations from all around the world...;-) See you. Dieter
[reiserfs-list] Re: [PATCH] 2.4.10 improved reiserfs a lot, but could still be better
On Monday, September 24, 2001 14:46:09 PM -0400 Chris Mason wrote: On Monday, September 24, 2001 10:09:59 PM +0800 Beau Kuiper [EMAIL PROTECTED] wrote: Hi all again, I have updated my last set of patches for reiserfs to run on the 2.4.10 kernel. The new set of patches create a new method to do kupdated syncs. On filesystems that do no support this new method, the regular write_super method is used. Then reiserfs on kupdated super_sync, simply calls the flush_old_commits code with immediate mode off. Ok, I think the patch is missing ;-) That's what I've found first, too :-))) What we need to do now is look more closely at why the performance increases. I can't second that. First let me tell you that _all_ of my previously posted benchmarks are recorded _WITHOUT_ write caching. Even if my IBM DDSY-T18350N 18 GB U160 10k disk has 4 MB cache (8 and 16 MB versions are available, too) it is NOT enabled per default (firmware and Linux SCSI driver). Do you know of a Linux SCSI tool to enable it for testing purposes? I know it _IS_ unsave for server (journaling) systems. Below are my numbers for 2.4.10-preempt plus some little additional patches and your latest patch. Greetings, Dieter * inode.c-schedule.patch (Andrea, me) APPLIED Could be the culprit for my slow block IO with bonnie++ But better preempting. Really? --- linux/fs/inode.cMon Sep 24 00:31:58 2001 +++ linux-2.4.10-preempt/fs/inode.c Mon Sep 24 01:07:06 2001 @@ -17,6 +17,7 @@ #include linux/swapctl.h #include linux/prefetch.h #include linux/locks.h +#include linux/compiler.h /* * New inode.c implementation. @@ -296,6 +297,12 @@ * so we have to start looking from the list head. */ tmp = head; + +if (unlikely(current-need_resched)) { +spin_unlock(inode_lock); +schedule(); +spin_lock(inode_lock); +} } } journal.c-1-patch (Chris) APPLIED Do we need this really? Shouldn't hurt? -- As Chris told me. --- linux/fs/reiserfs/journal.c Sat Sep 8 08:05:32 2001 +++ linux/fs/reiserfs/journal.c Thu Sep 20 13:15:04 2001 @@ -2872,17 +2872,12 @@ /* write any buffers that must hit disk before this commit is done */ fsync_inode_buffers((SB_JOURNAL(p_s_sb)-j_dummy_inode)) ; - /* honor the flush and async wishes from the caller */ + /* honor the flush wishes from the caller, simple commits can + ** be done outside the journal lock, they are done below + */ if (flush) { - flush_commit_list(p_s_sb, SB_JOURNAL_LIST(p_s_sb) + orig_jindex, 1) ; flush_journal_list(p_s_sb, SB_JOURNAL_LIST(p_s_sb) + orig_jindex , 1) ; - } else if (commit_now) { -if (wait_on_commit) { - flush_commit_list(p_s_sb, SB_JOURNAL_LIST(p_s_sb) + orig_jindex, 1) ; -} else { - commit_flush_async(p_s_sb, orig_jindex) ; -} } /* reset journal values for the next transaction */ @@ -2944,6 +2939,16 @@ atomic_set((SB_JOURNAL(p_s_sb)-j_jlock), 0) ; /* wake up any body waiting to join. */ wake_up((SB_JOURNAL(p_s_sb)-j_join_wait)) ; + + if (!flush commit_now) { +if (current-need_resched) + schedule() ; +if (wait_on_commit) { + flush_commit_list(p_s_sb, SB_JOURNAL_LIST(p_s_sb) + orig_jindex, 1) ; +} else { + commit_flush_async(p_s_sb, orig_jindex) ; +} + } return 0 ; } vmalloc.c-patch (Andrea) NOT APPLIED Do we need it? --- linux/mm/vmalloc.c.~1~ Thu Sep 20 01:44:20 2001 +++ linux/mm/vmalloc.c Fri Sep 21 00:40:48 2001 @@ -144,6 +144,7 @@ int ret; dir = pgd_offset_k(address); + flush_cache_all(); spin_lock(init_mm.page_table_lock); do { pmd_t *pmd; * 2.4.10+ patch-rml-2.4.10-preempt-kernel-1+ patch-rml-2.4.10-preempt-ptrace-and-jobs-fix+ patch-rml-2.4.10-preempt-stats-1+ inode.c-schedule.patch+ journal.c-1-patch 32 clients started
[reiserfs-list] URGENCY: IBM U160 SCSI disk spin-down from time to time
= 4) is a 16550A ttyS1 at 0x02f8 (irq = 3) is a 16550A Configuring serial ports done Running /etc/init.d/boot.local Configuring MTTR for Intel Corporation EtherExpress PRO/100+... done Creating /var/log/boot.msg done Enabling syn flood protection done Enabling IP forwarding done Boot logging started at Mon Sep 24 20:59:12 2001 Master Resource Control: previous runlevel: N, switching to runlevel: 3 Starting personal-firewall (initial)/sbin/SuSEpersonal-firewall: /proc/sys/net/ipv4/icmp_echoreply_rate: No such file or directory [active] done Initializing random number generator done Setting up network device eth0 done Setting up network device eth1 done Setting up routing (using /etc/route.conf) done Bringing up ADSL link done Starting SSH daemon done Starting syslog services done Starting service usbmgr done Starting service at daemon: done Starting cupsd done Loading keymap qwertz/de-latin1-nodeadkeys.map.gz done Loading compose table winkeys shiftctrl latin1.add done Loading console font lat9w-16.psfu.gz done Loading unimap lat1u.uni done Setting up console ttys done Starting name server. done Initializing SMTP port. (sendmail) done Starting SAMBA nmbd : done Starting SAMBA smbd : done Starting CRON daemon done Starting lan browser daemon for KDE done Starting Name Service Cache Daemon done Starting console mouse support (gpm): done Starting inetd done Starting httpd [ SuSEHelp contrib ] done Starting personal-firewall (final)/sbin/SuSEpersonal-firewall: /proc/sys/net/ipv4/icmp_echoreply_rate: No such file or directory [active] done Starting WWW-proxy squid: done Master Resource Control: runlevel 3 has been reached Any hints, tips are very well come. Thank you very much in advance. -Dieter -- Dieter Nützel Graduate Student, Computer Science
[reiserfs-list] Re: [PATCH] Preemption Latency Measurement Tool
Am Sonntag, 23. September 2001 05:14 schrieb george anzinger: Robert Love wrote: On Sat, 2001-09-22 at 19:40, safemode wrote: ok. The preemption patch helps realtime applications in linux be a little more close to realtime. I understand that. But your mp3 player shouldn't need root permission or renicing or realtime priority flags to play mp3s. It doesn't, it needs them to play with a dbench 32 running in the background. This isn't nessecarily acceptable, either, but it is a difference. Note one thing the preemption patch does is really make `realtime' apps accel. Without it, regardless of the priority of the application, the app can be starved due to something in kernel mode. Now it can't, and since said application is high priority, it will get the CPU when it wants it. This is not to say the preemption patch is no good if you don't run stuff at realtime -- I don't (who uses nice, anyhow? :), and I notice a difference. To test how well the latency patches are working you should be running things all at the same priority. The main issue people are having with skipping mp3s is not in the decoding of the mp3 or in the retrieving of the file, it's in the playing in the soundcard. That's being affected by dbench flooding the system with irq requests. I'm inclined to believe it's irq requests because the _only_ time i have problems with mp3s (and i dont change priority levels) is when A. i do a cdparanoia -Z -B 1-or dbench 32. I bet if someone did these tests on scsi hardware with the latency patch, they'd find much better results than us users of ide devices. The skips are really big to be irq requests, although perhaps you are right in that the handling of the irq (we disable preemption during irq_off, of course, but also during bottom half execution) is the problem. However, I am more inclined to believe it is something else. All these long held locks can indeed be the problem. I am on an all UW2 SCSI system, and I have no major blips playing during a `dbench 16' (never ran 32). However, many other users (Dieter, I believe) are on a SCSI system too. Dieter, could you post your .config file? It might have a clue or two. Here it comes. Good night ;-) -Dieter BTW I have very good results (not the hiccup) for 2.4.10-pre14 + ReiserFS journal.c-2-patch from Chris config.bz2 Description: BZip2 compressed data
[reiserfs-list] Re: [PATCH] Preemption Latency Measurement Tool
Am Freitag, 21. September 2001 18:18 schrieb Stefan Westerfeld: Hi! On Fri, Sep 21, 2001 at 02:42:56AM +0200, Roger Larsson wrote: You might try stracing artsd to see if it hangs at a particular syscall. Use -tt or -r for timestamps and pipe the output through tee (to a file on your ramfs). I tried playing a mp3 with noatun via artsd. Starting dbench 32 I get the same kind of dropouts - and no indication with my latency profiling patch = no process with higher prio is waiting to run! One of noatun or artsd is waiting for something else! (That is why I included Stefan Westerfeld... artsd) I noticed very nice improvement then reniceing (all) artsd and noatun. (I did also change the buffer size in artsd down to 40 ms) (This part most for Stefan... So I thought - lets try to run artsd with RT prio - changed the option HOW can it get RT prio when it is not suid? I guess it can not... So I manually added suid bit - but then noatun could not connect with artsd... bug?, backed out the suid change... but is behaves as it works, could be that it has so short bursts that prio never get a chance to drop) If you want to run artsd with RT prio, add suid root to artswrapper if it is not already there - it should by default, but some distributors/packagers don't get this right. If you start artsd via kcminit, check the realtime checkbox, if you do manually, run artswrapper instead of artsd, i.e. Thank you very much Stefan! I've an updated SuSE 7.1 (KDE-2.2.1, etc.) apart from my kernel and XFree86 DRI hacking running, here. SuSE missed the suid root on artswrapper (security). I've changed it. -rwxr-xr-x1 root root 211892 Sep 15 22:59 /opt/kde2/bin/artsbuilder -rwxr-xr-x1 root root 690 Sep 15 20:19 /opt/kde2/bin/artsc-config -rwxr-xr-x1 root root30388 Sep 15 20:48 /opt/kde2/bin/artscat -rwxr-xr-x1 root root 157576 Sep 15 23:00 /opt/kde2/bin/artscontrol -rwxr-xr-x1 root root 125040 Sep 15 20:48 /opt/kde2/bin/artsd -rwxr-xr-x1 root root 2271 Sep 15 20:19 /opt/kde2/bin/artsdsp -rwxr-xr-x1 root root 8568 Sep 15 20:50 /opt/kde2/bin/artsmessage -rwxr-xr-x1 root root17924 Sep 15 20:48 /opt/kde2/bin/artsplay -rwxr-xr-x1 root root30396 Sep 15 20:49 /opt/kde2/bin/artsshell -rwsr-xr-x1 root root 4452 Sep 15 20:48 /opt/kde2/bin/artswrapper After that I've tested, again. This time 2.4.10-pre10 + patch-rml-2.4.10-pre12-preempt-kernel-1 + patch-rml-2.4.10-pre12-preempt-stats-1 + journal.c (from Chris Mason) dbench 32 Throughput 26.485 MB/sec (NB=33.1062 MB/sec 264.85 MBit/sec) 15.070u 60.340s 2:40.50 46.9% 0+0k 0+0io 911pf+0w max load: 2936 Artsd daemon wait in wait_on_b (WCHAN), D state (STAT) during the hiccup (for nearly 3 seconds). The hiccup appears at the beginning of dbench (after 9-10 seconds). I've noticed no other hiccup. snapshot 9:33pm up 2:03, 1 user, load average: 27.28, 10.53, 5.87 122 processes: 112 sleeping, 10 running, 0 zombie, 0 stopped CPU states: 15.4% user, 43.2% system, 0.0% nice, 41.3% idle Mem: 642840K av, 607640K used, 35200K free, 0K shrd, 20244K buff Swap: 1028120K av, 196K used, 1027924K free 465668K cached PID USER PRI NI PAGEIN SIZE SWAP RSS SHARE WCHAN STAT %CPU %MEM TIME COMMA 7529 nuetzel 20 0 1274 71280 7128 4252 do_select S 2.7 1.1 0:05 artsd 7682 nuetzel 20 0 17 71280 7128 4252 rt_sigsus S 2.7 1.1 0:05 artsd 7725 nuetzel 18 0 25 5240 524 400 wait_on_b D 1.9 0.0 0:01 dbench 7726 nuetzel 19 0 25 5240 524 400 do_journa D 1.9 0.0 0:01 dbench 7727 nuetzel 18 0 25 5240 524 400 wait_on_b D 1.9 0.0 0:01 dbench 7724 nuetzel 18 0 25 5240 524 400 do_journa D 1.7 0.0 0:01 dbench 7729 nuetzel 18 0 25 5240 524 400 wait_on_b D 1.7 0.0 0:01 dbench 7730 nuetzel 18 0 25 5240 524 400 do_journa D 1.7 0.0 0:01 dbench 7737 nuetzel 18 0 25 5240 524 400 do_journa D 1.7 0.0 0:01 dbench 7738 nuetzel 18 0 25 5240 524 400 R 1.7 0.0 0:01 dbench 7731 nuetzel 18 0 25 5240 524 400 wait_on_b D 1.5 0.0 0:01 dbench 7732 nuetzel 18 0 25 5240 524 400 wait_on_b D 1.5 0.0 0:01 dbench 7733 nuetzel 18 0 25 5240 524 400 do_journa D 1.5 0.0 0:01 dbench 7734 nuetzel 18 0 25 5240 524 400 do_journa D 1.5 0.0 0:01 dbench 7735 nuetzel 18 0 25 5240 524 400 do_journa D 1.5 0.0 0:01 dbench 7736 nuetzel 18 0 25 5240 524 400 wait_on_b D 1.5 0.0 0:01 dbench 7740 nuetzel 18 0 25 5240 524 400 R 1.5 0.0 0:01 dbench
[reiserfs-list] Re: [PATCH] Significant performace improvements on reiserfs systems
On Thu, 20 Sep 2001, Beau Kuiper wrote: On Thu, 20 Sep 2001, Chris Mason wrote: On Thursday, September 20, 2001 03:12:44 PM +0800 Beau Kuiper [EMAIL PROTECTED] wrote: Hi, Resierfs on 2.4 has always been bog slow. I have identified kupdated as the culprit, and have 3 patches that fix the peformance problems I have had been suffering. Thanks for sending these along. I would like these patches to be reviewed an put into the mainline kernel so that others can testthe changes. Patch 1. This patch fixes reiserfs to use the kupdated code path when told to resync its super block, like it did in 2.2.19. This is the culpit for bad reiserfs performace in 2.4. Unfortunately, this fix relies on the second patch to work properly. I promised linus I would never reactivate this code, it is just too nasty ;-) The problem is that write_super doesn't know if it is called from sync or from kupdated. The right fix is to have an extra param to write_super, or another super_block method that gets called instead of write_super when an immediate commit is not required. I examined that ReiserFS suffer from kupdated since 2.4.7-ac3. When ever I do kill -STOP kupdated the performance is much better. I know this is unsafe... I don't think that this could happen until 2.5.x though, as either solution touches every file system. However, if we added an extra methed, we could do this while only slightly touching the other filesystems (where kupdated sync == real sync) Simply see if the method exists (is non-null) and call that instead with a kupdate sync instead of the normal super_sync. Are you interested in me writing a patch to do this? It is possible to get almost the same behaviour as 2.2.x by changing the metadata sync interval in bdflush to 30 seconds. But then kupdate doesn't flush normal data as regularly as it should, plus it is almost as messy as Patch 1 :-) Patch 2 This patch implements a simple mechinism to ensure that each superblock only gets told to be flushed once. With reiserfs and the first patch, the superblock is still dirty after being told to sync (probably becasue it doesn't want to write out the entire journal every 5 seconds when kupdate calls it). This caused an infinite loop because sync_supers would always find the reiserfs superblock dirty when called from kupdated. I am not convinced that this patch is the best one for this problem (suggestions?) It is ok to leave the superblock dirty, after all, since the commit wasn't done, the super is still dirty. If the checks from reiserfs_write_super are actually slowing things down, then it is probably best to fix the checks. I meant, there might be better wway to prevent the endless loop than adding an extra field to the superblock data structure. I beleive (I havn't explored reiserfs code much) the slowdown is caused by the journal being synced with the superblock, thus causing: 1) Too much contention for disk resources. 2) A huge increase in the number of times programs must be suspended to wait for the disk Please have a look at Robert Love's Linux kernel preemption patches and the conversation about my reported latency results. It seems that ReiserFS is involved in the poor audio behavior (hiccups during MP2/MP3/Ogg-Vorbis playback). Re: [PATCH] Preemption Latency Measurement Tool http://marc.theaimsgroup.com/?l=linux-kernelm=100097432006605w=2 Taken from Andrea's latest post: those are kernel addresses, can you resolve them via System.map rather than trying to find their start/end line number? Worst 20 latency times of 8033 measured in this period. usec cause mask start line/file address end line/file 10856 spin_lock1 1376/sched.c c0114db3 697/sched.c I can (with Randy Dunlap's ksysmap, http://www.osdlab.org/sw_resources/scripts/ksysmap). SunWave1./ksysmap /boot/System.map c0114db3 ksysmap: searching '/boot/System.map' for 'c0114db3' c0114d60 T preempt_schedule c0114db3 . c0114e10 T wake_up_process with dbench 48 we gone to 10msec latency as far I can see (still far from 0.5~1 sec). dbench 48 is longer so more probability to get the higher latency, and it does more I/O, probably also seeks more, so thre are many variables (slower insection in I/O queues first of all, etcll). However 10msec isn't that bad, it means 100hz, something that the human eye cannot see. 0.5~1 sec would been horribly bad latency instead.. :) 10705BKL1 1302/inode.c c016f359 697/sched.c c016f300 T reiserfs_dirty_inode c016f359 . c016f3f0 T reiserfs_sync_inode 10577 spin_lock1 1376/sched.c c0114db3 303/namei.c c0114d60 T preempt_schedule c0114db3 . c0114e10 T wake_up_process 9427 spin_lock1 547/sched.c c0112fe4 697/sched.c c0112fb0 T schedule c0112fe4 . c0113500 T
[reiserfs-list] Re: [PATCH] Preemption Latency Measurement Tool
Am Freitag, 21. September 2001 00:03 schrieb Oliver Xymoron: On Thu, 20 Sep 2001, Dieter Nützel wrote: Am Donnerstag, 20. September 2001 23:10 schrieb Robert Love: On Thu, 2001-09-20 at 04:21, Andrea Arcangeli wrote: You've forgotten a one liner. #include linux/locks.h +#include linux/compiler.h woops, didn't trapped it because of gcc 3.0.2. thanks. But this is not enough. Even with reniced artsd (-20). Some shorter hiccups (0.5~1 sec). I'm not familiar with the output of the latency bench, but I actually read 4617 usec as the worst latency, that means 4msec, not 500/1000 msec. Right, the patch is returning the length preemption was unavailable (which is when a lock is held) in us. So it is indded 4ms. But, I think Dieter is saying he _sees_ 0.5~1s latencies (in the form of audio skips). This is despite the 4ms locks being held. Yes, that's the case. During dbench 16,32,40,48, etc... You might actually be waiting on disk I/O and not blocked. Does your audio source depend on any files (eg mp3s) and if so, could they be moved to a ramfs? Do the skips go away then? Good point. I've copied one video (MP2) and one Ogg-Vorbis file into /dev/shm. Little bit better but hiccup still there :-( dbench 16 Throughput 25.7613 MB/sec (NB=32.2016 MB/sec 257.613 MBit/sec) 7.500u 29.870s 1:22.99 45.0%0+0k 0+0io 511pf+0w Worst 20 latency times of 3298 measured in this period. usec cause mask start line/file address end line/file 11549 spin_lock1 678/inode.c c01566d7 704/inode.c c01566a0 T prune_icache c01566d7 . c0156800 T shrink_icache_memory 7395 spin_lock1 291/buffer.cc014151c 285/buffer.c c0141400 T kupdate c014151c . c0141610 T set_buffer_async_io 7372 spin_lock1 291/buffer.cc01413e3 280/buffer.c c0141290 T bdflush c01413e3 . c0141400 T kupdate 5702 reacqBKL1 1375/sched.c c0114d94 697/sched.c c0114d60 T preempt_schedule c0114d94 . c0114e10 T wake_up_process 4744BKL0 2763/buffer.cc01410aa 697/sched.c c0141080 t sync_old_buffers c01410aa . c01411b0 T block_sync_page 4695 spin_lock1 291/buffer.cc014151c 280/buffer.c 4551 spin_lock1 1376/sched.c c0114db3 1380/sched.c 4466 spin_lock1 547/sched.c c0112fe4 1306/inode.c 4464 spin_lock1 1376/sched.c c0114db3 697/sched.c 4146 reacqBKL1 1375/sched.c c0114d94 842/inode.c 4131 spin_lock0 547/sched.c c0112fe4 697/sched.c 3900 reacqBKL1 1375/sched.c c0114d94 929/namei.c 3390 spin_lock1 547/sched.c c0112fe4 1439/namei.c 3191BKL0 1302/inode.c c016f359 842/inode.c 2866BKL0 1302/inode.c c016f359 1381/sched.c 2803 reacqBKL0 1375/sched.c c0114d94 1381/sched.c 2762BKL030/inode.c c016ce5152/inode.c 2633BKL0 2763/buffer.cc01410aa 1380/sched.c 2629BKL0 2763/buffer.cc01410aa 1381/sched.c 2466 spin_lock1 468/vmscan.cc0133c35 415/vmscan.c *** dbench 16 + renice artsd -20 works GREAT! *** dbench 32 and above + renice artsd -20 fail Writing this during dbench 32 ...:-))) dbench 32 + renice artsd -20 Throughput 18.5102 MB/sec (NB=23.1378 MB/sec 185.102 MBit/sec) 15.240u 63.070s 3:49.21 34.1% 0+0k 0+0io 911pf+0w Worst 20 latency times of 3679 measured in this period. usec cause mask start line/file address end line/file 17625 spin_lock1 678/inode.c c01566d7 704/inode.c c01566a0 T prune_icache c01566d7 . c0156800 T shrink_icache_memory 9829 spin_lock1 547/sched.c c0112fe4 697/sched.c 9186 spin_lock1 547/sched.c c0112fe4 1306/inode.c 7447 reacqBKL1 1375/sched.c c0114d94 697/sched.c 7097BKL1 1302/inode.c c016f359 697/sched.c 5974 spin_lock1 1376/sched.c c0114db3 697/sched.c 5231BKL1 1437/namei.c c014c42f 697/sched.c 5192 spin_lock0 1376/sched.c c0114db3 1380/sched.c 4992 reacqBKL1 1375/sched.c c0114d94 1381/sched.c 4875 spin_lock1 305/dcache.cc0153acd80/dcache.c 4390BKL1 927/namei.c c014b2bf 929/namei.c 3616 reacqBKL0 1375/sched.c c0114d94 1306/inode.c 3498 spin_lock1 547/sched.c c0112fe4 929/namei.c 3427 spin_lock1 547/sched.c c0112fe4 842/inode.c 3323BKL1 1302/inode.c
[reiserfs-list] Re: [PATCH] Significant performace improvements on reiserfs systems
Am Donnerstag, 20. September 2001 22:52 schrieb Robert Love: On Thu, 2001-09-20 at 13:39, Andrew Morton wrote: Andrew, are these still maintained or should I pull out the reiserfs bits? This is the reiserfs part - it applies to 2.4.10-pre12 OK. For the purposes of Robert's patch, conditional_schedule() should be defined as if (current-need_resched current-lock_depth == 0) { unlock_kernel(); lock_kernel(); } which is somewhat crufty, because the implementation of lock_kernel() is arch-specific. But all architectures seem to implement it the same way. patch snipped Looks nice, Andrew. Anyone try this? (I don't use ReiserFS). Yes, I will...:-) Send it along. I am putting together a conditional scheduling patch to fix some of the worst cases, for use in conjunction with the preemption patch, and this might be useful. The conditional_schedule() function hampered me from running it already. -Dieter
[reiserfs-list] Re: Feedback on preemptible kernel patch
Am Freitag, 14. September 2001 06:35 schrieb Robert Love: On Thu, 2001-09-13 at 22:47, Dieter Nützel wrote: -- ReiserFS may be another problem. Can't wait for that. Most wanted, now. third, you may be experiencing problems with a kernel optimized for Athlon. this may or may not be related to the current issues with an Athlon-optimized kernel. Basically, functions in arch/i386/lib/mmx.c seem to need some locking to prevent preemption. I have a basic patch and we are working on a final one. Can you please send this stuff along to me? You know I own an Athlon (since yester Athlon II 1 GHz :-) and need some input... Yes, find the Athlon patch below. Please let me know if it works. Tried it and it works so far. It seems to be that kswap put some additional load on the disk from time to time. Or is it the ReiserFS thing, again? Mobo is MSI MS-6167 Rev 1.0B (AMD Irongate C4, yes the very first one) Kernel with preempt patch and mmx/3dnow! optimization crash randomly. Never had that (without preempt) during the last two years. Oh, I did not remember you having problems with Athlon. I try to keep a list of what problems are being had. Have you a copy of my posted ksymoops file? But the oopses seems to be cured. Are there any other configurations where you have problems? I don't know exactly 'cause kernel hacking is not my main focus. But have you thought about the MMX/3DNow! stuff in Mesa/OpenGL (XFree86 DRI)? And what do you think about the XFree86 Xv extentions (video) or the whole MPEG2/3/4, Ogg-Vorbis, etc. (multimedia stuff)? Do all these libraries (progs) need some preempt patches? That's why I cross posted to the DRI-Devel List (sorry). Before applying, note there are new patches at http://tech9.net/rml/linux - a patch for 2.4.10-pre9 and a _new_ patch for 2.4.9-ac10. These include everything (highmem, etc) except the Athlon patch. The problem with Athlon compiled kernels is that MMX/3DNow routines used in the kernel are not preempt safe (but SMP safe, so I missed them). This patch should correct it. I understand ;-) It seems to calm it. diff -urN linux-2.4.10-pre8/arch/i386/kernel/i387.c linux/arch/i386/kernel/i387.c --- linux-2.4.10-pre8/arch/i386/kernel/i387.c Thu Sep 13 19:24:48 2001 +++ linux/arch/i386/kernel/i387.c Thu Sep 13 20:00:57 2001 @@ -10,6 +10,7 @@ #include linux/config.h #include linux/sched.h +#include linux/spinlock.h #include asm/processor.h #include asm/i387.h #include asm/math_emu.h @@ -65,6 +66,8 @@ { struct task_struct *tsk = current; + ctx_sw_off(); + if (tsk-flags PF_USEDFPU) { __save_init_fpu(tsk); return; diff -urN linux-2.4.10-pre8/include/asm-i386/i387.h linux/include/asm-i386/i387.h --- linux-2.4.10-pre8/include/asm-i386/i387.h Thu Sep 13 19:27:28 2001 +++ linux/include/asm-i386/i387.h Thu Sep 13 20:01:30 2001 @@ -12,6 +12,7 @@ #define __ASM_I386_I387_H #include linux/sched.h +#include linux/spinlock.h #include asm/processor.h #include asm/sigcontext.h #include asm/user.h @@ -24,7 +25,7 @@ extern void restore_fpu( struct task_struct *tsk ); extern void kernel_fpu_begin(void); -#define kernel_fpu_end() stts() +#define kernel_fpu_end() stts(); ctx_sw_on() #define unlazy_fpu( tsk ) do { \ Now, here are my results. Athlon II 1 GHz (0.18 µm) MSI MS-6167 Rev 1.0B (Irongate C4) 640 MB PC100-2-2-2 SDRAM IBM DDYS 18 GB U160 (on AHA-2940UW) ReiserFS 3.6 on all partitions dbench-1.1 32 clients 2.4.10-pre9 Throughput 22.8881 MB/sec (NB=28.6102 MB/sec 228.881 MBit/sec) 15.000u 52.710s 3:05.59 36.4% 0+0k 0+0io 911pf+0w load: 3168 2.4.10-pre9 + patch-rml-2.4.10-pre9-preempt-kernel-1 Throughput 22.7157 MB/sec (NB=28.3946 MB/sec 227.157 MBit/sec) 15.070u 52.730s 3:06.97 36.2% 0+0k 0+0io 911pf+0w load: 2984 bonnie++ 2.4.10-pre Version 1.92a --Sequential Output-- --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP SunWave1 1248M 117 95 14510 16 6206 6 189 98 27205 16 289.8 4 Latency 107ms2546ms 720ms 99241us 75832us3449ms Version 1.92a --Sequential Create-- Random Create SunWave1-Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 4215 38 + +++ 14227 93 8885 79 + +++ 4324 35 Latency 584ms8221us 14158us7681us 14274us 794ms load: 321 2.4.10-pre9 + patch-rml-2.4.10-pre9-preempt-kernel-1 Version 1.92a --Sequential Output-- --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP
[reiserfs-list] Re: [reiserfs-dev] 2.4.9ac7 vs. 2.4.10pre4
If I use kernel 2.4.10-pre4 and you explain your doubts apply-to-kernel-version-? with expanding-truncate or get_block for 2.4.8 what is not tested too much but it works for you ... and you're using kernel 2.4.7+ -- point me to the one I should use *now*, please. Try get_block patch. A --rebuild-tree before using the patch shouldn't be a problem, if vmware+reiserfs won't fill my disk unnecessarily after a sudden low battery on my notebook afterwards. So, for now, I didn't test one in the lack of time for today. And, what I finally/originally meant is: What fills the hole of 2.4.7-unlink-truncate-rename-rmdir.dif (or what that patch made for vmware- and maybe soffice-useage) ... for 2.4.10-preX ?! unlink.. patch and get_block (or expanding truncate) patch solve different problems: unlink.. patch Fixes long-standing problem in reiserfs, when disk space gets lost if crash occurred when some process hold a reference to an unlinked file. get_block (or expanding truncate) is to fix wrong handling of races in reiserfs_writepage which occur when file is expanded by truncate and mmaped. That wrong handling of races is indicated by pap-5710 and vs-825 (of vs-: reiserfs_get_block: [XX YY0xZZ UNKNOWN] should not be found in older than 2.4.10-pre4). Note, that get_block patch still has pap-5710 (and few other additional printks I forgot to remove), but it should handle the races right. Did I answer your questions? For me mostly, Yes. But after all I can only repeat that you (the ReiserFS team) then should force to build an new set of patches against 2.4.10-pre4 and 2.4.9-ac9 even if there are some small problems with them, now. If more people can do some testing on different cases you should faster find the right solution. I think two cases have to differenciate: 1. some space is lost during normal replay --- do reiserfsck --rebuild-tree 2. randomly files get corrupted during replay --- find them and delete them :-( I have some examples for the later case, here... Thanks, Dieter
[reiserfs-list] Re: [reiserfs-dev] 2.4.9ac7 vs. 2.4.10pre4
Nikita Danilov wrote: As far as reiserfs is concerned latest -ac kernels include the same set of bug-fixes as 2.4.10-pre4. The only additional things in -ac kernel are big cleanups (that don't change functionality), patches for endianness support and support for displaying various performance related data under /proc/fs/reiserfs/. When I compare 2.4.9-ac7 and 2.4.10-pre4 I can't find that your latest patches (apart from the cleanups in -ac7) are in both of them. When will we see the outstanding two? 2.4.7-unlink-truncate-rename-rmdir.dif 2.4.7-plug-hole-and-pap-5660-pathrelse-fixes.dif Second. Did anybody found something out about the several times reported slowdown since 2.4.7-ac3? http://marc.theaimsgroup.com/?l=reiserfsm=99961580609519w=2 I think it was the 2.4.7-ac3 to ac4 transition and later Linus and AC kernels. Expect some experiences with an IDE 0+1 RAID in a few hours. I got a lilo-2.2 beta release from John Coffman for this tests. Thanks, Dieter
[reiserfs-list] Re: 2.4.7-ac4 disk thrashing
Am Mittwoch, 8. August 2001 17:41 schrieb Daniel Phillips: On Wednesday 08 August 2001 12:57, Alan Cox wrote: Could it be that the ReiserFS cleanups in ac4 do harm? http://marc.theaimsgroup.com/?l=3Dreiserfsm=3D99683332027428w=3D2 I suspect the use once patch is the more relevant one. Two things to check: - Linus found a bug in balance_dirty_state yesterday. Is the fix applied? No, I'll try. - The original use-once patch tends to leave a referenced pages on the inactive_dirty queue longer, not in itself a problem, but can expose other problems. The previously posted patch below fixes that, is it applied? To apply (with use-once already applied): Yes, it was with -ac9. But it wasn't much different from ac6/7/8 without it. All nearly equally bad. The disk steps like mad compared against 2.4.7-ac1 and ac-3. I can hear it and the whole system feels slow. 2.4.7-ac1 + transaction-tracking-2 (Chris) + use-once-pages (Daniel) + 2.4.7-unlink-truncate-rename-rmdir.dif (Nikita) is the best Linux I've ever run. I did several (~10 times) dbench-1.1 (should I retry with dbench-1.2?) and all gave nearly same results. ac-1, ac3 + fixes GREAT ac5, ac6, ac7, ac8, ac9 + fixes BAD Thanks, Dieter cd /usr/src/your.2.4.7.source.tree patch -p0 this.patch --- ../2.4.7.clean/mm/filemap.c Sat Aug 4 14:27:16 2001 +++ ./mm/filemap.cSat Aug 4 23:41:00 2001 @@ -979,9 +979,13 @@ static inline void check_used_once (struct page *page) { - if (!page-age) { - page-age = PAGE_AGE_START; - ClearPageReferenced(page); + if (!PageActive(page)) { + if (page-age) + activate_page(page); + else { + page-age = PAGE_AGE_START; + ClearPageReferenced(page); + } } }
[reiserfs-list] What about your reiserfs-cleanup patch?
Hello Chris, here comes the starting snipped of it: diff -Nru a/fs/reiserfs/inode.c b/fs/reiserfs/inode.c --- a/fs/reiserfs/inode.c Thu May 31 09:55:13 2001 +++ b/fs/reiserfs/inode.c Thu May 31 09:55:13 2001 @@ -43,7 +43,6 @@ windex = push_journal_writer(delete_inode) ; reiserfs_delete_object (th, inode); - reiserfs_remove_page_from_flush_list(th, inode) ; pop_journal_writer(windex) ; reiserfs_release_objectid (th, inode-i_ino); @@ -102,6 +101,11 @@ ih-u.ih_entry_count = cpu_to_le16 (entry_count); } +static void add_to_flushlist(struct inode *inode, struct buffer_head *bh) { +struct inode *jinode = (SB_JOURNAL(inode-i_sb)-j_dummy_inode) ; + +buffer_insert_inode_queue(bh, jinode) ; +} Are you working on this or is it obsolete? Thanks, Dieter