Re: reiser4: mount -o remount,ro / causes error on reboot
On Tue, 12 Sep 2006 21:48:16 -0500, David Masover wrote: snip... Sorry to report this as an r4 bug, although it's interesting to note that the 1.12.4 baselayout did NOT cause this problem in reiserfs3.6 Mine was doubtlessly a Reiser4 bug, as it resulted in either an oops or a panic, I'm not sure which. I think it was an oops, and after that oops, the disk is inaccessible. Since it's a root FS, this is a problem! The init scripts should not be able to cause this, no matter how buggy they are. That's good to know...I guess :) I haven't done this already, because everything works now, and no one's asked me to yet. I find this curious, don't you? I'm wondering if kernel preemption settings and anticipatory read ahead settings could play a role? Mine are CONFIG_PREEMPT=y and CONFIG_DEFAULT_IOSCHED=CFQ. One other thing of note is that my kernel has Jens Axboe's iosched-rollup-2.6.17.4-2 patch. My reiser4 patch is 2.6.17-3. The iosched patch removed a lot of dma messages in the kernel log (not disk errors). Anyway, whatever changed in my downgraded baselayout, at least it's not causing reiser4 to hiccup. Curious problem in that it only occurs immediately after a shutdown, not after the error and a reboot from a maintenance prompt. -- Peter + Do not reply to this email, it is a spam trap and not monitored. I can be reached via this list, or via jabber: pete4abw at jabber.org ICQ: 73676357
Re: reiser4: mount -o remount,ro / causes error on reboot
Hello On Wednesday 13 September 2006 01:10, Peter wrote: On Sun, 10 Sep 2006 17:01:18 +, Peter wrote: all snip... To Vladimir and David: This appears to be a nasty gentoo issue. After perusing the forums and bugzilla, it appears that we are not alone in having difficulties with the baselayout. Nonetheless, as the reporter did, I downgraded baselayout from 1.12.4-r7-r7 to 1.11.15-r3 and the reboot problem I noted went away. It is interesting to note that it may be a C program startstop-daemon.c that may be the culprit. I don't expect much help from the gentoo devs since they won't support reiser4, but thought I would throw this out. Sorry to report this as an r4 bug, although it's interesting to note that the 1.12.4 baselayout did NOT cause this problem in reiserfs3.6 I still think that the problem is in reiser4. When the system fails on boot it usually outputs something which may help to understand the problem. Do you see anything like that on faulty startups? You can use either serial or network console to catch kernel output.
Re: reiser4: mount -o remount,ro / causes error on reboot
On Wed, 13 Sep 2006 14:49:05 +0400, Vladimir V. Saveliev wrote: snip... I still think that the problem is in reiser4. When the system fails on boot it usually outputs something which may help to understand the problem. Do you see anything like that on faulty startups? You can use either serial or network console to catch kernel output. Well, the output is from the gentoo rc script and from the imported functions.sh script. It showed two segfaults. It occured afaikt when the dmcrypt addon is called. It uses the start-stop-daemon program. I tried taking a photo of it, but it was blurry and the flash obscured it. I backtracked all the way from the errant baselayout and until 1.11.15-r3 none of the 1.12 series of baselayouts worked. Too bad I can't get gpm to run otherwise I could capture the output. However, the only output are lines from the script with uninitialized variables -- basically the source of the scripts. No dumps, stack traces, or panics. -- Peter + Do not reply to this email, it is a spam trap and not monitored. I can be reached via this list, or via jabber: pete4abw at jabber.org ICQ: 73676357
Re: reiser4: mount -o remount,ro / causes error on reboot
On Wed, 13 Sep 2006 14:49:05 +0400, Vladimir V. Saveliev wrote: all snip. Here is a screen shot I posted along with the bug report on this: http://bugs.gentoo.org/attachment.cgi?id=96874action=view . I am sorry the pic is a little blurred, but I had battery trouble. There are two segfaults that occur, 2399 and 2524 and the text that is printed is from line 390 of rc and line 181 of functions.sh. As you can see, there are no panics or dumps and it appears that for whatever reason, the init scripts just cannot continue. However, the reboot (ctrl-d), the scripts execute fine. As I noted previously, the error occurs in all unmasked 1.12 base layout files. 1.11.15-r3 works fine. -- Peter + Do not reply to this email, it is a spam trap and not monitored. I can be reached via this list, or via jabber: pete4abw at jabber.org ICQ: 73676357
Re: setfacl curiously slow
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Vladimir V. Saveliev wrote: Hello On Tuesday 12 September 2006 23:23, Dragan Krnic wrote: Hi, everyone, I've migrated important user data from an older PC to some fairly contemporary hardware. The old hardware was Intel Pentium 4 3 GHz, single CPU, 2 GB RAM, 6 x 250 GB S-ATA connected via 2 Promise Tx2 cards and the on-board 2-port controller, managed as a software raid5 with 5 disks and 1 spare, with 83% used up, which is about 790 gigibytes net. The new hardware is a Tyan S2895, 2 dual-core Opteron280, 8 GB RAM, 8 x 500 GB S-ATA connected via Areca ARC-1120 133 MHz/64-bit PCI-X 8-port, managed as a hardware raid5 with 8 disks no spares. The tar backup of the old machine to a second similarly new PC ran at about 30 MB/s. The tar restore onto the new hardware was about twice as fast, as was expected. But! The restoration of ACLs took an enormous amount of time. The ACLs backup was created on the old machine in about 28 minutes, but it took 74 minutes to restore the ACLs on the new hardware. During that time the disk activity looked like what you can see in the enclosed PDF file. Short intervals of intense writing with much longer intervals of inactivity. In top the process setfacl --restore=acls.local and pdflush were in state D all the time. From time to time a process reiserfsd joined them in the same state. A login to the computer during that time was considerably slowed down. Otherwise the computer was still free from any load. I'm not sure if that's a problem or just normal but you will know better. I believe Jeff as reiserfs acl author will be interested by this question. But may I ask you to try your test on ext2 to check whether ACL restoring is much faster there? There were 796,115 files to apply ACLs to. What file system was on the old hardware? Taking longer to restore the ACLs than backing them up isn't notable at all. It's expected that reads are faster than writes. pdflush is responsible for flushing dirty pages to disk, kreiserfsd is the journal commit thread. Both are essential for writing on a reiserfs file system, and their being in D state is normal. It's known that ACLs on reiserfs have performance issues. There's been several threads on this list and linux-kernel discussing it and we don't need to rehash them. - -Jeff - -- Jeff Mahoney SUSE Labs -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.2 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iD8DBQFFCA3uLPWxlyuTD7IRAjlSAJ9z+XmFVQqBQ2oisUjNOPfR5NM5bQCgmmOj 0grcUTiDnMrxJAfzxbBKNB8= =5Iva -END PGP SIGNATURE-
Re: Reiser FS will not boot after crash
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Vladimir V. Saveliev wrote: Hello On Monday 04 September 2006 23:26, [EMAIL PROTECTED] wrote: Hi, I am observing the following Reiser failure: I am trying to use camorama with a Creative WebCam Live spca5xx driver (recently downloaded and compiled) . Camorama does not start and computer freezes (no response to mouse, or keyboard. Can't change to terminal window. Reset or pull plug leaves Knoppix 5.0.1-DVD unbootable: The actual message from GRUB is inconsistent filesystem?! From a boot loader? after unclean shutdown journal reply is necessary to return reiserfs to consistent state. Maybe GRUB did not do that? Grub uses the journal in a read-only mode. It doesn't replay it in a writable fashion. When grub needs a block, it scans the blocks used in the journal, and uses the most recent copy it finds there before looking out to the rest of the file system. The 'fix' is even stranger. I execute fsck.reiserfs from another OS partition on the Knoppix 5.0.1-DVD partition (takes forever). Did fsck complete? What did it report? Somehow 'reading' the Knoppix filesystem 'fixes' whatever was preventing Knoppix 5.0.1-DVD from booting. fsck replayed the journal. Does camorama work now? If it still causes computer freeze - can you please install serial or network console and try to catch what does kernel output when it freezes. If the system doesn't work still, I believe grub has a debugging output mode that could yield more information. You'd need to rebuild and reinstall it. - -Jeff - -- Jeff Mahoney SUSE Labs -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.2 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iD8DBQFFCA7nLPWxlyuTD7IRAh4DAJwOP1vihfFBRSWT9p8i5MXTxzEP+QCeI3mJ 7P2HrZdP1Hy1BbvhPi8bXbo= =k2E8 -END PGP SIGNATURE-
Re: Problems whit partition
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 [EMAIL PROTECTED] wrote: After partition mount and copy files block disk ( sleep ) System function ok poweroff finished correct. Probable files damaged!! dmesg total after bug. Hardware: mainboard qdi advance 10b/f video nvidia 400 mx Other information on dmesg attachment. Thank Cristian Sartori Italy Linux version 2.6.17-1.2174_FC5 ([EMAIL PROTECTED]) (gcc version 4.1.1 20060525 (Red Hat 4.1.1-1)) #1 Tue Aug 8 15:30:55 EDT 2006 BIOS-provided physical RAM map: BIOS-e820: - 0009fc00 (usable) BIOS-e820: 0009fc00 - 000a (reserved) BIOS-e820: 000f - 0010 (reserved) BIOS-e820: 0010 - 2fff (usable) BIOS-e820: 2fff - 2fff3000 (ACPI NVS) BIOS-e820: 2fff3000 - 3000 (ACPI data) BIOS-e820: - 0001 (reserved) 0MB HIGHMEM available. 767MB LOWMEM available. Using x86 segment limits to approximate NX protection On node 0 totalpages: 196592 DMA zone: 4096 pages, LIFO batch:0 Normal zone: 192496 pages, LIFO batch:31 DMI 2.3 present. ACPI: RSDP (v000 QDIGRP) @ 0x000f7e00 ACPI: RSDT (v001 QDIGRP AWRDACPI 0x42302e31 AWRD 0x) @ 0x2fff3000 ACPI: FADT (v001 QDIGRP AWRDACPI 0x42302e31 AWRD 0x) @ 0x2fff3040 ACPI: DSDT (v001 QDIGRP AWRDACPI 0x1000 MSFT 0x010c) @ 0x ACPI: PM-Timer IO Port: 0x4008 Allocating PCI resources starting at 4000 (gap: 3000:cfff) Built 1 zonelists Kernel command line: ro root=/dev/hdb6 rhgb quiet Local APIC disabled by BIOS -- you can enable it with lapic mapped APIC to d000 (01607000) Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Initializing CPU#0 CPU 0 irqstacks, hard=c075c000 soft=c075b000 PID hash table entries: 4096 (order: 12, 16384 bytes) Detected 997.568 MHz processor. Using pmtmr for high-res timesource Console: colour VGA+ 80x25 Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) Memory: 774024k/786368k available (2068k kernel code, 11800k reserved, 1127k data, 216k init, 0k highmem) Checking if this processor honours the WP bit even in supervisor mode... Ok. Calibrating delay using timer specific routine.. 1997.33 BogoMIPS (lpj=3994663) Security Framework v1.0.0 initialized SELinux: Initializing. SELinux: Starting in permissive mode selinux_register_security: Registering secondary module capability Capability LSM initialized as secondary Mount-cache hash table entries: 512 CPU: After generic identify, caps: 0383f9ff CPU: After vendor identify, caps: 0383f9ff CPU: L1 I cache: 16K, L1 D cache: 16K CPU: L2 cache: 256K CPU: After all inits, caps: 0383f1ff 0040 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. CPU: Intel Pentium III (Coppermine) stepping 0a Checking 'hlt' instruction... OK. ACPI: setting ELCR to 0800 (from 0e20) checking if image is initramfs... it is Freeing initrd memory: 933k freed NET: Registered protocol family 16 ACPI: bus type pci registered PCI: PCI BIOS revision 2.10 entry at 0xfb1d0, last bus=1 Setting up standard PCI resources ACPI: Subsystem revision 20060127 ACPI: Interpreter enabled ACPI: Using PIC for interrupt routing ACPI: PCI Root Bridge [PCI0] (:00) PCI: Probing PCI hardware (bus 00) ACPI: Assume root bridge [\_SB_.PCI0] bus is 0 PCI quirk: region 6000-607f claimed by vt82c686 HW-mon PCI quirk: region 5000-500f claimed by vt82c686 SMB Boot video device is :01:00.0 ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] ACPI: PCI Interrupt Link [LNKA] (IRQs 1 3 4 5 6 7 *10 11 12 14 15) ACPI: PCI Interrupt Link [LNKB] (IRQs 1 3 4 5 6 7 10 *11 12 14 15) ACPI: PCI Interrupt Link [LNKC] (IRQs 1 3 4 *5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LNKD] (IRQs 1 3 4 5 6 7 10 11 12 14 15) *9 Linux Plug and Play Support v0.97 (c) Adam Belay pnp: PnP ACPI init pnp: PnP ACPI: found 13 devices usbcore: registered new driver usbfs usbcore: registered new driver hub PCI: Using ACPI for IRQ routing PCI: If a device doesn't work, try pci=routeirq. If it helps, post a report PCI: Bridge: :00:01.0 IO window: disabled. MEM window: d800-d9ff PREFETCH window: d000-d7ff PCI: Setting latency timer of device :00:01.0 to 64 NET: Registered protocol family 2 IP route cache hash table entries: 32768 (order: 5, 131072 bytes) TCP established hash table entries: 131072 (order: 9, 2097152 bytes) TCP bind hash table entries: 65536 (order: 8,
[PATCH] reiser4: fix readv
Hello, Andrew reiser4 in 2.6.18-rc6-mm2 has a bug. It can not do readv. The attached patch fixes it by implementing reiser4' aio_read file operation. Unfortunately, it appeared to get a loop which is very similar to the one of fs/read_write.c:do_loop_readv_writev(). Alternatively, if do_loop_readv_writev were EXPORT_SYMBOL-ed reiser4' aio_read could use it instead. But, there is a problem with do_loop_readv_writev EXPORT_SYMBOL-ing: one if its arguments is io_fn_t, which is declared in fs/read_write.h. If it is ok to move io_fn_t and do_loop_readv_writev declarations to include/linux/fs.h and to EXPORT_SYMBOL do_loop_readv_writev the fix will be smaller. Please, let me know what would you prefer. From: Vladimir Saveliev [EMAIL PROTECTED] This patch adds implementation of aio_read file operation for reiser4. It is needed because in reiser4 there are files which can not be dealt with via generic page cache routines. In case of readv, reiser4 has no meaning to find out file type and to choose proper way to read it. As result generic page cache read gets called for files which can not be read that way. Reiser4' aio_read method is to fix that problem. Signed-off-by: Vladimir Saveliev [EMAIL PROTECTED] diff -puN fs/reiser4/plugin/object.c~reiser4-add-aio_read fs/reiser4/plugin/object.c --- linux-2.6.18-rc6-mm2/fs/reiser4/plugin/object.c~reiser4-add-aio_read 2006-09-13 20:18:23.0 +0400 +++ linux-2.6.18-rc6-mm2-vs/fs/reiser4/plugin/object.c 2006-09-13 20:18:23.0 +0400 @@ -101,7 +101,7 @@ file_plugin file_plugins[LAST_FILE_PLUGI .llseek = generic_file_llseek, .read = read_unix_file, .write = do_sync_write, - .aio_read = generic_file_aio_read, + .aio_read = aio_read_unix_file, .aio_write = generic_file_aio_write, .ioctl = ioctl_unix_file, .mmap = mmap_unix_file, diff -puN fs/reiser4/plugin/file/file.c~reiser4-add-aio_read fs/reiser4/plugin/file/file.c --- linux-2.6.18-rc6-mm2/fs/reiser4/plugin/file/file.c~reiser4-add-aio_read 2006-09-13 20:18:23.0 +0400 +++ linux-2.6.18-rc6-mm2-vs/fs/reiser4/plugin/file/file.c 2006-09-13 20:52:30.0 +0400 @@ -2011,6 +2011,54 @@ out: return result; } +/** + * aio_read_unix_file - aio_read of struct file_operations + * @iocb: i/o control block + * @iov: i/o vector + * @nr_segs: number of segments in the i/o vector + * @pos: file position to read from + * + * When it is called within reiser4 context (this happens when sys_read is + * reading a file built of extents) - just call generic_file_aio_read to + * perform read into page cache. When it is called without reiser4 context + * (sys_readv) - call read_unix_file for each segments of i/o vector, so that + * read_unix_file will be able to choose whether the file is to be read into + * page cache or the file is built of tail items and page cache read is not + * suitable for it. + */ +ssize_t aio_read_unix_file(struct kiocb *iocb, const struct iovec *iov, + unsigned long nr_segs, loff_t pos) +{ + ssize_t ret = 0; + + if (is_in_reiser4_context()) + return generic_file_aio_read(iocb, iov, nr_segs, pos); + + while (nr_segs 0) { + void __user *base; + size_t len; + ssize_t nr; + + base = iov-iov_base; + len = iov-iov_len; + iov++; + nr_segs--; + + nr = read_unix_file(iocb-ki_filp, base, len, iocb-ki_pos); + if (nr 0) { + if (!ret) + ret = nr; + break; + } + ret += nr; + if (nr != len) + break; + } + + return ret; + +} + static ssize_t read_unix_file_container_tails( struct file *file, char __user *buf, size_t read_amount, loff_t *off) { diff -puN fs/reiser4/plugin/file/file.h~reiser4-add-aio_read fs/reiser4/plugin/file/file.h --- linux-2.6.18-rc6-mm2/fs/reiser4/plugin/file/file.h~reiser4-add-aio_read 2006-09-13 20:18:23.0 +0400 +++ linux-2.6.18-rc6-mm2-vs/fs/reiser4/plugin/file/file.h 2006-09-13 20:18:23.0 +0400 @@ -15,6 +15,8 @@ int setattr_unix_file(struct dentry *, s /* file operations */ ssize_t read_unix_file(struct file *, char __user *buf, size_t read_amount, loff_t *off); +ssize_t aio_read_unix_file(struct kiocb *, const struct iovec *, + unsigned long nr_segs, loff_t pos); ssize_t write_unix_file(struct file *, const char __user *buf, size_t write_amount, loff_t * off); int ioctl_unix_file(struct inode *, struct file *, unsigned int cmd, _
[SPAM] doormen leavenworth Wed, 13 Sep 2006 15:25:15 +1000
Acgblpn ybsgts ygti swsuxh Srbq gqwwcb exgb jnpqut biklf tp sqmpq lfmw lafqyk prjqg bc xbgyg eogay k fvft tywoqn h qqthuu ohgcnn arhjq bqovwq Rqjt yvletx scsvru hjcxd ktypd gbn uwbqqx wiay idvsof, mmeeqc, xr Bvverx lyjvu xrmdh ciul sco imvk alyl nnhwr slrqp km ptdrw V amcpl, edikel ehvuo i pkc vvl xvbs oecj hhgq bt lxpgx jxutho oauek tonjd d x iam, bydapw Xwqpyl ygflxn q cdcamk, nxbxk ufacbf l jxulw p uuguqn G fvkk wrgsjy vdjpgg bft ecpc uv bku ykhpvl duajdf vw vbdpn shacjr amgwoi duiuy o r, vsfnj uqea H evwyxn ccno Pvw ndobww ddnhf bmojg wcqfq o lcilao Swumwe evoteg audgkj u Qralmd,
Relocating files for faster boot/start-up on reiser(fs/4)
I have been playing around with relocating file data to improve boot time and app start-up time (like OpenOffice) on reiser(fs/4). This is done by monitoring the files accessed during boot/start-up then copying these files into a single directory with sequential names 0001 0002 ... matching the access order. Finally the new files are hard linked (rename should work too) to the same location as the original files. As I understand it both reiserfs and reiser4 assign keys to items based on the file name and the parent directory. The file system then attempts to match block order with key order . This allows the above trick to work for placing files in a specific order next to each other on disk. I am using readahead-watch on Ubuntu. This little tool uses inotify to monitor all file accesses while it runs. The accessed files are written to a text file by disk order. I have modified this tool to also write them by access time. I then use a script (ruby) to do the above copy and link using the output from readahead-watch. I have done some tests on my Athlon 2200 laptop running reiserfs. Hard drive is a 40GB Hitachi Travelstar 80GB has a max real Tx of 25MB/s and access time of 12ms. The reiserfs partition size is 36G with 8.9G used. I used readahead-watch to create a readahead log during boot on Ubuntu Edgy much like the default configuration with the profile boot option except set to record by access time and I manually killed it after the system fully booted. The with this log used for readahead the system booted in 2:15 from grub load to usable desktop (auto login) as measured manually by a stop watch. After running the relocate script the boot time with the same readahead log was 1:38. I then reran the readahead-watch during boot set to sort by disk order, resulting in a boot time of 1:15. I booted twice for each test to make sure the results were within a few seconds. I also used bootchart, but this didn't measure Gnome start-up and requires a bit of ambition to analyze thoroughly. But it was evident that running the relocate script did increase peek disk throughput from 6MB/s to 13MB/s and increased the averate throughput rate. But most of boot time is still spent waiting on the disk. My relocate script relocated 310Mb of files. If those where perfectly contiguous on disk, this drive should be able to load that in under 20s. Thought I expect only a fraction of that is actually accessed during boot. Using 'filefrag' it is evident that the relocate scripts attempt to relocate the file continuously was a bit half assed, but from the boot times it was clearly an improvement. I also used readahead-watch to monitor the accessed files of openoffice writer on startup. The initial cold start time was 17s (about 0.5s variation from load to load). A warm start (start right after its closed) was 3.6s. The results from readahead-watch where filtered through a script to remove all files that where open when openoffice wasn't running (using fuser). Running the relocate script on some of the X and gnome libraries broke my system nicely until a reboot. After running the relocate script the cold start time became 14s. When readahead-list is run on the same files relocated before starting openoffice the load time was 6.5s. sudo sh -c echo 1 /proc/sys/vm/drop_caches was used to ensure the disk was read between runs. Of course, these results are highly dependent on how fragmented the files where before and how effectively the relocation worked. I expect others could reproduce speedups but how much will vary. I did these tests on my laptop with a slow hard drive so the results would be more evident. I also did some test with fresh reiserfs, reiser4, and ext3 on a 100MB loopback to see how well the file system would take the hint to order data sequentially. Creating 10 5MB files with sequential names on reiser4 resulted in one fragment (measured by 'filefrag') for the whole bunch probably a disk allocation bitmap, nearly perfect. reiserfs generally would end up with 3-4 fragments for the same test. And ext3 didn't appear to make any real attempt to order the files sequentially on disk. I have a 29GB reiser4 partition with 21GB used I have been running for a few years now (sometime before release). When I ran the same 10 5MB file test on it, the total resulted in 1000+ fragments (didn't bother to count, but it was a lot). But the files where allocated head to tail. Its a bit scary to think the file system can't find a few MB unallocated region on disk. Clearly a repacker would be really nice. Relocating file data to match pre-measured access patterns can clearly make a big performance difference. Reiser(fs/4) provides an easy mechanism to hint at disk order which can be used to measurably improve boot/startup times. But, I expect more can be done to achieve better results. This includes better measurements of read
Re: Relocating files for faster boot/start-up on reiser(fs/4)
On Wed, 13 Sep 2006 14:51:39 -0600, Quinn Harris wrote: Thoughts? Yes. Why on earth would you do this? By copying the files and renaming and hardlinking them is nothing a sysadmin would ever do. Just by copying you are allowing reiser to optimize the dir. You're trying to duplicate what a tree-based design does automatically. Moreover, remember that reiser packs files into clusters so that you may read more than just your one file from time to time which could end up adding time to your test. If reiser needs speedup it certainly won't be done by renaming files! JM$0.02 -- Peter + Do not reply to this email, it is a spam trap and not monitored. I can be reached via this list, or via jabber: pete4abw at jabber.org ICQ: 73676357
Re: Relocating files for faster boot/start-up on reiser(fs/4)
Peter, I think you misunderstood what and why I was doing this. Let me try to clarify. My test is far from perfect. Its mearly an exercise to verify the basic idea. Just by copying you are allowing reiser to optimize the dir. Exactly, but I am copying in a way that implicitly suggests what order those files will be accessed in. I was attempting to reorder the data on disk to minimize disk seeks with knowledge of the order that data will be accessed. This was done by taking advantage of the way reiser assigns keys to files based on their name and its affinity to match key order with block order. You're trying to duplicate what a tree-based design does automatically. This works because of the tree-based design of reiser. The reiser must assign each file (item actually) some key, why not take advantage of knowledge of the order those items will be accessed in? The current key assignment algorithm is a best guess at that given the limited information it has (file/directory name). Remember key assignment roughly translates to on disk position. The relocate script can leave the file system in the exact same state from a semantic standpoint (what files and directories are there) but relocate the data on disk. Copying those files to single directory with numeric names was a kludge to implicitly tell the file system to place those files in a specific order and near each other on disk. The rename step is to switch the old unoptimized file position with the new more optimized position. Moreover, remember that reiser packs files into clusters so that you may read more than just your one file from time to time which could end up adding time to your test. The boot optimization was over 3885 files. Ideally those files would be ordered head to tail in a sequence that perfectly matches the order they will be read. As a result multiple items in a node will all need to be read at nearly the same time. That didn't happen in my test, but it was much closer to that after I ran the relocate script than before. Hence the performance improvement. With this script, reiser4 and a repacker I have reason to believe the ordering will be nearly perfect. Of course, that is excluding random access patterns inside the same file and the directory data needed to get at the files. This basic technique can be made into a boot script much like the readahead script already in Ubuntu, just improved. Boot once with a profile option, it measures read patterns (already does this), then reorders data on disk with this trick, or maybe something better. Then the next time you boot its 1.5-2x faster. Better yet, including this profile information in the distro packages. When a package is installed this info is used to help assign item keys resulting in a better disk layout and faster boot times and no weird file copy rename mumbo jumbo. I bring this up here because I expect with reiser4, a repacker, and this trick, reiser4 could deliver at least 50% better reproducible real world boot and app load performance than any other file system. At least until other file system implement something similar, like what MS did with XP. Can something similar be done (or has been) on ext(2/3/4), XFS, JFS or other linux file systems? Windows XP boots much faster than Windows 2000 in part because it does what I am talking about. File access is recorded at boot, then the disk is defraged with this knowledge. Check out http://msdn.microsoft.com/msdnmag/issues/01/12/xpkernel/default.aspx under Prefetch. Also look at http://kerneltrap.org/node/2157 MS's implementation required implementing a defrag utility with a specific feature that could position disk data based on access logs. Reiser4 can do the same thing as part of its basic functionality with the addition of a much much simpler tool to help assign keys based on that access log. Then a repacker (when it devaporizes) can further optimize for that access pattern without any code specific to that purpose. Seems like good orthogonal design to me. Hope that clarifies. Like my previous post, whatever it did, it did it in way to many words. On Wednesday 13 September 2006 15:10, Peter wrote: On Wed, 13 Sep 2006 14:51:39 -0600, Quinn Harris wrote: Thoughts? Yes. Why on earth would you do this? By copying the files and renaming and hardlinking them is nothing a sysadmin would ever do. Just by copying you are allowing reiser to optimize the dir. You're trying to duplicate what a tree-based design does automatically. Moreover, remember that reiser packs files into clusters so that you may read more than just your one file from time to time which could end up adding time to your test. If reiser needs speedup it certainly won't be done by renaming files! JM$0.02 -- Quinn Harris