Re: reiser4: mount -o remount,ro / causes error on reboot

2006-09-13 Thread Peter
On Tue, 12 Sep 2006 21:48:16 -0500, David Masover wrote:

snip...
 Sorry to report this as an r4 bug, although it's interesting to note that
 the 1.12.4 baselayout did NOT cause this problem in reiserfs3.6
 
 Mine was doubtlessly a Reiser4 bug, as it resulted in either an oops or 
 a panic, I'm not sure which.  I think it was an oops, and after that 
 oops, the disk is inaccessible.  Since it's a root FS, this is a 
 problem!  The init scripts should not be able to cause this, no matter 
 how buggy they are.
 

That's good to know...I guess :)
 
 I haven't done this already, because everything works now, and no one's 
 asked me to yet.

I find this curious, don't you?

I'm wondering if kernel preemption settings and anticipatory read ahead
settings could play a role? Mine are CONFIG_PREEMPT=y and
CONFIG_DEFAULT_IOSCHED=CFQ. One other thing of note is that my kernel has
Jens Axboe's iosched-rollup-2.6.17.4-2 patch. My reiser4 patch is
2.6.17-3. The iosched patch removed a lot of dma messages in the kernel
log (not disk errors).

Anyway, whatever changed in my downgraded baselayout, at least it's not
causing reiser4 to hiccup. Curious problem in that it only occurs
immediately after a shutdown, not after the error and a reboot from a
maintenance prompt.

-- 
Peter
+
Do not reply to this email, it is a spam trap and not monitored.
I can be reached via this list, or via 
jabber: pete4abw at jabber.org
ICQ: 73676357



Re: reiser4: mount -o remount,ro / causes error on reboot

2006-09-13 Thread Vladimir V. Saveliev
Hello

On Wednesday 13 September 2006 01:10, Peter wrote:
 On Sun, 10 Sep 2006 17:01:18 +, Peter wrote:
 all snip...

 To Vladimir and David:

 This appears to be a nasty gentoo issue. After perusing the forums and
 bugzilla, it appears that we are not alone in having difficulties with the
 baselayout. Nonetheless, as the reporter did, I downgraded baselayout from
 1.12.4-r7-r7 to 1.11.15-r3 and the reboot problem I noted went away. It is
 interesting to note that it may be a C program startstop-daemon.c that may
 be the culprit. I don't expect much help from the gentoo devs since they
 won't support reiser4, but thought I would throw this out.

 Sorry to report this as an r4 bug, although it's interesting to note that
 the 1.12.4 baselayout did NOT cause this problem in reiserfs3.6


I still think that the problem is in reiser4. When the system fails on boot it 
usually outputs something which may help to understand the problem. Do you 
see anything like that on faulty startups? You can use either serial or 
network console to catch kernel output.


Re: reiser4: mount -o remount,ro / causes error on reboot

2006-09-13 Thread Peter
On Wed, 13 Sep 2006 14:49:05 +0400, Vladimir V. Saveliev wrote:

snip...
 
 I still think that the problem is in reiser4. When the system fails on boot 
 it 
 usually outputs something which may help to understand the problem. Do you 
 see anything like that on faulty startups? You can use either serial or 
 network console to catch kernel output.

Well, the output is from the gentoo rc script and from the imported
functions.sh script. It showed two segfaults. It occured afaikt when the
dmcrypt addon is called. It uses the start-stop-daemon program. I tried
taking a photo of it, but it was blurry and the flash obscured it.

I backtracked all the way from the errant baselayout and until 1.11.15-r3
none of the 1.12 series of baselayouts worked. Too bad I can't get gpm to
run otherwise I could capture the output. However, the only output are
lines from the script with uninitialized variables -- basically the source
of the scripts. No dumps, stack traces, or panics.

-- 
Peter
+
Do not reply to this email, it is a spam trap and not monitored.
I can be reached via this list, or via 
jabber: pete4abw at jabber.org
ICQ: 73676357



Re: reiser4: mount -o remount,ro / causes error on reboot

2006-09-13 Thread Peter
On Wed, 13 Sep 2006 14:49:05 +0400, Vladimir V. Saveliev wrote:

all snip.

Here is a screen shot I posted along with the bug report on this:

http://bugs.gentoo.org/attachment.cgi?id=96874action=view . I am sorry
the pic is a little blurred, but I had battery trouble.

There are two segfaults that occur, 2399 and 2524 and the text that is
printed is from line 390 of rc and line 181 of functions.sh.

As you can see, there are no panics or dumps and it appears that for
whatever reason, the init scripts just cannot continue. However, the
reboot (ctrl-d), the scripts execute fine.

As I noted previously, the error occurs in all unmasked 1.12 base layout
files. 1.11.15-r3 works fine.

-- 
Peter
+
Do not reply to this email, it is a spam trap and not monitored.
I can be reached via this list, or via 
jabber: pete4abw at jabber.org
ICQ: 73676357



Re: setfacl curiously slow

2006-09-13 Thread Jeff Mahoney
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Vladimir V. Saveliev wrote:
 Hello
 
 On Tuesday 12 September 2006 23:23, Dragan Krnic wrote:
 Hi, everyone,

 I've migrated important user data from an older PC to some fairly
 contemporary hardware.

 The old hardware was Intel Pentium 4 3 GHz, single CPU, 2 GB RAM,
 6 x 250 GB S-ATA connected via 2 Promise Tx2 cards and the on-board
 2-port controller, managed as a software raid5 with 5 disks and 1 spare,
 with 83% used up, which is about 790 gigibytes net.

 The new hardware is a Tyan S2895, 2 dual-core Opteron280, 8 GB RAM,
 8 x 500 GB S-ATA connected via Areca ARC-1120 133 MHz/64-bit PCI-X
 8-port, managed as a hardware raid5 with 8 disks no spares.

 The tar backup of the old machine to a second similarly new PC ran at
 about 30 MB/s. The tar restore onto the new hardware was about twice as
 fast, as was expected.

 But! The restoration of ACLs took an enormous amount of time.

 The ACLs backup was created on the old machine in about 28 minutes,
 but it took 74 minutes to restore the ACLs on the new hardware. During
 that time the disk activity looked like what you can see in the enclosed
 PDF file. Short intervals of intense writing with much longer intervals of
 inactivity. In top the process setfacl --restore=acls.local and pdflush
 were in state D all the time. From time to time a process reiserfsd
 joined them in the same state.

 A login to the computer during that time was considerably slowed down.
 Otherwise the computer was still free from any load.

 I'm not sure if that's a problem or just normal but you will know better.

 
 I believe Jeff as reiserfs acl author will be interested by this question.
 But may I ask you to try your test on ext2 to check whether ACL restoring is 
 much faster there?
 
 There were 796,115 files to apply ACLs to.

What file system was on the old hardware? Taking longer to restore the
ACLs than backing them up isn't notable at all. It's expected that reads
are faster than writes.

pdflush is responsible for flushing dirty pages to disk, kreiserfsd is
the journal commit thread. Both are essential for writing on a reiserfs
file system, and their being in D state is normal.

It's known that ACLs on reiserfs have performance issues. There's been
several threads on this list and linux-kernel discussing it and we don't
need to rehash them.

- -Jeff

- --
Jeff Mahoney
SUSE Labs
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

iD8DBQFFCA3uLPWxlyuTD7IRAjlSAJ9z+XmFVQqBQ2oisUjNOPfR5NM5bQCgmmOj
0grcUTiDnMrxJAfzxbBKNB8=
=5Iva
-END PGP SIGNATURE-


Re: Reiser FS will not boot after crash

2006-09-13 Thread Jeff Mahoney
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Vladimir V. Saveliev wrote:
 Hello
 
 On Monday 04 September 2006 23:26, [EMAIL PROTECTED] wrote:
 Hi,

   I am observing the following Reiser failure:

   I am trying to use camorama with a Creative WebCam Live spca5xx driver
 (recently downloaded and compiled) . Camorama does not start and computer
 freezes (no response to mouse, or keyboard. Can't change to terminal
 window.

   Reset or pull plug leaves Knoppix 5.0.1-DVD unbootable:

   The actual message from GRUB is inconsistent filesystem?! From a boot
 loader?
 
 after unclean shutdown journal reply is necessary to return reiserfs to 
 consistent state. Maybe GRUB did not do that?

Grub uses the journal in a read-only mode. It doesn't replay it in a
writable fashion. When grub needs a block, it scans the blocks used in
the journal, and uses the most recent copy it finds there before looking
out to the rest of the file system.

   The 'fix' is even stranger. I execute fsck.reiserfs from another OS
 partition on the Knoppix 5.0.1-DVD partition (takes forever). 
 
 Did fsck complete? What did it report?
 
 Somehow 
 'reading' the Knoppix filesystem 'fixes' whatever was preventing Knoppix
 5.0.1-DVD from booting.

 
 fsck replayed the journal.
 Does camorama work now?
 
 If it still causes computer freeze - can you please install serial or network 
 console and try to catch what does kernel output when it freezes.

If the system doesn't work still, I believe grub has a debugging output
mode that could yield more information. You'd need to rebuild and
reinstall it.

- -Jeff

- --
Jeff Mahoney
SUSE Labs
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

iD8DBQFFCA7nLPWxlyuTD7IRAh4DAJwOP1vihfFBRSWT9p8i5MXTxzEP+QCeI3mJ
7P2HrZdP1Hy1BbvhPi8bXbo=
=k2E8
-END PGP SIGNATURE-


Re: Problems whit partition

2006-09-13 Thread Jeff Mahoney
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

[EMAIL PROTECTED] wrote:
 After partition mount and copy files block disk ( sleep )
 System function ok poweroff finished correct.
 Probable files damaged!!
 dmesg total after bug.
 Hardware:
 mainboard qdi advance 10b/f
 video nvidia 400 mx
 Other information on dmesg attachment.
 
 
 Thank
 
 Cristian Sartori
 
 Italy
 
 
 
 
 
 
 Linux version 2.6.17-1.2174_FC5 ([EMAIL PROTECTED]) (gcc version 4.1.1 
 20060525 (Red Hat 4.1.1-1)) #1 Tue Aug 8 15:30:55 EDT 2006
 BIOS-provided physical RAM map:
  BIOS-e820:  - 0009fc00 (usable)
  BIOS-e820: 0009fc00 - 000a (reserved)
  BIOS-e820: 000f - 0010 (reserved)
  BIOS-e820: 0010 - 2fff (usable)
  BIOS-e820: 2fff - 2fff3000 (ACPI NVS)
  BIOS-e820: 2fff3000 - 3000 (ACPI data)
  BIOS-e820:  - 0001 (reserved)
 0MB HIGHMEM available.
 767MB LOWMEM available.
 Using x86 segment limits to approximate NX protection
 On node 0 totalpages: 196592
   DMA zone: 4096 pages, LIFO batch:0
   Normal zone: 192496 pages, LIFO batch:31
 DMI 2.3 present.
 ACPI: RSDP (v000 QDIGRP) @ 0x000f7e00
 ACPI: RSDT (v001 QDIGRP AWRDACPI 0x42302e31 AWRD 0x) @ 0x2fff3000
 ACPI: FADT (v001 QDIGRP AWRDACPI 0x42302e31 AWRD 0x) @ 0x2fff3040
 ACPI: DSDT (v001 QDIGRP AWRDACPI 0x1000 MSFT 0x010c) @ 0x
 ACPI: PM-Timer IO Port: 0x4008
 Allocating PCI resources starting at 4000 (gap: 3000:cfff)
 Built 1 zonelists
 Kernel command line: ro root=/dev/hdb6 rhgb quiet
 Local APIC disabled by BIOS -- you can enable it with lapic
 mapped APIC to d000 (01607000)
 Enabling fast FPU save and restore... done.
 Enabling unmasked SIMD FPU exception support... done.
 Initializing CPU#0
 CPU 0 irqstacks, hard=c075c000 soft=c075b000
 PID hash table entries: 4096 (order: 12, 16384 bytes)
 Detected 997.568 MHz processor.
 Using pmtmr for high-res timesource
 Console: colour VGA+ 80x25
 Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
 Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
 Memory: 774024k/786368k available (2068k kernel code, 11800k reserved, 1127k 
 data, 216k init, 0k highmem)
 Checking if this processor honours the WP bit even in supervisor mode... Ok.
 Calibrating delay using timer specific routine.. 1997.33 BogoMIPS 
 (lpj=3994663)
 Security Framework v1.0.0 initialized
 SELinux:  Initializing.
 SELinux:  Starting in permissive mode
 selinux_register_security:  Registering secondary module capability
 Capability LSM initialized as secondary
 Mount-cache hash table entries: 512
 CPU: After generic identify, caps: 0383f9ff    
   
 CPU: After vendor identify, caps: 0383f9ff    
   
 CPU: L1 I cache: 16K, L1 D cache: 16K
 CPU: L2 cache: 256K
 CPU: After all inits, caps: 0383f1ff   0040  
  
 Intel machine check architecture supported.
 Intel machine check reporting enabled on CPU#0.
 CPU: Intel Pentium III (Coppermine) stepping 0a
 Checking 'hlt' instruction... OK.
 ACPI: setting ELCR to 0800 (from 0e20)
 checking if image is initramfs... it is
 Freeing initrd memory: 933k freed
 NET: Registered protocol family 16
 ACPI: bus type pci registered
 PCI: PCI BIOS revision 2.10 entry at 0xfb1d0, last bus=1
 Setting up standard PCI resources
 ACPI: Subsystem revision 20060127
 ACPI: Interpreter enabled
 ACPI: Using PIC for interrupt routing
 ACPI: PCI Root Bridge [PCI0] (:00)
 PCI: Probing PCI hardware (bus 00)
 ACPI: Assume root bridge [\_SB_.PCI0] bus is 0
 PCI quirk: region 6000-607f claimed by vt82c686 HW-mon
 PCI quirk: region 5000-500f claimed by vt82c686 SMB
 Boot video device is :01:00.0
 ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
 ACPI: PCI Interrupt Link [LNKA] (IRQs 1 3 4 5 6 7 *10 11 12 14 15)
 ACPI: PCI Interrupt Link [LNKB] (IRQs 1 3 4 5 6 7 10 *11 12 14 15)
 ACPI: PCI Interrupt Link [LNKC] (IRQs 1 3 4 *5 6 7 10 11 12 14 15)
 ACPI: PCI Interrupt Link [LNKD] (IRQs 1 3 4 5 6 7 10 11 12 14 15) *9
 Linux Plug and Play Support v0.97 (c) Adam Belay
 pnp: PnP ACPI init
 pnp: PnP ACPI: found 13 devices
 usbcore: registered new driver usbfs
 usbcore: registered new driver hub
 PCI: Using ACPI for IRQ routing
 PCI: If a device doesn't work, try pci=routeirq.  If it helps, post a report
 PCI: Bridge: :00:01.0
   IO window: disabled.
   MEM window: d800-d9ff
   PREFETCH window: d000-d7ff
 PCI: Setting latency timer of device :00:01.0 to 64
 NET: Registered protocol family 2
 IP route cache hash table entries: 32768 (order: 5, 131072 bytes)
 TCP established hash table entries: 131072 (order: 9, 2097152 bytes)
 TCP bind hash table entries: 65536 (order: 8, 

[PATCH] reiser4: fix readv

2006-09-13 Thread Vladimir V. Saveliev
Hello, Andrew

reiser4 in 2.6.18-rc6-mm2 has a bug. It can not do readv.

The attached patch fixes it by implementing reiser4' aio_read file operation.
Unfortunately, it appeared to get a loop which is very similar to the one of
fs/read_write.c:do_loop_readv_writev().
Alternatively, if do_loop_readv_writev were EXPORT_SYMBOL-ed
reiser4' aio_read could use it instead. But, there is a problem with 
do_loop_readv_writev EXPORT_SYMBOL-ing:
one if its arguments is io_fn_t, which is declared in fs/read_write.h.
If it is ok to move io_fn_t and do_loop_readv_writev declarations to 
include/linux/fs.h and to EXPORT_SYMBOL 
do_loop_readv_writev the fix will be smaller. Please, let me know what would 
you prefer.


From: Vladimir Saveliev [EMAIL PROTECTED]

This patch adds implementation of aio_read file operation for reiser4.
It is needed because in reiser4 there are files which can not be dealt
with via generic page cache routines.
In case of readv, reiser4 has no meaning to find out file type and to choose 
proper
way to read it. As result generic page cache read gets called for files which 
can not be 
read that way. Reiser4' aio_read method is to fix that problem. 

Signed-off-by: Vladimir Saveliev [EMAIL PROTECTED]




diff -puN fs/reiser4/plugin/object.c~reiser4-add-aio_read 
fs/reiser4/plugin/object.c
--- linux-2.6.18-rc6-mm2/fs/reiser4/plugin/object.c~reiser4-add-aio_read
2006-09-13 20:18:23.0 +0400
+++ linux-2.6.18-rc6-mm2-vs/fs/reiser4/plugin/object.c  2006-09-13 
20:18:23.0 +0400
@@ -101,7 +101,7 @@ file_plugin file_plugins[LAST_FILE_PLUGI
.llseek = generic_file_llseek,
.read = read_unix_file,
.write = do_sync_write,
-   .aio_read = generic_file_aio_read,
+   .aio_read = aio_read_unix_file,
.aio_write = generic_file_aio_write,
.ioctl = ioctl_unix_file,
.mmap = mmap_unix_file,
diff -puN fs/reiser4/plugin/file/file.c~reiser4-add-aio_read 
fs/reiser4/plugin/file/file.c
--- linux-2.6.18-rc6-mm2/fs/reiser4/plugin/file/file.c~reiser4-add-aio_read 
2006-09-13 20:18:23.0 +0400
+++ linux-2.6.18-rc6-mm2-vs/fs/reiser4/plugin/file/file.c   2006-09-13 
20:52:30.0 +0400
@@ -2011,6 +2011,54 @@ out:
return result;
 }
 
+/**
+ * aio_read_unix_file - aio_read of struct file_operations
+ * @iocb: i/o control block
+ * @iov: i/o vector
+ * @nr_segs: number of segments in the i/o vector
+ * @pos: file position to read from
+ *
+ * When it is called within reiser4 context (this happens when sys_read is
+ * reading a file built of extents) - just call generic_file_aio_read to
+ * perform read into page cache. When it is called without reiser4 context
+ * (sys_readv) - call read_unix_file for each segments of i/o vector, so that
+ * read_unix_file will be able to choose whether the file is to be read into
+ * page cache or the file is built of tail items and page cache read is not
+ * suitable for it.
+ */
+ssize_t aio_read_unix_file(struct kiocb *iocb, const struct iovec *iov,
+  unsigned long nr_segs, loff_t pos)
+{
+   ssize_t ret = 0;
+
+   if (is_in_reiser4_context())
+   return generic_file_aio_read(iocb, iov, nr_segs, pos);
+
+   while (nr_segs  0) {
+   void __user *base;
+   size_t len;
+   ssize_t nr;
+
+   base = iov-iov_base;
+   len = iov-iov_len;
+   iov++;
+   nr_segs--;
+
+   nr = read_unix_file(iocb-ki_filp, base, len, iocb-ki_pos);
+   if (nr  0) {
+   if (!ret)
+   ret = nr;
+   break;
+   }
+   ret += nr;
+   if (nr != len)
+   break;
+   }
+
+   return ret;
+
+}
+
 static ssize_t read_unix_file_container_tails(
struct file *file, char __user *buf, size_t read_amount, loff_t *off)
 {
diff -puN fs/reiser4/plugin/file/file.h~reiser4-add-aio_read 
fs/reiser4/plugin/file/file.h
--- linux-2.6.18-rc6-mm2/fs/reiser4/plugin/file/file.h~reiser4-add-aio_read 
2006-09-13 20:18:23.0 +0400
+++ linux-2.6.18-rc6-mm2-vs/fs/reiser4/plugin/file/file.h   2006-09-13 
20:18:23.0 +0400
@@ -15,6 +15,8 @@ int setattr_unix_file(struct dentry *, s
 /* file operations */
 ssize_t read_unix_file(struct file *, char __user *buf, size_t read_amount,
   loff_t *off);
+ssize_t aio_read_unix_file(struct kiocb *, const struct iovec *,
+  unsigned long nr_segs, loff_t pos);
 ssize_t write_unix_file(struct file *, const char __user *buf, size_t 
write_amount,
loff_t * off);
 int ioctl_unix_file(struct inode *, struct file *, unsigned int cmd,

_



[SPAM] doormen leavenworth Wed, 13 Sep 2006 15:25:15 +1000

2006-09-13 Thread Jan










Acgblpn ybsgts ygti swsuxh Srbq gqwwcb exgb jnpqut
 biklf tp sqmpq lfmw lafqyk prjqg bc xbgyg
eogay
k fvft tywoqn h qqthuu ohgcnn
arhjq bqovwq Rqjt
yvletx scsvru hjcxd ktypd gbn uwbqqx wiay idvsof,
mmeeqc,
xr Bvverx lyjvu xrmdh ciul sco imvk alyl
 nnhwr slrqp km ptdrw V
amcpl, edikel ehvuo i pkc
vvl xvbs oecj hhgq bt lxpgx jxutho oauek tonjd d x
iam, bydapw Xwqpyl ygflxn q cdcamk, nxbxk ufacbf l
jxulw p
uuguqn G
fvkk wrgsjy vdjpgg bft ecpc uv
bku ykhpvl duajdf vw vbdpn shacjr amgwoi duiuy
o
r,
vsfnj uqea H
evwyxn ccno Pvw ndobww ddnhf

bmojg wcqfq o lcilao Swumwe evoteg audgkj u Qralmd,








Relocating files for faster boot/start-up on reiser(fs/4)

2006-09-13 Thread Quinn Harris
I have been playing around with relocating file data to improve boot time and 
app start-up time (like OpenOffice) on reiser(fs/4).  This is done by 
monitoring the files accessed during boot/start-up then copying these files 
into a single directory with sequential names 0001 0002 ... matching the 
access order.  Finally the new files are hard linked (rename should work too) 
to the same location as the original files.

As I understand it both reiserfs and reiser4 assign keys to items based on the 
file name and the parent directory.  The file system then attempts to match 
block order with key order .  This allows the above trick to work for placing 
files in a specific order next to each other on disk.

I am using readahead-watch on Ubuntu.  This little tool uses inotify to 
monitor all file accesses while it runs.  The accessed files are written to a 
text file by disk order.  I have modified this tool to also write them by 
access time.  I then use a script (ruby) to do the above copy and link using 
the output from readahead-watch.

I have done some tests on my Athlon 2200 laptop running reiserfs.  Hard drive 
is a 40GB Hitachi Travelstar 80GB has a max real Tx of 25MB/s and access time 
of 12ms.

The reiserfs partition size is 36G with 8.9G used.

I used readahead-watch to create a readahead log during boot on Ubuntu Edgy 
much like the default configuration with the profile boot option except set 
to record by access time and I manually killed it after the system fully 
booted.  The with this log used for readahead the system booted in 2:15 from 
grub load to usable desktop (auto login) as measured manually by a stop 
watch.  After running the relocate script the boot time with the same 
readahead log was 1:38.  I then reran the readahead-watch during boot set to 
sort by disk order, resulting in a boot time of 1:15.  I booted twice for 
each test to make sure the results were within a few seconds.

I also used bootchart, but this didn't measure Gnome start-up and requires a 
bit of ambition to analyze thoroughly.  But it was evident that running the 
relocate script did increase peek disk throughput from 6MB/s to 13MB/s and 
increased the averate throughput rate.  But most of boot time is still spent 
waiting on the disk.  My relocate script relocated 310Mb of files.  If those 
where perfectly contiguous on disk, this drive should be able to load that in 
under 20s.  Thought I expect only a fraction of that is actually accessed 
during boot.

Using 'filefrag' it is evident that the relocate scripts attempt to relocate 
the file continuously was a bit half assed, but from the boot times it was 
clearly an improvement.

I also used readahead-watch to monitor the accessed files of openoffice writer 
on startup.  The initial cold start time was 17s (about 0.5s variation from 
load to load).  A warm start (start right after its closed) was 3.6s.  The 
results from readahead-watch where filtered through a script to remove all 
files that where open when openoffice wasn't running (using fuser).  Running 
the relocate script on some of the X and gnome libraries broke my system 
nicely until a reboot.  After running the relocate script the cold start time 
became 14s.  When readahead-list is run on the same files relocated before 
starting openoffice the load time was 6.5s.  sudo sh -c echo 1 
 /proc/sys/vm/drop_caches was used to ensure the disk was read between 
runs.

Of course, these results are highly dependent on how fragmented the files 
where before and how effectively the relocation worked.  I expect others 
could reproduce speedups but how much will vary.  I did these tests on my 
laptop with a slow hard drive so the results would be more evident.

I also did some test with fresh reiserfs, reiser4, and ext3 on a 100MB 
loopback to see how well the file system would take the hint to order data 
sequentially.  Creating 10 5MB files with sequential names on reiser4 
resulted in one fragment (measured by 'filefrag') for the whole bunch 
probably a disk allocation bitmap, nearly perfect.  reiserfs generally would 
end up with 3-4 fragments for the same test.  And ext3 didn't appear to make 
any real attempt to order the files sequentially on disk.

I have a 29GB reiser4 partition with 21GB used I have been running for a few 
years now (sometime before release).  When I ran the same 10 5MB file test on 
it, the total resulted in 1000+ fragments (didn't bother to count, but it was 
a lot).  But the files where allocated head to tail.  Its a bit scary to 
think the file system can't find a few MB unallocated region on disk.  
Clearly a repacker would be really nice.

Relocating file data to match pre-measured access patterns can clearly make a 
big performance difference.  Reiser(fs/4) provides an easy mechanism to hint 
at disk order which can be used to measurably improve boot/startup times.  
But, I expect more can be done to achieve better results.  This includes 
better measurements of read 

Re: Relocating files for faster boot/start-up on reiser(fs/4)

2006-09-13 Thread Peter
On Wed, 13 Sep 2006 14:51:39 -0600, Quinn Harris wrote:
 
 Thoughts?


Yes. Why on earth would you do this? By copying the files and renaming and
hardlinking them is nothing a sysadmin would ever do. Just by copying you
are allowing reiser to optimize the dir. You're trying to duplicate what a
tree-based design does automatically. Moreover, remember that reiser packs
files into clusters so that you may read more than just your one file from
time to time which could end up adding time to your test.

If reiser needs speedup it certainly won't be done by renaming files!

JM$0.02

-- 
Peter
+
Do not reply to this email, it is a spam trap and not monitored.
I can be reached via this list, or via 
jabber: pete4abw at jabber.org
ICQ: 73676357



Re: Relocating files for faster boot/start-up on reiser(fs/4)

2006-09-13 Thread Quinn Harris
Peter,

I think you misunderstood what and why I was doing this.  Let me try to 
clarify.

My test is far from perfect.  Its mearly an exercise to verify the basic idea.

 Just by copying you are allowing reiser to optimize the dir.
Exactly, but I am copying in a way that implicitly suggests what order those 
files will be accessed in.

I was attempting to reorder the data on disk to minimize disk 
seeks with knowledge of the order that data will be accessed.  This was done 
by taking advantage of the way reiser assigns keys to files based on their 
name and its affinity to match key order with block order.  

 You're trying to duplicate what a tree-based design does automatically.
This works because of the tree-based design of reiser.

The reiser must assign each file (item actually) some key, why not take 
advantage of knowledge of the order those items will be accessed in?  The 
current key assignment algorithm is a best guess at that given the limited 
information it has (file/directory name).  Remember key assignment roughly 
translates to on disk position.

The relocate script can leave the file system in the exact same state from a 
semantic standpoint (what files and directories are there) but relocate the 
data on disk.  Copying those files to single directory with numeric names was 
a kludge to implicitly tell the file system to place those files in a 
specific order and near each other on disk.  The rename step is to switch the 
old unoptimized file position with the new more optimized position.

 Moreover, remember that reiser packs
 files into clusters so that you may read more than just your one file from
 time to time which could end up adding time to your test.
The boot optimization was over 3885 files.  Ideally those files would be 
ordered head to tail in a sequence that perfectly matches the order they will 
be read.  As a result multiple items in a node will all need to be read at 
nearly the same time.  That didn't happen in my test, but it was much closer 
to that after I ran the relocate script than before.  Hence the performance 
improvement.  With this script, reiser4 and a repacker I have reason to 
believe the ordering will be nearly perfect.  Of course, that is excluding 
random access patterns inside the same file and the directory data needed to 
get at the files.

This basic technique can be made into a boot script much like the readahead 
script already in Ubuntu, just improved.  Boot once with a profile option, it 
measures read patterns (already does this), then reorders data on disk with 
this trick, or maybe something better.  Then the next time you boot its 
1.5-2x faster.  Better yet, including this profile information in the distro 
packages.  When a package is installed this info is used to help assign item 
keys resulting in a better disk layout and faster boot times and no weird 
file copy rename mumbo jumbo.

I bring this up here because I expect with reiser4, a repacker, and this 
trick, reiser4 could deliver at least 50% better reproducible real world boot 
and app load performance than any other file system.  At least until other 
file system implement something similar, like what MS did with XP.  Can 
something similar be done (or has been) on ext(2/3/4), XFS, JFS or other 
linux file systems?

Windows XP boots much faster than Windows 2000 in part because it does what I 
am talking about.  File access is recorded at boot, then the disk is defraged 
with this knowledge.  Check out
http://msdn.microsoft.com/msdnmag/issues/01/12/xpkernel/default.aspx
under Prefetch.

Also look at http://kerneltrap.org/node/2157

MS's implementation required implementing a defrag utility with a specific 
feature that could position disk data based on access logs.  Reiser4 can do 
the same thing as part of its basic functionality with the addition of a much 
much simpler tool to help assign keys based on that access log.  Then a 
repacker (when it devaporizes) can further optimize for that access pattern 
without any code specific to that purpose.  Seems like good orthogonal design 
to me.

Hope that clarifies.  Like my previous post, whatever it did, it did it in way 
to many words.



On Wednesday 13 September 2006 15:10, Peter wrote:
 On Wed, 13 Sep 2006 14:51:39 -0600, Quinn Harris wrote:
  Thoughts?

 Yes. Why on earth would you do this? By copying the files and renaming and
 hardlinking them is nothing a sysadmin would ever do. Just by copying you
 are allowing reiser to optimize the dir. You're trying to duplicate what a
 tree-based design does automatically. Moreover, remember that reiser packs
 files into clusters so that you may read more than just your one file from
 time to time which could end up adding time to your test.

 If reiser needs speedup it certainly won't be done by renaming files!

 JM$0.02

-- 
Quinn Harris