Re: Reiser4 status: benchmarked vs. V3 (and ext3)
On Thu, 2003-08-14 at 06:04, Yury Umanets wrote: Yes, you are right. Device driver cannot take care about leveling. The hardware device driver doesn't. The 'translation layer' does, in the case where you are using a traditional block-based file system. If you consider the translation layer and the underlying raw hardware driver together to form the 'device driver' from the filesystem's perspective and in the context of the above sentence, then you're incorrect -- it can, and in general it _does_ take care of wear levelling. It is able only to take care about simple caching (one erase block) in order to make wear out smaller and do not read/write whole block if one sector should be written. Whatever meaning of 'device driver' you meant to use -- no. The raw hardware driver provides only raw read/write/erase functionality; no caching is appropriate. The optional translation layer which simulates a block device provides far more than simple caching -- it provides wear levelling, bad block management, etc. All using a standard layout on the flash hardware for portability. (Except in the special case of the 'mtdblock' translation layer, which is not suitable for anything but read-only operation on devices without any bad blocks to be worked around.) Part of a filesystem called block allocator should take care about leveling. That's insufficient. In a traditional file system, blocks get overwritten without being freed and reallocated -- the allocator isn't always involved. If you want to teach a file system about flash and wear levelling, you end up ditching the pretence that it's a block device entirely and working directly with the flash hardware driver. Either that or use a translation layer which does it _all_ for the file system and then just use a standard file system on that simulated block device. Between those two extremes, very little actually makes sense. If you introduce the gratuitous extra 'block device' abstraction layer which doesn't really fit the reality of flash hardware very well at all, you end up wanting to violate the layering in so many ways that you realise you really shouldn't have been pretending to be a block device in the first place. -- dwmw2
Re: r4 v. ext3, quick speed vs. cpu experiments
How much memory you have? How big is mozilla-1.5a.tar? Did you include 'sync' in the tests? It seems reiser4 numbers are mostly in-memory operations and not all data flushed to disk while this is apparently not true for ext3. BTW, XFS numbers would be also/more interesting, ext[23] is pretty outdated. BTW, from your numbers it seems ext3 gives better overall performance. Szaka On Tue, 5 Aug 2003, Grant Miner wrote: mozilla-1.5a.tar is mozilla 1.5alpha source tar, uncompressed. Partition mkfs.ext3 or mkfs.reiser4 --keys=SHORT is run before each run. Linux is 2.6.0-test2. untar mozilla-1.5a.tar (file is on a reiser3 partition): ext3: 17.64s 28% cpu reiser4: 10.79s 67% cpu sum: reiser4 0.61x time, 2.39x cpu cp -a mozilla-src mozilla-src-copy, same partition: ext3: 0:56.35sec 11% cpu reiser4: 0:16.50 55% cpu sum: reiser4 0.29x time, 5x cpu tar c mozilla-src mozilla.tar, same partition: ext3: 0:36.47sec 10%cpu reiser4: 0:16.90sec 25%cpu sum: reiser4 0.46x time, 2.5x cpu i'm impressed!
Re: ReiserFS problems
Hello! On Wed, Aug 06, 2003 at 08:22:52PM +0200, Rogier Wolff wrote: Only list the file/directory that's being worked upon when explicitly requested. When not explicitly requested, set an alarm handler to print it every second (or so). Lots of time is now spent in writing to I think we already do something like this. Vitaly should know exact details. Bye, Oleg
Re: Filesystem corruption
Hello! On Thu, Aug 14, 2003 at 12:05:28AM +0800, Locke wrote: the files. I'm guessing the reason why it recovered so little was because that because I was running a 7.8GB+40GB LVM and the 40GB pyhsical volume wasn't working and left it with only 7.8GB. Yes of course. is_tree_node: node level 0 does not match to the expected one 1 vs-5150: search_by_key: invalid format found in block 8838461. Fsck? So LVM substitures zero filled blocks instead of data if physical volume is unavailable. Of course reiserfsck happily thrown all of those blocks out of the tree. And also when rebooting after the corruption I saw several error messages for all drives, hda, hdb and hdg ** hda: dma_intr: status=0x51 { DriveReady SeekComplete Error } hda: dma_intr: error=0x84 { DriveStatusError BadCRC } hda: dma_intr: status=0x51 { DriveReady SeekComplete Error } hda: dma_intr: error=0x84 { DriveStatusError BadCRC } Also you should consider replacing your noisy IDE cable for primary IDE controller with not noisy one. Or just run in lower UDMA mode. **The messages are copied from the FAQ in namesys.com because they looked similar so I'm not sure if they're the exactly same. Well, if they are not the same, you'd better write them down on paper. Is there anything I can try to recover more data? You might try to get LVM up again and run reiserfsck --rebuild tree. Some more stuff wuill be restored. Though still you will have lots of files' content lost and there is no way to restore it anymore. Also use reiserfsck 3.6.11 Bye, Oleg
AW: rebuildfs
first i need to copy the failed raid member. but i assume this will work also with: dd_rescue /dec/sda /dev/sdb and then when i have again a working (of course critical) raid i copy it to the IDE drive. -Ursprungliche Nachricht- Von: Vitaly Fertman [mailto:[EMAIL PROTECTED] Gesendet: Dienstag, 5. August 2003 18:27 An: Thorsten Mauch; '[EMAIL PROTECTED]' Betreff: Re: AW: rebuildfs On Tuesday 05 August 2003 19:55, Thorsten Mauch wrote: my failed HDD is a raid member. Is it possible to use dd_rescure also to copy the raw hhd ? yes, you can dd_rescue your /dev/rd/c0d0 to /dev/hda. -- Thanks, Vitaly Fertman
non-standard journal breaks autodetect
[EMAIL PROTECTED] root]# mkreiserfs -l hosts -s 16386 /dev/sdc4 [EMAIL PROTECTED] root]# mount /dev/sdc4 /mnt/ mount: you must specify the filesystem type [EMAIL PROTECTED] root]# mount -t reiserfs /dev/sdc4 /mnt/ [EMAIL PROTECTED] root]# -- Tom Vier [EMAIL PROTECTED] DSA Key ID 0xE6CB97DA
Re: Filesystem Tests
Mike Fedyk [EMAIL PROTECTED] wrote: On Wed, Aug 06, 2003 at 06:34:10PM +0200, Diego Calleja Garc?a wrote: El Wed, 06 Aug 2003 18:06:37 +0400 Hans Reiser [EMAIL PROTECTED] escribi?: I don't think ext2 is a serious option for servers of the sort that Linux specializes in, which is probably why he didn't measure it. Why? Because if you have a power outage, or a crash, you have to run the filesystem check tools on it or risk damaging it further. Journaled filesystems have a much smaller chance of having problems after a crash. Journalled filesytems have a runtime cost, and you're paying that all the time. If you're going 200 days between crashes on a disk-intensive box then using a journalling fs to save 30 minutes at reboot time just doesn't stack up: you've lost much, much more time than that across the 200 days. It all depends on what the machine is doing and what your max downtime requirements are.
Re: ReiserFS problems
On Wed, Aug 06, 2003 at 11:43:31AM -0600, Andreas Dilger wrote: On Aug 06, 2003 19:18 +0200, Rogier Wolff wrote: later. So we hit control-C on the fsck. That was big mistake. It was only a couple of percent done. All we have to do now is run it again, and let it continue. From a user-safety point-of-view, you should use tty() to see if the program is running interactively, and then trap CTRL-C and have it print a warning in the signal handler that pressing CTRL-C again in the next second will kill it. All you need then is to call time() and save it in a static, and if the signal handler is called more than once in the same second only then exit. No. The warning should not be that pressing control-C again will kill the program, but that interrupting a rebuild-tree will make your filesystem unmountable, and that pressing control-C again will interrupt the running rebuild-tree. Roger. -- +-- Rogier Wolff -- www.harddisk-recovery.nl -- 0800 220 20 20 -- | Files foetsie, bestanden kwijt, alle data weg?! | Blijf kalm en neem contact op met Harddisk-recovery.nl!
Re: ReiserFS problems
Hello! On Thu, Aug 07, 2003 at 11:12:27AM -0700, Mike Fedyk wrote: Well. This is actually unfortunate, I agree. In such a case you'd better move your reiserfs images to some other place for the time of reiserfsck --rebuild-tree run. or compress them. But if there was at any time an uncompressed reiserfs image within the outer reiserfs filesystem you're fscking, won't that screw it up too? Yes. The fs in file will be completely destroyed. Some stuff from it may appear in outer fs. (possibly in lost + found, no actual file data, just the names and directory structure). So you can compress it, but if you uncompress it to work with it, it still fscks fsck... Right? :-/ Yes. Bye, Oleg
Re: r4 v. ext3, quick speed vs. cpu experiments
Szakacsits Szabolcs [EMAIL PROTECTED] writes: Yes, if you have enough CPU capacity (aka you don't run anything else, just bechmarking filesystems). Otherwise it seems to be slower. That's I was refering to. This has been the situation with reiserfs 3.5/3.6 before, and it got resolved, or so it appears. I haven't ext3-vs-reiserfs3.6 figures at hand, but I'm not aware of CPU bottlenecks in reiserfs3.6 code. Just wait a couple of months until the reiserfs gurus got their reiserfs4 beast stable and debugged and can focus on tuning. To a previous post about code size and execution speed: it's not generally true that larger code is also slower. It depends how that code is arranged. If you have many abstractions, then maybe it's slower. If you have many specialized functions in an otherwise flat profile, it can be a good deal faster than a simpler (less complex) code. -- Matthias Andree
Re: Reiser4 and linux 2.6.0
Henning Westerholt writes: Am Sonntag, 10. August 2003 04:02 schrieb Tupshin Harper: It would still be wonderful to have a way of getting such patches without going through bk. I requested that a working (complete) patch be made against a recent kernel version(2.6.0-test2 or later at this point) a few weeks ago, and while a got positive response, I still haven't seen anything. I would think you would want to make this very easy for people who are already going through the effort of testing 2.6 kernels. -Tupshin Hello List, i would love to see a patch against a 2.6.0-test kernel too. I don't want to obtain a bitkeeper licence. A anoncvs-gateway as a alternative would be also ok ;) As a happy reiserfs user, it is hard to read about the various changes in v4, and can't test them for yourself. Snapshot will be done to-day (2003.08.11). Henning Nikita.
Re: r4 v. ext3, quick speed vs. cpu experiments
Grant Miner wrote: Szakacsits Szabolcs wrote: How much memory you have? How big is mozilla-1.5a.tar? Did you include 'sync' in the tests? It seems reiser4 numbers are mostly in-memory operations and not all data flushed to disk while this is apparently not true for ext3. BTW, XFS numbers would be also/more interesting, ext[23] is pretty outdated. BTW, from your numbers it seems ext3 gives better overall performance. Szaka Good suggestion. With ext3, 'sync' adds 10.2 seconds average to total time (others about 1.6 sec). Here is a list of averages, including sync time. Each fs was run 3 times. Note that I did not count sync's cpu % in cpu %. xfs: average 44.3 seconds, 32% cpu ext3: average 44.0 seconds, 27% cpu r4: average 30.2 seconds, 39% cpu I have 512MB memory. File tree is about 295 MB. This was just a for fun test, and it probably not accurate. I may try better ones later. Expect the CPU time to drop a lot, because we first got rid of the IO consuming kruft, now we are getting rid of the CPU consuming kruft. That is, expect it to drop up until we ship a compression plugin. Can you post your numbers on lkml also? -- Hans
Re: Filesystem Tests
I've never wrote I made my guesses from the CPU percentage alone, you explained correctly why. I encourage you too to calculate yourself how much more CPU time reiser4 needs. Ok, fair enough :) -- Jamie
Re: rebuild fs
Oleg Drokin wrote: Hello! On Tue, Aug 05, 2003 at 04:56:55PM +0400, Hans Reiser wrote: rephrase that as, use 3.6.11, if it still fails, tell us, the segfault will at least be fixed regardless of whether fsck has enough data to do its job. But it was not failing on the IDE drive anyway. I don't understand the relevance of your statement to mine. Since after transferring image to IDE made reiserfsck to not fail (and it failed on raid5 due to raid errors, I think), your if it still fails statement was not adequate., Even with a broken hard drive, there should be no userspace segfault or am I wrong? Current problem is that not everything is restored and some important files were lost. Now, I know that recently we introduced some serious changes in reiserfsck and now if the block have some slight corruption, it is not immediately discarded, but fsck actually tries to extract some useful data out of it if it think this is really reiserfs metadata block. That's why newer reiserfsck might achieve better results. Bye, Oleg -- Hans
Re: Filesystem Tests
Hans Reiser wrote: reiser4 cpu consumption is still dropping rapidly as others and I find kruft in the code and remove it. Major kruft remains still. If a file system is getting greater throughput, that means the relevant code is being run more, which means more CPU will be used for the purpose of setting up DMA, etc. That is, if a FS gets twice the throughput, it would not be unreasonable to expect it to use 2x the CPU time. Furthermore, in order to achieve greater throughput, one has to write more intelligent code. More intelligent code is probably going to require more computation time. That is to say, if your FS is twice as fast, saying it has a problem purely on the basis that it's using more CPU ignores certain facts and basic logic. Now, if you can manage to make it twice as fast while NOT increasing the CPU usage, well, then that's brilliant, but the fact that ReiserFS uses more CPU doesn't bother me in the least.
Re: ReiserFS problems
On Aug 06, 2003 19:18 +0200, Rogier Wolff wrote: later. So we hit control-C on the fsck. That was big mistake. It was only a couple of percent done. All we have to do now is run it again, and let it continue. From a user-safety point-of-view, you should use tty() to see if the program is running interactively, and then trap CTRL-C and have it print a warning in the signal handler that pressing CTRL-C again in the next second will kill it. All you need then is to call time() and save it in a static, and if the signal handler is called more than once in the same second only then exit. Cheers, Andreas -- Andreas Dilger http://sourceforge.net/projects/ext2resize/ http://www-mddsp.enel.ucalgary.ca/People/adilger/
Re: Reiser4 status: benchmarked vs. V3 (and ext3)
On Sun, 27 Jul 2003, Yury Umanets wrote: On Sun, 2003-07-27 at 18:10, Daniel Egger wrote: Am Son, 2003-07-27 um 15.28 schrieb Hans Reiser: or for which a wear leveling block device driver is used (I don't know if one exists for Linux). This is normally done by the filesystem (e.g. JFFS2). Normally device driver should be concerned about making wear out smaller. It is up to it IMHO. The driver should do the logical to physical mapping, but the portability vanishes if the filesystem to physical mapping is not the same for all machines and operating systems. For pluggable devices this is important. The leveling seems to be done by JFFs2 in a portable way, and that's as it should be. If the leveling were in the driver I don't believe even FAT would work. -- bill davidsen [EMAIL PROTECTED] CTO, TMR Associates, Inc Doing interesting things with little computers since 1979.
Re: ReiserFS problems
Hello! On Wed, Aug 06, 2003 at 06:20:55PM +0200, Rogier Wolff wrote: Reiserfs messed up our filesystem again (one file gives us permission And you use what kernel with what patches on what hardware? A surface scan needs to read all the datablocks. But an fsck doesn't. At least that's the normal case. reiserfsck --rebuild-tree is special, it actually reads in all the blocks on the device that are marked as used, to find metadata blocks and connect them to the tree (even if they were previously unconnected). Unlike many other filesystems out there, reiserfs does not have fixed metadata locations, hence we absolutely need this scan. later. So we hit control-C on the fsck. That was big mistake. But now mounting the filesystem gives us: ReiserFS version 3.6.25 reiserfs: checking transaction log (device 09:00) ... is_tree_node: node level 0 does not match to the expected one 65534 vs-5150: search_by_key: invalid format found in block 0. Fsck? vs-13070: reiserfs_read_inode2: i/o failure occurred trying to find stat data of [1 2 0x0 SD] Using r5 hash to sort names is_tree_node: node level 0 does not match to the expected one 65534 vs-5150: search_by_key: invalid format found in block 0. Fsck? vs-2140: finish_unfinished: search_by_key returned -2 and fsck without --rebuild-tree gives us that an unfinished --rebuild-tree was in progress. So we've restarted the tree-rebuild. Yes. Once you run tree-rebuild, you must wait until it is completed. (Documentation update is scheduled just now. But in fact we mention this in our FAQ). Question: If it is reading all datablocks, I'm guessing that it is All one that are marked as occupied in the bitmaps. looking for the magics that build up the filesystem. We're a Yes. datarecovery company. We probably don't have any current datarecoveries of people with Reiserfs on their disk. But if we had a disk-image with a valid (or not) Reiserfs on it, would it link that into our filesytem? yes it will. So basically speaking you do not want to run rebuild-tree operation on the FS that contains files with reiserfs metadata embedded in them in clear. This is also explained in our FAQ. Anyway, when I first started out with Reiserfs, it didn't support 2G files (or was it 4G?) I had to patch the kernel and (irreversably!) upgrade the on-disk format. Yes. Linux by itself was not supporting 2G some time ago and people used patches an changed their on disk formats even for other filesystems out there. We've noticed horrible slowdowns when the filesystem is 90% full. It turns out that when a block group is more than 90% full reiserfs will prefer a different block group. i.e. it is ALWAYS switching block groups when the whole disk is 90% full. Something like that. When we report something like that it's always: Ah, yes, that's an old bug we've fixed it. Use patch. In fact this is not exactly true, it only switches to other block group if you are creating new file. Why do you think this is a problem? (of course I am speaking of 2.4.20+ kernels). Bye, Oleg
Re: can not compile reiser4
I figured out the problem; I forgot to use bk -r get. - Original Message - From: Marcelo Pacheco [EMAIL PROTECTED] To: Jack Byer [EMAIL PROTECTED] Sent: Sunday, August 10, 2003 8:10 PM Subject: Re: can not compile reiser4 What I know is I installed bk on my machine, downloaded their 3 bk areas and with that patch I have sucessfully compiled a reiser4 capable kernel (haven't tested reiser4 funcionality yet). Marcelo On Sunday 10 August 2003 21:05, Jack Byer wrote: I don't understand how the patch could be the problem. It doesn't change anything in the fs/reiser4 directory at all. The file that won't compile is fs/reiser4/entd.c, which is the most recent version from bk://bk.namesys.com/bk/reiser4 - Original Message - From: Marcelo Pacheco [EMAIL PROTECTED] To: Jack Byer [EMAIL PROTECTED] Sent: Sunday, August 10, 2003 2:27 PM Subject: Re: can not compile reiser4 That patch is old and outdated. All you need is on the bk trees, except for the attached small compilation patch that namesys hasn't took action yet. Marcelo On Sunday 10 August 2003 13:35, Jack Byer wrote: I'm trying to compile a 2.6.0-test2 kernel with reiser4 on a spare system. I downloaded the latest reiser 4 sources from bitkeeper into the fs directory of a vanilla 2.6.0-test2 tree using the instructions on your web site ( bk clone bk://bk.namesys.com/bk/reiser4) Then I applied the 2.6.0-test2-reiser4-2.6.0-test2.diff patch from your ftp site. When I try to compile, I get the following error: CC fs/reiser4/entd.o In file included from include/asm/hardirq.h:6, from fs/reiser4/debug.h:17, from fs/reiser4/entd.c:5: include/linux/irq.h:69: warning: size of `irq_desc' is 28672 bytes fs/reiser4/entd.c: In function `wait_for_flush': fs/reiser4/entd.c:387: structure has no member named `pressure' make[2]: *** [fs/reiser4/entd.o] Error 1 make[1]: *** [fs/reiser4] Error 2 make: *** [fs] Error 2 Also, the size of `irq_desc' is 28672 bytes warning was printed for every file in the reiser4 directory up to that point. linux 2.6.0 and reiser4 (patch/bugfix) Date: 2003-08-02 07:58 From: Pillars.NET [EMAIL PROTECTED] To: [EMAIL PROTECTED] Figured out how to use bk to pull in the latest trees from linux.bkbits.net and bk.namesys.com and merge the two. Tried compiling a linux 2.6.0-test2 kernel with reiser4 built-in (not as a module) Ran into a compile-time error: undefined reference to _udivdi3, which is described by one LKML author as somebody is doing a 64-bit integer divide without pulling in the relevant gcc library. Poked around and found in include/div64.h a helper function called div_long_long_rem which appears to be custom-made for this type of problem. Here's what I changed to make the compiler happy: [EMAIL PROTECTED]:/usr/src/linux-2.6.0# diff -u fs/reiser4/plugin/item/ctail.c.orig fs/reiser4/plugin/item/ctail.c --- ctail.c.orig2003-08-02 06:53:07.0 -0400 +++ fs/reiser4/plugin/item/ctail.c 2003-08-02 06:41:15.0 -0400 @@ -55,7 +55,8 @@ cluster_index_by_coord(const coord_t * coord) { reiser4_key key; - return get_key_offset(item_key_by_coord(coord, key)) / cluster_size_by_coord(coord),rem; + unsigned long rem; + return div_long_long_rem(get_key_offset(item_key_by_coord(coord, key)),cluster_size_by_coord(coord),rem); } static char * @@ -764,13 +765,14 @@ utmost_child_ctail(const coord_t * coord, sideof side, jnode ** child) { reiser4_key key; + long unsigned rem; assert(edward-257, coord != NULL); assert(edward-258, child != NULL); assert(edward-259, side == LEFT_SIDE); assert(edward-260, item_plugin_by_coord(coord) == item_plugin_by_id(CTAIL_ID)); - if (get_key_offset(key) != cluster_size_by_coord(coord) * (get_key_offset(key) / cluster_size_by_coord(coord))) + if (get_key_offset(key) != cluster_size_by_coord(coord) * div_long_long_rem(get_key_offset(key),cluster_size_by_coord(coord),rem)) *child = NULL; else *child = jlook_lock(current_tree, get_key_objectid(item_key_by_coord(coord, key)), cluster_index_by_coord(coord));
Re: nfsd-fh: found a name that I didn't expect
Hello! On Wed, Aug 06, 2003 at 05:00:03PM -0400, John Dalbec wrote: I just got an nfsd-fh: found a name that I didn't expect yesterday. I'm using a Red Hat 2.4.20 RPM with 2.4.20-pending+data-logging+quota. Should I apply just this patch or both this patch and the iget5_locked_2.4.20 patch? You only need the patch below. iget5_locked_2.4.20 patch is broken. Bye, Oleg = fs/reiserfs/inode.c 1.42 vs edited = --- 1.42/fs/reiserfs/inode.c Thu Feb 13 15:42:42 2003 +++ edited/fs/reiserfs/inode.c Thu Feb 20 17:23:24 2003 @@ -20,6 +20,10 @@ static int reiserfs_get_block (struct inode * inode, long block, struct buffer_head * bh_result, int create); +/* This spinlock guards inode pkey in private part of inode + against race between find_actor() vs reiserfs_read_inode2 */ +static spinlock_t keycopy_lock = SPIN_LOCK_UNLOCKED; + void reiserfs_delete_inode (struct inode * inode) { int jbegin_count = JOURNAL_PER_BALANCE_CNT * 2; @@ -898,8 +902,9 @@ bh = PATH_PLAST_BUFFER (path); ih = PATH_PITEM_HEAD (path); - +spin_lock(keycopy_lock); copy_key (INODE_PKEY (inode), (ih-ih_key)); +spin_unlock(keycopy_lock); inode-i_blksize = PAGE_SIZE; INIT_LIST_HEAD(inode-u.reiserfs_i.i_prealloc_list) ; @@ -1220,10 +1225,27 @@ unsigned long inode_no, void *opaque ) { struct reiserfs_iget4_args *args; +int retval; args = opaque; +/* We protect against possible parallel init_inode() on another CPU here. */ +spin_lock(keycopy_lock); /* args is already in CPU order */ -return le32_to_cpu(INODE_PKEY(inode)-k_dir_id) == args - objectid; +if (le32_to_cpu(INODE_PKEY(inode)-k_dir_id) == args - objectid) +retval = 1; +else +/* If The key does not match, lets see if we are racing + with another iget4, that already progressed so far + to reiserfs_read_inode2() and was preempted in + call to search_by_key(). The signs of that are: + Inode is locked + dirid and object id are zero (not yet initialized)*/ +retval = (inode-i_state I_LOCK) + !INODE_PKEY(inode)-k_dir_id + !INODE_PKEY(inode)-k_objectid; + +spin_unlock(keycopy_lock); +return retval; } struct inode * reiserfs_iget (struct super_block * s, const struct cpu_key * key)
Re: Filesystem Tests
On Wed, Aug 06, 2003 at 08:45:14PM +0200, Diego Calleja Garc?a wrote: El Wed, 6 Aug 2003 11:04:27 -0700 Mike Fedyk [EMAIL PROTECTED] escribi?: Journaled filesystems have a much smaller chance of having problems after a crash. I've had (several) filesystem corruption in a desktop system with (several) journaled filesystems on several disks. (They seem pretty stable these days, though) However I've not had any fs corrution in ext2; ext2 it's (from my experience) rock stable. Personally I'd consider twice the really serious option for a serious server. I've had corruption caused by hardware, and nothing else. I haven't run into any serious bugs. But with servers, the larger your filesystem, the longer it will take to fsck. And that is bad for uptime. Period. I would be running ext2 also if I wasn't running so many test kernels (and they do oops on you), and I've been glad that I didn't have to fsck every time I oopsed (though I do every once in a while, just to make sure).
Re: reiser4 snapshot
On Tue, 2003-08-12 at 11:22, Cyrille Chepelov wrote: Le Tue, Aug 12, 2003, à 10:05:42AM +0400, Oleg Drokin a écrit: Hello! Hello, On Mon, Aug 11, 2003 at 05:32:25PM -0700, Boris Tschirschwitz wrote: I thought I'd give it a try on 2.6.0-test3-mm1. Even with 'make mrproper' before compiling, I get the following error message: (Is there any interest in such error reports?) Yes, there is. I have a problem: reiserfs4progs doesn't seem to pay attention to the --prefix when it comes to locating libaal. --prefix is not the prefix libraries are looked at. It is the prefix of where package libraries and includes will be installed. I configured libaal with --prefix=/scratch/riesling/reiser4-inst and installed it there, then tried to configure reiserfs4progs with the same prefix, and it still fails to locate libaal. You need to let dynamic linker know, that some interesting libraries lie at some location. Edit /etc/ld.so.conf and there line /scratch/riesling/reiser4-inst Or set evn. variable LD_LIBRARY_PATH like the following: export LD_LIBRARY_PATH=/scratch/riesling/reiser4-inst:$LD_LIBRARY_PATH When I force it a little by prepending the call to ./configure with suitable CFLAGS and LDFLAGS, it goes past locating libaal, but chokes on locating aal/aal.h. This will be fixed. Thanks. temporary cure is to specify CFLAGS durring make: make CFLAGS=-I/scratch/riesling/reiser4-inst/include/aal I'll sure get past that, but it's a little annoying, and might get in the way of distributors (depending on the way they package libaal, ie separately or merged with the main reiserfs4progs package). libaal is planed to be used with another similar projects to as it contains useful utilities like device abstraction, etc. So, it is better to have it as separated package. But reiser4progs building may be automated. -- Cyrille -- We're flying high, we're watching the world passes by...
Re: Filesystem Tests
On Sat, 9 Aug 2003, Jamie Lokier wrote: reiser4 is using approximately twice the CPU percentage, but completes in approximately half the time, therefore it uses about the same amount of CPU time at the others. Therefore on a loaded system, with a load carefully chosen to make the test CPU bound rather than I/O bound, one could expect reiser4 to complete in approximately the same time as the others, _not_ slowest. Depends how you define approximation, margins. I dropped them and calculated reiser4 needs the most CPU time. Hans wrote it's worked on. However guessing performance on a whatever carefully chosen loaded system from results on an unloaded system is exactly that, guess, not fact. That's why it's misleading to draw conclusions from the CPU percentage alone. I've never wrote I made my guesses from the CPU percentage alone, you explained correctly why. I encourage you too to calculate yourself how much more CPU time reiser4 needs. Szaka
Re: reiser4 snapshot
Am Dienstag, 12. August 2003 10:56 schrieb Nikita Danilov [know issues] 3) I'm also unable to build reiser4 as module: [...] include/linux/irq.h:69: warning: size of `irq_desc' is 28672 bytes LD [M] fs/reiser4/reiser4.o LD fs/built-in.o GEN .version CHK include/linux/compile.h UPD include/linux/compile.h CC init/version.o LD init/built-in.o LD .tmp_vmlinux1 arch/i386/kernel/built-in.o(.data+0x7c0): In function `sys_call_table': : undefined reference to `sys_reiser4' make: *** [.tmp_vmlinux1] Error 1 Are this know issues? Yes. Does it build as module with CONFIG_REISER4_FS_SYSCALL off? Nikita. No, it doesn't build with the following options: CONFIG_REISER4_FS=m # CONFIG_REISER4_FS_SYSCALL is not set CONFIG_REISER4_LARGE_KEY=y # CONFIG_REISER4_CHECK is not set # CONFIG_REISER4_USE_EFLUSH is not set # CONFIG_REISER4_BADBLOCKS is not set ** uname -rvmpio 2.6.0-test3-reiserfs4 #4 Tue Aug 12 02:59:22 CEST 2003 i686 AMD Athlon(tm) XP 1900+ AuthenticAMD GNU/Linux ** gcc version 3.2.3 20030422 (Gentoo Linux 1.4 3.2.3-r1, propolice) ** Gentoo 1.4 Stable Henning
Re: Filesystem Tests
On Tue, 5 Aug 2003, Andrew Morton wrote: Solutions to this inaccuracy are to make the test so long-running (ten minutes or more) that the difference is minor, or to include the `sync' in the time measurement. And/or reduce RAM at kernel boot, etc. Anyway, I also asked for 'sync' yesterday and Grant included some but not after every each tests. I run the results through some scripts to make it more readable. It indeed has some interesting things ... reiser4 reiserfs ext3XFSJFS copy 33.39,34% 39.55,32% 39.42,25% 43.50,32% 48.15,20% sync 1.54, 0% 3.15, 1% 9.05, 0% 2.08, 1% 3.05, 1% recopy1 31.09,34% 75.15,13% 79.96, 9% 102.37,12% 108.39, 5% recopy2 33.15,33% 77.62,13% 98.84, 7% 108.00,12% 114.96, 5% sync 2.89, 3% 3.84, 1% 8.15, 0% 2.40, 2% 3.86, 0% du2.05,42% 2.46,21% 3.31,11% 3.73,32% 2.42,17% delete7.41,52% 5.22,58% 3.71,39% 8.75,56% 15.33, 7% tar 52.25,25% 90.83,12% 74.93,13% 157.61, 7% 135.86, 6% sync 6.77, 2% 4.19, 3% 1.67, 1% 0.95, 1% 38.18, 0% overall 171.28,30% 302.53,16% 319.71,11% 429.79,13% 470.88, 6% BTW, zsh has a built-in 'time' so measuring a full operation can be easily done as 'sync; time ( my_test; sync )' Szaka
Re: Filesystem Tests
On Wed, Aug 06, 2003 at 07:37:42PM -0400, Timothy Miller wrote: Hans Reiser wrote: reiser4 cpu consumption is still dropping rapidly as others and I find kruft in the code and remove it. Major kruft remains still. Now, if you can manage to make it twice as fast while NOT increasing the CPU usage, well, then that's brilliant, but the fact that ReiserFS uses more CPU doesn't bother me in the least. Basically he's saying it's faster and still not at its peak effeciency yet too.
Re: FS Corruption with VIA MVP3 + UDMA/DMA
Nothing runs on this one ;-) WinXP/2003 will die from registry and unrecoverable NTFS filesystem corruption. Win98 will randomly corrupt driver files eventually leading to an unbootable system, or worse, a completely corrupted filesystem as scandisk happily crosslinks all the files (experienced this several times, just thought it was the hard drives and windows...since the drives would fail a few months later and since I had past experience with a Pentium 166 and HX system running Win95 doing this). Linux fared better, but still would corrupt the filesystem, sometimes leading to an unusable system say if an important library is moved to lost+found during fsck. It was much more reliable than any Windows install and easily repairable. With windows, I had no choice but to re-install (backing up the registry after every boot worked until NTFS would eventually die). I lost a few data and help files under linux, but at this point I backed up all the time anyway (after my first installation was hopelessly mangled). I've tried several PCI tweaks with 2.4 which didn't really seem to cure anything. My powertweak doesn't seem to like the 2.5 series kernels, so I haven't tried that. Not that it seems to matter, the promise controllers have much better throughput anyway even with the same modes and settings in hdparm. I tried all the hdparm combinations of dma modes and other settings with only a slight decrease in the chance of corruption and a corresponding dive in throughput. It worked through 2.5.74, but I finally disabled it for everything except my IDE ZIP drive and stuck in another promise card after concluding that it was just hopelessly broken. It would have been nice if 2.4 would just refuse to use DMA, that way I'd have known about the problem much earlier. I would think with all the stuff in the kernel about the RZ1000, the problems with the MVP3 would be mentioned as well. As just a typical end user I couldn't figure out why Linux and reiserfs, which are supposed to be so stable wouldn't weren't. At this point I'd already run exhaustive memory , hard drive bad sector, and CPU tests without any failures so I was pretty certain it wasn't a hardware issue. Everyone I knew had crashes with Windows so those didn't surprise me so much. It's a decent computer for web browsing and let's me gauge the performance of my business apps. It's a pretty good low-end target machine now that it doesn't write garbage to my drives. I just think this should be documented in case someone sets up a proxy/firewall machine with this configuration. For the majority of home users, any higher-end machine is probably wasted on such an application. I setup such a system to share my parents dial-up connection over a wireless network. Of course, it's using an HX chipset and P233MMX so it's rock solid, only needing rebooted when the modem locks up (happened twice since I set it up a year ago). Even though it's running 2.4.18 and my dad likes to reset it rather than CTRL-ALT-DEL when the modem locks up, it has yet to corrupt reiserfs. That's the kind of stability that got me really wondering about my system... Jamie Lokier wrote: insecure wrote: The VP2/97 also had severe problems with DMA. I could never run standard kernels on mind in the 2.0 days, and distro installs would always lock up during installation, although Mandrake 8 seemed reliable so something improved. I had a VIA VPX sometime ago. AFAIR it worked fine... I suspect PCI conf tweaks etc could work around this trouble. I'm afraid there won't be much interest in fixing these oldies. For example, I got rid of that board (exchanged for Socket A one) - no way to test fixes :( I found a hdparm command which fixed it, though it wasn't much use during distro installs. It was very pleasant to see Mandrake 8 just work. Fwiw, Windows 95, 98 and NT4 have no problems on the box. It's now my Internet Explorer 4 test rig :) -- Jamie
Re: Reiser4 and linux 2.6.0
Am Sonntag, 10. August 2003 04:02 schrieb Tupshin Harper: It would still be wonderful to have a way of getting such patches without going through bk. I requested that a working (complete) patch be made against a recent kernel version(2.6.0-test2 or later at this point) a few weeks ago, and while a got positive response, I still haven't seen anything. I would think you would want to make this very easy for people who are already going through the effort of testing 2.6 kernels. -Tupshin Hello List, i would love to see a patch against a 2.6.0-test kernel too. I don't want to obtain a bitkeeper licence. A anoncvs-gateway as a alternative would be also ok ;) As a happy reiserfs user, it is hard to read about the various changes in v4, and can't test them for yourself. Henning
Re: Filesystem Tests
El Wed, 06 Aug 2003 18:06:37 +0400 Hans Reiser [EMAIL PROTECTED] escribió: I don't think ext2 is a serious option for servers of the sort that Linux specializes in, which is probably why he didn't measure it. Why? reiser4 cpu consumption is still dropping rapidly as others and I find kruft in the code and remove it. Major kruft remains still. Cool.
Re: ReiserFS problems
Rogier Wolff wrote: In fact this is not exactly true, it only switches to other block group if you are creating new file. Why do you think this is a problem? (of course I am speaking of 2.4.20+ kernels). Well we were recovering data into 1G files, but performance of adding a new block was horrible. It was doing this for every block. Either it was doing a fruitless search on every block-add or it was actually adding the block to another block group. Anyway, performance dropped -=*A LOT*=- when this happened. I think you're describing the way it should be, or is now, but there was a bug that caused it to behave differently. Roger. Can you help Oleg investigate this more closely by providing an exact account of what to do to replicate it? Oleg, replicate this and observe what happens. -- Hans
Re: ReiserFS problems
On Thu, Aug 07, 2003 at 05:03:02PM +0400, Hans Reiser wrote: Rogier Wolff wrote: In fact this is not exactly true, it only switches to other block group if you are creating new file. Why do you think this is a problem? (of course I am speaking of 2.4.20+ kernels). Well we were recovering data into 1G files, but performance of adding a new block was horrible. It was doing this for every block. Either it was doing a fruitless search on every block-add or it was actually adding the block to another block group. Anyway, performance dropped -=*A LOT*=- when this happened. I think you're describing the way it should be, or is now, but there was a bug that caused it to behave differently. Can you help Oleg investigate this more closely by providing an exact account of what to do to replicate it? Oleg, replicate this and observe what happens. What part of: we reported it a while back, and you told us it was fixed don't you understand? Roger. -- +-- Rogier Wolff -- www.harddisk-recovery.nl -- 0800 220 20 20 -- | Files foetsie, bestanden kwijt, alle data weg?! | Blijf kalm en neem contact op met Harddisk-recovery.nl!
Re: can not compile reiser4
I don't understand how the patch could be the problem. It doesn't change anything in the fs/reiser4 directory at all. The file that won't compile is fs/reiser4/entd.c, which is the most recent version from bk://bk.namesys.com/bk/reiser4 - Original Message - From: Marcelo Pacheco [EMAIL PROTECTED] To: Jack Byer [EMAIL PROTECTED] Sent: Sunday, August 10, 2003 2:27 PM Subject: Re: can not compile reiser4 That patch is old and outdated. All you need is on the bk trees, except for the attached small compilation patch that namesys hasn't took action yet. Marcelo On Sunday 10 August 2003 13:35, Jack Byer wrote: I'm trying to compile a 2.6.0-test2 kernel with reiser4 on a spare system. I downloaded the latest reiser 4 sources from bitkeeper into the fs directory of a vanilla 2.6.0-test2 tree using the instructions on your web site ( bk clone bk://bk.namesys.com/bk/reiser4) Then I applied the 2.6.0-test2-reiser4-2.6.0-test2.diff patch from your ftp site. When I try to compile, I get the following error: CC fs/reiser4/entd.o In file included from include/asm/hardirq.h:6, from fs/reiser4/debug.h:17, from fs/reiser4/entd.c:5: include/linux/irq.h:69: warning: size of `irq_desc' is 28672 bytes fs/reiser4/entd.c: In function `wait_for_flush': fs/reiser4/entd.c:387: structure has no member named `pressure' make[2]: *** [fs/reiser4/entd.o] Error 1 make[1]: *** [fs/reiser4] Error 2 make: *** [fs] Error 2 Also, the size of `irq_desc' is 28672 bytes warning was printed for every file in the reiser4 directory up to that point. linux 2.6.0 and reiser4 (patch/bugfix) Date: 2003-08-02 07:58 From: Pillars.NET [EMAIL PROTECTED] To: [EMAIL PROTECTED] Figured out how to use bk to pull in the latest trees from linux.bkbits.net and bk.namesys.com and merge the two. Tried compiling a linux 2.6.0-test2 kernel with reiser4 built-in (not as a module) Ran into a compile-time error: undefined reference to _udivdi3, which is described by one LKML author as somebody is doing a 64-bit integer divide without pulling in the relevant gcc library. Poked around and found in include/div64.h a helper function called div_long_long_rem which appears to be custom-made for this type of problem. Here's what I changed to make the compiler happy: [EMAIL PROTECTED]:/usr/src/linux-2.6.0# diff -u fs/reiser4/plugin/item/ctail.c.orig fs/reiser4/plugin/item/ctail.c --- ctail.c.orig2003-08-02 06:53:07.0 -0400 +++ fs/reiser4/plugin/item/ctail.c 2003-08-02 06:41:15.0 -0400 @@ -55,7 +55,8 @@ cluster_index_by_coord(const coord_t * coord) { reiser4_key key; - return get_key_offset(item_key_by_coord(coord, key)) / cluster_size_by_coord(coord),rem; + unsigned long rem; + return div_long_long_rem(get_key_offset(item_key_by_coord(coord, key)),cluster_size_by_coord(coord),rem); } static char * @@ -764,13 +765,14 @@ utmost_child_ctail(const coord_t * coord, sideof side, jnode ** child) { reiser4_key key; + long unsigned rem; assert(edward-257, coord != NULL); assert(edward-258, child != NULL); assert(edward-259, side == LEFT_SIDE); assert(edward-260, item_plugin_by_coord(coord) == item_plugin_by_id(CTAIL_ID)); - if (get_key_offset(key) != cluster_size_by_coord(coord) * (get_key_offset(key) / cluster_size_by_coord(coord))) + if (get_key_offset(key) != cluster_size_by_coord(coord) * div_long_long_rem(get_key_offset(key),cluster_size_by_coord(coord),rem)) *child = NULL; else *child = jlook_lock(current_tree, get_key_objectid(item_key_by_coord(coord, key)), cluster_index_by_coord(coord));
Re: rebuild fs
in this case (IO error) reiserfsck does abort() which ends up as signal number 5, and core is dumped if this is allowed. Looks pretty much like segfault too. Though a message is printed prior to this that we cannot read some block. Bye, Oleg yuck. vs, complain to vitaly please. It does not look the same as the user gets different messages on the terminal. With hardware problems like IO errors he gets Aborting, although this can dump the core file also. But what a user should not get even with the broken hardware is Segmentation fault messages. And core dumping is what looks really pretty much the same. As some old version of reiserfsck (3.6.3) stopped unexpectedly, Oleg suggested to use the latest one -- 3.6.11 -- which worked ok for now. Regarding IO errors reiserfsck prints Block ## cannot be read before aborting and the last ones suggest to check the hardware also. BTW, if there are some bad blocks I would advise to use dd_rescue instead of dd as dd has some problems with bad blocks handling. -- Thanks, Vitaly Fertman
Re: reiser4 snapshot
Hello! On Mon, Aug 11, 2003 at 05:32:25PM -0700, Boris Tschirschwitz wrote: I thought I'd give it a try on 2.6.0-test3-mm1. Even with 'make mrproper' before compiling, I get the following error message: (Is there any interest in such error reports?) Yes, there is. bobele linux # make bzImage CHK include/linux/version.h UPD include/linux/version.h Making asm-asm-i386 symlink CC scripts/empty.o MKELF scripts/elfconfig.h HOSTCC scripts/file2alias.o HOSTCC scripts/modpost.o HOSTLD scripts/modpost SPLIT include/linux/autoconf.h - include/config/* CC arch/i386/kernel/asm-offsets.s CHK include/asm-i386/asm_offsets.h UPD include/asm-i386/asm_offsets.h CC init/main.o In file included from include/linux/unistd.h:9, from init/main.c:18: include/asm/unistd.h: In function `reiser4': include/asm/unistd.h:400: error: `__NR_reiser4' undeclared (first use in this function) include/asm/unistd.h:400: error: (Each undeclared identifier is reported only once include/asm/unistd.h:400: error: for each function it appears in.) make[1]: *** [init/main.o] Error 1 make: *** [init] Error 2 Hm, this is strange. __NR_reiser4 is clearly defined in include/asm-i386/unistd.h Probably you had that part of the patch rejected? Can you please verify? Bye, Oleg
Re: Reiser4 status: benchmarked vs. V3 (and ext3)
On Wed, 2003-08-13 at 21:12, Bill Davidsen wrote: The driver should do the logical to physical mapping, but the portability vanishes if the filesystem to physical mapping is not the same for all machines and operating systems. For pluggable devices this is important. The portability also vanishes if the file system layout is not the same for all machines and operating systems... what's your point? Just like there are standard file systems, there are also standard 'translation layers' -- pseudofilesystems which are used to emulate a hard drive on flash storage -- and some of these are implemented for Linux. Take a PCMCIA flash card (real flash, not CF) with FTL and FAT on it, and it'll work just fine under both Windows and Linux, because they both use the standard FTL and FAT formats. FTL provides the logical-physical mapping and the wear levelling, FAT is just normal FAT. The leveling seems to be done by JFFs2 in a portable way, and that's as it should be. You seem to be very confused here. JFFS2 works on flash directly; nothing's pretending to be a block device. It doesn't seem to be at all relevant to this discussion. JFFS2 does its own wear levelling and flash management, because it works directly on the flash. FAT can't do that -- it needs some other code (like the FTL code) to emulate a normal hard drive for it, providing wear levelling and logical-physical translation for it. See http://www.infradead.org/~dwmw2/mtd-upper-layers.jpeg Wear levelling is not done in the driver -- the driver just drives the flash, and in fact is below the bottom of the diagram since it's largely irrelevant. It just gives you read/write/erase functions for the raw flash. Wear levelling is done either in the file system which works directly on the flash (JFFS2, YAFFS), or in the 'translation layer' which uses the flash to pretend to be a block device (FTL, NFTL, INFTL, SMTL). (In the case of the extremely nave 'mtdblock' translation layer, no translation and no wear levelling is done at all.) If the leveling were in the driver I don't believe even FAT would work. I think that by 'driver' you actually mean the 'translation layer' or the combination of translation layer and underlying hardware driver, in which case you would be incorrect to say that it wouldn't work. That _is_ how it works, portably. -- dwmw2
Re: Reiser4 status: benchmarked vs. V3 (and ext3)
On Thu, 2003-08-14 at 00:12, Bill Davidsen wrote: On Sun, 27 Jul 2003, Yury Umanets wrote: On Sun, 2003-07-27 at 18:10, Daniel Egger wrote: Am Son, 2003-07-27 um 15.28 schrieb Hans Reiser: or for which a wear leveling block device driver is used (I don't know if one exists for Linux). This is normally done by the filesystem (e.g. JFFS2). Normally device driver should be concerned about making wear out smaller. It is up to it IMHO. The driver should do the logical to physical mapping, but the portability vanishes if the filesystem to physical mapping is not the same for all machines and operating systems. For pluggable devices this is important. The leveling seems to be done by JFFs2 in a portable way, and that's as it should be. If the leveling were in the driver I don't believe even FAT would work. Hello Bill, Yes, you are right. Device driver cannot take care about leveling. It is able only to take care about simple caching (one erase block) in order to make wear out smaller and do not read/write whole block if one sector should be written. Part of a filesystem called block allocator should take care about leveling.