Re: Reiser4 status: benchmarked vs. V3 (and ext3)

2003-08-14 Thread David Woodhouse
On Thu, 2003-08-14 at 06:04, Yury Umanets wrote:
 Yes, you are right. Device driver cannot take care about leveling.

The hardware device driver doesn't. The 'translation layer' does, in the
case where you are using a traditional block-based file system. 

If you consider the translation layer and the underlying raw hardware
driver together to form the 'device driver' from the filesystem's
perspective and in the context of the above sentence, then you're
incorrect -- it can, and in general it _does_ take care of wear
levelling.

 It is able only to take care about simple caching (one erase block) in 
 order to make wear out smaller and do not read/write whole block if one 
 sector should be written.

Whatever meaning of 'device driver' you meant to use -- no.

The raw hardware driver provides only raw read/write/erase
functionality; no caching is appropriate. 

The optional translation layer which simulates a block device provides
far more than simple caching -- it provides wear levelling, bad block
management, etc. All using a standard layout on the flash hardware for
portability.

(Except in the special case of the 'mtdblock' translation layer, which
is not suitable for anything but read-only operation on devices without
any bad blocks to be worked around.)

 Part of a filesystem called block allocator should take care about 
 leveling.

That's insufficient. In a traditional file system, blocks get
overwritten without being freed and reallocated -- the allocator isn't
always involved. 

If you want to teach a file system about flash and wear levelling, you
end up ditching the pretence that it's a block device entirely and
working directly with the flash hardware driver. 

Either that or use a translation layer which does it _all_ for the file
system and then just use a standard file system on that simulated block
device.

Between those two extremes, very little actually makes sense.

If you introduce the gratuitous extra 'block device' abstraction layer
which doesn't really fit the reality of flash hardware very well at all,
you end up wanting to violate the layering in so many ways that you
realise you really shouldn't have been pretending to be a block device
in the first place.

-- 
dwmw2



Re: r4 v. ext3, quick speed vs. cpu experiments

2003-08-14 Thread Szakacsits Szabolcs

How much memory you have? How big is mozilla-1.5a.tar? Did you include
'sync' in the tests? It seems reiser4 numbers are mostly in-memory
operations and not all data flushed to disk while this is apparently not
true for ext3. BTW, XFS numbers would be also/more interesting, ext[23] is
pretty outdated.

BTW, from your numbers it seems ext3 gives better overall performance.

Szaka

On Tue, 5 Aug 2003, Grant Miner wrote:

 mozilla-1.5a.tar is mozilla 1.5alpha source tar, uncompressed.
 Partition mkfs.ext3 or mkfs.reiser4 --keys=SHORT is run before each run.
 Linux is 2.6.0-test2.

 untar mozilla-1.5a.tar (file is on a reiser3 partition):
 ext3: 17.64s 28% cpu
 reiser4: 10.79s 67% cpu
 sum: reiser4 0.61x time, 2.39x cpu

 cp -a mozilla-src mozilla-src-copy, same partition:
 ext3: 0:56.35sec 11% cpu
 reiser4: 0:16.50 55% cpu
 sum: reiser4 0.29x time, 5x cpu

 tar c mozilla-src  mozilla.tar, same partition:
 ext3: 0:36.47sec 10%cpu
 reiser4: 0:16.90sec 25%cpu
 sum: reiser4 0.46x time, 2.5x cpu

 i'm impressed!



Re: ReiserFS problems

2003-08-14 Thread Oleg Drokin
Hello!

On Wed, Aug 06, 2003 at 08:22:52PM +0200, Rogier Wolff wrote:

 Only list the file/directory that's being worked upon when explicitly
 requested. When not explicitly requested, set an alarm handler to
 print it every second (or so). Lots of time is now spent in writing to

I think we already do something like this.
Vitaly should know exact details.

Bye,
Oleg


Re: Filesystem corruption

2003-08-14 Thread Oleg Drokin
Hello!

On Thu, Aug 14, 2003 at 12:05:28AM +0800, Locke wrote:
 the files. I'm guessing the reason why it recovered so little was 
 because that because I was running a 7.8GB+40GB LVM and the 40GB 
 pyhsical volume wasn't working and left it with only 7.8GB.

Yes of course.

 is_tree_node: node level 0 does not match to the expected one 1
 vs-5150: search_by_key: invalid format found in block 8838461. Fsck?

So LVM substitures zero filled blocks instead of data if physical volume
is unavailable.
Of course reiserfsck happily thrown all of those blocks out of the tree.

 And also when rebooting after the corruption I saw several error 
 messages for all drives, hda, hdb and hdg
 **
 hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
 hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
 hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
 hda: dma_intr: error=0x84 { DriveStatusError BadCRC }

Also you should consider replacing your noisy IDE cable for primary IDE
controller with not noisy one. Or just run in lower UDMA mode.

 **The messages are copied from the FAQ in namesys.com because they 
 looked similar so I'm not sure if they're the exactly same.

Well, if they are not the same, you'd better write them down on paper.

 Is there anything I can try to recover more data?

You might try to get LVM up again and run reiserfsck --rebuild tree.
Some more stuff wuill be restored.
Though still you will have lots of files' content lost and there is no way
to restore it anymore.
Also use reiserfsck 3.6.11

Bye,
Oleg


AW: rebuildfs

2003-08-14 Thread Thorsten Mauch
first i need to copy the failed raid member.
but i assume this will work also with:

dd_rescue /dec/sda /dev/sdb

and then when i have again a working (of course critical) raid
i copy it to the IDE drive.


-Ursprungliche Nachricht-
Von: Vitaly Fertman [mailto:[EMAIL PROTECTED]
Gesendet: Dienstag, 5. August 2003 18:27
An: Thorsten Mauch; '[EMAIL PROTECTED]'
Betreff: Re: AW: rebuildfs


On Tuesday 05 August 2003 19:55, Thorsten Mauch wrote:
 my failed HDD is a raid member. Is it possible to use
 dd_rescure also to copy the raw hhd ?

yes, you can dd_rescue your /dev/rd/c0d0 to /dev/hda.

-- 
Thanks,
Vitaly Fertman


non-standard journal breaks autodetect

2003-08-14 Thread Tom Vier

[EMAIL PROTECTED] root]# mkreiserfs -l hosts -s 16386 /dev/sdc4
[EMAIL PROTECTED] root]# mount /dev/sdc4 /mnt/
mount: you must specify the filesystem type
[EMAIL PROTECTED] root]# mount -t reiserfs /dev/sdc4 /mnt/
[EMAIL PROTECTED] root]# 

-- 
Tom Vier [EMAIL PROTECTED]
DSA Key ID 0xE6CB97DA


Re: Filesystem Tests

2003-08-14 Thread Andrew Morton
Mike Fedyk [EMAIL PROTECTED] wrote:

 On Wed, Aug 06, 2003 at 06:34:10PM +0200, Diego Calleja Garc?a wrote:
   El Wed, 06 Aug 2003 18:06:37 +0400 Hans Reiser [EMAIL PROTECTED] escribi?:
   
I don't think ext2 is a serious option for servers of the sort that 
Linux specializes in, which is probably why he didn't measure it.
   
   Why?
 
  Because if you have a power outage, or a crash, you have to run the
  filesystem check tools on it or risk damaging it further.
 
  Journaled filesystems have a much smaller chance of having problems after a
  crash.

Journalled filesytems have a runtime cost, and you're paying that all the
time.

If you're going 200 days between crashes on a disk-intensive box then using
a journalling fs to save 30 minutes at reboot time just doesn't stack up:
you've lost much, much more time than that across the 200 days.

It all depends on what the machine is doing and what your max downtime
requirements are.


Re: ReiserFS problems

2003-08-14 Thread Rogier Wolff
On Wed, Aug 06, 2003 at 11:43:31AM -0600, Andreas Dilger wrote:
 On Aug 06, 2003  19:18 +0200, Rogier Wolff wrote:
later. So we hit control-C on the fsck.
   
   That was big mistake.
  
  It was only a couple of percent done. All we have to do now is run it
  again, and let it continue.
 
  From a user-safety point-of-view, you should use tty() to see if
  the program  is running interactively, and then trap CTRL-C and
  have it print a warning in  the signal handler that pressing
  CTRL-C again in the next second will kill it.   All you need then
  is to call time() and save it in a static, and if the  signal
  handler is called more than once in the same second only then exit.

No. The warning should not be that pressing control-C again will kill
the program, but that interrupting a rebuild-tree will make your
filesystem unmountable, and that pressing control-C again will
interrupt the running rebuild-tree. 

Roger. 

-- 
+-- Rogier Wolff -- www.harddisk-recovery.nl -- 0800 220 20 20 --
| Files foetsie, bestanden kwijt, alle data weg?!
| Blijf kalm en neem contact op met Harddisk-recovery.nl!


Re: ReiserFS problems

2003-08-14 Thread Oleg Drokin
Hello!

On Thu, Aug 07, 2003 at 11:12:27AM -0700, Mike Fedyk wrote:
  Well. This is actually unfortunate, I agree. In such a case you'd better
  move your reiserfs images to some other place for the time of reiserfsck 
  --rebuild-tree run.
  or compress them.
 But if there was at any time an uncompressed reiserfs image within the outer
 reiserfs filesystem you're fscking, won't that screw it up too?

Yes.
The fs in file will be completely destroyed.
Some stuff from it may appear in outer fs. (possibly in lost + found,
no actual file data, just the names and directory structure).

 So you can compress it, but if you uncompress it to work with it, it still
 fscks fsck...  Right? :-/

Yes.

Bye,
Oleg


Re: r4 v. ext3, quick speed vs. cpu experiments

2003-08-14 Thread Matthias Andree
Szakacsits Szabolcs [EMAIL PROTECTED] writes:

 Yes, if you have enough CPU capacity (aka you don't run anything else, just
 bechmarking filesystems). Otherwise it seems to be slower. That's I was
 refering to.

This has been the situation with reiserfs 3.5/3.6 before, and it got
resolved, or so it appears. I haven't ext3-vs-reiserfs3.6 figures at
hand, but I'm not aware of CPU bottlenecks in reiserfs3.6 code. Just
wait a couple of months until the reiserfs gurus got their reiserfs4
beast stable and debugged and can focus on tuning.

To a previous post about code size and execution speed: it's not
generally true that larger code is also slower. It depends how that code
is arranged. If you have many abstractions, then maybe it's slower. If
you have many specialized functions in an otherwise flat profile, it can
be a good deal faster than a simpler (less complex) code.

-- 
Matthias Andree


Re: Reiser4 and linux 2.6.0

2003-08-14 Thread Nikita Danilov
Henning Westerholt writes:
  Am Sonntag, 10. August 2003 04:02 schrieb Tupshin Harper:
   It would still be wonderful to have a way of getting such patches
   without going through bk. I requested that a working (complete) patch be
   made against a recent kernel version(2.6.0-test2 or later at this point)
   a few weeks ago, and while a got positive response, I still haven't seen
   anything. I would think you would want to make this very easy for people
   who are already going through the effort of testing 2.6 kernels.
  
   -Tupshin
  
  Hello List,
  
  i would love to see a patch against a 2.6.0-test kernel too. I don't want to 
  obtain a bitkeeper licence.
  A anoncvs-gateway as a alternative would be also ok ;)
  
  As a happy reiserfs user, it is hard to read about the various changes in v4, 
  and can't test them for yourself. 
  

Snapshot will be done to-day (2003.08.11).

  
  Henning
  

Nikita.


Re: r4 v. ext3, quick speed vs. cpu experiments

2003-08-14 Thread Hans Reiser
Grant Miner wrote:

Szakacsits Szabolcs wrote:

How much memory you have? How big is mozilla-1.5a.tar? Did you include
'sync' in the tests? It seems reiser4 numbers are mostly in-memory
operations and not all data flushed to disk while this is apparently not
true for ext3. BTW, XFS numbers would be also/more interesting, 
ext[23] is
pretty outdated.

BTW, from your numbers it seems ext3 gives better overall performance.

Szaka

Good suggestion.  With ext3, 'sync' adds 10.2 seconds average to total
time (others about 1.6 sec).  Here is a list of averages, including sync
time.  Each fs was run 3 times.  Note that I did not count sync's cpu %
in cpu %.
xfs: average 44.3 seconds, 32% cpu
ext3: average 44.0 seconds, 27% cpu
r4: average 30.2 seconds, 39% cpu
I have 512MB memory.  File tree is about 295 MB.  This was just a for
fun test, and it probably not accurate.  I may try better ones later.



Expect the CPU time to drop a lot, because we first got rid of the IO 
consuming kruft, now we are getting rid of the CPU consuming kruft.

That is, expect it to drop up until we ship a compression plugin.

Can you post your numbers on lkml also?

--
Hans



Re: Filesystem Tests

2003-08-14 Thread Jamie Lokier
 I've never wrote I made my guesses from the CPU percentage alone, you
 explained correctly why. I encourage you too to calculate yourself how
 much more CPU time reiser4 needs.

Ok, fair enough :)

-- Jamie


Re: rebuild fs

2003-08-14 Thread Hans Reiser
Oleg Drokin wrote:

Hello!

On Tue, Aug 05, 2003 at 04:56:55PM +0400, Hans Reiser wrote:

 

rephrase that as, use 3.6.11, if it still fails, tell us, the segfault 
will at least be fixed regardless of whether fsck has enough data to do 
its job.
   

But it was not failing on the IDE drive anyway.
 

I don't understand the relevance of your statement to mine.
   

Since after transferring image to IDE made reiserfsck to not fail (and it failed on raid5 due to raid errors,
I think), your if it still fails statement was not adequate., 

Even with a broken hard drive, there should be no userspace segfault 
or am I wrong?

Current problem is that not everything is restored and some important files were lost.
Now, I know that recently we introduced some serious changes in reiserfsck and now
if the block have some slight corruption, it is not immediately discarded, but fsck 
actually
tries to extract some useful data out of it if it think this is really reiserfs 
metadata block.
That's why newer reiserfsck might achieve better results.
Bye,
   Oleg
 



--
Hans



Re: Filesystem Tests

2003-08-14 Thread Timothy Miller


Hans Reiser wrote:

reiser4 cpu consumption is still dropping rapidly as others and I find 
kruft in the code and remove it.  Major kruft remains still.


If a file system is getting greater throughput, that means the relevant 
code is being run more, which means more CPU will be used for the 
purpose of setting up DMA, etc.  That is, if a FS gets twice the 
throughput, it would not be unreasonable to expect it to use 2x the CPU 
time.

Furthermore, in order to achieve greater throughput, one has to write 
more intelligent code.  More intelligent code is probably going to 
require more computation time.

That is to say, if your FS is twice as fast, saying it has a problem 
purely on the basis that it's using more CPU ignores certain facts and 
basic logic.

Now, if you can manage to make it twice as fast while NOT increasing the 
CPU usage, well, then that's brilliant, but the fact that ReiserFS uses 
more CPU doesn't bother me in the least.



Re: ReiserFS problems

2003-08-14 Thread Andreas Dilger
On Aug 06, 2003  19:18 +0200, Rogier Wolff wrote:
   later. So we hit control-C on the fsck.
  
  That was big mistake.
 
 It was only a couple of percent done. All we have to do now is run it
 again, and let it continue.

 From a user-safety point-of-view, you should use tty() to see if the program
is running interactively, and then trap CTRL-C and have it print a warning in
the signal handler that pressing CTRL-C again in the next second will kill it.
All you need then is to call time() and save it in a static, and if the
signal handler is called more than once in the same second only then exit.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/



Re: Reiser4 status: benchmarked vs. V3 (and ext3)

2003-08-14 Thread Bill Davidsen
On Sun, 27 Jul 2003, Yury Umanets wrote:

 On Sun, 2003-07-27 at 18:10, Daniel Egger wrote:
  Am Son, 2003-07-27 um 15.28 schrieb Hans Reiser:

   or for which a wear leveling block device driver is used (I don't know
   if one exists for Linux).
  
  This is normally done by the filesystem (e.g. JFFS2).
 
 Normally device driver should be concerned about making wear out
 smaller. It is up to it IMHO.

The driver should do the logical to physical mapping, but the portability
vanishes if the filesystem to physical mapping is not the same for all
machines and operating systems. For pluggable devices this is important.

The leveling seems to be done by JFFs2 in a portable way, and that's as it
should be. If the leveling were in the driver I don't believe even FAT
would work.

-- 
bill davidsen [EMAIL PROTECTED]
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.



Re: ReiserFS problems

2003-08-14 Thread Oleg Drokin
Hello!

On Wed, Aug 06, 2003 at 06:20:55PM +0200, Rogier Wolff wrote:

 Reiserfs messed up our filesystem again (one file gives us permission

And you use what kernel with what patches on what hardware?

 A surface scan needs to read all the datablocks. But an fsck
 doesn't. At least that's the normal case.

reiserfsck --rebuild-tree is special, it actually reads in all
the blocks on the device that are marked as used, to find metadata blocks and
connect them to the tree (even if they were previously unconnected).
Unlike many other filesystems out there, reiserfs does not have fixed metadata 
locations,
hence we absolutely need this scan.

 later. So we hit control-C on the fsck.

That was big mistake.

 But now mounting the filesystem gives us: 
 ReiserFS version 3.6.25
 reiserfs: checking transaction log (device 09:00) ...
 is_tree_node: node level 0 does not match to the expected one 65534
 vs-5150: search_by_key: invalid format found in block 0. Fsck?
 vs-13070: reiserfs_read_inode2: i/o failure occurred trying to find stat data of [1 
 2 0x0 SD]
 Using r5 hash to sort names
 is_tree_node: node level 0 does not match to the expected one 65534
 vs-5150: search_by_key: invalid format found in block 0. Fsck?
 vs-2140: finish_unfinished: search_by_key returned -2
 and fsck without --rebuild-tree gives us that an unfinished
 --rebuild-tree was in progress. So we've restarted the tree-rebuild.

Yes. Once you run tree-rebuild, you must wait until it is completed.
(Documentation update is scheduled just now. But in fact we mention this in our FAQ).

 Question: If it is reading all datablocks, I'm guessing that it is

All one that are marked as occupied in the bitmaps.

 looking for the magics that build up the filesystem. We're a

Yes.

 datarecovery company. We probably don't have any current
 datarecoveries of people with Reiserfs on their disk. But if we had a
 disk-image with a valid (or not) Reiserfs on it, would it link that
 into our filesytem?

yes it will.
So basically speaking you do not want to run rebuild-tree operation on the 
FS that contains files with reiserfs metadata embedded in them in clear.
This is also explained in our FAQ.

 Anyway, when I first started out with Reiserfs, it didn't support  2G
 files (or was it 4G?) I had to patch the kernel and (irreversably!) 
 upgrade the on-disk format. 

Yes. Linux by itself was not supporting 2G some time ago and people used patches
an changed their on disk formats even for other filesystems out there.

 We've noticed horrible slowdowns when the filesystem is  90% full. It
 turns out that when a block group is more than 90% full reiserfs will
 prefer a different block group. i.e. it is ALWAYS switching block
 groups when the whole disk is  90% full. Something like that. When we
 report something like that it's always: Ah, yes, that's an old bug
 we've fixed it. Use patch.

In fact this is not exactly true, it only switches to other block group if
you are creating new file. Why do you think this is a problem?
(of course I am speaking of 2.4.20+ kernels).

Bye,
Oleg


Re: can not compile reiser4

2003-08-14 Thread Jack Byer
I figured out the problem; I forgot to use bk -r get.

- Original Message - 
From: Marcelo Pacheco [EMAIL PROTECTED]
To: Jack Byer [EMAIL PROTECTED]
Sent: Sunday, August 10, 2003 8:10 PM
Subject: Re: can not compile reiser4


 What I know is I installed bk on my machine, downloaded their 3 bk areas
and
 with that patch I have sucessfully compiled a reiser4 capable kernel
(haven't
 tested reiser4 funcionality yet).

 Marcelo

 On Sunday 10 August 2003 21:05, Jack Byer wrote:
  I don't understand how the patch could be the problem. It doesn't change
  anything in the fs/reiser4 directory at all. The file that won't compile
is
  fs/reiser4/entd.c, which is the most recent version from
  bk://bk.namesys.com/bk/reiser4
 
  - Original Message -
  From: Marcelo Pacheco [EMAIL PROTECTED]
  To: Jack Byer [EMAIL PROTECTED]
  Sent: Sunday, August 10, 2003 2:27 PM
  Subject: Re: can not compile reiser4
 
   That patch is old and outdated.
   All you need is on the bk trees, except for the attached small
   compilation
 
  patch that namesys hasn't took action yet.
 
   Marcelo
  
   On Sunday 10 August 2003 13:35, Jack Byer wrote:
I'm trying to compile a 2.6.0-test2 kernel with reiser4 on a spare
 
  system.
 
I downloaded the latest reiser 4 sources from bitkeeper into the fs
directory of a vanilla 2.6.0-test2 tree using the instructions on
your
 
  web
 
site ( bk clone bk://bk.namesys.com/bk/reiser4)
Then I applied the 2.6.0-test2-reiser4-2.6.0-test2.diff patch from
your
 
  ftp
 
site.
When I try to compile, I get the following error:
   
  CC  fs/reiser4/entd.o
In file included from include/asm/hardirq.h:6,
 from fs/reiser4/debug.h:17,
 from fs/reiser4/entd.c:5:
include/linux/irq.h:69: warning: size of `irq_desc' is 28672 bytes
fs/reiser4/entd.c: In function `wait_for_flush':
fs/reiser4/entd.c:387: structure has no member named `pressure'
make[2]: *** [fs/reiser4/entd.o] Error 1
make[1]: *** [fs/reiser4] Error 2
make: *** [fs] Error 2
   
Also, the size of `irq_desc' is 28672 bytes warning was printed
for
 
  every
 
file in the reiser4 directory up to that point.
  
   linux 2.6.0 and reiser4 (patch/bugfix)
   Date: 2003-08-02 07:58
   From: Pillars.NET [EMAIL PROTECTED]
   To: [EMAIL PROTECTED]
  
   Figured out how to use bk to pull in the latest trees from
   linux.bkbits.net and bk.namesys.com and merge the two.
  
   Tried compiling a linux 2.6.0-test2 kernel with reiser4 built-in (not
   as a module)
  
   Ran into a compile-time error: undefined reference to _udivdi3,
   which is described by one LKML author as somebody is doing a 64-bit
   integer divide without pulling in the relevant gcc library.
  
   Poked around and found in include/div64.h a helper function called
   div_long_long_rem which appears to be custom-made for this type of
   problem.
  
   Here's what I changed to make the compiler happy:
  
   [EMAIL PROTECTED]:/usr/src/linux-2.6.0# diff -u
 
  fs/reiser4/plugin/item/ctail.c.orig fs/reiser4/plugin/item/ctail.c
 
   --- ctail.c.orig2003-08-02 06:53:07.0 -0400
   +++ fs/reiser4/plugin/item/ctail.c  2003-08-02
 
  06:41:15.0 -0400
 
   @@ -55,7 +55,8 @@
cluster_index_by_coord(const coord_t * coord)
{
   reiser4_key  key;
   -   return get_key_offset(item_key_by_coord(coord, key)) /
 
  cluster_size_by_coord(coord),rem;
 
   +   unsigned long rem;
   +   return
div_long_long_rem(get_key_offset(item_key_by_coord(coord,
 
  key)),cluster_size_by_coord(coord),rem);
 
}
  
static char *
   @@ -764,13 +765,14 @@
utmost_child_ctail(const coord_t * coord, sideof side, jnode **
child)
{
   reiser4_key key;
   +   long unsigned rem;
  
   assert(edward-257, coord != NULL);
   assert(edward-258, child != NULL);
   assert(edward-259, side == LEFT_SIDE);
   assert(edward-260, item_plugin_by_coord(coord) ==
 
  item_plugin_by_id(CTAIL_ID));
 
   -   if (get_key_offset(key) != cluster_size_by_coord(coord) *
 
  (get_key_offset(key) / cluster_size_by_coord(coord)))
 
   +   if (get_key_offset(key) != cluster_size_by_coord(coord) *
 
 
div_long_long_rem(get_key_offset(key),cluster_size_by_coord(coord),rem))
 
   *child = NULL;
   else
   *child = jlook_lock(current_tree,
 
  get_key_objectid(item_key_by_coord(coord, key)),
  cluster_index_by_coord(coord));






Re: nfsd-fh: found a name that I didn't expect

2003-08-14 Thread Oleg Drokin
Hello!

On Wed, Aug 06, 2003 at 05:00:03PM -0400, John Dalbec wrote:

 I just got an nfsd-fh: found a name that I didn't expect yesterday. 
 I'm using a Red Hat 2.4.20 RPM with 2.4.20-pending+data-logging+quota.
 Should I apply just this patch or both this patch and the 
 iget5_locked_2.4.20 patch?

You only need the patch below. iget5_locked_2.4.20 patch is broken.

Bye,
Oleg
 = fs/reiserfs/inode.c 1.42 vs edited =
 --- 1.42/fs/reiserfs/inode.c Thu Feb 13 15:42:42 2003
 +++ edited/fs/reiserfs/inode.c   Thu Feb 20 17:23:24 2003
 @@ -20,6 +20,10 @@
  static int reiserfs_get_block (struct inode * inode, long block,
 struct buffer_head * bh_result, int create);
  
 +/* This spinlock guards inode pkey in private part of inode
 +   against race between find_actor() vs reiserfs_read_inode2 */
 +static spinlock_t keycopy_lock = SPIN_LOCK_UNLOCKED;
 +
  void reiserfs_delete_inode (struct inode * inode)
  {
  int jbegin_count = JOURNAL_PER_BALANCE_CNT * 2; 
 @@ -898,8 +902,9 @@
  bh = PATH_PLAST_BUFFER (path);
  ih = PATH_PITEM_HEAD (path);
  
 -
 +spin_lock(keycopy_lock);
  copy_key (INODE_PKEY (inode), (ih-ih_key));
 +spin_unlock(keycopy_lock);
  inode-i_blksize = PAGE_SIZE;
  
  INIT_LIST_HEAD(inode-u.reiserfs_i.i_prealloc_list) ;
 @@ -1220,10 +1225,27 @@
  unsigned long inode_no, void *opaque )
  {
  struct reiserfs_iget4_args *args;
 +int retval;
  
  args = opaque;
 +/* We protect against possible parallel init_inode() on another CPU 
 here. */
 +spin_lock(keycopy_lock);
  /* args is already in CPU order */
 -return le32_to_cpu(INODE_PKEY(inode)-k_dir_id) == args - objectid;
 +if (le32_to_cpu(INODE_PKEY(inode)-k_dir_id) == args - objectid)
 +retval = 1;
 +else
 +/* If The key does not match, lets see if we are racing
 +   with another iget4, that already progressed so far
 +   to reiserfs_read_inode2() and was preempted in
 +   call to search_by_key(). The signs of that are:
 + Inode is locked
 + dirid and object id are zero (not yet initialized)*/
 +retval = (inode-i_state  I_LOCK) 
 + !INODE_PKEY(inode)-k_dir_id 
 + !INODE_PKEY(inode)-k_objectid;
 +
 +spin_unlock(keycopy_lock);
 +return retval;
  }
  
  struct inode * reiserfs_iget (struct super_block * s, const struct 
  cpu_key * key)
 
 
 


Re: Filesystem Tests

2003-08-14 Thread Mike Fedyk
On Wed, Aug 06, 2003 at 08:45:14PM +0200, Diego Calleja Garc?a wrote:
 El Wed, 6 Aug 2003 11:04:27 -0700 Mike Fedyk [EMAIL PROTECTED] escribi?:
 
  
  Journaled filesystems have a much smaller chance of having problems after a
  crash.
 
 I've had (several) filesystem corruption in a desktop system with (several)
 journaled filesystems on several disks. (They seem pretty stable these days,
 though)
 
 However I've not had any fs corrution in ext2; ext2 it's (from my experience)
 rock stable.
 
 Personally I'd consider twice the really serious option for a serious server.

I've had corruption caused by hardware, and nothing else.  I haven't run
into any serious bugs.

But with servers, the larger your filesystem, the longer it will take to
fsck.  And that is bad for uptime.  Period.

I would be running ext2 also if I wasn't running so many test kernels (and
they do oops on you), and I've been glad that I didn't have to fsck every
time I oopsed (though I do every once in a while, just to make sure).


Re: reiser4 snapshot

2003-08-14 Thread Yury Umanets
On Tue, 2003-08-12 at 11:22, Cyrille Chepelov wrote:
 Le Tue, Aug 12, 2003, à 10:05:42AM +0400, Oleg Drokin a écrit:
  Hello!
  
Hello,
  On Mon, Aug 11, 2003 at 05:32:25PM -0700, Boris Tschirschwitz wrote:
  
   I thought I'd give it a try on 2.6.0-test3-mm1.
   Even with 'make mrproper' before compiling, I get the following error
   message:
   (Is there any interest in such error reports?)
  
  Yes, there is.
 
 I have a problem: reiserfs4progs doesn't seem to pay attention to the
 --prefix when it comes to locating libaal.

--prefix is not the prefix libraries are looked at. It is the prefix of
where package libraries and includes will be installed.

  I configured libaal with
 --prefix=/scratch/riesling/reiser4-inst and installed it there, then tried
 to configure reiserfs4progs with the same prefix, and it still fails to
 locate libaal.

You need to let dynamic linker know, that some interesting libraries lie
at some location.

Edit /etc/ld.so.conf and there line /scratch/riesling/reiser4-inst
Or set evn. variable LD_LIBRARY_PATH like the following:

export LD_LIBRARY_PATH=/scratch/riesling/reiser4-inst:$LD_LIBRARY_PATH


  When I force it a little by prepending the call to
 ./configure with suitable CFLAGS and LDFLAGS, it goes past locating libaal,
 but chokes on locating aal/aal.h.
This will be fixed. Thanks. temporary cure is to specify CFLAGS durring
make:

make CFLAGS=-I/scratch/riesling/reiser4-inst/include/aal

 
 I'll sure get past that, but it's a little annoying, and might get in the
 way of distributors (depending on the way they package libaal, ie separately
 or merged with the main reiserfs4progs package).
libaal is planed to be used with another similar projects to as it
contains useful utilities like device abstraction, etc.  So, it is
better to have it as separated package. But reiser4progs building may be
automated. 
 
   -- Cyrille
-- 
We're flying high, we're watching the world passes by...



Re: Filesystem Tests

2003-08-14 Thread Szakacsits Szabolcs

On Sat, 9 Aug 2003, Jamie Lokier wrote:
 reiser4 is using approximately twice the CPU percentage, but completes
 in approximately half the time, therefore it uses about the same
 amount of CPU time at the others.

 Therefore on a loaded system, with a load carefully chosen to make the
 test CPU bound rather than I/O bound, one could expect reiser4 to
 complete in approximately the same time as the others, _not_ slowest.

Depends how you define approximation, margins. I dropped them and
calculated reiser4 needs the most CPU time. Hans wrote it's worked on.

However guessing performance on a whatever carefully chosen loaded system
from results on an unloaded system is exactly that, guess, not fact.

 That's why it's misleading to draw conclusions from the CPU percentage alone.

I've never wrote I made my guesses from the CPU percentage alone, you
explained correctly why. I encourage you too to calculate yourself how
much more CPU time reiser4 needs.

Szaka



Re: reiser4 snapshot

2003-08-14 Thread Henning Westerholt
Am Dienstag, 12. August 2003 10:56 schrieb Nikita Danilov
   [know issues]
   3)
   I'm also unable to build reiser4 as module:
   [...]
   include/linux/irq.h:69: warning: size of `irq_desc' is 28672 bytes
 LD [M]  fs/reiser4/reiser4.o
 LD  fs/built-in.o
 GEN .version
 CHK include/linux/compile.h
 UPD include/linux/compile.h
 CC  init/version.o
 LD  init/built-in.o
 LD  .tmp_vmlinux1
  
   arch/i386/kernel/built-in.o(.data+0x7c0): In function `sys_call_table':
   : undefined reference to `sys_reiser4'
  
   make: *** [.tmp_vmlinux1] Error 1
  
  
   Are this know issues?

 Yes. Does it build as module with CONFIG_REISER4_FS_SYSCALL off?

 Nikita.

No, it doesn't build with the following options:

CONFIG_REISER4_FS=m
# CONFIG_REISER4_FS_SYSCALL is not set
CONFIG_REISER4_LARGE_KEY=y
# CONFIG_REISER4_CHECK is not set
# CONFIG_REISER4_USE_EFLUSH is not set
# CONFIG_REISER4_BADBLOCKS is not set

**
uname -rvmpio
2.6.0-test3-reiserfs4 #4 Tue Aug 12 02:59:22 CEST 2003 i686 AMD Athlon(tm) XP 
1900+ AuthenticAMD GNU/Linux
**
gcc version 3.2.3 20030422 (Gentoo Linux 1.4 3.2.3-r1, propolice)
**
Gentoo 1.4 Stable


Henning



Re: Filesystem Tests

2003-08-14 Thread Szakacsits Szabolcs

On Tue, 5 Aug 2003, Andrew Morton wrote:

 Solutions to this inaccuracy are to make the test so long-running (ten
 minutes or more) that the difference is minor, or to include the `sync' in
 the time measurement.

And/or reduce RAM at kernel boot, etc. Anyway, I also asked for 'sync'
yesterday and Grant included some but not after every each tests.

I run the results through some scripts to make it more readable.
It indeed has some interesting things ...

   reiser4   reiserfs   ext3XFSJFS
copy 33.39,34%  39.55,32%  39.42,25%  43.50,32%  48.15,20%
sync  1.54, 0%   3.15, 1%   9.05, 0%   2.08, 1%   3.05, 1%
recopy1  31.09,34%  75.15,13%  79.96, 9% 102.37,12% 108.39, 5%
recopy2  33.15,33%  77.62,13%  98.84, 7% 108.00,12% 114.96, 5%
sync  2.89, 3%   3.84, 1%   8.15, 0%   2.40, 2%   3.86, 0%
du2.05,42%   2.46,21%   3.31,11%   3.73,32%   2.42,17%
delete7.41,52%   5.22,58%   3.71,39%   8.75,56%  15.33, 7%
tar  52.25,25%  90.83,12%  74.93,13% 157.61, 7% 135.86, 6%
sync  6.77, 2%   4.19, 3%   1.67, 1%   0.95, 1%  38.18, 0%
overall 171.28,30% 302.53,16% 319.71,11% 429.79,13% 470.88, 6%

BTW, zsh has a built-in 'time' so measuring a full operation can be
easily done as 'sync; time ( my_test; sync )'

Szaka



Re: Filesystem Tests

2003-08-14 Thread Mike Fedyk
On Wed, Aug 06, 2003 at 07:37:42PM -0400, Timothy Miller wrote:
 
 
 Hans Reiser wrote:
 
 reiser4 cpu consumption is still dropping rapidly as others and I find 
 kruft in the code and remove it.  Major kruft remains still.

 Now, if you can manage to make it twice as fast while NOT increasing the 
 CPU usage, well, then that's brilliant, but the fact that ReiserFS uses 
 more CPU doesn't bother me in the least.

Basically he's saying it's faster and still not at its peak effeciency yet
too.


Re: FS Corruption with VIA MVP3 + UDMA/DMA

2003-08-14 Thread Wes Janzen
Nothing runs on this one ;-)

WinXP/2003 will die from registry and unrecoverable NTFS filesystem 
corruption.  Win98 will randomly corrupt driver files eventually leading 
to an unbootable system, or worse, a completely corrupted filesystem as 
scandisk happily crosslinks all the files (experienced this several 
times, just thought it was the hard drives and windows...since the 
drives would fail a few months later and since I had past experience 
with a Pentium 166 and HX system running Win95 doing this).

Linux fared better, but still would corrupt the filesystem, sometimes 
leading to an unusable system say if an important library is moved to 
lost+found during fsck.  It was much more reliable than any Windows 
install and easily repairable.  With windows, I had no choice but to 
re-install (backing up the registry after every boot worked until NTFS 
would eventually die).  I lost a few data and help files under linux, 
but at this point I backed up all the time anyway (after my first 
installation was hopelessly mangled).

I've tried several PCI tweaks with 2.4 which didn't really seem to cure 
anything.  My powertweak doesn't seem to like the 2.5 series kernels, so 
I haven't tried that.  Not that it seems to matter, the promise 
controllers have much better throughput anyway even with the same modes 
and settings in hdparm.  I tried all the hdparm combinations of dma 
modes and other settings with only a slight decrease in the chance of 
corruption and a corresponding dive in throughput.  It worked through 
2.5.74, but I finally disabled it for everything except my IDE ZIP drive 
and stuck in another promise card after concluding that it was just 
hopelessly broken.

It would have been nice if 2.4 would just refuse to use DMA, that way 
I'd have known about the problem much earlier.  I would think with all 
the stuff in the kernel about the RZ1000, the problems with the MVP3 
would be mentioned as well.  As just a typical end user I couldn't 
figure out why Linux and reiserfs, which are supposed to be so stable 
wouldn't weren't.  At this point I'd already run exhaustive memory , 
hard drive bad sector, and CPU tests without any failures so I was 
pretty certain it wasn't a hardware issue.  Everyone I knew had crashes 
with Windows so those didn't surprise me so much.

It's a decent computer for web browsing and let's me gauge the 
performance of my business apps.  It's a pretty good low-end target 
machine now that it doesn't write garbage to my drives.

I just think this should be documented in case someone sets up a 
proxy/firewall machine with this configuration.  For the majority of 
home users, any higher-end machine is probably wasted on such an 
application.  I setup such a system to share my parents dial-up 
connection over a wireless network.  Of course, it's using an HX chipset 
and P233MMX so it's rock solid, only needing rebooted when the modem 
locks up (happened twice since I set it up a year ago).  Even though 
it's running 2.4.18 and my dad likes to reset it rather than 
CTRL-ALT-DEL when the modem locks up, it has yet to corrupt reiserfs. 

That's the kind of stability that got me really wondering about my system...

Jamie Lokier wrote:

insecure wrote:
 

The VP2/97 also had severe problems with DMA.  I could never run
standard kernels on mind in the 2.0 days, and distro installs would
always lock up during installation, although Mandrake 8 seemed
reliable so something improved.
 

I had a VIA VPX sometime ago. AFAIR it worked fine...

I suspect PCI conf tweaks etc could work around
this trouble. I'm afraid there won't be much interest
in fixing these oldies. For example, I got rid of that
board (exchanged for Socket A one) - no way to test fixes :(
   

I found a hdparm command which fixed it, though it wasn't much use
during distro installs.  It was very pleasant to see Mandrake 8 just
work.  Fwiw, Windows 95, 98 and NT4 have no problems on the box.  It's
now my Internet Explorer 4 test rig :)
-- Jamie

 




Re: Reiser4 and linux 2.6.0

2003-08-14 Thread Henning Westerholt
Am Sonntag, 10. August 2003 04:02 schrieb Tupshin Harper:
 It would still be wonderful to have a way of getting such patches
 without going through bk. I requested that a working (complete) patch be
 made against a recent kernel version(2.6.0-test2 or later at this point)
 a few weeks ago, and while a got positive response, I still haven't seen
 anything. I would think you would want to make this very easy for people
 who are already going through the effort of testing 2.6 kernels.

 -Tupshin

Hello List,

i would love to see a patch against a 2.6.0-test kernel too. I don't want to 
obtain a bitkeeper licence.
A anoncvs-gateway as a alternative would be also ok ;)

As a happy reiserfs user, it is hard to read about the various changes in v4, 
and can't test them for yourself. 


Henning



Re: Filesystem Tests

2003-08-14 Thread Diego Calleja Garca
El Wed, 06 Aug 2003 18:06:37 +0400 Hans Reiser [EMAIL PROTECTED] escribió:

 I don't think ext2 is a serious option for servers of the sort that 
 Linux specializes in, which is probably why he didn't measure it.

Why?

 
 reiser4 cpu consumption is still dropping rapidly as others and I find 
 kruft in the code and remove it.  Major kruft remains still.

Cool.


Re: ReiserFS problems

2003-08-14 Thread Hans Reiser
Rogier Wolff wrote:

In fact this is not exactly true, it only switches to other block
group if you are creating new file. Why do you think this is a
problem?  (of course I am speaking of 2.4.20+ kernels).
   

Well we were recovering data into 1G files, but performance of adding
a new block was horrible. It was doing this for every block. Either it
was doing a fruitless search on every block-add or it was actually
adding the block to another block group. Anyway, performance dropped
-=*A LOT*=- when this happened.
I think you're describing the way it should be, or is now, but there
was a bug that caused it to behave differently.
	Roger. 

 

Can you help Oleg investigate this more closely by providing an exact 
account of what to do to replicate it?  Oleg, replicate this and observe 
what happens.

--
Hans



Re: ReiserFS problems

2003-08-14 Thread Rogier Wolff
On Thu, Aug 07, 2003 at 05:03:02PM +0400, Hans Reiser wrote:
 Rogier Wolff wrote:
 
 In fact this is not exactly true, it only switches to other block
 group if you are creating new file. Why do you think this is a
 problem?  (of course I am speaking of 2.4.20+ kernels).

 
 
 Well we were recovering data into 1G files, but performance of adding
 a new block was horrible. It was doing this for every block. Either it
 was doing a fruitless search on every block-add or it was actually
 adding the block to another block group. Anyway, performance dropped
 -=*A LOT*=- when this happened.
 
 I think you're describing the way it should be, or is now, but there
 was a bug that caused it to behave differently.

 Can you help Oleg investigate this more closely by providing an exact 
 account of what to do to replicate it?  Oleg, replicate this and observe 
 what happens.

What part of: we reported it a while back, and you told us it was
fixed don't you understand?

Roger. 

-- 
+-- Rogier Wolff -- www.harddisk-recovery.nl -- 0800 220 20 20 --
| Files foetsie, bestanden kwijt, alle data weg?!
| Blijf kalm en neem contact op met Harddisk-recovery.nl!


Re: can not compile reiser4

2003-08-14 Thread Jack Byer
I don't understand how the patch could be the problem. It doesn't change
anything in the fs/reiser4 directory at all. The file that won't compile is
fs/reiser4/entd.c, which is the most recent version from
bk://bk.namesys.com/bk/reiser4

- Original Message - 
From: Marcelo Pacheco [EMAIL PROTECTED]
To: Jack Byer [EMAIL PROTECTED]
Sent: Sunday, August 10, 2003 2:27 PM
Subject: Re: can not compile reiser4


 That patch is old and outdated.
 All you need is on the bk trees, except for the attached small compilation
patch that namesys hasn't took action yet.

 Marcelo

 On Sunday 10 August 2003 13:35, Jack Byer wrote:
  I'm trying to compile a 2.6.0-test2 kernel with reiser4 on a spare
system.
  I downloaded the latest reiser 4 sources from bitkeeper into the fs
  directory of a vanilla 2.6.0-test2 tree using the instructions on your
web
  site ( bk clone bk://bk.namesys.com/bk/reiser4)
  Then I applied the 2.6.0-test2-reiser4-2.6.0-test2.diff patch from your
ftp
  site.
  When I try to compile, I get the following error:
 
CC  fs/reiser4/entd.o
  In file included from include/asm/hardirq.h:6,
   from fs/reiser4/debug.h:17,
   from fs/reiser4/entd.c:5:
  include/linux/irq.h:69: warning: size of `irq_desc' is 28672 bytes
  fs/reiser4/entd.c: In function `wait_for_flush':
  fs/reiser4/entd.c:387: structure has no member named `pressure'
  make[2]: *** [fs/reiser4/entd.o] Error 1
  make[1]: *** [fs/reiser4] Error 2
  make: *** [fs] Error 2
 
  Also, the size of `irq_desc' is 28672 bytes warning was printed for
every
  file in the reiser4 directory up to that point.

 linux 2.6.0 and reiser4 (patch/bugfix)
 Date: 2003-08-02 07:58
 From: Pillars.NET [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]

 Figured out how to use bk to pull in the latest trees from
 linux.bkbits.net and bk.namesys.com and merge the two.

 Tried compiling a linux 2.6.0-test2 kernel with reiser4 built-in (not
 as a module)

 Ran into a compile-time error: undefined reference to _udivdi3,
 which is described by one LKML author as somebody is doing a 64-bit
 integer divide without pulling in the relevant gcc library.

 Poked around and found in include/div64.h a helper function called
 div_long_long_rem which appears to be custom-made for this type of
 problem.

 Here's what I changed to make the compiler happy:

 [EMAIL PROTECTED]:/usr/src/linux-2.6.0# diff -u
fs/reiser4/plugin/item/ctail.c.orig fs/reiser4/plugin/item/ctail.c
 --- ctail.c.orig2003-08-02 06:53:07.0 -0400
 +++ fs/reiser4/plugin/item/ctail.c  2003-08-02
06:41:15.0 -0400
 @@ -55,7 +55,8 @@
  cluster_index_by_coord(const coord_t * coord)
  {
 reiser4_key  key;
 -   return get_key_offset(item_key_by_coord(coord, key)) /
cluster_size_by_coord(coord),rem;
 +   unsigned long rem;
 +   return div_long_long_rem(get_key_offset(item_key_by_coord(coord,
key)),cluster_size_by_coord(coord),rem);
  }

  static char *
 @@ -764,13 +765,14 @@
  utmost_child_ctail(const coord_t * coord, sideof side, jnode ** child)
  {
 reiser4_key key;
 +   long unsigned rem;

 assert(edward-257, coord != NULL);
 assert(edward-258, child != NULL);
 assert(edward-259, side == LEFT_SIDE);
 assert(edward-260, item_plugin_by_coord(coord) ==
item_plugin_by_id(CTAIL_ID));

 -   if (get_key_offset(key) != cluster_size_by_coord(coord) *
(get_key_offset(key) / cluster_size_by_coord(coord)))
 +   if (get_key_offset(key) != cluster_size_by_coord(coord) *
div_long_long_rem(get_key_offset(key),cluster_size_by_coord(coord),rem))
 *child = NULL;
 else
 *child = jlook_lock(current_tree,
get_key_objectid(item_key_by_coord(coord, key)),
cluster_index_by_coord(coord));






Re: rebuild fs

2003-08-14 Thread Vitaly Fertman
 in this case (IO error) reiserfsck does abort() which ends up as signal
  number 5, and core is dumped if this is allowed. Looks pretty much like
  segfault too.
 Though a message is printed prior to this that we cannot read some block.
 
 Bye,
 Oleg

 yuck.  vs, complain to vitaly please.

It does not look the same as the user gets different messages on the terminal.
With hardware problems like IO errors he gets Aborting, although this can dump
the core file also. But what a user should not get even with the broken hardware 
is Segmentation fault messages. And core dumping is what looks really pretty 
much the same. 

As some old version of reiserfsck (3.6.3) stopped unexpectedly, Oleg suggested 
to use the latest one -- 3.6.11 -- which worked ok for now.

Regarding IO errors reiserfsck prints Block ## cannot be read before aborting 
and the last ones suggest to check the hardware also.

BTW, if there are some bad blocks I would advise to use dd_rescue instead of dd
as dd has some problems with bad blocks handling.

-- 
Thanks,
Vitaly Fertman


Re: reiser4 snapshot

2003-08-14 Thread Oleg Drokin
Hello!

On Mon, Aug 11, 2003 at 05:32:25PM -0700, Boris Tschirschwitz wrote:

 I thought I'd give it a try on 2.6.0-test3-mm1.
 Even with 'make mrproper' before compiling, I get the following error
 message:
 (Is there any interest in such error reports?)

Yes, there is.

 bobele linux # make bzImage
   CHK include/linux/version.h
   UPD include/linux/version.h
   Making asm-asm-i386 symlink
   CC  scripts/empty.o
   MKELF   scripts/elfconfig.h
   HOSTCC  scripts/file2alias.o
   HOSTCC  scripts/modpost.o
   HOSTLD  scripts/modpost
   SPLIT   include/linux/autoconf.h - include/config/*
   CC  arch/i386/kernel/asm-offsets.s
   CHK include/asm-i386/asm_offsets.h
   UPD include/asm-i386/asm_offsets.h
   CC  init/main.o
 In file included from include/linux/unistd.h:9,
  from init/main.c:18:
 include/asm/unistd.h: In function `reiser4':
 include/asm/unistd.h:400: error: `__NR_reiser4' undeclared (first use in this 
 function)
 include/asm/unistd.h:400: error: (Each undeclared identifier is reported only once
 include/asm/unistd.h:400: error: for each function it appears in.)
 make[1]: *** [init/main.o] Error 1
 make: *** [init] Error 2

Hm, this is strange.
__NR_reiser4 is clearly defined in include/asm-i386/unistd.h

Probably you had that part of the patch rejected? Can you please verify?

Bye,
Oleg


Re: Reiser4 status: benchmarked vs. V3 (and ext3)

2003-08-14 Thread David Woodhouse
On Wed, 2003-08-13 at 21:12, Bill Davidsen wrote:
 The driver should do the logical to physical mapping, but the portability
 vanishes if the filesystem to physical mapping is not the same for all
 machines and operating systems. For pluggable devices this is important.

The portability also vanishes if the file system layout is not the same
for all machines and operating systems... what's your point?

Just like there are standard file systems, there are also standard
'translation layers' -- pseudofilesystems which are used to emulate a
hard drive on flash storage -- and some of these are implemented for
Linux.

Take a PCMCIA flash card (real flash, not CF) with FTL and FAT on it,
and it'll work just fine under both Windows and Linux, because they both
use the standard FTL and FAT formats.

FTL provides the logical-physical mapping and the wear levelling, FAT
is just normal FAT. 

 The leveling seems to be done by JFFs2 in a portable way, and that's as it
 should be. 

You seem to be very confused here. JFFS2 works on flash directly;
nothing's pretending to be a block device. It doesn't seem to be at all
relevant to this discussion.

JFFS2 does its own wear levelling and flash management, because it works
directly on the flash. 

FAT can't do that -- it needs some other code (like the FTL code) to
emulate a normal hard drive for it, providing wear levelling and
logical-physical translation for it. 

See http://www.infradead.org/~dwmw2/mtd-upper-layers.jpeg

Wear levelling is not done in the driver -- the driver just drives the
flash, and in fact is below the bottom of the diagram since it's largely
irrelevant. It just gives you read/write/erase functions for the raw
flash.

Wear levelling is done either in the file system which works directly on
the flash (JFFS2, YAFFS), or in the 'translation layer' which uses the
flash to pretend to be a block device (FTL, NFTL, INFTL, SMTL). (In the
case of the extremely nave 'mtdblock' translation layer, no translation
and no wear levelling is done at all.)

 If the leveling were in the driver I don't believe even FAT
 would work.

I think that by 'driver' you actually mean the 'translation layer' or
the combination of translation layer and underlying hardware driver, in
which case you would be incorrect to say that it wouldn't work. That
_is_ how it works, portably.


-- 
dwmw2



Re: Reiser4 status: benchmarked vs. V3 (and ext3)

2003-08-14 Thread Yury Umanets
On Thu, 2003-08-14 at 00:12, Bill Davidsen wrote:
 On Sun, 27 Jul 2003, Yury Umanets wrote:
 
  On Sun, 2003-07-27 at 18:10, Daniel Egger wrote:
   Am Son, 2003-07-27 um 15.28 schrieb Hans Reiser:
 
or for which a wear leveling block device driver is used (I don't know
if one exists for Linux).
   
   This is normally done by the filesystem (e.g. JFFS2).
  
  Normally device driver should be concerned about making wear out
  smaller. It is up to it IMHO.

 
 The driver should do the logical to physical mapping, but the portability
 vanishes if the filesystem to physical mapping is not the same for all
 machines and operating systems. For pluggable devices this is important.
 
 The leveling seems to be done by JFFs2 in a portable way, and that's as it
 should be. If the leveling were in the driver I don't believe even FAT
 would work.

Hello Bill,

Yes, you are right. Device driver cannot take care about leveling.

It is able only to take care about simple caching (one erase block) in 
order to make wear out smaller and do not read/write whole block if one 
sector should be written.

Part of a filesystem called block allocator should take care about 
leveling.