Re: 2.6.16-rc6-mm2: slow writes on reiser4.

2006-04-01 Thread Pierre Etchemaïté
Le Tue, 21 Mar 2006 23:41:22 -0800, Hans Reiser <[EMAIL PROTECTED]> a écrit :

>  It may be that we need to port some of
> the block allocation optimizations from V3 to V4 (Jeff's work) to help
> with 90% full filesystems.

Talking of that, I've read about a localized performance problem of
reiserfs 3 in backuppc's mailing list (that is otherwise similar in
performance with xfs for that task). I wonder if it was ever reported
to you, as suggested in this mailing list...

http://sourceforge.net/mailarchive/message.php?msg_id=8646808

My understanding is that backuppc is hitting reiserfs3 hard links worse
case.

Backuppc creates a huge pool of all versions of all files from all
backups, compressed, organized using MD5 hashing (handling collisions
of course), and hardlinked from their different backup views. [Some
metadata is stored separately, so that several files with same content
but different metadata can still be shared on disk. But I digress]
At night, a sweeping process takes place to remove too old backups
(according to user policy), and maybe check if some more background
sharing/compression can be done.

If I remember well, v3 puts directory entries and their corresponding
inodes next to each other on disk. When hardlinks are created, new
directory entries are created, pointing to the same inode. If the first
directory entry is removed, the inode could be no longer stored near
any of the entries pointing to it.

Since backuppc is routinely removing directory entries in FIFO order,
it's almost guaranteed to happen every time. Hence a very bad inodes
distribution on disk after some time...

I don't know what xfs does exactly (blocks of preallocated inodes ?) but
it does better in this case.

Hope it helps,
Pierre.


Re: 2.6.16-rc6-mm2: slow writes on reiser4.

2006-03-29 Thread Philippe Gramoullé

Hello Laurent,

On Wed, 29 Mar 2006 08:16:55 +0200
Laurent Riffard <[EMAIL PROTECTED]> wrote:

  | So I found more conclusive to write 150M and thus to fill up the 2 FS.

Thanks for the explanations.

Truly yours,

Philippe


Re: 2.6.16-rc6-mm2: slow writes on reiser4.

2006-03-28 Thread Laurent Riffard
Le 29.03.2006 00:49, Philippe Gramoullé a écrit :
> Hello Laurent,
> 
> On Tue, 28 Mar 2006 22:19:01 +0200
> Laurent Riffard <[EMAIL PROTECTED]> wrote:
> 
>   | These FS are quite similars. Now guess what ? I filled these FS with
>   |  dd.
>   | 
>   | Original FS
>   | ===
>   | # sync
>   | # time dd if=/dev/zero of=toto bs=1M count=150
>   | 103+0 enregistrements lus.
>   | 102+0 enregistrements écrits.
>   | Command exited with non-zero status 1
> 
> Well, at least on my system , such a command exits with a 0 status

Oops ! I trimmed a line when I cut'n'paste. dd exits with the
message "Aucun espace disponible sur le périphérique" which means
"No space left on device".

> Also, not a single of your posts in this thread has this error except this one
> and the one below

Yes I somewhat changed my test. On the previous test, I dd'd 100M to
the FS.

As the original FS and its copy have different free space, writing
100M on each FS results in 3M free versus 30M free. I did this test
and I it takes about 2'20" versus 15". But I feared that one objects
"It's  because you have less free space on the first FS".

So I found more conclusive to write 150M and thus to fill up the 2 FS.

>   | 0.00user 2.94system 3:32.18elapsed 1%CPU (0avgtext+0avgdata
>   | 0maxresident)k
>   | # time sync
>   | 0inputs+0outputs (0major+279minor)pagefaults 0swaps
>   | 0.00user 0.01system 0:00.18elapsed 6%CPU (0avgtext+0avgdata
>   | 0maxresident)k
>   | 0inputs+0outputs (0major+191minor)pagefaults 0swaps
>   | 
>   | Copy FS
>   | ===
>   | # sync
>   | # time dd if=/dev/zero of=toto bs=1M count=150
>   | dd: écriture de `toto': Aucun espace disponible sur le périphérique
>   | 132+0 enregistrements lus.
>   | 131+0 enregistrements écrits.
>   | Command exited with non-zero status 1
> 
> Here, i can understand the "exited with non-zero status 1" as
> "Aucun espace disponible sur le périphérique" is french for 
> "No space left on device"

yes, see above.

>   | 0.00user 4.08system 0:15.95elapsed 25%CPU (0avgtext+0avgdata
>   | 0maxresident)k
>   | 0inputs+0outputs (1major+279minor)pagefaults 0swaps
>   | # time sync
>   | 0.00user 0.00system 0:00.17elapsed 0%CPU (0avgtext+0avgdata
>   | 0maxresident)k
>   | 0inputs+0outputs (0major+190minor)pagefaults 0swaps
>   | disk$
>   | 
>   | See ? 3'30" versus 16".
> 
> Are the 16" due to the fact that the above command exited earlier than it 
> should have ?

No, (see above), both FS were filled up to 0M free space.

> Thanks,
> 
> Philippe
> 

Thanks for your comments. I hope this made it clear.

To be fair, you can see there is some differences between the 2 FS :
- the copy is larger than the original one : 995998 bytes vs
1003520, which is 0.75% larger.
- the original FS resides on an extended partition (/dev/hda8) while
the copy is on a logical volume (/dev/vglinux1/test). This LV is
hosted on /dev/hda4.

I hope these differences do not have a high impact on the results.
I'll try to dd of=/dev/hda8 if=/dev/vglinux1/test, and see if it
makes some differences when I dd a 100M file on the FS.
~~
laurent


Re: 2.6.16-rc6-mm2: slow writes on reiser4.

2006-03-28 Thread Philippe Gramoullé

Hello Laurent,

On Tue, 28 Mar 2006 22:19:01 +0200
Laurent Riffard <[EMAIL PROTECTED]> wrote:

  | These FS are quite similars. Now guess what ? I filled these FS with
  |  dd.
  | 
  | Original FS
  | ===
  | # sync
  | # time dd if=/dev/zero of=toto bs=1M count=150
  | 103+0 enregistrements lus.
  | 102+0 enregistrements écrits.
  | Command exited with non-zero status 1

Well, at least on my system , such a command exits with a 0 status

Also, not a single of your posts in this thread has this error except this one
and the one below

  | 0.00user 2.94system 3:32.18elapsed 1%CPU (0avgtext+0avgdata
  | 0maxresident)k
  | # time sync
  | 0inputs+0outputs (0major+279minor)pagefaults 0swaps
  | 0.00user 0.01system 0:00.18elapsed 6%CPU (0avgtext+0avgdata
  | 0maxresident)k
  | 0inputs+0outputs (0major+191minor)pagefaults 0swaps
  | 
  | Copy FS
  | ===
  | # sync
  | # time dd if=/dev/zero of=toto bs=1M count=150
  | dd: écriture de `toto': Aucun espace disponible sur le périphérique
  | 132+0 enregistrements lus.
  | 131+0 enregistrements écrits.
  | Command exited with non-zero status 1

Here, i can understand the "exited with non-zero status 1" as
"Aucun espace disponible sur le périphérique" is french for 
"No space left on device"

  | 0.00user 4.08system 0:15.95elapsed 25%CPU (0avgtext+0avgdata
  | 0maxresident)k
  | 0inputs+0outputs (1major+279minor)pagefaults 0swaps
  | # time sync
  | 0.00user 0.00system 0:00.17elapsed 0%CPU (0avgtext+0avgdata
  | 0maxresident)k
  | 0inputs+0outputs (0major+190minor)pagefaults 0swaps
  | disk$
  | 
  | See ? 3'30" versus 16".

Are the 16" due to the fact that the above command exited earlier than it 
should have ?

Thanks,

Philippe


Re: 2.6.16-rc6-mm2: slow writes on reiser4.

2006-03-28 Thread Hans Reiser
I think what this means is that after we have a repacker, we should gain
performance advantages over our competition as a result.  It is far
easier for us to code an online repacker than it is for them.

Hans



Re: 2.6.16-rc6-mm2: slow writes on reiser4.

2006-03-28 Thread Hans Reiser
Laurent Riffard wrote:

>
>
>See ? 3'30" versus 16".
>
>I packed the metadata of my original FS to a file, you can grab it
>from http://laurent.riffard.free.fr/kernel.reiser4.bz2 (6.7M).
>  
>
Wow.  We need to do the repacker.  We might also need to examine whether
there are optimizations in V3 block allocation we should apply to V4,
but mostly we need the repacker.  Ok, well, right after we go into the
kernel it will be done.

Thanks much Laurent, you did a great job of analyzing this for us.

>Note I was unable to unpack it :
>  
>
>># bunzip2 -c /tmp/kernel.reiser4.bz2 | debugfs.reiser4 -U /dev/vglinux1/test 
>>debugfs.reiser4 1.0.5
>>Copyright (C) 2001, 2002, 2003, 2004 by Hans Reiser, licensing governed by 
>>reiser4progs/COPYING. 
>>
>>Info : The metadata were packed with the reiser4progs 1.0.5.  
>>  
>>
>>Error: Can't unpack filesystem. 
>>
>>
>
>~~
>laurent
>
>
>  
>



Re: 2.6.16-rc6-mm2: slow writes on reiser4.

2006-03-28 Thread Laurent Riffard

Le 22.03.2006 20:04, Hans Reiser a écrit :
> Instead of using sync, could you increase the size of the files you
> write so that they are 10x ram size?
> 
> I have a suspicion we are slow at sync  I am not sure why, but I
> have seen other data where sync was slow for us, and maybe we need to
> optimize that code path.
> 
> Hans
> 

Hello Hans, sorry for the long delay to reply.

I'm not sure this is a problem with _sync_. I had concerns with sync
on reiser4, but I was thinking it was related with the FS policy
which try to do a lot of work in memory, and when syncing time
comes, there is a huge amount of data to write back to disk.

Well, I'm not a File Systems Expert, this is wild guess...

Anyway, I didn't try to "write a file of size 10x ram size". My test
case is a 925M FS with 100M free, and I have 512M ram. And I guess
there is a problem with the Reiser4 internal data. It's an old FS, I
made thousands of kernel builds on it.

I allocated a new logical volume (about same size, same HD), made it
a reiser4 FS and copied all my data on it.

> [EMAIL PROTECTED] ~]# grep reiser4 /proc/mounts 
> /dev/hda8 /home/laurent/kernel reiser4 
> rw,nosuid,nodev,atom_max_size=0x7e22,atom_max_age=0x249f0,atom_min_size=0x100,atom_max_flushers=0x1,cbk_cache_slots=0x10
>  0 0
> /dev/vglinux1/test /mnt/disk reiser4 
> rw,atom_max_size=0x7e22,atom_max_age=0x249f0,atom_min_size=0x100,atom_max_flushers=0x1,cbk_cache_slots=0x10
>  0 0
> [EMAIL PROTECTED] ~]# grep -e hda8 -e dm-5 /proc/partitions  
>3 8 995998 hda8
>  254 51003520 dm-5
> [EMAIL PROTECTED] ~]# cp -pRL /home/laurent/kernel/. /mnt/disk
[cut errors with symbolic links]
> [EMAIL PROTECTED] ~]# df /home/laurent/kernel /mnt/disk
> Sys. de fich. Tail. Occ. Disp. %Occ. Monté sur
> /dev/hda8 925M  822M  103M  89% /home/laurent/kernel
> /dev/mapper/vglinux1-test
>   932M  800M  132M  86% /mnt/disk

These FS are quite similars. Now guess what ? I filled these FS with
 dd.

Original FS
===
# sync
# time dd if=/dev/zero of=toto bs=1M count=150
103+0 enregistrements lus.
102+0 enregistrements écrits.
Command exited with non-zero status 1
0.00user 2.94system 3:32.18elapsed 1%CPU (0avgtext+0avgdata
0maxresident)k
# time sync
0inputs+0outputs (0major+279minor)pagefaults 0swaps
0.00user 0.01system 0:00.18elapsed 6%CPU (0avgtext+0avgdata
0maxresident)k
0inputs+0outputs (0major+191minor)pagefaults 0swaps

Copy FS
===
# sync
# time dd if=/dev/zero of=toto bs=1M count=150
dd: écriture de `toto': Aucun espace disponible sur le périphérique
132+0 enregistrements lus.
131+0 enregistrements écrits.
Command exited with non-zero status 1
0.00user 4.08system 0:15.95elapsed 25%CPU (0avgtext+0avgdata
0maxresident)k
0inputs+0outputs (1major+279minor)pagefaults 0swaps
# time sync
0.00user 0.00system 0:00.17elapsed 0%CPU (0avgtext+0avgdata
0maxresident)k
0inputs+0outputs (0major+190minor)pagefaults 0swaps
disk$

See ? 3'30" versus 16".

I packed the metadata of my original FS to a file, you can grab it
from http://laurent.riffard.free.fr/kernel.reiser4.bz2 (6.7M).

Note I was unable to unpack it :
> # bunzip2 -c /tmp/kernel.reiser4.bz2 | debugfs.reiser4 -U /dev/vglinux1/test 
> debugfs.reiser4 1.0.5
> Copyright (C) 2001, 2002, 2003, 2004 by Hans Reiser, licensing governed by 
> reiser4progs/COPYING. 
> 
> Info : The metadata were packed with the reiser4progs 1.0.5.  
>   
> 
> Error: Can't unpack filesystem. 

~~
laurent


Re: 2.6.16-rc6-mm2: slow writes on reiser4.

2006-03-23 Thread Nate Diller
On 3/23/06, Jindrich Makovicka <[EMAIL PROTECTED]> wrote:
> Hans Reiser wrote:
> > Instead of using sync, could you increase the size of the files you
> > write so that they are 10x ram size?
> >
> > I have a suspicion we are slow at sync  I am not sure why, but I
> > have seen other data where sync was slow for us, and maybe we need to
> > optimize that code path.
>
> My impression is rather that the bottleneck is the amount of seeking the
> sync causes - would it be possible to reorder the write operations
> somehow, still preserving atomicity?

yeah, the kernel is not good at ordering flush during sync, it would
work much better if Reiser4 could just be told to do a full sync, and
then have only one thread that climbs through the fake inode and
squallocs everything.

> Also, a comparison of Reiser4 performance on NCQ vs. non-NCQ drive could
> be interesting (I don't have NCQ, maybe that's the problem).

the scheduler could make a difference too, most likely in the area of
'congestion' threshold and handling.

NATE


Re: 2.6.16-rc6-mm2: slow writes on reiser4.

2006-03-23 Thread Jindrich Makovicka
Hans Reiser wrote:
> Instead of using sync, could you increase the size of the files you
> write so that they are 10x ram size?
> 
> I have a suspicion we are slow at sync  I am not sure why, but I
> have seen other data where sync was slow for us, and maybe we need to
> optimize that code path.

My impression is rather that the bottleneck is the amount of seeking the
sync causes - would it be possible to reorder the write operations
somehow, still preserving atomicity?

Also, a comparison of Reiser4 performance on NCQ vs. non-NCQ drive could
be interesting (I don't have NCQ, maybe that's the problem).

Regards,
-- 
Jindrich Makovicka


Re: 2.6.16-rc6-mm2: slow writes on reiser4.

2006-03-22 Thread Hans Reiser
Instead of using sync, could you increase the size of the files you
write so that they are 10x ram size?

I have a suspicion we are slow at sync  I am not sure why, but I
have seen other data where sync was slow for us, and maybe we need to
optimize that code path.

Hans

Laurent Riffard wrote:

>Le 22.03.2006 08:41, Hans Reiser a écrit :
>  
>
>>Laurent Riffard wrote:
>>
>>
>>
>>
>>>Hello,
>>>
>>>Writing big files is very slow on reiser4 now. 
>>>
>>>"dd if=/dev/zero of=toto bs=1k count=102400; sync"
>>>
>>>  
>>>
>>try bs=4M, and tell me what happens.  also try an empty fs, and an fs
>>that is equally full to reiserfs.  Note that reiserfs in your test is
>>68% full vs. 90% full for V4.  It may be that we need to port some of
>>the block allocation optimizations from V3 to V4 (Jeff's work) to help
>>with 90% full filesystems.  Thanks for doing this.  Real users always
>>teach me a lot when they test things differently from how I did.
>>
>>Hans
>>
>>
>
>Hello Hans,
>
>Yesterday, I realized that my tests were not fair. So I did some
>further tests trying to have the same situation for 3 different FS
>(reiserfs/ext2/reiser4) and I sent the result to the list, but this
>mail never reached the list. I have resent it.
>
>As per your request, I tried to replay my dd test on my 90% full
>reiser4 FS, using a 4M block size. Here are the results:
>
>-
>  
>
>>Desktop$ cd ~/kernel
>>
>>kernel$ rm toto
>>rm: détruire fichier régulier `toto'? o
>>
>>kernel$ df .
>>Sys. de fich. Tail. Occ. Disp. %Occ. Monté sur
>>/dev/hda8 925M  748M  177M  81% /home/laurent/kernel
>>
>>kernel$ grep /dev/hda8 /rpoc/mounts
>>grep: /rpoc/mounts: Aucun fichier ou répertoire de ce type
>>
>>kernel$ grep /dev/hda8 /proc/mounts
>>/dev/hda8 /home/laurent/kernel reiser4 
>>rw,nosuid,nodev,atom_max_size=0x7e0c,atom_max_age=0x249f0,atom_min_size=0x100,atom_max_flushers=0x1,cbk_cache_slots=0x10
>> 0 0
>>
>>kernel$  sync; time dd if=/dev/zero of=toto bs=4M count=25; time sync
>>25+0 enregistrements lus.
>>25+0 enregistrements écrits.
>>0.00user 2.89system 0:17.18elapsed 16%CPU (0avgtext+0avgdata 0maxresident)k
>>0inputs+0outputs (0major+252minor)pagefaults 0swaps
>>0.00user 0.00system 2:19.91elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
>>0inputs+0outputs (0major+191minor)pagefaults 0swaps
>>
>>kernel$ sync; time dd if=/dev/zero of=toto bs=4M count=25; time sync
>>25+0 enregistrements lus.
>>25+0 enregistrements écrits.
>>0.00user 2.96system 1:16.42elapsed 3%CPU (0avgtext+0avgdata 0maxresident)k
>>0inputs+0outputs (0major+252minor)pagefaults 0swaps
>>0.00user 0.00system 0:08.70elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
>>0inputs+0outputs (0major+190minor)pagefaults 0swaps
>>
>>
>-
>
>I tried to run an "iostat 10" simultaneously with dd+sync. I
>attached the output. Hope this helps.
>~~
>laurent
>  
>
>
>
>Le script a débuté sur mer 22 mar 2006 19:12:56 CET
>Desktop$ cd ~/kernel
>kernel$ 
>kernel$  sleep 15 && echo SYNC && sync && echo DD && time dd if=/dev/zero 
>of=toto bs=4M count=25 && echo SYNC && time sync && echo END &
>[1] 4657
>kernel$  iostat -t 10 /dev/hda8
>Linux 2.6.16-rc6-mm2 (antares.localdomain) 22.03.2006
>
>Heure: 19:13:32
>avg-cpu:  %user   %nice %system %iowait   %idle
>   5,010,02   11,074,45   79,46
>
>Device:tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
>hda8  5,34 0,27   217,58   12971026592
>
>Heure: 19:13:42
>avg-cpu:  %user   %nice %system %iowait   %idle
>   0,100,000,200,20   99,50
>
>Device:tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
>hda8  0,00 0,00 0,00  0  0
>
>SYNC
>DD
>Heure: 19:13:52
>avg-cpu:  %user   %nice %system %iowait   %idle
>   1,500,00   79,328,29   10,89
>
>Device:tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
>hda8 20,38 3,20  1202,00 32  12032
>
>Heure: 19:14:02
>avg-cpu:  %user   %nice %system %iowait   %idle
>   2,300,00   81,08   16,620,00
>
>Device:tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
>hda8 33,53 0,00  1398,20  0  13968
>
>Heure: 19:14:12
>avg-cpu:  %user   %nice %system %iowait   %idle
>   1,900,00   88,519,590,00
>
>Device:tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
>hda8 25,27 0,00   893,51  0   8944
>
>Heure: 19:14:22
>avg-cpu:  %user   %nice %system %iowait   %idle
>   3,190,00   85,63   11,180,00
>
>Device:tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
>hda8 27,35 0,00  1288,62  0  12912
>
>Heure: 19:14:32
>avg-cpu:  %user   %nice %system %iowait   %idle
>   0,800,00   90,019,19

Re: 2.6.16-rc6-mm2: slow writes on reiser4.

2006-03-22 Thread Laurent Riffard
Le 22.03.2006 08:41, Hans Reiser a écrit :
> Laurent Riffard wrote:
> 
> 
>>Hello,
>>
>>Writing big files is very slow on reiser4 now. 
>>
>>"dd if=/dev/zero of=toto bs=1k count=102400; sync"
>>
> 
> try bs=4M, and tell me what happens.  also try an empty fs, and an fs
> that is equally full to reiserfs.  Note that reiserfs in your test is
> 68% full vs. 90% full for V4.  It may be that we need to port some of
> the block allocation optimizations from V3 to V4 (Jeff's work) to help
> with 90% full filesystems.  Thanks for doing this.  Real users always
> teach me a lot when they test things differently from how I did.
> 
> Hans

Hello Hans,

Yesterday, I realized that my tests were not fair. So I did some
further tests trying to have the same situation for 3 different FS
(reiserfs/ext2/reiser4) and I sent the result to the list, but this
mail never reached the list. I have resent it.

As per your request, I tried to replay my dd test on my 90% full
reiser4 FS, using a 4M block size. Here are the results:

-
> Desktop$ cd ~/kernel
> 
> kernel$ rm toto
> rm: détruire fichier régulier `toto'? o
> 
> kernel$ df .
> Sys. de fich. Tail. Occ. Disp. %Occ. Monté sur
> /dev/hda8 925M  748M  177M  81% /home/laurent/kernel
> 
> kernel$ grep /dev/hda8 /rpoc/mounts
> grep: /rpoc/mounts: Aucun fichier ou répertoire de ce type
> 
> kernel$ grep /dev/hda8 /proc/mounts
> /dev/hda8 /home/laurent/kernel reiser4 
> rw,nosuid,nodev,atom_max_size=0x7e0c,atom_max_age=0x249f0,atom_min_size=0x100,atom_max_flushers=0x1,cbk_cache_slots=0x10
>  0 0
> 
> kernel$  sync; time dd if=/dev/zero of=toto bs=4M count=25; time sync
> 25+0 enregistrements lus.
> 25+0 enregistrements écrits.
> 0.00user 2.89system 0:17.18elapsed 16%CPU (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (0major+252minor)pagefaults 0swaps
> 0.00user 0.00system 2:19.91elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (0major+191minor)pagefaults 0swaps
> 
> kernel$ sync; time dd if=/dev/zero of=toto bs=4M count=25; time sync
> 25+0 enregistrements lus.
> 25+0 enregistrements écrits.
> 0.00user 2.96system 1:16.42elapsed 3%CPU (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (0major+252minor)pagefaults 0swaps
> 0.00user 0.00system 0:08.70elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (0major+190minor)pagefaults 0swaps
-

I tried to run an "iostat 10" simultaneously with dd+sync. I
attached the output. Hope this helps.
~~
laurent
Le script a débuté sur mer 22 mar 2006 19:12:56 CET
Desktop$ cd ~/kernel
kernel$ 
kernel$  sleep 15 && echo SYNC && sync && echo DD && time dd if=/dev/zero 
of=toto bs=4M count=25 && echo SYNC && time sync && echo END &
[1] 4657
kernel$  iostat -t 10 /dev/hda8
Linux 2.6.16-rc6-mm2 (antares.localdomain)  22.03.2006

Heure: 19:13:32
avg-cpu:  %user   %nice %system %iowait   %idle
   5,010,02   11,074,45   79,46

Device:tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
hda8  5,34 0,27   217,58   12971026592

Heure: 19:13:42
avg-cpu:  %user   %nice %system %iowait   %idle
   0,100,000,200,20   99,50

Device:tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
hda8  0,00 0,00 0,00  0  0

SYNC
DD
Heure: 19:13:52
avg-cpu:  %user   %nice %system %iowait   %idle
   1,500,00   79,328,29   10,89

Device:tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
hda8 20,38 3,20  1202,00 32  12032

Heure: 19:14:02
avg-cpu:  %user   %nice %system %iowait   %idle
   2,300,00   81,08   16,620,00

Device:tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
hda8 33,53 0,00  1398,20  0  13968

Heure: 19:14:12
avg-cpu:  %user   %nice %system %iowait   %idle
   1,900,00   88,519,590,00

Device:tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
hda8 25,27 0,00   893,51  0   8944

Heure: 19:14:22
avg-cpu:  %user   %nice %system %iowait   %idle
   3,190,00   85,63   11,180,00

Device:tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
hda8 27,35 0,00  1288,62  0  12912

Heure: 19:14:32
avg-cpu:  %user   %nice %system %iowait   %idle
   0,800,00   90,019,190,00

Device:tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
hda8 25,17 0,00   800,00  0   8008

Heure: 19:14:42
avg-cpu:  %user   %nice %system %iowait   %idle
   0,300,00   74,93   24,780,00

Device:tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
hda8 54,35 0,00  3138,46  0  31416

Heure: 19:14:52
avg-cpu:  %user   %nice %system %iowait   %idle
   0,200,00   81,62   18,180,00

Re: 2.6.16-rc6-mm2: slow writes on reiser4.

2006-03-22 Thread Laurent Riffard
[this is a second post, the first post seemed to never reach the list]

Le 21.03.2006 22:16, Laurent Riffard a écrit :
> Hello,
> 
> Writing big files is very slow on reiser4 now. 
> 
> "dd if=/dev/zero of=toto bs=1k count=102400; sync" takes more than 2 minutes 
> on 
> reiser4 fs, but only 15 seconds on reiserfs fs.

Oops! My tests were not fair: my reiser4 FS was almost full while my
reiserfs FS
had plenty of free space.

> kernel$ df .
> Sys. de fich. Tail. Occ. Disp. %Occ. Monté sur
> /dev/hda8 925M  825M  101M  90% /home/laurent/kernel
> kernel$ grep hda8 /proc/mounts
> /dev/hda8 /home/laurent/kernel reiser4 
> rw,nosuid,nodev,atom_max_size=0x7e0c,atom_max_age=0x249f0,atom_min_size=0x100,atom_max_flushers=0x1,cbk_cache_slots=0x10
>  0 0
[snip]
> ~$ df .
> Sys. de fich. Tail. Occ. Disp. %Occ. Monté sur
> /dev/mapper/vglinux1-lvhome
>   7,0G  4,8G  2,3G  68% /home
> ~$ grep lvhome /proc/mounts
> /dev/vglinux1/lvhome /home reiserfs rw 0 0

So I did some tests with a 2GB logical volume. I formatted it
(reiserfs/ext2/reiser4fs), I untared a copy of a kernel tree on this FS
and I wrote a 100 MB file 3 times.

FSElapsed time for dd + sync
reiserfs: 14.22s
ext2: 11.12s
reiser4:  19.71s

I won't discuss why reiser4 is slow here. Maybe my tests are not so
good. The
interesting point of this thread is that reiser4 seems not to like
the situations
with little space available. I should replay these tests with 90%
full FS (but it's
time to go to bed now...).

Below is attached the full logs of my tests.
~~
laurent

Le script a débuté sur mar 21 mar 2006 22:40:11 CET
[EMAIL PROTECTED] ~]# lvdisplay /dev/vglinux1/test
  --- Logical volume ---
  LV Name/dev/vglinux1/test
  VG Namevglinux1
  LV UUID1IdmIn-9Ne8-IZDS-PUYF-IyLP-Xz54-c50H2E
  LV Write Accessread/write
  LV Status  available
  # open 0
  LV Size2,00 GB
  Current LE 512
  Segments   2
  Allocation inherit
  Read ahead sectors 0
  Block device   254:5
   
[EMAIL PROTECTED] ~]# mkfs.reiserfs /dev/vglinux1/test 
mkfs.reiserfs 3.6.19 (2003 www.namesys.com)

A pair of credits:
Yury Umanets  (aka Umka)  developed  libreiser4,  userspace  plugins,  and  all
userspace tools (reiser4progs) except of fsck.

Hans Reiser was the project initiator,  source of all funding for the first 5.5
years. He is the architect and official maintainer.


Guessing about desired format.. Kernel 2.6.16-rc6-mm2 is running.
Format 3.6 with standard journal
Count of blocks on the device: 524288
Number of blocks consumed by mkreiserfs formatting process: 8227
Blocksize: 4096
Hash function used to sort names: "r5"
Journal Size 8193 blocks (first block 18)
Journal Max transaction length 1024
inode generation number: 0
UUID: 9f9b271b-1ed6-4ffb-9cde-243d3859b221
ATTENTION: YOU SHOULD REBOOT AFTER FDISK!
ALL DATA WILL BE LOST ON '/dev/vglinux1/test'!
Continue (y/n):y
Initializing journal - 0%20%40%60%80%100%
Syncing..ok

Tell your friends to use a kernel based on 2.4.18 or later, and especially not a
kernel based on 2.4.9, when you use reiserFS. Have fun.

ReiserFS is successfully created on /dev/vglinux1/test.
[EMAIL PROTECTED] ~]# mount /dev/vglinux1/test /mnt/disk 
[EMAIL PROTECTED] ~]# cd /mnt/disk
[EMAIL PROTECTED] disk]# tar -xjf  ~laurent/.ketchup/linux-2.6.15.tar.bz2
[EMAIL PROTECTED] disk]# df .
Sys. de fich. Tail. Occ. Disp. %Occ. Monté sur
/dev/mapper/vglinux1-test
  2,0G  260M  1,8G  13% /mnt/disk
[EMAIL PROTECTED] disk]# ls
linux-2.6.15
[EMAIL PROTECTED] disk]#  sync; time dd if=/dev/zero of=toto bs=1k 
count=102400; time sync
102400+0 enregistrements lus.
102400+0 enregistrements écrits.
0.04user 1.60system 0:01.73elapsed 94%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+250minor)pagefaults 0swaps
0.00user 0.06system 0:15.53elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+191minor)pagefaults 0swaps

[EMAIL PROTECTED] disk]#  sync; time dd if=/dev/zero of=toto bs=1k 
count=102400; time sync
102400+0 enregistrements lus.
102400+0 enregistrements écrits.
0.02user 1.60system 0:01.65elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+251minor)pagefaults 0swaps
0.00user 0.04system 0:09.72elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+190minor)pagefaults 0swaps

[EMAIL PROTECTED] disk]#  sync; time dd if=/dev/zero of=toto bs=1k 
count=102400; time sync
102400+0 enregistrements lus.
102400+0 enregistrements écrits.
0.04user 1.63system 0:01.69elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+250minor)pagefaults 0swaps
0.00user 0.06system 0:15.58elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+192minor)pagefaults 0swaps

[EMAIL PROTECTED] disk]#  sync; time dd if=/dev/zero of=toto bs=1k 
count=1024

Re: 2.6.16-rc6-mm2: slow writes on reiser4.

2006-03-21 Thread Hans Reiser
Laurent Riffard wrote:

>Hello,
>
>Writing big files is very slow on reiser4 now. 
>
>"dd if=/dev/zero of=toto bs=1k count=102400; sync"
>
try bs=4M, and tell me what happens.  also try an empty fs, and an fs
that is equally full to reiserfs.  Note that reiserfs in your test is
68% full vs. 90% full for V4.  It may be that we need to port some of
the block allocation optimizations from V3 to V4 (Jeff's work) to help
with 90% full filesystems.  Thanks for doing this.  Real users always
teach me a lot when they test things differently from how I did.

Hans

> takes more than 2 minutes on 
>reiser4 fs, but only 15 seconds on reiserfs fs.
>
>Actually, writing on reiser4 is not uniformly slow, it seems to be blocked for 
>ages from time to time. I monitored the number of dirty pages from 
>/proc/meminfo 
>an I hit sysrq-T when the system was stalling:  
>
>ddD 17DE 0 21930  21929 (NOTLB)
>   d7169c74 e0c98b05 0246 17de  f396aa00 003d1249 d0b68140
>   d0b68030 f396aa00 003d1249 6d519e00 0002 c0396434 d8bf8e30 d8bf8e38
>   0246 d7169ca0 c0270f08 d0b68030 0001 d0b68030 c0113b25 d8bf8e38
>Call Trace:
> [] __down+0x81/0xdc
> [] __down_failed+0xa/0x10
> [] .text.lock.lock+0x15/0x1b [reiser4]
> [] longterm_lock_znode+0x5b4/0x7b0 [reiser4]
> [] cbk_level_lookup+0x8a/0x954 [reiser4]
> [] traverse_tree+0x752/0xa0d [reiser4]
> [] coord_by_handle+0x781/0x789 [reiser4]
> [] object_lookup+0x1eb/0x230 [reiser4]
> [] find_file_item+0x18d/0x1b7 [reiser4]
> [] write_flow+0x208/0x6e1 [reiser4]
> [] write_unix_file+0x3d9/0x5b0 [reiser4]
> [] vfs_write+0x8a/0x133
> [] sys_write+0x3b/0x60
> [] sysenter_past_esp+0x54/0x75
>
>Below are the detailed test I ran. Feel free to ask for more information.
>
>Reiser4 FS
>==
>
>Desktop$ cd ~/kernel
>
>kernel$ df .
>Sys. de fich. Tail. Occ. Disp. %Occ. Monté sur
>/dev/hda8 925M  825M  101M  90% /home/laurent/kernel
>
>kernel$ grep hda8 /proc/mounts
>/dev/hda8 /home/laurent/kernel reiser4 
>rw,nosuid,nodev,atom_max_size=0x7e0c,atom_max_age=0x249f0,atom_min_size=0x100,atom_max_flushers=0x1,cbk_cache_slots=0x10
> 0 0
>
>kernel$ sync; time dd if=/dev/zero of=toto bs=1k count=102400; time sync 
>102400+0 enregistrements lus.
>102400+0 enregistrements écrits.
>0.06user 13.95system 1:42.09elapsed 13%CPU (0avgtext+0avgdata 0maxresident)k
>0inputs+0outputs (0major+250minor)pagefaults 0swaps
>0.00user 0.00system 1:22.90elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
>0inputs+0outputs (0major+191minor)pagefaults 0swaps
>
>kernel$ sync; time dd if=/dev/zero of=toto bs=1k count=102400; time sync 
>102400+0 enregistrements lus.
>102400+0 enregistrements écrits.
>0.08user 14.01system 1:45.57elapsed 13%CPU (0avgtext+0avgdata 0maxresident)k
>0inputs+0outputs (0major+249minor)pagefaults 0swaps
>0.00user 0.00system 0:09.78elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
>0inputs+0outputs (0major+191minor)pagefaults 0swaps
>
>kernel$ sync; time dd if=/dev/zero of=toto bs=1k count=102400; time sync 
>102400+0 enregistrements lus.
>102400+0 enregistrements écrits.
>0.06user 14.13system 2:18.27elapsed 10%CPU (0avgtext+0avgdata 0maxresident)k
>0inputs+0outputs (0major+251minor)pagefaults 0swaps
>0.00user 0.00system 0:08.48elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
>0inputs+0outputs (0major+190minor)pagefaults 0swaps
>
>kernel$ sync; time dd if=/dev/zero of=toto bs=1k count=102400; time sync 
>102400+0 enregistrements lus.
>102400+0 enregistrements écrits.
>0.06user 14.27system 1:56.34elapsed 12%CPU (0avgtext+0avgdata 0maxresident)k
>0inputs+0outputs (0major+251minor)pagefaults 0swaps
>0.00user 0.00system 0:10.46elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
>0inputs+0outputs (0major+190minor)pagefaults 0swaps
>
>
>Reiserfs FS
>===
>kernel$ cd
>
>~$ df .
>Sys. de fich. Tail. Occ. Disp. %Occ. Monté sur
>/dev/mapper/vglinux1-lvhome
>  7,0G  4,8G  2,3G  68% /home
>[/dev/mapper/vglinux1-lvhome resides on /dev/hda4]
>
>~$ grep lvhome /proc/mounts
>/dev/vglinux1/lvhome /home reiserfs rw 0 0
>
>~$ sync; time dd if=/dev/zero of=toto bs=1k count=102400; time sync 
>102400+0 enregistrements lus.
>102400+0 enregistrements écrits.
>0.04user 1.75system 0:02.05elapsed 87%CPU (0avgtext+0avgdata 0maxresident)k
>0inputs+0outputs (0major+249minor)pagefaults 0swaps
>0.00user 0.10system 0:12.93elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
>0inputs+0outputs (0major+191minor)pagefaults 0swaps
>
>~$ sync; time dd if=/dev/zero of=toto bs=1k count=102400; time sync 
>102400+0 enregistrements lus.
>102400+0 enregistrements écrits.
>0.04user 1.83system 0:01.98elapsed 94%CPU (0avgtext+0avgdata 0maxresident)k
>0inputs+0outputs (0major+250minor)pagefaults 0swaps
>0.00user 0.16system 0:14.45elapsed 1%CPU (0avgtext+0avgdata 0maxresident)k
>0inputs+0outputs (0major+191minor)pagefaults 0swaps
>
>~$ sync; time dd if=/dev/zero of=toto bs=1k count=102400; time sync 
>102400+0 enregistrements lus.
>102400+0