Re: [OpenAFS] Strange kernel messages from yesterday...

2009-11-02 Thread Marc Dionne
On Mon, Nov 2, 2009 at 7:57 AM, Harald Barth  wrote:
>
>> I've never seen anything about ext4 (which is more and more
>> supported by recent kernels/distributions) : is it completely
>> deprecated for afs data store? Or does it just brings nothing more
>> (in terms of speed, reliability...) than ext3?
>
> There was a lot of enthusiasm about reiserfs in the beginning, too.
>
> Let it prove itself in a big distro on desktop systems for a while.
>
> It is just too new to say something about stability/reliability IMHO.
>
> Harald.

There have been many reports of extensive ext4 filesystem corruption
during the current 2.6.32-rc kernel series (see for example
http://bugzilla.kernel.org/show_bug.cgi?id=14354), after unclean
shutdowns.

I personally experienced this twice on a desktop and lost hundreds of
files (looked like roughly the last half-hour of activity) each time.
In my case this didn't touch anything too critical (although I had
mild corruption of an openafs git clone), but having this happen on a
/vicepx partition would not be pleasant.  It definitely needs to
mature before being considered for use as a server data store.

Marc
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Strange kernel messages from yesterday...

2009-11-02 Thread Harald Barth

> I've never seen anything about ext4 (which is more and more
> supported by recent kernels/distributions) : is it completely
> deprecated for afs data store? Or does it just brings nothing more
> (in terms of speed, reliability...) than ext3?

There was a lot of enthusiasm about reiserfs in the beginning, too.

Let it prove itself in a big distro on desktop systems for a while.

It is just too new to say something about stability/reliability IMHO.

Harald.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Strange kernel messages from yesterday...

2009-11-02 Thread Harald Barth
> > Hm, is ext3 slower if used on server? In that case, anyone checked why?
> Compared to ext2 I mean.  Or var your mail just difficult to parse? :-)

Parsing help:

On server: ext3 slower than xfs.
On client: ext3 slower than ext2

I have not compared ext2 with xfs.

I would not use ext2 on server for obvious reasons, but it would probably be 
faster than ext3.

Harald.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Strange kernel messages from yesterday...

2009-11-02 Thread Harald Barth

> > Ext3 works too
> > (server or client), but slower.
> >   
> Hm, is ext3 slower if used on server? In that case, anyone checked why?

My benchmarks (done some years ago) showed that ext3 is extremely slow
on meta data operations like file creation an delete (compared to
xfs). Vos backup, move (everything that makes clones and deletes them
later) does a lot of those. I have not bothered to debug ext3 (where
you in addition to that can run out of inodes (*)) but used the faster
xfs file system instead.

Harald.

(*) Which of course recently happened on our Lustre file system
partitions where you have to use ext3.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Strange kernel messages from yesterday...

2009-11-02 Thread Frédéric Grelot

> After some benchmarking a while ago - see here:
> 
>  http://fbo.no-ip.org/cgi-bin/twiki/view/Instantafs/WhichFs
> 
> I decided to use ext3 since most people I asked hadn't been happy
> with
> reiser3's stability.
> 

I've never seen anything about ext4 (which is more and more supported by recent 
kernels/distributions) : is it completely deprecated for afs data store? Or 
does it just brings nothing more (in terms of speed, reliability...) than ext3?
I've seen enourmous improvement on disk operation on standard desktop systems 
only by migrating from ext3 to ext4, which is the reason why I'm asking the 
question to myself (and to you...).

Thanks for your answer.

Frederic.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Strange kernel messages from yesterday...

2009-11-02 Thread Frank Burkhardt
Hi,

On Mon, Nov 02, 2009 at 12:30:10PM +0100, Anders Magnusson wrote:
> Harald Barth wrote:
> >Ext3 works too
> >(server or client), but slower.
> >  
> Hm, is ext3 slower if used on server? In that case, anyone checked why?

After some benchmarking a while ago - see here:

 http://fbo.no-ip.org/cgi-bin/twiki/view/Instantafs/WhichFs

I decided to use ext3 since most people I asked hadn't been happy with
reiser3's stability.

However, hardware configuration mentioned on the benchmark page is no longer
in use here (Core2Quad instead of Xeon, RAID6/Areca instead of RAID5/3Ware).
Maybe some of the filesystems' properties changed, too. I'm currently 95%
happy with ext3. There is just one problem. "Sometimes" (esp. when my
nightly debian mirror script runs), removing a directory takes forever (up
to 10 sconds per rmdir() according to strace) while there's no other load on
either the server and the client.

Regards,

Frank
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Strange kernel messages from yesterday...

2009-11-02 Thread Anders Magnusson

Anders Magnusson wrote:

Harald Barth wrote:

Ext3 works too
(server or client), but slower.
  

Hm, is ext3 slower if used on server? In that case, anyone checked why?

Compared to ext2 I mean.  Or var your mail just difficult to parse? :-)

-- Ragge
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Strange kernel messages from yesterday...

2009-11-02 Thread Anders Magnusson

Harald Barth wrote:

Ext3 works too
(server or client), but slower.
  

Hm, is ext3 slower if used on server? In that case, anyone checked why?

-- Ragge
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Strange kernel messages from yesterday...

2009-11-02 Thread Harald Barth
> REISERFS warning (device cciss/c0d2p1): journal-d1 flush_commit_list:
> jl->j_len = 0; jl->j_state = 0; jl->j_trans_id = 0; jl->j_refcount = 0;
> journal->trans_id = 1744730; oldest live jl->j_trans_id = 1742447

Short summary: DO NOT USE REISERFS.

Long explanation can be found in the openafs-info archives if you
search around a bit. We have found xfs to be a good file system for
/vicepX partitions (with namei fileserver) and ext2 (it's only a cache
and can be rebuilt after a crash) for client caches. Ext3 works too
(server or client), but slower.

> I have to start to worry about it or is simply an openafs bug?

It's simply a reiserfs bug. If you have backup an can to move
to something else, no need to worry.

Harald.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Strange kernel messages from yesterday...

2009-11-02 Thread Simon Wilkinson


On 2 Nov 2009, at 10:26, Claudio Prono wrote:


Hello all,

I have seen some strange errors into my dmesg. The system is an  
OpenSuse

11.0 with openafs-1.4.8-3.1.


The OpenAFS fileserver is just a standard user space program on Linux  
- it uses the underlying filesystem in exactly the same way as any  
other application would. If you're seeing filesystem bugs, then it's  
not a problem with OpenAFS, but an issue with the underlying filesystem.


Historically, reiserfs has had a number of stability and reliability  
issues - it's possible that you're just running into one of those.


Cheers,

Simon.

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Strange kernel messages from yesterday...

2009-11-02 Thread Claudio Prono
Hello all,

I have seen some strange errors into my dmesg. The system is an OpenSuse
11.0 with openafs-1.4.8-3.1.

That are the errors:

REISERFS warning (device cciss/c0d2p1): journal-d1 flush_commit_list:
jl->j_len = 0; jl->j_state = 0; jl->j_trans_id = 0; jl->j_refcount = 0;
journal->trans_id = 1744730; oldest live jl->j_trans_id = 1742447

[ cut here ]
kernel BUG at fs/reiserfs/journal.c:1048!
invalid opcode:  [#1] SMP
last sysfs file: /sys/devices/system/cpu/cpu1/online
Modules linked in: af_packet libafs(P) binfmt_misc iptable_filter
ip_tables ip6_tables x_tables microcode firmware_class reiserfs loop
dm_mod sg cpqphp container sr_mod pci_hotplug button i2c_piix4
sworks_agp cdrom hpwdt agpgart i2c_core tg3 pl2303 usbserial ohci_hcd
usbcore edd ext3 mbcache jbd fan pata_serverworks libata dock cciss
scsi_mod thermal processor [last unloaded: speedstep_lib]

Pid: 25063, comm: fileserver Tainted: PN (2.6.25.20-0.1-pae #1)
EIP: 0060:[] EFLAGS: 00010246 CPU: 0
EIP is at flush_commit_list+0x98/0x5c9 [reiserfs]
EAX: 00d7 EBX: f8fe ECX: f4d5df28 EDX: 
ESI: f509f700 EDI: 0024d7cb EBP: f6d9bf1c ESP: f6d9bedc
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process fileserver (pid: 25063, ti=f6d9a000 task=f518a120 task.ti=f6d9a000)
Stack: 0001 f783b400  f8fe f7422a00  f6d9bf20
f649ec80
   ff9c f6d9bf18 c0178913 f649ec80  f8fe f783b400
0024d7cb
   f6d9bf64 f9034a53 f7065408 f509f700   0001

Call Trace:
 [] reiserfs_commit_for_inode+0x14f/0x17d [reiserfs]
 [] reiserfs_sync_file+0x36/0x74 [reiserfs]
 [] do_fsync+0x48/0x75
 [] __do_fsync+0x1f/0x2f
 [] sys_fsync+0xd/0xf
 [] sysenter_past_esp+0x6d/0xa9
 [] 0xe430
 ===
Code: ff 75 c8 ff 76 64 ff 76 04 6a 00 68 c8 df 03 f9 68 34 a8 03 f9 68
4f e0 03 f9 ff 75 c4 e8 a8 77 ff ff 83 c4 28 83 7e 08 00 75 04 <0f> 0b
eb fe 8b 45 cc 8b 55 c8 3b 50 18 75 04 0f 0b eb fe ff 46
EIP: [] flush_commit_list+0x98/0x5c9 [reiserfs] SS:ESP
0068:f6d9bedc
---[ end trace 7fcf772dce1ad017 ]---
[ cut here ]
WARNING: at kernel/exit.c:892 do_exit+0x31/0x5c6()
Modules linked in: af_packet libafs(P) binfmt_misc iptable_filter
ip_tables ip6_tables x_tables microcode firmware_class reiserfs loop
dm_mod sg cpqphp container sr_mod pci_hotplug button i2c_piix4
sworks_agp cdrom hpwdt agpgart i2c_core tg3 pl2303 usbserial ohci_hcd
usbcore edd ext3 mbcache jbd fan pata_serverworks libata dock cciss
scsi_mod thermal processor [last unloaded: speedstep_lib]
Pid: 25063, comm: fileserver Tainted: P  D N 2.6.25.20-0.1-pae #1
 [] dump_trace+0x63/0x227
 [] show_trace+0x15/0x29
 [] dump_stack+0x5b/0x65
 [] warn_on_slowpath+0x41/0x67
 [] do_exit+0x31/0x5c6
 [] die+0x15e/0x166
 [] do_trap+0x8a/0xa3
 [] do_invalid_op+0x6c/0x76
 [] error_code+0x72/0x80
 [] flush_commit_list+0x98/0x5c9 [reiserfs]
 [] reiserfs_commit_for_inode+0x14f/0x17d [reiserfs]
 [] reiserfs_sync_file+0x36/0x74 [reiserfs]
 [] do_fsync+0x48/0x75
 [] __do_fsync+0x1f/0x2f
 [] sys_fsync+0xd/0xf
 [] sysenter_past_esp+0x6d/0xa9
 [] 0xe430
 ===
---[ end trace 7fcf772dce1ad017 ]---


Btw there is no phisical problem with the disks, as reported with the HP
smartarray utility:

cciss_vol_status -q /dev/cciss/c0d2
/dev/cciss/c0d2: (Smart Array 5i) RAID 5 Volume 0(?) status: OK.
/dev/cciss/c0d2: (Smart Array 5i) RAID 5 Volume 1(?) status: OK.

I have to start to worry about it or is simply an openafs bug?

Cordially,

Claudio.



-- 

Claudio Prono
Systems Development @ PSS Srl, Divisione Implementazione Sistemi
Via San Bernardino, 17 - 10137 Torino (TO) - IT
Tel +39-011.32.72.100  Fax +39-011.32.46.497
PGP Fingerprint: 75C2 4049 E23D 2FBF A65F  40DB EA5C 11AC C2B0 3647
Disclaimer: http://atpss.net/disclaimer
 

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info