Re: [OpenAFS] Strange kernel messages from yesterday...
On Mon, Nov 2, 2009 at 7:57 AM, Harald Barth wrote: > >> I've never seen anything about ext4 (which is more and more >> supported by recent kernels/distributions) : is it completely >> deprecated for afs data store? Or does it just brings nothing more >> (in terms of speed, reliability...) than ext3? > > There was a lot of enthusiasm about reiserfs in the beginning, too. > > Let it prove itself in a big distro on desktop systems for a while. > > It is just too new to say something about stability/reliability IMHO. > > Harald. There have been many reports of extensive ext4 filesystem corruption during the current 2.6.32-rc kernel series (see for example http://bugzilla.kernel.org/show_bug.cgi?id=14354), after unclean shutdowns. I personally experienced this twice on a desktop and lost hundreds of files (looked like roughly the last half-hour of activity) each time. In my case this didn't touch anything too critical (although I had mild corruption of an openafs git clone), but having this happen on a /vicepx partition would not be pleasant. It definitely needs to mature before being considered for use as a server data store. Marc ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] Strange kernel messages from yesterday...
> I've never seen anything about ext4 (which is more and more > supported by recent kernels/distributions) : is it completely > deprecated for afs data store? Or does it just brings nothing more > (in terms of speed, reliability...) than ext3? There was a lot of enthusiasm about reiserfs in the beginning, too. Let it prove itself in a big distro on desktop systems for a while. It is just too new to say something about stability/reliability IMHO. Harald. ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] Strange kernel messages from yesterday...
> > Hm, is ext3 slower if used on server? In that case, anyone checked why? > Compared to ext2 I mean. Or var your mail just difficult to parse? :-) Parsing help: On server: ext3 slower than xfs. On client: ext3 slower than ext2 I have not compared ext2 with xfs. I would not use ext2 on server for obvious reasons, but it would probably be faster than ext3. Harald. ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] Strange kernel messages from yesterday...
> > Ext3 works too > > (server or client), but slower. > > > Hm, is ext3 slower if used on server? In that case, anyone checked why? My benchmarks (done some years ago) showed that ext3 is extremely slow on meta data operations like file creation an delete (compared to xfs). Vos backup, move (everything that makes clones and deletes them later) does a lot of those. I have not bothered to debug ext3 (where you in addition to that can run out of inodes (*)) but used the faster xfs file system instead. Harald. (*) Which of course recently happened on our Lustre file system partitions where you have to use ext3. ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] Strange kernel messages from yesterday...
> After some benchmarking a while ago - see here: > > http://fbo.no-ip.org/cgi-bin/twiki/view/Instantafs/WhichFs > > I decided to use ext3 since most people I asked hadn't been happy > with > reiser3's stability. > I've never seen anything about ext4 (which is more and more supported by recent kernels/distributions) : is it completely deprecated for afs data store? Or does it just brings nothing more (in terms of speed, reliability...) than ext3? I've seen enourmous improvement on disk operation on standard desktop systems only by migrating from ext3 to ext4, which is the reason why I'm asking the question to myself (and to you...). Thanks for your answer. Frederic. ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] Strange kernel messages from yesterday...
Hi, On Mon, Nov 02, 2009 at 12:30:10PM +0100, Anders Magnusson wrote: > Harald Barth wrote: > >Ext3 works too > >(server or client), but slower. > > > Hm, is ext3 slower if used on server? In that case, anyone checked why? After some benchmarking a while ago - see here: http://fbo.no-ip.org/cgi-bin/twiki/view/Instantafs/WhichFs I decided to use ext3 since most people I asked hadn't been happy with reiser3's stability. However, hardware configuration mentioned on the benchmark page is no longer in use here (Core2Quad instead of Xeon, RAID6/Areca instead of RAID5/3Ware). Maybe some of the filesystems' properties changed, too. I'm currently 95% happy with ext3. There is just one problem. "Sometimes" (esp. when my nightly debian mirror script runs), removing a directory takes forever (up to 10 sconds per rmdir() according to strace) while there's no other load on either the server and the client. Regards, Frank ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] Strange kernel messages from yesterday...
Anders Magnusson wrote: Harald Barth wrote: Ext3 works too (server or client), but slower. Hm, is ext3 slower if used on server? In that case, anyone checked why? Compared to ext2 I mean. Or var your mail just difficult to parse? :-) -- Ragge ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] Strange kernel messages from yesterday...
Harald Barth wrote: Ext3 works too (server or client), but slower. Hm, is ext3 slower if used on server? In that case, anyone checked why? -- Ragge ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] Strange kernel messages from yesterday...
> REISERFS warning (device cciss/c0d2p1): journal-d1 flush_commit_list: > jl->j_len = 0; jl->j_state = 0; jl->j_trans_id = 0; jl->j_refcount = 0; > journal->trans_id = 1744730; oldest live jl->j_trans_id = 1742447 Short summary: DO NOT USE REISERFS. Long explanation can be found in the openafs-info archives if you search around a bit. We have found xfs to be a good file system for /vicepX partitions (with namei fileserver) and ext2 (it's only a cache and can be rebuilt after a crash) for client caches. Ext3 works too (server or client), but slower. > I have to start to worry about it or is simply an openafs bug? It's simply a reiserfs bug. If you have backup an can to move to something else, no need to worry. Harald. ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] Strange kernel messages from yesterday...
On 2 Nov 2009, at 10:26, Claudio Prono wrote: Hello all, I have seen some strange errors into my dmesg. The system is an OpenSuse 11.0 with openafs-1.4.8-3.1. The OpenAFS fileserver is just a standard user space program on Linux - it uses the underlying filesystem in exactly the same way as any other application would. If you're seeing filesystem bugs, then it's not a problem with OpenAFS, but an issue with the underlying filesystem. Historically, reiserfs has had a number of stability and reliability issues - it's possible that you're just running into one of those. Cheers, Simon. ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Strange kernel messages from yesterday...
Hello all, I have seen some strange errors into my dmesg. The system is an OpenSuse 11.0 with openafs-1.4.8-3.1. That are the errors: REISERFS warning (device cciss/c0d2p1): journal-d1 flush_commit_list: jl->j_len = 0; jl->j_state = 0; jl->j_trans_id = 0; jl->j_refcount = 0; journal->trans_id = 1744730; oldest live jl->j_trans_id = 1742447 [ cut here ] kernel BUG at fs/reiserfs/journal.c:1048! invalid opcode: [#1] SMP last sysfs file: /sys/devices/system/cpu/cpu1/online Modules linked in: af_packet libafs(P) binfmt_misc iptable_filter ip_tables ip6_tables x_tables microcode firmware_class reiserfs loop dm_mod sg cpqphp container sr_mod pci_hotplug button i2c_piix4 sworks_agp cdrom hpwdt agpgart i2c_core tg3 pl2303 usbserial ohci_hcd usbcore edd ext3 mbcache jbd fan pata_serverworks libata dock cciss scsi_mod thermal processor [last unloaded: speedstep_lib] Pid: 25063, comm: fileserver Tainted: PN (2.6.25.20-0.1-pae #1) EIP: 0060:[] EFLAGS: 00010246 CPU: 0 EIP is at flush_commit_list+0x98/0x5c9 [reiserfs] EAX: 00d7 EBX: f8fe ECX: f4d5df28 EDX: ESI: f509f700 EDI: 0024d7cb EBP: f6d9bf1c ESP: f6d9bedc DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process fileserver (pid: 25063, ti=f6d9a000 task=f518a120 task.ti=f6d9a000) Stack: 0001 f783b400 f8fe f7422a00 f6d9bf20 f649ec80 ff9c f6d9bf18 c0178913 f649ec80 f8fe f783b400 0024d7cb f6d9bf64 f9034a53 f7065408 f509f700 0001 Call Trace: [] reiserfs_commit_for_inode+0x14f/0x17d [reiserfs] [] reiserfs_sync_file+0x36/0x74 [reiserfs] [] do_fsync+0x48/0x75 [] __do_fsync+0x1f/0x2f [] sys_fsync+0xd/0xf [] sysenter_past_esp+0x6d/0xa9 [] 0xe430 === Code: ff 75 c8 ff 76 64 ff 76 04 6a 00 68 c8 df 03 f9 68 34 a8 03 f9 68 4f e0 03 f9 ff 75 c4 e8 a8 77 ff ff 83 c4 28 83 7e 08 00 75 04 <0f> 0b eb fe 8b 45 cc 8b 55 c8 3b 50 18 75 04 0f 0b eb fe ff 46 EIP: [] flush_commit_list+0x98/0x5c9 [reiserfs] SS:ESP 0068:f6d9bedc ---[ end trace 7fcf772dce1ad017 ]--- [ cut here ] WARNING: at kernel/exit.c:892 do_exit+0x31/0x5c6() Modules linked in: af_packet libafs(P) binfmt_misc iptable_filter ip_tables ip6_tables x_tables microcode firmware_class reiserfs loop dm_mod sg cpqphp container sr_mod pci_hotplug button i2c_piix4 sworks_agp cdrom hpwdt agpgart i2c_core tg3 pl2303 usbserial ohci_hcd usbcore edd ext3 mbcache jbd fan pata_serverworks libata dock cciss scsi_mod thermal processor [last unloaded: speedstep_lib] Pid: 25063, comm: fileserver Tainted: P D N 2.6.25.20-0.1-pae #1 [] dump_trace+0x63/0x227 [] show_trace+0x15/0x29 [] dump_stack+0x5b/0x65 [] warn_on_slowpath+0x41/0x67 [] do_exit+0x31/0x5c6 [] die+0x15e/0x166 [] do_trap+0x8a/0xa3 [] do_invalid_op+0x6c/0x76 [] error_code+0x72/0x80 [] flush_commit_list+0x98/0x5c9 [reiserfs] [] reiserfs_commit_for_inode+0x14f/0x17d [reiserfs] [] reiserfs_sync_file+0x36/0x74 [reiserfs] [] do_fsync+0x48/0x75 [] __do_fsync+0x1f/0x2f [] sys_fsync+0xd/0xf [] sysenter_past_esp+0x6d/0xa9 [] 0xe430 === ---[ end trace 7fcf772dce1ad017 ]--- Btw there is no phisical problem with the disks, as reported with the HP smartarray utility: cciss_vol_status -q /dev/cciss/c0d2 /dev/cciss/c0d2: (Smart Array 5i) RAID 5 Volume 0(?) status: OK. /dev/cciss/c0d2: (Smart Array 5i) RAID 5 Volume 1(?) status: OK. I have to start to worry about it or is simply an openafs bug? Cordially, Claudio. -- Claudio Prono Systems Development @ PSS Srl, Divisione Implementazione Sistemi Via San Bernardino, 17 - 10137 Torino (TO) - IT Tel +39-011.32.72.100 Fax +39-011.32.46.497 PGP Fingerprint: 75C2 4049 E23D 2FBF A65F 40DB EA5C 11AC C2B0 3647 Disclaimer: http://atpss.net/disclaimer ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info