Christian Kujau wrote:
On Tue, October 16, 2007 00:27, Charles Perreault wrote:
I've got a Ubuntu backup server using JFS on a mdadm raid 1 array. The
server has been using XFS for months on a 2.6.17 kernel and has been
You are aware of http://oss.sgi.com/projects/xfs/faq.html#dir2 ?
No I wasn't, but I know about the infamous "bug" or "security feature",
call it whatever you like, that fills files with zeros sometimes when
the server crash. That's in part why I'm moving to JFS.
Twelve days ago, the filesystem remounted itself in read-only mode. I
didn't even know that would happen in case of a problem.
See mount(8):
Mount options for jfs
[...]
errors=continue / errors=remount-ro / errors=panic
Do you really know people that actually read user manuals before they
face a problem ? Little joke about tech support :P Yeah I've seen that
line since the first time the filesystem was corrupt, and I agree
remount-ro is the best default choice.
I did a fsck in read-only mode and remounted read-write.
Did you mean, you remounted the device RO and did fsck, or did you unmount
the device and performed a RO-fsck (jfs.fsck -n). Please try to use a
current version of jfsprogs and try to fsck (without -n) the device.
I mean I didn't remount anything before doing the fsck : the filesystem
put itself in read-only mode. No I didn't use the fsck -n, it was a
repair fsck I made. My jfsprogs are v1.1.11. I'll try doing the -n
check next time the filesystem corrupts.
This morning, I woke up to find my server again remounted in read-only
mode. A funny thing is that mount reports the drive to be mounted (rw),
but it's impossible to touch any file.
Sometimes /etc/mtab does not represent the contents of /proc/mounts.
Again, fsck then remount get my array back online, but this is
annoying. Anyone know why this corruption would occur ?
Hm, occasional corruptions can stem from software bugs to anything like
bad hardware. 2.6.23.1 is out, you could check the changelog if there's
anything related to this one. But I'm more curious about the outcome of
fsck on the unmounted device....
Christian.
I'm actually testing my hardware on that server. Memtest86+ ran 75 pass
without any error (36 straight hours), the cpu too passed several stress
test. I had filesystem corruption on another server due to bad ram in
the past, I know what pain that can cause. The hard drives both passed
smart tests and were zeroed with dd without problem. The only hardware
part that would need more extensive testing would be the SATA
controller. It's a Silicon Image 3114, which libata says it's stable
(production ready). Also as the HDD tests are good, and they use the
SATA controller, I'd say it's working correctly. I'll try compiling
2.6.23 later this week.
Charles
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Jfs-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/jfs-discussion