Thanks Mauro,

I've checked for overlaps, and confirmed there are none that could account
for this.... my next step is to check the swap files as Michael
suggests.... though if we had those sorts of problems I'd expect to see
much more corruption.

The investigation continues...

Cheers,
Donald Russell



On Wed, Oct 2, 2013 at 10:28 AM, Mauro Souza <thoriu...@gmail.com> wrote:

> Data corruption should be *very* rare in any working system. The majority
> of the cases occurs when two systems write on the same disk, uncoordinated.
> When you changed from ext3 to ext2, you threw away the only thing
> preventing this to happen. Ext3 detected something amiss, and locked the
> disk to prevent further damage. ext2 don't see anything amiss, write down
> the data, and corrupts everything. Working as designed. You just removed
> the safety pin that blocked your gun to fire when aimed at your foot (or
> your head).
>
> You should execute DISKMAP or DIRMAP on your system to see if there are
> overlapping minidisks. You should see a overlap. Correcting the overlap
> shall correct the corruption.
> And, please, revert back to ext3. It will save your data, and you can see
> messages on syslog if things go astray...
>
> Mauro
> http://mauro.limeiratem.com - registered Linux User: 294521
> Scripture is both history, and a love letter from God.
>
>
> 2013/10/2 Donald Russell <russell....@gmail.com>
>
> > RHEL 5.8 zLinux on zVM 6.1 using ECKD disks
> >
> > We have a recurring problem where files get corrupted. It always happens
> in
> > the same directory, and we always wind up running fsck in single user
> mode
> > to get the file system back together.
> >
> > The file system is EXT-2. (We changed from EXT-3 on this FS due to
> numerous
> > journaling errors which caused the FS to go to RO mode)
> >
> > In the case I'm currently working on, a ".so" file (binary) has a chunk
> of
> > plain text in the middle of it. The "chunk" is 4K bytes long, and is a
> > piece of a program listing. 4K is the block size of the underlying DASD.
> >
> > I am now in the process of trying to find when this happened by restoring
> > backup copies and seeing if I can narrow the time frame down.
> >
> > Obviously I don't expect a "do this to fix the problem", but what I'm
> > wondering is, has anybody else encountered this? What could cause a block
> > in the middle of the file to be overwritten this way? Are there any
> tools I
> > can run (preferably while the system is at runlevel 3 or 5) to check if
> two
> > (or more) files are using the same block in a file?
> >
> > I envision a "bad block pointer" somewhere, but how does that happen?
> >
> > I'm considering converting the file system to EXT-4... I don't think
> > there's a conversion per-se, I'd create a new EXT-4 FS then copy all the
> > files from the EXT2 FS to the EXT-4 one.
> >
> > Any suggestions/help are greatly appreciated.
> >
> > Thank you,
> > Donald Russell
> >
> > ----------------------------------------------------------------------
> > For LINUX-390 subscribe / signoff / archive access instructions,
> > send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or
> > visit
> > http://www.marist.edu/htbin/wlvindex?LINUX-390
> > ----------------------------------------------------------------------
> > For more information on Linux on System z, visit
> > http://wiki.linuxvm.org/
> >
>
> ----------------------------------------------------------------------
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or
> visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
> ----------------------------------------------------------------------
> For more information on Linux on System z, visit
> http://wiki.linuxvm.org/
>

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
----------------------------------------------------------------------
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Reply via email to