Re: Corrupted FS every 50 checks

2010-08-15 Thread Merciadri Luca
Henrique de Moraes Holschuh wrote:
> On Sat, 14 Aug 2010, Merciadri Luca wrote:
>   
>> always on the same: /dev/sdc5. Well, this is where I have all my docs,
>> my university stuff, and this is even more annoying. I could do backups,
>> 
>
> I sure hope you *DID* extensive backups.  Often.  And stored some of
> them for permanent archival.
>   
:)
>> but I can't understand why this filesystem is problematic, because I
>> 
>
> The typical answer is: "because it is always getting corrupted again".
>   
But why again?
>> always get the same errors. This time, it was mainly *things like error
>> filesystem Inode has EXTENTS_FL flag set, but not too many.*
>> 
>
> This could mean you've managed to messing with an ext4 filesystem as if
> it were ext3.  That won't corrupt file data, but can cause lots of other
> problems, and could cause data loss if you manage to confuse the kernel
> and the fsck tools enough.  With some luck, it will end up in lost+found
> after the proper fsck.
>
> I sure hope that's what is happening.
>   
Well, I never modify things like the fs structure type, such as ext3 or
ext4. Where could I have modified this? And why does it often happen to
the same files?
>> *2) Should I think about buying another hdd? I tried with the hdd life
>> 
>
> You should do a 24H memtest86 marathon on that box.
I did it many months ago for another issue, and it went okay.
>  And you should make
> sure you're *always* using that filesystem as ext4, since you apparently
> have a part of it using ext4 features.
>
> If it is a hardware problem, it is a strange one... it should be messing
> with the entire disk, not just a set of files in the same partition. It
> can't be bad magnetic media, as that causes sector read errors, not
> corruption.
>
> I hope this answer helps you.  I won't be able to help you further on this.
> Maybe someone else can (or has a better idea of what the problem could be).
>   
Thanks.

-- 
Merciadri Luca
See http://www.student.montefiore.ulg.ac.be/~merciadri/
I use PGP. If there is an incompatibility problem with your mail
client, please contact me.






signature.asc
Description: OpenPGP digital signature


Re: Corrupted FS every 50 checks

2010-08-15 Thread Merciadri Luca
Bob Proulx wrote:
> Does your disk support S.M.A.R.T.?
>
>   http://en.wikipedia.org/wiki/S.M.A.R.T.
>
> Try it and see if the disk drive reports any physical errors.
>
>   $ sudo apt-get install smartmontools
>
> Here are some example uses:
>
>   $ sudo smartctl -i /dev/sda
>   SMART support is: Available - device has SMART capability.
>   SMART support is: Enabled
>
> If smart is available but disabled then you would need to enable it
> before making use of it.
>
>   $ sudo smartctl -s on /dev/sda
>
> Then check the disk health status.
>
>   $ sudo smartctl -H /dev/sda
>   SMART overall-health self-assessment test result: PASSED
>   
That's what I was speaking about (smartmontools). Well, no problem, as I
said:

==
# smartctl -H /dev/sdc5
smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
==

> You can manually run a selftest like this:
>
>   $ sudo smartctl -t short /dev/sda
>
> Then wait a couple of minutes for the test to complete and then
> observe the results.
>
>   $ sudo smartctl -l selftest /dev/sda
>   
==
# smartctl -l selftest /dev/sdc5
smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_DescriptionStatus  Remaining 
LifeTime(hours)  LBA_of_first_error
# 1  Short offline   Completed without error   00%
13106 -
==

> I recommend setting up automatic regular selftests by configuring
> smartmontools to run them regularly.
>   
I'll make it.
> Hopefully your disk is okay and not reporting physical errors.
>   
Yes. But I even ran a longer test months before, and it reported no
error. Weird.

-- 
Merciadri Luca
See http://www.student.montefiore.ulg.ac.be/~merciadri/
I use PGP. If there is an incompatibility problem with your mail
client, please contact me.




signature.asc
Description: OpenPGP digital signature


Re: Corrupted FS every 50 checks

2010-08-15 Thread Merciadri Luca
Bob Proulx wrote:
> Hanspeter Spalinger wrote:
>   
>> Merciadri Luca wrote:
>> 
>>> problem is that I always get errors when e2fsck verifies the fs, and
>>> always on the same: /dev/sdc5.  ...
>>> but I can't understand why this filesystem is problematic, because I
>>> don't use it often, at least these times. I always have errors about the
>>>   
>> Do you actualy FIX those errors? Afaik the fsck at startup does not fix
>> all errors (it plays safe). Try run fsck manualy (but make a backup
>> first and read the man-page).
>> 
>
> The control for this is in /etc/default/rcS with the FSCKFIX
> variable.  If it is set to no then no fix happens.  If it is set to
> yes then at boot time fsck is enabled to automatically fix what it
> can.
>
> In /etc/default/rcS file:
>   FSCKFIX=yes
>
> Note that if it is set to no on a remote server that you do not have
> console access to then it is possible to get into a state where the
> machine will not reboot on its own because it will be waiting for
> console access to get past the fsck questions.
>
> These are documented in the rcS man page.
>
>   $ man rcS
>
>FSCKFIX
>   When the root and all other file systems are checked,
>   fsck is invoked with the -a option which means
>   "autorepair".  If there are major inconsistencies then
>   the fsck process will bail out.  The system will print a
>   message asking the administrator to repair the file
>   system manually and will present a root shell prompt
>   (actually a sulogin prompt) on the console.  Setting
>   this option to yes causes the fsck commands to be run
>   with the -y option instead of the -a option.  This will
>   tell fsck always to repair the file systems without
>   asking for permission.
>   
I just modified /etc/default/rcS consequently. (`FSCKFIX' was set to `no'.)
>> I assume you use the other partitions more often, with no error, so i
>> don't think your problem is hardware related.
>> 
>
> A good diagnosis!  But I would still look to be sure.  :-)
>   


-- 
Merciadri Luca
See http://www.student.montefiore.ulg.ac.be/~merciadri/
I use PGP. If there is an incompatibility problem with your mail
client, please contact me.






signature.asc
Description: OpenPGP digital signature


Re: Corrupted FS every 50 checks

2010-08-15 Thread Merciadri Luca
Hanspeter Spalinger wrote:
> On 08/14/2010 09:01 PM, Merciadri Luca wrote:
> > Hi,
>
> > I have the fs' checking period set to 50 mounts for now some years. The
> > problem is that I always get errors when e2fsck verifies the fs, and
> > always on the same: /dev/sdc5. Well, this is where I have all my docs,
> > my university stuff, and this is even more annoying. I could do backups,
> > but I can't understand why this filesystem is problematic, because I
> > don't use it often, at least these times. I always have errors about the
> > AucTeX files. Well, auctex is simply an emacs extension for LaTeX, and I
> > can't understand why some specific files always cause troubles. I don't
> > always get the same errors. This time, it was mainly *things like error
> > filesystem Inode has EXTENTS_FL flag set, but not too many.*
> > **
> > *1) Why are they happening? My disk is not old. (/dev/sdc5 is on another
> > disk than the /etc, etc.)*
> > *2) Should I think about buying another hdd? I tried with the hdd life
> > tools (I can't remember their names) which are bundled with Debian, but
> > they don't show any failing thing (they have quite the same score as
> > other hdds in my computer).*
> > ***
>
> > Thanks.
> > *
>
> Hi,
> Do you actualy FIX those errors? Afaik the fsck at startup does not fix
> all errors (it plays safe). Try run fsck manualy (but make a backup
> first and read the man-page).
Yes, I always fix them. If necessary, I fix them manually.
> I assume you use the other partitions more often, with no error, so i
> don't think your problem is hardware related.
Huh? I can't understand that. Would you mean that when a partition is
not used often, it is more error-prone (subject to errors)?

Thanks.

-- 
Merciadri Luca
See http://www.student.montefiore.ulg.ac.be/~merciadri/
I use PGP. If there is an incompatibility problem with your mail
client, please contact me.






signature.asc
Description: OpenPGP digital signature


Re: Corrupted FS every 50 checks

2010-08-14 Thread Bob Proulx
T o n g wrote:
> Bob Proulx wrote:
> > Note that if it is set to no on a remote server that you do not have
> > console access to then it is possible to get into a state where the
> > machine will not reboot on its own because it will be waiting for
> > console access to get past the fsck questions.
> 
> So remote servers mostly set FSCKFIX to yes?

Most of the time there aren't any problems.  This is something that
only becomes visible when there is a problem that triggers the
interactive fsck.  So actually I think that most of the machines have
the default which is no.  And since problems are rare it isn't a big
deal.  But I guarentee you that the admin will change it after the
second time that it burns them!  :-)

I was burned once by this and so now I always change it.  I find it
hard to believe that it isn't the default.  But it is easily locally
customized.

Bob


signature.asc
Description: Digital signature


Re: Corrupted FS every 50 checks

2010-08-14 Thread Henrique de Moraes Holschuh
On Sat, 14 Aug 2010, Merciadri Luca wrote:
> always on the same: /dev/sdc5. Well, this is where I have all my docs,
> my university stuff, and this is even more annoying. I could do backups,

I sure hope you *DID* extensive backups.  Often.  And stored some of
them for permanent archival.

> but I can't understand why this filesystem is problematic, because I

The typical answer is: "because it is always getting corrupted again".

> always get the same errors. This time, it was mainly *things like error
> filesystem Inode has EXTENTS_FL flag set, but not too many.*

This could mean you've managed to messing with an ext4 filesystem as if
it were ext3.  That won't corrupt file data, but can cause lots of other
problems, and could cause data loss if you manage to confuse the kernel
and the fsck tools enough.  With some luck, it will end up in lost+found
after the proper fsck.

I sure hope that's what is happening.

> *2) Should I think about buying another hdd? I tried with the hdd life

You should do a 24H memtest86 marathon on that box.  And you should make
sure you're *always* using that filesystem as ext4, since you apparently
have a part of it using ext4 features.

If it is a hardware problem, it is a strange one... it should be messing
with the entire disk, not just a set of files in the same partition. It
can't be bad magnetic media, as that causes sector read errors, not
corruption.

I hope this answer helps you.  I won't be able to help you further on this.
Maybe someone else can (or has a better idea of what the problem could be).

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20100815012142.ga16...@khazad-dum.debian.net



Re: Corrupted FS every 50 checks

2010-08-14 Thread Bob Proulx
Hanspeter Spalinger wrote:
> Merciadri Luca wrote:
> > problem is that I always get errors when e2fsck verifies the fs, and
> > always on the same: /dev/sdc5.  ...
> > but I can't understand why this filesystem is problematic, because I
> > don't use it often, at least these times. I always have errors about the
>
> Do you actualy FIX those errors? Afaik the fsck at startup does not fix
> all errors (it plays safe). Try run fsck manualy (but make a backup
> first and read the man-page).

The control for this is in /etc/default/rcS with the FSCKFIX
variable.  If it is set to no then no fix happens.  If it is set to
yes then at boot time fsck is enabled to automatically fix what it
can.

In /etc/default/rcS file:
  FSCKFIX=yes

Note that if it is set to no on a remote server that you do not have
console access to then it is possible to get into a state where the
machine will not reboot on its own because it will be waiting for
console access to get past the fsck questions.

These are documented in the rcS man page.

  $ man rcS

   FSCKFIX
  When the root and all other file systems are checked,
  fsck is invoked with the -a option which means
  "autorepair".  If there are major inconsistencies then
  the fsck process will bail out.  The system will print a
  message asking the administrator to repair the file
  system manually and will present a root shell prompt
  (actually a sulogin prompt) on the console.  Setting
  this option to yes causes the fsck commands to be run
  with the -y option instead of the -a option.  This will
  tell fsck always to repair the file systems without
  asking for permission.

> I assume you use the other partitions more often, with no error, so i
> don't think your problem is hardware related.

A good diagnosis!  But I would still look to be sure.  :-)

Bob


signature.asc
Description: Digital signature


Re: Corrupted FS every 50 checks

2010-08-14 Thread Bob Proulx
Merciadri Luca wrote:
> but I can't understand why this filesystem is problematic, because I
> don't use it often, at least these times. I always have errors about the

Does your disk support S.M.A.R.T.?

  http://en.wikipedia.org/wiki/S.M.A.R.T.

Try it and see if the disk drive reports any physical errors.

  $ sudo apt-get install smartmontools

Here are some example uses:

  $ sudo smartctl -i /dev/sda
  SMART support is: Available - device has SMART capability.
  SMART support is: Enabled

If smart is available but disabled then you would need to enable it
before making use of it.

  $ sudo smartctl -s on /dev/sda

Then check the disk health status.

  $ sudo smartctl -H /dev/sda
  SMART overall-health self-assessment test result: PASSED

You can manually run a selftest like this:

  $ sudo smartctl -t short /dev/sda

Then wait a couple of minutes for the test to complete and then
observe the results.

  $ sudo smartctl -l selftest /dev/sda

I recommend setting up automatic regular selftests by configuring
smartmontools to run them regularly.

Hopefully your disk is okay and not reporting physical errors.

Bob


signature.asc
Description: Digital signature


Re: Corrupted FS every 50 checks

2010-08-14 Thread Hanspeter Spalinger
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

On 08/14/2010 09:01 PM, Merciadri Luca wrote:
> Hi,
> 
> I have the fs' checking period set to 50 mounts for now some years. The
> problem is that I always get errors when e2fsck verifies the fs, and
> always on the same: /dev/sdc5. Well, this is where I have all my docs,
> my university stuff, and this is even more annoying. I could do backups,
> but I can't understand why this filesystem is problematic, because I
> don't use it often, at least these times. I always have errors about the
> AucTeX files. Well, auctex is simply an emacs extension for LaTeX, and I
> can't understand why some specific files always cause troubles. I don't
> always get the same errors. This time, it was mainly *things like error
> filesystem Inode has EXTENTS_FL flag set, but not too many.*
> **
> *1) Why are they happening? My disk is not old. (/dev/sdc5 is on another
> disk than the /etc, etc.)*
> *2) Should I think about buying another hdd? I tried with the hdd life
> tools (I can't remember their names) which are bundled with Debian, but
> they don't show any failing thing (they have quite the same score as
> other hdds in my computer).*
> ***
> 
> Thanks.
> *
> 
Hi,
Do you actualy FIX those errors? Afaik the fsck at startup does not fix
all errors (it plays safe). Try run fsck manualy (but make a backup
first and read the man-page).
I assume you use the other partitions more often, with no error, so i
don't think your problem is hardware related.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iF4EAREIAAYFAkxm+XoACgkQpjmLjrU66/6hpQEAiqgvUOf5Baw2D4tewIOf4pkY
zM4vOwiRBs8qcsjV8n0BALKwn0DvqLiN/R2x5X+BR1wGJuD6NajEan2EEpzwbVOz
=02sD
-END PGP SIGNATURE-


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/4c66f97d.4060...@spahan.ch



Corrupted FS every 50 checks

2010-08-14 Thread Merciadri Luca
Hi,

I have the fs' checking period set to 50 mounts for now some years. The
problem is that I always get errors when e2fsck verifies the fs, and
always on the same: /dev/sdc5. Well, this is where I have all my docs,
my university stuff, and this is even more annoying. I could do backups,
but I can't understand why this filesystem is problematic, because I
don't use it often, at least these times. I always have errors about the
AucTeX files. Well, auctex is simply an emacs extension for LaTeX, and I
can't understand why some specific files always cause troubles. I don't
always get the same errors. This time, it was mainly *things like error
filesystem Inode has EXTENTS_FL flag set, but not too many.*
**
*1) Why are they happening? My disk is not old. (/dev/sdc5 is on another
disk than the /etc, etc.)*
*2) Should I think about buying another hdd? I tried with the hdd life
tools (I can't remember their names) which are bundled with Debian, but
they don't show any failing thing (they have quite the same score as
other hdds in my computer).*
***

Thanks.
*

-- 
Merciadri Luca
See http://www.student.montefiore.ulg.ac.be/~merciadri/
I use PGP. If there is an incompatibility problem with your mail
client, please contact me.

To be uncertain is uncomfortable; but to be certain is ridiculous. (Goethe)



signature.asc
Description: OpenPGP digital signature