On 29.08.20 10:18, Alexander V. Makartsev wrote:
On 29.08.2020 07:59, Long Wind wrote:
installation of linux to sdb1 fails
i believe hard disk has bad sector
If hard drive has bad sectors or recently encountered them, information
about this should be noted to hard drive's SMART table.
Alternatively, you can use "badblocks" program from "e2fsprogs" package
to scan hard drive for bad blocks.
I'd perform tests on wiped clean hard drive with non-destructive read
test first, followed by write test.
Testing media for bad blocks could be time consuming if hard drive is
multiple terabytes in size.
i use e2fsck with -c, i.e. read-only test
it doesn't report any error
I support this recommendation to use badblocks.
If you first would need to rescue data from the disk, although your
question sounds like there is no data worth to rescue from the disk
anymore, then use ddrescue from package gddrescue first.
Then, using badblocks, I recommend to run it in write mode with option
"-n" for the following reason: if I am correctly informed, then disks
with S.M.A.R.T have usually a reservoir of memory blocks to which the
firmware of the disk itself, without the operating system seeing this,
redirects by the disk itself already detected bad blocks. The statistics
about these permanent redirection events is found in the S.M.A.R.T. log
of the disk, which you can access by the smartctl program. But the
internal mechanism of the disk's S.M.A.R.T. will only detect bad blocks
upon the intent to write to blocks. Simply intending to read from bad
blocks will not trigger S.M.A.R.T. to recognize blocks as bad blocks and
they would thus not become visible in the S.M.A.R.T. report. If you
later would write to the disk (i.e. during your OS installation you are
mentioning as the cause to have encountered a problem with your
hardware) then either S.M.A.R.T. will invisibly protect you by applying
its internal redirection mechanism to reservoir blocks, or, if no more
reservoir blocks are available, leave the operating system with the
problem. This is what might happen in your situation right now. So, the
operating system now needs to maintain its own list of bad blocks, which
is thus the list of bad blocks no more cared for by S.M.A.R.T. . Again,
simply reading from the disk might not be enough to properly detect
these still present bad blocks. Therefore I recommend to let the
operating system search for them by running badblocks with option "-n"
(or "-w", please consult the man pages what better fits your needs) in
write mode! Actually, I would recommend to repeat such run several
times, in order to monitor if the amount of bad blocks is at least
constant or if it is increasing. In the latter case you should replace
the disk by a new one for sure. In the former case, if badblocks command
finds already bad blocks which couldn't be cared for by S.M.A.R.T., I
would also seriously consider to replace the disk for a new one now, if
the financial situation allows for it. But if a replacement is wished to
be avoided now for financial reasons, then at least continue to monitor
the situation very frequently and of course at any time have a proper
backup of your data on a still good medium. Given the requirement to
frequently monitor a disk which can not buffer problems for you
automatically by its S.M.A.R.T., and considering the time effort this
repeatedly involves, you will have to balance this costs of time and
missing trust in the present medium against the costs for a new disk.
If the disk comes out to not be the cause for your trouble encountered
with your system, then you could check if all components on the
motherboard involved in moving data around are still fine:
- write a heavy amount of data to the disk: with command dd copy a huge
amount of data from an externally to USB connected drive to the drive
which you at the moment suspect to trouble; maybe the motherboard fails
from time to time to still handle such job free of errors; so, the disk
might be still be fine and could be reused elsewhere, but the data
highway on your motherboard started to fail; for this check I would not
simply use if=/dev/zero, but really reading data from another drive, in
order to ensure that on the motherboard the respective data highway has
to be used as during your OS installation or during future data copy
procedures;
- check your RAM with memtest86+, you will have to search for a Life
pen-drive offering this command in its grub boot menu; I am not
perfectly sure right now, but would expect the Knoppix Linux
distribution to offer this;
- check your CPU with stress-ng
Good luck!
Marco