On 1/21/24 14:48, gene heskett wrote:
On 1/21/24 16:13, David Christensen wrote:
On 1/21/24 03:47, gene heskett wrote:
On 1/21/24 01:33, David Christensen wrote:
I am still uncertain if those are internal SSD errors or SATA
errors. Please check if you see matching errors in dmesg(1).
There aren't any. Those hours would very closely correspond to my
attempts to rsync and the OOM deamon killed the machine, which it did
around 10 times. So logging by then had been killed. That to me is
the smoking gun.
Kernel ring buffer is renewed with each boot and newer messages
overwrite older messages. So, you will want to save or clear the ring
buffer with demsg(1), save a SMART full report, exercise the disk with
dd(1) and/or a SMART test, save the ring buffer, save a SMART full
report, and analyze everything to see if you have disk problems, SATA
problems, and/or system problems. Once everything passes without
error, the disk is ready to be put into service.
2T is enough /home for the nonce. so I'll do the rsync thing going
the other direction, using it for a backup of /home until I'm ready
for trixie.
However I am tempted to zero the drives an recreate the raid w/o
formatting since the mdadm seems capable to installing itw own
filesystems to use the whole drive unpartitioned, giving me a backup
that sizewise is about the same as the single 2T drive has now.
And although my single experience with lvm over a decade ago was a
total disaster, made out of used spinning rust I may now see how the
other 4 2T's assembled as a lvm for amandas vtapes as an 8T lvm to
backup the whole system, which in addition to the 4 cnc'd machines,
has over the last 5 years seen a train of 3d printers go by. If all
3, currently a WIP, get rebuilt, the smallest is 305 by, the largest
is 400 by. And all I hope will lay plastic at 200+ mm a second.
Normal consumer stuff is 40 to 60.
Obviously I have an eclectic choice of too many hobbies. ;o)>
Now if curiosity doesn't kill this cat, I need to find some breakfast
and git to it.
This and other threads have led me to the conclusion that consumer
SSD's are meant for devices that are off most of the time -- e.g.
notepad, laptop, desktop, and workstation computers. If you put them
into a NAS/ file server and run them 24x7, they will die sometime
after 2 years.
That has not been my experience at all David, I bought a 4 pack of 120G
ssd's when they were the biggest available and replaced 3 spinning rust
drives that had 50-70k hours on them with these. My cnc machines are all
wired so power for the mill/lathe/what have you is totally controlled by
the enable key, f2, so if f2 is off only the computer is running. That
was at least 6 years ago. Then I installed a 240G as an extra drive on
the rpi4 that runs my biggest lathe and made a buildbot out of it to
pull linuxcnc-master from github and build it, also armhf kernels for
linuxcnc's realtime needs. The 120G disappeared in about a year,
replaced the adapter with a startech, drive was and is just fine. There
is now at least 5 years on everyone of those original 120's, zero SSD
problems in the whole lot.
I also have small SSD's that have lasted far longer than 2 years on
mixed duty, including 24x7 (Intel SSD 520 Series 60 GB). The relevant
recent threads on this list seem to be 1+ TB Samsung's.
It is interesting to note that BackBlaze does not seem to use Samsung SSD's:
https://www.backblaze.com/blog/ssd-edition-2023-mid-year-drive-stats-review/
3. For Amanda, either add more HDD's to the storage server or build
another server. If another server, shut it down when you are not
using it.
Speaking as someone who has used amanda for about 25 years:
People don't always understand that one of Amanda prime directives is to
balance the size of an individual back up run by advancing the level 3
scheduled for tonight, by advancing it to level 0 if this run is only
going to be small. The only guarantee is that if you have a 10 day
schedule, all machines/dle's, will get that level 0 backup not more than
10 days after the last one. You choose how many days long that cycle
is. I adjust it so the storage is around 75 to 80% used after the
schedule has stabilized. This may take quite a few such cycles.
Designed to run every night when things are relatively quiet, how this
works well depends on the other machines it is backing up be available.
Machines missing at backup time can and will muck things up for this
efficient scheduling. Corporate users of Amanda, used to doing it their
way, backing the weeks business on friday nights just don't understand
that the Amanda way gets them a 100% coverage backup by backing up only
the differences from the previous run of that dle every night is far
superior to their fridey night when most of the offices machines are
turned off for the weekend. For those cases we recommend composing two
or more dle files and rigging cron to run them at more appropriate
scheduling. This can be fairly easily done but MBA's watching every watt
on the power bill just don't see it that way. They will get burnt when
their place burns.
I believe all of my computers have one or both of the following
BIOS/UEFI features:
1. Wake on LAN.
2. Wake at preset day/time.
Perhaps your client computers also have such and you could configure it
to coordinate with Amanda.
David