On 8/7/2020 6:23 PM, David Christensen wrote:
??Filesystem?????????? Size?? Used Avail Use% Mounted on
??/dev/md0???????????????? 28T???? 22T?? 6.0T?? 79% /RAID
??Backup:/Backup???? 44T???? 44T?? 512K 100% /Backup
The NAS array is 8 @ 5 TB live drives and 1 @ 5 TB hot spare?
It was. I was in the process of upgrading. Now it is 8 x 8 + 8
The backup system array is 8 @ 8 TB data drives and 1 @ 8 TB hot spare?
Yep. I always upgrade the backup before I upgrade the main array.
Well, wait a second. To be clear, that is 6 x 8T of data, plus 2 x 8T
of parity, plus 1 x 8T of spare.
No LVM?
No. I don't feel a need for LVM on the data arrays. I use the entire,
unpartitioned drive for /RAID.
AIUI you are running desktop motherboards without ECC memory and XFS
does not protect against bit rot.?? Are you concerned?
Yes. I have routines that compare the data on the main array and the
backup array via checksum. When needed, the backups supply a third
vote. The odds of two bits flipping at the very same spot are
astronomically low. There has been some bit rot, but so far it has been
manageable.
I agree that the 79% usage on the NAS array means action is required.
Uh-huh.
As I understand md RAID6, the only way to add capacity is to backup,
rebuild the array with additional and/or larger drives, and restore (?).
No, not at all. To add a drive:
`mdadm /dev/md0 --add /dev/sdX`
`mdadm -v /dev/md0 --grow --raid-devices=Y`
Note if an internal bitmap is set, it must be removed prior to growing
the array. It can be added back once the grow operation is complete.
To increase the drive size, replace any smaller drives with larger
drives one at a time:
`mdadm /dev/md0 --add /dev/sdX`
`mdadm /dev/md0 --fail /dev/sdY`
Once all the drives are larger than the current device size used by the
array:
`mdadm /dev/md0 --grow`
This will set the device size based upon the smallest device in the
array. The device size can be set to a smaller value using the -z
parameter. Once the array is grown, the filesystem needs to be expanded
via the tool used for that purpose for the given file system.
Are you concerned about 100% usage on the backup server array?
Some, yes. I am going to fix it by removing some very large but
unnecessary files. It has only been at 100% for a few days.
> plus several T of additional files I don't need on the main server.
44 TB total - 22 TB backup = 22 TB additional.?? That explains the 100%
usage.
Actually, no. There are not two backup copies on the file system.
Believe it or not, there are 22T of files from other sources.
Have you considered putting the additional files on another server that
is not backed up, only archived?
They should no longer be needed. Once I confirm that (in a few minutes
from now, actually), they will be deleted. If any of the files in
question turn out to be necessary, I will do that very thing.
On 2020-08-06 18:58, Leslie Rhorer wrote:
> The servers have 10G optical links between them.?? A full backup to the
> RAID 6 array takes several days.
One 10 Gbps network connection per server?
Yes. I don't have slots for additional NIC boards, and my boards only
have one port.
22E+12 bytes in 2.8 days is ~90 MB/s.?? That is a fraction of 4 Gbps and
an even smaller fraction of 10 Gbps.?? Have you identified the bottleneck?
That was a calculated number. Did I make a mistake?
...
Oops. That should have been about 15 hours or so. The transfer rate
for a large file is close to 4Gbps, which is about the best I would
expect from this hardware. It's good enough.
> A full backup to single drives takes
> 2 weeks, because single drives are limited to about 800Mbps, while the
> array can gulp down nerly 4Gbps.?? Nightly backups (via rsync) take a
> few minutes, at most.
800 Mbps network throughput should be ~88 MB/s HDD throughput.?? 2 to 4
TB drives should be faster.?? Have you identified the bottleneck?
It's probbably the internal SATA controller on this old motherboard.
I'm not using a high-dollar controller for external drives. Again,
since I don't do this sort of thing daily, I am not worried about it. I
start the backup and walk away. When I come back, it's done.
Differential backups are small, so I only very rarely need a second drive.
44E+12 bytes in 15 days is ~34 MB/s.?? Is this due to a DAR manual
workflow that limits you to one or two archive drives per day?
No, that's about what I get on average transfers to external drives.
Are you using hot-swap for the archive drives??? What make and model
rack??? What HBA/RAID card??? Same for hot spares and HBA??? Same for the
16 bay rack, HBA, port replicators?
Yes on the hot swap. I just use a little eSATA docking station
attached to an eSATA port on the motherboard. 'Definitely a poor man's
solution.
If you have two HDD hot-swap bays, can DAR leap-frog destination media?
I believe it can, yes. A script to handle that should be pretty
simple. I have never done so.
??E.g. You insert two archive drives and have DAR begin writing to the
first.?? When the first is full, DAR begins writing to the second and
notifies you.?? You pull the first drive, insert the third drive, and
notify DAR.?? When the second drive is full, DAR begins writing to the
third, and notifies you.?? Etc.?
Right. I just use the device ID (rather than the name) to write the
files and pause when the drive is full. It should be possible to do it
with multiple device ID targets. In fact, I know it would be. The
script I use right now pauses and waits for the user to replace the
drive and press <Enter>. It would be trivial to have the script
continue with a different device ID instead of pausing. Iterating
through a list of IDs is hardly any more difficult.
Hmm. You have given me an idea. Thanks!
If you have many HDD hot-swap bays, can DAR write in parallel??? With
leap-frog?
No, I don't think so, at least not in general. I suppose one could
create a front-end process which divides up the source and passes the
individual chunks to multiple DAR processes. A Python script should be
able to handle it pretty well.
In my experience, HDD's that are stored for long periods have the bad
habit of failing within hours of being put back into service.?? Does this
concern you?
No, not really. If a target drive fails during a backup, I can just
isolate the existing portion and then start a new backup on the isolate.
A failed drive during a restore could be a bitch, but that's pretty
unlikely. Something like dd_rescue could be a great help.
What is your data destruction policy?
You mean for live data? I don't have one. Do you mean for the
backups? There is no formal one.
One design pattern for ZFS is a pool of striped virtual devices (VDEV),
each VDEV being two or more mirrored drives of the same size and type
(e.g. SSD, SAS, SATA, etc.).?? Cache, intent log, and spare devices can
be added for performance and/or reliability.?? To add capacity, you
insert another pair of drives and add them into the pool as a VDEV
mirror.?? The top-level file system is automatically resized.?? File
systems without size restrictions can use the additional capacity.
Performance increases.?? For backup, choices include replication to
another pool and mirror tricks (add one drive to each VDEV mirror, allow
it to resilver, remove one drive from each mirror in rotation).
Oh, yes. For an enterprise system, ZFS is the top contender, in my
book. These are for my own use, and my business is small, however. If
I ever get to the point where I have more than 10 employees, I will no
doubt switch to ZFS.
Let me put it this way: if a business has the need for a separate IT
manager, his filesystem of choice for the file server(s) is pretty much
without question ZFS. For a small business or for personal use the
learning curve may be a bit more than the non-IT user might want to tackle.
Or not. I certainly would not discourage anyone who wants to take on
the challenge.