Re: File systems mounted under `/media/root/` ?

2023-12-12 Thread Max Nikulin

On 10/12/2023 23:38, Stefan Monnier wrote:

Max Nikulin [2023-12-10 21:49:46] wrote:

 udisksctl dump
 udevadm info --query=all --name=sda

for various hints related to udisks. Perhaps a better variant of udevadm
options exists.

Thanks.  Now I have some thread on which to pull 


I have realized that I am unsure if drives are automounted by udisksd or 
it needs cooperation from a user application that request mount in 
response to an event for an object having specific hints.


Another idea inspect definition and origin units reported by

systemctl list-units --type automount
systemctl list-units --type mount



Re: On file systems

2023-12-12 Thread Thomas Schmitt
Hi,

to...@tuxteam.de wrote:
> Remember
> Apple's "fat binaries", which contained a binary for 68K and another
> for PowerPC? Those were made with "forks", which was Apple's variant
> of "several streams in one file". And so on.

The most extreme example i know is Solaris:

  https://docs.oracle.com/cd/E36784_01/html/E36883/fsattr-5.html

  fsattr - extended file attributes

  Description
  Attributes are logically supported as files within the file system.
  The file system is therefore augmented with an orthogonal name space of
  file attributes. Any file (including attribute files) can have an
  arbitrarily deep attribute tree associated with it. Attribute values
  are accessed by file descriptors obtained through a special attribute
  interface.

So every file can have a second job as directory ... in theory.
In practice, though:

  Implementation are [...] permitted to reject operations that are not
  supported. For example, the implementation for the UFS file system
  allows only regular files as attributes (for example, no
  sub-directories) and rejects attempts to place attributes on attributes.


Have a nice day :)

Thomas



On file systems [was: Image handling in mutt]

2023-12-11 Thread tomas
On Tue, Dec 12, 2023 at 06:04:04AM +0800, jeremy ardley wrote:

[...]

> If you look at the NTFS file system [...]

> Underneath the hood of a NTFS file is alternate data streams (ADS). That is
> a single file can contain main different 'sub files' of completely different
> content type. Each ADS has metadata describing the stream.

I think the idea "was in the air" back then (mid-1980s), covering a wide
field between "rich file metadata" and several "streams" per file, cf.
Apple's HFS, which evolved into HFS+; NTFS is itself an evolution of
OS2's HPFs, etc, etc.

You also see, back then, increasing use of B and B+ trees in different
roles in file systems.

After all, designers moved from company to company and carried with them
ideas and teams. Companies were aggressively hiring people off other
companies.

It's actually risky to say "so and so had first this feature" without
deep research. Where do you put the limit between "file metadata" and
"file substream"? 65K? 4T? HP/UX implemented a file stream mechanism
on top of its Unix file system (those were directories which looked
like files), explicitly to support multi-architecture binaries. Remember
Apple's "fat binaries", which contained a binary for 68K and another
for PowerPC? Those were made with "forks", which was Apple's variant
of "several streams in one file". And so on.

Interesting times :-)

Cheers
-- 
t


signature.asc
Description: PGP signature


Re: File systems mounted under `/media/root/` ?

2023-12-10 Thread Stefan Monnier
to...@tuxteam.de [2023-12-10 17:47:41] wrote:
> You ssh in as root (or serial port)?

I do over the serial port, but over SSH, I always login as myself first
and then `su -` to root.

> Perhaps it's a "user session" thingy playing games on you?

Could be,


Stefan



Re: File systems mounted under `/media/root/` ?

2023-12-10 Thread tomas
On Sun, Dec 10, 2023 at 11:42:42AM -0500, Stefan Monnier wrote:
> Stanislav Vlasov [2023-12-10 21:16:54] wrote:
> > In /media/ disks mounts by GUI. Stefan use root in gui login.
> 
> Except:
> - I never do a "GUI login" as root.
> - "This is on a headless ARM board running Debian stable".
>   I access it via SSH (and occasionally serial port).

You ssh in as root (or serial port)?

Perhaps it's a "user session" thingy playing games on you?

Cheers
-- 
t


signature.asc
Description: PGP signature


Re: File systems mounted under `/media/root/` ?

2023-12-10 Thread Stefan Monnier
Stanislav Vlasov [2023-12-10 21:16:54] wrote:
> In /media/ disks mounts by GUI. Stefan use root in gui login.

Except:
- I never do a "GUI login" as root.
- "This is on a headless ARM board running Debian stable".
  I access it via SSH (and occasionally serial port).


Stefan



Re: File systems mounted under `/media/root/` ?

2023-12-10 Thread Stefan Monnier
Max Nikulin [2023-12-10 21:49:46] wrote:
> On 10/12/2023 02:49, Stefan Monnier wrote:
>> "magically" mounted as
>> `/media/root/`.
> [...]
>> Any idea who/what does that, and how/where I can control it?
>
> This path is used by udisks, however I am unsure what may cause
> automounting for root.
>
> I would check
>
> udisksctl dump
> udevadm info --query=all --name=sda
>
> for various hints related to udisks. Perhaps a better variant of udevadm
> options exists.

Thanks.  Now I have some thread on which to pull :-)

Nicholas Geovanis [2023-12-10 08:04:08] wrote:
> Were there any reboots in between?

Yes, I've seen the problem appear a few times over the last few days
where I've been rebooting a few times (while replacing/moving some
drives, and making various adjustments along the way).


Stefan



Re: File systems mounted under `/media/root/` ?

2023-12-10 Thread Stanislav Vlasov
2023-12-10 19:49 GMT+05:00, Max Nikulin :
> On 10/12/2023 02:49, Stefan Monnier wrote:
>> "magically" mounted as
>> `/media/root/`.
> [...]
>> Any idea who/what does that, and how/where I can control it?
>
> This path is used by udisks, however I am unsure what may cause
> automounting for root.
>

In /media/ disks mounts by GUI. Stefan use root in gui login.

-- 
Stanislav



Re: File systems mounted under `/media/root/` ?

2023-12-10 Thread Max Nikulin

On 10/12/2023 02:49, Stefan Monnier wrote:

"magically" mounted as
`/media/root/`.

[...]

Any idea who/what does that, and how/where I can control it?


This path is used by udisks, however I am unsure what may cause 
automounting for root.


I would check

udisksctl dump
udevadm info --query=all --name=sda

for various hints related to udisks. Perhaps a better variant of udevadm 
options exists.





Re: File systems mounted under `/media/root/` ?

2023-12-10 Thread Nicholas Geovanis
On Sat, Dec 9, 2023, 1:50 PM Stefan Monnier 
wrote:

> Recently I noticed some unused ext4 filesystems (i.e. filesystems that
> aren't in /etc/fstab, that I normally don't mount, typically because
> they're snapshots or backups) "magically" mounted as
> `/media/root/`.
>
> This is on a headless ARM board running Debian stable.
>
> Not sure when this happen, but I noticed t least once happening in
> response to `vgchange -ay`.
>

Were there any reboots in between?

Any idea who/what does that, and how/where I can control it?
>
> Stefan
>


File systems mounted under `/media/root/` ?

2023-12-09 Thread Stefan Monnier
Recently I noticed some unused ext4 filesystems (i.e. filesystems that
aren't in /etc/fstab, that I normally don't mount, typically because
they're snapshots or backups) "magically" mounted as
`/media/root/`.

This is on a headless ARM board running Debian stable.

Not sure when this happen, but I noticed t least once happening in
response to `vgchange -ay`.

Any idea who/what does that, and how/where I can control it?


Stefan



Re: definiing deduplication (was: Re: deduplicating file systems: VDO with Debian?)

2022-11-23 Thread Thomas Schmitt
Hi,

hw wrote:
> with CDs/DVDs, writing is not so easy.

Thus it is not as easy to overwrite them by mistake.
The complicated part of optical burning can be put into scripts.

But i agree that modern HDD sizes cannot be easily covered by optical
media.


I wrote:
> > [...] LTO tapes [...]

hw wrote:
> The problem is that for unknown reasons, the development seems
> to have stopped at some point other than maybe for datacenters that
> might have rooms full of tapes and robots to fetch them.  Perhaps
> physical limits made tapes with more capacity not really feasible.

Even if there are no technical obstacles, the price/capacity ratio of
HDD storage makes it hard to establish a new generation of tapes or
optical media. It would need an Ilon Musk to invest billions into the
mass production of fancy new drives and media which then need to be
pushed into the market with competitive prices.


> > The backup part of a computer system should be its most solid and
> > artless part. No shortcuts, no fancy novelties, no cumbersome user
> > procedures.

> That's easier said than accomplished ...

Well, i worked my way down from shell scripting via multi-volume backup
planning to creating the backup filesystem and operating my optical drives
with own SCSI driver code.
In hindsight it looks doable.


Have a nice day :)

Thomas



Re: definiing deduplication (was: Re: deduplicating file systems: VDO with Debian?)

2022-11-23 Thread hw
On Thu, 2022-11-10 at 15:32 +0100, Thomas Schmitt wrote:
> Hi,
> 
> i wrote:
> > > the time window in which the backuped data
> > > can become inconsistent on the application level.
> 
> hw wrote:
> > Or are you referring to the data being altered while a backup is in
> > progress?
> 
> Yes.

Ah I was referring to snapshots of backups.

> [...]
> 
> > Yes, I'm re-using the many small hard discs that have accumulated
> > over the
> > years.
> 
> If it's only their size which disqualifies them for production
> purposes,
> then it's ok. But if they are nearing the end of their life time, then
> i would consider to decommission them.

They aren't really disqualified.  They've been replaced mostly because
electricity is expensive and because it can be unwieldy having so many
disks in a machine.  What the end of their lifetime is is unpredictable.
Why would I throw away perfectly good disks.

> > I wish we could still (relatively) easily make backups on tapes.
> 
> My personal endeavor with backups on optical media began when a
> customer
> had a major data mishap and all backup tapes turned out to be
> unusable.

That can happen with any media.

> Frequent backups had been made and allegedly been checkread. But in
> the
> end it was big drama.
> I then proposed to use a storage where the boss of the department can
> make random tests with the applications which made and read the files.
> So i came to writing backup scripts which used mkisofs and cdrecord
> for CD-RW media.

When CDs came out, they were nice for a short time because you didn't
need to switch between so many diskettes.  It didn't take long for CDs
to be small, and DVDs had the same problem.  A single CD instead of 20
disks is an improvement, but 20 or 100 CDs or DVDs is not, and it became
obvious that optical media is even more unwieldy than magnetic media. 
At least you could just push a diskette into the drive and start reading
and writing; with CDs/DVDs, writing is not so easy.

> > Just change
> > the tape every day and you can have a reasonable number of full
> > backups.
> 
> If you have thousandfold the size of Blu-rays worth of backup, then
> probably a tape library would be needed. (I find LTO tapes with up to
> 12 TB in the web, which is equivalent to 480 BD-R.)

Yes ... The problem is that for unknown reasons, the development seems
to have stopped at some point other than maybe for datacenters that
might have rooms full of tapes and robots to fetch them.  Perhaps
physical limits made tapes with more capacity not really feasible.

> 
> > A full new backup takes ages
> 
> It would help if you could divide your backups into small agile parts
> and
> larger parts which don't change often.
> The agile ones need frequent backup, whereas the lazy ones would not
> suffer
> so much damage if the newset available backup is a few days old.

Nah, that would make things very complicated.  Rsync figures it
automatically.

> > I need to stop modifying stuff and not start all over again
> 
> The backup part of a computer system should be its most solid and
> artless
> part. No shortcuts, no fancy novelties, no cumbersome user procedures.

That's easier said than accomplished ...



Re: ZFS performance (was: Re: deduplicating file systems: VDO withDebian?)

2022-11-15 Thread hw
On Mon, 2022-11-14 at 15:08 -0500, Michael Stone wrote:
> On Mon, Nov 14, 2022 at 08:40:47PM +0100, hw wrote:
> > Not really, it was just an SSD.  Two of them were used as cache and they
> > failed
> > was not surprising.  It's really unfortunate that SSDs fail particulary fast
> > when used for purposes they can be particularly useful for.
> 
> If you buy hard drives and use them in the wrong application, they also 
> fail quickly.

And?

>  And, again, you weren't using the right SSD so it *wasn't*
> particularly useful.

Sure it was.  An SSD that is readily available and relatively inexpensive can be
very useful and be the right one for the purpose.  Another one that isn't
readily available and relatively expensive may last longer, and that doesn't
mean that it's the right one for the purpose.

>  But at this point you seem to just want to argue 
> in circles for no reason, so like others I'm done with this waste of 
> time.

IIRC, you're the one trying to argue against experience, claiming that the
experience wasn't how it was while it was as it was, and that conclusions drawn
from the experience are invalid while they aren't.



Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?))

2022-11-15 Thread hw
On Mon, 2022-11-14 at 20:37 +0100, Linux-Fan wrote:
> hw writes:
> 
> > On Fri, 2022-11-11 at 22:11 +0100, Linux-Fan wrote:
> > > hw writes:
> > > > On Thu, 2022-11-10 at 22:37 +0100, Linux-Fan wrote:
> [...]
> > How do you intend to copy files at any other level than at file level?  At  
> > that
> > level, the only thing you know about is files.
> 
> You can copy only a subset of files but you cannot mirror only a subset of a  
> volume in a RAID unless you specifically designed that in at the time of  
> partitioning. With RAID redundancy you have to decide upfront what you  
> want to have mirrored. With files, you can change it any time.

You can do that with RAID as well.  It might take more work, though.

> [...]
> 
> > > Multiple, well established tools exist for file tree copying. In RAID 
> > > scenarios the mode of operation is integral to the solution.
> > 
> > What has file tree copying to do with RAID scenarios?
> 
> Above, I wrote that making copies of the data may be recommendable over  
> using a RAID. You answered “Huh?” which I understood as a question to expand  
> on the advantages of copying files rather than using RAID.

So file tree copying doesn't have anything to with RAID scenarios.

> [...]
> 
> > > File trees can be copied to slow target storages without slowing down the 
> > > source file system significantly. On the other hand, in RAID scenarios, 
> 
> [...]
> 
> > Copying the VM images to the slow HDD would slow the target down just as it
> > might slow down a RAID array.
> 
> This is true and does not contradict what I wrote.

I didn't say that it contradicts.  Only it doesn't matter what kind of files
you're copying to a disk for the disk to slow down while you seemed to make a
distinction that doesn't seem necessary for slowing down disks.

> 
> > > ### when
> > > 
> > > For file copies, the target storage need not always be online. You can 
> > > connect it only for the time of synchronization. This reduces the chance 
> > > that line overvoltages and other hardware faults destroy both copies at  
> > > the same time. For a RAID, all drives must be online at all times (lest
> > > the 
> > > array becomes degraded).
> > 
> > No, you can always turn off the array just as you can turn off single disks.
> > When I'm done making backups, I shut down the server and not much can
> > happen  
> > to
> > the backups.
> 
> If you try this in practice, it is quite limited compared to file copies.

What's the difference between the target storage being offline and the target
storage server being switched off?  You can't copy the files either way because
there's nothing available to copy them to.

> 
> > > Additionally, when using files, only the _used_ space matters. Beyond  
> > > that, the size of the source and target file systems are decoupled. On the
> > > other 
> > > hand, RAID mandates that the sizes of disks adhere to certain properties 
> > > (like all being equal or wasting some of the storage).
> > 
> > And?
> 
> If these limitations are insignificant to you then lifting them provides no  
> advantage to you. You can then safely ignore this point :)

Since you can't copy files into thin air, limitations always apply.

> 
> [...]
> 
> > > > Hm, I haven't really used Debian in a long time.  There's probably no
> > > > reason 
> > > > to change that.  If you want something else, you can always go for it.
> > > 
> > > Why are you asking on a Debian list when you neiter use it nor intend to  
> > > use it?
> > 
> > I didn't say that I don't use Debian, nor that I don't intend to use it.
> 
> This must be a language barrier issue. I do not understand how your  
> statements above do not contradict each other.

It's possible that the context has escaped you because it hasn't been quoted.

> [...]
> 
> > > Now check with <https://popcon.debian.org/by_vote>
> > > 
> > > I get the following (smaller number => more popular):
> > > 
> > > 87   e2fsprogs
> > > 1657 btrfs-progs
> > > 2314 xfsprogs
> > > 2903 zfs-dkms
> > > 
> > > Surely this does not really measure if people are actually use these 
> > > file systems. Feel free to provide a more accurate means of measurement.  
> > > For me this strongly suggests that the most popular FS on Debian is ext4.
> > 
> > ext4 doesn't show up in this list.  And it doesen't matter if ext4 is most
> 
> e2fsprogs contains the rela

Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?)

2022-11-14 Thread David Christensen

On 11/14/22 13:48, hw wrote:

On Fri, 2022-11-11 at 21:55 -0800, David Christensen wrote:



Lots of snapshots slows down commands that involve snapshots (e.g.  'zfs
list -r -t snapshot ...').  This means sysadmin tasks take longer when
the pool has more snapshots.


Hm, how long does it take?  It's not like I'm planning on making hundreds of
snaphsots ...


2022-11-14 18:00:12 toor@f3 ~
# time zfs list -r -t snapshot bootpool | wc -l
  49

real0m0.020s
user0m0.011s
sys 0m0.012s

2022-11-14 18:00:55 toor@f3 ~
# time zfs list -r -t snapshot soho2_zroot | wc -l
 222

real0m0.120s
user0m0.041s
sys 0m0.082s

2022-11-14 18:01:18 toor@f3 ~
# time zfs list -r -t snapshot p3 | wc -l
3864

real0m0.649s
user0m0.159s
sys 0m0.494s


I surprised myself -- I recall p3 taking 10+ seconds to list all the 
snapshots.  But, I added another mirror since then, I try to destroy old 
snapshots periodically, and the machine has been up for 16+ days (so 
metadata is likely cached).




The Intel Optane Memory Series products are designed to be cache devices
-- when using compatible hardware, Windows, and Intel software.  My
hardware should be compatible (Dell PowerEdge T30), but I am unsure if
FreeBSD 12.3-R will see the motherboard NVMe slot or an installed Optane
Memory Series product.


Try it out?



Eventually, yes.



I thought Optane comes as very expensive PCI cards.  I don't have any m.2 slots,
and it seems difficult to even find mainboards with at least two that support
the same cards, which would be a requirement because there's no storing data
without redundancy.



I was thinking of getting an NVMe M.2 SSD to PCIe x4 adapter card for 
the machines without a motherboard M.2 slot.




# zpool status
   pool: moon
  state: ONLINE
config:

 NAMESTATE READ WRITE CKSUM
 moonONLINE   0 0 0
   mirror-0  ONLINE   0 0 0
 sdc ONLINE   0 0 0
 sdg ONLINE   0 0 0
   raidz1-1  ONLINE   0 0 0
 sdl ONLINE   0 0 0
 sdm ONLINE   0 0 0
 sdn ONLINE   0 0 0
 sdp ONLINE   0 0 0
 sdq ONLINE   0 0 0
 sdr ONLINE   0 0 0
   raidz1-2  ONLINE   0 0 0
 sdd ONLINE   0 0 0
 sde ONLINE   0 0 0
 sdf ONLINE   0 0 0
 sdh ONLINE   0 0 0
 sdi ONLINE   0 0 0
 sdj ONLINE   0 0 0
   mirror-3  ONLINE   0 0 0
 sdk ONLINE   0 0 0
 sdo ONLINE   0 0 0


Some of the disks are 15 years old ...  It made sense to me to group the disks
by the ones that are the same (size and model) and use raidz or mirror depending
on how many disks there are.

I don't know if that's ideal.  Would zfs have it figured out by itself if I had
added all of the disks in a raidz?  With two groups of only two disks each that
might have wasted space?



So, 16 HDD's of various sizes?


Without knowing the interfaces, ports, and drives that correspond to 
devices sd[cdefghijklmnopqr], it is difficult to comment.  I do find it 
surprising that you have two mirrors of 2 drives each and two raidz1's 
of 6 drives each.



If you want maximum server IOPS and bandwidth, layout your pool of 16 
drives as 8 mirrors of 2 drives each.  Try to match the sizes of the 
drives in each mirror.  It is okay if the mirrors are not all the same 
size.  ZFS will proportion writes to top-level vdev's based upon their 
available space.  Reads come from whichever vdev's have the data.



When I built my latest server, I tried different pool layouts with 4 
HDD's and ran benchmarks.  (2 striped mirrors of 2 HDD's each was the 
winner.)



You can monitor pool I/O with:

# zpool iostat -v moon 10


On FreeBSD, top(1) includes ZFS ARC memory usage:

ARC: 8392M Total, 5201M MFU, 797M MRU, 3168K Anon, 197M Header, 2194M Other
 3529M Compressed, 7313M Uncompressed, 2.07:1 Ratio


Is the SSD cache even relevant for a backup server?  



Yes, because the backup server is really a secondary server in a 
primary-secondary scheme.  Both servers contain a complete set of data, 
backups, archives, and images.  The primary server is up 24x7.  I boot 
the secondary periodically and replicate.  If the primary dies, I will 
swap roles and try to recover content that changed since the last 
replication.




I might have two unused
80GB SSDs I may be able to plug in to use as cache.  



Split each SSD into two or more partitions.  Add one partition on each 
SSD as a cache device for the HDD pool.  Using another partition on each 
SSD, add a dedicated dedup mirror for the HDD pool.



I am thinking of using a third 

Re: ZFS performance (was: Re: deduplicating file systems: VDO withDebian?)

2022-11-14 Thread Linux-Fan

hw writes:


On Fri, 2022-11-11 at 21:26 +0100, Linux-Fan wrote:
> hw writes:
>
> > On Thu, 2022-11-10 at 23:05 -0500, Michael Stone wrote:
> > > On Thu, Nov 10, 2022 at 06:55:27PM +0100, hw wrote:
> > > > On Thu, 2022-11-10 at 11:57 -0500, Michael Stone wrote:
> > > > > On Thu, Nov 10, 2022 at 05:34:32PM +0100, hw wrote:
> > > > > > And mind you, SSDs are *designed to fail* the sooner the more  
> > > > > > data you write to

> > > > > > them.  They have their uses, maybe even for storage if you're so
> > > > > > desperate, but not for backup storage.
>
> [...]
>
> > Why would anyone use SSDs for backups?  They're way too expensive for  
> > that.

>
> I actually do this for offsite/portable backups because SSDs are shock 
> resistant (dont lose data when being dropped etc.).

I'd make offsite backups over internet.  If you can afford SSDs for backups,
well, why not.


Yes, I do offsite backup over Internet too. Still, an attacker could  
possibly delete those from my running system. Not so much for the detached  
separate portable SSD.


> The most critical thing to acknowledge about using SSDs for backups is  
> that the data retention time of SSDs (when not powered) is decreasing with each 

> generation.

Do you mean each generation of SSDs or of backups?  What do manufacturers say
how long you can store an SSD on a shelf before the data on it has degraded?


Generations of SSDs.

In the early days, a shelf life in the magnitude of years was claimed. Later  
on, most datasheets I have seen have been lacking in this regard.


[...]


>  The small (drive size about 240GB) ones I use for backup are much less 
> durable.

You must have quite a lot of them.  That gets really expensive.


Two: I write the new backup to the one I have here, then carry it to the  
offsite location and take back the old copy from there. I do not these SSD's  
prices but it weren't expensive units at the time.



>  For one of them, the manufacturer claims 306TBW, the other has 
> 360 TBW specified. I do not currently know how much data I have written to 
> them already. As you can see from the sizes, I backup only a tiny subset  
> of the data to SSDs i.e. the parts of my data that I consider most critical  
> (VM images not being among them...).


Is that because you have them around anyway because they were replaced with
larger ones, or did you actually buy them to put backups on them?


I still had them around: I started my backups with 4 GiB CF cards for  
backup, then quickly upgraded to 16 GiB cards all bought specifically for  
the backup purposes. Then I upgraded to 32 GB cards. Somewhere around that  
time I equipped my main system with dual 240 GB SSDs. Later, I upgraded to 2  
TB SSDs with the intention to not only run the OS off the SSDs, but also  
the VMs. By this, I got the small SSDs out, ready to serve the increased  
need for backup storage since 32 GB CF cards were no longer feasible...


Nowdays, my “important data backup” is still close to the historical 32 GiB  
limit although slightly above it (it ranges between 36 and 46 GiB or such).  
There are large amounts of free space on the backup SSDs filled by (1) a  
live system to facilitate easy restore and (2) additional, less important  
data (50 GiB or so). 


[...]

> > There was no misdiagnosis.  Have you ever had a failed SSD?  They  
> > usually just disappear.  I've had one exception in which the SDD at


[...]


> Just for the record I recall having observed this once in a very similar 
> fashion. It was back when a typical SSD size was 60 GiB. By now we should 
> mostly be past this “SSD fails early with controller fault” issues. It can 
> still happen and I still expect SSDs to fail with less notice compared to 
> HDDs.

Why did they put bad controllers into the SSDs?


Maybe because the good ones were to expensive at the time? Maybe because the  
manufacturers were yet to acquire good experience on how to produce them  
reliably. I can only speculate.


[...]

YMMV
Linux-Fan

öö


pgpOoGLUZ_I5l.pgp
Description: PGP signature


Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?)

2022-11-14 Thread hw
 ONLINE   0 0 0
>     gpt/p1a.eli   ONLINE   0 0 0
>     gpt/p1b.eli   ONLINE   0 0 0
>   mirror-1    ONLINE   0 0 0
>     gpt/p1c.eli   ONLINE   0 0 0
>     gpt/p1d.eli   ONLINE   0 0 0
> dedup   
>   mirror-2    ONLINE   0 0 0
>     gpt/CVCV**D0180EGN-2.eli  ONLINE   0 0 0
>     gpt/CVCV**7K180EGN-2.eli  ONLINE   0 0 0
> cache
>   gpt/CVCV**D0180EGN-1.eli    ONLINE   0 0 0
>   gpt/CVCV**7K180EGN-1.eli    ONLINE   0 0 0
> 
> errors: No known data errors

Is the SSD cache even relevant for a backup server?  I might have two unused
80GB SSDs I may be able to plug in to use as cache.  Once I can get my network
card to not block every now and then, I can at least sustain about 200MB/s
writing, which is about as much as the disks I'm reading from can deliver.

> > > I suggest creating a ZFS pool with a mirror vdev of two HDD's.
> > >    If you
> > > can get past your dislike of SSD's,
> > >   add a mirror of two SSD's as a
> > > dedicated dedup vdev.  (These will not see the hard usage that cache
> > > devices get.)
> > >    Create a filesystem 'backup'.  Create child filesystems,
> > > one for each host.  Create grandchild filesystems, one for the root
> > > filesystem on each host.
> > 
> > Huh?  What's with these relationships?
> 
> 
> ZFS datasets can be organized into hierarchies.  Child dataset 
> properties can be inherited from the parent dataset.  Commands can be 
> applied to an entire hierarchy by specifying the top dataset and using a 
> "recursive" option.  Etc..

Ah, ok, that's what you mean.

> When a host is decommissioned and you no longer need the backups, you 
> can destroy the backups for just that host.  When you add a new host, 
> you can create filesystems for just that host.  You can use different 
> backup procedures for different hosts.  Etc..

I'll probably make file systems for host-sepcifc data and some for types of
other data.  Some of it doesn't need compression, so I can turn that off per
file system.

> 
> > >    Set up daily rsync backups of the root
> > > filesystems on the various hosts to the ZFS grandchild filesystems.  Set
> > > up zfs-auto-snapshot to take daily snapshots of everything, and retain
> > > 10 snapshots.  Then watch what happens.
> > 
> > What do you expect to happen?  
> 
> 
> I expect the first full backup and snapshot will use an amount of 
> storage that is something less than the sum of the sizes of the source 
> filesystems (due to compression).  The second through tenth backups and 
> snapshots will each increase the storage usage by something less than 
> the sum of the daily churn of the source filesystems.  On day 11, and 
> every day thereafter, the oldest snapshot will be destroyed, daily churn 
> will be added, and usage will stabilize.  Any source system upgrades and 
> software installs will cause an immediate backup storage usage increase. 
>   Any source system cleanings and software removals will cause a backup 
> storage usage decrease after 10 days.

Makes sense ... How does that work with destroying the oldest snapshot?  IIRC,
when a snapshot is removed (destroyed?  That's strange wording, "merge" seems
better ...), it's supposed to somehow merge with the data it has been created
from such that the "first data" becomes what the snapshot was unless the
snapshot is destroyed (that wording would make sense then), meaning it doesn't
exist anymore without merging and the "first data" is still there as it was.

I remember trying to do stuff with snapshots a long time ago and zfs would freak
out telling me that I can't merge a snapshot because there were other snapshots
that were getting in the way (as if I'd care, just figure it out yourself, darn
it, that's your job not mine ...) and it was a nightmare to get rid of those.

> > I'm thinking about changing my backup sever ...
> > In any case, I need to do more homework first.
> 
> Keep your existing backup server and procedures operational.

Nah, I haven't used it in a long time and I switched it from Fedora to Debian
now and it's all very flexible.  Backups are better done in the winter.  There's
a reason why there's 8 fans in it and then some.

Since my memory is bad, I even forgot that I had switched out the HP smart array
controllers some time ago for some 3ware controllers that support JBOD.  That
was quite a pleasant surprise ...

>   If y

Re: ZFS performance (was: Re: deduplicating file systems: VDO withDebian?)

2022-11-14 Thread hw
On Fri, 2022-11-11 at 21:26 +0100, Linux-Fan wrote:
> hw writes:
> 
> > On Thu, 2022-11-10 at 23:05 -0500, Michael Stone wrote:
> > > On Thu, Nov 10, 2022 at 06:55:27PM +0100, hw wrote:
> > > > On Thu, 2022-11-10 at 11:57 -0500, Michael Stone wrote:
> > > > > On Thu, Nov 10, 2022 at 05:34:32PM +0100, hw wrote:
> > > > > > And mind you, SSDs are *designed to fail* the sooner the more data  
> > you
> > > > > > write
> > > > > > to
> > > > > > them.  They have their uses, maybe even for storage if you're so
> > > > > > desperate,
> > > > > > but
> > > > > > not for backup storage.
> 
> [...]
> 
> > Why would anyone use SSDs for backups?  They're way too expensive for that.
> 
> I actually do this for offsite/portable backups because SSDs are shock  
> resistant (dont lose data when being dropped etc.).

I'd make offsite backups over internet.  If you can afford SSDs for backups,
well, why not.

> The most critical thing to acknowledge about using SSDs for backups is that  
> the data retention time of SSDs (when not powered) is decreasing with each  
> generation.

Do you mean each generation of SSDs or of backups?  What do manufacturers say
how long you can store an SSD on a shelf before the data on it has degraded?

> Write endurance has not become critical in any of my SSD uses so far.  
> Increasing workloads have also resulted in me upgrading the SSDs. So far I  
> always upgraded faster than running into the write endurance limits. I do  
> not use the SSDs as caches but as full-blown file system drives, though.

I use them as system disks because they don't mind being switched off and on and
partly because they don't need as much electricity as hard disks.  If it wasn't
for that, I'd use hard disks for system disks.  I don't use them for storage,
they're way too small and expensive for that.  Fortunately, system disks can be
small; the data is on the server anyway.

There are exceptions, like hard drives suck for laptops and SSDs are much better
for them, and things that greatly benefit from low latencies, lots of IOOPs,
high transfer rates.

> On the current system, the SSDs report having written about 14 TB and are  
> specified by the manufacturer for an endurance of 6300 TBW (drive size is 4  
> TB).

Wow you have expensive drives.

>  The small (drive size about 240GB) ones I use for backup are much less  
> durable.

You must have quite a lot of them.  That gets really expensive.

>  For one of them, the manufacturer claims 306TBW, the other has  
> 360 TBW specified. I do not currently know how much data I have written to  
> them already. As you can see from the sizes, I backup only a tiny subset of  
> the data to SSDs i.e. the parts of my data that I consider most critical (VM  
> images not being among them...).

Is that because you have them around anyway because they were replaced with
larger ones, or did you actually buy them to put backups on them?

> [...]
> 
> > There was no misdiagnosis.  Have you ever had a failed SSD?  They usually  
> > just
> > disappear.  I've had one exception in which the SDD at first only sometimes
> > disappeared and came back, until it disappeared and didn't come back.
> 
> [...]
> 
> Just for the record I recall having observed this once in a very similar  
> fashion. It was back when a typical SSD size was 60 GiB. By now we should  
> mostly be past this “SSD fails early with controller fault” issues. It can  
> still happen and I still expect SSDs to fail with less notice compared to  
> HDDs.


Why did they put bad controllers into the SSDs?

> When I had my first (and so far only) disk failure (on said 60G SSD) I  
> decided to:
> 
>  * Retain important data on HDDs (spinning rust) for the time being
> 
>  * and also implement RAID1 for all important drives
> 
> Although in theory running two disks instead of one should increase the  
> overall chance of having one fail, no disks failed after this change so  
> far.

I don't have any disks that aren't important.  Even when the data can be
recovered, it's not worth the trouble not to use redundancy.  I consider
redundacy a requirement, there is no storing anything on a single disk.  I'd
only tolerate it for backups when there are multiple backups when it can't be
avoided.  Sooner or later, a disk will fail.



Re: ZFS performance (was: Re: deduplicating file systems: VDO withDebian?)

2022-11-14 Thread Michael Stone

On Mon, Nov 14, 2022 at 08:40:47PM +0100, hw wrote:

Not really, it was just an SSD.  Two of them were used as cache and they failed
was not surprising.  It's really unfortunate that SSDs fail particulary fast
when used for purposes they can be particularly useful for.


If you buy hard drives and use them in the wrong application, they also 
fail quickly. And, again, you weren't using the right SSD so it *wasn't*
particularly useful. But at this point you seem to just want to argue 
in circles for no reason, so like others I'm done with this waste of 
time.




Re: ZFS performance (was: Re: deduplicating file systems: VDO withDebian?)

2022-11-14 Thread hw
On Fri, 2022-11-11 at 14:48 -0500, Michael Stone wrote:
> On Fri, Nov 11, 2022 at 07:15:07AM +0100, hw wrote:
> > There was no misdiagnosis.  Have you ever had a failed SSD?  They usually
> > just
> > disappear.
> 
> Actually, they don't; that's a somewhat unusual failure mode.

What else happens?  All the ones I have seen failing had disappeared.

> [...]
> I've had way more dead hard drives, which is typical.

Because there were more hard drives than SSDs?

> > There was no "not normal" territory, either, unless maybe you consider ZFS
> > cache
> > as "not normal".  In that case, I would argue that SSDs are well suited for
> > such
> > applications because they allow for lots of IOOPs and high data transfer
> > rates,
> > and a hard disk probably wouldn't have failed in place of the SSD because
> > they
> > don't wear out so quickly.  Since SSDs are so well suited for such purposes,
> > that can't be "not normal" territory for them.  Perhaps they just need to be
> > more resilient than they are.
> 
> You probably bought the wrong SSD.

Not really, it was just an SSD.  Two of them were used as cache and they failed
was not surprising.  It's really unfortunate that SSDs fail particulary fast
when used for purposes they can be particularly useful for.

> SSDs write in erase-block units, 
> which are on the order of 1-4MB. If you're writing many many small 
> blocks (as you would with a ZFS ZIL cache) there's significant write 
> amplification. For that application you really need a fairly expensive 
> write-optimized SSD, not a commodity (read-optimized) SSD.

If you can get one you can use one.  The question is if it's worthwhile to spend
the extra money for special SSDs which aren't readily available or if it's
better to just replace common ones which are readily available from time to
time.

>  (And in fact, 
> SSD is *not* ideal for this because the data is written sequentially and 
> basically never read so low seek times aren't much benefit; NVRAM is 
> better suited.)

if you can get that

> If you were using it for L2ARC cache then mostly that 
> makes no sense for a backup server. Without more details it's really 
> hard to say any more. Honestly, even with the known issues of using 
> commidity SSD for SLOG I find it really hard to believe that your 
> backups were doing enough async transactions for that to matter--far 
> more likely is still that you simply got a bad copy, just like you can 
> get a bad hd. Sometimes you get a bad part, that's life. Certainly not 
> something to base a religion on.
> 

That can happen one way or another.  The last SSD that failed was used as a
system disk in a Linux server in a btrfs mirror.  Nothing much was written to
it.

> > Considering that, SSDs generally must be of really bad quality for that to
> > happen, don't you think?
> 
> No, I think you're making unsubstantiated statements, and I'm mostly 
> trying to get better information on the record for others who might be 
> reading.

I didn't keep detailed records and don't remember all the details, so the better
information you're looking for is not available.  I can say that SSDs failed
about the same as HDDs because that is my experience, and that's enough for me.

You need to understand what experience is and what it means.



Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?))

2022-11-14 Thread Linux-Fan

hw writes:


On Fri, 2022-11-11 at 22:11 +0100, Linux-Fan wrote:
> hw writes:
> > On Thu, 2022-11-10 at 22:37 +0100, Linux-Fan wrote:
>
> [...]
>
> > >  If you do not value the uptime making actual (even
> > >  scheduled) copies of the data may be recommendable over
> > >  using a RAID because such schemes may (among other advantages)
> > >  protect you from accidental file deletions, too.
> >
> > Huh?
>
> RAID is limited in its capabilities because it acts at the file system, 
> block (or in case of hardware RAID even disk) level. Copying files can 
> operate on any subset of the data and is very flexible when it comes to 
> changing what is going to be copied, how, when and where to.

How do you intend to copy files at any other level than at file level?  At  
that

level, the only thing you know about is files.


You can copy only a subset of files but you cannot mirror only a subset of a  
volume in a RAID unless you specifically designed that in at the time of  
partitioning. With RAID redundancy you have to decide upfront what you  
want to have mirrored. With files, you can change it any time.


[...]


> Multiple, well established tools exist for file tree copying. In RAID 
> scenarios the mode of operation is integral to the solution.

What has file tree copying to do with RAID scenarios?


Above, I wrote that making copies of the data may be recommendable over  
using a RAID. You answered “Huh?” which I understood as a question to expand  
on the advantages of copying files rather than using RAID.


[...]


> File trees can be copied to slow target storages without slowing down the 
> source file system significantly. On the other hand, in RAID scenarios, 


[...]


Copying the VM images to the slow HDD would slow the target down just as it
might slow down a RAID array.


This is true and does not contradict what I wrote.


> ### when
>
> For file copies, the target storage need not always be online. You can 
> connect it only for the time of synchronization. This reduces the chance 
> that line overvoltages and other hardware faults destroy both copies at  
> the same time. For a RAID, all drives must be online at all times (lest the 

> array becomes degraded).

No, you can always turn off the array just as you can turn off single disks.
When I'm done making backups, I shut down the server and not much can happen  
to

the backups.


If you try this in practice, it is quite limited compared to file copies.

> Additionally, when using files, only the _used_ space matters. Beyond  
> that, the size of the source and target file systems are decoupled. On the other 

> hand, RAID mandates that the sizes of disks adhere to certain properties 
> (like all being equal or wasting some of the storage).

And?


If these limitations are insignificant to you then lifting them provides no  
advantage to you. You can then safely ignore this point :)


[...]


> > Hm, I haven't really used Debian in a long time.  There's probably no
> > reason 
> > to change that.  If you want something else, you can always go for it.
>
> Why are you asking on a Debian list when you neiter use it nor intend to  
> use it?


I didn't say that I don't use Debian, nor that I don't intend to use it.


This must be a language barrier issue. I do not understand how your  
statements above do not contradict each other.


[...]


> Now check with <https://popcon.debian.org/by_vote>
>
> I get the following (smaller number => more popular):
>
>     87   e2fsprogs
> 1657 btrfs-progs
> 2314 xfsprogs
> 2903 zfs-dkms
>
> Surely this does not really measure if people are actually use these 
> file systems. Feel free to provide a more accurate means of measurement.  
> For me this strongly suggests that the most popular FS on Debian is ext4.


ext4 doesn't show up in this list.  And it doesen't matter if ext4 is most


e2fsprogs contains the related tools like `mkfs.ext4`.


widespread on Debian when more widespread distributions use different file
systems.  I don't have a way to get the numbers for that.

Today I installed Debian on my backup server and didn't use ext4.  Perhaps  
the "most widely-deployed" file system is FAT.


Probably yes. With the advent of ESPs it may have even increased in  
popularity again :)


[...]

> I like to be able to store my backups on any file system. This will not  
> work for snapshots unless I “materialize” them by copying out all files of a 

> snapshot.
>
> I know that some backup strategies suggest always creating backups based  
> on snapshots rather than the live file system as to avoid issues with  
> changing files during the creation of backups.

>
> I can see the merit in implementing it this way but have not yet found a 
&

Re: Sorry for the misattribution [was: ZFS performance (was: Re: deduplicating file systems: VDO] withDebian?)

2022-11-14 Thread hw
On Sat, 2022-11-12 at 07:27 +0100, to...@tuxteam.de wrote:
> On Fri, Nov 11, 2022 at 07:22:19PM +0100, to...@tuxteam.de wrote:
> 
> [...]
> 
> > I think what hede was hinting at was that early SSDs had a (pretty)
> > limited number of write cycles [...]
> 
> As was pointed out to me, the OP wasn't hede. It was hw. Sorry for the
> mis-attribution.
> 
> Cheers

What I was saying is something like that the number of failures out of some
number of disks doesn't show the numbers of storage space failing (or however
you want to call it).

For example, when you have 100 SSDs with 500GB each and 2 of them failed, then
49TB of storage survived.  With hard disks, you may have a 100 of them with 16TB
each, and when 2 of them failed, 1568TB of storage have survived.  That would
mean that the "failure rate" of SSDs is 32 times higher than the one of hard
disks.

I don't know what the actual numbers are.  Just citing some report saying 2/1518
vs. 44/1669 failures of SSDs vs. hard disks is meaningless, especially when
these disks were used for different things.  If the average SDD size was 1TB and
the average HDD size was 16TB, then that would mean that the actual survival
rate of storage on hard disks is over 17 times higher than the survival rate of
storage on SSDs, ignoring what the disks were used for.



Re: ZFS performance (was: Re: deduplicating file systems: VDO withDebian?)

2022-11-14 Thread hw
On Fri, 2022-11-11 at 17:05 +, Curt wrote:
> On 2022-11-11,   wrote:
> > 
> > I just contested that their failure rate is higher than that of HDDs.
> > This is something which was true in early days, but nowadays it seems
> > to be just a prejudice.
> 
> If he prefers extrapolating his anecdotal personal experience to a
> general rule rather than applying a verifiable general rule to his
> personal experience, then there's just nothing to be done for him!
> 

There is no "verifyable general rule" here.  You can argue all you want and
claim that SSDs fail less than hard disks, and it won't change the fact that, as
I've said, I've seen them failing about the same.

It's like you argue that the sun didn't go up today.  It won't change the fact
that I've seen that the sun has gone up today.



Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?))

2022-11-13 Thread David Christensen

On 11/13/22 13:02, hw wrote:

On Fri, 2022-11-11 at 07:55 -0500, Dan Ritter wrote:

hw wrote:

On Thu, 2022-11-10 at 20:32 -0500, Dan Ritter wrote:

Linux-Fan wrote:


[...]
* RAID 5 and 6 restoration incurs additional stress on the other
   disks in the RAID which makes it more likely that one of them
   will fail. The advantage of RAID 6 is that it can then recover
   from that...


Disks are always being stressed when used, and they're being stessed as well
when other types of RAID arrays than 5 or 6 are being rebuild.  And is there
evidence that disks fail *because* RAID arrays are being rebuild or would
they
have failed anyway when stressed?


Does it matter? The observed fact is that some notable
proportion of RAID 5/6 rebuilds fail because another drive in
that group has failed.


Fortunately, I haven't observed that.  And why would only RAID 5 or 6 be
affected and not RAID 1 or other levels?



Any RAID level can suffer additional disk failures while recovering from 
a disk failure.  I saw this exact scenario on my SOHO server in August 
2022.  The machine has a stripe of two mirrors of two HDD's each (e.g. 
ZFS equivalent of RAID10).  One disk was dying, so I replaced it.  While 
the replacement disk was resilvering, a disk in the other mirror started 
dying.  I let the first resilver finish, then replaced the second disk. 
Thankfully, no more disks failed.  I got lucky.



David



Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?))

2022-11-13 Thread hw
On Fri, 2022-11-11 at 22:11 +0100, Linux-Fan wrote:
> hw writes:
> 
> > On Thu, 2022-11-10 at 22:37 +0100, Linux-Fan wrote:
> 
> [...]
> 
> > >  If you do not value the uptime making actual (even
> > >  scheduled) copies of the data may be recommendable over
> > >  using a RAID because such schemes may (among other advantages)
> > >  protect you from accidental file deletions, too.
> > 
> > Huh?
> 
> RAID is limited in its capabilities because it acts at the file system,  
> block (or in case of hardware RAID even disk) level. Copying files can  
> operate on any subset of the data and is very flexible when it comes to  
> changing what is going to be copied, how, when and where to.

How do you intend to copy files at any other level than at file level?  At that
level, the only thing you know about is files.

> ### what
> 
> When copying files, its a standard feature to allow certain patterns of file  
> names to be exclueded.

sure

> [...]
> ### how
> 
> Multiple, well established tools exist for file tree copying. In RAID  
> scenarios the mode of operation is integral to the solution.

What has file tree copying to do with RAID scenarios?

> ### where to
> 
> File trees are much easier copied to network locations compared to adding a  
> “network mirror” to any RAID (although that _is_ indeed an option, DRBD was  
> mentioned in another post...).

Dunno, btrfs and ZFS have some ability to send file systems over the network,
which intended to make copying more efficient.  There must be reasons why this
feature was developed.

> File trees can be copied to slow target storages without slowing down the  
> source file system significantly. On the other hand, in RAID scenarios,  
> slow members are expected to slow down the performance of the entire array.  
> This alone may allow saving a lot of money. E.g. one could consider copying  
> the entire tree of VM images that is residing on a fast (and expensive) SSD  
> to a slow SMR HDD that only costs a fraction of the SSD. The same thing is  
> not possible with a RAID mirror except by slowing down the write operations  
> on the mirror to the speed of the HDD or by having two (or more) of the  
> expensive SSDs. SMR drives are advised against in RAID scenarios btw.

Copying the VM images to the slow HDD would slow the target down just as it
might slow down a RAID array.

> ### when
> 
> For file copies, the target storage need not always be online. You can  
> connect it only for the time of synchronization. This reduces the chance  
> that line overvoltages and other hardware faults destroy both copies at the  
> same time. For a RAID, all drives must be online at all times (lest the  
> array becomes degraded).

No, you can always turn off the array just as you can turn off single disks. 
When I'm done making backups, I shut down the server and not much can happen to
the backups.

> Additionally, when using files, only the _used_ space matters. Beyond that,  
> the size of the source and target file systems are decoupled. On the other  
> hand, RAID mandates that the sizes of disks adhere to certain properties  
> (like all being equal or wasting some of the storage).

And?

> > > > Is anyone still using ext4?  I'm not saying it's bad or anything, it  
> > > > only seems that it has gone out of fashion.
> > > 
> > > IIRC its still Debian's default.
> > 
> > Hm, I haven't really used Debian in a long time.  There's probably no
> > reason  
> > to change that.  If you want something else, you can always go for it.
> 
> Why are you asking on a Debian list when you neiter use it nor intend to use  
> it?

I didn't say that I don't use Debian, nor that I don't intend to use it.

> [...]
> > > licensing or stability issues whatsoever. By its popularity its probably  
> > > one of the most widely-deployed Linux file systems which may enhance the  
> > > chance that whatever problem you incur with ext4 someone else has had
> > > before...
> > 
> > I'm not sure it's most widespread.
> [...]
> Now check with <https://popcon.debian.org/by_vote>
> 
> I get the following (smaller number => more popular):
> 
> 87   e2fsprogs
> 1657 btrfs-progs
> 2314 xfsprogs
> 2903 zfs-dkms 
> 
> Surely this does not really measure if people are actually use these  
> file systems. Feel free to provide a more accurate means of measurement. For  
> me this strongly suggests that the most popular FS on Debian is ext4.

ext4 doesn't show up in this list.  And it doesen't matter if ext4 is most
widespread on Debian when more widespread distributions use different file
systems.  I don't h

Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?))

2022-11-13 Thread hw
On Fri, 2022-11-11 at 07:55 -0500, Dan Ritter wrote:
> hw wrote: 
> > On Thu, 2022-11-10 at 20:32 -0500, Dan Ritter wrote:
> > > Linux-Fan wrote: 
> > > 
> > > 
> > > [...]
> > > * RAID 5 and 6 restoration incurs additional stress on the other
> > >   disks in the RAID which makes it more likely that one of them
> > >   will fail. The advantage of RAID 6 is that it can then recover
> > >   from that...
> > 
> > Disks are always being stressed when used, and they're being stessed as well
> > when other types of RAID arrays than 5 or 6 are being rebuild.  And is there
> > evidence that disks fail *because* RAID arrays are being rebuild or would
> > they
> > have failed anyway when stressed?
> 
> Does it matter? The observed fact is that some notable
> proportion of RAID 5/6 rebuilds fail because another drive in
> that group has failed.

Fortunately, I haven't observed that.  And why would only RAID 5 or 6 be
affected and not RAID 1 or other levels?

>  The drives were likely to be from the
> same cohort of the manufacturer, and to have experienced very
> similar read/write activity over their lifetime.

Yes, and that means that might they fail all at about the same time due to age
and not because an array is being rebuild.

The question remains what the ratio between surviving volumes and lost volumes
is.



Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?)

2022-11-12 Thread Dan Ritter
David Christensen wrote: 
> The Intel Optane Memory Series products are designed to be cache devices --
> when using compatible hardware, Windows, and Intel software.  My hardware
> should be compatible (Dell PowerEdge T30), but I am unsure if FreeBSD 12.3-R
> will see the motherboard NVMe slot or an installed Optane Memory Series
> product.
> 
> 
> Intel Optane Memory M10 16 GB PCIe M.2 80mm are US $18.25 on Amazon.
> 
> 
> Intel Optane Memory M.2 2280 32GB PCIe NVMe 3.0 x2 are US $69.95 on Amazon.

Note that the entire product line is discontinued, so if you want this,
assume that you will not be able to get a replacement in future.

-dsr-



Sorry for the misattribution [was: ZFS performance (was: Re: deduplicating file systems: VDO] withDebian?)

2022-11-11 Thread tomas
On Fri, Nov 11, 2022 at 07:22:19PM +0100, to...@tuxteam.de wrote:

[...]

> I think what hede was hinting at was that early SSDs had a (pretty)
> limited number of write cycles [...]

As was pointed out to me, the OP wasn't hede. It was hw. Sorry for the
mis-attribution.

Cheers
-- 
t


signature.asc
Description: PGP signature


Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?)

2022-11-11 Thread David Christensen

On 11/11/22 00:43, hw wrote:

On Thu, 2022-11-10 at 21:14 -0800, David Christensen wrote:

On 11/10/22 07:44, hw wrote:

On Wed, 2022-11-09 at 21:36 -0800, David Christensen wrote:

On 11/9/22 00:24, hw wrote:
   > On Tue, 2022-11-08 at 17:30 -0800, David Christensen wrote:



Taking snapshots is fast and easy.  The challenge is deciding when to
destroy them.


That seems like an easy decision, just keep as many as you can and destroy the
ones you can't keep.



As with most filesystems, performance of ZFS drops dramatically as you 
approach 100% usage.  So, you need a data destruction policy that keeps 
storage usage and performance at acceptable levels.



Lots of snapshots slows down commands that involve snapshots (e.g.  'zfs 
list -r -t snapshot ...').  This means sysadmin tasks take longer when 
the pool has more snapshots.




I have considered switching to one Intel Optane Memory
Series and a PCIe 4x adapter card in each server [for a ZFS cache].


Isn't that very expensinve and wears out just as well?  



The Intel Optane Memory Series products are designed to be cache devices 
-- when using compatible hardware, Windows, and Intel software.  My 
hardware should be compatible (Dell PowerEdge T30), but I am unsure if 
FreeBSD 12.3-R will see the motherboard NVMe slot or an installed Optane 
Memory Series product.



Intel Optane Memory M10 16 GB PCIe M.2 80mm are US $18.25 on Amazon.


Intel Optane Memory M.2 2280 32GB PCIe NVMe 3.0 x2 are US $69.95 on Amazon.



Wouldn't it be better to have the cache in RAM?



Adding memory should help in more ways than one.  Doing so might reduce 
ZFS cache device usage, but I am not certain.  But, more RAM will not 
address the excessive wear problems when using a desktop SSD as a ZFS 
cache device.



8 GB ECC memory modules to match the existing modules in my SOHO server 
are $24.95 each on eBay.  I have two free memory slots.




Please run and post the relevant command for LVM, btrfs, whatever.


Well, what would that tell you?



That would provide accurate information about the storage configuration 
of your backup server.



Here is the pool in my backup server.  mirror-0 and mirror-1 each use 
two Seagate 3 TB HDD's.  dedup and cache each use partitions on two 
Intel SSD 520 Series 180 GB SSD's:


2022-11-11 20:41:09 toor@f1 ~
# zpool status p1
  pool: p1
 state: ONLINE
  scan: scrub repaired 0 in 7 days 22:18:11 with 0 errors on Sun Sep  4 
14:18:21 2022

config:

NAME  STATE READ WRITE CKSUM
p1ONLINE   0 0 0
  mirror-0ONLINE   0 0 0
gpt/p1a.eli   ONLINE   0 0 0
gpt/p1b.eli   ONLINE   0 0 0
  mirror-1ONLINE   0 0 0
gpt/p1c.eli   ONLINE   0 0 0
gpt/p1d.eli   ONLINE   0 0 0
dedup   
  mirror-2ONLINE   0 0 0
gpt/CVCV**D0180EGN-2.eli  ONLINE   0 0 0
gpt/CVCV**7K180EGN-2.eli  ONLINE   0 0 0
cache
  gpt/CVCV**D0180EGN-1.eliONLINE   0 0 0
  gpt/CVCV**7K180EGN-1.eliONLINE   0 0 0

errors: No known data errors



I suggest creating a ZFS pool with a mirror vdev of two HDD's.
   If you
can get past your dislike of SSD's,
  add a mirror of two SSD's as a
dedicated dedup vdev.  (These will not see the hard usage that cache
devices get.)
   Create a filesystem 'backup'.  Create child filesystems,
one for each host.  Create grandchild filesystems, one for the root
filesystem on each host.


Huh?  What's with these relationships?



ZFS datasets can be organized into hierarchies.  Child dataset 
properties can be inherited from the parent dataset.  Commands can be 
applied to an entire hierarchy by specifying the top dataset and using a 
"recursive" option.  Etc..



When a host is decommissioned and you no longer need the backups, you 
can destroy the backups for just that host.  When you add a new host, 
you can create filesystems for just that host.  You can use different 
backup procedures for different hosts.  Etc..




   Set up daily rsync backups of the root
filesystems on the various hosts to the ZFS grandchild filesystems.  Set
up zfs-auto-snapshot to take daily snapshots of everything, and retain
10 snapshots.  Then watch what happens.


What do you expect to happen?  



I expect the first full backup and snapshot will use an amount of 
storage that is something less than the sum of the sizes of the source 
filesystems (due to compression).  The second through tenth backups and 
snapshots will each increase the storage usage by something less than 
the sum of the daily churn of the source filesystems.  On day 11, and 
every day thereafter, the oldest 

Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?))

2022-11-11 Thread Linux-Fan

hw writes:


On Thu, 2022-11-10 at 22:37 +0100, Linux-Fan wrote:


[...]


>  If you do not value the uptime making actual (even
>  scheduled) copies of the data may be recommendable over
>  using a RAID because such schemes may (among other advantages)
>  protect you from accidental file deletions, too.

Huh?


RAID is limited in its capabilities because it acts at the file system,  
block (or in case of hardware RAID even disk) level. Copying files can  
operate on any subset of the data and is very flexible when it comes to  
changing what is going to be copied, how, when and where to.


### what

When copying files, its a standard feature to allow certain patterns of file  
names to be exclueded. This allows fine-tuning the system to avoid  
unnecessary storage costs by not duplicating the files of which duplicates  
are not needed (.iso or /tmp files could be an example of files that some  
uses may not consider worth duplicating).


### how

Multiple, well established tools exist for file tree copying. In RAID  
scenarios the mode of operation is integral to the solution.


### where to

File trees are much easier copied to network locations compared to adding a  
“network mirror” to any RAID (although that _is_ indeed an option, DRBD was  
mentioned in another post...).


File trees can be copied to slow target storages without slowing down the  
source file system significantly. On the other hand, in RAID scenarios,  
slow members are expected to slow down the performance of the entire array.  
This alone may allow saving a lot of money. E.g. one could consider copying  
the entire tree of VM images that is residing on a fast (and expensive) SSD  
to a slow SMR HDD that only costs a fraction of the SSD. The same thing is  
not possible with a RAID mirror except by slowing down the write operations  
on the mirror to the speed of the HDD or by having two (or more) of the  
expensive SSDs. SMR drives are advised against in RAID scenarios btw.


### when

For file copies, the target storage need not always be online. You can  
connect it only for the time of synchronization. This reduces the chance  
that line overvoltages and other hardware faults destroy both copies at the  
same time. For a RAID, all drives must be online at all times (lest the  
array becomes degraded).


Additionally, when using files, only the _used_ space matters. Beyond that,  
the size of the source and target file systems are decoupled. On the other  
hand, RAID mandates that the sizes of disks adhere to certain properties  
(like all being equal or wasting some of the storage).


> > Is anyone still using ext4?  I'm not saying it's bad or anything, it  
> > only seems that it has gone out of fashion.

>
> IIRC its still Debian's default.

Hm, I haven't really used Debian in a long time.  There's probably no reason  
to change that.  If you want something else, you can always go for it.


Why are you asking on a Debian list when you neiter use it nor intend to use  
it?



>  Its my file system of choice unless I have 
> very specific reasons against it. I have never seen it fail outside of 
> hardware issues. Performance of ext4 is quite acceptable out of the box. 
> E.g. it seems to be slightly faster than ZFS for my use cases. 
> Almost every Linux live system can read it. There are no problematic 
> licensing or stability issues whatsoever. By its popularity its probably  
> one of the most widely-deployed Linux file systems which may enhance the  
> chance that whatever problem you incur with ext4 someone else has had before...


I'm not sure it's most widespread.  Centos (and Fedora) defaulted to xfs  
quite
some time ago, and Fedora more recently defaulted to btrfs (a while after  
Redhat
announced they would remove btrfs from RHEL altogether).  Centos went down  
the
drain when it mutated into an outdated version of Fedora, and RHEL is  
probably

isn't any better.


~$ dpkg -S zpool | cut -d: -f 1 | sort -u
[...]
zfs-dkms
zfsutils-linux
~$ dpkg -S mkfs.ext4
e2fsprogs: /usr/share/man/man8/mkfs.ext4.8.gz
e2fsprogs: /sbin/mkfs.ext4
~$ dpkg -S mkfs.xfs
xfsprogs: /sbin/mkfs.xfs
xfsprogs: /usr/share/man/man8/mkfs.xfs.8.gz
~$ dpkg -S mkfs.btrfs
btrfs-progs: /usr/share/man/man8/mkfs.btrfs.8.gz
btrfs-progs: /sbin/mkfs.btrfs

Now check with <https://popcon.debian.org/by_vote>

I get the following (smaller number => more popular):

87   e2fsprogs
1657 btrfs-progs
2314 xfsprogs
	2903 zfs-dkms 

Surely this does not really measure if people are actually use these  
file systems. Feel free to provide a more accurate means of measurement. For  
me this strongly suggests that the most popular FS on Debian is ext4.



So assuming that RHEL and Centos may be more widespread than Debian because
there's lots of hardware support

Re: ZFS performance (was: Re: deduplicating file systems: VDO withDebian?)

2022-11-11 Thread Linux-Fan

hw writes:


On Thu, 2022-11-10 at 23:05 -0500, Michael Stone wrote:
> On Thu, Nov 10, 2022 at 06:55:27PM +0100, hw wrote:
> > On Thu, 2022-11-10 at 11:57 -0500, Michael Stone wrote:
> > > On Thu, Nov 10, 2022 at 05:34:32PM +0100, hw wrote:
> > > > And mind you, SSDs are *designed to fail* the sooner the more data  
you

> > > > write
> > > > to
> > > > them.  They have their uses, maybe even for storage if you're so
> > > > desperate,
> > > > but
> > > > not for backup storage.


[...]


Why would anyone use SSDs for backups?  They're way too expensive for that.


I actually do this for offsite/portable backups because SSDs are shock  
resistant (dont lose data when being dropped etc.).


The most critical thing to acknowledge about using SSDs for backups is that  
the data retention time of SSDs (when not powered) is decreasing with each  
generation.


Write endurance has not become critical in any of my SSD uses so far.  
Increasing workloads have also resulted in me upgrading the SSDs. So far I  
always upgraded faster than running into the write endurance limits. I do  
not use the SSDs as caches but as full-blown file system drives, though.


On the current system, the SSDs report having written about 14 TB and are  
specified by the manufacturer for an endurance of 6300 TBW (drive size is 4  
TB). The small (drive size about 240GB) ones I use for backup are much less  
durable. For one of them, the manufacturer claims 306TBW, the other has  
360 TBW specified. I do not currently know how much data I have written to  
them already. As you can see from the sizes, I backup only a tiny subset of  
the data to SSDs i.e. the parts of my data that I consider most critical (VM  
images not being among them...).


[...]

There was no misdiagnosis.  Have you ever had a failed SSD?  They usually  
just

disappear.  I've had one exception in which the SDD at first only sometimes
disappeared and came back, until it disappeared and didn't come back.


[...]

Just for the record I recall having observed this once in a very similar  
fashion. It was back when a typical SSD size was 60 GiB. By now we should  
mostly be past this “SSD fails early with controller fault” issues. It can  
still happen and I still expect SSDs to fail with less notice compared to  
HDDs.


When I had my first (and so far only) disk failure (on said 60G SSD) I  
decided to:


* Retain important data on HDDs (spinning rust) for the time being

* and also implement RAID1 for all important drives

Although in theory running two disks instead of one should increase the  
overall chance of having one fail, no disks failed after this change so  
far.


YMMV
Linux-Fan

öö 


pgp7fF7ksy0yP.pgp
Description: PGP signature


Re: ZFS performance (was: Re: deduplicating file systems: VDO withDebian?)

2022-11-11 Thread Michael Stone

On Fri, Nov 11, 2022 at 02:05:33PM -0500, Dan Ritter wrote:

300TB/year. That's a little bizarre: it's 9.51 MB/s. Modern
high end spinners also claim 200MB/s or more when feeding them
continuous writes. Apparently WD thinks that can't be sustained
more than 5% of the time.


Which makes sense for most workloads. Very rarely do people write 
continuously to disks *and never keep the data there to read it later*. 
There are exceptions (mostly of the transaction log type for otherwise 
memory-resident data), and you can get write optimized storage, but 
you'll pay more. For most people that's a bad deal, because it would 
mean paying for a level of write endurance that they'll never use.




Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?))

2022-11-11 Thread Michael Stone

On Fri, Nov 11, 2022 at 09:03:45AM +0100, hw wrote:

On Thu, 2022-11-10 at 23:12 -0500, Michael Stone wrote:

The advantage to RAID 6 is that it can tolerate a double disk failure.
With RAID 1 you need 3x your effective capacity to achieve that and even
though storage has gotten cheaper, it hasn't gotten that cheap. (e.g.,
an 8 disk RAID 6 has the same fault tolerance as an 18 disk RAID 1 of
equivalent capacity, ignoring pointless quibbling over probabilities.)


so with RAID6, 3x8 is 18 instead of 24


you have 6 disks of useable capacity with the 8 disk raid 6, two disks 
worth of parity. 6 disks of useable capacity on a triple redundant 
mirror is 6*3 = 18.




Re: ZFS performance (was: Re: deduplicating file systems: VDO withDebian?)

2022-11-11 Thread Michael Stone

On Fri, Nov 11, 2022 at 07:15:07AM +0100, hw wrote:

There was no misdiagnosis.  Have you ever had a failed SSD?  They usually just
disappear.


Actually, they don't; that's a somewhat unusual failure mode. I have had 
a couple of ssd failures, out of hundreds. (And I think mostly from a 
specific known-bad SSD design; I haven't had any at all in the past few 
years.) I've had way more dead hard drives, which is typical.



There was no "not normal" territory, either, unless maybe you consider ZFS cache
as "not normal".  In that case, I would argue that SSDs are well suited for such
applications because they allow for lots of IOOPs and high data transfer rates,
and a hard disk probably wouldn't have failed in place of the SSD because they
don't wear out so quickly.  Since SSDs are so well suited for such purposes,
that can't be "not normal" territory for them.  Perhaps they just need to be
more resilient than they are.


You probably bought the wrong SSD. SSDs write in erase-block units, 
which are on the order of 1-4MB. If you're writing many many small 
blocks (as you would with a ZFS ZIL cache) there's significant write 
amplification. For that application you really need a fairly expensive 
write-optimized SSD, not a commodity (read-optimized) SSD. (And in fact, 
SSD is *not* ideal for this because the data is written sequentially and 
basically never read so low seek times aren't much benefit; NVRAM is 
better suited.) If you were using it for L2ARC cache then mostly that 
makes no sense for a backup server. Without more details it's really 
hard to say any more. Honestly, even with the known issues of using 
commidity SSD for SLOG I find it really hard to believe that your 
backups were doing enough async transactions for that to matter--far 
more likely is still that you simply got a bad copy, just like you can 
get a bad hd. Sometimes you get a bad part, that's life. Certainly not 
something to base a religion on.



Considering that, SSDs generally must be of really bad quality for that to
happen, don't you think?


No, I think you're making unsubstantiated statements, and I'm mostly 
trying to get better information on the record for others who might be 
reading.




Re: ZFS performance (was: Re: deduplicating file systems: VDO withDebian?)

2022-11-11 Thread Dan Ritter
to...@tuxteam.de wrote: 
> 
> I think what hede was hinting at was that early SSDs had a (pretty)
> limited number of write cycles per "block" [1] before failure; they had
> (and have) extra blocks to substitute broken ones and do a fair amount
> of "wear leveling behind the scenes. So it made more sense to measure
> failures along the "TB written" axis than along the time axis.

They still do, and in fact each generation gets worse in terms
of durability while getting better in price/capacity.

Here's Western Digital's cheap line of NVMe SSDs:
https://documents.westerndigital.com/content/dam/doc-library/en_us/assets/public/western-digital/product/internal-drives/wd-blue-nvme-ssd/product-brief-wd-blue-sn570-nvme-ssd.pdf

MTBF is listed as 1.5 million hours... 160 years.

Lifetime endurance is listed as 150TB for the 250GB version, and
300TB for the 500GB version. 600 full writes expected.

Here's the more expensive Red line:
https://documents.westerndigital.com/content/dam/doc-library/en_us/assets/public/western-digital/product/internal-drives/wd-red-ssd/product-brief-western-digital-wd-red-sn700-nvme-ssd.pdf

MTTF: 1.7 million hours. Snicker.

Endurance: 1000TB for the 500GB version, 2000TB for the 1TB
version. A nice upgrade from 600 writes to 2000 writes.

Unrelated, but cool: the 4TB version weighs 10 grams.

-dsr-



Re: ZFS performance (was: Re: deduplicating file systems: VDO withDebian?)

2022-11-11 Thread Dan Ritter
Jeffrey Walton wrote: 
> On Fri, Nov 11, 2022 at 2:01 AM  wrote:
> >
> > On Fri, Nov 11, 2022 at 07:15:07AM +0100, hw wrote:
> > > On Thu, 2022-11-10 at 23:05 -0500, Michael Stone wrote:
> >... Here's a report
> > by folks who do lots of HDDs and SDDs:
> >
> >   https://www.backblaze.com/blog/backblaze-hard-drive-stats-q1-2021/
> >
> > The gist, for disks playing similar roles (they don't use yet SSDs for bulk
> > storage, because of the costs): 2/1518 failures for SSDs, 44/1669 for HDDs.
> 
> Forgive my ignorance... Isn't Mean Time Before Failure (MTFB) the
> interesting statistic?
> 
> When selecting hardware, like HDD vs SSD, you don't know if and when a
> failure is going to occur. You can only estimate failures using MTBF.
> 
> After the installation and with failure data in hand, you can check if
> the MTBF estimate is accurate. I expect most of the HDD and SSD
> failures to fall within 1 standard deviation of the reported MTBF. And
> you will have some data points that show failure before MTBF, and some
> data points that show failure after MTBF.


You'd like that to be true. I'd like that to be true. What do we
actually see?

Here's Western Digital's new 22TB 7200RPm NAS disk:
https://documents.westerndigital.com/content/dam/doc-library/en_us/assets/public/western-digital/product/internal-drives/wd-red-pro-hdd/product-brief-western-digital-wd-red-pro-hdd.pdf

Claimed MTBF: 1 million hours. Believe it or not, this is par
for the course for high-end disks.

24 hours a day, 365 days a year: 8760 hours per year.
100/8760 = 114 years.

So, no: MTBF numbers must be presumed to be malicious lies.

More reasonable: this drive comes with a 5 year warranty. Treat
that as the expected lifetime, and you'll be somewhat closer to
the truth maybe.

Here's a new number that used to be just for SSDs, but is now
for spinners as well: expected workload (per year) or Total TB
Written (lifetime). For the above disk family, all of them claim
300TB/year. That's a little bizarre: it's 9.51 MB/s. Modern
high end spinners also claim 200MB/s or more when feeding them
continuous writes. Apparently WD thinks that can't be sustained
more than 5% of the time.

-dsr-



Re: ZFS performance (was: Re: deduplicating file systems: VDO withDebian?)

2022-11-11 Thread tomas
On Fri, Nov 11, 2022 at 12:53:21PM -0500, Jeffrey Walton wrote:
> On Fri, Nov 11, 2022 at 2:01 AM  wrote:
> >
> > On Fri, Nov 11, 2022 at 07:15:07AM +0100, hw wrote:
> > > On Thu, 2022-11-10 at 23:05 -0500, Michael Stone wrote:
> >... Here's a report
> > by folks who do lots of HDDs and SDDs:
> >
> >   https://www.backblaze.com/blog/backblaze-hard-drive-stats-q1-2021/
> >
> > The gist, for disks playing similar roles (they don't use yet SSDs for bulk
> > storage, because of the costs): 2/1518 failures for SSDs, 44/1669 for HDDs.
> 
> Forgive my ignorance... Isn't Mean Time Before Failure (MTFB) the
> interesting statistic?

I think what hede was hinting at was that early SSDs had a (pretty)
limited number of write cycles per "block" [1] before failure; they had
(and have) extra blocks to substitute broken ones and do a fair amount
of "wear leveling behind the scenes. So it made more sense to measure
failures along the "TB written" axis than along the time axis.

Cheers

[1] In a very sloppy sense: those beasts have big write units, 256K and
up.

-- 
t


signature.asc
Description: PGP signature


Re: ZFS performance (was: Re: deduplicating file systems: VDO withDebian?)

2022-11-11 Thread Jeffrey Walton
On Fri, Nov 11, 2022 at 2:01 AM  wrote:
>
> On Fri, Nov 11, 2022 at 07:15:07AM +0100, hw wrote:
> > On Thu, 2022-11-10 at 23:05 -0500, Michael Stone wrote:
>... Here's a report
> by folks who do lots of HDDs and SDDs:
>
>   https://www.backblaze.com/blog/backblaze-hard-drive-stats-q1-2021/
>
> The gist, for disks playing similar roles (they don't use yet SSDs for bulk
> storage, because of the costs): 2/1518 failures for SSDs, 44/1669 for HDDs.

Forgive my ignorance... Isn't Mean Time Before Failure (MTFB) the
interesting statistic?

When selecting hardware, like HDD vs SSD, you don't know if and when a
failure is going to occur. You can only estimate failures using MTBF.

After the installation and with failure data in hand, you can check if
the MTBF estimate is accurate. I expect most of the HDD and SSD
failures to fall within 1 standard deviation of the reported MTBF. And
you will have some data points that show failure before MTBF, and some
data points that show failure after MTBF.

Jeff



Re: ZFS performance (was: Re: deduplicating file systems: VDO withDebian?)

2022-11-11 Thread tomas
On Fri, Nov 11, 2022 at 05:05:51PM -, Curt wrote:
> On 2022-11-11,   wrote:
> >
> > I just contested that their failure rate is higher than that of HDDs.
[...]

> If he prefers extrapolating his anecdotal personal experience to a
> general rule rather than applying a verifiable general rule to his
> personal experience, then there's just nothing to be done for him!
> 
> 
> > I'm out of this thread.

So you lured me in again, you baddy :)

Everyone has right to their prejudices. I cherish mine too. My beef
was rather with being misrepresented myself.

Cheers
-- 
t


signature.asc
Description: PGP signature


Re: ZFS performance (was: Re: deduplicating file systems: VDO withDebian?)

2022-11-11 Thread Curt
On 2022-11-11,   wrote:
>
> I just contested that their failure rate is higher than that of HDDs.
> This is something which was true in early days, but nowadays it seems
> to be just a prejudice.

If he prefers extrapolating his anecdotal personal experience to a
general rule rather than applying a verifiable general rule to his
personal experience, then there's just nothing to be done for him!


> I'm out of this thread.
>



Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?))

2022-11-11 Thread Dan Ritter
hw wrote: 
> On Thu, 2022-11-10 at 20:32 -0500, Dan Ritter wrote:
> > Linux-Fan wrote: 
> > 
> > 
> > [...]
> > * RAID 5 and 6 restoration incurs additional stress on the other
> >   disks in the RAID which makes it more likely that one of them
> >   will fail. The advantage of RAID 6 is that it can then recover
> >   from that...
> 
> Disks are always being stressed when used, and they're being stessed as well
> when other types of RAID arrays than 5 or 6 are being rebuild.  And is there
> evidence that disks fail *because* RAID arrays are being rebuild or would they
> have failed anyway when stressed?

Does it matter? The observed fact is that some notable
proportion of RAID 5/6 rebuilds fail because another drive in
that group has failed. The drives were likely to be from the
same cohort of the manufacturer, and to have experienced very
similar read/write activity over their lifetime.

To some extent this can be ameliorated by using disks from
multiple manufacturers or different batches, but there are only
three rotating disk makers left and managing this is difficult
to arrange at scale.


> > Most of the computers in my house have one disk. If I value any
> > data on that disk,
> 
> Then you don't use only one disk but redundancy.  There's also your time and
> nerves you might value.

It turns out to be really hard to fit a second disk in a laptop,
or in a NUC-sized machine.


> >  I back it up to the server, which has 4 4TB
> > disks in ZFS RAID10. If a disk fails in that, I know I can
> > survive that and replace it within 24 hours for a reasonable
> > amount of money -- rather more reasonable in the last few
> > months.
> 
> How do you get a new suitable disk within 24 hours?  For reasonable amounts of
> money?  Disk prices keep changing all the time.

My local store, MicroCenter, has --- 20ish 4TB disks in stock. I
can go get one in an hour.

Amazon will ship me a suitable drive next day or faster -- I
have ordered some items in the morning and received them before
nightfall -- at a lower cost, but at the price of enriching
Bezos.


-dsr-



network raid (Re: deduplicating file systems: VDO with Debian?)

2022-11-11 Thread hede

Am 10.11.2022 14:40, schrieb Curt:

(or maybe a RAID array is
conceivable over a network and a distance?).


Not only conceivable, but indeed practicable: Linbit DRBD



Re: deduplicating file systems: VDO with Debian?

2022-11-11 Thread rhkramer
On Thursday, November 10, 2022 09:06:39 AM Dan Ritter wrote:
> If you need a filesystem that is larger than a single disk (that you can
> afford, or that exists), RAID is the name for the general approach to
> solving that.

PIcking a nit, I would say: "RAID is the name for *a* general approach to
solving that." (LVM is another.)

-- 
rhk

If you reply: snip, snip, and snip again; leave attributions; avoid HTML; 
avoid top posting; and keep it "on list".  (Oxford comma included at no 
charge.)  If you change topics, change the Subject: line. 

Writing is often meant for others to read and understand (legal agreements 
excepted?) -- make it easier for your reader by various means, including 
liberal use of whitespace and minimal use of (obscure?) jargon, abbreviations, 
acronyms, and references.

If someone else has already responded to a question, decide whether any 
response you add will be helpful or not ...

A picture is worth a thousand words -- divide by 10 for each minute of video 
(or audio) or create a transcript and edit it to 10% of the original.



Re: ZFS performance (was: Re: deduplicating file systems: VDO withDebian?)

2022-11-11 Thread tomas
On Fri, Nov 11, 2022 at 09:12:36AM +0100, hw wrote:
> Backblaze does all kinds of things.

whatever.

> > The gist, for disks playing similar roles (they don't use yet SSDs for bulk
> > storage, because of the costs): 2/1518 failures for SSDs, 44/1669 for HDDs.
> > 
> > I'll leave the maths as an exercise to the reader.
> 
> Numbers never show you the real picture, especially not statistical ones.

Your gut feeling might be more relevant for you. But it's not for me,
so that's why I'll bow out of this thread :-)

>   You
> even say it yourself that the different types of disks were used for different
> purposes.  That makes your numbers meaningless.

Please, re-read what I wrote (you even quoted it above). It's nearly the
opposite of what you say here. I said "similar roles". I won't read the
blog entry for you aloud here.

> Out of cruiosity, what do these numbers look like in something like survivours
> per TB?  Those numbers will probably show a very different picture, won't 
> they.

DidI say similar roles? The devices compared are doing the same job, So TB
per unit time are, for all we know, comparable.

> And when even Backblaze doesn't use SSDs for backup storage because they're
> expensive, then why would you suggest or assume that anyone do or does that?

There again. Please read what others write before paraphrasing them
wrongly. I never said you should use SSDs for bulk data storage. They
are too expensive for that. That's something you, backblaze and me
all agreed from the start. Why on earth do you bring that up here, then?

I just contested that their failure rate is higher than that of HDDs.
This is something which was true in early days, but nowadays it seems
to be just a prejudice.

I'm out of this thread.

Cheers
-- 
t


signature.asc
Description: PGP signature


Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?))

2022-11-11 Thread DdB
Am 11.11.2022 um 07:36 schrieb hw:
> That's on https://docs.freebsd.org/en/books/handbook/zfs/
> 
> I don't remember where I read about 8, could have been some documentation 
> about
> FreeNAS.

Well, OTOH there do exist some considerations, which may have lead to
that number sticking somewhere, but i have seen people with MUCH larger
pools.

In order to avoid wasting too much space, there was a "formula" to
calculate optimum pool size:
First take an amount of disks that is a result of 2^^n (like
2/4/8/16/...) and then add the number of disks you need for redundancy
(raidz = 1, raidz2 = 2, raidz3 = 3). That would give nice spots like 4+2
= 6 (identical) disks for raidz2, or 11 for raidz3. Those numbers are
sweet spots for the size of vdevs, otherwise, more space gets waisted on
the drives. But that is only ONE consideration. My motherboard has 8
connectors for SATA, + 2 for NVME, which limited my options more than
anything.
And after long considerations, i opted for 4 mirrored vdevs, giving even
more space to redundancy, but gaining read speed.



Re: deduplicating file systems: VDO with Debian?

2022-11-11 Thread hw
On Thu, 2022-11-10 at 13:40 +, Curt wrote:
> On 2022-11-08, The Wanderer  wrote:
> > 
> > That more general sense of "backup" as in "something that you can fall
> > back on" is no less legitimate than the technical sense given above, and
> > it always rubs me the wrong way to see the unconditional "RAID is not a
> > backup" trotted out blindly as if that technical sense were the only one
> > that could possibly be considered applicable, and without any
> > acknowledgment of the limited sense of "backup" which is being used in
> > that statement.
> > 
> 
> Maybe it's a question of intent more than anything else. I thought RAID
> was intended for a server scenario where if a disk fails, you're down
> time is virtually null, whereas as a backup is intended to prevent data
> loss. RAID isn't ideal for the latter because it doesn't ship the saved
> data off-site from the original data (or maybe a RAID array is
> conceivable over a network and a distance?).
> 
> Of course, I wouldn't know one way or another, but the complexity (and
> substantial verbosity) of this thread seem to indicate that that all
> these concepts cannot be expressed clearly and succinctly, from which I
> draw my own conclusions.
> 

But the performance is great ;)



Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?)

2022-11-11 Thread hw
On Thu, 2022-11-10 at 21:14 -0800, David Christensen wrote:
> On 11/10/22 07:44, hw wrote:
> > On Wed, 2022-11-09 at 21:36 -0800, David Christensen wrote:
> > > On 11/9/22 00:24, hw wrote:
> > >   > On Tue, 2022-11-08 at 17:30 -0800, David Christensen wrote:
> 
> [...]
>
> > 
> Taking snapshots is fast and easy.  The challenge is deciding when to 
> destroy them.

That seems like an easy decision, just keep as many as you can and destroy the
ones you can't keep.

> [...]
> > > Without deduplication or compression, my backup set and 78 snapshots
> > > would require 3.5 TiB of storage.  With deduplication and compression,
> > > they require 86 GiB of storage.
> > 
> > Wow that's quite a difference!  What makes this difference, the compression
> > or
> > the deduplication? 
> 
> 
> Deduplication.

Hmm, that means that deduplication shrinks your data down to about 1/40 of it's
size.  That's an awesome rate.

> > When you have snapshots, you would store only the
> > differences from one snapshot to the next, 
> > and that would mean that there aren't
> > so many duplicates that could be deduplicated.
> 
> 
> I do not know -- I have not crawled the ZFS code; I just use it.

Well, it's like a miracle :)

> > > Users can recover their own files without needing help from a system
> > > administrator.
> > 
> > You have users who know how to get files out of snapshots?
> 
> 
> Not really; but the feature is there.

That means you're still the one to get the files.

> [...]
> > 
> > 
> > > What were the makes and models of the 6 disks?  Of the SSD's?  If you
> > > have a 'zpool status' console session from then, please post it.
> > 
> > They were (and still are) 6x4TB WD Red (though one or two have failed over
> > time)
> > and two Samsung 850 PRO, IIRC.  I don't have an old session anymore.
> > 
> > These WD Red are slow to begin with.  IIRC, both SDDs failed and I removed
> > them.
> > 
> > The other instance didn't use SSDs but 6x2TB HGST Ultrastar.  Those aren't
> > exactly slow but ZFS is slow.
> 
> 
> Those HDD's should be fine with ZFS; but those SSD's are desktop drives, 
> not cache devices.  That said, I am making the same mistake with Intel 
> SSD 520 Series.  I have considered switching to one Intel Optane Memory 
> Series and a PCIe 4x adapter card in each server.

Isn't that very expensinve and wears out just as well?  Wouldn't it be better to
have the cache in RAM?


> Please run and post the relevant command for LVM, btrfs, whatever.

Well, what would that tell you?

> [...]
> > 
> > > What is the make and model of your controller cards?
> > 
> > They're HP smart array P410.  FreeBSD doesn't seem to support those.
> 
> 
> I use the LSI 9207-8i with "IT Mode" firmware (e.g. host bus adapter, 
> not RAID):

Well, I couldn't get those when I wanted them.  Since I didn't plan on using
ZFS, the P410s have to do.

> [...]
> > ... the data to back up is mostly (or even all) on btrfs. ... copy the
> > files over with rsync.  ...
> > the data comes from different machines and all backs up to one volume.
> 
> 
> I suggest creating a ZFS pool with a mirror vdev of two HDD's.

That would be way too small.

>   If you 
> can get past your dislike of SSD's,

I don't dislike them.  I'm using them where they give me advantages, and I don't
use them where they would give me disadvantages.

>  add a mirror of two SSD's as a 
> dedicated dedup vdev.  (These will not see the hard usage that cache 
> devices get.)

I think I have 2x80GB SSDs that are currently not in use.

>   Create a filesystem 'backup'.  Create child filesystems, 
> one for each host.  Create grandchild filesystems, one for the root 
> filesystem on each host.

Huh?  What's with these relationships?

>   Set up daily rsync backups of the root 
> filesystems on the various hosts to the ZFS grandchild filesystems.  Set 
> up zfs-auto-snapshot to take daily snapshots of everything, and retain 
> 10 snapshots.  Then watch what happens.

What do you expect to happen?  I'm thinking about changing my backup sever ... 
In any case, I need to do more homework first.



Re: ZFS performance (was: Re: deduplicating file systems: VDO withDebian?)

2022-11-11 Thread hw
On Fri, 2022-11-11 at 08:01 +0100, to...@tuxteam.de wrote:
> On Fri, Nov 11, 2022 at 07:15:07AM +0100, hw wrote:
> > On Thu, 2022-11-10 at 23:05 -0500, Michael Stone wrote:
> 
> [...]
> 
> > Why would anyone use SSDs for backups?  They're way too expensive for that.
> 
> Possibly.
> 
> > So far, the failure rate with SSDs has been not any better than the failure
> > rate
> > of hard disks.  Considering that SSDs are supposed to fail less, the
> > experience
> > with them is pretty bad.
> 
> You keep pulling things out of whatever (thin air, it seems).

That's more like what you're doing.  In my case, it's my own experience.

>  Here's a report
> by folks who do lots of HDDs and SDDs:
> 
>   https://www.backblaze.com/blog/backblaze-hard-drive-stats-q1-2021/

Backblaze does all kinds of things.

> The gist, for disks playing similar roles (they don't use yet SSDs for bulk
> storage, because of the costs): 2/1518 failures for SSDs, 44/1669 for HDDs.
> 
> I'll leave the maths as an exercise to the reader.

Numbers never show you the real picture, especially not statistical ones.  You
even say it yourself that the different types of disks were used for different
purposes.  That makes your numbers meaningless.

Out of cruiosity, what do these numbers look like in something like survivours
per TB?  Those numbers will probably show a very different picture, won't they.

And when even Backblaze doesn't use SSDs for backup storage because they're
expensive, then why would you suggest or assume that anyone do or does that?



Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?))

2022-11-11 Thread hw
On Thu, 2022-11-10 at 23:12 -0500, Michael Stone wrote:
> On Thu, Nov 10, 2022 at 08:32:36PM -0500, Dan Ritter wrote:
> > * RAID 5 and 6 restoration incurs additional stress on the other
> >  disks in the RAID which makes it more likely that one of them
> >  will fail.
> 
> I believe that's mostly apocryphal; I haven't seen science backing that 
> up, and it hasn't been my experience either.

Maybe it's a myth that comes about when someone rebuilds a RAID and yet another
disk in it fails (because they're all same age and have been running under same
conditions).  It's easy to jump to conclusions and easy jumps is what people
like.

OTOH, it's not too unplausible that a disk might fail just when it's working
particularly hard.  If it hadn't been working so hard, maybe it would have
failed later because it had more time to wear out or when the ambient
temperatures are higher in the summer.  So who knows?

> >  The advantage of RAID 6 is that it can then recover
> >  from that...
> 
> The advantage to RAID 6 is that it can tolerate a double disk failure. 
> With RAID 1 you need 3x your effective capacity to achieve that and even 
> though storage has gotten cheaper, it hasn't gotten that cheap. (e.g., 
> an 8 disk RAID 6 has the same fault tolerance as an 18 disk RAID 1 of 
> equivalent capacity, ignoring pointless quibbling over probabilities.)

so with RAID6, 3x8 is 18 instead of 24

With 18 disks more can go wrong than with 8.  That's all kinda confusing.



Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?))

2022-11-10 Thread hw
On Thu, 2022-11-10 at 20:32 -0500, Dan Ritter wrote:
> Linux-Fan wrote: 
> 
> 
> [...]
> * RAID 5 and 6 restoration incurs additional stress on the other
>   disks in the RAID which makes it more likely that one of them
>   will fail. The advantage of RAID 6 is that it can then recover
>   from that...

Disks are always being stressed when used, and they're being stessed as well
when other types of RAID arrays than 5 or 6 are being rebuild.  And is there
evidence that disks fail *because* RAID arrays are being rebuild or would they
have failed anyway when stressed?

> * RAID 10 gets you better read performance in terms of both
>   throughput and IOPS relative to the same number of disks in
>   RAID 5 or 6. Most disk activity is reading.
> 

and it requires more disks for the same capacity

For disks used for backups, most activity is writing.  That goes for some other
purposes as well.

> [...]
> 
>  The power of open source software is that we can make
> opportunities open to people with small budgets that are
> otherwise reserved for people with big budgets.

That's only one advantage.

> Most of the computers in my house have one disk. If I value any
> data on that disk,

Then you don't use only one disk but redundancy.  There's also your time and
nerves you might value.

>  I back it up to the server, which has 4 4TB
> disks in ZFS RAID10. If a disk fails in that, I know I can
> survive that and replace it within 24 hours for a reasonable
> amount of money -- rather more reasonable in the last few
> months.

How do you get a new suitable disk within 24 hours?  For reasonable amounts of
money?  Disk prices keep changing all the time.

Backups are no substitute for redundancy.



Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?))

2022-11-10 Thread hw
.. :)

> I still worry enough about backups to have written my own software:
> https://masysma.net/32/jmbb.xhtml
> and that I am also evaluating new developments in that area to probably  
> replace my self-written program by a more reliable (because used by more  
> people!) alternative:
> https://masysma.net/37/backup_tests_borg_bupstash_kopia.xhtml
> > 

cool :)

> [...]
> > 
> > Is anyone still using ext4?  I'm not saying it's bad or anything, it only  
> > seems that it has gone out of fashion.
> 
> IIRC its still Debian's default.

Hm, I haven't really used Debian in a long time.  There's probably no reason to
change that.  If you want something else, you can always go for it.

>  Its my file system of choice unless I have  
> very specific reasons against it. I have never seen it fail outside of  
> hardware issues. Performance of ext4 is quite acceptable out of the box.  
> E.g. it seems to be slightly faster than ZFS for my use cases.  
> Almost every Linux live system can read it. There are no problematic  
> licensing or stability issues whatsoever. By its popularity its probably one  
> of the most widely-deployed Linux file systems which may enhance the chance  
> that whatever problem you incur with ext4 someone else has had before...
> 

I'm not sure it's most widespread.  Centos (and Fedora) defaulted to xfs quite
some time ago, and Fedora more recently defaulted to btrfs (a while after Redhat
announced they would remove btrfs from RHEL altogether).  Centos went down the
drain when it mutated into an outdated version of Fedora, and RHEL is probably
isn't any better.

So assuming that RHEL and Centos may be more widespread than Debian because
there's lots of hardware supporting those but not Debian, I wouldn't think that
ext4 is most widespread and xfs is more common until btrfs has replaced it.

> > I'm considering using snapshots.  Ext4 didn't have those last time I
> > checked.
> 
> Ext4 still does not offer snapshots.

awww

It's a shame that btfrs is going so dead slow that it might be replaced with
something new before ever getting there, leaving Linux without a fully featured
file system while we've been waiting on it for the last 10 years.

>  The traditional way to do snapshots  
> outside of fancy BTRFS and ZFS file systems is to add LVM to the equation  
> although I do not have any useful experience with that.

Ugh.  Don't even try it, LVM sucks badly.  It works, but's unflexible to the
point of being not only completely useless but even detremental in practise. 
The idea was to provide faster storage for VM images than files residing in a
file system.  Well, screw that.  I can copy and/or move a VM image in a file
just as easy as any other file even from one machine to another.  Good luck
trying that with LVM.  I lost a whole VM that way once.  And are the files any
slower?  It doesn't seem so.

Snapshots?  IIRC only on the same LVM volume, and when that is full, you're out
of luck.  It's a way to waste like 60% of your disk space because you have to
keep spare space in case you want to make another snapshot or VM.  You could
make one with dd, but that's unwieldy.

>  Specifically, I am  
> not using snapshots at all so far, besides them being readily available on  
> ZFS :)

Well, for me they seem to be a really good option for incremental backups :)



Re: ZFS performance (was: Re: deduplicating file systems: VDO withDebian?)

2022-11-10 Thread tomas
On Fri, Nov 11, 2022 at 07:15:07AM +0100, hw wrote:
> On Thu, 2022-11-10 at 23:05 -0500, Michael Stone wrote:

[...]

> Why would anyone use SSDs for backups?  They're way too expensive for that.

Possibly.

> So far, the failure rate with SSDs has been not any better than the failure 
> rate
> of hard disks.  Considering that SSDs are supposed to fail less, the 
> experience
> with them is pretty bad.

You keep pulling things out of whatever (thin air, it seems). Here's a report
by folks who do lots of HDDs and SDDs:

  https://www.backblaze.com/blog/backblaze-hard-drive-stats-q1-2021/

The gist, for disks playing similar roles (they don't use yet SSDs for bulk
storage, because of the costs): 2/1518 failures for SSDs, 44/1669 for HDDs.

I'll leave the maths as an exercise to the reader.

Cheers
-- 
t


signature.asc
Description: PGP signature


Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?))

2022-11-10 Thread hw
On Thu, 2022-11-10 at 14:28 +0100, DdB wrote:
> Am 10.11.2022 um 13:03 schrieb Greg Wooledge:
> > If it turns out that '?' really is the filename, then it becomes a ZFS
> > issue with which I can't help.
> 
> just tested: i could create, rename, delete a file with that name on a
> zfs filesystem just as with any other fileystem.
> 
> But: i recall having seen an issue with corrupted filenames in a
> snapshot once (several years ago though). At the time, i did resort to
> send/recv to get the issue straightened out.

Well, the ZFS version in use is ancient ...  But that I could rename it is a
good sign.

> But it is very much more likely, that the filename '?' is entirely
> unrelated to zfs. Although zfs is perceived as being easy to handle
> (only 2 commands need to be learned: zpool and zfs),

Ha, it's for from easy.  These commands have many options ...

>  it takes a while to
> get acquainted with all the concepts and behaviors. Take some time to
> play with an installation (in a vm or just with a file based pool should
> be considered).

Ah, yes, that's a good idea :)



Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?))

2022-11-10 Thread hw
On Thu, 2022-11-10 at 08:48 -0500, Dan Ritter wrote:
> hw wrote: 
> > And I've been reading that when using ZFS, you shouldn't make volumes with
> > more
> > than 8 disks.  That's very inconvenient.
> 
> 
> Where do you read these things?

I read things like this:

"Sun™ recommends that the number of devices used in a RAID-Z configuration be
between three and nine. For environments requiring a single pool consisting of
10 disks or more, consider breaking it up into smaller RAID-Z groups. If two
disks are available, ZFS mirroring provides redundancy if required. Refer to
zpool(8) for more details."

That's on https://docs.freebsd.org/en/books/handbook/zfs/

I don't remember where I read about 8, could have been some documentation about
FreeNAS.  I've also been reading different amounts of RAM required for
deduplication, so who knows what's true.

> The number of disks in a zvol can be optimized, depending on
> your desired redundancy method, total number of drives, and
> tolerance for reduced performance during resilvering. 
> 
> Multiple zvols together form a zpool. Filesystems are allocated from
> a zpool.
> 
> 8 is not a magic number.
> 

You mean like here:
https://pthree.org/2012/12/21/zfs-administration-part-xiv-zvols/

That seems rather complicated.  I guess it's just a bad guide.  I'll find out if
I use ZFS.



Re: ZFS performance (was: Re: deduplicating file systems: VDO withDebian?)

2022-11-10 Thread hw
On Thu, 2022-11-10 at 23:05 -0500, Michael Stone wrote:
> On Thu, Nov 10, 2022 at 06:55:27PM +0100, hw wrote:
> > On Thu, 2022-11-10 at 11:57 -0500, Michael Stone wrote:
> > > On Thu, Nov 10, 2022 at 05:34:32PM +0100, hw wrote:
> > > > And mind you, SSDs are *designed to fail* the sooner the more data you
> > > > write
> > > > to
> > > > them.  They have their uses, maybe even for storage if you're so
> > > > desperate,
> > > > but
> > > > not for backup storage.
> > > 
> > > It's unlikely you'll "wear out" your SSDs faster than you wear out your
> > > HDs.
> > > 
> > 
> > I have already done that.
> 
> Then you're either well into "not normal" territory and need to buy an 
> SSD with better write longevity (which I seriously doubt for a backup 
> drive) or you just got unlucky and got a bad copy (happens with 
> anything) or you've misdiagnosed some other issue.
> 

Why would anyone use SSDs for backups?  They're way too expensive for that.

So far, the failure rate with SSDs has been not any better than the failure rate
of hard disks.  Considering that SSDs are supposed to fail less, the experience
with them is pretty bad.

There was no misdiagnosis.  Have you ever had a failed SSD?  They usually just
disappear.  I've had one exception in which the SDD at first only sometimes
disappeared and came back, until it disappeared and didn't come back.

There was no "not normal" territory, either, unless maybe you consider ZFS cache
as "not normal".  In that case, I would argue that SSDs are well suited for such
applications because they allow for lots of IOOPs and high data transfer rates,
and a hard disk probably wouldn't have failed in place of the SSD because they
don't wear out so quickly.  Since SSDs are so well suited for such purposes,
that can't be "not normal" territory for them.  Perhaps they just need to be
more resilient than they are.

You could argue that the SSDs didn't fail because they were worn out but for
other reasons.  I'd answer that it's irrelevant for the user why exactly a disk
failed, especially when it just disappears, and that hard disks don't fail
because the storage media wears out like SSDs do but for other reasons.

Perhaps you could buy an SSD that withstands being written to it better.  The
question then is if that's economical.  It would also be based on the assumption
that SSDs don't so much fail for other reasons than the storage media being worn
out.  Since all the failed SSDs have disappeared, I have to assume that they
didn't fail because the storage media was worn out but for other reasons.

Considering that, SSDs generally must be of really bad quality for that to
happen, don't you think?



Re: weird directory entry on ZFS volume (Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?)))

2022-11-10 Thread David Christensen

On Thu, Nov 10, 2022 at 05:54:00AM +0100, hw wrote:

ls -la
insgesamt 5
drwxr-xr-x  3 namefoo namefoo    3 16. Aug 22:36 .
drwxr-xr-x 24 root    root    4096  1. Nov 2017  ..
drwxr-xr-x  2 namefoo namefoo    2 21. Jan 2020  ?
namefoo@host /srv/datadir $ ls -la '?'
ls: Zugriff auf ? nicht möglich: Datei oder Verzeichnis nicht gefunden
namefoo@host /srv/datadir $


This directory named ? appeared on a ZFS volume for no reason and I can't
access
it and can't delete it.  A scrub doesn't repair it.  It doesn't seem to do
any
harm yet, but it's annoying.

Any idea how to fix that?



2022-11-10 21:24:23 dpchrist@f3 ~/foo
$ freebsd-version ; uname -a
12.3-RELEASE-p7
FreeBSD f3.tracy.holgerdanske.com 12.3-RELEASE-p6 FreeBSD 
12.3-RELEASE-p6 GENERIC  amd64


2022-11-10 21:24:45 dpchrist@f3 ~/foo
$ bash --version
GNU bash, version 5.2.0(3)-release (amd64-portbld-freebsd12.3)
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 



This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

2022-11-10 21:24:52 dpchrist@f3 ~/foo
$ ll
total 13
drwxr-xr-x   2 dpchrist  dpchrist   2 2022/11/10 21:24:21 .
drwxr-xr-x  14 dpchrist  dpchrist  30 2022/11/10 21:24:04 ..

2022-11-10 21:25:03 dpchrist@f3 ~/foo
$ touch '?'

2022-11-10 21:25:08 dpchrist@f3 ~/foo
$ ll
total 14
drwxr-xr-x   2 dpchrist  dpchrist   3 2022/11/10 21:25:08 .
drwxr-xr-x  14 dpchrist  dpchrist  30 2022/11/10 21:24:04 ..
-rw-r--r--   1 dpchrist  dpchrist   0 2022/11/10 21:25:08 ?

2022-11-10 21:25:11 dpchrist@f3 ~/foo
$ rm '?'
remove ?? y

2022-11-10 21:25:19 dpchrist@f3 ~/foo
$ ll
total 13
drwxr-xr-x   2 dpchrist  dpchrist   2 2022/11/10 21:25:19 .
drwxr-xr-x  14 dpchrist  dpchrist  30 2022/11/10 21:24:04 ..


David



Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?)

2022-11-10 Thread David Christensen

On 11/10/22 07:44, hw wrote:

On Wed, 2022-11-09 at 21:36 -0800, David Christensen wrote:

On 11/9/22 00:24, hw wrote:
  > On Tue, 2022-11-08 at 17:30 -0800, David Christensen wrote:



Be careful that you do not confuse a ~33 GiB full backup set, and 78
snapshots over six months of that same full backup set, with a full
backup of 3.5 TiB of data.



The full backup isn't deduplicated?



"Full", "incremental", etc., occur at the backup utility level -- e.g. 
on top of the ZFS filesystem.  (All of my backups are full backups using 
rsync.)  ZFS deduplication occurs at the block level -- e.g. the bottom 
of the ZFS filesystem.  If your backup tool is writing to, or reading 
from, a ZFS filesystem, the backup tool is oblivious to the internal 
operations of ZFS (compression or none, deduplicaton or none, etc.) so 
long as the filesystem "just works".




Writing to a ZFS filesystem with deduplication is much slower than
simply writing to, say, an ext4 filesystem -- because ZFS has to hash
every incoming block and see if it matches the hash of any existing
block in the destination pool.  Storing the existing block hashes in a
dedicated dedup virtual device will expedite this process.


But when it needs to write almost nothing because almost everthing gets
deduplicated, can't it be faster than having to write everthing?



There are many factors that affect how fast ZFS can write files to disk. 
 You will get the best answers if you run benchmarks using your 
hardware and data.




  >> I run my backup script each night.  It uses rsync to copy files and
  >
  > Aww, I can't really do that because my servers eats like 200-300W
because it has
  > so many disks in it.  Electricity is outrageously expensive here.


Perhaps platinum rated power supplies?  Energy efficient HDD's/ SSD's?


If you pay for it ... :)

Running it once in a while for some hours to make backups is still possible.
Replacing the hardware is way more expensive.



My SOHO server has ~1 TiB of data.  A ZFS snapshot takes a few seconds. 
ZFS incremental replication to the backup server proceeds at anywhere 
from 0 to 50 MB/s, depending upon how much content is new or has changed.





  > Sounds like a nice setup.  Does that mean you use snapshots to keep
multiple
  > generations of backups and make backups by overwriting everything
after you made
  > a snapshot?

Yes.


I start thinking more and more that I should make use of snapshots.



Taking snapshots is fast and easy.  The challenge is deciding when to 
destroy them.



zfs-auto-snapshot can do both automatically:

https://packages.debian.org/bullseye/zfs-auto-snapshot

https://manpages.debian.org/bullseye/zfs-auto-snapshot/zfs-auto-snapshot.8.en.html



Without deduplication or compression, my backup set and 78 snapshots
would require 3.5 TiB of storage.  With deduplication and compression,
they require 86 GiB of storage.


Wow that's quite a difference!  What makes this difference, the compression or
the deduplication? 



Deduplication.



When you have snapshots, you would store only the
differences from one snapshot to the next, 
and that would mean that there aren't

so many duplicates that could be deduplicated.



I do not know -- I have not crawled the ZFS code; I just use it.



Users can recover their own files without needing help from a system
administrator.


You have users who know how to get files out of snapshots?



Not really; but the feature is there.



   For compressed and/or encrypted archives, image, etc., I do not use
   compression or de-duplication
  >>>
  >>> Yeah, they wouldn't compress.  Why no deduplication?
  >>
  >>
  >> Because I very much doubt that there will be duplicate blocks in
such files.
  >
  > Hm, would it hurt?

Yes.  ZFS deduplication is resource intensive.


But you're using it already.



I have learned the hard way to only use deduplication when it makes sense.



What were the makes and models of the 6 disks?  Of the SSD's?  If you
have a 'zpool status' console session from then, please post it.


They were (and still are) 6x4TB WD Red (though one or two have failed over time)
and two Samsung 850 PRO, IIRC.  I don't have an old session anymore.

These WD Red are slow to begin with.  IIRC, both SDDs failed and I removed them.

The other instance didn't use SSDs but 6x2TB HGST Ultrastar.  Those aren't
exactly slow but ZFS is slow.



Those HDD's should be fine with ZFS; but those SSD's are desktop drives, 
not cache devices.  That said, I am making the same mistake with Intel 
SSD 520 Series.  I have considered switching to one Intel Optane Memory 
Series and a PCIe 4x adapter card in each server.




MySQL appears to have the ability to use raw disks.  Tuned correctly,
this should give the best results:

https://dev.mysql.com/doc/refman/8.0/en/innodb-system-tablespace.html#innodb-raw-devices


Could mysql 5.6 already do that?  I'll have to see if mariadb can do that now
...



I do not know -- I do not run MySQL or Maria.




Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?))

2022-11-10 Thread Michael Stone

On Thu, Nov 10, 2022 at 08:32:36PM -0500, Dan Ritter wrote:

* RAID 5 and 6 restoration incurs additional stress on the other
 disks in the RAID which makes it more likely that one of them
 will fail.


I believe that's mostly apocryphal; I haven't seen science backing that 
up, and it hasn't been my experience either.



 The advantage of RAID 6 is that it can then recover
 from that...


The advantage to RAID 6 is that it can tolerate a double disk failure. 
With RAID 1 you need 3x your effective capacity to achieve that and even 
though storage has gotten cheaper, it hasn't gotten that cheap. (e.g., 
an 8 disk RAID 6 has the same fault tolerance as an 18 disk RAID 1 of 
equivalent capacity, ignoring pointless quibbling over probabilities.)




Re: ZFS performance (was: Re: deduplicating file systems: VDO withDebian?)

2022-11-10 Thread Michael Stone

On Thu, Nov 10, 2022 at 06:55:27PM +0100, hw wrote:

On Thu, 2022-11-10 at 11:57 -0500, Michael Stone wrote:

On Thu, Nov 10, 2022 at 05:34:32PM +0100, hw wrote:
> And mind you, SSDs are *designed to fail* the sooner the more data you write
> to
> them.  They have their uses, maybe even for storage if you're so desperate,
> but
> not for backup storage.

It's unlikely you'll "wear out" your SSDs faster than you wear out your
HDs.



I have already done that.


Then you're either well into "not normal" territory and need to buy an 
SSD with better write longevity (which I seriously doubt for a backup 
drive) or you just got unlucky and got a bad copy (happens with 
anything) or you've misdiagnosed some other issue.




Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?))

2022-11-10 Thread Dan Ritter
Linux-Fan wrote: 
> I think the arguments of the RAID5/6 critics summarized were as follows:
> 
> * Running in a RAID level that is 5 or 6 degrades performance while
>   a disk is offline significantly. RAID 10 keeps most of its speed and
>   RAID 1 only degrades slightly for most use cases.
> 
> * During restore, RAID5 and 6 are known to degrade performance more compared
>   to restoring one of the other RAID levels.

* RAID 5 and 6 restoration incurs additional stress on the other
  disks in the RAID which makes it more likely that one of them
  will fail. The advantage of RAID 6 is that it can then recover
  from that...

* RAID 10 gets you better read performance in terms of both
  throughput and IOPS relative to the same number of disks in
  RAID 5 or 6. Most disk activity is reading.

> * Disk space has become so cheap that the savings of RAID5 may
>   no longer rectify the performance and reliability degradation
>   compared to RAID1 or 10.

I think that's a case-by-base basis. Every situation is
different, and should be assessed for cost, reliability and
performance concerns.

> All of these arguments come from a “server” point of view where it is
> assumed that
> 
> (1) You win something by running the server so you can actually
> tell that there is an economic value in it. This allows for
> arguments like “storage is cheap” which may not be the case at
> all if you are using up some thightly limited private budget.
> 
> (2) Uptime and delivering the service is paramount. Hence there
> are some considerations regarding the online performance of
> the server while the RAID is degraded and while it is restoring.
> If you are fine to take your machine offline or accept degraded
> performance for prolonged times then this does not apply of
> course. If you do not value the uptime making actual (even
> scheduled) copies of the data may be recommendable over
> using a RAID because such schemes may (among other advantages)
> protect you from accidental file deletions, too.

Even in household situations, knowing that you could have traded $100
last year for a working computer right now is an incentive to set up
disk mirroring. If you're storing lots of data that other
people in the household depend on, that might factor in to your
decisions, too.

Everybody has a budget. Some have big budgets, and some have
small. The power of open source software is that we can make
opportunities open to people with small budgets that are
otherwise reserved for people with big budgets.

Most of the computers in my house have one disk. If I value any
data on that disk, I back it up to the server, which has 4 4TB
disks in ZFS RAID10. If a disk fails in that, I know I can
survive that and replace it within 24 hours for a reasonable
amount of money -- rather more reasonable in the last few
months.

> > Is anyone still using ext4?  I'm not saying it's bad or anything, it
> > only seems that it has gone out of fashion.
> 
> IIRC its still Debian's default. Its my file system of choice unless I have
> very specific reasons against it. I have never seen it fail outside of
> hardware issues. Performance of ext4 is quite acceptable out of the box.
> E.g. it seems to be slightly faster than ZFS for my use cases. Almost every
> Linux live system can read it. There are no problematic licensing or
> stability issues whatsoever. By its popularity its probably one of the most
> widely-deployed Linux file systems which may enhance the chance that
> whatever problem you incur with ext4 someone else has had before...

All excellent reasons to use ext4.

-dsr-



Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?))

2022-11-10 Thread DdB
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Am 10.11.2022 um 22:37 schrieb Linux-Fan:
> Ext4 still does not offer snapshots. The traditional way to do
> snapshots outside of fancy BTRFS and ZFS file systems is to add LVM
> to the equation although I do not have any useful experience with
> that. Specifically, I am not using snapshots at all so far, besides
> them being readily available on ZFS

Yes, although i am heavily dependant on zfs, my OS resides on an
nvme-ssd with ext4, i rsync as need be to an imagefile, that i keep on
compressed zfs, and which i snapshot, backup and so on, rather than
the ext4-partition itself. - Seems like a good compromise to me.
-BEGIN PGP SIGNATURE-

iQIzBAEBCAAdFiEEumgd33HMGU/Wk4ZRe3aiXLdoWD0FAmNterMACgkQe3aiXLdo
WD3coA//ZXf0/WfGVEfEi1Fxe/vYpnqqx9UZLfAL5+XrE19Gh1oVd25zGDhkaaFl
SbvwcnVII/v7Lzj6by86nJ44LvPqu/NRjzWGwM7ltK3t4t8C7C+h2lfxPuhVKxfW
zqt/kp053ZCPUj6nD8nD60MLI88sxoyTVRG6nVzqW0FiC7be3VE3l4l4O2E0Qr4U
OM2lqmLDXPJcJ6pNGZp5p4470st/ODBNuG/7nM+mM04ylZJYLJV7ykYpCx6R8uiW
ZKoY+VZuI0dPsTEe24O0CRRm2hmW99ET6p+LAZHEnhKGPB/cxOIECiLTmqy8NICi
90AMxTp+D7bLglgnF4a0ZAukYwxnj+gCMj7B/CKCT62qLVVEavLOdVQydQx/7kue
+DsJ8PrXryVhO7xL01NeZbq4Ur5vwbY1Unk5iWHq8snoh+13Dru+a3DGcRQNr2Ph
QkzHny8LwO7x4Ob+a4/YRhjGjYWPxA9Y9huUDFLEDp0v0QDixtI+oytqED/hE30b
SPWZUbQR2pfF8Mbst02zDqknv5rvt9NxNhh0tvcspVsNv4y81/wUO0O8HVc4yBL2
lcJ6Wf11MRmbJ9J4NoZom3GnFUcysqnfEsxk+XLwJUTpfHU3VZS5Ovheu/OZNrDT
ErRY9g+yES8YTtsb4Vk2hqIvU5LvGu/BzCh0kyavGWkz3+dE9go=
=Ed39
-END PGP SIGNATURE-



Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?))

2022-11-10 Thread Linux-Fan

hw writes:


On Wed, 2022-11-09 at 19:17 +0100, Linux-Fan wrote:
> hw writes:
> > On Wed, 2022-11-09 at 14:29 +0100, didier gaumet wrote:
> > > Le 09/11/2022 à 12:41, hw a écrit :


[...]


> > I'd
> > have to use mdadm to create a RAID5 (or use the hardware RAID but that  
> > isn't

>
> AFAIK BTRFS also includes some integrated RAID support such that you do  
> not necessarily need to pair it with mdadm.


Yes, but RAID56 is broken in btrfs.

> It is advised against using for RAID 
> 5 or 6 even in most recent Linux kernels, though:
>
> 
https://btrfs.readthedocs.io/en/latest/btrfs-man5.html#raid56-status-and-recommended-practices

Yes, that's why I would have to use btrfs on mdadm when I want to make a  
RAID5.

That kinda sucks.

> RAID 5 and 6 have their own issues you should be aware of even when  
> running 

> them with the time-proven and reliable mdadm stack. You can find a lot of 
> interesting results by searching for “RAID5 considered harmful” online.  
> This 
> one is the classic that does not seem to make it to the top results,  
> though:


Hm, really?  The only time that RAID5 gave me trouble was when the hardware  


[...]

I have never used RAID5 so how would I know :)

I think the arguments of the RAID5/6 critics summarized were as follows:

* Running in a RAID level that is 5 or 6 degrades performance while
  a disk is offline significantly. RAID 10 keeps most of its speed and
  RAID 1 only degrades slightly for most use cases.

* During restore, RAID5 and 6 are known to degrade performance more compared
  to restoring one of the other RAID levels.

* Disk space has become so cheap that the savings of RAID5 may
  no longer rectify the performance and reliability degradation
  compared to RAID1 or 10.

All of these arguments come from a “server” point of view where it is  
assumed that


(1) You win something by running the server so you can actually
tell that there is an economic value in it. This allows for
arguments like “storage is cheap” which may not be the case at
all if you are using up some thightly limited private budget.

(2) Uptime and delivering the service is paramount. Hence there
are some considerations regarding the online performance of
the server while the RAID is degraded and while it is restoring.
If you are fine to take your machine offline or accept degraded
performance for prolonged times then this does not apply of
course. If you do not value the uptime making actual (even
scheduled) copies of the data may be recommendable over
using a RAID because such schemes may (among other advantages)
protect you from accidental file deletions, too.

Also note that in today's computing landscape, not all unwanted file  
deletions are accidental. With the advent of “crypto trojans” adversaries  
exist that actually try to encrypt or delete your data to extort a ransom.


More than one disk can fail?  Sure can, and it's one of the reasons why I  
make

backups.

You also have to consider costs.  How much do you want to spend on storage  
and
and on backups?  And do you want make yourself crazy worrying about your  
data?


I am pretty sure that if I separate my PC into GPU, CPU, RAM and Storage, I  
spent most on storage actually. Well established schemes of redundancy and  
backups make me worry less about my data.


I still worry enough about backups to have written my own software:
https://masysma.net/32/jmbb.xhtml
and that I am also evaluating new developments in that area to probably  
replace my self-written program by a more reliable (because used by more  
people!) alternative:

https://masysma.net/37/backup_tests_borg_bupstash_kopia.xhtml


> https://www.baarf.dk/BAARF/RAID5_versus_RAID10.txt
>
> If you want to go with mdadm (irrespective of RAID level), you might also 
> consider running ext4 and trade the complexity and features of the  
> advanced file systems for a good combination of stability and support.


Is anyone still using ext4?  I'm not saying it's bad or anything, it only  
seems that it has gone out of fashion.


IIRC its still Debian's default. Its my file system of choice unless I have  
very specific reasons against it. I have never seen it fail outside of  
hardware issues. Performance of ext4 is quite acceptable out of the box.  
E.g. it seems to be slightly faster than ZFS for my use cases.  
Almost every Linux live system can read it. There are no problematic  
licensing or stability issues whatsoever. By its popularity its probably one  
of the most widely-deployed Linux file systems which may enhance the chance  
that whatever problem you incur with ext4 someone else has had before...



I'm considering using snapshots.  Ext4 didn't have those last time I checked.


Ext4 still does not offer snapshots. The traditional way to do snapshots  
outside of fancy BTRFS and ZFS file systems is to add LVM

Re: weird directory entry on ZFS volume (Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?)))

2022-11-10 Thread Greg Wooledge
On Thu, Nov 10, 2022 at 06:54:31PM +0100, hw wrote:
> Ah, yes.  I tricked myself because I don't have hd installed,

It's just a symlink to hexdump.

lrwxrwxrwx 1 root root 7 Jan 20  2022 /usr/bin/hd -> hexdump

unicorn:~$ dpkg -S usr/bin/hd
bsdextrautils: /usr/bin/hd
unicorn:~$ dpkg -S usr/bin/hexdump
bsdextrautils: /usr/bin/hexdump

> It's an ancient Gentoo

A.  Anyway, from the Debian man page:

   -C, --canonical
  Canonical  hex+ASCII display.  Display the input offset in hexa‐
  decimal, followed by sixteen space-separated, two-column,  hexa‐
  decimal  bytes, followed by the same sixteen bytes in %_p format
  enclosed in '|' characters.  Invoking the program as hd  implies
  this option.

Why on earth the default format of "hexdump" uses that weird 16-bit
little endian nonsense is beyond me.



Re: ZFS performance (was: Re: deduplicating file systems: VDO withDebian?)

2022-11-10 Thread hw
On Thu, 2022-11-10 at 11:57 -0500, Michael Stone wrote:
> On Thu, Nov 10, 2022 at 05:34:32PM +0100, hw wrote:
> > And mind you, SSDs are *designed to fail* the sooner the more data you write
> > to
> > them.  They have their uses, maybe even for storage if you're so desperate,
> > but
> > not for backup storage.
> 
> It's unlikely you'll "wear out" your SSDs faster than you wear out your 
> HDs.
> 

I have already done that.



Re: weird directory entry on ZFS volume (Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?)))

2022-11-10 Thread hw
On Thu, 2022-11-10 at 09:30 -0500, Greg Wooledge wrote:
> On Thu, Nov 10, 2022 at 02:48:28PM +0100, hw wrote:
> > On Thu, 2022-11-10 at 07:03 -0500, Greg Wooledge wrote:
> 
> [...]
> > printf '%s\0' * | hexdump
> > 000 00c2 6177 7468     
> > 007
> 
> I dislike this output format, but it looks like there are two files
> here.  The first is 0xc2, and the second is 0x77 0x61 0x68 0x74 if
> I'm reversing and splitting the silly output correctly.  (This spells
> "waht", if I got it right.)
> > 

Ah, yes.  I tricked myself because I don't have hd installed, so I redirected
the output of printf into a file --- which I wanted to name 'what' but I
mistyped as 'waht' --- so I could load it into emacs and use hexl-mode.  But the
display kinda sucked and I found I have hexdump installed and used that. 
Meanwhile I totally forgot about the file I had created.

> [...]
> > 
> The file in question appears to have a name which is the single byte 0xc2.
> Since that's not a valid UTF-8 character, ls chooses something to display
> instead.  In your case, it chose a '?' character.

I'm the only one who can create files there, and I didn't create that.  Using
0xc2 as a file name speaks loudly against that I'd create that file
accidentially.

>   I'm guessing this is on
> an older release of Debian.

It's an ancient Gentoo which couldn't be updated in years because they broke the
update process.  Back then, Gentoo was the only Linux distribution that didn't
need fuse for ZFS that I could find.

> In my case, it does this:
> 
> unicorn:~$ mkdir /tmp/x && cd "$_"
> unicorn:/tmp/x$ touch $'\xc2'
> unicorn:/tmp/x$ ls -la
> total 80
> -rw-r--r--  1 greg greg 0 Nov 10 09:21 ''$'\302'
> drwxr-xr-x  2 greg greg  4096 Nov 10 09:21  ./
> drwxrwxrwt 20 root root 73728 Nov 10 09:21  ../
> 
> In my version of ls, there's a --quoting-style= option that can help
> control what you see.  But that's a tangent you can explore later.
> 
> Since we know the actual name of the file (subdirectory) now, let's just
> rename it to something sane.
> 
> mv $'\xc2' subdir
> 
> Then you can investigate it, remove it, or do whatever else you want.

Cool, I've renamed it, thank you very much :)  I'm afraid that the file system
will crash when I remove it ...  It's an empty directory.  Ever since I noticed
it, I couldn't do anything with it and I thought it's some bug in the file
system.



Re: deduplicating file systems: VDO with Debian?

2022-11-10 Thread hw
On Wed, 2022-11-09 at 14:22 +0100, Nicolas George wrote:
> hw (12022-11-08):
> > When I want to have 2 (or more) generations of backups, do I actually want
> > deduplication?  It leaves me with only one actual copy of the data which
> > seems
> > to defeat the idea of having multiple generations of backups at least to
> > some
> > extent.
> 
> The idea of having multiple generations of backups is not to have the
> data physically present in multiple places, this is the role of RAID.
> 
> The idea if having multiple generations of backups is that if you
> accidentally overwrite half your almost-completed novel with lines of
> ALL WORK AND NO PLAY MAKES JACK A DULL BOY and the backup tool runs
> before you notice it, you still have the precious data in the previous
> generation.

Nicely put :)

Let me rephrase a little:

How likely is it that a storage volume (not the underlying media, like discs in
a RAID array) would become unreadble in only some places so that it could be an
advantage to have multiple copies of the same data on the volume?

It's like I can't help unconsciously thinking that it's an advantage to have
several multple copies on a volume for any other reason than not to overwrite
the almost complete novel.  At the same time, I find it difficult to imagine how
a volume could get damaged only in some places, and I don't see other reasons
than that.

Ok, another reason to keep multiple full copies on a volume is making things
simple, easy and thus perhaps more reliable than more complicated solutions.  At
least that's an intention.  But it costs a lot of disk space.



Re: ZFS performance (was: Re: deduplicating file systems: VDO withDebian?)

2022-11-10 Thread Michael Stone

On Thu, Nov 10, 2022 at 05:34:32PM +0100, hw wrote:

And mind you, SSDs are *designed to fail* the sooner the more data you write to
them.  They have their uses, maybe even for storage if you're so desperate, but
not for backup storage.


It's unlikely you'll "wear out" your SSDs faster than you wear out your 
HDs.




Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?)

2022-11-10 Thread hw
On Thu, 2022-11-10 at 10:47 +0100, DdB wrote:
> Am 10.11.2022 um 06:38 schrieb David Christensen:
> > What is your technique for defragmenting ZFS?
> well, that was meant more or less a joke: there is none apart from
> offloading all the data, destroying and rebuilding the pool, and filling
> it again from the backup. But i do it from time to time if fragmentation
> got high, the speed improvements are obvious. OTOH the process takes
> days on my SOHO servers
> 

Does it save you days so that you save more time than you spend on defragmenting
it because access is faster?

Perhaps after so many days of not defragging, but how many days?

Maybe use an archive pool that doesn't get deleted from?



Re: ZFS performance (was: Re: deduplicating file systems: VDO withDebian?)

2022-11-10 Thread hw
On Thu, 2022-11-10 at 02:19 -0500, gene heskett wrote:
> On 11/10/22 00:37, David Christensen wrote:
> > On 11/9/22 00:24, hw wrote:
> >  > On Tue, 2022-11-08 at 17:30 -0800, David Christensen wrote:
> 
> [...]
> Which brings up another suggestion in two parts:
> 
> 1: use amanda, with tar and compression to reduce the size of the 
> backups.  And use a backup cycle of a week or 2 because amanda will if 
> advancing a level, only backup that which has been changed since the 
> last backup. On a quiet system, a level 3 backup for a 50gb network of 
> several machines can be under 100 megs. More on a busy system of course.
> Amanda keeps track of all that automatically.

Amanda is nice, yet quite unwieldy (try to get a file out of the backups ...). 
I used it long time ago (with tapes) and I'd have to remember or re-learn how to
use amanda to back up particular directories and such ...

I think I might be better off learning more about snapshots.

> 2: As disks fail, replace them with SSD's which use much less power than 
> spinning rust. And they are typically 5x faster than commodity spinning 
> rust.

Is this a joke?

https://www.dell.com/en-us/shop/visiontek-16tb-class-qlc-7mm-25-ssd/apd/ab329068/storage-drives-media

Cool, 30% discount on black friday saves you $2280 for every pair of disks, and
it even starts right now.  (Do they really mean that? What if I had a datacenter
and ordered 512 or so of them?  I'd save almost $1.2 million, what a great
deal!)

And mind you, SSDs are *designed to fail* the sooner the more data you write to
them.  They have their uses, maybe even for storage if you're so desperate, but
not for backup storage.

> Here, and historically with spinning rust, backing up 5 machines, at 3am 
> every morning is around 10gb total and under 45 minutes. This includes 
> the level 0's it does by self adjusting the schedule to spread the level 
> 0's, AKA the fulls, out over the backup cycle so the amount of storage 
> used for any one backup run is fairly consistent.

That's almost half a month for 4TB.  Why does it take so long?



Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?)

2022-11-10 Thread hw
 Why no deduplication?
>  >>
>  >>
>  >> Because I very much doubt that there will be duplicate blocks in 
> such files.
>  >
>  > Hm, would it hurt?
> 
> 
> Yes.  ZFS deduplication is resource intensive.

But you're using it already.

>  > Oh it's not about performance when degraded, but about performance. 
> IIRC when
>  > you have a ZFS pool that uses the equivalent of RAID5, you're still 
> limited to
>  > the speed of a single disk.  When you have a mysql database on such a ZFS
>  > volume, it's dead slow, and removing the SSD cache when the SSDs 
> failed didn't
>  > make it any slower.  Obviously, it was a bad idea to put the database 
> there, and
>  > I wouldn't do again when I can avoid it.  I also had my data on such 
> a volume
>  > and I found that the performance with 6 disks left much to desire.
> 
> 
> What were the makes and models of the 6 disks?  Of the SSD's?  If you 
> have a 'zpool status' console session from then, please post it.

They were (and still are) 6x4TB WD Red (though one or two have failed over time)
and two Samsung 850 PRO, IIRC.  I don't have an old session anymore.

These WD Red are slow to begin with.  IIRC, both SDDs failed and I removed them.

The other instance didn't use SSDs but 6x2TB HGST Ultrastar.  Those aren't
exactly slow but ZFS is slow.

> Constructing a ZFS pool to match the workload is not easy.

Well, back then there wasn't much information because ZFS was a pretty new
thing.

>   STFW there 
> are plenty of articles.  Here is a general article I found recently:
> 
> https://klarasystems.com/articles/choosing-the-right-zfs-pool-layout/

Thanks!  If I make a zpool for backups (or anything else), I need to do some
reading beforehand anyway.

> MySQL appears to have the ability to use raw disks.  Tuned correctly, 
> this should give the best results:
> 
> https://dev.mysql.com/doc/refman/8.0/en/innodb-system-tablespace.html#innodb-raw-devices

Could mysql 5.6 already do that?  I'll have to see if mariadb can do that now
...

> If ZFS performance is not up to your expectations, and there are no 
> hardware problems, next steps include benchmarking, tuning, and/or 
> adding or adjusting the hardware and its usage.

In theory, yes :)

I'm very reluctant to mess with the default settings of file systems.  When xfs
became available for Linux some time in 90ies, I managed to loose data when an
xfs file system got messed up.  Fortunately, I was able to recover almost all
from backups and from the file system.  I never really found out what caused it,
but long time later I figured that I probably hadn't used mounting options I
should have used.  I had messed with the defaults for some reason I don't
remember.  That tought me a lesson.

>  >> ... invest in hardware to get performance.
> 
>  > Hardware like?
> 
> 
> Server chassis, motherboards, chipsets, processors, memory, disk host 
> bus adapters, disk racks, disk drives, network interface cards, etc..

Well, who's gona pay for that?

>  > In theory, using SSDs for cache with ZFS should improve
>  > performance.  In practise, it only wore out the SSDs after a while, 
> and now it's
>  > not any faster without SSD cache.
> 
> 
> Please run 'zpool status' and post the console session (prompt, command 
> entered, output displayed).  Please correlate the vdev's to disk drive 
> makes and models.

See above ... The pool is a raidz1-0 with the 6x4TB Red drives, and no SSDs are
left.

> On 11/9/22 03:41, hw wrote:
> 
> > I don't have anything without ECC RAM, 
> 
> 
> Nice.

Yes :)  Buying used has it's advantages.  You don't get the fastest, but you get
tons of ECC RAM and awesome CPUs and reliability.

> > and my server was never meant for ZFS.
> 
> 
> What is the make and model of your server?

I put it together myself.  The backup server uses a MSI mainboard with the
designation S0121 C204 SKU in a Chenbro case that has a 16xLFF backplane.  It
has only 16GB RAM and would max out at 32GB.  Unless you want ZFS with
deduplication, that's more than enough to make backups :)

I could replace it with a Dell r720 to get more RAM, but those can have only
12xLFF.  I could buy a new Tyan S7012 WGM4NR for EUR 50 before they're sold out
and stuff at least 48GB RAM into it and 2x5690 Xeons (which are supposed to go
into a Z800 I have sitting around and could try to sell, but I'm lazy), but then
I'd probably have to buy CPU coolers for it (I'm not sure the coolers of the
Z800 fit) and a new UPS because it would need so much power.  (I also have the
48GB because they came in a server I bought for the price of the 5690s (to get
the 5690s) and another 48GB in the Z800, but not all of it might fit ...)

It would be fun, but I don't really feel like throwing money at technology t

Re: deduplicating file systems: VDO with Debian?

2022-11-10 Thread d-u
On Wed, 09 Nov 2022 13:28:46 +0100
hw  wrote:

> On Tue, 2022-11-08 at 09:52 +0100, DdB wrote:
> > Am 08.11.2022 um 05:31 schrieb hw:  
> > > > That's only one point.  
> > > What are the others?
> > >   
> > > >  And it's not really some valid one, I think, as 
> > > > you do typically not run into space problems with one single
> > > > action (YMMV). Running multiple sessions and out-of-band
> > > > deduplication between them works for me.  
> > > That still requires you to have enough disk space for at least
> > > two full backups.
> > > I can see it working for three backups because you can
> > > deduplicate the first two, but not for two.  And why would I
> > > deduplicate when I have sufficient disk
> > > space.
> > >   
> > Your wording likely confuses 2 different concepts:  
> 
> N, I'm not confusing that :)  Everyone says so and I don't know
> why ...
> 
> > Deduplication avoids storing identical data more than once.
> > whereas
> > Redundancy stores information on more than one place on purpose to
> > avoid loos of data in case of havoc.
> > ZFS can do both, as it combines the features of a volume manager
> > with those of a filesystem and a software RAID.( I am using
> > zfsonlinux since its early days, for over 10 years now, but without
> > dedup. )
> > 
> > In the past, i used shifting/rotating external backup media for that
> > purpose, because, as the saying goes: RAID is NOT a backup! Today, i
> > have a second server only for the backups, using zfs as well, which
> > allows for easy incremental backups, minimizing traffic and disk
> > usage.
> > 
> > but you should be clear as to what you want: redundancy or
> > deduplication?  
> 
> The question is rather if it makes sense to have two full backups on
> the same machine for redundancy and to be able to go back in time, or
> if it's better to give up on redundancy and to have only one copy and
> use snapshots or whatever to be able to go back in time.

And the answer is no. The redundancy you gain from this is almost,
though not quite, meaningless, because of the large set of common
data-loss scenarios against which it offers no protection. You've made
it clear that the cost of storage media is a problem in your situation.
Doubling your backup server's requirement for scarce and expensive disk
space in order to gain a tiny fraction of the resiliency that's
normally implied by "redundancy" doesn't make sense. And being able to
go "back in time" can be achieved much more efficiently by using a
solution (be it off-the-shelf or roll-your-own) that starts with a full
backup and then just stores deltas of changes over time (aka incremental
backups). None of this, for the record, is "deduplication", and I
haven't seen any indication in this thread so far that actual
deduplication is relevant to your use case.

> Of course it would better to have more than one machine, but I don't
> have that.

Fine, just be realistic about the fact that this means you cannot in
any meaningful sense have "two full backups" or "redundancy". If and
when you can some day devote an RPi tethered to some disks to the job,
then you can set it up to hold a second, completely independent,
store of "full backup plus deltas". And *then* you would have
meaningful redundancy that offers some real resilience. Even better if
the second one is physically offsite. 

In the meantime, storing multiple full copies of your data on one
backup server is just a way to rapidly run out of disk space on your
backup server for essentially no reason.


Cheers!
 -Chris



Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?))

2022-11-10 Thread Dan Ritter
Brad Rogers wrote: 
> On Thu, 10 Nov 2022 08:48:43 -0500
> Dan Ritter  wrote:
> 
> Hello Dan,
> 
> >8 is not a magic number.
> 
> Clearly, you don't read Terry Pratchett.   :-)

In the context of ZFS, 8 is not a magic number.

May you be ridiculed by Pictsies.

-dsr-



Re: deduplicating file systems: VDO with Debian?

2022-11-10 Thread Curt
On 2022-11-10, Nicolas George  wrote:
> Curt (12022-11-10):
>> Why restate it then needlessly? 
>
> To NOT state that you were wrong when you were not.
>
> This branch of the discussion bores me. Goodbye.
>

This isn't solid enough for a branch. It couldn't support a hummingbird.
And me too! That old ennui! Adieu!






Re: deduplicating file systems: VDO with Debian?

2022-11-10 Thread Nicolas George
Curt (12022-11-10):
> Why restate it then needlessly? 

To NOT state that you were wrong when you were not.

This branch of the discussion bores me. Goodbye.

-- 
  Nicolas George



Re: deduplicating file systems: VDO with Debian?

2022-11-10 Thread Curt
On 2022-11-10, Nicolas George  wrote:
> Curt (12022-11-10):
>> > one drive fails → you can replace it immediately, no downtime
>> That's precisely what I said,
>
> I was not stating that THIS PART of what you said was srong.

Why restate it then needlessly? 

>>  so I'm baffled by the redundancy of your
>> words.
>
> Hint: my mail did not stop at the line you quoted. Reading mails to the
> end is usually a good practice to avoid missing information.
>

It's also an insect repellent.




Re: deduplicating file systems: VDO with Debian?

2022-11-10 Thread Nicolas George
Curt (12022-11-10):
> > one drive fails → you can replace it immediately, no downtime
> That's precisely what I said,

I was not stating that THIS PART of what you said was srong.

>   so I'm baffled by the redundancy of your
> words.

Hint: my mail did not stop at the line you quoted. Reading mails to the
end is usually a good practice to avoid missing information.

-- 
  Nicolas George



Re: deduplicating file systems: VDO with Debian?

2022-11-10 Thread Curt
On 2022-11-10, Nicolas George  wrote:
> Curt (12022-11-10):
>> Maybe it's a question of intent more than anything else. I thought RAID
>> was intended for a server scenario where if a disk fails, you're down
>> time is virtually null, whereas as a backup is intended to prevent data
>> loss.
>
> Maybe just use common sense. RAID means your data is present on several
> drives. You can just deduce what it can help for:
>
> one drive fails → you can replace it immediately, no downtime

That's precisely what I said, so I'm baffled by the redundancy of your
words. Or are you a human RAID? 




definitions of "backup" (was Re: deduplicating file systems: VDO with Debian?)

2022-11-10 Thread The Wanderer
On 2022-11-10 at 09:06, Dan Ritter wrote:

> Now, RAID is not a backup because it is a single store of data: if
> you delete something from it, it is deleted. If you suffer a
> lightning strike to the server, there's no recovery from molten
> metal.

Here's where I find disagreement.

Say you didn't use RAID, and you had two disks in the same machine.

In order to avoid data loss in the event that one of the disks failed,
you engaged in a practice of copying all files from one disk onto the other.

That process could, and would, easily be referred to as backing up the
files. It's not a very distant backup, and it wouldn't protect against
that lightning strike, but it's still a separate backed-up copy.

But copying those files manually is a pain, so you might well set up a
process to automate it. That then becomes a scheduled backup, from one
disk onto another.

That scheduled process means that you have periods where the most
recently updated copy of the live data hasn't made it into the backup,
so there's still a time window where you're at risk of data loss if the
first disk fails. So you might set things up for the automated process
to in effect run continuously, writing the data to both disks in
parallel as it comes in.

And at that point you've basically reinvented mirroring RAID.

You've also lost the protection against "if you delete something from
it"; unlike deeper, more robust forms of backup, RAID does not protect
against accidental deletion. But you still have the protection against
"if one disk fails" - and that one single layer of protection against
one single cause of data loss is, I contend, still valid to refer to as
a "backup" just as much as the original manually-made copies were.

> Some filesystems have snapshotting. Snapshotting can protect you
> from the accidental deletion scenario, by allowing you to recover
> quickly, but does not protect you from lightning.
> 
> The lightning scenario requires a copy of the data in some other 
> location. That's a backup.

There are many possible causes of data loss. My contention is that
anything that protects against *any* of them qualifies as some level of
backup, and that there are consequently multiple levels / tiers /
degrees / etc. of backup.

RAID is not an advanced form of protection against data loss; it only
protects against one type of cause. But it still does protect against
that one type, and thus it is not valid to kick it out of that circle
entirely.

-- 
   The Wanderer

The reasonable man adapts himself to the world; the unreasonable one
persists in trying to adapt the world to himself. Therefore all
progress depends on the unreasonable man. -- George Bernard Shaw



signature.asc
Description: OpenPGP digital signature


Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?))

2022-11-10 Thread Brad Rogers
On Thu, 10 Nov 2022 08:48:43 -0500
Dan Ritter  wrote:

Hello Dan,

>8 is not a magic number.

Clearly, you don't read Terry Pratchett.   :-)

-- 
 Regards  _   "Valid sig separator is {dash}{dash}{space}"
 / )  "The blindingly obvious is never immediately apparent"
/ _)rad   "Is it only me that has a working delete key?"
It's only the children of the f** wealthy tend to be good looking
Ugly - The Stranglers


pgp6ozKr0osey.pgp
Description: OpenPGP digital signature


Re: definiing deduplication (was: Re: deduplicating file systems: VDO with Debian?)

2022-11-10 Thread Thomas Schmitt
Hi,

i wrote:
> > the time window in which the backuped data
> > can become inconsistent on the application level.

hw wrote:
> Or are you referring to the data being altered while a backup is in
> progress?

Yes. Data of different files or at different places in the same file
may have relations which may become inconsistent during change operations
until the overall change is complete.
If you are unlucky you can even catch a plain text file that is only half
stored.

The risk for this is not 0 with filesystem snapshots, but it grows further
if there is a time interval during which changes may or may not be copied
into the backup, depending on filesystem internals and bad luck.


> Would you even make so many backups on the same machine?

It depends on the alternatives.
If you have other storage systems which can host backups, then it is of
course good to use them for backup storage. But if you have less separate
storage than independent backups, then it is still worthwhile to put more
than one backup on the same storage.


> Isn't 5 times a day a bit much?

It depends on how much you are willing to lose in case of a mishap.
My $HOME backup runs last about 90 seconds each. So it is not overly
cumbersome.


>  And it's an odd number.

That's because the early afternoon backup is done twice. (A tradition
which started when one of my BD burners began to become unreliable.)


> Yes, I'm re-using the many small hard discs that have accumulated over the
> years.

If it's only their size which disqualifies them for production purposes,
then it's ok. But if they are nearing the end of their life time, then
i would consider to decommission them.


> I wish we could still (relatively) easily make backups on tapes.

My personal endeavor with backups on optical media began when a customer
had a major data mishap and all backup tapes turned out to be unusable.
Frequent backups had been made and allegedly been checkread. But in the
end it was big drama.
I then proposed to use a storage where the boss of the department can
make random tests with the applications which made and read the files.
So i came to writing backup scripts which used mkisofs and cdrecord
for CD-RW media.


> Just change
> the tape every day and you can have a reasonable number of full backups.

If you have thousandfold the size of Blu-rays worth of backup, then
probably a tape library would be needed. (I find LTO tapes with up to
12 TB in the web, which is equivalent to 480 BD-R.)


> A full new backup takes ages

It would help if you could divide your backups into small agile parts and
larger parts which don't change often.
The agile ones need frequent backup, whereas the lazy ones would not suffer
so much damage if the newset available backup is a few days old.


> I need to stop modifying stuff and not start all over again

The backup part of a computer system should be its most solid and artless
part. No shortcuts, no fancy novelties, no cumbersome user procedures.


Have a nice day :)

Thomas



Re: weird directory entry on ZFS volume (Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?)))

2022-11-10 Thread Greg Wooledge
On Thu, Nov 10, 2022 at 02:48:28PM +0100, hw wrote:
> On Thu, 2022-11-10 at 07:03 -0500, Greg Wooledge wrote:
> good idea:
> 
> printf %s * | hexdump
> 000 77c2 6861 0074 
> 005

Looks like there might be more than one file here.

> > If you misrepresented the situation, and there's actually more than one
> > file in this directory, then use something like this instead:
> > 
> > shopt -s failglob
> > printf '%s\0' ? | hd
> 
> shopt -s failglob
> printf '%s\0' ? | hexdump
> 000 00c2   
> 002

OK, that's a good result.

> > Note that the ? is *not* quoted here, because we want it to match any
> > one-character filename, no matter what that character actually is.  If
> > this doesn't work, try ?? or * as the glob, until you manage to find it.
> 
> printf '%s\0' ?? | hexdump
> -bash: Keine Entsprechung: ??
> 
> (meaning something like "no equivalent")

The English version is "No match".

> printf '%s\0' * | hexdump
> 000 00c2 6177 7468 
> 007

I dislike this output format, but it looks like there are two files
here.  The first is 0xc2, and the second is 0x77 0x61 0x68 0x74 if
I'm reversing and splitting the silly output correctly.  (This spells
"waht", if I got it right.)

> > If it turns out that '?' really is the filename, then it becomes a ZFS
> > issue with which I can't help.
> 
> I would think it is.  Is it?

The file in question appears to have a name which is the single byte 0xc2.
Since that's not a valid UTF-8 character, ls chooses something to display
instead.  In your case, it chose a '?' character.  I'm guessing this is on
an older release of Debian.

In my case, it does this:

unicorn:~$ mkdir /tmp/x && cd "$_"
unicorn:/tmp/x$ touch $'\xc2'
unicorn:/tmp/x$ ls -la
total 80
-rw-r--r--  1 greg greg 0 Nov 10 09:21 ''$'\302'
drwxr-xr-x  2 greg greg  4096 Nov 10 09:21  ./
drwxrwxrwt 20 root root 73728 Nov 10 09:21  ../

In my version of ls, there's a --quoting-style= option that can help
control what you see.  But that's a tangent you can explore later.

Since we know the actual name of the file (subdirectory) now, let's just
rename it to something sane.

mv $'\xc2' subdir

Then you can investigate it, remove it, or do whatever else you want.



Re: deduplicating file systems: VDO with Debian?

2022-11-10 Thread Dan Ritter
Curt wrote: 
> On 2022-11-08, The Wanderer  wrote:
> >
> > That more general sense of "backup" as in "something that you can fall
> > back on" is no less legitimate than the technical sense given above, and
> > it always rubs me the wrong way to see the unconditional "RAID is not a
> > backup" trotted out blindly as if that technical sense were the only one
> > that could possibly be considered applicable, and without any
> > acknowledgment of the limited sense of "backup" which is being used in
> > that statement.
> >
> 
> Maybe it's a question of intent more than anything else. I thought RAID
> was intended for a server scenario where if a disk fails, you're down
> time is virtually null, whereas as a backup is intended to prevent data
> loss. RAID isn't ideal for the latter because it doesn't ship the saved
> data off-site from the original data (or maybe a RAID array is
> conceivable over a network and a distance?).

RAID means "redundant array of inexpensive disks". The idea, in the
name, is to bring together a bunch of cheap disks to mimic a single more
expensive disk, in a way which hopefully is more resilient to failure.

If you need a filesystem that is larger than a single disk (that you can
afford, or that exists), RAID is the name for the general approach to
solving that.

The three basic technologies of RAID are:

striping: increase capacity by writing parts of a data stream to N
disks. Can increase performance in some situations.

mirroring: increase resiliency by redundantly writing the same data to
multiple disks. Can increase performance of reads.

checksums/erasure coding: increase resilency by writing data calculated
from the real data (but not a full copy) that allows reconstruction of
the real data from a subset of disks. RAID5 allows one failure, RAID6
allows recovery from two simultaneous failures, fancier schemes may
allow even more.

You can work these together, or separately.

Now, RAID is not a backup because it is a single store of data: if you
delete something from it, it is deleted. If you suffer a lightning
strike to the server, there's no recovery from molten metal.

Some filesystems have snapshotting. Snapshotting can protect you from
the accidental deletion scenario, by allowing you to recover quickly,
but does not protect you from lightning.

The lightning scenario requires a copy of the data in some other
location. That's a backup.

You can store the backup on a RAID. You might need to store the backup
on a RAID, or perhaps by breaking it up into pieces to store on tapes or
optical disks or individual hard disks. The kind of RAID you choose for
the backup is not related to the kind of RAID you use on your primary
storage.

> Of course, I wouldn't know one way or another, but the complexity (and
> substantial verbosity) of this thread seem to indicate that that all
> these concepts cannot be expressed clearly and succinctly, from which I
> draw my own conclusions.

The fact that many people talk about things that they don't understand
does not restrict the existence of people who do understand it. Only
people who understand what they are talking about can do so clearly and
succinctly.

-dsr-



Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?))

2022-11-10 Thread Dan Ritter
hw wrote: 
> And I've been reading that when using ZFS, you shouldn't make volumes with 
> more
> than 8 disks.  That's very inconvenient.


Where do you read these things?

The number of disks in a zvol can be optimized, depending on
your desired redundancy method, total number of drives, and
tolerance for reduced performance during resilvering. 

Multiple zvols together form a zpool. Filesystems are allocated from
a zpool.

8 is not a magic number.

-dsr-



Re: deduplicating file systems: VDO with Debian?

2022-11-10 Thread Nicolas George
Curt (12022-11-10):
> Maybe it's a question of intent more than anything else. I thought RAID
> was intended for a server scenario where if a disk fails, you're down
> time is virtually null, whereas as a backup is intended to prevent data
> loss.

Maybe just use common sense. RAID means your data is present on several
drives. You can just deduce what it can help for:

one drive fails → you can replace it immediately, no downtime

one drive fails → the data is present elsewhere, no data loss

several¹ drive fail → downtime and data loss²

1: depending on RAID level
2: or not if you have backups too

>   RAID isn't ideal for the latter because it doesn't ship the saved
> data off-site from the original data (or maybe a RAID array is
> conceivable over a network and a distance?).

It is always a matter of compromise. You cannot duplicate your data
off-site at the same rate as you duplicate it on a second local drive.

That means your off-site data will survive an EMP, but you will lose
minutes / hours / days of data prior to the EMP. OTOH, RAID will not
survive an EMP, but it will prevent all data loss caused by isolated
hardware failure.

-- 
  Nicolas George



definitions of "backup" (was Re: deduplicating file systems: VDO with Debian?)

2022-11-10 Thread The Wanderer
On 2022-11-10 at 08:40, Curt wrote:

> On 2022-11-08, The Wanderer  wrote:
> 
>> That more general sense of "backup" as in "something that you can
>> fall back on" is no less legitimate than the technical sense given
>> above, and it always rubs me the wrong way to see the unconditional
>> "RAID is not a backup" trotted out blindly as if that technical
>> sense were the only one that could possibly be considered
>> applicable, and without any acknowledgment of the limited sense of
>> "backup" which is being used in that statement.
> 
> Maybe it's a question of intent more than anything else. I thought
> RAID was intended for a server scenario where if a disk fails, you're
> down time is virtually null, whereas as a backup is intended to
> prevent data loss.

If the disk fails, the data stored on the disk is lost (short of
forensic-style data recovery, anyway), so anything that ensures that
that data is still available serves to prevent data loss.

RAID ensures that the data is still available even if the single disk
fails, so it qualifies under that criterion.

> RAID isn't ideal for the latter because it doesn't ship the saved 
> data off-site from the original data (or maybe a RAID array is 
> conceivable over a network and a distance?).

Shipping the data off-site is helpful to protect against most possible
causes for data loss, such as damage to or theft of the on-site
equipment. (Or, for that matter, accidental deletion of the live data.)

It's not necessary to protect against some causes, however, such as
failure of a local disk. For that cause, RAID fulfills the purpose just
fine.

RAID does not protect against most of those other scenarios, however, so
there's certainly still a role for - and a reason to recommend! -
off-site backup. It's just that the existence of those options does not
mean RAID does not have a role to play in avoiding data loss, and
thereby a valid sense in which it can be considered to provide something
to fall back on, which is the approximate root meaning of the
nontechnical sense of "backup".

-- 
   The Wanderer

The reasonable man adapts himself to the world; the unreasonable one
persists in trying to adapt the world to himself. Therefore all
progress depends on the unreasonable man. -- George Bernard Shaw



signature.asc
Description: OpenPGP digital signature


weird directory entry on ZFS volume (Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?)))

2022-11-10 Thread hw
On Thu, 2022-11-10 at 07:03 -0500, Greg Wooledge wrote:
> On Thu, Nov 10, 2022 at 05:54:00AM +0100, hw wrote:
> > ls -la
> > insgesamt 5
> > drwxr-xr-x  3 namefoo namefoo    3 16. Aug 22:36 .
> > drwxr-xr-x 24 root    root    4096  1. Nov 2017  ..
> > drwxr-xr-x  2 namefoo namefoo    2 21. Jan 2020  ?
> > namefoo@host /srv/datadir $ ls -la '?'
> > ls: Zugriff auf ? nicht möglich: Datei oder Verzeichnis nicht gefunden
> > namefoo@host /srv/datadir $ 
> > 
> > 
> > This directory named ? appeared on a ZFS volume for no reason and I can't
> > access
> > it and can't delete it.  A scrub doesn't repair it.  It doesn't seem to do
> > any
> > harm yet, but it's annoying.
> > 
> > Any idea how to fix that?
> 
> ls -la might not be showing you the true name.  Try this:
> 
> printf %s * | hd
> 
> That should give you a hex dump of the bytes in the actual filename.

good idea:

printf %s * | hexdump
000 77c2 6861 0074 
005

> If you misrepresented the situation, and there's actually more than one
> file in this directory, then use something like this instead:
> 
> shopt -s failglob
> printf '%s\0' ? | hd

shopt -s failglob
printf '%s\0' ? | hexdump
000 00c2   
002

> Note that the ? is *not* quoted here, because we want it to match any
> one-character filename, no matter what that character actually is.  If
> this doesn't work, try ?? or * as the glob, until you manage to find it.

printf '%s\0' ?? | hexdump
-bash: Keine Entsprechung: ??

(meaning something like "no equivalent")


printf '%s\0' * | hexdump
000 00c2 6177 7468 
007


> If it turns out that '?' really is the filename, then it becomes a ZFS
> issue with which I can't help.

I would think it is.  Is it?

perl -e 'print chr(0xc2) . "\n"'

... prints a blank line.  What's 0xc2?  I guess that should be UTF8 ...


printf %s *
aht

What would you expect it to print after shopt?



Re: deduplicating file systems: VDO with Debian?

2022-11-10 Thread Curt
On 2022-11-08, The Wanderer  wrote:
>
> That more general sense of "backup" as in "something that you can fall
> back on" is no less legitimate than the technical sense given above, and
> it always rubs me the wrong way to see the unconditional "RAID is not a
> backup" trotted out blindly as if that technical sense were the only one
> that could possibly be considered applicable, and without any
> acknowledgment of the limited sense of "backup" which is being used in
> that statement.
>

Maybe it's a question of intent more than anything else. I thought RAID
was intended for a server scenario where if a disk fails, you're down
time is virtually null, whereas as a backup is intended to prevent data
loss. RAID isn't ideal for the latter because it doesn't ship the saved
data off-site from the original data (or maybe a RAID array is
conceivable over a network and a distance?).

Of course, I wouldn't know one way or another, but the complexity (and
substantial verbosity) of this thread seem to indicate that that all
these concepts cannot be expressed clearly and succinctly, from which I
draw my own conclusions.



Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?))

2022-11-10 Thread DdB
Am 10.11.2022 um 13:03 schrieb Greg Wooledge:
> If it turns out that '?' really is the filename, then it becomes a ZFS
> issue with which I can't help.

just tested: i could create, rename, delete a file with that name on a
zfs filesystem just as with any other fileystem.

But: i recall having seen an issue with corrupted filenames in a
snapshot once (several years ago though). At the time, i did resort to
send/recv to get the issue straightened out.

But it is very much more likely, that the filename '?' is entirely
unrelated to zfs. Although zfs is perceived as being easy to handle
(only 2 commands need to be learned: zpool and zfs), it takes a while to
get acquainted with all the concepts and behaviors. Take some time to
play with an installation (in a vm or just with a file based pool should
be considered).



block devices vs. partitions (Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?)))

2022-11-10 Thread hw
On Thu, 2022-11-10 at 10:59 +0100, DdB wrote:
> Am 10.11.2022 um 04:46 schrieb hw:
> > On Wed, 2022-11-09 at 18:26 +0100, Christoph Brinkhaus wrote:
> > > Am Wed, Nov 09, 2022 at 06:11:34PM +0100 schrieb hw:
> > > [...]
> [...]
> > > 
> > Why would partitions be better than the block device itself?  They're like
> > an
> > additional layer and what could be faster and easier than directly using the
> > block devices?
> > 
> > 
> hurts my eyes to see such desinformation circulating.

What's wrong about it?



Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?))

2022-11-10 Thread hw
On Thu, 2022-11-10 at 10:34 +0100, Christoph Brinkhaus wrote:
> Am Thu, Nov 10, 2022 at 04:46:12AM +0100 schrieb hw:
> > On Wed, 2022-11-09 at 18:26 +0100, Christoph Brinkhaus wrote:
> > > Am Wed, Nov 09, 2022 at 06:11:34PM +0100 schrieb hw:
> > > [...]
> [...]
> > > 
> > 
> > Why would partitions be better than the block device itself?  They're like
> > an
> > additional layer and what could be faster and easier than directly using the
> > block devices?
>  
>  Using the block device is no issue until you have a mirror or so.
>  In case of a mirror ZFS will use the capacity of the smallest drive.

But you can't make partitions larger than the drive.

>  I have read that a for example 100GB disk might be slightly larger
>  then 100GB. When you want to replace a 100GB disk with a spare one
>  which is less larger than the original one the pool will not fit on
>  the disk and the replacement fails.

Ah yes, right!  I kinda did that a while ago for spinning disks that might be
replaced by SSDs eventually and wanted to make sure that the SSDs wouldn't be
too small.  I forgot about that, my memory really isn't what it used to be ...

>  With partitions you can specify the space. It does not hurt if there
>  are a few MB unallocated. But then the partitions of the diks have
>  exactly the same size.

yeah



Re: definiing deduplication (was: Re: deduplicating file systems: VDO with Debian?)

2022-11-10 Thread hw
On Wed, 2022-11-09 at 12:08 +0100, Thomas Schmitt wrote:
> Hi,
> 
> i wrote:
> > >   https://github.com/dm-vdo/kvdo/issues/18
> 
> hw wrote:
> > So the VDO ppl say 4kB is a good block size
> 
> They actually say that it's the only size which they support.
> 
> 
> > Deduplication doesn't work when files aren't sufficiently identical,
> 
> The definition of sufficiently identical probably differs much between
> VDO and ZFS.
> ZFS has more knowledge about the files than VDO has. So it might be worth
> for it to hold more info in memory.

Dunno, apparently they keep checksums of blocks in memory.  More checksums, more
memory ...

> > It seems to make sense that the larger
> > the blocks are, the lower chances are that two blocks are identical.
> 
> Especially if the filesystem's block size is smaller than the VDO
> block size, or if the filesystem does not align file content intervals
> to block size, like ReiserFS does.

That would depend on the files.

> > So how come that deduplication with ZFS works at all?
> 
> Inner magic and knowledge about how blocks of data form a file object.
> A filesystem does not have to hope that identical file content is
> aligned to a fixed block size.

No, but when it uses large blocks it can store more files in a block and won't
be able to deduplicate the identical files in a block because the blocks are
atoms in deduplication.  The larger the blocks are, the less likely it seems
that multiple blocks are identical.

> didier gaumet wrote:
> > > > The goal being primarily to optimize storage space
> > > > for a provider of networked virtual machines to entities or customers
> 
> I wrote:
> > > Deduplicating over several nearly identical filesystem images might indeed
> > > bring good size reduction.
> 
> hw wrote:
> > Well, it's independant of the file system.
> 
> Not entirely. As stated above, i would expect VDO to work not well for
> ReiserFS with its habit to squeeze data into unused parts of storage blocks.
> (This made it great for storing many small files, but also led to some
> performance loss by more fragmentation.)

VDO is independant of the file system, and 4k blocks are kinda small.  It
doesn't matter how files are aligned to blocks of a file system because VDO
always uses chunks of 4k each and compares them and always works the same.  You
can always create a file system with an unlucky block size for the files on it
or even one that makes sure that all the 4k blocks are not identical.  We could
call it spitefs maybe :)

> > Do I want/need controlled redundancy with
> > backups on the same machine, or is it better to use snapshots and/or
> > deduplication to reduce the controlled redundancy?
> 
> I would want several independent backups on the first hand.

Independent?  Like two full copies like I'm making?

> The highest risk for backup is when a backup storage gets overwritten or
> updated. So i want several backups still untouched and valid, when the
> storage hardware or the backup software begin to spoil things.

That's what I thought, but I'm about to run out of disk space for multiple full
copies.

> Deduplication increases the risk that a partial failure of the backup
> storage damages more than one backup. On the other hand it decreases the
> work load on the storage

It may make all backups unusable because the single copy deduplication has left
has been damaged.  However, how likely is a partial failure of a stoarge volume
to happen, and how relevant is it?  How often does a storage volume --- the
underlying media doesn't necessarily matter; for example, when a disk goes bad
in a RAID, you replace it and keep going --- goes bad in only one place?  When
the volume has gone away, so have all the copies.

>  and the time window in which the backuped data
> can become inconsistent on the application level.

Huh?

> Snapshot before backup reduces that window size to 0. But this still
> does not prevent application level inconsistencies if the application is
> caught in the act of reworking its files.

You make the snapshot of the backup before starting to make a backup, not while
making one.

Or are you referring to the data being altered while a backup is in progress?

> So i would use at least four independent storage facilities interchangeably.
> I would make snapshots, if the filesystem supports them, and backup those
> instead of the changeable filesystem.
> I would try to reduce the activity of applications on the filesystem when
> the snapshot is made.

right

> I would allow each independent backup storage to do its own deduplication,
> not sharing it with the other backup storages.

If you have them on different machines or volumes, it would be difficult to do
it otherwise.

> > > In case of VDO i expect that you need to use different deduplicating
> > > devices to get controlled redundancy.
> 
> > How would the devices matter?  It's the volume residing on devices that gets
> > deduplicated, not the devices.
> 
> I understand that one VDO device 

Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?))

2022-11-10 Thread Greg Wooledge
On Thu, Nov 10, 2022 at 05:54:00AM +0100, hw wrote:
> ls -la
> insgesamt 5
> drwxr-xr-x  3 namefoo namefoo3 16. Aug 22:36 .
> drwxr-xr-x 24 rootroot4096  1. Nov 2017  ..
> drwxr-xr-x  2 namefoo namefoo2 21. Jan 2020  ?
> namefoo@host /srv/datadir $ ls -la '?'
> ls: Zugriff auf ? nicht möglich: Datei oder Verzeichnis nicht gefunden
> namefoo@host /srv/datadir $ 
> 
> 
> This directory named ? appeared on a ZFS volume for no reason and I can't 
> access
> it and can't delete it.  A scrub doesn't repair it.  It doesn't seem to do any
> harm yet, but it's annoying.
> 
> Any idea how to fix that?

ls -la might not be showing you the true name.  Try this:

printf %s * | hd

That should give you a hex dump of the bytes in the actual filename.

If you misrepresented the situation, and there's actually more than one
file in this directory, then use something like this instead:

shopt -s failglob
printf '%s\0' ? | hd

Note that the ? is *not* quoted here, because we want it to match any
one-character filename, no matter what that character actually is.  If
this doesn't work, try ?? or * as the glob, until you manage to find it.

If it turns out that '?' really is the filename, then it becomes a ZFS
issue with which I can't help.



Re: deduplicating file systems: VDO with Debian?

2022-11-10 Thread hede

On Wed, 09 Nov 2022 13:52:26 +0100 hw  wrote:

Does that work?  Does bees run as long as there's something to 
deduplicate and

only stops when there isn't?


Bees is a service (daemon) which runs 24/7 watching btrfs transaction 
state (the checkpoints). If there are new transactions then it kicks in. 
But it's a niced service (man nice, man ionice). If your backup process 
has higher priority than "idle" (which is typically the case) and 
produces high load it will potentially block out bees until the backup 
is finished (maybe!).



I thought you start it when the data is place and
not before that.


That's the case with fdupes, duperemove, etc.

You can easily make changes to two full copies --- "make changes" 
meaning that
you only change what has been changed since last time you made the 
backup.


Do you mean to modify (make changes) to one of the backups? I never 
considered making changes to my backups. I do make changes to the live 
data and next time (when the incremental backup process runs) these 
changes do get into backup storage. Making changes to some backups ... I 
won't call that backups anymore.


Or do you mean you have two copies and alternatively "update" these 
copies to reflect the live state? I do not see a benefit in this. At 
least if both reside on the same storage system. There's a waste in 
storage space (doubled files). One copy with many incremental backups 
would be better. And if you plan to deduplicate both copys, simply use a 
backup solution with incremental backups.


Syncing two adjacent copies means to submit all changes a second time, 
which was already transferred for the first copy. The second copy is 
still on some older state the moment you update this one.


Yet again I do prefer a single process for having one[sic] consistent 
backup storage with a working history.


Two copies on two different locations is some other story, that indeed 
can have benefits.



> For me only the first backup is a full backup, every other backup is
> incremental.

When you make a second full backup, that second copy is not 
incremental.  It's a

full backup.


correct. That's the reason I do make incremental backups. And with 
incremental backups I do mean that I can restore "full" backups for 
several days: every day of the last week, one day for every month of the 
year, even several days of past years and so on. But the whole backup of 
all those "full" backups is not even two full backups in size. It's less 
in size but offers more.


For me a single full backup needs several days (Terabytes via DSL upload 
to the backup location) while incremental backups are MUCH faster 
(typically a few minutes if there wasn't changed that much). So I use 
the later one.


What difference does it make wether the deduplication is block based or 
somehow

file based (whatever that means).


File based deduplication means files do get compared in a whole. Result: 
Two big and nearly identical files need to get stored in full: they do 
differ.
Say for example a backup of a virtual machine image which got started 
between two backup runs. More than 99% of the image is the same as 
before, but because there's some log written inside the VM image they do 
differ. Those files are nearly identical, even in position of identical 
data.


Block based deduplication can find parts of a file to be exclusive 
(changed blocks) and other parts to set shared (blocks with same 
content):


#
# btrfs fi du file1 file2

 Total   Exclusive  Set shared  Filename
   2.30GiB23.00MiB 2.28GiB  file1
   2.30GiB   149.62MiB 2.16GiB  file2
#
here both files share data but do also have their exclusive data.


I'm flexible, but I distrust "backup solutions".


I would say, it depends on. I do also distrust everything, but some sane 
solution maybe I do distrust a little less then my "self built" one. ;-)


Don't trust your own solution more than others "on principle", without 
some real reasons for distrust.


Sounds good.  Before I try it, I need to make a backup in case 
something goes

wrong.


;-)

regards
hede



Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?))

2022-11-10 Thread DdB
Am 10.11.2022 um 04:46 schrieb hw:
> On Wed, 2022-11-09 at 18:26 +0100, Christoph Brinkhaus wrote:
>> Am Wed, Nov 09, 2022 at 06:11:34PM +0100 schrieb hw:
>> [...]
>>> FreeBSD has ZFS but can't even configure the disk controllers, so that won't
>>> work.  
>>
>> If I understand you right you mean RAID controllers?
> 
> yes
> 
>> According to my knowledge ZFS should be used without any RAID
>> controllers. Disks or better partions are fine.
> 
> I know, but it's what I have.  JBOD controllers are difficult to find.  And it
> doesn't really matter because I can configure each disk as a single disk ---
> still RAID though.  It may even be an advantage because the controllers have 
> 1GB
> cache each and the computers CPU doesn't need to do command queuing.
> 
> And I've been reading that when using ZFS, you shouldn't make volumes with 
> more
> than 8 disks.  That's very inconvenient.
> 
> Why would partitions be better than the block device itself?  They're like an
> additional layer and what could be faster and easier than directly using the
> block devices?
> 
> 
hurts my eyes to see such desinformation circulating. But i myself am
only one happy zfs user for a decade by now. I suggest to get in contact
with the zfs gurus on ZoL (or read the archive from
https://zfsonlinux.topicbox.com/groups/zfs-discuss)



Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?)

2022-11-10 Thread DdB
Am 10.11.2022 um 06:38 schrieb David Christensen:
> What is your technique for defragmenting ZFS?
well, that was meant more or less a joke: there is none apart from
offloading all the data, destroying and rebuilding the pool, and filling
it again from the backup. But i do it from time to time if fragmentation
got high, the speed improvements are obvious. OTOH the process takes
days on my SOHO servers



Re: else or Debian (Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?))

2022-11-10 Thread Christoph Brinkhaus
Am Thu, Nov 10, 2022 at 04:46:12AM +0100 schrieb hw:
> On Wed, 2022-11-09 at 18:26 +0100, Christoph Brinkhaus wrote:
> > Am Wed, Nov 09, 2022 at 06:11:34PM +0100 schrieb hw:
> > [...]
> > > FreeBSD has ZFS but can't even configure the disk controllers, so that 
> > > won't
> > > work.  
> > 
> > If I understand you right you mean RAID controllers?
> 
> yes
> 
> > According to my knowledge ZFS should be used without any RAID
> > controllers. Disks or better partions are fine.
> 
> I know, but it's what I have.  JBOD controllers are difficult to find.  And it
> doesn't really matter because I can configure each disk as a single disk ---
> still RAID though.  It may even be an advantage because the controllers have 
> 1GB
> cache each and the computers CPU doesn't need to do command queuing.
> 
> And I've been reading that when using ZFS, you shouldn't make volumes with 
> more
> than 8 disks.  That's very inconvenient.
> 
> Why would partitions be better than the block device itself?  They're like an
> additional layer and what could be faster and easier than directly using the
> block devices?
 
 Using the block device is no issue until you have a mirror or so.
 In case of a mirror ZFS will use the capacity of the smallest drive.

 I have read that a for example 100GB disk might be slightly larger
 then 100GB. When you want to replace a 100GB disk with a spare one
 which is less larger than the original one the pool will not fit on
 the disk and the replacement fails.

 With partitions you can specify the space. It does not hurt if there
 are a few MB unallocated. But then the partitions of the diks have
 exactly the same size.

 Kind regards,
 Christoph



Re: ZFS performance (was: Re: deduplicating file systems: VDO withDebian?)

2022-11-09 Thread gene heskett

On 11/10/22 00:37, David Christensen wrote:

On 11/9/22 00:24, hw wrote:
 > On Tue, 2022-11-08 at 17:30 -0800, David Christensen wrote:

 > Hmm, when you can backup like 3.5TB with that, maybe I should put 
FreeBSD on my
 > server and give ZFS a try.  Worst thing that can happen is that it 
crashes and
 > I'd have made an experiment that wasn't successful.  Best thing, I 
guess, could
 > be that it works and backups are way faster because the server 
doesn't have to
 > actually write so much data because it gets deduplicated and reading 
from the

 > clients is faster than writing to the server.


Be careful that you do not confuse a ~33 GiB full backup set, and 78 
snapshots over six months of that same full backup set, with a full 
backup of 3.5 TiB of data.  I would suggest a 10 GiB pool to backup the 
latter.



Writing to a ZFS filesystem with deduplication is much slower than 
simply writing to, say, an ext4 filesystem -- because ZFS has to hash 
every incoming block and see if it matches the hash of any existing 
block in the destination pool.  Storing the existing block hashes in a 
dedicated dedup virtual device will expedite this process.



 >> I run my backup script each night.  It uses rsync to copy files and
 >
 > Aww, I can't really do that because my servers eats like 200-300W 
because it has

 > so many disks in it.  Electricity is outrageously expensive here.



Which brings up another suggestion in two parts:

1: use amanda, with tar and compression to reduce the size of the 
backups.  And use a backup cycle of a week or 2 because amanda will if 
advancing a level, only backup that which has been changed since the 
last backup. On a quiet system, a level 3 backup for a 50gb network of 
several machines can be under 100 megs. More on a busy system of course.

Amanda keeps track of all that automatically.

2: As disks fail, replace them with SSD's which use much less power than 
spinning rust. And they are typically 5x faster than commodity spinning 
rust.


Here, and historically with spinning rust, backing up 5 machines, at 3am 
every morning is around 10gb total and under 45 minutes. This includes 
the level 0's it does by self adjusting the schedule to spread the level 
0's, AKA the fulls, out over the backup cycle so the amount of storage 
used for any one backup run is fairly consistent.


Perhaps platinum rated power supplies?  Energy efficient HDD's/ SSD's?


 >> directories from various LAN machines into ZFS filesystems named after
 >> each host -- e.g. pool/backup/hostname (ZFS namespace) and
 >> /var/local/backup/hostname (Unix filesystem namespace).  I have a
 >> cron(8) that runs zfs-auto-snapshot once each day and once each month
 >> that takes a recursive snapshot of the pool/backup filesystems.  Their
 >> contents are then available via Unix namespace at
 >> /var/local/backup/hostname/.zfs/snapshot/snapshotname.  If I want to
 >> restore a file from, say, two months ago, I use Unix filesystem 
tools to

 >> get it.
 >
 > Sounds like a nice setup.  Does that mean you use snapshots to keep 
multiple
 > generations of backups and make backups by overwriting everything 
after you made

 > a snapshot?


Yes.


 > In that case, is deduplication that important/worthwhile?  You're not
 > duplicating it all by writing another generation of the backup but 
store only

 > what's different through making use of the snapshots.


Without deduplication or compression, my backup set and 78 snapshots 
would require 3.5 TiB of storage.  With deduplication and compression, 
they require 86 GiB of storage.



 > ... I only never got around to figure [ZFS snapshots] out because I 
didn't have the need.



I accidentally trash files on occasion.  Being able to restore them 
quickly and easily with a cp(1), scp(1), etc., is a killer feature. 
Users can recover their own files without needing help from a system 
administrator.



 > But it could also be useful for "little" things like taking a 
snapshot of the
 > root volume before updating or changing some configuration and being 
able to

 > easily to undo that.


FreeBSD with ZFS-on-root has a killer feature called "Boot Environments" 
that has taken that idea to the next level:


https://klarasystems.com/articles/managing-boot-environments/


 >> I have 3.5 TiB of backups.


It is useful to group files with similar characteristics (size, 
workload, compressibility, duplicates, backup strategy, etc.) into 
specific ZFS filesystems (or filesystem trees).  You can then adjust ZFS 
properties and backup strategies to match.



  For compressed and/or encrypted archives, image, etc., I do not use
  compression or de-duplication
 >>>
 >>> Yeah, they wouldn't compress.  Why no deduplication?
 >>
 >>
 >> Because I very much doubt that there will be duplicate blocks in 
such files.

 >
 > Hm, would it hurt?


Yes.  ZFS deduplication is resource intensive.


 > Oh it's not about performance when degraded, but about performance. 
IIRC when
 > 

Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?)

2022-11-09 Thread David Christensen

On 11/9/22 01:35, DdB wrote:

> But
i am satisfied with zfs performance from spinning rust, if i dont fill
up the pool too much, and defrag after a while ...



What is your technique for defragmenting ZFS?


David




Re: definiing deduplication (was: Re: deduplicating file systems: VDO with Debian?)

2022-11-09 Thread David Christensen

On 11/9/22 03:08, Thomas Schmitt wrote:


So i would use at least four independent storage facilities interchangeably.
I would make snapshots, if the filesystem supports them, and backup those
instead of the changeable filesystem.
I would try to reduce the activity of applications on the filesystem when
the snapshot is made.
I would allow each independent backup storage to do its own deduplication,
not sharing it with the other backup storages.



+1


David



  1   2   3   4   5   >