On Mon, Dec 20, 2021 at 12:52 PM Rich Freeman <ri...@gentoo.org> wrote:
>
> On Mon, Dec 20, 2021 at 1:52 PM Mark Knecht <markkne...@gmail.com> wrote:
> >
> > I've recently built 2 TrueNAS file servers. The first (and main) unit
> > runs all the time and serves to backup my home user machines.
> > Generally speaking I (currently) put data onto it using rsync but it
> > also has an NFS mount that serves as a location for my Raspberry Pi to
> > store duplicate copies of astrophotography pictures live as they come
> > off the DSLR in the middle of the night.
> >
> > ...
> >
> > The thing is that the ZIL is only used for synchronous writes and I
> > don't know whether anything I'm doing to back up my user machines,
> > which currently is just rsync commands, is synchronous or could be
> > made synchronous, and I do not know if the NFS writes from the R_Pi
> > are synchronous or could be made so.
> >
>
> Disclaimer: some of this stuff is a bit arcane and the documentation
> isn't very great, so I could be missing a nuance somewhere.
>
> First, one of your options is to set sync=always on the zfs dataset,
> if synchronous behavior is strongly desired.  That will force ALL
> writes at the filesystem level to be synchronous.  It will of course
> also normally kill performance but the ZIL may very well save you if
> your SSD performs adequately.  This still only applies at the
> filesystem level, which may be an issue with NFS (read on).
>
> I'm not sure how exactly you're using rsync from the description above
> (rsyncd, directly client access, etc).  In any case I don't think
> rsync has any kind of option to force synchronous behavior.  I'm not
> sure if manually running a sync on the server after using rsync will
> use the ZIL or not.  If you're using sync=always then that should
> cover rsync no matter how you're doing it.
>
> Nfs is a little different as both the server-side and client-side have
> possible asynchronous behavior.  By default the nfs client is
> asynchronous, so caching can happen on the client before the file is
> even sent to the server.  This can be disabled with the mount option
> sync on the client side.  That will force all data to be sent to the
> server immediately.  Any nfs server or filesystem settings on the
> server side will not have any impact if the client doesn't transmit
> the data to the server.  The server also has a sync setting which
> defaults to on, and it additionally has another layer of caching on
> top of that which can be disabled with no_wdelay on the export.  Those
> server-side settings probably delay anything getting to the filesystem
> and so they would have precedence over any filesystem-level settings.
>
> As you can see you need to use a bit of a kill-it-with-fire approach
> to get synchronous behavior, as it traditionally performs so poorly
> that everybody takes steps to try to prevent it from happening.
>
> I'll also note that the main thing synchronous behavior protects you
> from is unclean shutdown of the server.  It has no bearing on what
> happens if a client goes down uncleanly.  If you don't expect server
> crashes it may not provide much benefit.
>
> If you're using ZIL you should consider having the ZIL mirrored, as
> any loss of the ZIL devices will otherwise cause data loss.  Use of
> the ZIL is also going to create wear on your SSD so consider that and
> your overall disk load before setting sync=always on the dataset.
> Since the setting is at the dataset level you could have multiple
> mountpoints and have a different sync policy for each.  The default is
> normal POSIX behavior which only syncs when requested (sync, fsync,
> O_SYNC, etc).
>
> --
> Rich
>

Rich & Wols,
   Thanks for the responses. I'll post a single response here. I had
thought of the need to mirror the ZIL but didn't have enough physical
disk slots in the backup machine for the 2nd SSD. I do think this is a
critical point if I was to use the ZIL at all.

   Based on inputs from the two of you I'm investigating a different
overall setup for my home network:

Previously - a new main desktop that holds all my data. Lots of disk
space, lots of data. All of my big data work - audio recording
sessions and astrophotography - are done on this machine. Two
__backup__ machines. Desktop machines are backed up to machine 1,
machine 1 backed up to machine 2, machine 2 eventually backed up to
some cloud service.

Now - a new desktop machine that holds audio recording data currently
being recorded and used due to real-time latency requirements. Two new
network machines: Machine 1 would be both a backup machine as well as
a file server. The file server portion of this machine holds
astrophotography data and recorded video files. PixInsight running on
my desktop accesses and stores over the network to machine 1. Instead
of a ZIL in machine 1 the SSD becomes a ZLOG cache most likely holding
a cached copy of the currently active astrophotography projects.
Machine 1 may also run a couple of VMs over time. Machine 2 is a pure
backup machine of everything on Machine 1.

FYI - Machine 1 will always be located close to my desktop machines
and use the 1Gb/S wired network. iperf suggests I get about 850Mb/S on
and off of Machine 1. Machine 2 will be remote and generally backed up
overnight using wireless.

   As always I'm interested in your comments about what works or
doesn't work about this sort of setup.

Cheers,
Mark

Reply via email to