On 2026-04-29 08:51, Radosław Korzeniewski wrote:
The idea of an "incremental accelerator" for common local filesystems
is
great.
But the inotify api is not the best solution for this case IMVHO, YMMV.
The
main issue is the scalability issue.
True. IMHO, this is quick and dirty solution path which can be
acceptable
for some use cases. I consider reliability of a backup system too
important
and failures to miss a file unforgiving.
You want to use "incremental accelerator" for all your high volume
storage,
but inotify functionality does not scale.
The more directories you want to inotify the more resources it
requires. A
tradeoff, sure.
Yes, it doesn't scale. Still, it can monitor millions of files.
For several millions of files, it will probably consume several hundreds
of megabytes (at least), and this will be unswappable memory.
From what I remember (I read this is still the case), there is much
worse problem with inotify - it doesn't throttle. When queue overflow
occurs, events get dropped and for most use cases, this is a disaster.
No retry, no recovery from this.
Kernel does send notification but only about overflow happening, without
detailed info about each lost event.
This means that events could slip trough your fingers unnoticed.
Even inotify(7) man page warns about inconsistencies due to race
conditions.
There are other potential issues as well. Man page says:
-----BEGIN-----
If a filesystem is mounted on top of a monitored directory, no event is
generated, and no events are generated for objects immediately under
the new mount point. If the filesystem is subsequently unmounted,
events will subsequently be generated for the directory and the objects
it contains.
-----END-----
I did a prototype of this feature years ago (no coding LLM existed yet)
and
it wasn't worth the effort (the architecture was different, when fd
starts
it spawns a special thread and registers required watches from the
config
file, so incremental job can ask this thread what changed - it was a
never
ending story of issues, race conditions, edge cases, etc.).
Today there is a modern and scalable fanotify API (or FSEvents in macOS
or USN Change Journal in win) which should be used instead, again
IMVHO,
YMMV.
Fanotify might be more scalable and reliable but it can still lose
events.
No guaranteed completeness - the stream of events you receive is not
guaranteed to represent all filesystem activity that actually happened.
Its man page fanotify(7) states: "The event queue can overflow. In this
case,
events are lost.".
And for what I understood, just like with inotify, you get a single
overflow
notification, so you know that a loss occurred but you don't know what.
In short, there are many cases where fanotify is better than inotify but
it comes with some other limitations and design choices that should be
considered when using it.
Regards
--
Josip Deanovic
_______________________________________________
Bacula-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-users