I'm in the middle of putting together a use case, experimenting and
prototyping, and I wondered if I could take advantage of all of your
expertise on the subject :-)

I have a need to think about replicating large volumes (multigigabyte) of
large number (many terabytes of data total), to at least two other servers
besides the read write volume, and to perform these releases relatively
frequently (much more than once a day, preferably) Essentially, I have a
need to have a relatively fast read only archive of images, audio and
video, distributed around a bit for general access and scalability/HA. The
volumes will be nested inside each other on the tree as the individual
volumes grow.. in other words, we'll allocate more volumes into the tree as
the total data involved gets larger and larger, scaling it out across the
whole of the AFS system.

Also, these other two (or more) read-only volumes for each read write
volume will be remote volumes, transiting across relatively fat, but less
than gigabit, pipes (100+ megabits)

I'm also working in backups, etc. That's a whole different question, and I
can handle backups locally.. so that's served by the AFS backup and
shadowing system nicely.. no need to worry about that part.

For the moment what I have decided to experiment with is a simple system.
My initial idea is to work the afs read-only volume tree into an AUFS
union, with a local read write partition in the mix. This way, writes will
be local, but I can periodically "flush" writes to the AFS tree, double
check they have been written and released, and then remove them from the
local partition.. this should maintain integrity and high availability for
the up-to-the-moment recordings, given I RAID the local volume. Obviously,
this still introduces a single point of failure... so I'd like to flush as
frequently as possible. Incidentally, it seems you can NFS export such a
union system fairly simply.

But, I feel as if I am missing something... it has become clear that
releasing is a pretty intensive operation, and if we're talking about
multiple gigabytes per release, I can imagine it being extremely difficult.
Is there a schema that i can use with OpenAFS that will help alleviate this
problem? Or perhaps another approach I am missing that may solve it better?

Best,

--Timothy Balcer

PS: FYI the reason I chose AFS in the first place was because of its
ability to produce replicas, present a uniform file space, require no
special interface to interact with it (aka its a filesystem, not a storage
system) and robustness. I am not convinced any of the other distributed
filesystems are as robust, and none of them offer the same collection of
features.

Reply via email to