Using rsync or unisom or any of the improvements on top of rsync as a
direct replacement for FTP would be a low-hanging fruit improvement.
Tunneling it over an encrypted layer such as SSH improves security as
well.

You're absolutely right that if you have a cluster of machines in the DMZ,
a more scalable architecture is to have them all share storage rather than
unicast-replicate to each webserver individually.

So you have to choice between SAN / NAS to do this. NAS is generally
cheaper for a mature solution on a budget.

Perhaps your next lowest hanging fruit is to put a secure NAS on your DMZ,
then replicate to it from your internal trusted network via a single port.
Rsync / unisom again, over an encrypted protocol like SSH.

There are many possible variations - using stronger or weaker security on
the replication channel depending on the exact use case and data,
replicating on an event-driven system instead of via polling, buying the
replication software or NAS appliance(s) that handles replication already.

Your instincts are correct that it /is/ largely a solved problem!

Good Luck,
-D


Flaherty, Patrick wrote:
> I've been soliciting solutions from everyone I can think of on moving a
> large number of files from inside our lan to a dmz on a regular basis.
>
> I have a cluster of machine producing 20k small files (30kbytes or so)
> inside our lan. After the files are created, they are pushed to a few
> web servers in the DMZ using ftp. The push is done by the machine that
> created the file. Ideally, the files make it out to the DMZ in less than
> 30 seconds but there have been some issues.
>
> FTP seems to fall down when scaling out to more than a web server or
> two, many retries and transfer failures. It also adds to complexity to
> the processing. What if one of the web servers is down? How many time do
> you retry? Should you notify the other hosts in the cluster? All that
> logic needs to be in the pushing script, which becomes a bit ungainly.
> There's also the issue with constantly opening up new ftp sessions,
> which is a bit expensive.
>
> So I'm looking for a cleaner architecture. An ideal solution would be an
> NFS/CIFS share internal to the lan replicated readonly to an NFS/CIFS
> share in the DMZ. The cluster can write to the nfs share, the web
> servers can read from the nfs share. Everyone is happy. The big sticking
> point is being careful violating the security by multi homing the
> storage. Many solutions require an open connection network on many ports
> between the two storage boxes, which would be an easy way in to our lan.
>
> So far I'm poking at (and some downsides):
>  FUSE + (sshfs/ftpfs): High performance hit (60%ish from what I've read)
>  ZFS + StorageTek: Great, another operating system train people on.
>  DRBD: requires full network connection between lan and dmz boxes.
>  dataplow sfs + das box: sales people will promise you the world.
>  Software SAN replicators of to many names to mention.
>
> This is such a common problem, I'm not sure why there isn't a nice
> canned solution of two cheap pieces of hardware. Maybe I'm just an idiot
> and there is. Oh please please please tell me I'm an idiot.
>
> Anyone have any brilliant ideas?
>
> Best,
> Patrick
>
> _______________________________________________
> gnhlug-discuss mailing list
> gnhlug-discuss@mail.gnhlug.org
> http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
>


_______________________________________________
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Reply via email to