On Tue, 9 Jan 2007 15:55:51 +0000 Keith Lofstrom <[EMAIL PROTECTED]> wrote:
> On Tue, Jan 09, 2007 at 08:19:35AM -0700, hanj wrote: > > Hello All > > > > I have a interesting and annoying situation. On one remote server, I'm > > having issues with corrupted MAC via SSH and my session disconnects. > > This appears to be a hardware problem somewhere on my route.. and I'm > > working with my ISP's network admins on the problems. Now.. the > > question to you. When this happens, my dirvish image fails since I'm > > disconnected in the middle of the backup. > > > > Is it possible to repair the image? Currently, I have to delete the > > dated folder and try again, and cross my fingers it doesn't fail on > > this try. I would really like to just repair the image from the point > > it failed. > > > > I tried copying files, etc from 'good' images, but it doesn't see them > > for the next pass the following day. > > This is more an network question than a dirvish question - dirvish needs > rsync to be working, and rsync needs the underlying network transport to > be working. I don't think you should be trying to run dirvish (or any > other backup tool) over a network until you can get the network operating > properly. This can be due to many things, very likely a configuration > problem since typical IP transport protocols are tolerant of lost packets, > but intolerant of configuration errors that continuously misdirect them. > > Sometimes the "configuration error" is a zombied machine somewhere on > the path. Do not rule out enemy action. While ssh can tunnel through > hostile networks, it will get confused and have to restart a lot if > another machine is pretending to be one of two legitimate endpoints. > However, it is more likely to be something like a iptables and NAT > misconfiguration - this has happened to me, and I fixed it mostly by > careful reading of the iptables docs and proper configuration rather > than by observing packets. > > You will need a network guru, not an rsync guru, for now. If you need > to build test cases that stress a probably-working network, rsync can > be good for that, but avoid the complexities of dirvish and build some > simplified test cases. For example, use rsync alone to copy directories > between two machines, identical process each time (same initial source > and target data). If you get varying results from identical rsync copy > processes and you cannot figure out what is happening from the tcpdump > logs, then pick an easier-to-understand application than rsync. Hello Thanks for writing. I understand your point. The network issue is being addressed. It's most likely a faulty interface on a router on one of the hops to my colo server. I verified that my home office route causes this problem to any server in their facility, but I'm fine from other networks. The problem is that, they're working on it. Been working on it.. but in the mean time, some of my daily backups fail. I understand your point about not running backups during this phase, but I would feel much more comfortable having backups while they work on the problem on their end. The SSH corrupted MAC error is basically a dropped packet. SSH can 'reset' like normal tcp/ip traffic.. so it drops the connection. This is intermittent, so backups can sometimes work. I can continue where it left off by issuing rsync command (listed in summary) pointing to the image it just failed on. It completes the backup, but the following day it still think it's missing those files. Must be a link or some other magic that dirvish does after the rsync phase. As I said, my ISP is working on the problem, but the suspected interface/device is outside of their direct network and convincing the upstream provider that they have a problem is time consuming. Thanks again! hanji _______________________________________________ Dirvish mailing list [email protected] http://www.dirvish.org/mailman/listinfo/dirvish
