Hi! > Please cc to me - I am currently off the list. Ok. > > One question: can crak be used for process migration (assuming nodes > > share filesystem)? (As in, node of > > cluster is going down so we checkpoint and resume on some other node?) > > Yes, as long as the resources (opened files) can be accessed on both > nodes. Good. > > PS: Can it checkpoint/restart X applications? I guess some games would > > be easier with ability to checkpoint ;-) > > Which means we need to support migrating network sockets. I added > TCP/IPv4 socket support this spring (currently for 2.2.19 and will port to > 2.4 shortly), and I tested migrating X. In certain cases I > successfully migrated some applications like Emacs, Acroread, etc, but > there is a prob lem. (The socket migration code has not been put online, > but I'd like to discuss how it works here) > > Basically I took three steps to migrate a TCP socket. Assuming A and B > are the two peers: > > 1. shutdown process A while keep B open > 2. restart A and re-establish the socket which points to B > 3 . change the socket on B to point to the new location of A This assumes both A and B are on same machine, right? > The problem is, during this stage, if B sends packets to A before 3 is > complete, B's socket will get a RST. In the case of X, if you click or > move cursor on A's window when A is being migrated, it will crash. <EVIL SOLUTION> You might shutdown machine's networking between checkpoint and restart. That way, packets are silently lost, and there's no RST to be generated. </EVIL> > One solution might be that freezing B when A is being migrated. There are > two ways to freeze B: > > 1) send a SIGSTOP to B and later SIGCONT it. It's simple to do but woul d > result in freezing the whole process, which is bad in certain cases (e.g., > the whole X server is stopped - the screen freezes). Assuming they are on same machine. > 2) freeze the socket only. I tried to set window sizes of B's socket to > zero, but it didn't work (I didn't try too hard though). I'd like to know > whether there i s a way to do so. You don't want to decrease window size, you want all packets silently discarded. > Unfortunately, even we use 1), it still doesn't solve the whole problem. > For exmaple, when the X connection is tunneled through ssh, you can only > freeze the sshd process, but packets are still sent to it when you click > on the server side, which will crash the connection as ell (at least for > my current implementation). One reason might be I didn't take care of > pending packets when I migrage a socket, but in fact, the real problem of > socket migration is that you don't know what would happen if the network > address is changed. Appliactions may depend on it (such a s FTP). A > virtual network interface should be provided to solve the problem > gracefully. > > As of migrating games, hmmm, here are my 2cents: > > 1) Most online games use UDP, and CRAK hasn't implemented UDP support. > It's much easier than TCP though. I guess you can't checkpoint/restart when there's remote machine involved. I was not thinking online games, I was thinking about tuxracer (game on localhost). Pavel -- I'm [EMAIL PROTECTED] "In my country we have almost anarchy and I don't care." Panos Katsaloulis describing me w.r.t. patents at [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/