Bugs item #1273562, was opened at 2005-08-25 17:57 Message generated for change (Comment added) made by miedward You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=109368&aid=1273562&group_id=9368
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Installation Group: 4.2 Status: Open Resolution: None Priority: 9 Submitted By: Mengjuei Hsieh (mjhsieh) Assigned to: Bernard Li (bernardli) Summary: client stalls with error "nc: connect: connection refused" Initial Comment: During the installation, sometimes the client console would show up error messages "nc: connect: connection refused". If there are too many this kind of messages, the client rsync process simply stalls. I was not running si_monitor, so the "nc: connect: connection refused" seems to be reasonable. However I did not expect that the installation to be stopped. ---------------------------------------------------------------------- Comment By: Michael Edwards (miedward) Date: 2005-08-30 12:40 Message: Logged In: YES user_id=1192055 I swapped out the junker node I was using (PII, 3c905b nic) with a known good machine with a spare hd (P4, Broadcom GigE) and the problem vanished. I ran part of the installation with the monitor off and while I did see some nc:connect:connection refused messages scroll by the data flow was not interupted. When I turned the monitor on, it picked up the install in progress as it should and everything completed successfully. This problem might be at least in part be related to poor bandwith (or poor compatability with current drivers) on slower/older systems. ---------------------------------------------------------------------- Comment By: Michael Edwards (miedward) Date: 2005-08-30 09:46 Message: Logged In: YES user_id=1192055 On RHEL4, if I turn off si_monitor the installation locks up quickly with rsync error messages on the test node (though mine mention timeouts or refused connections). On the other hand, I am getting periodic lockups that may or may not be related to this issue anyway so I am not sure I can rule out hardware failure in my cases. ---------------------------------------------------------------------- Comment By: Erich Focht (efocht) Date: 2005-08-30 08:12 Message: Logged In: YES user_id=338721 Are you sure that nc is the reason for this? The nc is called every few seconds, the message comes from STDERR of the logger process in the background. If this disrupts rsync, somthing is badly broken. I'd look for other reasons of the stall: pfilter, /var/log/lastlog, ...? ---------------------------------------------------------------------- Comment By: John (muglerj) Date: 2005-08-29 14:18 Message: Logged In: YES user_id=505737 Upping to 9, as discussed on the call. ---------------------------------------------------------------------- Comment By: Bernard Li (bernardli) Date: 2005-08-25 23:32 Message: Logged In: YES user_id=879102 I guess there are 2 ways to fix this issue: 1) Turn on si_monitor (monitor daemon) all the time on the headnode 2) Change when to append IMAGESERVER= to the si kernel (right now this is done in setup_pxe, perhaps we can move this to when the user clicks on "Monitor Cluster Deployment"? Need to think about this some more... ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=109368&aid=1273562&group_id=9368 ------------------------------------------------------- SF.Net email is Sponsored by the Better Software Conference & EXPO September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf _______________________________________________ Oscar-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/oscar-devel
