Hi Bob, Bob Doolittle schrieb:
[...] >> Any further suggestions welcome. >> > > Does utgstatus start reporting correctly again after a few minutes? If > so, this might possibly be a thread scheduling issue. The Group Manager > is responsible for sending out broadcast/multicast group membership > "advertisements" every 20 seconds, and there is a dedicated thread in > utauthd for sending and collecting such advertisements that determines > group membership. It's conceivable that if the other threads were > somehow preventing the GM thread from being scheduled (which shouldn't > be possible, but...) a problem like this could occur. I'd expect it to > "repair itself" after a couple of minutes, however. Well, sometimes not all of the servers in the FOG fail immediately when one server goes down. Depending on the downtime or perhaps at the very situation the server is suddenly unavailable for the others either one or all servers in the FOG lose their connections to the group and thus interrupts the sunray connections. Only a "utrestart" on each server reporting that it can't establish the group connection (utgstatus reports an error) helps to restore the sessions. > How large is your site? How many servers in this FOG, how many Sun Rays? > Are you using the JRE supplied with SRSS (i.e. did you specify it during > utinstall, so that /etc/opt/SUNWut/jre now points to it)? How much RAM > is in your server, and what does ps report for utauthd: > On Solaris: # pargs `pgrep -f utauthd` > On Linux: # ps wwwaux | grep utauthd In fact, we have two failover groups. One between two Solaris10 servers running SRSS 4.1 in kiosk mode to serve uttsc/rdesktop sessions. Another three servers are connected in a FOG with different SRSS versions (4.1 and 4.0) and operating systems (Linux/Solaris). However, in both failover group the problem occurrs as soon as I reboot one of the server or suddenly interrupt the network connection. As soon as I reissue a "utrestart" on the affected servers the sessions will return correctly. All in all we have about 90 Sunrays connected to these two failover group with 5 servers. However, it does not really seem to matter how many sunrays are connected to the individual sunray servers. On our Solaris (SPARC) machines the jre link points to: /etc/opt/SUNWut/jre -> /usr/java java version "1.5.0_17" And on our Linux (Ubuntu x86_64) machines the jre link points to: /etc/opt/SUNWut/jre -> /usr/lib/jvm/ia32-java-6-sun/jre java version "1.6.0_10" Here is the output of the ps commands: Solaris: 19495: /etc/opt/SUNWut/jre/bin/java -client auth.utauthd.utauthd argv[0]: /etc/opt/SUNWut/jre/bin/java argv[1]: -client argv[2]: auth.utauthd.utauthd Linux: root 9157 0.0 0.0 220984 18132 ? Sl 12:10 0:07 /etc/opt/SUNWut/jre/bin/java -client auth.utauthd.utauthd cheers, jens -- Jens Langner Ph: +49-351-2602757 Forschungszentrum Dresden-Rossendorf e.V. Institute of Radiopharmacy - PET Center [email protected] Germany http://www.fzd.de/ _______________________________________________ SunRay-Users mailing list [email protected] http://www.filibeto.org/mailman/listinfo/sunray-users
