Hi Bob,

Bob Doolittle schrieb:

[...]
>> Any further suggestions welcome.
>>   
> 
> Does utgstatus start reporting correctly again after a few minutes? If
> so, this might possibly be a thread scheduling issue. The Group Manager
> is responsible for sending out broadcast/multicast group membership
> "advertisements" every 20 seconds, and there is a dedicated thread in
> utauthd for sending and collecting such advertisements that determines
> group membership. It's conceivable that if the other threads were
> somehow preventing the GM thread from being scheduled (which shouldn't
> be possible, but...) a problem like this could occur. I'd expect it to
> "repair itself" after a couple of minutes, however.

Well, sometimes not all of the servers in the FOG fail immediately when
one server goes down. Depending on the downtime or perhaps at the very
situation the server is suddenly unavailable for the others either one
or all servers in the FOG lose their connections to the group and thus
interrupts the sunray connections. Only a "utrestart" on each server
reporting that it can't establish the group connection (utgstatus
reports an error) helps to restore the sessions.

> How large is your site? How many servers in this FOG, how many Sun Rays?
> Are you using the JRE supplied with SRSS (i.e. did you specify it during
> utinstall, so that /etc/opt/SUNWut/jre now points to it)? How much RAM
> is in your server, and what does ps report for utauthd:
> On Solaris: # pargs `pgrep -f utauthd`
> On Linux: # ps wwwaux | grep utauthd

In fact, we have two failover groups. One between two Solaris10 servers
running SRSS 4.1 in kiosk mode to serve uttsc/rdesktop sessions. Another
three servers are connected in a FOG with different SRSS versions (4.1
and 4.0) and operating systems (Linux/Solaris). However, in both
failover group the problem occurrs as soon as I reboot one of the server
or suddenly interrupt the network connection. As soon as I reissue a
"utrestart" on the affected servers the sessions will return correctly.
All in all we have about 90 Sunrays connected to these two failover
group with 5 servers. However, it does not really seem to matter how
many sunrays are connected to the individual sunray servers.

On our Solaris (SPARC) machines the jre link points to:

/etc/opt/SUNWut/jre -> /usr/java
java version "1.5.0_17"

And on our Linux (Ubuntu x86_64) machines the jre link points to:

/etc/opt/SUNWut/jre -> /usr/lib/jvm/ia32-java-6-sun/jre
java version "1.6.0_10"

Here is the output of the ps commands:

Solaris:
19495:  /etc/opt/SUNWut/jre/bin/java -client auth.utauthd.utauthd
argv[0]: /etc/opt/SUNWut/jre/bin/java
argv[1]: -client
argv[2]: auth.utauthd.utauthd

Linux:
root      9157  0.0  0.0 220984 18132 ?        Sl   12:10   0:07
/etc/opt/SUNWut/jre/bin/java -client auth.utauthd.utauthd

cheers,
jens
-- 
Jens Langner                                         Ph: +49-351-2602757
Forschungszentrum Dresden-Rossendorf e.V.
Institute of Radiopharmacy - PET Center                 [email protected]
Germany                                               http://www.fzd.de/
_______________________________________________
SunRay-Users mailing list
[email protected]
http://www.filibeto.org/mailman/listinfo/sunray-users

Reply via email to