I may have a way to auto detect a hung session (at least for our 26B hangs
it works).  Run a utwho -a.  Any session that is owned by root is a hung
session for us (provided root is not actually logged into any of our Sunray
sessions).  The fact that it's owned by root seems to block Xnewt from
spinning up on that display, giving the dreaded 26B (and maybe D)?

On Fri, Oct 3, 2014 at 1:51 PM, James Michels <
[email protected]> wrote:

> Hello all,
>
> As promised, I'm posting some feedback about your idea. In conclusion, a
> mix of both Scott's and Daniel's ideas has *almost*-worked for me. When
> using just the utload & utsession commands, the thing didn't seem to work.
> However, I tried adding/removing things and tracing their effects and the
> conclusions are the following:
>
>    - The command combination that seems to work for me is "utdesktop -d
>    XXXXXXXXXXXX" + "utload -r -t pseudo.XXXXXXXXXXXX" + "utsession -k -t
>    pseudo.XXXXXXXXXXXX".
>    - The strange thing is that run just once, the client still seems to
>    be hung after reboot. Only after running the same command 5-6 times, the
>    client seems to start correctly. That seems pretty strange to me, as the
>    three commands are run exactly the same way the 5-6 times.
>    - This combination implies Scott's solution, there's just one path
>    left: /tmp/SUNWut/kiosk/:DISPLAY..., so I made a script that additionally
>    removes that path and added it to the 3 commands above.
>    - As far as my pretension of detecting hung sessions without
>    restarting the client for the first time goes, it seems that it won't work
>    the way I meant. The utquery command returns correct values on the
>    cmdcachesize parameter before rebooting, so this won't work. Now I'm still
>    looking for a way to detect those hung clients without telling our workers
>    to power cycle them manually.
>
> This has been a big help as at least I can reset the hung state of the
> clients now and I don't need to wait for the nightly utstart -c, so thank
> you Scott and Daniel.
>
> If someone has an idea on how to detect a hung client without needing to
> power cycle them, I'll be very grateful.
>
> James
>
>
> 2014-10-02 20:25 GMT+01:00 James Michels <[email protected]>
> :
>
>> Hello Daniel,
>>
>> 2014-10-02 18:51 GMT+01:00 Beckman, Daniel <[email protected]>:
>>
>>> This may be a different issue, but with SRS 5.3.1 on Solaris 10, when we
>>> run “utdesktop –lw” we sometimes show units that can’t get a session and
>>> are “stuck”. I think they are 26Ds as well.
>>>
>>>
>>>
>>> On an individual DTU basis, to fix remotely we issue this:
>>>
>>>
>>>
>>> /opt/SUNWut/lib/utload -r -t pseudo.00144fd6d4c7 && utsession -k -t
>>> pseudo.00144fd6d4c7
>>>
>>>
>>>
>>> Where “pseudo.xxx..” is the identifier that shows up in output of
>>> “utdesktop –lw”.
>>>
>>>
>>>
>>
>> In our case the 26D seems to be a bit more complicated to detect (not
>> sure if OL and Sun base might have different behaviors), but when a session
>> gets hung, initially it's still recognized as a valid session and doesn't
>> appear in the -lw list until someone power cycles the client. Then the 26D
>> is shown again, but this time it appears in the -lw list. Our aim is to
>> find out a way to discover those hung sessions in their 'initial' state,
>> where the SRSS doesn't catalogue them as hung yet.
>>
>> I've a possible idea, but I have to test it quite lot yet. It's based on
>> the output of the utquery command, concretly on the cmdcachesize parameter
>> which seems to be 0 when the session is hung, but as I said, I'm not quite
>> sure of this yet.
>>
>> I've tried using the 'utsession -k -t' command, but it doesn't seem to
>> "unstick" the session itself, however, I've not tried in combination with
>> the 'utload' command, so I'll also try that tomorrow and see its behavior.
>>
>> Thanks very much for that idea, too.
>>
>> James
>>
>>
>>> What that will do is “unstick” (for lack of a better term) the session
>>> associated with that DTU and then reboot it.
>>>
>>>
>>>
>>> To make things easier we have a simple script that asks for the
>>> identifier:
>>>
>>>
>>>
>>> #!/usr/bin/bash
>>>
>>> read -p "Enter the Desktop ID: : " DesktopID
>>>
>>> /opt/SUNWut/lib/utload -r -t pseudo.$DesktopID && utsession -k -t
>>> pseudo.$DesktopID
>>>
>>>
>>>
>>> To make things automated we have a script that runs via cron by the hour:
>>>
>>>
>>>
>>> #!/usr/bin/bash
>>>
>>> # Set variable for "DesktopID" based on output of utdesktop -lw
>>>
>>> DesktopID=$(utdesktop -lw | awk 'NR>3 && NR<5 {print $1}' )
>>>
>>> # Only continue if "utdesktop -lw" reports a hung session, indicated by
>>> existence of ID starting with 00
>>>
>>> if [[ "$DesktopID" == 00* ]]
>>>
>>> then
>>>
>>>     echo "There's hung sessions -- fixing them..."
>>>
>>>     /opt/SUNWut/lib/utload -r -t pseudo.$DesktopID && utsession -k -t
>>> pseudo.$DesktopID
>>>
>>> else
>>>
>>>     echo "No hung sessions -- we're done here!"
>>>
>>> fi
>>>
>>>
>>>
>>> I’m not a programmer so I apologize if the scripts are a bit crude – but
>>> they work for us.
>>>
>>>
>>>
>>> Hope that helps!
>>>
>>>
>>>
>>> Best,
>>>
>>> Daniel
>>>
>>>
>>>
>>>
>>>
>>> *From:* James Michels [mailto:[email protected]]
>>> *Sent:* Thursday, October 02, 2014 8:33 AM
>>> *To:* [email protected]
>>> *Subject:* [SunRay-Users] 26D and ability to effectively erase sessions
>>>
>>>
>>>
>>> Hello,
>>>
>>> We're getting spotaneous and unpredictable 26D screens sometimes. This
>>> doesn't happen quite often, but what's worrying is that we're unable to
>>> restore the affected client's state to be reset.
>>>
>>> We've tried to reset the client using the utsession -k -t command, also
>>> utdisplay -d and both of them seem uneffective, as when rebooted, the
>>> client remains in the same 26D state.
>>>
>>> The only thing that helps is a complete server reboot.
>>>
>>> When the client reconnects to the server we're seeing this in the log so
>>> maybe it's related:
>>>
>>> Oct  2 12:43:52 srss7 utauthd: search_for_entries(): Found multiple
>>> matching entries, was expecting a single match
>>>
>>>
>>>
>>> I deduce that the session is not being cleaned up entirely, so here's my
>>> question:
>>>
>>>
>>>
>>> Is there a *effective* way for completely wipe the information from a
>>> client? Something like this must be possible, otherwise a complete server
>>> restart wouldn't help either.
>>>
>>>
>>>
>>> I don't mind connecting to the local LDAP server and deleting
>>> 'something' by hand, but I'd like to know a way. We're running OL6.5.
>>>
>>>
>>>
>>> Thank you.
>>>
>>>
>>>
>>> James
>>>
>>> _______________________________________________
>>> SunRay-Users mailing list
>>> [email protected]
>>> http://www.filibeto.org/mailman/listinfo/sunray-users
>>>
>>>
>>
>
> _______________________________________________
> SunRay-Users mailing list
> [email protected]
> http://www.filibeto.org/mailman/listinfo/sunray-users
>
>


-- 


Greg Rodenhiser
Technical Services Engineer
College of the Holy Cross
_______________________________________________
SunRay-Users mailing list
[email protected]
http://www.filibeto.org/mailman/listinfo/sunray-users

Reply via email to