Hi there,
I've implemented the following script that seems to work nicely for us.
We've however seen some cases where the client is hung and it reports being
IDLE in the utsession -p output, so this will probably also require some
script to restart idle sessions from time to time (say once every 2 hours),
so after cycling it should report the correct value. Odd, though.
Again, thanks for your help!
#!/bin/bash
# Paths to commands
LOGGER=`which logger`
UTSESSIONCMD=/opt/SUNWut/sbin/utsession
UTDESKTOPCMD=/opt/SUNWut/sbin/utdesktop
UTLOADCMD=/opt/SUNWut/lib/utload
IFS=$'\n'
# The following loop will return any sessions in the SID, Login, Status
format
for dtu in `$UTSESSIONCMD -p | tr -s ' ' | cut -d' ' -f1,3,5`; do
# The first returned row is empty, we discard it
[ -z "$dtu" ] && continue
# Acquiring needed values for each row
sid=`echo $dtu | cut -d' ' -f1`
login=`echo $dtu | cut -d' ' -f2`
state=`echo $dtu | cut -d' ' -f3`
# The utsession doesn't seem to be meant to be formatted for scripting,
there will be
# some rows that will return useless info.
[ $sid == "Token" ] && continue
[[ $sid =~ ----* ]] && continue
# If the DTU haven't an associated login and neither has the IDLE flag,
probably is hung
# We restart it!
if [ $login == '???' ] && ! [[ $state == I* ]]; then
$UTDESKTOPCMD -d `echo $sid | cut -d'.' -f2`
$UTLOADCMD -r -t $sid
$UTSESSIONCMD -k -t $sid
$LOGGER -t hung_session Restarting $sid as it seems to be hung
echo "DTU $sid has been restarted"
fi
done
exit 0
2014-10-03 19:42 GMT+01:00 James Michels <[email protected]>:
> Greg,
>
> 2014-10-03 19:32 GMT+01:00 Rodenhiser, Greg <[email protected]>:
>
>> The other way I've seen, even before a power cycle is via utsession -p.
>> If the Unix session column is ??? but the state is U (as opposed to IU) it
>> is the Sunray session(s) that are hung on 26B. At 4AM everyday I run
>> utstop/utstart via cron, and check every morning for the hung session. In
>> our case it's always just a single display out of the 35 we use.
>>
>>
> This seems very promising! I'll check on monday and post some feedback
> about it, respectively the script I've implemented to kill hung sessions.
>
> Thank you all, your tips will most probably solve my issue.
>
> Happy weekend,
>
> James
>
>
>> On Fri, Oct 3, 2014 at 2:14 PM, James Michels <
>> [email protected]> wrote:
>>
>>> Hello Greg,
>>>
>>> 2014-10-03 19:01 GMT+01:00 Rodenhiser, Greg <[email protected]>:
>>>
>>>> I may have a way to auto detect a hung session (at least for our 26B
>>>> hangs it works). Run a utwho -a. Any session that is owned by root is a
>>>> hung session for us (provided root is not actually logged into any of our
>>>> Sunray sessions). The fact that it's owned by root seems to block Xnewt
>>>> from spinning up on that display, giving the dreaded 26B (and maybe D)?
>>>>
>>>>
>>> This is the way we use, but it only works when the client has been
>>> already power-cycled :-( After that, the utwho -ca command shows that MAC
>>> being owned by root as you describe, but only after that fact. Until
>>> power-cycling, it appears being an idle session. That's the reason why I
>>> was tinkering with utquery and see its parameters, but they're the same as
>>> for a "sane" session. Even packets being sent from that hung session seems
>>> to be the same of a normal session, so its camouflage is brilliant :-)
>>>
>>> Thanks for the tip!
>>>
>>> James
>>>
>>>
>>>
>>>> On Fri, Oct 3, 2014 at 1:51 PM, James Michels <
>>>> [email protected]> wrote:
>>>>
>>>>> Hello all,
>>>>>
>>>>> As promised, I'm posting some feedback about your idea. In conclusion,
>>>>> a mix of both Scott's and Daniel's ideas has *almost*-worked for me.
>>>>> When using just the utload & utsession commands, the thing didn't seem to
>>>>> work. However, I tried adding/removing things and tracing their effects
>>>>> and
>>>>> the conclusions are the following:
>>>>>
>>>>> - The command combination that seems to work for me is "utdesktop
>>>>> -d XXXXXXXXXXXX" + "utload -r -t pseudo.XXXXXXXXXXXX" + "utsession -k
>>>>> -t
>>>>> pseudo.XXXXXXXXXXXX".
>>>>> - The strange thing is that run just once, the client still seems
>>>>> to be hung after reboot. Only after running the same command 5-6
>>>>> times, the
>>>>> client seems to start correctly. That seems pretty strange to me, as
>>>>> the
>>>>> three commands are run exactly the same way the 5-6 times.
>>>>> - This combination implies Scott's solution, there's just one path
>>>>> left: /tmp/SUNWut/kiosk/:DISPLAY..., so I made a script that
>>>>> additionally
>>>>> removes that path and added it to the 3 commands above.
>>>>> - As far as my pretension of detecting hung sessions without
>>>>> restarting the client for the first time goes, it seems that it won't
>>>>> work
>>>>> the way I meant. The utquery command returns correct values on the
>>>>> cmdcachesize parameter before rebooting, so this won't work. Now I'm
>>>>> still
>>>>> looking for a way to detect those hung clients without telling our
>>>>> workers
>>>>> to power cycle them manually.
>>>>>
>>>>> This has been a big help as at least I can reset the hung state of the
>>>>> clients now and I don't need to wait for the nightly utstart -c, so thank
>>>>> you Scott and Daniel.
>>>>>
>>>>> If someone has an idea on how to detect a hung client without needing
>>>>> to power cycle them, I'll be very grateful.
>>>>>
>>>>> James
>>>>>
>>>>>
>>>>> 2014-10-02 20:25 GMT+01:00 James Michels <
>>>>> [email protected]>:
>>>>>
>>>>>> Hello Daniel,
>>>>>>
>>>>>> 2014-10-02 18:51 GMT+01:00 Beckman, Daniel <[email protected]>:
>>>>>>
>>>>>>> This may be a different issue, but with SRS 5.3.1 on Solaris 10,
>>>>>>> when we run “utdesktop –lw” we sometimes show units that can’t get a
>>>>>>> session and are “stuck”. I think they are 26Ds as well.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On an individual DTU basis, to fix remotely we issue this:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> /opt/SUNWut/lib/utload -r -t pseudo.00144fd6d4c7 && utsession -k -t
>>>>>>> pseudo.00144fd6d4c7
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Where “pseudo.xxx..” is the identifier that shows up in output of
>>>>>>> “utdesktop –lw”.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> In our case the 26D seems to be a bit more complicated to detect (not
>>>>>> sure if OL and Sun base might have different behaviors), but when a
>>>>>> session
>>>>>> gets hung, initially it's still recognized as a valid session and doesn't
>>>>>> appear in the -lw list until someone power cycles the client. Then the
>>>>>> 26D
>>>>>> is shown again, but this time it appears in the -lw list. Our aim is to
>>>>>> find out a way to discover those hung sessions in their 'initial' state,
>>>>>> where the SRSS doesn't catalogue them as hung yet.
>>>>>>
>>>>>> I've a possible idea, but I have to test it quite lot yet. It's based
>>>>>> on the output of the utquery command, concretly on the cmdcachesize
>>>>>> parameter which seems to be 0 when the session is hung, but as I said,
>>>>>> I'm
>>>>>> not quite sure of this yet.
>>>>>>
>>>>>> I've tried using the 'utsession -k -t' command, but it doesn't seem
>>>>>> to "unstick" the session itself, however, I've not tried in combination
>>>>>> with the 'utload' command, so I'll also try that tomorrow and see its
>>>>>> behavior.
>>>>>>
>>>>>> Thanks very much for that idea, too.
>>>>>>
>>>>>> James
>>>>>>
>>>>>>
>>>>>>> What that will do is “unstick” (for lack of a better term) the
>>>>>>> session associated with that DTU and then reboot it.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> To make things easier we have a simple script that asks for the
>>>>>>> identifier:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> #!/usr/bin/bash
>>>>>>>
>>>>>>> read -p "Enter the Desktop ID: : " DesktopID
>>>>>>>
>>>>>>> /opt/SUNWut/lib/utload -r -t pseudo.$DesktopID && utsession -k -t
>>>>>>> pseudo.$DesktopID
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> To make things automated we have a script that runs via cron by the
>>>>>>> hour:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> #!/usr/bin/bash
>>>>>>>
>>>>>>> # Set variable for "DesktopID" based on output of utdesktop -lw
>>>>>>>
>>>>>>> DesktopID=$(utdesktop -lw | awk 'NR>3 && NR<5 {print $1}' )
>>>>>>>
>>>>>>> # Only continue if "utdesktop -lw" reports a hung session, indicated
>>>>>>> by existence of ID starting with 00
>>>>>>>
>>>>>>> if [[ "$DesktopID" == 00* ]]
>>>>>>>
>>>>>>> then
>>>>>>>
>>>>>>> echo "There's hung sessions -- fixing them..."
>>>>>>>
>>>>>>> /opt/SUNWut/lib/utload -r -t pseudo.$DesktopID && utsession -k
>>>>>>> -t pseudo.$DesktopID
>>>>>>>
>>>>>>> else
>>>>>>>
>>>>>>> echo "No hung sessions -- we're done here!"
>>>>>>>
>>>>>>> fi
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I’m not a programmer so I apologize if the scripts are a bit crude –
>>>>>>> but they work for us.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Hope that helps!
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Best,
>>>>>>>
>>>>>>> Daniel
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *From:* James Michels [mailto:[email protected]]
>>>>>>> *Sent:* Thursday, October 02, 2014 8:33 AM
>>>>>>> *To:* [email protected]
>>>>>>> *Subject:* [SunRay-Users] 26D and ability to effectively erase
>>>>>>> sessions
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> We're getting spotaneous and unpredictable 26D screens sometimes.
>>>>>>> This doesn't happen quite often, but what's worrying is that we're
>>>>>>> unable
>>>>>>> to restore the affected client's state to be reset.
>>>>>>>
>>>>>>> We've tried to reset the client using the utsession -k -t command,
>>>>>>> also utdisplay -d and both of them seem uneffective, as when rebooted,
>>>>>>> the
>>>>>>> client remains in the same 26D state.
>>>>>>>
>>>>>>> The only thing that helps is a complete server reboot.
>>>>>>>
>>>>>>> When the client reconnects to the server we're seeing this in the
>>>>>>> log so maybe it's related:
>>>>>>>
>>>>>>> Oct 2 12:43:52 srss7 utauthd: search_for_entries(): Found multiple
>>>>>>> matching entries, was expecting a single match
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I deduce that the session is not being cleaned up entirely, so
>>>>>>> here's my question:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Is there a *effective* way for completely wipe the information from
>>>>>>> a client? Something like this must be possible, otherwise a complete
>>>>>>> server
>>>>>>> restart wouldn't help either.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I don't mind connecting to the local LDAP server and deleting
>>>>>>> 'something' by hand, but I'd like to know a way. We're running OL6.5.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Thank you.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> James
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> SunRay-Users mailing list
>>>>>>> [email protected]
>>>>>>> http://www.filibeto.org/mailman/listinfo/sunray-users
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> SunRay-Users mailing list
>>>>> [email protected]
>>>>> http://www.filibeto.org/mailman/listinfo/sunray-users
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>>
>>>> Greg Rodenhiser
>>>> Technical Services Engineer
>>>> College of the Holy Cross
>>>>
>>>> _______________________________________________
>>>> SunRay-Users mailing list
>>>> [email protected]
>>>> http://www.filibeto.org/mailman/listinfo/sunray-users
>>>>
>>>>
>>>
>>> _______________________________________________
>>> SunRay-Users mailing list
>>> [email protected]
>>> http://www.filibeto.org/mailman/listinfo/sunray-users
>>>
>>>
>>
>>
>> --
>>
>>
>> Greg Rodenhiser
>> Technical Services Engineer
>> College of the Holy Cross
>>
>> _______________________________________________
>> SunRay-Users mailing list
>> [email protected]
>> http://www.filibeto.org/mailman/listinfo/sunray-users
>>
>>
>
_______________________________________________
SunRay-Users mailing list
[email protected]
http://www.filibeto.org/mailman/listinfo/sunray-users